R, Rasch, etc
  • Home
  • RISEkbmRasch vignette
  • About
  • Blog
    • Automating reports with Quarto
    • What’s in a latent trait score?
    • Data retrieval with R using API calls
    • Power analysis for multilevel models
    • Data wrangling for psychometrics in R

On this page

  • 1 Creating a function
  • 2 Specifying parameter variations
  • 3 Power analysis
  • 4 Visualization

Power analysis for multilevel models

  • Show All Code
  • Hide All Code

  • View Source

Using powerlmm and pmap() with ggplot() to visualize variations of parameter settings

Author
Affiliation

Magnus Johansson

RISE Research Institutes of Sweden

Published

2024-02-20

I recently needed to do a power analysis for a longitudinal study, and since I always document my explorative work in a Quarto file it seemed reasonable to take a few minutes to share my experiences.

The starting point for me was the “Basic example” made by the author of the powerlmm package.

You may need to install the package from the GitHub website:

devtools::install_github("rpsychologist/powerlmm")
Code
library(powerlmm)
library(tidyverse)

# define ggplot theme
theme_rise <- function(fontfamily = "Lato", axissize = 13, titlesize = 15,
                       margins = 12, axisface = "plain", stripsize = 12,
                       panelDist = 0.6, legendSize = 11, legendTsize = 12) {
  theme_minimal() +
  theme(
    text = element_text(family = fontfamily),
    axis.title.x = element_text(
      margin = margin(t = margins),
      size = axissize
    ),
    axis.title.y = element_text(
      margin = margin(r = margins),
      size = axissize
    ),
    plot.title = element_text(
      face = "bold",
      size = titlesize
    ),
    axis.title = element_text(
      face = axisface
    ),
    plot.caption = element_text(
      face = "italic"
    ),
    legend.text = element_text(family = fontfamily, size = legendSize),
    legend.title = element_text(family = fontfamily, size = legendTsize),
    legend.background = element_rect(color = "lightgrey"),
    strip.text = element_text(size = stripsize),
    strip.background = element_rect(color = "lightgrey"),
    panel.spacing = unit(panelDist, "cm", data = NULL)
  )
}

I made some changes in the example regarding time points, clusters and icc’s. Below is a test run for one combination of parameter settings.

Code
# dropout per treatment group
d <- per_treatment(control = dropout_weibull(0.25, 1),
              treatment = dropout_weibull(0.25, 1))
# Setup design
p <- study_parameters(n1 = 5, # time points
                      n2 = 10, # subjects per cluster
                      n3 = 4, # clusters per treatment arm
                      icc_pre_subject = 0.5,
                      icc_pre_cluster = 0.2,
                      icc_slope = 0.05,
                      var_ratio = 0.02,
                      dropout = d,
                      cohend = -0.8)
# Power
powerAnalysis <- get_power(p)

powerAnalysis$power
[1] 0.7626795

1 Creating a function

We want to be able to vary some of the parameters and compare the resulting power analysis output.

For this example, we will create a function that allows us to vary three of the input parameters:

  • the number of participants per cluster
  • attrition/dropout rate
  • effect size
Code
powerFunc <- function(samplesize = 15, dropoutrate = 0.25, effectsize = -0.8) {
  d <- per_treatment(
    control = dropout_weibull(dropoutrate, 1),
    treatment = dropout_weibull(dropoutrate, 1)
  )

  p <- study_parameters(
    n1 = 5, # time points
    n2 = samplesize, # subjects per cluster
    n3 = 4, # clusters per treatment arm
    icc_pre_subject = 0.5,
    icc_pre_cluster = 0.2,
    icc_slope = 0.05,
    var_ratio = 0.02,
    dropout = d,
    cohend = effectsize
  )
  paout <- get_power(p)
  return(paout$power)
}

Let’s test the function.

Code
powerFunc()
[1] 0.8937192

2 Specifying parameter variations

Next, we will define variables with the parameter variations we want to look at.

Code
# set sample sizes
sampleSizes <- c(10, 12, 14, 16, 18, 20)
# set dropout rates
dropoutRates <- c(0.20, 0.25, 0.30)
# set effect sizes
effectSizes <- c(-0.6, -0.7, -0.8)

And now we utilize the magic of expand_grid() to create a dataframe with all combinations of the three variables.

Code
# make a dataframe with all combinations
combinations <- expand_grid(sampleSizes, dropoutRates, effectSizes)

3 Power analysis

We’ll make use of pmap() from the purrr:map() family since we need to input more than two variables.

Code
# use pmap to iterate over all combinations
powerList <- pmap(
  list(
    combinations$sampleSizes,
    combinations$dropoutRates,
    combinations$effectSizes
  ),
  powerFunc
) %>%
  # set readable names for each output list object, to later separate into variables
  set_names(paste0(
    combinations$sampleSizes, "_",
    combinations$dropoutRates, "_",
    combinations$effectSizes
  ))

The names we set at the end of the chunk above will help us to create separate variables next.

Code
# combine into dataframe for visualization
df.power <- powerList %>%
  bind_rows() %>%
  t() %>%
  as.data.frame() %>%
  rownames_to_column("split_this") %>%
  rename(power = V1) %>%
  separate(split_this,
    into = c("samplesize", "dropoutrate", "effectsize"),
    sep = "_"
  ) %>%
  mutate(across(everything(), ~ as.numeric(.x)))

glimpse(df.power)
Rows: 54
Columns: 4
$ samplesize  <dbl> 10, 10, 10, 10, 10, 10, 10, 10, 10, 12, 12, 12, 12, 12, 12…
$ dropoutrate <dbl> 0.20, 0.20, 0.20, 0.25, 0.25, 0.25, 0.30, 0.30, 0.30, 0.20…
$ effectsize  <dbl> -0.6, -0.7, -0.8, -0.6, -0.7, -0.8, -0.6, -0.7, -0.8, -0.6…
$ power       <dbl> 0.5415203, 0.6703958, 0.7798685, 0.5241826, 0.6515921, 0.7…

4 Visualization

Code
df.power %>%
  mutate(dropoutrate = factor(dropoutrate,
                              labels = c("20%","25%","30%")),
         effectsize = fct_rev(factor(effectsize))) %>%
  
  ggplot(aes(
    x = samplesize,
    y = power,
    color = dropoutrate,
    group = dropoutrate
  )) +
  geom_point() +
  geom_line() +
  scale_x_continuous(breaks = sampleSizes) +
  scale_y_continuous(
    limits = c(0, 1),
    breaks = c(0, 0.2, 0.4, 0.6, 0.8, 1)
  ) +
  labs(
    x = "Cluster sample size",
    y = "Statistical power",
    title = "Power analysis",
    subtitle = "For different effect sizes (Cohen's d)",
    caption = "Note. There are four clusters per treatment arm and five time points."
  ) +
  theme_minimal(
    base_family = "Lato",
    base_size = 14
  ) +
  facet_wrap(~effectsize) +
  geom_hline(
    yintercept = 0.8,
    linetype = 2
  ) +
  scale_color_viridis_d('Dropout rate') +
  theme_rise()

I added a reference line for power = 0.80.

Reuse

CC BY 4.0

Citation

BibTeX citation:
@online{johansson2024,
  author = {Johansson, Magnus},
  title = {Power Analysis for Multilevel Models},
  date = {2024-02-20},
  url = {https://pgmj.github.io/powerviz.html},
  langid = {en}
}
For attribution, please cite this work as:
Johansson, Magnus. 2024. “Power Analysis for Multilevel Models.” February 20, 2024. https://pgmj.github.io/powerviz.html.
Source Code
---
title: "Power analysis for multilevel models"
subtitle: "Using powerlmm and pmap() with ggplot() to visualize variations of parameter settings"
author:
  name: 'Magnus Johansson'
  affiliation: 'RISE Research Institutes of Sweden'
  affiliation-url: 'https://ri.se/shic'
  orcid: '0000-0003-1669-592X'
date: last-modified
date-format: iso
google-scholar: true
citation:
  type: 'webpage'
format:
  html:
    css: styles.css
execute: 
  cache: false
  warning: false
  message: false
editor_options: 
  chunk_output_type: console
---


I recently needed to do a power analysis for a longitudinal study, and since I always document my explorative work in a Quarto file it seemed reasonable to take a few minutes to share my experiences.

The starting point for me was the ["Basic example"](https://rpsychologist.com/introducing-powerlmm#a-basic-example) made by the author of the `powerlmm` package.

You may need to install the package from the GitHub website:
``` r
devtools::install_github("rpsychologist/powerlmm")
```

```{r}
#| code-fold: true
library(powerlmm)
library(tidyverse)

# define ggplot theme
theme_rise <- function(fontfamily = "Lato", axissize = 13, titlesize = 15,
                       margins = 12, axisface = "plain", stripsize = 12,
                       panelDist = 0.6, legendSize = 11, legendTsize = 12) {
  theme_minimal() +
  theme(
    text = element_text(family = fontfamily),
    axis.title.x = element_text(
      margin = margin(t = margins),
      size = axissize
    ),
    axis.title.y = element_text(
      margin = margin(r = margins),
      size = axissize
    ),
    plot.title = element_text(
      face = "bold",
      size = titlesize
    ),
    axis.title = element_text(
      face = axisface
    ),
    plot.caption = element_text(
      face = "italic"
    ),
    legend.text = element_text(family = fontfamily, size = legendSize),
    legend.title = element_text(family = fontfamily, size = legendTsize),
    legend.background = element_rect(color = "lightgrey"),
    strip.text = element_text(size = stripsize),
    strip.background = element_rect(color = "lightgrey"),
    panel.spacing = unit(panelDist, "cm", data = NULL)
  )
}

```

I made some changes in the example regarding time points, clusters and icc's. Below is a test run for one combination of parameter settings.

```{r}
# dropout per treatment group
d <- per_treatment(control = dropout_weibull(0.25, 1),
              treatment = dropout_weibull(0.25, 1))
# Setup design
p <- study_parameters(n1 = 5, # time points
                      n2 = 10, # subjects per cluster
                      n3 = 4, # clusters per treatment arm
                      icc_pre_subject = 0.5,
                      icc_pre_cluster = 0.2,
                      icc_slope = 0.05,
                      var_ratio = 0.02,
                      dropout = d,
                      cohend = -0.8)
# Power
powerAnalysis <- get_power(p)

powerAnalysis$power
```

## Creating a function

We want to be able to vary some of the parameters and compare the resulting power analysis output.

For this example, we will create a function that allows us to vary three of the input parameters: 

- the number of participants per cluster
- attrition/dropout rate
- effect size

```{r}
powerFunc <- function(samplesize = 15, dropoutrate = 0.25, effectsize = -0.8) {
  d <- per_treatment(
    control = dropout_weibull(dropoutrate, 1),
    treatment = dropout_weibull(dropoutrate, 1)
  )

  p <- study_parameters(
    n1 = 5, # time points
    n2 = samplesize, # subjects per cluster
    n3 = 4, # clusters per treatment arm
    icc_pre_subject = 0.5,
    icc_pre_cluster = 0.2,
    icc_slope = 0.05,
    var_ratio = 0.02,
    dropout = d,
    cohend = effectsize
  )
  paout <- get_power(p)
  return(paout$power)
}
```

Let's test the function.

```{r}
powerFunc()
```

## Specifying parameter variations

Next, we will define variables with the parameter variations we want to look at.
```{r}
# set sample sizes
sampleSizes <- c(10, 12, 14, 16, 18, 20)
# set dropout rates
dropoutRates <- c(0.20, 0.25, 0.30)
# set effect sizes
effectSizes <- c(-0.6, -0.7, -0.8)
```

And now we utilize the magic of `expand_grid()` to create a dataframe with all combinations of the three variables.
```{r}
# make a dataframe with all combinations
combinations <- expand_grid(sampleSizes, dropoutRates, effectSizes)
```

## Power analysis

We'll make use of `pmap()` from the `purrr:map()` family since we need to input more than two variables.
```{r}
# use pmap to iterate over all combinations
powerList <- pmap(
  list(
    combinations$sampleSizes,
    combinations$dropoutRates,
    combinations$effectSizes
  ),
  powerFunc
) %>%
  # set readable names for each output list object, to later separate into variables
  set_names(paste0(
    combinations$sampleSizes, "_",
    combinations$dropoutRates, "_",
    combinations$effectSizes
  ))
```

The names we set at the end of the chunk above will help us to create separate variables next.

```{r}
# combine into dataframe for visualization
df.power <- powerList %>%
  bind_rows() %>%
  t() %>%
  as.data.frame() %>%
  rownames_to_column("split_this") %>%
  rename(power = V1) %>%
  separate(split_this,
    into = c("samplesize", "dropoutrate", "effectsize"),
    sep = "_"
  ) %>%
  mutate(across(everything(), ~ as.numeric(.x)))

glimpse(df.power)
```

## Visualization

```{r}
df.power %>%
  mutate(dropoutrate = factor(dropoutrate,
                              labels = c("20%","25%","30%")),
         effectsize = fct_rev(factor(effectsize))) %>%
  
  ggplot(aes(
    x = samplesize,
    y = power,
    color = dropoutrate,
    group = dropoutrate
  )) +
  geom_point() +
  geom_line() +
  scale_x_continuous(breaks = sampleSizes) +
  scale_y_continuous(
    limits = c(0, 1),
    breaks = c(0, 0.2, 0.4, 0.6, 0.8, 1)
  ) +
  labs(
    x = "Cluster sample size",
    y = "Statistical power",
    title = "Power analysis",
    subtitle = "For different effect sizes (Cohen's d)",
    caption = "Note. There are four clusters per treatment arm and five time points."
  ) +
  theme_minimal(
    base_family = "Lato",
    base_size = 14
  ) +
  facet_wrap(~effectsize) +
  geom_hline(
    yintercept = 0.8,
    linetype = 2
  ) +
  scale_color_viridis_d('Dropout rate') +
  theme_rise()

```

I added a reference line for power = 0.80.