Power analysis for multilevel models

Magnus Johansson

Power analysis for multilevel models

Using powerlmm and pmap() with ggplot() to visualize variations of parameter settings

Author

Affiliation

Magnus Johansson

RISE Research Institutes of Sweden

Published

2024-02-20

I recently needed to do a power analysis for a longitudinal study, and since I always document my explorative work in a Quarto file it seemed reasonable to take a few minutes to share my experiences.

The starting point for me was the “Basic example” made by the author of the powerlmm package.

You may need to install the package from the GitHub website:

devtools::install_github("rpsychologist/powerlmm")

Code

library(powerlmm)
library(tidyverse)

# define ggplot theme
theme_rise <- function(fontfamily = "Lato", axissize = 13, titlesize = 15,
                       margins = 12, axisface = "plain", stripsize = 12,
                       panelDist = 0.6, legendSize = 11, legendTsize = 12) {
  theme_minimal() +
  theme(
    text = element_text(family = fontfamily),
    axis.title.x = element_text(
      margin = margin(t = margins),
      size = axissize
    ),
    axis.title.y = element_text(
      margin = margin(r = margins),
      size = axissize
    ),
    plot.title = element_text(
      face = "bold",
      size = titlesize
    ),
    axis.title = element_text(
      face = axisface
    ),
    plot.caption = element_text(
      face = "italic"
    ),
    legend.text = element_text(family = fontfamily, size = legendSize),
    legend.title = element_text(family = fontfamily, size = legendTsize),
    legend.background = element_rect(color = "lightgrey"),
    strip.text = element_text(size = stripsize),
    strip.background = element_rect(color = "lightgrey"),
    panel.spacing = unit(panelDist, "cm", data = NULL)
  )
}

I made some changes in the example regarding time points, clusters and icc’s. Below is a test run for one combination of parameter settings.

Code

# dropout per treatment group
d <- per_treatment(control = dropout_weibull(0.25, 1),
              treatment = dropout_weibull(0.25, 1))
# Setup design
p <- study_parameters(n1 = 5, # time points
                      n2 = 10, # subjects per cluster
                      n3 = 4, # clusters per treatment arm
                      icc_pre_subject = 0.5,
                      icc_pre_cluster = 0.2,
                      icc_slope = 0.05,
                      var_ratio = 0.02,
                      dropout = d,
                      cohend = -0.8)
# Power
powerAnalysis <- get_power(p)

powerAnalysis$power

[1] 0.7626795

1 Creating a function

We want to be able to vary some of the parameters and compare the resulting power analysis output.

For this example, we will create a function that allows us to vary three of the input parameters:

the number of participants per cluster
attrition/dropout rate
effect size

Code

powerFunc <- function(samplesize = 15, dropoutrate = 0.25, effectsize = -0.8) {
  d <- per_treatment(
    control = dropout_weibull(dropoutrate, 1),
    treatment = dropout_weibull(dropoutrate, 1)
  )

  p <- study_parameters(
    n1 = 5, # time points
    n2 = samplesize, # subjects per cluster
    n3 = 4, # clusters per treatment arm
    icc_pre_subject = 0.5,
    icc_pre_cluster = 0.2,
    icc_slope = 0.05,
    var_ratio = 0.02,
    dropout = d,
    cohend = effectsize
  )
  paout <- get_power(p)
  return(paout$power)
}

Let’s test the function.

Code

powerFunc()

[1] 0.8937192

2 Specifying parameter variations

Next, we will define variables with the parameter variations we want to look at.

Code

# set sample sizes
sampleSizes <- c(10, 12, 14, 16, 18, 20)
# set dropout rates
dropoutRates <- c(0.20, 0.25, 0.30)
# set effect sizes
effectSizes <- c(-0.6, -0.7, -0.8)

And now we utilize the magic of expand_grid() to create a dataframe with all combinations of the three variables.

Code

# make a dataframe with all combinations
combinations <- expand_grid(sampleSizes, dropoutRates, effectSizes)

3 Power analysis

We’ll make use of pmap() from the purrr:map() family since we need to input more than two variables.

Code

# use pmap to iterate over all combinations
powerList <- pmap(
  list(
    combinations$sampleSizes,
    combinations$dropoutRates,
    combinations$effectSizes
  ),
  powerFunc
) %>%
  # set readable names for each output list object, to later separate into variables
  set_names(paste0(
    combinations$sampleSizes, "_",
    combinations$dropoutRates, "_",
    combinations$effectSizes
  ))

The names we set at the end of the chunk above will help us to create separate variables next.

Code

# combine into dataframe for visualization
df.power <- powerList %>%
  bind_rows() %>%
  t() %>%
  as.data.frame() %>%
  rownames_to_column("split_this") %>%
  rename(power = V1) %>%
  separate(split_this,
    into = c("samplesize", "dropoutrate", "effectsize"),
    sep = "_"
  ) %>%
  mutate(across(everything(), ~ as.numeric(.x)))

glimpse(df.power)

Rows: 54
Columns: 4
$ samplesize  <dbl> 10, 10, 10, 10, 10, 10, 10, 10, 10, 12, 12, 12, 12, 12, 12…
$ dropoutrate <dbl> 0.20, 0.20, 0.20, 0.25, 0.25, 0.25, 0.30, 0.30, 0.30, 0.20…
$ effectsize  <dbl> -0.6, -0.7, -0.8, -0.6, -0.7, -0.8, -0.6, -0.7, -0.8, -0.6…
$ power       <dbl> 0.5415203, 0.6703958, 0.7798685, 0.5241826, 0.6515921, 0.7…

4 Visualization

Code

df.power %>%
  mutate(dropoutrate = factor(dropoutrate,
                              labels = c("20%","25%","30%")),
         effectsize = fct_rev(factor(effectsize))) %>%
  
  ggplot(aes(
    x = samplesize,
    y = power,
    color = dropoutrate,
    group = dropoutrate
  )) +
  geom_point() +
  geom_line() +
  scale_x_continuous(breaks = sampleSizes) +
  scale_y_continuous(
    limits = c(0, 1),
    breaks = c(0, 0.2, 0.4, 0.6, 0.8, 1)
  ) +
  labs(
    x = "Cluster sample size",
    y = "Statistical power",
    title = "Power analysis",
    subtitle = "For different effect sizes (Cohen's d)",
    caption = "Note. There are four clusters per treatment arm and five time points."
  ) +
  theme_minimal(
    base_family = "Lato",
    base_size = 14
  ) +
  facet_wrap(~effectsize) +
  geom_hline(
    yintercept = 0.8,
    linetype = 2
  ) +
  scale_color_viridis_d('Dropout rate') +
  theme_rise()

I added a reference line for power = 0.80.

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{johansson2024,
  author = {Johansson, Magnus},
  title = {Power Analysis for Multilevel Models},
  date = {2024-02-20},
  url = {https://pgmj.github.io/powerviz.html},
  langid = {en}
}

For attribution, please cite this work as:

Johansson, Magnus. 2024. “Power Analysis for Multilevel Models.” February 20, 2024. https://pgmj.github.io/powerviz.html.