| Title: | Analyze Data from Stepped Wedge Cluster Randomized Trials |
|---|---|
| Description: | Provide various functions and tools to help fit models for estimating treatment effects in stepped wedge cluster randomized trials. Implements methods described in Kenny, Voldal, Xia, and Heagerty (2022) "Analysis of stepped wedge cluster randomized trials in the presence of a time-varying treatment effect", <doi:10.1002/sim.9511>. |
| Authors: | Avi Kenny [aut, cre, cph], David Arthur [aut], Yongdong Ouyang [aut] |
| Maintainer: | Avi Kenny <[email protected]> |
| License: | GPL-3 |
| Version: | 1.0.0 |
| Built: | 2026-05-19 09:54:47 UTC |
| Source: | https://github.com/avi-kenny/steppedwedge |
Analyze a stepped wedge dataset
analyze( dat, method = "mixed", estimand_type = "TATE", estimand_time = c(1, max(dat$exposure_time)), exp_time = "IT", cal_time = "categorical", family = stats::gaussian, exponentiate = FALSE, re = c("clust", "time"), corstr = "exchangeable", advanced = params() )analyze( dat, method = "mixed", estimand_type = "TATE", estimand_time = c(1, max(dat$exposure_time)), exp_time = "IT", cal_time = "categorical", family = stats::gaussian, exponentiate = FALSE, re = c("clust", "time"), corstr = "exchangeable", advanced = params() )
dat |
A dataframe containing the stepped wedge trial data. |
method |
A character string; either "mixed", for a mixed-effects model, or "GEE", for generalized estimating equations. |
estimand_type |
One of c("TATE", "PTE"); "TATE" represents the time-averaged treatment effect and "PTE" represents the point treatment effect. |
estimand_time |
An integer vector of length 1 or 2. When estimand_type="TATE", 'estimand_time' must be a numeric vector of length 2, representing the start and end times of the exposure time period to average over. When estimand_type="PTE", 'estimand_time' must be a numeric vector of length 1, representing the time period of interest. See examples. |
exp_time |
One of c("IT", "ETI", "DCT", "NCS", "TEH"); model for exposure time. "IT" encodes an immediate treatment model with a single treatment effect parameter. "ETI" is an exposure time indicator model, including one indicator variable for each exposure time point. "NCS" uses a natural cubic spline model for the exposure time trend. "TEH" includes a random slope term in the model, allowing the treatment effect to vary by timepoint. "DCT" encodes a delayed constant treatment model, which estimates individual effects for a specified number of washout periods (set via the 'w' parameter in 'advanced') followed by a single constant treatment effect. |
cal_time |
One of c("categorical", "NCS", "linear", "none"); model for calendar time. "categorical" uses indicator variables for discrete time points, as in the Hussey and Hughes model. "NCS" uses a natural cubic spline, useful for datasets with continuous time. "linear" uses a single slope parameter. "none" assumes that there is no underlying calendar time trend. |
family |
A family object; see documentation for 'glm'. |
exponentiate |
Logical; if TRUE, return exponentiated treatment effect estimates and confidence intervals (including in the 'effect_curve' object). Defaults to FALSE. |
re |
A character vector of random effects to include; only relevant if method="mixed" is used. Possible random effects include "clust" (random intercept for cluster), "time" (random intercept for cluster-time interaction), "ind" (random intercept for individuals; appropriate when a cohort design is used), "tx" (random treatment effect) |
corstr |
One of c("independence", "exchangeable", "ar1"); only relevant if method="GEE" is used. Defines the GEE working correlation structure; see the documentation for 'geepack::geeglm'. |
advanced |
A list of options returned by |
A list with the model object, model type as a string, estimand type as a string, numeric treatment effect estimate, numeric treatment effect standard error, treatment effect 95 p-value corresponding to the null hypothesis that the main treatment effect estimand equals zero, a list with treatment effect estimates (and standard errors and 95 passed to 'analyze()', and an indicator whether the effect estimates and CI are exponentiated.
# Load data test_data <- load_data(time ="period", cluster_id = "cluster", individual_id = NULL, treatment = "trt", outcome = "outcome_cont", data = sw_data_example) # Analysis example 1: TATE estimand for exposure times 1 through 4 results_tate <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE", estimand_time = c(1, 4), exp_time = "ETI") results_tate # Analysis example 2: PTE estimand for exposure time 3 results_pte <- analyze(dat = test_data, method = "mixed", estimand_type = "PTE", estimand_time = 3, exp_time = "ETI") results_pte # Analysis example 3: TATE estimand for exposure times 1 through 4, Natural Cubic Splines model results_tate_ncs <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE", estimand_time = c(1, 4), exp_time = "NCS", advanced = params(n_knots_exp = 4)) results_tate_ncs # Analysis example 4: TATE estimand for exposure times 1 through 4 with binomial outcome data # Load data test_data_bin <- load_data(time ="period", cluster_id = "cluster", individual_id = NULL, treatment = "trt", outcome = c("numerator", "denominator"), data = sw_data_example_binom) results_pte_bin <- analyze(dat = test_data_bin, family = binomial, method = "mixed", estimand_type = "TATE", estimand_time = c(1, 4), exp_time = "ETI") results_pte_bin# Load data test_data <- load_data(time ="period", cluster_id = "cluster", individual_id = NULL, treatment = "trt", outcome = "outcome_cont", data = sw_data_example) # Analysis example 1: TATE estimand for exposure times 1 through 4 results_tate <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE", estimand_time = c(1, 4), exp_time = "ETI") results_tate # Analysis example 2: PTE estimand for exposure time 3 results_pte <- analyze(dat = test_data, method = "mixed", estimand_type = "PTE", estimand_time = 3, exp_time = "ETI") results_pte # Analysis example 3: TATE estimand for exposure times 1 through 4, Natural Cubic Splines model results_tate_ncs <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE", estimand_time = c(1, 4), exp_time = "NCS", advanced = params(n_knots_exp = 4)) results_tate_ncs # Analysis example 4: TATE estimand for exposure times 1 through 4 with binomial outcome data # Load data test_data_bin <- load_data(time ="period", cluster_id = "cluster", individual_id = NULL, treatment = "trt", outcome = c("numerator", "denominator"), data = sw_data_example_binom) results_pte_bin <- analyze(dat = test_data_bin, family = binomial, method = "mixed", estimand_type = "TATE", estimand_time = c(1, 4), exp_time = "ETI") results_pte_bin
Format estimates returned by analyze as a table
as_table(..., labels = NA)as_table(..., labels = NA)
... |
One or more objects of class |
labels |
A character vector of length equal to length(list(...)) representing curve labels |
A table of effect estimate values
# Load data test_data <- load_data(time ="period", cluster_id = "cluster", individual_id = NULL, treatment = "trt", outcome = "outcome_cont", data = sw_data_example) IT_model <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE", estimand_time = c(1, 4), exp_time = "IT") ETI_model <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE", estimand_time = c(1, 4), exp_time = "ETI") NCS_model <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE", estimand_time = c(1, 4), exp_time = "NCS") ests_table <- as_table(IT_model, IT_model, NCS_model) head(ests_table)# Load data test_data <- load_data(time ="period", cluster_id = "cluster", individual_id = NULL, treatment = "trt", outcome = "outcome_cont", data = sw_data_example) IT_model <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE", estimand_time = c(1, 4), exp_time = "IT") ETI_model <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE", estimand_time = c(1, 4), exp_time = "ETI") NCS_model <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE", estimand_time = c(1, 4), exp_time = "NCS") ests_table <- as_table(IT_model, IT_model, NCS_model) head(ests_table)
Load and format data object
load_data( time, cluster_id, individual_id = NULL, treatment, outcome, exposure_time = NULL, offset = NULL, time_type = "discrete", data )load_data( time, cluster_id, individual_id = NULL, treatment, outcome, exposure_time = NULL, offset = NULL, time_type = "discrete", data )
time |
A character string; the name of the numeric variable representing time. Time can be either discrete or continuous. |
cluster_id |
A character string; the name of the numeric variable identifying the cluster. |
individual_id |
A character string (optional); the name of the numeric variable identifying the individual. |
treatment |
A character string; the name of the binary variable indicating treatment. Values must be either integers (0/1) or Boolean (T/F). |
outcome |
Either a character string or a vector of two character strings; for a numeric or binary outcome, the single character string indicates the name of the numeric or binary outcome variable; for binomial outcome data, the vector of two character strings indicates the "# of successes" variable and the "# of trials" variable, respectively. Values in the outcome variable(s) must be either numeric or Boolean (T/F). |
exposure_time |
A character string (optional); the name of the numeric variable identifying the exposure time variable. If this is not provided, the package will calculate exposure time automatically. |
offset |
A character string (optional); the name of the numeric variable specifying the offset. |
time_type |
One of c("discrete", "continuous"); whether the model treats time as discrete or continuous. |
data |
A dataframe containing the stepped wedge trial data. |
An object of class sw_data
example_data <- load_data(time ="period", cluster_id = "cluster", individual_id = NULL, treatment = "trt", outcome = "outcome_cont", offset = NULL, data = sw_data_example) base::summary(example_data)example_data <- load_data(time ="period", cluster_id = "cluster", individual_id = NULL, treatment = "trt", outcome = "outcome_cont", offset = NULL, data = sw_data_example) base::summary(example_data)
This should be used in conjunction with analyze to
set parameters controlling analysis; see examples.
params( offset = NULL, n_knots_exp = 4, n_knots_cal = 4, var_est = "model", var_est_type = "classic", return_ncs = F, re_correlated = F, w = 1 )params( offset = NULL, n_knots_exp = 4, n_knots_cal = 4, var_est = "model", var_est_type = "classic", return_ncs = F, re_correlated = F, w = 1 )
offset |
A linear predictor offset term; see docs for 'lme4::lmer'. |
n_knots_exp |
An integer; only relevant when exp_time="NCS". Specifies the number of knots to use for exposure time, including boundary knots. The spline basis includes an intercept, and the degree of the basis is equal to the number of knots. |
n_knots_cal |
An integer; only relevant when cal_time="NCS". Specifies the number of knots to use for calendar time, including boundary knots. The spline basis includes an intercept, and the degree of the basis is equal to the number of knots. |
var_est |
A character string; either "model", for model-based variance, or "robust", to use the robust variance estimator. |
var_est_type |
A character string; one of c("classic","DF","KC","MD","FG"); only relevant when var_est="robust". |
return_ncs |
Logical; only relevant when exp_time="NCS". Specifies whether the full covariance matrix for the calendar time parameters and the transformed treatment effect parameters are returned. |
re_correlated |
Logical; specifies whether random treatment effect and random intercept for cluster are correlated. |
w |
Integer; the number of washout periods to use when 'exp_time = "DCT"' (Delayed Constant Treatment). Defaults to 1. |
A list of options
dat <- load_data(time = "period", cluster_id = "cluster", individual_id = NULL, treatment = "trt", outcome = "outcome_bin", data = sw_data_example) analyze(dat = dat, method = "mixed", estimand_type = "TATE", exp_time = "NCS", family = binomial)dat <- load_data(time = "period", cluster_id = "cluster", individual_id = NULL, treatment = "trt", outcome = "outcome_bin", data = sw_data_example) analyze(dat = dat, method = "mixed", estimand_type = "TATE", exp_time = "NCS", family = binomial)
Plot observed and predicted outcomes by cluster over time
plot_clusters(analysis_object, ncol = 3)plot_clusters(analysis_object, ncol = 3)
analysis_object |
A list of class 'sw_analysis'. |
ncol |
Integer; number of columns in the faceted plot. Defaults to 3. |
A list with a 'ggplot2' object of the actual and predicted outcomes by cluster.
# Load data test_data <- load_data(time ="period", cluster_id = "cluster", individual_id = NULL, treatment = "trt", outcome = "outcome_cont", data = sw_data_example) # Analyze using TATE estimand for exposure times 1 through 4 results_tate <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE", estimand_time = c(1, 4), exp_time = "ETI") # Plot by cluster plot_clusters(results_tate)# Load data test_data <- load_data(time ="period", cluster_id = "cluster", individual_id = NULL, treatment = "trt", outcome = "outcome_cont", data = sw_data_example) # Analyze using TATE estimand for exposure times 1 through 4 results_tate <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE", estimand_time = c(1, 4), exp_time = "ETI") # Plot by cluster plot_clusters(results_tate)
Plot stepped wedge design
plot_design(dat)plot_design(dat)
dat |
A dataframe containing the stepped wedge trial data. |
A list with a plot of the stepped wedge design.
# Load data example_data <- load_data(time ="period", cluster_id = "cluster", individual_id = NULL, treatment = "trt", outcome = "outcome_cont", data = sw_data_example) # Plot design plot_design(example_data)# Load data example_data <- load_data(time ="period", cluster_id = "cluster", individual_id = NULL, treatment = "trt", outcome = "outcome_cont", data = sw_data_example) # Plot design plot_design(example_data)
Plot effect estimates by exposure time for one or more models.
plot_effect_curves(..., labels = NA, facet_nrow = 1)plot_effect_curves(..., labels = NA, facet_nrow = 1)
... |
One or more objects of class |
labels |
A character vector of length equal to the length of list(...), representing plot labels. Only used if length(list(...))>1. |
facet_nrow |
Number of rows for displaying plots using ggplot2::facet_wrap(). |
A plot of the effect curve for each "sw_analysis" object
passed to the function.
# Load data test_data <- load_data(time ="period", cluster_id = "cluster", individual_id = NULL, treatment = "trt", outcome = "outcome_cont", data = sw_data_example) IT_model <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE", estimand_time = c(1, 4), exp_time = "IT") ETI_model <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE", estimand_time = c(1, 4), exp_time = "ETI") NCS_4_model <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE", estimand_time = c(1, 4), exp_time = "NCS", advanced = params(n_knots_exp = 4)) plot_effect_curves(IT_model, NCS_4_model, ETI_model, facet_nrow = 1)# Load data test_data <- load_data(time ="period", cluster_id = "cluster", individual_id = NULL, treatment = "trt", outcome = "outcome_cont", data = sw_data_example) IT_model <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE", estimand_time = c(1, 4), exp_time = "IT") ETI_model <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE", estimand_time = c(1, 4), exp_time = "ETI") NCS_4_model <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE", estimand_time = c(1, 4), exp_time = "NCS", advanced = params(n_knots_exp = 4)) plot_effect_curves(IT_model, NCS_4_model, ETI_model, facet_nrow = 1)
Data generated for the purpose of demonstrating the steppedwedge package
sw_data_examplesw_data_example
## 'sw_data_example' A data frame with 2,063 rows and 5 columns:
Cluster id
Time period
Treatment indicator
Binary outcome
Continuous outcome
...
A simulated stepped wedge dataset formatted with separate columns for the number of successes (numerator) and number of trials (denominator).
sw_data_example_binomsw_data_example_binom
## 'sw_data_example_binom' A data frame with 90 rows and 5 columns:
Cluster id
Time period
Treatment indicator
Denominator (# trials)
Numerator (# successes)
...