Package 'steppedwedge'

Title: Analyze Data from Stepped Wedge Cluster Randomized Trials
Description: Provide various functions and tools to help fit models for estimating treatment effects in stepped wedge cluster randomized trials. Implements methods described in Kenny, Voldal, Xia, and Heagerty (2022) "Analysis of stepped wedge cluster randomized trials in the presence of a time-varying treatment effect", <doi:10.1002/sim.9511>.
Authors: Avi Kenny [aut, cre, cph], David Arthur [aut], Yongdong Ouyang [aut]
Maintainer: Avi Kenny <[email protected]>
License: GPL-3
Version: 1.0.0
Built: 2026-05-19 09:54:47 UTC
Source: https://github.com/avi-kenny/steppedwedge

Help Index


Analyze a stepped wedge dataset

Description

Analyze a stepped wedge dataset

Usage

analyze(
  dat,
  method = "mixed",
  estimand_type = "TATE",
  estimand_time = c(1, max(dat$exposure_time)),
  exp_time = "IT",
  cal_time = "categorical",
  family = stats::gaussian,
  exponentiate = FALSE,
  re = c("clust", "time"),
  corstr = "exchangeable",
  advanced = params()
)

Arguments

dat

A dataframe containing the stepped wedge trial data.

method

A character string; either "mixed", for a mixed-effects model, or "GEE", for generalized estimating equations.

estimand_type

One of c("TATE", "PTE"); "TATE" represents the time-averaged treatment effect and "PTE" represents the point treatment effect.

estimand_time

An integer vector of length 1 or 2. When estimand_type="TATE", 'estimand_time' must be a numeric vector of length 2, representing the start and end times of the exposure time period to average over. When estimand_type="PTE", 'estimand_time' must be a numeric vector of length 1, representing the time period of interest. See examples.

exp_time

One of c("IT", "ETI", "DCT", "NCS", "TEH"); model for exposure time. "IT" encodes an immediate treatment model with a single treatment effect parameter. "ETI" is an exposure time indicator model, including one indicator variable for each exposure time point. "NCS" uses a natural cubic spline model for the exposure time trend. "TEH" includes a random slope term in the model, allowing the treatment effect to vary by timepoint. "DCT" encodes a delayed constant treatment model, which estimates individual effects for a specified number of washout periods (set via the 'w' parameter in 'advanced') followed by a single constant treatment effect.

cal_time

One of c("categorical", "NCS", "linear", "none"); model for calendar time. "categorical" uses indicator variables for discrete time points, as in the Hussey and Hughes model. "NCS" uses a natural cubic spline, useful for datasets with continuous time. "linear" uses a single slope parameter. "none" assumes that there is no underlying calendar time trend.

family

A family object; see documentation for 'glm'.

exponentiate

Logical; if TRUE, return exponentiated treatment effect estimates and confidence intervals (including in the 'effect_curve' object). Defaults to FALSE.

re

A character vector of random effects to include; only relevant if method="mixed" is used. Possible random effects include "clust" (random intercept for cluster), "time" (random intercept for cluster-time interaction), "ind" (random intercept for individuals; appropriate when a cohort design is used), "tx" (random treatment effect)

corstr

One of c("independence", "exchangeable", "ar1"); only relevant if method="GEE" is used. Defines the GEE working correlation structure; see the documentation for 'geepack::geeglm'.

advanced

A list of options returned by params.

Value

A list with the model object, model type as a string, estimand type as a string, numeric treatment effect estimate, numeric treatment effect standard error, treatment effect 95 p-value corresponding to the null hypothesis that the main treatment effect estimand equals zero, a list with treatment effect estimates (and standard errors and 95 passed to 'analyze()', and an indicator whether the effect estimates and CI are exponentiated.

Examples

# Load data
test_data <- load_data(time ="period", cluster_id = "cluster", individual_id = NULL,
treatment = "trt", outcome = "outcome_cont", data = sw_data_example)

# Analysis example 1: TATE estimand for exposure times 1 through 4
results_tate <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE",
estimand_time = c(1, 4), exp_time = "ETI")

results_tate

# Analysis example 2: PTE estimand for exposure time 3
results_pte <- analyze(dat = test_data, method = "mixed", estimand_type = "PTE",
estimand_time = 3, exp_time = "ETI")

results_pte

# Analysis example 3: TATE estimand for exposure times 1 through 4, Natural Cubic Splines model
results_tate_ncs <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE",
estimand_time = c(1, 4), exp_time = "NCS", advanced = params(n_knots_exp = 4))

results_tate_ncs

# Analysis example 4: TATE estimand for exposure times 1 through 4 with binomial outcome data
# Load data
test_data_bin <- load_data(time ="period", cluster_id = "cluster", individual_id = NULL,
treatment = "trt", outcome = c("numerator", "denominator"), data = sw_data_example_binom)

results_pte_bin <- analyze(dat = test_data_bin, family = binomial, method = "mixed", 
estimand_type = "TATE", estimand_time = c(1, 4), exp_time = "ETI")

results_pte_bin

Create table of estimates

Description

Format estimates returned by analyze as a table

Usage

as_table(..., labels = NA)

Arguments

...

One or more objects of class "sw_analysis" returned by analyze.

labels

A character vector of length equal to length(list(...)) representing curve labels

Value

A table of effect estimate values

Examples

# Load data
test_data <- load_data(time ="period", cluster_id = "cluster", individual_id = NULL,
treatment = "trt", outcome = "outcome_cont", data = sw_data_example)


IT_model <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE",
estimand_time = c(1, 4), exp_time = "IT")
ETI_model <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE",
estimand_time = c(1, 4), exp_time = "ETI")
NCS_model <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE",
estimand_time = c(1, 4), exp_time = "NCS")
ests_table <- as_table(IT_model, IT_model, NCS_model)
head(ests_table)

Load and format data object

Description

Load and format data object

Usage

load_data(
  time,
  cluster_id,
  individual_id = NULL,
  treatment,
  outcome,
  exposure_time = NULL,
  offset = NULL,
  time_type = "discrete",
  data
)

Arguments

time

A character string; the name of the numeric variable representing time. Time can be either discrete or continuous.

cluster_id

A character string; the name of the numeric variable identifying the cluster.

individual_id

A character string (optional); the name of the numeric variable identifying the individual.

treatment

A character string; the name of the binary variable indicating treatment. Values must be either integers (0/1) or Boolean (T/F).

outcome

Either a character string or a vector of two character strings; for a numeric or binary outcome, the single character string indicates the name of the numeric or binary outcome variable; for binomial outcome data, the vector of two character strings indicates the "# of successes" variable and the "# of trials" variable, respectively. Values in the outcome variable(s) must be either numeric or Boolean (T/F).

exposure_time

A character string (optional); the name of the numeric variable identifying the exposure time variable. If this is not provided, the package will calculate exposure time automatically.

offset

A character string (optional); the name of the numeric variable specifying the offset.

time_type

One of c("discrete", "continuous"); whether the model treats time as discrete or continuous.

data

A dataframe containing the stepped wedge trial data.

Value

An object of class sw_data

Examples

example_data <- load_data(time ="period", cluster_id = "cluster", individual_id = NULL,
treatment = "trt", outcome = "outcome_cont", offset = NULL, data = sw_data_example)
base::summary(example_data)

Set advanced parameters for analysis of data from stepped wedge trials

Description

This should be used in conjunction with analyze to set parameters controlling analysis; see examples.

Usage

params(
  offset = NULL,
  n_knots_exp = 4,
  n_knots_cal = 4,
  var_est = "model",
  var_est_type = "classic",
  return_ncs = F,
  re_correlated = F,
  w = 1
)

Arguments

offset

A linear predictor offset term; see docs for 'lme4::lmer'.

n_knots_exp

An integer; only relevant when exp_time="NCS". Specifies the number of knots to use for exposure time, including boundary knots. The spline basis includes an intercept, and the degree of the basis is equal to the number of knots.

n_knots_cal

An integer; only relevant when cal_time="NCS". Specifies the number of knots to use for calendar time, including boundary knots. The spline basis includes an intercept, and the degree of the basis is equal to the number of knots.

var_est

A character string; either "model", for model-based variance, or "robust", to use the robust variance estimator.

var_est_type

A character string; one of c("classic","DF","KC","MD","FG"); only relevant when var_est="robust".

return_ncs

Logical; only relevant when exp_time="NCS". Specifies whether the full covariance matrix for the calendar time parameters and the transformed treatment effect parameters are returned.

re_correlated

Logical; specifies whether random treatment effect and random intercept for cluster are correlated.

w

Integer; the number of washout periods to use when 'exp_time = "DCT"' (Delayed Constant Treatment). Defaults to 1.

Value

A list of options

Examples

dat <- load_data(time = "period", cluster_id = "cluster", individual_id = NULL,
treatment = "trt", outcome = "outcome_bin", data = sw_data_example)

analyze(dat = dat, method = "mixed", estimand_type = "TATE",  exp_time = "NCS",
family = binomial)

Plot observed and predicted outcomes by cluster over time

Description

Plot observed and predicted outcomes by cluster over time

Usage

plot_clusters(analysis_object, ncol = 3)

Arguments

analysis_object

A list of class 'sw_analysis'.

ncol

Integer; number of columns in the faceted plot. Defaults to 3.

Value

A list with a 'ggplot2' object of the actual and predicted outcomes by cluster.

Examples

# Load data
test_data <- load_data(time ="period", cluster_id = "cluster", individual_id = NULL,
treatment = "trt", outcome = "outcome_cont", data = sw_data_example)

# Analyze using TATE estimand for exposure times 1 through 4
results_tate <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE",
estimand_time = c(1, 4), exp_time = "ETI")

# Plot by cluster
plot_clusters(results_tate)

Plot stepped wedge design

Description

Plot stepped wedge design

Usage

plot_design(dat)

Arguments

dat

A dataframe containing the stepped wedge trial data.

Value

A list with a plot of the stepped wedge design.

Examples

# Load data
example_data <- load_data(time ="period", cluster_id = "cluster", individual_id = NULL,
treatment = "trt", outcome = "outcome_cont", data = sw_data_example)
# Plot design
plot_design(example_data)

Plot effect estimates by exposure time for one or more models.

Description

Plot effect estimates by exposure time for one or more models.

Usage

plot_effect_curves(..., labels = NA, facet_nrow = 1)

Arguments

...

One or more objects of class "sw_analysis" returned by analyze.

labels

A character vector of length equal to the length of list(...), representing plot labels. Only used if length(list(...))>1.

facet_nrow

Number of rows for displaying plots using ggplot2::facet_wrap().

Value

A plot of the effect curve for each "sw_analysis" object passed to the function.

Examples

# Load data
test_data <- load_data(time ="period", cluster_id = "cluster", individual_id = NULL,
treatment = "trt", outcome = "outcome_cont", data = sw_data_example)


IT_model <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE",
estimand_time = c(1, 4), exp_time = "IT")
ETI_model <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE",
estimand_time = c(1, 4), exp_time = "ETI")
NCS_4_model <- analyze(dat = test_data, method = "mixed", estimand_type = "TATE",
estimand_time = c(1, 4), exp_time = "NCS", advanced = params(n_knots_exp = 4))
plot_effect_curves(IT_model, NCS_4_model, ETI_model, facet_nrow = 1)

Example stepped wedge data

Description

Data generated for the purpose of demonstrating the steppedwedge package

Usage

sw_data_example

Format

## 'sw_data_example' A data frame with 2,063 rows and 5 columns:

cluster

Cluster id

period

Time period

trt

Treatment indicator

outcome_bin

Binary outcome

outcome_cont

Continuous outcome

...


Example stepped wedge data with aggregrated outcomes

Description

A simulated stepped wedge dataset formatted with separate columns for the number of successes (numerator) and number of trials (denominator).

Usage

sw_data_example_binom

Format

## 'sw_data_example_binom' A data frame with 90 rows and 5 columns:

cluster

Cluster id

period

Time period

trt

Treatment indicator

denominator

Denominator (# trials)

numerator

Numerator (# successes)

...