BMC Medical Research Methodology (Aug 2024)

A simple and effective method for simulating nested exchangeable correlated binary data for longitudinal cluster randomised trials

  • Rhys A. Bowden,
  • Jessica Kasza,
  • Andrew B. Forbes

DOI
https://doi.org/10.1186/s12874-024-02285-4
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 11

Abstract

Read online

Abstract Background Simulation is an important tool for assessing the performance of statistical methods for the analysis of data and for the planning of studies. While methods are available for the simulation of correlated binary random variables, all have significant practical limitations for simulating outcomes from longitudinal cluster randomised trial designs, such as the cluster randomised crossover and the stepped wedge trial designs. For these trial designs as the number of observations in each cluster increases these methods either become computationally infeasible or their range of allowable correlations rapidly shrinks to zero. Methods In this paper we present a simple method for simulating binary random variables with a specified vector of prevalences and correlation matrix. This method allows for the outcome prevalence to change due to treatment or over time, and for a ‘nested exchangeable’ correlation structure, in which observations in the same cluster are more highly correlated if they are measured in the same time period than in different time periods, and where different individuals are measured in each time period. This means that our method is also applicable to more general hierarchical clustered data contexts, such as students within classrooms within schools. The method is demonstrated by simulating 1000 datasets with parameters matching those derived from data from a cluster randomised crossover trial assessing two variants of stress ulcer prophylaxis. Results Our method is orders of magnitude faster than the most well known general simulation method while also allowing a much wider range of correlations than alternative methods. An implementation of our method is available in an R package NestBin. Conclusions This simulation method is the first to allow for practical and efficient simulation of large datasets of binary outcomes with the commonly used nested exchangeable correlation structure. This will allow for much more effective testing of designs and inference methods for longitudinal cluster randomised trials with binary outcomes.

Keywords