BMC Medical Research Methodology (Jul 2019)
Imputation strategies when a continuous outcome is to be dichotomized for responder analysis: a simulation study
Abstract
Abstract Background In many clinical trials continuous outcomes are dichotomized to compare proportions of patients who respond. A common and recommended approach to handling missing data in responder analysis is to impute as non-responders, despite known biases. Multiple imputation is another natural choice but when a continuous outcome is ultimately dichotomized, the specifications of the imputation model come into question. Practitioners can either impute the missing outcome before dichotomizing or dichotomize then impute. In this study we compared multiple imputation of the continuous and dichotomous forms of the outcome, and imputing responder status as non-response in responder analysis. Methods We simulated four response profiles representing a two-arm randomized controlled trial with a continuous outcome at four time points. We omitted data using six missing at random mechanisms, and imputed missing observations three ways: 1) replacing as non-responder; 2) multiply imputing before dichotomizing; and 3) multiply imputing the dichotomized response. Imputation models included the continuous response at all timepoints, and additional auxiliary variables for some scenarios. We assessed bias, power, coverage of the 95% confidence interval, and type 1 error. Finally, we applied these methods to a longitudinal trial for patients with major depressive disorder. Results Both forms of multiple imputation performed better than non-response imputation in terms of bias and type 1 error. When approximately 30% of responses were missing, bias was less than 7.3% for multiple imputation scenarios but when 50% of responses were missing, imputing before dichotomizing generally had lower bias compared to dichotomizing before imputing. Non-response imputation resulted in biased estimates, both underestimates and overestimates. In the example trial data, non-response imputation estimated a smaller difference in proportions than multiply imputed approaches. Conclusions With moderate amounts of missing data, multiply imputing the continuous outcome variable prior to dichotomizing performed similar to multiply imputing the binary responder status. With higher rates of missingness, multiply imputing the continuous variable was less biased and had well-controlled coverage probabilities of the 95% confidence interval compared to imputing the dichotomous response. In general, multiple imputation using the longitudinally measured continuous outcome in the imputation model performed better than imputing missing observations as non-responders.
Keywords