BMC Medical Research Methodology (Jul 2016)

How to deal with missing longitudinal data in cost of illness analysis in Alzheimer’s disease—suggestions from the GERAS observational study

  • Mark Belger,
  • Josep Maria Haro,
  • Catherine Reed,
  • Michael Happich,
  • Kristin Kahle-Wrobleski,
  • Josep Maria Argimon,
  • Giuseppe Bruno,
  • Richard Dodel,
  • Roy W Jones,
  • Bruno Vellas,
  • Anders Wimo

DOI
https://doi.org/10.1186/s12874-016-0188-1
Journal volume & issue
Vol. 16, no. 1
pp. 1 – 11

Abstract

Read online

Abstract Background Missing data are a common problem in prospective studies with a long follow-up, and the volume, pattern and reasons for missing data may be relevant when estimating the cost of illness. We aimed to evaluate the effects of different methods for dealing with missing longitudinal cost data and for costing caregiver time on total societal costs in Alzheimer’s disease (AD). Methods GERAS is an 18-month observational study of costs associated with AD. Total societal costs included patient health and social care costs, and caregiver health and informal care costs. Missing data were classified as missing completely at random (MCAR), missing at random (MAR) or missing not at random (MNAR). Simulation datasets were generated from baseline data with 10–40 % missing total cost data for each missing data mechanism. Datasets were also simulated to reflect the missing cost data pattern at 18 months using MAR and MNAR assumptions. Naïve and multiple imputation (MI) methods were applied to each dataset and results compared with complete GERAS 18-month cost data. Opportunity and replacement cost approaches were used for caregiver time, which was costed with and without supervision included and with time for working caregivers only being costed. Results Total costs were available for 99.4 % of 1497 patients at baseline. For MCAR datasets, naïve methods performed as well as MI methods. For MAR, MI methods performed better than naïve methods. All imputation approaches were poor for MNAR data. For all approaches, percentage bias increased with missing data volume. For datasets reflecting 18-month patterns, a combination of imputation methods provided more accurate cost estimates (e.g. bias: −1 % vs −6 % for single MI method), although different approaches to costing caregiver time had a greater impact on estimated costs (29–43 % increase over base case estimate). Conclusions Methods used to impute missing cost data in AD will impact on accuracy of cost estimates although varying approaches to costing informal caregiver time has the greatest impact on total costs. Tailoring imputation methods to the reason for missing data will further our understanding of the best analytical approach for studies involving cost outcomes.

Keywords