BMC Medical Research Methodology (Mar 2024)

Predicting health outcomes with intensive longitudinal data collected by mobile health devices: a functional principal component regression approach

  • Qing Yang,
  • Meilin Jiang,
  • Cai Li,
  • Sheng Luo,
  • Matthew J. Crowley,
  • Ryan J. Shaw

DOI
https://doi.org/10.1186/s12874-024-02193-7
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Background Intensive longitudinal data (ILD) collected in near real time by mobile health devices provide a new opportunity for monitoring chronic diseases, early disease risk prediction, and disease prevention in health research. Functional data analysis, specifically functional principal component analysis, has great potential to abstract trends in ILD but has not been used extensively in mobile health research. Objective To introduce functional principal component analysis (fPCA) and demonstrate its potential applicability in estimating trends in ILD collected by mobile heath devices, assessing longitudinal association between ILD and health outcomes, and predicting health outcomes. Methods fPCA and scalar-to-function regression models were reviewed. A case study was used to illustrate the process of abstracting trends in intensively self-measured blood glucose using functional principal component analysis and then predicting future HbA1c values in patients with type 2 diabetes using a scalar-to-function regression model. Results Based on the scalar-to-function regression model results, there was a slightly increasing trend between daily blood glucose measures and HbA1c. 61% of variation in HbA1c could be predicted by the three preceding months’ blood glucose values measured before breakfast (P < 0.0001, $${R}_{adjusted}^{2}=0.61$$ ). Conclusions Functional data analysis, specifically fPCA, offers a unique tool to capture patterns in ILD collected by mobile health devices. It is particularly useful in assessing longitudinal dynamic association between repeated measures and outcomes, and can be easily integrated in prediction models to improve prediction precision.