PLoS ONE (Jan 2019)

Predicting atrial fibrillation in primary care using machine learning.

  • Nathan R Hill,
  • Daniel Ayoubkhani,
  • Phil McEwan,
  • Daniel M Sugrue,
  • Usman Farooqui,
  • Steven Lister,
  • Matthew Lumley,
  • Ameet Bakhai,
  • Alexander T Cohen,
  • Mark O'Neill,
  • David Clifton,
  • Jason Gordon

DOI
https://doi.org/10.1371/journal.pone.0224582
Journal volume & issue
Vol. 14, no. 11
p. e0224582

Abstract

Read online

BACKGROUND:Atrial fibrillation (AF) is the most common sustained heart arrhythmia. However, as many cases are asymptomatic, a large proportion of patients remain undiagnosed until serious complications arise. Efficient, cost-effective detection of the undiagnosed may be supported by risk-prediction models relating patient factors to AF risk. However, there exists a need for an implementable risk model that is contemporaneous and informed by routinely collected patient data, reflecting the real-world pathology of AF. METHODS:This study sought to develop and evaluate novel and conventional statistical and machine learning models for risk-predication of AF. This was a retrospective, cohort study of adults (aged ≥30 years) without a history of AF, listed on the Clinical Practice Research Datalink, from January 2006 to December 2016. Models evaluated included published risk models (Framingham, ARIC, CHARGE-AF), machine learning models, which evaluated baseline and time-updated information (neural network, LASSO, random forests, support vector machines), and Cox regression. RESULTS:Analysis of 2,994,837 individuals (3.2% AF) identified time-varying neural networks as the optimal model achieving an AUROC of 0.827 vs. 0.725, with number needed to screen of 9 vs. 13 patients at 75% sensitivity, when compared with the best existing model CHARGE-AF. The optimal model confirmed known baseline risk factors (age, previous cardiovascular disease, antihypertensive medication usage) and identified additional time-varying predictors (proximity of cardiovascular events, body mass index (both levels and changes), pulse pressure, and the frequency of blood pressure measurements). CONCLUSION:The optimal time-varying machine learning model exhibited greater predictive performance than existing AF risk models and reflected known and new patient risk factors for AF.