JMIR Medical Informatics (Nov 2024)

Using Machine Learning to Predict the Duration of Atrial Fibrillation: Model Development and Validation

  • Satoshi Shimoo,
  • Keitaro Senoo,
  • Taku Okawa,
  • Kohei Kawai,
  • Masahiro Makino,
  • Jun Munakata,
  • Nobunari Tomura,
  • Hibiki Iwakoshi,
  • Tetsuro Nishimura,
  • Hirokazu Shiraishi,
  • Keiji Inoue,
  • Satoaki Matoba

DOI
https://doi.org/10.2196/63795
Journal volume & issue
Vol. 12
p. e63795

Abstract

Read online

BackgroundAtrial fibrillation (AF) is a progressive disease, and its clinical type is classified according to the AF duration: paroxysmal AF, persistent AF (PeAF; AF duration of less than 1 year), and long-standing persistent AF (AF duration of more than 1 year). When considering the indication for catheter ablation, having a long AF duration is considered a risk factor for recurrence, and therefore, the duration of AF is an important factor in determining the treatment strategy for PeAF. ObjectiveThis study aims to improve the accuracy of the cardiologists’ diagnosis of the AF duration, and the steps to achieve this goal are to develop a predictive model of the AF duration and validate the support performance of the prediction model. MethodsThe study included 272 patients with PeAF (aged 20-90 years), with data obtained between January 1, 2015, and December 31, 2023. Of those, 189 (69.5%) were included in the study, excluding 83 (30.5%) who met the exclusion criteria. Of the 189 patients included, 145 (76.7%) were used as training data to build the machine learning (ML) model and 44 (23.3%) were used as test data for predictive ability of the ML model. Using a questionnaire, 10 cardiologists (group A) evaluated whether the test data (44 patients) included AF of more than a 1-year duration (phase 1). Next, the same questionnaire was performed again after providing the ML model’s answer (phase 2). Subsequently, another 10 cardiologists (group B) were shown the test results of group A, were made aware of the limitations of their own diagnostic abilities, and were then administered the same 2-stage test as group A. ResultsThe prediction results with the ML model using the test data provided 81.8% accuracy (72% sensitivity and 89% specificity). The mean percentage of correct answers in group A was 63.9% (SD 9.6%) for phase 1 and improved to 71.6% (SD 9.3%) for phase 2 (P=.01). The mean percentage of correct answers in group B was 59.8% (SD 5.3%) for phase 1 and improved to 68.2% (SD 5.9%) for phase 2 (P=.007). The mean percentage of answers that differed from the ML model’s prediction for phase 2 (percentage of answers where cardiologists did not trust the ML model and believed their own determination) was 17.3% (SD 10.3%) in group A and 20.9% (SD 5%) in group B and was not significantly different (P=.85). ConclusionsML models predicting AF duration improved the diagnostic ability of cardiologists. However, cardiologists did not entirely rely on the ML model’s prediction, even if they were aware of their diagnostic capability limitations.