PLoS ONE (Jan 2020)

Machine learning insight into the role of imaging and clinical variables for the prediction of obstructive coronary artery disease and revascularization: An exploratory analysis of the CONSERVE study.

  • Lohendran Baskaran,
  • Xiaohan Ying,
  • Zhuoran Xu,
  • Subhi J Al'Aref,
  • Benjamin C Lee,
  • Sang-Eun Lee,
  • Ibrahim Danad,
  • Hyung-Bok Park,
  • Ravi Bathina,
  • Andrea Baggiano,
  • Virginia Beltrama,
  • Rodrigo Cerci,
  • Eui-Young Choi,
  • Jung-Hyun Choi,
  • So-Yeon Choi,
  • Jason Cole,
  • Joon-Hyung Doh,
  • Sang-Jin Ha,
  • Ae-Young Her,
  • Cezary Kepka,
  • Jang-Young Kim,
  • Jin-Won Kim,
  • Sang-Wook Kim,
  • Woong Kim,
  • Yao Lu,
  • Amit Kumar,
  • Ran Heo,
  • Ji Hyun Lee,
  • Ji-Min Sung,
  • Uma Valeti,
  • Daniele Andreini,
  • Gianluca Pontone,
  • Donghee Han,
  • Todd C Villines,
  • Fay Lin,
  • Hyuk-Jae Chang,
  • James K Min,
  • Leslee J Shaw

DOI
https://doi.org/10.1371/journal.pone.0233791
Journal volume & issue
Vol. 15, no. 6
p. e0233791

Abstract

Read online

BackgroundMachine learning (ML) is able to extract patterns and develop algorithms to construct data-driven models. We use ML models to gain insight into the relative importance of variables to predict obstructive coronary artery disease (CAD) using the Coronary Computed Tomographic Angiography for Selective Cardiac Catheterization (CONSERVE) study, as well as to compare prediction of obstructive CAD to the CAD consortium clinical score (CAD2). We further perform ML analysis to gain insight into the role of imaging and clinical variables for revascularization.MethodsFor prediction of obstructive CAD, the entire ICA arm of the study, comprising 719 patients was used. For revascularization, 1,028 patients were randomized to invasive coronary angiography (ICA) or coronary computed tomographic angiography (CCTA). Data was randomly split into 80% training 20% test sets for building and validation. Models used extreme gradient boosting (XGBoost).ResultsMean age was 60.6 ± 11.5 years and 64.3% were female. For the prediction of obstructive CAD, the AUC was significantly higher for ML at 0.779 (95% CI: 0.672-0.886) than for CAD2 (0.696 [95% CI: 0.594-0.798]) (P = 0.01). BMI, age, and angina severity were the most important variables. For revascularization, the model obtained an overall area under the receiver-operation curve (AUC) of 0.958 (95% CI = 0.933-0.983). Performance did not differ whether the imaging parameters used were from ICA (AUC 0.947, 95% CI = 0.903-0.990) or CCTA (AUC 0.941, 95% CI = 0.895-0.988) (P = 0.90). The ML model obtained sensitivity and specificity of 89.2% and 92.9%, respectively. Number of vessels with ≥70% stenosis, maximum segment stenosis severity (SSS) and body mass index (BMI) were the most important variables. Exclusion of imaging variables resulted in performance deterioration, with an AUC of 0.705 (95% CI 0.614-0.795) (P ConclusionsFor obstructive CAD, the ML model outperformed CAD2. BMI is an important variable, although currently not included in most scores. In this ML model, imaging variables were most associated with revascularization. Imaging modality did not influence model performance. Removal of imaging variables reduced model performance.