Diagnostics (Aug 2024)
A Comparison of Interpretable Machine Learning Approaches to Identify Outpatient Clinical Phenotypes Predictive of First Acute Myocardial Infarction
Abstract
Background: Acute myocardial infarctions are deadly to patients and burdensome to healthcare systems. Most recorded infarctions are patients’ first, occur out of the hospital, and often are not accompanied by cardiac comorbidities. The clinical manifestations of the underlying pathophysiology leading to an infarction are not fully understood and little effort exists to use explainable machine learning to learn predictive clinical phenotypes before hospitalization is needed. Methods: We extracted outpatient electronic health record data for 2641 case and 5287 matched-control patients, all without pre-existing cardiac diagnoses, from the Michigan Medicine Health System. We compare six different interpretable, feature extraction approaches, including temporal computational phenotyping, and train seven interpretable machine learning models to predict the onset of first acute myocardial infarction within six months. Results: Using temporal computational phenotypes significantly improved the model performance compared to alternative approaches. The mean cross-validation test set performance exhibited area under the receiver operating characteristic curve values as high as 0.674. The most consistently predictive phenotypes of a future infarction include back pain, cardiometabolic syndrome, family history of cardiovascular diseases, and high blood pressure. Conclusions: Computational phenotyping of longitudinal health records can improve classifier performance and identify predictive clinical concepts. State-of-the-art interpretable machine learning approaches can augment acute myocardial infarction risk assessment and prioritize potential risk factors for further investigation and validation.
Keywords