BMC Pulmonary Medicine (Apr 2023)
The combination of supervised and unsupervised learning based risk stratification and phenotyping in pulmonary arterial hypertension—a long-term retrospective multicenter trial
Abstract
Abstract Background Accurate risk stratification in pulmonary arterial hypertension (PAH), a devastating cardiopulmonary disease, is essential to guide successful therapy. Machine learning may improve risk management and harness clinical variability in PAH. Methods We conducted a long-term retrospective observational study (median follow-up: 67 months) including 183 PAH patients from three Austrian PAH expert centers. Clinical, cardiopulmonary function, laboratory, imaging, and hemodynamic parameters were assessed. Cox proportional hazard Elastic Net and partitioning around medoid clustering were applied to establish a multi-parameter PAH mortality risk signature and investigate PAH phenotypes. Results Seven parameters identified by Elastic Net modeling, namely age, six-minute walking distance, red blood cell distribution width, cardiac index, pulmonary vascular resistance, N-terminal pro-brain natriuretic peptide and right atrial area, constituted a highly predictive mortality risk signature (training cohort: concordance index = 0.82 [95%CI: 0.75 – 0.89], test cohort: 0.77 [0.66 – 0.88]). The Elastic Net signature demonstrated superior prognostic accuracy as compared with five established risk scores. The signature factors defined two clusters of PAH patients with distinct risk profiles. The high-risk/poor prognosis cluster was characterized by advanced age at diagnosis, poor cardiac output, increased red cell distribution width, higher pulmonary vascular resistance, and a poor six-minute walking test performance. Conclusion Supervised and unsupervised learning algorithms such as Elastic Net regression and medoid clustering are powerful tools for automated mortality risk prediction and clinical phenotyping in PAH.
Keywords