ERJ Open Research (Sep 2023)
Machine learning to differentiate pulmonary hypertension due to left heart disease from pulmonary arterial hypertension
Abstract
Background and aims Pulmonary hypertension due to left heart disease (PH-LHD) is the most frequent form of PH. As differential diagnosis with pulmonary arterial hypertension (PAH) has therapeutic implications, it is important to accurately and noninvasively differentiate PH-LHD from PAH before referral to PH centres. The aim was to develop and validate a machine learning (ML) model to improve prediction of PH-LHD in a population of PAH and PH-LHD patients. Methods Noninvasive PH-LHD predictors from 172 PAH and 172 PH-LHD patients from the PH centre database at the University Hospitals of Leuven (Leuven, Belgium) were used to develop an ML model. The Jacobs score was used as performance benchmark. The dataset was split into a training and test set (70:30) and the best model was selected after 10-fold cross-validation on the training dataset (n=240). The final model was externally validated using 165 patients (91 PAH, 74 PH-LHD) from Erasme Hospital (Brussels, Belgium). Results In the internal test dataset (n=104), a random forest-based model correctly diagnosed 70% of PH-LHD patients (sensitivity: n=35/50), with 100% positive predicted value, 78% negative predicted value and 100% specificity. The model outperformed the Jacobs score, which identified 18% (n=9/50) of the patients with PH-LHD without false positives. In external validation, the model had 64% sensitivity at 100% specificity, while the Jacobs score had a sensitivity of 3% for no false positives. Conclusions ML significantly improves the sensitivity of PH-LHD prediction at 100% specificity. Such a model may substantially reduce the number of patients referred for invasive diagnostics without missing PAH diagnoses.