BMC Pulmonary Medicine (Aug 2022)

Use of machine learning models to predict prognosis of combined pulmonary fibrosis and emphysema in a Chinese population

  • Qing Liu,
  • Di Sun,
  • Yu Wang,
  • Pengfei Li,
  • Tianci Jiang,
  • Lingling Dai,
  • Mengjie Duo,
  • Ruhao Wu,
  • Zhe Cheng

DOI
https://doi.org/10.1186/s12890-022-02124-6
Journal volume & issue
Vol. 22, no. 1
pp. 1 – 11

Abstract

Read online

Abstract Background Combined pulmonary fibrosis and emphysema (CPFE) is a novel clinical entity with a poor prognosis. This study aimed to develop a clinical nomogram model to predict the 1-, 2- and 3-year mortality of patients with CPFE by using the machine learning approach, and to validate the predictive ability of the interstitial lung disease-gender-age-lung physiology (ILD-GAP) model in CPFE. Methods The data of CPFE patients from January 2015 to October 2021 who met the inclusion criteria were retrospectively collected. We utilized LASSO regression and multivariable Cox regression analysis to identify the variables associated with the prognosis of CPFE and generate a nomogram. The Harrell's C index, the calibration curve and the area under the receiver operating characteristic (ROC) curve (AUC) were used to evaluate the performance of the nomogram. Then, we performed likelihood ratio test, net reclassification improvement (NRI), integrated discrimination improvement (IDI) and decision curve analysis (DCA) to compare the performance of the nomogram with that of the ILD-GAP model. Results A total of 184 patients with CPFE were enrolled. During the follow-up, 90 patients died. After screening out, diffusing lung capacity for carbon monoxide (DLCO), right ventricular diameter (RVD), C-reactive protein (CRP), and globulin were found to be associated with the prognosis of CPFE. The nomogram was then developed by incorporating the above five variables, and it showed a good performance, with a Harrell's C index of 0.757 and an AUC of 0.800 (95% CI 0.736–0.863). Moreover, the calibration plot of the nomogram showed good concordance between the prediction probabilities and the actual observations. The nomogram also improved the discrimination ability of the ILD-GAP model compared to that of the ILD-GAP model alone, and this was substantiated by the likelihood ratio test, NRI and IDI. The significant clinical utility of the nomogram was demonstrated by DCA. Conclusion Age, DLCO, RVD, CRP and globulin were identified as being significantly associated with the prognosis of CPFE in our cohort. The nomogram incorporating the 5 variables showed good performance in predicting the mortality of CPFE. In addition, although the nomogram was superior to the ILD-GAP model in the present cohort, further validation is needed to determine the clinical utility of the nomogram.

Keywords