PLoS ONE (Jan 2024)

Development and web deployment of prediction model for pulmonary arterial pressure in chronic thromboembolic pulmonary hypertension using machine learning.

  • Takaaki Matsunaga,
  • Atsushi Kono,
  • Mizuho Nishio,
  • Takahiro Yoshii,
  • Hidetoshi Matsuo,
  • Mai Takahashi,
  • Takuya Takahashi,
  • Yu Taniguchi,
  • Hidekazu Tanaka,
  • Kenichi Hirata,
  • Takamichi Murakami

DOI
https://doi.org/10.1371/journal.pone.0300716
Journal volume & issue
Vol. 19, no. 4
p. e0300716

Abstract

Read online

Background and purposeMean pulmonary artery pressure (mPAP) is a key index for chronic thromboembolic pulmonary hypertension (CTEPH). Using machine learning, we attempted to construct an accurate prediction model for mPAP in patients with CTEPH.MethodsA total of 136 patients diagnosed with CTEPH were included, for whom mPAP was measured. The following patient data were used as explanatory variables in the model: basic patient information (age and sex), blood tests (brain natriuretic peptide (BNP)), echocardiography (tricuspid valve pressure gradient (TRPG)), and chest radiography (cardiothoracic ratio (CTR), right second arc ratio, and presence of avascular area). Seven machine learning methods including linear regression were used for the multivariable prediction models. Additionally, prediction models were constructed using the AutoML software. Among the 136 patients, 2/3 and 1/3 were used as training and validation sets, respectively. The average of R squared was obtained from 10 different data splittings of the training and validation sets.ResultsThe optimal machine learning model was linear regression (averaged R squared, 0.360). The optimal combination of explanatory variables with linear regression was age, BNP level, TRPG level, and CTR (averaged R squared, 0.388). The R squared of the optimal multivariable linear regression model was higher than that of the univariable linear regression model with only TRPG.ConclusionWe constructed a more accurate prediction model for mPAP in patients with CTEPH than a model of TRPG only. The prediction performance of our model was improved by selecting the optimal machine learning method and combination of explanatory variables.