Development and web deployment of prediction model for pulmonary arterial pressure in chronic thromboembolic pulmonary hypertension using machine learning.

Takaaki Matsunaga; Atsushi Kono; Mizuho Nishio; Takahiro Yoshii; Hidetoshi Matsuo; Mai Takahashi; Takuya Takahashi; Yu Taniguchi; Hidekazu Tanaka; Kenichi Hirata; Takamichi Murakami

doi:10.1371/journal.pone.0300716

PLoS ONE (Jan 2024)

Development and web deployment of prediction model for pulmonary arterial pressure in chronic thromboembolic pulmonary hypertension using machine learning.

Takaaki Matsunaga,
Atsushi Kono,
Mizuho Nishio,
Takahiro Yoshii,
Hidetoshi Matsuo,
Mai Takahashi,
Takuya Takahashi,
Yu Taniguchi,
Hidekazu Tanaka,
Kenichi Hirata,
Takamichi Murakami

Affiliations

Takaaki Matsunaga
Atsushi Kono
Mizuho Nishio
Takahiro Yoshii
Hidetoshi Matsuo
Mai Takahashi
Takuya Takahashi
Yu Taniguchi
Hidekazu Tanaka
Kenichi Hirata
Takamichi Murakami

DOI: https://doi.org/10.1371/journal.pone.0300716
Journal volume & issue: Vol. 19, no. 4
p. e0300716

Abstract

Read online

Background and purposeMean pulmonary artery pressure (mPAP) is a key index for chronic thromboembolic pulmonary hypertension (CTEPH). Using machine learning, we attempted to construct an accurate prediction model for mPAP in patients with CTEPH.MethodsA total of 136 patients diagnosed with CTEPH were included, for whom mPAP was measured. The following patient data were used as explanatory variables in the model: basic patient information (age and sex), blood tests (brain natriuretic peptide (BNP)), echocardiography (tricuspid valve pressure gradient (TRPG)), and chest radiography (cardiothoracic ratio (CTR), right second arc ratio, and presence of avascular area). Seven machine learning methods including linear regression were used for the multivariable prediction models. Additionally, prediction models were constructed using the AutoML software. Among the 136 patients, 2/3 and 1/3 were used as training and validation sets, respectively. The average of R squared was obtained from 10 different data splittings of the training and validation sets.ResultsThe optimal machine learning model was linear regression (averaged R squared, 0.360). The optimal combination of explanatory variables with linear regression was age, BNP level, TRPG level, and CTR (averaged R squared, 0.388). The R squared of the optimal multivariable linear regression model was higher than that of the univariable linear regression model with only TRPG.ConclusionWe constructed a more accurate prediction model for mPAP in patients with CTEPH than a model of TRPG only. The prediction performance of our model was improved by selecting the optimal machine learning method and combination of explanatory variables.

Published in PLoS ONE

ISSN: 1932-6203 (Online)
Publisher: Public Library of Science (PLoS)
Country of publisher: United States
LCC subjects: Medicine; Science
Website: https://journals.plos.org/plosone/

About the journal