Development of early prediction model for pregnancy-associated hypertension with graph-based semi-supervised learning

Seung Mi Lee; Yonghyun Nam; Eun Saem Choi; Young Mi Jung; Vivek Sriram; Jacob S. Leiby; Ja Nam Koo; Ig Hwan Oh; Byoung Jae Kim; Sun Min Kim; Sang Youn Kim; Gyoung Min Kim; Sae Kyung Joo; Sue Shin; Errol R. Norwitz; Chan-Wook Park; Jong Kwan Jun; Won Kim; Dokyoon Kim; Joong Shin Park

doi:10.1038/s41598-022-15391-4

Scientific Reports (Sep 2022)

Development of early prediction model for pregnancy-associated hypertension with graph-based semi-supervised learning

Seung Mi Lee,
Yonghyun Nam,
Eun Saem Choi,
Young Mi Jung,
Vivek Sriram,
Jacob S. Leiby,
Ja Nam Koo,
Ig Hwan Oh,
Byoung Jae Kim,
Sun Min Kim,
Sang Youn Kim,
Gyoung Min Kim,
Sae Kyung Joo,
Sue Shin,
Errol R. Norwitz,
Chan-Wook Park,
Jong Kwan Jun,
Won Kim,
Dokyoon Kim,
Joong Shin Park

Affiliations

Seung Mi Lee: Department of Obstetrics and Gynecology, Seoul National University College of Medicine
Yonghyun Nam: Department of Biostatistics, Epidemiology and Informatics, The Perelman School of Medicine, University of Pennsylvania
Eun Saem Choi: Department of Obstetrics and Gynecology, Seoul National University Hospital
Young Mi Jung: Department of Obstetrics and Gynecology, Seoul National University College of Medicine
Vivek Sriram: Department of Biostatistics, Epidemiology and Informatics, The Perelman School of Medicine, University of Pennsylvania
Jacob S. Leiby: Department of Biostatistics, Epidemiology and Informatics, The Perelman School of Medicine, University of Pennsylvania
Ja Nam Koo: Seoul Women’s Hospital
Ig Hwan Oh: Seoul Women’s Hospital
Byoung Jae Kim: Department of Obstetrics and Gynecology, Seoul National University College of Medicine
Sun Min Kim: Department of Obstetrics and Gynecology, Seoul National University College of Medicine
Sang Youn Kim: Department of Radiology, Seoul National University College of Medicine
Gyoung Min Kim: Department of Radiology, Yonsei University College of Medicine
Sae Kyung Joo: Department of Internal Medicine, Seoul National University College of Medicine
Sue Shin: Department of Laboratory Medicine, Seoul National University College of Medicine
Errol R. Norwitz: Department of Obstetrics and Gynecology, Tufts University School of Medicine
Chan-Wook Park: Department of Obstetrics and Gynecology, Seoul National University College of Medicine
Jong Kwan Jun: Department of Obstetrics and Gynecology, Seoul National University College of Medicine
Won Kim: Department of Internal Medicine, Seoul National University College of Medicine
Dokyoon Kim: Department of Biostatistics, Epidemiology and Informatics, The Perelman School of Medicine, University of Pennsylvania
Joong Shin Park: Department of Obstetrics and Gynecology, Seoul National University College of Medicine

DOI: https://doi.org/10.1038/s41598-022-15391-4
Journal volume & issue: Vol. 12, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Clinical guidelines recommend several risk factors to identify women in early pregnancy at high risk of developing pregnancy-associated hypertension. However, these variables result in low predictive accuracy. Here, we developed a prediction model for pregnancy-associated hypertension using graph-based semi-supervised learning. This is a secondary analysis of a prospective study of healthy pregnant women. To develop the prediction model, we compared the prediction performances across five machine learning methods (semi-supervised learning with both labeled and unlabeled data, semi-supervised learning with labeled data only, logistic regression, support vector machine, and random forest) using three different variable sets: [a] variables from clinical guidelines, [b] selected important variables from the feature selection, and [c] all routine variables. Additionally, the proposed prediction model was compared with placental growth factor, a predictive biomarker for pregnancy-associated hypertension. The study population consisted of 1404 women, including 1347 women with complete follow-up (labeled data) and 57 women with incomplete follow-up (unlabeled data). Among the 1347 with complete follow-up, 2.4% (33/1347) developed pregnancy-associated HTN. Graph-based semi-supervised learning using top 11 variables achieved the best average prediction performance (mean area under the curve (AUC) of 0.89 in training set and 0.81 in test set), with higher sensitivity (72.7% vs 45.5% in test set) and similar specificity (80.0% vs 80.5% in test set) compared to risk factors from clinical guidelines. In addition, our proposed model with graph-based SSL had a higher performance than that of placental growth factor for total study population (AUC, 0.71 vs. 0.80, p < 0.001). In conclusion, we could accurately predict the development pregnancy-associated hypertension in early pregnancy through the use of routine clinical variables with the help of graph-based SSL.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal