Predicting Risk of Mortality in Pediatric ICU Based on Ensemble Step-Wise Feature Selection

Shenda Hong; Xinlin Hou; Jin Jing; Wendong Ge; Luxia Zhang

doi:10.34133/2021/9365125

Health Data Science (Jan 2021)

Predicting Risk of Mortality in Pediatric ICU Based on Ensemble Step-Wise Feature Selection

Shenda Hong,
Xinlin Hou,
Jin Jing,
Wendong Ge,
Luxia Zhang

Affiliations

Shenda Hong: National Institute of Health Data Science at Peking University, Beijing, China; Institute of Medical Technology, Health Science Center of Peking University, Beijing, China
Xinlin Hou: Neonatology Department of Peking University First Hospital, Beijing, China
Jin Jing: Harvard Medical School, Boston, MA, USA; Clinical Data Animation Center (CDAC), Massachusetts General Hospital, Boston, MA, USA
Wendong Ge: Harvard Medical School, Boston, MA, USA; Clinical Data Animation Center (CDAC), Massachusetts General Hospital, Boston, MA, USA
Luxia Zhang: National Institute of Health Data Science at Peking University, Beijing, China; Institute of Medical Technology, Health Science Center of Peking University, Beijing, China

DOI: https://doi.org/10.34133/2021/9365125
Journal volume & issue: Vol. 2021

Abstract

Read online

Background. Prediction of mortality risk in intensive care units (ICU) is an important task. Data-driven methods such as scoring systems, machine learning methods, and deep learning methods have been investigated for a long time. However, few data-driven methods are specially developed for pediatric ICU. In this paper, we aim to amend this gap—build a simple yet effective linear machine learning model from a number of hand-crafted features for mortality prediction in pediatric ICU. Methods. We use a recently released publicly available pediatric ICU dataset named pediatric intensive care (PIC) from Children’s Hospital of Zhejiang University School of Medicine in China. Unlike previous sophisticated machine learning methods, we want our method to keep simple that can be easily understood by clinical staffs. Thus, an ensemble step-wise feature ranking and selection method is proposed to select a small subset of effective features from the entire feature set. A logistic regression classifier is built upon selected features for mortality prediction. Results. The final predictive linear model with 11 features achieves a 0.7531 ROC-AUC score on the hold-out test set, which is comparable with a logistic regression classifier using all 397 features (0.7610 ROC-AUC score) and is higher than the existing well known pediatric mortality risk scorer PRISM III (0.6895 ROC-AUC score). Conclusions. Our method improves feature ranking and selection by utilizing an ensemble method while keeping a simple linear form of the predictive model and therefore achieves better generalizability and performance on mortality prediction in pediatric ICU.

Published in Health Data Science

ISSN: 2765-8783 (Online)
Publisher: American Association for the Advancement of Science (AAAS)
Country of publisher: United States
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: https://spj.sciencemag.org/journals/hds/

About the journal