IEEE Access (Jan 2019)

Down Syndrome Prediction Using a Cascaded Machine Learning Framework Designed for Imbalanced and Feature-correlated Data

  • Ling Li,
  • Wanying Liu,
  • Hongguo Zhang,
  • Yuting Jiang,
  • Xiaonan Hu,
  • Ruizhi Liu

DOI
https://doi.org/10.1109/ACCESS.2019.2929681
Journal volume & issue
Vol. 7
pp. 97582 – 97593

Abstract

Read online

Down syndrome (DS) caused by the presence of part or all of a third copy of chromosome 21 is the most common form of aneuploidy. The prenatal screening for DS is a key component of antenatal care and is recommended to be universally offered to women irrespective of age or background. The objective of this paper is to introduce a noninvasive and accurate diagnosis procedure for DS and to minimize social and financial cost of prenatal diagnosis. Recently, machine learning has received considerable attention in predictive analytics for medical problems. However, there is few its applications on DS prediction reported due to the difficulty of dealing with highly imbalanced and feature-correlated screening data. In this paper, we propose a cascaded machine learning framework designed for DS prediction based on three complementary stages: 1) pre-judgment with isolation forest technique, 2) model ensemble by voting strategy, and 3) final judgment using logistic regression approach. The experimental results show that the performance of this framework on maternal serum screening data set, when evaluated with different evaluation parameters, is superior to those of some machine learning methods. The best suggested combination of input features for DS screening is the group of alpha-fetoprotein, human chorionic gonadotropin, unconjugated estriol, and maternal age. In addition, our method has the potential to generate further accurate prediction for imbalanced and feature-correlated data, thereby providing a novel and effective approach for certain diseases analysis.

Keywords