Mathematical Biosciences and Engineering (Oct 2023)

Integrative approach for classifying male tumors based on DNA methylation 450K data

  • Ji-Ming Wu,
  • Wang-Ren Qiu,
  • Zi Liu,
  • Zhao-Chun Xu,
  • Shou-Hua Zhang

DOI
https://doi.org/10.3934/mbe.2023845
Journal volume & issue
Vol. 20, no. 11
pp. 19133 – 19151

Abstract

Read online

Malignancies such as bladder urothelial carcinoma, colon adenocarcinoma, liver hepatocellular carcinoma, lung adenocarcinoma and prostate adenocarcinoma significantly impact men's well-being. Accurate cancer classification is vital in determining treatment strategies and improving patient prognosis. This study introduced an innovative method that utilizes gene selection from high-dimensional datasets to enhance the performance of the male tumor classification algorithm. The method assesses the reliability of DNA methylation data to distinguish the five most prevalent types of male cancers from normal tissues by employing DNA methylation 450K data obtained from The Cancer Genome Atlas (TCGA) database. First, the chi-square test is used for dimensionality reduction and second, L1 penalized logistic regression is used for feature selection. Furthermore, the stacking ensemble learning technique was employed to integrate seven common multiclassification models. Experimental results demonstrated that the ensemble learning model utilizing multiple classification models outperformed any base classification model. The proposed ensemble model achieved an astonishing overall accuracy (ACC) of 99.2% in independent testing data. Moreover, it may present novel ideas and pathways for the early detection and treatment of future diseases.

Keywords