IEEE Access (Jan 2022)

Prediction Methods of Common Cancers in China Using PCA-ANN and DBN-ELM-BP

  • Huitao Qi,
  • Shuangbo Xie,
  • Yanli Chen,
  • Chengqian Wang,
  • Tingting Wang,
  • Bin Sun,
  • Mingxu Sun

DOI
https://doi.org/10.1109/ACCESS.2022.3215706
Journal volume & issue
Vol. 10
pp. 113397 – 113409

Abstract

Read online

Accurate prediction of cancer cases is crucial for diagnosis of cancer at an early stage because a long-lasting chronic disease is harmful to both physical and mental health. While medical data about healthcare and health obtained from questionnaire, the true positive rate of cancers predicted by traditional methods is low. Machine learning can provide a pattern for classification for types of cancer (mainly including lung cancer, liver cancer, upper gastrointestinal cancer, lower gastrointestinal cancer and breast cancer) using instances of early questionnaire screening. The screening covered 3411 respondents in this study. Principal component analysis (PCA) is used to generate attributes, coupled with artificial neural network (ANN) technology to conduct cancer prediction by providing 28 attributes into models. While deep belief network (DBN) is used for unsupervised training and extracting relevant attributes. Extreme learning machine (ELM) optimizes DBN and conducts supervised classification. Back propagation (BP) algorithm conducts supervised fine tuning. Finally, PCA-ANN and DBN-ELM-BP common cancers prediction models are established. The training set and testing set of PCA-ANN model gives 35.29% and 37.5% sensitivity, 98.36% and 98.33% specificity, 97.01% and 97.85% accuracy, an area under the receiver operating characteristic curve (AUC) 0.7245 and 0.7221, respectively. While the training set and testing set of DBN-ELM-BP model gives 58.83% and 62.5% sensitivity, 98.31% and 98.52% specificity, 98.03% and 98.24% accuracy, AUC 0.7747 and 0.7238, respectively. The results show that DBN-ELM-BP model can provide a method to predict the possibility of common cancers, which is non-invasive and economical for clinicians to make diagnostic decisions.

Keywords