Digital Health (Mar 2024)

Predicting early gastric cancer risk using machine learning: A population-based retrospective study

  • Xing Ke,
  • Xinyu Cai,
  • Bingxian Bian,
  • Yuanheng Shen,
  • Yunlan Zhou,
  • Wei Liu,
  • Xu Wang,
  • Lisong Shen,
  • Junyao Yang

DOI
https://doi.org/10.1177/20552076241240905
Journal volume & issue
Vol. 10

Abstract

Read online

Background Early detection and treatment are crucial for reducing gastrointestinal tumour-related mortality. The diagnostic efficiency of the most commonly used diagnostic markers for gastric cancer (GC) is not very high. A single laboratory test cannot meet the requirements of early screening, and machine learning methods are needed to aid the early diagnosis of GC by combining multiple indicators. Methods Based on the XGBoost algorithm, a new model was developed to distinguish between GC and precancerous lesions in newly admitted patients between 2018 and 2023 using multiple laboratory tests. We evaluated the ability of the prediction score derived from this model to predict early GC. In addition, we investigated the efficacy of the model in correctly screening for GC given negative protein tumour marker results. Results The XHGC20 model constructed using the XGBoost algorithm could distinguish GC from precancerous disease well (area under the receiver operating characteristic curve [AUC] = 0.901), with a sensitivity, specificity and cut-off value of 0.830, 0.806 and 0.265, respectively. The prediction score was very effective in the diagnosis of early GC. When the cut-off value was 0.27, and the AUC was 0.888, the sensitivity and specificity were 0.797 and 0.807, respectively. The model was also effective at evaluating GC given negative conventional markers (AUC = 0.970), with the sensitivity and specificity of 0.941 and 0.906, respectively, which helped to reduce the rate of missed diagnoses. Conclusions The XHGC20 model established by the XGBoost algorithm integrates information from 20 clinical laboratory tests and can aid in the early screening of GC, providing a useful new method for auxiliary laboratory diagnosis.