Journal of Medical Internet Research (Apr 2021)

Establishing Machine Learning Models to Predict Curative Resection in Early Gastric Cancer with Undifferentiated Histology: Development and Usability Study

  • Bang, Chang Seok,
  • Ahn, Ji Yong,
  • Kim, Jie-Hyun,
  • Kim, Young-Il,
  • Choi, Il Ju,
  • Shin, Woon Geon

DOI
https://doi.org/10.2196/25053
Journal volume & issue
Vol. 23, no. 4
p. e25053

Abstract

Read online

BackgroundUndifferentiated type of early gastric cancer (U-EGC) is included among the expanded indications of endoscopic submucosal dissection (ESD); however, the rate of curative resection remains unsatisfactory. Endoscopists predict the probability of curative resection by considering the size and shape of the lesion and whether ulcers are present or not. The location of the lesion, indicating the likely technical difficulty, is also considered. ObjectiveThe aim of this study was to establish machine learning (ML) models to better predict the possibility of curative resection in U-EGC prior to ESD. MethodsA nationwide cohort of 2703 U-EGCs treated by ESD or surgery were adopted for the training and internal validation cohorts. Separately, an independent data set of the Korean ESD registry (n=275) and an Asan medical center data set (n=127) treated by ESD were chosen for external validation. Eighteen ML classifiers were selected to establish prediction models of curative resection with the following variables: age; sex; location, size, and shape of the lesion; and whether ulcers were present or not. ResultsAmong the 18 models, the extreme gradient boosting classifier showed the best performance (internal validation accuracy 93.4%, 95% CI 90.4%-96.4%; precision 92.6%, 95% CI 89.5%-95.7%; recall 99.0%, 95% CI 97.8%-99.9%; and F1 score 95.7%, 95% CI 93.3%-98.1%). Attempts at external validation showed substantial accuracy (first external validation 81.5%, 95% CI 76.9%-86.1% and second external validation 89.8%, 95% CI 84.5%-95.1%). Lesion size was the most important feature in each explainable artificial intelligence analysis. ConclusionsWe established an ML model capable of accurately predicting the curative resection of U-EGC before ESD by considering the morphological and ecological characteristics of the lesions.