BMC Gastroenterology (Feb 2022)

Machine learning-based survival rate prediction of Korean hepatocellular carcinoma patients using multi-center data

  • Byeonggwan Noh,
  • Young Mok Park,
  • Yujin Kwon,
  • Chang In Choi,
  • Byung Kwan Choi,
  • Kwang il Seo,
  • Yo-Han Park,
  • Kwangho Yang,
  • Sunju Lee,
  • Taeyoung Ha,
  • YunKyong Hyon,
  • Myunghee Yoon

DOI
https://doi.org/10.1186/s12876-022-02182-4
Journal volume & issue
Vol. 22, no. 1
pp. 1 – 9

Abstract

Read online

Abstract Aim To predict survival time of Korean hepatocellular carcinoma (HCC) patients using multi-center data as a foundation for the development of a predictive artificial intelligence model according to treatment methods based on machine learning. Methods Data of patients who underwent treatment for HCC from 2008 to 2015 was provided by Korean Liver Cancer Study Group and Korea Central Cancer Registry. A total of 10,742 patients with HCC were divided into two groups, with Group I (2920 patients) confirmed on biopsy and Group II (5562 patients) diagnosed as HCC according to HCC diagnostic criteria as outlined in Korean Liver Cancer Association guidelines. The data were modeled according to features of patient clinical characteristics. Features effective in predicting survival rate were analyzed retrospectively. Various machine learning methods were used. Results Target was overall survival time, which divided into approximately 60 months (= / 60 m). Target distribution in Group I (total 514 samples) was 28.8%: (148 samples) less than 60 months, 71.2% (366 samples) greater than 60 months, and in Group II (total 757 samples) was 66.6% (504 samples) less than 60 months, 33.4% (253 samples) greater than 60 months. Using NG Boost method, its accuracy was 83%, precision 84%, sensitivity 95%, and F1 score 89% for more than 60 months survival time in Group I with surgical resection. Moreover, its accuracy was 79%, precision 82%, sensitivity 87%, and F1 score 84% for less than 60 months survival time in Group II with TACE. The feature importance with gain criterion indicated that pathology, portal vein invasion, surgery, metastasis, and needle biopsy features could be explained as important factors for prediction in case of biopsy (Group I). Conclusion By developing a predictive model using machine learning algorithms to predict prognosis of HCC patients, it is possible to project optimized treatment by case according to liver function and tumor status.

Keywords