Cancer Medicine (Oct 2023)

A novel prediction model of the risk of pancreatic cancer among diabetes patients using multiple clinical data and machine learning

  • Shih‐Min Chen,
  • Phan Thanh Phuc,
  • Phung‐Anh Nguyen,
  • Whitney Burton,
  • Shwu‐Jiuan Lin,
  • Weei‐Chin Lin,
  • Christine Y. Lu,
  • Min‐Huei Hsu,
  • Chi‐Tsun Cheng,
  • Jason C. Hsu

DOI
https://doi.org/10.1002/cam4.6547
Journal volume & issue
Vol. 12, no. 19
pp. 19987 – 19999

Abstract

Read online

Abstract Introduction Pancreatic cancer is associated with poor prognosis. Considering the increased global incidence of diabetes cases and that individuals with diabetes are considered a high‐risk subpopulation for pancreatic cancer, it is critical to detect the risk of pancreatic cancer within populations of person living = with diabetes. This study aimed to develop a novel prediction model for pancreatic cancer risk among patients with diabetes, using = a real‐world database containing clinical features and employing numerous artificial intelligent approach algorithms. Methods This retrospective observational study analyzed data on patients with Type 2 diabetes from a multisite Taiwanese EMR database between 2009 and 2019. Predictors were selected in accordance with the literature review and clinical perspectives. The prediction models were constructed using machine learning algorithms such as logistic regression, linear discriminant analysis, gradient boosting machine, and random forest. Results The cohort consisted of 66,384 patients. The Linear Discriminant Analysis (LDA) model generated the highest AUROC of 0.9073, followed by the Voting Ensemble and Gradient Boosting machine models. LDA, the best model, exhibited an accuracy of 84.03%, a sensitivity of 0.8611, and a specificity of 0.8403. The most significant predictors identified for pancreatic cancer risk were glucose, glycated hemoglobin, hyperlipidemia comorbidity, antidiabetic drug use, and lipid‐modifying drug use. Conclusion This study successfully developed a highly accurate 4‐year risk model for pancreatic cancer in patients with diabetes using real‐world clinical data and multiple machine‐learning algorithms. Potentially, our predictors offer an opportunity to identify pancreatic cancer early and thus increase prevention and invention windows to impact survival in diabetic patients.

Keywords