Cancer Management and Research (2018-11-01)

Development of a prediction model for pancreatic cancer in patients with type 2 diabetes using logistic regression and artificial neural network models

  • Hsieh MH,
  • Sun LM,
  • Lin CL,
  • Hsieh MJ,
  • Hsu CY,
  • Kao CH

Journal volume & issue
Vol. Volume 10
pp. 6317 – 6324

Abstract

Read online

Meng Hsuen Hsieh,1,* Li-Min Sun,2,* Cheng-Li Lin,3,4 Meng-Ju Hsieh,5 Chung-Y Hsu,6 Chia-Hung Kao6–8 1Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, USA; 2Department of Radiation Oncology, Zuoying Branch of Kaohsiung Armed Forces General Hospital, Kaohsiung, Taiwan, Republic of China; 3Management Office for Health Data, China Medical University Hospital, Taichung, Taiwan, Republic of China; 4College of Medicine, China Medical University, Taichung, Taiwan, Republic of China; 5Department of Medicine, Poznan University of Medical Sciences, Poznan, Poland; 6Graduate Institute of Biomedical Sciences, China Medical University, Taichung, Taiwan, Republic of China; 7Department of Nuclear Medicine and PET Center, China Medical University Hospital, Taichung, Taiwan, Republic of China; 8Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan, Republic of China *These authors contributed equally to this work Objectives: Patients with type 2 diabetes (T2DM) are suggested to have a higher risk of developing pancreatic cancer. We used two models to predict pancreatic cancer risk among patients with T2DM. Methods: The original data used for this investigation were retrieved from the National Health Insurance Research Database of Taiwan. The prediction models included the available possible risk factors for pancreatic cancer. The data were split into training and test sets: 97.5% of the data were used as the training set and 2.5% of the data were used as the test set. Logistic regression (LR) and artificial neural network (ANN) models were implemented using Python (Version 3.7.0). The F1, precision, and recall were compared between the LR and the ANN models. The areas under the receiver operating characteristic (ROC) curves of the prediction models were also compared. Results: The metrics used in this study indicated that the LR model more accurately predicted pancreatic cancer than the ANN model. For the LR model, the area under the ROC curve in the prediction of pancreatic cancer was 0.727, indicating a good fit. Conclusion: Using this LR model, our results suggested that we could appropriately predict pancreatic cancer risk in patients with T2DM in Taiwan. Keywords: pancreatic cancer, type 2 diabetes, logistic regression, artificial neural network

Keywords