Scientific Reports (Jan 2023)

Iterated cross validation method for prediction of survival in diffuse large B-cell lymphoma for small size dataset

  • Chin-Chuan Chang,
  • Chien-Hua Chen,
  • Jer-Guang Hsieh,
  • Jyh-Horng Jeng

DOI
https://doi.org/10.1038/s41598-023-28394-6
Journal volume & issue
Vol. 13, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Efforts have been made to improve the risk stratification model for patients with diffuse large B-cell lymphoma (DLBCL). This study aimed to evaluate the disease prognosis using machine learning models with iterated cross validation (CV) method. A total of 122 patients with pathologically confirmed DLBCL and receiving rituximab-containing chemotherapy were enrolled. Contributions of clinical, laboratory, and metabolic imaging parameters from fluorine-18 fluorodeoxyglucose (FDG) positron emission tomography/computed tomography (PET/CT) scans to the prognosis were evaluated using five regression models, namely logistic regression, random forest, support vector classifier (SVC), deep neural network (DNN), and fuzzy neural network models. Binary classification predictions for 3-year progression free survival (PFS) and 3-year overall survival (OS) were conducted. The 10-iterated fivefold CV with shuffling process was conducted to predict the capability of learning machines. The median PFS and OS were 41.0 and 43.6 months, respectively. Two indicators were found to be independent predictors for prognosis: international prognostic index and total metabolic tumor volume (MTVsum) from FDG PET/CT. For PFS, SVC and DNN (both with accuracy 71%) have the best predictive results, of which outperformed other algorithms. For OS, the DNN has the best predictive result (accuracy 76%). Using clinical and metabolic parameters as input variables, the machine learning methods with iterated CV method add the predictive values for PFS and OS evaluation in DLBCL patients.