Diagnostics (Sep 2023)

AI Evaluation of Imaging Factors in the Evolution of Stage-Treated Metastases Using Gamma Knife

  • Calin G. Buzea,
  • Razvan Buga,
  • Maria-Alexandra Paun,
  • Madalina Albu,
  • Dragos T. Iancu,
  • Bogdan Dobrovat,
  • Maricel Agop,
  • Viorel-Puiu Paun,
  • Lucian Eva

DOI
https://doi.org/10.3390/diagnostics13172853
Journal volume & issue
Vol. 13, no. 17
p. 2853

Abstract

Read online

Background: The study investigated whether three deep-learning models, namely, the CNN_model (trained from scratch), the TL_model (transfer learning), and the FT_model (fine-tuning), could predict the early response of brain metastases (BM) to radiosurgery using a minimal pre-processing of the MRI images. The dataset consisted of 19 BM patients who underwent stereotactic-radiosurgery (SRS) within 3 months. The images used included axial fluid-attenuated inversion recovery (FLAIR) sequences and high-resolution contrast-enhanced T1-weighted (CE T1w) sequences from the tumor center. The patients were classified as responders (complete or partial response) or non-responders (stable or progressive disease). Methods: A total of 2320 images from the regression class and 874 from the progression class were randomly assigned to training, testing, and validation groups. The DL models were trained using the training-group images and labels, and the validation dataset was used to select the best model for classifying the evaluation images as showing regression or progression. Results: Among the 19 patients, 15 were classified as “responders” and 4 as “non-responders”. The CNN_model achieved good performance for both classes, showing high precision, recall, and F1-scores. The overall accuracy was 0.98, with an AUC of 0.989. The TL_model performed well in identifying the “progression” class, but could benefit from improved precision, while the “regression” class exhibited high precision, but lower recall. The overall accuracy of the TL_model was 0.92, and the AUC was 0.936. The FT_model showed high recall for “progression”, but low precision, and for the “regression” class, it exhibited a high precision, but lower recall. The overall accuracy for the FT_model was 0.83, with an AUC of 0.885. Conclusions: Among the three models analyzed, the CNN_model, trained from scratch, provided the most accurate predictions of SRS responses for unlearned BM images. This suggests that CNN models could potentially predict SRS prognoses from small datasets. However, further analysis is needed, especially in cases where class imbalances exist.

Keywords