BMC Medical Imaging (Nov 2024)

Predicting malignancy in breast lesions: enhancing accuracy with fine-tuned convolutional neural network models

  • Li Li,
  • Changjie Pan,
  • Ming Zhang,
  • Dong Shen,
  • Guangyuan He,
  • Mingzhu Meng

DOI
https://doi.org/10.1186/s12880-024-01484-1
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 11

Abstract

Read online

Abstract Background This study aims to explore the accuracy of Convolutional Neural Network (CNN) models in predicting malignancy in Dynamic Contrast-Enhanced Breast Magnetic Resonance Imaging (DCE-BMRI). Methods A total of 273 benign lesions (benign group) and 274 malignant lesions (malignant group) were collected and randomly divided into a training set (246 benign and 245 malignant lesions) and a testing set (28 benign and 28 malignant lesions) in a 9:1 ratio. An additional 53 lesions from 53 patients were designated as the validation set. Five models—VGG16, VGG19, DenseNet201, ResNet50, and MobileNetV2—were evaluated. Model performance was assessed using accuracy (Ac) in the training and testing sets, and precision (Pr), recall (Rc), F1 score (F1), and area under the receiver operating characteristic curve (AUC) in the validation set. Results The accuracy of VGG19 on the test set (0.96) is higher than that of VGG16 (0.91), DenseNet201 (0.91), ResNet50 (0.67), and MobileNetV2 (0.88). For the validation set, VGG19 achieved higher performance metrics (Pr 0.75, Rc 0.76, F1 0.73, AUC 0.76) compared to the other models, specifically VGG16 (Pr 0.73, Rc 0.75, F1 0.70, AUC 0.73), DenseNet201 (Pr 0.71, Rc 0.74, F1 0.69, AUC 0.71), ResNet50 (Pr 0.65, Rc 0.68, F1 0.60, AUC 0.65), and MobileNetV2 (Pr 0.73, Rc 0.75, F1 0.71, AUC 0.73). S4 model achieved higher performance metrics (Pr 0.89, Rc 0.88, F1 0.87, AUC 0.89) compared to the other four fine-tuned models, specifically S1 (Pr 0.75, Rc 0.76, F1 0.74, AUC 0.75), S2 (Pr 0.77, Rc 0.79, F1 0.75, AUC 0.77), S3 (Pr 0.76, Rc 0.76, F1 0.73, AUC 0.75), and S5 (Pr 0.77, Rc 0.79, F1 0.75, AUC 0.77). Additionally, S4 model showed the lowest loss value in the testing set. Notably, the AUC of S4 for BI-RADS 3 was 0.90 and for BI-RADS 4 was 0.86, both significantly higher than the 0.65 AUC for BI-RADS 5. Conclusions The S4 model we propose has demonstrated superior performance in predicting the likelihood of malignancy in DCE-BMRI, making it a promising candidate for clinical application in patients with breast diseases. However, further validation is essential, highlighting the need for additional data to confirm its efficacy.

Keywords