Digital Diagnostics (Dec 2024)

Evaluating the performance of artificial intelligence-based software for digital mammography characterization

  • Yuriy A. Vasilev,
  • Alexander V. Kolsanov,
  • Kirill M. Arzamasov,
  • Anton V. Vladzymyrskyy,
  • Olga V. Omelyanskaya,
  • Serafim S. Semenov,
  • Lubov E. Axenova

DOI
https://doi.org/10.17816/DD625967
Journal volume & issue
Vol. 5, no. 4
pp. 695 – 711

Abstract

Read online

BACKGROUND: Digital screening mammography is a key modality for early detection of breast cancer, reducing mortality by 20–40%. Many artificial intelligence (AI)-based services have been developed to automate the analysis of imaging data. AIM: The aim of the study was to compare mammography assessments using three types of AI services in multiple versions with radiologists’ conclusions. MATERIALS AND METHODS: Binary mammography scoring scales were compared with several types and versions of AI services regarding diagnostic accuracy, Matthews correlation coefficient, and maximum Youden’s index. RESULTS: A comparative analysis showed that the use of a binary scale for evaluating digital mammography affects the number of detected abnormalities and accuracy of AI results. In addition, diagnostic accuracy was found to be threshold dependent. AI Service 1 in version 3 had the best performance, as confirmed by most diagnostic accuracy parameters. CONCLUSIONS: Our results can be used to select AI services for interpreting mammography screening data. Using Youden’s index maximization to set up an AI service provides a balance of sensitivity and specificity that is not always clinically relevant.

Keywords