Breast Cancer Research (Feb 2024)

Are better AI algorithms for breast cancer detection also better at predicting risk? A paired case–control study

  • Ruggiero Santeramo,
  • Celeste Damiani,
  • Jiefei Wei,
  • Giovanni Montana,
  • Adam R. Brentnall

DOI
https://doi.org/10.1186/s13058-024-01775-z
Journal volume & issue
Vol. 26, no. 1
pp. 1 – 7

Abstract

Read online

Abstract Background There is increasing evidence that artificial intelligence (AI) breast cancer risk evaluation tools using digital mammograms are highly informative for 1–6 years following a negative screening examination. We hypothesized that algorithms that have previously been shown to work well for cancer detection will also work well for risk assessment and that performance of algorithms for detection and risk assessment is correlated. Methods To evaluate our hypothesis, we designed a case-control study using paired mammograms at diagnosis and at the previous screening visit. The study included n = 3386 women from the OPTIMAM registry, that includes mammograms from women diagnosed with breast cancer in the English breast screening program 2010–2019. Cases were diagnosed with invasive breast cancer or ductal carcinoma in situ at screening and were selected if they had a mammogram available at the screening examination that led to detection, and a paired mammogram at their previous screening visit 3y prior to detection when no cancer was detected. Controls without cancer were matched 1:1 to cases based on age (year), screening site, and mammography machine type. Risk assessment was conducted using a deep-learning model designed for breast cancer risk assessment (Mirai), and three open-source deep-learning algorithms designed for breast cancer detection. Discrimination was assessed using a matched area under the curve (AUC) statistic. Results Overall performance using the paired mammograms followed the same order by algorithm for risk assessment (AUC range 0.59–0.67) and detection (AUC 0.81–0.89), with Mirai performing best for both. There was also a correlation in performance for risk and detection within algorithms by cancer size, with much greater accuracy for large cancers (30 mm+, detection AUC: 0.88–0.92; risk AUC: 0.64–0.74) than smaller cancers (0 to < 10 mm, detection AUC: 0.73–0.86, risk AUC: 0.54–0.64). Mirai was relatively strong for risk assessment of smaller cancers (0 to < 10 mm, risk, Mirai AUC: 0.64 (95% CI 0.57 to 0.70); other algorithms AUC 0.54–0.56). Conclusions Improvements in risk assessment could stem from enhancing cancer detection capabilities of smaller cancers. Other state-of-the-art AI detection algorithms with high performance for smaller cancers might achieve relatively high performance for risk assessment.

Keywords