Interobserver agreement between radiologists and artificial intelligence in mammographic breast density classification

Claudia M. Delsol-Perez; Alix D. Reyes-Mosqueda; Tania A. Rios-Rodriguez; David F. Perez-Montemayor

doi:10.24875/JMEXFRI.M24000074

Journal of the Mexican Federation of Radiology and Imaging (Apr 2024)

Interobserver agreement between radiologists and artificial intelligence in mammographic breast density classification

Claudia M. Delsol-Perez,
Alix D. Reyes-Mosqueda,
Tania A. Rios-Rodriguez,
David F. Perez-Montemayor

Affiliations

Claudia M. Delsol-Perez: Centro de Imagenologia Integral IMAX; Instituto de Estudios Superiores de Tamaulipas, Universidad Anahuac. Tampico, Tamaulipas, Mexico
Alix D. Reyes-Mosqueda: Centro de Imagenologia Integral IMAX; Instituto de Estudios Superiores de Tamaulipas, Universidad Anahuac. Tampico, Tamaulipas, Mexico
Tania A. Rios-Rodriguez: Centro de Imagenologia Integral IMAX; Instituto de Estudios Superiores de Tamaulipas, Universidad Anahuac. Tampico, Tamaulipas, Mexico
David F. Perez-Montemayor: Centro de Imagenologia Integral IMAX; Instituto de Estudios Superiores de Tamaulipas, Universidad Anahuac. Tampico, Tamaulipas, Mexico

DOI: https://doi.org/10.24875/JMEXFRI.M24000074
Journal volume & issue: Vol. 3, no. 2

Abstract

Read online

Artificial intelligence (AI) has been proposed as a tool for assessing mammographic breast density (MBD). This study aimed to evaluate the agreement of MBD classification between four radiologists (human readers [HRs]) with different years of experience in breast imaging and the AI Lunit INSIGHT MMG. This cross-sectional study was conducted with a convenience sample of radiologists trained in breast imaging who assessed MBD screening mammograms of asymptomatic women 35 years or older using BI-RADS descriptors. Cohen’s kappa determined the agreement between the HRs and AI. A total of 192 women with a mean age of 55.4 ± 31.8 years (range 37-82 years) were included. Interobserver agreement between HRs and AI varied in Category a but was substantial in Category b (HR1 k = 0.729, HR2 k = 0.718, HR3 k = 0.768, and HR4 k = 0.672) and in Category c, HR1, HR2, and HR3 had substantial agreement (k = 0.728, k = 0.697, and k = 0.738, respectively) and HR4 had moderate agreement (k = 0.578), while in Category d, it was mostly moderate. HRs and AI agreements varied from fair to substantial. HRs with more years of experience in breast image interpretation had a lower agreement with AI for MBD classification than HRs with less time.

Published in Journal of the Mexican Federation of Radiology and Imaging

ISSN: 2938-1215 (Print); 2696-8444 (Online)
Publisher: Permanyer
Country of publisher: Spain
LCC subjects: Medicine: Medicine (General): Medical physics. Medical radiology. Nuclear medicine
Website: https://www.jmexfri.com/index.php

About the journal