Journal of the Mexican Federation of Radiology and Imaging (Apr 2024)
Interobserver agreement between radiologists and artificial intelligence in mammographic breast density classification
Abstract
Artificial intelligence (AI) has been proposed as a tool for assessing mammographic breast density (MBD). This study aimed to evaluate the agreement of MBD classification between four radiologists (human readers [HRs]) with different years of experience in breast imaging and the AI Lunit INSIGHT MMG. This cross-sectional study was conducted with a convenience sample of radiologists trained in breast imaging who assessed MBD screening mammograms of asymptomatic women 35 years or older using BI-RADS descriptors. Cohen’s kappa determined the agreement between the HRs and AI. A total of 192 women with a mean age of 55.4 ± 31.8 years (range 37-82 years) were included. Interobserver agreement between HRs and AI varied in Category a but was substantial in Category b (HR1 k = 0.729, HR2 k = 0.718, HR3 k = 0.768, and HR4 k = 0.672) and in Category c, HR1, HR2, and HR3 had substantial agreement (k = 0.728, k = 0.697, and k = 0.738, respectively) and HR4 had moderate agreement (k = 0.578), while in Category d, it was mostly moderate. HRs and AI agreements varied from fair to substantial. HRs with more years of experience in breast image interpretation had a lower agreement with AI for MBD classification than HRs with less time.