EClinicalMedicine (Dec 2024)
Development and validation of a deep learning pipeline to diagnose ovarian masses using ultrasound screening: a retrospective multicenter studyResearch in context
Abstract
Summary: Background: Ovarian cancer has the highest mortality rate among gynaecological malignancies and is initially screened using ultrasound. Owing to the high complexity of ultrasound images of ovarian masses and the anatomical characteristics of the deep pelvic cavity, subjective assessment requires extensive experience and skill. Therefore, detecting the ovaries and ovarian masses and diagnose ovarian cancer are challenging. In the present study, we aimed to develop an automated deep learning framework, the Ovarian Multi-Task Attention Network (OvaMTA), for ovary and ovarian mass detection, segmentation, and classification, as well as further diagnosis of ovarian masses based on ultrasound screening. Methods: Between June 2020 and May 2022, the OvaMTA model was trained, validated and tested on a training and validation cohort including 6938 images and an internal testing cohort including 1584 images which were recruited from 21 hospitals involving women who underwent ultrasound examinations for ovarian masses. Subsequently, we recruited two external test cohorts from another two hospitals. We obtained 1896 images between February 2024 and April 2024 as image-based external test dataset, and further obtained 159 videos for the video-based external test dataset between April 2024 and May 2024. We developed an artificial intelligence (AI) system (termed OvaMTA) to diagnose ovarian masses using ultrasound screening. It includes two models: an entire image-based segmentation model, OvaMTA-Seg, for ovary detection and a diagnosis model, OvaMTA-Diagnosis, for predicting the pathological type of ovarian mass using image patches cropped by OvaMTA-Seg. The performance of the system was evaluated in one internal and two external validation cohorts, and compared with doctors’ assessments in real-world testing. We recruited eight physicians to assess the real-world data. The value of the system in assisting doctors with diagnosis was also evaluated. Findings: In terms of segmentation, OvaMTA-Seg achieved an average Dice score of 0.887 on the internal test set and 0.819 on the image-based external test set. OvaMTA-Seg also performed well in ovarian mass detection from test images, including healthy ovaries and masses (internal test area under the curve [AUC]: 0.970; external test AUC: 0.877). In terms of classification diagnosis prediction, OvaMTA-Diagnosis demonstrated high performance on image-based internal (AUC: 0.941) and external test sets (AUC: 0.941). In video-based external testing, OvaMTA recognised 159 videos with ovarian masses with AUC of 0.911, and is comparable to the performance of senior radiologists (ACC: 86.2 vs. 88.1, p = 0.50; SEN: 81.8 vs. 88.6, p = 0.16; SPE: 89.2 vs. 87.6, p = 0.68). There was a significant improvement in junior and intermediate radiologists who were assisted by AI compared to those who were not assisted by AI (ACC: 80.8 vs. 75.3, p = 0.00015; SEN: 79.5 vs. 74.6, p = 0.029; SPE: 81.7 vs. 75.8, p = 0.0032). General practitioners assisted by AI achieved an average performance of radiologists (ACC: 82.7 vs. 81.8, p = 0.80; SEN: 84.8 vs. 82.6, p = 0.72; SPE: 81.2 vs. 81.2, p > 0.99). Interpretation: The OvaMTA system based on ultrasound imaging is a simple and practical auxiliary tool for screening for ovarian cancer, with a diagnostic performance comparable to that of senior radiologists. This provides a potential tool for screening ovarian cancer. Funding: This work was supported by the National Natural Science Foundation of China (Grant Nos. 12090020, 82071929, and 12090025) and the R&D project of the Pazhou Lab (Huangpu) (Grant No. 2023K0605).