Journal of Imaging (Feb 2022)
Considerations on Baseline Generation for Imaging AI Studies Illustrated on the CT-Based Prediction of Empyema and Outcome Assessment
Abstract
For AI-based classification tasks in computed tomography (CT), a reference standard for evaluating the clinical diagnostic accuracy of individual classes is essential. To enable the implementation of an AI tool in clinical practice, the raw data should be drawn from clinical routine data using state-of-the-art scanners, evaluated in a blinded manner and verified with a reference test. Three hundred and thirty-five consecutive CTs, performed between 1 January 2016 and 1 January 2021 with reported pleural effusion and pathology reports from thoracocentesis or biopsy within 7 days of the CT were retrospectively included. Two radiologists (4 and 10 PGY) blindly assessed the chest CTs for pleural CT features. If needed, consensus was achieved using an experienced radiologist’s opinion (29 PGY). In addition, diagnoses were extracted from written radiological reports. We analyzed these findings for a possible correlation with the following patient outcomes: mortality and median hospital stay. For AI prediction, we used an approach consisting of nnU-Net segmentation, PyRadiomics features and a random forest model. Specificity and sensitivity for CT-based detection of empyema (n = 81 of n = 335 patients) were 90.94 (95%-CI: 86.55–94.05) and 72.84 (95%-CI: 61.63–81.85%) in all effusions, with moderate to almost perfect interrater agreement for all pleural findings associated with empyema (Cohen’s kappa = 0.41–0.82). Highest accuracies were found for pleural enhancement or thickening with 87.02% and 81.49%, respectively. For empyema prediction, AI achieved a specificity and sensitivity of 74.41% (95% CI: 68.50–79.57) and 77.78% (95% CI: 66.91–85.96), respectively. Empyema was associated with a longer hospital stay (median = 20 versus 14 days), and findings consistent with pleural carcinomatosis impacted mortality.
Keywords