AI-Based CXR First Reading: Current Limitations to Ensure Practical Value

Yuriy Vasilev; Anton Vladzymyrskyy; Olga Omelyanskaya; Ivan Blokhin; Yury Kirpichev; Kirill Arzamasov

doi:10.3390/diagnostics13081430

Diagnostics (Apr 2023)

AI-Based CXR First Reading: Current Limitations to Ensure Practical Value

Yuriy Vasilev,
Anton Vladzymyrskyy,
Olga Omelyanskaya,
Ivan Blokhin,
Yury Kirpichev,
Kirill Arzamasov

Affiliations

Yuriy Vasilev: State Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, Russia
Anton Vladzymyrskyy: State Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, Russia
Olga Omelyanskaya: State Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, Russia
Ivan Blokhin: State Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, Russia
Yury Kirpichev: State Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, Russia
Kirill Arzamasov: State Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, Russia

DOI: https://doi.org/10.3390/diagnostics13081430
Journal volume & issue: Vol. 13, no. 8
p. 1430

Abstract

Read online

We performed a multicenter external evaluation of the practical and clinical efficacy of a commercial AI algorithm for chest X-ray (CXR) analysis (Lunit INSIGHT CXR). A retrospective evaluation was performed with a multi-reader study. For a prospective evaluation, the AI model was run on CXR studies; the results were compared to the reports of 226 radiologists. In the multi-reader study, the area under the curve (AUC), sensitivity, and specificity of the AI were 0.94 (CI95%: 0.87–1.0), 0.9 (CI95%: 0.79–1.0), and 0.89 (CI95%: 0.79–0.98); the AUC, sensitivity, and specificity of the radiologists were 0.97 (CI95%: 0.94–1.0), 0.9 (CI95%: 0.79–1.0), and 0.95 (CI95%: 0.89–1.0). In most regions of the ROC curve, the AI performed a little worse or at the same level as an average human reader. The McNemar test showed no statistically significant differences between AI and radiologists. In the prospective study with 4752 cases, the AUC, sensitivity, and specificity of the AI were 0.84 (CI95%: 0.82–0.86), 0.77 (CI95%: 0.73–0.80), and 0.81 (CI95%: 0.80–0.82). Lower accuracy values obtained during the prospective validation were mainly associated with false-positive findings considered by experts to be clinically insignificant and the false-negative omission of human-reported “opacity”, “nodule”, and calcification. In a large-scale prospective validation of the commercial AI algorithm in clinical practice, lower sensitivity and specificity values were obtained compared to the prior retrospective evaluation of the data of the same population.

Published in Diagnostics

ISSN: 2075-4418 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Medicine: Medicine (General)
Website: http://www.mdpi.com/journal/diagnostics

About the journal

Abstract

Keywords