Digital Diagnostics (Jul 2024)

Comparison of the methods of operation of the artificial intelligence system in the ultra-high sensitivity mode for the autonomous description of chest X-rays without pathology

  • Evgeniy D. Nikitin,
  • Nikita S. Plaksin,
  • Maria B. Garetz,
  • Evgeniy M. Gutin

DOI
https://doi.org/10.17816/DD626001
Journal volume & issue
Vol. 5, no. 1S
pp. 71 – 73

Abstract

Read online

BACKGROUND: Up to 95% of digital fluoroscopy screening studies are free of pathologic changes. Radiologists typically spend the majority of their time reviewing and describing such studies. In these cases, artificial intelligence systems can be used to automate the description, thereby saving physicians’ time [1–3]. AIM: The aim of this study was to compare the efficacy of various algorithms within an existing artificial intelligence system in an ultra-high sensitivity scenario and to estimate the percentage of X-rays that could be automatically characterized. MATERIALS AND METHODS: The artificial intelligence system “Cels.Fluorography” version 0.15.3 was used for the analysis. A dataset derived from disparate medical organizations, comprising 11,707 studies devoid of pathology and 5,846 studies exhibiting pathology, was selected for comparison. A subsample of 500 studies with pathology and 9,500 studies without pathology (5% to 95% balance) was randomly selected 1,000 times from the dataset to calculate the metrics. The resulting metrics were then averaged. The markup of two physicians was used as the source of the target variable. In the event of a discrepancy in opinion, the study was subjected to an expert physician evaluation. An X-ray was considered pathological if the final markup contained at least one of 12 radiological features [4]. Five methods were used to compare metrics: by maximum (1) and mean (2) probability of radiological features localized by the neural network-detector; by maximum (3) and mean (4) probability of feature presence derived from dedicated “heads” of the neural network trained to determine the presence of each feature on the image (0 for no feature, 1 for presence); by probability (5) derived from a separate “head” of the neural network trained to determine the binary presence of pathology on the study (0 for normal, 1 for pathology). For each method, a response threshold was selected to ensure that no more than one missed pathology was identified per 1,000 examinations in the current subsample. The percentage of X-rays that could be correctly identified as pathology-free by artificial intelligence was calculated as the main quality metric. RESULTS: The methods demonstrated the following average percentages of norm dropout: 66.4%, 72.2%, 69.0%, 74.1%, 68.7%—and the following area under the ROC curve: 0.948, 0.957, 0.964, 0.967, 0.971. The 95% confidence interval for the dropout rate associated with the optimal method was found to be 66.1% to 79.4%. CONCLUSIONS: Modern artificial intelligence systems can be used to automate the description of a significant portion of screenings. The most efficacious method for norm screening (over 74% of the flow) was demonstrated by the averaging of probabilities derived from special “heads” of the neural network trained to identify the presence of pathology.

Keywords