Reproducibility of artificial intelligence models in computed tomography of the head: a quantitative analysis

Felix Gunzer; Michael Jantscher; Eva M. Hassler; Thomas Kau; Gernot Reishofer

doi:10.1186/s13244-022-01311-7

Insights into Imaging (Oct 2022)

Reproducibility of artificial intelligence models in computed tomography of the head: a quantitative analysis

Felix Gunzer,
Michael Jantscher,
Eva M. Hassler,
Thomas Kau,
Gernot Reishofer

Affiliations

Felix Gunzer: Division of Neuroradiology, Vascular and Interventional Radiology, Department of Radiology, Medical University Graz
Michael Jantscher: Research Center for Data-Driven Business Big Data Analytics, Know-Center GmbH
Eva M. Hassler: Division of Neuroradiology, Vascular and Interventional Radiology, Department of Radiology, Medical University Graz
Thomas Kau: Department of Radiology, Landeskrankenhaus Villach
Gernot Reishofer: Department of Radiology, Medical University Graz

DOI: https://doi.org/10.1186/s13244-022-01311-7
Journal volume & issue: Vol. 13, no. 1
pp. 1 – 8

Abstract

Read online

Abstract When developing artificial intelligence (AI) software for applications in radiology, the underlying research must be transferable to other real-world problems. To verify to what degree this is true, we reviewed research on AI algorithms for computed tomography of the head. A systematic review was conducted according to the preferred reporting items for systematic reviews and meta-analyses. We identified 83 articles and analyzed them in terms of transparency of data and code, pre-processing, type of algorithm, architecture, hyperparameter, performance measure, and balancing of dataset in relation to epidemiology. We also classified all articles by their main functionality (classification, detection, segmentation, prediction, triage, image reconstruction, image registration, fusion of imaging modalities). We found that only a minority of authors provided open source code (10.15%, n 0 7), making the replication of results difficult. Convolutional neural networks were predominantly used (32.61%, n = 15), whereas hyperparameters were less frequently reported (32.61%, n = 15). Data sets were mostly from single center sources (84.05%, n = 58), increasing the susceptibility of the models to bias, which increases the error rate of the models. The prevalence of brain lesions in the training (0.49 ± 0.30) and testing (0.45 ± 0.29) datasets differed from real-world epidemiology (0.21 ± 0.28), which may overestimate performances. This review highlights the need for open source code, external validation, and consideration of disease prevalence.

Published in Insights into Imaging

ISSN: 1869-4101 (Online)
Publisher: SpringerOpen
Country of publisher: Germany
LCC subjects: Medicine: Medicine (General): Medical physics. Medical radiology. Nuclear medicine
Website: http://www.springer.com/13244

About the journal

Abstract

Keywords