Scientific Reports (Aug 2022)

Testing the applicability and performance of Auto ML for potential applications in diagnostic neuroradiology

  • Manfred Musigmann,
  • Burak Han Akkurt,
  • Hermann Krähling,
  • Nabila Gala Nacul,
  • Luca Remonda,
  • Thomas Sartoretti,
  • Dylan Henssen,
  • Benjamin Brokinkel,
  • Walter Stummer,
  • Walter Heindel,
  • Manoj Mannil

DOI
https://doi.org/10.1038/s41598-022-18028-8
Journal volume & issue
Vol. 12, no. 1
pp. 1 – 11

Abstract

Read online

Abstract To investigate the applicability and performance of automated machine learning (AutoML) for potential applications in diagnostic neuroradiology. In the medical sector, there is a rapidly growing demand for machine learning methods, but only a limited number of corresponding experts. The comparatively simple handling of AutoML should enable even non-experts to develop adequate machine learning models with manageable effort. We aim to investigate the feasibility as well as the advantages and disadvantages of developing AutoML models compared to developing conventional machine learning models. We discuss the results in relation to a concrete example of a medical prediction application. In this retrospective IRB-approved study, a cohort of 107 patients who underwent gross total meningioma resection and a second cohort of 31 patients who underwent subtotal resection were included. Image segmentation of the contrast enhancing parts of the tumor was performed semi-automatically using the open-source software platform 3D Slicer. A total of 107 radiomic features were extracted by hand-delineated regions of interest from the pre-treatment MRI images of each patient. Within the AutoML approach, 20 different machine learning algorithms were trained and tested simultaneously. For comparison, a neural network and different conventional machine learning algorithms were trained and tested. With respect to the exemplary medical prediction application used in this study to evaluate the performance of Auto ML, namely the pre-treatment prediction of the achievable resection status of meningioma, AutoML achieved remarkable performance nearly equivalent to that of a feed-forward neural network with a single hidden layer. However, in the clinical case study considered here, logistic regression outperformed the AutoML algorithm. Using independent test data, we observed the following classification results (AutoML/neural network/logistic regression): mean area under the curve = 0.849/0.879/0.900, mean accuracy = 0.821/0.839/0.881, mean kappa = 0.465/0.491/0.644, mean sensitivity = 0.578/0.577/0.692 and mean specificity = 0.891/0.914/0.936. The results obtained with AutoML are therefore very promising. However, the AutoML models in our study did not yet show the corresponding performance of the best models obtained with conventional machine learning methods. While AutoML may facilitate and simplify the task of training and testing machine learning algorithms as applied in the field of neuroradiology and medical imaging, a considerable amount of expert knowledge may still be needed to develop models with the highest possible discriminatory power for diagnostic neuroradiology.