Applied Sciences (Apr 2023)

Performance of AI-Based Automated Classifications of Whole-Body FDG PET in Clinical Practice: The CLARITI Project

  • Arnaud Berenbaum,
  • Hervé Delingette,
  • Aurélien Maire,
  • Cécile Poret,
  • Claire Hassen-Khodja,
  • Stéphane Bréant,
  • Christel Daniel,
  • Patricia Martel,
  • Lamiae Grimaldi,
  • Marie Frank,
  • Emmanuel Durand,
  • Florent L. Besson

DOI
https://doi.org/10.3390/app13095281
Journal volume & issue
Vol. 13, no. 9
p. 5281

Abstract

Read online

Purpose: To assess the feasibility of a three-dimensional deep convolutional neural network (3D-CNN) for the general triage of whole-body FDG PET in daily clinical practice. Methods: An institutional clinical data warehouse working environment was devoted to this PET imaging purpose. Dedicated request procedures and data processing workflows were specifically developed within this infrastructure and applied retrospectively to a monocentric dataset as a proof of concept. A custom-made 3D-CNN was first trained and tested on an “unambiguous” well-balanced data sample, which included strictly normal and highly pathological scans. For the training phase, 90% of the data sample was used (learning set: 80%; validation set: 20%, 5-fold cross validation) and the remaining 10% constituted the test set. Finally, the model was applied to a “real-life” test set which included any scans taken. Text mining of the PET reports systematically combined with visual rechecking by an experienced reader served as the standard-of-truth for PET labeling. Results: From 8125 scans, 4963 PETs had processable cross-matched medical reports. For the “unambiguous” dataset (1084 PETs), the 3D-CNN’s overall results for sensitivity, specificity, positive and negative predictive values and likelihood ratios were 84%, 98%, 98%, 85%, 42.0 and 0.16, respectively (F1 score of 90%). When applied to the “real-life” dataset (4963 PETs), the sensitivity, NPV, LR+, LR− and F1 score substantially decreased (61%, 40%, 2.97, 0.49 and 73%, respectively), whereas the specificity and PPV remained high (79% and 90%). Conclusion: An AI-based triage of whole-body FDG PET is promising. Further studies are needed to overcome the challenges presented by the imperfection of real-life PET data.

Keywords