Frontiers in Medicine (Nov 2022)

Diagnostic accuracy of artificial intelligence for detecting gastrointestinal luminal pathologies: A systematic review and meta-analysis

  • Om Parkash,
  • Asra Tus Saleha Siddiqui,
  • Uswa Jiwani,
  • Fahad Rind,
  • Zahra Ali Padhani,
  • Arjumand Rizvi,
  • Zahra Hoodbhoy,
  • Jai K. Das,
  • Jai K. Das

DOI
https://doi.org/10.3389/fmed.2022.1018937
Journal volume & issue
Vol. 9

Abstract

Read online

BackgroundArtificial Intelligence (AI) holds considerable promise for diagnostics in the field of gastroenterology. This systematic review and meta-analysis aims to assess the diagnostic accuracy of AI models compared with the gold standard of experts and histopathology for the diagnosis of various gastrointestinal (GI) luminal pathologies including polyps, neoplasms, and inflammatory bowel disease.MethodsWe searched PubMed, CINAHL, Wiley Cochrane Library, and Web of Science electronic databases to identify studies assessing the diagnostic performance of AI models for GI luminal pathologies. We extracted binary diagnostic accuracy data and constructed contingency tables to derive the outcomes of interest: sensitivity and specificity. We performed a meta-analysis and hierarchical summary receiver operating characteristic curves (HSROC). The risk of bias was assessed using Quality Assessment for Diagnostic Accuracy Studies-2 (QUADAS-2) tool. Subgroup analyses were conducted based on the type of GI luminal disease, AI model, reference standard, and type of data used for analysis. This study is registered with PROSPERO (CRD42021288360).FindingsWe included 73 studies, of which 31 were externally validated and provided sufficient information for inclusion in the meta-analysis. The overall sensitivity of AI for detecting GI luminal pathologies was 91.9% (95% CI: 89.0–94.1) and specificity was 91.7% (95% CI: 87.4–94.7). Deep learning models (sensitivity: 89.8%, specificity: 91.9%) and ensemble methods (sensitivity: 95.4%, specificity: 90.9%) were the most commonly used models in the included studies. Majority of studies (n = 56, 76.7%) had a high risk of selection bias while 74% (n = 54) studies were low risk on reference standard and 67% (n = 49) were low risk for flow and timing bias.InterpretationThe review suggests high sensitivity and specificity of AI models for the detection of GI luminal pathologies. There is a need for large, multi-center trials in both high income countries and low- and middle- income countries to assess the performance of these AI models in real clinical settings and its impact on diagnosis and prognosis.Systematic review registration[https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=288360], identifier [CRD42021288360].

Keywords