npj Digital Medicine (May 2024)

Artificial intelligence in digital pathology: a systematic review and meta-analysis of diagnostic test accuracy

  • Clare McGenity,
  • Emily L. Clarke,
  • Charlotte Jennings,
  • Gillian Matthews,
  • Caroline Cartlidge,
  • Henschel Freduah-Agyemang,
  • Deborah D. Stocken,
  • Darren Treanor

DOI
https://doi.org/10.1038/s41746-024-01106-8
Journal volume & issue
Vol. 7, no. 1
pp. 1 – 19

Abstract

Read online

Abstract Ensuring diagnostic performance of artificial intelligence (AI) before introduction into clinical practice is essential. Growing numbers of studies using AI for digital pathology have been reported over recent years. The aim of this work is to examine the diagnostic accuracy of AI in digital pathology images for any disease. This systematic review and meta-analysis included diagnostic accuracy studies using any type of AI applied to whole slide images (WSIs) for any disease. The reference standard was diagnosis by histopathological assessment and/or immunohistochemistry. Searches were conducted in PubMed, EMBASE and CENTRAL in June 2022. Risk of bias and concerns of applicability were assessed using the QUADAS-2 tool. Data extraction was conducted by two investigators and meta-analysis was performed using a bivariate random effects model, with additional subgroup analyses also performed. Of 2976 identified studies, 100 were included in the review and 48 in the meta-analysis. Studies were from a range of countries, including over 152,000 whole slide images (WSIs), representing many diseases. These studies reported a mean sensitivity of 96.3% (CI 94.1–97.7) and mean specificity of 93.3% (CI 90.5–95.4). There was heterogeneity in study design and 99% of studies identified for inclusion had at least one area at high or unclear risk of bias or applicability concerns. Details on selection of cases, division of model development and validation data and raw performance data were frequently ambiguous or missing. AI is reported as having high diagnostic accuracy in the reported areas but requires more rigorous evaluation of its performance.