Quality assessment standards in artificial intelligence diagnostic accuracy systematic reviews: a meta-research study

Shruti Jayakumar; Viknesh Sounderajah; Pasha Normahani; Leanne Harling; Sheraz R. Markar; Hutan Ashrafian; Ara Darzi

doi:10.1038/s41746-021-00544-y

npj Digital Medicine (Jan 2022)

Quality assessment standards in artificial intelligence diagnostic accuracy systematic reviews: a meta-research study

Shruti Jayakumar,
Viknesh Sounderajah,
Pasha Normahani,
Leanne Harling,
Sheraz R. Markar,
Hutan Ashrafian,
Ara Darzi

Affiliations

Shruti Jayakumar: Department of Surgery and Cancer, Imperial College London
Viknesh Sounderajah: Department of Surgery and Cancer, Imperial College London
Pasha Normahani: Department of Surgery and Cancer, Imperial College London
Leanne Harling: Department of Surgery and Cancer, Imperial College London
Sheraz R. Markar: Department of Surgery and Cancer, Imperial College London
Hutan Ashrafian: Department of Surgery and Cancer, Imperial College London
Ara Darzi: Department of Surgery and Cancer, Imperial College London

DOI: https://doi.org/10.1038/s41746-021-00544-y
Journal volume & issue: Vol. 5, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Artificial intelligence (AI) centred diagnostic systems are increasingly recognised as robust solutions in healthcare delivery pathways. In turn, there has been a concurrent rise in secondary research studies regarding these technologies in order to influence key clinical and policymaking decisions. It is therefore essential that these studies accurately appraise methodological quality and risk of bias within shortlisted trials and reports. In order to assess whether this critical step is performed, we undertook a meta-research study evaluating adherence to the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool within AI diagnostic accuracy systematic reviews. A literature search was conducted on all studies published from 2000 to December 2020. Of 50 included reviews, 36 performed the quality assessment, of which 27 utilised the QUADAS-2 tool. Bias was reported across all four domains of QUADAS-2. Two hundred forty-three of 423 studies (57.5%) across all systematic reviews utilising QUADAS-2 reported a high or unclear risk of bias in the patient selection domain, 110 (26%) reported a high or unclear risk of bias in the index test domain, 121 (28.6%) in the reference standard domain and 157 (37.1%) in the flow and timing domain. This study demonstrates the incomplete uptake of quality assessment tools in reviews of AI-based diagnostic accuracy studies and highlights inconsistent reporting across all domains of quality assessment. Poor standards of reporting act as barriers to clinical implementation. The creation of an AI-specific extension for quality assessment tools of diagnostic accuracy AI studies may facilitate the safe translation of AI tools into clinical practice.

Published in npj Digital Medicine

ISSN: 2398-6352 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: https://www.nature.com/npjdigitalmed/

About the journal