Systematic Reviews (Jun 2018)

Evaluation of the reliability, usability, and applicability of AMSTAR, AMSTAR 2, and ROBIS: protocol for a descriptive analytic study

  • Allison Gates,
  • Michelle Gates,
  • Gonçalo Duarte,
  • Maria Cary,
  • Monika Becker,
  • Barbara Prediger,
  • Ben Vandermeer,
  • Ricardo M. Fernandes,
  • Dawid Pieper,
  • Lisa Hartling

DOI
https://doi.org/10.1186/s13643-018-0746-1
Journal volume & issue
Vol. 7, no. 1
pp. 1 – 7

Abstract

Read online

Abstract Background Systematic reviews (SRs) of randomised controlled trials (RCTs) can provide the best evidence to inform decision-making, but their methodological and reporting quality varies. Tools exist to guide the critical appraisal of quality and risk of bias in SRs, but evaluations of their measurement properties are limited. We will investigate the interrater reliability (IRR), usability, and applicability of A MeaSurement Tool to Assess systematic Reviews (AMSTAR), AMSTAR 2, and Risk Of Bias In Systematic reviews (ROBIS) for SRs in the fields of biomedicine and public health. Methods An international team of researchers at three collaborating centres will undertake the study. We will use a random sample of 30 SRs of RCTs investigating therapeutic interventions indexed in MEDLINE in February 2014. Two reviewers at each centre will appraise the quality and risk of bias in each SR using AMSTAR, AMSTAR 2, and ROBIS. We will record the time to complete each assessment and for the two reviewers to reach consensus for each SR. We will extract the descriptive characteristics of each SR, the included studies, participants, interventions, and comparators. We will also extract the direction and strength of the results and conclusions for the primary outcome. We will summarise the descriptive characteristics of the SRs using means and standard deviations, or frequencies and proportions. To test for interrater reliability between reviewers and between the consensus agreements of reviewer pairs, we will use Gwet’s AC1 statistic. For comparability to previous evaluations, we will also calculate weighted Cohen’s kappa and Fleiss’ kappa statistics. To estimate usability, we will calculate the mean time to complete the appraisal and to reach consensus for each tool. To inform applications of the tools, we will test for statistical associations between quality scores and risk of bias judgments, and the results and conclusions of the SRs. Discussion Appraising the methodological and reporting quality of SRs is necessary to determine the trustworthiness of their conclusions. Which tool may be most reliably applied and how the appraisals should be used is uncertain; the usability of newly developed tools is unknown. This investigation of common (AMSTAR) and newly developed (AMSTAR 2, ROBIS) tools will provide empiric data to inform their application, interpretation, and refinement.

Keywords