Chinese Medical Journal (Jun 2023)

Development of the Scientific, Transparent and Applicable Rankings (STAR) tool for clinical practice guidelines

  • Nan Yang,
  • Hui Liu,
  • Wei Zhao,
  • Yang Pan,
  • Xiangzheng Lyu,
  • Xiuyuan Hao,
  • Xiaoqing Liu,
  • Wen'an Qi,
  • Tong Chen,
  • Xiaoqin Wang,
  • Boheng Zhang,
  • Weishe Zhang,
  • Qiu Li,
  • Dong Xu,
  • Xinghua Gao,
  • Yinghui Jin,
  • Feng Sun,
  • Wenbo Meng,
  • Guobao Li,
  • Qijun Wu,
  • Ze Chen,
  • Xu Wang,
  • Janne Estill,
  • Susan L. Norris,
  • Liang Du,
  • Yaolong Chen,
  • Junmin Wei,
  • Xiuyuan Hao

DOI
https://doi.org/10.1097/CM9.0000000000002713
Journal volume & issue
Vol. 136, no. 12
pp. 1430 – 1438

Abstract

Read online

Abstract. Background:. This study aimed to develop a comprehensive instrument for evaluating and ranking clinical practice guidelines, named Scientific, Transparent and Applicable Rankings tool (STAR), and test its reliability, validity, and usability. Methods:. This study set up a multidisciplinary working group including guideline methodologists, statisticians, journal editors, clinicians, and other experts. Scoping review, Delphi methods, and hierarchical analysis were used to develop the STAR tool. We evaluated the instrument's intrinsic and interrater reliability, content and criterion validity, and usability. Results:. STAR contained 39 items grouped into 11 domains. The mean intrinsic reliability of the domains, indicated by Cronbach's α coefficient, was 0.588 (95% confidence interval [CI]: 0.414, 0.762). Interrater reliability as assessed with Cohen's kappa coefficient was 0.774 (95% CI: 0.740, 0.807) for methodological evaluators and 0.618 (95% CI: 0.587, 0.648) for clinical evaluators. The overall content validity index was 0.905. Pearson's r correlation for criterion validity was 0.885 (95% CI: 0.804, 0.932). The mean usability score of the items was 4.6 and the median time spent to evaluate each guideline was 20 min. Conclusion:. The instrument performed well in terms of reliability, validity, and efficiency, and can be used for comprehensively evaluating and ranking guidelines.