Kappa statistic considerations in evaluating inter-rater reliability between two raters: which, when and context matters

Ming Li; Qian Gao; Tianfei Yu

doi:10.1186/s12885-023-11325-z

BMC Cancer (Aug 2023)

Kappa statistic considerations in evaluating inter-rater reliability between two raters: which, when and context matters

Ming Li,
Qian Gao,
Tianfei Yu

Affiliations

Ming Li: Department of Computer Science and Technology, College of Computer and Control Engineering, Qiqihar University
Qian Gao: Department of Computer Science and Technology, College of Computer and Control Engineering, Qiqihar University
Tianfei Yu: Department of Biotechnology, College of Life Science and Agriculture Forestry, Qiqihar University

DOI: https://doi.org/10.1186/s12885-023-11325-z
Journal volume & issue: Vol. 23, no. 1
pp. 1 – 5

Abstract

Read online

Abstract Background In research designs that rely on observational ratings provided by two raters, assessing inter-rater reliability (IRR) is a frequently required task. However, some studies fall short in properly utilizing statistical procedures, omitting essential information necessary for interpreting their findings, or inadequately addressing the impact of IRR on subsequent analyses’ statistical power for hypothesis testing. Methods This article delves into the recent publication by Liu et al. in BMC Cancer, analyzing the controversy surrounding the Kappa statistic and methodological issues concerning the assessment of IRR. The primary focus is on the appropriate selection of Kappa statistics, as well as the computation, interpretation, and reporting of two frequently used IRR statistics when there are two raters involved. Results The Cohen’s Kappa statistic is typically utilized to assess the level of agreement between two raters when there are two categories or for unordered categorical variables with three or more categories. On the other hand, when it comes to evaluating the degree of agreement between two raters for ordered categorical variables comprising three or more categories, the weighted Kappa is a widely used measure. Conclusion Despite not substantially affecting the findings of Liu et al.?s study, the statistical dispute underscores the significance of employing suitable statistical methods. Rigorous and accurate statistical results are crucial for producing trustworthy research.

Published in BMC Cancer

ISSN: 1471-2407 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine: Internal medicine: Neoplasms. Tumors. Oncology. Including cancer and carcinogens
Website: http://bmccancer.biomedcentral.com

About the journal

Abstract

Keywords