An Empirical Comparative Assessment of Inter-Rater Agreement of Binary Outcomes and Multiple Raters

Menelaos Konstantinidis; Lisa. W. Le; Xin Gao

doi:10.3390/sym14020262

Symmetry (Jan 2022)

An Empirical Comparative Assessment of Inter-Rater Agreement of Binary Outcomes and Multiple Raters

Menelaos Konstantinidis,
Lisa. W. Le,
Xin Gao

Affiliations

Menelaos Konstantinidis: Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, ON M5T 3M7, Canada
Lisa. W. Le: Department of Biostatistics, Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
Xin Gao: Department of Mathematics and Statistics, York University, Toronto, ON M3J 1P3, Canada

DOI: https://doi.org/10.3390/sym14020262
Journal volume & issue: Vol. 14, no. 2
p. 262

Abstract

Read online

Background: Many methods under the umbrella of inter-rater agreement (IRA) have been proposed to evaluate how well two or more medical experts agree on a set of outcomes. The objective of this work was to assess key IRA statistics in the context of multiple raters with binary outcomes. Methods: We simulated the responses of several raters (2–5) with 20, 50, 300, and 500 observations. For each combination of raters and observations, we estimated the expected value and variance of four commonly used inter-rater agreement statistics (Fleiss’ Kappa, Light’s Kappa, Conger’s Kappa, and Gwet’s AC1). Results: In the case of equal outcome prevalence (symmetric), the estimated expected values of all four statistics were equal. In the asymmetric case, only the estimated expected values of the three Kappa statistics were equal. In the symmetric case, Fleiss’ Kappa yielded a higher estimated variance than the other three statistics. In the asymmetric case, Gwet’s AC1 yielded a lower estimated variance than the three Kappa statistics for each scenario. Conclusion: Since the population-level prevalence of a set of outcomes may not be known a priori, Gwet’s AC1 statistic should be favored over the three Kappa statistics. For meaningful direct comparisons between IRA measures, transformations between statistics should be conducted.

Published in Symmetry

ISSN: 2073-8994 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/symmetry/

About the journal

Abstract

Keywords