A Bayesian hierarchical latent trait model for estimating rater bias and reliability in large-scale performance assessment.

Kaja Zupanc; Erik Štrumbelj

doi:10.1371/journal.pone.0195297

PLoS ONE (Jan 2018)

A Bayesian hierarchical latent trait model for estimating rater bias and reliability in large-scale performance assessment.

Kaja Zupanc,
Erik Štrumbelj

Affiliations

Kaja Zupanc
Erik Štrumbelj

DOI: https://doi.org/10.1371/journal.pone.0195297
Journal volume & issue: Vol. 13, no. 4
p. e0195297

Abstract

Read online

We propose a novel approach to modelling rater effects in scoring-based assessment. The approach is based on a Bayesian hierarchical model and simulations from the posterior distribution. We apply it to large-scale essay assessment data over a period of 5 years. Empirical results suggest that the model provides a good fit for both the total scores and when applied to individual rubrics. We estimate the median impact of rater effects on the final grade to be ± 2 points on a 50 point scale, while 10% of essays would receive a score at least ± 5 different from their actual quality. Most of the impact is due to rater unreliability, not rater bias.

Published in PLoS ONE

ISSN: 1932-6203 (Online)
Publisher: Public Library of Science (PLoS)
Country of publisher: United States
LCC subjects: Medicine; Science
Website: https://journals.plos.org/plosone/

About the journal