Journal of Language Horizons (Jul 2019)

The Investigation of Rater Expertise in Oral Language Proficiency Assessment: A Multifaceted Rasch Analysis

  • Houman Bijani

DOI
https://doi.org/10.22051/lghor.2019.26072.1123
Journal volume & issue
Vol. 2, no. 2
pp. 103 – 124

Abstract

Read online

Since scoring oral language proficiency is performed by raters, they are an essential part of performance assessment. One important feature of raters is their teaching and rating experience which has attracted considerable attention. In a majority of previous studies on rater training, extremely severe or lenient raters, benefited more from training programs and thus results of this training showed significant severity/leniency reduction in their rating behavior. However, they mostly investigated the application of FACETS on only one or two facets and few have used a pre, post-training design. Besides, empirical studies have reported contrasting outcomes, not showing clearly which group of raters does rating more reliably than the other. In this study, 20 experienced and inexperienced raters rated the oral performances produced by 200 test-takers before and after a training program. The results indicated that training leads to higher measures of interrater consistency and reduces measures of biases towards using rating scale categories. Moreover, since it is almost impossible to completely eradicate rater variability even if training is applied, rater training procedure had better had better be regarded as a procedure to make raters more self-consistent (intrarater reliability) rather than consistent with each other (interrater reliability). The findings of this study indicated that inexperienced and experienced raters’ rating quality improved after training; however, inexperienced raters underwent higher consistency and less bias. Hence, there is no evidence that inexperienced raters should be excluded from rating solely because of their lack of adequate experience. Moreover, Inexperienced raters, being more economical than the experienced ones, cost less for decision-makers for rating. Therefore, instead of charging a bulky budget on experienced raters, decision-makers had better use the budget for establishing better training programs.

Keywords