Education Sciences (Oct 2021)

Low Inter-Rater Reliability of a High Stakes Performance Assessment of Teacher Candidates

  • Scott A. Lyness,
  • Kent Peterson,
  • Kenneth Yates

DOI
https://doi.org/10.3390/educsci11100648
Journal volume & issue
Vol. 11, no. 10
p. 648

Abstract

Read online

The Performance Assessment for California Teachers (PACT) is a high stakes summative assessment that was designed to measure pre-service teacher readiness. We examined the inter-rater reliability (IRR) of trained PACT evaluators who rated 19 candidates. As measured by Cohen’s weighted kappa, the overall IRR estimate was 0.17 (poor strength of agreement). IRR estimates ranged from −0.29 (worse than expected by chance) to 0.54 (moderate strength of agreement); all were below the standard of 0.70 for consensus agreement. Follow-up interviews of 10 evaluators revealed possible reasons we observed low IRR, such as departures from established PACT scoring protocol, and lack of, or inconsistent, use of a scoring aid document. Evaluators reported difficulties scoring the materials that candidates submitted, particularly the use of Academic Language. Cognitive Task Analysis (CTA) is suggested as a method to improve IRR in the PACT and other teacher performance assessments such as the edTPA.

Keywords