BMC Medical Education (Oct 2022)

Is the assumption of equal distances between global assessment categories used in borderline regression valid?

  • Patrick J. McGown,
  • Celia A. Brown,
  • Ann Sebastian,
  • Ricardo Le,
  • Anjali Amin,
  • Andrew Greenland,
  • Amir H. Sam

DOI
https://doi.org/10.1186/s12909-022-03753-5
Journal volume & issue
Vol. 22, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Background Standard setting for clinical examinations typically uses the borderline regression method to set the pass mark. An assumption made in using this method is that there are equal intervals between global ratings (GR) (e.g. Fail, Borderline Pass, Clear Pass, Good and Excellent). However, this assumption has never been tested in the medical literature to the best of our knowledge. We examine if the assumption of equal intervals between GR is met, and the potential implications for student outcomes. Methods Clinical finals examiners were recruited across two institutions to place the typical ‘Borderline Pass’, ‘Clear Pass’ and ‘Good’ candidate on a continuous slider scale between a typical ‘Fail’ candidate at point 0 and a typical ‘Excellent’ candidate at point 1. Results were analysed using one-sample t-testing of each interval to an equal interval size of 0.25. Secondary data analysis was performed on summative assessment scores for 94 clinical stations and 1191 medical student examination outcomes in the final 2 years of study at a single centre. Results On a scale from 0.00 (Fail) to 1.00 (Excellent), mean examiner GRs for ‘Borderline Pass’, ‘Clear Pass’ and ‘Good’ were 0.33, 0.55 and 0.77 respectively. All of the four intervals between GRs (Fail-Borderline Pass, Borderline Pass-Clear Pass, Clear Pass-Good, Good-Excellent) were statistically significantly different to the expected value of 0.25 (all p-values < 0.0125). An ordinal linear regression using mean examiner GRs was performed for each of the 94 stations, to determine pass marks out of 24. This increased pass marks for all 94 stations compared with the original GR locations (mean increase 0.21), and caused one additional fail by overall exam pass mark (out of 1191 students) and 92 additional station fails (out of 11,346 stations). Conclusions Although the current assumption of equal intervals between GRs across the performance spectrum is not met, and an adjusted regression equation causes an increase in station pass marks, the effect on overall exam pass/fail outcomes is modest.

Keywords