BMC Medical Education (Jun 2021)
Variation in performance on common content items at UK medical schools
Abstract
Abstract Background Due to differing assessment systems across UK medical schools, making meaningful cross-school comparisons on undergraduate students’ performance in knowledge tests is difficult. Ahead of the introduction of a national licensing assessment in the UK, we evaluate schools’ performances on a shared pool of “common content” knowledge test items to compare candidates at different schools and evaluate whether they would pass under different standard setting regimes. Such information can then help develop a cross-school consensus on standard setting shared content. Methods We undertook a cross-sectional study in the academic sessions 2016-17 and 2017-18. Sixty “best of five” multiple choice ‘common content’ items were delivered each year, with five used in both years. In 2016-17 30 (of 31 eligible) medical schools undertook a mean of 52.6 items with 7,177 participants. In 2017-18 the same 30 medical schools undertook a mean of 52.8 items with 7,165 participants, creating a full sample of 14,342 medical students sitting common content prior to graduation. Using mean scores, we compared performance across items and carried out a “like-for-like” comparison of schools who used the same set of items then modelled the impact of different passing standards on these schools. Results Schools varied substantially on candidate total score. Schools differed in their performance with large (Cohen’s d around 1) effects. A passing standard that would see 5 % of candidates at high scoring schools fail left low-scoring schools with fail rates of up to 40 %, whereas a passing standard that would see 5 % of candidates at low scoring schools fail would see virtually no candidates from high scoring schools fail. Conclusions Candidates at different schools exhibited significant differences in scores in two separate sittings. Performance varied by enough that standards that produce realistic fail rates in one medical school may produce substantially different pass rates in other medical schools – despite identical content and the candidates being governed by the same regulator. Regardless of which hypothetical standards are “correct” as judged by experts, large institutional differences in pass rates must be explored and understood by medical educators before shared standards are applied. The study results can assist cross-school groups in developing a consensus on standard setting future licensing assessment.