Calibrating the Medical Council of Canada’s Qualifying Examination Part I using an integrated item response theory framework: a comparison of models and designs

Andre F. De Champlain; Andre-Philippe Boulais; Andrew Dallas

doi:10.3352/jeehp.2016.13.6

Journal of Educational Evaluation for Health Professions (Jan 2016)

Calibrating the Medical Council of Canada’s Qualifying Examination Part I using an integrated item response theory framework: a comparison of models and designs

Andre F. De Champlain,
Andre-Philippe Boulais,
Andrew Dallas

Affiliations

Andre F. De Champlain: Research & Development, Medical Council of Canada, Ottawa, Ontario, Canada
Andre-Philippe Boulais: Research & Development, Medical Council of Canada, Ottawa, Ontario, Canada
Andrew Dallas: Educational Research Methodology Department, School of Education, University of North Carolina at Greensboro, Greensboro, North Carolina, USA

DOI: https://doi.org/10.3352/jeehp.2016.13.6
Journal volume & issue: Vol. 13

Abstract

Read online

Purpose: The aim of this research was to compare different methods of calibrating multiple choice question (MCQ) and clinical decision making (CDM) components for the Medical Council of Canada’s Qualifying Examination Part I (MCCQEI) based on item response theory. Methods: Our data consisted of test results from 8,213 first time applicants to MCCQEI in spring and fall 2010 and 2011 test administrations. The data set contained several thousand multiple choice items and several hundred CDM cases. Four dichotomous calibrations were run using BILOG-MG 3.0. All 3 mixed item format (dichotomous MCQ responses and polytomous CDM case scores) calibrations were conducted using PARSCALE 4. Results: The 2-PL model had identical numbers of items with chi-square values at or below a Type I error rate of 0.01 (83/3,499 or 0.02). In all 3 polytomous models, whether the MCQs were either anchored or concurrently run with the CDM cases, results suggest very poor fit. All IRT abilities estimated from dichotomous calibration designs correlated very highly with each other. IRT-based pass-fail rates were extremely similar, not only across calibration designs and methods, but also with regard to the actual reported decision to candidates. The largest difference noted in pass rates was 4.78%, which occurred between the mixed format concurrent 2-PL graded response model (pass rate= 80.43%) and the dichotomous anchored 1-PL calibrations (pass rate= 85.21%). Conclusion: Simpler calibration designs with dichotomized items should be implemented. The dichotomous calibrations provided better fit of the item response matrix than more complex, polytomous calibrations.

Published in Journal of Educational Evaluation for Health Professions

ISSN: 1975-5937 (Online)
Publisher: Korea Health Personnel Licensing Examination Institute
Country of publisher: Korea, Republic of
LCC subjects: Education: Special aspects of education; Medicine
Website: http://jeehp.org/

About the journal

Abstract

Keywords