BMC Medical Education (Apr 2023)

A comparison of 3- and 4-option multiple-choice items for medical subspecialty in-training examinations

  • Dandan Chen,
  • Ann E. Harman,
  • Huaping Sun,
  • Tianpeng Ye,
  • Robert R. Gaiser

DOI
https://doi.org/10.1186/s12909-023-04277-2
Journal volume & issue
Vol. 23, no. 1
pp. 1 – 7

Abstract

Read online

Abstract Background The American Board of Anesthesiology piloted 3-option multiple-choice items (MCIs) for its 2020 administration of 150-item subspecialty in-training examinations for Critical Care Medicine (ITE-CCM) and Pediatric Anesthesiology (ITE-PA). The 3-option MCIs were transformed from their 4-option counterparts, which were administered in 2019, by removing the least effective distractor. The purpose of this study was to compare physician performance, response time, and item and exam characteristics between the 4-option and 3-option exams. Methods Independent-samples t-test was used to examine the differences in physician percent-correct score; paired t-test was used to examine the differences in response time and item characteristics. The Kuder and Richardson Formula 20 was used to calculate the reliability of each exam form. Both the traditional (distractor being selected by fewer than 5% of examinees and/or showing a positive correlation with total score) and sliding scale (adjusting the frequency threshold of distractor being chosen by item difficulty) methods were used to identify non-functioning distractors (NFDs). Results Physicians who took the 3-option ITE-CCM (mean = 67.7%) scored 2.1 percent correct higher than those who took the 4-option ITE-CCM (65.7%). Accordingly, 3-option ITE-CCM items were significantly easier than their 4-option counterparts. No such differences were found between the 4-option and 3-option ITE-PAs (71.8% versus 71.7%). Item discrimination (4-option ITE-CCM [an average of 0.13], 3-option ITE-CCM [0.12]; 4-option ITE-PA [0.08], 3-option ITE-PA [0.09]) and exam reliability (0.75 and 0.74 for 4- and 3-option ITE-CCMs, respectively; 0.62 and 0.67 for 4-option and 3-option ITE-PAs, respectively) were similar between these two formats for both ITEs. On average, physicians spent 3.4 (55.5 versus 58.9) and 1.3 (46.2 versus 47.5) seconds less per item on 3-option items than 4-option items for ITE-CCM and ITE-PA, respectively. Using the traditional method, the percentage of NFDs dropped from 51.3% in the 4-option ITE-CCM to 37.0% in the 3-option ITE-CCM and from 62.7% to 46.0% for the ITE-PA; using the sliding scale method, the percentage of NFDs dropped from 36.0% to 21.7% for the ITE-CCM and from 44.9% to 27.7% for the ITE-PA. Conclusions Three-option MCIs function as robustly as their 4-option counterparts. The efficiency achieved by spending less time on each item poses opportunities to increase content coverage for a fixed testing period. The results should be interpreted in the context of exam content and distribution of examinee abilities.

Keywords