Practical Assessment, Research & Evaluation (2014-11-01)

Editorial Changes and Item Performance: Implications for Calibration and Pretesting

  • Heather Stoffel,
  • Mark R. Raymond,
  • S. Deniz Bucak,
  • Steven A. Haist

Journal volume & issue
Vol. 19, no. 14
pp. 1 – 11


Read online

Previous research on the impact of text and formatting changes on test-item performance has produced mixed results. This matter is important because it is generally acknowledged that any change to an item requires that it be recalibrated. The present study investigated the effects of seven classes of stylistic changes on item difficulty, discrimination, and response time for a subset of 65 items that make up a standardized test for physician licensure completed by 31,918 examinees in 2012. One of two versions of each item (original or revised) was randomly assigned to examinees such that each examinee saw only two experimental items, with each item being administered to approximately 480 examinees. The stylistic changes had little or no effect on item difficulty or discrimination; however, one class of edits -' changing an item from an open lead-in (incomplete statement) to a closed lead-in (direct question) -' did result in slightly longer response times. Data for nonnative speakers of English were analyzed separately with nearly identical results. These findings have implications for the conventional practice of repretesting (or recalibrating) items that have been subjected to minor editorial changes.