Validating Parallel-Forms Tests for Assessing Anesthesia Resident Knowledge

Allison J. Lee; Stephanie R. Goodman; Melissa E. B. Bauer; Rebecca D. Minehart; Shawn Banks; Yi Chen; Ruth L. Landau; Madhabi Chatterji

doi:10.1177/23821205241229778

Journal of Medical Education and Curricular Development (Feb 2024)

Validating Parallel-Forms Tests for Assessing Anesthesia Resident Knowledge

Allison J. Lee,
Stephanie R. Goodman,
Melissa E. B. Bauer,
Rebecca D. Minehart,
Shawn Banks,
Yi Chen,
Ruth L. Landau,
Madhabi Chatterji

Affiliations

Allison J. Lee: Department of Anesthesiology, , New York, NY, USA
Stephanie R. Goodman: Department of Anesthesiology, , New York, NY, USA
Melissa E. B. Bauer: Department of Anesthesiology, , Durham, NC, USA
Rebecca D. Minehart: Department of Anesthesia, Critical Care and Pain Medicine, , Boston, MA, USA
Shawn Banks: Department of Anesthesiology, Perioperative Medicine and Pain Management, University of Miami, Miami, FL, USA
Yi Chen: Teachers College, , New York, NY, USA
Ruth L. Landau: Department of Anesthesiology, , New York, NY, USA
Madhabi Chatterji: Teachers College, , New York, NY, USA

DOI: https://doi.org/10.1177/23821205241229778
Journal volume & issue: Vol. 11

Abstract

Read online

We created a serious game to teach first year anesthesiology (CA-1) residents to perform general anesthesia for cesarean delivery. We aimed to investigate resident knowledge gains after playing the game and having received one of 2 modalities of debriefing. We report on the development and validation of scores from parallel test forms for criterion-referenced interpretations of resident knowledge. The test forms were intended for use as pre- and posttests for the experiment. Validation of instruments measuring the study's primary outcome was considered essential for adding rigor to the planned experiment, to be able to trust the study's results. Parallel, multiple-choice test forms development steps included: (1) assessment purpose and population specification; (2) content domain specification and writing/selection of items; (3) content validation by experts of paired items by topic and cognitive level; and (4) empirical validation of scores from the parallel test forms using Classical Test Theory (CTT) techniques. Field testing involved online administration of 52 shuffled items from both test forms to 24 CA-1's, 21 second-year anesthesiology (CA-2) residents, 2 fellows, 1 attending anesthesiologist, and 1 of unknown rank at 3 US institutions. Items from each form yielded near-normal score distributions, with similar medians, ranges, and standard deviations. Evaluations of CTT item difficulty (item p values) and discrimination (D) indices indicated that most items met assumptions of criterion-referenced test design, separating experienced from novice residents. Experienced residents performed better on overall domain scores than novices ( P < .05). Kuder-Richardson Formula 20 (KR-20) reliability estimates of both test forms were above the acceptability cut of .70, and parallel forms reliability estimate was high at .86, indicating results were consistent with theoretical expectations. Total scores of parallel test forms demonstrated item-level validity, strong internal consistency and parallel forms reliability, suggesting sufficient robustness for knowledge outcomes assessments of CA-1 residents.

Published in Journal of Medical Education and Curricular Development

ISSN: 2382-1205 (Online)
Publisher: SAGE Publishing
Country of publisher: United Kingdom
LCC subjects: Education: Special aspects of education; Medicine: Medicine (General)
Website: https://journals.sagepub.com/home/mde

About the journal