Physical Review Physics Education Research (Oct 2020)

Detecting the influence of item chaining on student responses to the Force Concept Inventory and the Force and Motion Conceptual Evaluation

  • Philip Eaton,
  • Barrett Frank,
  • Shannon Willoughby

DOI
https://doi.org/10.1103/PhysRevPhysEducRes.16.020122
Journal volume & issue
Vol. 16, no. 2
p. 020122

Abstract

Read online Read online

Items that are chained, or blocked, together appear on many of the conceptual assessments utilized for physics education research. However, when items are chained together there is the potential to introduce local dependence between those items, which would violate the assumption of item independence required by classical test theory, unidimensional item response theory, and other measurement theories. Local dependence can be divided into two categories: (i) underlying local dependence, which can be adequately modeled with multidimensional measurement theories, and (ii) surface local dependence (SLD), which cannot be modeled using multidimensional measurement theories. The act of chaining items is thought to be one of the many potential sources of SLD between items. Using previous local dependence research results, this study proposes two methods for detecting the presence of local dependence and SLD between items on an assessment. These methods were applied to the Force Concept Inventory (FCI) and the Force and Motion Conceptual Evaluation (FMCE). It was found that the assumption of item independence was violated for both assessments, implying that unidimensional measurement theories may not adequately model either the FCI or FMCE. Further, both detection methods identified the potential for a minimal amount of SLD present for FCI and a significant amount of SLD present for the FMCE. This implies that even multidimensional measurement theories may not be capable of adequately modeling the FMCE when scoring items individually. This result supports the claim made by Thornton et al. that the items on the FMCE should be scored in groups; however, the currently proposed grading scheme was found to be inadequate.