Physical Review Physics Education Research (Jul 2019)
Missing data and bias in physics education research: A case for using multiple imputation
Abstract
[This paper is part of the Focused Collection on Quantitative Methods in PER: A Critical Examination.] Physics education researchers (PER) commonly use complete-case analysis to address missing data. For complete-case analysis, researchers discard all data from any student who is missing any data. Despite its frequent use, no PER article we reviewed that used complete-case analysis provided evidence that the data met the assumption of missing completely at random necessary to ensure accurate results. Not meeting this assumption raises the possibility that prior studies have reported biased results with inflated gains that may obscure differences across courses. To test this possibility, we compared the accuracy of complete-case analysis and multiple imputation (MI) using simulated data. We simulated the data based on prior studies such that students who earned higher grades participated at higher rates, which made the data missing at random. PER studies seldom use MI, but MI uses all available data, has less stringent assumptions, and is more accurate and more statistically powerful than complete-case analysis. Results indicated that complete-case analysis introduced more bias than MI and this bias was large enough to obscure differences between student populations or between courses. We recommend that the PER community adopt the use of MI for handling missing data to improve the accuracy in research studies.