AERA Open (Dec 2024)

Addressing Threats to Validity in Supervised Machine Learning: A Framework and Best Practices for Education Researchers

  • Kylie Anglin

DOI
https://doi.org/10.1177/23328584241303495
Journal volume & issue
Vol. 10

Abstract

Read online

Given the rapid adoption of machine learning methods by education researchers, and the growing acknowledgment of their inherent risks, there is an urgent need for tailored methodological guidance on how to improve and evaluate the validity of inferences drawn from these methods. Drawing on an integrative literature review and extending a well-known framework for theorizing validity in the social sciences, this article provides both an overview of threats to validity in supervised machine learning and plausible approaches for addressing such threats. It collates a list of current best practices, brings supervised learning challenges into a unified conceptual framework, and offers a straightforward reference guide on crucial validity considerations. Finally, it proposes a novel research protocol for researchers to use during project planning and for reviewers and scholars to use when evaluating the validity of supervised machine learning applications.