Language Testing in Asia (Aug 2019)
Critical review of validation models and practices in language testing: their limitations and future directions for validation research
Abstract
Abstract Purpose and background The purpose of this paper is to critically review the traditional and contemporary validation frameworks—the content, criterion, and construct validations; the evidence-gathering; the socio-cognitive model; the test usefulness; and an argument-based approach—as well as empirical studies using an argument-based approach to validation in high-stakes contexts to discuss the applicability of an argument-based approach to validation. Chapelle and Voss (2014) reported that despite the usefulness and advantages of an argument-based approach for test validation, five validation studies using this approach were found in a search from two major journals—Language Testing and Language Assessment Quarterly. We reviewed the validation approaches in language testing and extended the search for empirical studies that used an argument-based approach in five language testing journals including ProQuest Dissertation and Theses. By doing so, this paper aims to provide validation researchers with each approach’s conceptual limitations and future directions for validation research. For validity arguments to be defensible, this paper suggests that various validity evidences be required, involving multiple test stakeholders. Implications By comparing variations of an argument-based approach and reviewing eight representative studies out of 33 empirical validation studies using an argument-based approach, this paper presents the following implications for future researchers to consider: (a) defining test constructs and relevant test tasks through domain analysis; (b) inviting multiple test stakeholders to test validation; (c) investigating the intended and actual interpretations, decisions, and consequences; (d) considering social, cultural, and political values to be embedded; and (e) employing multiple methods beyond statistical analyses using test scores.
Keywords