Developing valid assessments in the era of generative artificial intelligence

Leonora Kaldaras; Leonora Kaldaras; Hope O. Akaeze; Hope O. Akaeze; Mark D. Reckase

doi:10.3389/feduc.2024.1399377

Frontiers in Education (Aug 2024)

Developing valid assessments in the era of generative artificial intelligence

Leonora Kaldaras,
Leonora Kaldaras,
Hope O. Akaeze,
Hope O. Akaeze,
Mark D. Reckase

Affiliations

Leonora Kaldaras: Department of Physics, University of Colorado Boulder, Boulder, CO, United States
Leonora Kaldaras: Graduate School of Education, Stanford University, Stanford, CA, United States
Hope O. Akaeze: Michigan State University, East Lansing, MI, United States
Hope O. Akaeze: Community Evaluation Programs, Office of Public Engagement and Scholarship, University Outreach and Engagement, Michigan State University, East Lansing, MI, United States
Mark D. Reckase: Michigan State University, East Lansing, MI, United States

DOI: https://doi.org/10.3389/feduc.2024.1399377
Journal volume & issue: Vol. 9

Abstract

Read online

Generative Artificial Intelligence (GAI) holds tremendous potential to transform the field of education because GAI models can consider context and therefore can be trained to deliver quick and meaningful evaluation of student learning outcomes. However, current versions of GAI tools have considerable limitations, such as social biases often inherent in the data sets used to train the models. Moreover, the GAI revolution comes during a period of moving away from memorization-based education systems toward supporting learners in developing the ability to apply knowledge and skills to solve real-world problems and explain real-world phenomena. A challenge in using GAI tools for scoring assessments aimed at fostering knowledge application is ensuring that these algorithms are scoring the same construct attributes (e.g., knowledge and skills) as a trained human scorer would score when evaluating student performance. Similarly, if using GAI tools to develop assessments, one needs to ensure that the goals of GAI-generated assessments are aligned with the vision and performance expectations of the learning environments for which these assessments are developed. Currently, no guidelines have been identified for assessing the validity of AI-based assessments and assessment results. This paper represents a conceptual analysis of issues related to developing and validating GAI-based assessments and assessment results to guide the learning process. Our primary focus is to investigate how to meaningfully leverage capabilities of GAI for developing assessments. We propose ways to evaluate the validity evidence of GAI-produced assessments and assessment scores based on existing validation approaches. We discuss future research avenues aimed at establishing guidelines and methodologies for assessing the validity of AI-based assessments and assessment results. We ground our discussion in the theory of validity outlined in the Standards for Educational and Psychological Testing by the American Educational Research Association and discuss how we envision building on the standards for establishing the validity of inferences made from the test scores in the context of GAI-based assessments.

Published in Frontiers in Education

ISSN: 2504-284X (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Education: Education (General)
Website: http://journal.frontiersin.org/journal/education

About the journal

Abstract

Keywords