Subjective data, objective data and the role of bias in predictive modelling: Lessons from a dispositional learning analytics application.

Dirk Tempelaar; Bart Rienties; Quan Nguyen

doi:10.1371/journal.pone.0233977

PLoS ONE (Jan 2020)

Subjective data, objective data and the role of bias in predictive modelling: Lessons from a dispositional learning analytics application.

Dirk Tempelaar,
Bart Rienties,
Quan Nguyen

Affiliations

Dirk Tempelaar
Bart Rienties
Quan Nguyen

DOI: https://doi.org/10.1371/journal.pone.0233977
Journal volume & issue: Vol. 15, no. 6
p. e0233977

Abstract

Read online

For decades, self-report measures based on questionnaires have been widely used in educational research to study implicit and complex constructs such as motivation, emotion, cognitive and metacognitive learning strategies. However, the existence of potential biases in such self-report instruments might cast doubts on the validity of the measured constructs. The emergence of trace data from digital learning environments has sparked a controversial debate on how we measure learning. On the one hand, trace data might be perceived as "objective" measures that are independent of any biases. On the other hand, there is mixed evidence of how trace data are compatible with existing learning constructs, which have traditionally been measured with self-reports. This study investigates the strengths and weaknesses of different types of data when designing predictive models of academic performance based on computer-generated trace data and survey data. We investigate two types of bias in self-report surveys: response styles (i.e., a tendency to use the rating scale in a certain systematic way that is unrelated to the content of the items) and overconfidence (i.e., the differences in predicted performance based on surveys' responses and a prior knowledge test). We found that the response style bias accounts for a modest to a substantial amount of variation in the outcomes of the several self-report instruments, as well as in the course performance data. It is only the trace data, notably that of process type, that stand out in being independent of these response style patterns. The effect of overconfidence bias is limited. Given that empirical models in education typically aim to explain the outcomes of learning processes or the relationships between antecedents of these learning outcomes, our analyses suggest that the bias present in surveys adds predictive power in the explanation of performance data and other questionnaire data.

Published in PLoS ONE

ISSN: 1932-6203 (Online)
Publisher: Public Library of Science (PLoS)
Country of publisher: United States
LCC subjects: Medicine; Science
Website: https://journals.plos.org/plosone/

About the journal