Large-scale Assessments in Education (Oct 2021)

From byproduct to design factor: on validating the interpretation of process indicators based on log data

  • Frank Goldhammer,
  • Carolin Hahnel,
  • Ulf Kroehne,
  • Fabian Zehner

DOI
https://doi.org/10.1186/s40536-021-00113-5
Journal volume & issue
Vol. 9, no. 1
pp. 1 – 25

Abstract

Read online

Abstract International large-scale assessments such as PISA or PIAAC have started to provide public or scientific use files for log data; that is, events, event-related attributes and timestamps of test-takers’ interactions with the assessment system. Log data and the process indicators derived from it can be used for many purposes. However, the intended uses and interpretations of process indicators require validation, which here means a theoretical and/or empirical justification that inferences about (latent) attributes of the test-taker’s work process are valid. This article reviews and synthesizes measurement concepts from various areas, including the standard assessment paradigm, the continuous assessment approach, the evidence-centered design (ECD) framework, and test validation. Based on this synthesis, we address the questions of how to ensure the valid interpretation of process indicators by means of an evidence-centered design of the task situation, and how to empirically challenge the intended interpretation of process indicators by developing and implementing correlational and/or experimental validation strategies. For this purpose, we explicate the process of reasoning from log data to low-level features and process indicators as the outcome of evidence identification. In this process, contextualizing information from log data is essential in order to reduce interpretative ambiguities regarding the derived process indicators. Finally, we show that empirical validation strategies can be adapted from classical approaches investigating the nomothetic span and construct representation. Two worked examples illustrate possible validation strategies for the design phase of measurements and their empirical evaluation.

Keywords