Large-scale Assessments in Education (Mar 2017)

Assuming measurement invariance of background indicators in international comparative educational achievement studies: a challenge for the interpretation of achievement differences

  • Heike Wendt,
  • Daniel Kasper,
  • Matthias Trendtel

DOI
https://doi.org/10.1186/s40536-017-0043-9
Journal volume & issue
Vol. 5, no. 1
pp. 1 – 34

Abstract

Read online

Abstract Background Large-scale cross-national studies designed to measure student achievement use different social, cultural, economic and other background variables to explain observed differences in that achievement. Prior to their inclusion into a prediction model, these variables are commonly scaled into latent background indices. To allow cross-national comparisons of the latent indices, measurement invariance is assumed. However, it is unclear whether the assumption of measurement invariance has some influence on the results of the prediction model, thus challenging the reliability and validity of cross-national comparisons of predicted results. Methods To establish the effect size attributed to different degrees of measurement invariance, we rescaled the ‘home resource for learning index’ (HRL) for the 37 countries ( $$n=166,709$$ n = 166 , 709 students) that participated in the IEA’s combined ‘Progress in International Reading Literacy Study’ (PIRLS) and ‘Trends in International Mathematics and Science Study’ (TIMSS) assessments of 2011. We used (a) two different measurement models [one-parameter model (1PL) and two-parameter model (2PL)] with (b) two different degrees of measurement invariance, resulting in four different models. We introduced the different HRL indices as predictors in a generalized linear mixed model (GLMM) with mathematics achievement as the dependent variable. We then compared three outcomes across countries and by scaling model: (1) the differing fit-values of the measurement models, (2) the estimated discrimination parameters, and (3) the estimated regression coefficients. Results The least restrictive measurement model fitted the data best, and the degree of assumed measurement invariance of the HRL indices influenced the random effects of the GLMM in all but one country. For one-third of the countries, the fixed effects of the GLMM also related to the degree of assumed measurement invariance. Conclusion The results support the use of country-specific measurement models for scaling the HRL index. In general, equating procedures could be used for cross-national comparisons of the latent indices when country-specific measurement models are fitted. Cross-national comparisons of the coefficients of the GLMM should take into account the applied measurement model for scaling the HRL indices. This process could be achieved by, for example, adjusting the standard errors of the coefficients.

Keywords