Student evaluations of teaching (mostly) do not measure teaching effectiveness

ScienceOpen Research. 2017;2016(01):1-11


Journal Homepage

Journal Title: ScienceOpen Research

ISSN: 2199-1006 (Online)

Publisher: ScienceOpen

LCC Subject Category: General Works

Country of publisher: United States

Language of fulltext: English

Full-text formats available: PDF, XML



Anne Boring (OFCE, SciencesPo, Paris, France; PSL, Université Paris-Dauphine, LEDa, UMR DIAL, Paris, France)
Kellie Ottoboni (Department of Statistics, University of California, Berkeley, CA, USA)
Philip B. Stark (Department of Statistics, University of California, Berkeley, CA, USA)


Open peer review

Editorial Board

Instructions for authors

Time From Submission to Publication: 1 weeks


Abstract | Full Text

Student evaluations of teaching (SET) are widely used in academic personnel decisions as a measure of teaching effectiveness. We show: SET are biased against female instructors by an amount that is large and statistically significant. The bias affects how students rassignments are graded. The bias varies by discipline and by student gender, among other things. It is not possible to adjust for the bias, because it depends on so many factors. SET are more sensitive to students’ gender bias and grade expectations than they are to teaching effectiveness. Gender biases can be large enough to cause more effective instructors to get lower SET than less effective instructors. These findings are based on nonparametric statistical tests applied to two datasets: 23,001 SET of 379 instructors by 4,423 students in six mandatory first-year courses in a five-year natural experiment at a French university, and 43 SET for four sections of an online course in a randomized, controlled, blind experiment at a US university.