Physical Review Physics Education Research (Jun 2020)

Thematic analysis of 18 years of physics education research conference proceedings using natural language processing

  • Tor Ole B. Odden,
  • Alessandro Marin,
  • Marcos D. Caballero

DOI
https://doi.org/10.1103/PhysRevPhysEducRes.16.010142
Journal volume & issue
Vol. 16, no. 1
p. 010142

Abstract

Read online Read online

We have used an unsupervised machine learning method called latent Dirichlet allocation (LDA) to thematically analyze all papers published in the Physics Education Research Conference Proceedings between 2001 and 2018. By looking at co-occurrences of words across the data corpus, this technique has allowed us to identify ten distinct themes or “topics” that have seen varying levels of prevalence in physics education research (PER) over time and to rate the distribution of these topics within each paper. Our analysis suggests that although all identified topics have seen sustained interest over time, PER has also seen several waves of increased interest in certain topics, beginning with initial interest in qualitative, theory-building studies of student understanding, which gave way to a focus on problem solving in the late 2000s. Since 2010 the field has seen a shift toward more sociocultural views of teaching and learning with a particular focus on communities of practice, student identities, and institutional change. Based on these results, we suggest that unsupervised text analysis techniques like LDA may hold promise for providing quantitative, independent, and replicable analyses of educational research literature.