Journal of Statistics Education (Jan 2020)

Comparison of Beginning R Students’ Perceptions of Peer-Made Plots Created in Two Plotting Systems: A Randomized Experiment

  • Leslie Myint,
  • Aboozar Hadavand,
  • Leah Jager,
  • Jeffrey Leek

DOI
https://doi.org/10.1080/10691898.2019.1695554
Journal volume & issue
Vol. 28, no. 1
pp. 98 – 108

Abstract

Read online

We performed an empirical study of the perceived quality of scientific graphics produced by beginning R users in two plotting systems: the base graphics package (“base R”) and the ggplot2 add-on package. In our experiment, students taking a data science course on the Coursera platform were randomized to complete identical plotting exercises using either base R or ggplot2. This exercise involved creating two plots: one bivariate scatterplot and one plot of a multivariate relationship that necessitated using color or panels. Students evaluated their peers on visual characteristics key to clear scientific communication, including plot clarity and sufficient labeling. We observed that graphics created with the two systems rated similarly on many characteristics. However, ggplot2 graphics were generally perceived by students to be slightly more clear overall with respect to presentation of a scientific relationship. This increase was more pronounced for the multivariate relationship. Through expert analysis of submissions, we also find that certain concrete plot features (e.g., trend lines, axis labels, legends, panels, and color) tend to be used more commonly in one system than the other. These observations may help educators emphasize the use of certain plot features targeted to correct common student mistakes. Supplementary materials for this article are available online.

Keywords