Journal of Statistics and Data Science Education (May 2024)

Obtaining and Applying Public Data for Training Students in Technical Statistical Writing: Case Studies with Data from U.S. Geological Survey and General Ecological Literature

  • Barb Bennie,
  • Richard A. Erickson

DOI
https://doi.org/10.1080/26939169.2023.2195459
Journal volume & issue
Vol. 32, no. 2
pp. 217 – 226

Abstract

Read online

AbstractEffective undergraduate statistical education requires training using real-world data. Textbook datasets seldom match the complexities and messiness of real-world data and finding these datasets can be challenging for educators. Consulting and industrial datasets often have nondisclosure agreements. Academic datasets often require subject area expertise beyond those of a general education or lack connections to real-world applications. Many governments, including the United States, now require the release of data from projects they directly complete or fund though grants and contracts. We show how statistical educators may find datasets and incorporate them into courses. Specifically, we use two examples from the U.S. Geological Survey (USGS) and one example from the ecology literature. We demonstrate the use of these datasets in an upper-level analysis of variance (ANOVA) class. In addition to describing how we found the datasets, we describe how to include them into course work and the course’s student assessments. We have used these datasets over multiple semesters and included student feedback from these courses. Although our examples focus on an ANOVA class, the general methods for finding data shared here could be used for statistical classes ranging from high school to graduate education. Supplementary materials for this article are available online.

Keywords