Developing Students’ Statistical Expertise through Writing in the Age of AI

Laura S. DeLuca; Alex Reinhart; Gordon Weinberg; Michael Laudenbach; Sydney Miller; David West Brown

doi:10.1080/26939169.2025.2497547

Journal of Statistics and Data Science Education (Apr 2025)

Developing Students’ Statistical Expertise through Writing in the Age of AI

Laura S. DeLuca,
Alex Reinhart,
Gordon Weinberg,
Michael Laudenbach,
Sydney Miller,
David West Brown

Affiliations

Laura S. DeLuca: Department of English, Carnegie Mellon University
Alex Reinhart: Department of Statistics & Data Science, Carnegie Mellon University
Gordon Weinberg: Department of Statistics & Data Science, Carnegie Mellon University
Michael Laudenbach: Department of Humanities and Social Sciences, New Jersey Institute of Technology
Sydney Miller: Department of English, Carnegie Mellon University
David West Brown: Department of English, Carnegie Mellon University

DOI: https://doi.org/10.1080/26939169.2025.2497547

Abstract

Read online

As large language models (LLMs) such as GPT have become more accessible, concerns about their potential effects on students’ learning have grown. In data science education, the specter of students’ turning to LLMs raises multiple issues, as writing is a means not just of conveying information but of developing their statistical reasoning. In our study, we engage with questions surrounding LLMs and their pedagogical impact by: 1) quantitatively and qualitatively describing how select LLMs write report introductions and complete data analysis reports; and 2) comparing patterns in texts authored by LLMs to those authored by students and by published researchers. Our results show distinct differences between machine-generated and human-generated writing, as well as between novice and expert writing. Those differences are evident in how writers manage information, modulate confidence, signal importance, and report statistics. The findings can help inform classroom instruction, whether that instruction is aimed at dissuading the use LLMs or at guiding their use as a productivity tool. It also has implications for students’ development as statistical thinkers and writers. What happens when they offload the work of data science to a model that doesn’t write quite like a data scientist?

Published in Journal of Statistics and Data Science Education

ISSN: 2693-9169 (Online)
Publisher: Taylor & Francis Group
Country of publisher: United States
LCC subjects: Science: Mathematics: Probabilities. Mathematical statistics; Education: Special aspects of education
Website: https://www.tandfonline.com/journals/ujse

About the journal

Abstract

Keywords