PLoS ONE (Jan 2024)

A real-world test of artificial intelligence infiltration of a university examinations system: A "Turing Test" case study.

  • Peter Scarfe,
  • Kelly Watcham,
  • Alasdair Clarke,
  • Etienne Roesch

DOI
https://doi.org/10.1371/journal.pone.0305354
Journal volume & issue
Vol. 19, no. 6
p. e0305354

Abstract

Read online

The recent rise in artificial intelligence systems, such as ChatGPT, poses a fundamental problem for the educational sector. In universities and schools, many forms of assessment, such as coursework, are completed without invigilation. Therefore, students could hand in work as their own which is in fact completed by AI. Since the COVID pandemic, the sector has additionally accelerated its reliance on unsupervised 'take home exams'. If students cheat using AI and this is undetected, the integrity of the way in which students are assessed is threatened. We report a rigorous, blind study in which we injected 100% AI written submissions into the examinations system in five undergraduate modules, across all years of study, for a BSc degree in Psychology at a reputable UK university. We found that 94% of our AI submissions were undetected. The grades awarded to our AI submissions were on average half a grade boundary higher than that achieved by real students. Across modules there was an 83.4% chance that the AI submissions on a module would outperform a random selection of the same number of real student submissions.