Comparative analysis of artificial intelligence-driven assistance in diverse educational queries: ChatGPT vs. Google Bard

Mohammad Al Mashagbeh; Latefa Dardas; Heba Alzaben; Amjad Alkhayat

doi:10.3389/feduc.2024.1429324

Frontiers in Education (Sep 2024)

Comparative analysis of artificial intelligence-driven assistance in diverse educational queries: ChatGPT vs. Google Bard

Mohammad Al Mashagbeh,
Latefa Dardas,
Heba Alzaben,
Amjad Alkhayat

Affiliations

Mohammad Al Mashagbeh: The Department of Mechatronics Engineering, The University of Jordan, Amman, Jordan
Latefa Dardas: School of Nursing, The University of Jordan, Amman, Jordan
Heba Alzaben: The Department of Mechanical Engineering, The School of Engineering Technology, Al Hussein Technical University, Amman, Jordan
Amjad Alkhayat: Department of Educational Sciences, Salt Faculty, Al-Balqa’ Applied University, Salt, Jordan

DOI: https://doi.org/10.3389/feduc.2024.1429324
Journal volume & issue: Vol. 9

Abstract

Read online

Artificial intelligence tools are rapidly growing in education, highlighting the imperative need for a thorough and critical evaluation of their performance. To this aim, this study tests the effectiveness of ChatGPT and Google Bard in answering a range of questions within the engineering and health sectors. True/false, multiple choice questions (MCQs), matching, short answer, essay, and calculation questions are among the question types investigated. Findings showed that ChatGPT 4 surpasses both ChatGPT 3.5 and Google Bard in terms of creative problem-solving and accuracy across various question types. The highest accuracy achieved by ChatGPT 4 was in true/false questions, reaching 97.5%, while its least accurate performance was noted in calculation questions with an accuracy of 82.5%. Prompting both ChatGPT and Google Bard to provide short responses apparently prevented them from hallucinating with unrealistic or nonsensical responses. The majority of the problems for which ChatGPT and Google Bard provided incorrect answers demonstrated a correct problem-solving approach; however, both AI models struggled to accurately perform simple calculations. In MCQs related to health sciences, ChatGPT seemed to have a challenge in discerning the correct answer among several plausible options. While all three tools managed the essay questions competently, avoiding any blatantly incorrect responses (unlike with other question types), some nuanced differences were noticed. ChatGPT 3.5 consistently adhered more closely to the essay prompts, providing straightforward and essential responses, while ChatGPT 4 demonstrated superiority over both models in terms of adaptability. ChatGPT4 fabricated references, creating nonexistent authors and research titles in response to prompts for sources. While utilizing AI in education holds a promise, even the latest and most advanced versions of ChatGPT and Google Bard were not able to accurately answer all questions. There remains a significant need for human cognitive skills and further advancements in AI capabilities.

Published in Frontiers in Education

ISSN: 2504-284X (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Education: Education (General)
Website: http://journal.frontiersin.org/journal/education

About the journal

Abstract

Keywords