Digital (Jan 2024)

Effectiveness of ChatGPT in Coding: A Comparative Analysis of Popular Large Language Models

  • Carlos Eduardo Andino Coello,
  • Mohammed Nazeh Alimam,
  • Rand Kouatly

DOI
https://doi.org/10.3390/digital4010005
Journal volume & issue
Vol. 4, no. 1
pp. 114 – 125

Abstract

Read online

This study explores the effectiveness and efficiency of the popular OpenAI model ChatGPT, powered by GPT-3.5 and GPT-4, in programming tasks to understand its impact on programming and potentially software development. To measure the performance of these models, a quantitative approach was employed using the Mostly Basic Python Problems (MBPP) dataset. In addition to the direct assessment of GPT-3.5 and GPT-4, a comparative analysis involving other popular large language models in the AI landscape, notably Google’s Bard and Anthropic’s Claude, was conducted to measure and compare their proficiency in the same tasks. The results highlight the strengths of ChatGPT models in programming tasks, offering valuable insights for the AI community, specifically for developers and researchers. As the popularity of artificial intelligence increases, this study serves as an early look into the field of AI-assisted programming.

Keywords