Evaluating LLMs for Code Generation in HRI: A Comparative Study of ChatGPT, Gemini, and Claude

Andrei Sobo; Awes Mubarak; Almas Baimagambetov; Nikolaos Polatidis

doi:10.1080/08839514.2024.2439610

Applied Artificial Intelligence (Dec 2025)

Evaluating LLMs for Code Generation in HRI: A Comparative Study of ChatGPT, Gemini, and Claude

Andrei Sobo,
Awes Mubarak,
Almas Baimagambetov,
Nikolaos Polatidis

Affiliations

Andrei Sobo: School of Architecture, Technology and Engineering, University of Brighton, Brighton, UK
Awes Mubarak: School of Architecture, Technology and Engineering, University of Brighton, Brighton, UK
Almas Baimagambetov: School of Architecture, Technology and Engineering, University of Brighton, Brighton, UK
Nikolaos Polatidis: School of Architecture, Technology and Engineering, University of Brighton, Brighton, UK

DOI: https://doi.org/10.1080/08839514.2024.2439610
Journal volume & issue: Vol. 39, no. 1

Abstract

Read online

This study investigates the effectiveness of Large Language Models (LLMs) in generating code for Human-Robot Interaction (HRI) applications. We present the first direct comparison of ChatGPT 3.5, Gemini 1.5 Pro, and Claude 3.5 Sonnet in the specific context of generating code for Human-Robot Interaction applications. Through a series of 20 carefully designed prompts, ranging from simple movement commands to complex object manipulation scenarios, we evaluate the models’ ability to generate accurate and context-aware code. Our findings reveal significant variations in performance, with Claude 3.5 Sonnet achieving a 95% success rate, Gemini 1.5 Pro at 60%, and ChatGPT 3.5 at 20%. The study highlights the rapid advancement in LLM capabilities for specialized programming tasks while also identifying persistent challenges in spatial reasoning and adherence to specific constraints. These results suggest promising applications for LLMs in robotics development and education while emphasizing the continued need for human oversight and specialized training in AI-assisted programming for HRI.

Published in Applied Artificial Intelligence

ISSN: 0883-9514 (Print); 1087-6545 (Online)
Publisher: Taylor & Francis Group
Country of publisher: United Kingdom
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science; Science: Science (General): Cybernetics
Website: https://www.tandfonline.com/journals/uaai

About the journal