Evaluation of BERT and ChatGPT models in inference, paraphrase and similarity tasks

Kim Radmir; Kotsenko Anton; Andreev Aleksandr; Bazanova Anastasiia; Aladin Dmitry; Todua David; Marushchenko Aleksei; Varlamov Oleg

doi:10.1051/e3sconf/202451503016

E3S Web of Conferences (Jan 2024)

Evaluation of BERT and ChatGPT models in inference, paraphrase and similarity tasks

Kim Radmir,
Kotsenko Anton,
Andreev Aleksandr,
Bazanova Anastasiia,
Aladin Dmitry,
Todua David,
Marushchenko Aleksei,
Varlamov Oleg

Affiliations

Kim Radmir: Bauman Moscow State Technical University
Kotsenko Anton: Bauman Moscow State Technical University
Andreev Aleksandr: Bauman Moscow State Technical University
Bazanova Anastasiia: Bauman Moscow State Technical University
Aladin Dmitry: Bauman Moscow State Technical University
Todua David: Bauman Moscow State Technical University
Marushchenko Aleksei: Bauman Moscow State Technical University
Varlamov Oleg: Bauman Moscow State Technical University

DOI: https://doi.org/10.1051/e3sconf/202451503016
Journal volume & issue: Vol. 515
p. 03016

Abstract

Read online

The purpose of this paper is to study the application of ChatGPT and BERT models in the field of mechanical engineering. In the context of machine learning, the ChatGPT and BERT models can be applied to various natural language processing tasks such as analyzing technical documentation and building instructions according to a particular version of the documentation, diagnosing malfunctions or customer service. The paper discusses the fundamental features of BERT and ChatGPT models, their origin, and also investigates the main architectural features and identifies the main advantages and disadvantages of the models. The paper analyzes and selects various natural language processing tasks to test the models’ ability to understand natural language in the context of machine learning. The selected criterion tasks are divided into semantic groups to identify the capabilities of ChatGPT and BERT models in each of three areas: logical inference tasks, paraphrasing tasks, and text similarity tasks. The paper also discusses the concept of operational design, which involves developing inputs that guide the models to produce desired outputs. The paper quantitatively analyzes and compares the performance of BERT and ChatGPT based models. The reasons for the bottlenecks of ChatGPT model in natural language understanding tasks are discovered and investigated. Possible improvements of ChatGPT model performance using the mivar approach are considered.

Published in E3S Web of Conferences

ISSN: 2267-1242 (Online)
Publisher: EDP Sciences
Country of publisher: France
LCC subjects: Geography. Anthropology. Recreation: Environmental sciences
Website: http://www.e3s-conferences.org/

About the journal