E3S Web of Conferences (Jan 2024)

Evaluation of BERT and ChatGPT models in inference, paraphrase and similarity tasks

  • Kim Radmir,
  • Kotsenko Anton,
  • Andreev Aleksandr,
  • Bazanova Anastasiia,
  • Aladin Dmitry,
  • Todua David,
  • Marushchenko Aleksei,
  • Varlamov Oleg

DOI
https://doi.org/10.1051/e3sconf/202451503016
Journal volume & issue
Vol. 515
p. 03016

Abstract

Read online

The purpose of this paper is to study the application of ChatGPT and BERT models in the field of mechanical engineering. In the context of machine learning, the ChatGPT and BERT models can be applied to various natural language processing tasks such as analyzing technical documentation and building instructions according to a particular version of the documentation, diagnosing malfunctions or customer service. The paper discusses the fundamental features of BERT and ChatGPT models, their origin, and also investigates the main architectural features and identifies the main advantages and disadvantages of the models. The paper analyzes and selects various natural language processing tasks to test the models’ ability to understand natural language in the context of machine learning. The selected criterion tasks are divided into semantic groups to identify the capabilities of ChatGPT and BERT models in each of three areas: logical inference tasks, paraphrasing tasks, and text similarity tasks. The paper also discusses the concept of operational design, which involves developing inputs that guide the models to produce desired outputs. The paper quantitatively analyzes and compares the performance of BERT and ChatGPT based models. The reasons for the bottlenecks of ChatGPT model in natural language understanding tasks are discovered and investigated. Possible improvements of ChatGPT model performance using the mivar approach are considered.