Applied Sciences (Jul 2024)

Comparing Fine-Tuning and Prompt Engineering for Multi-Class Classification in Hospitality Review Analysis

  • Ive Botunac,
  • Marija Brkić Bakarić,
  • Maja Matetić

DOI
https://doi.org/10.3390/app14146254
Journal volume & issue
Vol. 14, no. 14
p. 6254

Abstract

Read online

This study compares the effectiveness of fine-tuning Transformer models, specifically BERT, RoBERTa, DeBERTa, and GPT-2, against using prompt engineering in LLMs like ChatGPT and GPT-4 for multi-class classification of hotel reviews. As the hospitality industry increasingly relies on online customer feedback to improve services and strategize marketing, accurately analyzing this feedback is crucial. Our research employs a multi-task learning framework to simultaneously conduct sentiment analysis and categorize reviews into aspects such as service quality, ambiance, and food. We assess the capabilities of fine-tuned Transformer models and LLMs with prompt engineering in processing and understanding the complex user-generated content prevalent in the hospitality industry. The results show that fine-tuned models, particularly RoBERTa, are more adept at classification tasks due to their deep contextual processing abilities and faster execution times. In contrast, while ChatGPT and GPT-4 excel in sentiment analysis by better capturing the nuances of human emotions, they require more computational power and longer processing times. Our findings support the hypothesis that fine-tuning models can achieve better results and faster execution than using prompt engineering in LLMs for multi-class classification in hospitality reviews. This study suggests that selecting the appropriate NLP model depends on the task’s specific needs, balancing computational efficiency and the depth of sentiment analysis required for actionable insights in hospitality management.

Keywords