IEEE Access (Jan 2024)

Harnessing the Power of LLMs for Service Quality Assessment From User-Generated Content

  • Taha Falatouri,
  • Denisa Hrusecka,
  • Thomas Fischer

DOI
https://doi.org/10.1109/ACCESS.2024.3429290
Journal volume & issue
Vol. 12
pp. 99755 – 99767

Abstract

Read online

Adopting Large Language Models (LLMs) creates opportunities for organizations to increase efficiency, particularly in sentiment analysis and information extraction tasks. This study explores the efficiency of LLMs in real-world applications, focusing on sentiment analysis and service quality dimension extraction from user-generated content (UGC). For this purpose, we compare the performance of two LLMs (ChatGPT 3.5 and Claude 3) and three traditional NLP methods using two datasets of customer reviews (one in English and one in Persian). The results indicate that LLMs can achieve notable accuracy in information extraction (76% accuracy for ChatGPT and 68% for Claude 3) and sentiment analysis (substantial agreement with human raters for ChatGPT and moderate agreement with human raters for Claude 3), demonstrating an improvement compared to other AI models. However, challenges persist, including discrepancies between model predictions and human judgments and limitations in extracting specific dimensions from unstructured text. Whereas LLMs can streamline the SQ assessment process, human supervision remains essential to ensure reliability.

Keywords