Big Data and Cognitive Computing (Feb 2024)

A Machine Learning-Based Pipeline for the Extraction of Insights from Customer Reviews

  • Róbert Lakatos,
  • Gergő Bogacsovics,
  • Balázs Harangi,
  • István Lakatos,
  • Attila Tiba,
  • János Tóth,
  • Marianna Szabó,
  • András Hajdu

DOI
https://doi.org/10.3390/bdcc8030020
Journal volume & issue
Vol. 8, no. 3
p. 20

Abstract

Read online

The efficiency of natural language processing has improved dramatically with the advent of machine learning models, particularly neural network-based solutions. However, some tasks are still challenging, especially when considering specific domains. This paper presents a model that can extract insights from customer reviews using machine learning methods integrated into a pipeline. For topic modeling, our composite model uses transformer-based neural networks designed for natural language processing, vector-embedding-based keyword extraction, and clustering. The elements of our model have been integrated and tailored to better meet the requirements of efficient information extraction and topic modeling of the extracted information for opinion mining. Our approach was validated and compared with other state-of-the-art methods using publicly available benchmark datasets. The results show that our system performs better than existing topic modeling and keyword extraction methods in this task.

Keywords