Exploring Transformer Models and Domain Adaptation for Detecting Opinion Spam in Reviews

Christopher G Harris

doi:10.23919/FRUCT64283.2024.10749897

Proceedings of the XXth Conference of Open Innovations Association FRUCT (Nov 2024)

Exploring Transformer Models and Domain Adaptation for Detecting Opinion Spam in Reviews

Christopher G Harris

Affiliations

Christopher G Harris: University of Northern Colorado

DOI: https://doi.org/10.23919/FRUCT64283.2024.10749897
Journal volume & issue: Vol. 36, no. 1
pp. 249 – 255

Abstract

Read online

As online reviews play a crucial role in purchasing decisions, businesses are increasingly incentivized to generate positive reviews, sometimes resorting to fake reviews or opinion spam. Detecting opinion spam requires well-trained models, but obtaining annotated training data in the same domain (e.g., hotels) can be challenging. Transfer learning addresses this by leveraging training data from a similar domain (e.g., restaurants). This paper examines three popular transformer models—BERT, RoBERTa, and DistilBERT—to evaluate how training data from different domains, including imbalanced datasets, impacts Transformer model performance. Notably, our evaluation of hotel opinion spam detection achieved an AUC of 0.927 using RoBERTa trained on YelpChi restaurant data.

Published in Proceedings of the XXth Conference of Open Innovations Association FRUCT

ISSN: 2305-7254 (Print); 2343-0737 (Online)
Publisher: FRUCT
Country of publisher: Finland
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Telecommunication
Website: http://fruct.org/publication

About the journal

Abstract

Keywords