Detection of Chinese Deceptive Reviews Based on Pre-Trained Language Model

Chia-Hsien Weng; Kuan-Cheng Lin; Jia-Ching Ying

doi:10.3390/app12073338

Applied Sciences (Mar 2022)

Detection of Chinese Deceptive Reviews Based on Pre-Trained Language Model

Chia-Hsien Weng,
Kuan-Cheng Lin,
Jia-Ching Ying

Affiliations

Chia-Hsien Weng: Department of Management Information Systems, National Chung Hsing University, Taichung 402, Taiwan
Kuan-Cheng Lin: Department of Management Information Systems, National Chung Hsing University, Taichung 402, Taiwan
Jia-Ching Ying: Department of Management Information Systems, National Chung Hsing University, Taichung 402, Taiwan

DOI: https://doi.org/10.3390/app12073338
Journal volume & issue: Vol. 12, no. 7
p. 3338

Abstract

Read online

The advancement of the Internet has changed people’s ways of expressing and sharing their views with the world. Moreover, user-generated content has become a primary guide for customer purchasing decisions. Therefore, motivated by commercial interest, some sellers have started manipulating Internet ratings by writing false positive reviews to encourage the sale of their goods and writing false negative reviews to discredit competitors. These reviews are generally referred to as deceptive reviews. Deceptive reviews mislead customers in purchasing goods that are inconsistent with online information and thus obstruct fair competition among businesses. To protect the right of consumers and sellers, an effective method is required to automate the detection of misleading reviews. Previously developed methods of translating text into feature vectors usually fail to interpret polysemous words, which leads to such functions being obstructed. By using dynamic feature vectors, the present study developed several misleading review-detection models for the Chinese language. The developed models were then compared with the standard detection-efficiency models. The deceptive reviews collected from various online forums in Taiwan by previous studies were used to test the models. The results showed that the models proposed in this study can achieve 0.92 in terms of precision, 0.91 in terms of recall, and 0.91 in terms of F1-score. The improvement rate of our proposal is higher than 20%. Accordingly, we prove that our proposal demonstrated improved performance in detecting misleading reviews, and the models based on dynamic feature vectors were capable of more accurately capturing semantic terms than the conventional models based on the static feature vectors, thereby enhancing effectiveness.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords