An unsupervised method for social network spammer detection based on user information interests

Darshika Koggalahewa; Yue Xu; Ernest Foo

doi:10.1186/s40537-021-00552-5

Journal of Big Data (Jan 2022)

An unsupervised method for social network spammer detection based on user information interests

Darshika Koggalahewa,
Yue Xu,
Ernest Foo

Affiliations

Darshika Koggalahewa: School of Computer Science, Queensland University of Technology
Yue Xu: School of Computer Science, Queensland University of Technology
Ernest Foo: School of Information and Communication Technology, Griffith University

DOI: https://doi.org/10.1186/s40537-021-00552-5
Journal volume & issue: Vol. 9, no. 1
pp. 1 – 35

Abstract

Read online

Abstract Online Social Networks (OSNs) are a popular platform for communication and collaboration. Spammers are highly active in OSNs. Uncovering spammers has become one of the most challenging problems in OSNs. Classification-based supervised approaches are the most commonly used method for detecting spammers. Classification-based systems suffer from limitations of “data labelling”, “spam drift”, “imbalanced datasets” and “data fabrication”. These limitations effect the accuracy of a classifier’s detection. An unsupervised approach does not require labelled datasets. We aim to address the limitation of data labelling and spam drifting through an unsupervised approach.We present a pure unsupervised approach for spammer detection based on the peer acceptance of a user in a social network to distinguish spammers from genuine users. The peer acceptance of a user to another user is calculated based on common shared interests over multiple shared topics between the two users. The main contribution of this paper is the introduction of a pure unsupervised spammer detection approach based on users’ peer acceptance. Our approach does not require labelled training datasets. While it does not better the accuracy of supervised classification-based approaches, our approach has become a successful alternative for traditional classifiers for spam detection by achieving an accuracy of 96.9%.

Published in Journal of Big Data

ISSN: 2196-1115 (Online)
Publisher: SpringerOpen
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Technology: Technology (General): Industrial engineering. Management engineering: Information technology; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://journalofbigdata.springeropen.com

About the journal

Abstract

Keywords