Achieving Online and Scalable Information Integrity by Harnessing Social Spam Correlations

Hailu Xu; Pinchao Liu; Boyuan Guan; Qingyang Wang; Dilma Da Silva; Liting Hu

doi:10.1109/ACCESS.2023.3236604

IEEE Access (Jan 2023)

Achieving Online and Scalable Information Integrity by Harnessing Social Spam Correlations

Hailu Xu,
Pinchao Liu,
Boyuan Guan,
Qingyang Wang,
Dilma Da Silva,
Liting Hu

Affiliations

Hailu Xu: ORCiD; Department of Computer Engineering and Computer Science, California State University, Long Beach, CA, USA
Pinchao Liu: School of Computing and Information Sciences, Florida International University, Miami, FL, USA
Boyuan Guan: School of Computing and Information Sciences, Florida International University, Miami, FL, USA
Qingyang Wang: Department of Computer Science and Engineering, Louisiana State University, Baton Rouge, LA, USA
Dilma Da Silva: ORCiD; Department of Computer Science and Engineering, Texas A&M University, College Station, TX, USA
Liting Hu: Department of Computer Science and Engineering, University of California at Santa Cruz, Santa Cruz, CA, USA

DOI: https://doi.org/10.1109/ACCESS.2023.3236604
Journal volume & issue: Vol. 11
pp. 7768 – 7781

Abstract

Read online

Malicious web links, social rumors, fraudulent advertisements, faked comments, and biased propaganda are overwhelmingly influencing online social networks. Enabling information integrity is a hot topic in both academia and industry. Traditional social spam detection techniques rely on centralized processing, focusing only on one specific set of data sources, thereby ignoring the social spam correlations between distributed data sources. In this paper, we propose an online and scalable misinformation detection system, named Spiral, to uncover social spam by leveraging the correlations between different social data sources in geo-distributed sites. The key insight in our approach is to amplify the effectiveness of state-of-the-art techniques to detect inappropriate posts by enabling the efficient large-scale propagation of detection information across domains. The novelty of our design lies in three key components: (1) a decentralized distributed hash-table-based tree overlay deployment for harvesting and uncovering deceptive information spreading in multiple online social networks communities; (2) a progressive aggregation tree for collecting the properties of these posts and creating new classifiers to actively filter out the propagation of inappropriate posts; and (3) a group communication structure that allows multiple groups to exchange the correlations among distributed social data sources. We designed and implemented a prototype of the Spiral system. Our large-scale experiments, using real-world social data, demonstrate Spiral’s scalability, effective load-balancing, and efficiency in online spam detection for social networks.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords