IEEE Access (Jan 2023)
Achieving Online and Scalable Information Integrity by Harnessing Social Spam Correlations
Abstract
Malicious web links, social rumors, fraudulent advertisements, faked comments, and biased propaganda are overwhelmingly influencing online social networks. Enabling information integrity is a hot topic in both academia and industry. Traditional social spam detection techniques rely on centralized processing, focusing only on one specific set of data sources, thereby ignoring the social spam correlations between distributed data sources. In this paper, we propose an online and scalable misinformation detection system, named Spiral, to uncover social spam by leveraging the correlations between different social data sources in geo-distributed sites. The key insight in our approach is to amplify the effectiveness of state-of-the-art techniques to detect inappropriate posts by enabling the efficient large-scale propagation of detection information across domains. The novelty of our design lies in three key components: (1) a decentralized distributed hash-table-based tree overlay deployment for harvesting and uncovering deceptive information spreading in multiple online social networks communities; (2) a progressive aggregation tree for collecting the properties of these posts and creating new classifiers to actively filter out the propagation of inappropriate posts; and (3) a group communication structure that allows multiple groups to exchange the correlations among distributed social data sources. We designed and implemented a prototype of the Spiral system. Our large-scale experiments, using real-world social data, demonstrate Spiral’s scalability, effective load-balancing, and efficiency in online spam detection for social networks.
Keywords