Robust Semi-Supervised Fake News Recognition by Effective Augmentations and Ensemble of Diverse Deep Learners

Abdulhameed Al Obaid; Hassan Khotanlou; Muharram Mansoorizadeh; Davood Zabihzadeh

doi:10.1109/ACCESS.2023.3278323

IEEE Access (Jan 2023)

Robust Semi-Supervised Fake News Recognition by Effective Augmentations and Ensemble of Diverse Deep Learners

Abdulhameed Al Obaid,
Hassan Khotanlou,
Muharram Mansoorizadeh,
Davood Zabihzadeh

Affiliations

Abdulhameed Al Obaid: Department of Computer Engineering, RIV Laboratory, Bu-Ali Sina University, Hamedan, Iran
Hassan Khotanlou: ORCiD; Department of Computer Engineering, RIV Laboratory, Bu-Ali Sina University, Hamedan, Iran
Muharram Mansoorizadeh: ORCiD; Department of Computer Engineering, RIV Laboratory, Bu-Ali Sina University, Hamedan, Iran
Davood Zabihzadeh: ORCiD; Computer Engineering Department, Hakim Sabzevari University, Sabzevar, Iran

DOI: https://doi.org/10.1109/ACCESS.2023.3278323
Journal volume & issue: Vol. 11
pp. 54526 – 54543

Abstract

Read online

Nowadays, most people obtain information from social media networks, where news accompanied by photos and videos attracts readers more than traditional ones. However, these advantages are often misused by some publishers to disseminate fake information rapidly, thereby adversely affecting individuals and societies. Thus, the early detection of fake posts is crucial. Developing an automatic content-based fake news detector is the ideal way to overcome this issue. Given that the generation rate of news in social media is drastic and the labeling of a huge amount of data required by fully supervised models is expensive and time consuming, these models are not beneficial in real applications. To address this limitation, this study presents a semi-supervised method by utilizing an ensemble of diverse deep learners, effective augmentations, and a distribution-aware pseudo-labeling technique. Here, the proposed hybrid loss function enforces the learners to have accurate classification performance while attending to different parts of news content. Moreover, the proposed augmentations enhance the robustness of learners and prevent overfitting effectively. Diverse learners are utilized to annotate the unlabeled posts accurately and update their parameters from the most confident predicted news in a curriculum way, thereby enhancing the quality of pseudo labels and the robustness of the model. Moreover, we utilize encoded sentences from pre-trained transformer models, such as XLNET, and parameter sharing to build light learners on a common deep feature extractor module. Consequently, while the number of parameters is less than that of the existing methods, experiments conducted on three public fake news datasets reveal that the proposed method consistently outperforms state-of-the-art models with different proportions of labeled data across all evaluated datasets.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords