IEEE Access (Jan 2023)
Robust Semi-Supervised Fake News Recognition by Effective Augmentations and Ensemble of Diverse Deep Learners
Abstract
Nowadays, most people obtain information from social media networks, where news accompanied by photos and videos attracts readers more than traditional ones. However, these advantages are often misused by some publishers to disseminate fake information rapidly, thereby adversely affecting individuals and societies. Thus, the early detection of fake posts is crucial. Developing an automatic content-based fake news detector is the ideal way to overcome this issue. Given that the generation rate of news in social media is drastic and the labeling of a huge amount of data required by fully supervised models is expensive and time consuming, these models are not beneficial in real applications. To address this limitation, this study presents a semi-supervised method by utilizing an ensemble of diverse deep learners, effective augmentations, and a distribution-aware pseudo-labeling technique. Here, the proposed hybrid loss function enforces the learners to have accurate classification performance while attending to different parts of news content. Moreover, the proposed augmentations enhance the robustness of learners and prevent overfitting effectively. Diverse learners are utilized to annotate the unlabeled posts accurately and update their parameters from the most confident predicted news in a curriculum way, thereby enhancing the quality of pseudo labels and the robustness of the model. Moreover, we utilize encoded sentences from pre-trained transformer models, such as XLNET, and parameter sharing to build light learners on a common deep feature extractor module. Consequently, while the number of parameters is less than that of the existing methods, experiments conducted on three public fake news datasets reveal that the proposed method consistently outperforms state-of-the-art models with different proportions of labeled data across all evaluated datasets.
Keywords