IEEE Access (Jan 2024)
Fake News Classification Methodology With Enhanced BERT
Abstract
News serves as a vital source of information for staying updated on various aspects of life worldwide. However, massive volume of information available on social media platforms makes it challenging to extract meaningful insights. Additionally, dispersion of false information has grown broader, often serving specific agendas. In this work, we present a novel fake news classification methodology based on an enhanced BERT deep learning model which is trained on self-developed PolitiTweet datasets along with benchmarked Buzzfeed dataset. The PolitiTweet dataset is augmented to solve class imbalance problem and improve data diversity to capture regional language nuances, cultural references that help in more accurate detection of fake news. For this purpose, We enhance BERTbase model by adding 3 additional layers namely Linear Layer, Dropout Layer, Activation Layer and fine tuned the model to train enhanced BERT classifier. The fine tuned BERT model trained on augmented dataset is capable of capturing patterns and nuances within the data, giving better classification results. Subsequently, the enhanced BERT model is evaluated against BERTbase model for further elaboration on the generalisibility and effective performance of the fine tuned model for real-world cases. The enhanced BERT model achieved an accuracy of 85% on Buzzfeed and 98% on PolitiTweet. In comparison the baseline BERT models achieved an average accuracy of 81% and 88%, respectively. The proposed Enhanced BERT model uses a mix of pre-training strategies with fine-tuning techniques to achieve better performance. The developed research data is available online at: https://www.kaggle.com/datasets/ameerhamza123/pak-tweets.
Keywords