Heliyon (Sep 2024)
Dissecting the infodemic: An in-depth analysis of COVID-19 misinformation detection on X (formerly Twitter) utilizing machine learning and deep learning techniques
Abstract
The alarming growth of misinformation on social media has become a global concern as it influences public opinions and compromises social, political, and public health development. The proliferation of deceptive information has resulted in widespread confusion, societal disturbances, and significant consequences for matters pertaining to health. Throughout the COVID-19 pandemic, there was a substantial surge in the dissemination of inaccurate or deceptive information via social media platforms, particularly X (formerly known as Twitter), resulting in the phenomenon commonly referred to as an “Infodemic”. This review paper examines a grand selection of 600 articles published in the past five years and focuses on conducting a thorough analysis of 87 studies that investigate the detection of fake news connected to COVID-19 on Twitter. In addition, this research explores the algorithmic techniques and methodologies used to investigate the individuals responsible for disseminating this type of fake news. A summary of common datasets, along with their fundamental qualities, for detecting fake news has been included as well. For the purpose of identifying fake news, the behavioral pattern of the misinformation spreaders, and their community analysis, we have performed an in-depth examination of the most recent literature that the researchers have worked with and recommended. Our key findings can be summarized in a few points: (a) around 80% of fake news detection-related papers have utilized Deep Neural Networks-based techniques for better performance achievement, although the proposed models suffer from overfitting, vanishing gradients, and higher prediction time problems, (b) around 60% of the disseminator related analysis papers focus on identifying dominant spreaders and their communities utilizing graph modeling although there is not much work done in this domain, and finally, (c) we conclude by pointing out a wide range of research gaps, for example, the need of a large and robust training dataset and deeper investigation of the communities, etc., and suggesting potential solution strategies. Moreover, to facilitate the utilization of a large training dataset for detecting fake news, we have created a large database by compiling the training datasets from 17 different research works. The objective of this study is to shed light on exactly how COVID-19-related tweets are beginning to diverge, along with the dissemination of misinformation. Our work uncovers notable discoveries, including the ongoing rapid growth of the disseminator population, the presence of professional spreaders within the disseminator community, and a substantial level of collaboration among the fake news spreaders.