Journal on Interactive Systems (Mar 2023)

Fake news detection: a systematic literature review of machine learning algorithms and datasets

  • Humberto Fernandes Villela,
  • Fábio Corrêa,
  • Jurema Suely de Araújo Nery Ribeiro,
  • Air Rabelo,
  • Dárlinton Barbosa Feres Carvalho

DOI
https://doi.org/10.5753/jis.2023.3020
Journal volume & issue
Vol. 14, no. 1

Abstract

Read online

Fake news (i.e., false news created to have a high capacity for dissemination and malicious intentions) is a problem of great interest to society today since it has achieved unprecedented political, economic, and social impacts. Taking advantage of modern digital communication and information technologies, they are widely propagated through social media, being their use intentional and challenging to identify. In order to mitigate the damage caused by fake news, researchers have been seeking the development of automated mechanisms to detect them, such as algorithms based on machine learning as well as the datasets employed in this development. This research aims to analyze the machine learning algorithms and datasets used in training to identify fake news published in the literature. It is exploratory research with a qualitative approach, which uses a research protocol to identify studies with the intention of analyzing them. As a result, we have the algorithms Stacking Method, Bidirectional Recurrent Neural Network (BiRNN), and Convolutional Neural Network (CNN), with 99.9%, 99.8%, and 99.8% accuracy, respectively. Although this accuracy is expressive, most of the research employed datasets in controlled environments (e.g., Kaggle) or without information updated in real-time (from social networks). Still, only a few studies have been applied in social network environments, where the most significant dissemination of disinformation occurs nowadays. Kaggle was the platform identified with the most frequently used datasets, being succeeded by Weibo, FNC-1, COVID-19 Fake News, and Twitter. For future research, studies should be carried out in addition to news about politics, the area that was the primary motivator for the growth of research from 2017, and the use of hybrid methods for identifying fake news.

Keywords