Enhancing Misinformation Detection in Spanish Language with Deep Learning: BERT and RoBERTa Transformer Models

Yolanda Blanco-Fernández; Javier Otero-Vizoso; Alberto Gil-Solla; Jorge García-Duque

doi:10.3390/app14219729

Applied Sciences (Oct 2024)

Enhancing Misinformation Detection in Spanish Language with Deep Learning: BERT and RoBERTa Transformer Models

Yolanda Blanco-Fernández,
Javier Otero-Vizoso,
Alberto Gil-Solla,
Jorge García-Duque

Affiliations

Yolanda Blanco-Fernández: atlanTTic Research Center for Telecommunication Technologies, University of Vigo, 36310 Vigo, Spain
Javier Otero-Vizoso: Escuela de Ingeniería de Telecomunicación, University of Vigo, 36310 Vigo, Spain
Alberto Gil-Solla: atlanTTic Research Center for Telecommunication Technologies, University of Vigo, 36310 Vigo, Spain
Jorge García-Duque: Escuela de Ingeniería de Telecomunicación, University of Vigo, 36310 Vigo, Spain

DOI: https://doi.org/10.3390/app14219729
Journal volume & issue: Vol. 14, no. 21
p. 9729

Abstract

Read online

This paper presents an approach to identifying political fake news in Spanish using Transformer architectures. Current methodologies often overlook political news due to the lack of quality datasets, especially in Spanish. To address this, we created a synthetic dataset of 57,231 Spanish political news articles, gathered via automated web scraping and enhanced with generative large language models. This dataset is used for fine-tuning and benchmarking Transformer models like BERT and RoBERTa for fake news detection. Our fine-tuned models showed outstanding performance on this dataset, with accuracy ranging from 97.4% to 98.6%. However, testing with a smaller, independent hand-curated dataset, including statements from political leaders during Spain’s July 2023 electoral debates, revealed a performance drop to 71%. Although this suggests that the model needs additional refinements to handle the complexity and variability of real-world political discourse, achieving over 70% accuracy seems a promising result in the under-explored domain of Spanish political fake news detection.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords