IEEE Access (Jan 2023)

Detecting Misleading Headlines Through the Automatic Recognition of Contradiction in Spanish

  • Robiert Sepulveda-Torres,
  • Alba Bonet-Jover,
  • Estela Saquete

DOI
https://doi.org/10.1109/ACCESS.2023.3295781
Journal volume & issue
Vol. 11
pp. 72007 – 72026

Abstract

Read online

Misleading headlines are part of the disinformation problem. Headlines should give a concise summary of the news story helping the reader to decide whether to read the body text of the article, which is why headline accuracy is a crucial element of a news story. This work focuses on detecting misleading headlines through the automatic identification of contradiction between the headline and body text of a news item. When the contradiction is detected, the reader is alerted to the lack of precision or trustworthiness of the headline in relation to the body text. To facilitate the automatic detection of misleading headlines, a new Spanish dataset is created (ES_Headline_Contradiction) for the purpose of identifying contradictory information between a headline and its body text. This dataset annotates the semantic relationship between headlines and body text by categorising the relation between texts as compatible, contradictory and unrelated. Furthermore, another novel aspect of this dataset is that it distinguishes between different types of contradictions, thereby enabling a more fine-grain identification of them. The dataset was built via a novel semi-automatic methodology, which resulted in a more cost-efficient development process. The results of the experiments show that pre-trained language models can be fine-tuned with this dataset, producing very encouraging results for detecting incongruency or non-relation between headline and body text.

Keywords