Applied Sciences (Aug 2023)

Coreference Resolution for Improving Performance Measures of Classification Tasks

  • Kirsten Šteflovič,
  • Jozef Kapusta

DOI
https://doi.org/10.3390/app13169272
Journal volume & issue
Vol. 13, no. 16
p. 9272

Abstract

Read online

There are several possibilities to improve classification in natural language processing tasks. In this article, we focused on the issue of coreference resolution that was applied to a manually annotated dataset of true and fake news. This dataset was used for the classification task of fake news detection. The research aimed to determine whether performing coreference resolution on the input data before classification or classifying them without performing coreference resolution is more effective. We also wanted to verify whether it is possible to enhance classifier performance metrics by incorporating coreference resolution into the data preparation process. A methodology was proposed, in which we described the implementation methods in detail, starting from the identification of entity mentions in the text using the neuralcoref algorithm, then through word-embedding models (TF–IDF, Doc2Vec), and finally to several machine learning methods. The result was a comparison of the implemented classifiers based on the performance metrics described in the theoretical part. The best result for accuracy was observed for the dataset with coreference resolution applied, which had a median value of 0.8149, while for the F1 score, the best result had a median value of 0.8101. However, the more important finding is that the processed data with the application of coreference resolution led to an improvement in performance metrics in the classification tasks.

Keywords