Domínios de Lingu@gem (Sep 2022)

Evaluating a typology of signals for automatic detection of complementarity

  • Jackson Wilke da Cruz Souza,
  • Ariani Di Felippo

DOI
https://doi.org/10.14393/DL52-v16n4a2022-10
Journal volume & issue
Vol. 16, no. 4
pp. 1517 – 1543

Abstract

Read online

In a cluster of news texts on the same event, two sentences from different documents might express different multi-document phenomena (redundancy, complementarity, and contradiction). Cross-Document Structure Theory (CST) provides labels to explicitly represent these phenomena. The automatic identification of the multi-document phenomena and their correspondent CST relations is definitely handy for Automatic Multi-Document Summarization since it helps computers understand text meaning. In this paper, we evaluated a typology of (textual) signals for the automatic detection of the CST relations of complementarity (i.e., Historical background, Follow-up and Elaboration) in a multi-document corpus of news texts in Brazilian Portuguese. Using algorithms from different machine-learning paradigms, we obtained classifiers that achieved high general accuracy (higher than 90%), indicating the potential of the signals.

Keywords