Journal of the Brazilian Computer Society (Sep 2018)

Update summarization: building from scratch for Portuguese and comparing to English

  • Fernando Antônio Asevedo Nóbrega,
  • Thiago Alexandre Salgueiro Pardo

DOI
https://doi.org/10.1186/s13173-018-0075-1
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Update summarization aims at automatically producing a summary for a collection of texts for a reader that has already read some previous texts about the subject of interest. It is a challenging task, since it not only brings the demands from the summarization area (as producing informative, coherent, and cohesive summaries) but also includes the issue of finding relevant new/updated content. In this paper, we report a comprehensive investigation of update summarization methods for the Portuguese language, for which there are few initiatives. We also propose new methods that combine some summarization strategies and enrich a traditional method with linguistic knowledge (subtopics), producing better results and advancing the state of the art. More than this, we present a reference dataset for Portuguese, so far inexistent, and establish an experiment setup in the area in order to foster future research. To confirm some of our summarization results, we run experiments in a well-known benchmark dataset for English language and show that our methods still do well.

Keywords