IEEE Access (Jan 2023)

Warm-Starting for Improving the Novelty of Abstractive Summarization

  • Ayham Alomari,
  • Ahmad Sami Al-Shamayleh,
  • Norisma Idris,
  • Aznul Qalid Md Sabri,
  • Izzat Alsmadi,
  • Danah Omary

DOI
https://doi.org/10.1109/ACCESS.2023.3322226
Journal volume & issue
Vol. 11
pp. 112483 – 112501

Abstract

Read online

Abstractive summarization is distinguished by using novel phrases that are not found in the source text. However, most previous research ignores this feature in favour of enhancing syntactical similarity with the reference. To improve novelty aspects, we have used multiple warm-started models with varying encoder and decoder checkpoints and vocabulary. These models are then adapted to the paraphrasing task and the sampling decoding strategy to further boost the levels of novelty and quality. In addition, to avoid relying only on the syntactical similarity assessment, two additional abstractive summarization metrics are introduced: 1) NovScore: a new novelty metric that delivers a summary novelty score; and 2) NSSF: a new comprehensive metric that ensembles Novelty, Syntactic, Semantic, and Faithfulness features into a single score to simulate human assessment in providing a reliable evaluation. Finally, we compare our models to the state-of-the-art sequence-to-sequence models using the current and the proposed metrics. As a result, warm-starting, sampling, and paraphrasing improve novelty degrees by 2%, 5%, and 14%, respectively, while maintaining comparable scores on other metrics.

Keywords