IEEE Access (Jan 2023)
Can Anaphora Resolution Improve Extractive Query-Focused Multi-Document Summarization?
Abstract
Query-Focused Multi-Document Summarization (QF-MDS) is the task of automatically generating a summary from a collection of documents that answers a specific user’s query. Extractive methods are primarily based on identifying, selecting, and ranking sentences according to their relevance to the given query. These methods have shown promising results; however, they may yield incoherent summaries when pronominal anaphoric expressions appear unbound. To address this issue, this paper proposes a novel method that leverages both contextual embeddings and anaphora resolution methods. More specifically, the Sentence-BERT (SBERT) model is employed to generate contextual embeddings for the sentences in the documents and the user’s query. Additionally, the SpanBERT model is utilized to resolve unbound pronominal references in the input sentences of the documents, aiming to improve the cohesiveness of the generated summaries. We have conducted a comprehensive comparative analysis using quantitative and qualitative evaluations against other state-of-the-art systems on the standard DUC’2005 and DUC’2007 datasets. The results obtained show that the proposed method is competitive and outperforms recent query-focused multi-document summarization systems on certain ROUGE evaluation measures. Furthermore, human evaluation results further confirm that our method is able to generate more informative, cohesive, and less redundant summaries.
Keywords