IEEE Access (Jan 2020)

Extractive Document Summarization Based on Dynamic Feature Space Mapping

  • Samira Ghodratnama,
  • Amin Beheshti,
  • Mehrdad Zakershahrak,
  • Fariborz Sobhanmanesh

DOI
https://doi.org/10.1109/ACCESS.2020.3012539
Journal volume & issue
Vol. 8
pp. 139084 – 139095

Abstract

Read online

The exponential growth of the Web documents has constituted the need for automatic document summarization. In this context, extractive document summarization, i.e., that task of extracting the most relevant information, removing redundancy and presenting the remained data in a coherent and cohesive structure, is a challenging task. In this article, we propose a novel intelligent approach, namely ExDoS, that harvests benefits of both supervised and unsupervised algorithms simultaneously. To the best of our knowledge, ExDoS is the first approach to combine both supervised and unsupervised algorithms in a single framework and an interpretable manner for document summarization purpose. ExDoS iteratively minimizes the error rate of the classifier in each cluster with the help of dynamic local feature weighting. Moreover, this approach specifies the contribution of features to discriminate each class, which is a challenging issue in the summarization task. Therefore, in addition to summarizing text, ExDoS is also able to measure the importance of each feature in the summarization process. We evaluate our model both automatically (in terms of ROUGE factor) and empirically (human analysis) on the benchmark datasets: the DUC2002 and CNN/DailyMail. Results show that our model obtains higher ROUGE scores comparing to most state-of-the-art models. The human evaluation also demonstrates that our model is capable of generating informative and readable summaries.

Keywords