Reaching for upper bound ROUGE score of extractive summarization methods

Iskander Akhmetov; Rustam Mussabayev; Alexander Gelbukh

doi:10.7717/peerj-cs.1103

PeerJ Computer Science (Sep 2022)

Reaching for upper bound ROUGE score of extractive summarization methods

Iskander Akhmetov,
Rustam Mussabayev,
Alexander Gelbukh

Affiliations

Iskander Akhmetov: Kazakh-British Technical University, Almaty, Almaty, Kazakhstan
Rustam Mussabayev: Institute of Information and Computational Technologies, Almaty, Almaty, Kazakhstan
Alexander Gelbukh: Instituto Politecnico Nacional, Mexico, Mexico

DOI: https://doi.org/10.7717/peerj-cs.1103
Journal volume & issue: Vol. 8
p. e1103

Abstract

Read online Read online

The extractive text summarization (ETS) method for finding the salient information from a text automatically uses the exact sentences from the source text. In this article, we answer the question of what quality of a summary we can achieve with ETS methods? To maximize the ROUGE-1 score, we used five approaches: (1) adapted reduced variable neighborhood search (RVNS), (2) Greedy algorithm, (3) VNS initialized by Greedy algorithm results, (4) genetic algorithm, and (5) genetic algorithm initialized by the Greedy algorithm results. Furthermore, we ran experiments on articles from the arXive dataset. As a result, we found 0.59 and 0.25 scores for ROUGE-1 and ROUGE-2, respectively achievable by the approach, where the genetic algorithm initialized by the Greedy algorithm results, which happens to yield the best results out of the tested approaches. Moreover, those scores appear to be higher than scores obtained by the current state-of-the-art text summarization models: the best score in the literature for ROUGE-1 on the same data set is 0.46. Therefore, we have room for the development of ETS methods, which are now undeservedly forgotten.

Published in PeerJ Computer Science

ISSN: 2376-5992 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://peerj.com/computer-science/

About the journal

Abstract

Keywords