Journal of Big Data (Jul 2024)
Text summarization based on semantic graphs: an abstract meaning representation graph-to-text deep learning approach
Abstract
Abstract Nowadays, due to the constantly growing amount of textual information, automatic text summarization constitutes an important research area in natural language processing. In this work, we present a novel framework that combines semantic graph representations along with deep learning predictions to generate abstractive summaries of single documents, in an effort to utilize a semantic representation of the unstructured textual content in a machine-readable, structured, and concise manner. The overall framework is based on a well defined methodology for performing semantic graph parsing, graph construction, graph transformations for machine learning models and deep learning predictions. The employed semantic graph representation focuses on using the model of abstract meaning representation. Several combinations of graph construction and graph transformation methods are investigated to specify the most efficient of them for the machine learning models. Additionally, a range of deep learning architectures is examined, including a sequence-to-sequence attentive network, reinforcement learning, transformer-based architectures, and pre-trained neural language models. In this direction, a semantic graph representation of an original text is extracted, and then the present framework formulates the problem as a graph-to-summary learning problem to predict a summary of an original text. To the best of our knowledge, this formulation of graph-to-summary predictions in abstractive text summarization, without other intermediate steps in the machine learning phase, has not been presented in the relevant literature. Another important contribution is an introduction of a measure for assessing the factual consistency of the generated summaries in an effort to provide a qualitative evaluation. To assess the framework, an extensive experimental procedure is presented that uses popular datasets to evaluate key aspects of the proposed approach. The obtained results exhibit promising performance, validating the robustness of the proposed framework.
Keywords