Integrating Topic-Aware Heterogeneous Graph Neural Network With Transformer Model for Medical Scientific Document Abstractive Summarization

Ayesha Khaliq; Atif Khan; Salman Afsar Awan; Salman Jan; Muhammad Umair; Megat F. Zuhairi

doi:10.1109/ACCESS.2024.3443730

IEEE Access (Jan 2024)

Integrating Topic-Aware Heterogeneous Graph Neural Network With Transformer Model for Medical Scientific Document Abstractive Summarization

Ayesha Khaliq,
Atif Khan,
Salman Afsar Awan,
Salman Jan,
Muhammad Umair,
Megat F. Zuhairi

Affiliations

Ayesha Khaliq: ORCiD; Department of Computer Science, University of Agriculture Faisalabad, Faisalabad, Pakistan
Atif Khan: ORCiD; Department of Computer Science, Islamia College Peshawar, Peshawar, Khyber Pakhtunkhwa, Pakistan
Salman Afsar Awan: Department of Computer Science, University of Agriculture Faisalabad, Faisalabad, Pakistan
Salman Jan: ORCiD; Department of Information Technology, Al Buraimi University College, Al Buraimi, Oman
Muhammad Umair: ORCiD; Department of Computer Science, City University of Science and Information Technology, Peshawar, Pakistan
Megat F. Zuhairi: ORCiD; Malaysian Institute of Information Technology, Universiti Kuala Lumpur, Kuala Lumpur, Malaysia

DOI: https://doi.org/10.1109/ACCESS.2024.3443730
Journal volume & issue: Vol. 12
pp. 113855 – 113866

Abstract

Read online

The development of abstractive summarization methods is a crucial task in Natural Language Processing (NLP) that presents challenges, which require the creation of intelligent systems that are capable of extracting the main idea from text effectively and generate coherent summary. Numerous existing abstractive approaches do not take into account the importance of the broader context or fail to capture the global semantics in identifying salient content for summary. Moreover, there is lack of studies that extensively evaluated abstractive summarization models for specific domains, such as medical scientific document summarization. With this motivation behind, this paper developed an integrated framework for abstractive summarization of medical scientific documents that integrates topic-aware Heterogeneous Graph Neural Network with a Transformer model. The suggested framework uses Latent Dirichlet Allocation (LDA) for topic modeling to uncover latent topics and global information, thus preserving document-level attributes important for creation of effective summaries. In addition to topic modeling, the framework utilized a Heterogeneous Graph Neural Network (HGNN), capable of capturing the relationship between sentences through graph-based document representation, and allows for the concurrent updating of both local and global information. Finally, the framework is integrated with a Transformer decoder, which greatly enhances the ability of model to produce accurate and informative abstractive summaries. The performance of proposed framework is evaluated on publicly available PubMed dataset related to medical scientific papers. Experimental results illustrate that the suggested framework for abstractive summarization showed superior performance as compared to the state-of-the-art models, achieving high F1-Scores: 46.03 for Rouge-1, 21.42 for Rouge-2, and 39.71 for Rouge-L. Our research makes a significant contribution to the field of natural language processing, particularly in the area of medical scientific document summarization. It demonstrates superior performance and provides a deeper understanding of document structure, and has the potential to impact various applications by offering efficient access to information.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords