MultiCGCN: Multi-Label Text Classification using GCNs and Heterogeneous Graphs

Milad Allahgholi; Hossein Rahmani; Parinaz Soltanzadeh; Aylin Naebzadeh

doi:10.22133/ijwr.2024.485064.1243

International Journal of Web Research (Sep 2024)

MultiCGCN: Multi-Label Text Classification using GCNs and Heterogeneous Graphs

Milad Allahgholi,
Hossein Rahmani,
Parinaz Soltanzadeh,
Aylin Naebzadeh

Affiliations

Milad Allahgholi: School of Computer Engineering, Iran University of Science and Technology, Tehran, Iran
Hossein Rahmani: ORCiD; School of Computer Engineering, Iran University of Science and Technology, Tehran, Iran
Parinaz Soltanzadeh: School of Computer Engineering, Iran University of Science and Technology, Tehran, Iran
Aylin Naebzadeh: School of Computer Engineering, Iran University of Science and Technology, Tehran, Iran

DOI: https://doi.org/10.22133/ijwr.2024.485064.1243
Journal volume & issue: Vol. 7, no. 4
pp. 29 – 37

Abstract

Read online

Multi-label text classification is a critical challenge in natural language processing, where the goal is to assign multiple labels to a given document. Recent advances have primarily focused on deep learning approaches, yet many fail to adequately capture the intricate relationships between documents and labels. In this paper, we propose a novel method called MultiCGCN, in which we leverage Graph Convolutional Networks (GCNs) for multi-label text classification by modeling text as a heterogeneous graph. This unified graph incorporates document similarities, label relationships, and document-label associations, enabling the model to effectively capture both document and label dependencies. We transform the multi-label classification problem into a link prediction task, using Term Frequency–Inverse Document Frequency (TF-IDF) for document similarity and applying GCNs to predict label assignments. Our empirical evaluations demonstrate that MultiCGCN achieves a significant performance boost, improving F1 score by 10% over traditional baseline models. This approach opens new avenues for enhancing the accuracy of multi-label classification in various domains.

Published in International Journal of Web Research

ISSN: 2645-4343 (Online)
Publisher: University of science and culture
Country of publisher: Iran, Islamic Republic of
LCC subjects: Bibliography. Library science. Information resources: Information resources (General)
Website: http://ijwr.usc.ac.ir/

About the journal

Abstract

Keywords