International Journal of Web Research (Sep 2024)

MultiCGCN: Multi-Label Text Classification using GCNs and Heterogeneous Graphs

  • Milad Allahgholi,
  • Hossein Rahmani,
  • Parinaz Soltanzadeh,
  • Aylin Naebzadeh

DOI
https://doi.org/10.22133/ijwr.2024.485064.1243
Journal volume & issue
Vol. 7, no. 4
pp. 29 – 37

Abstract

Read online

Multi-label text classification is a critical challenge in natural language processing, where the goal is to assign multiple labels to a given document. Recent advances have primarily focused on deep learning approaches, yet many fail to adequately capture the intricate relationships between documents and labels. In this paper, we propose a novel method called MultiCGCN, in which we leverage Graph Convolutional Networks (GCNs) for multi-label text classification by modeling text as a heterogeneous graph. This unified graph incorporates document similarities, label relationships, and document-label associations, enabling the model to effectively capture both document and label dependencies. We transform the multi-label classification problem into a link prediction task, using Term Frequency–Inverse Document Frequency (TF-IDF) for document similarity and applying GCNs to predict label assignments. Our empirical evaluations demonstrate that MultiCGCN achieves a significant performance boost, improving F1 score by 10% over traditional baseline models. This approach opens new avenues for enhancing the accuracy of multi-label classification in various domains.

Keywords