IEEE Access (Jan 2024)

Enhancing the Performance of Multi-Category Text Classification via Label Relation Mining

  • Yun Wang,
  • Kang Tian,
  • Yanghan Gao,
  • Bowen Hu,
  • Xiaoyang Li,
  • Yadong Zhou

DOI
https://doi.org/10.1109/ACCESS.2024.3394853
Journal volume & issue
Vol. 12
pp. 61433 – 61442

Abstract

Read online

Multi-category text classification aims to assign labels under multiple categories to each sample in the dataset. However, most researchers overlook the relationship between labels and do not effectively incorporate text and label information. To address these problems, we propose a label embedding joint cross-attention (LEJCA) model for multi-category text classification using label relations and text information. Specifically, we first input the characters of the text into the Pre-training Language Model to obtain the representation of the text. In addition, a label embedding method based on graph representation is proposed to realize multi-dimensional mining and quantitative description of label relations. Finally, a joint cross-attention (JCA) model is proposed to integrate text and label embedding representations. Experimental results on our own constructed dataset E-government document show that LEJCA ’s micro-average F1 score is improved by $9.7~{\%}$ compared to the baseline’s best result.

Keywords