Xi'an Gongcheng Daxue xuebao (Feb 2022)

Word vector text representation model based on feature weight

  • JIANG Yanjie,
  • LI Yunhong,
  • SU Xueping,
  • ZHANG Leitao,
  • JIA Kaili,
  • CHEN Jinni

DOI
https://doi.org/10.13338/j.issn.1674-649x.2022.01.015
Journal volume & issue
Vol. 36, no. 1
pp. 108 – 114

Abstract

Read online

Traditional text representation methods have inaccurate expression of text information and high sparse dimensions. A word vector text representation model based on feature weights was thus proposed. The word vector was obtained through the Glove model, and then combined with the TF-IDF and N-Gram models, which not only considers the global information of the text, but also solves the problem of high sparse dimensionality in traditional representation methods. This can better capture the local information such as text semantics and word order and improve the ability of text feature expression. Finally, through the 20NewsGroup and 5AbstractsGroup tests, the classification accuracy rates reached 85.93% and 87.02%, respectively, verifying the effectiveness of the text representation model.

Keywords