Proceedings of the XXth Conference of Open Innovations Association FRUCT (May 2021)

A Survey of Models for Constructing Text Features to Classify Texts in Natural Language

  • Ksenia Lagutina,
  • Nadezhda Lagutina

DOI
https://doi.org/10.23919/FRUCT52173.2021.9435512
Journal volume & issue
Vol. 29, no. 1
pp. 222 – 233

Abstract

Read online

In this survey we systematize the state-of-the-art features that are used to model texts for text classification tasks: topical and sentiment classification, authorship attribution, style detection, etc. We classify text models into three categories: standard models that use popular features, linguistic models that apply complex linguistic features, and modern universal models that combine deep neural networks with text graphs or language models. For each category we describe particular models and their adaptations, note the most effective solutions, summarize advantages, disadvantages and limitations, and make suggestions for future research.

Keywords