Proceedings of the International Conference on Applied Innovations in IT (Apr 2021)

Models and Algorithms for Automatic Labelling of Unstructured Texts (Text Tagging)

  • Gyuzel Shakhmametova,
  • Ilshat Ishmukhametov

DOI
https://doi.org/10.25673/36586
Journal volume & issue
Vol. 9, no. 1
pp. 69 – 76

Abstract

Read online

The article discusses the task of automatic labelling of texts to improve the efficiency of processing unstructured text data. An overview of existing software products for solving the problem is given, showing the need to develop its own solution specialized in the processing of Russian-language texts. The problem of assigning labels is considered from a mathematical point of view as a problem of multilabel classification, with corresponding mathematical models analysed and described. Based on this, models, algorithms, and a software product for automatically assigning labels to texts have been developed. Numerical experiments were carried out that showed the universality of the method and the possibility of application both in nonspecialized and specialized fields, in particular, for processing medical documents.

Keywords