Proceedings of the XXth Conference of Open Innovations Association FRUCT (Apr 2022)
Topical Text Classification of Russian News: a Comparison of BERT and Standard Models
Abstract
The paper is devoted to the single-label topical classification of Russian news. The author compares the BERT features and standard character, word and structure-level features as text models. Experiments with OpenCorpora show that the BERT model is superior to standard ones, and achieves good classification quality for a small dataset of long news. Comparison with the state-of-the-art research allows to consider BERT as a baseline for future investigations of analysis of texts in Russian.
Keywords