Informatika (Oct 2016)

ALGORITHMS FOR IDENTIFICATION OF CUES WITH AUTHORS’ TEXT INSERTIONS IN BELARUSIAN ELECTRONIC BOOKS

  • Y. S. Hetsevich,
  • T,. I. Okrut,
  • B. M. Lobanov

Journal volume & issue
Vol. 0, no. 1
pp. 68 – 76

Abstract

Read online

The main stages of algorithms for characters’ gender identification in Belarusian electronic texts are described. The algorithms are based on punctuation marking and gender indicators detection, such as past tense verbs and nouns with gender attributes. For indicators, special dictionaries are developed, thus making the algorithms more language-independent and allowing to create dictionaries for cognate languages. Testing showed the following results: the mean harmonic quantity for masculine gender detection makes up 92,2 %, and for feminine gender detection – 90,4%.