ITM Web of Conferences (Jan 2016)

The fast vocabulary-based algorithm for natural language word form analysis

  • Rozanov Alexey

DOI
https://doi.org/10.1051/itmconf/20160603013
Journal volume & issue
Vol. 6
p. 03013

Abstract

Read online

In the field of Natural Language Processing, identifying word forms and, more precisely, identifying part-of-speech and grammatical information for each of the words in the input text usually comprises the very first level of text processing (or immediately follows splitting the text into words, should such task be non-trivial), therefore development of approaches to speed up the word form analysis pose significant interest In (his work, by using the work [1] as a basis, we present an approach to analysis of word forms for natural languages with postfix inflection, following the work done in [3]. We propose a way of representing the postfix inflection rules associated with a natural language and an algorithm for word form analysis based on it. In conclusion, we provide the benchmark data indicating the increase in speed compared to known analysis methods.