Современные информационные технологии и IT-образование (Dec 2019)
The Improved Algorithm for Calculation of the Contextual Words Meaning in the Text
Abstract
Some modifications of the algorithm for context calculation, published in [1], are considered. A new solution for word and document context calculation is proposed. To improve a context determination it is proposed to take into consideration distances between words W1 and W2. This approach is especially important, when W2 number is >1. The results of investigations of these two formulas are presented. For efficiency comparison of these formulas calculation has been made for 100 texts. There were built distributions for C average and dispersion, which were compared with model data from [1]. The weight function has been optimized. The versions comparison was made according to the value of s/Сaver. The C dispersion was calculated for all version of the weight function. Dispersion of C appeared to be rather big because of great variation of text size, number W2 and W3, as well as wide distribution of words in the text. There is an example of L distribution for W2=”компьютер”.
Keywords