Труды Института системного программирования РАН (Oct 2018)
Keyterm extraction from microblogs' messages using Wikipedia
Abstract
The paper describes a method for keyterm extraction from messages of microblogs. The described approach utilizes the information obtained by analysis of Wikipedia structure and content. The algorithm is based on computation of “keyphraseness” for each term, i.e. an estimation of probability that it can be selected as a key in text. The experimental study demonstrated that the proposed technique performs significantly better comparing to analogues. As a demonstration of possible application, the prototype of context-sensitive advertising system has been implemented. This system is able to obtain the descriptions of goods relevant to found keyterms from Amazon online store. Several ways have been proposed also on how the information derived from Twitter messages may be utilized in different auxiliary services.