Труды Института системного программирования РАН (Oct 2018)
Use of Multiple Features for Extracting Topics from News Clusters
Abstract
In this paper we consider a method for extraction of alternative names of a concept or a named entity mentioned in a news cluster. The method is based on the structural organization of news clusters and exploits comparison of various contexts of words. The word contexts are used as basis for multiword expression extraction and main entity detection. At the end of cluster processing we obtain groups of near-synonyms, in which the main synonym of a group is determined.
Keywords