Neural Network Algorithm for Detection of New Word Meanings Denoting Named Entities

Vladimir V. Bochkarev; Stanislav V. Khristoforov; Anna V. Shevlyakova; Valery D. Solovyev

doi:10.1109/access.2022.3186681

IEEE Access (Jan 2022)

Neural Network Algorithm for Detection of New Word Meanings Denoting Named Entities

Vladimir V. Bochkarev,
Stanislav V. Khristoforov,
Anna V. Shevlyakova,
Valery D. Solovyev

Affiliations

Vladimir V. Bochkarev: ORCiD; Institute of Physics, Kazan Federal University, Kazan, Russia
Stanislav V. Khristoforov: ORCiD; Institute of Physics, Kazan Federal University, Kazan, Russia
Anna V. Shevlyakova: ORCiD; Institute of Philology and Intercultural Communication, Kazan Federal University, Kazan, Russia
Valery D. Solovyev: ORCiD; Institute of Philology and Intercultural Communication, Kazan Federal University, Kazan, Russia

DOI: https://doi.org/10.1109/access.2022.3186681
Journal volume & issue: Vol. 10
pp. 68499 – 68512

Abstract

Read online

Lexical semantic change detection has been a rapidly developing field of science in recent years. Existed algorithms of lexical semantic change detection face difficulties when they are used to work with words denoting named entities. This paper proposes a method that allows one to reveal a word in a large corpus that started being used as a named entity, as well as to date the first usage of this word as a proper name. To solve this problem, firstly, we offer an algorithm that allows for detecting words in a large corpus denoting named entities. The recognizer is based on an analysis of co-occurrences with the most frequent words and was trained on data from the English subcorpus of the Google Books Ngram corpus. The achieved recognition accuracy of named entities is 98.44% on the test sample. Secondly, we test the possibility of applying the trained recognizer to diachronic data. The analysed cases show that the recognizer initially trained using the total bigram frequencies for a long time interval, at least for any frequent word, provides stable results for the annual frequency values. This can make the recognizer a good tool for language evolution studies, especially for detecting new meanings of words. The analysed cases show that the proposed method allows revealing new word meanings associated with named entities, as well as detecting genericized meaning of words that were earlier used as proper names.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords