Journal of Language and Education (Dec 2024)
Synchronic and Diachronic Predictors of Socialness Ratings of Words
Abstract
In recent works, a new psycholinguistic concept has been introduced and studied that is the socialness of a word. In particular, Diveica et al., 2022 presents a dictionary with socialness ratings obtained using a survey method. The socialness rating reflects word social significance. Unfortunately, the size of the existed dictionary with word socialness ratings is relatively small. In this paper, we propose linear and neural network predictors of socialness ratings by using pre-trained fasttext vectors as input. The obtained Spearman`s correlation coefficient between human socialness ratings and machine ones is 0.869. The trained models allowed obtaining socialness ratings for 2 million English words, as well as a wide range of words in 43 other languages. An unexpected result is that the linear model provides highly accurate estimate of the socialness ratings, which can be hardly further improved. Apparently, this is due to the fact that in the space of vectors representing words there is a selected direction responsible for meanings associated with socialness driven by of social factors influencing word representation and use. The article also presents a diachronic neural network predictor of concreteness ratings using word co-occurrence vectors as input data. It is shown that using a one-year data from a large diachronic corpus Google Books Ngram one can obtain accuracy comparable to the accuracy of synchronic estimates. We study some examples of words words that are characterised by significant changes in socialness ratings over the past 150 years. It is concluded that changes in socialness ratings can serve as a marker of word meaning change.
Keywords