IJCoL (Dec 2020)
Embeddings-based detection of word use variation in Italian newspapers
Abstract
We study how words are used differently in two Italian newspapers at opposite ends of the political spectrum by training embeddings on one newspaper’s corpus, updating the weights on the second one, and observing vector shifts. We run two types of analysis, one top-down, based on a preselection of frequent words in both newspapers, and one bottom-up, on the basis of a combination of the observed shifts and relative and absolute frequency. The analysis is specific to this data, but the method can serve as a blueprint for similar studies.