IJCoL (Dec 2020)

Embeddings-based detection of word use variation in Italian newspapers

  • Michele Cafagna,
  • Lorenzo De Mattei,
  • Malvina Nissim

DOI
https://doi.org/10.4000/ijcol.703
Journal volume & issue
Vol. 6, no. 2
pp. 9 – 22

Abstract

Read online

We study how words are used differently in two Italian newspapers at opposite ends of the political spectrum by training embeddings on one newspaper’s corpus, updating the weights on the second one, and observing vector shifts. We run two types of analysis, one top-down, based on a preselection of frequent words in both newspapers, and one bottom-up, on the basis of a combination of the observed shifts and relative and absolute frequency. The analysis is specific to this data, but the method can serve as a blueprint for similar studies.