Embeddings-based detection of word use variation in Italian newspapers

Michele Cafagna; Lorenzo De Mattei; Malvina Nissim

doi:10.4000/ijcol.703

IJCoL (Dec 2020)

Embeddings-based detection of word use variation in Italian newspapers

Michele Cafagna,
Lorenzo De Mattei,
Malvina Nissim

Affiliations

Michele Cafagna
Lorenzo De Mattei
Malvina Nissim

DOI: https://doi.org/10.4000/ijcol.703
Journal volume & issue: Vol. 6, no. 2
pp. 9 – 22

Abstract

Read online

We study how words are used differently in two Italian newspapers at opposite ends of the political spectrum by training embeddings on one newspaper’s corpus, updating the weights on the second one, and observing vector shifts. We run two types of analysis, one top-down, based on a preselection of frequent words in both newspapers, and one bottom-up, on the basis of a combination of the observed shifts and relative and absolute frequency. The analysis is specific to this data, but the method can serve as a blueprint for similar studies.

Published in IJCoL

ISSN: 2499-4553 (Online)
Publisher: Accademia University Press
Country of publisher: Italy
LCC subjects: Social Sciences; Language and Literature: Philology. Linguistics: Computational linguistics. Natural language processing
Website: https://journals.openedition.org/ijcol

About the journal