A Collection of Swedish Diachronic Word Embedding Models Trained on Historical Newspaper Data

Simon Hengchen; Nina Tahmasebi

doi:10.5334/johd.22

Journal of Open Humanities Data (Jan 2021)

A Collection of Swedish Diachronic Word Embedding Models Trained on Historical Newspaper Data

Simon Hengchen,
Nina Tahmasebi

Affiliations

Simon Hengchen: Språkbanken Text, Department of Swedish, University of Gothenburg
Nina Tahmasebi: Språkbanken Text, Department of Swedish, University of Gothenburg

DOI: https://doi.org/10.5334/johd.22
Journal volume & issue: Vol. 7, no. 0

Abstract

Read online

This paper describes the creation of several word embedding models based on a large collection of diachronic Swedish newspaper material available through Språkbanken Text, the Swedish language bank. This data was produced in the context of Språkbanken Text’s continued mission to collaborate with humanities and natural language processing (NLP) researchers and to provide freely available language resources, for the development of state-of-the-art NLP methods and tools.

Published in Journal of Open Humanities Data

ISSN: 2059-481X (Online)
Publisher: Ubiquity Press
Country of publisher: United Kingdom
LCC subjects: General Works: History of scholarship and learning. The humanities; Language and Literature
Website: https://openhumanitiesdata.metajnl.com/

About the journal

Abstract

Keywords