Journal of Open Humanities Data (Oct 2021)
Named-Entity Dataset for Medieval Latin, Middle High German and Old Norse
Abstract
We present a dataset of named entities in three languages: Medieval Latin, Middle High German and Old Norse. The dataset, containing proper nouns of persons and places, was originally created to extract characters from three related medieval texts. Since the annotation is on low-resource pre-modern languages, they may be important to build named-entity recognition tools for languages with little data and high linguistic variation.
Keywords