Journal of Open Humanities Data (Jun 2022)

Bearing a Bag-of-Tales: An Open Corpus of Annotated Folktales for Reproducible Research

  • Joshua Hagedorn,
  • Sándor Darányi

DOI
https://doi.org/10.5334/johd.78
Journal volume & issue
Vol. 8

Abstract

Read online

Motifs in folktales and myths have been identified and articulated by scholars, and the computational identification and discovery of such motifs is an area of ongoing research. Achieving this goal means meeting scientific requirements (that methods be comparable and replicable) and requirements for collaboration (that multi-disciplinary teams can reliably access data). To support those requirements, access to consistent reference datasets is needed. Unfortunately, these datasets are not openly available in a format that supports their use in data science. Here we report work in progress toward this goal, having converted the Ashliman Folktexts collection into a public dataset of annotated tale texts. The data can be accessed at doi.org/10.5281/zenodo.6575263.

Keywords