Journal of Open Humanities Data (Nov 2021)

PapyGreek Treebanks: A Dataset of Linguistically Annotated Greek Documentary Papyri

  • Marja Vierros,
  • Erik Henriksson

DOI
https://doi.org/10.5334/johd.55
Journal volume & issue
Vol. 7

Abstract

Read online

The PapyGreek Treebanks dataset contains documentary texts written in Postclassical Greek (ca. 300 BCE–700 CE), morphosyntactically annotated according to Dependency Grammar. The source of the texts is the Duke Databank of Documentary Papyri (DDbDP), which preserves the modern editorial treatment of the documents in TEI Epidoc XML encoding. Aiming to expose linguistic variation in the DDbDP, we have annotated two versions of a selection of documents: the plain transcription and an editorially corrected version. The dataset also comprises metadata about the documents’ dating and provenance, text type, and the persons involved. Furthermore, it facilitates linguistic research on these texts.

Keywords