Journal of Open Humanities Data (Mar 2020)

Dependency Treebanks of Ancient Greek Prose

  • Vanessa B. Gorman

DOI
https://doi.org/10.5334/johd.13
Journal volume & issue
Vol. 6, no. 1

Abstract

Read online

This dataset is a collection of dependency syntax trees of representative texts from ancient Greek prose authors (Aeschines, Antiphon, Appian, Athenaeus, Demosthenes, Dionysius of Halicarnassus, Herodotus, Josephus, Lysias, Plutarch, Polybius, Thucydides, and Xenophon), totaling to date 550,000+ tokens. It is hand-annotated by one person, using the Arethusa program on the Perseids website. Original texts were obtained from the Perseus Digital Library, and some (as indicated) were computer pre-parsed at the Pedalion Project. The database is stored in a stable form (2019-12-31) on Zenodo (DOI: 10.5281/zenodo.3596076) and in a continuously updated form on GitHub in .xml format (https://vgorman1.github.io/). The repository can be used for pedagogical purposes and for research in linguistics analysis and corpus linguistics, stylistics, natural language processing, classification, and literary and historical analysis.

Keywords