Journal of Data Mining and Digital Humanities (Jan 2019)

A Hackathon for Classical Tibetan

  • Orna Almogi,
  • Lena Dankin,
  • Nachum Dershowitz,
  • Lior Wolf

Journal volume & issue
Vol. Special Issue on Computer-Aided Processing of Intertextuality in Ancient Languages, no. Towards a Digital Ecosystem: NLP. Corpus infrastructure. Methods for Retrieving Texts and Computing Text Similarities

Abstract

Read online

We describe the course of a hackathon dedicated to the development of linguistic tools for Tibetan Buddhist studies. Over a period of five days, a group of seventeen scholars, scientists, and students developed and compared algorithms for intertextual alignment and text classification, along with some basic language tools, including a stemmer and word segmenter.

Keywords