Journal of Data Mining and Digital Humanities (Aug 2017)

Bioinformatics and Classical Literary Study

  • Pramit Chaudhuri,
  • Joseph P. Dexter

Journal volume & issue
Vol. Special Issue on Computer-Aided Processing of Intertextuality in Ancient Languages, no. Project presentations

Abstract

Read online

This paper describes the Quantitative Criticism Lab, a collaborative initiative between classicists, quantitative biologists, and computer scientists to apply ideas and methods drawn from the sciences to the study of literature. A core goal of the project is the use of computational biology, natural language processing, and machine learning techniques to investigate authorial style, intertextuality, and related phenomena of literary significance. As a case study in our approach, here we review the use of sequence alignment, a common technique in genomics and computational linguistics, to detect intertextuality in Latin literature. Sequence alignment is distinguished by its ability to find inexact verbal similarities, which makes it ideal for identifying phonetic echoes in large corpora of Latin texts. Although especially suited to Latin, sequence alignment in principle can be extended to many other languages.

Keywords