Journal of Data Mining and Digital Humanities (Dec 2020)

Stylo visualisations of Middle English documents

  • Martti Mäkinen

Journal volume & issue
Vol. Special issue on Visualisations in Historical Linguistics

Abstract

Read online

International audience Automated approaches to identifying authorship of a text have become commonplace in the stylometric studies. The current article applies an unsupervised stylometric approach on Middle English documents using the script Stylo in R, in an attempt to distinguish between texts from different dialectal areas. The approach is based on the distribution of character 3-grams generated from the texts of the corpus of Middle English Local Documents (MELD). The article adopts the middle ground in the study of Middle English spelling variation, between the concept of relational linguistic space and the real linguistic continuum of medieval England. Stylo can distinguish between Middle English dialects by using the less frequent character 3-grams.

Keywords