Methodos (Oct 2024)
Finding Relatedness: pathways for detecting textual relatedness in the medieval scholastic corpus
Abstract
To show the importance of preparing historical editions as textual data first, while leaving presentation (whether in print or on the web) as a secondary down-stream task, this article identifies beneficial outcomes for research that can be achieved through computational analysis when such a corpus of textual data is at hand. With a focus on the deep intertextuality characteristic of the medieval scholastic corpus, it reviews three distinct methods for detecting different forms of textual relatedness within the corpus: n-gram intersections, document embeddings, and convolution. In each case, special attention is given to how the availability of a domain specific knowledge graph helps us both properly prepare the corpus for analysis and visualize the results in ways that enhance research. Such results include observing trends in citation practices across different genres and sub-genres of the corpus, automatically grouping questions by similarity, and detecting sustained and uncited textual re-use.
Keywords