Journal of Historical Network Research (Oct 2022)
Recreating the Network of Early Modern Natural Philosophy: A Mono- and Multilingual Text Data Vectorization Method
Abstract
How could one create a network representation of a book corpus which spans over two hundred years ? In this paper, we present a method based on text data vectorization for a complex and multifaceted network representation of an early modern corpus of 239 natural philosophy textbooks published in Latin, French, and English. We use unsupervised methods (namely, topic modeling, term frequency – inverse document frequency, and multilingual word embeddings) to represent the broader features of this corpus, such as its homogeneity in style and linguistic usages, both among works written in the same language, and across multiple languages. We call this the ‘textual dimension.’ We also use a collocate analysis of specific keywords to explore how certain concepts were understood, reshaped, and disseminated in the corpus. We call this the ‘semantic dimension.’ Each of these two dimensions provides a different way of correlating the books via text data vectorization and of representing them as a network. Since these dimensions are complex and multifaceted, the network we construct for each of them is a multiplex, made from several layer-graphs. Furthermore, using existing bio-bibliographical information, this research provides the grounds for further expanding the described network representation in such a way as to create a third multiplex, one that explores some of the social features of the authors in question.
Keywords