Big Data & Society (Dec 2015)
The fictionality of topic modeling: Machine reading Anthony Trollope's Barsetshire series
Abstract
This essay describes how using unsupervised topic modeling (specifically the latent Dirichlet allocation topic modeling algorithm in MALLET) on relatively small corpuses can help scholars of literature circumvent the limitations of some existing theories of the novel. Using an example drawn from work on Victorian novelist Anthony Trollope's Barsetshire series, it argues that unsupervised topic modeling's counter-factual and retrospective reconstruction of the topics out of which a given set of novels have been created allows for a denaturalizing and unfamiliar (though crucially not “objective” or “unbiased”) view. In other words, topic models are fictions, and scholars of literature should consider reading them as such. Drawing on one aspect of Stephen Ramsay's idea of algorithmic criticism, the essay emphasizes the continuities between “big data” methods and techniques and longer-standing methods of literary study.