Caracteres: Estudios Culturales y Críticos de la Esfera Digital (Nov 2016)

El análisis estilométrico aplicado a la literatura española: las novelas policiacas e históricas

  • José Manuel Fradejas Rueda

Journal volume & issue
Vol. 5, no. 2
pp. 196 – 245

Abstract

Read online

This paper demonstrates that a computer can determine the authorship of a text. To this end we created a corpus of 122 contemporary novels written in Spanish (69 historical novels, 50 crime novels, and 3 westerns). The corpus was then studied using stylo, a stylometric analysis package written in the programming language R. We chose to apply the simplest of the multiple types of analysis offered by this package: cluster analysis. The results are very interesting: by taking into account just the 100 most frequently used words (MFW), the computer was able to group the different works of each author as well as assigning those published under a pseudonym to the true author without incurring in any errors.

Keywords