Philologia Hispalensis (Dec 2019)

Lemmatization of codea data and its use in quantitative analyzes on the eñe and the silent hache

  • Hiroto Ueda

DOI
https://doi.org/10.12795/PH.2019.v33.i01.10
Journal volume & issue
Vol. 33, no. 1
pp. 161 – 178

Abstract

Read online

In this article we will explain a method of lemmatization of Spanish old documents using the data of «CODEA» Corpus de Documentos Españoles Anteriores a 1800 (Sánchez-Prieto et al., 2009) and the analysis tool «LYNEAL» (Letras y Números en Análisis Lingüísticos). Our goal is to present the simplest possible method of lemmatization, easy to perform with high degree of accuracy. Next, we will expose two examples of its use in the historical study of Spanish spelling: on the eñe and the silent hache.

Keywords