Ibérica (Apr 2017)

SCAP-TT: Tagging and lemmatising Spanish tourism discourse, and beyond

  • Patrick Goethals,
  • Els Lefever ,
  • Lieve Macken

Journal volume & issue
Vol. 33
pp. 279 – 288

Abstract

Read online

In this research note we report on the first results of SCAP, the Spanish Corpus Annotation Project, applied to tourism discourse (SCAP_tur). In particular, we present and assess a new TreeTagger parameter set for Spanish (SCAP-TT), which has been trained for the Part-of-Speech tagging (POS-tagging) and lemmatisation of Spanish promotional tourism texts. Although SCAP-TT has been trained for specialized tourism discourse, we also show promising results for the annotation of other text genres such as essays and literary texts.

Keywords