Bellaterra Journal of Teaching & Learning Language & Literature (Dec 2017)

Investigating the use of readability metrics to detect differences in written productions of learners: a corpus-based study

  • Paula Lissón

DOI
https://doi.org/10.5565/rev/jtl3.752
Journal volume & issue
Vol. 10, no. 4

Abstract

Read online

This paper deals with the use of readability metrics as indices of learmers' linguistic features in a written corpus of Spanish learners of English L2. Seventeen measures of readability are presented and computed for 200 samples of written argumentative essays extracted from the corpus NOCE (Díaz-Negrillo, 2007). Support Vector Machines (SVM) are used in order to detect which are the metrics that perform better at detecting differences in learners’ productions belonging to students enrolled in the first or in the second year of an English major. Metrics based on sentence length, number of sentences, and number of polysyllabic words are reported to be the most accurate ones for the classification of learners' linguistic features.

Keywords