Learner Corpora without Error Tagging

Rastelli, Stefano

Linguistik Online (Jan 2009)

Learner Corpora without Error Tagging

Rastelli, Stefano

Affiliations

Rastelli, Stefano

Journal volume & issue: Vol. 38, no. 2
pp. 57 – 66

Abstract

Read online

The article explores the possibility of adopting a form-to-function perspective when annotating learner corpora in order to get deeper insights about systematic features of interlanguage. A split between forms and functions (or categories) is desirable in order to avoid the "comparative fallacy" and because – especially in basic varieties – forms may precede functions (e.g., what resembles to a "noun" might have a different function or a function may show up in unexpected forms). In the computer-aided error analysis tradition, all items produced by learners are traced to a grid of error tags which is based on the categories of the target language. Differently, we believe it is possible to record and make retrievable both words and sequence of characters independently from their functional-grammatical label in the target language. For this purpose at the University of Pavia we adapted a probabilistic POS tagger designed for L1 on L2 data. Despite the criticism that this operation can raise, we found that it is better to work with "virtual categories" rather than with errors. The article outlines the theoretical background of the project and shows some examples in which some potential of SLA-oriented (non error-based) tagging will be possibly made clearer.

Published in Linguistik Online

ISSN: 1615-3014 (Online)
Publisher: Bern Open Publishing
Country of publisher: Switzerland
LCC subjects: Language and Literature: Philology. Linguistics: Computational linguistics. Natural language processing; Language and Literature: Philology. Linguistics: Language. Linguistic theory. Comparative grammar
Website: https://bop.unibe.ch/linguistik-online/

About the journal