Developing and implementing an English-Spanish literary parallel audio-textual corpus for data-driven ESL learning

Michael Lang; Xavier Gómez Guinovart

doi:10.1590/1678-460x2021370106

DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada (Mar 2021)

Developing and implementing an English-Spanish literary parallel audio-textual corpus for data-driven ESL learning

Michael Lang,
Xavier Gómez Guinovart

Affiliations

Michael Lang: ORCiD
Xavier Gómez Guinovart: ORCiD

DOI: https://doi.org/10.1590/1678-460x2021370106
Journal volume & issue: Vol. 37, no. 1

Abstract

Read online Read online

ABSTRACT The purpose of this paper is to present the LITTERA corpus, an English-Spanish literary parallel speech corpus created for the purpose of language learning, and to sketch out a few pedagogical applications for the study of English phonology by Spanish-speaking language learners. It is composed of 25 literary texts that have been aligned with the Spanish translation and are accompanied by audio from the corresponding audiobooks. In this article, we will detail its conception, composition and features at length, as well as provide a few examples of how LITTERA can be applied in language learning, particularly within the realm of oral comprehension and speech production.

Published in DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada

ISSN: 0102-4450 (Print); 1678-460X (Online)
Publisher: Pontifícia Universidade Católica de São Paulo
Country of publisher: Brazil
LCC subjects: Language and Literature: Philology. Linguistics
Website: http://www.scielo.br/scielo.php?pid=0102-4450&script=sci_serial

About the journal

Abstract

Keywords