Optimizing Automatic Speech Recognition for Low-Proficient Non-Native Speakers

Catia Cucchiarini; Helmer Strik; Joost van Doremalen

doi:10.1155/2010/973954

EURASIP Journal on Audio, Speech, and Music Processing (Jan 2010)

Optimizing Automatic Speech Recognition for Low-Proficient Non-Native Speakers

Catia Cucchiarini,
Helmer Strik,
Joost van Doremalen

Affiliations

Catia Cucchiarini
Helmer Strik
Joost van Doremalen

DOI: https://doi.org/10.1155/2010/973954
Journal volume & issue: Vol. 2010

Abstract

Read online

Computer-Assisted Language Learning (CALL) applications for improving the oral skills of low-proficient learners have to cope with non-native speech that is particularly challenging. Since unconstrained non-native ASR is still problematic, a possible solution is to elicit constrained responses from the learners. In this paper, we describe experiments aimed at selecting utterances from lists of responses. The first experiment on utterance selection indicates that the decoding process can be improved by optimizing the language model and the acoustic models, thus reducing the utterance error rate from 29–26% to 10–8%. Since giving feedback on incorrectly recognized utterances is confusing, we verify the correctness of the utterance before providing feedback. The results of the second experiment on utterance verification indicate that combining duration-related features with a likelihood ratio (LR) yield an equal error rate (EER) of 10.3%, which is significantly better than the EER for the other measures in isolation.

Published in EURASIP Journal on Audio, Speech, and Music Processing

ISSN: 1687-4722 (Online)
Publisher: SpringerOpen
Country of publisher: United Kingdom
LCC subjects: Science: Physics: Acoustics. Sound; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://asmp-eurasipjournals.springeropen.com

About the journal