Project Euphonia: advancing inclusive speech recognition through expanded data collection and evaluation

Alicia Martin; Robert L. MacDonald; Pan-Pan Jiang; Marilyn Ladewig; Julie Cattiau; Rus Heywood; Richard Cave; Jimmy Tobin; Philip C. Nelson; Katrin Tomanek

doi:10.3389/flang.2025.1569448

Frontiers in Language Sciences (Jun 2025)

Project Euphonia: advancing inclusive speech recognition through expanded data collection and evaluation

Alicia Martin,
Robert L. MacDonald,
Pan-Pan Jiang,
Marilyn Ladewig,
Julie Cattiau,
Rus Heywood,
Richard Cave,
Jimmy Tobin,
Philip C. Nelson,
Katrin Tomanek

Affiliations

Alicia Martin: Google Research, Mountain View, CA, United States
Robert L. MacDonald: Google Research, Mountain View, CA, United States
Pan-Pan Jiang: Google Research, Mountain View, CA, United States
Marilyn Ladewig: CP Unlimited, New York, NY, United States
Julie Cattiau: Google Research, Mountain View, CA, United States
Rus Heywood: Google Research, Mountain View, CA, United States
Richard Cave: Computer Science Department, University College London (UCL), London, United Kingdom
Jimmy Tobin: Google Research, Mountain View, CA, United States
Philip C. Nelson: Google Research, Mountain View, CA, United States
Katrin Tomanek: Google Research, Mountain View, CA, United States

DOI: https://doi.org/10.3389/flang.2025.1569448
Journal volume & issue: Vol. 4

Abstract

Read online

Speech recognition models, predominantly trained on standard speech, often exhibit lower accuracy for individuals with accents, dialects, or speech impairments. This disparity is particularly pronounced for economically or socially marginalized communities, including those with disabilities or diverse linguistic backgrounds. Project Euphonia, a Google initiative originally launched in English dedicated to improving Automatic Speech Recognition (ASR) of disordered speech, is expanding its data collection and evaluation efforts to include international languages like Spanish, Japanese, French and Hindi, in a continued effort to enhance inclusivity. This paper presents an overview of the extension of processes and methods used for English data collection to more languages and locales, progress on the collected data, and details about our model evaluation process, focusing on meaning preservation based on Generative AI.

Published in Frontiers in Language Sciences

ISSN: 2813-4605 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Language and Literature
Website: https://www.frontiersin.org/journals/language-sciences

About the journal

Abstract

Keywords