Speech recognition of south China languages based on federated learning and mathematical construction

Weiwei Lai; Yinglong Zheng

doi:10.3934/era.2023255

Electronic Research Archive (Jul 2023)

Speech recognition of south China languages based on federated learning and mathematical construction

Weiwei Lai ,
Yinglong Zheng

Affiliations

Weiwei Lai: 1. China Southern Power Grid Digital Enterprise Technology (Guangdong) Co., Ltd, Guangzhou 510000, Guangdong, China 2. Northwestern Polytechnical University, Xi'an, Shaanxi Province, China
Yinglong Zheng: 1. China Southern Power Grid Digital Enterprise Technology (Guangdong) Co., Ltd, Guangzhou 510000, Guangdong, China3. South China University of Technology, Guangzhou, Guangdong Province, China

DOI: https://doi.org/10.3934/era.2023255
Journal volume & issue: Vol. 31, no. 8
pp. 4985 – 5005

Abstract

Read online

As speech recognition technology continues to advance in sophistication and computer processing power, more and more recognition technologies are being integrated into a variety of software platforms, enabling intelligent speech processing. We create a comprehensive processing platform for multilingual resources used in business and security fields based on speech recognition and distributed processing technology. Based on the federated learning model, this study develops speech recognition and its mathematical model for languages in South China. It also creates a speech dataset for dialects in South China, which at present includes three dialects of Mandarin and Cantonese, Chaoshan and Hakka that are widely spoken in the Guangdong region. Additionally, it uses two data enhancement techniques—audio enhancement and spectrogram enhancement—for speech signal characteristics in order to address the issue of unequal label distribution in the dataset. With a macro-average F-value of 91.54% and when compared to earlier work in the field, experimental results show that this structure is combined with hyperbolic tangent activation function and spatial domain attention to propose a dialect classification model based on hybrid domain attention.

Published in Electronic Research Archive

ISSN: 2688-1594 (Online)
Publisher: AIMS Press
Country of publisher: United States
LCC subjects: Science: Mathematics; Technology: Technology (General): Industrial engineering. Management engineering: Applied mathematics. Quantitative methods
Website: https://www.aimspress.com/journal/era

About the journal

Abstract

Keywords