Joint streaming model for backchannel prediction and automatic speech recognition

Yong-Seok Choi; Jeong-Uk Bang; Seung Hi Kim

doi:10.4218/etrij.2023-0358

ETRI Journal (Feb 2024)

Joint streaming model for backchannel prediction and automatic speech recognition

Yong-Seok Choi,
Jeong-Uk Bang,
Seung Hi Kim

Affiliations

Yong-Seok Choi
Jeong-Uk Bang
Seung Hi Kim

DOI: https://doi.org/10.4218/etrij.2023-0358
Journal volume & issue: Vol. 46, no. 1
pp. 118 – 126

Abstract

Read online

In human conversations, listeners often utilize brief backchannels such as ''uh-huh'' or ''yeah.'' Timely backchannels are crucial to understanding and increasing trust among conversational partners. In human-machine conversation systems, users can engage in natural conversations when a conversational agent generates backchannels like a human listener. We propose a method that simultaneously predicts backchannels and recognizes speech in real time. We use a streaming transformer and adopt multitask learning for concurrent backchannel prediction and speech recognition. The experimental results demonstrate the superior performance of our method compared with previous works while maintaining a similar single-task speech recognition performance. Owing to the extremely imbalanced training data distribution, the single-task backchannel prediction model fails to predict any of the backchannel categories, and the proposed multitask approach substantially enhances the backchannel prediction performance. Notably, in the streaming prediction scenario, the performance of backchannel prediction improves by up to 18.7% compared with existing methods.

Published in ETRI Journal

ISSN: 1225-6463 (Print); 2233-7326 (Online)
Publisher: Electronics and Telecommunications Research Institute (ETRI)
Country of publisher: Korea, Republic of
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Telecommunication; Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics
Website: https://onlinelibrary.wiley.com/journal/22337326

About the journal

Abstract

Keywords