Generative Spoken Dialogue Language Modeling

Tu Anh Nguyen; Eugene Kharitonov; Jade Copet; Yossi Adi; Wei-Ning Hsu; Ali Elkahky; Paden Tomasello; Robin Algayres; Benoît Sagot; Abdelrahman Mohamed; Emmanuel Dupoux

doi:10.1162/tacl_a_00545

Transactions of the Association for Computational Linguistics (Jan 2023)

Generative Spoken Dialogue Language Modeling

Tu Anh Nguyen,
Eugene Kharitonov,
Jade Copet,
Yossi Adi,
Wei-Ning Hsu,
Ali Elkahky,
Paden Tomasello,
Robin Algayres,
Benoît Sagot,
Abdelrahman Mohamed,
Emmanuel Dupoux

Affiliations

Tu Anh Nguyen: Meta AI Research, France. [email protected]
Eugene Kharitonov: Meta AI Research, France
Jade Copet: Meta AI Research, France
Yossi Adi: Meta AI Research, Israël
Wei-Ning Hsu: Meta AI Research, United States
Ali Elkahky: Meta AI Research, United States
Paden Tomasello: Meta AI Research, United States
Robin Algayres: Meta AI Research, France
Benoît Sagot: Inria, Paris, France
Abdelrahman Mohamed: Meta AI Research, France. [email protected]
Emmanuel Dupoux: Meta AI Research, France. [email protected]

DOI: https://doi.org/10.1162/tacl_a_00545
Journal volume & issue: Vol. 11
pp. 250 – 266

Abstract

Read online

AbstractWe introduce dGSLM, the first “textless” model able to generate audio samples of naturalistic spoken dialogues. It uses recent work on unsupervised spoken unit discovery coupled with a dual-tower transformer architecture with cross-attention trained on 2000 hours of two-channel raw conversational audio (Fisher dataset) without any text or labels. We show that our model is able to generate speech, laughter, and other paralinguistic signals in the two channels simultaneously and reproduces more naturalistic and fluid turn taking compared to a text-based cascaded model.1,2

Published in Transactions of the Association for Computational Linguistics

ISSN: 2307-387X (Online)
Publisher: The MIT Press
Country of publisher: United States
LCC subjects: Language and Literature: Philology. Linguistics: Computational linguistics. Natural language processing
Website: https://direct.mit.edu/tacl

About the journal