A speech separation system in video sequence using dilated inception network and U-Net

Ghada Dahy; Mohammed A.A. Refaey; Reda Alkhoribi; M. Shoman

Egyptian Informatics Journal (Dec 2022)

A speech separation system in video sequence using dilated inception network and U-Net

Ghada Dahy,
Mohammed A.A. Refaey,
Reda Alkhoribi,
M. Shoman

Affiliations

Ghada Dahy: Corresponding author at: Teaching Assistant, Faculty of Computers and Artificial Intelligence, Cairo University, Egypt.; Faculty of Computers and AI, Cairo University, Egypt
Mohammed A.A. Refaey: Faculty of Computers and AI, Cairo University, Egypt
Reda Alkhoribi: Faculty of Computers and AI, Cairo University, Egypt
M. Shoman: Faculty of Computers and AI, Cairo University, Egypt

Journal volume & issue: Vol. 23, no. 4
pp. 121 – 131

Abstract

Read online

In this paper, an audio-visual model for separating a speech of the target speaker from a combination of other speakers’ speeches is proposed. It can be used in speech separation, automatic speech recognition systems (ASR) and also in creating single speaker speech databases. Speech separation is complicated problem using audio information only so visual and auditory signals are combined to complete the separation process. The proposed model consists of four modules, two for audio signal, one for visual feature and the last one used to concatenate the features resulted from the previous three modules to generate the separated signals. Our proposed model improved Short-time objective intelligibility (STOI) with 11%, Perceptual Evaluation of Speech Quality (PESQ) with 24%, and Frequency-weighted Segmental SNR (fwSNRseg) with 16% compared with previous works. It also improved Csig' which is the predicted rating of speech distortion with 13% and 'Covl' which is the predicted rating of overall quality with 18% compared with previous audio-visual models.

Published in Egyptian Informatics Journal

ISSN: 1110-8665 (Print)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://www.journals.elsevier.com/egyptian-informatics-journal/

About the journal

Abstract

Keywords