Persian sentences to phoneme sequences
conversion based on recurrent neural networks

Behbahani Yasser Mohseni; Babaali Bagher; Turdalyuly Mussa

doi:10.1515/comp-2016-0019

Open Computer Science (Dec 2016)

Persian sentences to phoneme sequences conversion based on recurrent neural networks

Behbahani Yasser Mohseni,
Babaali Bagher,
Turdalyuly Mussa

Affiliations

Behbahani Yasser Mohseni: Speech Processing Laboratory of the Sharif University of Technology, Iran
Babaali Bagher: Department of Computer Science of the University of Tehran, Iran
Turdalyuly Mussa: Institute of Information and Computational Technologies, Almaty, Kazakhstan

DOI: https://doi.org/10.1515/comp-2016-0019
Journal volume & issue: Vol. 6, no. 1
pp. 219 – 225

Abstract

Read online

Grapheme to phoneme conversion is one of the main subsystems of Text-to-Speech (TTS) systems. Converting sequence of written words to their corresponding phoneme sequences for the Persian language is more challenging than other languages; because in the standard orthography of this language the short vowels are omitted and the pronunciation ofwords depends on their positions in a sentence. Common approaches used in the Persian commercial TTS systems have several modules and complicated models for natural language processing and homograph disambiguation that make the implementation harder as well as reducing the overall precision of system. In this paper we define the grapheme-to-phoneme conversion as a sequential labeling problem; and use the modified Recurrent Neural Networks (RNN) to create a smart and integrated model for this purpose. The recurrent networks are modified to be bidirectional and equipped with Long-Short Term Memory (LSTM) blocks to acquire most of the past and future contextual information for decision making. The experiments conducted in this paper show that in addition to having a unified structure the bidirectional RNN-LSTM has a good performance in recognizing the pronunciation of the Persian sentences with the precision more than 98 percent.

Published in Open Computer Science

ISSN: 2299-1093 (Online)
Publisher: De Gruyter
Country of publisher: Poland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.degruyter.com/view/j/comp

About the journal