International Journal of Advanced Robotic Systems (Nov 2012)

FST-Based Pronunciation Lexicon Compression for Speech Engines

  • Žiga Golob,
  • Jerneja Žganec Gros,
  • Mario Žganec,
  • Boštjan Vesnicer,
  • Simon Dobrišek

DOI
https://doi.org/10.5772/52795
Journal volume & issue
Vol. 9

Abstract

Read online

Finite-state transducers are frequently used for pronunciation lexicon representations in speech engines, in which memory and processing resources are scarce. This paper proposes two possibilities for further reducing the memory footprint of finite-state transducers representing pronunciation lexicons. First, different alignments of grapheme and allophone transcriptions are studied and a reduction in the number of states of up to 30% is reported. Second, a combination of grapheme-to-allophone rules with a finite-state transducer is proposed, which yields a 65% smaller finite-state transducer than conventional approaches.