Computer Science (Jan 2009)

Polish Phoneme Statistics Obtained On Large Set Of Written Texts

  • Bartosz Ziółko,
  • Jakub Gałka,
  • Mariusz Ziółko

DOI
https://doi.org/10.7494/csci.2009.10.3.97
Journal volume & issue
Vol. 10
p. 97

Abstract

Read online

The phonetical statistics were collected from several Polish corpora. The paper is a summaryof the data which are phoneme n-grams and some phenomena in the statistics. Triphonestatistics apply context-dependent speech units which have an important role in speech recognitionsystems and were never calculated for a large set of Polish written texts. The standardphonetic alphabet for Polish, SAMPA, and methods of providing phonetic transcriptions are described.

Keywords