Journal of Portuguese Linguistics (Apr 2021)

Detecting word-level stress in continuous speech: A case study of Brazilian Portuguese

  • Simone Harmath-de Lemos

DOI
https://doi.org/10.5334/jpl.238
Journal volume & issue
Vol. 20, no. 1

Abstract

Read online Read online

This study discusses the detection of primary stress in continuous speech in Brazilian Portuguese (BP) using the West Point corpus (Morgan et al. 2008), and compressed representations of the speech signal (MFCCs, modelled by HMM-GMMs), as implemented in the toolkit Kaldi (Povey et al. 2011). An acoustic model of BP was trained using 5-fold cross validation and tested in three experimental conditions. Fairly high measures of accuracy were achieved in all conditions tested, yielding high MCCs and Kappas, indicating that the results are neither an effect of imbalanced data sets, nor of chance classification. These results, along with metrics obtained for vowels in pre- and posttonic positions indicate (i) that stress in BP is captured fairly well across speakers and genders by representations of the speech signal that encode spectral features and energy information but which do not directly compute duration or F0; (ii) as captured by the models used herein, there is an asymmetry between pretonic and posttonic vowels; (iii) in a preliminary analysis, Unstressed word tokens tend to cluster in prosodically weak positions of the utterance, raising the question of whether stress is consistently realized in these positions; (iv) pending further studies, there is an asymmetry between ultimate, penultimate and antepenultimate words as to how successfully stress is captured by the models used herein.

Keywords