نشریه پژوهشهای زبانشناسی (Sep 2020)
Speaker-specific features of simple vowels in Persian based on the source-filter theory
Abstract
AbstractBased on source-filter theory, the present research attempts to investigate between- and within-speaker variability in simple vowels of Persian using experimental phonetics tools. This research aims to discover which of the simple vowels of Persian represent more speaker-specific information and which acoustic parameters can better distinguish Persian speakers. To test between- and within-speaker variability, two types of acoustic parameters, one related to the larynx, i.e. fundamental frequency, and the other related to the vocal tract, i.e. formant frequencies, were selected. Fundamental frequency as well as formant values were extracted from the steady state point of the vowels uttered by twelve Persian-speaking male speakers. Speech data were recorded non-contemporaneously in laboratory environment on two different occasions separated by one to two weeks, thereby allowing for analyzing occasion-to occasion within-speaker variability. Speech tokens were acoustically measured with PRAAT version 5.2.34 and statistical analyses were carried out with SPSS version 21 and R version 3.3.3. Results of the study indicated that the low front vowel /a/ and the third formant frequency convey more speaker-specific information compared to the other vowels and formant frequencies. In addition, discriminatory power of fundamental frequency was reported to be stronger than formant frequencies. The results also revealed that fundamental frequency is correlated with the first formant frequency which is subsequently indicative of interdependence between the source and filter sections.Keywords: Acoustic phonetics, forensic phonetics, formant frequency, fundamental frequency, speaker identification, source-filter theory IntroductionVerbal communication is an integral part of human social interactions. Everyday experience tells us that humans are able to recognize easily familiar speakers through their voice. This indicates that speech sounds contain specific information which could be reflected in the acoustic characteristics of speech signals. Vowels are among those speech sounds which have always been the center of attention in the field of forensic voice comparison. In a study by Gold and French (2011), vowels have been reported as one of the most analyzed segments among forensic practitioners. They also reported that F0 and formant structures are two acoustic parameters which are commonly used in forensic voice comparison. Earlier studies on particularities of vowels were primarily focused on calculation of the average values of formant and fundamental frequency over a long stretches of a speech recordings. However, long term extraction of F0 and formant values represent solely the discriminatory power of formant structures without allowing us to measure the strengths of vowels in separation. In this study we aim to examine the discriminatory role of simple vowels in Persian with focus on the extent of source-filter independence or interdependence within the context of speaker identification. This study aims to determine which of the vowels in Persian can better distinguish speakers and which acoustic parameters of the source and filter sections represent more speaker-specificity. We also aim to examine whether source and filter features have potential in capturing complementary information about speakers that can be used to improve speaker discrimination. Materials & MethodsTo test between-and within-speaker variability, twelve male Persian speakers were recorded on two different sessions, separated by a time-lapse of one to two weeks. Speakers were asked to read the 54 sentences one by one, with a pause, and in a natural way, without any marked intonation. Speech tokens were analyzed using Praat (version 5.2.34, Boersma and Weenink, 2013). For this study, mean values of the fundamental frequency (F0) and the first four formants i.e. F1, F2, F3 and F4 were measured at the central points of six simple vowels in Persian. Statistical analysis of data was carried out using R (R core Team 2014) version 3.3.3 and SPSS (IBM Corp. 2012) version 21. Discussion of Results and ConclusionsIn this section, we provide the results of different acoustical models i.e. univariate analysis of variance, multinomial logistic regression and principal component analysis that were employed on the collected speech data of Persian. In the present study, we explored potential speaker-specific acoustic parameters of simple vowels in Persian based on the source-filter theory. Statistical analysis of speech data revealed that selected acoustic parameters i.e. F0, F1, F2, F3 and F4 of the all vowels, except for the F2 of the vowel /u/, were able to discriminate between Persian speakers. The current findings showed that the low front vowel /a/ appear to convey the highest between-speaker discrimination power. In terms of formant structures, for most vowels, effects of speaker were stronger on F3 and F4 compared to F1 and F2. Additionally, fundamental frequency was reported to be more discriminatory than formant frequencies. The results also revealed a significant correlation between F0 and F1 which show a considerable interdependence between the source and filter sections.
Keywords