Papers in Physics, vol. 6, art. 060002 (2014) Received: 29 December 2013, Accepted: 27 May 2014 Edited by: E. Mizraji Reviewed by: J. Brum, Instituto de F́ısica, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay. Licence: Creative Commons Attribution 3.0 DOI: http://dx.doi.org/10.4279/PIP.060002 www.papersinphysics.org ISSN 1852-4249 Study of the characteristic parameters of the normal voices of Argentinian speakers E. V. Bonzi,1, 2∗ G. B. Grad,1† A. M. Maggi,3‡ M. R. Muñóz3§ The voice laboratory permits to study the human voices using a method that is objective and noninvasive. In this work, we have studied the parameters of the human voice such as pitch, formant, jitter, shimmer and harmonic-noise ratio of a group of young people. This statistical information of parameters is obtained from Argentinian speakers. I. Introduction The voice is a multidimensional phenomenon that must be evaluated using special tools for determin- ing acoustic parameters. These parameters are: the pitch or voice tone, the timbre, considered as the personality of the voice that is particular of each person (determined by fundamental frequency, its harmonics and formants) and the degree of hoarse- ness. During sustained vibration, the vocal fold will exhibit variations of fundamental frequency and amplitude; these phenomena are called “frequency perturbation” (jitter) and “amplitude perturba- tion” (shimmer). They reflect fluctuations in ten- sion and biochemical characteristics of the vocal ∗E-mail: bonzie@famaf.unc.edu.ar †E-mail: grad@famaf.unc.edu.ar ‡E-mail: alicia.maggi@hotmail.com §E-mail: eudaimonia13@hotmail.com 1 Facultad de Matemática, Astronomı́a y F́ısica, Univer- sidad Nacional de Córdoba, Ciudad Universitaria, 5000 Córdoba, Argentina. 2 Instituto de F́ısica Enrique Gaviola (CONICET), Ciudad Universitaria, 5000 Córdoba, Argentina. 3 Escuela de Fonoaudioloǵıa, Facultad de Ciencias Médicas, Universidad Nacional de Córdoba, Ciudad Uni- versitaria, 5000 Córdoba, Argentina. folds, as well as variation in their neural control and the physiological properties of the individuals voices. The acoustic analysis is one of the major ad- vances in the study of voice, increasing the accuracy of diagnosis in this area. Normal values as stan- dards are important and necessary to guide voice professionals. There are not many studies performed for the Latin languages [1–3]. However, there are several of them for the English language, such as those in Refs. [4–8]. In the same way, the software used for voice ther- apy is in general designed for other languages than Spanish. A comparison has been made, though, be- tween the two vowel systems of English and Span- ish (the variation spoken in Madrid, Spain), which triggered relatively large versus small vowel inven- tories [9]. That is the reason why we consider it is very important and necessary to produce more results for the Spanish speaking population. We analyzed 72 audio files of female and male voices from an Argentinian Spanish speaking pop- ulation to obtain the acoustical parameters using the Praat program [10]. Our data were compared to Bradlow [9], Hualde [11] and Casado Morente et al. [12]. The pitches measured were lower than ex- pected and the First formant of the /a/ and /u/ 060002-1 Papers in Physics, vol. 6, art. 060002 (2014) / E. V. Bonzi et al. Figure 1: Wave shape of the /a/ sound. Figure 2: Wave shape of the /i/ sound. vowels is higher than the published data. Addi- tionally, the Harmonic to Noise Ratio (HNR) values discriminated per vowel are presented. II. Measurement methodology Pitch, First and Second formants, Jitter, Shimmer and Harmonic to Noise Ratio (HNR) are the cor- nerstones of acoustic measurement of voice signals, and are often regarded as indices of the perceived quality of both normal and pathological voices [13]. In this work, we analyzed the audio files from the five Spanish vowels produced by 72 female and male individuals, in order to study the parameters previ- ously mentioned. The individuals are Argentinian university students whose ages range between 20 and 30, coming from different regions without any special geographical distribution. The voices were recorded using a Behringer C-1U (USB) cardioid microphone and a notebook. The microphone was placed at a distance of 10 cm respect to the mouth of the subjects while they were pronouncing the vowels with an inten- sity and tone that was comfortable in an acousti- cally treated room. Each sound was sustained for, Figure 3: Harmonics of the /a/ vowel. Figure 4: Harmonics of the /i/ vowel. at least, five seconds. The Praat program, commonly used in linguistics for the scientific analysis of the human voice [10], was used to record, analyze the wav files and obtain all the parameters presented in this work. A sample rate of 44100 Hz was used to record the sound file. The wave shapes of the sounds corresponding to /a/ and /i/ vowels are shown in Figs. 1 and 2. In Figs. 3 and 4, the harmonic components obtained by applying Fourier Transform to the respective vowel signal are shown. Pitch The pitch is a perceptual attribute of sound closely related to frequency, being this perception a subjective notion. In psychoacoustics, the pitch is related to the fundamental frequency of vibration of the vocal cords, allowing the perception of the tone fre- quency. Nevertheless, for Praat program [10], the pitch is coincident with the fundamental harmonic of the wave and we used this definition in this work. This parameter depends on gender, being higher for women and lower for men. 060002-2 Papers in Physics, vol. 6, art. 060002 (2014) / E. V. Bonzi et al. Formants The voice is created in the vocal cord, shaped as complex sound with harmonics and modified in the vocal tract by the resonating frequencies. Then, the amplitude of harmonics frequencies are enveloped forming a spectrum of energy, the peaks or max- imum observed in these spectra are named “for- mants.” Consequently, a formant is a concentration of acoustic energy around a particular frequency in the speech wave. There are several formants, each one at a different frequency corresponding to a res- onance in the vocal tract, and especially the first two are related to the movement of the tongue. The high-low magnitude of the First one (F1) is inversely related to the up-down tongue position and the Second formant (F2) is related to the front tongue position. Jitter and Shimmer The naturalness factor of sustained vowels is at- tributed to a fundamental frequency and the sig- nal amplitude. Still there are unwanted variations in time of the sound signal properties in the voice production. While jitter indicates the variability or pertur- bation of fundamental frequency, shimmer refers to the same perturbation but, in this case, related to amplitude of sound wave, or intensity of vocal emis- sion. Jitter is affected mainly by lack of control of vocal fold vibration and shimmer by reduction of glottic resistance and mass lesions in the vocal folds, which are related to the presence of noise at emission and breathiness [10, 14]. Harmonic to Noise Ratio - HNR The amount of energy conveyed in the funda- mental frequency (f0) and its harmonics, divided by the energy in noise frequencies, is defined as the harmonic-to-noise ratio. Frequencies that are not integer multiples of f0 are regarded as noise. This parameter is related to the perception of vo- cal roughness and hoarseness [10]. Normal voices have a low level of noise and high HNR. On the contrary, the degree of hoarseness increases the noise component and decreases HNR. III. Results and Discussion The measured data were processed statistically and the results are shown in the Tables 1, 4, 5, 6 and Figs. 5 and 6. 200 300 400 500 600 700 800 900 1000 500 1000 1500 2000 2500 3000 F ir s t F o rm a n t F 1 [ H z ] Second Formant F2 [Hz] /i/ /e/ /a/ /o/ /u/ Female vowels Figure 5: Female formant chart. 200 300 400 500 600 700 800 900 1000 500 1000 1500 2000 2500 3000 F ir s t F o rm a n t F 1 [ H z ] Second Formant F2 [Hz] /i/ /e/ /a/ /o/ /u/ Male vowels Figure 6: Male formant chart. The pitches for female and male individuals are shown in Table 1. We used the minimum and max- imum values to address the dispersion instead of the standard deviation because the data distribu- tion was not normal. Our values are in general lower for both genders compared to the published data [9, 11, 12]. Tables 2 and 3 show the First and Second for- mants values and Figs. 5 and 6 show the chart of formants corresponding to female and male popu- lations obtained in this work. We have compared our male results with formant data of male Spanish speakers published by Brad- low [9]. In general, the First (F1) and Second (F2) for- mants values are comparable to the published ones. In particular, the F1 formants for the /a/ and 060002-3 Papers in Physics, vol. 6, art. 060002 (2014) / E. V. Bonzi et al. Female Male Maximum 314 196 Medium 225 128 Minimum 155 85 Table 1: Pitch values of female and male subjects in Hz. /u/ vowels are higher than the reported ones, 12 and 21 %, respectively. The Second formant, F2, for the /o/ vowel is lower than Bradlow by 12 %. On the other hand, we cannot compare our fe- male formant values with published results because we could not find results for female individuals in the literature. Comparing female versus male F1 formants, we observed that most of them are higher by 20 % but in the case of the /o/ vowel the differ- ence is 11 %. Comparing F2 formants, the female values are higher than the male ones, reaching almost the 25 % for /a/ and /i/ vowels. Furthermore, the F2 of the /u/ vowel in our sam- ples show an important scatter for both genders, female and male. In the Tables 4 and 5, the obtained Jitter and Shimmer values for each vowel are shown. They are comparable to the Jitter and Shimmer aver- ages obtained by Casado Morente et al. [12] in a study that involves a group of normal people. In our work, we have observed that the Jitter and the Shimmer values of the /a/ vowel are bigger than the corresponding ones of the other vowels. Finally, the HNR results, see Table 6, are ac- cording to the average value presented by Casado Morente et al. [12]. However, we could not find in the bibliography the HNR values for each of the five Spanish vowels, so we had to make the compar- ison with the average of them. In the present work, we have found that the vowels show an increasing HNR value from /a/ to /u/, meaning that /u/ has better signal to noise ratio than the other vowels. IV. Concluding remarks The objective of this research was to measure acoustical properties of the Spanish voices of Ar- gentinian speakers. Vowels F1 [Hz] F2 [Hz] /i/ 370 ± 45 2600 ± 110 /e/ 525 ± 40 2300 ± 130 /a/ 900 ± 55 1500 ± 100 /o/ 550 ± 40 1000 ± 80 /u/ 440 ± 40 1150 ± 430 Table 2: First and Second formant of female. Vowels F1 [Hz] F2 [Hz] /i/ 300 ± 25 2220 ± 100 /e/ 450 ± 35 1935 ± 90 /a/ 715 ± 55 1260 ± 60 /o/ 490 ± 35 900 ± 45 /u/ 390 ± 45 970 ± 430 Table 3: First and Second formant of male. Vowels Shimmer Local [%] Jitter Local [%] /a/ 2.7 ± 1.1 0.31 ± 0.10 /e/ 2.1 ± 0.7 0.28 ± 0.08 /i/ 2.2 ± 0.6 0.29 ± 0.07 /o/ 2.0 ± 0.7 0.26 ± 0.11 /u/ 2.1 ± 0.7 0.27 ± 0.09 Table 4: Shimmer and Jitter of female subjects. Vowels Shimmer Local [%] Jitter Local [%] /a/ 3.0 ± 0.9 0.36 ± 0.10 /e/ 2.3 ± 0.8 0.33 ± 0.09 /i/ 2.3 ± 0.7 0.28 ± 0.08 /o/ 2.2 ± 0.8 0.29 ± 0.10 /u/ 2.3 ± 0.9 0.25 ± 0.07 Table 5: Shimmer and Jitter of male subjects. Vowels Female Male /a/ 21 ± 3 20 ± 2 /e/ 20 ± 2 21 ± 2 /i/ 22 ± 3 22 ± 2 /o/ 25 ± 3 24 ± 3 /u/ 25 ± 4 25 ± 3 Table 6: Harmonic to Noise Ratio of female and male subjects in dB. 060002-4 Papers in Physics, vol. 6, art. 060002 (2014) / E. V. Bonzi et al. These voice parameters are generally assessed subjectively by several authors. This form of per- ceptual analysis of voice has significant limitations and the subtle interpretative judgments of verbal classifications may not be accurate. The differences we found in the parameters of the vowels measured in a group of people from Ar- gentina compared to the parameters obtained from Spanish speaking people living in Spain suggests the region of study has an important influence in the results, as expected. This kind of studies are very useful to compare the properties of normal and pathological voices of people from different regions. It is necessary to test the same parameters in female Spanish speakers as well. Such work should be performed in larger quan- tities and should be extended to other countries or regions of Latin America, especially where different ethnic groups can be found. [1] W R Rodŕıguez, O Saz, E Lleida, Análisis ro- busto de la voz infantil con aplicación en ter- apia de voz, Areté 10, 70 (2010). [2] T Cervera, J L Miralles, J González-Álvarez, Acoustical analysis of Spanish vowels produced by laryngectomized subjects, J. Speech Lang. Hear Res. 44, 988 (2001). [3] J Muñoz, E Mendoza, M D Fresneda, G Car- ballo, P Lopez, Acoustic and perceptual indi- cators of normal and pathological voice, Folia Phoniatr. Logop. 55, 102 (2003). [4] H K Vorperian, R D Kent Vowel acoustic space development in children: A synthesis of acous- tic and anatomic data, J. Speech Lang. Hear Res. 50, 1510 (2007). [5] S P Whiteside, Sex-specific fundamental and formant frequency patterns in a cross-sectional study, J. Acoust. Soc. Am. 110, 464 (2001). [6] P White, Formant frequency analysis of chil- drens spoken and sung vowels using sweeping fundamental frequency production, J. Voice, 13, 570 (1999). [7] R O Coleman, Male and female voice quality and its relationship to vowel formant frequen- cies, J. Speech Lang. Hear Res. 14, 565 (1971). [8] S Bennett, Vowel formant frequency charac- teristics of preadolescent males and females, J. Acoust. Soc. Am. 69 231 (1981). [9] A R Bradlow A comparative acoustic study of English and Spanish vowels, J. Acoust. Soc. Am. 97, 1916 (1995). [10] P Boersma, D Weenink, Praat: doing pho- netics by computer [Computer program], Ver- sion 5.3.51, retrieved 2 June 2013 from http://www.praat.org/. [11] J I Hualde, The sounds of Spanish, Cambridge University Press, Cambridge (2005). [12] J C Casado Morente, J A Adrián Torres, M Conde Jiménez, D Piédrola Maroto, V Povedano Rodŕıguez, E Muñoz Gomariz, E Cantillo Baños, A Jurado Ramos, Estudio ob- jetivo de la voz en población normal y en la disfońıa por nódulos y pólipos vocales, Acta Otorrinolaringol. Esp. 52, 476 (2001). [13] J Kreimana, B R Gerrattb, Perception of ape- riodicity in pathological voice, J. Acoust. Soc. Am. 117, 2201 (2005). [14] H F Wertzner, S Schreiber, L Amaro, Analy- sis of fundamental frequency, jitter, shimmer and vocal intensity in children with phonologi- cal disorders, Rev. Bras. Otorrinolaringol. 71, 582 (2005). 060002-5