Abstract
The aims of our study were: (1) to determine if the acoustical parameters used by normal subjects to discriminate between different speakers vary when comparisons are made between pairs of two of the same or different vowels, and if they are different for male and female voices; (2) to ask whether individual voices can reasonably be represented as points in a low-dimensional perceptual space such that similarly sounding voices are located close to one another. Subjects were presented with pairs of voices from 16 male and 16 female speakers uttering the three French vowels “a”, “i” and “u” and asked to give speaker similarity judgments. Multidimensional analyses of the similarity matrices were performed separately for male and female voices and for three types of comparisons: same vowels, different vowels and overall average. The resulting dimensions were then interpreted a posteriori in terms of relevant acoustical measures. For both male and female voices, a two-dimensional perceptual space was found to be most appropriate, with axes largely corresponding to contributions of the larynx (pitch) and supra-laryngeal vocal tract (formants), mirroring the two largely independent components of source and filter in voice production. These perceptual spaces of male and female voices and their corresponding voice samples are available at: http://vnl.psy.gla.ac.uk section Resources.
Similar content being viewed by others
References
Aronovitch, D. S. (1976). The voice of personality: Stereotyped judgments and their relation to voice quality and sex of speaker. The Journal of Social Psychology, 99, 207–220.
Bachorowski, J. A., & Owren, M. J. (1999). Acoustic correlates of talker sex and individual talker identity are present in a short vowel segment produced in running speech. The Journal of the Acoustical Society of America, 106, 1054–1063.
Belin, P., Fecteau, S., & Bédard, C. (2004). Thinking the voice: neural correlates of voice perception. Trends in Cognitive Science, 8, 129–135.
Borg, I., & Staufenbiel, T. (1989). Theorien und Methoden der Skalierung. Bern: Huber.
Bricker, P. D., & Pruzansky, S. (1976). Speaker recognition. In N. J. Lass (Ed.), Contemporary issues in experimental phonetics (pp. 295–326). New York: Academic.
Bruckert, L., Liénard, J. S., Lacroix, A., Kreutzer, M., & Leboucher, G. (2006). Women use voice parameters to assess men’s characteristics. In: Proceedings of the royal society. Biological sciences (Vol. 273, pp. 83–89).
Carroll, J. D., & Chang, J. (1970). An analysis of individual differences in multidimensional scaling via an N-way generalization of Eckart-Young decomposition. Psychometrica, 35, 283–319.
Clarke, F. R., & Becker, R. W. (1969). Comparison of techniques for discriminating among talkers. Journal of Speech and Hearing Research, 12, 747–762.
Coleman, R. O. (1976). A comparison of the contributions of two voice quality characteristics to the perception of maleness and femaleness in the voice. Journal of Speech and Hearing Research, 19, 168–180.
Collins, S. A. (2000). Men’s voices and women’s choices. Animal Behaviour, 40, 773–780.
Collins, S. A., & Missing, C. (2003). Vocal and visual attractiveness are related in women. Animal Behaviour, 65, 997–1004.
Endres, W., Bambach, W., & Flösser, G. (1971). Voice spectrograms as a function of age, voice disguise, and voice imitation. The Journal of the Acoustical Society of America, 49, 1842–1848.
Fant, G. (1960). Acoustic theory of speech production. The Hague: Mouton & Co.
Hanson, H. (1997). Glottal characteristics of female speakers: Acoustic correlates. The Journal of the Acoustical Society of America, 101, 466–481.
Hecker, M. H. L. (1971). Speaker recognition: An interpretive survey of the literature. ASHA Monographs No. 16
Holmgren, G. (1967). Physical and psychological correlates of speaker recognition. Journal of Speech and Hearing Reserch, 10, 57–66.
Horii, Y. (1980). Vocal shimmer in sustained phonation. Journal of Speech and Hearing Research, 23, 202–209.
Kreiman, J., Gerratt, B. R., Precoda, K., & Berke, G. S. (1992). Individual differences in voice quality perception. Journal of Speech and Hearing Research, 35, 512–520.
Matsumoto, H., Hiki, S., Sone, T., & Nimura, T. (1973). Multidimensional representation of personal quality of vowels and its acoustical correlates. IEEE Transactions on Audio and Electroacoustics, 21, 428–436.
Moore, B. C. J. (2003). An introduction to the psychology of hearing. Amsterdam: Academic Press.
Murry, T., & Singh, S. (1980). Multidimensional analysis of male and female voices. The Journal of the Acoustical Society of America, 68, 1294–1300.
Singer, H., & Sagayama, S. (1992). Pitch dependent phone modelling for HMM based speech recognition. Acoustics, Speech, and Signal Processing, 1, 273–276.
Singh, S., & Murry, T. (1978). Multidimensional classification of normal voice qualities. The Journal of the Acoustical Society of America, 64, 81–87.
Tabachnick, B. G., & Fidell, L. S. (1996). Using multivariate statistics. New York: HarperCollins.
van Dommelen, W. A. (1990). Acoustic parameters in human speaker recognition. Language and Speech, 33, 259–272.
Voiers, W. D. (1964). Perceptual bases of speaker identity. The Journal of the Acoustical Society of America, 36, 1065–1073.
Walden, B. E., Montgomery, A. A., Gibeily, G. T., Prosek, R. A., & Schwartz, D. M. (1978). Correlates of psychological dimensions in talker similarity. Journal of Speech and Hearing Research, 21, 265–275.
Yumoto, E., Sasaki, Y., & Okamura, H. (1984). Harmonics-to-noise ratio and psychophysical measurement of the degree of hoarseness. Journal of Speech and Hearing Research, 27, 2–6.
Acknowledgments
We would like to acknowledge Mike Roy (Secteur Electroacoustique Faculté de Musique, Université de Montreal) for his assistance with recording the voices. We also thank anonymous reviewers for their constructive comments. This project was supported by a grant from the Biotechnology and Biological Sciences Research Council to Pascal Belin.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Baumann, O., Belin, P. Perceptual scaling of voice identity: common dimensions for different vowels and speakers. Psychological Research 74, 110–120 (2010). https://doi.org/10.1007/s00426-008-0185-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00426-008-0185-z