Skip to main content
Log in

Perceptual scaling of voice identity: common dimensions for different vowels and speakers

  • Original Article
  • Published:
Psychological Research PRPF Aims and scope Submit manuscript

Abstract

The aims of our study were: (1) to determine if the acoustical parameters used by normal subjects to discriminate between different speakers vary when comparisons are made between pairs of two of the same or different vowels, and if they are different for male and female voices; (2) to ask whether individual voices can reasonably be represented as points in a low-dimensional perceptual space such that similarly sounding voices are located close to one another. Subjects were presented with pairs of voices from 16 male and 16 female speakers uttering the three French vowels “a”, “i” and “u” and asked to give speaker similarity judgments. Multidimensional analyses of the similarity matrices were performed separately for male and female voices and for three types of comparisons: same vowels, different vowels and overall average. The resulting dimensions were then interpreted a posteriori in terms of relevant acoustical measures. For both male and female voices, a two-dimensional perceptual space was found to be most appropriate, with axes largely corresponding to contributions of the larynx (pitch) and supra-laryngeal vocal tract (formants), mirroring the two largely independent components of source and filter in voice production. These perceptual spaces of male and female voices and their corresponding voice samples are available at: http://vnl.psy.gla.ac.uk section Resources.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Aronovitch, D. S. (1976). The voice of personality: Stereotyped judgments and their relation to voice quality and sex of speaker. The Journal of Social Psychology, 99, 207–220.

    PubMed  Google Scholar 

  • Bachorowski, J. A., & Owren, M. J. (1999). Acoustic correlates of talker sex and individual talker identity are present in a short vowel segment produced in running speech. The Journal of the Acoustical Society of America, 106, 1054–1063.

    Article  PubMed  Google Scholar 

  • Belin, P., Fecteau, S., & Bédard, C. (2004). Thinking the voice: neural correlates of voice perception. Trends in Cognitive Science, 8, 129–135.

    Article  Google Scholar 

  • Borg, I., & Staufenbiel, T. (1989). Theorien und Methoden der Skalierung. Bern: Huber.

    Google Scholar 

  • Bricker, P. D., & Pruzansky, S. (1976). Speaker recognition. In N. J. Lass (Ed.), Contemporary issues in experimental phonetics (pp. 295–326). New York: Academic.

    Google Scholar 

  • Bruckert, L., Liénard, J. S., Lacroix, A., Kreutzer, M., & Leboucher, G. (2006). Women use voice parameters to assess men’s characteristics. In: Proceedings of the royal society. Biological sciences (Vol. 273, pp. 83–89).

  • Carroll, J. D., & Chang, J. (1970). An analysis of individual differences in multidimensional scaling via an N-way generalization of Eckart-Young decomposition. Psychometrica, 35, 283–319.

    Article  Google Scholar 

  • Clarke, F. R., & Becker, R. W. (1969). Comparison of techniques for discriminating among talkers. Journal of Speech and Hearing Research, 12, 747–762.

    PubMed  Google Scholar 

  • Coleman, R. O. (1976). A comparison of the contributions of two voice quality characteristics to the perception of maleness and femaleness in the voice. Journal of Speech and Hearing Research, 19, 168–180.

    PubMed  Google Scholar 

  • Collins, S. A. (2000). Men’s voices and women’s choices. Animal Behaviour, 40, 773–780.

    Article  Google Scholar 

  • Collins, S. A., & Missing, C. (2003). Vocal and visual attractiveness are related in women. Animal Behaviour, 65, 997–1004.

    Article  Google Scholar 

  • Endres, W., Bambach, W., & Flösser, G. (1971). Voice spectrograms as a function of age, voice disguise, and voice imitation. The Journal of the Acoustical Society of America, 49, 1842–1848.

    Article  PubMed  Google Scholar 

  • Fant, G. (1960). Acoustic theory of speech production. The Hague: Mouton & Co.

    Google Scholar 

  • Hanson, H. (1997). Glottal characteristics of female speakers: Acoustic correlates. The Journal of the Acoustical Society of America, 101, 466–481.

    Article  PubMed  Google Scholar 

  • Hecker, M. H. L. (1971). Speaker recognition: An interpretive survey of the literature. ASHA Monographs No. 16

  • Holmgren, G. (1967). Physical and psychological correlates of speaker recognition. Journal of Speech and Hearing Reserch, 10, 57–66.

    Google Scholar 

  • Horii, Y. (1980). Vocal shimmer in sustained phonation. Journal of Speech and Hearing Research, 23, 202–209.

    PubMed  Google Scholar 

  • Kreiman, J., Gerratt, B. R., Precoda, K., & Berke, G. S. (1992). Individual differences in voice quality perception. Journal of Speech and Hearing Research, 35, 512–520.

    PubMed  Google Scholar 

  • Matsumoto, H., Hiki, S., Sone, T., & Nimura, T. (1973). Multidimensional representation of personal quality of vowels and its acoustical correlates. IEEE Transactions on Audio and Electroacoustics, 21, 428–436.

    Article  Google Scholar 

  • Moore, B. C. J. (2003). An introduction to the psychology of hearing. Amsterdam: Academic Press.

    Google Scholar 

  • Murry, T., & Singh, S. (1980). Multidimensional analysis of male and female voices. The Journal of the Acoustical Society of America, 68, 1294–1300.

    Article  PubMed  Google Scholar 

  • Singer, H., & Sagayama, S. (1992). Pitch dependent phone modelling for HMM based speech recognition. Acoustics, Speech, and Signal Processing, 1, 273–276.

    Google Scholar 

  • Singh, S., & Murry, T. (1978). Multidimensional classification of normal voice qualities. The Journal of the Acoustical Society of America, 64, 81–87.

    Article  PubMed  Google Scholar 

  • Tabachnick, B. G., & Fidell, L. S. (1996). Using multivariate statistics. New York: HarperCollins.

    Google Scholar 

  • van Dommelen, W. A. (1990). Acoustic parameters in human speaker recognition. Language and Speech, 33, 259–272.

    PubMed  Google Scholar 

  • Voiers, W. D. (1964). Perceptual bases of speaker identity. The Journal of the Acoustical Society of America, 36, 1065–1073.

    Article  Google Scholar 

  • Walden, B. E., Montgomery, A. A., Gibeily, G. T., Prosek, R. A., & Schwartz, D. M. (1978). Correlates of psychological dimensions in talker similarity. Journal of Speech and Hearing Research, 21, 265–275.

    PubMed  Google Scholar 

  • Yumoto, E., Sasaki, Y., & Okamura, H. (1984). Harmonics-to-noise ratio and psychophysical measurement of the degree of hoarseness. Journal of Speech and Hearing Research, 27, 2–6.

    PubMed  Google Scholar 

Download references

Acknowledgments

We would like to acknowledge Mike Roy (Secteur Electroacoustique Faculté de Musique, Université de Montreal) for his assistance with recording the voices. We also thank anonymous reviewers for their constructive comments. This project was supported by a grant from the Biotechnology and Biological Sciences Research Council to Pascal Belin.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Oliver Baumann.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Baumann, O., Belin, P. Perceptual scaling of voice identity: common dimensions for different vowels and speakers. Psychological Research 74, 110–120 (2010). https://doi.org/10.1007/s00426-008-0185-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00426-008-0185-z

Keywords

Navigation