Auditory Stream Segregation Based on Speaker Size, and Identification of Size-Modulated Vowel Sequences

Tsuzaki, Minoru; Takeshima, Chihiro; Irino, Toshio; Patterson, Roy D.

doi:10.1007/978-3-540-73009-5_31

Minoru Tsuzaki²,
Chihiro Takeshima²,
Toshio Irino³ &
…
Roy D. Patterson⁴

1873 Accesses
1 Citations

When a receiver of acoustic signals is surrounded by several vibrating bodies, it becomes important to “sort out” sound energies into subparts appropriately to represent the original sources. This issue is called a problem of source segregation, and has been investigated in several ways as a core of the auditory scene analysis. Pitch, or a perceptual attribute corresponding to the fundamental periodicity, has been regarded as one of significant cues for sound segregation. It has been also known that “timbre” can function as another cue (Bregman 1990). However, there are still some problems with the ambiguity in the definition of timbre.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 299.00; Price excludes VAT (USA)

Softcover Book: USD 379.99; Price excludes VAT (USA)

Hardcover Book: USD 379.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bregman AS (1990) Auditory scene analysis: the perceptual organization of sound. MIT Press, Massachusetts.
Google Scholar
Irino T, Patterson R (2002) Segregating information about the size and shape of the vocal tract using a time-domain auditory model: The stabilised wavelet-Mellin transform. Speech Commun 36:181–203.
Article Google Scholar
Ives DT, Smith DRR, Patterson RD (2005) Discrimination of speaker size from syllable phrases. J Acoust Soc Am 118(6):3816–3822.
Article PubMed Google Scholar
Kawahara H, Irino T (2005) Underlying principles of a high-quality speech manipulation system STRAIGHT and its application to speech segregation. In: Divenyi P (ed) Speech separation by humans and machines. Kluwer Academic Pub, Dordrechet, pp 167–180.
Chapter Google Scholar
Kawahara H, Masuda-Katsuse I, de Cheveigé A (1999) Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds. Speech Commun 27:187–207.
Article Google Scholar
Tsuzaki M, Irino T (2004) Perception of size-modulated speech: the relation between the modulation period and the vowel identification. Trans Tech Committee Psychol Physiolog Acoust, Acoust Soc Jpn H-2004–125, 34.
Google Scholar

References

Hartmann WM, Johnson D (1991) Stream segregation and peripheral channeling. Music Percept 9:155–184
Google Scholar
Wessel DL (1979) Timbre space as a musical control structure. Comput Music J 3:45–52.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Kyoto City University of Arts, Kyoto, Japan
Minoru Tsuzaki & Chihiro Takeshima
Wakayama University, Wakayama, Japan
Toshio Irino
Centre for the Neural Basis of Hearing, Cambridge University, Cambridge, UK
Roy D. Patterson

Authors

Minoru Tsuzaki
View author publications
You can also search for this author in PubMed Google Scholar
Chihiro Takeshima
View author publications
You can also search for this author in PubMed Google Scholar
Toshio Irino
View author publications
You can also search for this author in PubMed Google Scholar
Roy D. Patterson
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Carl-von-Ossietzky Universität, 26111, Oldenburg, Germany
Birger Kollmeier , Georg Klump , Volker Hohmann , Ulrike Langemann , Manfred Mauermann , Stefan Uppenkamp & Jesko Verhey , , , , , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tsuzaki, M., Takeshima, C., Irino, T., Patterson, R.D. (2007). Auditory Stream Segregation Based on Speaker Size, and Identification of Size-Modulated Vowel Sequences. In: Kollmeier, B., et al. Hearing – From Sensory Processing to Perception. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73009-5_31

Download citation

DOI: https://doi.org/10.1007/978-3-540-73009-5_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73008-8
Online ISBN: 978-3-540-73009-5
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics

Auditory Stream Segregation Based on Speaker Size, and Identification of Size-Modulated Vowel Sequences

Access this chapter

Preview

References

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation