Abstract
One person talks to another in a crowded, noisy room; a soloist performs a concerto with an orchestra; a car screeches to a halt in the street outside: in each of these situations, the auditory system is faced with the problem of separating several different sources of sound from the complex, composite signal that reaches the ears.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Adelson EH, Bergen JR (1986) The extraction of spatio-temporal energy in human and machine vision. In: Proceedings, Workshop on Motion: Representation and Analysis, pp. 151–155. Los Alamitos, CA: IEEE Computer Society Press.
Anstis S, Saida S (1985) Adaptation to auditory streaming of frequency-modulated tones. J Exp Psychol Hum Percept Perform 11(3):257–271.
Assman PF, Summerfield Q (1989) Modelling the perception of concurrent vowels: vowels with different fundamental frequencies. J Acoust Soc Am 88:680–697.
Balzano GJ (1980) The group-theoretic description of twelvefold and microtonal pitch systems. Comp Music J 4:66–84.
Barinaga M (1990) The mind revealed? Science 249:856–858.
Békésy G von (1963) Three experiments concerned with pitch perception. J Acoust Soc Am 35(4):602–606.
Borden GJ, Harris KS (1984) Speech Science Primer: Physiology, Acoustics, and Perception of Speech. Baltimore: Williams & Wilkins.
Bregman AS (1978) Auditory streaming is cumulative. J Exp Psychol Hum Percept Perform 4(3):380–387.
Bregman AS (1990) Auditory Scene Analysis. Cambridge: MIT Press.
Bregman AS, Dannenbring G (1973) The effect of continuity on auditory stream segregation. Percept & Psychophys 13(2):308–312.
Bregman AS, Pinker S (1978) Auditory streaming and the building of timbre. Can J Psychol 32(1):19–31.
Bregman AS, Rudnicky A (1975) Auditory segregation: stream or steams? J Exp Psychol Hum Percept Perform 1(3):263–267.
Bregman AS, Abramson J, Doehring P, Darwin CJ (1985) Spectral integration based on common amplitude modulation. Percept Psychophys 37:483–493.
Brown GJ (1992) Computational Auditory Scene Analysis. Ph.D. thesis, University of Sheffield, England.
Brown GJ, Cooke M (1993) Physiologically-motivated signal representations for computational auditory modeling. In: Cooke M, Beet SW, Crawford M (eds) Visual Representations of Speech Signals. New York: Wiley, pp. 181–188.
Carlyon RP (1991) Discriminating between coherent and incoherent frequency modulation of complex tones. J Acoust Soc Am 89(l):329–340.
Carloyon RP, Stubbs RJ (1989) Detecting single-cycle frequency modulation imposed on sinusoidal, harmonic, and inharmonic carriers. J Acoust Soc Am 85(6):2563–2574.
Chafe C, Jaffe DA (1986) Source separation and note identification in polyphonic music. Proc IEEE Int Conf Acoust Speech Sig Proc 2:25.6.1–25.6.4.
Chowning JM (1980) Computer synthesis of the singing voice. In: Sound Generation in Winds, Strings, Computers. Stockholm: Royal Swedish Academy of Music, Publ. No. 29, pp. 4–13.
Ciocca W, Bregman AS (1989) The effects of auditory streaming on duplex perception. Percept Psychophys 46(1):39–48.
Cohen EA (1984) Some effects of inharmonic partials on interval perception. Music Percept l(3):323–349.
Cohen MF, Schubert ED (1987) Influence of place synchrony on detection of a sinusoid. J Acoust Soc Am 81(2):452–458.
Cooke MP (1991) Modelling Auditory Processing and Organisation. Ph.D. thesis, University of Sheffield, Sheffield.
Cooke MP, Crawfod MD (1993) Tracking spectral dominances in an auditory model. In: Cooke MP, Beet SW, Crawford MD (eds) Visual Representations of Speech Signals. New York: Wiley, pp. 197–204.
Darwin CJ (1984) Perceiving vowels in the presence of another sound: constraints on formant perception. J Acoust Soc Am 76(6): 1636–1647.
Deutsch D (1975) Two-channel listening to musical scales. J Acoust Soc Am 57(5): 1156–1160.
Dirks DD, Bower D (1970) Effect of forward and backward masking on speech intelligibility. J Acoust Soc Am 47(4): 1003–1008.
Dowling WJ (1978) Scale and contour: two components of a theory of memory for melodies. Psychol Rev 85(4):341–354.
Durlach NI (1963) Equalization and cancellation theory of binaural masking-level differences. J Acoust Soc Am 35(8):1206–1218.
Durlach NI (1964) Note on binaural masking-level differences at high frequencies. J Acoust Soc Am 36(3):576–581.
Erickson R (1982) New music and psychology. In: Deutsch D (ed) The Psychology of Music. London: Academic Press, pp. 517–536.
Fodor JA (1983) Modularity of Mind. Cambridge: MIT Press.
Freeman WJ (1975) Mass Action in the Nervous System. London: Academic Press.
Fuchs W (1962) Mathematical analysis of formal structure of music. IRE Trans Inform Theory, IT 8:225–228.
Gardner RB, Darwin CJ (1986) Grouping of vowel harmonics by frequency modulation: absence of effects on phonemic categorization. Percept Psychophys 40(3): 183–187.
Gardner RB, Wilson JP (1979) Evidence for direction-specific channels in the processing of frequency modulation. J Acoust Soc Am 66(3):704–709.
Goldstein JL (1973) An optimum processor theory for the central formation of he pitch of complex tones. J Acoust Soc Am 54(6): 1496–1516.
Gordon JW (1984) Perception of Attack Transients in Musical Tones. Ph.D. thesis, Dept. of Music, Stanford University, Palo Alto, CA.
Gray CM, Singer W (1989) Stimulus-specific neuronal oscillations in orientation columns of cat visual cortex. Proc Natl Acad Sci USA 86:1698–1702.
Gray CM, König P, Engel AK, Singer W (1989) Oscillatory responses in cat visual cortex exhibit inter-columnar synchronization which reflects global stimulus properties. Nature 338:334–337.
Grey JM (1975) An Exploration of Musical Timbre. Ph.D. thesis, Dept. of Music, Stanford University, Palo Alto, CA.
Hafter ER (1971) Quantitive evaluation of a lateralization model of masking-level differences. J Acoust Soc Am 50(4):1116–1122.
Hall JW, Haggard MP, Fernandes MA (1984) Detection in noise by spectrotermporal pattern analysis. J Acoust Soc Am 76:50–56.
Hartmann WM (1988) Pitch perception and the segregation and integration of auditory entities. In Edelman GM, Gall WE, Cowan WM (eds) Auditory Function: Neurobiological Bases of Hearing. New York: Wiley, pp. 623–645.
Hebb DO (1949) The Organization of Behavior. New York: Wiley.
Heeger DJ (1991) Nonlinear model of neural responses in cat visual cortex. In: Landy MS, Movshon JA (eds) Computational Models of Visual Processing. Cambridge: MIT Press.
Jeffress LA (1972) Binaural signal detection: vector theory. In: Tobias JV (ed) Foundations of Modern Auditory Theory, Vol. II. London: Academic Press, pp. 351–368.
Jeffress LA, Blodgett HC, Sandel TT, Wood III CL (1956) Masking of tonal signals. J Acoust Soc Am 28:416–426.
Jenison RL, Greenberg S, Kluender KR, Rhode WS (1991) A composite model of the auditory periphery for the processing of speech based on the filter response functions of single auditory-nerve fibers. J Acoust Soc Am 90:773–786.
Johannesma P, Aertsen A, van den Boogaard H, Eggermont J, Epping W (1986) From synchrony to harmony: ideas on the function of neural assemblies and on the interpretation of neural synchrony. In: Palm G, Aertsen A (eds) Brain Theory. Berlin: Springer, pp. 25–47.
Kay RH, Matthews DR (1972) On the existence in human auditory pathways of channels selectively tuned to the modulation present in frequency-modulated tones. J Physiol 225:657–677.
Knudsen EI (1981) The hearing of the barn owl. Sci Am 245(6):113–125
Licklider JCR (1951) A duplex theory of pitch perception. Experientia 7:128–133.
Lindemann W (1986) Extension of a binaural cross-correlation model by contralateral inhibition. I. Simulation of lateralization for stationary signals. J Acoust Soc Am 80:1608–1622.
Lyon RF (1982) A computational model of filtering, detection, and compression in the cochlea. Proc IEEE Int Conf Acoust Speech Sig Proc 2:1282–1285.
Lyon RF (1984) Computational models of neural auditory processing. Proc IEEE Int Conf Acoust Speech Sig Proc 36.1.1–36.1.4.
Lyon RF (1986) Experiments with a computational model of the cochlea. Proc IEEE Int Conf Acoust Speech Sig Proc: 1975–1978.
Lyon RF, Mead CA (1988) Cochlear hydrodynamics demystified. Tech Rept CSTR 88-4, California Institute of Technology, Pasadena.
Massaro DW (1987) Speech Perception by Ear and Eye: A Paradigm for Psychological Inquiry. Hillsdale: Erlbaum.
McAdams S (1984) Spectral Fusion, Spectral Parsing, and the Formation of Auditory Images. Ph.D. thesis, Stanford University, Palo Alto, CA.
McAdams S (1989) Segregation of concurrent sounds I: Effects of frequency modulation coherence. J Acoust Soc Am 86(6):2148–2159.
Meddis R, Hewitt M (1991) Virtual pitch and phase sensitivity of a computer model of the auditory periphery: I. Pitch identification. J Acoust Soc Am 89(6):2866–2882.
Mellinger DK (1991) Event Formation and Separation in Musical Sound. Ph.D. thesis, Dept of Music, Stanford University, Palo Alto, CA.
Mellinger DK, Clark CW (1993) A method for filtering bioacoustic transients by spectrogram image convolution. Proc IEEE Oceans’93, pp. 122–127.
Mendelson JR, Cynader MS (1985) Sensitivity of cat auditory primary cortex (AI) neurons to the direction and rate of frequency modulation. Brain Res 327:331–335.
Metz PJ, von Bismark G, Durlach NI (1968) Further results on binaural unmasking and the EC model. II. Noise bandwidth and interaural phase. J Acoust Soc Am 43(5): 1085–1091.
Miller GA, Licklider JCR (1950) The intelligibility of interrupted speech. J Acoust Soc Am 22(2): 167–173.
Møller AR (1977) Coding of time-varying sounds in the cochlear nucleus. Audiology 17:446–468.
Moore BCJ (1989) An Introducion to the Psychology of Hearing, 3rd Ed. London: Academic Press.
Moore BCJ (1990) Co-modulation masking release: spectro-termporal pattern analysis in hearing. Br J Audiol 24:131–137.
Moore BCJ, Glasberg BR, Peters RW (1985) Relative dominance of individual partials in determining the pitch of complex tones. J Acoust Soc Am 77(5): 1853–1860.
Moore BCJ, Peters RW, Glasberg BR (1985) Thresholds for the detection of inharmonicity in complex tones. J Acoust Soc Am 77(5): 1861–1867.
Moorer JA (1975) On the Segmentation and Analysis of Continuous Musical Sound by Digital Computer. Ph.D. thesis, Dept. of Music, Stanford University, Palo Alto, CA.
Oppenheim AV, Schafer RW (1975) Digital Signal Processing. Englewood Cliffs: Prentice-Hall.
Ortmannn O (1926) On the melodic relativity of tones. Psychol Monogr 35(1): 1–47.
Parsons TW (1976) Separation of speech from interfering noise by means of harmonic selection. J Acoust Soc Am 60(4):911–918.
Patterson RD (1987) A pulse ribbon model of peripheral auditory processing. In: Yost WA, Watson CS (eds) Auditory Processing of Complex Sounds. Hillsdale, NJ: Erlbaum, pp. 167–179.
Pickles JO (1988) An Introduction to the Physiology of Hearning. London: Academic Press.
Pierce JR (1983) The Science of Musical Sound. New York: Freeman.
Rabiner LR, Gold B (1975) Theory and Application of Digital Signal Processing. Englewood Cliffs: Prentice-Hall.
Rand TC (1974) Dichotic release from masking for speech. J Acoust Soc Am 55(3):678–680.
Rasch RA (1978) The perception of simultaneous notes such as in polyphonic music. Acustica 40:21–33.
Rasch RA (1979) Synchronization in performed ensemble music. Acustica 43: 121–131.
Reynolds R (1983) Archipelago. New York: C. F. Peters.
Rhode WS, Smith PH (1986) Encoding timing and intensity in the ventral cochlear nucleus of the cat. J Neurophysiol (Bethesda) 56(2):261–286.
Schooneveldt GP, Moore BCJ (1987) Comodulation masking release (CMR): effects of signal frequency, flanking-band frequency, masker bandwidth, flanking-band level, and monotic versus dichotic presentation of the flanking band. J Acoust Soc Am 82(6): 1944–1956.
Schooneveldt GP, Moore BCJ (1988) Failure to obtain comodulation masking release with frequency-modulated maskers. J Acoust Soc Am 83(6):2290–2292.
Schreiner CE, Langner G (1988) Coding of temporal patterns in the central auditory nervous system. In: Edelman GM, Gall WE, Cowan WM (eds) Auditory Function. New York: Wiley, pp. 337–361.
Schreiner CE, Mendelson JR (1990) Functional topography of cat primary auditory cortex: distribution of integrated excitation. J Neurophysiol (Bethesda) 64(5): 1442–1459.
Schreiner CE, Urbas JV (1986) Representation of amplitude modulation in the auditory cortex of the cat. I. Anterior auditory field. Hear Res 21:227–241.
Schreiner CE, Urbas JV (1988) Representation of amplitude modualtion in the auditory cortex of the cat. II. Comparison between cortical fields. Hear Res 32:49–64.
Schroeder MR (1968) Period histogram and product spectrum: new methods for fundamental-frequency measurement. J Acoust Soc Am 43(4):829–834.
Schwede GW (1983) An algorithm and architecture for constant-Q spectrum analysis. Proc IEEE Int Conf Acoust Speech Sig Proc 3:1384–1387.
Seneff S (1988) A joint-synchrony/mean-rate model of auditory speech processing. J Phonet 16:55–76.
Serra X (1988) An Environment for the Analysis, Transformation, and Resynthesis of Music Souds. Ph.D. thesis, Dept. of Music, Stanford University, Palo Alto, CA.
Shepard RN (1982) Geometrical approximations to the structure of musical pitch. Psychol Rev 89:305–333.
Shepard RN (1989) Internal representation of universal regularities: a challenge for connectionism. In: Nadel L, et al. (eds) Neural Connections, Mental Computation. Cambridge: MIT Press, pp. 104–134.
Slaney M (1988) Lyon’s cochlear model. Technical Report 13, Apple Computer. Available from the Apple Corporate Library, Cupertino, CA 95014.
Slaney M (1990) Interactive signal processing documents. IEEE ASSP Mag 7(2):8–20.
Suga N (1990) Cortical computational maps for auditory imaging. Neural Networkds 3:3–21.
Terhardt E (1972) Zur Tonhöhenwahrnehmung von Klängen II: Ein Funktionsschema. Acustica 26:187–199.
van Noorden LPAS (1975) Temporal Coherence in the Perception of Time Sequences. Ph.D. thesis, Technische Hogeschool Eindhoven, Netherlands.
van Noorden LPAS (1977) Minimum differences of level and frequency for perceptural fission of tone sequences ABAB. J Acoust Soc Am 61(4): 1041–1045.
von der Malsburg C (1986) Am I thinking assemblies? In: Palm G, Aertsen A (eds) Brain Theory. Berlin: Springer, pp. 161–176.
von der Malsburg C, Schneider W (1986) A neural cocktail-party processor. Biol Cybern 54:29–40.
Wang K, Shamma S (1995) Auditory analysis of spectro-temporal information in acoustic signals. IEEE Engineering in Medicine and Biol 14(2): 186–194.
Warren RM (1982) Auditory Perception: A New Synthesis. New York: Pergamon Press.
Warren WH Jr, Verbrugge RR (1984) Auditory perception of breaking and bouncing events. J Exp Psychol Hum Percept Perform 10(5):704–712.
Weintraub M (1985) A Theory and Computational Model of Auditory Monaural Sound Separation. Ph.D. thesis, Stanford University, Palo Alto, CA.
Wessel DL (1979) Timbre space as a musical control structure. Comp Music J 3(2):45–52.
Whitfield IC, Evans EF (1965) Responses of auditory cortical neurons to stimuli of changing frequency. J Neurophysiol (Bethesda) 28:655–672.
Wightman FL (1973) The pattern-transformation model of pitch. J Acoust Soc Am 54(2):407–416.
Wise JD, Caprio JR, Parks TW (1976) Maximum likelihood pitch estimation. IEEE Trans Acoust Speech Sig Proc 24(5):418–423.
Yin TC, Chan JCK (1988) Neural mechanisms underlying interaural time senstivity to tones and noise. In: Edelman GM, Gall WE, Cowan WM (eds) Auditory Function: Neurobiological Bases of Hearing. New York: Wiley, pp. 385–430.
Young ED, Shofner WP, White JA, Robert J-M, Voigt HF (1988) Response properties of cochlear nucleus neurons in relationship to physiological mechanisms. In: Edelman GM, Gall WE, Cowan WM (eds) Auditory Function: Neurobiological Bases of Hearing. New York: Wiley, pp. 277–312.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1996 Springer-Verlag New York, Inc.
About this chapter
Cite this chapter
Mellinger, D.K., Mont-Reynaud, B.M. (1996). Scene Analysis. In: Hawkins, H.L., McMullen, T.A., Popper, A.N., Fay, R.R. (eds) Auditory Computation. Springer Handbook of Auditory Research, vol 6. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-4070-9_7
Download citation
DOI: https://doi.org/10.1007/978-1-4612-4070-9_7
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4612-8487-1
Online ISBN: 978-1-4612-4070-9
eBook Packages: Springer Book Archive