Processing of Short Auditory Stimuli: The Rapid Audio Sequential Presentation Paradigm (RASP)

  • Clara Suied
  • Trevor R. Agus
  • Simon J. Thorpe
  • Daniel Pressnitzer
Conference paper
Part of the Advances in Experimental Medicine and Biology book series (volume 787)


Human listeners seem to be remarkably able to recognise acoustic sound sources based on timbre cues. Here we describe a psychophysical paradigm to estimate the time it takes to recognise a set of complex sounds differing only in timbre cues: both in terms of the minimum duration of the sounds and the inferred neural processing time. Listeners had to respond to the human voice while ignoring a set of distractors. All sounds were recorded from natural sources over the same pitch range and equalised to the same duration and power. In a first experiment, stimuli were gated in time with a raised-cosine window of variable duration and random onset time. A voice/non-voice (yes/no) task was used. Performance, as measured by d′, remained above chance for the shortest sounds tested (2 ms); d′s above 1 were observed for durations longer than or equal to 8 ms. Then, we constructed sequences of short sounds presented in rapid succession. Listeners were asked to report the presence of a single voice token that could occur at a random position within the sequence. This method is analogous to the “rapid sequential visual presentation” paradigm (RSVP), which has been used to evaluate neural processing time for images. For 500-ms sequences made of 32-ms and 16-ms sounds, d′ remained above chance for presentation rates of up to 30 sounds per second. There was no effect of the pitch relation between successive sounds: identical for all sounds in the sequence or random for each sound. This implies that the task was not determined by streaming or forward masking, as both phenomena would predict better performance for the random pitch condition. Overall, the recognition of familiar sound categories such as the voice seems to be surprisingly fast, both in terms of the acoustic duration required and of the underlying neural time constants.


Sound Source Fixed Duration Presentation Rate Musical Instrument Target Sound 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work was supported by the Fondation Pierre Gilles de Gennes pour la Recherche.


  1. Agus TA, Suied C, Thorpe SJ, Pressnitzer D (2012) Fast recognition of musical sounds based on timbre. J Acoust Soc Am 131:4124–4133PubMedCrossRefGoogle Scholar
  2. Belin P, Zatorre RJ, Lafaille P, Ahad P, Pike B (2000) Voice-selective areas in human auditory cortex. Nature 403:309–312PubMedCrossRefGoogle Scholar
  3. Cusack R, Carlyon RP (2003) Perceptual asymmetries in audition. J Exp Psychol Hum Percept Perform 29:713–725PubMedCrossRefGoogle Scholar
  4. Drullman R, Festen JM, Plomp R (1994) Effect of temporal envelope smearing on speech reception. J Acoust Soc Am 95:1053–1064PubMedCrossRefGoogle Scholar
  5. Duncan J, Martens S, Ward R (1997) Restricted attentional capacity within but not between sensory modalities. Nature 387:808–810PubMedCrossRefGoogle Scholar
  6. Goto M, Hashiguchi H, Nishimura T, Oka R (2003) RWC music database: music genre database and musical instrument sound database. In: 4th international conference on Music Information Retrieval, Baltimore, 2003Google Scholar
  7. Gray GW (1942) Phonemic microtomy: the minimum duration of perceptible speech sounds. Speech Monogr 9:75–90CrossRefGoogle Scholar
  8. Keysers C, Xiao DK, Foldiak P, Perrett DI (2001) The speed of sight. J Cogn Neurosci 13:90–101PubMedCrossRefGoogle Scholar
  9. Macmillan NA, Creelman CD (2005) Detection theory: a user’s guide, 2nd edn. Lawrence Erlbaum Associates, MahwahGoogle Scholar
  10. Patterson RD, Allerhand MH, Giguere C (1995) Time-domain modeling of peripheral auditory processing: a modular architecture and a software platform. J Acoust Soc Am 98:1890–1894PubMedCrossRefGoogle Scholar
  11. Pressnitzer D, Patterson RD, Krumbholz K (2001) The lower limit of melodic pitch. J Acoust Soc Am 109:2074–2084PubMedCrossRefGoogle Scholar
  12. Robinson K, Patterson RD (1995) The stimulus duration required to identify vowels, their octave, and their pitch chroma. J Acoust Soc Am 98:1858–1865CrossRefGoogle Scholar
  13. Subramaniam S, Biederman I, Madigan S (2000) Accurate identification but no priming and chance recognition memory for pictures in RSVP sequences. Vis Cogn 7:511–535CrossRefGoogle Scholar
  14. Woods DL, Alain C (1993) Feature processing during high-rate auditory selective attention. Percept Psychophys 53:391–402PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Clara Suied
    • 1
    • 2
    • 3
  • Trevor R. Agus
    • 1
    • 2
  • Simon J. Thorpe
    • 4
  • Daniel Pressnitzer
    • 1
    • 2
  1. 1.Département d’études cognitives, Equipe AuditionEcole Normale SupérieureParisFrance
  2. 2.Laboratoire de Psychologie de la Perception (UMR CNRS 8158)Université Paris DescartesParisFrance
  3. 3.Département Action et Cognition en Situation OpérationnelleInstitut de Recherche Biomédicale des ArméesBrétigny-sur-OrgeFrance
  4. 4.Centre de Recherche Cerveau et CognitionUniversité Toulouse 3, CNRS. CHU Purpan, Pavillon BaudotToulouseFrance

Personalised recommendations