Skip to main content

A Bio-inspired Architecture for Cognitive Audio

  • Conference paper
Bio-inspired Modeling of Cognitive Tasks (IWINAC 2007)

Abstract

A comprehensive view of speech and voice technologies is now demanding better and more complex tools amenable of extracting as much knowledge about sound and speech as possible. Many knowledge-extraction tasks from speech and voice share well-known procedures at the algorithmic level under the point of view of bio-inspiration. The same resources employed to decode speech phones may be used in the characterization of the speaker (gender, age, speaking group, etc.). Based on these facts the present paper examines a hierarchy of sound processing levels at the auditory and perceptual levels on the brain neural paths which can be translated into a bio-inspired audio-processing architecture. Through this paper its fundamental characteristics are analyzed in relation with current tendencies in cognitive audio processing. Examples extracted from speech processing applications in the domain of acoustic-phonetics are presented. These may find applicability in speaker’s characterization, forensics, and biometry, among others.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Delattre, P., Liberman, A., Cooper, F.: Acoustic loci and transitional cues for consonants. J. Acoust. Soc. Am. 27, 769–773 (1955)

    Article  Google Scholar 

  2. Deller, J.R., Proakis, J.G., Hansen, J.H.: Discrete-Time Processing of Speech Signals. Macmillan, New York (1993)

    Google Scholar 

  3. Gómez, P., Godino, J.I., Alvarez, A., Martínez, R., Nieto, V., Rodellar, V.: Evidence of Glottal Source Spectral Features found in Vocal Fold Dynamics. In: Proc. of the ICASSP’05, pp. 441–444 (2005)

    Google Scholar 

  4. Hermansky, H.: Should Recognizers Have Ears? In: ESCA-NATO Tutorial and Research Workshop on Robust Speech Recognition for Unknown Communication Channels, Pont-à-Mousson, France, 17-18 April 1997, pp. 1–10 (1997)

    Google Scholar 

  5. Ferrández, J.M.: Study and Realization of a Bio-inspired Hierarchical Architecture for Speech Recognition. Ph.D. Thesis (in Spanish), Universidad Politécnica de Madrid (1998)

    Google Scholar 

  6. Gómez, P., Martínez, R., Rodellar, V., Ferrández, J.M.: Bio-inspired Systems in Speech Perception: An overview and a study case. In: IEEE/NML Life Sciences Systems and Applications Workshop (by invitation), National Institute of Health, Bethesda, Maryland, July 13-14 (2006)

    Google Scholar 

  7. Haykin, S.: Neural Networks - A comprehensive Foundation. Prentice-Hall, Upper Saddle River (1999)

    MATH  Google Scholar 

  8. Irino, T., Patterson, R.D.: A time-domain, level-dependent auditory filter: the gammachirp. J. Acoust. Soc. Am. 101(1), 412–419 (1997)

    Article  Google Scholar 

  9. Jahne, B.: Digital Image Processing. Springer, Berlin (2005)

    Google Scholar 

  10. Mendelson, J.R., Cynader, M.S.: Sensitivity of Cat Primary Auditory Cortex (AI) Neurons to the Direction and Rate of Frequency Modulation. Brain Research 327, 331–335 (1985)

    Article  Google Scholar 

  11. Mountcastle, V.B.: The columnar organization of the neocortex. Brain 120, 701–722 (1997)

    Article  Google Scholar 

  12. Ojemann, G.A.: Organization of language cortex derived from investigation during neurosurgery. Sem. Neuros. 2, 297–305 (1990)

    Google Scholar 

  13. O’Shaughnessy, D.: Speech Communication. IEEE Press, Los Alamitos (2000)

    Google Scholar 

  14. Rauschecker, J.P., Tian, B., Hauser, M.: Processing of Complex Sounds in the Macaque Nonprimary Auditory Cortex. Science 268, 111–114 (1995)

    Article  Google Scholar 

  15. Sams, M., Salmening, R.: Evidence of sharp frequency tuning in human auditory cortex. Hearing Research 75, 67–74 (1994)

    Article  Google Scholar 

  16. Schreiner, C.E.: Time Domain Analysis of Auditory-Nerve Fibers Firing Rates. Curr. Op. Neurobiol. 5, 489–496 (1995)

    Article  Google Scholar 

  17. Secker, H., Searle, C.: Study and Realization of a Bio-inspired Hierarchical Architecture for Speech Recognition. J. Acoust. Soc. Am. 88(3), 1427–1436 (1990)

    Article  Google Scholar 

  18. Sejnowski, T.J., Rosenberg, C.R.: Parallel networks that learn to pronounce English text. Complex Systems 1, 145–168 (1987)

    MATH  Google Scholar 

  19. Suga, N.: Cortical Computational Maps for Auditory Imaging. Neural Networks 3, 3–21 (1990)

    Article  Google Scholar 

  20. Suga, N.: Basic Acoustic Patterns and Neural Mechanism Shared By Humans and Animals for Auditory Perception: A Neuroethologist’s view. In: Proceedings of Workshop on the Auditory bases of Speech Perception, ESCA, July 1996, pp. 31–38 (1996)

    Google Scholar 

  21. Waibel, A.: Neural Network Approaches for Speech Recognition. In: Furui, S., Sondhi, M.M. (eds.) Advances in Speech Signal Processing, pp. 555–597. Marcel Dekker, New York (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

José Mira José R. Álvarez

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Gómez-Vilda, P., Ferrández-Vicente, J.M., Rodellar-Biarge, V., Álvarez-Marquina, A., Mazaira-Fernández, L.M. (2007). A Bio-inspired Architecture for Cognitive Audio. In: Mira, J., Álvarez, J.R. (eds) Bio-inspired Modeling of Cognitive Tasks. IWINAC 2007. Lecture Notes in Computer Science, vol 4527. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73053-8_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73053-8_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73052-1

  • Online ISBN: 978-3-540-73053-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics