Detection of Speech Dynamics by Neuromorphic Units

  • Pedro Gómez-Vilda
  • José Manuel Ferrández-Vicente
  • Victoria Rodellar-Biarge
  • Agustín Álvarez-Marquina
  • Luis Miguel Mazaira-Fernández
  • Rafael Martínez-Olalla
  • Cristina Muñoz-Mulas
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5601)

Abstract

Speech and voice technologies are experiencing a profound review as new paradigms are sought to overcome some specific problems which can not be completely solved by classical approaches. Neuromorphic Speech Processing is an emerging area in which research is turning the face to understand the natural neural processing of speech by the Human Auditory System in order to capture the basic mechanisms solving difficult tasks in an efficient way. In the present paper a further step ahead is presented in the approach to mimic basic neural speech processing by simple neuromorphic units standing on previous work to show how formant dynamics -and henceforth consonantal features-, can be detected by using a general neuromorphic unit which can mimic the functionality of certain neurons found in the Upper Auditory Pathways. Using these simple building blocks a General Speech Processing Architecture can be synthesized as a layered structure. Results from different simulation stages are provided as well as a discussion on implementation details. Conclusions and future work are oriented to describe the functionality to be covered in the next research steps.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Delattre, P., Liberman, A., Cooper, F.: Acoustic loci and transitional cues for consonants. J. Acoust. Soc. Am. 27, 769–773 (1955)CrossRefGoogle Scholar
  2. 2.
    Deller, J.R., Proakis, J.G., Hansen, J.H.: Discrete-Time Processing of Speech Signals. Macmillan, New York (1993)Google Scholar
  3. 3.
    Gómez, P., Godino, J.I., Alvarez, A., Martínez, R., Nieto, V., Rodellar, V.: Evidence of Glottal Source Spectral Features found in Vocal Fold Dynamics. In: Proc. of the ICASSP 2005, pp. 441–444 (2005)Google Scholar
  4. 4.
    Hermansky, H.: Should Recognizers Have Ears? In: ESCA-NATO Tutorial and Research Workshop on Robust Speech Recognition for Unknown Communication Channels, Pont-à-Mousson, France, April 17-18, 1997, pp. 1–10 (1997)Google Scholar
  5. 5.
    Ferrández, J.M.: Study and Realization of a Bio-inspired Hierarchical Architecture for Speech Recognition, Ph.D. Thesis, Universidad Politécnica de Madrid (1998) (in Spanish)Google Scholar
  6. 6.
    Gómez, P., Martínez, R., Rodellar, V., Ferrández, J.M.: Bio-inspired Systems in Speech Perception: An overview and a study case. In: IEEE/NML Life Sciences Systems and Applications Workshop (by invitation), National Institute of Health, Bethesda, Maryland, July 13-14 (2006)Google Scholar
  7. 7.
    Haykin, S.: Neural Networks - A comprehensive Foundation. Prentice-Hall, Upper Saddle River (1999)MATHGoogle Scholar
  8. 8.
    Irino, T., Patterson, R.D.: A time-domain, level-dependent auditory filter: the gammachirp. J. Acoust. Soc. Am. 101(1), 412–419 (1997)CrossRefGoogle Scholar
  9. 9.
    Jahne, B.: Digital Image Processing. Springer, Berlin (2005)MATHGoogle Scholar
  10. 10.
    Mendelson, J.R., Cynader, M.S.: Sensitivity of Cat Primary Auditory Cortex (AI) Neurons to the Direction and Rate of Frequency Modulation. Brain Research 327, 331–335 (1985)CrossRefGoogle Scholar
  11. 11.
    Mountcastle, V.B.: The columnar organization of the neocortex. Brain 120, 701–722 (1997)CrossRefGoogle Scholar
  12. 12.
    Ojemann, G.A.: Organization of language cortex derived from investigation during neurosurgery. Sem. Neuros. 2, 297–305 (1990)Google Scholar
  13. 13.
    O’Shaughnessy, D.: Speech Communication. IEEE Press, Park Avenue (2000)MATHGoogle Scholar
  14. 14.
    Rauschecker, J.P., Tian, B., Hauser, M.: Processing of Complex Sounds in the Macaque Nonprimary Auditory Cortex. Science 268, 111–114 (1995)CrossRefGoogle Scholar
  15. 15.
    Sams, M., Salmening, R.: Evidence of sharp frequency tuning in human auditory cortex. Hearing Research 75, 67–74 (1994)CrossRefGoogle Scholar
  16. 16.
    Schreiner, C.E.: Time Domain Analysis of Auditory-Nerve Fibers Firing Rates. Curr. Op. Neurobiol. 5, 489–496 (1995)CrossRefGoogle Scholar
  17. 17.
    Secker, H., Searle, C.: Study and Realization of a Bio-inspired Hierarchical Architecture for Speech Recognition. J. Acoust. Soc. Am. 88(3), 1427–1436 (1990)CrossRefGoogle Scholar
  18. 18.
    Sejnowski, T.J., Rosenberg, C.R.: Parallel networks that learn to pronounce English text. Complex Systems 1, 145–168 (1987)MATHGoogle Scholar
  19. 19.
    Suga, N.: Cortical Computational Maps for Auditory Imaging. Neural Networks 3, 3–21 (1990)CrossRefGoogle Scholar
  20. 20.
    Suga, N.: Basic Acoustic Patterns and Neural Mechanism Shared By Humans and Animals for Auditory Perception: A Neuroethologists view. In: Proceedings of Workshop on the Auditory bases of Speech Perception, ESCA, July 1996, pp. 31–38 (1996)Google Scholar
  21. 21.
    Waibel, A.: Neural Network Approaches for Speech Recognition. In: Furui, S., Sondhi, M.M. (eds.) Advances in Speech Signal Processing, pp. 555–597. Dekker, New York (1992)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Pedro Gómez-Vilda
    • 1
  • José Manuel Ferrández-Vicente
    • 2
  • Victoria Rodellar-Biarge
    • 1
  • Agustín Álvarez-Marquina
    • 1
  • Luis Miguel Mazaira-Fernández
    • 1
  • Rafael Martínez-Olalla
    • 1
  • Cristina Muñoz-Mulas
    • 1
  1. 1.Grupo de Informática Aplicada al Tratamiento de Señal e Imagen, Facultad de InformáticaUniversidad Politécnica de MadridMadridSpain
  2. 2.Dpto. Electrónica, Tecnología de ComputadorasUniv. Politécnica de CartagenaCartagenaSpain

Personalised recommendations