Neuromorphic Detection of Vowel Representation Spaces

  • Pedro Gómez-Vilda
  • José Manuel Ferrández-Vicente
  • Victoria Rodellar-Biarge
  • Agustín Álvarez-Marquina
  • Luis Miguel Mazaira-Fernández
  • Rafael Martínez-Olalla
  • Cristina Muñoz-Mulas
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6687)

Abstract

In this paper a layered architecture to spot and characterize vowel segments in running speech is presented. The detection process is based on neuromorphic principles, as is the use of Hebbian units in layers to implement lateral inhibition, band probability estimation and mutual exclusion. Results are presented showing how the association between the acoustic set of patterns and the phonologic set of symbols may be created. Possible applications of this methodology are to be found in speech event spotting, in the study of pathological voice and in speaker biometric characterization, among others.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Acero, A.: New Machine Learning Approaches to Speech Recognition. In: FALA 2010, Vigo, Spain, November 10-12 (2010); ISBN: 978-84-8158-510-0Google Scholar
  2. 2.
  3. 3.
    Barbour, D.L., Wang, X.: Temporal Coherence Sensitivity in Auditory Cortex. J. Neurophysiol. 88, 2684–2699 (2002)CrossRefGoogle Scholar
  4. 4.
    Gómez, P., Ferrández, J.M., Rodellar, V., Fernández, R.: Time-frequency Representations in Speech Perception. Neurocomputing 72, 820–830 (2009)CrossRefGoogle Scholar
  5. 5.
    Gómez, P., Ferrández, J.M., Rodellar, V., Alvarez, A., Mazaira, L.M., Olalla, R., Muñoz, C.: Neuromorphic detection of speech dynamics. Neurocomputing 74(8), 1191–1202 (2011)CrossRefGoogle Scholar
  6. 6.
    Greenberg, S., Ainsworth, W.H.: Speech processing in the auditory system: an overview. In: Greenberg, W.A.S. (ed.) Speech Processing in the Auditory System, pp. 1–62. Springer, New York (2004)CrossRefGoogle Scholar
  7. 7.
    Hebb, D.O.: The Organization of Behavior. Wiley, New York (1949)Google Scholar
  8. 8.
    Huang, X., Acero, A., Hon, H.W.: Spoken Language Processing. Prentice-Hall, Upper Saddle River (2001)Google Scholar
  9. 9.
    Jahne, B.: Digital Image Processing. Springer, Berlin (2005)MATHGoogle Scholar
  10. 10.
    Kohonen, T.: Self-Organizing Maps. Springer, Heidelberg (1997)CrossRefMATHGoogle Scholar
  11. 11.
    Munkong, R., Juang, B.H.: Auditory Perception and Cognition. IEEE Signal Proc. Magazine, 98–117 (May 2008)Google Scholar
  12. 12.
    O’Shaughnessy, D.: Speech Communication. Human and Machine. Addison-Wesley, Reading (2000)MATHGoogle Scholar
  13. 13.
    Palmer, A., Shamma, S.: Physiological Representation of Speech. In: Greenberg, S., Ainsworth, W., Popper, A. (eds.), pp. 163–230. Springer, New York (2004)Google Scholar
  14. 14.
    Rose, P., Kinoshita, Y., Alderman, T.: Realistic Extrinsic Forensic Speaker Discrimination with the Diphthong /aI/. In: Proc. 11th Austr. Int. Conf. on Speech Sci. and Tech., pp. 329–334 (December 2006)Google Scholar
  15. 15.
    Shamma, S.: Physiological foundations of temporal integration in the perception of speech. J. Phonetics 31, 495–501 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Pedro Gómez-Vilda
    • 1
  • José Manuel Ferrández-Vicente
    • 2
  • Victoria Rodellar-Biarge
    • 1
  • Agustín Álvarez-Marquina
    • 1
  • Luis Miguel Mazaira-Fernández
    • 1
  • Rafael Martínez-Olalla
    • 1
  • Cristina Muñoz-Mulas
    • 1
  1. 1.Grupo de Informática Aplicada al Tratamiento de Señal e Imagen, Facultad de InformáticaUniversidad Politécnica de MadridMadridSpain
  2. 2.Dpto. Electrónica, Tecnología de ComputadorasUniv. Politécnica de CartagenaCartagenaSpain

Personalised recommendations