A Study on Vowel Region Detection from a Continuous Speech

  • Ramakrishna Thirumuru
  • Harikrishna Vydana
  • Suryakanth V. Gangashetty
  • Anil Kumar Vuppala
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10089)

Abstract

Vowels exhibit high sonority and loudness due to varying strength of impulse-like excitations. Acoustic events such as vowel onset point (VOP) and vowel end point (VEP) can be used to detect precise vowel regions in a speech signal. In this paper, a technique is proposed to detect vowel regions based on these acoustic parameters in a continuous speech signal. Vowels possess significant energy content in the low frequency bands of speech. The initial phase of this method consists of speech signal processing using zero frequency filtering technique. Zero frequency filtered signal predominantly contains low frequency content of the speech signal as it is filtered around 0 Hz. This process is followed by the extraction of dominant spectral peaks from the magnitude spectrum around glottal closure regions of the speech signal. The vowel onset points and vowel end points are obtained by convolving spectrum of zero frequency filtered signal with first order Gaussian differentiator. The performance of the proposed vowel region detection method is compared with the existing state of art methods on TIMIT database. It is reported that this method produced relatively significant improvement in vowel region detection in clean and noisy environments.

Keywords

Vowel Onset Point (VOP) Vowel End Point (VEP) Zero frequency filtering Magnitude spectrum First order Gaussian differentiator 

References

  1. 1.
    Gangashetty, S.V., Sekhar, C.C., Yegnanarayana, B.: Detection of vowel onset points in continuous speech using autoassociative neural network models. In: Proceedings International Conference Spoken Language Processing, pp. 401–410 (2004)Google Scholar
  2. 2.
    Glass, J.R.: A probabilistic framework for segment-based speech recognition. Comput. Speech Lang. 17(2), 137–152 (2003)CrossRefGoogle Scholar
  3. 3.
    Murty, K.S.R., Yegnanarayana, B.: Epoch extraction from speech signals. IEEE Transa. Audio Speech Lang. Process. 16(8), 1602–1613 (2008)CrossRefGoogle Scholar
  4. 4.
    Prasanna, S.M., Gangashetty, S.V., Yegnanarayana, B.: Significance of vowel onset point for speech analysis. In: Proceedings of International Conference Signal Processing and Communications, pp. 81–88. Citeseer (2001)Google Scholar
  5. 5.
    Prasanna, S.M., Reddy, B.S., Krishnamoorthy, P.: Vowel onset point detection using source, spectral peaks, and modulation spectrum energies. IEEE Transa. Audio Speech Lang. Process. 17(4), 556–565 (2009)CrossRefGoogle Scholar
  6. 6.
    Rao, K.S., Vuppala, A.K.: Non-uniform time scale modification using instants of significant excitation and vowel onset points. Speech Commun. 55(6), 745–756 (2013)CrossRefGoogle Scholar
  7. 7.
    Rao, K.S., Yegnanarayana, B.: Duration modification using glottal closure instants and vowel onset points. Speech Commun. 51(12), 1263–1269 (2009)CrossRefGoogle Scholar
  8. 8.
    Schutte, K., Glass, J.R.: Robust detection of sonorant landmarks. In: INTERSPEECH, pp. 1005–1008 (2005)Google Scholar
  9. 9.
    Vuppala, A.K., Rao, K.S.: Vowel onset point detection for noisy speech using spectral energy at formant frequencies. Int. J. Speech Technol. 16(2), 229–235 (2013)CrossRefGoogle Scholar
  10. 10.
    Vuppala, A.K., Yadav, J., Chakrabarti, S., Rao, K.S.: Vowel onset point detection for low bit rate coded speech. IEEE Trans. Audio Speech Lang. Process. 20(6), 1894–1903 (2012)CrossRefGoogle Scholar
  11. 11.
    Yadav, J., Rao, K.S.: Detection of vowel offset point from speech signal. IEEE Signal Process. Lett. 20(4), 299–302 (2013)CrossRefGoogle Scholar
  12. 12.
    Yegnanarayana, B., Prasanna, S.M., Guruprasad, S.: Study of robustness of zero frequency resonator method for extraction of fundamental frequency. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5392–5395. IEEE (2011)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Ramakrishna Thirumuru
    • 1
  • Harikrishna Vydana
    • 1
  • Suryakanth V. Gangashetty
    • 1
  • Anil Kumar Vuppala
    • 1
  1. 1.Speech and Vision Lab (LTRC)International Institute of Information Technology HyderabadHyderabadIndia

Personalised recommendations