Skip to main content

Identifying Issues in Estimating Parameters from Speech Under Lombard Effect

  • Conference paper
  • First Online:
Advances in Signal Processing and Intelligent Recognition Systems (SIRS 2017)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 678))

  • 1606 Accesses

Abstract

Lombard effect (LE) is the phenomena in which a person tends to speak louder in the presence of loud noise, due to the obstruction of self-auditory feedback. The main objective of this work is to develop a dataset for the study of LE on speech parameters. The proposed dataset comprising of 230 utterances each from 10 speakers, consists of the simultaneous recording of speech and ElectroGlottoGram (EGG) of speech under LE as well as neutral speech recorded in a noise free condition. The speech under LE is recorded at 5 different levels (30 dB, 15 dB, 5 dB, 0 dB and \(-20\) dB) of babble noise. The level of LE in the developed dataset is demonstrated by comparing (a) the source parameters, (b) speaker recognition rates and (c) epoch extraction performance. For the comparison of source parameters like pitch and Strength of Excitation (SoE), the neutral speech and speech under LE are compared. Based on the comparison, high pitch and low SoE are observed for the speech under LE. Also, lower recognition performance is observed when a Mel Frequency Cepstral Coefficient (MFCC) - Gaussian Mixture Model (GMM) based speaker recognition system built using the neutral speech, is tested with the speech under LE obtained from the same set of speakers. Finally, on the basis of the comparison of epoch extraction from neutral speech and speech under LE, the utterances with LE is observed to have higher epoch deviation than that for neutral speech. All these experiments confirm the level of LE in the prepared database and also reinforces the issues in processing the speech under LE, for different speech processing tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bapineedu, G., Avinash, B., Gangashetty, S.V., Yegnanarayana, B.: Analysis of lombard speech using excitation source information. In: Interspeech, pp. 1091–1094. Citeseer (2009)

    Google Scholar 

  2. Mahadeva Prasanna, S.R., Govind, D.: Analysis of excitation source information in emotional speech. In: INTERSPEECH, pp. 781–784 (2010)

    Google Scholar 

  3. Raja, G.S., Dandapat, S.: Speaker recognition under stressed condition. Int. J. Speech Technol. 13(3), 141–161 (2010)

    Article  Google Scholar 

  4. Hansen, J.H.L.: Analysis and compensation of speech under stress and noise for environmental robustness in speech recognition. Speech Commun. 20(1–2), 151–173 (1996)

    Article  Google Scholar 

  5. Furui, S.: 50 years of progress in speech and speaker recognition. In: SPECOM 2005, Patras, pp. 1–9 (2005)

    Google Scholar 

  6. Bapineedu, G.: Analysis of Lombard effect speech and its application in speaker verification for imposter detection. Ph.D. thesis, International Institute of Information Technology Hyderabad, India (2010)

    Google Scholar 

  7. Hagiwara, R.: Monthly mystery spectrogram. Linguistics Department, University of Manitoba, Canada (2006)

    Google Scholar 

  8. Ikeno, A., Varadarajan, V., Patil, S., Hansen, J.H.L.: Ut-scope: speech under Lombard effect and cognitive stress. In: Aerospace Conference, 2007 IEEE, pp. 1–7. IEEE (2007)

    Google Scholar 

  9. Hansen, J.H.L., Bou-Ghazale, S.E., Sarikaya, R., Pellom, B.: Getting started with SUSAS: a speech under simulated and actual stress database. In: Eurospeech, vol. 97, pp. 1743–1746 (1997)

    Google Scholar 

  10. Bořil, H., Pollák, P.: Design and collection of Czech Lombard speech database. In: Proceedings of Interspeech, vol. 5, pp. 1577–1580. Citeseer (2005)

    Google Scholar 

  11. Pravena, D., Govind, D.: Development of simulated emotion speech database for excitation source analysis. Int. J. Speech Technol. 20, 327–338 (2017)

    Article  Google Scholar 

  12. Shukla, S., Prasanna, S.R.M., Dandapat, S.: Stressed speech processing: human vs automatic in non-professional speakers scenario. In: 2011 National Conference on Communications (NCC), pp. 1–5. IEEE (2011)

    Google Scholar 

  13. Reynolds, D.A.: Speaker identification and verification using Gaussian mixture speaker models. Speech Commun. 17(1), 91–108 (1995)

    Article  Google Scholar 

  14. Pravena, D., Nandhakumar, S., Govind, D.: Significance of natural elicitation in developing simulated full blown speech emotion databases. In: 2016 IEEE Students on Technology Symposium (TechSym), pp. 261–265. IEEE (2016)

    Google Scholar 

  15. Govind, D., Mahadeva Prasanna, S.R., Pati, D.: Epoch extraction in high pass filtered speech using Hilbert envelope. In: INTERSPEECH, pp. 1977–1980 (2011)

    Google Scholar 

  16. Deepak, K.T., Prasanna, S.R.M.: Epoch extraction using zero band filtering from speech signal. Circ. Syst. Sig. Process. 34(7), 2309–2333 (2015)

    Article  Google Scholar 

  17. Ramesh, K., Mahadeva Prasanna, S.R., Govind, D.: Detection of glottal opening instants using Hilbert envelope. In: Interspeech, pp. 44–48 (2013)

    Google Scholar 

  18. Govind, D., Hisham, P.M., Pravena, D.: Effectiveness of polarity detection for improved epoch extraction from speech. In: 2016 Twenty Second National Conference on Communication (NCC), pp. 1–6. IEEE (2016)

    Google Scholar 

  19. Govind, D., Joy, T.T.: Improving the flexibility of dynamic prosody modification using instants of significant excitation. Circ. Syst. Signal Process. 35(7), 2518–2543 (2016)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Aiswarya .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Cite this paper

Aiswarya, M., Pravena, D., Govind, D. (2018). Identifying Issues in Estimating Parameters from Speech Under Lombard Effect. In: Thampi, S., Krishnan, S., Corchado Rodriguez, J., Das, S., Wozniak, M., Al-Jumeily, D. (eds) Advances in Signal Processing and Intelligent Recognition Systems. SIRS 2017. Advances in Intelligent Systems and Computing, vol 678. Springer, Cham. https://doi.org/10.1007/978-3-319-67934-1_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67934-1_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67933-4

  • Online ISBN: 978-3-319-67934-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics