Skip to main content

Improving Objective Speech Quality Indicators in Noise Conditions

  • Chapter
  • First Online:
Data Science: New Issues, Challenges and Applications

Part of the book series: Studies in Computational Intelligence ((SCI,volume 869))

Abstract

This work aims at modifying speech signal samples and test them with objective speech quality indicators after mixing the original signals with noise or with an interfering signal. Modifications that are applied to the signal are related to the Lombard speech characteristics, i.e., pitch shifting, utterance duration changes, vocal tract scaling, manipulation of formants. A set of words and sentences in Polish, recorded in silence, as well as in the presence of interfering signals, i.e., pink noise and the so-called babble speech, also referred to as the “cocktail-party” effect is utilized. Speech samples were then processed and measured utilizing objective indicators to check whether modifications applied to the signal in the presence of noise increased values of the speech quality index, i.e., PESQ (Perceptual Evaluation of Speech Quality) standard.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Arantes P (2015) Time-normalization of fundamental frequency contours: a hands-on tutorial. In: Courses on speech prosody, p 98

    Google Scholar 

  • Bapineedu G (2010) Analysis of Lombard effect speech and its application in speaker verification for imposter detection. M.Sc. thesis, Language Technologies Research Centre, International Institute of Information Technology

    Google Scholar 

  • Beerends JG, Buuren RV, Vugt JV, Verhave J (2009) Objective speech intelligibility measurement on the basis of natural speech in combination with perceptual modeling. J Audio Eng Soc 57(5):299–308

    Google Scholar 

  • Beerends JG, Schmidmer C, Berger J, Obermann M, Ullmann R, Pomy J, Keyhl M (2013) Perceptual objective listening quality assessment (POLQA), the third generation ITUT standard for end-to-end speech quality measurement part ii perceptual model. J Audio Eng Soc 61(6):385–402

    Google Scholar 

  • Boersma P, Weenink D (2018) Praat: doing phonetics by computer [Computer Program]. Version 6.0.39. Retrieved May 2018

    Google Scholar 

  • Boril H, Fousek P, Höge H (2007a) Two-stage system for robust neutral/Lombard speech recognition. InterSpeech

    Google Scholar 

  • Boril H, Fousek P, Sündermann D, Cerva P, Zdansky J (2007b) Lombard speech recognition: a comparative study. InterSpeech

    Google Scholar 

  • Corretge R (2012) Praat vocal toolkit. http://www.praatvocaltoolkit.com

  • Darwin CJ, Brungart DS, Simpson BD (2003) Effects of fundamental frequency and vocal-tract length changes on attention to one of two simultaneous talkers. J Acoust Soc Am 114(5):2913–2922

    Article  Google Scholar 

  • Egan JP (1972) Psychoacoustics of the Lombard voice response. J Auditory Res 12:318–324

    Google Scholar 

  • Ghai S, Sinha R (2009) Exploring the role of spectral smoothing in context of children’s speech recognition. In: 10th Annual conference of the international speech communication association

    Google Scholar 

  • ITU-R BS.1116 (2016) Methods for the subjective assessment of small impairments in audio systems including multichannel sound systems

    Google Scholar 

  • ITU-R BS.1284 (2003) General methods for the subjective assessment of sound quality

    Google Scholar 

  • ITU-T (1996) Methods for subjective determination of transmission quality. Recommendation P.800, Aug

    Google Scholar 

  • ITU-T (2003) Mapping function for transforming P.862 raw result scores to MOS-LQO. Recommendation P.862.1, Nov

    Google Scholar 

  • ITU-T (2001) Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrow band telephone networks and speech codecs. Recommendation P.862, Feb

    Google Scholar 

  • ITU-T (2004) Single-ended method for objective speech quality assessment in narrow-band telephony applications. Recommendation P.563

    Google Scholar 

  • ITU-T (2006) Mean opinion score (MOS) terminology. Recommendation P.800.1, July

    Google Scholar 

  • Junqua J-C, Fincke S, Field K (1999) The Lombard effect: a reflex to better communicate with others in noise. In: 1999 IEEE international conference on acoustics, speech, and signal processing proceedings. ICASSP99 (Cat. No. 99CH36258), vol 4, pp 2083–2086

    Google Scholar 

  • Kleczkowski P, Żak A, Król-Nowak A (2017) Lombard effect in Polish speech and its comparison in English speech. Arch Acoust 42(4):561–569. https://doi.org/10.1515/aoa-2017-0060

    Article  Google Scholar 

  • Lau P (2008) The Lombard effect as a communicative phenomenon. UC Berkeley Phonology Lab Annual Report

    Google Scholar 

  • Lombard E (1911) Le signe de l’élévation de la voix (translated from French). Ann des Mal l’oreille du larynx 37(2):101–119

    Google Scholar 

  • Lu Y, Cooke M (2008) Speech production modifications produced by competing talkers, babble, and stationary noise. J Acoust Soc Am 124:3261–3275

    Article  Google Scholar 

  • Mermelstein P (1976) Distance measures for speech recognition, psychological and instrumental. In: Chen RCH (ed) Pattern recognition and artificial intelligence. Academic, New York, NY, USA, pp 374–388

    Google Scholar 

  • Moulines E, Charpentier F (1990) Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones. Speech Commun 9(5–6):453–467

    Article  Google Scholar 

  • Nishiura T (2013) Detection for Lombard speech with second-order mel-frequency cepstral coefficient and spectral envelope in beginning of talking speech. J Acoust Soc Am

    Google Scholar 

  • Stowe LM, Golob EJ (2013) Evidence that the Lombard effect is frequency-specific in humans. J Acoust Soc Am 134(1):640–647. https://doi.org/10.1121/1.4807645

    Article  Google Scholar 

  • Therrien AS, Lyons J, Balasubramaniam R (2012) Sensory attenuation of self-produced feedback: the Lombard effect revisited. PLoS One 7(11):e49370

    Article  Google Scholar 

  • Ubul K, Hamdulla A, Aysa A (2009) A digital signal processing teaching methodology using Praat. In: 2009 4th international conference on computer science & education. IEEE, pp 1804–1809

    Google Scholar 

  • Vlaj D, Kacic Z (2011) The influence of Lombard effect on speech recognition. In: Speech technologies, Chap. 7, pp 151–168

    Google Scholar 

  • Whitepaper PESQ (2001) An introduction. Psytechnics Limited

    Google Scholar 

  • Zollinger SA, Brumm H (2011) The evolution of the Lombard effect: 100 years of psychoacoustic research. Behaviour 148:1173–1198

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gražina Korvel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Kąkol, K., Korvel, G., Kostek, B. (2020). Improving Objective Speech Quality Indicators in Noise Conditions. In: Dzemyda, G., Bernatavičienė, J., Kacprzyk, J. (eds) Data Science: New Issues, Challenges and Applications. Studies in Computational Intelligence, vol 869. Springer, Cham. https://doi.org/10.1007/978-3-030-39250-5_11

Download citation

Publish with us

Policies and ethics