Abstract
This work aims at modifying speech signal samples and test them with objective speech quality indicators after mixing the original signals with noise or with an interfering signal. Modifications that are applied to the signal are related to the Lombard speech characteristics, i.e., pitch shifting, utterance duration changes, vocal tract scaling, manipulation of formants. A set of words and sentences in Polish, recorded in silence, as well as in the presence of interfering signals, i.e., pink noise and the so-called babble speech, also referred to as the “cocktail-party” effect is utilized. Speech samples were then processed and measured utilizing objective indicators to check whether modifications applied to the signal in the presence of noise increased values of the speech quality index, i.e., PESQ (Perceptual Evaluation of Speech Quality) standard.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Arantes P (2015) Time-normalization of fundamental frequency contours: a hands-on tutorial. In: Courses on speech prosody, p 98
Bapineedu G (2010) Analysis of Lombard effect speech and its application in speaker verification for imposter detection. M.Sc. thesis, Language Technologies Research Centre, International Institute of Information Technology
Beerends JG, Buuren RV, Vugt JV, Verhave J (2009) Objective speech intelligibility measurement on the basis of natural speech in combination with perceptual modeling. J Audio Eng Soc 57(5):299–308
Beerends JG, Schmidmer C, Berger J, Obermann M, Ullmann R, Pomy J, Keyhl M (2013) Perceptual objective listening quality assessment (POLQA), the third generation ITUT standard for end-to-end speech quality measurement part ii perceptual model. J Audio Eng Soc 61(6):385–402
Boersma P, Weenink D (2018) Praat: doing phonetics by computer [Computer Program]. Version 6.0.39. Retrieved May 2018
Boril H, Fousek P, Höge H (2007a) Two-stage system for robust neutral/Lombard speech recognition. InterSpeech
Boril H, Fousek P, Sündermann D, Cerva P, Zdansky J (2007b) Lombard speech recognition: a comparative study. InterSpeech
Corretge R (2012) Praat vocal toolkit. http://www.praatvocaltoolkit.com
Darwin CJ, Brungart DS, Simpson BD (2003) Effects of fundamental frequency and vocal-tract length changes on attention to one of two simultaneous talkers. J Acoust Soc Am 114(5):2913–2922
Egan JP (1972) Psychoacoustics of the Lombard voice response. J Auditory Res 12:318–324
Ghai S, Sinha R (2009) Exploring the role of spectral smoothing in context of children’s speech recognition. In: 10th Annual conference of the international speech communication association
ITU-R BS.1116 (2016) Methods for the subjective assessment of small impairments in audio systems including multichannel sound systems
ITU-R BS.1284 (2003) General methods for the subjective assessment of sound quality
ITU-T (1996) Methods for subjective determination of transmission quality. Recommendation P.800, Aug
ITU-T (2003) Mapping function for transforming P.862 raw result scores to MOS-LQO. Recommendation P.862.1, Nov
ITU-T (2001) Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrow band telephone networks and speech codecs. Recommendation P.862, Feb
ITU-T (2004) Single-ended method for objective speech quality assessment in narrow-band telephony applications. Recommendation P.563
ITU-T (2006) Mean opinion score (MOS) terminology. Recommendation P.800.1, July
Junqua J-C, Fincke S, Field K (1999) The Lombard effect: a reflex to better communicate with others in noise. In: 1999 IEEE international conference on acoustics, speech, and signal processing proceedings. ICASSP99 (Cat. No. 99CH36258), vol 4, pp 2083–2086
Kleczkowski P, Żak A, Król-Nowak A (2017) Lombard effect in Polish speech and its comparison in English speech. Arch Acoust 42(4):561–569. https://doi.org/10.1515/aoa-2017-0060
Lau P (2008) The Lombard effect as a communicative phenomenon. UC Berkeley Phonology Lab Annual Report
Lombard E (1911) Le signe de l’élévation de la voix (translated from French). Ann des Mal l’oreille du larynx 37(2):101–119
Lu Y, Cooke M (2008) Speech production modifications produced by competing talkers, babble, and stationary noise. J Acoust Soc Am 124:3261–3275
Mermelstein P (1976) Distance measures for speech recognition, psychological and instrumental. In: Chen RCH (ed) Pattern recognition and artificial intelligence. Academic, New York, NY, USA, pp 374–388
Moulines E, Charpentier F (1990) Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones. Speech Commun 9(5–6):453–467
Nishiura T (2013) Detection for Lombard speech with second-order mel-frequency cepstral coefficient and spectral envelope in beginning of talking speech. J Acoust Soc Am
Stowe LM, Golob EJ (2013) Evidence that the Lombard effect is frequency-specific in humans. J Acoust Soc Am 134(1):640–647. https://doi.org/10.1121/1.4807645
Therrien AS, Lyons J, Balasubramaniam R (2012) Sensory attenuation of self-produced feedback: the Lombard effect revisited. PLoS One 7(11):e49370
Ubul K, Hamdulla A, Aysa A (2009) A digital signal processing teaching methodology using Praat. In: 2009 4th international conference on computer science & education. IEEE, pp 1804–1809
Vlaj D, Kacic Z (2011) The influence of Lombard effect on speech recognition. In: Speech technologies, Chap. 7, pp 151–168
Whitepaper PESQ (2001) An introduction. Psytechnics Limited
Zollinger SA, Brumm H (2011) The evolution of the Lombard effect: 100 years of psychoacoustic research. Behaviour 148:1173–1198
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Kąkol, K., Korvel, G., Kostek, B. (2020). Improving Objective Speech Quality Indicators in Noise Conditions. In: Dzemyda, G., Bernatavičienė, J., Kacprzyk, J. (eds) Data Science: New Issues, Challenges and Applications. Studies in Computational Intelligence, vol 869. Springer, Cham. https://doi.org/10.1007/978-3-030-39250-5_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-39250-5_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-39249-9
Online ISBN: 978-3-030-39250-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)