Non-intrusive Speech Quality Assessment with Support Vector Regression

Narwaria, Manish; Lin, Weisi; McLoughlin, Ian Vince; Emmanuel, Sabu; Tien, Chia Liang

doi:10.1007/978-3-642-11301-7_34

Manish Narwaria²¹,
Weisi Lin²¹,
Ian Vince McLoughlin²¹,
Sabu Emmanuel²¹ &
…
Chia Liang Tien²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5916))

Included in the following conference series:

International Conference on Multimedia Modeling

2136 Accesses
13 Citations

Abstract

We propose a new non-intrusive speech quality assessment algorithm based on Support Vector Regression (SVR) and Mel Frequency Cepstral Coefficients (MFCCs). The basic idea is to map the MFCCs into the desired quality score using SVR. The sensitivity of the MFCCs to external noise is exploited to gauge the changes in the speech signal to evaluate its perceptual quality. The use of SVR exploits the advantages of machine learning with the ability to learn complex data patterns for an effective and generalized mapping of features into a perceptual score, in contrast with the oft-utilized feature pooling process in the existing speech quality estimators. Experimental results indicate that the proposed approach outperforms the standard P.563 algorithm for non-intrusive assessment of speech quality with a total of 1792 speech files and the associated subjective scores.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Au, O., Lam, K.: A novel output-based objective speech quality measure for wireless communication. In: Proc. 4th Int. Conf. Signal Process., vol. 1, pp. 666–669 (1998)
Google Scholar
Falk, T., Xu, Q., Chan, W.Y.: Non-intrusive GMM-based speech quality measurement. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., vol. 1, pp. 125–128 (2005)
Google Scholar
Chen, G., Parsa, V.: Bayesian model based non-intrusive speech quality evaluation. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., vol. 1, pp. 385–388 (2005)
Google Scholar
Kim, D.: ANIQUE: An auditory model for single-ended speech quality estimation. IEEE Trans. Speech Audio Process. 13(5), 821–831 (2005)
Article Google Scholar
Grancharov, V., David, Y., Jonas, L., Bastiaan, W.: Low Complexity Non Intrusive Speech Quality Assessment. IEEE Trans. Speech Audio Process. 14(6), 1948–1956 (2006)
Article Google Scholar
Hu, Y., Loizou, P.C.: Evaluation of Objective Quality Measures for Speech Enhancement. IEEE Trans. Speech Audio Process. 16(1), 229–230 (2008)
Article Google Scholar
Falk, T., Chan, W.Y.: Single-ended speech quality measurement using machine learning methods. IEEE Trans. Audio, Speech, Lang. Process. 14(6), 1935–1947 (2006)
Article Google Scholar
Zhu, Q., Alwan, A.: The Effect of Additive Noise on Speech Amplitude Spectra: A Quantitative Analysis. IEEE Signal Processing Letters 9(9), 275–277 (2002)
Article Google Scholar
Scholkopf, Smola, A.J.: Learning with kernels. MIT Press, Cambridge (2002)
Google Scholar
Hu, Y., Loizou, P.C.: Subjective comparison and evaluation of speech enhancement algorithms. Speech Commun. 49, 588–601 (2007)
Article Google Scholar
ITU-T.: Subjective test methodology for evaluating speech communication systems that include noise suppression algorithms. In: ITU-T Rec. P.835, Geneva, Switzerland (2003)
Google Scholar
ITU-T.: Single-ended method for objective speech quality assessment in narrow-band telephony applications. In: ITU-T P.563Geneva, Switzerland (2004)
Google Scholar
Rix, A.: Perceptual speech quality assessment - A Review. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., May 2004, vol. 3, pp. 1056–1059 (2004)
Google Scholar
Psytechnics Limited: NiQA - Product Description. Tech. Rep. (January 2003), http://www.psytechnics.com/pages/products/niqa.php
SwissQual Inc.: NiNA - SwissQual’s non-intrusive algorithm for estimating the subjective quality of live speech. Tech. Rep. (June 2001), http://www.swissqual.com/HTML/ninapage.htm
Murty, K., Yegnanarayana, B.: Combining evidence from residual phase and MFCC features for speaker recognition. IEEE Signal Processing Letters 13(1), 52–55 (2006)
Article Google Scholar
ITU-T.: Perceptual evaluation of speech quality. In: ITU-TP.862 Recommendation (Febrauary 2001)
Google Scholar
Müller, M.: Information Retrieval for Music and Motion. Springer, New York (2007)
Book Google Scholar
Shao, X., Milner, B.: Clean speech reconstruction from noisy Mel frequency cepstral coefficients using a sinusoidal model. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., vol. I, pp. 704–707 (2003)
Google Scholar
Boucheron, L., Philip, L.: On the inversion of Mel frequency cepstral coefficients for Speech Enhancement Applications. In: Proc. ICSES, pp. 485–488 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Engineering, Nanyang Technological University, 639798, Singapore
Manish Narwaria, Weisi Lin, Ian Vince McLoughlin, Sabu Emmanuel & Chia Liang Tien

Authors

Manish Narwaria
View author publications
You can also search for this author in PubMed Google Scholar
Weisi Lin
View author publications
You can also search for this author in PubMed Google Scholar
Ian Vince McLoughlin
View author publications
You can also search for this author in PubMed Google Scholar
Sabu Emmanuel
View author publications
You can also search for this author in PubMed Google Scholar
Chia Liang Tien
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Oldenburg, Germany
Susanne Boll
University of Texas at San Antonio,, TX, San Antonio, USA
Qi Tian
Microsoft Research Asia, Beijing, P.R. China
Lei Zhang
Southwest University, Beibei, Chongqing, China
Zili Zhang
School of Engineering and Information Technology, Deakin University, 221 Burwood Highway, Vic, 3125, Australia
Yi-Ping Phoebe Chen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Narwaria, M., Lin, W., McLoughlin, I.V., Emmanuel, S., Tien, C.L. (2010). Non-intrusive Speech Quality Assessment with Support Vector Regression. In: Boll, S., Tian, Q., Zhang, L., Zhang, Z., Chen, YP.P. (eds) Advances in Multimedia Modeling. MMM 2010. Lecture Notes in Computer Science, vol 5916. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11301-7_34

Download citation

DOI: https://doi.org/10.1007/978-3-642-11301-7_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11300-0
Online ISBN: 978-3-642-11301-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics