Skip to main content
Log in

Enhanced speaker verification using an adaptive multiple low-rank representation based on the modified adaptive Gaussian mixture model framework

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

In this paper, a new method for the calculation of the observation-confidence value that is applied in the modified adaptive Gaussian mixture model framework is proposed for speaker verification. First, an adaptive version of the multiple low-rank representation method, for which a weighted decomposition that incorporates the prior information regarding the speech/non-speech content is considered, is proposed to find the enhanced speech and for the estimation of the frame signal-to-noise ratio (SNR) values. Then, a simple sigmoid function is applied to convert the frame SNR values into the observation-confidence values. To verify the accuracy of the system, we use utterances from the Korean movie You Came From The Stars. The experiment results show that our proposed approach achieves a greater accuracy compared with the other well-known baseline methods, such as the GMM-based universal background model, the GMM supervector-based support vector machine (SVM), the i-vector-based SVM, and the sparse representation, under the noisy environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Bimbot, F., et al.: A tutorial on text-independent speaker verification. EURASIP J. Appl. Signal Process. 4, 430–451 (2004)

    Article  Google Scholar 

  2. Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian mixture models. Digit. Signal Process. 10, 19–41 (2000)

    Article  Google Scholar 

  3. Trinh, T.D., Park, M.K., Kim, J.Y., Lee, K.R., Choi, S.H., Cho, K.S.: A modified adaptive GMM approach based GMM supervector and i-vector using NMF decomposition for robust speaker verification. J. KIIT 13(7), 117–125 (2015)

    Article  Google Scholar 

  4. Ma, X., Trinh, T.D., Kim, J.Y., Kim, H.Y.: Speaker verification using a modified adaptive GMM approach based on low rank matrix recovery. In: Lecture Notes in Electrical Engineering, vol. 391, pp. 109–116 (2016)

  5. Ephraim, Y., Malah, D.: Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process. 33(2), 443–445 (1985)

    Article  Google Scholar 

  6. Huang, P.S., Chen, S.D., Smaragdis, P., Johnson, H.M.: Singing-voice separation from monaural recordings using robust principal component analysis. In: Proceedings of the IEEE ICASSP, pp. 57–60 (2012)

  7. Cohen, I., Berdugo, B.: Speech enhancement for non-stationary noise environments. Signal Process. 81, 2403–2418 (2001)

    Article  MATH  Google Scholar 

  8. Yang, Y.H.: Low-rank representation of both singing voice and music accompaniment via learned dictionaries. In: Proceedings of the International Society for Music Information Retrieval, pp. 427–432 (2013)

  9. Campbell, W.M., Sturim, D.E., Reynolds, D.A., Solomonoff, A.: SVM based speaker verification using a GMM supervector kernel and NAP variability compensation. In: Proceedings of the IEEE ICASSP, vol. 1, pp. 97–100 (2006)

  10. Dehak, N., Kenny, P., Dehak, R., Glembek, O., Dumouchel, P., Burget, L., Hubeika, V., Castaldo, F.: Support vector machines and joint factor analysis for speaker verification. In: Proceedings of the IEEE ICASSP, pp. 4237–4240 (2009)

  11. Dehak, N., Kenny, P.J., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011)

    Article  Google Scholar 

  12. Li, M., Zhang, X., Yan, Y., Narayanan, S.: Speaker verification using sparse representations on total variability i-vectors. In: Proceedings of the Interspeech, pp. 4548–4551 (2011)

  13. Kua, J.M.K., Epps, J., Ambikairajah, E.: I-vector with sparse representation classification for speaker verification. Speech Commun. 55(5), 707–720 (2013)

    Article  Google Scholar 

  14. Papadopoulos, H., Ellis, D.P.W.: Music-content-adaptive robust principal component analysis for a semantically consistent separation of foreground and background in music audio signals. In: Proceedings of the International Conference on Digital Audio Effects, pp. 1–8 (2014)

  15. Kim, J.Y., et al.: Modified GMM training for inexact observation and its application to speaker identification. Speech Sci. 14, 163–175 (2007)

    Google Scholar 

  16. Liu, G., et al.: Robust recovery of subspace structures by low-rank representation. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 171–184 (2013)

    Article  Google Scholar 

  17. Lin, Z., Chen, M., Wu, L., Ma, Y.: The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices. UIUC Technical Report UILU-ENG-09-2215, Tech. Rep., pp. 1–20 (2009)

  18. Sohn, J., Kim, N.S., Sung, W.: A statistical model-based voice activity detection. IEEE Signal Process. Lett. 6(1), 1–3 (1999)

    Article  Google Scholar 

  19. Brookes, M.: Voicebox: Speech Processing Toolbox for Matlab. Software from www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html, vol. 47 (1997). Accessed March 2011

  20. Hermansky, H., Morgan, N.: RASTA processing of speech. IEEE Trans. Speech Audio Process. 2(4), 578–589 (1994)

    Article  Google Scholar 

  21. Pelecanos, J., Sridharan, S.: Feature warping for robust speaker verification. In: Proceedings of the Speaker Odyssey, Crete, Greece, pp. 213–218 (2001)

  22. Trinh, T.D., Kim, J.Y., Pham, T.B., Choi, S.H., Cho, K.S.: Robust speaker verification using low-rank matrix recovery and weighted sparse representation under total variability space. J. KIIT 14(3), 59–69 (2016)

    Article  Google Scholar 

  23. Bui, N.N., Kim, J.Y., Trinh, T.D.: A non-linear GMM KL and GUMI kernel for SVM using GMM-UBM supervector in home acoustic event classification. IEICE Trans. Fundam. E97–A(8), 1791–1794 (2014)

    Article  Google Scholar 

  24. You, C.H., Lee, K.A., Li, H.: An SVM kernel with GMM-supervector based on the Bhattacharyya distance for speaker recognition. IEEE Signal Process. Lett. 16(1), 49–52 (2009)

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by the MSIP (Ministry of Science, ICT and Future Planning), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2016-R2718-16-0011) supervised by the IITP (Institute for Information & communications Technology Promotion).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jin Young Kim.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Trinh, T.D., Ma, X., Kim, J.Y. et al. Enhanced speaker verification using an adaptive multiple low-rank representation based on the modified adaptive Gaussian mixture model framework. Cluster Comput 20, 2333–2347 (2017). https://doi.org/10.1007/s10586-017-1051-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-017-1051-9

Keywords

Navigation