Speech Processing Background

Pathak, Manas A.

doi:10.1007/978-1-4614-4639-2_2

Manas A. Pathak²

Part of the book series: Springer Theses ((Springer Theses))

1508 Accesses

Abstract

In this chapter, we review some of the building blocks of speech processing systems. We then discuss the specifics of speaker verification, speaker identification, and speech recognition. We will reuse these constructions when designing privacy-preserving algorithms for these tasks in the reminder of the thesis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bimbot F, Bonastre J-F, Fredouille C, Gravier G, Magrin-Chagnolleau I, Meignier S, Merlin T, Ortega-Garcia J, Petrovska-Delacretaz D, Reynolds D (2004) A tutorial on text-independent speaker verification. EURASIP J Appl Signal Process 4:430–451
Google Scholar
Campbell W (2002) Generalized linear discriminant sequence kernels for speaker recognition. In: IEEE international conference on acoustics, speech and signal processing
Google Scholar
Campbell WM, Sturim DE, Reynolds DA, Solomonoff A (2006) SVM based speaker verification using a GMM supervector kernel and NAP variability compensation. In: IEEE international conference on acoustics, speech and signal processing
Google Scholar
Carey M, Parris E, Bridle J (1991) Speaker verification system using alpha-nets. In: International conference on acoustics, speech and signal processing
Google Scholar
Charikar M (2002) Similarity estimation techniques from rounding algorithms. In: ACM symposium on theory of computing
Google Scholar
Datar M, Immorlica N, Indyk P, Mirrokni VS (2004) Locality-sensitive hashing scheme based on p-stable distributions. In: ACM symposium on computational geometry, pp 253–262
Google Scholar
Davis S, Mermelstein P (1980) Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Acoust Speech Signal Process ASSP 28(4):357
Google Scholar
Dehak N, Kenny P, Dehak R, Dumouchel P, Ouellet P (2011) Front-end factor analysis for speaker verification. IEEE Trans Audio Speech Lang Process 19(4):788–798
Google Scholar
Dunn RB, Reynolds DA, Quatieri TF (2000) Approaches to speaker detection and tracking in conversational speech. Digit Signal Process 10:93–112
Google Scholar
Heck LP, Weintraub M (1997) Handset-dependent background models for robust text-independen speaker recognition. In: International conference on acoustics, speech and signal processing, vol 2. pp 1071–1074
Google Scholar
Indyk P, Motwani R (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the ACM symposium on theory of computing, pp 604–613
Google Scholar
Kenny P, Dumouchel P (2004) Experiments in speaker verification using factor analysis likelihood ratios. In: Odyssey, pp 219–226
Google Scholar
Kulis B, Grauman K (2009) Kernelized locality-sensitive hashing for scalable image search. In: IEEE international conference on computer vision, pp 2130–2137
Google Scholar
Mariéthoz, J., Bengio, S. and Grandvalet, Y. (2009) Kernel-Based Text-Independent Speaker Verification, in Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods (eds) J. Keshet and S. Bengio, John Wiley & Sons, Ltd, Chichester, UK. doi: 10.1002/9780470742044.ch12
Matsui T, Furui S (1995) Likelihood normalization for speaker verification using a phoneme- and speaker-independent model. Speech Commun 17(1–2):109–116
Google Scholar
Reynolds DA (1997) Comparison of background normalization methods for text-independent speaker verification. In: European conference on speech communication and technology, vol 2. pp 963–966
Google Scholar
Reynolds DA, Quatieri TF, Dunn RB (2000) Speaker verification using adapted Gaussian mixture models. Digit Signal Process 10:19–41
Google Scholar
Rosenberg AE, Parthasarathy S (1996) Speaker background models for connected digit password speaker verification. In: International conference on acoustics, speech and signal processing
Google Scholar

Download references

Author information

Authors and Affiliations

Carnegie Mellon University, Forbes Ave. 5000, Pittsburgh, PA, 15213, USA
Manas A. Pathak

Authors

Manas A. Pathak
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manas A. Pathak .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Pathak, M.A. (2013). Speech Processing Background. In: Privacy-Preserving Machine Learning for Speech Processing. Springer Theses. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-4639-2_2

Download citation

DOI: https://doi.org/10.1007/978-1-4614-4639-2_2
Published: 26 October 2012
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-4638-5
Online ISBN: 978-1-4614-4639-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics