Skip to main content

Speech Processing Background

  • Chapter
  • First Online:
Privacy-Preserving Machine Learning for Speech Processing

Part of the book series: Springer Theses ((Springer Theses))

  • 1508 Accesses

Abstract

In this chapter, we review some of the building blocks of speech processing systems. We then discuss the specifics of speaker verification, speaker identification, and speech recognition. We will reuse these constructions when designing privacy-preserving algorithms for these tasks in the reminder of the thesis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Bimbot F, Bonastre J-F, Fredouille C, Gravier G, Magrin-Chagnolleau I, Meignier S, Merlin T, Ortega-Garcia J, Petrovska-Delacretaz D, Reynolds D (2004) A tutorial on text-independent speaker verification. EURASIP J Appl Signal Process 4:430–451

    Google Scholar 

  • Campbell W (2002) Generalized linear discriminant sequence kernels for speaker recognition. In: IEEE international conference on acoustics, speech and signal processing

    Google Scholar 

  • Campbell WM, Sturim DE, Reynolds DA, Solomonoff A (2006) SVM based speaker verification using a GMM supervector kernel and NAP variability compensation. In: IEEE international conference on acoustics, speech and signal processing

    Google Scholar 

  • Carey M, Parris E, Bridle J (1991) Speaker verification system using alpha-nets. In: International conference on acoustics, speech and signal processing

    Google Scholar 

  • Charikar M (2002) Similarity estimation techniques from rounding algorithms. In: ACM symposium on theory of computing

    Google Scholar 

  • Datar M, Immorlica N, Indyk P, Mirrokni VS (2004) Locality-sensitive hashing scheme based on p-stable distributions. In: ACM symposium on computational geometry, pp 253–262

    Google Scholar 

  • Davis S, Mermelstein P (1980) Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Acoust Speech Signal Process ASSP 28(4):357

    Google Scholar 

  • Dehak N, Kenny P, Dehak R, Dumouchel P, Ouellet P (2011) Front-end factor analysis for speaker verification. IEEE Trans Audio Speech Lang Process 19(4):788–798

    Google Scholar 

  • Dunn RB, Reynolds DA, Quatieri TF (2000) Approaches to speaker detection and tracking in conversational speech. Digit Signal Process 10:93–112

    Google Scholar 

  • Heck LP, Weintraub M (1997) Handset-dependent background models for robust text-independen speaker recognition. In: International conference on acoustics, speech and signal processing, vol 2. pp 1071–1074

    Google Scholar 

  • Indyk P, Motwani R (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the ACM symposium on theory of computing, pp 604–613

    Google Scholar 

  • Kenny P, Dumouchel P (2004) Experiments in speaker verification using factor analysis likelihood ratios. In: Odyssey, pp 219–226

    Google Scholar 

  • Kulis B, Grauman K (2009) Kernelized locality-sensitive hashing for scalable image search. In: IEEE international conference on computer vision, pp 2130–2137

    Google Scholar 

  • Mariéthoz, J., Bengio, S. and Grandvalet, Y. (2009) Kernel-Based Text-Independent Speaker Verification, in Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods (eds) J. Keshet and S. Bengio, John Wiley & Sons, Ltd, Chichester, UK. doi: 10.1002/9780470742044.ch12

  • Matsui T, Furui S (1995) Likelihood normalization for speaker verification using a phoneme- and speaker-independent model. Speech Commun 17(1–2):109–116

    Google Scholar 

  • Reynolds DA (1997) Comparison of background normalization methods for text-independent speaker verification. In: European conference on speech communication and technology, vol 2. pp 963–966

    Google Scholar 

  • Reynolds DA, Quatieri TF, Dunn RB (2000) Speaker verification using adapted Gaussian mixture models. Digit Signal Process 10:19–41

    Google Scholar 

  • Rosenberg AE, Parthasarathy S (1996) Speaker background models for connected digit password speaker verification. In: International conference on acoustics, speech and signal processing

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manas A. Pathak .

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Pathak, M.A. (2013). Speech Processing Background. In: Privacy-Preserving Machine Learning for Speech Processing. Springer Theses. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-4639-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-4639-2_2

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-4638-5

  • Online ISBN: 978-1-4614-4639-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics