Abstract
Most conventional features used in speaker authentication are based on estimation of spectral envelopes in one way or another, e.g., Mel-scale Filterbank Cepstrum Coefficients (MFCCs), Linear-scale Filterbank Cepstrum Coefficients (LFCCs) and Relative Spectral Perceptual Linear Prediction (RASTA-PLP). In this study, Spectral Subband Centroids (SSCs) are examined. These features are the centroid frequency in each subband. They have properties similar to formant frequencies but are limited to a given subband. Empirical experiments carried out on the NIST2001 database using SSCs, MFCCs, LFCCs and their combinations by concatenation suggest that SSCs are somewhat more robust compared to conventional MFCC and LFCC features as well as being partially complementary.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bengio, S., Keller, M., Mariéthoz, J.: The Expected Performance Curve. IDIAP Research Report 03-85, Martigny, Switzerland (2003)
Bengio, Y.: Neural Networks for Speech and Sequence Recognition. Thompson Computer Press (1995)
Bishop, C.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1999)
Chilton, E., Marvi, H.: Two-Dimensional Root Cepstrum as Feature Extraction Method for Speech Recognition. Electronics Letters 3(10), 815–816 (2003)
de Mori, R., Palakal, M.: On the Use of a Taxonomy of Time-Frequency Morphologies for Automatic Speech Recognition. Int’l Joint Conf. Artificial Intelligence, 877–879 (1985)
Hermansky, H., Morgan, N., Bayya, A., Kohn, P.: Rasta-PLP speech analysis. In: Proc. IEEE Int’l Conf. Acoustics, Speech and Signal Processing, San Francisco, vol. 1, pp. 121–124 (1992)
Kajarekar, S.S., Hermansky, H.: Analysis of Information in Speech and its Application in Speech Recognition. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2000. LNCS (LNAI), vol. 1902, pp. 283–288. Springer, Heidelberg (2000)
Magrin-Chagnolleau, I., Gravier, G., Seck, M., Boeffard, O., Blouet, R., Bimbot, F.: A Further Investigation on Speech Features for Speaker Characterization. In: Proc. Int’l Conf. Spoken Language Processing, Beijing, October 2000, vol. 3, pp. 1029–1032 (2000)
Paliwal, K.K.: Spectral Subband Centroids Features for Speech Recognition. In: Proc. Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), Seattle, vol. 2, pp. 617–620 (1998)
Poh, N., Sanderson, C., Bengio, S.: An Investigation of Spectral Subband Centroids For Speaker Authentication. IDIAP Research Report 03-62, Martigny, Switzerland (2003); To appear in Int’l Conf. on Biometric Authentication, Hong Kong (2004)
Rabiner, L., Juang, B.-H.: Fundamentals of Speech Recognition. Oxford University Press, Oxford (1993)
Reynolds, D.A.: Experimental Evaluation of Features for Robust Speaker Identification. IEEE Trans. Speech and Audio Processing 2(4), 639–643 (1994)
Reynolds, D.A., Quatieri, T., Dunn, R.: Speaker Verification Using Adapted Gaussian Mixture Models. 10(1–3), 19–41 (2000)
Sanderson, C.: Speech Processing & Text-Independent Automatic Person Verification. In: IDIAP Communication 02-08, Martigny, Switzerland (2002)
Sönmez, M.K., Shriberg, E., Heck, L., Weintraub, M.: Modeling Dynamic Prosodic Variation for Speaker Verification. In: Proc. Int’l Conf. Spoken Language Processing, Sydney, vol. 7, pp. 3189–3192 (1998)
Kemal Sönmez, M., Heck, L., Weintraub, M., Shriberg, E.: A Lognormal Tied Mixture Model of Pitch for Prosody-Based Speaker Recognition. In: Proc. Eurospeech, Rhodes, vol. 3, pp. 1291–1394 (1997) (Greece)
Varga, A., Steeneken, H.: Assessment for Automatic Speech Recognition: NOISEX-92: A Database and an Experiment to Study the Effect of Additive Noise on Speech Recognition Systems. Speech Communication 12(3), 247–251 (1993)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Thian, N.P.H., Sanderson, C., Bengio, S. (2004). Spectral Subband Centroids as Complementary Features for Speaker Authentication. In: Zhang, D., Jain, A.K. (eds) Biometric Authentication. ICBA 2004. Lecture Notes in Computer Science, vol 3072. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-25948-0_86
Download citation
DOI: https://doi.org/10.1007/978-3-540-25948-0_86
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22146-3
Online ISBN: 978-3-540-25948-0
eBook Packages: Springer Book Archive