Parametric Excitation Source Features for Language Identification

Rao, K. Sreenivasa; Nandi, Dipanjan

doi:10.1007/978-3-319-17725-0_4

K. Sreenivasa Rao⁴ &
Dipanjan Nandi⁴

Part of the book series: SpringerBriefs in Electrical and Computer Engineering ((BRIEFSSPEECHTECH))

631 Accesses

Abstract

This chapter describes the proposed methods to extract parametric features at sub-segmental, segmental and supra-segmental levels to capture the language-specific excitation source information. In this work, glottal pulse, spectral and epoch parameters are used for representing sub-segmental, segmental and supra-segmental information present in excitation source signal. Further, these individual features are combined at score level to enhance the accuracy of LID systems by exploiting the non-overlapping information present among the features.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

L.R. Rabiner, R.W. Schafer, Digital Processing of Speech Signals (Prentice-Hall, Englewood Cliffs, 1978)
Google Scholar
J. Makhoul, Linear prediction: a tutorial review. Proc. IEEE 63(4), 561–580 (1975)
Article Google Scholar
D.G. Childers, A.K. Krishnamurthy, A critical review of electroglottography. Crit. Rev. Biomed. Eng. 12(2), 131–161 (1985)
Google Scholar
M.D. Plumpe, T.F. Quatieri, D.A. Reynolds, Modeling of the glottal flow derivative waveform with application to speaker identification. IEEE Trans. Audio Speech Lang. Process. 7(5), 569–586 (1999)
Article Google Scholar
R. Veldhuish, A computationally efficient alternative for the Liljencrants-Fant model and its perceptual evaluation. J. Acoust. Soc. Am. 103(1), 566–571 (1998)
Article Google Scholar
T.V. Ananthapadmanabha, G. Fant, Calculation of true glottal flow and its components. Speech Commun. 1, 167–184 (1982)
Article Google Scholar
Y. Qi, N. Bi, A simplified approximation of the four-parameter LF model of voice source. J. Acoust. Soc. Am. 96(2), 1182–1185 (1994)
Article Google Scholar
K.S.R. Murty, B. Yegnanarayana, Epoch extraction from speech signals. IEEE Trans. Audio Speech Lang. Process. 16(8), 1602–1613 (2008)
Article Google Scholar
P. Naylor, A. Kounoudes, J. Gudnason, M. Brookes, Estimation of glottal closure instants in voiced speech using the DYPSA algorithm. IEEE Trans. Audio Speech Lang. Process. 15(1), 34–43 (2007)
Article Google Scholar
S. Hayakawa, K. Takeda, F. Itakura, Speaker identification using harmonic structure of LP-residual spectrum, Biometric Personal Authentification, vol. 1206, Lecture notes (Springer, Berlin, 1997)
Google Scholar
A.H. Gray, J.D. Markel, A spectral-flatness measure for studying the autocorrelation method of linear prediction of speech analysis. IEEE Trans. Audio Speech Lang. Process. ASSP-22(3), 207–217 (1974)
Google Scholar
J.J. Wolf, Efficient acoustic parameters for speaker recognition. J. Acoust. Soc. Am. 51(2), 2044–2055 (1972)
Article Google Scholar
B.S. Atal, Automatic speaker recognition based on pitch contours. J. Acoust. Soc. Am. 52(6), 1687–1697 (1972)
Article Google Scholar
B. Yegnenarayana, K.S.R. Murthy, Event based instantaneous fundamental frequency estimation from speech signals. IEEE Trans. Audio Speech Lang. Process. 17(4), 614–624 (2009)
Article Google Scholar
K.S.R. Murthy, B. Yegnanarayana, Epoch extraction from speech signal. IEEE Trans. Audio Speech Lang. Process. 16(8), 1602–1613 (2008)
Article Google Scholar
K.S.R. Murthy, B. Yegnanarayana, Characterization of glottal activity from speech signal. IEEE Signal Process. Lett. 16(6), 469–472 (2009)
Article Google Scholar
G. Seshadria, B. Yegnanarayana, Perceived loudness of speech based on the characteristics of glottal excitation source. J. Acoust. Soc. Am. 126(4), 2061–2071 (2009)
Article Google Scholar
D.A. Reynolds, R.C. Rose, Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. Audio Speech Lang. Process. 3(1), 72–83 (1995)
Article Google Scholar
V.R. Reddy, S. Maity, K.S. Rao, Identification of Indian languages using multi-level spectral and prosodic features. Int. J. Speech Technol. (Springer) 16(4), 489–511 (2013)
Article Google Scholar
Y.K. Muthusamy, R.A. Cole, B.T. Oshika, The OGI multilanguage telephone speech corpus, in Spoken Language Processing, pp. 895–898 (1992)
Google Scholar

Download references

Author information

Authors and Affiliations

Indian Institute of Technology Kharagpur, Kharagpur, West Bengal, India
K. Sreenivasa Rao & Dipanjan Nandi

Authors

K. Sreenivasa Rao
View author publications
You can also search for this author in PubMed Google Scholar
Dipanjan Nandi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to K. Sreenivasa Rao .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Rao, K.S., Nandi, D. (2015). Parametric Excitation Source Features for Language Identification. In: Language Identification Using Excitation Source Features. SpringerBriefs in Electrical and Computer Engineering(). Springer, Cham. https://doi.org/10.1007/978-3-319-17725-0_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-17725-0_4
Published: 16 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17724-3
Online ISBN: 978-3-319-17725-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics