Cepstral Domain Teager Energy for Identifying Perceptually Similar Languages

Patil, Hemant A.; Basu, T. K.

doi:10.1007/978-3-540-77046-6_56

Hemant A. Patil¹ &
T. K. Basu²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4815))

Included in the following conference series:

International Conference on Pattern Recognition and Machine Intelligence

2209 Accesses
1 Citations

Abstract

Language Identification (LID) refers to the task of identifying an unknown language from the test utterances. In this paper, a new feature set, viz.,T-MFCC by amalgamating Teager Energy Operator (TEO) and well-known Mel frequency cepstral coefficients (MFCC) is developed. The effectiveness of the newly derived feature set is demonstrated for identifying perceptually similar Indian languages such as Hindi and Urdu. The modified structure of polynomial classifier of 2^nd and 3^rd order approximation has been used for the LID problem. The results have been compared with state-of-the art feature set, viz.,MFCC and found to be effective (an average jump 21.66%) in majority of the cases. This may be due to the fact that the T-MFCC represents the combined effect of airflow properties in the vocal tract (which are known to be language and speaker dependent) and human perception process for hearing.

Download to read the full chapter text

Chapter PDF

A lazy learning-based language identification from speech using MFCC-2 features

Article 28 January 2019

Spoken Language Identification of Indian Languages Using MFCC Features

Language Discrimination from Speech Signal Using Perceptual and Physical Features

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Campbell, W.M., Assaleh, K.T., Broun, C.C.: Speaker recognition with polynomial classifiers. IEEE Trans. on Speech and Audio Processing 10, 205–212 (2002)
Article Google Scholar
Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust., Speech and Signal Processing 28, 357–366 (1980)
Article Google Scholar
Kaiser, J.F.: On a simple algorithm to calculate the ‘energy’ of a signal. Proc. of Int. Conf. on Acoustic, Speech and Signal Processing 1, 381–384 (1990)
Article Google Scholar
Kersta, L.G.: Voiceprint Identification. Nature 196, 1253–1257 (1962)
Article Google Scholar
Mary, L., Yegnanarayana, B.: Autoassociative neural network models for language identification. In: Int. Conf. on Intelligent Sensing and Information Processing, ICISIP, pp. 317–320 (2004)
Google Scholar
Muthusamy, Y.K., Barnard, E., Cole, R.A.: Reviewing automatic language identification. IEEE Signal Processing Mag. 11, 3341 (1994)
Google Scholar
Patil, H.A.: Speaker Recognition in Indian languages: A feature based approach. Ph.D. Thesis, Department of Electrical Engineering, IIT Kharagpur, India (July 2005)
Google Scholar
Teager, H.M.: Some observations on oral air flow during phonation. IEEE Trans. Acoust., Speech, Signal Process. 28, 599–601 (1980)
Article Google Scholar
Zhau, G., Hansen, J.H.L., Kaiser, J.F.: Non-linear feature based classification of speech under stress. IEEE Trans. on Speech and Audio Processing 9, 201–216 (2001)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Dhirubhai Ambani Institute of Information and Communication Technology, DA-IICT , Gandhinagar, Gujarat, India
Hemant A. Patil
Department of Electrical Engineering, Indian Institute of Technology, IIT Kharagpur, West Bengal, India
T. K. Basu

Authors

Hemant A. Patil
View author publications
You can also search for this author in PubMed Google Scholar
T. K. Basu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Ashish Ghosh Rajat K. De Sankar K. Pal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Patil, H.A., Basu, T.K. (2007). Cepstral Domain Teager Energy for Identifying Perceptually Similar Languages. In: Ghosh, A., De, R.K., Pal, S.K. (eds) Pattern Recognition and Machine Intelligence. PReMI 2007. Lecture Notes in Computer Science, vol 4815. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77046-6_56

Download citation

DOI: https://doi.org/10.1007/978-3-540-77046-6_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77045-9
Online ISBN: 978-3-540-77046-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Cepstral Domain Teager Energy for Identifying Perceptually Similar Languages

Abstract

Chapter PDF

Similar content being viewed by others

A lazy learning-based language identification from speech using MFCC-2 features

Spoken Language Identification of Indian Languages Using MFCC Features

Language Discrimination from Speech Signal Using Perceptual and Physical Features

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Cepstral Domain Teager Energy for Identifying Perceptually Similar Languages

Abstract

Chapter PDF

Similar content being viewed by others

A lazy learning-based language identification from speech using MFCC-2 features

Spoken Language Identification of Indian Languages Using MFCC Features

Language Discrimination from Speech Signal Using Perceptual and Physical Features

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation