Abstract
Language Identification (LID) refers to the task of identifying an unknown language from the test utterances. In this paper, a new feature set, viz.,T-MFCC by amalgamating Teager Energy Operator (TEO) and well-known Mel frequency cepstral coefficients (MFCC) is developed. The effectiveness of the newly derived feature set is demonstrated for identifying perceptually similar Indian languages such as Hindi and Urdu. The modified structure of polynomial classifier of 2nd and 3rd order approximation has been used for the LID problem. The results have been compared with state-of-the art feature set, viz.,MFCC and found to be effective (an average jump 21.66%) in majority of the cases. This may be due to the fact that the T-MFCC represents the combined effect of airflow properties in the vocal tract (which are known to be language and speaker dependent) and human perception process for hearing.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Campbell, W.M., Assaleh, K.T., Broun, C.C.: Speaker recognition with polynomial classifiers. IEEE Trans. on Speech and Audio Processing 10, 205–212 (2002)
Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust., Speech and Signal Processing 28, 357–366 (1980)
Kaiser, J.F.: On a simple algorithm to calculate the ‘energy’ of a signal. Proc. of Int. Conf. on Acoustic, Speech and Signal Processing 1, 381–384 (1990)
Kersta, L.G.: Voiceprint Identification. Nature 196, 1253–1257 (1962)
Mary, L., Yegnanarayana, B.: Autoassociative neural network models for language identification. In: Int. Conf. on Intelligent Sensing and Information Processing, ICISIP, pp. 317–320 (2004)
Muthusamy, Y.K., Barnard, E., Cole, R.A.: Reviewing automatic language identification. IEEE Signal Processing Mag. 11, 3341 (1994)
Patil, H.A.: Speaker Recognition in Indian languages: A feature based approach. Ph.D. Thesis, Department of Electrical Engineering, IIT Kharagpur, India (July 2005)
Teager, H.M.: Some observations on oral air flow during phonation. IEEE Trans. Acoust., Speech, Signal Process. 28, 599–601 (1980)
Zhau, G., Hansen, J.H.L., Kaiser, J.F.: Non-linear feature based classification of speech under stress. IEEE Trans. on Speech and Audio Processing 9, 201–216 (2001)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Patil, H.A., Basu, T.K. (2007). Cepstral Domain Teager Energy for Identifying Perceptually Similar Languages. In: Ghosh, A., De, R.K., Pal, S.K. (eds) Pattern Recognition and Machine Intelligence. PReMI 2007. Lecture Notes in Computer Science, vol 4815. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77046-6_56
Download citation
DOI: https://doi.org/10.1007/978-3-540-77046-6_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77045-9
Online ISBN: 978-3-540-77046-6
eBook Packages: Computer ScienceComputer Science (R0)