Advertisement

Cepstral Domain Teager Energy for Identifying Perceptually Similar Languages

  • Hemant A. Patil
  • T. K. Basu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4815)

Abstract

Language Identification (LID) refers to the task of identifying an unknown language from the test utterances. In this paper, a new feature set, viz.,T-MFCC by amalgamating Teager Energy Operator (TEO) and well-known Mel frequency cepstral coefficients (MFCC) is developed. The effectiveness of the newly derived feature set is demonstrated for identifying perceptually similar Indian languages such as Hindi and Urdu. The modified structure of polynomial classifier of 2 nd and 3 rd order approximation has been used for the LID problem. The results have been compared with state-of-the art feature set, viz.,MFCC and found to be effective (an average jump 21.66%) in majority of the cases. This may be due to the fact that the T-MFCC represents the combined effect of airflow properties in the vocal tract (which are known to be language and speaker dependent) and human perception process for hearing.

Keywords

Vocal Tract Speaker Recognition Average Success Rate Similar Language Test Utterance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Campbell, W.M., Assaleh, K.T., Broun, C.C.: Speaker recognition with polynomial classifiers. IEEE Trans. on Speech and Audio Processing 10, 205–212 (2002)CrossRefGoogle Scholar
  2. 2.
    Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust., Speech and Signal Processing 28, 357–366 (1980)CrossRefGoogle Scholar
  3. 3.
    Kaiser, J.F.: On a simple algorithm to calculate the ‘energy’ of a signal. Proc. of Int. Conf. on Acoustic, Speech and Signal Processing 1, 381–384 (1990)CrossRefGoogle Scholar
  4. 4.
    Kersta, L.G.: Voiceprint Identification. Nature 196, 1253–1257 (1962)CrossRefGoogle Scholar
  5. 5.
    Mary, L., Yegnanarayana, B.: Autoassociative neural network models for language identification. In: Int. Conf. on Intelligent Sensing and Information Processing, ICISIP, pp. 317–320 (2004)Google Scholar
  6. 6.
    Muthusamy, Y.K., Barnard, E., Cole, R.A.: Reviewing automatic language identification. IEEE Signal Processing Mag. 11, 3341 (1994)Google Scholar
  7. 7.
    Patil, H.A.: Speaker Recognition in Indian languages: A feature based approach. Ph.D. Thesis, Department of Electrical Engineering, IIT Kharagpur, India (July 2005)Google Scholar
  8. 8.
    Teager, H.M.: Some observations on oral air flow during phonation. IEEE Trans. Acoust., Speech, Signal Process. 28, 599–601 (1980)CrossRefGoogle Scholar
  9. 9.
    Zhau, G., Hansen, J.H.L., Kaiser, J.F.: Non-linear feature based classification of speech under stress. IEEE Trans. on Speech and Audio Processing 9, 201–216 (2001)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Hemant A. Patil
    • 1
  • T. K. Basu
    • 2
  1. 1.Dhirubhai Ambani Institute of Information and Communication Technology, DA-IICT , Gandhinagar, GujaratIndia
  2. 2.Department of Electrical Engineering, Indian Institute of Technology, IIT Kharagpur, West BengalIndia

Personalised recommendations