Indian language identification using time-frequency texture features and kernel ELM

Birajdar, Gajanan K.; Raveendran, Smitha

doi:10.1007/s12652-022-03781-5

Indian language identification using time-frequency texture features and kernel ELM

Original Research
Published: 15 March 2022

Volume 14, pages 13237–13250, (2023)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

235 Accesses
5 Citations
Explore all metrics

Abstract

Precise identification of the language from speech utterance is a prime task of a language identification system and has been extensively utilized in multilanguage speech applications. This article presents Indian language identification system using textural descriptors extracted from time-frequency visual representation. The conventional LPC and MFCC feature extraction approaches for language identification have limited detection accuracy. In the first step, an input speech signal is converted into spectrogram, MFCC and cochleagram images representation. These speech sample visual representations can be treated as a texture image characterizing energy variations in different frequency-bands over time. Second step comprises extraction of completed linear binary pattern (CLBP), linear phase quantization (LPQ) and Weber local descriptor (WLD) textural features from visual representations. Finally, the kernel extreme learning machine (KELM) classifier has been employed for the language specific class label identification. The proposed algorithm validation is carried out using the IIIT-H Indic speech databases incorporating seven Indian languages from Indo-Aryan and Dravidian family. It is evident from the experimental results that the proposed time-frequency texture descriptor method outperforms other machine learning algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Language Identification Using Spectrogram Texture

Language Discrimination from Speech Signal Using Perceptual and Physical Features

Recognition of Spoken Languages from Acoustic Speech Signals Using Fourier Parameters

Article 04 April 2019

Data Availability Statement

The data underlying this article were provided by IIIT-H Indic Speech Databases by permission.

References

Aarti B, Kopparapu SK (2017) Spoken Indian language classification using artificial neural network—an experimental study. In: 2017 4th International Conference on signal processing and integrated networks (SPIN), pp 424–430. https://doi.org/10.1109/SPIN.2017.8049987
All India radio (2021) All India radio news services division. https://newsonair.gov.in/RNU-NSD-Audio-Archive-Search.aspx. Accessed 21 Feb 2021
Anjana JS, Poorna SS (2018) Language Identification From Speech Features Using SVM and LDA. In: 2018 International Conference on wireless communications, signal processing and networking (WiSPNET), pp 1–4. https://doi.org/10.1109/WiSPNET.2018.8538638
Anjanendu C, George A, Mary L (2018) Language identification using gender dependent GMM-UBM for three Indian languages. In: 2018 2nd International Conference on trends in electronics and informatics (ICOEI), IEEE, pp 510–513. IEEE. https://doi.org/10.1109/ICOEI.2018.8553783
Bagi R, Yadav J (2016) Performance degradation of language identification system in noisy environment, pp 538–548. https://doi.org/10.1142/9789814704830_0051
Bagi R, Yadav J, Rao KS (2015) Improved recognition rate of language identification system in noisy environment. In: 2015 Eighth International Conference on contemporary computing (IC3), IEEE, pp 214–219. IEEE. https://doi.org/10.1109/IC3.2015.7346681
Bakshi A, Kumar KS (2018) Spoken Indian language identification: a review of features and databases. Sādhanā 43(4):53. https://doi.org/10.1007/s12046-018-0841-y
Article Google Scholar
Basu J, Khan S, Roy R, Basu TK, Majumder S (2021) Multilingual speech corpus in low-resource eastern and northeastern Indian languages for speaker and language identification. Circ Syst Signal Process 40:4986–5013. https://doi.org/10.1007/s00034-021-01704-x
Article Google Scholar
Bhanja CC, Bisharad D, Laskar RH (2019a) Deep residual networks for pre-classification based Indian language identification. J Intell Fuzzy Syst 36(3):2207–2218. https://doi.org/10.3233/JIFS-169932
Article Google Scholar
Bhanja CC, Laskar MA, Laskar RH, Bandyopadhyay S (2019b) Deep neural network based two-stage Indian language identification system using glottal closure instants as anchor points. J King Saud Univ-Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2019.07.001
Article Google Scholar
Birajdar GK, Patil MD (2020) Speech/music classification using visual and spectral chromagram features. J Ambient Intell Humaniz Comput 11(1):329–347. https://doi.org/10.1007/s12652-019-01303-4
Article Google Scholar
Census of India (2011) Census of India/Abstract of speakers strength of languages (2011). http://www.censusindia.gov.in/Census_Data_2001/Census_Data_Online/Language/Statement1.aspx. Accessed 21 Feb 2021
Chen J, Shan S, He C, Zhao G, Pietikäinen M, Chen X, Gao W (2010) WLD: A robust local image descriptor. IEEE Trans Pattern Anal Mach Intell 32(9):1705–1720. https://doi.org/10.1109/TPAMI.2009.155
Article Google Scholar
China Bhanja C, Laskar MA, Laskar RH (2019) A pre-classification-based language identification for Northeast Indian languages using prosody and spectral features. Circ Syst Signal Process 38(5):2266–2296. https://doi.org/10.1007/s00034-018-0962-x
Article Google Scholar
Chowdhury AA, Borkar VS, Birajdar GK (2020) Indian language identification using time-frequency image textural descriptors and gwo-based feature selection. J Exp Theoret Artif Intell 32(1):111–132. https://doi.org/10.1080/0952813X.2019.1631392
Article Google Scholar
Das HS, Roy P (2019) Optimal prosodic feature extraction and classification in parametric excitation source information for Indian language identification using neural network based Q-learning algorithm. Int J Speech Technol 22(1):67–77. https://doi.org/10.1007/s10772-018-09582-6
Article Google Scholar
Das HS, Roy P (2020) Bottleneck feature-based hybrid deep autoencoder approach for Indian language identification. Arab J Sci Eng 45(4):3425–3436
Article Google Scholar
Dennis JW, Dat TH, Li H (2011) Spectrogram image feature for sound event classification in mismatched conditions. IEEE Signal Process Lett 18(2):130–133. https://doi.org/10.1109/LSP.2010.2100380
Article Google Scholar
Dutta AK, Rao KS (2018) Language identification using phase information. Int J Speech Technol 21(3):509–519. https://doi.org/10.1007/s10772-017-9482-5
Article Google Scholar
Garain A, Singh PK, Sarkar R (2021) Fuzzygcp: A deep learning architecture for automatic spoken language identification from speech signals. Expert Syst Appl 168:114416. https://doi.org/10.1016/j.eswa.2020.114416
Article Google Scholar
Godbole Shubham, Jadhav V, Birajdar G (2020) Indian language identification using deep learning. ITM Web Conf 32:01010. https://doi.org/10.1051/itmconf/20203201010
Article Google Scholar
Guo Z, Zhang L, Zhang D (2010) A completed modeling of local binary pattern operator for texture classification. IEEE Trans Image Process 19(6):1657–1663. https://doi.org/10.1109/TIP.2010.2044957
Article MathSciNet MATH Google Scholar
Gupta M, Bharti S.S, Agarwal S (2017) Implicit language identification system based on random forest and support vector machine for speech. In: 2017 4th International Conference on power, control & embedded systems (ICPCES), IEEE, pp 1–6. IEEE. https://doi.org/10.1109/ICPCES.2017.8117624
Gupta K, Gour K.S, Arya S, Gangashetty S.V (2018) Decision level fusion based approach for indian languages identification using deep neural network. In: TENCON 2018-2018 IEEE Region 10 Conference, IEEE, pp 2056–2059. IEEE. https://doi.org/10.1109/TENCON.2018.8650227
Huang G-B, Zhu Q-Y, Siew C-K (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1):489–501. https://doi.org/10.1016/j.neucom.2005.12.126 (Neural Networks)
Article Google Scholar
Huang G-B, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern B Cybern 42(2):513–529. https://doi.org/10.1109/TSMCB.2011.2168604
Article Google Scholar
Jog AH, Jugade OA, Kadegaonkar AS, Birajdar GK (2018) Indian language identification using cochleagram based texture descriptors and ANN classifier. In: 2018 15th IEEE India Council International Conference (INDICON), IEEE, pp 1–6. IEEE
Jothilakshmi S, Ramalingam V, Palanivel S (2012) A hierarchical language identification system for Indian languages. Digit Signal Process 22(3):544–553. https://doi.org/10.1016/j.dsp.2011.11.008
Article MathSciNet Google Scholar
Koolagudi S, Deepika R, Sreenivasa RK (2012) Identification of language using mel-frequency cepstral coefficients (MFCC). Proc Eng 38:3391–3398. https://doi.org/10.1016/j.proeng.2012.06.392
Article Google Scholar
Madhu C, George A, Mary L (2017) Automatic language identification for seven Indian languages using higher level features. In: 2017 IEEE International Conference on signal processing, informatics, communication and energy systems (SPICES), IEEE, pp 1–6. IEEE. https://doi.org/10.1109/SPICES.2017.8091332
Manwani N, Mitra S.K, Joshi M.V (2007) Spoken language identification for Indian languages using split and merge EM algorithm In: International Conference on pattern recognition and machine intelligence, Springer, pp 463–468. Springer. https://doi.org/10.1007/978-3-540-77046-6_57
Mukherjee H, Ghosh S, Sen S, Sk MdO, Santosh KC, Phadikar S, Roy K (2019) Deep learning for spoken language identification: Can we visualize speech signal patterns? Neural Comput Appl 31(12):8483–8501. https://doi.org/10.1007/s00521-019-04468-3
Article Google Scholar
Mukherjee H, Das S, Dhar A, Obaidullah SM, Santosh KC, Phadikar S, Roy K (2020) An ensemble learning-based language identification system. In: Maharatna K, Kanjilal M, Konar S, Nandi S, Das K (eds) Computational advancement in communication circuits and systems. Lecture notes in electrical engineering, vol 575. Springer, Singapore, pp 129–138. https://doi.org/10.1007/978-981-13-8687-9_12
Nandi D, Pati D, Rao KS (2015) Implicit excitation source features for robust language identification. Int J Speech Technol 18(3):459–477. https://doi.org/10.1007/s10772-015-9288-2
Article Google Scholar
Nandi D, Pati D, Rao KS (2017) Parametric representation of excitation source information for language identification. Comput Speech Lang 41:88–115. https://doi.org/10.1016/j.csl.2016.05.001
Article Google Scholar
Nanni L, Costa YMG, Lucio DR, Silla CN, Brahnam S (2017) Combining visual and acoustic features for audio classification tasks. Pattern Recogn Lett 88:49–56. https://doi.org/10.1016/j.patrec.2017.01.013
Article Google Scholar
Ojansivu V, Heikkilä J (2008) Blur insensitive texture classification using local phase quantization. In: Elmoataz A, Lezoray O, Nouboud F, Mammass D (eds) Image and signal processing. Springer, Berlin, Heidelberg, pp 236–243. https://doi.org/10.1007/978-3-540-69905-7_27
Chapter Google Scholar
Patterson RD, Robinson K, Holdsworth J, McKeown D, Zhang C, Allerhand M (1992) Complex sounds and auditory images. In: Cazals Y, Horner K, Demany L (eds) Auditory physiology and perception. Pergamon, pp 429–446. https://doi.org/10.1016/B978-0-08-041847-6.50054-X
Chapter Google Scholar
Polasi PK, Krishna KSR (2016) Combining the evidences of temporal and spectral enhancement techniques for improving the performance of Indian language identification system in the presence of background noise. Int J Speech Technol 19(1):75–85. https://doi.org/10.1007/s10772-015-932-0
Article Google Scholar
Polasi PK, Sri Rama Krishna K (2016) Performance of speaker independent language identification system under various noise environments. In: Satapathy SC, Mandal JK, Udgata SK, Bhateja V (eds) Information systems design and intelligent applications. Springer, New Delhi, pp 315–320. https://doi.org/10.1007/978-81-322-2755-7_33
Chapter Google Scholar
Prahallad K, Kumar EN, Keri V, Rajendran S, Black AW (2012) The IIIT-H Indic speech databases. In: Proceedings of interspeech, Interspeech. Interspeech, pp. 2546–2549
Rai MK, Fahad MS, Yadav J, Rao KS, et al (2016) Language identification using plda based on i-vector in noisy environment. In: 2016 International Conference on advances in computing, communications and informatics (ICACCI), IEEE, pp. 1014–1020. IEEE
Rao KS, Reddy VR, Maity S (2015) Language identification using spectral and prosodic features. Springer, Cham, p 3319171623
Book Google Scholar
Reddy VR, Maity S, Rao KS (2013) Identification of Indian languages using multi-level spectral and prosodic features. Int J Speech Technol 16(4):489–511. https://doi.org/10.1007/s10772-013-9198-0
Article Google Scholar
Revathi A, Jeyalakshmi C, Muruganantham T (2018) Perceptual features based rapid and robust language identification system for various Indian classical languages. In: Hemanth DJ, Smys S (eds) Computational vision and bio inspired computing. Springer, Cham, pp 291–305. https://doi.org/10.1007/978-3-319-71767-8_25
Chapter Google Scholar
Sharan RV, Moir TJ (2015) Noise robust audio surveillance using reduced spectrogram image feature and one-against-all svm. Neurocomputing 158:90–99. https://doi.org/10.1016/j.neucom.2015.02.001
Article Google Scholar
Sharan RV, Moir TJ (2016) An overview of applications and advancements in automatic sound recognition. Neurocomputing 200:22–34. https://doi.org/10.1016/j.neucom.2016.03.020
Article Google Scholar
Sharan RV, Moir TJ (2019) Acoustic event recognition using cochleagram image and convolutional neural networks. Appl Acoust 148:62–66. https://doi.org/10.1016/j.apacoust.2018.12.006
Article Google Scholar
Verma VK, Khanna N (2013) Indian language identification using k-means clustering and support vector machine (SVM). In:2013 Students Conference on engineering and systems (SCES), IEEE, pp 1–5. IEEE. https://doi.org/10.1109/SCES.2013.6547533
Wang M, Chen H, Li H, Cai Z, Zhao X, Tong C, Li J, Xu X (2017) Grey wolf optimization evolving kernel extreme learning machine: Application to bankruptcy prediction. Eng Appl Artif Intell 63:54–68. https://doi.org/10.1016/j.engappai.2017.05.003
Article Google Scholar
Xie J, Zhu M (2019) Handcrafted features and late fusion with deep learning for bird sound classification. Eco Inf 52:74–81. https://doi.org/10.1016/j.ecoinf.2019.05.007
Article Google Scholar
Yang W, Krishnan S (2017) Combining temporal features by local binary pattern for acoustic scene classification. IEEE/ACM Trans Audio Speech Lang Process 25(6):1315–1321. https://doi.org/10.1109/TASLP.2017.2690558
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronics Engineering, Ramrao Adik Institute of Technology, DY Patil Deemed to be University, Navi Mumbai, Maharashtra, 400706, India
Gajanan K. Birajdar & Smitha Raveendran

Authors

Gajanan K. Birajdar
View author publications
You can also search for this author in PubMed Google Scholar
Smitha Raveendran
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gajanan K. Birajdar.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 547 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Birajdar, G.K., Raveendran, S. Indian language identification using time-frequency texture features and kernel ELM. J Ambient Intell Human Comput 14, 13237–13250 (2023). https://doi.org/10.1007/s12652-022-03781-5

Download citation

Received: 08 June 2021
Accepted: 25 February 2022
Published: 15 March 2022
Issue Date: October 2023
DOI: https://doi.org/10.1007/s12652-022-03781-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Indian language identification using time-frequency texture features and kernel ELM

Abstract

Access this article

Similar content being viewed by others

Language Identification Using Spectrogram Texture

Language Discrimination from Speech Signal Using Perceptual and Physical Features

Recognition of Spoken Languages from Acoustic Speech Signals Using Fourier Parameters

Data Availability Statement

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (PDF 547 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Indian language identification using time-frequency texture features and kernel ELM

Abstract

Access this article

Similar content being viewed by others

Language Identification Using Spectrogram Texture

Language Discrimination from Speech Signal Using Perceptual and Physical Features

Recognition of Spoken Languages from Acoustic Speech Signals Using Fourier Parameters

Data Availability Statement

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (PDF 547 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation