Abstract
The determination and classification of natural language based on specified content and data set involves a process known as spoken language identification (LID). To initiate the process, useful features of the given data need to be extracted first in a mature process where the standard LID features have been previously developed by employing the use of MFCC, SDC, GMM and the i-vector-based framework. Nevertheless, optimisation of the learning process is still required to enable a comprehensive capturing of the extracted features’ embedded knowledge. The training of a single hidden layer neural network can be done using the extreme learning machine (ELM), which is an effective learning model for conducting classification and regression analysis. Nevertheless, the learning process of this model is not entirely effective (i.e. optimised) due to the random selection of weights within the input hidden layer. This study employs ELM as the LID learning model centred upon the extraction of the standard features. The enhanced self-adjusting extreme learning machine (ESA–ELM) is one of the ELM’s optimisation techniques which has been chosen as the benchmark and is enhanced by adopting a new alternative optimisation approach (PSO) instead of (EATLBO) in terms of achieving high performance. The improved ESA–ELM is named particle swarm optimisation–extreme learning machine (PSO–ELM). The generated results are based on LID with the same benchmarked data set derived from eight languages, which indicated the superior performance of the particle swarm optimisation–extreme learning machine LID (PSO–ELM LID) with an accuracy of 98.75% in comparison with the ESA–ELM LID which only achieved 96.25%.
Similar content being viewed by others
Availability of Data and Materials
The data are available on Figshare via the following https://doi.org/10.6084/m9.figshare.6015173.v1.
References
M.A.A. Albadr, S. Tiun, M. Ayob, F.T. AL-Dhief, Spoken language identification based on optimised genetic algorithm–extreme learning machine approach. Int. J. Speech Technol. 22(3), 711–727 (2019)
M.A.A. Albadr, S. Tiun, F.T. AL-Dhief, M.A. Sammour, Spoken language identification based on the enhanced self-adjusting extreme learning machine approach. PLoS ONE 13(4), 0194770 (2018)
M.A.A. Albadra, S. Tiuna, Extreme learning machine: a review. Int. J. Appl. Eng. Res. 12(14), 4610–4623 (2017)
A.N. Alfiyatin, A.M. Rizki, W.F. Mahmudy, C.F. Ananda, Extreme learning machine and particle swarm optimization for inflation forecasting. Int. J. Adv. Comput. Sci. Appl. 10(4), 473–478 (2019)
A. Alihodzic, E. Tuba, M. Tuba, An improved extreme learning machine tuning by flower pollination algorithm, in Nature-Inspired Computation in Data Mining and Machine Learning, vol. 855, ed. by X.S. Yang, X.S. He (Springer, Cham, 2020), pp. 95–112
E. Ambikairajah, H. Li, L. Wang, B. Yin, V. Sethu, Language identification: a tutorial. IEEE Circuits Syst. Mag. 11(2), 82–108 (2011)
E. Ben-Reuven, J. Goldberger, A Semisupervised Approach for Language Identification based on Ladder Networks. arXiv:1604.00317 (2016)
P.-H. Chen, Particle swarm optimization for power dispatch with pumped hydro, in Particle Swarm Optimization. Department of Electrical Engineering, St. John’s University Taiwan, ed. by A. Lazinica (InTech, 2009), pp. 131–144
C. Deng, G. Huang, J. Xu, J. Tang, Extreme learning machines: new trends and applications. Sci. China Inf. Sci. 58(2), 1–16 (2015)
R.C. Eberhart, Y. Shi, J. Kennedy, Swarm Intelligence (Elsevier, New York, 2001)
S. Ganapathy, K.J. Han, S. Thomas, M.K. Omar, M. Van Segbroeck, S.S. Narayanan, Robust language identification using convolutional neural network features, in INTERSPEECH 2014, pp. 1846–1850
A. Garg, V. Gupta, M. Jindal, A survey of language identification techniques and applications. J. Emerg. Technol. Web Intell. 6(4), 388–400 (2014)
S.K. Gupta, O.P. Singh, P.C. Pradhan, A survey on language identification system. Int. J. Innovative Sci. Eng. Technol. 2(3), 2348–7968 (2015)
R.P. Hafen, M.J. Henry, Speech information retrieval: a review. Multimedia Syst. 18(6), 499–518 (2012)
K. Han, D. Yu, I. Tashev, Speech emotion recognition using deep neural network and extreme learning machine, in Fifteenth Annual Conference of the International Speech Communication Association (Interspeech, 2014), pp. 223–227
G.-B. Huang, Q.-Y. Zhu, C.-K. Siew, Extreme learning machine: theory and applications. Neurocomputing 70(1–3), 489–501 (2006)
G.-B. Huang, H. Zhou, X. Ding, R. Zhang, Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybernet. Part B (Cybernet.) 42(2), 513–529 (2012)
G.-B. Huang, L. Chen, C.K. Siew, Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans. Neural Netw. 17(4), 879–892 (2006)
G.-B. Huang, H. Zhou, X. Ding, R. Zhang, Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybernet. Part B (Cybernet.) 42(2), 513–529 (2011)
B. Jiang, Y. Song, S. Wei, J.-H. Liu, I.V. McLoughlin, L.-R. Dai, Deep bottleneck features for spoken language identification. PLoS ONE 9(7), e100795 (2014)
H. Kaya, A.A. Karpov, Efficient and effective strategies for cross-corpus acoustic emotion recognition. Neurocomputing 275, 1028–1034 (2018)
R. Kennedy, Particle swarm optimization, in Proceedings of IEEE International Conference on Neural Networks IV, p. 1995
S. Kumar, S.K. Pal, R. Singh, A novel hybrid model based on particle swarm optimisation and extreme learning machine for short-term temperature prediction using ambient sensors. Sustain. Cities Soc. 49, 101601 (2019)
Y. Lan, Z. Hu, Y.C. Soh, G.-B. Huang, An extreme learning machine approach for speaker recognition. Neural Comput. Appl. 22(3–4), 417–425 (2013)
K.A. Lee, H. Li, L. Deng, V. Hautamäki, W. Rao, X. Xiao, A. Larcher, H. Sun, T.H. Nguyen, G. Wang, The 2015 NIST language recognition evaluation: the shared view of I2R, Fantastic4 and SingaMS, in 2016
J. Li, A. Mohamed, G. Zweig, Y. Gong, LSTM time and frequency recurrence for automatic speech recognition, in 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) (2015). IEEE, pp. 187–191
N.-Y. Liang, G.-B. Huang, P. Saratchandran, N. Sundararajan, A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Trans. Neural Netw. 17(6), 1411–1423 (2006)
T. Liu, Y. Ding, X. Cai, Y. Zhu, X. Zhang, Extreme learning machine based on particle swarm optimization for estimation of reference evapotranspiration, in 2017 36th Chinese Control Conference (CCC) 2017. IEEE, pp. 4567–4572
I. Lopez-Moreno, J. Gonzalez-Dominguez, D. Martinez, O. Plchot, J. Gonzalez-Rodriguez, P.J. Moreno, On the use of deep feedforward neural networks for automatic language identification. Comput. Speech Lang. 40, 46–59 (2016)
H. Muthusamy, K. Polat, S. Yaacob, Improved emotion recognition using Gaussian Mixture Model and extreme learning machine in speech and glottal signals. Math. Probl. Eng. 2015, 394083 (2015)
P. Nayak, S. Mishra, P. Dash, R. Bisoi, Comparison of modified teaching–learning-based optimization and extreme learning machine for classification of multiple power signal disturbances. Neural Comput. Appl. 27(7), 2107–2122 (2016)
M. Pal, A.E. Maxwell, T.A. Warner, Kernel-based extreme learning machine for remote-sensing image classification. Remote Sens. Lett. 4(9), 853–862 (2013)
M. Sokolova, N. Japkowicz, S. Szpakowicz, Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation, in Australasian Joint Conference on Artificial Intelligence 2006. Springer, pp. 1015–1021
M. van Heeswijk, Advances in extreme learning machines. In: Aalto University publication series, DOCTORAL DISSERTATIONS 43/2015 (2015). ISBN:1799-4942
W. Wang, W. Song, C. Chen, Z. Zhang, Y. Xin, I-vector features and deep neural network modeling for language recognition. Procedia Comput. Sci. 147, 36–43 (2019)
J. Xu, W.-Q. Zhang, J. Liu, S. Xia, Regularized minimum class variance extreme learning machine for language recognition. EURASIP J. Audio Speech Music Process. 2015(1), 22 (2015)
Z. Yang, T. Zhang, D. Zhang, A novel algorithm with differential evolution and coral reef optimization for extreme learning machine training. Cognit. Neurodyn. 10(1), 73–83 (2016)
R. Zazo, A. Lozano-Diez, J. Gonzalez-Dominguez, D.T. Toledano, J. Gonzalez-Rodriguez, Language identification in short utterances using long short-term memory (LSTM) recurrent neural networks. PLoS ONE 11(1), e0146917 (2016)
Acknowledgements
The Malaysian government had funded this project under the research code: DCP-2017-013/6
Funding
The Malaysian government had funded this project under the research code: DCP-2017-013/6.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All the authors declare that they have no conflict of interest.
Code Availability
The source code is not yet publicly available since this project is still ongoing.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Albadr, M.A.A., Tiun, S. Spoken Language Identification Based on Particle Swarm Optimisation–Extreme Learning Machine Approach. Circuits Syst Signal Process 39, 4596–4622 (2020). https://doi.org/10.1007/s00034-020-01388-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-020-01388-9