Skip to main content
Log in

Exploration of diverse intelligent approaches in speech recognition systems

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

Artificial Intelligence revolutionizes the industrial sector to the greater extent towards the era of smart world. Real time automatic speech recognition system is on greater demand for the past few years in most of the embedded devices and smart phone applications. Research on automatic speech recognition is quite challenging due to the complication of environmental noises especially with the non stationary one. Machine learning based robust models are developed widely for speech recognition applications in the past decades. Now the researches mostly focused on deep learning approaches in order to improve the performance and better results. The complexity in designing separate feature extraction steps and classification models in the earlier models are eliminated in the deep learning models. This research article presents the detailed view of various research models developed for the application of automatic speech recognition, its advantages and also the various deep learning frame works for exploring future works.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Amodei, D., Ananthanarayanan, S., Anubhai, R., Bai, J., Battenberg, E., Case, C., Casper, J., Catanzaro, B., Cheng, Q., Chen, G., & Chen, J. (2016, June). Deep speech 2: End-to-end speech recognition in english and mandarin. In International conference on machine learning (pp. 173–182).

  • An, N. N., Thanh, N. Q., & Liu, Y. (2019). Deep CNNs with self-attention for speaker identification. IEEE Access, 7, 85327–85337.

    Article  Google Scholar 

  • Blunt, P., & Haskins, B. (2019, November). A model for incorporating an automatic speech recognition system in a noisy educational environment. In 2019 International multidisciplinary information technology and engineering conference (IMITEC) (pp. 1–7). IEEE.

  • Brems, D. J., & Schoeffler, M. S. (1996). U.S. Patent No. 5,566,272. Washington, DC: U.S. Patent and Trademark Office.

  • Bunrit, S., et al. (2019). Text-independent speaker identification using deep learning model of convolution neural network. International Journal of Machine Learning and Computing, 9, 2.

    Article  Google Scholar 

  • Deng, L., & Li, X. (2013). Machine learning paradigms for speech recognition: An overview. IEEE Transactions on Audio, Speech and Language Processing, 21(5), 1060–1089.

    Article  Google Scholar 

  • Graves, A., Mohamed, A.R., & Hinton, G. (2013). Speech recognition with deep recurrent neural networks. 2013 IEEE international conference on acoustics, speech and signal processing. IEEE.

  • Gupta, K., & Gupta, D. (2016, January). An analysis on LPC, RASTA and MFCC techniques in Automatic Speech recognition system. In 2016 6th international conference-cloud system and big data engineering (confluence) (pp. 493–497). IEEE.

  • Gupta, A., Patel, N., & Khan, S. (2014, November). Automatic speech recognition technique for voice command. In 2014 international conference on science engineering and management research (ICSEMR) (pp. 1–5). IEEE.

  • Kavitha, S., Veena, S., & Kumaraswamy, R. (2015, December). Development of automatic speech recognition system for voice activated Ground Control system. In 2015 international conference on trends in automation, communications and computing technology (I-TACT-15) (pp. 1–5). IEEE.

  • Khosravani, A., & Homayounpour, M. M. (2017). A PLDA approach for language and text independent speaker recognition. Computer Speech & Language, 45, 457–474.

    Article  Google Scholar 

  • Koo, M. W., Choi, J. K., & Kim, Y. M. (2008, February). The development of automatic speech recognition software for portable devices. In First international conference on advances in computerhuman interaction (pp. 59–62). IEEE.

  • Kumar, Y., & Singh, N. (2019, April). A comprehensive view of automatic speech recognition system-A systematic literature review. In 2019 international conference on automation, computational and technology management (ICACTM) (pp. 168–173). IEEE.

  • Lee, T., Liu, Y., Huang, P. W., Chien, J. T., Lam, W. K., Yeung, Y. T…. Law, S. P. (2016, March). Automatic speech recognition for acoustical analysis and assessment of cantonese pathological voice and speech. In 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 6475–6479). IEEE.

  • Londhe, N. D., Ahirwal, M. K., & Lodha, P. (2016, April). Machine learning paradigms for speech recognition of an Indian dialect. In 2016 international conference on communication and signal processing (ICCSP) (pp. 0780–0786). IEEE.

  • Makhmudov, A. Z., & Abdukarimov, S. S. (2016). Speech recognition using deep learning algorithms. Инфopмaтикa: пpoблeмы, мeтoдoлoгия, тexнoлoгии.

  • Mokgonyane, T. B., Sefara, T. J., Modipa, T. I., Mogale, M. M., Manamela, M. J., & Manamela, P. J. (2019, January). Automatic speaker recognition system based on machine learning algorithms. In 2019 Southern African Universities Power Engineering Conference/Robotics and Mechatronics/Pattern Recognition Association of South Africa (SAUPEC/RobMech/PRASA) (pp. 141–146). IEEE.

  • Nassif, A. B., Shahin, I., Attili, I., Azzeh, M., & Shaalan, K. (2019). Speech recognition using deep neural networks: A systematic review. IEEE Access, 7, 19143–19165.

    Article  Google Scholar 

  • Park, J., Boo, Y., Choi, I., Shin, S., & Sung, W. (2018). Fully neural network based speech recognition on mobile and embedded devices. In Advances in neural information processing systems (pp. 10620–10630).

  • Pramanik, A., & Raha, R. (2012, October). Automatic speech recognition using correlation analysis. In 2012 World congress on information and communication technologies (pp. 670–674). IEEE.

  • Richardson, F., Reynolds, D., & Dehak, N. (2015). Deep neural network approaches to speaker and language recognition. IEEE Signal Processing Letters, 22(10), 1671–1675.

    Article  Google Scholar 

  • Rubi, C. R. (2015). A review: Speech recognition with deep learning methods. International Journal of Computer Science and Mobile Computing, 4(5), 1017–1024.

    Google Scholar 

  • Sahu, P. K., & Ganesh, D. S. (2015, December). A study on automatic speech recognition toolkits. In 2015 international conference on microwave, optical and communication engineering (ICMOCE) (pp. 365–368). IEEE.

  • Song, W., & Cai, J. (2015). End-to-end deep neural network for automatic speech recognition. Standford CS224D Reports.

  • Sztahó, D., Szaszák, G., & Beke, A. (2019). Deep learning methods in speaker recognition: A review. arXiv:1911.06615.

  • Tirumala, S. S., & Shahamiri, S. R. (2016, November). A review on Deep Learning approaches in Speaker Identification. In Proceedings of the 8th international conference on signal processing systems (pp. 142–147).

  • Trivedi, A., et al. (2018). Speech to text and text to speech recognition systems-A review. IOSR Journal of Computer Engineering, 20(2), 39.

    Google Scholar 

  • Valin, J. M. (2018, August). A hybrid DSP/deep learning approach to real-time full-band speech enhancement. In 2018 IEEE 20th international workshop on multimedia signal processing (MMSP) (pp. 1–5). IEEE.

  • Variani, E., Lei, X., McDermott, E., Moreno, I. L., & Gonzalez-Dominguez, J. (2014). Deep neural networks for small footprint text-dependent speaker verification. In 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP), (pp. 4052–4056). IEEE.

  • Wu, C. (2018). Structured deep neural networks for speech recognition. PhD diss., University of Cambridge.

  • Zhang, Z., Geiger, J., Pohjalainen, J., Mousa, A. E. D., Jin, W., & Schuller, B. (2018). Deep learning for environmentally robust speech recognition: An overview of recent developments. ACM Transactions on Intelligent Systems and Technology (TIST), 9(5), 1–28.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Iwin Thanakumar Joseph Swamidason.

Ethics declarations

Conflict of interest

Authors declared no conflict of Interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Swamidason, I.T.J., Tatiparthi, S., Arul Xavier, V.M. et al. Exploration of diverse intelligent approaches in speech recognition systems. Int J Speech Technol 26, 1–10 (2023). https://doi.org/10.1007/s10772-020-09769-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-020-09769-w

Keywords

Navigation