Skip to main content

Automatic Speech Recognition of Continuous Speech Signal of Gujarati Language Using Machine Learning

  • Conference paper
  • First Online:
Mathematical Modeling, Computational Intelligence Techniques and Renewable Energy

Abstract

In this work we perform automatic recognition of continuous speech signal spoken in Gujarati language using machine learning (ML) technique. For this purpose, from continuous speech signal of sentence we first extract words using short term auto-correlation (STAC) method. Since the signals for each word are large in size, the dimension reduction is done using feature extraction algorithm: mel-frequency discrete wavelet coefficient (MFDWC). Then these features are trained using ML algorithm for recognition of speech.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Rabiner, L.R., Juang, B.-H., Yegnanarayana, B.: Fundamentals of speech recognition. Pearson Education (2010)

    Google Scholar 

  2. Census of India 2011: http://censusindia.gov.in/

  3. Juang, B.H., Rabiner, L.R.: Automatic speech recognition—a brief history of the technology. 1–24 (2005)

    Google Scholar 

  4. Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. 28, 357–366 (1980). https://doi.org/10.1109/TASSP.1980.1163420

    Article  Google Scholar 

  5. Dua, M., Aggarwal, R.K., Biswas, M.: Discriminatively trained continuous Hindi speech recognition system using interpolated recurrent neural network language modeling. Neural Comput. Appl. 31, 6747–6755 (2019). https://doi.org/10.1007/s00521-018-3499-9

    Article  Google Scholar 

  6. China Bhanja, C., Laskar, M.A., Laskar, R.H., Bandyopadhyay, S.: Deep neural network based two-stage Indian language identification system using glottal closure instants as anchor points. J. King Saud Univ. Comput. Inf. Sci. (2019). https://doi.org/10.1016/j.jksuci.2019.07.001

  7. Goel, S., Pangasa, R., Dawn, S., Arora, A.: Audio acoustic features based tagging and comparative analysis of its classifications. In: 2018 11th International Conference Contemporary Computing IC3 2018, pp. 1–5 (2018). https://doi.org/10.1109/IC3.2018.8530512

  8. Tufekci, Z., Gowdy, J.N.: Feature extraction using discrete wavelet transform for speech recognition. In: Conference Proceedings—IEEE SOUTHEASTCON. pp. 116–123 (2000)

    Google Scholar 

  9. Pandit, P., Bhatt, S.: Automatic speech recognition of Gujarati digits using dynamic time warping. Int. J. Eng. Innov. Technol. 3, 69–73 (2014)

    Google Scholar 

  10. Pandit, P., Bhatt, S., Makwana, P.: Automatic speech recognition of Gujarati digits using artificial neural network. In: Proceedings of 19th Annual Cum 4th International Conference of GAMS On Advances in Mathematical Modelling to Real World Problems. pp. 141–146. Excellent Publishers (2014)

    Google Scholar 

  11. Pandit, P., Bhatt, S.: Automatic speech recognition of Gujarati digits using radial basis function network. In: International Conference on Futuristic Trends in Engineering, Science, Pharmacy and Management. pp. 216–226. A D Publication (2016)

    Google Scholar 

  12. Pandit, P., Bhatt, S.: Automatic speech recognition of Gujarati digits using wavelet coefficients. J. Maharaja Sayajirao Univ. Baroda. 52, 101–110 (2017)

    Google Scholar 

  13. Tufekci, Z., Gowdy, J.N., Gurbuz, S., Patterson, E.: Applied mel-frequency discrete wavelet coefficients and parallel model compensation for noise-robust speech recognition. Speech Commun. 48, 1294–1307 (2006). https://doi.org/10.1016/j.specom.2006.06.006

    Article  Google Scholar 

  14. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323, 533–536 (1986). https://doi.org/10.1038/323533a0

    Article  MATH  Google Scholar 

  15. Kingma, D.P., Ba, J.L.: Adam: A method for stochastic optimization. In: International Conference on Learning Representations. pp. 1–15 (2015)

    Google Scholar 

  16. Maas, A., Hannun, A., Ng, A.: Rectifier nonlinearities improve neural network acoustic models. In: ICML Workshop on Deep Learning for Audio, Speech and Language Processing (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Priyank Makwana .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pandit, P., Makwana, P., Bhatt, S. (2021). Automatic Speech Recognition of Continuous Speech Signal of Gujarati Language Using Machine Learning. In: Sahni, M., Merigó, J.M., Jha, B.K., Verma, R. (eds) Mathematical Modeling, Computational Intelligence Techniques and Renewable Energy. Advances in Intelligent Systems and Computing, vol 1287. Springer, Singapore. https://doi.org/10.1007/978-981-15-9953-8_13

Download citation

Publish with us

Policies and ethics