Skip to main content

Automatic Speech Recognition Analysis Over Wireless Networks

  • Conference paper
  • First Online:
Intelligent Data Engineering and Analytics (FICTA 2022)

Abstract

In this paper, the effects on speech recognition performance by the speech coders are presented. We evaluate our Amazigh speech recognition system through wireless network based on a configurable platform that was created by combining both automatic speech recognition and IVR technologies. Different parameters are used such as VoIP audio codecs, hidden Markov models (HMMs) and Gaussian mixture models (GMMs). The system is trained and tested on ten first digits by collecting data from 24 speakers native of Tarifit. On the other hand, the VoIP codecs used in this work are G.711, GSM and Speex depending on the SIP protocol. Our results show that the best performance is 84.14% achieved by using the GSM audio codec.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Yu, D., Deng, L.: Automatic Speech Recognition. Springer London limited (2016)

    Google Scholar 

  2. Edan, N.M., Al-Sherbaz, A., Turner, S., Ajit, S.: Performance evaluation of QoS using SIP & IAX2 VVoIP protocols with CODECS. In: SAI Computing Conference (SAI), pp. 631–636. IEEE (2016)

    Google Scholar 

  3. Ansari, S., Gutta, R.: Evaluate performance of voice over LTE networks using voice codecs. Int. J. Sci. Eng. Technol. Res. 5(5) (2016)

    Google Scholar 

  4. Das, T.K., Nahar, K.M.: A voice identification system using hidden Markov model. Indian J. Sci. Technol. 9(4) (2016)

    Google Scholar 

  5. Satori, H., Elhaoussi, F.: Investigation Amazigh speech recognition using CMU tools. Int. J. Speech Technol. 17(3), 235–243 (2014)

    Article  Google Scholar 

  6. Ahmad, J., Fiaz, M., Kwon, S.I., Sodanil, M., Vo, B., Baik, S.W.: Gender identification using MFCC for telephone applications—a comparative study (2016). arXiv: 1601.01577

    Google Scholar 

  7. Bhat, C., Mithun, B., Saxena, V., Kulkarni, V.Y., Kopparapu, S.K.: Deploying usable speech enabled IVR systems for mass use. In: International Conference on Human Computer Interactions (ICHCI), pp. 1–5 (2013)

    Google Scholar 

  8. Suciu, G., Vulpe, A., Arseni, S.C., Stancu, A., Butca, C., Suciu, V.: Monitoring a cloud-based speech processing system. In: 7th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), pp. Y-23. IEEE (2015)

    Google Scholar 

  9. Lee, K. M., and Lai, J.: Speech versus touch: A comparitive study of the use of speech and dtmf keypad for navigation. Int. J. Hum. Comput. Interact. 19(3), 343–360 (2005)

    Article  Google Scholar 

  10. Hamidi, M., Satori, H., Zealouk, O., Satori, K., Laaidi, N.: Interactive voice response server voice network administration using hidden Markov model speech recognition system. In: Second World (2018)

    Google Scholar 

  11. Hamidi, M., Satori, H., Zealouk, O., Satori, K., Laaidi, N.: Interactive administration service based on HMM speech recognition system. Int. J. Comput. Aided Eng. Technol. 16(2), 266–282 (2022)

    Article  Google Scholar 

  12. Varshney, U., Snow, A., McGivern, M., Howard, C.: Voice over IP. Commun. ACM 45(1), 89–96 (2002)

    Article  Google Scholar 

  13. Karapantazis, S., Pavlidou, F.N.: VoIP: a comprehensive survey on a promising technology. Comput. Netw. 53(12), 2050–2090 (2009)

    Article  Google Scholar 

  14. Huang, X., Acero, A., Hon, H.W., Foreword By-Reddy, R.: Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Prentice Hall PTR (2001)

    Google Scholar 

  15. Outahajala, M., Zenkouar, L., Rosso, P.: Building an annotated corpus for Amazighe. In: Will Appear In Proceedings of 4th International Conference on Amazigh and ICT (2011)

    Google Scholar 

  16. Boukous, A.: Phonologie de L’amazighe. Institut Royal de la Culture Amazighe, Rabat (2009)

    Google Scholar 

  17. Satori, H., Zealouk, O., Satori, K., ElHaoussi, F.: Voice comparison between smokers and non-smokers using HMM speech recognition system. Int. J. Speech Technol. 20(4), 771–777 (2017)

    Article  Google Scholar 

  18. Zealouk, O., Satori, H., Hamidi, M., Laaidi, N., Satori, K.: Vocal parameters analysis of smoker using Amazigh language. Int. J. Speech Technol. 21(1), 85–91 (2018)

    Article  Google Scholar 

  19. Zealouk, O., Satori, H., Hamidi, M., Laaidi, N., Salek, A., Satori, K.: Analysis of COVID-19 resulting cough using formants and automatic speech recognition system. J. Voice (2021)

    Google Scholar 

  20. Hamidi, M., Satori, H., Zealouk, O., Laaidi, N.: Estimation of ASR parameterization for interactive system. Int. J. Nat. Comput. Res. (IJNCR) 10(1), 28–40 (2021)

    Article  Google Scholar 

  21. Hamidi, M., Satori, H., Zealouk, O., Satori, K.: Speech coding effect on Amazigh alphabet speech recognition performance. J. Adv. Res. Dyn. Control Syst. 11(2), 1392–1400 (2019)

    Google Scholar 

  22. Lounnas, K., Abbas, M., Lichouri, M., Hamidi, M., Satori, H., Teffahi, H.: Enhancement of spoken digits recognition for under-resourced languages: case of Algerian and Moroccan dialects. Int. J. Speech Technol. 1–13 (2022)

    Google Scholar 

  23. Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)

    Google Scholar 

  24. Muda, L., Begam, M., Elamvazuthi, I.: Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques (2010). arXiv:1003.4083

  25. Falk, T.H., Chan, W.Y.: Modulation spectral features for robust far-field speaker identification. IEEE Trans. Audio, Speech Lang. Process. 18(1), 90–100 (2010)

    Google Scholar 

  26. Wavesurfer. https://sourceforge.net/projects/wavesurfer/. Accessed July 2015

  27. El Amrani, M.Y., Rahman, M.M.H., Wahiddin, M.R., Shah, A.: Building CMU Sphinx language model for the Holy Quran using simplifed Arabic phonemes. Egypt. Inf. J. 17, 305–314 (2016)

    Google Scholar 

  28. Abushariah, M.A.M., Ainon, R.N., Zainuddin, R., Alqudah, A.A.M., Elshafei, M.A., Khalifa, O.O.: Modern standard Arabic speech corpus for implementing and evaluating automatic continuous speech recognition systems. J. Franklin Inst. 349, 2215–2242 (2011)

    Article  Google Scholar 

  29. Hyassat, H., Abu-Zitar, R.: Arabic speech recognition using SPHINX engine. Int. J. Speech Technol. 9, 133 (2006). https://doi.org/10.1007/s10772-008-9009-1

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohamed Hamidi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hamidi, M., Zealouk, O., Satori, H. (2023). Automatic Speech Recognition Analysis Over Wireless Networks. In: Bhateja, V., Yang, XS., Chun-Wei Lin, J., Das, R. (eds) Intelligent Data Engineering and Analytics. FICTA 2022. Smart Innovation, Systems and Technologies, vol 327. Springer, Singapore. https://doi.org/10.1007/978-981-19-7524-0_44

Download citation

Publish with us

Policies and ethics