Skip to main content

Effect of Codec Bit Rate and Packet Loss on Thai Speech Recognition over IP

  • Conference paper
Advances in Information Technology (IAIT 2013)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 409))

Included in the following conference series:

  • 1005 Accesses

Abstract

Nowadays, VoIP has become the core communications on the internet. One of the crucial applications on VoIP is the automated IVR system that interacts with the user automatically. Speech recognition plays an important role behind this kind of system. This paper studies the effect of codec bit rate and network packet loss on Thai speech recognition systems over an IP network. We encoded the speech samples of male, female and artificial voice with various bit rates of Speex codec. The speech sample was sent by RTP through the IP network with packet loss simulation. The speech quality was measured by PESQ and compared to word error rate of speech recognition. The results show that the codec bit rate and level of packet loss have a significant impact on the performance of speech recognition over IP.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Holub, J., Slavata, O.: Impact of IP channel parameters on the final quality of the transferred voice. In: Wireless Telecommunications Symposium (WTS), pp. 1–5 (2012)

    Google Scholar 

  2. Karapantazis, S., Pavlidou, F.-N.: VoIP: A comprehensive survey on a promising technology. Computer Networks 53, 2050–2090 (2009)

    Article  Google Scholar 

  3. Blatnik, R., Kandus, G., Šef, T.: Influence of the speech quality in telephony on the automated speaker recognition. In: Proceedings of the 5th WSEAS International Conference on Circuits, Systems, Signal and Telecommunications, pp. 115–120. World Scientific and Engineering Academy and Society (WSEAS), Stevens Point (2011)

    Google Scholar 

  4. Mayorga, P., Besacier, L., Lamy, R., Serignat, J.-F.: Audio packet loss over IP and speech recognition. In: 2003 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2003, pp. 607–612 (2003)

    Google Scholar 

  5. Yip, M.: Tone. Cambridge University Press (2002)

    Google Scholar 

  6. Triyason, T., Kanthamanon, P.: Perceptual Evaluation of Speech Quality Measurement on Speex Codec VoIP with Tonal Language Thai. In: Papasratorn, B., Charoenkitkarn, N., Lavangnananda, K., Chutimaskul, W., Vanijja, V. (eds.) IAIT 2012. CCIS, vol. 344, pp. 181–190. Springer, Heidelberg (2012)

    Google Scholar 

  7. ITU-T P.862: Perceptual evaluation of speech quality (PESQ), an ojective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs (2001)

    Google Scholar 

  8. Daengsi, T., Wutiwiwatchai, C., Preechayasomboon, A., Sukparungsee, S.: A study of VoIP quality evaluation: User perception of voice quality from G.729, G.711 and G.722. In: 2012 IEEE Consumer Communications and Networking Conference (CCNC), pp. 342–345 (2012)

    Google Scholar 

  9. Diller, A.V.N.: The Tai-Kadai languages. Routledge, London (2012)

    Google Scholar 

  10. Chong, F.L., McLoughlin, I.V., Pawlikowski, K.: A Methodology for Improving PESQ accuracy for Chinese Speech. TENCON 2005 IEEE Region 10, 1–6 (2005)

    Google Scholar 

  11. Cooke, J.R., Abramson, A.S.: The Vowels and Tones of Standard Thai: Acoustical Measurements and Experiments. American Anthropologist. 65, 1406–1407 (1963)

    Article  Google Scholar 

  12. Chomphan: Fujisaki’s Model of Fundamental Frequency Contours for Thai Dialects. Journal of Computer Science 6, 1263–1271 (2010)

    Google Scholar 

  13. ITU-T P.800: Methods for subjective determination of transmission quality (1996)

    Google Scholar 

  14. ITU-T P.862.1: Mapping function for transforming P.862 raw result scores to MOS-LQO (2003)

    Google Scholar 

  15. Anusuya, M.A., Katti, S.K.: Speech Recognition by Machine. A Review (2010)

    Google Scholar 

  16. Vaja 6.0, http://vaja.nectec.or.th/

  17. Ispeech-W 1.5, http://tvis.nectec.or.th/speech/index.php

  18. Patcharikra, C., Treepop, S., Sawit, K., Nattanun, T., Chai, W.: LOTUS: Large vocabulary Thai continuous Speech Recognition Corpus. In: NSTDA Annual Conference S&T in Thailand: Towards the Molecular Economy (2005)

    Google Scholar 

  19. Chitode, J.S.: Principles Of Communication. Technical Publications (2008)

    Google Scholar 

  20. Valin, J.M.: The Speex codec manual (version 1.2 Beta 3) (2007)

    Google Scholar 

  21. RTPToolBox: RTP Packet Testing & Simulation Tools, http://www.gl.com/rtptoolbox.html

  22. IPNetSim-(IPNetwork/WANEmulator - 100Mbps, 1Gbps, 4x1Gbps), http://www.gl.com/ipnetsim.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Triyason, T., Kanthamanon, P. (2013). Effect of Codec Bit Rate and Packet Loss on Thai Speech Recognition over IP. In: Papasratorn, B., Charoenkitkarn, N., Vanijja, V., Chongsuphajaisiddhi, V. (eds) Advances in Information Technology. IAIT 2013. Communications in Computer and Information Science, vol 409. Springer, Cham. https://doi.org/10.1007/978-3-319-03783-7_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-03783-7_21

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-03782-0

  • Online ISBN: 978-3-319-03783-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics