Distributed Speech Recognition Standards

  • David Pearce

This chapter provides an overview of the industry standards for Distributed Speech Recognition developed in ETSI, 3GPP and IETF. These standards were created to ensure interoperability between the feature extraction running on a client device and a compatible recogniser running on a remote server. They are intended for use in the implementation of commercial services for speech and multimodal services over mobile networks. In the process of developing and agreeing the standards substantial performance testing was conducted and these results are also summarised here. While other chapters provide more general information about feature extraction and channel error processing for DSR this chapter focuses on introducing the specifics of the standards.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 3GPP TR 22.977 (2002) Feasibility Study for Speech Enabled Services, Aug.Google Scholar
  2. 3GPP TS 26.243 (2004) ANSI C Code for the Fixed-Point Distributed Speech Recognition Extended Advanced Front-end.Google Scholar
  3. 3GPP TS 26.177 (2004) DSR Extended Advanced Front-end Test Sequences.Google Scholar
  4. 3GPP TR 26.943 (2004) Recognition Performance Evaluations of Codecs for Speech Enabled Services (SES).Google Scholar
  5. ETSI Standard ES 201 108 (2000) Distributed Speech Recognition; Front-end Feature Extrac-tion Algorithm; Compression Algorithm, April.Google Scholar
  6. ETSI Standard ES 202 050 (2002) Distributed Speech Recognition; Advanced Front-end Feature Extraction Algorithm; Compression Algorithm.Google Scholar
  7. ETSI Standard ES 202 211 (2003) Distributed Speech Recognition; Extended Front-end Fea-ture Extraction Algorithm; Compression Algorithm, Back-end Speech Reconstruction Al-gorithm, Nov.Google Scholar
  8. ETSI Standard ES 202 212 (2003) Distributed Speech Recognition; Extended Advanced Front-end Feature Extraction Algorithm; Compression Algorithm, Back-end Speech Recon-struction Algorithm, Nov.Google Scholar
  9. GTS GSM 03.50: Digital cellular telecommunications system (Phase 2+); Transmission planning aspects of the speech service in the GSM Public Land Mobile Network (PLMN) system (GSM 03.50).Google Scholar
  10. Hirsch, H. G. and Pearce, D. (2000) The Aurora Experimental Framework for the Performance Evaluation of Speech Recognition Systems Under Noisy Conditions. In Proceedings of ISCA workshop on Automatic Speech Recognition, Paris, France.Google Scholar
  11. IETF Xie, Q. (2003) RTP Payload Format for ETSI European Standard ES 201 108 Distrib-uted Speech Recognition Encoding, RFC 3557, July. http://www.ietf.org/rfc/rfc3557.txt
  12. IETF Xie, Q and Pearce, D. (2005) RTP Payload Formats for ETSI European Standard ES 202 050, ES 202 211, and ES 202 212 Distributed Speech Recognition Encoding, RFC 4060, May. http://www.ietf.org/rfc/rfc4060.txt
  13. Macho, D., Mauuary, L., Noe, B., Cheng, Y., Ealey, D., Jouvet, D., Kelleher, H., Pearce, D. and Saadoun, F. (2002) Evaluation of a Noise-Robust DSR Front-End on Aurora Databases. In Proceedings of ICSLP 2002, Denver, USA.Google Scholar
  14. Pearce, D. (2000) Enabling New Speech Driven Services for Mobile Devices: An Overview of the ETSI Standards Activities for Distributed Speech Recognition Front-ends. Applied Voice Input/Output Society Conference (AVIOS2000), San Jose, CA, May 2000.Google Scholar
  15. Pearce, D. (2004a) Enabling Speech and Multimodal Services on Mobile Devices: The ETSI Aurora DSR standards and 3GPP Speech Enabled Services. VoiceXML Review, Nov/Dec 2004. http://www.voicexmlreview.org/Nov2004/features
  16. Pearce, D. (2004b), Robustness to Transmission Channel—the DSR Approach. COST278 & ICSA Research Workshop on Robustness Issues in Conversational Interaction, Aug 2004.Google Scholar
  17. Peinado, A. and Segura, J. (2006) Speech Recognition over Digital Channels. Book Publisher Wiley, Chichester.CrossRefGoogle Scholar
  18. Ramabadran, T., Sorin, A., McLaughlin, M., Chazan, D., Pearce, D. and Hoory, R. (2004) The ETSI Extended Distributed Speech Recognition (DSR) Standards: Server-Side Speech Re-construction. In Proceedings of ICASSP, Montreal, Canada.Google Scholar
  19. Sorin, A., Ramabadran, T., Chazan, D., Hoory R., McLaughlin, M., Pearce, D., Wang, F. and Zhang, Y. (2004) The ETSI Extended Distributed Speech Recognition (DSR) Standards: Client Side Processing and Tonal Language Recognition Evaluation. In Proceedings of ICASSP, Montreal, Canada.Google Scholar

Copyright information

© Springer-Verlag London Limited 2008

Authors and Affiliations

  • David Pearce
    • 1
  1. 1.Applications Research CentreMotorola LabsBasingstokeUK

Personalised recommendations