Chapter

Automatic Speech Recognition on Mobile Devices and over Communication Networks

Part of the series Advances in Pattern Recognition pp 1-23

Network, Distributed and Embedded Speech Recognition: An Overview

  • Zheng-Hua TanAffiliated withDepartment of Electronic Systems, Aalborg University
  • , Imre VargaAffiliated withCorporate Technology, Siemens AG

* Final gross prices may vary according to local VAT.

Get Access

As mobile devices become pervasive and small, the design of efficient user interfaces is rapidly developing into a major issue. The expectation for speech-centric interfaces has stimulated a great interest in deploying automatic speech recognition (ASR) on devices like mobile phones, PDAs and automobiles. Mobile devices are characterised as having limited computational power, memory size and battery life, whereas state-of-the-art ASR systems are computationally intensive. To circumvent these restrictions, a great deal of effort has therefore been spent on enabling efficient ASR implementation on embedded platforms, primarily through fixed-point arithmetic and algorithm optimisation for low computational complexity and memory footprint. The restrictions can also be largely bypassed from the architecture side: Distributed speech recognition (DSR) splits ASR processing into the client based feature extraction and the server based recognition. The relief of computational burden on mobile devices, however, comes at the cost of network deteriorations and additional components such as feature quantisation, error recovery and concealment. An alternative to DSR is network speech recognition that uses a conventional speech coder for speech transmission from client to server. Over the past decade, these areas have undergone substantial development. This chapter gives a comprehensive overview of the areas and discusses the pros and cons of different approaches. The optimal choice is made according to the complexity of ASR components, the resources available on the device and in the network and the location of associated applications.