Advertisement

A Comparison of Microphone and Speech Recognition Engine Efficacy for Mobile Data Entry

  • Joanna Lumsden
  • Scott Durling
  • Irina Kondratova
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5333)

Abstract

The research presented in this paper is part of an ongoing investigation into how best to incorporate speech-based input within mobile data collection applications. In our previous work [1], we evaluated the ability of a single speech recognition engine to support accurate, mobile, speech-based data input. Here, we build on our previous research to compare the achievable speaker-independent accuracy rates of a variety of speech recognition engines; we also consider the relative effectiveness of different speech recognition engine and microphone pairings in terms of their ability to support accurate text entry under realistic mobile conditions of use. Our intent is to provide some initial empirical data derived from mobile, user-based evaluations to support technological decisions faced by developers of mobile applications that would benefit from, or require, speech-based data entry facilities.

Keywords

mobile speech input microphone efficacy speech recognition accuracy/efficacy mobile technology mobile evaluation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Lumsden, J., Kondratova, I., Durling, S.: Investigating Microphone Efficacy for Facilitation of Mobile Speech-Based Data Entry. In: Proceedings of British HCI 2007 Conference, Lancaster, UK, September 3-7, pp. 89–98 (2007)Google Scholar
  2. 2.
    Price, K., Lin, M., Feng, J., Goldman, R., Sears, A., Jacko, J.: Data Entry on the Move: An Examination of Nomadic Speech-Based Text Entry. In: Stary, C., Stephanidis, C. (eds.) UI4ALL 2004, vol. 3196, pp. 460–471. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  3. 3.
    Sawhney, N., Schmandt, C.: Nomadic Radio: Speech and Audio Interaction for Contextual Messaging in Nomadic Environments. ACM Transactions on Computer-Human Interaction 7(3), 353–383 (2000)CrossRefGoogle Scholar
  4. 4.
    Ward, K., Novick, D.: Hands-Free Documentation. In: Proceedings of 21st Annual International Conference on Documentation (SIGDoc 2003), San Francisco, USA, October 12-15, pp. 147–154 (2003)Google Scholar
  5. 5.
    Oviatt, S.: Taming Recognition Errors with a Multimodal Interface. Communications of the ACM 43(9), 45–51 (2000)CrossRefGoogle Scholar
  6. 6.
    Lumsden, J., Kondratova, I., Langton, N.: Bringing A Construction Site Into The Lab: A Context-Relevant Lab-Based Evaluation Of A Multimodal Mobile Application. In: Proceedings of 1st International Workshop on Multimodal and Pervasive Services (MAPS 2006), Lyon, France, June 29, pp. 62–68 (2006)Google Scholar
  7. 7.
    Sammon, M., Brotman, L., Peebles, E., Seligmann, D.: MACCS: Enabling Communications for Mobile Workers within Healthcare Environments. In: Proceedings of 8th International Conference on Human Computer Interaction with Mobile Devices and Services (MobileHCI 2006), Helsinki, Finland, September 12 - 15, pp. 41–44 (2006)Google Scholar
  8. 8.
    Sebastian, D.: Development of a Field-Deployable Voice-Controlled Ultrasound Scanner System, M.Sc. Thesis, Dept. of Electrical and Computer Engineering, Worcester Polytechnic Institute, Worcester, MA, USA (2004)Google Scholar
  9. 9.
    Vinciguerra, B.: A Comparison of Commercial Speech Recognition Components for Use with the Project54 System, M.Sc. Thesis, Dept. of Electrical Engineering, University of New Hampshire, Durham, NH, USA (2002)Google Scholar
  10. 10.
    Pick, H., Siegel, G., Fox, P., Garber, S., Kearney, J.: Inhibiting the Lombard Effect. Journal of the Acoustical Society of America 85(2), 894–900 (1989)CrossRefGoogle Scholar
  11. 11.
    Rollins, A.: Speech Recognition and Manner of Speaking in Noise and in Quiet. In: Proceedings of Conference on Human Factors in Computing Systems (CHI 1985), San Francisco, USA, April 14 - 18, pp. 197–199 (1985)Google Scholar
  12. 12.
    Chang, J.: Speech Recognition System Robustness to Microphone Variations, M.Sc. Thesis, Dept. of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA (1995)Google Scholar
  13. 13.
    NextLink, Invisio Pro, http://www.nextlink.se/
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
    Microsoft, Windows Desktop Speech Technology, http://msdn.microsoft.com/en-us/library/system.speech.recognition.aspx
  20. 20.
    Nuance, Dragon Naturally Speaking, http://www.nuance.com/naturallyspeaking/sdk/client/

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Joanna Lumsden
    • 1
  • Scott Durling
    • 1
  • Irina Kondratova
    • 1
  1. 1.National Research Council of Canada, IIT e-BusinessFrederictonCanada

Personalised recommendations