Skip to main content

SpeakerSense: Energy Efficient Unobtrusive Speaker Identification on Mobile Phones

  • Conference paper
Pervasive Computing (Pervasive 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6696))

Included in the following conference series:

Abstract

Automatically identifying the person you are talking with using continuous audio sensing has the potential to enable many pervasive computing applications from memory assistance to annotating life logging data. However, a number of challenges, including energy efficiency and training data acquisition, must be addressed before unobtrusive audio sensing is practical on mobile devices. We built SpeakerSense, a speaker identification prototype that uses a heterogeneous multi-processor hardware architecture that splits computation between a low power processor and the phone’s application processor to enable continuous background sensing with minimal power requirements. Using SpeakerSense, we benchmarked several system parameters (sampling rate, GMM complexity, smoothing window size, and amount of training data needed) to identify thresholds that balance computation cost with performance. We also investigated channel compensation methods that make it feasible to acquire training data from phone calls and an automatic segmentation method for training speaker models based on one-to-one conversations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hayes, G., Patel, S., Truong, K., Iachello, G., Kientz, J., Farmer, R., Abowd, G.: The Personal Audio Loop: Designing a Ubiquitous Audio-Based Memory Aid. In: Proc. Mobile HCI 2004 (2004)

    Google Scholar 

  2. Hodges, S., Williams, L., Berry, E., Izadi, S., Srinivasan, J., Butler, A., Smyth, G., Kapur, N., Wood, K.: SenseCam: A Retrospective Memory Aid. In: Dourish, P., Friday, A. (eds.) UbiComp 2006. LNCS, vol. 4206, pp. 177–193. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  3. Huang, L., Yang, C.: A Novel Approach to Robust Speech Endpoint Detection in Car Environments. In: ICASSP 2000, Istambul, Turkey, vol. 3, pp. 1751–1754 (May 2000)

    Google Scholar 

  4. Kapur, N.: Compensating for Memory Deficits with Memory Aids. In: Wilson, B. (ed.) Memory Rehabilitation Integrating Theory and Practice, pp. 52–73. Guilford Press, New York

    Google Scholar 

  5. Lee, M., Dey, A.: Lifelogging Memory Appliance for People with Episodic Memory Impairment. In: Proc. UbiComp, pp. 44–53 (2008)

    Google Scholar 

  6. Lu, H., Pan, W., Lane, W., Choudhury, T., Campbell, A.: SoundSense: scalable sound sensing for people-centric applications on mobile phones. In: Proc. MobiSys 2009, pp. 165–178 (2009)

    Google Scholar 

  7. Miluzzo, E., Cornelius, C., Ramaswamy, A., Choudhury, T., Liu, Z., Campbell, A.: Darwin Phones: the Evolution of Sensing and Inference on Mobile Phones. In: Proc. MobiSys 2010, pp. 5–20 (2010)

    Google Scholar 

  8. Miluzzo, E., Lane, N., Fodor, K., Peterson, R., Lu, H., Musolesi, M., Eisenman, S., Zheng, X., Campbell, A.: Sensing meets mobile social networks: The design, implementation and evaluation of the CenceMe application. In: Proc. SenSys 2008, pp. 337–350 (2008)

    Google Scholar 

  9. Power Monitor, http://www.msoon.com/LabEquipment/PowerMonitor/

  10. Priyantha, B., Lymberopoulos, D., Liu, J.: LittleRock: Enabling Energy Effcient Continuous Sensing on Mobile Phones. IEEE Pervasive Computing Magazine (April-June 2011)

    Google Scholar 

  11. Rabiner, L.R., Cheng, M.J., Rosenberg, A.E., McGonegal, C.A.: Acomparative performance study of several pitchdetection algorithms. IEEE Trans. Acoust., Speech, and Signal Processing, 399–418 (October 1976)

    Google Scholar 

  12. Rachuri, K., Musolesi, M., Mascolo, C., Rentfrow, P., Longworth, C., Aucinas, A.: EmotionSense: A Mobile Phone based Adaptive Platform for Experimental Social Psychology Research. In: Proc. UbiComp 2010, pp. 281–290 (2010)

    Google Scholar 

  13. Reynolds, D.A.: An Overview of Automatic Speaker Recognition Technology. In: Proc. Int. Conf. Acoustics, Speech, and Signal Processing, vol. 4, pp. 4072–4075 (2002)

    Google Scholar 

  14. Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. Speech Audio Process. 3, 72–83 (1995)

    Article  Google Scholar 

  15. Saunders, J.: Real time discrimination of broadcast speech/music. In: Proc. Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), pp. 993–996 (1996)

    Google Scholar 

  16. Scheirer, E., Slaney, M.: Construction and evaluation of a robust multifeature speech/music discriminator. In: Proc. ICASSP 1998 (May 1998)

    Google Scholar 

  17. Vemuri, S., Schmandt, C., Bender, W.: iRemember: a Personal, Long-term Memory Prosthesis. In: Proc. CARPE 2006 (2006)

    Google Scholar 

  18. Viikki, O., Laurila, K.: Cepstral domain segmental feature vector normalization for noise robust speech recognition. Speech Communication 25, 133–147 (1998)

    Article  Google Scholar 

  19. Wang, Y., Lin, J., Annavaram, M., Jacobson, Q., Hong, J., Krishnamachari, B., Sadeh, N.: A framework of energy efficient mobile sensing for automatic user state recognition. In: Proc. MobiSys, pp. 179–192

    Google Scholar 

  20. Zheng, F., Zhang, G., Song, Z.: Comparison of different implementations of MFCC. J. Computer Science & Technology 16(6), 582–589 (2001)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lu, H., Bernheim Brush, A.J., Priyantha, B., Karlson, A.K., Liu, J. (2011). SpeakerSense: Energy Efficient Unobtrusive Speaker Identification on Mobile Phones. In: Lyons, K., Hightower, J., Huang, E.M. (eds) Pervasive Computing. Pervasive 2011. Lecture Notes in Computer Science, vol 6696. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21726-5_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-21726-5_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-21725-8

  • Online ISBN: 978-3-642-21726-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics