Skip to main content

Visage: A Face Interpretation Engine for Smartphone Applications

  • Conference paper
Mobile Computing, Applications, and Services (MobiCASE 2012)

Abstract

Smartphones represent powerful mobile computing devices enabling a wide variety of new applications and opportunities for human interaction, sensing and communications. Because smartphones come with front-facing cameras, it is now possible for users to interact and drive applications based on their facial responses to enable participatory and opportunistic face-aware applications. This paper presents the design, implementation and evaluation of a robust, real-time face interpretation engine for smartphones, called Visage, that enables a new class of face-aware applications for smartphones. Visage fuses data streams from the phone’s front-facing camera and built-in motion sensors to infer, in an energy-efficient manner, the user’s 3D head poses (i.e., the pitch, roll and yaw of user’s heads with respect to the phone) and facial expressions (e.g., happy, sad, angry, etc.). Visage supports a set of novel sensing, tracking, and machine learning algorithms on the phone, which are specifically designed to deal with challenges presented by user mobility, varying phone contexts, and resource limitations. Results demonstrate that Visage is effective in different real-world scenarios. Furthermore, we developed two distinct proof-of-concept applications, Streetview+ and Mood Profiler driven by Visage.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Recognizr, http://news.cnet.com/8301-137723-10458736-52.html

  2. Adiv, G.: Determining Three-dimensional Motion and Structure from Optical Flow Generated by Several Moving Objects. Trans. Pattern Anal. Mach. Intell. 7(4), 384–401 (1985)

    Article  Google Scholar 

  3. Baker, S., Matthews, I.: Lucas-kanade 20 Years On: A Unifying Framework. Int’l J. Comput. Vision 56(3), 221–255 (2004)

    Article  Google Scholar 

  4. Bao, X., Choudhury, R.R.: MoVi: Mobile Phone based Video Highlights via Collaborative Sensing. In: Proc. the 8th Int’l Conf. Mobile Systems, Applications, and Services, pp. 357–370. ACM, New York (2010)

    Google Scholar 

  5. Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J.: Eigenfaces vs. Fisherfaces: Recognition using Class Specific Linear Projection. Trans. Pattern Anal. Mach. Intell. 19(7), 711–720 (1997)

    Article  Google Scholar 

  6. Bradski, G.R.: Real Time Face and Object Tracking as a Component of a Perceptual User Interface. In: Proc. the 4th IEEE Workshop on Applications of Computer Vision, pp. 214–219. IEEE Computer Society, Washington, DC (1998)

    Google Scholar 

  7. Chai, S.: Mobile Challenges for Embedded Computer Vision. In: Embedded Computer Vision, Advances in Pattern Recognition, pp. 219–235. Springer, London (2009)

    Google Scholar 

  8. Chang, C.-C., Lin, C.-J.: LIBSVM: A Library for Support Vector Machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011)

    Article  Google Scholar 

  9. Cunningham, D.W., Nusseck, M., Wallraven, C., Bulthoff, H.H.: The Role of Image Size in the Recognition of Conversational Facial Expressions. Research Articles. Comput. Animat. Virtual Worlds 15(3-4), 305–310 (2004)

    Article  Google Scholar 

  10. Dementhon, D.F., Davis, L.S.: Model-based Object Pose in 25 Lines of Code. Int’l J. Comput. Vision 15(1-2), 123–141 (1995)

    Article  Google Scholar 

  11. Ekman, P., Friesen, W.V.: Constants Across Cultures in the Face and Emotion. Journal of Personality and Social Psychology 17(2), 124–129 (1971)

    Article  Google Scholar 

  12. Hodges, S., Williams, L., Berry, E., Izadi, S., Srinivasan, J., Butler, A., Smyth, G., Kapur, N., Wood, K.: SenseCam: A Retrospective Memory Aid. In: Dourish, P., Friday, A. (eds.) UbiComp 2006. LNCS, vol. 4206, pp. 177–193. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  13. Hua, G., Yang, T., Vasireddy, S.: PEYE: Toward a Visual Motion Based Perceptual Interface for Mobile Devices. In: Proc. of the 2007 IEEE Int’l Conf. Human-Computer Interaction, pp. 39–48. Springer, Berlin (2007)

    Google Scholar 

  14. Jia, P.: Vision Open Statistical Models (2011), http://sourceforge.net/projects/vosm

  15. Liao, S., Fan, W., Chung, A., Yeung, D.-Y.: Facial Expression Recognition using Advanced Local Binary Patterns, Tsallis Entropies and Global Appearance Features. In: IEEE Int’l Conf. Image Processing, pp. 665–668 (2006)

    Google Scholar 

  16. Littlewort, G., Bartlett, M., Fasel, I., Chenu, J., Kanda, T., Ishiguro, H., Movellan, J.: Towards Social Robots: Automatic Evaluation of Human-robot Interaction by Face Detection and Expression Classification. Advances in Neural Information Processing Systems 16, 1563–1570 (2004)

    Google Scholar 

  17. Lu, H., Pan, W., Lane, N., Choudhury, T., Campbell, A.: SoundSense: Scalable Sound Sensing for People-centric Applications on Mobile Phones. In: Proc. the 7th Int’l Conf. Mobile Systems, Applications, and Services, pp. 165–178. ACM (2009)

    Google Scholar 

  18. Lyons, M., Akamatsu, S., Kamachi, M., Gyoba, J.: Coding Facial Expressions with Gabor Wavelets. In: Proc. 3rd IEEE Int’l Conf. Automatic Face and Gesture Recognition, pp. 200–205. IEEE Computer Society, Washington, DC (1998)

    Chapter  Google Scholar 

  19. Matthews, I., Baker, S.: Active Appearance Models Revisited. Int’l J. Comput. Vision 60(2), 135–164 (2004)

    Article  Google Scholar 

  20. Michel, P., Kaliouby, R.E.: Real Time Facial Expression Recognition in Video using Support Vector Machines. In: Proc. the 5th Int’l Conf. Multimodal Interfaces, pp. 258–264. ACM, New York (2003)

    Google Scholar 

  21. Miluzzo, E., Lane, N.D., Eisenman, S.B., Campbell, A.T.: CenceMe – Injecting Sensing Presence into Social Networking Applications. In: Kortuem, G., Finney, J., Lea, R., Sundramoorthy, V. (eds.) EuroSSC 2007. LNCS, vol. 4793, pp. 1–28. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  22. Mun, M., Reddy, S., Shilton, K., Yau, N., Burke, J., Estrin, D., Hansen, M., Howard, E., West, R., Boda, P.: Peir: The Personal Environmental Impact Report, as a Platform for Participatory Sensing Systems Research. In: Proc. the 7th Int’l Conf. Mobile Systems, Applications, and Services, pp. 55–68. ACM, New York (2009)

    Google Scholar 

  23. Radovanovic, M., Nanopoulos, A., Ivanovic, M.: On The Existence of Obstinate Results in Vector Space Models. In: Proc. the 33rd Int’l Conf. Research and Development in Information Retrieval, pp. 186–193. ACM, New York (2010)

    Google Scholar 

  24. Ristic, B., Arulampalam, S., Gordon, N.: Beyond The Kalman Filter: Particle Filters for Tracking Applications. Artech House Publishers (2004)

    Google Scholar 

  25. Shan, C., Gong, S., McOwan, P.: Facial Expression Recognition based on Local Binary Patterns: A Comprehensive Study. Image and Vision Computing 27(6), 803–816 (2009)

    Article  Google Scholar 

  26. Szeliski, R.: Computer Vision: Algorithms and Applications. Microsoft Research (2010)

    Google Scholar 

  27. Viola, P., Jones, M.J.: Robust Real-time Face Detection. Int’l J. Comput. Vision 57, 137–154 (2004)

    Article  Google Scholar 

  28. Willogarage, OpenCV (2010), http://opencv.willowgarage.com/wiki

  29. Yan, T., Kumar, V., Ganesan, D.: CrowdSearch: Exploiting Crowds for Accurate Real-time Image Search on Mobile Phones. In: Proc. the 8th Int’l Conf. Mobile Systems, Applications, and Services, pp. 77–90. ACM (2010)

    Google Scholar 

  30. Yilmaz, A., Javed, O., Shah, M.: Object Tracking: A Survey. ACM Comput. Surv. 38 (2006)

    Google Scholar 

  31. Zeng, Z., Pantic, M., Roisman, G., Huang, T.: A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions. IEEE Trans. Pattern Anal. Mach. Intell. 31(1), 39–58 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering

About this paper

Cite this paper

Yang, X., You, CW., Lu, H., Lin, M., Lane, N.D., Campbell, A.T. (2013). Visage: A Face Interpretation Engine for Smartphone Applications. In: Uhler, D., Mehta, K., Wong, J.L. (eds) Mobile Computing, Applications, and Services. MobiCASE 2012. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 110. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36632-1_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-36632-1_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-36631-4

  • Online ISBN: 978-3-642-36632-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics