Abstract
Smartphones represent powerful mobile computing devices enabling a wide variety of new applications and opportunities for human interaction, sensing and communications. Because smartphones come with front-facing cameras, it is now possible for users to interact and drive applications based on their facial responses to enable participatory and opportunistic face-aware applications. This paper presents the design, implementation and evaluation of a robust, real-time face interpretation engine for smartphones, called Visage, that enables a new class of face-aware applications for smartphones. Visage fuses data streams from the phone’s front-facing camera and built-in motion sensors to infer, in an energy-efficient manner, the user’s 3D head poses (i.e., the pitch, roll and yaw of user’s heads with respect to the phone) and facial expressions (e.g., happy, sad, angry, etc.). Visage supports a set of novel sensing, tracking, and machine learning algorithms on the phone, which are specifically designed to deal with challenges presented by user mobility, varying phone contexts, and resource limitations. Results demonstrate that Visage is effective in different real-world scenarios. Furthermore, we developed two distinct proof-of-concept applications, Streetview+ and Mood Profiler driven by Visage.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Recognizr, http://news.cnet.com/8301-137723-10458736-52.html
Adiv, G.: Determining Three-dimensional Motion and Structure from Optical Flow Generated by Several Moving Objects. Trans. Pattern Anal. Mach. Intell. 7(4), 384–401 (1985)
Baker, S., Matthews, I.: Lucas-kanade 20 Years On: A Unifying Framework. Int’l J. Comput. Vision 56(3), 221–255 (2004)
Bao, X., Choudhury, R.R.: MoVi: Mobile Phone based Video Highlights via Collaborative Sensing. In: Proc. the 8th Int’l Conf. Mobile Systems, Applications, and Services, pp. 357–370. ACM, New York (2010)
Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J.: Eigenfaces vs. Fisherfaces: Recognition using Class Specific Linear Projection. Trans. Pattern Anal. Mach. Intell. 19(7), 711–720 (1997)
Bradski, G.R.: Real Time Face and Object Tracking as a Component of a Perceptual User Interface. In: Proc. the 4th IEEE Workshop on Applications of Computer Vision, pp. 214–219. IEEE Computer Society, Washington, DC (1998)
Chai, S.: Mobile Challenges for Embedded Computer Vision. In: Embedded Computer Vision, Advances in Pattern Recognition, pp. 219–235. Springer, London (2009)
Chang, C.-C., Lin, C.-J.: LIBSVM: A Library for Support Vector Machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011)
Cunningham, D.W., Nusseck, M., Wallraven, C., Bulthoff, H.H.: The Role of Image Size in the Recognition of Conversational Facial Expressions. Research Articles. Comput. Animat. Virtual Worlds 15(3-4), 305–310 (2004)
Dementhon, D.F., Davis, L.S.: Model-based Object Pose in 25 Lines of Code. Int’l J. Comput. Vision 15(1-2), 123–141 (1995)
Ekman, P., Friesen, W.V.: Constants Across Cultures in the Face and Emotion. Journal of Personality and Social Psychology 17(2), 124–129 (1971)
Hodges, S., Williams, L., Berry, E., Izadi, S., Srinivasan, J., Butler, A., Smyth, G., Kapur, N., Wood, K.: SenseCam: A Retrospective Memory Aid. In: Dourish, P., Friday, A. (eds.) UbiComp 2006. LNCS, vol. 4206, pp. 177–193. Springer, Heidelberg (2006)
Hua, G., Yang, T., Vasireddy, S.: PEYE: Toward a Visual Motion Based Perceptual Interface for Mobile Devices. In: Proc. of the 2007 IEEE Int’l Conf. Human-Computer Interaction, pp. 39–48. Springer, Berlin (2007)
Jia, P.: Vision Open Statistical Models (2011), http://sourceforge.net/projects/vosm
Liao, S., Fan, W., Chung, A., Yeung, D.-Y.: Facial Expression Recognition using Advanced Local Binary Patterns, Tsallis Entropies and Global Appearance Features. In: IEEE Int’l Conf. Image Processing, pp. 665–668 (2006)
Littlewort, G., Bartlett, M., Fasel, I., Chenu, J., Kanda, T., Ishiguro, H., Movellan, J.: Towards Social Robots: Automatic Evaluation of Human-robot Interaction by Face Detection and Expression Classification. Advances in Neural Information Processing Systems 16, 1563–1570 (2004)
Lu, H., Pan, W., Lane, N., Choudhury, T., Campbell, A.: SoundSense: Scalable Sound Sensing for People-centric Applications on Mobile Phones. In: Proc. the 7th Int’l Conf. Mobile Systems, Applications, and Services, pp. 165–178. ACM (2009)
Lyons, M., Akamatsu, S., Kamachi, M., Gyoba, J.: Coding Facial Expressions with Gabor Wavelets. In: Proc. 3rd IEEE Int’l Conf. Automatic Face and Gesture Recognition, pp. 200–205. IEEE Computer Society, Washington, DC (1998)
Matthews, I., Baker, S.: Active Appearance Models Revisited. Int’l J. Comput. Vision 60(2), 135–164 (2004)
Michel, P., Kaliouby, R.E.: Real Time Facial Expression Recognition in Video using Support Vector Machines. In: Proc. the 5th Int’l Conf. Multimodal Interfaces, pp. 258–264. ACM, New York (2003)
Miluzzo, E., Lane, N.D., Eisenman, S.B., Campbell, A.T.: CenceMe – Injecting Sensing Presence into Social Networking Applications. In: Kortuem, G., Finney, J., Lea, R., Sundramoorthy, V. (eds.) EuroSSC 2007. LNCS, vol. 4793, pp. 1–28. Springer, Heidelberg (2007)
Mun, M., Reddy, S., Shilton, K., Yau, N., Burke, J., Estrin, D., Hansen, M., Howard, E., West, R., Boda, P.: Peir: The Personal Environmental Impact Report, as a Platform for Participatory Sensing Systems Research. In: Proc. the 7th Int’l Conf. Mobile Systems, Applications, and Services, pp. 55–68. ACM, New York (2009)
Radovanovic, M., Nanopoulos, A., Ivanovic, M.: On The Existence of Obstinate Results in Vector Space Models. In: Proc. the 33rd Int’l Conf. Research and Development in Information Retrieval, pp. 186–193. ACM, New York (2010)
Ristic, B., Arulampalam, S., Gordon, N.: Beyond The Kalman Filter: Particle Filters for Tracking Applications. Artech House Publishers (2004)
Shan, C., Gong, S., McOwan, P.: Facial Expression Recognition based on Local Binary Patterns: A Comprehensive Study. Image and Vision Computing 27(6), 803–816 (2009)
Szeliski, R.: Computer Vision: Algorithms and Applications. Microsoft Research (2010)
Viola, P., Jones, M.J.: Robust Real-time Face Detection. Int’l J. Comput. Vision 57, 137–154 (2004)
Willogarage, OpenCV (2010), http://opencv.willowgarage.com/wiki
Yan, T., Kumar, V., Ganesan, D.: CrowdSearch: Exploiting Crowds for Accurate Real-time Image Search on Mobile Phones. In: Proc. the 8th Int’l Conf. Mobile Systems, Applications, and Services, pp. 77–90. ACM (2010)
Yilmaz, A., Javed, O., Shah, M.: Object Tracking: A Survey. ACM Comput. Surv. 38 (2006)
Zeng, Z., Pantic, M., Roisman, G., Huang, T.: A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions. IEEE Trans. Pattern Anal. Mach. Intell. 31(1), 39–58 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Yang, X., You, CW., Lu, H., Lin, M., Lane, N.D., Campbell, A.T. (2013). Visage: A Face Interpretation Engine for Smartphone Applications. In: Uhler, D., Mehta, K., Wong, J.L. (eds) Mobile Computing, Applications, and Services. MobiCASE 2012. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 110. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36632-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-36632-1_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36631-4
Online ISBN: 978-3-642-36632-1
eBook Packages: Computer ScienceComputer Science (R0)