User-Independent Face Landmark Detection and Tracking for Spatial AR Interaction

  • Youngkyoon Jang
  • Eunah Jung
  • Sung Sil Kim
  • Jeongmin Yu
  • Woontack WooEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9749)


We present novel face landmark detection and tracking methods which are independent of user facial differences in a scenario of Spatial Augmented Reality (SAR) interaction. The proposed methods do not require a preliminary general face model to detect or track landmarks. Our contributions include: (i) fast face landmark detection, which is achieved based on our modified Latent Regression Forest (LRF) and (ii) model-independent facial landmark tracking by revising outliers based on a direction and displacement of neighboring landmarks. We also discuss (iii) feature enhancements based on RGB and depth images for supporting several interaction scenarios in SAR environments. We anticipate that the proposed methods promise several interesting scenarios, even under severe head orientation in SAR interaction without wearing any wearable devices.


Face landmark detection Face landmark tracking Random forest Virtual reality Computer vision 



This work was supported by DMC R&D Center of Samsung Electronics Co.


  1. 1.
    Microsoft HoloLens. Accessed 25 Sept 2015
  2. 2.
    Oculus, V.R.: Accessed 25 Sept 2015
  3. 3.
    Ahonen, T., Hadid, A., Pietikainen, M.: Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 2037–2041 (2006)CrossRefzbMATHGoogle Scholar
  4. 4.
    Cao, C., Weng, Y., Lin, S., Zhou, K.: 3d shape regression for real-time facial animation. ACM Trans. Graph. 32(4), 41: 1–41: 10 (2013)CrossRefzbMATHGoogle Scholar
  5. 5.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Schmid, C., Soatto, S., Tomasi, C. (eds.) International Conference on Computer Vision & Pattern Recognition, vol. 2, pp. 886–893. INRIA Rhône-Alpes, ZIRST-655, av. de l’Europe, Montbonnot-38334. June 2005.
  6. 6.
    Denning, P.J.: The locality principle. Commun. ACM 48(7), 19–24 (2005)CrossRefGoogle Scholar
  7. 7.
    Jang, Y., Woo, W.: Local feature descriptors for 3d object recognition in ubiquitous virtual reality. In: 2012 International Symposium on Ubiquitous Virtual Reality, Daejeon, Korea (South), 22–25 August 2012, pp. 42–45 (2012)Google Scholar
  8. 8.
    Jones, B., Sodhi, R., Murdock, M., Mehra, R., Benko, H., Wilson, A., Ofek, E., MacIntyre, B., Raghuvanshi, N., Shapira, L.: Roomalive: magical experiences enabled by scalable, adaptive projector-camera units. In: Proceedings of the 27th Annual ACM Symposium on User Interface Software and Technology, UIST 2014, NY, USA, pp. 637–644. ACM, New York (2014)Google Scholar
  9. 9.
    Jones, B.R., Benko, H., Ofek, E., Wilson, A.D.: Illumiroom: peripheral projected illusions for interactive experiences. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2013, NY, USA, pp. 869–878 (2013).
  10. 10.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)CrossRefGoogle Scholar
  11. 11.
    Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Proceedings of the 7th International Joint Conference on Artificial Intelligence, IJCAI 1981, vol. 2, pp. 674–679. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1981)Google Scholar
  12. 12.
    Shotton, J., Girshick, R., Fitzgibbon, A., Sharp, T., Cook, M., Finocchio, M., Moore, R., Kohli, P., Criminisi, A., Kipman, A., Blake, A.: Efficient human pose estimation from single depth images. In: Transaction on PAMI (2012)Google Scholar
  13. 13.
    Tang, D., Chang, H.J., Tejani, A., Kim, T.K.: Latent regression forest: structured estimation of 3D articulated hand posture. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2014Google Scholar
  14. 14.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, vol. 1, pp. I-511–I-518 (2001)Google Scholar
  15. 15.
    Wang, H., Klaser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition CVPR 2011, pp. 3169–3176. IEEE Computer Society, Washington, DC, USA (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Youngkyoon Jang
    • 1
  • Eunah Jung
    • 2
  • Sung Sil Kim
    • 3
  • Jeongmin Yu
    • 1
  • Woontack Woo
    • 1
    • 3
    Email author
  1. 1.CTRI & AHRCKAISTDaejeonSouth Korea
  2. 2.School of ComputingKAISTDaejeonSouth Korea
  3. 3.GSCTKAISTDaejeonSouth Korea

Personalised recommendations