Advertisement

The Phong Surface: Efficient 3D Model Fitting Using Lifted Optimization

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12346)

Abstract

Realtime perceptual and interaction capabilities in mixed reality require a range of 3D tracking problems to be solved at low latency on resource-constrained hardware such as head-mounted devices. Indeed, for devices such as HoloLens 2 where the CPU and GPU are left available for applications, multiple tracking subsystems are required to run on a continuous, real-time basis while sharing a single Digital Signal Processor. To solve model-fitting problems for HoloLens 2 hand tracking, where the computational budget is approximately 100 times smaller than an iPhone 7, we introduce a new surface model: the ‘Phong surface’. Using ideas from computer graphics, the Phong surface describes the same 3D shape as a triangulated mesh model, but with continuous surface normals which enable the use of lifting-based optimization, providing significant efficiency gains over ICP-based methods. We show that Phong surfaces retain the convergence benefits of smoother surface models, while triangle meshes do not.

Keywords

Model-fitting Optimization Hand tracking Pose estimation 

Supplementary material

500725_1_En_40_MOESM1_ESM.pdf (1.2 mb)
Supplementary material 1 (pdf 1222 KB)

References

  1. 1.
    Besl, P.J., McKay, N.D.: A method for registration of 3-D shapes. IEEE Trans. Pattern Anal. Mach. Intell. 14(2), 239–256 (1992)CrossRefGoogle Scholar
  2. 2.
    Bogo, F., Kanazawa, A., Lassner, C., Gehler, P., Romero, J., Black, M.J.: Keep it SMPL: automatic estimation of 3D human pose and shape from a single image. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 561–578. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46454-1_34CrossRefGoogle Scholar
  3. 3.
    Bogo, F., Romero, J., Loper, M., Black, M.J.: FAUST: dataset and evaluation for 3D mesh registration. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3794–3801 (2014)Google Scholar
  4. 4.
    Cashman, T.J., Fitzgibbon, A.W.: What shape are dolphins? Building 3D morphable models from 2D images. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 232–244 (2013)CrossRefGoogle Scholar
  5. 5.
    Chen, Y., Medioni, G.: Object modeling by registration of multiple range images. In: IEEE International Conference on Robotics and Automation, pp. 2724–2729 (1991)Google Scholar
  6. 6.
    Fitzgibbon, A.: Robust registration of 2D and 3D point sets. In: Proceedings of the British Machine Vision Conference, pp. 411–420 (2001)Google Scholar
  7. 7.
    Hirshberg, D.A., Loper, M., Rachlin, E., Black, M.J.: Coregistration: simultaneous alignment and modeling of articulated 3D shape. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 242–255. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33783-3_18CrossRefGoogle Scholar
  8. 8.
    Khamis, S., Taylor, J., Shotton, J., Keskin, C., Izadi, S., Fitzgibbon, A.: Learning an efficient model of hand shape variation from depth images. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2540–2548 (2015)Google Scholar
  9. 9.
    Kolotouros, N., Pavlakos, G., Black, M., Daniilidis, K.: Learning to reconstruct 3D human pose and shape via model-fitting in the loop. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2252–2261 (2019)Google Scholar
  10. 10.
    Li, T., Bolkart, T., Black, M.J., Li, H., Romero, J.: Learning a model of facial shape and expression from 4D scans. ACM Trans. Graph. 36(6), 194:1–194:17 (2017)Google Scholar
  11. 11.
    Loop, C.T.: Smooth subdivision surfaces based on triangles. Master’s thesis, University of Utah (1987)Google Scholar
  12. 12.
    Magic Leap Inc.: Perception at Magic Leap (2019). https://sites.google.com/view/perceptionatmagicleap/
  13. 13.
    Marquardt, D.W.: An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Ind. Appl. Math. 11(2), 431–441 (1963)MathSciNetCrossRefGoogle Scholar
  14. 14.
  15. 15.
    Mueller, F., et al.: Real-time pose and shape reconstruction of two interacting hands with a single depth camera. ACM Trans. Graph. 38(4), 49:1–49:13 (2019)CrossRefGoogle Scholar
  16. 16.
    Neugebauer, P.J.: Geometrical cloning of 3D objects via simultaneous registration of multiple range images. In: International Conference on Shape Modeling and Applications, pp. 130–139 (1997)Google Scholar
  17. 17.
    Pavlakos, G., et al.: Expressive body capture: 3D hands, face, and body from a single image. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 10975–10985 (2019)Google Scholar
  18. 18.
    Pellegrini, S., Schindler, K., Nardi, D.: A generalisation of the ICP algorithm for articulated bodies. In: Proceedings of the British Machine Vision Conference, pp. 87.1–87.10 (2008)Google Scholar
  19. 19.
    Phong, B.T.: Illumination for computer generated pictures. Commun. ACM 18(6), 311–317 (1975)CrossRefGoogle Scholar
  20. 20.
    Qian, C., Sun, X., Wei, Y., Tang, X., Sun, J.: Realtime and robust hand tracking from depth. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1106–1113 (2014)Google Scholar
  21. 21.
    Rusinkiewicz, S.: A symmetric objective function for ICP. ACM Trans. Graph. 38(4), 85:1–85:7 (2019)CrossRefGoogle Scholar
  22. 22.
    Rusinkiewicz, S., Levoy, M.: Efficient variants of the ICP algorithm. In: International Conference on 3D Digital Imaging and Modeling, pp. 145–152 (2001)Google Scholar
  23. 23.
    Sridhar, S., Oulasvirta, A., Theobalt, C.: Interactive markerless articulated hand motion tracking using RGB and depth data. In: International Conference on Computer Vision, pp. 2456–2463 (2013)Google Scholar
  24. 24.
    Sullivan, S., Ponce, J.: Automatic model construction and pose estimation from photographs using triangular splines. IEEE Trans. Pattern Anal. Mach. Intell. 20(10), 1091–1097 (1998)CrossRefGoogle Scholar
  25. 25.
    Tagliasacchi, A., Schröder, M., Tkach, A., Bouaziz, S., Botsch, M., Pauly, M.: Robust articulated-ICP for real-time hand tracking. Comput. Graph. Forum 34(5), 101–114 (2015)CrossRefGoogle Scholar
  26. 26.
    Taylor, J., et al.: User-specific hand modeling from monocular depth sequences. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 644–651 (2014)Google Scholar
  27. 27.
    Taylor, J., et al.: Efficient and precise interactive hand tracking through joint, continuous optimization of pose and correspondences. ACM Trans. Graph. 35(4), 143:1–143:12 (2016)CrossRefGoogle Scholar
  28. 28.
    Taylor, J., et al.: Articulated distance fields for ultra-fast tracking of hands interacting. ACM Trans. Graph. 36(6), 244:1–244:12 (2017)CrossRefGoogle Scholar
  29. 29.
    Tkach, A., Pauly, M., Tagliasacchi, A.: Sphere-meshes for real-time hand modeling and tracking. ACM Trans. Graph. 35(6), 222:1–222:11 (2016)CrossRefGoogle Scholar
  30. 30.
    Wan, C., Probst, T., Gool, L.V., Yao, A.: Self-supervised 3D hand pose estimation through training by fitting. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 10845–10854 (2019)Google Scholar
  31. 31.
    Xiang, D., Joo, H., Sheikh, Y.: Monocular total capture: posing face, body, and hands in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 10957–10966 (2018)Google Scholar
  32. 32.
    Zheng, J., Zeng, M., Cheng, X., Liu, X.: SCAPE-based human performance reconstruction. Comput. Graph. 38, 191–198 (2014)CrossRefGoogle Scholar
  33. 33.
    Zuffi, S., Kanazawa, A., Black, M.J.: Lions and tigers and bears: capturing non-rigid, 3D, articulated shape from images. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3955–3963 (2018)Google Scholar
  34. 34.
    Zuffi, S., Kanazawa, A., Jacobs, D.W., Black, M.J.: 3D menagerie: modeling the 3D shape and pose of animals. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5524–5532 (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Microsoft Mixed Reality & AI LabsCambridgeUK

Personalised recommendations