Combining 3D Model Contour Energy and Keypoints for Object Tracking

  • Bogdan BugaevEmail author
  • Anton Kryshchenko
  • Roman Belov
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11216)


We present a new combined approach for monocular model-based 3D tracking. A preliminary object pose is estimated by using a keypoint-based technique. The pose is then refined by optimizing the contour energy function. The energy determines the degree of correspondence between the contour of the model projection and the image edges. It is calculated based on both the intensity and orientation of the raw image gradient. For optimization, we propose a technique and search area constraints that allow overcoming the local optima and taking into account information obtained through keypoint-based pose estimation. Owing to its combined nature, our method eliminates numerous issues of keypoint-based and edge-based approaches. We demonstrate the efficiency of our method by comparing it with state-of-the-art methods on a public benchmark dataset that includes videos with various lighting conditions, movement patterns, and speed.


3D tracking Monocular Model-based Pose estimation 

Supplementary material

474200_1_En_4_MOESM1_ESM.pdf (763 kb)
Supplementary material 1 (pdf 762 KB)

Supplementary material 2 (avi 81949 KB)


  1. 1.
    Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. PAMI 8(6), 679–698 (1986). Scholar
  2. 2.
    Choi, C., Christensen, H.I.: 3D textureless object detection and tracking: an edge-based approach. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3877–3884, October 2012.
  3. 3.
    Choi, C., Christensen, H.I.: Robust 3D visual tracking using particle filtering on the special euclidean group: a combined approach of keypoint and edge features. Int. J. Robot. Res. 31(4), 498–519 (2012). Scholar
  4. 4.
    Comport, A.I., Marchand, E., Chaumette, F.: A real-time tracker for markerless augmented reality. In: The Second IEEE and ACM International Symposium on Mixed and Augmented Reality, pp. 36–45, October 2003.
  5. 5.
    Comport, A.I., Marchand, E., Pressigout, M., Chaumette, F.: Real-time markerless tracking for augmented reality: the virtual visual servoing framework. IEEE Trans. Vis. Comput. Graph. 12, 615–628 (2006). Scholar
  6. 6.
    Damen, D., Bunnun, P., Calway, A., Mayol-cuevas, W.: Real-time learning and detection of 3D texture-less objects: a scalable approach. In: Proceedings of the British Machine Vision Conference, pp. 23.1–23.12. BMVA Press, Guildford (2012).
  7. 7.
    Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981). Scholar
  8. 8.
    Harris, C., Stennett, C.: RAPID - a video rate object tracker. In: Proceedings of the British Machine Vision Conference, vol. 6, pp. 15.1–15.6. BMVA Press, Guildford, September 1990.
  9. 9.
    Klein, G., Murray, D.W.: Full-3D edge tracking with a particle filter. In: Proceedings of the British Machine Vision Conference, pp. 1119–1128. BMVA Press, Guildford, September 2006.
  10. 10.
    Kraft, D.: A Software Package for Sequential Quadratic Programming. Forschungsbericht, Wiss. Berichtswesen d. DFVLR, Deutsche Forschungs- und Versuchsanstalt für Luft- und Raumfahrt Köln (1988)Google Scholar
  11. 11.
    Lepetit, V., Fua, P.: Monocular model-based 3D tracking of rigid objects. Found. Trends Comput. Graph. Vis. 1(1), 1–89 (2005). Scholar
  12. 12.
    Lourakis, M., Zabulis, X.: Model-based pose estimation for rigid objects. In: Chen, M., Leibe, B., Neumann, B. (eds.) ICVS 2013. LNCS, vol. 7963, pp. 83–92. Springer, Heidelberg (2013). Scholar
  13. 13.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004). Scholar
  14. 14.
    Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Proceedings of the 7th International Joint Conference on Artificial Intelligence. IJCAI, vol. 2, pp. 674–679. Morgan Kaufmann, San Francisco, CA, USA (1981)Google Scholar
  15. 15.
    Marchand, E., Bouthemy, P., Chaumette, F.: A 2D–3D model-based approach to real-time visual tracking. Image Vis. Comput. 19(13), 941–955 (2001). Scholar
  16. 16.
    Marchand, E., Uchiyama, H., Spindler, F.: Pose estimation for augmented reality: a hands-on survey. IEEE Trans. Vis. Comput. Graph. 22(12), 2633–2651 (2016). Scholar
  17. 17.
    Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equation of state calculations by fast computing machines. J. Chem. Phys. 21(6), 1087–1092 (1953). Scholar
  18. 18.
    Mur-Artal, R., Tardós, J.D.: ORB-SLAM2: an open-source SLAM system for monocular, stereo and RGB-D cameras. IEEE Trans. Robot. 33, 1255–1262 (2017). Scholar
  19. 19.
    Pauwels, K., Rubio, L., Díaz, J., Vidal, E.R.: Real-time model-based rigid object pose estimation and tracking combining dense and sparse visual cues. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2347–2354. IEEE, Portland, Oregon, USA (Jun 2013).
  20. 20.
    Prisacariu, V.A., Reid, I.D.: PWP3D: real-time segmentation and tracking of 3D objects. Int. J. Comput. Vis. 98, 335–354 (2012). Scholar
  21. 21.
    Seo, B.K., Park, H., Park, J.I., Hinterstoisser, S., Ilic, S.: Optimal local searching for fast and robust textureless 3D object tracking in highly cluttered backgrounds. IEEE Trans. Vis. Comput. Graph. 20, 99–110 (2014). Scholar
  22. 22.
    Shi, J., Tomasi, C.: Good features to track. Technical report, Ithaca, NY, USA (1993)Google Scholar
  23. 23.
    Tjaden, H., Schwanecke, U., Schömer, E.: Real-time monocular pose estimation of 3D objects using temporally consistent local color histograms. In: IEEE International Conference on Computer Vision, pp. 124–132 (2017).
  24. 24.
    Tomasi, C., Kanade, T.: Detection and tracking of point features. Int. J. Comput. Vis. 9, 137–154 (1991)CrossRefGoogle Scholar
  25. 25.
    Vacchetti, L., Lepetit, V., Fua, P.: Combining edge and texture information for real-time accurate 3D camera tracking. In: Proceedings of the 3rd IEEE/ACM International Symposium on Mixed and Augmented Reality, ISMAR, pp. 48–57. IEEE Computer Society, Washington, DC, USA (2004).
  26. 26.
    Vacchetti, L., Lepetit, V., Fua, P.: Stable real-time 3D tracking using online and offline information. IEEE Trans. Pattern Anal. Mach. Intell. 26, 1385–1391 (2004). Scholar
  27. 27.
    Wales, D.J., Doye, J.P.K.: Global optimization by basin-hopping and the lowest energy structures of lennard-jones clusters containing up to 110 atoms. J. Phys. Chem. 101(28), 5111–5116 (1997). Scholar
  28. 28.
    Wang, B., Zhong, F., Qin, X.: Pose optimization in edge distance field for textureless 3D object tracking. In: Proceedings of the Computer Graphics International Conference, CGI, pp. 32:1–32:6. ACM, New York (2017).
  29. 29.
    Wang, G., Wang, B., Zhong, F., Qin, X., Chen, B.: Global optimal searching fortextureless 3D object tracking. Vis. Comput.: Int. J. Comput. Graph. 31(6–8), 979–988 (2015). Scholar
  30. 30.
    Wu, P.C., Lee, Y.Y., Tseng, H.Y., Ho, H.I., Yang, M.H., Chien, S.Y.: A benchmark dataset for 6DoF object pose tracking. In: IEEE International Symposium on Mixed and Augmented Reality, ISMAR-Adjunct, pp. 186–191. IEEE Computer Society, Washington, DC, USA, October 2017.

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.National Research University Higher School of EconomicsSt. PetersburgRussia
  2. 2.Saint Petersburg Academic UniversitySt. PetersburgRussia
  3. 3.KeenToolsSt. PetersburgRussia

Personalised recommendations