Real-Time Monocular Segmentation and Pose Tracking of Multiple Objects

  • Henning Tjaden
  • Ulrich Schwanecke
  • Elmar Schömer
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9908)


We present a real-time system capable of segmenting multiple 3D objects and tracking their pose using a single RGB camera, based on prior shape knowledge. The proposed method uses twist-coordinates for pose parametrization and a pixel-wise second-order optimization approach which lead to major improvements in terms of tracking robustness, especially in cases of fast motion and scale changes, compared to previous region-based approaches. Our implementation runs at about 50–100 Hz on a commodity laptop when tracking a single object without relying on GPGPU computations. We compare our method to the current state of the art in various experiments involving challenging motion sequences and different complex objects.


Tracking Segmentation Real-time Monocular Pose estimation Model-based Shape knowledge 


  1. 1.
    Lepetit, V., Fua, P.: Monocular model-based 3D tracking of rigid objects: A survey. Found. Trends. Comput. Graph. Vis. 1(1), 1–89 (2005)CrossRefGoogle Scholar
  2. 2.
    Cremers, D., Rousson, M., Deriche, R.: A review of statistical approaches to level set segmentation: Integrating color, texture, motion and shape. Int. J. Comput. Vis. 72(2), 195–215 (2007)CrossRefGoogle Scholar
  3. 3.
    Harris, C., Stennet, C.: RAPiD - A video-rate object tracker. In: British Machine Vision Conference, pp. 73–77, September 1990Google Scholar
  4. 4.
    Vacchetti, L., Lepetit, V., Fua, P.: Stable real-time 3D tracking using online and offline information. IEEE Trans. Pattern Anal. Mach. Intell. 26(10), 1385–1391 (2004)CrossRefGoogle Scholar
  5. 5.
    Park, Y., Lepetit, V., Woo, W.: Multiple 3D object tracking for augmented reality. In: 2008 7th IEEE/ACM International Symposium on Mixed and Augmented Reality, ISMAR 2008, pp. 117–120, September 2008Google Scholar
  6. 6.
    Kim, K., Lepetit, V., Woo, W.: Keyframe-based modeling and tracking of multiple 3D objects. In: 2010 9th IEEE International Symposium on Mixed and Augmented Reality, ISMAR 2010, pp. 193–198, October 2010Google Scholar
  7. 7.
    Rosenhahn, B., Brox, T., Weickert, J.: Three-dimensional shape knowledge for joint image segmentation and pose tracking. Int. J. Comput. Vis. 73(3), 243–262 (2006)CrossRefGoogle Scholar
  8. 8.
    Rosenhahn, B., Brox, T., Cremers, D., Seidel, H.-P.: A comparison of shape matching methods for contour based pose estimation. In: Reulke, R., Eckardt, U., Flach, B., Knauer, U., Polthier, K. (eds.) IWCIA 2006. LNCS, vol. 4040, pp. 263–276. Springer, Heidelberg (2006). doi: 10.1007/11774938_21 CrossRefGoogle Scholar
  9. 9.
    Schmaltz, C., Rosenhahn, B., Brox, T., Cremers, D., Weickert, J., Wietzke, L., Sommer, G.: Region-based pose tracking. In: Martí, J., Benedí, J.M., Mendonça, A.M., Serrat, J. (eds.) IbPRIA 2007. LNCS, vol. 4478, pp. 56–63. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-72849-8_8 CrossRefGoogle Scholar
  10. 10.
    Brox, T., Rosenhahn, B., Gall, J., Cremers, D.: Combined region and motion-based 3D tracking of rigid and articulated objects. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 402–415 (2010)CrossRefGoogle Scholar
  11. 11.
    Schmaltz, C., Rosenhahn, B., Brox, T., Weickert, J.: Region-based pose tracking with occlusions using 3D models. Mach. Vis. Appl. 23(3), 557–577 (2011)CrossRefGoogle Scholar
  12. 12.
    Prisacariu, V.A., Reid, I.D.: PWP3D: Real-time segmentation and tracking of 3D objects. Int. J. Comput. Vis. 98(3), 335–354 (2012)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Dambreville, S., Sandhu, R., Yezzi, A., Tannenbaum, A.: A geometric approach to joint 2D region-based segmentation and 3D pose estimation using a 3D shape prior. SIAM J. Img. Sci. 3(1), 110–132 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Prisacariu, V., Kahler, O., Murray, D., Reid, I.: Real-time 3D tracking and reconstruction on mobile phones. IEEE Trans. Visual. Comput. Graph. 21(5), 557–570 (2015)CrossRefGoogle Scholar
  15. 15.
    Zhao, S., Wang, L., Sui, W., yu Wu, H., Pan, C.: 3D object tracking via boundary constrained region-based model. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 486–490, October 2014Google Scholar
  16. 16.
    Hexner, J., Hagege, R.R.: 2D–3D pose estimation of heterogeneous objects using a region based approach. Int. J. Comput. Vis. 118, 95–112 (2016). ISSN: 1573–1405MathSciNetCrossRefGoogle Scholar
  17. 17.
    Murray, R.M., Li, Z., Sastry, S.S.: A Mathematical Introduction to Robotic Manipulation, 3rd edn. CRC Press Inc., Boca Raton (1994)zbMATHGoogle Scholar
  18. 18.
    Bibby, C., Reid, I.: Robust real-time visual tracking using pixel-wise posteriors. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5303, pp. 831–844. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-88688-4_61 CrossRefGoogle Scholar
  19. 19.
    Felzenszwalb, P.F., Huttenlocher, D.P.: Distance transforms of sampled functions. Theory Comput. 8(1), 415–428 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes 3rd Edition: The Art of Scientific Computing, 3rd edn. Cambridge University Press, New York (2007)zbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Henning Tjaden
    • 1
  • Ulrich Schwanecke
    • 1
  • Elmar Schömer
    • 2
  1. 1.Computer Science DepartmentRheinMain University of Applied SciencesWiesbadenGermany
  2. 2.Institute of Computer ScienceJohannes Gutenberg University MainzMainzGermany

Personalised recommendations