Advertisement

BOP: Benchmark for 6D Object Pose Estimation

  • Tomáš Hodaň
  • Frank Michel
  • Eric Brachmann
  • Wadim Kehl
  • Anders Glent Buch
  • Dirk Kraft
  • Bertram Drost
  • Joel Vidal
  • Stephan Ihrke
  • Xenophon Zabulis
  • Caner Sahin
  • Fabian Manhardt
  • Federico Tombari
  • Tae-Kyun Kim
  • Jiří Matas
  • Carsten Rother
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11214)

Abstract

We propose a benchmark for 6D pose estimation of a rigid object from a single RGB-D input image. The training data consists of a texture-mapped 3D object model or images of the object in known 6D poses. The benchmark comprises of: (i) eight datasets in a unified format that cover different practical scenarios, including two new datasets focusing on varying lighting conditions, (ii) an evaluation methodology with a pose-error function that deals with pose ambiguities, (iii) a comprehensive evaluation of 15 diverse recent methods that captures the status quo of the field, and (iv) an online evaluation system that is open for continuous submission of new results. The evaluation shows that methods based on point-pair features currently perform best, outperforming template matching methods, learning-based methods and methods based on 3D local features. The project website is available at bop.felk.cvut.cz.

Notes

Acknowledgements

We gratefully acknowledge Manolis Lourakis, Joachim Staib, Christoph Kick, Juil Sock and Pavel Haluza for their help. This work was supported by CTU student grant SGS17/185/OHK3/3T/13, Technology Agency of the Czech Republic research program TE01020415 (V3C – Visual Computing Competence Center), and the project for GAČR, No. 16-072105: Complex network methods applied to ancient Egyptian data in the Old Kingdom (2700–2180 BC).

References

  1. 1.
    Brachmann, E., Krull, A., Michel, F., Gumhold, S., Shotton, J., Rother, C.: Learning 6D object pose estimation using 3D object coordinates. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 536–551. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10605-2_35CrossRefGoogle Scholar
  2. 2.
    Brachmann, E., Michel, F., Krull, A., Yang, M.Y., Gumhold, S., Rother, C.: Uncertainty-driven 6D pose estimation of objects and scenes from a single RGB image. In: CVPR (2016)Google Scholar
  3. 3.
    Buch, A.G., Petersen, H.G., Krüger, N.: Local shape feature fusion for improved matching, pose estimation and 3D object recognition. SpringerPlus 5(1), 297 (2016)CrossRefGoogle Scholar
  4. 4.
    Buch, A.G., Kiforenko, L., Kraft, D.: Rotational subgroup voting and pose clustering for robust 3D object recognition. In: ICCV (2017)Google Scholar
  5. 5.
    Buch, A.G., Kraft, D.: Local point pair feature histogram for accurate 3D matching. In: BMVC (2018)Google Scholar
  6. 6.
    Correll, N., et al.: Lessons from the Amazon picking challenge. arXiv e-prints (2016)Google Scholar
  7. 7.
    Doumanoglou, A., Kouskouridas, R., Malassiotis, S., Kim, T.K.: Recovering 6D object pose and predicting next-best-view in the crowd. In: CVPR (2016)Google Scholar
  8. 8.
    Drost, B., Ulrich, M., Bergmann, P., Härtinger, P., Steger, C.: Introducing MVTec ITODD - a dataset for 3D object recognition in industry. In: ICCVW (2017)Google Scholar
  9. 9.
    Drost, B., Ulrich, M., Navab, N., Ilic, S.: Model globally, match locally: Efficient and robust 3D object recognition. In: CVPR (2010)Google Scholar
  10. 10.
    Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The PASCAL visual object classes (VOC) challenge. IJCV 88(2), 303–338 (2010)CrossRefGoogle Scholar
  11. 11.
    Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: CVPR (2012)Google Scholar
  12. 12.
  13. 13.
    Hinterstoisser, S., et al.: Gradient response maps for real-time detection of texture-less objects. TPAMI 34(5), 876–888 (2012)CrossRefGoogle Scholar
  14. 14.
    Hinterstoisser, S., et al.: Model based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012. LNCS, vol. 7724, pp. 548–562. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-37331-2_42CrossRefGoogle Scholar
  15. 15.
    Hinterstoisser, S., Lepetit, V., Rajkumar, N., Konolige, K.: Going further with point pair features. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 834–848. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46487-9_51CrossRefGoogle Scholar
  16. 16.
    Hodaň, T., Haluza, P., Obdržálek, Š., Matas, J., Lourakis, M., Zabulis, X.: T-LESS: an RGB-D dataset for 6D pose estimation of texture-less objects. In: WACV (2017)Google Scholar
  17. 17.
    Hodaň, T., Matas, J., Obdržálek, Š.: On evaluation of 6D object pose estimation. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 606–619. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-49409-8_52CrossRefGoogle Scholar
  18. 18.
    Hodaň, T., Zabulis, X., Lourakis, M., Obdržálek, Š., Matas, J.: Detection and fine 3D pose estimation of texture-less objects in RGB-D images. In: IROS (2015)Google Scholar
  19. 19.
    Johnson, A.E., Hebert, M.: Using spin images for efficient object recognition in cluttered 3D scenes. TPAMI 21(5), 433–449 (1999)CrossRefGoogle Scholar
  20. 20.
    Jørgensen, T.B., Buch, A.G., Kraft, D.: Geometric edge description and classification in point cloud data with application to 3D object recognition. In: VISAPP (2015)Google Scholar
  21. 21.
    Kehl, W., Manhardt, F., Tombari, F., Ilic, S., Navab, N.: SSD-6D: making RGB-based 3D detection and 6D pose estimation great again. In: ICCV (2017)Google Scholar
  22. 22.
    Kehl, W., Milletari, F., Tombari, F., Ilic, S., Navab, N.: Deep Learning of Local RGB-D Patches for 3D Object Detection and 6D Pose Estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 205–220. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46487-9_13CrossRefGoogle Scholar
  23. 23.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)Google Scholar
  24. 24.
    Krull, A., Brachmann, E., Michel, F., Ying Yang, M., Gumhold, S., Rother, C.: Learning analysis-by-synthesis for 6D pose estimation in RGB-D images. In: ICCV (2015)Google Scholar
  25. 25.
    Michel, F., et al.: Global hypothesis generation for 6D object pose estimation. In: CVPR (2017)Google Scholar
  26. 26.
    Newcombe, R.A., et al.: KinectFusion: real-time dense surface mapping and tracking. In: ISMAR (2011)Google Scholar
  27. 27.
    Rad, M., Lepetit, V.: BB8: a scalable, accurate, robust to partial occlusion method for predicting the 3D poses of challenging objects without using depth. In: ICCV (2017)Google Scholar
  28. 28.
    Rennie, C., Shome, R., Bekris, K.E., De Souza, A.F.: A dataset for improved RGBD-based object detection and pose estimation for warehouse pick-and-place. Rob. Autom. Lett. 1(2), 1179–1185 (2016)CrossRefGoogle Scholar
  29. 29.
    Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  30. 30.
    Salti, S., Tombari, F., Di Stefano, L.: SHOT: unique signatures of histograms for surface and texture description. Comput. Vis. Image Underst. 125, 251–264 (2014)CrossRefGoogle Scholar
  31. 31.
    Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. IJCV 47(1–3), 7–42 (2002)CrossRefGoogle Scholar
  32. 32.
    Scharstein, D., Pal, C.: Learning conditional random fields for stereo. In: CVPR (2007)Google Scholar
  33. 33.
    Steinbrücker, F., Sturm, J., Cremers, D.: Volumetric 3D mapping in real-time on a CPU. In: ICRA (2014)Google Scholar
  34. 34.
    Tejani, A., Tang, D., Kouskouridas, R., Kim, T.-K.: Latent-class hough forests for 3D object detection and pose estimation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 462–477. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10599-4_30CrossRefGoogle Scholar
  35. 35.
    Vidal, J., Lin, C.Y., Martí, R.: 6D pose estimation using an improved method based on point pair features. In: ICCAR (2018)Google Scholar
  36. 36.
    Wohlhart, P., Lepetit, V.: Learning descriptors for object recognition and 3D pose estimation. In: CVPR (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Tomáš Hodaň
    • 1
  • Frank Michel
    • 2
  • Eric Brachmann
    • 3
  • Wadim Kehl
    • 4
  • Anders Glent Buch
    • 5
  • Dirk Kraft
    • 5
  • Bertram Drost
    • 6
  • Joel Vidal
    • 7
  • Stephan Ihrke
    • 2
  • Xenophon Zabulis
    • 8
  • Caner Sahin
    • 9
  • Fabian Manhardt
    • 10
  • Federico Tombari
    • 10
  • Tae-Kyun Kim
    • 9
  • Jiří Matas
    • 1
  • Carsten Rother
    • 3
  1. 1.CTU in PraguePragueCzech Republic
  2. 2.TU DresdenDresdenGermany
  3. 3.Heidelberg UniversityHeidelbergGermany
  4. 4.Toyota Research InstituteLos AltosUSA
  5. 5.University of Southern DenmarkOdenseDenmark
  6. 6.MVTec SoftwareMunichGermany
  7. 7.Taiwan TechTaipeiTaiwan
  8. 8.FORTH HeraklionHeraklionGreece
  9. 9.Imperial College LondonLondonUK
  10. 10.TU MunichMunichGermany

Personalised recommendations