Skip to main content

ClearPose: Large-scale Transparent Object Dataset and Benchmark

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Abstract

Transparent objects are ubiquitous in household settings and pose distinct challenges for visual sensing and perception systems. The optical properties of transparent objects leave conventional 3D sensors alone unreliable for object depth and pose estimation. These challenges are highlighted by the shortage of large-scale RGB-Depth datasets focusing on transparent objects in real-world settings. In this work, we contribute a large-scale real-world RGB-Depth transparent object dataset named ClearPose to serve as a benchmark dataset for segmentation, scene-level depth completion and object-centric pose estimation tasks. The ClearPose dataset contains over 350K labeled real-world RGB-Depth frames and 5M instance annotations covering 63 household objects. The dataset includes object categories commonly used in daily life under various lighting and occluding conditions as well as challenging test scenarios such as cases of occlusion by opaque or translucent objects, non-planar orientations, presence of liquids, etc. We benchmark several state-of-the-art depth completion and object pose estimation deep neural networks on ClearPose. The dataset and benchmarking source code is available at https://github.com/opipari/ClearPose.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Campos, C., Elvira, R., Rodríguez, J.J.G., Montiel, J.M., Tardós, J.D.: Orb-slam3: an accurate open-source library for visual, visual-inertial, and multimap slam. IEEE Trans. Robot. 37(6), 1874–1890 (2021)

    Article  Google Scholar 

  2. Chang, J., et al.: GhostPose:*: multi-view pose estimation of transparent objects for robot hand grasping. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5749–5755. IEEE (2021)

    Google Scholar 

  3. Chen, G., Han, K., Wong, K.Y.K.: Tom-net: learning transparent object matting from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9233–9241 (2018)

    Google Scholar 

  4. Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)

  5. Chen, X., Zhang, H., Yu, Z., Lewis, S., Jenkins, O.C.: ProgressLabeller: visual data stream annotation for training object-centric 3d perception. arXiv preprint arXiv:2203.00283 (2022)

  6. Eigen, D., Puhrsch, C., Fergus, R.: Depth map prediction from a single image using a multi-scale deep network. In: Advances in neural Information Processing Systems, vol. 27 (2014)

    Google Scholar 

  7. Fang, H., Fang, H.S., Xu, S., Lu, C.: TransCG: a large-scale real-world dataset for transparent object depth completion and grasping. arXiv preprint arXiv:2202.08471 (2022)

  8. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)

    Google Scholar 

  9. He, Y., Huang, H., Fan, H., Chen, Q., Sun, J.: Ffb6d: a full flow bidirectional fusion network for 6d pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3003–3013 (2021)

    Google Scholar 

  10. Hodaň, T., et al.: BOP challenge 2020 on 6d object localization. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12536, pp. 577–594. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-66096-3_39

    Chapter  Google Scholar 

  11. Hong, X., Xiong, P., Ji, R., Fan, H.: Deep fusion network for image completion. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 2033–2042 (2019)

    Google Scholar 

  12. Liu, X., Iwase, S., Kitani, K.M.: Stereobj-1 m: large-scale stereo image dataset for 6d object pose estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10870–10879 (2021)

    Google Scholar 

  13. Liu, X., Jonschkowski, R., Angelova, A., Konolige, K.: KeyPose: multi-view 3d labeling and keypoint estimation for transparent objects. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11602–11610 (2020)

    Google Scholar 

  14. Peng, S., Liu, Y., Huang, Q., Zhou, X., Bao, H.: PVNet: pixel-wise voting network for 6dof pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4561–4570 (2019)

    Google Scholar 

  15. Sajjan, S., et al.: Clear grasp: 3d shape estimation of transparent objects for manipulation. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 3634–3642. IEEE (2020)

    Google Scholar 

  16. Tang, Y., Chen, J., Yang, Z., Lin, Z., Li, Q., Liu, W.: DepthGrasp: depth completion of transparent objects using self-attentive adversarial network with spectral residual for grasping. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5710–5716. IEEE (2021)

    Google Scholar 

  17. Tian, M., Pan, L., Ang, M.H., Lee, G.H.: Robust 6d object pose estimation by learning rgb-d features. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 6218–6224. IEEE (2020)

    Google Scholar 

  18. Wang, C., et al.: DenseFusion: 6d object pose estimation by iterative dense fusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3343–3352 (2019)

    Google Scholar 

  19. Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: a convolutional neural network for 6d object pose estimation in cluttered scenes. Robot. Sci. Syst. (2018)

    Google Scholar 

  20. Xie, E., Wang, W., Wang, W., Ding, M., Shen, C., Luo, P.: Segmenting transparent objects in the wild. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12358, pp. 696–711. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58601-0_41

    Chapter  Google Scholar 

  21. Xu, C., Chen, J., Yao, M., Zhou, J., Zhang, L., Liu, Y.: 6dof pose estimation of transparent object from a single RGB-D image. Sensors 20(23), 6790 (2020)

    Article  Google Scholar 

  22. Xu, H., Wang, Y.R., Eppel, S., Aspuru-Guzik, A., Shkurti, F., Garg, A.: Seeing glass: joint point cloud and depth completion for transparent objects. arXiv preprint arXiv:2110.00087 (2021)

  23. Xu, Y., Nagahara, H., Shimada, A., Taniguchi, R.i.: Transcut: transparent object segmentation from a light-field image. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3442–3450 (2015)

    Google Scholar 

  24. Zhang, Y., Funkhouser, T.: Deep depth completion of a single RGB-d image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 175–185 (2018)

    Google Scholar 

  25. Zhou, Z., Chen, X., Jenkins, O.C.: Lit: light-field inference of transparency for refractive object localization. IEEE Robot. Autom. Lett. 5(3), 4548–4555 (2020)

    Google Scholar 

  26. Zhu, L., et al.: RGB-d local implicit function for depth completion of transparent objects. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4649–4658 (2021)

    Google Scholar 

Download references

Acknowledgement

We thank greatly the support from Dr. Peter Gaskell and Weishu Wu at the University of Michigan, who provided devices and objects for dataset collection.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaotong Chen .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 12282 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chen, X., Zhang, H., Yu, Z., Opipari, A., Chadwicke Jenkins, O. (2022). ClearPose: Large-scale Transparent Object Dataset and Benchmark. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13668. Springer, Cham. https://doi.org/10.1007/978-3-031-20074-8_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20074-8_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20073-1

  • Online ISBN: 978-3-031-20074-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics