Segmenting Transparent Objects in the Wild

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12358)


Transparent objects such as windows and bottles made by glass widely exist in the real world. Segmenting transparent objects is challenging because these objects have diverse appearance inherited from the image background, making them had similar appearance with their surroundings. Besides the technical difficulty of this task, only a few previous datasets were specially designed and collected to explore this task and most of the existing datasets have major drawbacks. They either possess limited sample size such as merely a thousand of images without manual annotations, or they generate all images by using computer graphics method (i.e. not real image). To address this important problem, this work proposes a large-scale dataset for transparent object segmentation, named Trans10 K, consisting of 10,428 images of real scenarios with carefully manual annotations, which are 10 times larger than the existing datasets. The transparent objects in Trans10 K are extremely challenging due to high diversity in scale, viewpoint and occlusion. To evaluate the effectiveness of Trans10 K, we propose a novel boundary-aware segmentation method, termed TransLab, which exploits boundary as the clue to improve segmentation of transparent objects. Extensive experiments and ablation studies demonstrate the effectiveness of Trans10 K and validate the practicality of learning object boundary in TransLab. For example, TransLab significantly outperforms 20 recent object segmentation methods based on deep learning, showing that this task is largely unsolved. We believe that both Trans10 K and TransLab have important contributions to both the academia and industry, facilitating future researches and applications. The codes and models will be released at:


Transparent objects Dataset Benchmark Image segmentation Object boundary 



This work is partially supported by the SenseTime Donation for Research, HKU Seed Fund for Basic Research, Startup Fund and General Research Fund No.27208720.

Supplementary material

504454_1_En_41_MOESM1_ESM.pdf (7.3 mb)
Supplementary material 1 (pdf 7478 KB)


  1. 1.
    Xu, Y., Nagahara, H., Shimada, A., Taniguchi, R.: Transcut: transparent object segmentation from a light-field image. In: ICCV (2015)Google Scholar
  2. 2.
    Chen, G., Han, K., Wong, K.K.: Tom-net: learning transparent object matting from a single image. In: CVPR (2018)Google Scholar
  3. 3.
    Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR (2017)Google Scholar
  4. 4.
    Zhao, H., Qi, X., Shen, X., Shi, J., Jia, J.: ICNET for real-time semantic segmentation on high-resolution images. In: ECCV (2018)Google Scholar
  5. 5.
    Jin, Q., Meng, Z., Pham, T.D., Chen, Q., Wei, L., Su, R.: Dunet: a deformable network for retinal vessel segmentation. Knowl. Based Syst. 178, 149–162 (2019)CrossRefGoogle Scholar
  6. 6.
    Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS. TPAMI (2017)Google Scholar
  7. 7.
    Yang, M., Yu, K., Zhang, C., Li, Z., Yang, K.: Denseaspp for semantic segmentation in street scenes. In: CVPR (2018)Google Scholar
  8. 8.
    Lin, G., Milan, A., Shen, C., Reid, I.: Refinenet: multi-path refinement networks for high-resolution semantic segmentation. In: CVPR (2017)Google Scholar
  9. 9.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)Google Scholar
  10. 10.
    Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: ECCV (2018) Google Scholar
  11. 11.
    Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFS. arXiv (2014)Google Scholar
  12. 12.
    Lin, G., Shen, C., Van Den Hengel, A., Reid, I.: Efficient piecewise training of deep structured models for semantic segmentation. In: CVPR (2016)Google Scholar
  13. 13.
    Zheng, S., et al.: Conditional random fields as recurrent neural networks. In: ICCV (2015)Google Scholar
  14. 14.
    Chen, L.C., Barron, J.T., Papandreou, G., Murphy, K., Yuille, A.L.: Semantic image segmentation with task-specific edge detection using CNNS and a discriminatively trained domain transform. In: CVPR (2016)Google Scholar
  15. 15.
    Gadde, R., Jampani, V., Kiefel, M., Kappler, D., Gehler, P.V.: Superpixel convolutional networks using bilateral inceptions. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 597–613. Springer, Cham (2016). Scholar
  16. 16.
    Liu, S., De Mello, S., Gu, J., Zhong, G., Yang, M.H., Kautz, J.: Learning affinity via spatial propagation networks. In: NIPS (2017)Google Scholar
  17. 17.
    Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: CVPR (2018)Google Scholar
  18. 18.
    Kuznetsova, A., et al.: The open images dataset v4: unified image classification, object detection, and visual relationship detection at scale. arXiv (2018)Google Scholar
  19. 19.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)Google Scholar
  20. 20.
    Milletari, F., Navab, N., Ahmadi, S.A.: V-net: fully convolutional neural networks for volumetric medical image segmentation. In: IC3DV (2016)Google Scholar
  21. 21.
    Paszke, A., et al.: Automatic differentiation in pytorch (2017)Google Scholar
  22. 22.
    Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., Sang, N.: Bisenet: bilateral segmentation network for real-time semantic segmentation. In: ECCV (2018)Google Scholar
  23. 23.
    Liu, M., Yin, H.: Feature pyramid encoding network for real-time semantic segmentation. arXiv (2019)Google Scholar
  24. 24.
    Poudel, R.P., Bonde, U., Liwicki, S., Zach, C.: Contextnet: exploring context and detail for semantic segmentation in real-time. arXiv (2018)Google Scholar
  25. 25.
    Poudel, R.P., Liwicki, S., Cipolla, R.: Fast-SCNN: fast semantic segmentation network. arXiv (2019)Google Scholar
  26. 26.
    Wu, T., Tang, S., Zhang, R., Zhang, Y.: CGNET: a light-weight context guided network for semantic segmentation. arXiv (2018)Google Scholar
  27. 27.
    Wang, J., et al.: Deep high-resolution representation learning for visual recognition. arXiv (2019)Google Scholar
  28. 28.
    Chao, P., Kao, C.Y., Ruan, Y.S., Huang, C.H., Lin, Y.L.: Hardnet: a low memory traffic network. In: ICCV (2019)Google Scholar
  29. 29.
    Li, G., Yun, I., Kim, J., Kim, J.: Dabnet: depth-wise asymmetric bottleneck for real-time semantic segmentation. arXiv (2019)Google Scholar
  30. 30.
    Wang, Y., et al.: Lednet: a lightweight encoder-decoder network for real-time semantic segmentation. In: ICIP (2019)Google Scholar
  31. 31.
    Yuan, Y., Wang, J.: OCNet: object context network for scene parsing. arXiv (2018)Google Scholar
  32. 32.
    Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: MICCAI (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.The University of Hong KongHong KongChina
  2. 2.SenseTime ResearchHong KongChina
  3. 3.Nanjing UniversityNanjingChina
  4. 4.The University of AdelaideAdelaideAustralia

Personalised recommendations