Skip to main content

DOOBNet: Deep Object Occlusion Boundary Detection from an Image

  • Conference paper
  • First Online:
Computer Vision – ACCV 2018 (ACCV 2018)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11366))

Included in the following conference series:

Abstract

Object occlusion boundary detection is a fundamental and crucial research problem in computer vision. Solving this problem is challenging as we encounter extreme boundary/non-boundary class imbalance during the training of an object occlusion boundary detector. In this paper, we propose to address this class imbalance by up-weighting the loss contribution of false negative and false positive examples with our novel Attention Loss function. We also propose a unified end-to-end multi-task deep object occlusion boundary detection network (DOOBNet) by sharing convolutional features to simultaneously predict object boundary and occlusion orientation. DOOBNet adopts an encoder-decoder structure with skip connection in order to automatically learn multi-scale and multi-level features. We significantly surpass the state-of-the-art on the PIOD dataset (ODS F-score of .702) and the BSDS ownership dataset (ODS F-score of .555), as well as improving the detecting speed to as 0.037 s per image on the PIOD dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The statistics come from PIOD dataset.

  2. 2.

    The conv block refers to convolution layer followed by batch normalization (BN) [14] and ReLU activation.

References

  1. Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)

    Article  Google Scholar 

  2. Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018)

    Article  Google Scholar 

  3. Cooper, M.C.: Interpreting line drawings of curved objects with tangential edges and surfaces. Image Vis. Comput. 15(4), 263–276 (1997)

    Article  Google Scholar 

  4. Dollár, P., Zitnick, C.L.: Fast edge detection using structured forests. IEEE Trans. Pattern Anal. Mach. Intell. 37(8), 1558–1570 (2015)

    Article  Google Scholar 

  5. Fu, H., Wang, C., Tao, D., Black, M.J.: Occlusion boundary detection via deep exploration of context. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 241–250 (2016)

    Google Scholar 

  6. Gao, T., Packer, B., Koller, D.: A segmentation-aware object detection model with occlusion handling. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1361–1368. IEEE (2011)

    Google Scholar 

  7. Girshick, R.: Fast R-CNN. In: International Conference on Computer Vision (2015)

    Google Scholar 

  8. He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)

    Google Scholar 

  9. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  10. He, X., Yuille, A.: Occlusion boundary detection using pseudo-depth. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 539–552. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15561-1_39

    Chapter  Google Scholar 

  11. Hoiem, D., Stein, A.N., Efros, A.A., Hebert, M.: Recovering occlusion boundaries from a single image. In: IEEE 11th International Conference on Computer Vision, ICCV 2007, pp. 1–8. IEEE (2007)

    Google Scholar 

  12. Hu, X., Liu, Y., Wang, K., Ren, B.: Learning hybrid convolutional features foredge detection. Neurocomputing 313, 377–385 (2018)

    Article  Google Scholar 

  13. Hwang, J.J., Liu, T.L.: Pixel-wise deep learning for contour detection. arXiv preprint arXiv:1504.01989 (2015)

  14. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)

  15. Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093 (2014)

  16. Kokkinos, I.: Pushing the boundaries of boundary detection using deep learning. arXiv preprint arXiv:1511.07386 (2015)

  17. Leichter, I., Lindenbaum, M.: Boundary ownership by lifting to 2.1 d. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 9–16. IEEE (2009)

    Google Scholar 

  18. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. arXiv preprint arXiv:1708.02002 (2017)

  19. Liu, Y., Lew, M.S.: Learning relaxed deep supervision for better edge detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 231–240 (2016)

    Google Scholar 

  20. Liu, Y., Cheng, M.M., Bian, J., Zhang, L., Jiang, P.T., Cao, Y.: Semantic edge detection with diverse deep supervision. arXiv preprint arXiv:1804.02864 (2018)

  21. Liu, Y., Cheng, M.M., Hu, X., Wang, K., Bai, X.: Richer convolutional features for edge detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5872–5881. IEEE (2017)

    Google Scholar 

  22. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

    Google Scholar 

  23. Maire, M.: Simultaneous segmentation and figure/ground organization using angular embedding. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6312, pp. 450–464. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15552-9_33

    Chapter  Google Scholar 

  24. Maire, M., Narihira, T., Yu, S.X.: Affinity CNN: learning pixel-centric pairwise relations for figure/ground embedding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 174–182 (2016)

    Google Scholar 

  25. Martin, D.R., Fowlkes, C.C., Malik, J.: Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Trans. Pattern Anal. Mach. Intell. 26(5), 530–549 (2004)

    Article  Google Scholar 

  26. Nitzberg, M., Mumford, D.: The 2.1-d sketch. In: 1990 Proceedings of Third International Conference on Computer Vision, pp. 138–144. IEEE (1990)

    Google Scholar 

  27. Ren, X., Fowlkes, C.C., Malik, J.: Figure/ground assignment in natural images. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 614–627. Springer, Heidelberg (2006). https://doi.org/10.1007/11744047_47

    Chapter  Google Scholar 

  28. Roberts, L.G.: Machine perception of three-dimensional solids. Ph.D. thesis, Massachusetts Institute of Technology (1963)

    Google Scholar 

  29. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  30. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  31. Sundberg, P., Brox, T., Maire, M., Arbeláez, P., Malik, J.: Occlusion boundary detection and figure/ground assignment from optical flow. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2233–2240. IEEE (2011)

    Google Scholar 

  32. Teo, C.L., Fermüller, C., Aloimonos, Y.: Fast 2D border ownership assignment. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5117–5125. IEEE (2015)

    Google Scholar 

  33. Tighe, J., Niethammer, M., Lazebnik, S.: Scene parsing with object instances and occlusion ordering. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3748–3755. IEEE (2014)

    Google Scholar 

  34. Wang, P., Yuille, A.: DOC: deep occlusion estimation from a single image. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 545–561. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_33

    Chapter  Google Scholar 

  35. Xie, S., Tu, Z.: Holistically-nested edge detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1395–1403 (2015)

    Google Scholar 

  36. Yang, J., Price, B., Cohen, S., Lee, H., Yang, M.H.: Object contour detection with a fully convolutional encoder-decoder network (2016)

    Google Scholar 

  37. Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122 (2015)

  38. Yu, F., Koltun, V., Funkhouser, T.: Dilated residual networks. In: Computer Vision and Pattern Recognition, vol. 1 (2017)

    Google Scholar 

  39. Zhang, Z., Schwing, A.G., Fidler, S., Urtasun, R.: Monocular object instance segmentation and depth ordering with CNNs. arXiv preprint arXiv:1505.03159 (2015)

Download references

Acknowledgement

This work is supported by National Key R&D Program of China (2017YFB1002702) and National Nature Science Foundation of China (61572058). We would like to thank Peng Wang for helping with generating DOC experimental results and valuable discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaohui Liang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, G., Wang, X., Li, F.W.B., Liang, X. (2019). DOOBNet: Deep Object Occlusion Boundary Detection from an Image. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11366. Springer, Cham. https://doi.org/10.1007/978-3-030-20876-9_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-20876-9_43

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-20875-2

  • Online ISBN: 978-3-030-20876-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics