DOOBNet: Deep Object Occlusion Boundary Detection from an Image

Wang, Guoxia; Wang, Xiaochuan; Li, Frederick W. B.; Liang, Xiaohui

doi:10.1007/978-3-030-20876-9_43

Guoxia Wang¹⁸,
Xiaochuan Wang¹⁸,
Frederick W. B. Li¹⁹ &
…
Xiaohui Liang¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11366))

Included in the following conference series:

Asian Conference on Computer Vision

1996 Accesses
8 Citations

Abstract

Object occlusion boundary detection is a fundamental and crucial research problem in computer vision. Solving this problem is challenging as we encounter extreme boundary/non-boundary class imbalance during the training of an object occlusion boundary detector. In this paper, we propose to address this class imbalance by up-weighting the loss contribution of false negative and false positive examples with our novel Attention Loss function. We also propose a unified end-to-end multi-task deep object occlusion boundary detection network (DOOBNet) by sharing convolutional features to simultaneously predict object boundary and occlusion orientation. DOOBNet adopts an encoder-decoder structure with skip connection in order to automatically learn multi-scale and multi-level features. We significantly surpass the state-of-the-art on the PIOD dataset (ODS F-score of .702) and the BSDS ownership dataset (ODS F-score of .555), as well as improving the detecting speed to as 0.037 s per image on the PIOD dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The statistics come from PIOD dataset.
2.
The conv block refers to convolution layer followed by batch normalization (BN) [14] and ReLU activation.

References

Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Article Google Scholar
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018)
Article Google Scholar
Cooper, M.C.: Interpreting line drawings of curved objects with tangential edges and surfaces. Image Vis. Comput. 15(4), 263–276 (1997)
Article Google Scholar
Dollár, P., Zitnick, C.L.: Fast edge detection using structured forests. IEEE Trans. Pattern Anal. Mach. Intell. 37(8), 1558–1570 (2015)
Article Google Scholar
Fu, H., Wang, C., Tao, D., Black, M.J.: Occlusion boundary detection via deep exploration of context. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 241–250 (2016)
Google Scholar
Gao, T., Packer, B., Koller, D.: A segmentation-aware object detection model with occlusion handling. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1361–1368. IEEE (2011)
Google Scholar
Girshick, R.: Fast R-CNN. In: International Conference on Computer Vision (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
He, X., Yuille, A.: Occlusion boundary detection using pseudo-depth. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 539–552. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15561-1_39
Chapter Google Scholar
Hoiem, D., Stein, A.N., Efros, A.A., Hebert, M.: Recovering occlusion boundaries from a single image. In: IEEE 11th International Conference on Computer Vision, ICCV 2007, pp. 1–8. IEEE (2007)
Google Scholar
Hu, X., Liu, Y., Wang, K., Ren, B.: Learning hybrid convolutional features foredge detection. Neurocomputing 313, 377–385 (2018)
Article Google Scholar
Hwang, J.J., Liu, T.L.: Pixel-wise deep learning for contour detection. arXiv preprint arXiv:1504.01989 (2015)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093 (2014)
Kokkinos, I.: Pushing the boundaries of boundary detection using deep learning. arXiv preprint arXiv:1511.07386 (2015)
Leichter, I., Lindenbaum, M.: Boundary ownership by lifting to 2.1 d. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 9–16. IEEE (2009)
Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. arXiv preprint arXiv:1708.02002 (2017)
Liu, Y., Lew, M.S.: Learning relaxed deep supervision for better edge detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 231–240 (2016)
Google Scholar
Liu, Y., Cheng, M.M., Bian, J., Zhang, L., Jiang, P.T., Cao, Y.: Semantic edge detection with diverse deep supervision. arXiv preprint arXiv:1804.02864 (2018)
Liu, Y., Cheng, M.M., Hu, X., Wang, K., Bai, X.: Richer convolutional features for edge detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5872–5881. IEEE (2017)
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Google Scholar
Maire, M.: Simultaneous segmentation and figure/ground organization using angular embedding. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6312, pp. 450–464. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15552-9_33
Chapter Google Scholar
Maire, M., Narihira, T., Yu, S.X.: Affinity CNN: learning pixel-centric pairwise relations for figure/ground embedding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 174–182 (2016)
Google Scholar
Martin, D.R., Fowlkes, C.C., Malik, J.: Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Trans. Pattern Anal. Mach. Intell. 26(5), 530–549 (2004)
Article Google Scholar
Nitzberg, M., Mumford, D.: The 2.1-d sketch. In: 1990 Proceedings of Third International Conference on Computer Vision, pp. 138–144. IEEE (1990)
Google Scholar
Ren, X., Fowlkes, C.C., Malik, J.: Figure/ground assignment in natural images. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 614–627. Springer, Heidelberg (2006). https://doi.org/10.1007/11744047_47
Chapter Google Scholar
Roberts, L.G.: Machine perception of three-dimensional solids. Ph.D. thesis, Massachusetts Institute of Technology (1963)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Sundberg, P., Brox, T., Maire, M., Arbeláez, P., Malik, J.: Occlusion boundary detection and figure/ground assignment from optical flow. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2233–2240. IEEE (2011)
Google Scholar
Teo, C.L., Fermüller, C., Aloimonos, Y.: Fast 2D border ownership assignment. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5117–5125. IEEE (2015)
Google Scholar
Tighe, J., Niethammer, M., Lazebnik, S.: Scene parsing with object instances and occlusion ordering. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3748–3755. IEEE (2014)
Google Scholar
Wang, P., Yuille, A.: DOC: deep occlusion estimation from a single image. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 545–561. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_33
Chapter Google Scholar
Xie, S., Tu, Z.: Holistically-nested edge detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1395–1403 (2015)
Google Scholar
Yang, J., Price, B., Cohen, S., Lee, H., Yang, M.H.: Object contour detection with a fully convolutional encoder-decoder network (2016)
Google Scholar
Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122 (2015)
Yu, F., Koltun, V., Funkhouser, T.: Dilated residual networks. In: Computer Vision and Pattern Recognition, vol. 1 (2017)
Google Scholar
Zhang, Z., Schwing, A.G., Fidler, S., Urtasun, R.: Monocular object instance segmentation and depth ordering with CNNs. arXiv preprint arXiv:1505.03159 (2015)

Download references

Acknowledgement

This work is supported by National Key R&D Program of China (2017YFB1002702) and National Nature Science Foundation of China (61572058). We would like to thank Peng Wang for helping with generating DOC experimental results and valuable discussions.

Author information

Authors and Affiliations

State Key Lab of Virtual Reality Technology and Systems, Beihang University, Beijing, China
Guoxia Wang, Xiaochuan Wang & Xiaohui Liang
Department of Computer Science, University of Durham, Durham, UK
Frederick W. B. Li

Authors

Guoxia Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaochuan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Frederick W. B. Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohui Liang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaohui Liang .

Editor information

Editors and Affiliations

IIIT Hyderabad, Hyderabad, India
C.V. Jawahar
ANU, Canberra, ACT, Australia
Hongdong Li
Simon Fraser University, Burnaby, BC, Canada
Greg Mori
ETH Zurich, Zurich, Zürich, Switzerland
Konrad Schindler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, G., Wang, X., Li, F.W.B., Liang, X. (2019). DOOBNet: Deep Object Occlusion Boundary Detection from an Image. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11366. Springer, Cham. https://doi.org/10.1007/978-3-030-20876-9_43

Download citation

DOI: https://doi.org/10.1007/978-3-030-20876-9_43
Published: 26 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20875-2
Online ISBN: 978-3-030-20876-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics