Skip to main content

Instance-Level Segmentation of Vehicles by Deep Contours

Part of the Lecture Notes in Computer Science book series (LNIP,volume 10116)

Abstract

The recognition of individual object instances in single monocular images is still an incompletely solved task. In this work, we propose a new approach for detecting and separating vehicles in the context of autonomous driving. Our method uses the fully convolutional network (FCN) for semantic labeling and for estimating the boundary of each vehicle. Even though a contour is in general a one pixel wide structure which cannot be directly learned by a CNN, our network addresses this by providing areas around the contours. Based on these areas, we separate the individual vehicle instances. In our experiments, we show on two challenging datasets (Cityscapes and KITTI) that we achieve state-of-the-art performance, despite the usage of a subsampling rate of two. Our approach even outperforms all recent works w.r.t. several rating scores.

Keywords

  • Markov Random Field
  • Convolutional Neural Network
  • Conditional Random Field
  • Individual Instance
  • Object Instance

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-54407-6_32
  • Chapter length: 16 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   84.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-54407-6
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   109.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.

References

  1. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 1097–1105 (2012)

    Google Scholar 

  2. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. (IJCV) 88, 303–338 (2010)

    CrossRef  Google Scholar 

  3. Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. Trans. Pattern Anal. Mach. Intell. (PAMI) (2016). http://ieeexplore.ieee.org/abstract/document/7478072/

  4. Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  5. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the KITTI vision benchmark suite. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3354–3361 (2012)

    Google Scholar 

  6. Tighe, J., Niethammer, M., Lazebnik, S.: Scene parsing with object instances and occlusion ordering. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3748–3755 (2014)

    Google Scholar 

  7. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. Trans. Pattern Anal. Mach. Intell. (PAMI) 32, 1627–1645 (2010)

    CrossRef  Google Scholar 

  8. Ladický, Ľ., Sturgess, P., Alahari, K., Russell, C., Torr, P.H.S.: What, where and how many? combining object detectors and CRFs. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 424–437. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15561-1_31

    CrossRef  Google Scholar 

  9. He, X., Gould, S.: An exemplar-based CRF for multi-instance object segmentation. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 296–303 (2014)

    Google Scholar 

  10. Zhang, Z., Fidler, S., Urtasun, R.: Instance-level segmentation for autonomous driving with deep densely connected MRFs. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  11. Zhang, Z., Schwing, A.G., Fidler, S., Urtasun, R.: Monocular object instance segmentation and depth ordering with CNNs. In: International Conference on Computer Vision (ICCV), pp. 2614–2622 (2015)

    Google Scholar 

  12. Uhrig, J., Cordts, M., Franke, U., Brox, T.: Pixel-level encoding and depth layering for instance-level semantic labeling. In: Rosenhahn, B., Andres, B. (eds.) GCPR 2016. LNCS, vol. 9796, pp. 14–25. Springer, Cham (2016). doi:10.1007/978-3-319-45886-1_2

    CrossRef  Google Scholar 

  13. Ghiasi, G., Fowlkes, C.C.: Laplacian pyramid reconstruction and refinement for semantic segmentation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 519–534. Springer, Cham (2016). doi:10.1007/978-3-319-46487-9_32

    CrossRef  Google Scholar 

  14. Lin, G., Shen, C., van den Hengel, A., Reid, I.: Efficient piecewise training of deep structured models for semantic segmentation. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  15. Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.: Conditional random fields as recurrent neural networks. In: International Conference on Computer Vision (ICCV), pp. 1529–1537 (2015)

    Google Scholar 

  16. Gupta, S., Girshick, R., Arbeláez, P., Malik, J.: Learning rich features from RGB-D images for object detection and segmentation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 345–360. Springer, Cham (2014). doi:10.1007/978-3-319-10584-0_23

    Google Scholar 

  17. Hariharan, B., Arbeláez, P., Girshick, R., Malik, J.: Simultaneous detection and segmentation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 297–312. Springer, Cham (2014). doi:10.1007/978-3-319-10584-0_20

    Google Scholar 

  18. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Region-based convolutional networks for accurate object detection and segmentation. Trans. Pattern Anal. Mach. Intell. (PAMI) 38, 142–158 (2016)

    CrossRef  Google Scholar 

  19. Wu, Z., Shen, C., van den Hengel, A.: Bridging Category-level and Instance-level Semantic Image Segmentation. arXiv:1605.06885 [cs.CV] (2016)

  20. Ren, M., Zemel, R.S.: End-to-End Instance Segmentation and Counting with Recurrent Attention. arXiv:1605.09410 [cs.LG] (2016)

  21. Liang, X., Wei, Y., Shen, X., Yang, J., Lin, L., Yan, S.: Proposal-free network for instance-level object segmentation. arXiv:1509.02636 [cs.CV] (2015)

  22. Martin, D.R., Fowlkes, C.C., Malik, J.: Learning to detect natural image boundaries using local brightness, color, and texture cues. Trans. Pattern Anal. Mach. Intell. (PAMI) 26, 530–549 (2004)

    CrossRef  Google Scholar 

  23. Mairal, J., Leordeanu, M., Bach, F., Hebert, M., Ponce, J.: Discriminative sparse image models for class-specific edge detection and image interpretation. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5304, pp. 43–56. Springer, Heidelberg (2008). doi:10.1007/978-3-540-88690-7_4

    CrossRef  Google Scholar 

  24. Dollar, P., Tu, Z., Belongie, S.: Supervised learning of edges and object boundaries. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1964–1971 (2006)

    Google Scholar 

  25. Hariharan, B., Arbeláez, P., Bourdev, L., Maji, S., Malik, J.: Semantic contours from inverse detectors. In: International Conference on Computer Vision (ICCV), pp. 991–998 (2011)

    Google Scholar 

  26. Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. Trans. Pattern Anal. Mach. Intell. (PAMI) 33, 898–916 (2011)

    CrossRef  Google Scholar 

  27. Gupta, S., Arbeláez, P., Girshick, R., Malik, J.: Indoor scene understanding with RGB-D images: bottom-up segmentation, object detection and semantic segmentation. Int. J. Comput. Vis. (IJCV) 112, 133–149 (2015)

    CrossRef  MathSciNet  Google Scholar 

  28. Kokkinos, I.: Pushing the boundaries of boundary detection using deep learning. In: International Conference on Learning Representations (ICLR) (2016)

    Google Scholar 

  29. Rupprecht, C., Huaroc, E., Baust, M., Navab, N.: Deep Active Contours. arXiv:1607.05074 [cs.CV] (2016)

  30. Shen, W., Wang, X., Wang, Y., Bai, X., Zhang, Z.: Deepcontour: a deep convolutional feature learned by positive-sharing loss for contour detection. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3982–3991 (2015)

    Google Scholar 

  31. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: Convolutional Architecture for Fast Feature Embedding. arXiv preprint arxiv:1408.5093 (2014)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jan van den Brand .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

van den Brand, J., Ochs, M., Mester, R. (2017). Instance-Level Segmentation of Vehicles by Deep Contours. In: Chen, CS., Lu, J., Ma, KK. (eds) Computer Vision – ACCV 2016 Workshops. ACCV 2016. Lecture Notes in Computer Science(), vol 10116. Springer, Cham. https://doi.org/10.1007/978-3-319-54407-6_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-54407-6_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-54406-9

  • Online ISBN: 978-3-319-54407-6

  • eBook Packages: Computer ScienceComputer Science (R0)