Towards Unified Object Detection and Semantic Segmentation

Dong, Jian; Chen, Qiang; Yan, Shuicheng; Yuille, Alan

doi:10.1007/978-3-319-10602-1_20

Jian Dong¹⁹,
Qiang Chen¹⁹,
Shuicheng Yan¹⁹ &
…
Alan Yuille²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8693))

Included in the following conference series:

European Conference on Computer Vision

23k Accesses
30 Citations

Abstract

Object detection and semantic segmentation are two strongly correlated tasks, yet typically solved separately or sequentially with substantially different techniques. Motivated by the complementary effect observed from the typical failure cases of the two tasks, we propose a unified framework for joint object detection and semantic segmentation. By enforcing the consistency between final detection and segmentation results, our unified framework can effectively leverage the advantages of leading techniques for these two tasks. Furthermore, both local and global context information are integrated into the framework to better distinguish the ambiguous samples. By jointly optimizing the model parameters for all the components, the relative importance of different component is automatically learned for each category to guarantee the overall performance. Extensive experiments on the PASCAL VOC 2010 and 2012 datasets demonstrate encouraging performance of the proposed unified framework for both object detection and semantic segmentation tasks.

Download to read the full chapter text

Chapter PDF

Improved hierarchical conditional random field model for object segmentation

Article 21 August 2015

Adaptive Generation of Weakly Supervised Semantic Segmentation for Object Detection

Article 16 June 2022

Efficient Perceptual Region Detector Based on Object Boundary

Keywords

References

Aghazadeh, O., Azizpour, H., Sullivan, J., Carlsson, S.: Mixture component identification and learning for visual recognition. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 115–128. Springer, Heidelberg (2012)
Chapter Google Scholar
Arbeláez, P., Hariharan, B., Gu, C., Gupta, S., Bourdev, L., Malik, J.: Semantic segmentation using regions and parts. In: CVPR (2012)
Google Scholar
Boix, X., Gonfaus, J.M., van de Weijer, J., Bagdanov, A.D., Serrat, J., Gonzàlez, J.: Harmony potentials. IJCV (2012)
Google Scholar
Brox, T., Bourdev, L., Maji, S., Malik, J.: Object segmentation by alignment of poselet activations to image contours. In: CVPR (2011)
Google Scholar
Brox, T., Bourdev, L., Maji, S., Malik, J.: Object segmentation by alignment of poselet activations to image contours. In: CVPR (2011)
Google Scholar
Carreira, J., Caseiro, R., Batista, J., Sminchisescu, C.: Semantic segmentation with second-order pooling. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VII. LNCS, vol. 7578, pp. 430–443. Springer, Heidelberg (2012)
Chapter Google Scholar
Carreira, J., Sminchisescu, C.: Cpmc: Automatic object segmentation using constrained parametric min-cuts. TPAMI (2012)
Google Scholar
Chai, Y., Lempitsky, V., Zisserman, A.: Symbiotic segmentation and part localization for fine-grained categorization. In: ICCV (2013)
Google Scholar
Chatfield, K., Lempitsky, V., Vedaldi, A.: The devil is in the details: an evaluation of recent feature encoding methods. In: BMVC (2011)
Google Scholar
Chen, Q., Song, Z., Hua, Y., Huang, Z., Yan, S.: Hierarchical matching with side information for image classification. In: CVPR (2012)
Google Scholar
Cinbis, R.G., Verbeek, J., Schmid, C., et al.: Segmentation driven object detection with fisher vectors. In: ICCV (2013)
Google Scholar
Dai, Q., Hoiem, D.: Learning to localize detected objects. In: CVPR (2012)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
Google Scholar
Divvala, S.K., Efros, A.A., Hebert, M.: How important are ”deformable parts” in the deformable parts model? In: ECCV Workshops (2012)
Google Scholar
Dong, J., Xia, W., Chen, Q., Feng, J., Huang, Z., Yan, S.: Subcategory-aware object classification. In: CVPR (2013)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge (VOC2012) Results (2012)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. International Journal of Computer Vision 88(2), 303–338 (2010)
Article Google Scholar
Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: LIBLINEAR: A library for large linear classification. JMLR (2008)
Google Scholar
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object Detection with Discriminatively Trained Part-Based Models. TPAMI (2010)
Google Scholar
Fidler, S., Mottaghi, R., Yuille, A., Urtasun, R.: Bottom-up segmentation for top-down detection. In: CVPR (2013)
Google Scholar
Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher Kernel for Large-Scale Image Classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)
Chapter Google Scholar
Girshick, R.B., Felzenszwalb, P., Mcallester, D.: Object detection with grammar models. In: NIPS (2011)
Google Scholar
Gu, C., Arbeláez, P., Lin, Y., Yu, K., Malik, J.: Multi-component models for object detection. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 445–458. Springer, Heidelberg (2012)
Chapter Google Scholar
Hariharan, B., Arbeláez, P., Bourdev, L., Maji, S., Malik, J.: Semantic contours from inverse detectors. In: ICCV (2011)
Google Scholar
Hoiem, D., Chodpathumwan, Y., Dai, Q.: Diagnosing error in object detectors. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 340–353. Springer, Heidelberg (2012)
Chapter Google Scholar
Kumar, M.P., Ton, P.H.S., Zisserman, A.: Obj cut. In: CVPR (2005)
Google Scholar
Ladický, Ľ., Sturgess, P., Alahari, K., Russell, C., Torr, P.H.S.: What, where and how many? Combining object detectors and cRFs. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 424–437. Springer, Heidelberg (2010)
Chapter Google Scholar
Lempitsky, V., Kohli, P., Rother, C., Sharp, T.: Image segmentation with a bounding box prior. In: 2009 IEEE 12th International Conference on Computer Vision (2009)
Google Scholar
Li, C., Parikh, D., Chen, T.: Extracting adaptive contextual cues from unlabeled regions. In: ICCV (2011)
Google Scholar
Liu, H., Yan, S.: Robust graph mode seeking by graph shift. In: ICML (2010)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV (2004)
Google Scholar
Mottaghi, R., Chen, X., Liu, X., Cho, N.G., Lee, S.W., Fidler, S., Urtasun, R., Yuille, A.: The role of context for object detection and semantic segmentation in the wild. In: CVPR (2014)
Google Scholar
Parkhi, O.M., Vedaldi, A., Jawahar, C.V., Zisserman, A.: The truth about cats and dogs. In: ICCV (2011)
Google Scholar
Russakovsky, O., Lin, Y., Yu, K., Fei-Fei, L.: Object-centric spatial pooling for image classification. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 1–15. Springer, Heidelberg (2012)
Chapter Google Scholar
Song, Z., Chen, Q., Huang, Z., Hua, Y., Yan, S.: Contextualizing object detection and classification. In: CVPR (2011)
Google Scholar
Tighe, J., Lazebnik, S.: Finding things: Image parsing with regions and per-exemplar detectors. In: CVPR (2013)
Google Scholar
Uijlings, J.R.R., van de Sande, K., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. IJCV (2013)
Google Scholar
Xia, W., Domokos, C., Dong, J., Cheong, L.F., Yan, S.: Semantic segmentation without annotating segments (2013)
Google Scholar
Yadollahpour, P., Batra, D., Shakhnarovich, G.: Discriminative re-ranking of diverse segmentations. In: CVPR (2013)
Google Scholar
Yang, Y., Hallman, S., Ramanan, D., Fowlkes, C.C.: Layered object models for image segmentation. PAMI (2012)
Google Scholar
Yu, C.N.J., Joachims, T.: Learning structural svms with latent variables. In: ICML (2009)
Google Scholar
Yuen, J., Zitnick, C.L., Liu, C., Torralba, A.: A framework for encoding object-level image priors. Tech. rep., Microsoft Research Technical Report
Google Scholar
Yuille, A.L., Rangarajan, A.: The concave-convex procedure. Neural Computation (2003)
Google Scholar
Zhu, L., Chen, Y., Yuille, A.L., Freeman, W.T.: Latent hierarchical structural learning for object detection. In: CVPR (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, NUS, Singapore
Jian Dong, Qiang Chen & Shuicheng Yan
Department of Statistics, UCLA, Los Angeles, CA, USA
Alan Yuille

Authors

Jian Dong
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Shuicheng Yan
View author publications
You can also search for this author in PubMed Google Scholar
Alan Yuille
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Toronto, 6 King’s College Road, M5H 3S5, Toronto, ON, Canada
David Fleet
Faculty of Electrical Engineering, Department of Cybernetics, Czech Technical University in Prague, Technicka 2, 166 27, Prague 6, Czech Republic
Tomas Pajdla
Max-Planck-Institut für Informatik, Campus E1 4, 66123, Saarbrücken, Germany
Bernt Schiele
ESAT - PSI, iMinds, KU Leuven, Kasteelpark Arenberg 10, Bus 2441, 3001, Leuven, Belgium
Tinne Tuytelaars

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dong, J., Chen, Q., Yan, S., Yuille, A. (2014). Towards Unified Object Detection and Semantic Segmentation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8693. Springer, Cham. https://doi.org/10.1007/978-3-319-10602-1_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-10602-1_20
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10601-4
Online ISBN: 978-3-319-10602-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Towards Unified Object Detection and Semantic Segmentation

Abstract

Chapter PDF

Similar content being viewed by others

Improved hierarchical conditional random field model for object segmentation

Adaptive Generation of Weakly Supervised Semantic Segmentation for Object Detection

Efficient Perceptual Region Detector Based on Object Boundary

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Towards Unified Object Detection and Semantic Segmentation

Abstract

Chapter PDF

Similar content being viewed by others

Improved hierarchical conditional random field model for object segmentation

Adaptive Generation of Weakly Supervised Semantic Segmentation for Object Detection

Efficient Perceptual Region Detector Based on Object Boundary

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation