Recovering Occlusion Boundaries from an Image

Hoiem, Derek; Efros, Alexei A.; Hebert, Martial

doi:10.1007/s11263-010-0400-4

Recovering Occlusion Boundaries from an Image

Published: 21 October 2010

Volume 91, pages 328–346, (2011)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Derek Hoiem¹,
Alexei A. Efros² &
Martial Hebert²

1814 Accesses
127 Citations
9 Altmetric
Explore all metrics

Abstract

Occlusion reasoning is a fundamental problem in computer vision. In this paper, we propose an algorithm to recover the occlusion boundaries and depth ordering of free-standing structures in the scene. Rather than viewing the problem as one of pure image processing, our approach employs cues from an estimated surface layout and applies Gestalt grouping principles using a conditional random field (CRF) model. We propose a hierarchical segmentation process, based on agglomerative merging, that re-estimates boundary strength as the segmentation progresses. Our experiments on the Geometric Context dataset validate our choices for features, our iterative refinement of classifiers, and our CRF model. In experiments on the Berkeley Segmentation Dataset, PASCAL VOC 2008, and LabelMe, we also show that the trained algorithm generalizes to other datasets and can be used as an object boundary predictor with figure/ground labels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Ahuja, N. (1996). A transform for multiscale image segmentation by integrated edge and region detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(12). http://dx.doi.org/10.1109/34.546258.
Alexe, B., Deselaers, T., & Ferrari, V. (2010). What is an object? In CVPR 2010.
Amir, A., & Lindenbaum, M. (1998). A generic grouping algorithm and its quantitative analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(2). http://dx.doi.org/10.1109/34.659934.
Arbelaez, P. (2006). Boundary extraction in natural images using ultrametric contour maps. In Proc. POCV.
Arbelaez, P., Maire, M., Fowlkes, C., & Malik, J. (2009). From contours to regions: an empirical evaluation. In CVPR.
Bakin, J. S., Nakayama, K., & Gilbert, C. D. (2000). Visual responses in monkey areas v1 and v2 to three-dimensional surface configurations. The Journal of Neuroscience.
Black, M. J., & Fleet, D. J. (2000). Probabilistic detection and tracking of motion discontinuities. International Journal of Computer Vision, 38(3), 231–245.
Article MATH Google Scholar
Canny, J. (1986). A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(6), 679–698.
Article Google Scholar
Cao, L., Liu, J., & Tang, X. (2005). 3D object reconstruction from a single 2D line drawing without hidden lines. In ICCV.
Clowes, M. (1971). On seeing things. Artificial Intelligence, 2(1), 79–116.
Article Google Scholar
Collins, M., Schapire, R., & Singer, Y. (2002). Logistic regression, Adaboost and Bregman distances. Machine Learning, 48(1–3).
Cour, T., Benezit, F., & Shi, J. (2005). Spectral segmentation with multiscale graph decomposition. In CVPR.
Draper, S. (1981). The use of gradient and dual space in line-drawing interpretation. Artificial Intelligence, 17, 461–508.
Article Google Scholar
Elder, J., & Zucker, S. (1996). Computing contour closure. In ECCV.
Endres, I., & Hoiem, D. (2010). Category independent object proposals. In ECCV.
Google Scholar
Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2008). The PASCAL visual object classes challenge 2008 (VOC2008) results. http://www.pascal-network.org/challenges/VOC/voc2008/workshop/index.html.
Farhadi, A., Endres, I., & Hoiem, D. (2010). Attribute-centric recognition for cross-category generalization. In CVPR.
Felzenszwalb, P., & Huttenlocher, D. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2). http://dx.doi.org/10.1023/B:VISI.0000022288.19776.77.
Gibson, J. (1950). The perception of the visual world. Boston: Houghton Mifflin.
Google Scholar
Gould, S., Gao, T., & Koller, D. (2009). Region-based segmentation and object detection. In NIPS.
Guzman, A. (1968). Computer recognition of three-dimensional objects in a visual scene. Technical report MAC-TR-59. MIT.
Herault, L., & Horaud, R. (1993). Figure-ground discrimination: A combinatorial optimization approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15. http://dx.doi.org/10.1109/34.232076.
Heskes, T., Albers, K., & Kappen, B. (2003). Approximate inference and constrained optimization. In Proc. UAI.
Hoiem, D., Efros, A. A., & Hebert, M. (2005). Automatic photo pop-up. In ACM SIGGRAPH 2005.
Hoiem, D., Efros, A. A., & Hebert, M. (2007). Recovering surface layout from an image. International Journal of Computer Vision, 75(1), 151–172.
Article Google Scholar
Hoiem, D., Efros, A. A., & Hebert, M. (2008). Closing the loop on scene interpretation. In CVPR.
Hoiem, D., Stein, A. N., Efros, A. A., & Hebert, M. (2007). Recovering occlusion boundaries from an image. In ICCV.
Huffman, D. (1971). Impossible objects as nonsense sentences. Machine Intelligence, 6, 295–323.
Google Scholar
Huffman, D. (1977). Realizable configurations of lines in pictures of polyhedra. Machine Intelligence, 8, 493–509.
Google Scholar
Jacobs, D. (1993). Robust and efficient detection of convex groups. In CVPR.
Jain, R., & Aggarwal, J. (1979). Computer analysis of scenes with curved objects. Proceedings of the IEEE, 67(5), 805–812.
Article Google Scholar
Jermyn, I., & Ishikawa, H. (2001). Globally optimal regions and boundaries as minimum ratio weight cycles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(10), 1075–1088.
Article Google Scholar
Kanade, T. (1980). A theory of the Origami world. Artificial Intelligence, 13, 279–311.
Article MATH MathSciNet Google Scholar
Kim, S.J., Koh, K., Lustig, M., Boyd, S., & Gorinevsky, D. (2007). An interior-point method for large-scale l1-regularized logistic regression. Journal of Machine Learning Research, 8, 1519–1555.
Google Scholar
Kovacs, I., & Julesz, B. (1993). A closed curve is much more than an incomplete one: effect of closure in figure-ground discrimination. In Proc. Nat’l Academy of Science USA, 90.
Kumar, M. P., Torr, P., & Zisserman, A. (2010). Objcut: efficient segmentation using top-down and bottom-up cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 530–545.
Article Google Scholar
Lalonde, J.-F., Hoiem, D., Efros, A. A., Rother, C., Winn, J., & Criminisi, A. (2007). Photo clip art. In ACM SIGGRAPH 2007.
Leclerc, Y., & Fischler, M. (1992). An optimization-based approach to the interpretation of single line drawings as 3D wire frames. International Journal of Computer Vision, 9(2). http://dx.doi.org/10.1007/BF00129683.
Lee, S.-I., Ganapathi, V., & Koller, D. (2007). Efficient structure learning of Markov networks using L ₁-regularization. In NIPS.
Leichter, I., & Lindenbaum, M. (2009). Boundary ownership by lifting to 2.1D. In NIPS.
Leung, T., & Malik, J. (1998). Contour continuity in region based image segmentation. In ECCV.
Li, F., Carreira, J., & Sminchisescu, C. (2010). Object recognition as ranking holistic figure-ground hypotheses. In CVPR.
Lipson, H., & Shpitalni, M. (1996). Optimization-based reconstruction of a 3D object from a single freehand line drawing. Computer-Aided Design, 28(8).
Lowe, D. (1985). Perceptual organization and visual recognition. Kluwer Academic: Norwell.
Google Scholar
Mahamud, S., Williams, L. R., Thornber, K. K., & Xu, K. (2003). Segmentation of multiple salient closed contours from real images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(4). http://dx.doi.org/10.1109/TPAMI.2003.1190570.
Maire, M., Arbelaez, P., Fowlkes, C., & Malik, J. (2008). Using contours to detect and localize junctions in natural images. In CVPR.
Malik, J. (1987). Interpreting line drawings of curved objects. International Journal of Computer Vision, 1(1), 73–103.
Article Google Scholar
Marill, T. (1991). Emulating the human interpretation of line-drawings as three-dimensional objects. International Journal of Computer Vision, 6(2). http://dx.doi.org/10.1007/BF00128154.
Martin, D., Fowlkes, C., & Malik, J. (2002). Learning to find brightness and texture boundaries in natural images. In NIPS.
Martin, D., Fowlkes, C., Tal, D., & Malik, J. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In ICCV.
Martin, D. R., Fowlkes, C. C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5), 530–549.
Article Google Scholar
McDermott, J. (2004). Psychophysics with junctions in real images. Perception, 33(9), 1101–1127.
Article Google Scholar
Ng, A. Y. (2004). Feature selection, L ₁ vs. L ₂ regularization, and rotational invariance. In ICML.
Nitzberg, M., & Mumford, D. (1990). The 2.1-D sketch. In ICCV.
Perona, P., & Freeman, W. (1998). A factorization approach to grouping. In ECCV.
Prasad, M., Zisserman, A., Fitzgibbon, A., Kumar, M., & Torr, P. (2006). Learning class-specific edges for object detection and segmentation. In ICCV.
Ren, X., Fowlkes, C. C., & Malik, J. (2006). Figure/ground assignment in natural images. In ECCV.
Ren, X., & Malik, J. (2003). Learning a classification model for segmentation. In ICCV.
Roberts, L. (1965). Machine perception of 3-D solids. In OEOIP, pp. 159–197.
Russell, B. C., Efros, A. A., Sivic, J., Freeman, W. T., & Zisserman, A. (2006). Using multiple segmentations to discover objects and their extent in image collections. In CVPR.
Sarkar, S., & Soundararajan, P. (2000). Supervised learning of large perceptual organization: graph spectral partitioning and learning automata. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(5). http://dx.doi.org/10.1109/34.857006.
Saund, E. (2006). Logic and MRF circuitry for labeling occluding and thinline visual contours. In NIPS.
Saxena, A., Chung, S., & Ng, A. Y. (2005). Learning depth from single monocular images. In NIPS.
Saxena, A., Chung, S. H., & Ng, A. Y. (2007). 3-d depth reconstruction from a single still image. International Journal of Computer Vision, 76. http://dx.doi.org/10.1007/s11263-007-0071-y.
Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8). http://dx.doi.org/10.1109/34.868688.
Shoji, K., Kato, K., & Toyama, F. (2001). 3-D interpretation of single line drawings based on entropy minimization principle. In ICCV.
Smith, P., Drummond, T., & Cipolla, R. (2004). Layered motion segmentation and depth ordering by tracking edges. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(4), 479–494.
Article Google Scholar
Stein, A. N., & Hebert, M. (2006a). Local detection of occlusion boundaries in video. In BMVC.
Stein, A. N., & Hebert, M. (2006b). Using spatio-temporal patches for simultaneous estimation of edge strength, orientation, and motion. In Beyond Patches Workshop at CVPR.
Stein, A. N., Hoiem, D., & Hebert, M. (2007). Learning to find object boundaries using motion cues. In ICCV.
Sugihara, K. (1984a). An algebraic approach to the shape-from-image-problem. Artificial Intelligence, 23, 59–95.
Article MATH MathSciNet Google Scholar
Sugihara, K. (1984b). A necessary and sufficient condition for a picture to represent a polyhedral scene. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(5), 578–586.
Article Google Scholar
Vaillant, R., & Faugeras, O. (1992). Using extremal boundaries for 3D object modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2), 157–173.
Article Google Scholar
Viola, P., & Jones, M. J. (2004). Robust real-time face detection. International Journal of Computer Vision, 57(2). http://dx.doi.org/10.1023/B:VISI.0000013087.49260.fb.
Waltz, D. L. (1975). Understanding line drawings of scenes with shadows. In P. Winston (Ed.), The psychology of computer vision (pp. 19–91). McGraw-Hill, New York.
Google Scholar
Wertheimer, M. (1938). Laws of organization in perceptual forms. In W. D. Ellis (Ed.), A sourcebook of gestalt psychology. Routledge, London.
Google Scholar
Yuille, A. L. (2002). CCCP algorithms to minimize the Bethe and Kikuchi free energies: convergent alternatives to belief propagation. Neural Computation, 14(7).

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Illinois at Urbana-Champaign, Champaign, IL, USA
Derek Hoiem
Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
Alexei A. Efros & Martial Hebert

Authors

Derek Hoiem
View author publications
You can also search for this author in PubMed Google Scholar
Alexei A. Efros
View author publications
You can also search for this author in PubMed Google Scholar
Martial Hebert
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Derek Hoiem.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hoiem, D., Efros, A.A. & Hebert, M. Recovering Occlusion Boundaries from an Image. Int J Comput Vis 91, 328–346 (2011). https://doi.org/10.1007/s11263-010-0400-4

Download citation

Received: 28 August 2009
Accepted: 05 October 2010
Published: 21 October 2010
Issue Date: February 2011
DOI: https://doi.org/10.1007/s11263-010-0400-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recovering Occlusion Boundaries from an Image

Abstract

Access this article

Similar content being viewed by others

Efficient Multi-cue Scene Segmentation

Scene Parsing with Object Instance Inference Using Regions and Per-exemplar Detectors

Realtime Hierarchical Clustering Based on Boundary and Surface Statistics

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Recovering Occlusion Boundaries from an Image

Abstract

Access this article

Similar content being viewed by others

Efficient Multi-cue Scene Segmentation

Scene Parsing with Object Instance Inference Using Regions and Per-exemplar Detectors

Realtime Hierarchical Clustering Based on Boundary and Surface Statistics

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation