Skip to main content
Log in

Recovering Occlusion Boundaries from an Image

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Occlusion reasoning is a fundamental problem in computer vision. In this paper, we propose an algorithm to recover the occlusion boundaries and depth ordering of free-standing structures in the scene. Rather than viewing the problem as one of pure image processing, our approach employs cues from an estimated surface layout and applies Gestalt grouping principles using a conditional random field (CRF) model. We propose a hierarchical segmentation process, based on agglomerative merging, that re-estimates boundary strength as the segmentation progresses. Our experiments on the Geometric Context dataset validate our choices for features, our iterative refinement of classifiers, and our CRF model. In experiments on the Berkeley Segmentation Dataset, PASCAL VOC 2008, and LabelMe, we also show that the trained algorithm generalizes to other datasets and can be used as an object boundary predictor with figure/ground labels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Ahuja, N. (1996). A transform for multiscale image segmentation by integrated edge and region detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(12). http://dx.doi.org/10.1109/34.546258.

  • Alexe, B., Deselaers, T., & Ferrari, V. (2010). What is an object? In CVPR 2010.

  • Amir, A., & Lindenbaum, M. (1998). A generic grouping algorithm and its quantitative analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(2). http://dx.doi.org/10.1109/34.659934.

  • Arbelaez, P. (2006). Boundary extraction in natural images using ultrametric contour maps. In Proc. POCV.

  • Arbelaez, P., Maire, M., Fowlkes, C., & Malik, J. (2009). From contours to regions: an empirical evaluation. In CVPR.

  • Bakin, J. S., Nakayama, K., & Gilbert, C. D. (2000). Visual responses in monkey areas v1 and v2 to three-dimensional surface configurations. The Journal of Neuroscience.

  • Black, M. J., & Fleet, D. J. (2000). Probabilistic detection and tracking of motion discontinuities. International Journal of Computer Vision, 38(3), 231–245.

    Article  MATH  Google Scholar 

  • Canny, J. (1986). A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(6), 679–698.

    Article  Google Scholar 

  • Cao, L., Liu, J., & Tang, X. (2005). 3D object reconstruction from a single 2D line drawing without hidden lines. In ICCV.

  • Clowes, M. (1971). On seeing things. Artificial Intelligence, 2(1), 79–116.

    Article  Google Scholar 

  • Collins, M., Schapire, R., & Singer, Y. (2002). Logistic regression, Adaboost and Bregman distances. Machine Learning, 48(1–3).

  • Cour, T., Benezit, F., & Shi, J. (2005). Spectral segmentation with multiscale graph decomposition. In CVPR.

  • Draper, S. (1981). The use of gradient and dual space in line-drawing interpretation. Artificial Intelligence, 17, 461–508.

    Article  Google Scholar 

  • Elder, J., & Zucker, S. (1996). Computing contour closure. In ECCV.

  • Endres, I., & Hoiem, D. (2010). Category independent object proposals. In ECCV.

    Google Scholar 

  • Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2008). The PASCAL visual object classes challenge 2008 (VOC2008) results. http://www.pascal-network.org/challenges/VOC/voc2008/workshop/index.html.

  • Farhadi, A., Endres, I., & Hoiem, D. (2010). Attribute-centric recognition for cross-category generalization. In CVPR.

  • Felzenszwalb, P., & Huttenlocher, D. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2). http://dx.doi.org/10.1023/B:VISI.0000022288.19776.77.

  • Gibson, J. (1950). The perception of the visual world. Boston: Houghton Mifflin.

    Google Scholar 

  • Gould, S., Gao, T., & Koller, D. (2009). Region-based segmentation and object detection. In NIPS.

  • Guzman, A. (1968). Computer recognition of three-dimensional objects in a visual scene. Technical report MAC-TR-59. MIT.

  • Herault, L., & Horaud, R. (1993). Figure-ground discrimination: A combinatorial optimization approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15. http://dx.doi.org/10.1109/34.232076.

  • Heskes, T., Albers, K., & Kappen, B. (2003). Approximate inference and constrained optimization. In Proc. UAI.

  • Hoiem, D., Efros, A. A., & Hebert, M. (2005). Automatic photo pop-up. In ACM SIGGRAPH 2005.

  • Hoiem, D., Efros, A. A., & Hebert, M. (2007). Recovering surface layout from an image. International Journal of Computer Vision, 75(1), 151–172.

    Article  Google Scholar 

  • Hoiem, D., Efros, A. A., & Hebert, M. (2008). Closing the loop on scene interpretation. In CVPR.

  • Hoiem, D., Stein, A. N., Efros, A. A., & Hebert, M. (2007). Recovering occlusion boundaries from an image. In ICCV.

  • Huffman, D. (1971). Impossible objects as nonsense sentences. Machine Intelligence, 6, 295–323.

    Google Scholar 

  • Huffman, D. (1977). Realizable configurations of lines in pictures of polyhedra. Machine Intelligence, 8, 493–509.

    Google Scholar 

  • Jacobs, D. (1993). Robust and efficient detection of convex groups. In CVPR.

  • Jain, R., & Aggarwal, J. (1979). Computer analysis of scenes with curved objects. Proceedings of the IEEE, 67(5), 805–812.

    Article  Google Scholar 

  • Jermyn, I., & Ishikawa, H. (2001). Globally optimal regions and boundaries as minimum ratio weight cycles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(10), 1075–1088.

    Article  Google Scholar 

  • Kanade, T. (1980). A theory of the Origami world. Artificial Intelligence, 13, 279–311.

    Article  MATH  MathSciNet  Google Scholar 

  • Kim, S.J., Koh, K., Lustig, M., Boyd, S., & Gorinevsky, D. (2007). An interior-point method for large-scale l1-regularized logistic regression. Journal of Machine Learning Research, 8, 1519–1555.

    Google Scholar 

  • Kovacs, I., & Julesz, B. (1993). A closed curve is much more than an incomplete one: effect of closure in figure-ground discrimination. In Proc. Nat’l Academy of Science USA, 90.

  • Kumar, M. P., Torr, P., & Zisserman, A. (2010). Objcut: efficient segmentation using top-down and bottom-up cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 530–545.

    Article  Google Scholar 

  • Lalonde, J.-F., Hoiem, D., Efros, A. A., Rother, C., Winn, J., & Criminisi, A. (2007). Photo clip art. In ACM SIGGRAPH 2007.

  • Leclerc, Y., & Fischler, M. (1992). An optimization-based approach to the interpretation of single line drawings as 3D wire frames. International Journal of Computer Vision, 9(2). http://dx.doi.org/10.1007/BF00129683.

  • Lee, S.-I., Ganapathi, V., & Koller, D. (2007). Efficient structure learning of Markov networks using L 1-regularization. In NIPS.

  • Leichter, I., & Lindenbaum, M. (2009). Boundary ownership by lifting to 2.1D. In NIPS.

  • Leung, T., & Malik, J. (1998). Contour continuity in region based image segmentation. In ECCV.

  • Li, F., Carreira, J., & Sminchisescu, C. (2010). Object recognition as ranking holistic figure-ground hypotheses. In CVPR.

  • Lipson, H., & Shpitalni, M. (1996). Optimization-based reconstruction of a 3D object from a single freehand line drawing. Computer-Aided Design, 28(8).

  • Lowe, D. (1985). Perceptual organization and visual recognition. Kluwer Academic: Norwell.

    Google Scholar 

  • Mahamud, S., Williams, L. R., Thornber, K. K., & Xu, K. (2003). Segmentation of multiple salient closed contours from real images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(4). http://dx.doi.org/10.1109/TPAMI.2003.1190570.

  • Maire, M., Arbelaez, P., Fowlkes, C., & Malik, J. (2008). Using contours to detect and localize junctions in natural images. In CVPR.

  • Malik, J. (1987). Interpreting line drawings of curved objects. International Journal of Computer Vision, 1(1), 73–103.

    Article  Google Scholar 

  • Marill, T. (1991). Emulating the human interpretation of line-drawings as three-dimensional objects. International Journal of Computer Vision, 6(2). http://dx.doi.org/10.1007/BF00128154.

  • Martin, D., Fowlkes, C., & Malik, J. (2002). Learning to find brightness and texture boundaries in natural images. In NIPS.

  • Martin, D., Fowlkes, C., Tal, D., & Malik, J. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In ICCV.

  • Martin, D. R., Fowlkes, C. C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5), 530–549.

    Article  Google Scholar 

  • McDermott, J. (2004). Psychophysics with junctions in real images. Perception, 33(9), 1101–1127.

    Article  Google Scholar 

  • Ng, A. Y. (2004). Feature selection, L 1 vs. L 2 regularization, and rotational invariance. In ICML.

  • Nitzberg, M., & Mumford, D. (1990). The 2.1-D sketch. In ICCV.

  • Perona, P., & Freeman, W. (1998). A factorization approach to grouping. In ECCV.

  • Prasad, M., Zisserman, A., Fitzgibbon, A., Kumar, M., & Torr, P. (2006). Learning class-specific edges for object detection and segmentation. In ICCV.

  • Ren, X., Fowlkes, C. C., & Malik, J. (2006). Figure/ground assignment in natural images. In ECCV.

  • Ren, X., & Malik, J. (2003). Learning a classification model for segmentation. In ICCV.

  • Roberts, L. (1965). Machine perception of 3-D solids. In OEOIP, pp. 159–197.

  • Russell, B. C., Efros, A. A., Sivic, J., Freeman, W. T., & Zisserman, A. (2006). Using multiple segmentations to discover objects and their extent in image collections. In CVPR.

  • Sarkar, S., & Soundararajan, P. (2000). Supervised learning of large perceptual organization: graph spectral partitioning and learning automata. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(5). http://dx.doi.org/10.1109/34.857006.

  • Saund, E. (2006). Logic and MRF circuitry for labeling occluding and thinline visual contours. In NIPS.

  • Saxena, A., Chung, S., & Ng, A. Y. (2005). Learning depth from single monocular images. In NIPS.

  • Saxena, A., Chung, S. H., & Ng, A. Y. (2007). 3-d depth reconstruction from a single still image. International Journal of Computer Vision, 76. http://dx.doi.org/10.1007/s11263-007-0071-y.

  • Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8). http://dx.doi.org/10.1109/34.868688.

  • Shoji, K., Kato, K., & Toyama, F. (2001). 3-D interpretation of single line drawings based on entropy minimization principle. In ICCV.

  • Smith, P., Drummond, T., & Cipolla, R. (2004). Layered motion segmentation and depth ordering by tracking edges. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(4), 479–494.

    Article  Google Scholar 

  • Stein, A. N., & Hebert, M. (2006a). Local detection of occlusion boundaries in video. In BMVC.

  • Stein, A. N., & Hebert, M. (2006b). Using spatio-temporal patches for simultaneous estimation of edge strength, orientation, and motion. In Beyond Patches Workshop at CVPR.

  • Stein, A. N., Hoiem, D., & Hebert, M. (2007). Learning to find object boundaries using motion cues. In ICCV.

  • Sugihara, K. (1984a). An algebraic approach to the shape-from-image-problem. Artificial Intelligence, 23, 59–95.

    Article  MATH  MathSciNet  Google Scholar 

  • Sugihara, K. (1984b). A necessary and sufficient condition for a picture to represent a polyhedral scene. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(5), 578–586.

    Article  Google Scholar 

  • Vaillant, R., & Faugeras, O. (1992). Using extremal boundaries for 3D object modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2), 157–173.

    Article  Google Scholar 

  • Viola, P., & Jones, M. J. (2004). Robust real-time face detection. International Journal of Computer Vision, 57(2). http://dx.doi.org/10.1023/B:VISI.0000013087.49260.fb.

  • Waltz, D. L. (1975). Understanding line drawings of scenes with shadows. In P. Winston (Ed.), The psychology of computer vision (pp. 19–91). McGraw-Hill, New York.

    Google Scholar 

  • Wertheimer, M. (1938). Laws of organization in perceptual forms. In W. D. Ellis (Ed.), A sourcebook of gestalt psychology. Routledge, London.

    Google Scholar 

  • Yuille, A. L. (2002). CCCP algorithms to minimize the Bethe and Kikuchi free energies: convergent alternatives to belief propagation. Neural Computation, 14(7).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Derek Hoiem.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hoiem, D., Efros, A.A. & Hebert, M. Recovering Occlusion Boundaries from an Image. Int J Comput Vis 91, 328–346 (2011). https://doi.org/10.1007/s11263-010-0400-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-010-0400-4

Keywords

Navigation