Learning and Incorporating Top-Down Cues in Image Segmentation

  • Xuming He
  • Richard S. Zemel
  • Debajyoti Ray
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3951)


Bottom-up approaches, which rely mainly on continuity principles, are often insufficient to form accurate segments in natural images. In order to improve performance, recent methods have begun to incorporate top-down cues, or object information, into segmentation. In this paper, we propose an approach to utilizing category-based information in segmentation, through a formulation as an image labelling problem. Our approach exploits bottom-up image cues to create an over-segmented representation of an image. The segments are then merged by assigning labels that correspond to the object category. The model is trained on a database of images, and is designed to be modular: it learns a number of image contexts, which simplify training and extend the range of object classes and image database size that the system can handle. The learning method estimates model parameters by maximizing a lower bound of the data likelihood. We examine performance on three real-world image databases, and compare our system to a standard classifier and other conditional random field approaches, as well as a bottom-up segmentation method.


Image Segmentation Object Category Conditional Random Field Label Distribution Snow Fence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Peterson, M., Gibson, B.: Shape recognition contributions to figure-ground organization in three-dimensional displays. Cognitive Psychology 25, 383–429 (1993)CrossRefGoogle Scholar
  2. 2.
    Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proc. 18th ICML (2001)Google Scholar
  3. 3.
    Kumar, S., Hebert, M.: Discriminative random fields: A discriminative framework for contextual interaction in classification. In: ICCV (2003)Google Scholar
  4. 4.
    Ren, X., Malik, J.: Learning a classification model for segmentation. In: ICCV (2003)Google Scholar
  5. 5.
    Liu, L., Sclaroff, S.: Region segmentation via deformable model-guided split and merge. In: ICCV (2001)Google Scholar
  6. 6.
    Borenstein, E., Sharon, E., Ullman, S.: Combining top-down and bottom-up segmentation. In: Proceedings IEEE Workshop of Perceptual Organization in Computer Vision (2004)Google Scholar
  7. 7.
    Yu, S., Shi, J.: Object-specific figure-ground segregation. In: CVPR (2003)Google Scholar
  8. 8.
    Tu, Z., Chen, X., Yuille, A., Zhu, S.C.: Image parsing: Unifying segmentation, detection, and object recognition. International Journal of Computer Vision 63, 113–140 (2005)CrossRefGoogle Scholar
  9. 9.
    Murphy, K., Torralba, A., Freeman, W.: Using the forest to see the trees: A graphical model relating features, objects and scenes. In: NIPS-04 (2004)Google Scholar
  10. 10.
    Carbonetto, P., de Freitas, N., Barnard, K.: A statistical model for general contextual object recognition. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 350–362. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  11. 11.
    He, X., Zemel, R., Carreira-Perpinan, M.: Multiscale conditional random fields for image labelling. In: CVPR (2004)Google Scholar
  12. 12.
    Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. PAMI 22, 888–905 (2000)CrossRefGoogle Scholar
  13. 13.
    Torralba, A., Oliva, A.: Statistics of natural image categories. Network: Computation in neural systems 14, 391–412 (2003)CrossRefGoogle Scholar
  14. 14.
    Jacobs, R.A., Jordan, M.I., Nowlan, S., Hinton, G.E.: Adaptive mixtures of local experts. Neural Computation 3, 1–12 (1991)CrossRefGoogle Scholar
  15. 15.
    Martin, D., Fowlkes, C., Malik, J.: Learning to detect natural image boundaries using local brightness, color and texture cues. IEEE Trans. PAMI. 26, 530–549 (2003)CrossRefGoogle Scholar
  16. 16.
    Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Computation 14, 1771–1800 (2002)CrossRefMATHGoogle Scholar
  17. 17.
    Russell, B., Torralba, A., Murphy, K., Freeman, W.: LabelMe: A database and web-based tool for image annotation (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Xuming He
    • 1
  • Richard S. Zemel
    • 1
  • Debajyoti Ray
    • 1
  1. 1.Department of Computer ScienceUniversity of TorontoCanada

Personalised recommendations