Figure-Ground Image Segmentation Helps Weakly-Supervised Learning of Objects

  • Katerina Fragkiadaki
  • Jianbo Shi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6316)


Given a collection of images containing a common object, we seek to learn a model for the object without the use of bounding boxes or segmentation masks. In linguistics, a single document provides no information about location of the topics it contains. On the contrary, an image has a lot to tell us about where foreground and background topics lie. Extensive literature on modelling bottom-up saliency and pop-out aims at predicting eye fixations and allocation of visual attention in a single image, prior to any recognition of content. Most salient image parts are likely to capture image foreground. We propose a novel probabilistic model, shape and figure-ground aware model (sFG model) that exploits bottom-up image saliency to compute an informative prior on segment topic assignments. xtitsegmented objects into visually object classes. (ii) bottom up saliency combined with co-occurrence give us strong hints about figure/ground that can help guide the topic discovery from partially segmented data. Our model exploits both figure-ground organization in each image separately, as well as feature re-occurrence across the image collection. Since we use image dependent topic prior, during model learning we optimize a conditional likelihood of the image collection given the image bottom-up saliency information. Our discriminative framework can tolerate larger intraclass variability of objects with fewer training data. We iterate between bottom-up figure-ground image organization and model parameter learning by accumulating image statistics from the entire image collection. The model learned influences later image figure-ground labelling. We present results of our approach on diverse datasets showing great improvement over generative probabilistic models that do not exploit image saliency, indicating the suitability of our model for weakly-supervised visual organization.


Topic Model Common Object Image Collection Image Saliency Aware Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    sCao, L., Fei-Fei, L.: Spatially coherent latent topic model for concurrent object segmentation and classification. In: ICCV (2007)Google Scholar
  2. 2.
    Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering objects and their localization in images. In: ICCV, pp. 370–377 (2005)Google Scholar
  3. 3.
    Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from google’s image search. In: ICCV, Washington, DC, USA (2005)Google Scholar
  4. 4.
    Blei, D.M., Ng, A.Y., Jordan, M.I., Lafferty, J.: Latent dirichlet allocation. Journal of Machine Learning Research 3, 2003 (2003)Google Scholar
  5. 5.
    Hofmann, T.: Probabilistic latent semantic analysis. In: Proc. of Uncertainty in Artificial Intelligence UAI 1999, pp. 289–296 (1999)Google Scholar
  6. 6.
    Russell, B.C., Efros, A.A., Sivic, J., Freeman, W.T., Zisserman, A.: Using multiple segmentations to discover objects and their extent in image collections. In: CVPR (2006)Google Scholar
  7. 7.
    Gupta, A., Shi, J., Davis, L.: A ‘shape aware’ model for semi-supervised learning of objects and its context. In: NIPS (2008)Google Scholar
  8. 8.
    Nguyen, M., Torresani, L., de la Torre, F., Rother, C.: Weakly supervised discriminative localization and classification: A joint learning processGoogle Scholar
  9. 9.
    An Exemplar Model for Learning Object Classes. In: CVPR 2007 (2007)Google Scholar
  10. 10.
    Galleguillos, C., Babenko, B., Rabinovich, A., Belongie, S.: Weakly supervised object localization with stable segmentations. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 193–207. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  11. 11.
    Viola, P., Platt, J., Zhang, C.: Multiple instance boosting for object detection. In: NIPS (2006)Google Scholar
  12. 12.
    Dollár, P., Babenko, B., Belongie, S., Perona, P., Tu, Z.: Multiple component learning for object detection. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 211–224. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  13. 13.
    Joulin, A., Bach, F., Ponce, J.: Discriminative clustering for image co-segmentation. In: CVPR 2010 (2010)Google Scholar
  14. 14.
    Zhu, L.L., Lin, C., Huang, H., Chen, Y., Yuille, A.L.: Unsupervised structure learning: Hierarchical recursive composition, suspicious coincidence and competitive exclusion. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 759–773. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  15. 15.
    Rother, C., Minka, T., Blake, A., Kolmogorov, V.: Cosegmentation of image pairs by histogram matching - incorporating a global constraint into mrfs. In: CVPR 2006 (2006)Google Scholar
  16. 16.
    Winn, J., Jojic, N.: Locus: Learning object classes with unsupervised segmentation. In: ICCV 2005 (2005)Google Scholar
  17. 17.
    Cour, T., Benezit, F., Shi, J.: Spectral segmentation with multiscale graph decomposition. In: CVPR 2005, Washington, DC, USA, pp. 1124–1131. IEEE Computer Society, Los Alamitos (2005)Google Scholar
  18. 18.
    Kadir, T., Brady, M.: Saliency, scale and image description. International Journal of Computer Vision V45, 83–105 (2001)Google Scholar
  19. 19.
    Lowe, D.: Distinctive image features from scale-invariant key-points. Intl. Journal of Computer Vision 60, 91–110 (2004)CrossRefGoogle Scholar
  20. 20.
    Ren, X., Fowlkes, C.C., Malik, J.: Figure/ground assignment in natural images. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part II. LNCS, vol. 3952, pp. 614–627. Springer, Heidelberg (2006)Google Scholar
  21. 21.
    Hoiem, D., Stein, A.N., Efros, A.A., Hebert, M.: Recovering occlusion boundaries from a single image. In: ICCV, pp. 1–8 (2007)Google Scholar
  22. 22.
    Goferman, S., Tal, A., Zelnik-Manor, L.: Puzzle-like collage. In: Computer Graphics Forum, EUROGRAPHICS (2010)Google Scholar
  23. 23.
    Zhu, Q., Song, G., Shi, J.: Untangling cycles for contour grouping. In: ICCV 2007 (2007)Google Scholar
  24. 24.
    Martin, D.R., Fowlkes, C.C., Malik, J.: Learning to detect natural image boundaries using local brightness, color, and texture cues. PAMI 26, 530–549 (2004)Google Scholar
  25. 25.
    Itti, L.: A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Research 40, 1489–1506 (2000)CrossRefGoogle Scholar
  26. 26.
    Avraham, T., Lindenbaum, M.: Esaliency (extended saliency): Meaningful attention using stochastic image modeling. In: PAMI (2007)Google Scholar
  27. 27.
    Zhang, L., Tong, M.H., Marks, T.K., Shan, H., Cottrell, G.W.: SUN: A Bayesian framework for saliency using natural statistics. J. Vis. 8, 1–20 (2008)Google Scholar
  28. 28.
    Seo, H.J., Milanfar, P.: Static and space-time visual saliency detection by self-resemblance. J. Vis. 9, 1–27 (2009)CrossRefGoogle Scholar
  29. 29.
    Griffiths, T.L., Steyvers, M., Tenenbaum: Finding scientific topics. In: National Academy of sciences. IEEE Computer Society, Los Alamitos (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Katerina Fragkiadaki
    • 1
  • Jianbo Shi
    • 1
  1. 1.GRASP LaboratoryUniversity of PennsylvaniaPhiladelphia

Personalised recommendations