Machine Vision and Applications

, Volume 24, Issue 2, pp 243–254 | Cite as

Unsupervised scene segmentation using sparse coding context

  • Yen-Cheng Liu
  • Hwann-Tzong ChenEmail author
Original Paper


This paper presents an approach to image understanding on the aspect of unsupervised scene segmentation. With the goal of image understanding in mind, we consider ‘unsupervised scene segmentation’ a task of dividing a given image into semantically meaningful regions without using annotation or other human-labeled information. We seek to investigate how well an algorithm can achieve at partitioning an image with limited human-involved learning procedures. Specifically, we are interested in developing an unsupervised segmentation algorithm that only relies on the contextual prior learned from a set of images. Our algorithm incorporates a small set of images that are similar to the input image in their scene structures. We use the sparse coding technique to analyze the appearance of this set of images; the effectiveness of sparse coding allows us to derive a priori the context of the scene from the set of images. Gaussian mixture models can then be constructed for different parts of the input image based on the sparse-coding contextual prior, and can be combined into an Markov-random-field-based segmentation process. The experimental results show that our unsupervised segmentation algorithm is able to partition an image into semantic regions, such as buildings, roads, trees, and skies, without using human-annotated information. The semantic regions generated by our algorithm can be useful, as pre-processed inputs for subsequent classification-based labeling algorithms, in achieving automatic scene annotation and scene parsing.


Unsupervised image segmentation Semantic scene analysis Sparse coding Markov random fields 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Boykov Y., Veksler O., Zabih R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1222–1239 (2001)CrossRefGoogle Scholar
  2. 2.
    Comaniciu D., Meer P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)CrossRefGoogle Scholar
  3. 3.
    Felzenszwalb P.F., Huttenlocher D.P.: Efficient graph-based image segmentation. Int. J. Comput. Vis. 59(2), 167–181 (2004)CrossRefGoogle Scholar
  4. 4.
    Hays J., Efros A.A.: Scene completion using millions of photographs. Commun. ACM 51(10), 87–94 (2008)CrossRefGoogle Scholar
  5. 5.
    Hoiem, D., Efros, A.A., Hebert, M.: Geometric context from a single image. In: ICCV, pp. 654–661 (2005)Google Scholar
  6. 6.
    Hoiem, D., Efros, A.A., Hebert, M.: Closing the loop in scene interpretation. In: CVPR (2008)Google Scholar
  7. 7.
    Hubert L., Arabie P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)CrossRefGoogle Scholar
  8. 8.
    Huttenlocher D.P., Klanderman G.A., Rucklidge W.: Comparing images using the hausdorff distance. IEEE Trans. Pattern Anal. Mach. Intell. 15(9), 850–863 (1993)CrossRefGoogle Scholar
  9. 9.
    Joulin, A., Bach, F.R., Ponce, J.: Discriminative clustering for image co-segmentation. In: CVPR, pp. 1943–1950 (2010)Google Scholar
  10. 10.
    Liu, C., Yuen, J., Torralba, A.B.: Nonparametric scene parsing: label transfer via dense scene alignment. In: CVPR, pp. 1972–1979 (2009)Google Scholar
  11. 11.
    Lowe D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRefGoogle Scholar
  12. 12.
    Mairal, J., Bach, F., Ponce, J., Sapiro, G.: Online dictionary learning for sparse coding. In: ICML, p. 87 (2009)Google Scholar
  13. 13.
    Mori, G., Ren, X., Efros, A.A., Malik, J.: Recovering human body configurations: combining segmentation and recognition. In: CVPR (2), pp. 326–333 (2004)Google Scholar
  14. 14.
    Oliva A., Torralba A.B.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)zbMATHCrossRefGoogle Scholar
  15. 15.
    Quack, T., Leibe, B., Gool, L.J.V.: World-scale mining of objects and events from community photo collections. In: CIVR, pp. 47–56 (2008)Google Scholar
  16. 16.
    Rand W.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66(336), 846–850 (1971)CrossRefGoogle Scholar
  17. 17.
    Rother C., Kolmogorov V., Blake A.: “grabcut”—interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23(3), 309–314 (2004)CrossRefGoogle Scholar
  18. 18.
    Rother, C., Minka, T.P., Blake, A., Kolmogorov, V.: Cosegmentation of image pairs by histogram matching—incorporating a global constraint into MRFs. In: CVPR (1), pp. 993–1000 (2006)Google Scholar
  19. 19.
    Russell, B.C., Efros, A.A., Sivic, J., Freeman, W.T., Zisserman, A.: Segmenting scenes by matching image composites. In: NIPS (2009)Google Scholar
  20. 20.
    Russell, B.C., Freeman, W.T., Efros, A.A., Sivic, J., Zisserman, A.: Using multiple segmentations to discover objects and their extent in image collections. In: CVPR (2), pp. 1605–1614 (2006)Google Scholar
  21. 21.
    Russell, B.C., Torralba, A.B.: Building a database of 3d scenes from user annotations. In: CVPR, pp. 2711–2718 (2009)Google Scholar
  22. 22.
    Russell B.C., Torralba A.B., Murphy K.P., Freeman W.T.: Labelme: a database and web-based tool for image annotation. Int. J. Comput. Vis. 77(1–3), 157–173 (2008)CrossRefGoogle Scholar
  23. 23.
    Shi J., Malik J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)CrossRefGoogle Scholar
  24. 24.
    Simon, I., Seitz, S.M.: Scene segmentation using the wisdom of crowds. In: ECCV (2), pp. 541–553 (2008)Google Scholar
  25. 25.
    Snavely N., Seitz S.M., Szeliski R.: Modeling the world from internet photo collections. Int. J. Comput. Vis. 80(2), 189–210 (2008)CrossRefGoogle Scholar
  26. 26.
    Tu, Z., Chen, X., Yuille, A.L., Zhu, S.C.: Image parsing: unifying segmentation, detection and recognition. In: ICCV, pp. 18–25 (2003)Google Scholar
  27. 27.
    Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: large-scale scene recognition from abbey to zoo. In: CVPR, pp. 3485–3492 (2010)Google Scholar
  28. 28.
    Yang, J., Yu, K., Huang, T.S.: Efficient highly over-complete sparse coding using a mixture model. In: ECCV (5), pp. 113–126 (2010)Google Scholar
  29. 29.
    Yu, K., Zhang, T., Gong, Y.: Nonlinear learning using local coordinate coding. In: NIPS (2009)Google Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  1. 1.Department of Computer ScienceNational Tsing Hua UniversityHsinchuTaiwan

Personalised recommendations