Skip to main content

Advertisement

Log in

Unsupervised scene segmentation using sparse coding context

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

This paper presents an approach to image understanding on the aspect of unsupervised scene segmentation. With the goal of image understanding in mind, we consider ‘unsupervised scene segmentation’ a task of dividing a given image into semantically meaningful regions without using annotation or other human-labeled information. We seek to investigate how well an algorithm can achieve at partitioning an image with limited human-involved learning procedures. Specifically, we are interested in developing an unsupervised segmentation algorithm that only relies on the contextual prior learned from a set of images. Our algorithm incorporates a small set of images that are similar to the input image in their scene structures. We use the sparse coding technique to analyze the appearance of this set of images; the effectiveness of sparse coding allows us to derive a priori the context of the scene from the set of images. Gaussian mixture models can then be constructed for different parts of the input image based on the sparse-coding contextual prior, and can be combined into an Markov-random-field-based segmentation process. The experimental results show that our unsupervised segmentation algorithm is able to partition an image into semantic regions, such as buildings, roads, trees, and skies, without using human-annotated information. The semantic regions generated by our algorithm can be useful, as pre-processed inputs for subsequent classification-based labeling algorithms, in achieving automatic scene annotation and scene parsing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Boykov Y., Veksler O., Zabih R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1222–1239 (2001)

    Article  Google Scholar 

  2. Comaniciu D., Meer P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)

    Article  Google Scholar 

  3. Felzenszwalb P.F., Huttenlocher D.P.: Efficient graph-based image segmentation. Int. J. Comput. Vis. 59(2), 167–181 (2004)

    Article  Google Scholar 

  4. Hays J., Efros A.A.: Scene completion using millions of photographs. Commun. ACM 51(10), 87–94 (2008)

    Article  Google Scholar 

  5. Hoiem, D., Efros, A.A., Hebert, M.: Geometric context from a single image. In: ICCV, pp. 654–661 (2005)

  6. Hoiem, D., Efros, A.A., Hebert, M.: Closing the loop in scene interpretation. In: CVPR (2008)

  7. Hubert L., Arabie P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)

    Article  Google Scholar 

  8. Huttenlocher D.P., Klanderman G.A., Rucklidge W.: Comparing images using the hausdorff distance. IEEE Trans. Pattern Anal. Mach. Intell. 15(9), 850–863 (1993)

    Article  Google Scholar 

  9. Joulin, A., Bach, F.R., Ponce, J.: Discriminative clustering for image co-segmentation. In: CVPR, pp. 1943–1950 (2010)

  10. Liu, C., Yuen, J., Torralba, A.B.: Nonparametric scene parsing: label transfer via dense scene alignment. In: CVPR, pp. 1972–1979 (2009)

  11. Lowe D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  12. Mairal, J., Bach, F., Ponce, J., Sapiro, G.: Online dictionary learning for sparse coding. In: ICML, p. 87 (2009)

  13. Mori, G., Ren, X., Efros, A.A., Malik, J.: Recovering human body configurations: combining segmentation and recognition. In: CVPR (2), pp. 326–333 (2004)

  14. Oliva A., Torralba A.B.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)

    Article  MATH  Google Scholar 

  15. Quack, T., Leibe, B., Gool, L.J.V.: World-scale mining of objects and events from community photo collections. In: CIVR, pp. 47–56 (2008)

  16. Rand W.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66(336), 846–850 (1971)

    Article  Google Scholar 

  17. Rother C., Kolmogorov V., Blake A.: “grabcut”—interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23(3), 309–314 (2004)

    Article  Google Scholar 

  18. Rother, C., Minka, T.P., Blake, A., Kolmogorov, V.: Cosegmentation of image pairs by histogram matching—incorporating a global constraint into MRFs. In: CVPR (1), pp. 993–1000 (2006)

  19. Russell, B.C., Efros, A.A., Sivic, J., Freeman, W.T., Zisserman, A.: Segmenting scenes by matching image composites. In: NIPS (2009)

  20. Russell, B.C., Freeman, W.T., Efros, A.A., Sivic, J., Zisserman, A.: Using multiple segmentations to discover objects and their extent in image collections. In: CVPR (2), pp. 1605–1614 (2006)

  21. Russell, B.C., Torralba, A.B.: Building a database of 3d scenes from user annotations. In: CVPR, pp. 2711–2718 (2009)

  22. Russell B.C., Torralba A.B., Murphy K.P., Freeman W.T.: Labelme: a database and web-based tool for image annotation. Int. J. Comput. Vis. 77(1–3), 157–173 (2008)

    Article  Google Scholar 

  23. Shi J., Malik J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)

    Article  Google Scholar 

  24. Simon, I., Seitz, S.M.: Scene segmentation using the wisdom of crowds. In: ECCV (2), pp. 541–553 (2008)

  25. Snavely N., Seitz S.M., Szeliski R.: Modeling the world from internet photo collections. Int. J. Comput. Vis. 80(2), 189–210 (2008)

    Article  Google Scholar 

  26. Tu, Z., Chen, X., Yuille, A.L., Zhu, S.C.: Image parsing: unifying segmentation, detection and recognition. In: ICCV, pp. 18–25 (2003)

  27. Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: large-scale scene recognition from abbey to zoo. In: CVPR, pp. 3485–3492 (2010)

  28. Yang, J., Yu, K., Huang, T.S.: Efficient highly over-complete sparse coding using a mixture model. In: ECCV (5), pp. 113–126 (2010)

  29. Yu, K., Zhang, T., Gong, Y.: Nonlinear learning using local coordinate coding. In: NIPS (2009)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hwann-Tzong Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, YC., Chen, HT. Unsupervised scene segmentation using sparse coding context. Machine Vision and Applications 24, 243–254 (2013). https://doi.org/10.1007/s00138-011-0401-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-011-0401-5

Keywords

Navigation