Structure-Sensitive Superpixels via Geodesic Distance

  • Peng Wang
  • Gang Zeng
  • Rui Gan
  • Jingdong Wang
  • Hongbin Zha
Article

Abstract

Segmenting images into superpixels as supporting regions for feature vectors and primitives to reduce computational complexity has been commonly used as a fundamental step in various image analysis and computer vision tasks. In this paper, we describe the structure-sensitive superpixel technique by exploiting Lloyd’s algorithm with the geodesic distance. Our method generates smaller superpixels to achieve relatively low under-segmentation in structure-dense regions with high intensity or color variation, and produces larger segments to increase computational efficiency in structure-sparse regions with homogeneous appearance. We adopt geometric flows to compute geodesic distances amongst pixels. In the segmentation procedure, the density of over-segments is automatically adjusted through iteratively optimizing an energy functional that embeds color homogeneity, structure density. Comparative experiments with the Berkeley database show that the proposed algorithm outperforms the prior arts while offering a comparable computational efficiency as TurboPixels. Further applications in image compression, object closure extraction and video segmentation demonstrate the effective extensions of our approach.

Keywords

Superpixel segmentation Geodesic distance Iterative optimization Structure-sensitivity 

References

  1. Alpert, S., Galun, M., Basri, R., & Brandt, A. (2007). Image segmentation by probabilistic bottom-up aggregation and cue integration. In CVPR.Google Scholar
  2. Arbelaez, P., Maire, M., Fowlkes, C. C., & Malik, J. (2009). From contours to regions: An empirical evaluation. In CVPR (pp. 2294–2301).Google Scholar
  3. Bai, X., & Sapiro, G. (2007). A geodesic framework for fast interactive image and video segmentation and matting. In ICCV (pp. 1–8).Google Scholar
  4. Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5), 603–619.CrossRefGoogle Scholar
  5. Criminisi, A., Sharp, T., & Blake, A. (2008). Geos: Geodesic image segmentation. In ECCV (pp. 99–112).Google Scholar
  6. Dollár, P., Tu, Z., & Belongie, S. (2006). Supervised learning of edges and object boundaries. In CVPR (Vol. 2, pp. 1964–1971).Google Scholar
  7. Du, Q., Emelianenko, M., & Ju, L. (2006). Convergence of the lloyd algorithm for computing centroidal voronoi tessellations. SIJNA: SIAM Journal on Numerical Analysis, 44, 102–119.Google Scholar
  8. Feil, B., & Abonyi, J. (2007). Geodesic distance based fuzzy clustering. Lecture notes in computer science, soft computing in industrial applications (pp. 50–59).Google Scholar
  9. Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2), 167–181.CrossRefGoogle Scholar
  10. Fulkerson, B., Vedaldi, A., & Soatto, S. (2009). Class segmentation and object localization with superpixel neighborhoods. In ICCV (pp. 670–677).Google Scholar
  11. Gulshan, V., Rother, C., Criminisi, A., Blake, A., & Zisserman, A. (2010). Geodesic star convexity for interactive image segmentation. In CVPR (pp. 3129–3136).Google Scholar
  12. Harel, J., Koch, C., & Perona, P. (2006). Graph-based visual saliency. In B. Schölkopf, J. C. Platt, & T. Hoffman (Eds.), NIPS (pp. 545–552). Cambridge, MA: MIT Press.Google Scholar
  13. He, X., Zemel, R. S., & Ray, D. (2006). Learning and incorporating top-down cues in image segmentation. In ECCV (Vol. 1, pp. 338–351).Google Scholar
  14. Hoiem, D., Efros, A. A., & Hebert, M. (2005). Geometric context from a single image. In ICCV (pp. 654–661).Google Scholar
  15. Hyvärinen, A. (1999). The fixed-point algorithm and maximum likelihood estimation for independent component analysis. Neural Processing Letters, 10(1), 1–5.CrossRefGoogle Scholar
  16. Jolliffe, I. T. (1986). Principal component analysis. In Principal component analysis. New York: Springer.Google Scholar
  17. Kaufhold, J. P., Collins, R., Hoogs, A., & Rondot, P. (2006). Recognition and segmentation of scene content using region-based classification. In ICPR (Vol. 1, pp. 755–760).Google Scholar
  18. Kim, J., Shim, K. H., & Choi, S. (2007). Soft geodesic kernel k-means. In ICASSP (pp. 429–432).Google Scholar
  19. Levinshtein, A., Dickinson, S. J., & Sminchisescu, C. (2009a). Multiscale symmetric part detection and grouping. In ICCV (pp. 2162–2169).Google Scholar
  20. Levinshtein, A., Sminchisescu, C., & Dickinson, S. J. (2010). Optimal contour closure by superpixel grouping. In ECCV (Vol. 2, pp. 429–493).Google Scholar
  21. Levinshtein, A., Stere, A., Kutulakos, K. N., Fleet, D. J., Dickinson, S. J., & Siddiqi, K. (2009b). Turbopixels: Fast superpixels using geometric flows. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(12), 2290–2297.Google Scholar
  22. Li, Y., & Chung, S. M. (2007). Parallel bisecting k-means with prediction clustering algorithm. The Journal of Supercomputing, 39, 19–37.Google Scholar
  23. Liu, C., Yuen, J., & Torralba, A. (2009). Nonparametric scene parsing: Label transfer via dense scene alignment. In CVPR (pp. 1972– 1979).Google Scholar
  24. Lloyd, S. P. (1982). Least squares quantization in PCM. IEEE Transactions on Information Theory, 28, 128–137.MathSciNetCrossRefGoogle Scholar
  25. Lucas, B., & Kanade, T. (1981). An iterative image registration technique with an application to stereo vision. In Proceedings of the DARPA image understanding workshop (pp. 121–130).Google Scholar
  26. Maire, M., Arbelaez, P., Fowlkes, C., & Malik, J. (2008). Using contours to detect and localize junctions in natural images. In CVPR.Google Scholar
  27. Malisiewicz, T., & Efros, A. A. (2007). Improving spatial support for objects via multiple segmentations. In BMVC.Google Scholar
  28. Martin, D. R., Fowlkes, C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5), 530–549.CrossRefGoogle Scholar
  29. Martin, D. R., Fowlkes, C., Tal, D., & Malik, J. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In ICCV (pp. 416–425).Google Scholar
  30. Meyer, F., & Maragos, P. (1999). Multiscale morphological segmentations based on watershed, flooding, and eikonal PDE. In Scale space (pp. 351–362).Google Scholar
  31. Micusík, B., & Kosecká, J. (2010). Multi-view superpixel stereo in urban environments. International Journal of Computer Vision, 89(1), 106–119.CrossRefGoogle Scholar
  32. Moore, A. P., Prince, S. J. D., & Warrell, J. (2010). “lattice cut”—Constructing superpixels using layer constraints. In CVPR (pp. 2117–2124).Google Scholar
  33. Moore, A. P., Prince, S., Warrell, J., Mohammed, U., & Jones, G. (2008). Superpixel lattices. In CVPR.Google Scholar
  34. Moore, A. P., Prince, S. J. D., Warrell, J., Mohammed, U., & Jones G. (2009). Scene shape priors for superpixel segmentation. In ICCV (pp. 771–778).Google Scholar
  35. Mori, G. (2005). Guiding model search using segmentation. In ICCV (pp. 1417–1423).Google Scholar
  36. Muhr, M., & Granitzer, M. (2009). Automatic cluster number selection using a split and merge K-means approach. In A. M. Tjoa & R. Wagner (Eds)., DEXA workshops (pp. 363–367). IEEE Computer Society.Google Scholar
  37. Nwogu, I., & Corso, J. J. (2008). (bp)\(^{2}\): Beyond pairwise belief propagation labeling by approximating kikuchi free energies. In CVPR.Google Scholar
  38. Peyré, G., Péchaud, M., Keriven, R.,& Cohen, L. D. (2010). Geodesic methods in computer vision and graphics. Foundations and Trends in Computer Graphics and Vision, 5(3–4), 197–397.Google Scholar
  39. Radhakrishna, A., Appu, S., Kevin, S., Aurelien, L., Pascal, F.,& Susstrunk, S. (2010). Slic superpixels. Technical Report 149300 EPFL (June), p. 15.Google Scholar
  40. Rasmussen, C. (2007). Superpixel analysis for object detection and tracking with application to UAV imagery. In Advances in visual computing (Vol. I, pp. 46–55).Google Scholar
  41. Russell, B. C., Freeman, W. T., Efros, A. A., Sivic, J.,& Zisserman, A. (2006). Using multiple segmentations to discover objects and their extent in image collections. In CVPR (Vol. 2, pp. 1605–1614).Google Scholar
  42. Savaresi, S. M.,& Boley, D. (2004). A comparative analysis on the bisecting K-means and the PDDP clustering algorithms. Intelligent Data Analysis, 8(4), 345–362.Google Scholar
  43. Sethian, J. (1996a). A fast marching level set method for monotonically advancing fronts. Proceedings of the National Academy of Sciences, 93, 1591–1694.MathSciNetMATHCrossRefGoogle Scholar
  44. Sethian, J. A. (1996b). A fast marching level set method for monotonically advancing fronts. Proceedings of the National Academy of Sciences, 93(4), pp. 1591–1595.Google Scholar
  45. Shi, J.,& Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888–905.CrossRefGoogle Scholar
  46. Shotton, J., Winn, J. M., Rother, C.,& Criminisi, A. (2006). TextonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In ECCV (Vol. 1, pp. 1–15).Google Scholar
  47. Tai, X. C., Hodneland, E., Weickert, J., Bukoreshtliev, N. V., Lundervold, A.,& Gerdes, H. H. (2007). Level set methods for watershed image segmentation. In Scale-space (pp. 178–190).Google Scholar
  48. Veksler, O., Boykov, Y.,& Mehrani, P. (2010). Superpixels and supervoxels in an energy optimization framework. In ECCV (Vol. 5, pp. 211–224). Google Scholar
  49. Vincent, L.,& Soille, P. (1991). Watersheds in digital spaces: An efficient algorithm based on immersion simulations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(6), 583–598.Google Scholar
  50. Wang, J., Jia, Y., Hua, X. S., Zhang, C.,& Quan, L. (2008). Normalized tree partitioning for image segmentation. In CVPR.Google Scholar
  51. Wang, S., Lu, H., Yang, F.,& Yang, M. H. (2011). Superpixel tracking. In ICCV (pp. 1323–1330).Google Scholar
  52. Xiao, J.,& Quan, L. (2009). Multiple view semantic segmentation for street view images. In ICCV (pp. 686–693).Google Scholar
  53. Yatziv, L., Bartesaghi, A.,& Sapiro, G. (2006). O(n) implementation of the fast marching algorithm. Journal of Computational Physics, 212(2), 393–393.Google Scholar

Copyright information

© Springer Science+Business Media New York 2012

Authors and Affiliations

  • Peng Wang
    • 1
  • Gang Zeng
    • 1
  • Rui Gan
    • 2
  • Jingdong Wang
    • 3
  • Hongbin Zha
    • 1
  1. 1.Key Laboratory on Machine PerceptionPeking UniversityBeijingChina
  2. 2.School of Mathematical SciencesPeking UniversityBeijingChina
  3. 3.Microsoft Research AsiaBeijingChina

Personalised recommendations