Abstract
Segmenting images into superpixels as supporting regions for feature vectors and primitives to reduce computational complexity has been commonly used as a fundamental step in various image analysis and computer vision tasks. In this paper, we describe the structure-sensitive superpixel technique by exploiting Lloyd’s algorithm with the geodesic distance. Our method generates smaller superpixels to achieve relatively low under-segmentation in structure-dense regions with high intensity or color variation, and produces larger segments to increase computational efficiency in structure-sparse regions with homogeneous appearance. We adopt geometric flows to compute geodesic distances amongst pixels. In the segmentation procedure, the density of over-segments is automatically adjusted through iteratively optimizing an energy functional that embeds color homogeneity, structure density. Comparative experiments with the Berkeley database show that the proposed algorithm outperforms the prior arts while offering a comparable computational efficiency as TurboPixels. Further applications in image compression, object closure extraction and video segmentation demonstrate the effective extensions of our approach.
Similar content being viewed by others
Notes
The Fast Marching Toolbox is written by Gabriel Peyre.
The Multi-scale Normalized Cuts Segmentation Toolbox is written by Timothee Cour et al.
The TurboPixels toolbox is written by Alex Levinshtein.
The Graph-Based method toolbox is written by Felzensz.
The GraphCut superpixel implementation is written by Olga Veksler.
The Lattice superpixel is written by Alastair P. Moore.
References
Alpert, S., Galun, M., Basri, R., & Brandt, A. (2007). Image segmentation by probabilistic bottom-up aggregation and cue integration. In CVPR.
Arbelaez, P., Maire, M., Fowlkes, C. C., & Malik, J. (2009). From contours to regions: An empirical evaluation. In CVPR (pp. 2294–2301).
Bai, X., & Sapiro, G. (2007). A geodesic framework for fast interactive image and video segmentation and matting. In ICCV (pp. 1–8).
Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5), 603–619.
Criminisi, A., Sharp, T., & Blake, A. (2008). Geos: Geodesic image segmentation. In ECCV (pp. 99–112).
Dollár, P., Tu, Z., & Belongie, S. (2006). Supervised learning of edges and object boundaries. In CVPR (Vol. 2, pp. 1964–1971).
Du, Q., Emelianenko, M., & Ju, L. (2006). Convergence of the lloyd algorithm for computing centroidal voronoi tessellations. SIJNA: SIAM Journal on Numerical Analysis, 44, 102–119.
Feil, B., & Abonyi, J. (2007). Geodesic distance based fuzzy clustering. Lecture notes in computer science, soft computing in industrial applications (pp. 50–59).
Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2), 167–181.
Fulkerson, B., Vedaldi, A., & Soatto, S. (2009). Class segmentation and object localization with superpixel neighborhoods. In ICCV (pp. 670–677).
Gulshan, V., Rother, C., Criminisi, A., Blake, A., & Zisserman, A. (2010). Geodesic star convexity for interactive image segmentation. In CVPR (pp. 3129–3136).
Harel, J., Koch, C., & Perona, P. (2006). Graph-based visual saliency. In B. Schölkopf, J. C. Platt, & T. Hoffman (Eds.), NIPS (pp. 545–552). Cambridge, MA: MIT Press.
He, X., Zemel, R. S., & Ray, D. (2006). Learning and incorporating top-down cues in image segmentation. In ECCV (Vol. 1, pp. 338–351).
Hoiem, D., Efros, A. A., & Hebert, M. (2005). Geometric context from a single image. In ICCV (pp. 654–661).
Hyvärinen, A. (1999). The fixed-point algorithm and maximum likelihood estimation for independent component analysis. Neural Processing Letters, 10(1), 1–5.
Jolliffe, I. T. (1986). Principal component analysis. In Principal component analysis. New York: Springer.
Kaufhold, J. P., Collins, R., Hoogs, A., & Rondot, P. (2006). Recognition and segmentation of scene content using region-based classification. In ICPR (Vol. 1, pp. 755–760).
Kim, J., Shim, K. H., & Choi, S. (2007). Soft geodesic kernel k-means. In ICASSP (pp. 429–432).
Levinshtein, A., Dickinson, S. J., & Sminchisescu, C. (2009a). Multiscale symmetric part detection and grouping. In ICCV (pp. 2162–2169).
Levinshtein, A., Sminchisescu, C., & Dickinson, S. J. (2010). Optimal contour closure by superpixel grouping. In ECCV (Vol. 2, pp. 429–493).
Levinshtein, A., Stere, A., Kutulakos, K. N., Fleet, D. J., Dickinson, S. J., & Siddiqi, K. (2009b). Turbopixels: Fast superpixels using geometric flows. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(12), 2290–2297.
Li, Y., & Chung, S. M. (2007). Parallel bisecting k-means with prediction clustering algorithm. The Journal of Supercomputing, 39, 19–37.
Liu, C., Yuen, J., & Torralba, A. (2009). Nonparametric scene parsing: Label transfer via dense scene alignment. In CVPR (pp. 1972– 1979).
Lloyd, S. P. (1982). Least squares quantization in PCM. IEEE Transactions on Information Theory, 28, 128–137.
Lucas, B., & Kanade, T. (1981). An iterative image registration technique with an application to stereo vision. In Proceedings of the DARPA image understanding workshop (pp. 121–130).
Maire, M., Arbelaez, P., Fowlkes, C., & Malik, J. (2008). Using contours to detect and localize junctions in natural images. In CVPR.
Malisiewicz, T., & Efros, A. A. (2007). Improving spatial support for objects via multiple segmentations. In BMVC.
Martin, D. R., Fowlkes, C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5), 530–549.
Martin, D. R., Fowlkes, C., Tal, D., & Malik, J. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In ICCV (pp. 416–425).
Meyer, F., & Maragos, P. (1999). Multiscale morphological segmentations based on watershed, flooding, and eikonal PDE. In Scale space (pp. 351–362).
Micusík, B., & Kosecká, J. (2010). Multi-view superpixel stereo in urban environments. International Journal of Computer Vision, 89(1), 106–119.
Moore, A. P., Prince, S. J. D., & Warrell, J. (2010). “lattice cut”—Constructing superpixels using layer constraints. In CVPR (pp. 2117–2124).
Moore, A. P., Prince, S., Warrell, J., Mohammed, U., & Jones, G. (2008). Superpixel lattices. In CVPR.
Moore, A. P., Prince, S. J. D., Warrell, J., Mohammed, U., & Jones G. (2009). Scene shape priors for superpixel segmentation. In ICCV (pp. 771–778).
Mori, G. (2005). Guiding model search using segmentation. In ICCV (pp. 1417–1423).
Muhr, M., & Granitzer, M. (2009). Automatic cluster number selection using a split and merge K-means approach. In A. M. Tjoa & R. Wagner (Eds)., DEXA workshops (pp. 363–367). IEEE Computer Society.
Nwogu, I., & Corso, J. J. (2008). (bp)\(^{2}\): Beyond pairwise belief propagation labeling by approximating kikuchi free energies. In CVPR.
Peyré, G., Péchaud, M., Keriven, R.,& Cohen, L. D. (2010). Geodesic methods in computer vision and graphics. Foundations and Trends in Computer Graphics and Vision, 5(3–4), 197–397.
Radhakrishna, A., Appu, S., Kevin, S., Aurelien, L., Pascal, F.,& Susstrunk, S. (2010). Slic superpixels. Technical Report 149300 EPFL (June), p. 15.
Rasmussen, C. (2007). Superpixel analysis for object detection and tracking with application to UAV imagery. In Advances in visual computing (Vol. I, pp. 46–55).
Russell, B. C., Freeman, W. T., Efros, A. A., Sivic, J.,& Zisserman, A. (2006). Using multiple segmentations to discover objects and their extent in image collections. In CVPR (Vol. 2, pp. 1605–1614).
Savaresi, S. M.,& Boley, D. (2004). A comparative analysis on the bisecting K-means and the PDDP clustering algorithms. Intelligent Data Analysis, 8(4), 345–362.
Sethian, J. (1996a). A fast marching level set method for monotonically advancing fronts. Proceedings of the National Academy of Sciences, 93, 1591–1694.
Sethian, J. A. (1996b). A fast marching level set method for monotonically advancing fronts. Proceedings of the National Academy of Sciences, 93(4), pp. 1591–1595.
Shi, J.,& Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888–905.
Shotton, J., Winn, J. M., Rother, C.,& Criminisi, A. (2006). TextonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In ECCV (Vol. 1, pp. 1–15).
Tai, X. C., Hodneland, E., Weickert, J., Bukoreshtliev, N. V., Lundervold, A.,& Gerdes, H. H. (2007). Level set methods for watershed image segmentation. In Scale-space (pp. 178–190).
Veksler, O., Boykov, Y.,& Mehrani, P. (2010). Superpixels and supervoxels in an energy optimization framework. In ECCV (Vol. 5, pp. 211–224).
Vincent, L.,& Soille, P. (1991). Watersheds in digital spaces: An efficient algorithm based on immersion simulations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(6), 583–598.
Wang, J., Jia, Y., Hua, X. S., Zhang, C.,& Quan, L. (2008). Normalized tree partitioning for image segmentation. In CVPR.
Wang, S., Lu, H., Yang, F.,& Yang, M. H. (2011). Superpixel tracking. In ICCV (pp. 1323–1330).
Xiao, J.,& Quan, L. (2009). Multiple view semantic segmentation for street view images. In ICCV (pp. 686–693).
Yatziv, L., Bartesaghi, A.,& Sapiro, G. (2006). O(n) implementation of the fast marching algorithm. Journal of Computational Physics, 212(2), 393–393.
Acknowledgments
This work is supported by National Nature Science Foundation of China (NSFC Grant) 61005037 and 90920304, National Basic Research Program of China (973 Program) 2011CB302202, and Beijing Natural Science Foundation (BJNSF Grant) 4113071.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, P., Zeng, G., Gan, R. et al. Structure-Sensitive Superpixels via Geodesic Distance. Int J Comput Vis 103, 1–21 (2013). https://doi.org/10.1007/s11263-012-0588-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-012-0588-6