# Structure-Sensitive Superpixels via Geodesic Distance

- 1.7k Downloads
- 56 Citations

## Abstract

Segmenting images into superpixels as supporting regions for feature vectors and primitives to reduce computational complexity has been commonly used as a fundamental step in various image analysis and computer vision tasks. In this paper, we describe the structure-sensitive superpixel technique by exploiting Lloyd’s algorithm with the geodesic distance. Our method generates smaller superpixels to achieve relatively low under-segmentation in structure-dense regions with high intensity or color variation, and produces larger segments to increase computational efficiency in structure-sparse regions with homogeneous appearance. We adopt geometric flows to compute geodesic distances amongst pixels. In the segmentation procedure, the density of over-segments is automatically adjusted through iteratively optimizing an energy functional that embeds color homogeneity, structure density. Comparative experiments with the Berkeley database show that the proposed algorithm outperforms the prior arts while offering a comparable computational efficiency as TurboPixels. Further applications in image compression, object closure extraction and video segmentation demonstrate the effective extensions of our approach.

### Keywords

Superpixel segmentation Geodesic distance Iterative optimization Structure-sensitivity## 1 Introduction

Image over-segmentation has been widely applied in various computer vision pipelines, such as segmentation (Arbelaez et al. 2009; Xiao and Quan 2009; Wang et al. 2008; Hoiem et al. 2005), recognition (Kaufhold et al. 2006), object tracking (Wang et al. 2011; Rasmussen 2007), localization (Fulkerson et al. 2009) and modeling (He et al. 2006; Nwogu and Corso 2008; Micusík and Kosecká 2010).

In these applications, over-segments (aka superpixels) represent small regions with homogeneous appearance and conform to local image structures, and thus provide a better support for region-based features than local windows. With superpixels, the computational cost significantly decreases especially for probabilistic, combinatorial or discriminative approaches, since the underlying graph is greatly simplified in terms of graph nodes and edges. Most superpixel methods have to face the following challenges: on one hand they are required to reduce image complexity by locally grouping pixels regarding intensity boundaries, and on the other hand they should avoid under-segmentation and maintain a certain level of detailed structures. These two aspects conflict with each other, and there have been various optimization techniques proposed to make trade-offs in order to solve this dilemma, for example, the mean shift algorithm (Comaniciu and Meer 2002), the normalized cuts (Shi and Malik 2000), the local viaration (Felzenszwalb and Huttenlocher 2004), the geometric flows (Levinshtein et al. 2009b) and the watershed (Vincent and Soille 1991; Meyer and Maragos 1999; Tai et al. 2007).

*Graph-based method*(Felzenszwalb and Huttenlocher 2004),

*Lattice*(Moore et al. 2008),

*N-Cuts*(Mori 2005; Levinshtein et al. 2009a),

*TurboPixels*(Levinshtein et al. 2009b),

*GraphCut superpixel*(Veksler et al. 2010) and also by using our method. It is noted by previous art (Levinshtein et al. 2009b) that the Graph-based method could easily generate under-segmentation for regions of irregular shapes and sizes due to the lack of compactness constraints. While other methods employ compactness constraints and markedly restrict under-segmentation. The advantage of utilizing compactness has also been demonstrated in Levinshtein et al. (2009b).

N-Cuts-based superpixel methods (Mori 2005; Levinshtein et al. 2009a) are variations of the normalized cuts algorithm (Shi and Malik 2000), in which the compactness is guaranteed by normalizing the cut cost using edge weights. However, the global optimization is computationally costly, and the time complexity of the segmentation increases significantly with the number of pixels and image size.

Lattice (Moore et al. 2008) generates superpixels by detecting vertical or horizontal strips, and thus naturally maintains a grid structure of regions. In order to achieve an adaptive lattice, the scene shape prior is then combined into the lattice framework (Moore et al. 2009). Their further investigation of lattice superpixels (Moore et al. 2010) is derived from global optimization of a well-designed energy function. The superpixel generation is initialized with a grid, and the graph cut algorithm is adopted to optimize the vertical and horizontal seams alternatively.

GraphCut superpixel (Veksler et al. 2010) over-segment an image using roughly regularly-placed patches under a single energy framework. Comparing with N-Cuts-based superpixel methods, this algorithm also keeps the character of compactness and provides comparable results. However, by using the graph-cut optimization method, it becomes much more efficient.

The most related work with ours is proposed by Levinshtein et al., namely a geometric flow based algorithm (aka TurboPixels) for superpixel segmentation (Levinshtein et al. 2009b). Starting from initial seeds regularly placed onto the image, TurboPixels uses the level set method for superpixels’ evolution. It yields a lattice-like structure of compact regions, and more importantly it is efficient especially when compared with N-Cuts-based over-segmentation.

As shown in Fig. 1, a further observation is that given a large diversity of scene layout, the prospective distortion is unavoidably introduced by imaging process, and the density of image contents often varies in different parts of the image. The over-segments of *Lattice*, *N-Cuts*, *TurboPixels* and *GraphCut superpixel* in Fig. 1b–e are too large to represent image appearances and lead to under-segmentation in regions near intensity boundaries, while the segments are rather small in homogeneous regions resulting in unnecessary overhead in high-level applications. To this end, quasi-uniform distribution or layout of superpixel on an image raises a dilemma situation for over-segmentation, since the number of superpixels is hard to choose. This has also been proven by several methods that exploit multiple level over-segments technique as a starting point for further scene segmentation (Malisiewicz and Efros 2007; Russell et al. 2006).

In order to overcome the aforementioned problem and maintain the intuitive consistency with the human vision system, a better image representation could be achieved by assigning the density of superpixels adaptively with respect to the co-occurrence of image contents or the “density” of image structures. This motivates us to introduce a structure-sensitive density function and to generate superpixels as regions with similar sizes in terms of this densityfunction.

Most recently, the geodesic distance has been taken in use for interactive segmentation and matting in Bai and Sapiro (2007), Criminisi et al. (2008), and Gulshan et al. (2010). To the best of our knowledge, however, it has never been used as criteria for determining the distribution and magnitude of superpixels in over-segmentation.

In summary, the contributions of this paper includes the following three aspects: (1) we propose an explicit energy for superpixel segmentation which aims at developing compact and structure-sensitive superpixels; (2) we introduce an efficient iterative optimization technique which includes two steps: over-segmentation with known superpixel centers and center relocation given known over-segments; (3) the geodesic distance is induced for measuring the structure and layout of superpixels, which is aware of the image contents.

### 1.1 Our Approach

Given a user-specified amount of superpixels, the algorithm first puts some seeds along with small disturbance in order to avoid the placement on strong intensity boundaries. The seeds are sampled based-on the “density” of the image structure and serve as initial estimates of the superpixel centers.

There are two key components in this iterative approach. The first one generates over-segments from the current set of centers. The fast marching method (Sethian 1996b) is employed to compute the geodesic distance and thus to generate a Voronoi diagram based on the distance. It has high computational efficiency and requires more restricted forms of the underlying velocity function. Our velocity function is based on the structure density with special care for satisfying the required forms. The details of this part can be found in Sect. 3.1.

The second component refines the locations of the centers according to superpixels’ distribution and magnitudes. The relocation is based on an energy minimization formulation defined with the geodesic distance. Additional superpixels are created by splitting existing ones when certain conditions of their density are satisfied. The splitting strategy guarantees a descent of the energy and accelerates the algorithm. The description of this part is in Sect. 3.2.

In addition, we further introduce an alternative optimization strategy which includes merging scheme and discuss the pros and cons. An efficient implementation for acceleration is also proposed. The details are presented in Sects. 4.1, 4.2 and 4.3 respectively.

The paper is organized as follows: Sect. 2 introduces the formulation of structure sensitive superpixels, and the optimization method are presented in Sect. 3. For deeper understanding of our algorithm, we discussed two optimization methods in Sect. 4. In Sect. 5, we introduce the implementation details and evaluation experiments. Section 6 discusses some applications based on our algorithm and the conclusion follows in Sect. 7.

## 2 Problem Formulation via Geodesic Distance

Given an input image \(I(\mathbf{x}),\) where \(\mathbf{x}\) indicates the pixel’s position \((x,y),\) the goal is to over-segment \(I(\mathbf{x})\) into dense small regions representing superpixels at different locations. We assign a unique label \(l\) to each superpixel and use \(L(\mathbf{x})\) to denote the label of the current pixel \(\mathbf{x}.\) The set of label is represented as \({\fancyscript{L}}.\) All pixels belonging to the \(l_{th}\) superpixel \(S_l\) can be represented by \(S_l=\{\mathbf{x}|L(\mathbf{x})=l\}.\) To this end, the over-segmentation problem belongs to a clustering problem (aka unsupervised learning), in general.

Since \(D(\mathbf{x})\) is a monotonically increasing function of gradient magnitude which is large on edges, the geodesic distance of a path across an intensity boundary is always larger than that in a homogeneous region. In addition, we can see the term \(D(\mathbf{x})\) produces a constant distance increment \(({\text{ i.e.}}\, D(\mathbf{x})=1\, {\text{ if}} E(\mathbf{x})=0)\) in regions of homogeneous appearance, and thus retains the minimum possible isoperimetric ratio. This also makes the superpixels compact so as to avoid large under-segmentation when the image regions contain little edge information.

With the distance measurements at hand, the problem is to cluster the pixels into regions, yielding the superpixels. Recent geodesic-distance-based clustering methods use K-means (Feil and Abonyi 2007), or Fuzzy C-means (Kim et al. 2007).

### 2.1 Energy Minimization

#### 2.1.1 Homogeneity Penalization

#### 2.1.2 Structure Penalization

From Eq. (2), the density function is high in the regions with much intensity variation and thus leads to smaller \(S_l\) on the image. This motivate us to penalize the image area \(A_l\) that is relatively large. Here we define an average area \(\overline{A},\) which is calculated as \(\frac{\sum _l A_l}{N}=\frac{\int _\mathbf{x} D(\mathbf{x})d\mathbf{x}}{N}\) in which \(N\) is the total number of superpixels specified by users.

## 3 Iterative Optimization

Due to the non-convex property of Eq. (8) and the induced latent variable, we choose an iterative scheme to minimize the energy functional \(E_{total},\) More precisely, our optimization process is similar to Lloyd’s algorithm (Lloyd 1982) as mentioned in Sect. 1.1, The convergence and robustness of the Lloyd’s algorithm has been elaborated by Du et al. (2006). During the iterative procedure, the weight of soft K-means \(W_{\mathbf{x},l},\) the centers \(\{\mathbf{c}_l\}\) are alternatively updated.

The main difference to the traditional EM optimization scheme is the usage of a top-down hierarchical optimization strategy during the iteration for adaptively generating new components through splitting. This strategy accelerates the optimization which is also known as the bisecting k-means (Li and Chung 2007). Moreover, we treasure the efficiency of the algorithm and use approximations to reduce the computational burdens during the iterative optimization. Since superpixel segmentation is a preprocessing step for many vision tasks, the computational efficiency is essential for its capability in various applications. The details about strategies and accelerations of the two steps of our algorithm are described in Sects. 3.1 and 3.2 respectively.

### 3.1 Weight Estimation

Given a set of centers \(\{\mathbf{c}_l\},\) the goal of this step is to compute the \(W_{\mathbf{x},l}\)

#### 3.1.1 The Geodesic Distance Computation

In order to generate the geodesic distances defined in Eq. (7), we employed the fast marching method which is proposed by Sethian (1996a) for better computational efficiency since this over-segmentation step may get called several times during the outer iterations. Moreover, in our configuration, the front end of the evolving contour can only move in the outward normal direction (i.e. the contour expands rather than shrinking), which fits well with the restricted forms of the underlying velocity functions of the fast marching.

#### 3.1.2 Weight Approximation for Acceleration

In this step, the main task is to compute the weight \(W_{\mathbf{x},l}\) in Eq. (5). Nevertheless, using Eq. (5) needs to calculate the geodesic distance between each pixel to each center, which is extremely time costly and makes the practical usage intractable. For efficiency, we consider to compute the geodesic distance as less as possible and approximate the Eq. (5). From the equation, the weight is negative related with the distance between the pixel \(\mathbf{x}\) and the center vector \(\mathbf{c}_l\) in most cases. In experiments, we observe that \(W_{\mathbf{x},l}\) is decreasing with the increasing distance similar with the numerator of the calculation equation.

### 3.2 Center Refinement

Given a set of superpixels \(L(\mathbf{x}),\) the goal of this step is to re-estimate the centers’ positions \(\{\mathbf{c}_l\}\) according to \(E_{total}\) in Eq. (8).

#### 3.2.1 Center Splitting

As mentioned in Sect. 1, one of the main goals is to generate superpixels that are sensitive to image structure (see Eq. (6)).

During the energy minimization process, given a superpixel \(S_l\) whose area \(A_l\) is much larger than \(\overline{A}\) while its center \(\mathbf{c}_l\) shifts little from last iteration, the algorithm splits the center \(\mathbf{c}_l\) into two since the later generated segments by the new ones would produce a lower value of the energy functional in Eq. (8). Such a process accelerates the process for finding the optimal centers much quickly.

Divisive clustering algorithms has been well discussed in Savaresi and Boley (2004). There are mainly two strategy for splitting: the bisecting K-means algorithm and the principal direction divisive partitioning (PDDP). Such schemes increase one cluster each time and suffer from the defect of propagation error from upper levels. In our case, increasing one cluster at one time is inefficient and the propagation error is also undesired. Considering both efficiency and accuracy, we choose to bisect the superpixel whenever it meets our splitting criteria which are heuristically generated based-on the human intuition and \(E_{total},\) and we propose to re-estimate all the centers’ location after splitting during the optimization.

Besides the rare cases in which no splitting criterion is met while the demanding number of superpixels has not been reached, we selected the largest few superpixels (10 in our implementation) to do the splitting. In a nutshell, the energy functional, i.e. \(E_{total},\) keeps decreasing.

### 3.3 Initialization and Termination

#### 3.3.1 Initial Seeds Placement

One way to place the initial seeds is similar as TurboPixels in Levinshtein et al. (2009b), \(K\) initial seeds are placed in a lattice formation such that the distance between neighbor seeds is roughly equal to \(\sqrt{M/K},\) where \(M\) is the total pixel number of the image. They also perturb the seeds by moving them away from the pixels with high gradient magnitude to avoid strong intensity boundaries and bad initialization for later iterations. Different from TurboPixels algorithm, we alternatively set \(K\) to be a portion of the total number of superpixels \(N\) (specified by users). During the optimization process, additional superpixels are generated by splitting existing ones until the number of superpixels reaches \(N.\)

#### 3.3.2 Termination Conditions

We use the following termination conditions: (1) the change of energy between two successive iteration steps is less than a threshold \(\varepsilon _E;\) (2) the total number of iterations exceeds the predefined number \(N_{max}.\)

In the final stage, very small superpixels are detected and removed, which results in a small number of unassigned pixels. The final superpixel result is generated by the over-segmentation (in Sect. 3.1) with the remaining centers.

### 3.4 Algorithm Complexity and Convergence

As the algorithm iteratively performs two routines in turn, it is easily known that the time complexity of our algorithm is \(O((T_{segment}+T_{center})N_I),\) where \(T_{segment}\) and \(T_{center}\) are the complexities of the over segmentation in Sect. 3.1 and center refinement in Sect. 3.2 respectively. \(N_I\) is the total number of iterations.

Let \(M\) denote the number of pixels on an image. The complexity of the fast marching can be decreased to roughly \(O(M)\) (Yatziv et al. 2006). It can be also proven that \(T_{center}\) is \(O(M),\) since the center refinement can be achieved by a single scan of all pixels on an image. Thus, the complexity of the whole algorithm becomes \(O(MN_I).\)

## 4 Alternative Strategy for Optimization

### 4.1 Center Merging

Merging adjacent superpixels with very similar appearance is another strategy to make the superpixels’ area represent the structure of image. Based on the structure term in Eq. (8), merging a pair of adjacent superpixels in low density regions of image contents while split a superpixel in high density drops the energy.

Previous art (Muhr and Granitzer 2009) combines the splitting and merging together for searching the optimized cluster number. We here adopt splitting and merging together to search for better superpixel structure consistent with the image density. In our attempting, we exploit a heuristic design and propose to perform merging simultaneously with splitting during the iteration optimization.

Unlike the center splitting, for initialization, we directly place \(N\) seeds according to the sampling scheme in Sect. 3.3.

### 4.2 Discussion Between Splitting and Splitting-Merging

Splitting-Merging scheme in Sect. 4.1 has three merits: (1) with both splitting and merging, the algorithm could converge in fewer iterations as shown in Fig. 12; (2) it makes the algorithm less dependent on the initial fraction \(K\) of seeds as required by the splitting strategy and thus cutting down the number of user setting parameters; and (3) some superpixels with small or tiny area are given a further possibility to merge with nearby superpixels. On the other hand, Splitting-Merging scheme causes extra computational cost at each iteration, while Splitting requires no computation related to the relationship with neighborhood.

In general, the choices of superpixels that need to be split are much fewer than the ones needs to be merged after few iterations, which makes the Splitting more efficient than the Splitting-Merging in single iteration. In our experiments, though fewer iterations is needed by Splitting-Merging, the Splitting achieves higher efficiency.

In addition, from the comparison by our experiment in Sect. 5, the performance of Splitting (with \(K\) initialization seeds) and Splitting-Merging (with \(N\) initialization seeds) are very close under our parameters and experiment settings, which is consistent with the energy comparison between the two methods shown in Fig. 12. Actually, the energy curves from Splitting and Splitting-Merging algorithm are not always converging to the same value as the local minimum of the energy functional. However, in our experiments, we show that the performances of both algorithm are visually close.

### 4.3 Acceleration Scheme for Optimization

- 1.The center \(\mathbf{c}_{l}\) itself shifts little from the position in last iteration, i.e. \(\Vert \mathbf{c}_{l}-\mathbf{c}_{l}^{\prime }\Vert \le T_{shift}\) and does not meet the splitting (or merging) criteria.
- 2.
All of the center’s adjacent neighbor centers meet the first rule.

## 5 Experimental Evaluations

### 5.1 Parameter Settings

In all experiments, our algorithm is not sensitive to most of the parameters, such as the standard deviation \(\sigma \) and \(\gamma \) in Eq. (2). We set the \(\sigma \) adaptively as \(\frac{\sqrt{M/N}}{2}\) and the \(\gamma \) as \(0.12,\) where \(M\) is the total number of pixels in the image and \(N\) is the user-specified number of superpixels. For the some sensitive parameters, we constructed a validation set including \(20\) training images randomly chosen from BSD300 data set (Martin et al. 2001) and tune the parameters based on the performance over these images. We set \(\varepsilon \) in Eq. (13) to be \(2,\,T_s = 2,\) and \(T_c = 4\) for criteria in Eq. (18) and the \(T_{shift} = 2\) in Eq. (24).

### 5.2 Quantitative Evaluation

We evaluated the performance of the proposed algorithm by comparing its accuracy with several leading approaches: TurboPixels (Levinshtein et al. 2009b), N-Cuts (Shi and Malik 2000), Graph-based method (Felzenszwalb and Huttenlocher 2004), Lattice (Moore et al. 2008), GraphCut superpixel (Veksler et al. 2010) and SLIC superpixel (Radhakrishna et al. 2010). We also show the evaluation of the Splitting-Merging algorithm, in which we evaluate the performance with \(N\) initial seeds. Finally, to explicitly illustrate the effect of the iterative optimization, another baseline is constructed by single fast-marching using the sampled \(N\) initial seeds based-on Eq. (22), which we call “Sample Seeds”.

We use the Fast Marching Toolbox^{1} to compute geometric flows. The Multi-scale Normalized Cuts Segmentation Toolbox^{2} is applied for N-Cuts. We downloaded the TurboPixels implementation^{3} for TurboPixels, the Graph-Based method implementation^{4} online, the GraphCut superpixel implementation^{5} and the Lattice.^{6} In all testing, we use the author’s raw implementation and tune their parameters on a small validation set.

All experiments are performed on a quad-core 3.2 GHz computer, and the evaluation is based on the BSD300 data set (Martin et al. 2001), which contains 100 test images and 200 training images with \(481\times 321\) (or \(321\times 481\)) pixel resolution. The performance is averaged over a random subset (\(50\) images) of the test set for the high computational cost of N-Cuts.

All the above mentioned algorithms keep compactness except Graph-based method. We compare ours with these algorithms in following quantitative criteria.

#### 5.2.1 Under-Segmentation Error

#### 5.2.2 Boundary Recall

The comparison of the boundary recall of some state-of-the-art methods and our method is in Fig. 16b. From the results, ours outperforms the competitors such as TurboPixels and N-Cuts while remaining comparable to the Graph-based method which has high boundary recall.

#### 5.2.3 Achievable Segmentation Accuracy (ASA)

Figure 16c shows the comparison result between our approach and other algorithms, and we can see the proposed method yields a better achievable segmentation upper-bound.

#### 5.2.4 Time Cost

As demonstrated in Levinshtein et al. (2009b), TurboPixels is much faster than N-Cuts and also exploits geometric flow for segmentation. We thus conducted time comparisons with TurboPixels. In the meanwhile, we also include Graph-based method, GraphCut superpixel and superpixel lattice for providing a whole view of relative time comparison of various superpixel methods. In our experiments, the running time of the comparing algorithms is tested with respect to image size and superpixel number.

### 5.3 Qualitative Results

#### 5.3.1 Combination with Supervised LEARNT EDGE Maps

#### 5.3.2 Combination with Image Saliency

## 6 Applications

### 6.1 Image Compression

### 6.2 Foreground Object Contour Closure and Segmentation

Our algorithm performs adaptively with image structure. This is also very useful for generates better segmentation. In the application of foreground segmentation, we adopted the contour closure extraction method from the work of Levinshtein et al. (2010). Specifically, the goal of contour closure is to find a circle of connected contour fragments that separates an object from its background. Previous art (Levinshtein et al. 2010) transforms such a problem to finding subsets of superpixels. In order to generate a better segmentation, the algorithm requires the superpixel boundaries to better capture the object edges, thus in their work, they apply the learned Pb edge map (Maire et al. 2008) for generating the superpixels. The algorithm optimizes the edge difference between the object and background and the edge homogeneous within the object region.

This concludes that our approach achieves visually better segmentation results due to the structure sensitivity. Specifically, our superpixel captures the object boundary more effectively, thus provides better optimized solutions for the contour closure algorithm.

### 6.3 Video Segmentation

Our approach could be easily extended to video segmentation and is more suitable for superpixels segmentation than previous superpixel methods with quasi-lattice formation such as Levinshtein et al. (2009b). Thanks to the spatial consistency between different frames, instead of re-calculation at each frame, the optimized position of centers could be easily transferred by methods like SIFT flow (Liu et al. 2009) or optical flow (Lucas and Kanade 1981). SIFT flow gives better correspondence, but it takes more time to perform a dense matching. Here, we conducted the superpixel flow based on the popular Lucas–Kanade (LK) algorithm to find stable correspondent centers between adjacent frames. In the video segmentation scenario, initial seeds have already been given a structure-sensitive placement which was optimized using the former frame. This informs us that the algorithm could converge much faster for the later frames.

## 7 Conclusion

We proposed a structure-sensitive over-segmentation algorithm for computing superpixels for images. It greatly limits under-segmentation by considering the homogeneity of image appearance, density of image contents, compactness of shape, and regularity of layout. The over-segmentation can be formulated as a soft clustering problem by exploiting geodesic distance, and a local optimal solution could be obtained via geometric flows and an efficient iterative optimization strategy through inducing center splitting. Experimental results on the Berkeley segmentation dataset demonstrate that our algorithm outperforms many state-of-the-art approaches, whilst the running time of the algorithm is fully adoptable for many practical usage.

Additionally, we discussed another optimization strategy inducing merging, while it gives similar performance with purely splitting strategy. At last, we provided several potential applications, e.g. supervised segmentation, image compression and video segmentation, and show the superiority of our structure-sensitive setting superpixel method compared with grid layout setting methods, such as TurboPixels.

## Footnotes

- 1.
The Fast Marching Toolbox is written by Gabriel Peyre.

- 2.
The Multi-scale Normalized Cuts Segmentation Toolbox is written by Timothee Cour et al.

- 3.
The TurboPixels toolbox is written by Alex Levinshtein.

- 4.
The Graph-Based method toolbox is written by Felzensz.

- 5.
The GraphCut superpixel implementation is written by Olga Veksler.

- 6.
The Lattice superpixel is written by Alastair P. Moore.

## Notes

### Acknowledgments

This work is supported by National Nature Science Foundation of China (NSFC Grant) 61005037 and 90920304, National Basic Research Program of China (973 Program) 2011CB302202, and Beijing Natural Science Foundation (BJNSF Grant) 4113071.

### References

- Alpert, S., Galun, M., Basri, R., & Brandt, A. (2007). Image segmentation by probabilistic bottom-up aggregation and cue integration. In
*CVPR*.Google Scholar - Arbelaez, P., Maire, M., Fowlkes, C. C., & Malik, J. (2009). From contours to regions: An empirical evaluation. In
*CVPR*(pp. 2294–2301).Google Scholar - Bai, X., & Sapiro, G. (2007). A geodesic framework for fast interactive image and video segmentation and matting. In
*ICCV*(pp. 1–8).Google Scholar - Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis.
*IEEE Transactions on Pattern Analysis and Machine Intelligence*,*24*(5), 603–619.CrossRefGoogle Scholar - Criminisi, A., Sharp, T., & Blake, A. (2008). Geos: Geodesic image segmentation. In
*ECCV*(pp. 99–112).Google Scholar - Dollár, P., Tu, Z., & Belongie, S. (2006). Supervised learning of edges and object boundaries. In
*CVPR*(Vol. 2, pp. 1964–1971).Google Scholar - Du, Q., Emelianenko, M., & Ju, L. (2006). Convergence of the lloyd algorithm for computing centroidal voronoi tessellations.
*SIJNA: SIAM Journal on Numerical Analysis, 44*, 102–119.Google Scholar - Feil, B., & Abonyi, J. (2007).
*Geodesic distance based fuzzy clustering*. Lecture notes in computer science, soft computing in industrial applications (pp. 50–59).Google Scholar - Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient graph-based image segmentation.
*International Journal of Computer Vision*,*59*(2), 167–181.CrossRefGoogle Scholar - Fulkerson, B., Vedaldi, A., & Soatto, S. (2009). Class segmentation and object localization with superpixel neighborhoods. In
*ICCV*(pp. 670–677).Google Scholar - Gulshan, V., Rother, C., Criminisi, A., Blake, A., & Zisserman, A. (2010). Geodesic star convexity for interactive image segmentation. In
*CVPR*(pp. 3129–3136).Google Scholar - Harel, J., Koch, C., & Perona, P. (2006). Graph-based visual saliency. In B. Schölkopf, J. C. Platt, & T. Hoffman (Eds.),
*NIPS*(pp. 545–552). Cambridge, MA: MIT Press.Google Scholar - He, X., Zemel, R. S., & Ray, D. (2006). Learning and incorporating top-down cues in image segmentation. In
*ECCV*(Vol. 1, pp. 338–351).Google Scholar - Hoiem, D., Efros, A. A., & Hebert, M. (2005). Geometric context from a single image. In
*ICCV*(pp. 654–661).Google Scholar - Hyvärinen, A. (1999). The fixed-point algorithm and maximum likelihood estimation for independent component analysis.
*Neural Processing Letters*,*10*(1), 1–5.CrossRefGoogle Scholar - Jolliffe, I. T. (1986). Principal component analysis. In
*Principal component analysis*. New York: Springer.Google Scholar - Kaufhold, J. P., Collins, R., Hoogs, A., & Rondot, P. (2006). Recognition and segmentation of scene content using region-based classification. In
*ICPR*(Vol. 1, pp. 755–760).Google Scholar - Kim, J., Shim, K. H., & Choi, S. (2007). Soft geodesic kernel k-means. In
*ICASSP*(pp. 429–432).Google Scholar - Levinshtein, A., Dickinson, S. J., & Sminchisescu, C. (2009a). Multiscale symmetric part detection and grouping. In
*ICCV*(pp. 2162–2169).Google Scholar - Levinshtein, A., Sminchisescu, C., & Dickinson, S. J. (2010). Optimal contour closure by superpixel grouping. In
*ECCV*(Vol. 2, pp. 429–493).Google Scholar - Levinshtein, A., Stere, A., Kutulakos, K. N., Fleet, D. J., Dickinson, S. J., & Siddiqi, K. (2009b). Turbopixels: Fast superpixels using geometric flows.
*IEEE Transactions on Pattern Analysis and Machine Intelligence*,*31*(12), 2290–2297.Google Scholar - Li, Y., & Chung, S. M. (2007). Parallel bisecting k-means with prediction clustering algorithm.
*The Journal of Supercomputing, 39*, 19–37.Google Scholar - Liu, C., Yuen, J., & Torralba, A. (2009). Nonparametric scene parsing: Label transfer via dense scene alignment. In
*CVPR*(pp. 1972– 1979).Google Scholar - Lloyd, S. P. (1982). Least squares quantization in PCM.
*IEEE Transactions on Information Theory*,*28*, 128–137.MathSciNetCrossRefGoogle Scholar - Lucas, B., & Kanade, T. (1981). An iterative image registration technique with an application to stereo vision. In
*Proceedings of the DARPA image understanding workshop*(pp. 121–130).Google Scholar - Maire, M., Arbelaez, P., Fowlkes, C., & Malik, J. (2008). Using contours to detect and localize junctions in natural images. In
*CVPR*.Google Scholar - Malisiewicz, T., & Efros, A. A. (2007). Improving spatial support for objects via multiple segmentations. In
*BMVC*.Google Scholar - Martin, D. R., Fowlkes, C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues.
*IEEE Transactions on Pattern Analysis and Machine Intelligence*,*26*(5), 530–549.CrossRefGoogle Scholar - Martin, D. R., Fowlkes, C., Tal, D., & Malik, J. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In
*ICCV*(pp. 416–425).Google Scholar - Meyer, F., & Maragos, P. (1999). Multiscale morphological segmentations based on watershed, flooding, and eikonal PDE. In
*Scale space*(pp. 351–362).Google Scholar - Micusík, B., & Kosecká, J. (2010). Multi-view superpixel stereo in urban environments.
*International Journal of Computer Vision*,*89*(1), 106–119.CrossRefGoogle Scholar - Moore, A. P., Prince, S. J. D., & Warrell, J. (2010). “lattice cut”—Constructing superpixels using layer constraints. In
*CVPR*(pp. 2117–2124).Google Scholar - Moore, A. P., Prince, S., Warrell, J., Mohammed, U., & Jones, G. (2008). Superpixel lattices. In
*CVPR*.Google Scholar - Moore, A. P., Prince, S. J. D., Warrell, J., Mohammed, U., & Jones G. (2009). Scene shape priors for superpixel segmentation. In
*ICCV*(pp. 771–778).Google Scholar - Mori, G. (2005). Guiding model search using segmentation. In
*ICCV*(pp. 1417–1423).Google Scholar - Muhr, M., & Granitzer, M. (2009). Automatic cluster number selection using a split and merge K-means approach. In A. M. Tjoa & R. Wagner (Eds).,
*DEXA workshops*(pp. 363–367). IEEE Computer Society.Google Scholar - Nwogu, I., & Corso, J. J. (2008). (bp)\(^{2}\): Beyond pairwise belief propagation labeling by approximating kikuchi free energies. In
*CVPR*.Google Scholar - Peyré, G., Péchaud, M., Keriven, R.,& Cohen, L. D. (2010). Geodesic methods in computer vision and graphics.
*Foundations and Trends in Computer Graphics and Vision*,*5*(3–4), 197–397.Google Scholar - Radhakrishna, A., Appu, S., Kevin, S., Aurelien, L., Pascal, F.,& Susstrunk, S. (2010). Slic superpixels. Technical Report 149300 EPFL (June), p. 15.Google Scholar
- Rasmussen, C. (2007). Superpixel analysis for object detection and tracking with application to UAV imagery. In
*Advances in visual computing*(Vol. I, pp. 46–55).Google Scholar - Russell, B. C., Freeman, W. T., Efros, A. A., Sivic, J.,& Zisserman, A. (2006). Using multiple segmentations to discover objects and their extent in image collections. In
*CVPR*(Vol. 2, pp. 1605–1614).Google Scholar - Savaresi, S. M.,& Boley, D. (2004). A comparative analysis on the bisecting K-means and the PDDP clustering algorithms.
*Intelligent Data Analysis*,*8*(4), 345–362.Google Scholar - Sethian, J. (1996a). A fast marching level set method for monotonically advancing fronts.
*Proceedings of the National Academy of Sciences*,*93*, 1591–1694.MathSciNetMATHCrossRefGoogle Scholar - Sethian, J. A. (1996b). A fast marching level set method for monotonically advancing fronts.
*Proceedings of the National Academy of Sciences*,*93*(4), pp. 1591–1595.Google Scholar - Shi, J.,& Malik, J. (2000). Normalized cuts and image segmentation.
*IEEE Transactions on Pattern Analysis and Machine Intelligence*,*22*(8), 888–905.CrossRefGoogle Scholar - Shotton, J., Winn, J. M., Rother, C.,& Criminisi, A. (2006). TextonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In
*ECCV*(Vol. 1, pp. 1–15).Google Scholar - Tai, X. C., Hodneland, E., Weickert, J., Bukoreshtliev, N. V., Lundervold, A.,& Gerdes, H. H. (2007). Level set methods for watershed image segmentation. In
*Scale-space*(pp. 178–190).Google Scholar - Veksler, O., Boykov, Y.,& Mehrani, P. (2010). Superpixels and supervoxels in an energy optimization framework. In
*ECCV*(Vol. 5, pp. 211–224). Google Scholar - Vincent, L.,& Soille, P. (1991). Watersheds in digital spaces: An efficient algorithm based on immersion simulations.
*IEEE Transactions on Pattern Analysis and Machine Intelligence*,*13*(6), 583–598.Google Scholar - Wang, J., Jia, Y., Hua, X. S., Zhang, C.,& Quan, L. (2008). Normalized tree partitioning for image segmentation. In
*CVPR*.Google Scholar - Wang, S., Lu, H., Yang, F.,& Yang, M. H. (2011). Superpixel tracking. In
*ICCV*(pp. 1323–1330).Google Scholar - Xiao, J.,& Quan, L. (2009). Multiple view semantic segmentation for street view images. In
*ICCV*(pp. 686–693).Google Scholar - Yatziv, L., Bartesaghi, A.,& Sapiro, G. (2006). O(n) implementation of the fast marching algorithm.
*Journal of Computational Physics,**212*(2), 393–393.Google Scholar