Abstract
This work bridges the gap between two popular methodologies for data partitioning: kernel clustering and regularization-based segmentation. While addressing closely related practical problems, these general methodologies may seem very different based on how they are covered in the literature. The differences may show up in motivation, formulation, and optimization, e.g. spectral relaxation versus max-flow. We explain how regularization and kernel clustering can work together and why this is useful. Our joint energy combines standard regularization, e.g. MRF potentials, and kernel clustering criteria like normalized cut. Complementarity of such terms is demonstrated in many applications using our bound optimization Kernel Cut algorithm for the joint energy (code is publicly available). While detailing combinatorial move-making, our main focus are new linear kernel and spectral bounds for kernel clustering criteria allowing their integration with any regularization objectives with existing discrete or continuous solvers.
Similar content being viewed by others
Notes
Iverson brackets \([\cdot ]\) enclosing a logical proposition, e.g. \([S_p=k]\), return 1 or 0 depending on true or false value of this proposition.
Here spectral bound means spectral auxiliary function in the context of optimization, not to be confused with bounds on eigenvalues.
These can be isometrically embedded into a Hilbert space Hein et al. (2004).
Mercer’s theorem is a similar eigen decomposition for continuous p.d. kernels k(x, y) giving infinite-dimensional Hilbert embedding \(\phi (x)\). Discrete kernel embedding \(\phi _p\equiv \phi (I_p)\) in Sect. 4.1 (60) has finite dimension \(|\varOmega |\), which is still much higher than the dimension of points \(I_p\), e.g. \({\mathcal {R}}^3\) for colors. Section 4.1 also shows lower dimensional embeddings \(\tilde{\phi }_p\) approximating isometry (20).
Function \({{\mathop {e}\limits ^{\frown }}}\) and gradient \({\nabla {\mathop {e}\limits ^{\frown }}}\) are defined only at non-zero indicators \(X_t\) where \(w' X_t >0\). We can formally extend \({{\mathop {e}\limits ^{\frown }}}\) to \(X=\mathbf{0}\) and make the bound \(T_t\) work for \({{\mathop {e}\limits ^{\frown }}}\) at \(X_t=\mathbf{0}\) with some supergradient. However, \(X_t=0\) is not a problem in practice since it corresponds to an empty segment.
If k is given as a continuous kernel \(k(x,y):{\mathcal {R}}^N\times {\mathcal {R}}^N \rightarrow {\mathcal {R}}\) matrix \(\mathcal{K}\) is its restriction to finite data set \(\{I_p|p\in \varOmega \} \subset {\mathcal {R}}^N \).
Without \(K\!N\!N\) or other special kernel accelerations.
The smoothness weights for different energies are not directly comparable; Fig. 19 shows all the curves for better visualization.
We found that for the GrabCut database adding texture features to RGB does not improve the results.
References
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., & Ssstrunk, S. (2012). Slic superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), 2274–2282.
Adams, A., Baek, J., & Davis, M. A. (2010). Fast high-dimensional filtering using the permutohedral lattice. Computer Graphics Forum, 29(2), 753–762.
Aggarwal, C. C., & Reddy, C. K. (eds.) (2014). Data clustering: Algorithms and applications. London: Chapman & Hall/CRC.
Arbelaez, P., Maire, M., Fowlkes, C., & Malik, J. (2011). Contour detection and hierarchical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 898–916.
Bach, F., & Jordan, M. (2003). Learning spectral clustering. Advances in Neural Information Processing Systems, 16, 305–312.
Bazaraa, M. S., Sherali, H. D., & Shetty, C. M. (2006). Nonlinear programming: Theory and algorithms. Hoboken: Wiley.
Belkin, M., & Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6), 1373–1396.
Belongie, S., & Malik, J. (1998). Finding boundaries in natural images: A new method using point descriptors and area completion. In Proceedings of the European Conference on Computer Vision (ECCV).
Ben Ayed, I., Gorelick, L., & Boykov, Y. (2013). Auxiliary cuts for general classes of higher order functionals. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1304–1311). Portland, Oregon. http://www.csd.uwo.ca/~yuri/Abstracts/cvpr13-auxcut-abs.shtml.
Ben Ayed, I., Mitiche, A., & Belhadj, Z. (2005). Multiregion level set partitioning of synthetic aperture radar images. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 27(5), 793–800.
Bishop, C. M. (2006). Pattern recognition and machine learning. Berlin: Springer.
Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge University Press.
Boykov, Y., & Jolly, M. P. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In ICCV (vol. I, pp. 105–112).
Boykov, Y., & Kolmogorov, V. (2003). Computing geodesics and minimal surfaces via graph cuts. In International Conference on Computer Vision (vol. I, pp. 26–33).
Boykov, Y., & Funka-Lea, G. (2006). Graph cuts and efficient N-D image segmentation. International Journal of Computer Vision (IJCV), 70(2), 109–131.
Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.
Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.
Breiman, L. (1996). Technical note: Some properties of splitting criteria. Machine Learning, 24(1), 41–47.
Brox, T., & Malik, J. (2010). Object segmentation by long term analysis of point trajectories. In Computer Vision–ECCV 2010 (pp. 282–295). Springer.
Brox, T., & Malik, J. (2011). Large displacement optical flow: Descriptor matching in variational motion estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(3), 500–513.
Carreira-Perpinan, M. A., & Wang, W. (2013). The K-modes algorithm for clustering. arXiv:1304.6478v1 [cs.LG]
Caselles, V., Kimmel, R., & Sapiro, G. (1997). Geodesic active contours. International Journal of Computer Vision, 22(1), 61–79.
Chambolle, A. (2004). An algorithm for total variation minimization and applications. Journal of Mathematical imaging and vision, 20(1–2), 89–97.
Chambolle, A., & Pock, T. (2011). A first-order primal-dual algorithm for convex problems with applications to imaging. Journal of Mathematical Imaging and Vision, 40(1), 120–145.
Chan, T., & Vese, L. (2001). Active contours without edges. IEEE Transactions on Image Processing, 10(2), 266–277.
Chew, S. E., & Cahill, N. D. (2015). Semi-supervised normalized cuts for image segmentation. In The IEEE International Conference on Computer Vision (ICCV).
Chitta, R., Jin, R., Havens, T., & Jain, A. (2011). Scalable kernel clustering: Approximate kernel k-means. In KDD (pp. 895–903).
Collins, M.D., Liu, J., Xu, J., Mukherjee, L., & Singh, V. (2014). Spectral clustering with a convex regularizer on millions of images. In Computer Vision–ECCV 2014 (pp. 282–298). Springer.
Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 24(5), 603–619.
Cox, I., Rao, S., & Zhong, Y. (1996). “Ratio regions”: A technique for image segmentation. In International Conference on Pattern Recognition (ICPR) (pp. 557–564).
Cox, T., & Cox, M. (2000). Multidimensional scaling. Boca Raton: CRC Press.
Cremers, D., Rousson, M., & Deriche, R. (2007). A review of statistical approaches to level set segmentation: Integrating color, texture, motion and shape. International Journal of Computer Vision, 72(2), 195–215.
Delong, A., Osokin, A., Isack, H., & Boykov, Y. (2012). Fast approximate energy minization with label costs. International Journal of Computer Vision (IJCV), 96(1), 1–27.
Deng, Z., Todorovic, S., & Latecki, L. J. (2015). Semantic segmentation of RGBD images with mutex constraints. In International Conference on Computer Vision (ICCV). Santiago, Chile.
Dhillon, I., Guan, Y., & Kulis, B. (2004). Kernel k-means, spectral clustering and normalized cuts. In KDD.
Dhillon, I., Guan, Y., & Kulis, B. (2007). Weighted graph cuts without eigenvectors: A multilevel approach. IEEE Transactions on Pattern Analysis and Machine Learning (PAMI), 29(11), 1944–1957.
Dou, M., Taylor, J., Fuchs, H., Fitzgibbon, A., & Izadi, S. (2015). 3D scanning deformable objects with a single RGBD sensor. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 493–501).
Duda, R., Hart, P., & Stork, D. (2001). Pattern classification. Hoboken: Wiley.
Duda, R. O., Hart, P. E., & Stork, D. G. (2001). Pattern classification. Hoboken: Wiley.
Eriksson, A., Olsson, C., & Kahl, F. (2011). Normalized cuts revisited: A reformulation for segmentation with linear grouping constraints. Journal of Mathematical Imaging and Vision, 39(1), 45–61.
Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.
Girolami, M. (2002). Mercer kernel-based clustering in feature space. IEEE Transactions on Neural Networks, 13(3), 780–784.
Golub, G. H., & Van Loan, C. F. (2012). Matrix computations (Vol. 3). Baltimore: JHU Press.
Gottfried, J. M., Fehr, J., & Garbe, C. S. (2011). Computing range flow from multi-modal kinect data. In Advances in visual computing (pp. 758–767). Springer.
Gulshan, V., Lempitsky, V., & Zisserman, A. (2011). Humanising grabcut: Learning to segment humans using the kinect. In Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on (pp. 1127–1133). IEEE.
Hein, M., Lal, T. N., & Bousquet, O. (2004). Hilbertian metrics on probability measures and their application in SVM’s. Pattern Recognition LNCS, 3175, 270–277.
Hochbaum, D. S. (2010). Polynomial time algorithms for ratio regions and a variant of normalized cut. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(5), 889–898.
Hofmann, T., & Buhmann, J. (1997). Pairwise data clustering by deterministic annealing. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 19(1), 1–14.
Ishikawa, H. (2003). Exact optimization for Markov random fields with convex priors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(10), 1333–1336.
Jayasumana, S., Hartley, R., Salzmann, M., Li, H., & Harandi, M. (2015). Kernel methods on Riemannian manifolds with Gaussian RBF kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 2464–2477.
Jermyn, I., & Ishikawa, H. (2001). Globally optimal regions and boundaries as minimum ratio weight cycles. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 23(10), 1075–1088.
Kappes, J. H., Andres, B., Hamprecht, F. A., Schnörr, C., Nowozin, S., Batra, D., et al. (2015). A comparative study of modern inference techniques for structured discrete energy minimization problems. International Journal of Computer Vision, 115(2), 155–184.
Kearns, M., Mansour, Y., & Ng, A. (1997). An information-theoretic analysis of hard and soft assignment methods for clustering. In Conference on Uncertainty in Artificial Intelligence (UAI).
Kohli, P., Torr, P. H., et al. (2009). Robust higher order potentials for enforcing label consistency. International Journal of Computer Vision, 82(3), 302–324.
Kolmogorov, V., Boykov, Y., & Rother, C. (2007). Applications of parametric maxflow in computer vision. In IEEE International Conference on Computer Vision (ICCV).
Kolmogorov, V. (2006). Convergent tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1568–1583.
Kolmogorov, V. (2006). Convergent tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1568–1583.
Krahenbuhl, P., & Koltun, V. (2011). Efficient inference in fully connected CRFs with Gaussian edge potentials. In NIPS.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097–1105).
Kulis, B., Basu, S., Dhillon, I., & Mooney, R. (2009). Semi-supervised graph clustering: A kernel approach. Machine Learning, 74(1), 1–22.
Lange, K., Hunter, D. R., & Yang, I. (2000). Optimization transfer using surrogate objective functions. Journal of Computational and Graphical Statistics, 9(1), 1–20.
Lempitsky, V., Blake, A., & Rother, C. (2008). Image segmentation by branch-and-mincut. In ECCV.
Lempitsky, V., Kohli, P., Rother, C., & Sharp, T. (2009). Image segmentation with a bounding box prior. In International Conference on Computer Vision (ICCV) (pp. 277–284).
Li, T., Ma, S., & Ogihara, M. (2004). Entropy-based criterion in categorical clustering. In: International Conference Mobile Learning.
Li, S. (2009). Markov random field modeling in image analysis (3rd ed.). Berlin: Springer.
Louppe, G., Wehenkel, L., Sutera, A., & Geurts, P. (2013). Understanding variable importances in forests of randomized trees. In NIPS (pp. 431–439).
MacKay, D. J. (2003). Information theory, inference, and learning algorithms. Cambridge: Cambridge University Press.
Malik, J., Belongie, S., Leung, T., & Shi, J. (2001). Contour and texture analysis for image segmentation. International Journal of Computer Vision, 43(1), 7–27.
Marin, D., Tang, M., Ayed, I. B., & Boykov, Y. (2018). Kernel clustering: Density biases and solutions. https://doi.org/10.1109/TPAMI.2017.2780166.
Martin, D., Fowlkes, C., Tal, D., & Malik, J. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEE International Conference on (vol. 2, pp. 416–423). IEEE.
McGuinness, K., & Oconnor, N. E. (2010). A comparative evaluation of interactive segmentation algorithms. Pattern Recognition, 43(2), 434–444.
Menze, M., & Geiger, A. (2015). Object scene flow for autonomous vehicles. In Conference on Computer Vision and Pattern Recognition (CVPR).
Mitiche, A., & Ayed, I. B. (2010). Variational and level set methods in image segmentation. Berlin: Springer.
Muller, K., Mika, S., Ratsch, G., Tsuda, K., & Scholkopf, B. (2001). An introduction to kernel-based learning algorithms. IEEE Transactions on Neural Networks, 12(2), 181–201.
Müller, K. R., Mika, S., Rätsch, G., Tsuda, K., & Schölkopf, B. (2001). An introduction to kernel-based learning algorithms. IEEE Transactions on Neural Networks, 12(2), 181–201.
Mumford, D., & Shah, J. (1989). Optimal approximations by piecewise smooth functions and associated variational problems. Communications on Pure and Applied Mathematics, 42, 577–685.
Narasimhan, M., & Bilmes, J. A. (2005). A submodular–supermodular procedure with applications to discriminative structure learning. In UAI (pp. 404–412).
Newcombe, R. A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A. J., Kohi, P., Shotton, J., Hodges, S., & Fitzgibbon, A. (2011). Kinectfusion: Real-time dense surface mapping and tracking. In Mixed and augmented reality (ISMAR), 2011 10th IEEE international symposium on (pp. 127–136). IEEE.
Ng, A., Jordan, M., & Weiss, Y. (2002). On spectral clustering: analysis and an algorithm. In Advances in Neural Information Processing Systems (NIPS) (vol. 2, pp. 849–856).
Nieuwenhuis, C., & Cremers, D. (2013). Spatially varying color distributions for interactive multilabel segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(5), 1234–1247.
Ochs, P., & Brox, T. (2011). Object segmentation in video: A hierarchical variational approach for turning point trajectories into dense regions. In Computer Vision (ICCV), 2011 IEEE International Conference on (pp. 1583–1590). IEEE.
Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42(3), 145–175.
Paris, S., & Durand, F. (2006). A fast approximation of the bilateral filter using a signal processing approach. In Computer Vision–ECCV 2006 (pp. 568–580). Springer.
Park, K., & Gould, S. (2012). On learning higher-order consistency potentials for multi-class pixel labeling. In ECCV.
Pock, T., Chambolle, A., Cremers, D., & Bischof, H. (2009). A convex relaxation approach for computing minimal partitions. In IEEE conference on Computer Vision and Pattern Recognition (CVPR).
Ren, X., Bo, L., & Fox, D. (2012). Rgb-(d) scene labeling: Features and algorithms. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on (pp. 2759–2766). IEEE.
Rose, K. (1998). Deterministic annealing for clustering, compression, classification, regression, and related optimization problems. Proceedings of the IEEE, 86(11), 2210–2239.
Rother, C., Kolmogorov, V., & Blake, A. (2004). Grabcut: Interactive foreground extraction using iterated graph cuts. In ACM Transactions on Graphics (SIGGRAPH).
Roth, V., Laub, J., Kawanabe, M., & Buhmann, J. (2003). Optimal cluster preserving embedding of nonmetric proximity data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(12), 1540–1551.
Rousson, M., & Deriche, R. (2002). A variational framework for active and adaptative segmentation of vector valued images. In Workshop on motion and video computing.
Salah, M. B., Mitiche, A., & Ayed, I. B. (2010). Effective level set image segmentation with a kernel induced data term. IEEE Transactions on Image Processing, 19(1), 220–232.
Shawe-Tayler, J., & Cristianini, N. (2004). Kernel methods for pattern analysis. Cambridge: Cambridge University Press.
Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 888–905.
Silberman, N., Hoiem, D., Kohli, P. & Fergus, R. (2012). Indoor segmentation and support inference from rgbd images. In ECCV.
Sung, K. K., & Poggio, T. (1995). Example based learning for viewbased human face detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 20, 39–51.
Tang, M., Ayed, I. B., & Boykov, Y. (2014). Pseudo-bound optimization for binary energies. In European Conference on Computer Vision (ECCV) (pp. 691–707).
Tang, M., Ayed, I. B., Marin, D., & Boykov, Y. (2015). Secrets of grabcut and kernel k-means. In International Conference on Computer Vision (ICCV). Santiago, Chile.
Tang, M., Gorelick, L., Veksler, O., & Boykov, Y. (2013). Grabcut in one cut. In International Conference on Computer Vision (ICCV). Sydney, Australia.
Tang, M., Marin, D., Ayed, I. B., & Boykov, Y. (2016). Kernel cuts: MRF meets kernel and spectral clustering. In arXiv:1506.07439
Tang, M., Marin, D., Ayed, I. B., & Boykov, Y. (2016). Normalized cut meets MRF. In European Conference on Computer Vision (ECCV). Amsterdam, Netherlands.
Vapnik, V. (1998). Statistical learning theory. Hoboken: Wiley.
Varma, M., & Zisserman, A. (2005). A statistical approach to texture classification from single images. International Journal of Computer Vision, 62(1–2), 61–81.
Veksler, O. (2018). Efficient graph cut optimisation for full CRFs with quantized edges. IEEE Transactions on Pattern Analysis and Machine Intelligence. arXiv:1809.04995.
Vicente, S., Kolmogorov, V., & Rother, C. (2009). Joint optimization of segmentation and appearance models. In International Conference on Computer Vision (ICCV).
Von Luxburg, U. (2007). A tutorial on spectral clustering. Statistics and Computing, 17(4), 395–416.
Wang, S., & Siskind, J. M. (2003). Image segmentation with ratio cut. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 25(6), 675–690.
Werner, T. (2007). A linear programming approach to max-sum problem: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(7), 1165–1179.
Xu, L., Li, W., & Schuurmans, D. (2009). Fast normalized cut with linear constraints. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2866–2873).
Yedidia, J. S., Freeman, W. T., & Weiss, Y. (2005). Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Transactions on Information Theory, 51(7), 2282–2312.
Yu, S., & Shi, J. (2003). Multiclass spectral clustering. In International Conference on Computer Vision (ICCV).
Yu, Y., Fang, C., & Liao, Z. (2015). Piecewise flat embedding for image segmentation. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1368–1376).
Yu, S. X., & Shi, J. (2004). Segmentation given partial grouping constraints. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 26(2), 173–183.
Zelnik-Manor, L., & Perona, P. (2004). Self-tuning spectral clustering. In Advances in NIPS (pp. 1601–1608).
Zhu, S. C., & Yuille, A. (1996). Region competition: Unifying snakes, region growing, and Bayes/MDL for multiband image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(9), 884–900.
Acknowledgements
We greatly thank Carl Olsson (Lund University, Sweden) for hours of stimulating discussions, as well as for detailed feedback and valuable recommendations at different stages of our work. We appreciate his tremendous patience when our thoughts were much more confusing than they might be now. Ivan Stelmakh (PhysTech, Russia) also gave helpful feedback on our draft and caught several errors. Anders Eriksson (Lund University, Sweden) helped with related work on NC with constraints. We also thank Jianbo Shi (UPenn, USA) for his excellent spectral-relaxation optimization code for NC.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by M. Hebert.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Tang, M., Marin, D., Ben Ayed, I. et al. Kernel Cuts: Kernel and Spectral Clustering Meet Regularization. Int J Comput Vis 127, 477–511 (2019). https://doi.org/10.1007/s11263-018-1115-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-018-1115-1