Foundations of a Multi-way Spectral Clustering Framework for Hybrid Linear Modeling
- 349 Downloads
The problem of Hybrid Linear Modeling (HLM) is to model and segment data using a mixture of affine subspaces. Different strategies have been proposed to solve this problem, however, rigorous analysis justifying their performance is missing. This paper suggests the Theoretical Spectral Curvature Clustering (TSCC) algorithm for solving the HLM problem and provides careful analysis to justify it. The TSCC algorithm is practically a combination of Govindu’s multi-way spectral clustering framework (CVPR 2005) and Ng et al.’s spectral clustering algorithm (NIPS 2001). The main result of this paper states that if the given data is sampled from a mixture of distributions concentrated around affine subspaces, then with high sampling probability the TSCC algorithm segments well the different underlying clusters. The goodness of clustering depends on the within-cluster errors, the between-clusters interaction, and a tuning parameter applied by TSCC. The proof also provides new insights for the analysis of Ng et al. (NIPS 2001).
KeywordsHybrid linear modeling d-flats clustering Multi-way clustering Spectral clustering Polar curvature Perturbation analysis Concentration inequalities
Mathematics Subject Classification (2000)68Q32 68T10 62H30 68W40 60D05 15A42
Unable to display preview. Download preview PDF.
- 1.S. Agarwal, K. Branson, S. Belongie, Higher order learning with graphs, in Proceedings of the 23rd International Conference on Machine learning, vol. 148 (2006), pp. 17–24. Google Scholar
- 2.S. Agarwal, J. Lim, L. Zelnik-Manor, P. Perona, D. Kriegman, S. Belongie, Beyond pairwise clustering, in Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 2 (2005), pp. 838–845. Google Scholar
- 3.E. Arias-Castro, D. Donoho, X. Huo, Near-optimal detection of geometric objects by fast multiscale methods, IEEE Trans. Inf. Theory 51(7) (2005). Google Scholar
- 4.B. Bader, T. Kolda, Algorithm 862: MATLAB tensor classes for fast algorithm prototyping. ACM Trans. Math. Softw. 32(4), 635–653 (2006). http://www.citeulike.org/user/bamberg/article/2875626 CrossRefMathSciNetGoogle Scholar
- 6.M. Brand, K. Huang, A unifying theorem for spectral embedding and clustering, in Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, January 2003. Google Scholar
- 11.V. Govindu, A tensor decomposition for geometric grouping and segmentation, in CVPR, vol. 1, June 2005, pp. 1150–1157. Google Scholar
- 12.P. Gruber, F. Theis, Grassmann clustering, in Proc. EUSIPCO 2006, Florence, Italy, 2006. Google Scholar
- 14.J. Ho, M. Yang, J. Lim, K. Lee, D. Kriegman, Clustering appearances of objects under varying illumination conditions, in Proceedings of International Conference on Computer Vision and Pattern Recognition, vol. 1 (2003), pp. 11–18. Google Scholar
- 16.A. Kambhatla, T. Leen, Fast non-linear dimension reduction, in Advances in Neural Information Processing Systems 6, (1994), pp. 152–159. Google Scholar
- 17.K. Kanatani, Motion segmentation by subspace separation and model selection, in Proc. of 8th ICCV, vol. 3, Vancouver, Canada (2001), pp. 586–591. Google Scholar
- 18.K. Kanatani, Evaluation and selection of models for motion segmentation, in 7th ECCV, vol. 3, May 2002, pp. 335–349. Google Scholar
- 22.G. Lerman, J.T. Whitehouse, High-dimensional Menger-type curvatures—part I: Geometric multipoles and multiscale inequalities (2008, submitted). Available from http://arxiv.org/abs/0805.1425v1.
- 23.G. Lerman, J.T. Whitehouse, High-dimensional Menger-type curvatures—part II: d-separation and a menagerie of curvatures. Constr. Approx. (2009, accepted). Available from http://arxiv.org/abs/0809.0137v1.
- 24.G. Lerman, J.T. Whitehouse, Least squares for probability measures via multi-way curvatures (2009, in preparation). Google Scholar
- 27.J. MacQueen, Some methods for classification and analysis of multivariate observations, in Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1 (University of California Press, Berkeley, 1967), pp. 281–297. Google Scholar
- 28.J. Mairal, F. Bach, J. Ponce, G. Sapiro, A. Zisserman, Discriminative learned dictionaries for local image analysis, in Proc. CVPR, Alaska, June 2008. Google Scholar
- 29.C. McDiarmid, On the method of bounded differences, in Surveys in Combinatorics (Cambridge University Press, Cambridge, 1989), pp. 148–188. Google Scholar
- 31.A. Ng, M. Jordan, Y. Weiss, On spectral clustering: Analysis and an algorithm, in Advances in Neural Information Processing Systems 14, (2001), pp. 849–856. Google Scholar
- 32.A. Shashua, R. Zass, T. Hazan, Multi-way clustering using super-symmetric non-negative tensor factorization, in ECCV06, vol. IV (2006), pp. 595–608. Google Scholar
- 34.R. Souvenir, R. Pless, Manifold clustering, in The 10th International Conference on Computer Vision (ICCV 2005), 2005. Google Scholar
- 35.A. Szlam, Modifications of k q-flats for supervised learning (2008). Google Scholar
- 39.R. Vidal, Y. Ma, S. Sastry, Generalized principal component analysis (GPCA), IEEE Trans. Pattern Anal. Mach. Intell. 27(12) (2005). Google Scholar
- 41.J. Yan, M. Pollefeys, A general framework for motion segmentation: Independent, articulated, rigid, non-rigid, degenerate and nondegenerate, in ECCV, vol. 4 (2006), pp. 94–106. Google Scholar
- 42.A.Y. Yang, S.R. Rao, Y. Ma, Robust statistical estimation and segmentation of multiple subspaces, in Computer Vision and Pattern Recognition Workshop, June 2006. Google Scholar
- 43.L. Zwald, G. Blanchard, On the convergence of eigenspaces in kernel principal components analysis, in Advances in Neural Information Processing Systems 18 (2005), pp. 1649–1656. Google Scholar