Abstract
The problem of Hybrid Linear Modeling (HLM) is to model and segment data using a mixture of affine subspaces. Different strategies have been proposed to solve this problem, however, rigorous analysis justifying their performance is missing. This paper suggests the Theoretical Spectral Curvature Clustering (TSCC) algorithm for solving the HLM problem and provides careful analysis to justify it. The TSCC algorithm is practically a combination of Govindu’s multi-way spectral clustering framework (CVPR 2005) and Ng et al.’s spectral clustering algorithm (NIPS 2001). The main result of this paper states that if the given data is sampled from a mixture of distributions concentrated around affine subspaces, then with high sampling probability the TSCC algorithm segments well the different underlying clusters. The goodness of clustering depends on the within-cluster errors, the between-clusters interaction, and a tuning parameter applied by TSCC. The proof also provides new insights for the analysis of Ng et al. (NIPS 2001).
Similar content being viewed by others
References
S. Agarwal, K. Branson, S. Belongie, Higher order learning with graphs, in Proceedings of the 23rd International Conference on Machine learning, vol. 148 (2006), pp. 17–24.
S. Agarwal, J. Lim, L. Zelnik-Manor, P. Perona, D. Kriegman, S. Belongie, Beyond pairwise clustering, in Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 2 (2005), pp. 838–845.
E. Arias-Castro, D. Donoho, X. Huo, Near-optimal detection of geometric objects by fast multiscale methods, IEEE Trans. Inf. Theory 51(7) (2005).
B. Bader, T. Kolda, Algorithm 862: MATLAB tensor classes for fast algorithm prototyping. ACM Trans. Math. Softw. 32(4), 635–653 (2006). http://www.citeulike.org/user/bamberg/article/2875626
P. Bradley, O. Mangasarian, k-plane clustering, J. Glob. Optim. 16(1), 23–32 (2000).
M. Brand, K. Huang, A unifying theorem for spectral embedding and clustering, in Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, January 2003.
G. Chen, G. Lerman, Spectral curvature clustering (SCC), Int. J. Comput. Vis. 81(3), 317–330 (2009).
J. Costeira, T. Kanade, A multibody factorization method for independently moving objects, Int. J. Comput. Vis. 29(3), 159–179 (1998).
M. Fischler, R. Bolles, Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography, Commun. ACM 24(6), 381–395 (1981).
G. Golub, C. Van Loan, Matrix Computations (John Hopkins University Press, Baltimore, 1996).
V. Govindu, A tensor decomposition for geometric grouping and segmentation, in CVPR, vol. 1, June 2005, pp. 1150–1157.
P. Gruber, F. Theis, Grassmann clustering, in Proc. EUSIPCO 2006, Florence, Italy, 2006.
G. Haro, G. Randall, G. Sapiro, Translated Poisson mixture model for stratification learning, Int. J. Comput. Vis. 80(3), 358–374 (2008).
J. Ho, M. Yang, J. Lim, K. Lee, D. Kriegman, Clustering appearances of objects under varying illumination conditions, in Proceedings of International Conference on Computer Vision and Pattern Recognition, vol. 1 (2003), pp. 11–18.
A. Hyvärinen, E. Oja, Independent component analysis: algorithms and applications, Neural Netw. 13(4–5), 411–430 (2000).
A. Kambhatla, T. Leen, Fast non-linear dimension reduction, in Advances in Neural Information Processing Systems 6, (1994), pp. 152–159.
K. Kanatani, Motion segmentation by subspace separation and model selection, in Proc. of 8th ICCV, vol. 3, Vancouver, Canada (2001), pp. 586–591.
K. Kanatani, Evaluation and selection of models for motion segmentation, in 7th ECCV, vol. 3, May 2002, pp. 335–349.
D. Kushnir, M. Galun, A. Brandt, Fast multiscale clustering and manifold identification, Pattern Recognit. 39(10), 1876–1891 (2006).
L. De Lathauwer, B. De Moor, J. Vandewalle, A multilinear singular value decomposition, SIAM J. Matrix Anal. A 21(4), 1253–1278 (2000).
G. Lerman, J.T. Whitehouse, On d-dimensional d-semimetrics and simplex-type inequalities for high-dimensional sine functions. J. Approx. Theory 56(1), 52–81 (2009). http://portal.acm.org/citation.scfm?id=1498013.
G. Lerman, J.T. Whitehouse, High-dimensional Menger-type curvatures—part I: Geometric multipoles and multiscale inequalities (2008, submitted). Available from http://arxiv.org/abs/0805.1425v1.
G. Lerman, J.T. Whitehouse, High-dimensional Menger-type curvatures—part II: d-separation and a menagerie of curvatures. Constr. Approx. (2009, accepted). Available from http://arxiv.org/abs/0809.0137v1.
G. Lerman, J.T. Whitehouse, Least squares for probability measures via multi-way curvatures (2009, in preparation).
Y. Ma, H. Derksen, W. Hong, J. Wright, Segmentation of multivariate mixed data via lossy coding and compression, IEEE Trans. Pattern Anal. Mach. Intell. 29(9), 1546–1562 (2007).
Y. Ma, A.Y. Yang, H. Derksen, R. Fossum, Estimation of subspace arrangements with applications in modeling and segmenting mixed data, SIAM Rev. 50(3), 413–458 (2008).
J. MacQueen, Some methods for classification and analysis of multivariate observations, in Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1 (University of California Press, Berkeley, 1967), pp. 281–297.
J. Mairal, F. Bach, J. Ponce, G. Sapiro, A. Zisserman, Discriminative learned dictionaries for local image analysis, in Proc. CVPR, Alaska, June 2008.
C. McDiarmid, On the method of bounded differences, in Surveys in Combinatorics (Cambridge University Press, Cambridge, 1989), pp. 148–188.
G. Medioni, M.-S. Lee, C.-K. Tang, A Computational Framework for Segmentation and Grouping (Elsevier, Amsterdam, 2000).
A. Ng, M. Jordan, Y. Weiss, On spectral clustering: Analysis and an algorithm, in Advances in Neural Information Processing Systems 14, (2001), pp. 849–856.
A. Shashua, R. Zass, T. Hazan, Multi-way clustering using super-symmetric non-negative tensor factorization, in ECCV06, vol. IV (2006), pp. 595–608.
J. Shi, J. Malik, Normalized cuts and image segmentation, IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000).
R. Souvenir, R. Pless, Manifold clustering, in The 10th International Conference on Computer Vision (ICCV 2005), 2005.
A. Szlam, Modifications of k q-flats for supervised learning (2008).
M. Tipping, C. Bishop, Mixtures of probabilistic principal component analysers, Neural Comput. 11(2), 443–482 (1999).
P.H.S. Torr, Geometric motion segmentation and model selection, Philos. Trans. R. Soc. Lond. A 356, 1321–1340 (1998).
P. Tseng, Nearest q-flat to m points, J. Optim. Theory Appl. 105(1), 249–252 (2000).
R. Vidal, Y. Ma, S. Sastry, Generalized principal component analysis (GPCA), IEEE Trans. Pattern Anal. Mach. Intell. 27(12) (2005).
U. von Luxburg, M. Belkin, O. Bousquet, Consistency of spectral clustering, Ann. Stat. 36(2), 555–586 (2008).
J. Yan, M. Pollefeys, A general framework for motion segmentation: Independent, articulated, rigid, non-rigid, degenerate and nondegenerate, in ECCV, vol. 4 (2006), pp. 94–106.
A.Y. Yang, S.R. Rao, Y. Ma, Robust statistical estimation and segmentation of multiple subspaces, in Computer Vision and Pattern Recognition Workshop, June 2006.
L. Zwald, G. Blanchard, On the convergence of eigenspaces in kernel principal components analysis, in Advances in Neural Information Processing Systems 18 (2005), pp. 1649–1656.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Albert Cohen.
This work was supported by NSF grant #0612608.
Rights and permissions
About this article
Cite this article
Chen, G., Lerman, G. Foundations of a Multi-way Spectral Clustering Framework for Hybrid Linear Modeling. Found Comput Math 9, 517–558 (2009). https://doi.org/10.1007/s10208-009-9043-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10208-009-9043-7
Keywords
- Hybrid linear modeling
- d-flats clustering
- Multi-way clustering
- Spectral clustering
- Polar curvature
- Perturbation analysis
- Concentration inequalities