Principal Direction Divisive Partitioning with Kernels and k-Means Steering

  • Dimitrios Zeimpekis
  • Efstratios Gallopoulos

Clustering is a fundamental task in data mining. We propose, implement, and evaluate several schemes that combine partitioning and hierarchical algorithms, specifically k-means and principal direction divisive partitioning (PDDP). Using available theory regarding the solution of the clustering indicator vector problem, we use 2-means to induce partitionings around fixed or varying cut-points. 2-means is applied either on the data or over its projection on a one-dimensional subspace. These techniques are also extended to the case of PDDP(l), a multiway clustering algorithm generalizing PDDP. To handle data that do not lend themselves to linear separability, the algebraic framework is established for a kernel variant, KPDDP. Extensive experiments demonstrate the performance of the above methods and suggest that it is advantageous to steer PDDP using k-means. It is also shown that KPDDP can provide results of superior quality than kernel k-means.

Keywords

Singular Value Decomposition Singular Vector Spectral Cluster Kernel Learning Kernel Version 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. C.J. Alpert, A.B. Kahng, and S.-Z. Yao. Spectral partitioning with multiple eigenvectors. Discrete Applied Mathematics, 90:3-26, 1999.MATHCrossRefMathSciNetGoogle Scholar
  2. C.J. Alpert and S.-Z. Yao. Spectral partitioning: the more eigenvectors, the better. In Proc. 32nd ACM/IEEE Design Automation Conf., pages 195-200. ACM Press, 1995. Available from World Wide Web: http://doi.acm.org/10.1145/217474.217529.
  3. M.W. Berry. Large scale singular value decomposition. Int’l. J. Supercomp. Appl., 6:13-49, 1992.Google Scholar
  4. P. Berkhin. A survey of clustering data mining techniques. In J. Kogan, C. Nicholas, and M. Teboulle, editors, Grouping Multidimensional Data: Recent Advances in Clustering, pages 25-72. Springer, Berlin, 2006.CrossRefGoogle Scholar
  5. K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft. When is “nearest neighbor” meaningful? In Lecture Notes in Computer Science, volume 1540, pages 217-235. Sprnger, London, 1999.Google Scholar
  6. A. Ben-Hur, D. Horn, H.T. Siegelmann, and V. Vapnik. Support vector clustering. Machine Learning Research, 2:125-137, 2001.CrossRefGoogle Scholar
  7. D. Boley. Principal direction divisive partitioning. Data Mining and Knowledge Discovery, 2(4):325-344, 1998.CrossRefGoogle Scholar
  8. D. Boley. A scalable hierarchical algorithm for unsupervised clustering. In R. Grossman, C. Kamath, P. Kegelmeyer, V. Kumar, and R. Namburu, editors, Data Mining for Scientific and Engineering Applications. Kluwer Academic Publishers, Norwell, MA, 2001.Google Scholar
  9. N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines and Other Kernel-Base Learning Methods. Cambridge University Press, Cambridge, UK, 2000.Google Scholar
  10. F. Camastra and A. Verri. A novel kernel method for clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(5):801-804, 2005.CrossRefGoogle Scholar
  11. I.S. Dhillon, Y. Guan, and B. Kulis. Kernel k-means: spectral clustering and normalized cuts. In Proc. 10th ACM SIGKDD, pages 551-556, ACM Press, New York, 2004.Google Scholar
  12. W.E. Donath and A.J. Hoffman. Lower bounds for the partitioning of graphs. IBM J. Res. Develop., 17:420-425, 1973.MATHCrossRefMathSciNetGoogle Scholar
  13. C. Ding and X. He. Cluster structure of k-means clustering via principal component analysis. In PAKDD, pages 414-418, 2004. Available from World Wide Web: http://springerlink.metapress.com/openurl.asp? genre=article&issn=0302- 9743&volume=3056&spage=414.
  14. I.S. Dhillon. Co-clustering documents and words using bipartite spectral graph partitioning. In Proc. 7th ACM SIGKDD, pages 269-274, ACM Press, New York, 2001.Google Scholar
  15. T. Finley and T. Joachims. Supervised clustering with support vector machines. In ICML ’05: Proceedings of the 22nd international conference on Machine learning, pages 217-224, New York, 2005.Google Scholar
  16. B. Hendrickson and R. Leland. An improved spectral graph partitioning algorithm for mapping parallel computations. SIAM J. Sci. Comput., 16(2):452-469, 1995. Available from World Wide Web: citeseer.nj.nec.com/ hendrickson95improved.html.Google Scholar
  17. J. Kogan, I.S. Dhillon, and C. Nicholas. Feature selection and document clustering. In M. Berry, editor, A Comprehensive Survey of Text Mining. Springer, New York, 2004.Google Scholar
  18. T. Kanungo, D.M. Mount, N.S. Netanyahu, D. Platko, and A.Y. Wu. An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans. PAMI, 24(7):881-892, 2002.Google Scholar
  19. J. Kogan. Introduction to Clustering Large and High-Dimensional Data. Cambridge University Press, New York, 2007.MATHGoogle Scholar
  20. E. Kokiopoulou and Y. Saad. PCA and kernel PCA using polynomial filtering: a case study on face recognition. In SIAM Conf. on Data Mining, 2005.Google Scholar
  21. R.M. Larsen. Propack: a software package for the symmetric eigenvalue problem and singular value problems on Lanczos and Lanczos bidiagonalization with partial reorthogonalization. Available from World Wide Web: http://sun.stanford.edu/rmunk/PROPACK/. Stanford University.
  22. D. Littau and D. Boley. Clustering very large datasets with PDDP. In J. Kogan, C. Nicholas, and M. Teboulle, editors, Grouping Multidimensional Data: Recent Advances in Clustering, pages 99-126. Springer, New York, 2006.CrossRefGoogle Scholar
  23. S.P. Lloyd. Least squares quantization in PCM. IEEE Trans. Information Theory, 28:129-137, 1982.MATHCrossRefMathSciNetGoogle Scholar
  24. K.R. M üller, S. Mika, G. R ätsch, K. Tsuda, and B. Schölkopf. An introduction to kernel-based learning algorithms. IEEE Transactions on Neural Networks, 12 (2):181-202, 2001.Google Scholar
  25. G. Salton and C. Buckley. Term weighting approaches in automatic text retrieval. Information Processing and Management, 24(5):513-523, 1988.CrossRefGoogle Scholar
  26. S. Savaresi, D. Boley, S. Bittanti, and G. Gazzaniga. Choosing the cluster to split in bisecting divisive clustering algorithms. In Second SIAM International Conference on Data Mining (SDM’2002), 2002.Google Scholar
  27. J. Sander, M. Ester, H.-P. Kriegel, and X. Xu. Density-based clustering in spatial databases: the algorithm GDBSCAN and its applications. Data Mining and Knowledge Discovery, 2(2):169-194, 1998.CrossRefGoogle Scholar
  28. M. Steinbach, G. Karypis, and V. Kumar. A comparison of document clustering techniques. In 6th ACM SIGKDD, World Text Mining Conference, Boston,MA, 2000. Available from World Wide Web: citeseer.nj.nec.com/steinbach00comparison.html.Google Scholar
  29. A.J. Sch ölkopf, B. Smola and K.R. M üller. Kernel principal component analysis. In Proc. International Conference on Artificial Neural Networks, pages 583-588, 1997.Google Scholar
  30. B. Schölkopf, A.J. Smola, and K.R. M üller. Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(5):1299-1319, 1998.CrossRefGoogle Scholar
  31. S. Xu and J. Zhang. A parallel hybrid Web document clustering algorithm and its performance study. J. Supercomputing, 30(2):117-131, 2004.MATHCrossRefGoogle Scholar
  32. M.H. Yang, N. Ahuja, and D.J. Kriegman. Face recognition using kernel eigenfaces. In Proc. International Conference on Image Processing, 2000.Google Scholar
  33. D.Q. Zhang and S.C. Chen. Clustering incomplete data using kernel-based fuzzy c-means algorithm. Neural Processing Letters, 18(3):155-162, 2003.CrossRefGoogle Scholar
  34. D. Zeimpekis and E. Gallopoulos. PDDP(l): towards a flexible principal direction divisive partitioning clustering algorithm. In D. Boley, I. Dhillon, J. Ghosh, and J. Kogan, editors, Proc. Workshop on Clustering Large Data Sets (held in conjunction with the Third IEEE Int’l. Conf. Data Min.), pages 26-35, Melbourne, FL, November 2003.Google Scholar
  35. D. Zeimpekis and E. Gallopoulos. TMG: A MATLAB toolbox for generating term-document matrices from text collections. In J. Kogan, C. Nicholas, and M. Teboulle, editors, Grouping Multidimensional Data: Recent Advances in Clustering, pages 187-210. Springer, New York, 2006.CrossRefGoogle Scholar
  36. H. Zha, X. He, C. Ding, M. Gu, and H. Simon. Spectral relaxation for k- means clustering. In NIPS, pages 1057-1064, 2001. Available from WorldWide Web: http://www-2.cs.cmu.edu/Groups/NIPS/NIPS2001/papers/psgz/AA41.ps.gz.

Copyright information

© Springer-Verlag London Limited 2008

Authors and Affiliations

  • Dimitrios Zeimpekis
    • 1
  • Efstratios Gallopoulos
    • 1
  1. 1.Department of Computer Engineering and InformaticsUniversity of PatrasPatrasGreece

Personalised recommendations