Clustering Based on Principal Curve

  • Ioan Cleju
  • Pasi Fränti
  • Xiaolin Wu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3540)


Clustering algorithms are intensively used in the image analysis field in compression, segmentation, recognition and other tasks. In this work we present a new approach in clustering vector datasets by finding a good order in the set, and then applying an optimal segmentation algorithm. The algorithm heuristically prolongs the optimal scalar quantization technique to vector space. The data set is sequenced using one-dimensional projection spaces. We show that the principal axis is too rigid to preserve the adjacency of the points. We present a way to refine the order using the minimum weight Hamiltonian path in the data graph. Next we propose to use the principal curve to better model the non-linearity of the data and find a good sequence in the data. The experimental results show that the principal curve based clustering method can be successfully used in cluster analysis.


Cluster Algorithm Mean Square Error Principal Axis Principal Curve Cluster Problem 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Slagle, J.L., Chang, C.L., Heller, S.L.: A Clustering and Data-Reorganization Algorithm. IEEE Transactions on Systems, Man and Cybernetics 5, 121–128 (1975)Google Scholar
  2. 2.
    Wu, X.: Optimal Quantization by Matrix Searching. Journal of Algorithms 12, 663–673 (1991)zbMATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Soong, F.K., Juang, B.H.: Optimal Quantization of LSP Parameters. IEEE Transactions on Speech and Audio Processing 1, 15–24 (1993)CrossRefGoogle Scholar
  4. 4.
    Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering: A review. ACM Computing Surveys 31 (1999)Google Scholar
  5. 5.
    MacQueen, J.: Some Methods for Classification and Analysis of Multivariate Observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–296 (1967)Google Scholar
  6. 6.
    Zahn, C.T.: Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters. IEEE Transactions on Computers, 68–86 (1971)Google Scholar
  7. 7.
    Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, New Jersey (1988)Google Scholar
  8. 8.
    Bezdek, J.C., Ehrlich, R., Full, W.: FCM: the Fuzzy c-Means Clustering Algorithm. Computers and Geosciences 10, 191–203 (1984)CrossRefGoogle Scholar
  9. 9.
    Kohonen, T.: Self-Organizing Maps. Springer, Berlin (1995)Google Scholar
  10. 10.
    Fränti, P.: Genetic Algorithm with Deterministic Crossover for Vector Quantization. Pattern Recognition Letters 21, 61–68 (2000)CrossRefGoogle Scholar
  11. 11.
    Gordon, A.D.: Classification. Chapman and Hall, London (1980)Google Scholar
  12. 12.
    Wu, X.: Color Quantization by Dynamic Programming and Principal Analysis. ACM Transactions on Graphics 11, 348–372 (1992)zbMATHCrossRefGoogle Scholar
  13. 13.
    Aggarwal, A., Schieber, B., Tokuyama, T.: Finding a Minimum Weight K-link Path in Graphs with Monge Property and Applications. In: Proceedings of the 9th Annual Symposium on Computational Geometry, pp. 189–197 (1993)Google Scholar
  14. 14.
    Johnson, R.A., Wichern, D.W.: Applied Multivariate Statistical Analysis. Prentice-Hall, New Jersey (1988)zbMATHGoogle Scholar
  15. 15.
    Garey, M., Johnson, D.: Computers and Intractability: A Guide to NP_Completeness. W.H. Freeman, New York (1979)zbMATHGoogle Scholar
  16. 16.
    Hastie, T., Stuetzle, W.: Principal Curves. Journal of the American Statistical Association 84, 502–516 (1989)zbMATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Banfield, J.D., Raftery, A.E.: Ice Floe Identification in Satellite Images Using Mathematical Morphology and Clustering about Principal Curves. Journal of the American Statistical Association 87, 7–16 (1992)CrossRefGoogle Scholar
  18. 18.
    Chang, K., Ghosh, J.: Principal Curves for Non-Linear Feature Extraction and Classification. In: Proceedings SPIE, pp. 120–129 (1998)Google Scholar
  19. 19.
    Kegl, B., Krzyzak, A., Linder, T., Zeger, K.: Learning and Design of Principal Curves. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 281–297 (2000)CrossRefGoogle Scholar
  20. 20.
    Verbeek, J.J., Vlassis, N., Krose, B.: A k-Segments Algorithm for Finding Principal Curves. Pattern Recognition Letters 23, 1009–1017 (2002)zbMATHCrossRefGoogle Scholar
  21. 21.
    Sandilya, S., Kulkarni, S.R.: Principal Curves with Bounded Turn. IEEE Transactions on Information Theory 48, 2789–2793 (2002)zbMATHCrossRefMathSciNetGoogle Scholar
  22. 22.
    Mulier, F., Cherkassky, V.: Self-organization as an Iterative Kernel Smoothing Process. Neural Computation 7, 1165–1177 (1995)CrossRefGoogle Scholar
  23. 23.
    Fränti, P., Kivijäri, J.: Randomized Local Search Algorithm for the Clustering Problem. Pattern Analysis and Applications 3, 358–369 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Ioan Cleju
    • 1
  • Pasi Fränti
    • 2
  • Xiaolin Wu
    • 3
  1. 1.Department of Computer and Information ScienceUniversity of KonstanzKonstanzGermany
  2. 2.Department of Computer ScienceUniversity of JoensuuJoensuuFinland
  3. 3.Department of Electrical and Computer EngineeringMcMaster UniversityHamiltonCanada

Personalised recommendations