Clustering with ITL Principles

  • Robert Jenssen
  • Sudhir Rao
Part of the Information Science and Statistics book series (ISS)


Learning and adaptation deal with the quantification and exploitation of the input source “structure” as pointed out perhaps for the first time by Watanabe [330]. Although structure is a vague and difficult concept to quantify, structure fills the space with identifiable patterns that may be distinguishable macroscopically by the shape of the probability density function. Therefore, entropy and the concept of dissimilarity naturally form the foundations for unsupervised learning because they are descriptors of PDFs.


Spectral Cluster Kernel Size Information Force Proximity Graph Iris Dataset 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Aczél J., Daróczy Z., On measures of information and their characterizations, Mathematics in Science and Engineering, vol. 115, Academic Press, New York, 1975.Google Scholar
  2. 13.
    Bach F., Jordan M., Finding clusters in independent component analysis, in Int. Symposium on Independent Component Analysis and Blind Signal Separation, Nara, Japan, pp. 891–896, 2003.Google Scholar
  3. 30.
    Ben-Hur A., Horn D., Siegelmann H., Vapnik V., Support vector clustering, J.Mach. Learn. Res., 2:125–137, 2001.Google Scholar
  4. 49.
    Carreira-Perpinan M., Mode-finding for mixtures of Gaussian distributions, IEEE Trans. Pattern Anal. Mach. Inte., 22(11):1318–1323, November 2000.CrossRefGoogle Scholar
  5. 50.
    Carreira-Perpinan M., Gaussian mean shift is an EM algorithm, IEEE Trans. Pattern Anal. Mach. Inte., 29(5):767–776, 2007.CrossRefGoogle Scholar
  6. 55.
    Cheng Y., Mean shift, mode seeking and clustering, IEEE Trans. Pattern Anal. Mach. Inte., 17(8):790–799, August 1995.CrossRefGoogle Scholar
  7. 60.
    Comaniciu D., Ramesh V., Meer P., Real-time tracking of nonrigid objects using mean shift, in Proceedings of IEEE Conf. Comput. Vision and Pattern Recogn., 2:142–149, June 2000.Google Scholar
  8. 61.
    Comaniciu D., Meer P., Mean shift: A robust approach toward feature space analysis, IEEE Trans. Pattern Anal. Mach. Inte., 24(5):603–619, May 2002.CrossRefGoogle Scholar
  9. 75.
    Ding H., He X., Zha H., Gu M., Simon H., A min-max cut algorithm for graph partitioning and data clustering. In Proc. IEEE Int. Conf. Data Mining, pp. 107–114, San Jose, CA, November 29–December 2, 2001.Google Scholar
  10. 80.
    Duda R., Hart P., Stork D., Pattern Classification and Scene Analysis. John Wiley & Sons, New York, 2nd edition, 2001.Google Scholar
  11. 100.
    Fine S., Scheinberg K., Cristianini N., Shawe-Taylor J., Williamson B., Efficient SVM training using low-rank kernel representations, J. Mach. Learn. Res., 2:243–264, 2001.Google Scholar
  12. 105.
    Friedman J., Tukey J., A Projection Pursuit Algorithm for Exploratory Data Analysis, IEEE Trans. Comput., Ser. C, 23:881–889, 1974.CrossRefMATHGoogle Scholar
  13. 108.
    Fukunaga K., An Introduction to Statistical Pattern Recognition, Academic Press, New York, 1972Google Scholar
  14. 109.
    Fukunaga K., Hostetler L., The estimation of the gradient of a density function with applications in pattern recognition, IEEE Trans. Inf. Theor., 21(1);32–40, January 1975.CrossRefMATHMathSciNetGoogle Scholar
  15. 111.
    Gdalyahu Y., Weinshall D., Werman M., Self-organization in vision: Stochastic clustering for image segmentation, perceptual grouping, and image database organization. IEEE Trans. Pattern Anal. Mach. Inte., 23(10):1053–1074, 2001.CrossRefGoogle Scholar
  16. 114.
    Gokcay E., Principe J., Information theoretic clustering, IEEE Trans. Pattern Anal. Mach, Intell., 24(2):158–171, 2002.CrossRefGoogle Scholar
  17. 115.
    Grossberg S., Competitive learning: From interactive activation to adaptive resonance, in Connectionist Models and Their Implications: Readings from Cognitive Science (Waltz, D. and Feldman, J. A., Eds.), Ablex, Norwood, NJ, pp. 243–283, 1988.Google Scholar
  18. 136.
    Hartigan J., Clustering Algorithms. John Wiley & Sons, New York, 1975.MATHGoogle Scholar
  19. 153.
    Hofmann T. and Buhmann J., Pairwise Data Clustering by Deterministic Annealing, IEEE Trans. Pattern Anal. Mach. Intell., 19(1):1–14, 1997.CrossRefGoogle Scholar
  20. 160.
    Jain K. and Dubes R., Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs, NJ, 1988.MATHGoogle Scholar
  21. 162.
    Jenssen R., Principe J., Erdogmus D. and Eltoft T., The Cauchy–Schwartz divergence and Parzen windowing: Connections to graph theory and mercer kernels, J. Franklin Inst., 343:614–629, 2004.CrossRefMathSciNetGoogle Scholar
  22. 163.
    Jenssen R., Erdogmus D., Hild II K., Principe J., Eltoft T., Optimizing the Cauchy–Schwarz PDF divergence for information theoretic, non-parametric clustering, in Proc. Int’l. Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition (EMMCVPR2005), pp. 34–45, St. Augustine, FL, November 2005.Google Scholar
  23. 165.
    Jenssen R., Erdogmus D., Hild II K., Principe J., Eltoft T., Information cut for clustering using a gradient descent approach, Pattern Recogn., 40:796–806, 2006.CrossRefGoogle Scholar
  24. 182.
    King S., Step-wise clustering procedures. J. Amer. Statist. Assoc., pp. 86–101, 1967.Google Scholar
  25. 183.
    Kohonen T., Self-Organizing Maps, 2nd edition, Springer Verlag, New York, 1997.CrossRefMATHGoogle Scholar
  26. 186.
    Koontz W., Narendra P., Fukunaga K., A graph theoretic approach to non-parametric cluster analysis, IEEE Trans. Comput., 25:936–944, 1975.MathSciNetGoogle Scholar
  27. 208.
    MacQueen J., Some Methods for Classification and Analysis of Multivariate Observations, in Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1967, pp. 281–297.Google Scholar
  28. 227.
    Murphy P., Ada D., UCI repository of machine learning databases, Tech. Rep., Department of Computational Science, University of California, Irvine, California, USA, 1994.Google Scholar
  29. 230.
    Ng Y., Jordan M., Weiss Y., On spectral clustering: Analysis and an algorithm, in Advances in Neural Information Processing Systems, 14, 2001, vol. 2, pp. 849–856.Google Scholar
  30. 243.
    Pavlidis T., Structural Pattern Recognition. Springer-Verlag, New York, 1977.CrossRefMATHGoogle Scholar
  31. 253.
    Principe J., Euliano N., Lefebvre C., Neural Systems: Fundamentals through Simulations, CD-ROM textbook, John Wiley, New York, 2000.Google Scholar
  32. 256.
    Ramanan D., Forsyth D., Finding and tracking people from the bottom up, in Proc.IEEE Conf. Computer Vision Pattern Recognition, June 2003, pp. 467–474.Google Scholar
  33. 260.
    Rao, S., Martins A., Principe J., Mean shift: An information theoretic perspective, Pattern Recogn. Lett., 30(1, 3):222–230, 2009.CrossRefGoogle Scholar
  34. 269.
    Roberts S., Everson R., Rezek I., Maximum certainty data partitioning, Pattern Recogn., 33:833–839, 2000.CrossRefGoogle Scholar
  35. 271.
    Rose K., Gurewitz E., Fox G., Vector quantization by deterministic annealing, IEEE Trans. Inf. Theor., 38(4):1249–1257, 1992.CrossRefMATHGoogle Scholar
  36. 280.
    Sands N., Cioffi J., An improved detector for channels with nonlinear intersymbol interference, Proc. Intl. Conf. on Communications, vol 2, pp 1226–1230, 1994.Google Scholar
  37. 286.
    Scanlon J., Deo N., Graph-theoretic algorithms for image segmentation. In IEEE International Symposium on Circuits and Systems, pp. VI141–144, Orlando, Florida, 1999.Google Scholar
  38. 295.
    Sheater S. Jones M., A reliable data-based bandwidth selection method for kernel density estimation, J. Roy. Statist. Soc., Ser. B, 53:683–690, 1991.MathSciNetGoogle Scholar
  39. 296.
    Shi J., Malik J., Normalized cuts and image segmentation, IEEE Trans. Pattern Anal. Mach. Intell., 22(8):888–905, 2000.CrossRefGoogle Scholar
  40. 306.
    Sneath P. Sokal R., Numerical Taxonomy. Freeman, London, 1973.MATHGoogle Scholar
  41. 315.
    Theodoridis S., K. Koutroumbas, Pattern Recognition, Academic Press, 1999.Google Scholar
  42. 317.
    Tishby N., Slonim N., Data clustering by markovian relaxation and the information bottleneck method, in Advances in Neural Information Processing Systems, 13, Denver, pp. 640–646, 2000.Google Scholar
  43. 322.
    Urquart R., Graph theoretical clustering based on limited neighbor sets, Pattern Recogn., 173–187, 1982Google Scholar
  44. 330.
    Watanabe S., Pattern Recognition: Human and Mechanical. Wiley, New York, 1985.Google Scholar
  45. 336.
    Wu Z. and Leahy R., An optimal graph theoretic approach to data clustering: Theory and its applications to image segmentation. IEEE Trans. Pattern Anal. and Mach. Intell., 15(11):1101–1113, 1993.CrossRefGoogle Scholar
  46. 347.
    Zahn T., Graph theoretic methods for detecting and describing gestalt clusters. IEEE Trans. Comput., 20:68–86, 1971.CrossRefMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Robert Jenssen
    • 1
  • Sudhir Rao
    • 1
  1. 1.Dept. Electrical Engineering & Biomedical EngineeringUniversity of FloridaGainesvilleUSA

Personalised recommendations