Optimizing the Cauchy-Schwarz PDF Distance for Information Theoretic, Non-parametric Clustering

Jenssen, Robert; Erdogmus, Deniz; Hild, Kenneth E.; Principe, Jose C.; Eltoft, Torbjørn

doi:10.1007/11585978_3

Robert Jenssen¹⁹,
Deniz Erdogmus²⁰,
Kenneth E. Hild²¹,
Jose C. Principe²² &
…
Torbjørn Eltoft¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 3757))

Included in the following conference series:

International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition

Abstract

This paper addresses the problem of efficient information theoretic, non-parametric data clustering. We develop a procedure for adapting the cluster memberships of the data patterns, in order to maximize the recent Cauchy-Schwarz (CS) probability density function (pdf) distance measure. Each pdf corresponds to a cluster. The CS distance is estimated analytically and non-parametrically by means of the Parzen window technique for density estimation. The resulting form of the cost function makes it possible to develop an efficient adaption procedure based on constrained gradient descent, using stochastic approximation of the gradients. The computational complexity of the algorithm is O(MN), M ≪ N, where N is the total number of data patterns and M is the number of data patterns used in the stochastic approximation. We show that the new algorithm is capable of performing well on several odd-shaped and irregular data sets.

This work was partially supported by NSF grants ECS-9900394 and EIA-0135946.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Computing Surveys 31(3), 264–323 (1999)
Article Google Scholar
Bezdek, J.C.: A Convergence Theorem for the Fuzzy Isodata Clustering Algorithms. IEEE Transactions on Pattern Analysis and Machine Learning 2(1), 1–8 (1980)
Article MATH Google Scholar
McLachlan, G.J., Peel, D.: Finite Mixture Models. John Wiley & Sons, New York (2000)
Book MATH Google Scholar
Rose, K., Gurewitz, E., Fox, G.C.: Vector Quantization by Deterministic Annealing. IEEE Transactions on Information Theory 38(4), 1249–1257 (1992)
Article MATH Google Scholar
Hofmann, T., Buhmann, J.M.: Pairwise Data Clustering by Deterministic Annealing. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(1), 1–14 (1997)
Article Google Scholar
Roberts, S.J., Everson, R., Rezek, I.: Maximum Certainty Data Partitioning. Pattern Recognition 33, 833–839 (2000)
Article Google Scholar
Tishby, N., Slonim, N.: Data Clustering by Markovian Relaxation and the Information Bottleneck Method. In: Advances in Neural Information Processing Systems, vol. 13, pp. 640–646. MIT Press, Cambridge (2001)
Google Scholar
Principe, J., Xu, D., Fisher, J.: Information Theoretic Learning. In: Haykin, S. (ed.) Unsupervised Adaptive Filtering, ch. 7, vol. I. John Wiley & Sons, New York (2000)
Google Scholar
Parzen, E.: On the Estimation of a Probability Density Function and the Mode. The Annals of Mathematical Statistics 32, 1065–1076 (1962)
Article MathSciNet Google Scholar
Gokcay, E., Principe, J.: Information Theoretic Clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(2), 158–170 (2002)
Article Google Scholar
Milligan, G.W., Cooper, M.C.: An Examination of Procedures for Determining the Number of Clusters in a Data Set. Phychometrica, 159–179 (1985)
Google Scholar
Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Chapman and Hall, London (1986)
MATH Google Scholar
Shi, J., Malik, J.: Normalized Cuts and Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8), 888–905 (2000)
Article Google Scholar
Mangasarian, O.L., Wolberg, W.H.: Cancer Diagnosis via Linear Programming. SIAM News 5, 1–18 (1990)
Google Scholar
Jenssen, R., Principe, J.C., Eltoft, T.: Information Cut and Information Forces for Clustering. In: Proceedings of IEEE International Workshop on Neural Networks for Signal Processing, Toulouse, France, September 17-19, pp. 459–468 (2003)
Google Scholar
Jenssen, R., Erdogmus, D., Principe, J.C., Eltoft, T.: The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space. In: Advances in Neural Information Processing Systems, vol. 17, pp. 625–632. MIT Press, Cambridge (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Physics, University of Tromsø, N – 9037, Tromsø, Norway
Robert Jenssen & Torbjørn Eltoft
Department of Computer Science and Engineering, Oregon Graduate Institute, OHSU, Portland, OR., 97006, USA
Deniz Erdogmus
Department of Radiology, University of California, San Francisco, CA., 94143, USA
Kenneth E. Hild
Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL., 32611, USA
Jose C. Principe

Authors

Robert Jenssen
View author publications
You can also search for this author in PubMed Google Scholar
Deniz Erdogmus
View author publications
You can also search for this author in PubMed Google Scholar
Kenneth E. Hild
View author publications
You can also search for this author in PubMed Google Scholar
Jose C. Principe
View author publications
You can also search for this author in PubMed Google Scholar
Torbjørn Eltoft
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of CISE, University of Florida, Gainesville, FL, USA
Anand Rangarajan
University of Florida, 32611, Gainesville, FL, USA
Baba Vemuri
Department of Statistics, University of California, Los Angeles
Alan L. Yuille

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jenssen, R., Erdogmus, D., Hild, K.E., Principe, J.C., Eltoft, T. (2005). Optimizing the Cauchy-Schwarz PDF Distance for Information Theoretic, Non-parametric Clustering. In: Rangarajan, A., Vemuri, B., Yuille, A.L. (eds) Energy Minimization Methods in Computer Vision and Pattern Recognition. EMMCVPR 2005. Lecture Notes in Computer Science, vol 3757. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11585978_3

Download citation

DOI: https://doi.org/10.1007/11585978_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30287-2
Online ISBN: 978-3-540-32098-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics