Universal Clustering with Regularization in Probabilistic Space

Nikulin, Vladimir; Smola, Alex J.

doi:10.1007/11510888_15

Vladimir Nikulin²⁰ &
Alex J. Smola²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3587))

Included in the following conference series:

International Workshop on Machine Learning and Data Mining in Pattern Recognition

2072 Accesses

Abstract

We propose universal clustering in line with the concepts of universal estimation. In order to illustrate above model we introduce family of power loss functions in probabilistic space which is marginally linked to the Kullback-Leibler divergence. Above model proved to be effective in application to the synthetic data. Also, we consider large web-traffic dataset. The aim of the experiment is to explain and understand the way people interact with web sites.

The paper proposes special regularization in order to ensure consistency of the corresponding clustering model.

This work was supported by the grants of the Australian Research Council. National ICT Australia is funded through the Australian Government initiative.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Entropy regularization in probabilistic clustering

Article Open access 07 August 2023

Generalization and Robustness of Batched Weighted Average Algorithm with V-Geometrically Ergodic Markov Data

Weighted stochastic block model

Article Open access 13 September 2021

References

Nikulin, V., Smola, A.: Parametric model-based clustering. In: Dasarathy, B. (ed.) Data Mining, Intrusion Detection, Information Assurance, and Data Network Security, Orlando, Florida, USA, March 28-29, vol. 5812, pp. 190–201. SPIE, San Jose (2005)
Google Scholar
Dhillon, I., Mallela, S., Kumar, R.: Divisive information-theoretic feature clustering algorithm for text classification. Journal of Machine Learning Research 3, 1265–1287 (2003)
Article MATH Google Scholar
Cohn, D., Hofmann, T.: The missing link - a probabilistic model of document content and hypertext connectivity. In: 13th Conference on Neural Information Processing Systems (2001)
Google Scholar
Hwang, J.T.: Universal domination and stochastic domination: Estimation simultaneously under a broad class of loss functions. The Annals of Statistics 13, 295–314 (1985)
Article MATH MathSciNet Google Scholar
Rukhin, A.: Universal Bayes estimators. The Annals of Statistics 6, 1345–1351 (1978)
Article MATH MathSciNet Google Scholar
Pollard, D.: Strong consistency of k-means clustering. The Annals of Statistics 10, 135–140 (1981)
Article MathSciNet Google Scholar
Cuesta-Albertos, J., Gordaliza, A., Matran, C.: Trimmed k-means: an attempt to robustify quantizers. The Annals of Statistics 25, 553–576 (1997)
Article MATH MathSciNet Google Scholar
Stute, W., Zhu, L.: Asymptotics of k-means clustering based on projection pursuit. Sankhya 57, 462–471 (1995)
MATH MathSciNet Google Scholar
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)
MATH Google Scholar
Kagan, A., Linnik, Y., Rao, C.: Characterization Problems in Mathematical Statistics. John Wiley & Sons, Chichester (1973)
MATH Google Scholar
Hamerly, G., Elkan, C.: Learning the k in k-means. In: 16th Conference on Neural Information Processing Systems (2003)
Google Scholar
Cadez, I., Heckerman, D., Meek, C., Smyth, P., White, S.: Model-based clustering and visualization of navigation patterns on a web site. Data Mining and Knowledge Discovery 7, 399–424 (2003)
Article MathSciNet Google Scholar
Msnbc: msnbc.com anonymous web data. In: UCI Knowledge Discovery in Databases Archive (1999), http://kdd.ics.uci.edu/summary.data.type.html

Download references

Author information

Authors and Affiliations

Computer Science Laboratory, Australian National University, Canberra, ACT 0200, Australia
Vladimir Nikulin
NICTA, Canberra, ACT 0200, Australia
Alex J. Smola

Authors

Vladimir Nikulin
View author publications
You can also search for this author in PubMed Google Scholar
Alex J. Smola
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Vision and applied Computer Sciences, IBaI, Germany
Petra Perner
Institute of Media and Information Technology, Chiba University, Japan
Atsushi Imiya

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nikulin, V., Smola, A.J. (2005). Universal Clustering with Regularization in Probabilistic Space. In: Perner, P., Imiya, A. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2005. Lecture Notes in Computer Science(), vol 3587. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11510888_15

Download citation

DOI: https://doi.org/10.1007/11510888_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26923-6
Online ISBN: 978-3-540-31891-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Universal Clustering with Regularization in Probabilistic Space

Abstract

Access this chapter

Preview

Similar content being viewed by others

Entropy regularization in probabilistic clustering

Generalization and Robustness of Batched Weighted Average Algorithm with V-Geometrically Ergodic Markov Data

Weighted stochastic block model

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Universal Clustering with Regularization in Probabilistic Space

Abstract

Access this chapter

Preview

Similar content being viewed by others

Entropy regularization in probabilistic clustering

Generalization and Robustness of Batched Weighted Average Algorithm with V-Geometrically Ergodic Markov Data

Weighted stochastic block model

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation