Skip to main content

Universal Clustering with Regularization in Probabilistic Space

  • Conference paper
Machine Learning and Data Mining in Pattern Recognition (MLDM 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3587))

  • 2072 Accesses

Abstract

We propose universal clustering in line with the concepts of universal estimation. In order to illustrate above model we introduce family of power loss functions in probabilistic space which is marginally linked to the Kullback-Leibler divergence. Above model proved to be effective in application to the synthetic data. Also, we consider large web-traffic dataset. The aim of the experiment is to explain and understand the way people interact with web sites.

The paper proposes special regularization in order to ensure consistency of the corresponding clustering model.

This work was supported by the grants of the Australian Research Council. National ICT Australia is funded through the Australian Government initiative.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Nikulin, V., Smola, A.: Parametric model-based clustering. In: Dasarathy, B. (ed.) Data Mining, Intrusion Detection, Information Assurance, and Data Network Security, Orlando, Florida, USA, March 28-29, vol. 5812, pp. 190–201. SPIE, San Jose (2005)

    Google Scholar 

  2. Dhillon, I., Mallela, S., Kumar, R.: Divisive information-theoretic feature clustering algorithm for text classification. Journal of Machine Learning Research 3, 1265–1287 (2003)

    Article  MATH  Google Scholar 

  3. Cohn, D., Hofmann, T.: The missing link - a probabilistic model of document content and hypertext connectivity. In: 13th Conference on Neural Information Processing Systems (2001)

    Google Scholar 

  4. Hwang, J.T.: Universal domination and stochastic domination: Estimation simultaneously under a broad class of loss functions. The Annals of Statistics 13, 295–314 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  5. Rukhin, A.: Universal Bayes estimators. The Annals of Statistics 6, 1345–1351 (1978)

    Article  MATH  MathSciNet  Google Scholar 

  6. Pollard, D.: Strong consistency of k-means clustering. The Annals of Statistics 10, 135–140 (1981)

    Article  MathSciNet  Google Scholar 

  7. Cuesta-Albertos, J., Gordaliza, A., Matran, C.: Trimmed k-means: an attempt to robustify quantizers. The Annals of Statistics 25, 553–576 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  8. Stute, W., Zhu, L.: Asymptotics of k-means clustering based on projection pursuit. Sankhya 57, 462–471 (1995)

    MATH  MathSciNet  Google Scholar 

  9. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)

    MATH  Google Scholar 

  10. Kagan, A., Linnik, Y., Rao, C.: Characterization Problems in Mathematical Statistics. John Wiley & Sons, Chichester (1973)

    MATH  Google Scholar 

  11. Hamerly, G., Elkan, C.: Learning the k in k-means. In: 16th Conference on Neural Information Processing Systems (2003)

    Google Scholar 

  12. Cadez, I., Heckerman, D., Meek, C., Smyth, P., White, S.: Model-based clustering and visualization of navigation patterns on a web site. Data Mining and Knowledge Discovery 7, 399–424 (2003)

    Article  MathSciNet  Google Scholar 

  13. Msnbc: msnbc.com anonymous web data. In: UCI Knowledge Discovery in Databases Archive (1999), http://kdd.ics.uci.edu/summary.data.type.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nikulin, V., Smola, A.J. (2005). Universal Clustering with Regularization in Probabilistic Space. In: Perner, P., Imiya, A. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2005. Lecture Notes in Computer Science(), vol 3587. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11510888_15

Download citation

  • DOI: https://doi.org/10.1007/11510888_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-26923-6

  • Online ISBN: 978-3-540-31891-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics