Skip to main content

Statistical Physics and the Clustering Problem

  • Chapter
New Directions in Statistical Physics

Abstract

This chapter reviews statistical approaches to the clustering problem, i.e. the task of partitioning data-sets in classes in such a way that points in the same class are more similar to one another than to those in other classes. Although this is techni­cally an ill-posed problem, it is of great importance in a wide range of applications and numerous methods have been proposed to tackle it. This paper reviews mainly the coupled maps approach to clustering which performs a non-parametric clas­sification without any assumptions on the distribution of clusters or the number of classes. The technique is illustrated on a biological example (reconstruction of phylogenetic trees) and one from coding theory. The merits of various approaches and the remaining challenges are also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. B.D. Ripley: Pattern Recognition and Neural Networks. ( Cambridge University Press, Cambridge 1996 )

    MATH  Google Scholar 

  2. R.O. Duda, P.E. Hart: Pattern Recognition and scene analysis. ( Wiley, New York 1973 )

    Google Scholar 

  3. A. Dekel, M.J. West: Astrophys. J. 228, 411 (1985)

    Article  ADS  Google Scholar 

  4. P. Chiappetta, P. Colangelo, P. De Felice, G. Nardulli, G. Pasquariello: Phys. Lett. B 322, 219 (1994)

    Article  ADS  Google Scholar 

  5. A. Baraldi, P. Blonda, F. Parmiggiani, G. Satalino: Optical Engineering 39, 907 (2000)

    Article  ADS  Google Scholar 

  6. Y. Linde, A. Buzo, R.M. Gray: IEEE Trans. on Communications 28, 84 (1980)

    Article  Google Scholar 

  7. U. Alon, N. Barkai, D.A. Notterman, K. Gish, S. Ybarra, D. Mack, A.J. Levine: Proc. Natl. Acad. Sci. USA 96, 6745 (1999)

    Article  ADS  Google Scholar 

  8. L. Kullmann, J. Kertesz, R.N. Mantegna: Physica A 287, 412 (2000)

    Article  ADS  Google Scholar 

  9. C. Giacovazzo: Direct Phasing in Crystallography. ( Oxford University Press, Oxford 1998 )

    Google Scholar 

  10. T. Kosaka, S. Sagayama: `Tree-structured speaker clustering for fast speaker adaptation’. In: Proceedings of the 1994 IEEE International Conference on Acoustic, Speech and Signal Processing–Vol. 1, ( IEEE, New York 1994 ) pp. 245–248

    Google Scholar 

  11. C. Marangi, L. Angelini, F. De Carlo, G. Nardulli, M. Pellicoro, S. Stramaglia: `Clustering by inhomogeneous chaotic maps in landmine detection’. In: Proceedings of SPIE–Vol. 4170, (SPIE 2001 ) pp. 122–132

    Google Scholar 

  12. A. Hutt, M. Svensen, F. Kruggel, R. Friedrich: Phys. Rev. E 61, R4691 (2000)

    Article  ADS  Google Scholar 

  13. A.K. Jain, R.C. Dubes: Algorithms for Clustering Data. ( Prentice Hall, New York 1988 )

    MATH  Google Scholar 

  14. M.Blatt, S. Wiseman, E. Domany• Phys. Rev. Lett. 76, 3251 (1996)

    Article  ADS  Google Scholar 

  15. L. Angelini, F. De Carlo, C. Marangi, M. Pellicoro, S. Stramaglia: Phys. Rev. Lett. 85, 554 (2000)

    Article  ADS  Google Scholar 

  16. C.M. Bishop: Neural Networks for Pattern Recognition. (Oxford University Press, Oxford 1995 )

    Google Scholar 

  17. A. Engel, C. Van den Broeck: Statistical Mechanics of Learning. ( Cambridge University Press, Cambridge 2001 )

    Book  MATH  Google Scholar 

  18. K. Rose, E. Gurewitz, G.C. Fox: Phys. Rev. Lett. 65, 945 (1990)

    Google Scholar 

  19. T. Hofmann, J.M. Buhmann: IEEE Trans. P.A.M.I. 19, 1 (1997)

    Google Scholar 

  20. L. Angelini, M. Attimonelli, M. De Robertis, M. Mannarelli, C. Marangi, L. Nitti, M. Pellicoro, G. Pesole, C. Saccone, S. Stramaglia, M. Tommaseo: “CMC: a novel clustering method for human sequence classification”. (Submitted)

    Google Scholar 

  21. M. Pagel: Nature 401, 877 (1999)

    Article  ADS  Google Scholar 

  22. M. Ingman, H. Kaesmann, S. Paabo, U. Gyllensten: Nature 408, 708 (2000)

    Article  ADS  Google Scholar 

  23. S. Wiggins: Introduction to Applied Nonlinear Dynamical Systems and Chaos. ( Springer, Berlin 1990 )

    MATH  Google Scholar 

  24. G. Pesole, C. Saccone: Genetics 157, 859 (2001)

    Google Scholar 

  25. C. Lanave, G. Preparata, C. Saccone, G. Serio: Jour. Mol. Evol. 20, 86 (1984)

    Article  Google Scholar 

  26. C. Saccone, C. Lanave, G. Pesole, G. Preparata: Meth. Enzymol. 183, 570 (1990)

    Article  Google Scholar 

  27. M. Tommaseo, M. Attimonelli, M. De Robertis, F. Tanzariello, C. Saccone: Am. J. Phys. Anthropol. 117, 49 (2002)

    Article  Google Scholar 

  28. N. Saitou, M. Nei: Mol. Biol. Evol. 4, 406 (1987)

    Google Scholar 

  29. J. Felsenstein: PHYLIP, Phylogeny Inference Package. (Genetics Dept., University of Washington, Seattle)

    Google Scholar 

  30. H.J. Bandelt, P. Forster, C.S. Bryan, M.B. Richards: Genetics 141, 743 (1995)

    Google Scholar 

  31. S.P. Luttrel: Neural Computation 6, 767 (1994)

    Article  Google Scholar 

  32. C.M. Bishop, M. Svensen, C.K.I. Williams: Neural Computation 10, 215 (1997)

    Google Scholar 

  33. T. Graepel: Statistical Physics of clustering algorithms, Diplomarbeit, FB Physik, Institut für Theoretische Physik, Technische Universität Berlin (1998)

    Google Scholar 

  34. S. Kirkpatrick, C.D. Gelatt, M.P. Vecchi: Science 220, 671 (1983)

    Article  MathSciNet  ADS  MATH  Google Scholar 

  35. A.L. Yuille, J.J. Kosowsky: Neural Computation 6, 341 (1994)

    Article  Google Scholar 

  36. G. Parisi: Statistical Field Theory. ( Addison Wesley, Reading 1988 )

    MATH  Google Scholar 

  37. A.P. Dempster, N.M. Laird, D.B. Rubin: Jour. Royal Stat. Soc. 39, 1 (1977)

    MathSciNet  MATH  Google Scholar 

  38. In the text we use an operational definition of ground state as the best output over a number (10–50) of simulated annealing runs. The true ground state might be found only by an unpractical exhaustive search.

    Google Scholar 

  39. E. Anderson: Bull. Amer. Iris Soc. 59, 2 (1935)

    Google Scholar 

  40. E. Levine, E. Domany: Neural Computation 13, 2573 (2001)

    Article  MATH  Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Stramaglia, S., Angelini, L., Marangi, C., Nitti, L., Pellicoro, M. (2004). Statistical Physics and the Clustering Problem. In: Wille, L.T. (eds) New Directions in Statistical Physics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-08968-2_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-08968-2_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-07739-5

  • Online ISBN: 978-3-662-08968-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics