Abstract
This chapter reviews statistical approaches to the clustering problem, i.e. the task of partitioning data-sets in classes in such a way that points in the same class are more similar to one another than to those in other classes. Although this is technically an ill-posed problem, it is of great importance in a wide range of applications and numerous methods have been proposed to tackle it. This paper reviews mainly the coupled maps approach to clustering which performs a non-parametric classification without any assumptions on the distribution of clusters or the number of classes. The technique is illustrated on a biological example (reconstruction of phylogenetic trees) and one from coding theory. The merits of various approaches and the remaining challenges are also discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
B.D. Ripley: Pattern Recognition and Neural Networks. ( Cambridge University Press, Cambridge 1996 )
R.O. Duda, P.E. Hart: Pattern Recognition and scene analysis. ( Wiley, New York 1973 )
A. Dekel, M.J. West: Astrophys. J. 228, 411 (1985)
P. Chiappetta, P. Colangelo, P. De Felice, G. Nardulli, G. Pasquariello: Phys. Lett. B 322, 219 (1994)
A. Baraldi, P. Blonda, F. Parmiggiani, G. Satalino: Optical Engineering 39, 907 (2000)
Y. Linde, A. Buzo, R.M. Gray: IEEE Trans. on Communications 28, 84 (1980)
U. Alon, N. Barkai, D.A. Notterman, K. Gish, S. Ybarra, D. Mack, A.J. Levine: Proc. Natl. Acad. Sci. USA 96, 6745 (1999)
L. Kullmann, J. Kertesz, R.N. Mantegna: Physica A 287, 412 (2000)
C. Giacovazzo: Direct Phasing in Crystallography. ( Oxford University Press, Oxford 1998 )
T. Kosaka, S. Sagayama: `Tree-structured speaker clustering for fast speaker adaptation’. In: Proceedings of the 1994 IEEE International Conference on Acoustic, Speech and Signal Processing–Vol. 1, ( IEEE, New York 1994 ) pp. 245–248
C. Marangi, L. Angelini, F. De Carlo, G. Nardulli, M. Pellicoro, S. Stramaglia: `Clustering by inhomogeneous chaotic maps in landmine detection’. In: Proceedings of SPIE–Vol. 4170, (SPIE 2001 ) pp. 122–132
A. Hutt, M. Svensen, F. Kruggel, R. Friedrich: Phys. Rev. E 61, R4691 (2000)
A.K. Jain, R.C. Dubes: Algorithms for Clustering Data. ( Prentice Hall, New York 1988 )
M.Blatt, S. Wiseman, E. Domany• Phys. Rev. Lett. 76, 3251 (1996)
L. Angelini, F. De Carlo, C. Marangi, M. Pellicoro, S. Stramaglia: Phys. Rev. Lett. 85, 554 (2000)
C.M. Bishop: Neural Networks for Pattern Recognition. (Oxford University Press, Oxford 1995 )
A. Engel, C. Van den Broeck: Statistical Mechanics of Learning. ( Cambridge University Press, Cambridge 2001 )
K. Rose, E. Gurewitz, G.C. Fox: Phys. Rev. Lett. 65, 945 (1990)
T. Hofmann, J.M. Buhmann: IEEE Trans. P.A.M.I. 19, 1 (1997)
L. Angelini, M. Attimonelli, M. De Robertis, M. Mannarelli, C. Marangi, L. Nitti, M. Pellicoro, G. Pesole, C. Saccone, S. Stramaglia, M. Tommaseo: “CMC: a novel clustering method for human sequence classification”. (Submitted)
M. Pagel: Nature 401, 877 (1999)
M. Ingman, H. Kaesmann, S. Paabo, U. Gyllensten: Nature 408, 708 (2000)
S. Wiggins: Introduction to Applied Nonlinear Dynamical Systems and Chaos. ( Springer, Berlin 1990 )
G. Pesole, C. Saccone: Genetics 157, 859 (2001)
C. Lanave, G. Preparata, C. Saccone, G. Serio: Jour. Mol. Evol. 20, 86 (1984)
C. Saccone, C. Lanave, G. Pesole, G. Preparata: Meth. Enzymol. 183, 570 (1990)
M. Tommaseo, M. Attimonelli, M. De Robertis, F. Tanzariello, C. Saccone: Am. J. Phys. Anthropol. 117, 49 (2002)
N. Saitou, M. Nei: Mol. Biol. Evol. 4, 406 (1987)
J. Felsenstein: PHYLIP, Phylogeny Inference Package. (Genetics Dept., University of Washington, Seattle)
H.J. Bandelt, P. Forster, C.S. Bryan, M.B. Richards: Genetics 141, 743 (1995)
S.P. Luttrel: Neural Computation 6, 767 (1994)
C.M. Bishop, M. Svensen, C.K.I. Williams: Neural Computation 10, 215 (1997)
T. Graepel: Statistical Physics of clustering algorithms, Diplomarbeit, FB Physik, Institut für Theoretische Physik, Technische Universität Berlin (1998)
S. Kirkpatrick, C.D. Gelatt, M.P. Vecchi: Science 220, 671 (1983)
A.L. Yuille, J.J. Kosowsky: Neural Computation 6, 341 (1994)
G. Parisi: Statistical Field Theory. ( Addison Wesley, Reading 1988 )
A.P. Dempster, N.M. Laird, D.B. Rubin: Jour. Royal Stat. Soc. 39, 1 (1977)
In the text we use an operational definition of ground state as the best output over a number (10–50) of simulated annealing runs. The true ground state might be found only by an unpractical exhaustive search.
E. Anderson: Bull. Amer. Iris Soc. 59, 2 (1935)
E. Levine, E. Domany: Neural Computation 13, 2573 (2001)
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Stramaglia, S., Angelini, L., Marangi, C., Nitti, L., Pellicoro, M. (2004). Statistical Physics and the Clustering Problem. In: Wille, L.T. (eds) New Directions in Statistical Physics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-08968-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-662-08968-2_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-07739-5
Online ISBN: 978-3-662-08968-2
eBook Packages: Springer Book Archive