The Poisson Processes in Cluster Analysis

Hardy, André

doi:10.1007/978-3-642-13312-1_5

André Hardy⁵

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

3003 Accesses

Abstract

This paper aims to review some use of the point processes in cluster analysis. The homogeneous Poisson process is, in many ways, the simplest point process, and it plays a role in point process theory in most respects analogous to the normal distribution in the study of random variables. We first propose a statistical model for cluster analysis based on the homogeneous Poisson process. The clustering criterion is extracted from that model thanks to maximum likelihood estimation. It consists in minimizing the sum of the Lebesgue measures of the convex hulls of the clusters. We also present a generalization of that model to the non-stationary Poisson process, as well as some monothetic divisive clustering methods also based on the Poisson processes. On the other hand, it is usually considered that the central problem of cluster validation is the determination of the best number of natural clusters. We present two likelihood ratio tests for the number of clusters based on the Poisson processes. Most of these clustering methods and tests for the number of clusters have been extended to symbolic data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bock, H.-H., Diday, E. (eds.): Analysis of Symbolic Data, Exploratory Methods for Extracting Statistical Information from Complex Data. Studies in Classification, Data Analysis and Knowledge Organisation. Springer, Heidelberg (2000).
Google Scholar
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth, Monterey, CA (1984).
MATH Google Scholar
Cox, D.R., Isham, V.: Point Processes. Chapman and Hall, London (1980)
MATH Google Scholar
Deheuvels, P., Einmahl, J.H.J., Mason, D.M.: The almost sure behavior of maximal and minimal multivariate kn-spacings. J. Multivar. Anal. 24, 155–176.
Google Scholar
Diday, E., Noirhomme-Fraiture, M. (eds.): Symbolic Data Analysis and the Sodas Software. Wiley, Chichester (2008)
MATH Google Scholar
Fisher, L., Van Ness, J.W.: Admissible clustering procedure. Biometrika 58(1), pp. 91–104 (1971)
Article MathSciNet MATH Google Scholar
Hardy, A.: Statistique et classification automatique: un modèle, un nouveau critère, des algorithmes, des applications. PhD thesis, University of Namur, Namur, Belgium (1983).
Google Scholar
Hardy, A.: A heuristic approach for the hypervolumes method in cluster analysis. Jorbel 36(1), 43–55 (1996)
MATH Google Scholar
Hardy, A.: On the number of clusters. Comput. Stat. Data Anal. 23(1), 83–96 (1996)
Article MATH Google Scholar
Hardy, A.: Validation of a clustering structure: determination of the number of clusters. In: Diday, E., Noirhomme-Fraiture, M. (eds.) Symbolic Data Analysis and the Sodas Software, pp. 235–262. Wiley, Chichester (2008)
Google Scholar
Hardy, A., Beauthier, C.: Comparaison entre le test des Hypervolumes et le Gap test. Research report. University of Namur, Namur, Belgium (2004)
Google Scholar
Hardy, A., Blasutig, L.: Application des tests de permutation au critère des Hypervolumes en classification automatique. Research report. University of Namur, Namur, Belgium (2007)
Google Scholar
Hardy, A., Rasson, J.P.: Une nouvelle approche des problèmes de classification automatique. Stat. Anal. Donnèes 7(2), 41–56 (1982)
MathSciNet MATH Google Scholar
Janson, S.: Random coverings in several dimensions. Acta Math. 156, 83–118 (1986)
Article MathSciNet MATH Google Scholar
Kubushishi, T.: On some Applications of Point Process Theory in Cluster Analysis and Pattern Recognition. PhD thesis, University of Namur, Namur, Belgium (1996)
Google Scholar
Milligan, G.W., Cooper, M.C.: An examination of procedures for determining the number of clusters in a data set. Psychometrika 50(2), pp. 159–179 (1985)
Article Google Scholar
Moore, M.: On the estimation of a convex set. Ann. Stat. 12, 1090–1099 (1984)
Article MATH Google Scholar
Pirçon, J.-Y.: La classification et les processus de Poisson pour de nouvelles méthodes monothétiques de partitionnement. PhD thesis, University of Namur, Namur, Belgium (2004)
Google Scholar
Rasson, J.P., Granville, V.: Geometrical tools in classification Comput. Stat. Data Anal. 23, 105–123 (1996)
Article MathSciNet MATH Google Scholar
Rasson, J.P., Kubushishi, T.: The gap test: an optimal method for determining the number of natural classes in cluster analysis. In: Diday, E. et al. (eds.) New approaches in classification and data analysis, pp. 186–193. Springer, Paris (1994)
Google Scholar
Rasson, J.P. et al.: Unsupervised divisive classification. In: Diday, E., Noirhomme, M. (eds.) Symbolic Data Analysis and the Sodas Software. Wiley, Chichester (2008)
Google Scholar
Ripley, B.D., Rasson, J.P.: Finding the edge of a Poisson forest. J. Appl. Probab. 14, 483–491 (1977)
Article MathSciNet MATH Google Scholar
Silverman, B.W.: Using kernel density estimates to investigate multimodality. J. R. Stat. Soc. Ser. B 43, 97–99 (1981)
Google Scholar
Silverman, B.W.: Density estimation for statistics and data analysis. Chapman and Hall, London (1986)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

University of Namur, 5000, Namur, Belgium
André Hardy

Authors

André Hardy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to André Hardy .

Editor information

Editors and Affiliations

, Laboratoire d' Informatique, Université d' Aix-Marseille II, Avenue de Luminy 163case 901, Marseille cedex 9, 13288, France
Bernard Fichet
, Dipartimento di Scienze Statistiche, Università di Napoli "Federico II", Via Leopoldo Rodinò 22, Naples, 80138, Italy
Domenico Piccolo
, Facoltà di Studi Politici "Jean Monnet", Seconda Università di Napoli, Via del Setificio 15, Caserta, 81100, Italy
Rosanna Verde
, Facoltà di Scienze Statistiche, Università di Roma "La Sapienza", P.le Aldo Moro 5, Rome, 00185, Italy
Maurizio Vichi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hardy, A. (2011). The Poisson Processes in Cluster Analysis. In: Fichet, B., Piccolo, D., Verde, R., Vichi, M. (eds) Classification and Multivariate Analysis for Complex Data Structures. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13312-1_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-13312-1_5
Published: 08 November 2010
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13311-4
Online ISBN: 978-3-642-13312-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics