Skip to main content

Dynamic Clustering of Histogram Data: Using the Right Metric

  • Chapter
Selected Contributions in Data Analysis and Classification

Abstract

In this paper we present a review of some metrics to be proposed as allocation functions in the Dynamic Clustering Algorithm (DCA) when data are distribution or histograms of values. The choice of the most suitable distance plays a central role in the DCA because it is related to the criterion function that is optimized. Moreover, it has to be consistent with the prototype which represents the cluster. In such a way, for each proposed metric, we identify the corresponding prototype according to the minimization of the criterion function and then to the best fitting between the partition and the best representation of the clusters. Finally, we focus our attention on a Wassertein based distance showing its optimality in partitioning a set of histogram data with respect to a representation of the clusters by means of their barycenter expressed in terms of distributions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • BENZÉCRI, J.P. (1973): Théorie de l—information et classification d’après un tableau de contingence. L’Analyse des données, Tome 1, Dunod.

    Google Scholar 

  • CELEUX, G., DIDAY, E., GOVAERT G., LECHEVALLIER, Y., RALAMBONDRAINY, H. (1989): Classification Automatique des Données, Environnement Statistique et Informatique. Bordas, Paris.

    Google Scholar 

  • CHAVENT, M., DE CARVALHO, F.A.T., LECHEVALLIER, Y., and VERDE, R. (2006): New clustering methods for interval data, Computational statistics, Phisica-Verlag, 21, 211–229.

    Article  MATH  MathSciNet  Google Scholar 

  • CSISZAR, I. (1967): Information type measures of difference of probability distributions and indirect observations, Studia Sci. Math. Hungar, 2, 299–318.

    MATH  MathSciNet  Google Scholar 

  • DIACONIS, P. (1988). Group Representations in Probability and Statistics, Institute of Mathematical Statistics, Harvard University, CA.

    Google Scholar 

  • DIDAY, E., and SIMON, J.C. (1976): Clustering analysis, In: K.S. Fu (Eds.), Digital Pattern Recognition, 47–94, Springer Verlag, Heidelberg.

    Google Scholar 

  • DIDAY, E. (1971): La méthode des nuées dynamiques, Revue de Statistique Appliquée, 19,2, 19–34.

    Google Scholar 

  • GIBBS, A.L. and SU, F.E. (2002): On choosing and bounding probability metrics, International Statistical Review, 70, 419.

    Article  MATH  Google Scholar 

  • HELLINGER, E. (1907): Die Orthogonalinvarianten quadratischer Formen von unendlich vielen Variablen, Dissertation, Göttingen.

    Google Scholar 

  • HUBER, P.J. (1981): Robust Statistics, John Wiley and Sons, New York.

    MATH  Google Scholar 

  • IRPINO, A., VERDE, R., and LECHEVALLIER Y. (2006): Dynamic clustering of histograms using Wasserstein metric, in COMPSTAT 2006, (Eds. Rizzi, Vichi), Springer, Berlin, 869–876.

    Google Scholar 

  • MALLOWS, C.L. (1972): A note on asymptotic joint normality. Annals of Mathematical Statistics, 43(2), 508–515.

    MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Verde, R., Irpino, A. (2007). Dynamic Clustering of Histogram Data: Using the Right Metric. In: Brito, P., Cucumel, G., Bertrand, P., de Carvalho, F. (eds) Selected Contributions in Data Analysis and Classification. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73560-1_12

Download citation

Publish with us

Policies and ethics