Skip to main content

Multiparameter Hierarchical Clustering Methods

  • Conference paper
  • First Online:
Classification as a Tool for Research

Abstract

We propose an extension of hierarchical clustering methods, called multiparameter hierarchical clustering methods which are designed to exhibit sensitivity to density while retaining desirable theoretical properties. The input of the method we propose is a triple (X, d, f), where (X, d) is a finite metric space and f : X is a function defined on the data X, which could be a density estimate or could represent some other type of information. The output of our method is more general than dendrograms in that we track two parameters: the usual scale parameter and a parameter related to the function f. Our construction is motivated by the methods of persistent topology (Edelsbrunner et al. 2000), the Reeb graph and Cluster Trees (Stuetzle 2003). We present both a characterization, and a stability theorem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Anthony Wong, M., & Lane, T. (1983). A kth nearest neighbour clustering procedure. Journal of the Royal Statistical Society: Series B, 45(3), 362–368.

    MATH  MathSciNet  Google Scholar 

  • Ben-David, S., von Luxburg, U., & Pál, D. (2006). A sober look at clustering stability. In G. Lugosi & H.-U. Simon (Eds.), COLT, volume 4005 of Lecture Notes in Computer Science (pp. 5–19). Berlin, Heidelberg, New York: Springer.

    Google Scholar 

  • Biau, G., Cadre, B., & Pelletier, B. (2007). A graph-based estimator of the number of clusters. ESAIM Probability and Statistics, 11, 272–280.

    Article  MATH  MathSciNet  Google Scholar 

  • Burago, D., Burago, Y., & Ivanov, S. (2001). A course in metric geometry, volume 33 of AMS Graduate Studies in Maths. American Mathematical Society.

    Google Scholar 

  • Carlsson, G., & Mémoli, F. (2008). Persistent clustering and a theorem of J. Kleinberg. ArXiv e-prints.

    Google Scholar 

  • Cuevas, A., Febrero, M., & Fraiman, R. (2001). Cluster analysis: A further approach based on density estimation. Computational Statistics and Data Analysis, 36(4), 441–459.

    Article  MATH  MathSciNet  Google Scholar 

  • Edelsbrunner, H., Letscher, D., & Zomorodian, A. (2000). Topological persistence and simplification. In Proceedings of the 41st Annual IEEE Symposium Foundation of Computer Science (pp. 454–463).

    Google Scholar 

  • Ester, M., Kriegel, H. P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise, 226–231. Menlo Park, CA, USA: AAAI Press.

    Google Scholar 

  • Hartigan, J. A. (1975). Clustering algorithms. New York-London-Sydney: Wiley. Wiley Series in Probability and Mathematical Statistics.

    MATH  Google Scholar 

  • Hartigan, J. A. (1981). Consistency of single linkage for high-density clusters. Journal of the American Statistical Association, 76(374), 388–394.

    Article  MATH  MathSciNet  Google Scholar 

  • Jardine, N., & Sibson, R. (1971). Mathematical taxonomy. London: Wiley. Wiley Series in Probability and Mathematical Statistics.

    MATH  Google Scholar 

  • Kleinberg, J. M. (2002). An impossibility theorem for clustering. In S. Becker, S. Thrun and K. Obermayer (Eds.), NIPS (pp. 446–453). Cambridge, MA: MIT Press.

    Google Scholar 

  • Klemelä, J. (2004). Visualization of multivariate density estimates with level set trees. Journal of Computational and Graphical Statistics, 13(3), 599–620.

    Article  MathSciNet  Google Scholar 

  • Lance, G. N., & Williams, W. T. (1967). A general theory of classificatory sorting strategies 1. Hierarchical systems. Computer Journal, 9(4), 373–380.

    Google Scholar 

  • Mac Lane, S. (1998). Categories for the working mathematician (2nd ed.), Vol. 5 of Graduate Texts in Mathematics. New York: Springer-Verlag.

    Google Scholar 

  • Stuetzle, W. (2003). Estimating the cluster type of a density by analyzing the minimal spanning tree of a sample. Journal of Classification, 20(1), 25–47.

    Article  MATH  MathSciNet  Google Scholar 

  • Stuetzle, W., & Nugent, R. (2008). A generalized single linkage method for estimating the cluster tree of a density.

    Google Scholar 

  • Wishart, D. (1969). Mode analysis: a generalization of nearest neighbor which reduces chaining effects. In Numerical Taxonomy (pp. 282–311). London: Academic Press.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Facundo Mémoli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Carlsson, G., Mémoli, F. (2010). Multiparameter Hierarchical Clustering Methods. In: Locarek-Junge, H., Weihs, C. (eds) Classification as a Tool for Research. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10745-0_6

Download citation

Publish with us

Policies and ethics