Skip to main content
Log in

On Hölder fields clustering

  • Original Paper
  • Published:
TEST Aims and scope Submit manuscript

Abstract

Based on n randomly drawn vectors in a Hilbert space, we study the k-means clustering scheme. Here, clustering is performed by computing the Voronoi partition associated with centers that minimize an empirical criterion, called distorsion. The performance of the method is evaluated by comparing the theoretical distorsion of empirical optimal centers to the theoretical optimal distorsion. Our first result states that, provided that the underlying distribution satisfies an exponential moment condition, an upper bound for the above performance criterion is \(O(1/\sqrt{n})\). Then, motivated by a broad range of applications, we focus on the case where the data are real-valued random fields. Assuming that they share a Hölder property in quadratic mean, we construct a numerically simple k-means algorithm based on a discretized version of the data. With a judicious choice of the discretization, we prove that the performance of this algorithm matches the performance of the classical algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abaya EA, Wise GL (1984) Convergence of vector quantizers with applications to optimal quantization. SIAM J Appl Math 183–189

  • Adamczak R (2008) A tail inequality for suprema of unbounded empirical processes with applications to Markov chains. Electron J Probab 1000–1034

  • Antos A (2005) Improved minimax bounds on the test and training distortion of empirically designed vector quantizers. IEEE Trans Inf Theory 4022–4032

  • Antos A, Györfi L, György A (2005) Improved convergence rates in empirical vector quantizer design. IEEE Trans Inf Theory 4013–4022

  • Bartlett PL, Linder T, Lugosi G (1998) The minimax distorsion redundancy in empirical quantizer design. IEEE Trans Inf Theory 1802–1813

  • Biau G, Devroye L, Lugosi G (2008) On the performance of clustering in Hilbert spaces. IEEE Trans Inf Theory 781–790

  • Breton JC, Nourdin I, Peccati G (2009). Exact confidence intervals for the Hurst parameter of a fractional Brownian motion. Electron J Stat 416–425

  • Chou PA (1994) The distorsion of vector quantizers trained on n vectors decreases to the optimum at O P (1/n). IEEE Trans Inf Theory 457–457

  • Coeurjolly J-F (2008). Hurst exponent estimation of locally self-similar Gaussian processes using sample quantiles. Ann Stat 1404–1434

  • Cont R, Tankov P (2003) Financial modelling with jump processes, 2nd edn. Chapmann and Hall, CRC Press, London

    Book  Google Scholar 

  • Devroye L, Györfi L, Lugosi G (1996) A probabilistic theory of pattern-recognition. Springer, New York

    MATH  Google Scholar 

  • Duda RO, Hart PE, Stork DG (2000) Pattern classification, 2nd edn. Wiley-Interscience, New York

    Google Scholar 

  • Frisch U (1995) Turbulences. Cambridge University Press, Cambridge

    Google Scholar 

  • Gersho A, Gray RM (1992) Vector quantization and signal compression. Kluwer Academic, Boston

    Book  MATH  Google Scholar 

  • Graf S, Luschgy H (2000) Foundations of quantization for probability distributions. Lectures notes in mathematics, vol 1730. Springer, New York

    Book  MATH  Google Scholar 

  • Gray RM, Neuhoff DL (1998) Quantization. IEEE Trans Inf Theory 2325–2384

  • Hartigan JA (1975) Clustering algorithms. Wiley, New York

    MATH  Google Scholar 

  • Hartigan JA (1978) Asymptotic distributions for clustering criteria. Ann Stat 117–131

  • Huang W (2009) Exponential integrability of Itô’s processes. J Math Anal Appl 427–433

  • Kärner O (2001) Comments on Hurst exponent. Geophys Res Lett 3825–3826

  • Kimmel M, Axelrod DE (2002) Branching processes in biology. Springer, Berlin

    MATH  Google Scholar 

  • Lacaux C, Loubès J-M (2007) Hurst exponent estimation of fractional Lévy motions. Alea 143–164

  • Lamberton D, Lapeyre B (1996) Introduction to stochastic calculus applied to finance. Chapman and Hall, CRC Press, London

    Google Scholar 

  • Linder T (2000) On the training distortion of vector quantizers. IEEE Trans Inf Theory 1617–1623

  • Linder T (2001) Learning-theoretic methods in vector quantization. Lecture notes for the advanced school on the principle of nonparametric learning, Udine, Italy, July 9–13

    Google Scholar 

  • Linder T, Lugosi G, Zeger K (1994) Rates of convergence in the source coding theorem, in empirical quantizer design, and in universal lossy source coding. IEEE Trans Inf Theory 1728–1740

  • Lindstrøm T (1993). Fractional Brownian fields as integrals of white noise. Bull Lond Math Soc 83–88

  • Mandelbrot B (1997) Fractals and scaling in finance. Selected works of Benoit B. Mandelbrot. Discontinuity, concentration, risk, selecta vol E, with a forward by RE Gomory. Springer, New York

    MATH  Google Scholar 

  • Mandelbrot B, van Ness J (1968) Fractional Brownian motion, fractional noises and applications. SIAM Rev 422–437

  • Maurer A, Pontil M (2010). K-dimensional coding schemes in Hilbert spaces. IEEE Trans Inf Theory 5839–5846

  • Pipiras V, Taqqu MS (2003) Fractional calculus and its connection to fractional Brownian motion. In: Long range dependence. Birkhäuser, Basel, pp 166–201

    Google Scholar 

  • Pisier G (1983) Some applications of the metric entropy condition to harmonic analysis. In: Banach spaces, harmonic analysis, and probability theory. Lecture notes in math, vol 995. Springer, Berlin, pp 123–154

    Chapter  Google Scholar 

  • Pollard D (1981) Strong consistency of k-means clustering. Ann Stat 135–140

  • Pollard D (1982a) A central limit theorem for k-means clustering. Ann Probab 199–205

  • Pollard D (1982b) Quantization and the method of k-means. IEEE Trans Inf Theory 1728–1740

  • Revuz D, Yor M (1999) Continuous martingales and Brownian motion. Springer, New York

    MATH  Google Scholar 

  • Van Kampen NG (2007) Stochastic processes in physics and chemistry, 3rd edn. Elsevier, New York

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Benoît Cadre.

Additional information

Communicated by Domingo Morales.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cadre, B., Paris, Q. On Hölder fields clustering. TEST 21, 301–316 (2012). https://doi.org/10.1007/s11749-011-0244-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11749-011-0244-4

Keywords

Mathematics Subject Classification (2000)

Navigation