Skip to main content
Log in

L1-Quantization and clustering in Banach spaces

  • Published:
Mathematical Methods of Statistics Aims and scope Submit manuscript

Abstract

LetX be a random variable with distribution µ taking values in a Banach spaceH. First, we establish the existence of an optimal quantization of µ with respect to the L 1-distance. Second, we propose several estimators of the optimal quantizer in the potentially infinite-dimensional space H, with associated algorithms. Finally, we discuss practical results obtained from real-life data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. E. A. Abaya and G. L. Wise “Convergence of Vector Quantizers with Application to Optimal Quantization”, SIAM J. Appl. Math. 44, 183–189 (1984).

    Article  MATH  MathSciNet  Google Scholar 

  2. G. Biau, F. Cérou, and A. Guyader, “Rate of Convergence of the Functional k-Nearest Neighbor Estimate”, IEEE Trans. Inform. Theory (2010) (in press).

  3. G. Biau, L. Devroye, and G. Lugosi, “On the Performance of Clustering in Hilbert Spaces”, IEEE Trans. Inform. Theory, 54, 781–790 (2007).

    Article  MathSciNet  Google Scholar 

  4. F. Bolley, A. Guillin, and C. Villani, “Quantitative Concentration Inequalities for Empirical Measures on Noncompact Spaces”, Probab. Theory and Rel. Fields 137(3–4), 541–593 (2007).

    MATH  MathSciNet  Google Scholar 

  5. B. Cadre, “Convergent Estimators for the L 1-Median of a Banach Valued Random Variable”, Statistics 35 (4), 509–521 (2001).

    Article  MathSciNet  Google Scholar 

  6. F. Cucker and S. Smale, “On the Mathematical Foundations of Learning”, Amer. Math. Soc. Bulletin. New Series 39(1), 1–49 (2002) (electronic).

    Article  MATH  MathSciNet  Google Scholar 

  7. H. Djellout, A. Guillin, and L. Wu, “Transportation Cost-Information Inequalities and Applications to Random Dynamical Systems and Diffusions”, Ann. Probab. 32(3B), 2702–2732 (2004).

    Article  MATH  MathSciNet  Google Scholar 

  8. R.M. Dudley, Real Analysis and Probability in Cambridge Studies in Advanced Mathematics, Revised reprint of the 1989 original (Cambridge Univ. Press, Cambridge, 2002) Vol. 74.

    MATH  Google Scholar 

  9. N. Dunford and J. T. Schwartz, Linear Operators, Pt. I: General Theory,With the assistance of W. G. Bade and R. G. Bartle, Reprint of the 1958 original, in Wiley Classics Library (Wiley, New York, 1988), A Wiley-Interscience Publication.

    Google Scholar 

  10. I. Ekeland and R. Temam, Analyse Convexe et Problè mes Variationnels (Dunod, 1974).

  11. A. Gersho and R. M. Gray, Vector Quantization and Signal Compression (Kluwer, Norwell, MA, USA, 1991).

    Google Scholar 

  12. S. Graf and H. Luschgy, Foundations of Quantization for Probability Distributions, in Lecture Notes in Mathematics (Springer, Berlin, 2000), Vol. 1730.

    Book  MATH  Google Scholar 

  13. J. A. Hartigan, Clustering Algorithms, in Wiley Series in Probab. and Math. Statist. (Wiley, New York-London-Sydney, 1975).

    MATH  Google Scholar 

  14. L. Kaufman and P. J. Rousseeuw, Finding Groups in Data, An Introduction to Cluster Analysis, in Wiley Series in Probab. and Math. Statist.: Appl. Probab. and Statist., (Wiley, New York-London-Sydney, 1990), A Wiley-Interscience Publication.

    Google Scholar 

  15. J. H. B. Kemperman, “The Median of a Finite Measure on a Banach Space,” in Statistical Data Analysis Based on the L 1-Norm and Related Methods (Neuchâtel, 1987), (North-Holland, Amsterdam, 1987), pp. 217–230.

    Google Scholar 

  16. M. Ledoux, The Concentration of Measure Phenomenon, in Mathematical Surveys and Monographs (AMS, 2001), Vol. 89.

  17. T. Linder, “Learning-Theoretic Methods in Vector Quantization”, in Principles of Nonparametric Learning (Udine, 2001), CISM Courses and Lectures (Springer, Vienna, 2002), Vol. 434, pp. 163–210.

    Google Scholar 

  18. T. Linder, G. Lugosi, and K. Zeger, “Rates of Convergence in the Source Coding Theorem, in Empirical Quantizer Design, and in Universal Lossy Source Coding”, IEEE Trans. Inform. Theory 40, 1728–1740 (1994).

    Article  MATH  MathSciNet  Google Scholar 

  19. D. Pollard, “Strong Consistency of k-Means Clustering”, Ann. Statist. 9, 135–140 (1981).

    Article  MATH  MathSciNet  Google Scholar 

  20. D. Pollard, “A Central Limit Theorem for k-Means Clustering”, Ann. Probab. 10, 919–926 (1982).

    Article  MATH  MathSciNet  Google Scholar 

  21. D. Pollard, “Quantization and the Method of k-Means”, IEEE Trans. Inform. Theory 28, 199–205 (1982).

    Article  MATH  MathSciNet  Google Scholar 

  22. J. O. Ramsay and B. W. Silverman, Functional Data Analysis, 2nd ed. in Springer Series in Statistics (Springer, New York, 2005).

    Google Scholar 

  23. A.W. Van der Vaart and J. A. Wellner,Weak Convergence and Empirical Processes, in Springer Series in Statistics (Springer, New York, 1996).

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to T. Laloë.

About this article

Cite this article

Laloë, T. L1-Quantization and clustering in Banach spaces. Math. Meth. Stat. 19, 136–150 (2010). https://doi.org/10.3103/S1066530710020031

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3103/S1066530710020031

Key words

2000 Mathematics Subject Classification

Navigation