Abstract
LetX be a random variable with distribution µ taking values in a Banach spaceH. First, we establish the existence of an optimal quantization of µ with respect to the L 1-distance. Second, we propose several estimators of the optimal quantizer in the potentially infinite-dimensional space H, with associated algorithms. Finally, we discuss practical results obtained from real-life data sets.
Similar content being viewed by others
References
E. A. Abaya and G. L. Wise “Convergence of Vector Quantizers with Application to Optimal Quantization”, SIAM J. Appl. Math. 44, 183–189 (1984).
G. Biau, F. Cérou, and A. Guyader, “Rate of Convergence of the Functional k-Nearest Neighbor Estimate”, IEEE Trans. Inform. Theory (2010) (in press).
G. Biau, L. Devroye, and G. Lugosi, “On the Performance of Clustering in Hilbert Spaces”, IEEE Trans. Inform. Theory, 54, 781–790 (2007).
F. Bolley, A. Guillin, and C. Villani, “Quantitative Concentration Inequalities for Empirical Measures on Noncompact Spaces”, Probab. Theory and Rel. Fields 137(3–4), 541–593 (2007).
B. Cadre, “Convergent Estimators for the L 1-Median of a Banach Valued Random Variable”, Statistics 35 (4), 509–521 (2001).
F. Cucker and S. Smale, “On the Mathematical Foundations of Learning”, Amer. Math. Soc. Bulletin. New Series 39(1), 1–49 (2002) (electronic).
H. Djellout, A. Guillin, and L. Wu, “Transportation Cost-Information Inequalities and Applications to Random Dynamical Systems and Diffusions”, Ann. Probab. 32(3B), 2702–2732 (2004).
R.M. Dudley, Real Analysis and Probability in Cambridge Studies in Advanced Mathematics, Revised reprint of the 1989 original (Cambridge Univ. Press, Cambridge, 2002) Vol. 74.
N. Dunford and J. T. Schwartz, Linear Operators, Pt. I: General Theory,With the assistance of W. G. Bade and R. G. Bartle, Reprint of the 1958 original, in Wiley Classics Library (Wiley, New York, 1988), A Wiley-Interscience Publication.
I. Ekeland and R. Temam, Analyse Convexe et Problè mes Variationnels (Dunod, 1974).
A. Gersho and R. M. Gray, Vector Quantization and Signal Compression (Kluwer, Norwell, MA, USA, 1991).
S. Graf and H. Luschgy, Foundations of Quantization for Probability Distributions, in Lecture Notes in Mathematics (Springer, Berlin, 2000), Vol. 1730.
J. A. Hartigan, Clustering Algorithms, in Wiley Series in Probab. and Math. Statist. (Wiley, New York-London-Sydney, 1975).
L. Kaufman and P. J. Rousseeuw, Finding Groups in Data, An Introduction to Cluster Analysis, in Wiley Series in Probab. and Math. Statist.: Appl. Probab. and Statist., (Wiley, New York-London-Sydney, 1990), A Wiley-Interscience Publication.
J. H. B. Kemperman, “The Median of a Finite Measure on a Banach Space,” in Statistical Data Analysis Based on the L 1-Norm and Related Methods (Neuchâtel, 1987), (North-Holland, Amsterdam, 1987), pp. 217–230.
M. Ledoux, The Concentration of Measure Phenomenon, in Mathematical Surveys and Monographs (AMS, 2001), Vol. 89.
T. Linder, “Learning-Theoretic Methods in Vector Quantization”, in Principles of Nonparametric Learning (Udine, 2001), CISM Courses and Lectures (Springer, Vienna, 2002), Vol. 434, pp. 163–210.
T. Linder, G. Lugosi, and K. Zeger, “Rates of Convergence in the Source Coding Theorem, in Empirical Quantizer Design, and in Universal Lossy Source Coding”, IEEE Trans. Inform. Theory 40, 1728–1740 (1994).
D. Pollard, “Strong Consistency of k-Means Clustering”, Ann. Statist. 9, 135–140 (1981).
D. Pollard, “A Central Limit Theorem for k-Means Clustering”, Ann. Probab. 10, 919–926 (1982).
D. Pollard, “Quantization and the Method of k-Means”, IEEE Trans. Inform. Theory 28, 199–205 (1982).
J. O. Ramsay and B. W. Silverman, Functional Data Analysis, 2nd ed. in Springer Series in Statistics (Springer, New York, 2005).
A.W. Van der Vaart and J. A. Wellner,Weak Convergence and Empirical Processes, in Springer Series in Statistics (Springer, New York, 1996).
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
Laloë, T. L1-Quantization and clustering in Banach spaces. Math. Meth. Stat. 19, 136–150 (2010). https://doi.org/10.3103/S1066530710020031
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S1066530710020031