Abstract
This paper is concerned with the study of the consistency of a variational method for probability measure quantization, deterministically realized by means of a minimizing principle, balancing power repulsion and attraction potentials. The proof of consistency is based on the construction of a target energy functional whose unique minimizer is actually the given probability measure \(\omega \) to be quantized. Then we show that the discrete functionals, defining the discrete quantizers as their minimizers, actually \(\Gamma \)converge to the target energy with respect to the narrow topology on the space of probability measures. A key ingredient is the reformulation of the target functional by means of a Fourier representation, which extends the characterization of conditionally positive semidefinite functions from points in generic position to probability measures. As a byproduct of the Fourier representation, we also obtain compactness of sublevels of the target energy in terms of uniform moment bounds, which already found applications in the asymptotic analysis of corresponding gradient flows. To model situations where the given probability is affected by noise, we further consider a modified energy, with the addition of a regularizing total variation term and we investigate again its point mass approximations in terms of \(\Gamma \)convergence. We show that such a discrete measure representation of the total variation can be interpreted as an additional nonlinear potential, repulsive at a short range, attractive at a medium range, and at a long range not having effect, promoting a uniform distribution of the point masses.
This is a preview of subscription content, access via your institution.
Notes
 1.
 2.
http://en.wikipedia.org/wiki/Pseudorandom_number_generator One practical way to sample randomly an image would be first to generate (pseudo)randomly a finite number of points according to the uniform distribution from which one eliminates points which do not realize locally an integral over a prescribed threshold.
References
 1.
Ambrosio, L., Fusco, N., Pallara, D.: Functions of Bounded Variation and Free Discontinuity Problems. Oxford Mathematical Monographs. The Clarendon Press, New York (2000)
 2.
Ambrosio, L., Gigli, N., Savaré, G.: Gradient Flows in Metric Spaces and in the Space of Probability Measures. Lectures in Mathematics ETH Zürich, 2nd edn. Birkhäuser, Basel (2008)
 3.
Anzellotti, G., Baldo, S., Percivale, D.: Dimension reduction in variational problems, asymptotic development in \(\Gamma \)convergence and thin structures in elasticity. Asymptotic Anal. 9(1), 61–100 (1994)
 4.
Bartels, S.: Total variation minimization with finite elements: convergence and iterative solution. SIAM J. Numer. Anal. 50(3), 1162–1180 (2012)
 5.
Billingsley, P.: Convergence of Probability Measures. Wiley, New York (1968)
 6.
Billingsley, P.: Probability and Measure. Wiley Series in Probability and Mathematical Statistics, 3rd edn. Wiley, New York (1995)
 7.
Carrillo, J.A., Choi, Y.P., Hauray, M.: he Derivation of Swarming Models: Meanfield limit and Wasserstein Distances. Collective Dynamics from Bacteria to Crowds: An Excursion Through Modeling, Analysis and Simulation Series, vol. 553, pp. 1–46. CISM International Centre for Mechanical Sciences, New York (2014)
 8.
Carrillo, J.A., Toscani, G.: Contractive probability metrics and asymptotic behavior of dissipative kinetic equations. Riv. Mat. Univ. Parma 7(6), 75–198 (2007)
 9.
Chambolle, A., Caselles, V., Cremers, D., Novaga, M., Pock, T.: Theoretical Foundations and Numerical Methods for Sparse Recovery. Radon Series on Computational and Applied Mathematics. An introduction to total variation for image analysis, vol. 9, pp. 263–340. Walter de Gruyter, Berlin (2010)
 10.
Cicalese, M., Spadaro, E.: Droplet minimizers of an isoperimetric problem with longrange interactions. Commun. Pure Appl. Math. 66(8), 1298–1333 (2013)
 11.
CVX Research, Inc. CVX: Matlab software for disciplined convex programming, version 2.0 beta. http://cvxr.com/cvx (2013)
 12.
Dal Maso, G.: An Introduction to \(\Gamma \)Convergence. Progress in Nonlinear Differential Equations and their Applications, vol. 8. Birkhäuser, Boston (1993)
 13.
Dereich, S., Scheutzow, M., Schottstedt, R.: Constructive quantization: approximation by empirical measures. Ann. Inst. Henri Poincaré (B) 49(4), 1183–1203 (2013)
 14.
Devroye, L., Györfi, L.: Nonparametric Density Estimation. Wiley Series in Probability and Mathematical Statistics: Tracts on Probability and Statistics. Wiley, New York (1985)
 15.
Di Francesco, M., Fornasier, M., Hütter, J.C., Matthes, D.: Asymptotic behavior of gradient flows driven by power repulsion and attraction potentials in one dimension. SIAM J. Math. Anal. 26(6), 3814–3837 (2014)
 16.
Durrett, R.: Probability: Theory and Examples. Cambridge Series in Statistical and Probabilistic Mathematics, 4th edn. Cambridge University Press, Cambridge (2010)
 17.
Evans, L.C., Gariepy, R.F.: Measure Theory and Fine Properties of Functions. Studies in Advanced Mathematics. CRC Press, Boca Raton (1992)
 18.
Fattal, R.: Bluenoise point sampling using kernel density model. ACM SIGGRAPH 2011 papers, 28(3), 1–10 (2011)
 19.
Fenn, M., Steidl, G.: Fast NFFT based summation of radial functions. Sampl. Theor. Sig. Image Process 3(1), 1–28 (2004)
 20.
Fornasier, M., Haškovec, J., Steidl, G.: Consistency of variational continuousdomain quantization via kinetic theory. Appl. Anal. 92(6), 1283–1298 (2013)
 21.
Goldman, D., Muratov, C.B., Serfaty, S.: The \(\Gamma \)limit of the twodimensional OhtaKawasaki energy. I. Droplet density. Arch. Ration. Mech. Anal. 210(2), 581–613 (2013)
 22.
Goldman, D., Muratov, C.B., Serfaty, S.: The \(\Gamma \)limit of the twodimensional OhtaKawasaki energy. Droplet arrangement via the renormalized energy. Arch. Ration. Mech. Anal. 212(2), 445–501 (2014)
 23.
Gräf, M., Potts, D., Steidl, G.: Quadrature errors, discrepancies, and their relations to halftoning on the torus and the sphere. SIAM J. Sci. Comput. 34(5), A2760–A2791 (2012)
 24.
Graf, S., Luschgy, H.: Foundations of Quantization for Probability Distributions. Lecture Notes in Mathematics, vol. 1730. Springer, Berlin (2000)
 25.
Grant, M., Boyd, S.: Graph implementations for nonsmooth convex programs. In: Blondel, V., Boyd, S., Kimura, H. (eds.) Recent Advances in Learning and Control. Lecture Notes in Control and Information Sciences, pp. 95–110. Springer (2008). http://stanford.edu/~boyd/graph_dcp.html
 26.
Gruber, P.M.: Optimum quantization and its applications. Adv. Math. 186(2), 456–497 (2004)
 27.
Hütter, J.C.: Minimizers and Gradient Flows of AttractionRepulsion Functionals with Power Kernels and Their Total Variation Regularization. Master Thesis, Technical University of Munich (2013)
 28.
Knüpfer, H., Muratov, C.B.: On an isoperimetric problem with a competing nonlocal term. I: The planar case. Commun. Pure Appl. Math. 66(7), 1129–1162 (2013)
 29.
Knüpfer, H., Muratov, C.B.: On an isoperimetric problem with a competing nonlocal term II: The general case. Commun. Pure Appl. Math. 67(12), 1974–1994 (2014)
 30.
Pagès, G.: A space quantization method for numerical integration. J. Comput. Appl. Math. 89(1), 1–38 (1998)
 31.
Potts, D., Steidl, G.: Fast summation at nonequispaced knots by NFFT. SIAM J. Sci. Comput. 24(6), 2013–2037 (2003)
 32.
Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Phys. D 60(1–4), 259–268 (1992)
 33.
Schmaltz, C., Gwosdek, P., Bruhn, A., Weickert, J.: Electrostatic halftoning. Comput. Graph. 29, 2313–2327 (2010)
 34.
Shorack, G.R., Wellner, J.A.: Empirical processes with applications to statistics. Reprint of the 1986 original ed. Reprint of the 1986 original ed. edition (2009)
 35.
Teuber, T., Steidl, G., Gwosdek, P., Schmaltz, C., Weickert, J.: Dithering by differences of convex functions. SIAM J. Imaging Sci. 4(1), 79–108 (2011)
 36.
Wendland, H.: cattered Data Approximation. Cambridge Monographs on Applied and Computational Mathematics, vol. 17. Cambridge University Press, Cambridge (2005)
 37.
Wied, D., Weißbach, R.: Consistency of the kernel density estimator: a survey. Statist. Papers 53(1), 1–21 (2012)
Acknowledgments
Massimo Fornasier is supported by the ERCStarting Grant for the project “HighDimensional Sparse Optimal Control”. JanChristian Hütter acknowledges the partial financial support of the STARTProject “Sparse Approximation and Optimization in HighDimensions” during the early preparation of this work.
Author information
Affiliations
Corresponding author
Additional information
Communicated by Hans G. Feichtinger.
Appendix: Conditionally Positive Definite Functions
Appendix: Conditionally Positive Definite Functions
In order to compute the Fourier representation of the energy functional \( \mathcal {E} \) in Sect. 3.1.3, we used the notion of generalized Fourier transforms and conditionally positive definite functions from [36], which we shall briefly recall here for the sake of completeness. In fact, the main result reported below, Theorem 7.6 is shown in a slightly modified form with respect to [36, Theorem 8.16], to allow us to prove the moment bound in Sect. 4. The representation formula (3.10) is a consequence of Theorem 7.4 below, which serves as a characterization formula in the theory of conditionally positive definite functions.
Definition 7.1
[36, Definition 8.1] Let \( \mathbb {P}_{k}(\mathbb {R}^d) \) denote the set of polynomial functions on \( \mathbb {R}^d \) of degree less or equal than k . We call a continuous function \( \Phi :\mathbb {R}^d \rightarrow \mathbb {C} \) conditionally positive semidefinite of order m if for all \( N\in \mathbb {N} \), pairwise distinct points \( x_1,\ldots ,x_N \in \mathbb {R}^d \), and \( \alpha \in \mathbb {C}^N \) with
the quadratic form given by \( (\Phi (x_j  x_k))_{jk} \) is nonnegative, i.e.,
Moreover, we call \( \Phi \) conditionally positive definite of order m if the above inequality is strict for \( \alpha \ne 0 \).
Generalized Fourier Transform
When working with distributional Fourier transforms, which can serve to characterize the conditionally positive definite functions defined above, it can be opportune to reduce the standard Schwartz space \( \mathcal {S} \) to functions which in addition to the polynomial decay for large arguments also exhibit a certain decay for small ones. In this way, one can elegantly neglect singularities in the Fourier transform at 0, which could otherwise arise.
Definition 7.2
(Restricted Schwartz class \( \mathcal {S}_m \) ) [36, Definition 8.8] Let \( \mathcal {S} \) be the Schwartz space of functions in \( C^\infty (\mathbb {R}^d) \) which for \( \left x \right \rightarrow \infty \) decay faster than any fixed polynomial. Then, for \( m \in \mathbb {N} \), we denote by \( \mathcal {S}_m \) the subset of those functions \(\gamma \) in \( \mathcal {S} \) which additionally fulfill
Furthermore, we shall call an (otherwise arbitrary) function \( \Phi :\mathbb {R}^d \rightarrow \mathbb {C} \) slowly increasing if there is an \( m \in \mathbb {N} \) such that
Definition 7.3
(Generalized Fourier transform) [36, Definition 8.9] For \( \Phi :\mathbb {R}^d \rightarrow \mathbb {C} \) continuous and slowly increasing, we call a measurable function \(\widehat{\Phi } \in L_{\mathrm {loc}}^2(\mathbb {R}^d {\setminus } \left\{ 0 \right\} )\) the generalized Fourier transform of \( \Phi \) if there exists a multiple of \( \frac{1}{2} \), \( m = \frac{1}{2}n, \, n \in \mathbb {N}_0 \) such that
Then, we call m the order of \( \widehat{\Phi } \).
Note that the order here is defined in terms of 2m instead of m , which is why we would like to allow for multiples of \( \frac{1}{2} \).
Representation Formula for Conditionally Positive Definite Functions
Theorem 7.4
[36, Corollary 8.13] Let \( \Phi :\mathbb {R}^d \rightarrow \mathbb {C} \) be a continuous and slowly increasing function with a nonnegative, nonvanishing generalized Fourier transform \( \widehat{\Phi } \) of order m that is continuous on \( \mathbb {R}^d {\setminus } \left\{ 0 \right\} \). Then, for pairwise distinct points \( x_1, \dots , x_N \in \mathbb {R}^d \) and \( \alpha \in \mathbb {C}^N \) which fulfill condition (7.1), i.e.,
we have
Computation for the Power Function
Given Theorem 7.4, in this paper we are naturally interested in the explicit formula of the generalized Fourier transform for the power function \( x \mapsto \left x \right ^q \) for \( q \in [1,2) \). It is a nice example of how to pass from an ordinary Fourier transform to the generalized Fourier transform by extending the formula by means of complex analysis methods. Our starting point will be the multiquadric \( x \mapsto \left( c^2 + \left x \right ^2 \right) ^{\beta } \) for \( \beta < d/2 \), whose Fourier transform involves the modified Bessel function of the third kind:
For \( \nu \in \mathbb {C} \), \( z \in \mathbb {C} \) with \( \left {\text {arg}} z \right < \pi /2 \), define
the modified Bessel function of the third kind of order \( \nu \in \mathbb {C} \).
Theorem 7.5
[36, Theorem 6.13] For \( c > 0 \) and \( \beta < d/2 \),
has (classical) Fourier transform given by
In the following result, we have slightly changed the statement compared to the original reference [36, Theorem 8.16] in order to allow orders which are a multiple of 1 / 2 instead of just integers. The latter situation made sense in [36] because the definition of the order involves the space \( \mathcal {S}_{2m} \) due to its purpose in the representation formula of Theorem 7.4, where a quadratic form appears. However, in Sect. 4 we need the generalized Fourier transform in the context of a linear functional, hence a different range of orders. Fortunately, one can easily generalize the proof in [36] to this fractional case, as all integrability arguments remain true when permitting multiples of 1 / 2 , in particular the estimates in (7.8) and (7.10).
Theorem 7.6

1.
[36, Theorem 8.15] \( \Phi (x) = (c^2 + \left x \right ^2)^\beta \), \( x \in \mathbb {R}^d \) for \( c > 0 \) and \( \beta \in \mathbb {R} {\setminus } \mathbb {N}_0 \) has the generalized Fourier transform
$$\begin{aligned} \widehat{\Phi }(\xi ) = (2\pi )^{d/2} \frac{2^{1+\beta }}{\Gamma (\beta )} \left( \frac{\left \xi \right }{c} \right) ^{\beta d/2} K_{d/2+\beta }(c \left \xi \right ), \quad \xi \ne 0 \end{aligned}$$(7.6)of order \( m = \max (0, {\lfloor 2\beta + 1\rfloor /2)} \).

2.
[36, Theorem 8.16] \( \Phi (x) = \left x \right ^\beta \), \( x \in \mathbb {R}^d \) with \( \beta \in \mathbb {R}_+ {\setminus } 2\mathbb {N} \) has the generalized Fourier transform
$$\begin{aligned} \widehat{\Phi }(\xi ) = (2\pi )^{d/2}\frac{2^{\beta +d/2}\Gamma ((d+\beta )/2)}{\Gamma ({\beta }/2)} \left \xi \right ^{\beta d}, \quad \xi \ne 0. \end{aligned}$$of order \( m = {\lfloor \beta + 1\rfloor /2} \).
Note that in the cases of interest to us, the second statement of the theorem means that the generalized Fourier transform of \( \Phi (x) = \left x \right ^\beta \) is of order \( \frac{1}{2} \) for \( \beta \in (0, 1) \) and 1 for \( \beta \in [1,2) \), respectively. As this statement appears in a slightly modified form with respect to [36, Theorem 8.16] we report below an explicit, although rather concise proof of it.
Proof

1.
We can pass from formula (7.5) to (7.6) by analytic continuation, where the exponent m serves to give us the needed integrable dominating function, see formula (7.8) below. Let \( G = \left\{ \lambda \in \mathbb {C} : \mathrm{Re}(\lambda ) < m \right\} {\ni \beta } \) and
$$\begin{aligned} \varphi _\lambda (\xi ) := {}&(2\pi )^{d/2} \frac{2^{1+\lambda }}{\Gamma (\lambda )} \left( \frac{\left \xi \right }{c} \right) ^{\lambda d/2} K_{d/2+\lambda }(c \left \xi \right )\\ \Phi _\lambda (\xi ) := {}&\left( c^2 + \left \xi \right ^2 \right) ^\lambda . \end{aligned}$$We want to show that for all \(\lambda \in G\)
$$\begin{aligned} \int _{\mathbb {R}^d} \Phi _\lambda (\xi )\widehat{\gamma }(\xi ) \mathrm {d}\xi = \int _{\mathbb {R}^d} \varphi _\lambda (\xi ) \gamma (\xi ) \mathrm {d}\xi , \quad \text {for all } \gamma \in \mathcal {S}_{2m}, \end{aligned}$$which is so far true for \(\lambda \) real and \( \lambda < {}d/2 \) by (7.5). As the integrands \( \Phi _\lambda \widehat{\gamma } \) and \( \varphi _\lambda \gamma \) are analytic, they can be expressed in terms of Cauchy integral formulas. The integral functions
$$\begin{aligned} f_1(\lambda )= & {} \int _{\mathbb {R}^d} \Phi _\lambda (\xi )\widehat{\gamma }(\xi ) \mathrm {d}\xi = \int _{\mathbb {R}^d} \frac{1}{2 \pi i} \int _{\mathcal C} \frac{\Phi _z(\xi )}{z  \lambda } dz \widehat{\gamma }(\xi ) d\xi \\ f_2(\lambda )= & {} \int _{\mathbb {R}^d} \varphi _\lambda (\xi ) \gamma (\xi ) \mathrm {d}\xi = \int _{\mathbb {R}^d} \frac{1}{2 \pi i} \int _{\mathcal C} \frac{\varphi _z(\xi )}{z  \lambda } dz {\gamma }(\xi ) d\xi , \end{aligned}$$will be also analytic as soon as we can find a uniform dominating function of the integrands on an arbitrary compact curve \( \mathcal {C} \subset G \), to allow the application of FubiniTonelli’s theorem and derive corresponding Cauchy integral formulas for \(f_1\) and \(f_2\) (see the details of the proofs of [36, Theorem 8.15] and [36, Theorem 8.16]). A dominating function for the integrand of \(f_1(\lambda )\) is easily obtained thanks to the decay of \( \widehat{\gamma } \in \mathcal {S} \) faster of any polynomially growing function (notice that \(\mathrm{Re}(\lambda )<m\)). It remains to find a dominating function for the integrand of \(f_2(\lambda )\). Setting \( b := \mathrm{Re}(\lambda ) \), for \( \xi \) close to 0 we get, by using the bound
$$\begin{aligned} \left K_\nu (r) \right \le {\left\{ \begin{array}{ll} 2^{\left \mathrm{Re}(\nu ) \right  1}\Gamma \left( \left \mathrm{Re}(\nu ) \right \right) r^{\left \mathrm{Re}(\nu ) \right }, &{}\mathrm{Re}(\nu ) \ne 0,\\ \frac{1}{\mathrm {e}}\log \frac{r}{2},&{}r < 2, \mathrm{Re}(\nu ) = 0. \end{array}\right. } \end{aligned}$$(7.7)for \( \nu \in \mathbb {C}, r > 0 \), as derived in [36, Lemma 5.14], that
$$\begin{aligned} \left \varphi _z(\xi ) \gamma (\xi ) \right \le C_\gamma \frac{2^{b+\left b + d/2 \right }\Gamma (\left b + d/2 \right )}{\left \Gamma (\lambda ) \right }c^{b+d/2\left b+d/2 \right }\left \xi \right ^{bd/2\left b+d/2 \right +2m} \end{aligned}$$(7.8)for \( b \ne d/2 \) and
$$\begin{aligned} \left \varphi _z(\xi )\gamma (\xi ) \right \le C_{z} \frac{2^{1d/2}}{\left \Gamma (\lambda ) \right }\left( \frac{1}{\mathrm {e}}  \log \frac{c \left \xi \right }{2} \right) . \end{aligned}$$for \( b = d/2 \). Taking into account that \( \mathcal {C} \) is compact and \( 1/\Gamma \) is an entire function, this yields
$$\begin{aligned} \left \varphi _z(\xi )\gamma (\xi ) \right \le C_{m,c,\mathcal {C}} \left( 1 + \left \xi \right ^{d+2\varepsilon }\log \frac{c \left \xi \right }{2} \right) , \end{aligned}$$with \( \left \xi \right < \min \left\{ 1/c,1 \right\} \) and \( \varepsilon := mb{>0}\), which is locally integrable. For \( \xi \) large, we similarly use the estimate for large r ,
$$\begin{aligned} \left K_\nu (r) \right \le \sqrt{\frac{2\pi }{r}} \mathrm {e}^{r} \mathrm {e}^{\left \mathrm{Re}(\mu ) \right ^2/(2r)}, \quad r > 0, \end{aligned}$$(7.9)from [36, Lemma 5.14] to obtain
$$\begin{aligned} \left \varphi _z(\xi )\gamma (\xi ) \right \le C_{\mathcal C} \frac{2^{1+b}\sqrt{2\pi }}{\left \Gamma (\lambda ) \right }c^{b+(d1)/2} \left \xi \right ^{b(d+1)/2} \mathrm {e}^{c \left \xi \right } \mathrm {e}^{\left b+d/2 \right ^2/(2c \left \xi \right )} \end{aligned}$$and consequently
$$\begin{aligned} \left \varphi _\lambda (\xi )\gamma (\xi ) \right \le C_{\gamma ,m,\mathcal {C},c} \mathrm {e}^{c \left \xi \right }, \end{aligned}$$which certainly is integrable.

2.
We want to pass to \( c \rightarrow 0 \) in formula (7.6). This can be done by applying the dominated convergence theorem in the definition of the generalized Fourier transform (7.3). Writing \( \Psi _c(x) := \left( c^2+ \left x \right ^2 \right) ^{\beta /2} \) for \( c>0 \), we know that
$$\begin{aligned} \widehat{\Psi }_c(\xi ) = \psi _c(\xi ) := (2\pi )^{d/2} \frac{2^{1+\beta /2}}{\left \Gamma (\beta /2) \right } \left \xi \right ^{\beta d}(c \left \xi \right )^{(\beta +d)/2}K_{(\beta +d)/2}(c \left \xi \right ). \end{aligned}$$By using the decay properties of a \( \gamma \in \mathcal {S}_{2m} \) in the estimate (7.8), we get
$$\begin{aligned} \left \psi _c(\xi )\gamma (\xi ) \right \le C_\gamma \frac{2^{\beta +d/2}\Gamma ((\beta +d)/2}{\left \Gamma (\beta /2) \right } \left \xi \right ^{2m\beta d} \quad \text {for } \left \xi \right \rightarrow 0 \end{aligned}$$(7.10)and
$$\begin{aligned} \left \psi _c(\xi )\gamma (\xi ) \right \le C_\gamma \frac{2^{\beta +d/2}\Gamma ((\beta +d)/2)}{\left \Gamma (\beta /2) \right } \left \xi \right ^{\beta d}, \end{aligned}$$yielding the desired uniform dominating function. The claim now follows by also taking into account that
$$\begin{aligned} \lim _{r\rightarrow 0} r^\nu K_\nu (r) = \lim _{r\rightarrow 0} 2^{\nu 1} \int _{0}^{\infty } \mathrm {e}^{t} \mathrm {e}^{r^2/(4t)} t^{\nu 1} \mathrm {d}t = 2^{\nu 1} \Gamma (\nu ). \end{aligned}$$\(\square \)
Rights and permissions
About this article
Cite this article
Fornasier, M., Hütter, JC. Consistency of Probability Measure Quantization by Means of Power Repulsion–Attraction Potentials. J Fourier Anal Appl 22, 694–749 (2016). https://doi.org/10.1007/s000410159432z
Received:
Revised:
Published:
Issue Date:
Keywords
 Variational measure quantization
 Fourier–Stieltjes transform
 Total variation regularization
 Gamma convergence
Mathematics Subject Classification
 28A33
 42B10
 49J45
 49M25
 60E10
 65D30
 90C26