Abstract
When performing a learning task on voluminous data, memory and computational time can become prohibitive. In this chapter, we propose a framework aimed at estimating the parameters of a density mixture on training data in a compressive manner by computing a low-dimensional sketch of the data. The sketch represents empirical moments of the underlying probability distribution. Instantiating the framework on the case where the densities are isotropic Gaussians, we derive a reconstruction algorithm by analogy with compressed sensing. We experimentally show that it is possible to precisely estimate the mixture parameters provided that the sketch is large enough, while consuming less memory in the case of numerous data. The considered framework also provides a privacy-preserving data analysis tool, since the sketch does not disclose information about individual datum it is based on.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Namely, a model of probability measures.
- 2.
We also performed experiments where all the weights were equal to \(\frac{1} {k}\) and this didn’t alter the conclusions drawn from the experiments.
References
Achlioptas, D.: Database-friendly random projections. In: Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 274–281 (2001)
Bertin, K., Pennec, E.L., Rivoirard, V.: Adaptive Dantzig density estimation. Annales de l’Institut Henri Poincaré (B) Probabilités et Statistiques 47(1), 43–74 (2011)
Blumensath, T., Davies, M.E.: Iterative hard thresholding for compressed sensing. Appl. Comput. Harmon. Anal. 27(3), 265–274 (2009)
Bourrier, A., Davies, M.E., Peleg, T., Pérez, P., Gribonval, R.: Fundamental performance limits for ideal decoders in high-dimensional linear inverse problems. IEEE Trans. Inf. Theory 60, 7928–7946 (2013)
Bourrier, A., Gribonval, R., Pérez, P.: Compressive Gaussian mixture estimation. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (2013)
Bunea, F., Tsybakov, A.B., Wegkamp, M., Barbu, A.: Spades and mixture models. Ann. Stat. 38(4), 2525–2558 (2010)
Calderbank, R., Schapire, R., Jafarpour, S.: Compressed learning: Universal sparse dimensionality reduction and learning in the measurement domain. Preprint (2009)
Candès, E.J., Tao, T.: The Dantzig selector: statistical estimation when p is much larger than n. Ann. Stat. 35(6), 2313–2351 (2007)
Charikar, M., Chen, K., Farach-Colton, M.: Finding frequent items in data streams. In: ICALP, pp. 693–703 (2002)
Cormode, G., Hadjieleftheriou, M.: Methods for finding frequent items in data streams. VLDB J. 19(1), 3–20 (2010)
Cormode, G., Muthukrishnan, S.: An improved data stream summary: the count-min sketch and its applications. In: LATIN, pp. 29–38 (2004)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. J. R. Stat. Soc. 39(1), 1–38 (1977)
Gilbert, A.C., Strauss, M.J., Tropp, J.A., Vershynin, R.: One sketch for all: fast algorithms for compressed sensing. In: STOC, pp. 237–246 (2007)
Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: VLDB, pp. 518–529 (1999)
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: STOC, pp. 604–613 (1998)
Johnson, W., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. In: Conference in Modern Analysis and Probability (New Haven, Conn., 1982). Contemporary Mathematics, vol. 26, pp. 189–206. American Mathematical Society, Providence (1984)
Thaper, N., Guha, S., Indyk, P., Koudas, N.: Dynamic multidimensional histograms. In: ACM SIGMOD International Conference on Management of Data (2002)
Acknowledgements
This work was supported in part by the European Research Council, PLEASE project (ERC-StG-2011-277906).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Bourrier, A., Gribonval, R., Pérez, P. (2015). Compressive Gaussian Mixture Estimation. In: Boche, H., Calderbank, R., Kutyniok, G., Vybíral, J. (eds) Compressed Sensing and its Applications. Applied and Numerical Harmonic Analysis. Birkhäuser, Cham. https://doi.org/10.1007/978-3-319-16042-9_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-16042-9_8
Publisher Name: Birkhäuser, Cham
Print ISBN: 978-3-319-16041-2
Online ISBN: 978-3-319-16042-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)