Advertisement

Compressive Gaussian Mixture Estimation

  • Anthony Bourrier
  • Rémi GribonvalEmail author
  • Patrick Pérez
Chapter
  • 2.2k Downloads
Part of the Applied and Numerical Harmonic Analysis book series (ANHA)

Abstract

When performing a learning task on voluminous data, memory and computational time can become prohibitive. In this chapter, we propose a framework aimed at estimating the parameters of a density mixture on training data in a compressive manner by computing a low-dimensional sketch of the data. The sketch represents empirical moments of the underlying probability distribution. Instantiating the framework on the case where the densities are isotropic Gaussians, we derive a reconstruction algorithm by analogy with compressed sensing. We experimentally show that it is possible to precisely estimate the mixture parameters provided that the sketch is large enough, while consuming less memory in the case of numerous data. The considered framework also provides a privacy-preserving data analysis tool, since the sketch does not disclose information about individual datum it is based on.

Keywords

Compressed Sensing Isotropic Gaussian Mixture Density Estimation Compressed Learning Hellinger Distance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgements

This work was supported in part by the European Research Council, PLEASE project (ERC-StG-2011-277906).

References

  1. 1.
    Achlioptas, D.: Database-friendly random projections. In: Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 274–281 (2001)Google Scholar
  2. 2.
    Bertin, K., Pennec, E.L., Rivoirard, V.: Adaptive Dantzig density estimation. Annales de l’Institut Henri Poincaré (B) Probabilités et Statistiques 47(1), 43–74 (2011)CrossRefzbMATHGoogle Scholar
  3. 3.
    Blumensath, T., Davies, M.E.: Iterative hard thresholding for compressed sensing. Appl. Comput. Harmon. Anal. 27(3), 265–274 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Bourrier, A., Davies, M.E., Peleg, T., Pérez, P., Gribonval, R.: Fundamental performance limits for ideal decoders in high-dimensional linear inverse problems. IEEE Trans. Inf. Theory 60, 7928–7946 (2013)CrossRefzbMATHGoogle Scholar
  5. 5.
    Bourrier, A., Gribonval, R., Pérez, P.: Compressive Gaussian mixture estimation. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (2013)Google Scholar
  6. 6.
    Bunea, F., Tsybakov, A.B., Wegkamp, M., Barbu, A.: Spades and mixture models. Ann. Stat. 38(4), 2525–2558 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Calderbank, R., Schapire, R., Jafarpour, S.: Compressed learning: Universal sparse dimensionality reduction and learning in the measurement domain. Preprint (2009)Google Scholar
  8. 8.
    Candès, E.J., Tao, T.: The Dantzig selector: statistical estimation when p is much larger than n. Ann. Stat. 35(6), 2313–2351 (2007)CrossRefzbMATHGoogle Scholar
  9. 9.
    Charikar, M., Chen, K., Farach-Colton, M.: Finding frequent items in data streams. In: ICALP, pp. 693–703 (2002)Google Scholar
  10. 10.
    Cormode, G., Hadjieleftheriou, M.: Methods for finding frequent items in data streams. VLDB J. 19(1), 3–20 (2010)CrossRefGoogle Scholar
  11. 11.
    Cormode, G., Muthukrishnan, S.: An improved data stream summary: the count-min sketch and its applications. In: LATIN, pp. 29–38 (2004)Google Scholar
  12. 12.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. J. R. Stat. Soc. 39(1), 1–38 (1977)MathSciNetzbMATHGoogle Scholar
  13. 13.
    Gilbert, A.C., Strauss, M.J., Tropp, J.A., Vershynin, R.: One sketch for all: fast algorithms for compressed sensing. In: STOC, pp. 237–246 (2007)Google Scholar
  14. 14.
    Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: VLDB, pp. 518–529 (1999)Google Scholar
  15. 15.
    Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: STOC, pp. 604–613 (1998)Google Scholar
  16. 16.
    Johnson, W., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. In: Conference in Modern Analysis and Probability (New Haven, Conn., 1982). Contemporary Mathematics, vol. 26, pp. 189–206. American Mathematical Society, Providence (1984)Google Scholar
  17. 17.
    Thaper, N., Guha, S., Indyk, P., Koudas, N.: Dynamic multidimensional histograms. In: ACM SIGMOD International Conference on Management of Data (2002)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Anthony Bourrier
    • 1
    • 2
  • Rémi Gribonval
    • 1
    Email author
  • Patrick Pérez
    • 2
  1. 1.Inria Rennes-Bretagne AtlantiqueRennesFrance
  2. 2.TechnicolorCesson-SévignéFrance

Personalised recommendations