Optimization via Information Geometry

Conference paper
Part of the Springer Proceedings in Mathematics & Statistics book series (PROMS, volume 114)


Information Geometry has been used to inspire efficient algorithms for black-box optimization, both in the combinatorial and in the continuous case. We give an overview of the authors’ research program and some specific contribution to the underlying theory.


Vector Bundle Exponential Family Bounded Continuous Function Natural Gradient Information Geometry 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Absil, P.A., Mahony, R., Sepulchre, R.: Optimization algorithms on matrix manifolds. Princeton University Press, Princeton (2008). With a foreword by Paul Van DoorenGoogle Scholar
  2. 2.
    Amari, S.I.: Natural gradient works efficiently in learning. Neural Comput. 10(2), 251–276 (1998)CrossRefMathSciNetGoogle Scholar
  3. 3.
    Amari, S., Nagaoka, H.: Methods of information geometry. American Mathematical Society, Providence (2000). Translated from the 1993 Japanese original by Daishi HaradaMATHGoogle Scholar
  4. 4.
    Arnold, L., Auger, A., Hansen, N., Ollivier, Y.: Information-Geometric Optimization Algorithms: A Unifying Picture via Invariance Principles (2011v1; 2013v2). ArXiv:1106.3708Google Scholar
  5. 5.
    Banerjee, A., Merugu, S., Dhillon, I.S., Ghosh, J.: Clustering with bregman divergences. J. Mach. Learn. Res. 6, 1705–1749 (2005)MATHMathSciNetGoogle Scholar
  6. 6.
    Bensadon, J.: Black-box optimization using geodesics in statistical manifolds. ArXiv:1309.7168Google Scholar
  7. 7.
    Boros, E., Hammer, P.L.: Pseudo-Boolean optimization. Discrete Appl. Math. 123(1–3), 155–225 (2002). Workshop on Discrete Optimization, DO’99 (Piscataway, NJ) (2013 v1; 2013v2)Google Scholar
  8. 8.
    Brown, L.D.: Fundamentals of statistical exponential families with applications in statistical decision theory. No. 9 in IMS Lecture Notes. Monograph Series. Institute of Mathematical Statistics (1986)Google Scholar
  9. 9.
    Cena, A., Pistone, G.: Exponential statistical manifold. Ann. Inst. Stat. Math. 59(1), 27–56 (2007)CrossRefMATHMathSciNetGoogle Scholar
  10. 10.
    Gallavotti, G.: Statistical Mechanics: A Short Treatise. Texts and Monographs in Physics. Springer, Berlin (1999)CrossRefMATHGoogle Scholar
  11. 11.
    Krasnosel’skii, M.A., Rutickii, Y.B.: Convex Functions and Orlicz Spaces. Noordhoff, Groningen (1961). Russian original: (1958) Fizmatgiz, MoskvaGoogle Scholar
  12. 12.
    Larrañaga, P., Lozano, J.A. (eds.): Estimation of Distribution Algoritms. A New Tool for evolutionary Computation. Genetic Algorithms and Evolutionary Computation, vol. 2. Springer, New York (2001)Google Scholar
  13. 13.
    Malagò, L.: On the geometry of optimization based on the exponential family relaxation. Ph.D. thesis, Politecnico di Milano (2012)Google Scholar
  14. 14.
    Malagò, L., Pistone, G.: A note on the border of an exponential family. ArXiv:1012.0637v1 (2010)Google Scholar
  15. 15.
    Malagò, L., Pistone, G.: Combinatorial Optimization with Information Geometry: Newton Method. Entropy 16(8), 4260–4289 (2014)CrossRefMathSciNetGoogle Scholar
  16. 16.
    Malagò, L., Matteucci, M., Pistone, G.: Stochastic relaxation as a unifying approach in 0/1 programming. In: NIPS 2009 Workshop on Discrete Optimization in Machine Learning: Submodularity, Sparsity & Polyhedra (DISCML), Whistler, 11 Dec 2009Google Scholar
  17. 17.
    Malagò, L., Matteucci, M., Pistone, G.: Stochastic natural gradient descent by estimation of empirical covariances. In: Proceedings of IEEE CEC, pp. 949–956 (2011)Google Scholar
  18. 18.
    Malagò, L., Matteucci, M., Pistone, G.: Towards the geometry of estimation of distribution algorithms based on the exponential family. In: Proceedings of the 11th Workshop on Foundations of Genetic Algorithms, FOGA ‘11, pp. 230–242. ACM, New York (2011)Google Scholar
  19. 19.
    Malagò, L., Matteucci, M., Pistone, G.: Natural gradient, fitness modelling and model selection: a unifying perspective. In: Proceedings of IEEE CEC, pp. 486–493 (2013)Google Scholar
  20. 20.
    Musielak, J.: Orlicz Spaces and Modular Spaces. Lecture Notes in Mathematics, vol. 1034. Springer, Berlin (1983)Google Scholar
  21. 21.
    Pistone, G.: Examples of application of nonparametric information geometry to statistical physics. Entropy 15(10), 4042–4065 (2013)CrossRefMathSciNetGoogle Scholar
  22. 22.
    Pistone, G.: Nonparametric information geometry. In: Nielsen, F., Barbaresco, F. (eds.) Geometric Science of Information. LNCS, vol. 8085, pp. 5–36. Springer, Berlin/Heidelberg (2013). GSI 2013 Paris, August 28–30, 2013 ProceedingsGoogle Scholar
  23. 23.
    Rao, M.M., Ren, Z.D.: Applications of Orlicz Spaces. Monographs and Textbooks in Pure and Applied Mathematics, vol. 250. Marcel Dekker, New York (2002)Google Scholar
  24. 24.
    Santacroce, M., Siri, P., Trivellato, B.: New results on mixture and exponential models by Orlicz spaces (2014, submitted)Google Scholar
  25. 25.
    Wierstra, D., Schaul, T., Peters, J., Schmidhuber, J.: Natural evolution strategies. In: Proceedings of IEEE CEC, pp. 3381–3387 (2008)Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Dipartimento di InformaticaUniversità degli Studi di MilanoMilanoItaly
  2. Castro StatisticsCollegio Carlo AlbertoMoncalieriItaly

Personalised recommendations