Advertisement

Mathematical Programming

, Volume 156, Issue 1–2, pp 549–579 | Cite as

Variable metric random pursuit

  • S. U. Stich
  • C. L. Müller
  • B. Gärtner
Full Length Paper Series A

Abstract

We consider unconstrained randomized optimization of smooth convex objective functions in the gradient-free setting. We analyze Random Pursuit (RP) algorithms with fixed (F-RP) and variable metric (V-RP). The algorithms only use zeroth-order information about the objective function and compute an approximate solution by repeated optimization over randomly chosen one-dimensional subspaces. The distribution of search directions is dictated by the chosen metric. Variable Metric RP uses novel variants of a randomized zeroth-order Hessian approximation scheme recently introduced by Leventhal and Lewis (Optimization 60(3):329–345, 2011. doi: 10.1080/02331930903100141). We here present (1) a refined analysis of the expected single step progress of RP algorithms and their global convergence on (strictly) convex functions and (2) novel convergence bounds for V-RP on strongly convex functions. We also quantify how well the employed metric needs to match the local geometry of the function in order for the RP algorithms to converge with the best possible rate. Our theoretical results are accompanied by numerical experiments, comparing V-RP with the derivative-free schemes CMA-ES, Implicit Filtering, Nelder–Mead, NEWUOA, Pattern-Search and Nesterov’s gradient-free algorithms.

Keywords

Gradient-free optimization Convex optimization Variable metric Line search 

Mathematics Subject Classification

90C25 90C56 90C53 68Q25 

Notes

Acknowledgments

We like to thank the anonymous reviewers whose comments and suggestions very much helped to improve the quality and content of this paper.

References

  1. 1.
    Adamczak, R., Litvak, A.E., Pajor, A., Tomczak-Jaegermann, N.: Quantitative estimates of the convergence of the empirical covariance matrix in log-concave ensembles. J. AMS 23, 535–561 (2010). doi: 10.1090/S0894-0347-09-00650-X MathSciNetzbMATHGoogle Scholar
  2. 2.
    Armijo, L.: Minimization of functions having Lipschitz continuous first partial derivatives. Pac. J. Math. 16(1), 1–3 (1966). http://projecteuclid.org/euclid.pjm/1102995080
  3. 3.
    Brockhoff, D., Auger, A., Hansen, N., Arnold, D., Hohm, T.: Mirrored Sampling and Sequential Selection for Evolution Strategies. In: Schaefer, R., Cotta, C., Kołodziej, J., Rudolph, G.(eds.) PPSN XI, LNCS, vol. 6238, pp. 11–21. Springer, Berlin, Heidelberg (2011). doi: 10.1007/978-3-642-15844-5_2
  4. 4.
    Broyden, C.G.: The convergence of a class of double-rank minimization algorithms 1. General considerations. IMA J. Appl. Math. 6(1), 76–90 (1970). doi: 10.1093/imamat/6.1.76. http://imamat.oxfordjournals.org/content/6/1/76.abstract
  5. 5.
    Davidon, W.C.: Variable metric method for minimization. SIAM J. Optim. 1(1), 1–17 (1991). doi: 10.1137/0801001. http://link.aip.org/link/?SJE/1/1/1
  6. 6.
    Fletcher, R.: A new approach to variable metric algorithms. Comput. J. 13(3), 317–322 (1970). doi: 10.1093/comjnl/13.3.317. URL http://comjnl.oxfordjournals.org/content/13/3/317.abstract
  7. 7.
    Goldfarb, D.: A family of variable-metric methods derived by variational means. Math. Comput. 24(109), 23–26 (1970). http://www.jstor.org/stable/2004873
  8. 8.
    Goldstein, A.: On steepest descent. J. Soc. Ind. Appl. Math. Ser. A Control 3(1), 147–151 (1965). doi: 10.1137/0303013 CrossRefzbMATHGoogle Scholar
  9. 9.
    Hansen, N., Ostermeier, A.: Completely derandomized self-adaption in evolution strategies. Evol. Comput. 9(2), 159–195 (2001)CrossRefGoogle Scholar
  10. 10.
    Heijmans, R.: When does the expectation of a ratio equal the ratio of expectations? Stat. Pap. 40, 107–115 (1999)CrossRefMathSciNetzbMATHGoogle Scholar
  11. 11.
    Horn, R.A., Johnson, C.R.: Matrix Analysis. Reprint 1990 edn. Cambridge University Press, Cambridge (1985)CrossRefzbMATHGoogle Scholar
  12. 12.
    Hu, T.C., Klee, V., Larman, D.: Optimization of globally convex functions. SIAM J. Control Optim. 27(5), 1026–1047 (1989). doi: 10.1137/0327055. http://link.aip.org/link/?SJC/27/1026/1
  13. 13.
    Jägersküpper, J.: Lower bounds for hit-and-run direct search. In: J. Hromkovic, R. Královic, M. Nunkesser, P. Widmayer (eds.) Stochastic Algorithms: Foundations and Applications, Lecture Notes in Comput. Sci., vol. 4665, pp. 118–129. Springer, Berlin (2007)Google Scholar
  14. 14.
    Kelley, C.T.: Implicit Filtering. SIAM, Philadelphia, PA (2011)CrossRefzbMATHGoogle Scholar
  15. 15.
    Kjellström, G., Taxen, L.: Stochastic optimization in system design. IEEE Trans. Circuits Syst. 28(7), 702–715 (1981). doi: 10.1109/TCS.1981.1085030 CrossRefzbMATHGoogle Scholar
  16. 16.
    Leventhal, D., Lewis, A.S.: Randomized Hessian estimation and directional search. Optimization 60(3), 329–345 (2011). doi: 10.1080/02331930903100141 CrossRefMathSciNetzbMATHGoogle Scholar
  17. 17.
    Marti, K.: Controlled random search procedures for global optimization. In: V. Arkin, A. Shiraev, R. Wets (eds.) Stochastic Optimization, Lecture Notes in Control and Information Sciences, vol. 81, pp. 457–474. Springer, Berlin (1986)Google Scholar
  18. 18.
    Mathai, A.M., Provost, S.B.: Quadratic forms in random variables: theory and applications. No. 126. In: Statistics: Textbooks and Monographs. New York, Dekker (1992)Google Scholar
  19. 19.
    Müller, C.L., Sbalzarini, I.F.: Gaussian adaptation revisited—an entropic view on covariance matrix adaptation. In: C. Di Chio et al. (ed.) EvoApplications, no. 6024 in Lecture Notes in Comput. Sci., pp. 432–441. Springer, Berlin (2010)Google Scholar
  20. 20.
    Nelder, J.A., Mead, R.: A simplex method for function minimization. Comput. J. 7(4), 308–313 (1965). doi: 10.1093/comjnl/7.4.308. http://comjnl.oxfordjournals.org/content/7/4/308.abstract
  21. 21.
    Nesterov, Y.: Random Gradient-Free Minimization of Convex Functions. Technical report, ECORE (2011)Google Scholar
  22. 22.
    Powell, M.: The newuoa software for unconstrained optimization without derivatives. In: Pillo, G., Roma, M. (eds.) Large-Scale Nonlinear Optimization, Nonconvex Optimization and Its Applications, vol. 83, pp. 255–297. Springer, US (2006). doi: 10.1007/0-387-30065-1_16
  23. 23.
    Puntanen, S., Styan, G.P.H., Isotalo, J.: Matrix Tricks for Linear Statistical Models: Our Personal Top Twenty. Springer, Berlin Heidelberg (2011)CrossRefGoogle Scholar
  24. 24.
    Rosenbrock, H.H.: An automatic method for finding the greatest or least value of a function. Comput. J. 3(3), 175–184 (1960). doi: 10.1093/comjnl/3.3.175. http://comjnl.oxfordjournals.org/content/3/3/175.abstract
  25. 25.
    Schumer, M., Steiglitz, K.: Adaptive step size random search. Autom. Control IEEE Trans. 13(3), 270–276 (1968). doi: 10.1109/TAC.1968.1098903 CrossRefGoogle Scholar
  26. 26.
    Shanno, D.F.: Conditioning of Quasi-Newton methods for function minimization. Math. Comput. 24(111), 647–656 (1970). http://www.jstor.org/stable/2004840
  27. 27.
    Stich, S.U.: Convex optimization with random pursuit. ETH Zurich (2014). doi: 10.3929/ethz-a-010377352
  28. 28.
    Stich, S.U., Müller, C.L.: On spectral invariance of randomized hessian and covariance matrix adaptation schemes. In: Coello, C., Cutello, V., Deb, K., Forrest, S., Nicosia, G., Pavone, M. (eds.) Parallel Problem Solving from Nature - PPSN XII. Lecture Notes in Computer Science, vol. 7491, pp. 448–457. Springer, Berlin Heidelberg (2012)Google Scholar
  29. 29.
    Stich, S.U., Müller, C.L., Gärtner, B.: Optimization of convex functions with random pursuit. SIAM J. Optim. 23(2), 1284–1309 (2013)CrossRefMathSciNetzbMATHGoogle Scholar
  30. 30.
    Stich, S.U., Müller, C.L., Gärtner, B.: Supporting online material for: variable metric random pursuit. arXiv:1210.5114 (2014)
  31. 31.
    Wedderburn, J.H.M.: Lectures on Matrices. (Colloquium Publications) AMS, New York (1938)Google Scholar
  32. 32.
    Wolfe, P.: Convergence conditions for ascent methods. SIAM Rev. 11(2), 226–235 (1969). doi: 10.1137/1011036 CrossRefMathSciNetzbMATHGoogle Scholar
  33. 33.
    Wolfe, P.: Convergence conditions for ascent methods. II: Some corrections. SIAM Rev. 13(2), 185–188 (1971). doi: 10.1137/1013035 CrossRefMathSciNetzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg and Mathematical Optimization Society 2015

Authors and Affiliations

  1. 1.Institute of Theoretical Computer ScienceETH ZurichZurichSwitzerland
  2. 2.ICTEAM Institute/CORE, Université catholique de LouvainLouvain-la-NeuveBelgium
  3. 3.Simons Center for Data Analysis, Simons FoundationNew YorkUSA

Personalised recommendations