Algorithmic Connections between Active Learning and Stochastic Convex Optimization

  • Aaditya Ramdas
  • Aarti Singh
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8139)

Abstract

Interesting theoretical associations have been established by recent papers between the fields of active learning and stochastic convex optimization due to the common role of feedback in sequential querying mechanisms. In this paper, we continue this thread in two parts by exploiting these relations for the first time to yield novel algorithms in both fields, further motivating the study of their intersection. First, inspired by a recent optimization algorithm that was adaptive to unknown uniform convexity parameters, we present a new active learning algorithm for one-dimensional thresholds that can yield minimax rates by adapting to unknown noise parameters. Next, we show that one can perform d-dimensional stochastic minimization of smooth uniformly convex functions when only granted oracle access to noisy gradient signs along any coordinate instead of real-valued gradients, by using a simple randomized coordinate descent procedure where each line search can be solved by 1-dimensional active learning, provably achieving the same error convergence rate as having the entire real-valued gradient. Combining these two parts yields an algorithm that solves stochastic convex optimization of uniformly convex and smooth functions using only noisy gradient signs by repeatedly performing active learning, achieves optimal rates and is adaptive to all unknown convexity and smoothness parameters.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Raginsky, M., Rakhlin, A.: Information complexity of black-box convex optimization: A new look via feedback information theory. In: 47th Annual Allerton Conference on Communication, Control, and Computing (2009)Google Scholar
  2. 2.
    Ramdas, A., Singh, A.: Optimal rates for stochastic convex optimization under tsybakov noise condition. In: Intl. Conference in Machine Learning, ICML (2013)Google Scholar
  3. 3.
    Hanneke, S.: Rates of convergence in active learning. The Annals of Statistics 39(1), 333–361 (2011)MathSciNetCrossRefMATHGoogle Scholar
  4. 4.
    Nemirovski, A., Yudin, D.: Problem complexity and method efficiency in optimization. John Wiley & Sons (1983)Google Scholar
  5. 5.
    Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization problems. Core Discussion Papers 2, 2010 (2010)Google Scholar
  6. 6.
    Jamieson, K., Nowak, R., Recht, B.: Query complexity of derivative-free optimization. In: Advances in Neural Information Processing Systems, NIPS (2012)Google Scholar
  7. 7.
    Tsybakov, A.: Optimal aggregation of classifiers in statistical learning. The Annals of Statistics 32(1), 135–166 (2004)MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    Audibert, J.Y., Tsybakov, A.B.: Fast learning rates for plug-in classifiers. Annals of Statistics 35(2), 608–633 (2007)MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Castro, R.M., Nowak, R.D.: Minimax bounds for active learning. In: Bshouty, N.H., Gentile, C. (eds.) COLT. LNCS (LNAI), vol. 4539, pp. 5–19. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  10. 10.
    Iouditski, A., Nesterov, Y.: Primal-dual subgradient methods for minimizing uniformly convex functions. Université Joseph Fourier, Grenoble, France (2010)Google Scholar
  11. 11.
    Burnashev, M., Zigangirov, K.: An interval estimation problem for controlled observations. Problemy Peredachi Informatsii 10(3), 51–61 (1974)MathSciNetMATHGoogle Scholar
  12. 12.
    Castro, R., Nowak, R.: Active sensing and learning. Foundations and Applications of Sensor Management, 177–200 (2009)Google Scholar
  13. 13.
    Devroye, L., Györfi, L., Lugosi, G.: A probabilistic theory of pattern recognition, vol. 31. Springer (1996)Google Scholar
  14. 14.
    Hazan, E., Kale, S.: Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization. In: Proceedings of the 23nd Annual Conference on Learning Theory (2011)Google Scholar
  15. 15.
    Bach, F., Moulines, E.: Non-asymptotic analysis of stochastic approximation algorithms for machine learning. In: Advances in Neural Information Processing Systems, NIPS (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Aaditya Ramdas
    • 1
  • Aarti Singh
    • 1
  1. 1.Carnegie Mellon UniversityPittsburghUSA

Personalised recommendations