Advertisement

Scale-Free Algorithms for Online Linear Optimization

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9355)

Abstract

We design algorithms for online linear optimization that have optimal regret and at the same time do not need to know any upper or lower bounds on the norm of the loss vectors. We achieve adaptiveness to norms of loss vectors by scale invariance, i.e., our algorithms make exactly the same decisions if the sequence of loss vectors is multiplied by any positive constant. Our algorithms work for any decision set, bounded or unbounded. For unbounded decisions sets, these are the first truly adaptive algorithms for online linear optimization.

Keywords

Online Learning Online Optimization Bregman Divergence Bound Close Convex Subset Dual Vector Space 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Auer, P., Cesa-Bianchi, N., Gentile, C.: Adaptive and self-confident on-line learning algorithms. Journal of Computer and System Sciences 64(1), 48–75 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Cesa-Bianchi, N., Freund, Y., Haussler, D., Helmbold, D.P., Schapire, R.E., Warmuth, M.K.: How to use expert advice. J. ACM 44(3), 427–485 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)CrossRefzbMATHGoogle Scholar
  4. 4.
    Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)MathSciNetzbMATHGoogle Scholar
  5. 5.
    Freund, Y., Schapire, R.E.: Large margin classification using the perceptron algorithm. Mach. Learn. 37(3), 277–296 (1999)CrossRefzbMATHGoogle Scholar
  6. 6.
    Jaksch, T., Ortner, R., Auer, P.: Near-optimal regret bounds for reinforcement learning. J. Mach. Learn. Res. 11, 1563–1600 (2010)MathSciNetzbMATHGoogle Scholar
  7. 7.
    Koolen, W.M., Warmuth, M.K., Kivinen, J.: Hedging structured concepts. In: Proc. of COLT, pp. 93–105 (2010)Google Scholar
  8. 8.
    Kwon, J., Mertikopoulos, P.: A continuous-time approach to online optimization, February 2014. arXiv:1401.6956
  9. 9.
    Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Information and Computation 108(2), 212–261 (1994)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    McMahan, H.B.: Analysis techniques for adaptive online learning (2014). arXiv:1403.3465
  11. 11.
    McMahan, H.B., Streeter, J.M.: Adaptive bound optimization for online convex optimization. In: Proc. of COLT, pp. 244–256 (2010)Google Scholar
  12. 12.
    Nemirovski, A., Yudin, D.B.: Problem complexity and method efficiency in optimization. Wiley (1983)Google Scholar
  13. 13.
    Orabona, F., Crammer, K., Cesa-Bianchi, N.: A generalized online mirror descent with applications to classification and regression. Mach. Learn. 99, 411–435 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Rakhlin, A., Sridharan, K.: Optimization, learning, and games with predictable sequences. In: Advances in Neural Information Processing Systems, vol. 26 (2013)Google Scholar
  15. 15.
    de Rooij, S., van Erven, T., Grünwald, P.D., Koolen, W.M.: Follow the leader if you can, hedge if you must. J. Mach. Learn. Res. 15, 1281–1316 (2014)MathSciNetzbMATHGoogle Scholar
  16. 16.
    Ross, S., Mineiro, P., Langford, J.: Normalized online learning. In: Proc. of UAI (2013)Google Scholar
  17. 17.
    Shalev-Shwartz, S.: Online Learning: Theory, Algorithms, and Applications. Ph.D. thesis, Hebrew University, Jerusalem (2007)Google Scholar
  18. 18.
    Shalev-Shwartz, S.: Online learning and online convex optimization. Foundations and Trends in Machine Learning 4(2), 107–194 (2011)CrossRefzbMATHGoogle Scholar
  19. 19.
    Srebro, N., Sridharan, K., Tewari, A.: On the universality of online mirror descent. In: Advances in Neural Information Processing Systems (2011)Google Scholar
  20. 20.
    Streeter, M., McMahan, H.B.: Less regret via online conditioning (2010). arXiv:1002.4862
  21. 21.
    Vovk, V.: A game of prediction with expert advice. Journal of Computer and System Sciences 56, 153–173 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Xiao, L.: Dual averaging methods for regularized stochastic learning and online optimization. J. Mach. Learn. Res. 11, 2543–2596 (2010)MathSciNetzbMATHGoogle Scholar
  23. 23.
    Zinkevich, M.: Online convex programming and generalized infinitesimal gradient ascent. In: Proc. of ICML, pp. 928–936 (2003)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Yahoo LabsNew YorkUSA

Personalised recommendations