Skip to main content

On Fixed Convex Combinations of No-Regret Learners

  • Conference paper
  • 2340 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5632))

Abstract

No-regret algorithms for online convex optimization are potent online learning tools and have been demonstrated to be successful in a wide-ranging number of applications. Considering affine and external regret, we investigate what happens when a set of no-regret learners (voters) merge their respective decisions in each learning iteration to a single, common one in form of a convex combination. We show that an agent (or algorithm) that executes this merged decision in each iteration of the online learning process and each time feeds back a copy of its own reward function to the voters, incurs sublinear regret itself. As a by-product, we obtain a simple method that allows us to construct new no-regret algorithms out of known ones.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Blum, A., Even-Dar, E., Ligett, K.: Routing without regret: on convergence to nash equilibria of regret-minimizing algorithms in routing games. In: PODC 2006: Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing (2006)

    Google Scholar 

  2. Blum, A., Mansour, Y.: From external to internal regret. In: Auer, P., Meir, R. (eds.) COLT 2005. LNCS, vol. 3559, pp. 621–636. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  3. Blum, A., Kumar, V., Rudra, A., Wu, F.: Online learning in online auctions. Theor. Comput. Sci. 324(2-3), 137–146 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  4. Boyd, S., Vandenberghe, L.: Convex optimization. Cambridge University Press, Cambridge (2004)

    Book  MATH  Google Scholar 

  5. Calliess, J.-P.: On fixed convex combinations of no-regret learners, Tech. Report CMU-ML-08-112, Carnegie Mellon (2008)

    Google Scholar 

  6. Calliess, J.-P., Gordon, G.J.: No-regret learning and a mechanism for distributed multiagent planning. In: Proc. of 7th Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS 2008) (2008)

    Google Scholar 

  7. Foster, D., Vohra, R.: Calibrated learning and correlated equilibrium. Games and Economic Behavior (1997)

    Google Scholar 

  8. Freund, Y., Shapire, R.E.: Game theory, on-line prediction and boosting. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS, vol. 2777. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  9. Gordon, G.: No-regret algorithms for online convex programs. In: Advances in Neural Information Processing Systems, vol. 19 (2007)

    Google Scholar 

  10. Gordon, G.J.: Approximate solutions to markov decision processes, Ph.D. thesis, Carnegie Mellon University (1999)

    Google Scholar 

  11. Gordon, G.J., Greenwald, A., Marks, C.: No-regret learning in convex games. In: 25th Int. Conf. on Machine Learning (ICML 2008) (2008)

    Google Scholar 

  12. Hannan, J.: Contributions to the theory of games. Princeton University Press, Princeton (1957)

    Google Scholar 

  13. Jafari, A., Greenwald, A.R., Gondek, D., Ercal, G.: On no-regret learning, fictitious play, and nash equilibrium. In: ICML 2001: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 226–233 (2001)

    Google Scholar 

  14. Kalai, A., Vempala, S.: Efficient algorithms for online decision problems. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS, vol. 2777, pp. 26–40. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  15. Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. In: IEEE Symposium on Foundations of Computer Science, pp. 256–261 (1989)

    Google Scholar 

  16. Sahota, M.K., Mackworth, A.K., Barman, R.A., Kingdon, S.J.: Real-time control of soccer-playing robots using off-board vision: the dynamite testbed. In: IEEE International Conference on Systems, Man, and Cybernetics, pp. 3690–3663 (1995)

    Google Scholar 

  17. Shapire, R.E.: The strength of weak learnability. Machine Learning 5(2), 197–227 (1990); First boosting method

    Google Scholar 

  18. Stoltz, G., Lugosi, G.: Learning correlated equilibria in games with compact sets of strategies. Games and Economic Behavior 59, 187–208 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  19. Zinkevich, M.: Online convex programming and generalized infinitesimal gradient ascent. In: Twentieth International Conference on Machine Learning (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Calliess, JP. (2009). On Fixed Convex Combinations of No-Regret Learners. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2009. Lecture Notes in Computer Science(), vol 5632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03070-3_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03070-3_37

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03069-7

  • Online ISBN: 978-3-642-03070-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics