On Fixed Convex Combinations of No-Regret Learners

Calliess, Jan-P.

doi:10.1007/978-3-642-03070-3_37

On Fixed Convex Combinations of No-Regret Learners

Jan-P. Calliess²⁰

Conference paper

2340 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5632))

Abstract

No-regret algorithms for online convex optimization are potent online learning tools and have been demonstrated to be successful in a wide-ranging number of applications. Considering affine and external regret, we investigate what happens when a set of no-regret learners (voters) merge their respective decisions in each learning iteration to a single, common one in form of a convex combination. We show that an agent (or algorithm) that executes this merged decision in each iteration of the online learning process and each time feeds back a copy of its own reward function to the voters, incurs sublinear regret itself. As a by-product, we obtain a simple method that allows us to construct new no-regret algorithms out of known ones.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Blum, A., Even-Dar, E., Ligett, K.: Routing without regret: on convergence to nash equilibria of regret-minimizing algorithms in routing games. In: PODC 2006: Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing (2006)
Google Scholar
Blum, A., Mansour, Y.: From external to internal regret. In: Auer, P., Meir, R. (eds.) COLT 2005. LNCS, vol. 3559, pp. 621–636. Springer, Heidelberg (2005)
Chapter Google Scholar
Blum, A., Kumar, V., Rudra, A., Wu, F.: Online learning in online auctions. Theor. Comput. Sci. 324(2-3), 137–146 (2004)
Article MathSciNet MATH Google Scholar
Boyd, S., Vandenberghe, L.: Convex optimization. Cambridge University Press, Cambridge (2004)
Book MATH Google Scholar
Calliess, J.-P.: On fixed convex combinations of no-regret learners, Tech. Report CMU-ML-08-112, Carnegie Mellon (2008)
Google Scholar
Calliess, J.-P., Gordon, G.J.: No-regret learning and a mechanism for distributed multiagent planning. In: Proc. of 7th Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS 2008) (2008)
Google Scholar
Foster, D., Vohra, R.: Calibrated learning and correlated equilibrium. Games and Economic Behavior (1997)
Google Scholar
Freund, Y., Shapire, R.E.: Game theory, on-line prediction and boosting. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS, vol. 2777. Springer, Heidelberg (2003)
Chapter Google Scholar
Gordon, G.: No-regret algorithms for online convex programs. In: Advances in Neural Information Processing Systems, vol. 19 (2007)
Google Scholar
Gordon, G.J.: Approximate solutions to markov decision processes, Ph.D. thesis, Carnegie Mellon University (1999)
Google Scholar
Gordon, G.J., Greenwald, A., Marks, C.: No-regret learning in convex games. In: 25th Int. Conf. on Machine Learning (ICML 2008) (2008)
Google Scholar
Hannan, J.: Contributions to the theory of games. Princeton University Press, Princeton (1957)
Google Scholar
Jafari, A., Greenwald, A.R., Gondek, D., Ercal, G.: On no-regret learning, fictitious play, and nash equilibrium. In: ICML 2001: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 226–233 (2001)
Google Scholar
Kalai, A., Vempala, S.: Efficient algorithms for online decision problems. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS, vol. 2777, pp. 26–40. Springer, Heidelberg (2003)
Chapter Google Scholar
Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. In: IEEE Symposium on Foundations of Computer Science, pp. 256–261 (1989)
Google Scholar
Sahota, M.K., Mackworth, A.K., Barman, R.A., Kingdon, S.J.: Real-time control of soccer-playing robots using off-board vision: the dynamite testbed. In: IEEE International Conference on Systems, Man, and Cybernetics, pp. 3690–3663 (1995)
Google Scholar
Shapire, R.E.: The strength of weak learnability. Machine Learning 5(2), 197–227 (1990); First boosting method
Google Scholar
Stoltz, G., Lugosi, G.: Learning correlated equilibria in games with compact sets of strategies. Games and Economic Behavior 59, 187–208 (2007)
Article MathSciNet MATH Google Scholar
Zinkevich, M.: Online convex programming and generalized infinitesimal gradient ascent. In: Twentieth International Conference on Machine Learning (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Machine Learning Dept., Carnegie Mellon University, Pittsburgh, USA
Jan-P. Calliess

Authors

Jan-P. Calliess
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut für Bildverarbeitung und angewandte Informatik, Körnerstr. 10, 04107, Leipzig, Deutschland, Germany
Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Calliess, JP. (2009). On Fixed Convex Combinations of No-Regret Learners. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2009. Lecture Notes in Computer Science(), vol 5632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03070-3_37

Download citation

DOI: https://doi.org/10.1007/978-3-642-03070-3_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03069-7
Online ISBN: 978-3-642-03070-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics