Abstract
No-regret algorithms for online convex optimization are potent online learning tools and have been demonstrated to be successful in a wide-ranging number of applications. Considering affine and external regret, we investigate what happens when a set of no-regret learners (voters) merge their respective decisions in each learning iteration to a single, common one in form of a convex combination. We show that an agent (or algorithm) that executes this merged decision in each iteration of the online learning process and each time feeds back a copy of its own reward function to the voters, incurs sublinear regret itself. As a by-product, we obtain a simple method that allows us to construct new no-regret algorithms out of known ones.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Blum, A., Even-Dar, E., Ligett, K.: Routing without regret: on convergence to nash equilibria of regret-minimizing algorithms in routing games. In: PODC 2006: Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing (2006)
Blum, A., Mansour, Y.: From external to internal regret. In: Auer, P., Meir, R. (eds.) COLT 2005. LNCS, vol. 3559, pp. 621–636. Springer, Heidelberg (2005)
Blum, A., Kumar, V., Rudra, A., Wu, F.: Online learning in online auctions. Theor. Comput. Sci. 324(2-3), 137–146 (2004)
Boyd, S., Vandenberghe, L.: Convex optimization. Cambridge University Press, Cambridge (2004)
Calliess, J.-P.: On fixed convex combinations of no-regret learners, Tech. Report CMU-ML-08-112, Carnegie Mellon (2008)
Calliess, J.-P., Gordon, G.J.: No-regret learning and a mechanism for distributed multiagent planning. In: Proc. of 7th Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS 2008) (2008)
Foster, D., Vohra, R.: Calibrated learning and correlated equilibrium. Games and Economic Behavior (1997)
Freund, Y., Shapire, R.E.: Game theory, on-line prediction and boosting. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS, vol. 2777. Springer, Heidelberg (2003)
Gordon, G.: No-regret algorithms for online convex programs. In: Advances in Neural Information Processing Systems, vol. 19 (2007)
Gordon, G.J.: Approximate solutions to markov decision processes, Ph.D. thesis, Carnegie Mellon University (1999)
Gordon, G.J., Greenwald, A., Marks, C.: No-regret learning in convex games. In: 25th Int. Conf. on Machine Learning (ICML 2008) (2008)
Hannan, J.: Contributions to the theory of games. Princeton University Press, Princeton (1957)
Jafari, A., Greenwald, A.R., Gondek, D., Ercal, G.: On no-regret learning, fictitious play, and nash equilibrium. In: ICML 2001: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 226–233 (2001)
Kalai, A., Vempala, S.: Efficient algorithms for online decision problems. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS, vol. 2777, pp. 26–40. Springer, Heidelberg (2003)
Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. In: IEEE Symposium on Foundations of Computer Science, pp. 256–261 (1989)
Sahota, M.K., Mackworth, A.K., Barman, R.A., Kingdon, S.J.: Real-time control of soccer-playing robots using off-board vision: the dynamite testbed. In: IEEE International Conference on Systems, Man, and Cybernetics, pp. 3690–3663 (1995)
Shapire, R.E.: The strength of weak learnability. Machine Learning 5(2), 197–227 (1990); First boosting method
Stoltz, G., Lugosi, G.: Learning correlated equilibria in games with compact sets of strategies. Games and Economic Behavior 59, 187–208 (2007)
Zinkevich, M.: Online convex programming and generalized infinitesimal gradient ascent. In: Twentieth International Conference on Machine Learning (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Calliess, JP. (2009). On Fixed Convex Combinations of No-Regret Learners. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2009. Lecture Notes in Computer Science(), vol 5632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03070-3_37
Download citation
DOI: https://doi.org/10.1007/978-3-642-03070-3_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03069-7
Online ISBN: 978-3-642-03070-3
eBook Packages: Computer ScienceComputer Science (R0)