Improved Second-Order Bounds for Prediction with Expert Advice

Cesa-Bianchi, Nicolò; Mansour, Yishay; Stoltz, Gilles

doi:10.1007/11503415_15

Nicolò Cesa-Bianchi²⁰,
Yishay Mansour²¹ &
Gilles Stoltz²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3559))

Included in the following conference series:

International Conference on Computational Learning Theory

3480 Accesses
10 Citations

Abstract

This work studies external regret in sequential prediction games with arbitrary payoffs (nonnegative or non-positive). External regret measures the difference between the payoff obtained by the forecasting strategy and the payoff of the best action. We focus on two important parameters: M, the largest absolute value of any payoff, and Q ^*, the sum of squared payoffs of the best action. Given these parameters we derive first a simple and new forecasting strategy with regret at most order of \(\sqrt{Q^{*}({\rm ln}N)}+M {\rm ln} N\), where N is the number of actions. We extend the results to the case where the parameters are unknown and derive similar bounds. We then devise a refined analysis of the weighted majority forecaster, which yields bounds of the same flavour. The proof techniques we develop are finally applied to the adversarial multi-armed bandit setting, and we prove bounds on the performance of an online algorithm in the case where there is no lower bound on the probability of each action.

The work of all authors was supported in part by the IST Programme of the European Community, under the PASCAL Network of Excellence, IST-2002-506778.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Allenberg-Neeman, C., Auer, P.: Personal communication
Google Scholar
Allenberg-Neeman, C., Neeman, B.: Full information game with gains and losses. In: Ben-David, S., Case, J., Maruoka, A. (eds.) ALT 2004. LNCS (LNAI), vol. 3244, pp. 264–278. Springer, Heidelberg (2004)
Chapter Google Scholar
Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The nonstochastic multiarmed bandit problem. SIAM Journal on Computing 32, 48–77 (2002)
Article MATH MathSciNet Google Scholar
Auer, P., Cesa-Bianchi, N., Gentile, C.: Adaptive and self-confident on-line learning algorithms. Journal of Computer and System Sciences 64, 48–75 (2002)
Article MATH MathSciNet Google Scholar
Cesa-Bianchi, N., Freund, Y., Helmbold, D.P., Haussler, D., Schapire, R., Warmuth, M.K.: How to use expert advice. Journal of the ACM 3, 427–485 (1997)
Article MathSciNet Google Scholar
Cesa-Bianchi, N., Lugosi, G., Stoltz, G.: Minimizing regret with label efficient prediction. IEEE Transactions on Information Theory (to appear)
Google Scholar
Cesa-Bianchi, N., Lugosi, G., Stoltz, G.: Regret minimization under partial monitoring. Submitted for journal publication (2004)
Google Scholar
Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (to appear)
Google Scholar
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)
Article MATH MathSciNet Google Scholar
Hart, S., Mas-Colell, A.: A Reinforcement Procedure Leading to Correlated Equilibrium. In: Neuefeind, W., Trockel, W. (eds.) Economic Essays, Gerard Debreu, pp. 181–200. Springer, Heidelberg (2001)
Google Scholar
Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Information and Computation 108, 212–261 (1994)
Article MATH MathSciNet Google Scholar
Piccolboni, A., Schindelhauer, C.: Discrete prediction games with arbitrary feedback and loss. In: Proceedings of the 14th Annual Conference on Computational Learning Theory, pp. 208–223 (2001)
Google Scholar
Vovk, V.G.: A Game of Prediction with Expert Advice. Journal of Computer and System Sciences 56(2), 153–173 (1998)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

DSI, Università di Milano, via Comelico 39, 20135, Milano, Italy
Nicolò Cesa-Bianchi
School of computer Science, Tel-Aviv University, Tel Aviv, Israel
Yishay Mansour
DMA, Ecole Normale Supérieure, 45, rue d’Ulm, 75005, Paris, France
Gilles Stoltz

Authors

Nicolò Cesa-Bianchi
View author publications
You can also search for this author in PubMed Google Scholar
Yishay Mansour
View author publications
You can also search for this author in PubMed Google Scholar
Gilles Stoltz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Leoben, A-8700, Leoben, Austria
Peter Auer
Department of Electrical Engineering, Technion, P.O. Box, 3200, Haifa, Israel
Ron Meir

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cesa-Bianchi, N., Mansour, Y., Stoltz, G. (2005). Improved Second-Order Bounds for Prediction with Expert Advice. In: Auer, P., Meir, R. (eds) Learning Theory. COLT 2005. Lecture Notes in Computer Science(), vol 3559. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11503415_15

Download citation

DOI: https://doi.org/10.1007/11503415_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26556-6
Online ISBN: 978-3-540-31892-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics