Hannan Consistency in On-Line Learning in Case of Unbounded Losses Under Partial Monitoring

  • Chamy Allenberg
  • Peter Auer
  • László Györfi
  • György Ottucsák
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4264)

Abstract

In this paper the sequential prediction problem with expert advice is considered when the loss is unbounded under partial monitoring scenarios. We deal with a wide class of the partial monitoring problems: the combination of the label efficient and multi-armed bandit problem, that is, where the algorithm is only informed about the performance of the chosen expert with probability ε≤1. For bounded losses an algorithm is given whose expected regret scales with the square root of the loss of the best expert. For unbounded losses we prove that Hannan consistency can be achieved, depending on the growth rate of the average squared losses of the experts.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: Gambling in a rigged casino: the adversial multi-armed bandit problem. In: Proceedings of the 36th Annual Symposium on Foundations of Computer Science, FOCS 1995, Washington, DC, USA, October 1995, pp. 322–331. IEEE Computer Society Press, Los Alamitos, CA (1995)Google Scholar
  2. 2.
    Blackwell, D.: An analog of the minimax theorem for vector payoffs. Pacific Journal of Mathematics 6, 1–8 (1956)MATHMathSciNetGoogle Scholar
  3. 3.
    Cesa-Bianchi, N., Freund, Y., Helmbold, D.P., Haussler, D., Schapire, R., Warmuth, M.K.: How to use expert advice. Journal of the ACM 44(3), 427–485 (1997)MATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)MATHCrossRefGoogle Scholar
  5. 5.
    Cesa-Bianchi, N., Lugosi, G., Stoltz, G.: Minimizing regret with label efficient prediction. IEEE Trans. Inform. Theory 51, 2152–2162 (2005)CrossRefMathSciNetGoogle Scholar
  6. 6.
    Cesa-Bianchi, N., Mansour, Y., Stoltz, G.: Improved second-order bounds for prediction with expert advice. In: Auer, P., Meir, R. (eds.) COLT 2005. LNCS, vol. 3559, pp. 217–232. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  7. 7.
    Cesa-Bianchi, N., Mansour, Y., Stoltz, G.: Improved second-order bounds for prediction with expert advice (submitted, 2006)Google Scholar
  8. 8.
    Chow, Y.S.: Local convergence of martingales and the law of large numbers. Annals of Mathematical Statistics 36, 552–558 (1965)MATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Györfi, L., Lugosi, G.: Strategies for sequential prediction of stationary time series. In: Dror, M., L’Ecuyer, P., Szidarovszky, F. (eds.) Modelling Uncertainty: An Examination of its Theory, Methods and Applications, pp. 225–248. Kluwer Academic Publishers, Dordrecht (2001)Google Scholar
  10. 10.
    Györfi, L., Ottucsák, G.: Sequential prediction of unbounded stationary time series (submitted, 2006)Google Scholar
  11. 11.
    György, A., Ottucsák, G.: Adaptive routing using expert advice. The Computer Journal 49(2), 180–189 (2006)CrossRefGoogle Scholar
  12. 12.
    Hannan, J.: Approximation to bayes risk in repeated plays. In: Dresher, M., Tucker, A., Wolfe, P. (eds.) Contributions to the Theory of Games, vol. 3, pp. 97–139. Princeton University Press, Princeton (1957)Google Scholar
  13. 13.
    Hart, S., Mas-Colell, A.: A simple adaptive procedure leading to correlated equilibrium. Econometria 68(5), 181–200 (2002)MathSciNetGoogle Scholar
  14. 14.
    Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Information and Computation 108, 212–261 (1994)MATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    Poland, J., Hutter, M.: Defensive universal learning with experts. In: Jain, S., Simon, H.U., Tomita, E. (eds.) ALT 2005. LNCS, vol. 3734, pp. 356–370. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  16. 16.
    Vovk, V.: Aggregating strategies. In: Proceedings of the 3rd Annual Workshop on Computational Learning Theory, Rochester, NY, pp. 372–383. Morgan Kaufmann, San Francisco (1990)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Chamy Allenberg
    • 1
  • Peter Auer
    • 2
  • László Györfi
    • 3
  • György Ottucsák
    • 3
  1. 1.School of Computer ScienceTel Aviv UniversityTel AvivIsrael
  2. 2.Chair for Information TechnologyUniversity of LeobenLeobenAustria
  3. 3.Department of Computer Science and Information TheoryBudapest University of Technology and EconomicsBudapestHungary

Personalised recommendations