Efficient Algorithms for Online Decision Problems

  • Adam Kalai
  • Santosh Vempala
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2777)

Abstract

In an online decision problem, one makes a sequence of decisions without knowledge of the future. Tools from learning such as Weighted Majority and its many variants [4, 13, 18] demonstrate that online algorithms can perform nearly as well as the best single decision chosen in hindsight, even when there are exponentially many possible decisions. However, the naive application of these algorithms is inefficient for such large problems. For some problems with nice structure, specialized efficient solutions have been developed [3, 6, 10, 16, 17].

We show that a very simple idea, used in Hannan’s seminal 1957 paper [9], gives efficient solutions to all of these problems. Essentially, in each period, one chooses the decision that worked best in the past. To guarantee low regret, it is necessary to add randomness. Surprisingly, this simple approach gives additive ε regret per period, efficiently. We present a simple general analysis and several extensions, including a (1+ε)-competitive algorithm as well as a lazy one that rarely switches between decisions.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Blum, A.: On-line algorithms in machine learning. Technical Report CMU-CS-97- 163, Carnegie Mellon University (1997)Google Scholar
  2. 2.
    Blum, A., Burch, C.: On-line learning and the metrical task system problem. Machine Learning 39(1), 35–58 (2000)MATHCrossRefGoogle Scholar
  3. 3.
    Blum, A., Chawla, S., Kalai, A.: Static Optimality and Dynamic Search Optimality in Lists and Trees. In: Proceedings of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2002) (2002)Google Scholar
  4. 4.
    Cesa-Bianchi, N., Freund, Y., Haussler, D., Helmbold, D., Schapire, R., Warmuth, M.: How to use expert advice. Journal of the ACM 44(3), 427–485 (1997)MATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Cover, T.: Universal Portfolios. Math. Finance 1, 1–29 (1991)MATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Freund, Y., Schapire, R., Singer, Y., Warmuth, M.: Using and combining predictors that specialize. In: Proceedings of the 29th Annual ACM Symposium on the Theory of Computing, pp. 334–343 (1997)Google Scholar
  7. 7.
    Foster, D., Vohra, R.: Regret in the on-line decision problem. Games and Economic Behavior 29, 1084–1090 (1999)MathSciNetGoogle Scholar
  8. 8.
    Goemans, M., Williamson, D.: Improved Approximation Algorithms for Maximum Cut and Satisfiability Problems Using Semidefinite Programming. J. ACM 42, 1115–1145 (1995)MATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Hannan, J.: Approximation to Bayes risk in repeated plays. In: Dresher, M., Tucker, A., Wolfe, P. (eds.) Contributions to the Theory of Games, vol. 3, pp. 97–139. Princeton University Press, Princeton (1957)Google Scholar
  10. 10.
    Helmbold, D., Schapire, R.: Predicting nearly as well as the best pruning of a decision tree. Machine Learning 27(1), 51–68 (1997)CrossRefGoogle Scholar
  11. 11.
    Kalai, A., Vempala, S.: Geometric algorithms for online optimization. MIT Technical report MIT-LCS-TR-861 (2002)Google Scholar
  12. 12.
    Knuth, D.: Dynamic Huffman Coding. J. Algorithms 2, 163–180 (1985)CrossRefMathSciNetGoogle Scholar
  13. 13.
    Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Information and Computation 108, 212–261 (1994)MATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Sleator, D., Tarjan, R.: Amortized efficiency of list update and paging rules. Communications of the ACM 28, 202–208 (1985)CrossRefMathSciNetGoogle Scholar
  15. 15.
    Sleator, D., Tarjan, R.: Self-Adjusting Binary Search Trees. Journal of the ACM 32, 652–686 (1985)MATHCrossRefMathSciNetGoogle Scholar
  16. 16.
    Takimoto, E., Warmuth, M.: Path Kernels and Multiplicative Updates. In: Proceedings of the Thirteenth Annual Conference on Computational Learning Theory, pp. 74–89 (2002)Google Scholar
  17. 17.
    Takimoto, E., Warmuth, M.: Predicting Nearly as Well as the Best Pruning of a Planar Decision Graph. Theoretical Computer Science 288(2), 217–235 (2002)MATHCrossRefMathSciNetGoogle Scholar
  18. 18.
    Vovk, V.: Aggregating strategies. In: Proc. 3rd Ann. Workshop on Computational Learning Theory, pp. 371–383 (1990)Google Scholar
  19. 19.
    Zinkevich, M.: Online Convex Programming and Generalized Infinitesimal Gradient Ascent. CMU Technical Report CMU-CS-03-110 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Adam Kalai
    • 1
  • Santosh Vempala
    • 1
  1. 1.Massachusetts Institute of TechnologyCambridgeUSA

Personalised recommendations