Skip to main content

Part of the book series: The IMA Volumes in Mathematics and its Applications ((IMA,volume 107))

Abstract

We consider the game of sequentially assigning probabilities to future data based on past observations under logarithmic loss. We are not making probabilistic assumptions about the generation of the data, but consider a situation where a player tries to minimize his loss relative to the loss of the (with hindsight) best distribution from a target class for the worst sequence of data. We give bounds on the minimax regret in terms of the metric entropies of the target class with respect to suitable distances between distributions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kolmogorov, A.N. and V.M. Tihomirov, e-Entropy and e-Capacity of Sets in Functional Spaces, Amer. Math. Soc. Translations (Ser. 2), 17, 277–364 (1961).

    MathSciNet  Google Scholar 

  2. V.G. Vovk, Aggregating strategies, Proceedings of the 1990 conference on computational learning theory, Morgan Kaufmann, 2 371–381 (1990).

    Google Scholar 

  3. N. Cesa-Bianchi, Y. Freund, D.H. Helmbold, D. Haussler, R.E. Schapire, and M.K. Warmuth, How to use expert advice, in 25th Annual ACM Symposium on Theory of Computing, 382–392, San Diego, CA (1993).

    Google Scholar 

  4. Y. Freund, Predicting a binary sequence as well as the optimal biased coin, Proceedings of the ninth annual conference on computational learning theory, ACM Press (1996).

    Google Scholar 

  5. J. Shtarkov, Coding of discrete sources with unknown statistics, In: Topics in Information Theory, 559–574, I. Csiszar and P. Elias, editors, North Holland, Amsterdam, 1975.

    Google Scholar 

  6. A.R. Barron and Q. XIE, Asymptotic minimax loss for data compression, gambling, and prediction, Proceedings of the ninth annual conference on computational learning theory, ACM Press (1996).

    Google Scholar 

  7. J. RISSANEN, Fisher Information and Stochastic Complexity, IEEE Trans. on Inf. Theory 42, 40–47 (1996).

    Article  MathSciNet  MATH  Google Scholar 

  8. T. Cover and JOY A. Thomas, Elements of Information Theory, Wiley Series in Telecommunications, New York, 1991.

    Book  MATH  Google Scholar 

  9. T.M Cover and E. ordentlich, Universal portfolios with side information, IEEE Transactions on Information Theory 42(2), 348–363 (1996).

    Article  MathSciNet  MATH  Google Scholar 

  10. AAD W. Van Der Vaart and JON A. Wellner, Weak Convergence and Empirical Processes, Springer Series in Statistics, 1996.

    Google Scholar 

  11. M. Ledoux and M. Talagrand, Probability in Banach Spaces: Isoperimetry and Processes, Springer Verlag, Berlin (1991).

    MATH  Google Scholar 

  12. D. Haussler and M. Opper, Mutual Information, Metric Entropy, and Risk in Estimation of Probability Distributions, Annals of Statistics 25 (6) (December, 1997).

    Google Scholar 

  13. G.F. Clements, Entropy of several sets of real valued functions, Pacific J. Math. 13, 1085 (1963).

    MathSciNet  MATH  Google Scholar 

  14. M.J. Weinberger, N. Merhav and M. Feder, Optimal Sequential Probability Assignment for Individual Sequences, IEEE Trans. on Inf. Theory 40, 384–396 (1994).

    Article  MathSciNet  MATH  Google Scholar 

  15. W. Hoeffding, Probability Inequalities for Sums of Bounded Random Variables, American Statistical Association Journal 58, 13–30 (1963).

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer Science+Business Media New York

About this chapter

Cite this chapter

Opper, M., Haussler, D. (1999). Worst Case Prediction over Sequences under Log Loss. In: Cybenko, G., O’Leary, D.P., Rissanen, J. (eds) The Mathematics of Information Coding, Extraction and Distribution. The IMA Volumes in Mathematics and its Applications, vol 107. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-1524-0_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-4612-1524-0_6

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4612-7178-9

  • Online ISBN: 978-1-4612-1524-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics