Abstract
We consider the game of sequentially assigning probabilities to future data based on past observations under logarithmic loss. We are not making probabilistic assumptions about the generation of the data, but consider a situation where a player tries to minimize his loss relative to the loss of the (with hindsight) best distribution from a target class for the worst sequence of data. We give bounds on the minimax regret in terms of the metric entropies of the target class with respect to suitable distances between distributions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Kolmogorov, A.N. and V.M. Tihomirov, e-Entropy and e-Capacity of Sets in Functional Spaces, Amer. Math. Soc. Translations (Ser. 2), 17, 277–364 (1961).
V.G. Vovk, Aggregating strategies, Proceedings of the 1990 conference on computational learning theory, Morgan Kaufmann, 2 371–381 (1990).
N. Cesa-Bianchi, Y. Freund, D.H. Helmbold, D. Haussler, R.E. Schapire, and M.K. Warmuth, How to use expert advice, in 25th Annual ACM Symposium on Theory of Computing, 382–392, San Diego, CA (1993).
Y. Freund, Predicting a binary sequence as well as the optimal biased coin, Proceedings of the ninth annual conference on computational learning theory, ACM Press (1996).
J. Shtarkov, Coding of discrete sources with unknown statistics, In: Topics in Information Theory, 559–574, I. Csiszar and P. Elias, editors, North Holland, Amsterdam, 1975.
A.R. Barron and Q. XIE, Asymptotic minimax loss for data compression, gambling, and prediction, Proceedings of the ninth annual conference on computational learning theory, ACM Press (1996).
J. RISSANEN, Fisher Information and Stochastic Complexity, IEEE Trans. on Inf. Theory 42, 40–47 (1996).
T. Cover and JOY A. Thomas, Elements of Information Theory, Wiley Series in Telecommunications, New York, 1991.
T.M Cover and E. ordentlich, Universal portfolios with side information, IEEE Transactions on Information Theory 42(2), 348–363 (1996).
AAD W. Van Der Vaart and JON A. Wellner, Weak Convergence and Empirical Processes, Springer Series in Statistics, 1996.
M. Ledoux and M. Talagrand, Probability in Banach Spaces: Isoperimetry and Processes, Springer Verlag, Berlin (1991).
D. Haussler and M. Opper, Mutual Information, Metric Entropy, and Risk in Estimation of Probability Distributions, Annals of Statistics 25 (6) (December, 1997).
G.F. Clements, Entropy of several sets of real valued functions, Pacific J. Math. 13, 1085 (1963).
M.J. Weinberger, N. Merhav and M. Feder, Optimal Sequential Probability Assignment for Individual Sequences, IEEE Trans. on Inf. Theory 40, 384–396 (1994).
W. Hoeffding, Probability Inequalities for Sums of Bounded Random Variables, American Statistical Association Journal 58, 13–30 (1963).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer Science+Business Media New York
About this chapter
Cite this chapter
Opper, M., Haussler, D. (1999). Worst Case Prediction over Sequences under Log Loss. In: Cybenko, G., O’Leary, D.P., Rissanen, J. (eds) The Mathematics of Information Coding, Extraction and Distribution. The IMA Volumes in Mathematics and its Applications, vol 107. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-1524-0_6
Download citation
DOI: https://doi.org/10.1007/978-1-4612-1524-0_6
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4612-7178-9
Online ISBN: 978-1-4612-1524-0
eBook Packages: Springer Book Archive