Auer, P., & Long, P. (1994). Simulating access to hidden information while learning. *Proceedings of the 26th Annual ACM Symposium on Theory of Computing* (pp. 263–272). New York: Assoc. Comput. Mach.

Carlson, B.C. (1977). *Special functions of applied mathematics*. New York: Academic Press.

Cesa-Bianchi, N., Freund, Y., Helmbold, D.P., Haussler, D., Schapire, R.E., & Warmuth, M.K. (1993). How to use expert advice. *Proceedings of the 25th Annual ACM Symposium on Theory of Computing* (pp. 382–391). New York: Assoc. Comput. Mach.

Cesa-Bianchi, N., Freund, Y., Helmbold, D.P., & Warmuth, M.K. (1996). On-line prediction and conversion strategies. *Machine Learning*, *25*, 71–110.

Cesa-Bianchi, N., Helmbold, D.P., & Panizza, S. (1996). On Bayes methods for on-line Boolean prediction. *Proceedings of the 9th Annual ACM Conference on Computational Learning Theory* (pp. 314–324). New York: Assoc. Comput. Mach.

Cover, T., & Ordentlich, E. (1996). Universal portfolios with side information. *IEEE Trans. Inform. Theory*, *42*, 348–363.

Dawid, A.P. (1986). Probability forecasting. In S. Kotz & N.L. Johnson (Eds.), *Encyclopedia of Statistical Sciences* (Vol. 7). New York: Wiley.

DeSantis, A., Markowsky, G., & Wegman, M.N. (1988). Learning probabilistic prediction functions. *Proceedings of the 29th Annual IEEE Symposium on Foundations of Computer Science* (pp. 110–119). Los Alamitos, CA: IEEE Comput. Soc.

Feder, M., Merhav, N., & Gutman, M. (1992). Universal prediction of individual sequences. *IEEE Trans. Inform. Theory*, *38*, 1258–1270.

Freund, Y. (1996). Predicting a binary sequence almost as well as the optimal biased coin. *Proceedings of the 9th Annual ACM Conference on Computational Learning Theory* (pp. 89–98). New York: Assoc. Comput. Mach.

Freund, Y., Schapire, R., Singer, Y., & Warmuth, M. (1997). Using and combining predictors that specialize. *Proceedings of the 29th Annual ACM Symposium on Theory of Computing*. New York: Assoc. Comput. Mach.

Haussler, D., Kivinen, J., & Warmuth, M.K. (1994). Tight worst-case loss bounds for predicting with expert advice. (Technical Report UCSC-CRL-94-36). University of California, Santa Cruz, CA, revised December 1994. Short version in P. Vitányi (Ed.), *Computational Learning Theory*. Lecture Notes in Computer Science (Vol. 904). Berlin: Springer (1995).

Helmbold, D., & Schapire, R. (1997). Predicting nearly as well as the best pruning of a decision tree. *Machine Learning*, *27*, 51–68.

Herbster, M., & Warmuth, M. (1995). Tracking the best expert. *Proceedings of the 12th International Conference on Machine Learning* (pp. 286–294). Morgan Kaufmann. To appear in *Machine Learning*.

Herbster, M., & Warmuth, M. (1997). Tracking the best expert, II. Manuscript.

Lauritzen, S.L., & Spiegelhalter, D.J. (1988). Local computations with probabilities on graphical structures and their application to expert systems (with discussion). *J. R. Statist. Soc. B*, *50*, 157–224. Also in (Shafer and Pearl, 1990).

Littlestone, N., & Warmuth, M.K. (1994). The weighted majority algorithm. *Inform. Computation*, *108*, 212–261.

Pearl, J. (1986). Fusion, propagation, and structuring in belief networks. *Artificial Intelligence*, *29*, 241–288. Also in (Shafer and Pearl, 1990).

Rissanen, J. (1983). A universal prior for integers and estimation by minimum description length. *Ann. Statist.*, *11*, 416–431.

Shafer, G., & Pearl, J. (Eds.) (1990). *Uncertain reasoning*. San Mateo, CA: Morgan Kauffman.

Takimoto, E., Maruoka, A., & Vovk, V. (1998). Predicting nearly as well as the best pruning of a decision tree through dynamic programming scheme. Submitted for publication.

Vovk, V. (1990). Aggregating strategies. *Proceedings of the 3rd Annual Workshop on Computational Learning Theory* (pp. 371–383). San Mateo, CA: Morgan Kaufmann.

Vovk, V. (1992). Universal forecasting algorithms. *Inform. Computation*, *96*, 245–277.

Vovk, V. (1997a). Derandomizing stochastic prediction strategies. *Proceedings of the 9th Annual ACM Conference on Computational Learning Theory* (pp. 32–44). New York: Assoc. Comput. Mach.

Vovk, V. (1997b). On-line competitive linear regression. M.I. Jordan, M.J. Kearns, & S.A. Solla (Eds.), *Advances in Neural Information Processing Systems 10* (pp. 364–370). Cambridge, MA: MIT Press.

Vovk, V. (1998). A game of prediction with expert advice. *J. Comput. Inform. Syst.*, *56*, 153–173.

Vovk, V., & Watkins, C.J.H.C. (1998). Universal portfolio selection. *Proceedings of the 11th Annual ACM Conference on Computational Learning Theory* (pp. 12–23). New York: Assoc. Comput. Mach.

Watkins, C.J.H.C. (1997). How to use advice from small numbers of experts. (Technical Report CSD-TR-97-16) Department of Computer Science, Royal Holloway, University of London.

Yamanishi, K. (1995). Randomized approximate aggregating strategies and their applications to prediction and discrimination. *Proceedings of the 8th Annual ACMConference on Computational Learning Theory* (pp. 83–90). New York: Assoc. Comput. Mach.