Online Regression Competitive with Changing Predictors

  • Steven Busuttil
  • Yuri Kalnishkan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4754)


This paper deals with the problem of making predictions in the online mode of learning where the dependence of the outcome y t on the signal x t can change with time. The Aggregating Algorithm (AA) is a technique that optimally merges experts from a pool, so that the resulting strategy suffers a cumulative loss that is almost as good as that of the best expert in the pool. We apply the AA to the case where the experts are all the linear predictors that can change with time. KAARCh is the kernel version of the resulting algorithm. In the kernel case, the experts are all the decision rules in some reproducing kernel Hilbert space that can change over time. We show that KAARCh suffers a cumulative square loss that is almost as good as that of any expert that does not change very rapidly.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Vovk, V.: Aggregating strategies. In: Fulk, M., Case, J. (eds.) Proceedings of the 3rd Annual Workshop on Computational Learning Theory, pp. 371–383. Morgan Kaufmann, San Francisco (1990)Google Scholar
  2. 2.
    Vovk, V.: A game of prediction with expert advice. Journal of Computer and System Sciences 56, 153–173 (1998)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Vovk, V.: Competitive on-line statistics. International Statistical Review 69(2), 213–248 (2001)CrossRefMATHGoogle Scholar
  4. 4.
    Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)CrossRefMATHGoogle Scholar
  5. 5.
    Gammerman, A., Kalnishkan, Y., Vovk, V.: On-line prediction with kernels and the complexity approximation principle. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, pp. 170–176. AUAI Press (2004)Google Scholar
  6. 6.
    Vovk, V.: On-line regression competitive with reproducing kernel Hilbert spaces. Technical Report arXiv:cs.LG/0511058 (version 2), (2006)Google Scholar
  7. 7.
    Herbster, M., Warmuth, M.K.: Tracking the best linear predictor. Journal of Machine Learning Research 1, 281–309 (2001)MathSciNetMATHGoogle Scholar
  8. 8.
    Kivinen, J., Smola, A.J., Williamson, R.C.: Online learning with kernels. IEEE Transactions on Signal Processing 52(8), 2165–2176 (2004)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Cavallanti, G., Cesa-Bianchi, N., Gentile, C.: Tracking the best hyperplane with a simple budget perceptron. Machine Learning (to appear)Google Scholar
  10. 10.
    Busuttil, S., Kalnishkan, Y.: Weighted kernel regression for predicting changing dependencies. In: Proceedings of the 18th European Conference on Machine Learning (ECML 2007) (to appear, 2007)Google Scholar
  11. 11.
    Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines (and Other Kernel-Based Learning Methods). Cambridge University Press, UK (2000)CrossRefMATHGoogle Scholar
  12. 12.
    Aizerman, M., Braverman, E., Rozonoer, L.: Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control 25, 821–837 (1964)Google Scholar
  13. 13.
    Aronszajn, N.: Theory of reproducing kernels. Transactions of the American Mathematical Society 68, 337–404 (1950)MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Schölkopf, B., Smola, A.J.: Learning with Kernels — Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, Cambridge (2002)Google Scholar
  15. 15.
    Beckenbach, E.F., Bellman, R.: Inequalities. Springer, Heidelberg (1961)CrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Steven Busuttil
    • 1
  • Yuri Kalnishkan
    • 1
  1. 1.Computer Learning Research Centre and Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, TW20 0EXUnited Kingdom

Personalised recommendations