Machine Learning

, Volume 14, Issue 1, pp 27–45 | Cite as

Tracking drifting concepts by minimizing disagreements

  • David P. Helmbold
  • Philip M. Long
Article

Abstract

In this paper we consider the problem of tracking a subset of a domain (called thetarget) which changes gradually over time. A single (unknown) probability distribution over the domain is used to generate random examples for the learning algorithm and measure the speed at which the target changes. Clearly, the more rapidly the target moves, the harder it is for the algorithm to maintain a good approximation of the target. Therefore we evaluate algorithms based on how much movement of the target can be tolerated between examples while predicting with accuracy ε. Furthermore, the complexity of the classH of possible targets, as measured byd, its VC-dimension, also effects the difficulty of tracking the target concept. We show that if the problem of minimizing the number of disagreements with a sample from among concepts in a classH can be approximated to within a factork, then there is a simple tracking algorithm forH which can achieve a probability ε of making a mistake if the target movement rate is at most a constant times ε2/(k(d +k) ln 1/ε), whered is the Vapnik-Chervonenkis dimension ofH. Also, we show that ifH is properly PAC-learnable, then there is an efficient (randomized) algorithm that with high probability approximately minimizes disagreements to within a factor of 7d + 1, yielding an efficient tracking algorithm forH which tolerates drift rates up to a constant times ε2/(d2 ln 1/ε). In addition, we prove complementary results for the classes of halfspaces and axisaligned hyperrectangles showing that the maximum rate of drift that any algorithm (even with unlimited computational power) can tolerate is a constant times ε2/d.

Keywords

Computational learning theory concept drift concept learning 

References

  1. M. Anthony, N. Biggs, and J. Shawe-Taylor, (1990). The learnability of formal concepts.The 1990 Workshop on Computational Learning Theory, 246–257.Google Scholar
  2. D. Angluin and L. Valiant, (1979). Fast probabilistic algorithms for Hamiltonion circuits and matchings.Journal of Computer and System Sciences, 18(2):155–193.Google Scholar
  3. D. Aldous and U. Vazirani, (1990). A Markovian extension of Valiant's learning model.Proceedings of the 31st Annual Symposium on the Foundations of Computer Science, pages 392–396.Google Scholar
  4. N. Abe and O. Watanabe, (1992). Polynomially sparse variations and reducibility among prediction problems.IEICE Trans. Inf. & Syst., E75-D(4):449–458, 1992.Google Scholar
  5. A. Blumer, A. Ehrenfeucht, D. Haussler, and M.K. Warmuth, (1989). Learnability and the Vapnik-Chervonenkis dimension.JACM, 36(4):929–965.Google Scholar
  6. A. Ehrenfeucht, D. Haussler, M. Kearns, and L.G. Valiant, (1989). A general lower bound on the number of examples needed for learning.Information and Computation, 82(3):247–251.Google Scholar
  7. D. Haussler, (1991). Decision theoretic generalizations of the PAC model for neural net and other learning applications. Technical Report UCSC-CRL-91-02, University of California at Santa Cruz.Google Scholar
  8. D.P. Helmbold and P.M. Long, (1991). Tracking drifting concepts using random examples.The 1991 Workshop on Computational Learning Theory, pages 13–23.Google Scholar
  9. D. Haussler, N. Littlestone, and M.K. Warmuth, (1988). Predicting {0, 1} functions on randomly drawn points.Proceedings of the 29th Annual Symposium on the Foundations of Computer Science, pages 100–109.Google Scholar
  10. David Haussler, Nick Littlestone, and Manfred Warmuth, (1990). Predicting {0, 1}-functions on randomly drawn points. Technical Report UCSC-CRL-90-54, University of California Santa Cruz. To appear in Information and Computation.Google Scholar
  11. T. Hagerup and C. Rub, (1990). A guided tour of Chernov bounds.Information Processing Letters, 33:305–308.Google Scholar
  12. M. Kearns and M. Li, (1988). Learning in the presence of malicious errors.Proceedings of the 20th ACM Symposium on the Theory of Computation, pages 267–279.Google Scholar
  13. T. Kuh, T. Petsche, and R. Rivest, (1990). Learning time varying concepts. InNIPS 3. Morgan Kaufmann.Google Scholar
  14. T. Kuh, T. Petsche, and R. Rivest, (1991). Mistake bounds of incremental learners when concepts drift with applications to feedforward networks. InNIPS 4. Morgan Kaufmann.Google Scholar
  15. N. Littlestone, (1989).Mistake Bounds and Logarithmic Linear-threshold Learning Algorithms. PhD thesis, UC Santa Cruz.Google Scholar
  16. P.M. Long, (1992).Towards a more comprehensive theory of learning in computers. PhD thesis, UC Santa Cruz.Google Scholar
  17. N. Littlestone and M.K. Warmuth, (1989). The weighted majority algorithm.Proceedings of the 30th Annual Symposium on the Foundations of Computer Science.Google Scholar
  18. D. Pollard, (1984).Convergence of Stochastic Processes. Springer Verlag.Google Scholar
  19. L. Pitt and L.G. Valiant, (1988). Computational limitations on learning from examples.Journal of the Association for Computing Machinery, 35(4):965–984.Google Scholar
  20. L. Pitt and M.K. Warmuth, (1990). Prediction preserving reducibility.Journal of Computer and System Sciences, 41(3).Google Scholar
  21. L.G. Valiant, (1984). A theory of the learnable.Communications of the ACM, 27(11):1134–1142.Google Scholar
  22. V.N. Vapnik, (1982).Estimation of Dependencies based on Empirical Data. Springer Verlag.Google Scholar
  23. V.N. Vapnik, (1989). Inductive principles of the search for empirical dependences (methods based on weak convergence of probability measures).The 1989 Workshop on Computational Learning Theory.Google Scholar
  24. V.N. Vapnik and A.Y. Chervonenkis, (1971). On the uniform convergence of relative frequencies of events to their probabilities.Theory of Probability and its Applications, 16(2):264–280.Google Scholar

Copyright information

© Kluwer Academic Publishers 1994

Authors and Affiliations

  • David P. Helmbold
    • 1
  • Philip M. Long
    • 2
  1. 1.CIS board, UC Santa CruzSanta Cruz
  2. 2.Institute for Theoretical Computer ScienceTechnische Universitaet GrazGrazAustria

Personalised recommendations