On-line learning with malicious noise and the closure algorithm

  • Peter Auer
  • Nicolò Cesa-Bianchi


We investigate a variant of the on-line learning model for classes of \0,1\-valued functions (concepts) in which the labels of a certain amount of the input instances are corrupted by adversarial noise. We propose an extension of a general learning strategy, known as “Closure Algorithm”, to this noise model, and show a worst-case mistake bound of m + (d+1)K for learning an arbitrary intersection-closed concept class C, where K is the number of noisy labels, d is a combinatorial parameter measuring C's complexity, and m is the worst-case mistake bound of the Closure Algorithm for learning C in the noise-free model. For several concept classes our extended Closure Algorithm is efficient and can tolerate a noise rate up to the information-theoretic upper bound. Finally, we show how to efficiently turn any algorithm for the on-line noise model into a learning algorithm for the PAC model with malicious noise.


Boolean Function Concept Class Target Class Noise Rate Hypothesis Class 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    D. Angluin, Queries and concept learning, Machine Learning 2(4) (1988) 319–342.Google Scholar
  2. [2]
    M. Anthony and J. Shawe-Taylor, A result of Vapnik with applications, Discrete Applied Mathematics 47 (1994) 207–217.MathSciNetCrossRefGoogle Scholar
  3. [3]
    P. Auer, On-line learning of rectangles in noisy environments, in: Proceedings of the 6th Annual ACM Workshop on Computational Learning Theory (ACM Press, 1993) pp. 253–261.Google Scholar
  4. [4]
    P. Auer and P.M Long, Simulating access to hidden information while learning, in: Proceedings of the 26th ACM Symposium on the Theory of Computing (ACM Press, 1994) pp. 263–272.Google Scholar
  5. [5]
    S. Boucheron, Learnability from positive examples in the Valiant framework, Manuscript (1988).Google Scholar
  6. [6]
    N. Cesa-Bianchi, Models of learning with noise, Unpublished manuscript (1994).Google Scholar
  7. [7]
    N. Cesa-Bianchi, Y. Freund, D.P. Helmbold and M.K. Warmuth, On-line prediction and conversion strategies, in: Proceedings of the First Euro-COLT Workshop, The Institute of Mathematics and its Applications Conference Series – New Series Number 53 (Clarendon Press, Oxford, 1994) pp. 205–216.Google Scholar
  8. [8]
    Z. Chen and S. Homer, On learning counting functions with queries, in: Proceedings of the 7th Annual ACM Workshop on Computational Learning Theory (ACM Press, 1994) pp. 218–227.Google Scholar
  9. [9]
    P. Fischer and H.U. Simon, On learning ring-sum-expansions, SIAM Journal on Computing 21 (1992) 181–192.zbMATHMathSciNetCrossRefGoogle Scholar
  10. [10]
    D. Haussler, N. Littlestone and M.K. Warmuth, Predicting 0, 1-functions on randomly drawn points, Information and Computation 115(2) (1994) 248–292.zbMATHMathSciNetCrossRefGoogle Scholar
  11. [11]
    D.P. Helmbold and P.M Long, Tracking drifting concepts by minimizing disagreements, Machine Learning 14(1) (1994) 27–45.zbMATHGoogle Scholar
  12. [12]
    D.P. Helmbold, R. Sloan and M.K. Warmuth, Learning nested differences of intersection-closed concept classes, Machine Learning 5(2) (1990) 165–196.Google Scholar
  13. [13]
    D.P. Helmbold, R. Sloan and M.K. Warmuth, Learning integer lattices, SIAM Journal on Computing 21(2) (1992) 240–266.zbMATHMathSciNetCrossRefGoogle Scholar
  14. [14]
    M.J. Kearns and M. Li, Learning in the presence of malicious errors, SIAM Journal on Computing 22(4) (1993) 807–837.zbMATHMathSciNetCrossRefGoogle Scholar
  15. [15]
    N. Littlestone, Learning quickly when irrelevant attributes abound: a new linear-threshold algorithm, Machine Learning 2(4) (1988) 285–318.Google Scholar
  16. [16]
    N. Littlestone, Mistake bounds and logarithmic linear-threshold learning algorithms, Ph.D. thesis, University of California at Santa Cruz (1989).Google Scholar
  17. [17]
    N. Littlestone and M.K. Warmuth, The weighted majority algorithm, Information and Computation 108 (1994) 212–261.zbMATHMathSciNetCrossRefGoogle Scholar
  18. [18]
    B.K. Natarajan, On learning boolean functions, in: Proceedings of the 19th ACM Symposium on the Theory of Computing (ACM Press, 1987) pp. 296–304.Google Scholar
  19. [19]
    B.K. Natarajan, Machine Learning: A Theoretical Approach (Morgan Kaufmann, San Mateo, CA, 1991).Google Scholar
  20. [20]
    A. Schrijver, Theory of Linear and Integer Programming (Wiley, New York, 1986).Google Scholar
  21. [21]
    L. Valiant, A theory of the learnable, Communications of the ACM 27(11) (1984) 1134–1142.zbMATHCrossRefGoogle Scholar

Copyright information

© Kluwer Academic Publishers 1998

Authors and Affiliations

  • Peter Auer
    • 1
  • Nicolò Cesa-Bianchi
    • 2
  1. 1.IGI, Graz University of TechnologyGrazAustria
  2. 2.DSI, University of MilanMilanoItaly

Personalised recommendations