Machine Learning

, Volume 66, Issue 2–3, pp 151–163 | Cite as

A new PAC bound for intersection-closed concept classes

  • Peter AuerEmail author
  • Ronald Ortner


For hyper-rectangles in \(\mathbb{R}^{d}\) Auer (1997) proved a PAC bound of \(O(\frac{1}{\varepsilon}(d+\log \frac{1}{\delta}))\), where \(\varepsilon\) and \(\delta\) are the accuracy and confidence parameters. It is still an open question whether one can obtain the same bound for intersection-closed concept classes of VC-dimension \(d\) in general. We present a step towards a solution of this problem showing on one hand a new PAC bound of \(O(\frac{1}{\varepsilon}(d\log d + \log \frac{1}{\delta}))\) for arbitrary intersection-closed concept classes, complementing the well-known bounds \(O(\frac{1}{\varepsilon}(\log \frac{1}{\delta}+d\log \frac{1}{\varepsilon}))\) and \(O(\frac{d}{\varepsilon}\log \frac{1}{\delta})\) of Blumer et al. and (1989) and Haussler, Littlestone and Warmuth (1994). Our bound is established using the closure algorithm, that generates as its hypothesis the intersection of all concepts that are consistent with the positive training examples. On the other hand, we show that many intersection-closed concept classes including e.g. maximum intersection-closed classes satisfy an additional combinatorial property that allows a proof of the optimal bound of \(O(\frac{1}{\varepsilon}(d+\log \frac{1}{\delta}))\). For such improved bounds the choice of the learning algorithm is crucial, as there are consistent learning algorithms that need \(\Omega(\frac{1}{\varepsilon}(d\log\frac{1}{\varepsilon} +\log\frac{1}{\delta}))\) examples to learn some particular maximum intersection-closed concept classes.


PAC bounds Intersection-closed classes 


  1. Auer, P. (1997). Learning nested differences in the presence of malicious noise. Theor. Comput. Sci., 185(1), 159–175.zbMATHCrossRefMathSciNetGoogle Scholar
  2. Auer, P., & Cesa-Bianchi, N. (1998). On-line learning with malicious noise and the closure algorithm. Ann. Math. Artif. Intell., 23 (1–2), 83–99.Google Scholar
  3. Auer, P., Long, P.M., & Srinivasan, A. (1998). Approximating hyper-rectangles: Learning and pseudorandom sets. J. Comput. Syst. Sci., 57(3), 376–388.Google Scholar
  4. Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M. (1989). Learnability and the Vapnik-Chervonenkis dimension. J. ACM, 36(4), 929–965.zbMATHCrossRefMathSciNetGoogle Scholar
  5. Ehrenfeucht, A., Haussler, D., Kearns, M. J., & Valiant, L. G. (1989). A general lower bound on the number of examples needed for learning. Inf. Comput., 82(3), 247–261.zbMATHCrossRefMathSciNetGoogle Scholar
  6. Floyd, A., & Warmuth, M. (1995). Sample compression, learnability, and the Vapnik-Chervonenkis Dimension. Machine Learning, 21(3), 269–304.Google Scholar
  7. Haussler, D., Littlestone, N., & Warmuth, M. (1994). Predicting {l0,1}-functions on randomly drawn points. Inf. Comput., 115(2), 248–292.zbMATHCrossRefMathSciNetGoogle Scholar
  8. Helmbold, D., Sloan, R., & Warmuth, M. (1990). Learning nested differences of intersection-closed concept classes. Machine Learning 5, 165–196.Google Scholar
  9. Sauer, N. (1972). On the density of families of sets. J. Combin. Theory Ser. A, 13, 145–147.zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer Science + Business Media, LLC 2007

Authors and Affiliations

  1. 1.Department Mathematik und Informationstechnologie, Montanuniversität Leoben8700LeobenAustria

Personalised recommendations