Machine Learning

, Volume 66, Issue 2–3, pp 151–163 | Cite as

A new PAC bound for intersection-closed concept classes

Article

Abstract

For hyper-rectangles in \(\mathbb{R}^{d}\) Auer (1997) proved a PAC bound of \(O(\frac{1}{\varepsilon}(d+\log \frac{1}{\delta}))\), where \(\varepsilon\) and \(\delta\) are the accuracy and confidence parameters. It is still an open question whether one can obtain the same bound for intersection-closed concept classes of VC-dimension \(d\) in general. We present a step towards a solution of this problem showing on one hand a new PAC bound of \(O(\frac{1}{\varepsilon}(d\log d + \log \frac{1}{\delta}))\) for arbitrary intersection-closed concept classes, complementing the well-known bounds \(O(\frac{1}{\varepsilon}(\log \frac{1}{\delta}+d\log \frac{1}{\varepsilon}))\) and \(O(\frac{d}{\varepsilon}\log \frac{1}{\delta})\) of Blumer et al. and (1989) and Haussler, Littlestone and Warmuth (1994). Our bound is established using the closure algorithm, that generates as its hypothesis the intersection of all concepts that are consistent with the positive training examples. On the other hand, we show that many intersection-closed concept classes including e.g. maximum intersection-closed classes satisfy an additional combinatorial property that allows a proof of the optimal bound of \(O(\frac{1}{\varepsilon}(d+\log \frac{1}{\delta}))\). For such improved bounds the choice of the learning algorithm is crucial, as there are consistent learning algorithms that need \(\Omega(\frac{1}{\varepsilon}(d\log\frac{1}{\varepsilon} +\log\frac{1}{\delta}))\) examples to learn some particular maximum intersection-closed concept classes.

Keywords

PAC bounds Intersection-closed classes 

References

  1. Auer, P. (1997). Learning nested differences in the presence of malicious noise. Theor. Comput. Sci., 185(1), 159–175.MATHCrossRefMathSciNetGoogle Scholar
  2. Auer, P., & Cesa-Bianchi, N. (1998). On-line learning with malicious noise and the closure algorithm. Ann. Math. Artif. Intell., 23 (1–2), 83–99.Google Scholar
  3. Auer, P., Long, P.M., & Srinivasan, A. (1998). Approximating hyper-rectangles: Learning and pseudorandom sets. J. Comput. Syst. Sci., 57(3), 376–388.Google Scholar
  4. Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M. (1989). Learnability and the Vapnik-Chervonenkis dimension. J. ACM, 36(4), 929–965.MATHCrossRefMathSciNetGoogle Scholar
  5. Ehrenfeucht, A., Haussler, D., Kearns, M. J., & Valiant, L. G. (1989). A general lower bound on the number of examples needed for learning. Inf. Comput., 82(3), 247–261.MATHCrossRefMathSciNetGoogle Scholar
  6. Floyd, A., & Warmuth, M. (1995). Sample compression, learnability, and the Vapnik-Chervonenkis Dimension. Machine Learning, 21(3), 269–304.Google Scholar
  7. Haussler, D., Littlestone, N., & Warmuth, M. (1994). Predicting {l0,1}-functions on randomly drawn points. Inf. Comput., 115(2), 248–292.MATHCrossRefMathSciNetGoogle Scholar
  8. Helmbold, D., Sloan, R., & Warmuth, M. (1990). Learning nested differences of intersection-closed concept classes. Machine Learning 5, 165–196.Google Scholar
  9. Sauer, N. (1972). On the density of families of sets. J. Combin. Theory Ser. A, 13, 145–147.MATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer Science + Business Media, LLC 2007

Authors and Affiliations

  1. 1.Department Mathematik und Informationstechnologie, Montanuniversität Leoben8700LeobenAustria

Personalised recommendations