Advertisement

Machine Learning

, Volume 51, Issue 3, pp 217–238 | Cite as

Boosting and Hard-Core Set Construction

  • Adam R. Klivans
  • Rocco A. Servedio
Article

Abstract

This paper connects hard-core set construction, a type of hardness amplification from computational complexity, and boosting, a technique from computational learning theory. Using this connection we give fruitful applications of complexity-theoretic techniques to learning theory and vice versa. We show that the hard-core set construction of Impagliazzo (1995), which establishes the existence of distributions under which boolean functions are highly inapproximable, may be viewed as a boosting algorithm. Using alternate boosting methods we give an improved bound for hard-core set construction which matches known lower bounds from boosting and thus is optimal within this class of techniques. We then show how to apply techniques from Impagliazzo (1995) to give a new version of Jackson's celebrated Harmonic Sieve algorithm for learning DNF formulae under the uniform distribution using membership queries. Our new version has a significant asymptotic improvement in running time. Critical to our arguments is a careful analysis of the distributions which are employed in both boosting and hard-core set constructions.

boosting hard core set construction computational complexity 

References

  1. Babai, L., Fortnow, L., Nisan, N., & Wigderson, A. (1993). BPP has subexponential time simulations unless exptime has publishable proofs. Computational Complexity, 3, 307–318.Google Scholar
  2. Blum, A., Furst, M., Jackson, J., Kearns, M., Mansour, Y., & Rudich, S. (1994). Weakly learning DNF and characterizing statistical query learning using Fourier analysis. In Proceedings of the Twenty-Sixth Annual Symposium on Theory of Computing (pp. 253–262). ACM.Google Scholar
  3. Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M. (1989). Learnability and the Vapnik-Chervonenkis dimension. Journal of the ACM, 36:4, 929–965.Google Scholar
  4. Boneh, D., & Lipton, R. (1993). Amplification of weak learning over the uniform distribution. In Proceedings of the Sixth Annual Workshop on Computational Learning Theory (pp. 347–351). ACM.Google Scholar
  5. Bshouty, N., Jackson, J., & Tamon, C. (1999). More efficient PAC learning of DNF with membership queries under the uniform distribution. In Proceedings of the Twelfth Annual Conference on Computational Learning Theory (pp. 286–295).Google Scholar
  6. Drucker, H., & Cortes, C. (1996). Boosting decision trees. In Advances in Neural Information Processing Systems 8 (pp. 479–485).Google Scholar
  7. Drucker, H., Cortes, C., Jackel, L. D., Lecun, Y., & Vapnik, V. (1994). Boosting and other ensemble methods. Neural Computation, 6:6, 1289–1301.Google Scholar
  8. Drucker, H., Schapire, R., & Simard, P. (1993a). Boosting performance in neural networks. International Journal of Pattern Recognition and Machine Intelligence, 7:4, 705–719.Google Scholar
  9. Drucker, H., Schapire, R., & Simard, P. (1993b). Improving performance in neural networks using a boosting algorithm. In Advances in Neural Information Processing Systems 5 (pp. 42–49).Google Scholar
  10. Freund, Y. (1990). Boosting a weak learning algorithm by majority. In Proceedings of the Third Annual Workshop on Computational Learning Theory (pp. 202–216).Google Scholar
  11. Freund, Y. (1992). An improved boosting algorithm and its implications on learning complexity. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory (pp. 391–398).Google Scholar
  12. Freund, Y. (1995). Boosting a weak learning algorithm by majority. Information and Computation, 121:2, 256–285.Google Scholar
  13. Freund, Y., & Schapire, R. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55:1, 119–139.Google Scholar
  14. Goldreich, O., Nisan, N., & Wigderson, A. (1995). On Yao's xor-lemma. Electronic Colloquium on Computational Complexity, TR95-050.Google Scholar
  15. Impagliazzo, R. (1995). Hard-core distributions for somewhat hard problems. In Proceedings of the Thirty-Sixth Annual Symposium on Foundations of Computer Science (pp. 538–545). IEEE.Google Scholar
  16. Impagliazzo, R., & Widgerson, A. (1997). P = BPP unless E has subexponential circuits: Derandomizing the xor lemma. In Proceedings of the Twenty-Ninth Annual Symposium on Theory of Computing (pp. 220–229).Google Scholar
  17. Jackson, J. (1995). The Harmonic sieve: A novel application of Fourier analysis to machine learning theory and practice. Ph.D. Thesis, Carnegie Mellon University.Google Scholar
  18. Jackson, J. (1997). An efficient membership-query algorithm for learning DNF with respect to the uniform distribution. Journal of Computer and System Sciences, 55, 414–440.Google Scholar
  19. Jackson, J. (2002). Personal communication.Google Scholar
  20. Jackson, J., & Craven, M. (1996). Learning sparse perceptrons. In Advances in Neural Information Processing Systems 8 (pp. 654–660).Google Scholar
  21. Kearns, M., & Valiant, L. (1994). Cryptographic limitations on learning boolean formulae and finite automata. Journal of the ACM, 41:1, 67–95.Google Scholar
  22. Klivans, A., & Servedio, R. (2001). Learning DNF in time \(2^{\tilde O(n^{{1 \mathord{\left/ {\vphantom {1 3}} \right. \kern-\nulldelimiterspace} 3}} )} \). In Proceedings of the Twenty-Sixth Annual Symposium on Theory of Computing (pp. 258–265).Google Scholar
  23. Levin, L. (1986). Average case complete problems. SIAM Journal on Computing, 15:1, 285–286.Google Scholar
  24. Muller, D., & Preparata, F. (1975). Bounds to complexities of networks for sorting and for switching. Journal of the ACM, 22:2, 195–201.Google Scholar
  25. Nisan, N., & Wigderson, A. (1994). Hardness versus randomness. Journal of Computer and System Sciences, 49, 149–167.Google Scholar
  26. Schapire, R. (1990). The strength of weak learnability. Machine Learning, 5:2, 197–227.Google Scholar
  27. Schapire, R., & Singer,Y. (1998). Improved boosting algorithms using confidence-rated predictions. In Proceedings of the Eleventh Annual Conference on Computational Learning Theory (pp. 80–91).Google Scholar
  28. Servedio, R. (2001). Smooth boosting and learning with malicious noise. In Proceedings of the Fourteenth Annual Conference on Computational Learning Theory (pp. 473–489).Google Scholar
  29. Shaltiel, R. (2001). Towards proving strong direct product theorems. In Proceedings of the Sixteenth Conference on Computational Complexity (pp. 107–117).Google Scholar
  30. Sudan, M., Trevisan, L., & Vadhan, S. (2001). Pseudorandom generators without the xor lemma. Journal of Computer and System Sciences, 62:2, 236–266.Google Scholar
  31. Valiant, L. (1984). A theory of the learnable. Communications of the ACM, 27:11, 1134–1142.Google Scholar
  32. Wigderson, A. (1999). Personal communication.Google Scholar

Copyright information

© Kluwer Academic Publishers 2003

Authors and Affiliations

  • Adam R. Klivans
    • 1
  • Rocco A. Servedio
    • 2
  1. 1.Laboratory for Computer ScienceMITCambridgeUSA
  2. 2.Division of Engineering and Applied SciencesHarvard UniversityCambridgeUSA

Personalised recommendations