Toward efficient agnostic learning

Kearns, Michael J.; Schapire, Robert E.; Sellie, Linda M.

doi:10.1007/BF00993468

Toward efficient agnostic learning

Published: November 1994

Volume 17, pages 115–141, (1994)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Toward efficient agnostic learning

Download PDF

Michael J. Kearns¹,
Robert E. Schapire¹ &
Linda M. Sellie²

1156 Accesses
117 Citations
4 Altmetric
Explore all metrics

Abstract

In this paper we initiate an investigation of generalizations of the Probably Approximately Correct (PAC) learning model that attempt to significantly weaken the target function assumptions. The ultimate goal in this direction is informally termedagnostic learning, in which we make virtually no assumptions on the target function. The name derives from the fact that as designers of learning algorithms, we give up the belief that Nature (as represented by the target function) has a simple or succinct explanation. We give a number of positive and negative results that provide an initial outline of the possibilities for agnostic learning. Our results include hardness results for the most obvious generalization of the PAC model to an agnostic setting, an efficient and general agnostic learning method based on dynamic programming, relationships between loss functions for agnostic learning, and an algorithm for a learning problem that involves hidden variables.

References

Aldous, D. & Vazirani, U. (1990). A Markovian extension of Valiant's learning model.31st Annual Symposium on Foundations of Computer Science (pp. 392–404). IEEE Press.
Blum, A. & Chalasani, P. (1992). Learning switching concepts.Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory (pp. 231–242). ACM Press.
Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M. K. (1989). Learnability and the Vapnik-Chervonenkis dimension.Journal of the Association for Computing Machinery, 36 929–965.
Google Scholar
Duda, R. O. & Hart, P. E. (1973).Pattern Classification and Scene Analysis. Wiley.
Dudley, R. M. (1978). Central limit theorems for empirical measures.The Annals of Probability, 6 899–929.
Google Scholar
Freund, Y. (1990). Boosting a weak learning algorithm by majority.Proceedings of the Third Annual Workshop on Computational Learning Theory (pp. 202–216). Morgan Kaufmann.
Freund, Y. (1992). An improved boosting algorithm and its implications on learning complexity.Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory (pp. 391–398). ACM Press.
Garey, M. & Johnson, D. (1979).Computers and Intractability: A Guide to the Theory of NP-Completeness. San Francisco: W. H. Freeman.
Google Scholar
Haussler, D. (1992). Decision theoretic generalizations of the PAC model for neural net and other learning applications.Information and Computation, 100, 78–150.
Google Scholar
Helmbold, D. P. & Long, P. M. (1994). Tracking drifting concepts by minimizing disagreements.Machine Learning, 14, 27–45.
Google Scholar
Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables.Journal of the American Statistical Association, 58, 13–30.
Google Scholar
Izenman, A. J. (1991). Recent developments in nonparametric density estimation.Journal of the American Statistical Association, 86, 205–224.
Google Scholar
Kearns, M. & Li, M. (1993). Learning in the presence of malicious errors.SIAM Journal on Computing, 22, 807–837.
Google Scholar
Kearns, M., Li, M., Pitt, L., & Valiant, L. (1987). On the learnability of Boolean formulae.Proceedings of the Nineteenth Annual ACM Symposium on Theory of Computing (pp. 285–295).
Kearns, M. & Valiant, L. G. (1994). Cryptographic limitations on learning Boolean formulae and finite automata.Journal of the Association for Computing Machinery, 41, 67–95. ACM Press.
Google Scholar
Kearns, M. J. & Schapire, R. E. (1990). Efficient distribution-free learning of probabilistic concepts.31st Annual Symposium on Foundations of Computer Science (pp. 382–391). IEEE Press. To appear,Journal of Computer and System Sciences.
Linial, N., Mansour, Y., & Nisan, N. (1993). Constant depth circuits, Fourier transform, and learnability.Journal of the Association for Computing Machinery, 40, 607–620.
Google Scholar
Pitt, L. & Valiant, L. G. (1988). Computational limitations on learning from examples.Journal of the Association for Computing Machinery, 35, 965–984.
Google Scholar
Pollard, D. (1984).Convergence of Stochastic Processes. Springer-Verlag.
Rissanen, J., Speed, T. P., & Yu, B. (1992). Density estimation by stochastic complexity.IEEE Transactions on Information Theory, 38, 315–323.
Google Scholar
Schapire, R. E. (1990). The strength of weak learnability.Machine Learning, 5, 197–227.
Google Scholar
Valiant, L. G. (1984). A theory of the learnable.Communications of the ACM, 27, 1134–1142.
Google Scholar
Valiant, L. G. (1985). Learning disjunctions of conjunctions.Proceedings of the 9th International Joint Conference on Artificial Intelligence (pp. 560–566).
Vapnik, V. N. (1982).Estimation of Dependences Based on Empirical Data. Springer-Verlag.
White, H. (1989). Learning in artificial neural networks: A statistical perspective.Neural Computation, 1, 425–464.
Google Scholar
Yamanishi, K. (1992a). A learning criterion for stochastic rules.Machine Learning, 9, 165–203.
Google Scholar
Yamanishi, K. (1992b). Learning nonparametric densities in terms of finite dimensional parametric hypotheses.IEICE Transactions: D Information and Systems, E75D, 459–469.
Google Scholar

Download references

Author information

Authors and Affiliations

AT&T Bell Laboratories, 600 Mountain Avenue, 07974-0636, Murray Hill, NJ
Michael J. Kearns & Robert E. Schapire
Department of Computer Science, University of Chicago, 60637, Chicago, IL
Linda M. Sellie

Authors

Michael J. Kearns
View author publications
You can also search for this author in PubMed Google Scholar
Robert E. Schapire
View author publications
You can also search for this author in PubMed Google Scholar
Linda M. Sellie
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kearns, M.J., Schapire, R.E. & Sellie, L.M. Toward efficient agnostic learning. Mach Learn 17, 115–141 (1994). https://doi.org/10.1007/BF00993468

Download citation

Received: 25 January 1993
Accepted: 15 October 1993
Issue Date: November 1994
DOI: https://doi.org/10.1007/BF00993468

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Toward efficient agnostic learning

Abstract

Article PDF

Similar content being viewed by others

Different Conceptions of Learning: Function Approximation vs. Self-Organization

The Logic of AGM Learning from Partial Observations

Learning Theory: the Probably Approximately Correct Framework

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Toward efficient agnostic learning

Abstract

Article PDF

Similar content being viewed by others

Different Conceptions of Learning: Function Approximation vs. Self-Organization

The Logic of AGM Learning from Partial Observations

Learning Theory: the Probably Approximately Correct Framework

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation