Abstract
This paper presents some results on the probabilistic analysis of learning, illustrating the applicability of these results to settings such as connectionist networks. In particular, it concerns the learning of sets and functions from examples and background information. After a formal statement of the problem, some theorems are provided identifying the conditions necessary and sufficient for efficient learning, with respect to measures of information complexity and computational complexity. Intuitive interpretations of the definitions and theorems are provided.
Article PDF
Similar content being viewed by others
References
Angluin, D. & Smith, C. H. (1983). Inductive inference: Theory and methods. Computing Surveys, 15, 237–269.
Angluin, D. (1986). Learning regular sets from queries and counter-examples (Technical Report YALEU/ DCS-464). New Haven, CT: Yale University, Department of Computer Science.
Angluin, D. & Laird, P. (1968). Learning from noisy examples. Machine Learning, 2, 343–370.
Benedeck, G. M., & Itai, N. (1988). Learning by fixed distributions Proceedings of the Workshop on Computational Learning Theory (pp. 80–90). Cambridge, MA: Morgan Kaufmann.
Berman, P., & Roos, R. (1987). Learning one-counter languages in polynomial time. Proceedings of the Symposium on Foundations of Computer Science (pp. 61–67). Los Angeles, CA: IEEE Computer Society Press.
Blum, A., & Rivest, R. (1988). Training a 3-node neural network is NP-complete. Proceedings of the Workshop on Computational Learning Theory (pp. 9–18). Cambridge, MA: Morgan Kaufmann.
Blumer, A., Ehrenfeucht, A., Haussler, D. & Warmuth, M. (1986). Classifying learnable geometric concepts with the Vapnik-Chervonenkis dimension. Proceedings of the ACM Symposium on Theory of Computing (pp. 273–282). Berkeley, CA: ACM Press.
Gill, J. (1977). Computational complexity of probabilistic turing machines. SIAM Journal of Computing, 6, 675–695.
Haussler, D., Littlestone, N., & Warmuth, M. (1988). Predicting {0.1} functions on randomly drawn points. Proceedings of the Workshop on Computational Learning Theory (pp. 280–296). Cambridge, MA: Morgan Kaufmann.
Kearns, M., Li, M., Pitt, L., & Valiant, L. G. (1987a). On the learnability of Boolean formulae. Proceedings of the ACM Symposium on Theory of Computing (pp. 285–295). New York, NY: ACM Press.
Kearns, M., Li, M., Pitt, L. & Valiant, L. G. (1987b). Recent results on Boolean concept learning. Proceedings of the Fourth International Workshop on Machine Learning (pp. 337–352). Irvine, CA: Morgan Kaufmann.
Laird, P. (1987). Learning from data good and bad. Doctoral Dissertation, Department of Computer Science, Yale University, New Haven, CT.
Littlestone, N. (1988). Learning quickly when irrelevant attributes abound. Machine Learning, 2, 285–318.
Mitchell, T. M., Keller, R. M., & Kedar-Cabelli, S. T. (1986). Explanation based generalization: A unifying view. Machine Learning, 1, 47–80.
Natarajan, B. K. (1986). On learning Boolean functions (Technical Report CMU-RI-TR-86–17). Pittsburgh, PA: Carnegie Mellon University, Robotics Institute. Also Proceedings of the ACM Symposium on Theory of Computing, 1987 (pp. 296–304). New York, NY: ACM Press.
Natarajan, B. K. (1987). Two new frameworks for learning (Technical Report CMU-RI-TR87–25). Pittsburgh, PA: Carnegie Mellon University, Robotics Institute.
Natarajan, B. K. (1988a). Learning over classes of distributions. Proceedings of the Workshop on Computational Learning Theory. Cambridge, MA: Morgan Kaufmann.
Natarajan, B. K. (1988b). Some results on learning. Unpublished manuscript.
Natarajan, B. K. (1989a). On learning from exercises (Technical Report CMU-RI-TR4–89). Pittsburgh, PA: Carnegie Mellon University, Robotics Institute.
Natarajan, B. K., Probably approximate learning over classes of distributions (Technical Report HPL-SAL-89-29). Palo Alto, CA: Hewlett Packard Research Laboratories.
Natarajan, B. K. & Tadepalli, P. (1988). Two new frameworks for learning. Proceedings of the Fifth International Symposium on Machine Learning (pp. 402–415). Ann Arbor, MI: Morgan Kaufmann.
Pitt, L., & Warmuth, M. (1988). Reductions among prediction problems: On the difficulty of predicting automata. Proceedings of 3rd IEEE Conference on Structure in Complexity Theory (pp. 60–69). Washington, D. C.: ACM Press.
Pollard, J. (1986). Convergence of stochastic processes. New York: Springer-Verlag.
Rivest, R. (1987). Learning decision lists. Machine Learning, 2, 229–246.
Rivest, R., & Schapire, R. E. (1987). Diversity based inference of finite automata. Proceedings of the Symposium on Foundations of Computer Science (pp. 78–87). Los Angeles, CA: IEEE Computer Society Press.
Rumelhart, D., & McClelland, J. (Eds.). (1986). Parallel distributed processing. Cambridge, MA: MIT Press.
Valiant, L. G. (1984). A theory of the learnable. Proceedings of the ACM Symposium on Theory of Computing (pp. 436–445). Washington, D.C.: ACM Press.
Vapnik, V. N. & Chervonenkis, A. Ya. (1971). On the uniform convergence of relative frequencies of events to their probabilities. Theory of probability and its applications, 16, 264–280.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Natarajan, B.K. On learning sets and functions. Mach Learn 4, 67–97 (1989). https://doi.org/10.1007/BF00114804
Issue Date:
DOI: https://doi.org/10.1007/BF00114804