Advertisement

Machine Learning

, Volume 4, Issue 1, pp 7–40 | Cite as

Learning Conjunctive Concepts in Structural Domains

  • David Haussler
Article

Abstract

We study the problem of learning conjunctive concepts from examples on structural domains like the blocks world. This class of concepts is formally defined, and it is shown that even for samples in which each example (positive or negative) is a two-object scene, it is NP-complete to determine if there is any concept in this class that is consistent with the sample. We demonstrate how this result affects the feasibility of Mitchell's version of space approach and how it shows that it is unlikely that this class of concepts is polynomially learnable from random examples alone in the PAC framework of Valiant. On the other hand, we show that for any fixed bound on the number of objects per scene, this class is polynomially learnable if, in addition to providing random examples, we allow the learning algorithm to make subset queries. In establishing this result, we calculate the capacity of the hypothesis space of conjunctive concepts in a structural domain and use a general theorem of Vapnik and Chervonenkis. This latter result can also be used to estimate a sample size sufficient for heuristic learning techniques that do not use queries.

concept learning PAC learning empirical learning VC dimension version spaces structural domains conjunctive concepts 

References

  1. Angluin, D. (1988). Queries and concept learning. Machine Learning, 2, 319-342.Google Scholar
  2. Baum, E. & Haussler, D. (1989). What size net gives valid generalization?Neural Computation, 1, 151-160.Google Scholar
  3. Buntine, W. Induction of Horn clauses:Methods and the plausible generalization algorithm. (Technical Report). New South Wales, Australia: New South Wales Institute of Technology, Department of Computer Science.Google Scholar
  4. Blumer, A., Ehrenfeucht, A., Haussler, D. & Warmuth, M. (1989). Learnability and the Vapnik-Chervonenkis dimension. J. ACM, 36.Google Scholar
  5. Boriga, A., Mitchell, T. & Williamson, K. Learning improved integrity constraints and schemas from exceptions in data and knowledge bases. In M. Brodie and J. Mylopoulos, (Eds. ), On knowledge base management systems, New York: Springer-Verlag.Google Scholar
  6. Bundy, A., Silver, B. & Plummer, D. (1985). An analytical comparison of some rule-learning programs.Artificial Intelligence, 27, 137-181.Google Scholar
  7. Cohen, P. & Feigenbaum, E. (1982). Handbook of Artificial Intelligence (Vol. 3). William Kaufmann.Google Scholar
  8. Dietterich, T. G. & Michalski, R. S. (1983). A comparitive review of selected methods for learning from examples. In Machine learning:An artificial intelligence approach. Palo Alto, CA: Tioga Press.Google Scholar
  9. Duda, R. & Hart, P. (1973). Pattern classification and scene analysis. John P. Wiley & Sons.Google Scholar
  10. Ehrenfeucht, A., Haussler, D., Kearns, M. & Valiant, L. (1989). A general lower bound on the number of examples needed for learning. Information and Computation, 82.Google Scholar
  11. Garey, M. & Johnson, D. (1979). Computers and intractability:A guide to the theory of NP-completeness. San Francisco, CA: W. H. Freeman.Google Scholar
  12. Gill, J. (1977). Probabilistic Turing machines. SIAM J Comput, 6, 675-695.Google Scholar
  13. Haussler, D. (1987). Learning conjunctive concepts in structural domains. (Technical Report UCSC-CRL-87-01). Santa Cruz, CA: University of California.Google Scholar
  14. Haussler, D. (1988). Quantifying inductive bias:AI learning algorithms and Valiant's learning frame-work. Artificial Intelligence, 36, 177-221.Google Scholar
  15. Haussler, D., Kearns, M., Littlestone, N. & Warmuth, M. (1988). Equivalence of models for polynomial leamability. (Technical Report UCSC-CRL-88-06). Santa Cruz, CA: University of California.Google Scholar
  16. Haussler, D. & Welzl, E. (1987). Epsilon nets and simplex range queries. Discrete and Comp. Geometry, 2, 127-151.Google Scholar
  17. Hayes-Roth, F. & McDermott, J. (1978). An interference matching technique for inducing abstractions. CACM, 21, 401-410.Google Scholar
  18. Kearns, M., Li, M., Pitt, L. & Valiant, L. (1987). Recent results in Boolean concept learning. Pro-ceedings of the Fourth International Workshop on Machine Learning (pp. 337-352). Irvine, CA.Google Scholar
  19. Kodratoff, Y. & Ganascia, J. (1986). Improving the generalization step in learning. In R. Michalski, J. Carbonell & T. Mitchell (Eds. ), Machine learning 11. Los Altos, CA: Morgan Kaufmann.Google Scholar
  20. Knapman, J. (1978). A critical review of Winston's learning structural descriptions from examples. AISB Quarterly, 31, 319-320.Google Scholar
  21. Michalski, R. S. (1983). A theory and methodology of inductive learning. In Machine learning:An artificial intelligence approach. Palo Alto, CA: Tioga Press.Google Scholar
  22. Mitchell, T. M. (1980). The need for biases in learning generalizations. (Technical Report CBM-TR-117). New Brunswick, NJ: Rutgers University, Department of Computer Science.Google Scholar
  23. Mitchell, T. M. (1982). Generalization as search. Artificial Intelligence, 18, 203-226.Google Scholar
  24. Mitchell, T. M., Keller, R. M. & Kedar-Cabelli, S. T. (1988). Explanation-based generalization:A unify-ing view. Machine Learning, 1, 47-80.Google Scholar
  25. Muggleton, S. & Buntine, W. (1988). Machine invention of first-order predicates by inverting resolution. Proceedings of the Fifth International Conference on Machine Learning (pp. 339-352). Ann Arbor, MI.Google Scholar
  26. Natarajan, B. K. (1989). On learning sets and functions. Machine Learning, 4, 67-97.Google Scholar
  27. Pearl, J. (1978). On the connection between the complexity and credibility of inferred models. Int. J. Gen. Sys., 4, 255-264..Google Scholar
  28. Pitt, L. & Valiant, L. G. (1988). Computational limitations on learning from examples. Journal of the ACM, 35, 965-984.Google Scholar
  29. Sammut, C. & Banerji, R. (1986). Learning concepts by asking questions. In R. Michalski, J. Carbonell & T. Mitchell, (Eds. ), Machine learning II. Los Altos, CA: Morgan Kaufmann.Google Scholar
  30. Stepp, R. (1987). Machine learning from structured objects. Proceedings of the Fourth International Workshop on Machine Learning (pp. 353-363). Irvine, CA.Google Scholar
  31. Subramanian, D. & Feigenbaum, J. (1986). Factorization in experiment generation. Proceedings of the AAAl-86 (pp. 518-522). Philadelphia, PA.Google Scholar
  32. Utgoff, P. (1986). Shift of bias for inductive concept learning. In R. Michalski, J. Carbonell & T. Mitchell, (Eds. ), Machine learning II, Los Altos, CA: Morgan Kaufmann.Google Scholar
  33. Valiant, L. G. (1984). A theory of the learnable. CACM, 27, 1134-1142.Google Scholar
  34. Valiant, L. G. (1985). Learning disjunctions of conjunctions. Proceedings of the Ninth International Joint Conference on Artificial Intelligence (pp. 560-566). Los Angeles, CA.Google Scholar
  35. Vapnik, V. N. (1982). Estimation of dependences based on empirical data. New York: Springer-Verlag.Google Scholar
  36. Vapnik, V. N. & Chervonenkis, A. (1971). On the uniform convergence of relative frequencies of events to their probabilities. Th. Prob. and its Appl., 16, 264-280.Google Scholar
  37. Vere, S. A. (1975). Induction of concepts in the predicate calculus. Proceedings of the Fourth Inter-national Joint Conference on Artificial Intelligence (pp. 281-287). Tbilisi, USSR.Google Scholar
  38. Winston, P. (1975). Learning structural descriptions from examples. In P. H. Winston, (Ed. ), The psychology of computer vision. New York: McGraw-Hill.Google Scholar
  39. Winston, P. (1984). Artificial intelligence. Addison-Wesley.Google Scholar

Copyright information

© Kluwer Academic Publishers 1989

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of CaliforniaSanta CruzUSA.

Personalised recommendations