Machine Learning

, Volume 16, Issue 3, pp 203–225 | Cite as

Complexity–Based Induction

  • Darrell Conklin
  • Ian H. Witten


A central problem in inductive logic programming is theory evaluation. Without some sort of preference criterion, any two theories that explain a set of examples are equally acceptable. This paper presents a scheme for evaluating alternative inductive theories based on an objective preference criterion. It strives to extract maximal redundancy from examples, transforming structure into randomness. A major strength of the method is its application to learning problems where negative examples of concepts are scarce or unavailable. A new measure called model complexity is introduced, and its use is illustrated and compared with a proof complexity measure on relational learning tasks. The complementarity of model and proof complexity parallels that of model and proof–theoretic semantics. Model complexity, where applicable, seems to be an appropriate measure for evaluating inductive logic theories.

Inductive logic programming data compression minimum description length principle model complexity learning from positive–only examples theory preference criterion 


  1. D. Angluin, (1978). Inductive inference of formal languages from positive data. Information and Control, 45:117–135.Google Scholar
  2. M. Bain, (1992). Experiments in non-monotonic first-order induction. In S. Muggleton, editor, Inductive Logic Programming, pages 423–435. Academic Press.Google Scholar
  3. R. C. Berwick, (1986). Learning from positive-only examples. In R. Michalski, J. Carbonell, and T. Mitchell, editors, Machine Learning: An Artificial Intelligence Approach, volume II, pages 625–645. Morgan Kaufmann.Google Scholar
  4. W. Buntine, (1988). Generalized subsumption and its application to induction and redundancy. Artificial Intelligence, 36:149–176.Google Scholar
  5. G. J. Chaitin, (1987). Information, Randomness & Incompleteness. World Scientific, Singapore.Google Scholar
  6. P. Cheeseman, (1990). On finding the most probable model. In J. Shrager and P. Langley, editors, Computational models of scientific discovery and theory formation, chapter 3. Morgan Kaufmann.Google Scholar
  7. J. Feldman, (1972). Some decidability results on grammatical inference and complexity. Information and Control, 20:244–262.Google Scholar
  8. B. R. Gaines, (1976). Behavior/structure transformations under uncertainty. Int. J. Man-Machine Studies, 8:337–365.Google Scholar
  9. E. M. Gold, (1967). Language identification in the limit. Information and Control, 10:447–474.Google Scholar
  10. M. Li and P. Vitanyi, (1992). Inductive reasoning and Kolmogorov complexity. J. Computer and System Sciences, 44 (2):343–384.Google Scholar
  11. J. W. Lloyd, (1987). Foundations of logic programming. Springer-Verlag.Google Scholar
  12. S. Minton, (1990). Quantitative results concerning the utility of explanation-based learning. Artificial Intelligence, 42:363–392.Google Scholar
  13. T. M. Mitchell, (1982). Generalization as search. Artificial Intelligence, 18:203–226.Google Scholar
  14. S. Muggleton, (1988). A strategy for constructing new predicates in first order logic. In Proc EWSL 88, pages 123–130.Google Scholar
  15. S. Muggleton, (1992). Inductive logic programming. In S. Muggleton, editor, Inductive Logic Programming, pages 3–27. Academic Press.Google Scholar
  16. S. Muggleton, editor, (1992). Inductive Logic Programming. Academic Press.Google Scholar
  17. S. Muggleton, A. Srinivasan, and M. Bain, (1992). Compression, significance and accuracy. In D. Sleeman and P. Edwards, editors, Machine Learning: Proceedings of the Ninth International Conference (ML92), pages 338–347. Morgan Kaufmann.Google Scholar
  18. K. R. Popper, (1959). The Logic of Scientific Discovery. Hutchinson & Co. Ltd.Google Scholar
  19. J. R. Quinlan, (1986). Induction of decision trees. Machine Learning, 1:81–106.Google Scholar
  20. J. R. Quinlan, (1990). Learning logical definitions from relations. Machine Learning, 5 (3):239–266.Google Scholar
  21. J. Rissanen, (1985). Minimum description length principle. In S. Kotz and N. L. Johnson, editors, Encyclopedia of Statistical Sciences, pages 523–527. Wiley.Google Scholar
  22. C. Rouveirol, (1994). Flattening and saturation: two representation changes for generalization. Machine Learning 14 (2):219–232.Google Scholar
  23. C. Sammut and R. B. Banerji, (1986). Learning concepts by asking questions. In R. Michalski, J. Carbonell, and T. Mitchell, editors, Machine Learning: An Artificial Intelligence Approach, volume II, pages 167–191. Morgan Kaufmann.Google Scholar
  24. E. Y. Shapiro, (1983). Algorithmic program debugging. The MIT Press.Google Scholar
  25. R. J. Solomonoff, (1978). Complexity-based induction systems: Comparisons and convergence theorems. IEEE Trans. Information Theory, IT-24 (4):422–432.Google Scholar
  26. L. Sterling and E. Shapiro, (1986). The Art of Prolog. The MIT Press.Google Scholar
  27. C. S. Wetherell, (1980). Probabilistic languages: A review and some open questions. ACM Computing Surveys, 12 (4):361–379.Google Scholar
  28. P. H. Winston, (1975). Learning structural descriptions from examples. In P. H. Winston, editor, The Psychology of Computer Vision. McGraw-Hill.Google Scholar
  29. I. H. Witten, R. Neal, and J. G. Cleary, (1987). Arithmetic coding for data compression. Communications of the ACM, 30 (6):520–540.Google Scholar

Copyright information

© Kluwer Academic Publishers 1994

Authors and Affiliations

  • Darrell Conklin
    • 1
  • Ian H. Witten
    • 2
  1. 1.Department of Computing and Information ScienceQueen's UniversityKingstonCanada
  2. 2.Department of Computer ScienceUniversity of WaikatoHamiltonNew Zealand

Personalised recommendations