## Abstract

A central problem in inductive logic programming is theory evaluation. Without some sort of preference criterion, any two theories that explain a set of examples are equally acceptable. This paper presents a scheme for evaluating alternative inductive theories based on an objective preference criterion. It strives to extract maximal redundancy from examples, transforming structure into randomness. A major strength of the method is its application to learning problems where negative examples of concepts are scarce or unavailable. A new measure called *model complexity* is introduced, and its use is illustrated and compared with a *proof complexity* measure on relational learning tasks. The complementarity of model and proof complexity parallels that of model and proof–theoretic semantics. Model complexity, where applicable, seems to be an appropriate measure for evaluating inductive logic theories.

## References

- D. Angluin, (1978). Inductive inference of formal languages from positive data.
*Information and Control*, 45:117–135.Google Scholar - M. Bain, (1992). Experiments in non-monotonic first-order induction. In S. Muggleton, editor,
*Inductive Logic Programming*, pages 423–435. Academic Press.Google Scholar - R. C. Berwick, (1986). Learning from positive-only examples. In R. Michalski, J. Carbonell, and T. Mitchell, editors,
*Machine Learning: An Artificial Intelligence Approach*, volume II, pages 625–645. Morgan Kaufmann.Google Scholar - W. Buntine, (1988). Generalized subsumption and its application to induction and redundancy.
*Artificial Intelligence*, 36:149–176.Google Scholar - G. J. Chaitin, (1987).
*Information, Randomness & Incompleteness*. World Scientific, Singapore.Google Scholar - P. Cheeseman, (1990). On finding the most probable model. In J. Shrager and P. Langley, editors,
*Computational models of scientific discovery and theory formation*, chapter 3. Morgan Kaufmann.Google Scholar - J. Feldman, (1972). Some decidability results on grammatical inference and complexity.
*Information and Control*, 20:244–262.Google Scholar - B. R. Gaines, (1976). Behavior/structure transformations under uncertainty.
*Int. J. Man-Machine Studies*, 8:337–365.Google Scholar - E. M. Gold, (1967). Language identification in the limit.
*Information and Control*, 10:447–474.Google Scholar - M. Li and P. Vitanyi, (1992). Inductive reasoning and Kolmogorov complexity.
*J. Computer and System Sciences*, 44 (2):343–384.Google Scholar - J. W. Lloyd, (1987).
*Foundations of logic programming*. Springer-Verlag.Google Scholar - S. Minton, (1990). Quantitative results concerning the utility of explanation-based learning.
*Artificial Intelligence*, 42:363–392.Google Scholar - T. M. Mitchell, (1982). Generalization as search.
*Artificial Intelligence*, 18:203–226.Google Scholar - S. Muggleton, (1988). A strategy for constructing new predicates in first order logic. In
*Proc EWSL 88*, pages 123–130.Google Scholar - S. Muggleton, (1992). Inductive logic programming. In S. Muggleton, editor,
*Inductive Logic Programming*, pages 3–27. Academic Press.Google Scholar - S. Muggleton, editor, (1992).
*Inductive Logic Programming*. Academic Press.Google Scholar - S. Muggleton, A. Srinivasan, and M. Bain, (1992). Compression, significance and accuracy. In D. Sleeman and P. Edwards, editors,
*Machine Learning: Proceedings of the Ninth International Conference (ML92)*, pages 338–347. Morgan Kaufmann.Google Scholar - K. R. Popper, (1959).
*The Logic of Scientific Discovery*. Hutchinson & Co. Ltd.Google Scholar - J. R. Quinlan, (1986). Induction of decision trees.
*Machine Learning*, 1:81–106.Google Scholar - J. R. Quinlan, (1990). Learning logical definitions from relations.
*Machine Learning*, 5 (3):239–266.Google Scholar - J. Rissanen, (1985). Minimum description length principle. In S. Kotz and N. L. Johnson, editors,
*Encyclopedia of Statistical Sciences*, pages 523–527. Wiley.Google Scholar - C. Rouveirol, (1994). Flattening and saturation: two representation changes for generalization.
*Machine Learning*14 (2):219–232.Google Scholar - C. Sammut and R. B. Banerji, (1986). Learning concepts by asking questions. In R. Michalski, J. Carbonell, and T. Mitchell, editors,
*Machine Learning: An Artificial Intelligence Approach*, volume II, pages 167–191. Morgan Kaufmann.Google Scholar - E. Y. Shapiro, (1983).
*Algorithmic program debugging*. The MIT Press.Google Scholar - R. J. Solomonoff, (1978). Complexity-based induction systems: Comparisons and convergence theorems.
*IEEE Trans. Information Theory*, IT-24 (4):422–432.Google Scholar - L. Sterling and E. Shapiro, (1986).
*The Art of Prolog*. The MIT Press.Google Scholar - C. S. Wetherell, (1980). Probabilistic languages: A review and some open questions.
*ACM Computing Surveys*, 12 (4):361–379.Google Scholar - P. H. Winston, (1975). Learning structural descriptions from examples. In P. H. Winston, editor,
*The Psychology of Computer Vision*. McGraw-Hill.Google Scholar - I. H. Witten, R. Neal, and J. G. Cleary, (1987). Arithmetic coding for data compression.
*Communications of the ACM*, 30 (6):520–540.Google Scholar