Optimal layered learning: A PAC approach to incremental sampling

  • Stephen Muggleton
Invited Papers
Part of the Lecture Notes in Computer Science book series (LNCS, volume 744)


It is best to learn a large theory in small pieces. An approach called “layered learning” starts by learning an approximately correct theory. The errors of this approximation are then used to construct a second-order “correcting” theory, which will again be only approximately correct. The process is iterated until some desired level of overall theory accuracy is met. The main advantage of this approach is that the sizes of successive training sets (errors of the hypothesis from the last iteration) are kept low. General lower-bound PAC-learning results are used in this paper to show that optimal layered learning results in the total training set size (t) increasing linearly in the number of layers. Meanwhile the total training and test set size (m) increases exponentially and the error (e) decreases exponentially. As a consequence, a model of layered learning which requires that t, rather than m, be a polynomial function of the logarithm of the concept space would make learnable many concept classes which are not learnable in Valiant's PAC model.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    M. Bain. Experiments in non-monotonie first-order induction. In Proceedings of the Eighth International Machine Learning Workshop, San Mateo, CA, 1991. Morgan-Kaufmann.Google Scholar
  2. 2.
    M. Bain and S. Muggleton. Non-monotonic learning. In D. Michie, editor, Machine Intelligence 12. Oxford University Press, 1991.Google Scholar
  3. 3.
    A. Ehrenfeucht, D. Haussler, M. Kearns, and L. Valiant. A general lower bound on the number of examples needed for learning. In COLT 88: Proceedings of the Conference on Learning Theory, pages 110–120, San Mateo, CA, 1988. Morgan-Kaufmann.Google Scholar
  4. 4.
    S. Muggleton. Inductive Logic Programming. New Generation Computing, 8(4):295–318, 1991.Google Scholar
  5. 5.
    J.R. Quinlan. Discovering rules from large collections of examples: a case study. In D. Michie, editor, Expert Systems in the Micro-electronic Age, pages 168–201. Edinburgh University Press, Edinburgh, 1979.Google Scholar
  6. 6.
    L. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134–1142, 1984.CrossRefGoogle Scholar
  7. 7.
    S. Wrobel. On the proper definition of minimality in specialization and theory revision. In P.Brazdil, editor, EWSL-93, pages 65–82, Berlin, 1993. Springer-Verlag.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1993

Authors and Affiliations

  • Stephen Muggleton
    • 1
  1. 1.Oxford University Computing LaboratoryOxfordUK

Personalised recommendations