Machine Learning

, Volume 9, Issue 4, pp 349–372 | Cite as

A framework for average case analysis of conjunctive learning algorithms

  • Michael J. Pazzani
  • Wendy Sarrett
Article

Abstract

We present an approach to modeling the average case behavior of learning algorithms. Our motivation is to predict the expected accuracy of learning algorithms as a function of the number of training examples. We apply this framework to a purely empirical learning algorithm, (the one-sided algorithm for pure conjunctive concepts), and to an algorithm that combines empirical and explanation-based learning. The model is used to gain insight into the behavior of these algorithms on a series of problems. Finally, we evaluate how well the average case model performs when the training examples violate the assumptions of the model.

Keywords

average-case analysis combining empirical and analytical learning 

References

  1. Atkinson, R., Bower, G., & Crothers, E. (1965).An introduction to mathematical learning theory. New York: John Wiley & Sons.Google Scholar
  2. Benedek, G., & Itai, A. (1987). Learnability by fixed distributions.Proceedings of the 1988 Workshop on Computational Learning Theory (pp. 81–90). Boston, MA: Morgan Kaufmann.Google Scholar
  3. Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M. (1989). Learnability and the Vapnik-Chervonenkis dimension.Journal of the Association of Computing Machinery, 36, 929–965.Google Scholar
  4. Bruner, J.S., Goodnow, J.J., & Austin, G.A. (1956).A study of thinking. New York: Wiley.Google Scholar
  5. Buntine, W. (1989). A critique of the Valiant model.Proceedings of the Eleventh Joint Conference on Artificial Intelligence (pp. 837–842). Detroit, MI: Morgan Kaufmann.Google Scholar
  6. Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification.IEEE Transactions of Information Theory, 13, 21–27.Google Scholar
  7. DeJong, G., & Mooney, R. (1986). Explanation-based learning: An alternate view.Machine Learning, 1, 145–176.Google Scholar
  8. Fisher, D. (1987). Knowledge acquisition via incremental conceptual clustering.Machine Learning, 2, 139–172.Google Scholar
  9. Haussler, D. (1987). Bias, version spaces and Valiant's learning framework.Proceedings of the Eighth National Conference on Artificial Intelligence (pp. 564–569). St. Paul, MN: Morgan Kaufmann.Google Scholar
  10. Haussler, D. (1990). Probably approximately correct learning.Proceedings of the Eighth National Conference on Artificial Intelligence (p. 1101–1108). Boston: AAAI Press.Google Scholar
  11. Haussler, D., Littlestone, N., & Warmuth, M. (1990).Predicting {0, 1}-functions on randomly drawn points. (Technical Report) USCS-CRL-90-54). Santa Cruz, CA: University of California, Santa Cruz.Google Scholar
  12. Hembold, D., Sloan, R., & Warmuth, M. (1990). Learning nested differences of intersection-closed concept classes.Machine Learning, 5, 165–196.Google Scholar
  13. Hirschberg, D., & Pazzani, M. (1992). Average case analysis of learning k-CNF concepts.Proceedings of the Ninth International Machine Learning Conference (pp. 206–211). Aberdeen: Morgan Kaufmann.Google Scholar
  14. Hirsh, H. (1989). Combining empirical and analytical learning with version spaces.Proceedings of the Sixth International Workshop on Machine Learning (pp. 29–33). Ithaca, NY: Morgan Kaufmann.Google Scholar
  15. Kearns, M., Li, M., Pitt, L., & Valiant, L. (1987). On the learnability of Boolean formula.Proceedings of the Nineteenth Annual ACM Symposium on the Theory of Computing (pp. 285–295). New York, NY: ACM Press.Google Scholar
  16. Kibler, D., & Aha, D. (1988). Comparing instance-averaging with instance-filtering learning algorithms.Proceedings of the Third European Working Session on Learning (pp. 63–79). Glasgow: Pitman.Google Scholar
  17. Langley, P. (1989). Toward a unified science of machine learning.Machine Learning,3(4).Google Scholar
  18. Minton, S. (1988). Quantitative results concerning the utility of explanation-based learning.Proceedings of the Seventh National Conference on Artificial Intelligence (pp. 564–569). St. Paul, MN: Morgan Kaufmann.Google Scholar
  19. Mitchell, T. (1982). Generalization as search.Artificial Intelligence, 18, 203–226.Google Scholar
  20. Mitchell, T., Keller, R., & Kedar-Cabelli, S. (1986). Explanation-based learning: A unifying view.Machine Learning, 1, 47–80.Google Scholar
  21. Natarajan, B. (1987). On learning Boolean formula.Proceedings of the Nineteenth Annual ACM Symposium on the Theory of Computing (pp. 295–304). New York, NY: ACM Press.Google Scholar
  22. Oblow, E. (1991). Implementation of Valiant's learnability theory using random sets.Machine Learning, 8.Google Scholar
  23. Pazzani, M. (1989). Explanation-based learning with weak domain theories.The Sixth International Workshop on Machine Learning (72–75). Ithaca, NY: Morgan Kaufmann.Google Scholar
  24. Restle, F. (1959). A survey and classification of learning models. In R. Bush & W. Estes (Eds.),Studies in mathematical learning theory, pp. 415–428. Stanford, CA: Stanford University Press.Google Scholar
  25. Rumelhart, D., Hinton, G., & Williams, R. (1986). Learning internal representations by error propagation. In D. Rumelhart & J. McClelland (Eds.),Parallel distributed processing: Explorations in the microstructure of cognition. Volume 1: Foundations (pp. 318–362). MIT Press.Google Scholar
  26. Sarrett, W., & Pazzani, M. (1989a). One-sided algorithms for integrating empirical and explanation-based learning.Proceedings of the Sixth International Workshop on Machine Learning (pp. 26–28). Ithaca, NY: Morgan Kaufmann.Google Scholar
  27. Sarrett, W., & Pazzani, M. (1989b).Average case analysis of empirical and explanation-based learning algorithms (Technical Report). Irvine, CA: University of California, Irvine.Google Scholar
  28. Valiant, L. (1984). A theory of the learnable.Communications of the Association of Computing Machinery, 27, 1134–1142.Google Scholar
  29. Vapnik, V., & Chervonenkis, A. (1971). On the uniform convergence of relative frequencies of events to their probabilities.Theory of Probability with Applications, 16, 264–280.Google Scholar
  30. Winer, B. (1971).Statistical principles in experimental design. New York: McGraw-Hill.Google Scholar

Copyright information

© Kluwer Academic Publishers 1992

Authors and Affiliations

  • Michael J. Pazzani
    • 1
  • Wendy Sarrett
    • 1
  1. 1.Department of Information and Computer ScienceUniversity of California, IrvineIrvineUSA

Personalised recommendations