Advertisement

Towards representation independence in PAC learning

  • Manfred K. Warmuth
Invited Lectures
Part of the Lecture Notes in Computer Science book series (LNCS, volume 397)

Abstract

In the recent development of various models of learning inspired by the PAC learning model (introduced by Valiant) there has been a trend towards models which are as representation independent as possible. We review this development and discuss the advantages of representation independence. Motivated by the research in learning, we propose a framework for studying the combinatorial properties of representations.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    N. Abe. Polynomial learnability as a formal model of natural language acquisition. 1989. Ph.D. Thesis, in preparation, Department of Computer and Information Science, University of Pennsylvania.Google Scholar
  2. [2]
    A. Aho, J. Hopcroft, and J. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, London, 1974.Google Scholar
  3. [3]
    D. Angluin. Queries and concept learning. Machine Learning, 2:319–342, 1987.Google Scholar
  4. [4]
    D. Angluin and C. H. Smith. Inductive inference: theory and methods. Computing Surveys, 15(3):237–269, 1983.Google Scholar
  5. [5]
    J. M. Barzdin. Prognostication of automata and functions, pages 81–84. Elsevier North-Holland, New York, 1972.Google Scholar
  6. [6]
    J. M. Barzdin and R. V. Freivald. On the prediction of general recursive functions. Soviet Mathematics Doklady, 13:1224–1228, 1972.Google Scholar
  7. [7]
    A. Blum and R. Rivest. Training a 3-node neural network is NP-complete. In Proceedings of the 1988 Workshop on Computational Learning Theory, Morgan Kaufmann, San Mateo, CA, August 1988.Google Scholar
  8. [8]
    L. Blum and M. Blum. Toward a mathematical theory of inductive inference. Inform. Contr., 28:125–155, 1975.Google Scholar
  9. [9]
    A. Blumer, A. Ehrenfeucht, D. Haussler, and M. K. Warmuth. Learnability and the Vapnik-Chervonenkis Dimension. Technical Report UCSC-CRL-87-20, Department of Computer and Information Sciences, University of California, Santa Cruz, November 1987. To appear, J. ACM.Google Scholar
  10. [10]
    T. Cover. Geometrical and statistical properties of systems of linear inequalities with applications to pattern recognition. IEEE Trans. Elect. Comp., 14:4326–334, 1965.Google Scholar
  11. [11]
    L. P. Dudley. A Course on Empirical Processes. Springer Verlag, New York, 1984. Lecture Notes in Mathematics No. 1097.Google Scholar
  12. [12]
    A. Ehrenfeucht and D. Haussler. Learning Decision Trees from Random Examples. Technical Report UCSC-CRL-87-15, Department of Computer and Information Sciences, University of California, Santa Cruz, October 1987.Google Scholar
  13. [13]
    A. Ehrenfeucht, D. Haussler, M. Kearns, and L. G. Valiant. A general lower bound on the number of examples needed for learning. In Proceedings of the 1988 Workshop on Computational Learning Theory, pages 139–154, Morgan Kaufmann, San Mateo, CA, August 1988.Google Scholar
  14. [14]
    S. I. Gallant. Representability theory. 1988. Unpublished manuscript.Google Scholar
  15. [15]
    J. Gill. Probabilistic Turing machines. SIAM J. Computing, 6(4):675–695, 1977.Google Scholar
  16. [16]
    O. Goldreich, S. Goldwasser, and S. Micali. How to construct random functions. J. ACM, 33(4):792–807, 1986.Google Scholar
  17. [17]
    O. Goldreich, H. Krawczyk, and M. Luby. On the existence of pseudorandom generators. In Proceedings of the 29th Annual IEEE Symposium on Foundations of Computer Science, pages 12–24, IEEE Computer Society Press, Washington, D.C., October 1988.Google Scholar
  18. [18]
    D. Haussler. Generalizing the pac model: sample size bounds from metric dimension-based uniform convergence results. In Proceedings of the 30th Annual IEEE Symposium on Foundations of Computer Science, IEEE Computer Society Press, Washington, D.C., October 1989.Google Scholar
  19. [19]
    D. Haussler. Learning Conjunctive Concepts in Structural Domains. Technical Report UCSC-CRL-87-01, Department of Computer and Information Sciences, University of California, Santa Cruz, February 1987. To appear in Machine Learning.Google Scholar
  20. [20]
    D. Haussler, M. Kearns, N. Littlestone, and M. K. Warmuth. Equivalence of models for polynomial learnability. In Proceedings of the 1988 Workshop on Computational Learning Theory, pages 42–55, Morgan Kaufmann, San Mateo, CA, August 1988.Google Scholar
  21. [21]
    D. Haussler and G. Pagallo. Feature discovery in empirical learning. Technical Report UCSC-CRL-88-08, Department of Computer and Information Sciences, University of California, Santa Cruz, October 1987. Revised May 1, 1989.Google Scholar
  22. [22]
    D. Haussler and E. Welzl. Epsilon-nets and simplex range queries. Discrete Computational Geometry, 2:127–151, 1987.Google Scholar
  23. [23]
    D. Helmbold, R. Sloan, and M. K. Warmuth. Learning nested differences of intersection closed concept classes. In Proceedings of the 1989 Workshop on Computational Learning Theory, Morgan Kaufmann, San Mateo, CA, August 1989.Google Scholar
  24. [24]
    J. Hopcroft and J. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, London, 1979.Google Scholar
  25. [25]
    M. Kearns, M. Li, L. Pitt, and L. G. Valiant. On the learnability of boolean formulae. In Proceedings of the 19th Annual ACM Symposium on Theory of Computing, Assoc. Comp. Mach., New York, May 1987.Google Scholar
  26. [26]
    M. Kearns and L. Pitt. A polynomial-time algorithm for learning k-variable pattern languages from examples. In Proceedings of the 30th Annual IEEE Symposium on Foundations of Computer Science, IEEE Computer Society Press, Washington, D.C., October 1989.Google Scholar
  27. [27]
    M. Kearns and L. G. Valiant. Cryptographic limitations on learning Boolean formulae and finite automata. In Proceedings of the 21st Annual ACM Symposium on Theory of Computing, pages 433–444, Assoc. Comp. Mach., New York, May 1989.Google Scholar
  28. [28]
    L. A. Levin. One-way functions and pseudorandom generators. Combinatorica, 7(4):357–363, 1987.Google Scholar
  29. [29]
    J. Lin and J. Vitter. Complexity issues in learning neural nets. In Proceedings of the 1989 Workshop on Computational Learning Theory, Morgan Kaufmann, San Mateo, CA, July 1989.Google Scholar
  30. [30]
    N. Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2:285–318, 1987.Google Scholar
  31. [31]
    B. K. Natarajan. Some Results on Learning. Technical Report CMU-RI-TR-89-6, Robotics Institute, Carnegie-Mellon University, 1989.Google Scholar
  32. [32]
    L. Pitt and L. G. Valiant. Computational limitations on learning from examples. J. ACM, 35(4):965–984, 1988.Google Scholar
  33. [33]
    L. Pitt and M. K. Warmuth. Reductions among prediction problems: On the difficulty of predicting automata. In Proceedings of the 3rd Annual IEEE Conference on Structure in Complexity Theory, pages 60–69, IEEE Computer Society Press, Washington, D.C., June 1988.Google Scholar
  34. [34]
    K. M. Podnieks. Comparing various concepts of function prediction, part 1. Latv. Gosudarst. Univ Uch. Zapiski, 210:68–81, 1974. (in Russian).Google Scholar
  35. [35]
    K. M. Podnieks. Comparing various concepts of function prediction, part 2. Latv. Gosudarst. Univ Uch. Zapiski, 233:33–44, 1975. (in Russian).Google Scholar
  36. [36]
    D. Pollard. Convergence of Stochastic Processes. Springer-Verlag, New York, 1974.Google Scholar
  37. [37]
    R. L. Rivest. Learning decision lists. Machine Learning, 2:229–246, 1987.Google Scholar
  38. [38]
    R. Schapire. The strength of weak learnability. In Proceedings of the 1989 Workshop on Computational Learning Theory, Morgan Kaufmann, San Mateo, CA, August 1989.Google Scholar
  39. [39]
    L. Valiant and M. K. Warmuth. Predicting symmetric differences of two halfspaces reduces to predicting halfspaces. 1989. Unpublished manuscript.Google Scholar
  40. [40]
    L. G. Valiant. A theory of the learnable. Comm. Assoc. Comp. Mach., 27(11):1134–1142, 1984.Google Scholar
  41. [41]
    V. N. Vapnik. Estimation of Dependencies Based on Empirical Data. Springer Verlag, New York, 1982.Google Scholar
  42. [42]
    A. C. Yao. Theory and applications of trapdoor functions. In Proceedings of the 23rd Annual IEEE Symposium on Foundations of Computer Science, pages 80–91, IEEE Computer Society Press, Washington, D.C., November 1982.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1989

Authors and Affiliations

  • Manfred K. Warmuth
    • 1
  1. 1.Department of Computer and Information SciencesUniversity of CaliforniaSanta Cruz

Personalised recommendations