Skip to main content

When Can Two Unsupervised Learners Achieve PAC Separation?

  • Conference paper
  • First Online:
Computational Learning Theory (COLT 2001)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2111))

Included in the following conference series:

Abstract

In this paper we study a new restriction of the PAC learning framework, in whicheac hlab el class is handled by an unsupervised learner that aims to fit an appropriate probability distribution to its own data. A hypothesis is derived by choosing, for any unlabeled instance, the label whose distribution assigns it the higher likelihood.

The motivation for the new learning setting is that the general approach of fitting separate distributions to eachlab el class, is often used in practice for classification problems. The set of probability distributions that is obtained is more useful than a collection of decision boundaries. A question that arises, however, is whether it is ever more tractable (in terms of computational complexity or sample-size required) to find a simple decision boundary than to divide the problem up into separate unsupervised learning problems and find appropriate distributions.

Within the framework, we give algorithms for learning various simple geometric concept classes. In the boolean domain we show how to learn parity functions, and functions having a constant upper bound on the number of relevant attributes. These results distinguish the new setting from various other well-known restrictions of PAC-learning. We give an algorithm for learning monomials over input vectors generated by an unknown product distribution. The main open problem is whether monomials (or any other concept class) distinguish learnability in this framework from standard PAC-learnability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. E.L. Allwein, R.E. Schapire and Y. Singer (2000). Reducing Multiclass to Binary: A Unifying Approachfor Margin Classifiers. Journal of Machine Learning Research 1, 113–141.

    Article  MathSciNet  Google Scholar 

  2. M. Anthony and N. Biggs (1992). Computational Learning Theory, Cambridge Tracts in Theoretical Computer Science, Cambridge University Press, Cambridge.

    MATH  Google Scholar 

  3. C.M. Bishop (1995). Neural Networks for Pattern Recognition, Oxford University Press.

    Google Scholar 

  4. A. Blumer, A. Ehrenfeucht, D. Haussler and M.K. Warmuth (1989). Learnability and the Vapnik-Chervonenkis Dimension, J.ACM 36, 929–965.

    Article  MATH  MathSciNet  Google Scholar 

  5. N.H. Bshouty, S.A. Goldman, H.D. Mathias, S. Suri and H. Tamaki (1998). Noise-Tolerant Distribution-Free Learning of General Geometric Concepts. Journal of the ACM 45(5), pp. 863–890.

    Article  MATH  MathSciNet  Google Scholar 

  6. N.H. Bshouty, P.W. Goldberg, S.A. Goldman and H.D. Mathias (1999). Exact learning of discretized geometric concepts. SIAM J. Comput. 28(2) pp. 674–699.

    Article  MathSciNet  Google Scholar 

  7. N. Cristianini and J. Shawe-Taylor (2000). An Introduction to Support Vector Machines. Cambridge University Press.

    Google Scholar 

  8. M. Cryan, L. Goldberg and P. Goldberg (1998). Evolutionary Trees can be Learned in Polynomial Time in the Two-State General Markov Model. Procs. of 39th FOCS symposium, pp. 436–445.

    Google Scholar 

  9. S. Dasgupta (1999). Learning mixtures of Gaussians. 40th IEEE Symposium on Foundations of Computer Science.

    Google Scholar 

  10. R.O. Duda and P.E. Hart. Pattern Classification and Scene Analysis. Wiley, New York (1973).

    MATH  Google Scholar 

  11. Y. Freund and Y. Mansour (1999). Estimating a mixture of two product distributions. Procs. of 12th COLT conference, pp. 53–62.

    Google Scholar 

  12. A. Frieze, M. Jerrum and R. Kannan (1996). Learning Linear Transformations. 37th IEEE Symposium on Foundations of Computer Science, pp. 359–368.

    Google Scholar 

  13. V. Guruswami and A. Sahai (1999). Multiclass Learning, Boosting, and Error-Correcting Codes. Procs. of 12th COLT conference, pp. 145–155.

    Google Scholar 

  14. D. Haussler, M. Kearns, N. Littlestone and M.K. Warmuth (1991). Equivalence of Models for Polynomial Learnability. Information and Computation, 95(2), pp. 129–161.

    Article  MATH  MathSciNet  Google Scholar 

  15. D. Helmbold, R. Sloan and M.K. Warmuth (1992). Learning Integer Lattices. SIAM Journal on Computing, 21(2), pp. 240–266.

    Article  MATH  MathSciNet  Google Scholar 

  16. M.J. Kearns (1993). Efficient Noise-Tolerant Learning From Statistical Queries, Procs. of the 25th Annual Symposium on the Theory of Computing, pp. 392–401.

    Google Scholar 

  17. M. Kearns, Y. Mansour, D. Ron, R. Rubinfeld, R.E. Schapire and L. Sellie (1994). On the Learnability of Discrete Distributions, Proceedings of the 26th Annual ACM Symposium on the Theory of Computing, pp. 273–282.

    Google Scholar 

  18. M.J. Kearns and R.E. Schapire (1994). Efficient Distribution-free Learning of Probabilistic Concepts, Journal of Computer and System Sciences, 48(3) 464–497. (see also FOCS’ 90)

    Article  MATH  MathSciNet  Google Scholar 

  19. J.C. Platt, N. Cristianini and J. Shawe-Taylor (2000). Large Margin DAGs for Multiclass Classification, Procs. of 12th NIPS conference.

    Google Scholar 

  20. L.G. Valiant (1984). A Theory of the Learnable. Commun. ACM 27(11), pp. 1134–1142.

    Article  MATH  Google Scholar 

  21. L.G. Valiant (1985). Learning disjunctions of conjunctions. Procs. of 9th International Joint Conference on Artificial Intelligence.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Goldberg, P.W. (2001). When Can Two Unsupervised Learners Achieve PAC Separation?. In: Helmbold, D., Williamson, B. (eds) Computational Learning Theory. COLT 2001. Lecture Notes in Computer Science(), vol 2111. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44581-1_20

Download citation

  • DOI: https://doi.org/10.1007/3-540-44581-1_20

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42343-0

  • Online ISBN: 978-3-540-44581-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics