Learning distributions by their density levels — A paradigm for learning without a teacher

  • Shai Ben-David
  • Michael Lindenbaum
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 904)


Can we learn from unlabeled examples? We consider here the un-supervised learning scenario in which the examples provided are not labeled (and are not necessarily all positive or all negative). The only information about their membership is indirectly disclosed to the student through the sampling distribution.

We view this problem as a restricted instance of the fundamental issue of inferring information about a probability distribution from the random samples it generates. We propose a framework, density-level-learning, for acquiring some partial information about a distribution and develop a model of un-supervised concept learning based on this framework.

We investigate the basic features of these types of learning and provide lower and upper bounds on the sample complexity of these tasks. Our main result is that the learnability of a class in this setting is equivalent to the finiteness of its VC-dimension. One direction of the proof involves a reduction of the density-level-learnability to PAC learnability, while the sufficiency condition is proved through the introduction of a generic learning algorithm.


Learning Theory PAC Vapnik-Chervonenkis dimension ∈-approximation unsupervised learning 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [BL93]
    S. Ben-David and M. Lindenbaum, 1993, “Localization vs. Identification of Semi-Algebraic Sets”, Proc. COLT 93, pp. 327–336.Google Scholar
  2. [BEHW89]
    Blumer, A., A. Ehrenfeucht, D. Haussler and M.K. Warmuth, 1989, “Learnability and The Vapnik-Chervonenkis Dimension”, JACM, 36(4), pp. 929–965.Google Scholar
  3. [CEG93]
    Canetti, R., G. Even and O. Goldreich, 1993, “Lower Bounds for sampling Algorithms for Estimating the Average”, Computer Science Technical Report No. 789, Technion-Israel Institute of Technology.Google Scholar
  4. [DH73]
    Duda, R.O., and P.E. Hart, 1973, “Pattern Classification and Scene Analysis”, Wiley.Google Scholar
  5. [D84]
    Dudley, R.M., 1984, “A course on empirical processes”, Lecture Notes in Mathematics, 1097, pp. 2–142.Google Scholar
  6. [IK88]
    Illingworth, J., and J. Kittler, 1988 “A Survey of Hough Transform”, Computer Vision, Graphics, and Image Processing, 44, pp. 87–116.Google Scholar
  7. [K91]
    Khovanskii, A.G. 1991, “Fewnomials”, Translations of Mathematical Monographs, 88.Google Scholar
  8. [KMRRSS94]
    Kearns, M., Y. Mansour, D. Ron, R. Rubinfeld, R.E. Schapire, and L. Sellie, 94, “On the Learnability of Discrete Distributions”, Proc. of 26th ACM STOC 94, pp. 273–282.Google Scholar
  9. [Kim9l]
    Kim, W.M., 991, “Learning by Smoothing: a Morphological approach”, Proc. of COLT 91, pp. 43–57.Google Scholar
  10. [Nat91]
    Natarajan, B.K., 1991, “Probably Approximate Learning of Sets and Functions”, SIAM J. Comput. 20(1), pp. 328–351.Google Scholar
  11. [Pap84]
    Papoulis, A., 1984, “Probability, Random Variables, and Stochastic Processes”, 1984, McGraw-Hill.Google Scholar
  12. [VC71]
    Vapnik, V.N. and A.Y. Chervonenkis, 1971, “On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities”, Theory of Probability and its applications, 16(2), pp. 264–280.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1995

Authors and Affiliations

  • Shai Ben-David
    • 1
  • Michael Lindenbaum
    • 1
  1. 1.Computer Science DepartmentTechnionHaifaIsrael

Personalised recommendations