Abstract
We consider the problem of classification using a variant of the agnostic learning model in which the algorithm’s hypothesis is evaluated by comparison with hypotheses that do not classify all possible instances. Such hypotheses are formalized as functions from the instance space X to {0, *, 1}, where * is interpreted as “don’t know”. We provide a characterization of the sets of {0, *, 1}-valued functions that are learnable in this setting. Using a similar analysis, we improve on sufficient conditions for a class of real-valued functions to be agnostically learnable with a particular relative accuracy; in particular, we improve by a factor of two the scale at which scale-sensitive dimensions must be finite in order to imply learnability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
N. Alon, S. Ben-David, N. Cesa-Bianchi, and D. Haussler. Scale-sensitive dimensions, uniform convergence, and learnability. Journal of the Association for Computing Machinery, 44(4):616–631, 1997.
M. Anthony and P.L. Bartlett. Neural Network Learning: Theoretical Foundations. Cambridge University Press, 1999.
P.L. Bartlett. The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network. IEEE Transactions on Information Theory, 44(2):525–536, 1998.
P.L. Bartlett and P.M. Long. Prediction, learning, uniform convergence, and scale-sensitive dimensions. Journal of Computer and System Sciences, 56(2):174–190, 1998.
P.L. Bartlett, P.M. Long, and R.C. Williamson. Fat-shattering and the learnability of real-valued functions. Journal of Computer and System Sciences, 52(3):434–452, 1996.
S. Ben-David and H.U. Simon. Efficient learning of linear perceptrons. Advances in Neural Information Processing Systems 14, 2000.
Shai Ben-David, Nadav Eiron, and Hans U. Simon. The computational complexity of densest region detection. Proceedings of the 2000 Conference on Computational Learning Theory, 2000.
A. Blum, P. Chalasani, S. Goldman, and D.K. Slonim. Learning with unreliable boundary queries. Proceedings of the 1995 Conference on Computational Learning Theory, pages 98–107, 1995.
A. Blumer, A. Ehrenfeucht, D. Haussler, and M.K. Warmuth. Learnability and the Vapnik-Chervonenkis dimension. JACM, 36(4):929–965, 1989.
K.L. Buescher and P.R. Kumar. Learning stochastic functions by smooth simultaneous estimation. Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, pages 272–279, 1992.
N. Cristianini and J. Shawe-Taylor. An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, 2000.
Y. Freund, R.E. Schapire, Y. Singer, and M.K. Warmuth. Using and combining predictors that specialize. Proceedings of the Twenty-Ninth Annual ACM Symposium on the Theory of Computing, 1997.
D. Haussler. Decision theoretic generalizations of the PAC model for neural net and other learning applications. Information and Computation, 100(1):78–150, 1992.
D. Haussler, M. Kearns, N. Littlestone, and M.K. Warmuth. Equivalence of models for polynomial learnability. Information and Computation, 95:129–161, 1991.
D. Haussler, N. Littlestone, and M. K. Warmuth. Predicting {0, 1}-functions on randomly drawn points. Information and Computation, 115(2):129–161, 1994.
M.J. Kearns and R.E. Schapire. Efficient distribution-free learning of probabilistic concepts. Journal of Computer and System Sciences, 48(3):464–497, 1994.
M.J. Kearns, R.E. Schapire, and L.M. Sellie. Toward efficient agnostic learning. Machine Learning, 17:115–141, 1994.
V.N. Vapnik. Estimation of Dependencies based on Empirical Data. Springer Verlag, 1982.
V.N. Vapnik and A.Y. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 16(2):264–280, 1971.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Long, P.M. (2001). On Agnostic Learning with {0, *, 1}-Valued and Real-Valued Hypotheses. In: Helmbold, D., Williamson, B. (eds) Computational Learning Theory. COLT 2001. Lecture Notes in Computer Science(), vol 2111. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44581-1_19
Download citation
DOI: https://doi.org/10.1007/3-540-44581-1_19
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42343-0
Online ISBN: 978-3-540-44581-4
eBook Packages: Springer Book Archive