Abstract
One of the core applications of machine learning to knowledge discovery is building a hypothesis (such as a decision tree or neural network) from a given amount of data, so that we can later use it to predict new instances of the data. In this paper, we focus on a particular situation where we assume that the hypothesis we want to use for prediction is a very simple one so the hypotheses class is of feasible size. We study the problem of how to determine which of the hypotheses in the class is almost the best one. We present two on-line sampling algorithms for selecting a hypothesis, give theoretical bounds on the number of examples needed, and analyze them experimentally. We compare them with the simple batch sampling approach commonly used and show that in most of the situations our algorithms use a much smlaler number of examples
Partially supported by the ESPRIT Working Group NeuroCOLT2 (No.27150), the ESPRIT project ALCOM-IT (No.20244), and CICYT TIC97-1475-CE.
Partially supported by the ESPRIT Working Group NeuroCOLT2 (No.27150), DGES project KOALA (PB95-0787), and CIRIT SGR 1997SGR-00366.
Partially supported by the Minsitry of Education, Science, Sports and Culture, Grant-in-Aid for Scientific Research on Priority Areas (Discovery Science) 1998.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Peter Auer, Robert C. Holte, and Wolfgang Mass. Theory and applications of agnostic PAC-learning with small decision trees. Proceedings of the 12th International Conference on Machine Learning, 21–29, 1995.
Leo Breiman. Bagging predictors. Machine Learning, 26(2):123–140, 1996.
Carlos Domingo, Ricard Gavaldà, and Osamu Watanabe, Practical Algorithms for On-line Sampling, Research Report C-123, Dept. of Math. and Comput. Sci, Tokyo Inst. of Tech. (1998), http://www.is.titech.ac.jp/research/ technical-report/index.html.
Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Science, 55:1, 119–139, 1997.
D. Haussler. Decision theoretic generalization of the PAC-model for neural nets and other learning applications. Information and Computation, 100:78–150, 1992.
Robert C. Holte. Very simple classification rules perform well on most common datasets. Machine Learning, 11:63–91, 1993.
M.J. Kearns, R.E. Schapire, and L.M. Sellie. Towards efficient agnostic learning. Proc. 5th ACM Workshop on Computational Learning Theory, 341–352, 1992.
M.J. Kearns and U.V. Vazirani. An Introduction to Computational Learning Theory. Cambridge University Press, 1994.
S.M. Weiss, R.S. Galen, and P.V. Tadepalli. Maximizing the predictive value of production rules. Artificial Intelligence, 45, 47–71, 1990.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Domingo, C., Gavaldà, R., Watanabe, O. (1998). Practical Algorithms for On-Line Sampling. In: Arikawa, S., Motoda, H. (eds) Discovey Science. DS 1998. Lecture Notes in Computer Science(), vol 1532. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49292-5_14
Download citation
DOI: https://doi.org/10.1007/3-540-49292-5_14
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65390-5
Online ISBN: 978-3-540-49292-4
eBook Packages: Springer Book Archive