Skip to main content

Practical Algorithms for On-Line Sampling

  • Conference paper
  • First Online:
Discovey Science (DS 1998)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1532))

Included in the following conference series:

Abstract

One of the core applications of machine learning to knowledge discovery is building a hypothesis (such as a decision tree or neural network) from a given amount of data, so that we can later use it to predict new instances of the data. In this paper, we focus on a particular situation where we assume that the hypothesis we want to use for prediction is a very simple one so the hypotheses class is of feasible size. We study the problem of how to determine which of the hypotheses in the class is almost the best one. We present two on-line sampling algorithms for selecting a hypothesis, give theoretical bounds on the number of examples needed, and analyze them experimentally. We compare them with the simple batch sampling approach commonly used and show that in most of the situations our algorithms use a much smlaler number of examples

Partially supported by the ESPRIT Working Group NeuroCOLT2 (No.27150), the ESPRIT project ALCOM-IT (No.20244), and CICYT TIC97-1475-CE.

Partially supported by the ESPRIT Working Group NeuroCOLT2 (No.27150), DGES project KOALA (PB95-0787), and CIRIT SGR 1997SGR-00366.

Partially supported by the Minsitry of Education, Science, Sports and Culture, Grant-in-Aid for Scientific Research on Priority Areas (Discovery Science) 1998.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Peter Auer, Robert C. Holte, and Wolfgang Mass. Theory and applications of agnostic PAC-learning with small decision trees. Proceedings of the 12th International Conference on Machine Learning, 21–29, 1995.

    Google Scholar 

  2. Leo Breiman. Bagging predictors. Machine Learning, 26(2):123–140, 1996.

    Google Scholar 

  3. Carlos Domingo, Ricard Gavaldà, and Osamu Watanabe, Practical Algorithms for On-line Sampling, Research Report C-123, Dept. of Math. and Comput. Sci, Tokyo Inst. of Tech. (1998), http://www.is.titech.ac.jp/research/ technical-report/index.html.

  4. Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Science, 55:1, 119–139, 1997.

    Article  MATH  MathSciNet  Google Scholar 

  5. D. Haussler. Decision theoretic generalization of the PAC-model for neural nets and other learning applications. Information and Computation, 100:78–150, 1992.

    Article  MATH  MathSciNet  Google Scholar 

  6. Robert C. Holte. Very simple classification rules perform well on most common datasets. Machine Learning, 11:63–91, 1993.

    Article  MATH  Google Scholar 

  7. M.J. Kearns, R.E. Schapire, and L.M. Sellie. Towards efficient agnostic learning. Proc. 5th ACM Workshop on Computational Learning Theory, 341–352, 1992.

    Google Scholar 

  8. M.J. Kearns and U.V. Vazirani. An Introduction to Computational Learning Theory. Cambridge University Press, 1994.

    Google Scholar 

  9. S.M. Weiss, R.S. Galen, and P.V. Tadepalli. Maximizing the predictive value of production rules. Artificial Intelligence, 45, 47–71, 1990.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Domingo, C., Gavaldà, R., Watanabe, O. (1998). Practical Algorithms for On-Line Sampling. In: Arikawa, S., Motoda, H. (eds) Discovey Science. DS 1998. Lecture Notes in Computer Science(), vol 1532. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49292-5_14

Download citation

  • DOI: https://doi.org/10.1007/3-540-49292-5_14

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65390-5

  • Online ISBN: 978-3-540-49292-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics