Positive and Unlabeled Examples Help Learning

De Comité, Francesco; Denis, François; Gilleron, Rémi; Letouzey, Fabien

doi:10.1007/3-540-46769-6_18

Francesco De Comité⁵,
François Denis⁵,
Rémi Gilleron⁵ &
…
Fabien Letouzey⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1720))

Included in the following conference series:

International Conference on Algorithmic Learning Theory

663 Accesses
41 Citations

Abstract

In many learning problems, labeled examples are rare or expensive while numerous unlabeled and positive examples are available. However, most learning algorithms only use labeled examples. Thus we address the problem of learning with the help of positive and unlabeled data given a small number of labeled examples. We present both theoretical and empirical arguments showing that learning algorithms can be improved by the use of both unlabeled and positive data. As an illustrating problem, we consider the learning algorithm from statistics for monotone conjunctions in the presence of classification noise and give empirical evidence of our assumptions. We give theoretical results for the improvement of Statistical Query learning algorithms from positive and unlabeled data. Lastly, we apply these ideas to tree induction algorithms. We modify the code of C4.5 to get an algorithm which takes as input a set LAB of labeled examples, a set POS of positive examples and a set UNL of unlabeled data and which uses these three sets to construct the decision tree. We provide experimental results based on data taken from UCI repository which confirm the relevance of this approach.

This research was partially supported by “Motricité et Cognition” : Contrat par objectifs région Nord/Pas-de-Calais

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

D. Angluin and P. Laird. Learning from noisy examples. Machine Learning, 2(4):343–370, 1988.
Google Scholar
A. Blum and T. Mitchell. Combining labeled and unlabeled data with cotraining. In Proc. 11th Annu. Conf. on Comput. Learning Theory, pages 92–100. ACM Press, New York, NY, 1998.
Google Scholar
S. E. Decatur. Pac learning with constant-partition classification noise and applications to decision tree induction. In Proceedings of the Fourteenth International Conference on Machine Learning, 1997.
Google Scholar
F. Denis. Pac learning from positive statistical queries. In ALT 98, 9th International Conference on Algorithmic Learning Theory, volume 1501 of Lecture Notes in Artificial Intelligence, pages 112–126. Springer-Verlag, 1998.
Article MathSciNet Google Scholar
M. Kearns. Effcient noise-tolerant learning from statistical queries. In Proceedings of the 25th ACM Symposium on the Theory of Computing, pages 392–401. ACM Press, New York, NY, 1993.
Google Scholar
M. Kubat and S. Matwin. Addressing the curse of imbalanced training sets: One-sided selection. In Proceedings of the 14th International Conference on Machine Learning, pages 179–186, 1997.
Google Scholar
C.J. Merz and P.M. Murphy. UCI repository of machine learning databases, 1998.
Google Scholar
K. Nigam, A. McCallum, S. Thrun, and T. Mitchell. Learning to classify text from labeled and unlabeled documents. In Proceedings of the 15th National Conference on Artificial Intelligence, AAAI-98, 1998.
Google Scholar
J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA, 1993.
Google Scholar
L.G. Valiant. A theory of the learnable. Commun. ACM, 27(11):1134–1142, November 1984.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

LIFL, URA 369 CNRS, Université de Lille 1, 59655, Villeneuve d’Ascq, France
Francesco De Comité, François Denis, Rémi Gilleron & Fabien Letouzey

Authors

Francesco De Comité
View author publications
You can also search for this author in PubMed Google Scholar
François Denis
View author publications
You can also search for this author in PubMed Google Scholar
Rémi Gilleron
View author publications
You can also search for this author in PubMed Google Scholar
Fabien Letouzey
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Mathematical and Computing Sciences, Tokyo Institute of Technology, Tokyo, 152-8552, Japan
Osamu Watanabe
Waseda University, 1-6-1 Nishiwaseda, Shinjuku-ku, Tokyo, 169-8050, Japan
Takashi Yokomori

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

De Comité, F., Denis, F., Gilleron, R., Letouzey, F. (1999). Positive and Unlabeled Examples Help Learning. In: Watanabe, O., Yokomori, T. (eds) Algorithmic Learning Theory. ALT 1999. Lecture Notes in Computer Science(), vol 1720. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46769-6_18

Download citation

DOI: https://doi.org/10.1007/3-540-46769-6_18
Published: 19 May 2000
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66748-3
Online ISBN: 978-3-540-46769-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics