Encyclopedia of Machine Learning

2010 Edition
| Editors: Claude Sammut, Geoffrey I. Webb

Semi-Supervised Text Processing

  • Ion Muslea
Reference work entry
DOI: https://doi.org/10.1007/978-0-387-30164-8_750

Synonyms

Definition

In contrast to supervised and unsupervised learners, which use solely labeled or unlabeled examples, respectively, semi-supervised learning systems exploit both labeled and unlabeled examples. In a typical semi-supervised framework, the system takes as input a (small) training set of labeled examples and a (larger) working set of unlabeled examples; the learner’s performance is evaluated on a test set that consists of unlabeled examples. Transductive learning is a particular case of semi-supervised learning in which the working set and the test set are identical.

Semi-supervised learners use the unlabeled examples to improve the performance of the system that could be learned solely from labeled data. Such learners typically exploit – directly or indirectly – the distribution of the available unlabeled examples. Text processing is an ideal application domain for semi-supervised learning because the...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. Basu, S., Banerjee, A., & Mooney, R. (2002). Semi-supervised clustering by seeding. In Proceedings of the international conference on machine learning (pp. 19–26). Sydney, Australia.Google Scholar
  2. Basu, S., Bilenko, M., & Mooney, R. (2004). A probabilistic framework for semi-supervised clustering. In Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (pp. 59–68). Seattle, WA.Google Scholar
  3. Blum, A., Lafferty, J., Rwebangira, M. R., & Reddy, R. (2004). Semi-supervised learning using randomized mincuts. In Proceedings of the twenty-first international conference on machine learning (p. 13).Google Scholar
  4. Blum, A., & Mitchell, T. (1998). Combining labeled and unlabeled data with co-training. In Proceedings of the 1988 conference on computational learning theory (pp. 92–100).Google Scholar
  5. Chapelle, O., Chi, M., & Zien, A. (2006). A continuation method for semi-supervised SVMs. In Proceedings of the 23rd international conference on machine learning (pp. 185–192). New York: ACM Press.CrossRefGoogle Scholar
  6. Cozman, F., Cohen, I., & Cirelo, M. (2003). Semi-supervised learning of mixture models. In Proceedings of the international conference on machine learning (pp. 99–106). Washington, DC.Google Scholar
  7. Joachims, T. (1999). Transductive inference for text classification using support vector machines. In Proceedings of the 16th international conference on machine learning (ICML-99) (pp. 200–209). San Francisco: Morgan Kaufmann.Google Scholar
  8. Joachims, T. (2003). Transductive learning via spectral graph partitioning. In Proceedings of the international conference on machine learning.Google Scholar
  9. McCallum, A., & Nigam, K. (1998). Employing EM in pool-based active learning for text classification. In Proceedings of the 15th international conference on machine learning (pp. 359–367).Google Scholar
  10. Muslea, I., Minton, S., & Knoblock, C. (2002a). Active + semi-supervised learning = robust multi-view learning. In The 19th international conference on machine learning (ICML-2002) (pp. 435–442). Sydney, Australia.Google Scholar
  11. Muslea, I., Minton, S., & Knoblock, C. (2002b). Adaptive view validation: A first step towards automatic view detection. In The 19th international conference on machine learning (ICML-2002) (pp. 443–450). Sydney, Australia.Google Scholar
  12. Muslea, I., Minton, S., & Knoblock, C. (2006). Active learning with multiple views. Journal of Artificial Intelligence Research, 27, 203–233.MathSciNetzbMATHGoogle Scholar
  13. Nigam, K., McCallum, A. K., Thrun, S., & Mitchell, T. M. (2000). Text classification from labeled and unlabeled documents using EM. Machine Learning, 39(2/3), 103–134.zbMATHCrossRefGoogle Scholar
  14. Sindhwani, V., Niyogi, P., & Belkin, M. (2005). Beyond the point cloud: From transductive to semi-supervised learning. In Proceedings of the 22nd international conference on machine learning (pp. 824–831). Bonn, Germany.Google Scholar
  15. Zelikovitz, S., & Hirsh, H. (2000). Improving short text classification using unlabeled background knowledge. In Proceedings of the 17th international conference on machine learning (pp. 1183–1190).Google Scholar
  16. Zhou, Z.-H., & Li, M. (2005). Tri-training: Exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering, 17(11), 1529–1541.CrossRefGoogle Scholar
  17. Zhu, X. (2005). Semi-supervised learning literature survey. Technical report 1530, Department of Computer Sciences, University of Wisconsin, Madison.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Ion Muslea

There are no affiliations available