Advertisement

Unlabeled Data and Multiple Views

  • Zhi-Hua Zhou
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7081)

Abstract

In many real-world applications there are usually abundant unlabeled data but the amount of labeled training examples are often limited, since labeling the data requires extensive human effort and expertise. Thus, exploiting unlabeled data to help improve the learning performance has attracted significant attention. Major techniques for this purpose include semi-supervised learning and active learning. These techniques were initially developed for data with a single view, that is, a single feature set; while recent studies showed that for multi-view data, semi-supervised learning and active learning can amazingly well. This article briefly reviews some recent advances of this thread of research.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abney, S.: Bootstrapping. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, pp. 360–367 (2002)Google Scholar
  2. 2.
    Balcan, M.-F., Blum, A., Yang, K.: Co-training and expansion: Towards bridging theory and practice. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems, vol. 17, pp. 89–96. MIT Press, Cambridge, MA (2005)Google Scholar
  3. 3.
    Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the 11th Annual Conference on Computational Learning Theory, Madison, WI, pp. 92–100 (1998)Google Scholar
  4. 4.
    Castro, R.M., Nowak, R.D.: Minimax bounds for active learning. IEEE Transactions on Information Theory 54(5), 2339–2353 (2008)CrossRefzbMATHMathSciNetGoogle Scholar
  5. 5.
    Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning. MIT Press, Cambridge, MA (2006)Google Scholar
  6. 6.
    Dasgupta, S., Hsu, D.: Hierarchical sampling for active learning. In: Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, pp. 208–215 (2008)Google Scholar
  7. 7.
    Dasgupta, S., Littman, M., McAllester, D.: PAC generalization bounds for co-training. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems, vol. 14, pp. 375–382. MIT Press, Cambridge, MA (2002)Google Scholar
  8. 8.
    Du, J., Ling, C.X., Zhou, Z.-H.: When does co-training work in real data? IEEE Transactions on Knowledge and Data Engineering 23(5), 788–799 (2010)CrossRefGoogle Scholar
  9. 9.
    Goldman, S., Zhou, Y.: Enhancing supervised learning with unlabeled data. In: Proceedings of the 17th International Conference on Machine Learning, San Francisco, CA, pp. 327–334 (2000)Google Scholar
  10. 10.
    Guo, Q., Chen, T., Chen, Y., Zhou, Z.-H., Hu, W., Xu, Z.: Effective and efficient microprocessor design space exploration using unlabeled design configurations. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Spain, pp. 1671–1677 (2011)Google Scholar
  11. 11.
    Huang, S.-J., Jin, R., Zhou, Z.-H.: Active learning by querying informative and representative examples. In: Lafferty, J., Williams, C.K.I., Shawe-Taylor, J., Zemel, R.S., Culotta, A. (eds.) Advances in Neural Information Processing Systems, vol. 23, pp. 892–900. MIT Press, Cambridge, MA (2010)Google Scholar
  12. 12.
    Kääriäinen, M.: Active learning in the non-realizable case. In: Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, pp. 63–77 (2006)Google Scholar
  13. 13.
    Li, M., Zhou, Z.-H.: Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples. IEEE Transactions on Systems, Man and Cybernetics - Part A: Systems and Humans 37(6), 1088–1098 (2007)CrossRefGoogle Scholar
  14. 14.
    Muslea, I., Minton, S., Knoblock, C.A.: Selective sampling with redundant views. In: Proceedings of the 17th National Conference on Artificial Intelligence, Austin, TX, pp. 621–626 (2000)Google Scholar
  15. 15.
    Muslea, I., Minton, S., Knoblock, C.A.: Active + semi-supervised learning = robust multi-view learning. In: Proceedings of the 19th International Conference on Machine Learning, Sydney, Australia, pp. 435–442 (2002)Google Scholar
  16. 16.
    Nigam, K., Ghani, R.: Analyzing the effectiveness and applicability of co-training. In: Proceedings of the 9th ACM International Conference on Information and Knowledge Management, Washington, DC, pp. 86–93 (2000)Google Scholar
  17. 17.
    Settles, B.: Active learning literature survey. Technical Report 1648, Department of Computer Sciences, University of Wisconsin at Madison, Wisconsin, WI (2009), http://pages.cs.wisc.edu/~bsettles/pub/settles.activelearning.pdf
  18. 18.
    Tong, S., Chang, E.: Support vector machine active learning for image retrieval. In: Proceedings of the 9th ACM International Conference on Multimedia, Ottawa, Canada, pp. 107–118 (2001)Google Scholar
  19. 19.
    Tsybakov, A.: Optimal aggregation of classifiers in statistical learning. Annals of Statistics 32(1), 135–166 (2004)CrossRefzbMATHMathSciNetGoogle Scholar
  20. 20.
    Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)zbMATHGoogle Scholar
  21. 21.
    Wang, W., Zhou, Z.-H.: Analyzing Co-training Style Algorithms. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 454–465. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  22. 22.
    Wang, W., Zhou, Z.-H.: On multi-view active learning and the combination with semi-supervised learning. In: Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, pp. 1152–1159 (2008)Google Scholar
  23. 23.
    Wang, W., Zhou, Z.-H.: Multi-view active learning in the non-realizable case. In: Lafferty, J., Williams, C.K.I., Shawe-Taylor, J., Zemel, R.S., Culotta, A. (eds.) Advances in Neural Information Processing Systems, vol. 23, pp. 2388–2396. MIT Press, Cambridge, MA (2010)Google Scholar
  24. 24.
    Wang, W., Zhou, Z.-H.: A new analysis of co-training. In: Proceedings of the 27th International Conference on Machine Learning, Haifa, Israel, pp. 1135–1142 (2010)Google Scholar
  25. 25.
    Zhou, Z.-H., Chen, K.-J., Dai, H.-B.: Enhancing relevance feedback in image retrieval using unlabeled data. ACM Transactions on Information Systems 24(2), 219–244 (2006)CrossRefGoogle Scholar
  26. 26.
    Zhou, Z.-H., Li, M.: Tri-training: Exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering 17(11), 1529–1541 (2005)CrossRefGoogle Scholar
  27. 27.
    Zhou, Z.-H., Li, M.: Semi-supervised regression with co-training style algorithms. IEEE Transactions on Knowledge and Data Engineering 19(11), 1479–1493 (2007)CrossRefGoogle Scholar
  28. 28.
    Zhou, Z.-H., Li, M.: Semi-supervised learning by disagreement. Knowledge and Information Systems 24(3), 415–439 (2010)CrossRefMathSciNetGoogle Scholar
  29. 29.
    Zhou, Z.-H., Zhan, D.-C., Yang, Q.: Semi-supervised learning with very few labeled training examples. In: Proceedings of the 22nd AAAI Conference on Artificial Intelligence, Vancouver, Canada, pp. 675–680 (2007)Google Scholar
  30. 30.
    Zhu, X.: Semi-supervised learning literature survey. Technical Report 1530, Department of Computer Sciences, University of Wisconsin at Madison, Madison, WI (2006), http://www.cs.wisc.edu/~jerryzhu/pub/ssl_survey.pdf

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Zhi-Hua Zhou
    • 1
  1. 1.National Key Laboratory for Novel Software TechnologyNanjing UniversityNanjingChina

Personalised recommendations