Using Weighted Nearest Neighbor to Benefit from Unlabeled Data

  • Kurt Driessens
  • Peter Reutemann
  • Bernhard Pfahringer
  • Claire Leschi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3918)

Abstract

The development of data-mining applications such as textclassification and molecular profiling has shown the need for machine learning algorithms that can benefit from both labeled and unlabeled data, where often the unlabeled examples greatly outnumber the labeled examples. In this paper we present a two-stage classifier that improves its predictive accuracy by making use of the available unlabeled data. It uses a weighted nearest neighbor classification algorithm using the combined example-sets as a knowledge base. The examples from the unlabeled set are “pre-labeled” by an initial classifier that is build using the limited available training data. By choosing appropriate weights for this pre-labeled data, the nearest neighbor classifier consistently improves on the original classifier.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Seeger, M.: Learning with labeled and unlabeled data. Technical report, Edinburgh University (2001)Google Scholar
  2. 2.
    Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: COLT: Proceedings of the Workshop on Computational Learning Theory, pp. 92–100. Morgan Kaufmann, San Francisco (1998)CrossRefGoogle Scholar
  3. 3.
    Zhou, D., Bousquet, O., Lal, T., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: Proceedings of the Annual Conf. on Neural Information Processing Systems, NIPS (2004)Google Scholar
  4. 4.
    Nigam, K., McCallum, A., Thrun, S., Mitchell, T.: Text classification from labeled and unlabeled documents using em. Machine Learning 39, 103–134 (2000)CrossRefMATHGoogle Scholar
  5. 5.
    Rosenberg, C., Hebert, M., Schneiderman, H.: Semi-supervised self-training of object detection models. In: 7th IEEE Workshop on Applications of Computer Vision / IEEE Workshop on Motion and Video Computing, Breckenridge, CO, USA, January 5-7, pp. 29–36. IEEE Computer Society, Los Alamitos (2005)Google Scholar
  6. 6.
    Joachims, T.: Transductive inference for text classification using support vector machines. In: Bratko, I., Džeroski, S. (eds.) Proceedings of ICML99, 16th International Conference on Machine Learning, pp. 200–209. Morgan Kaufmann, San Francisco (1999)Google Scholar
  7. 7.
    Szummer, M., Jaakkola, T.: Partially labeled classification with markov random walks. In: Dietterich, T., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems 14, Neural Information Processing Systems, NIPS 2001, Vancouver and Whistler, British Columbia, Canada, December 3-8, pp. 945–952. MIT Press, Cambridge (2001)Google Scholar
  8. 8.
    Chapelle, O., Weston, J., Schölkopf, B.: Cluster kernels for semi-supervised learning. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems 15, Neural Information Processing Systems, NIPS 2002, Vancouver, British Columbia, Canada, December 9-14, pp. 585–592. MIT Press, Cambridge (2002)Google Scholar
  9. 9.
    Zhu, X.: Semi-supervised learning with graphs. PhD thesis, Carnegie Mellon University, School of Computer Science, Pittsburgh, Pennsylvania (PA), USA (2005)Google Scholar
  10. 10.
    Blum, A., Chawla, S.: Learning from labeled and unlabeled data using graph mincuts. In: Brodley, C., Pohoreckyj Danyluk, A. (eds.) Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Williams College, Williamstown, MA, USA, June 28 - July 1, pp. 19–26. Morgan Kaufmann, San Francisco (2001)Google Scholar
  11. 11.
    Joachims, T.: Transductive learning via spectral graph partitioning. In: Fawcett, T., Mishra, N. (eds.) Proceedings of the Twentieth International Conference on Machine Learning (ICML 2003), August 21-24, pp. 290–297. AAAI Press, Washington (2003)Google Scholar
  12. 12.
    Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised searning using Gaussian fields and harmonic functions. In: Fawcett, T., Mishra, N. (eds.) Proceedings of the Twentieth International Conference on Machine Learning (ICML 2003), August 21-24, pp. 912–919. AAAI Press, Washington (2003)Google Scholar
  13. 13.
    Neville, J., Jensen, D.: Collective classification with relational dependency networks. In: Proceedings of the Second International Workshop on Multi-Relational Data-Mining (2003)Google Scholar
  14. 14.
    Zhou, Z.H., Jiang, Y.: Nec4.5: neural ensemble based c4.5. IEEE Transactions on Knowledge and Data Engineering 16, 770–773 (2004)CrossRefGoogle Scholar
  15. 15.
    Blum, A., Chawla, S.: Learning from labeled and unlabeled data using graph mincuts. In: Proceedings of the Eighteenth International Conference on Machine Learning. Morgan Kaufmann, San Francisco (2001)Google Scholar
  16. 16.
    Friedman, J., Bentley, J., Finkel, R.: An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software 3, 209–226 (1977)CrossRefMATHGoogle Scholar
  17. 17.
    Beygelzimer, A., Kakade, S., Langford, J.: Cover trees for nearest neighbor. In: pre-print (2005), available from, www.cs.rochester.edu/u/beygel/publications.html
  18. 18.
    Omohundro, S.: Efficient algorithms with nearal network behavior. Journal of Complex Systems 1, 273–347 (1987)MathSciNetMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Kurt Driessens
    • 1
    • 2
  • Peter Reutemann
    • 2
  • Bernhard Pfahringer
    • 2
  • Claire Leschi
    • 3
  1. 1.Department of Computer ScienceK.U. LeuvenBelgium
  2. 2.Department of Computer ScienceUniversity of WaikatoHamiltonNew Zealand
  3. 3.Institut National des Sciences AppliqueesLyonFrance

Personalised recommendations