Graph Based Semi-supervised Learning with Sharper Edges

  • Hyunjung (Helen) Shin
  • N. Jeremy Hill
  • Gunnar Rätsch
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4212)


In many graph-based semi-supervised learning algorithms, edge weights are assumed to be fixed and determined by the data points’ (often symmetric) relationships in input space, without considering directionality. However, relationships may be more informative in one direction (e.g. from labelled to unlabelled) than in the reverse direction, and some relationships (e.g. strong weights between oppositely labelled points) are unhelpful in either direction. Undesirable edges may reduce the amount of influence an informative point can propagate to its neighbours – the point and its outgoing edges have been “blunted.” We present an approach to “sharpening” in which weights are adjusted to meet an optimization criterion wherever they are directed towards labelled points. This principle can be applied to a wide variety of algorithms. In the current paper, we present one ad hoc solution satisfying the principle, in order to show that it can improve performance on a number of publicly available benchmark data sets.


Kernel Matrix Outgoing Edge Manifold Structure Multiple Kernel Learning Machine Learn Research 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Bach, F.R., Jordan, M.I.: Learning spectral clustering. In: NIPS, vol. 16 (2004)Google Scholar
  2. 2.
    Belkin, M., Matveeva, I., Niyogi, P.: Regularization and regression on large graphs. In: Shawe-Taylor, J., Singer, Y. (eds.) COLT 2004. LNCS, vol. 3120, pp. 624–638. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  3. 3.
    Belkin, M., Niyogi, P.: Using manifold structure for partially labelled classification. In: NIPS, vol. 15 (2003)Google Scholar
  4. 4.
    Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)zbMATHGoogle Scholar
  5. 5.
    Chapelle, O., Schölkopf, B., Zien, A.: Semi-Supervised Learning. MIT Press, Cambridge (in press, 2006)Google Scholar
  6. 6.
    Chapelle, O., Weston, J., Schölkopf, B.: Cluster kernels for semi-supervised learning. In: NIPS, vol. 15 (2003)Google Scholar
  7. 7.
    Chung, F.R.K.: Spectral Graph Theory. Number 92 in Regional Conference Series in Mathematics. American Mathematical Society, Providence (1997)Google Scholar
  8. 8.
    Crammer, K., Keshet, J., Singer, Y.: Kernel design using boosting. In: NIPS, vol. 15 (2003)Google Scholar
  9. 9.
    Cristianini, N., Shawe-Taylor, J., Kandola, J.: On kernel target alignment. In: NIPS, vol. 14 (2002)Google Scholar
  10. 10.
    De Bie, T., Cristianini, N.: Convex method for transduction. In: NIPS, vol. 16 (2004)Google Scholar
  11. 11.
    Demsăr, J.: Statistical comparisons of claissifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)Google Scholar
  12. 12.
    Doyle, P., Snell, J.: Random walks and electric networks. Mathematical Association of America (1984)Google Scholar
  13. 13.
    Joachims, T.: Transductive inference for text classification using support vector machines. In: Proc. ICML (1999)Google Scholar
  14. 14.
    Kondor, I., Lafferty, J.: Diffusion kernels on graphs and other discrete structures. In: Proc. ICML (2002)Google Scholar
  15. 15.
    Lanckriet, G.R.G., Cristianini, N., Ghaoui, L.E., Bartlett, P., Jordan, M.I.: Learning the kernel matrix with semi-definite programming. In: Proc. ICML (2002)Google Scholar
  16. 16.
    Schölkopf, B., Smola, A.J.: Learning with Kernels. MIT Press, Cambridge (2002)Google Scholar
  17. 17.
    Sindhwani, V., Niyogi, P., Belkin, M.: Beyond the point cloud: from transductive to semi-supervised learning. In: Proc. ICML (2005)Google Scholar
  18. 18.
    Sonnenburg, S., Rätsch, G., Schäfer, S., Schölkopf, B.: Large scale multiple kernel learning. Journal of Machine Learning Research (accepted, 2006)Google Scholar
  19. 19.
    Spielman, D.A., Teng, S.H.: Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In: Proc. of the 26th annual ACM symposium on Theory of computing, pp. 81–90. ACM Press, New York (2004)Google Scholar
  20. 20.
    Tsuda, K., Rätsch, G., Warmuth, M.K.: Matrix exponentiated gradient updates for on-line learning and bregman projection. Journal of Machine Learning Research 6, 995–1018 (2005)Google Scholar
  21. 21.
    Vapnik, V.N.: Statistical Learning Theory. Wiley-Interscience, Chichester (1998)zbMATHGoogle Scholar
  22. 22.
    Zhang, Z., Yeung, D.Y., Kwok, J.T.: Bayesian inference for transductive learning of kernel matrix using the tanner-wong data augmentation algorithm. In: Proc. ICML (2004)Google Scholar
  23. 23.
    Zhou, D., Bousquet, O., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: NIPS, vol. 16 (2004)Google Scholar
  24. 24.
    Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using gaussian fields and harmonic functions. In: Proc. ICML (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Hyunjung (Helen) Shin
    • 1
    • 2
  • N. Jeremy Hill
    • 3
  • Gunnar Rätsch
    • 1
  1. 1.Friedrich Miescher LaboratoryMax Planck SocietyTübingenGermany
  2. 2.Dept. of Industrial & Information Systems EngineeringAjou UniversitySuwonKorea
  3. 3.Max Planck Institute for Biological CyberneticsTübingenGermany

Personalised recommendations