Advertisement

Semi-supervised Learning with Ensemble Learning and Graph Sharpening

  • Inae Choi
  • Hyunjung Shin
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5326)

Abstract

The generalization ability of a machine learning algorithm varies on the specified values to the model-hyperparameters and the degree of noise in the learning dataset. If the dataset has a sufficient amount of labeled data points, the optimal value for the hyperparameter can be found via validation by using a subset of the given dataset. However, for semi-supervised learning–one of the most recent learning algorithms–this is not as available as in conventional supervised learning. In semi-supervised learning, it is assumed that the dataset is given with only a few labeled data points. Therefore, holding out some of labeled data points for validation is not easy. The lack of labeled data points, furthermore, makes it difficult to estimate the degree of noise in the dataset. To circumvent the addressed difficulties, we propose to employ ensemble learning and graph sharpening. The former replaces the hyperparameter selection procedure to an ensemble network of the committee members trained with various values of hyperparameter. The latter, on the other hand, improves the performance of algorithms by removing unhelpful information flow by noise. The experimental results present that the proposed method can improve performance on a publicly available bench-marking problems.

Keywords

Semi-supervised learning Graph sharpening Ensemble learning Hyperparameter selection Noise reduction 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Zhu, X.: Semi-supervised learning with graphs. Ph.D. dissertation, Carnegie Mellon University (2005)Google Scholar
  2. 2.
    Shin, H., Tsuda, K.: Prediction of Protein Function from Networks. In: Chapelle, O., Schoelkopf, B., Zien, A. (eds.) Book: Semi-Supervised Learning, Ch. 20, pp. 339–352. MIT Press, Cambridge (2006)Google Scholar
  3. 3.
    Shin, H., Lisewski, A.M., Lichtarge, O.: Graph Sharpening plus Graph Integration: A Synergy that Improves Protein Functional Classification. Bioinformatic 23(23), 3217–3224 (2007)CrossRefGoogle Scholar
  4. 4.
    Breiman, L.: Bagging Predictors. Machine Learning 24, 123–140 (1996)zbMATHGoogle Scholar
  5. 5.
    Perrone, M.P.: Improving Regression Estimation: Averaging Methods for Variance Reduction with Extension to General Convex Measure Optimization. Ph.D Thesis, Brown University, Providence, RI (1993)Google Scholar
  6. 6.
    Sharkey, A.J.C.: Combining Diverse Neural Nets. The Knowledge Engineering Review 12(3), 231–247 (1997)CrossRefGoogle Scholar
  7. 7.
    Tumer, K., Ghosh, J.: Error Correlation and Error Reduction in Ensemble Classifiers. Connection Science 8(3), 385–404 (1996)CrossRefGoogle Scholar
  8. 8.
    Shin, H., Hill, N.J., Raetsch, G.: Graph-based semi-supervised learning with sharper edges. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 402–413. Springer, Heidelberg (2006)Google Scholar
  9. 9.
    Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. Advances in Neural Information Processing Systems (NIPS) 16, 321–328 (2004)Google Scholar
  10. 10.
    Belkin, M., Matveeva, I., Niyogi, P.: Regularization and regression on large graphs. In: Shawe-Taylor, J., Singer, Y. (eds.) COLT 2004. LNCS (LNAI), vol. 3120, pp. 624–638. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  11. 11.
    Chapelle, O., Weston, J., Schölkopf, B.: Cluster kernels for semi-supervised learning. In: Advances in Neural Information Processing Systems (NIPS), vol. 15, pp. 585–592 (2003)Google Scholar
  12. 12.
    Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Machine Learning: Proceedings of the Thirteenth International Conference, pp. 148–156 (1996)Google Scholar
  13. 13.
    Shin, H., Cho, S.: Pattern selection using the bias and variance of ensemble. Journal of the Korean Institute of Industrial Engineers 28(1), 112–127 (2001)Google Scholar
  14. 14.
    Demšar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)MathSciNetzbMATHGoogle Scholar
  15. 15.

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Inae Choi
    • 1
  • Hyunjung Shin
    • 1
  1. 1.Department of Industrial & Information Systems EngineeringAjou UniversitySuwonKorea

Personalised recommendations