Instance selection method for improving graph-based semi-supervised learning

Wang, Hai; Wang, Shao-Bo; Li, Yu-Feng

doi:10.1007/s11704-017-6543-5

Instance selection method for improving graph-based semi-supervised learning

Research Article
Published: 13 February 2018

Volume 12, pages 725–735, (2018)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Hai Wang^1,2,
Shao-Bo Wang^1,2 &
Yu-Feng Li^1,2

119 Accesses
5 Citations
Explore all metrics

Abstract

Graph-based semi-supervised learning is an important semi-supervised learning paradigm. Although graph-based semi-supervised learning methods have been shown to be helpful in various situations, they may adversely affect performance when using unlabeled data. In this paper, we propose a new graph-based semi-supervised learning method based on instance selection in order to reduce the chances of performance degeneration. Our basic idea is that given a set of unlabeled instances, it is not the best approach to exploit all the unlabeled instances; instead, we should exploit the unlabeled instances that are highly likely to help improve the performance, while not taking into account the ones with high risk. We develop both transductive and inductive variants of our method. Experiments on a broad range of data sets show that the chances of performance degeneration of our proposed method are much smaller than those of many state-of-the-art graph-based semi-supervised learning methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Instance Selection Method for Improving Graph-Based Semi-supervised Learning

Robust Semi-Supervised Learning on Multiple Networks with Noise

Exploring Latent Sparse Graph for Large-Scale Semi-supervised Learning

References

Zhou D, Bousquet O, Lal T N, Weston J, Schölkopf B. Learning with local and global consistency. In: Proceedings of the 16th International Conference on Neural Information Processing Systems. 2004, 321–328
Google Scholar
Zhu X. Semi-supervised learning literature survey. Technical Report, 2007
Google Scholar
Zhu X, Goldberg A B. Introduction to semi-supervised learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, 2009, 3(1): 1–130
Article MATH Google Scholar
Chapelle O, Schölkopf B, Zien A. Semi-Supervised Learning. Cambridge: MIT Press, 2006
Book Google Scholar
Blum A, Mitchell T. Combining labeled and unlabeled data with cotraining. In: Proceedings of the 11th Annual Conference on Computational Learning Theory. 1998, 92–100
Google Scholar
Joachims T. Transductive inference for text classification using support vector machines. In: Proceedings of the 16th International Conference on Machine Learning. 1999, 200–209
Google Scholar
Zhu X, Ghahramani Z, Lafferty J. Semi-supervised learning using Gaussian fields and harmonic functions. In: Proceedings of the 20th International Conference on Machine learning. 2003, 912–919
Google Scholar
Zhu X, Lafferty J, Rosenfeld R. Semi-supervised learning with graphs. Dissertation for the Doctoral Degree. Pittsburgh: CarnegieMellon University, 2005
Google Scholar
Cai X F, Wen G H, Wei J, Yu Z W. Relative manifold based semisupervised dimensionality reduction. Frontiers of Computer Science, 2014, 8(6): 923–932
Article Google Scholar
Liu W, Wang J, Chang S F. Robust and scalable graph-based semisupervised learning. Proceedings of the IEEE, 2012, 100(9): 2624–2638
Article Google Scholar
Joachims T. Transductive learning via spectral graph partitioning. In: Proceedings of the 20th International Conference on Machine Learning. 2003, 290–297
Google Scholar
Zha Z J, Mei T, Wang J, Wang Z, Hua X S. Graph-based semisupervised learning with multiple labels. Journal of Visual Communication and Image Representation, 2009, 20(2): 97–103
Article Google Scholar
Camps-Valls G, Marsheva T V B, Zhou D. Semi-supervised graphbased hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 2007, 45(10): 3044–3054
Article Google Scholar
Belkin M, Niyogi P. Semi-supervised learning on riemannian manifolds. Machine Learning, 2004, 56(1–3): 209–239
Article MATH Google Scholar
Karlen M, Weston J, Erkan A, Collobert R. Large scale manifold transduction. In: Proceedings of the 25th International Conference on Machine Learning. 2008, 775–782
Google Scholar
Wang F, Zhang C. Label propagation through linear neighborhoods. IEEE Transactions on Knowledge and Data Engineering, 2008, 20(1): 55–67
Article Google Scholar
Li Y F, Wang S B, Zhou Z H. Graph Quality Judgement: a large margin expedition. In: Proceedings of the 25th International Joint Confernece on Artificial Intelligence. 2016, 1725–1731
Google Scholar
Li Y F, Zhou Z H. Towards making unlabeled data never hurt. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(1): 175–188
Article Google Scholar
Li Y F, Kwok J T, Zhou Z H. Towards safe semi-supervised learning for multivariate performance measures. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016, 1816–1822
Google Scholar
Balsubramani A, Freund Y. Optimally Combining Classifiers Using Unlabeled Data. In: Proceedings of the 28th International Conference On Learning Theory. 2015, 211–225
Google Scholar
Bennett K P, Demiriz A. Semi-supervised support vector machines. In: Proceedings of the Conference on Advances in Neural Information Processing Systems II. 1999, 368–374
Google Scholar
Li Y F, Kwok J T, Zhou Z H. Semi-supervised learning using label mean. In: Proceedings of the 26th International Conference on Machine Learning. 2009, 633–640
Google Scholar
Blum A, Chawla S. Learning from labeled and unlabeled data using graph mincuts. In: Proceedings of the 18th International Conference on Machine Learning. 2001, 19–26
Google Scholar
Chapelle O, Weston J, Schölkopf B. Cluster kernels for semisupervised learning. In: Proceedings of the 15th International Conference on Neural Information Processing Systems. 2003, 601–608
Google Scholar
Szummer M, Jaakkola T. Partially labeled classification with Markov random walks. In: Proceedings of the 14th International Conference on Neural Information Processing Systems. 2002, 945–952
Google Scholar
Kemp C, Griffiths T L, Stromsten S, Tenenbaum J B. Semi-supervised learning with trees. In: Proceedings of the 16th International Conference on Neural Information Processing Systems. 2004, 257–264
Google Scholar
Wang H, Wang S B, Li Y F. Instance Selection Method for Improving Graph-based Semi-Supervised Learning. In: Proceedings of the 14th Pacific Rim International Conference on Artificial Intelligence. 2016, 565–573
Google Scholar
Jebara T, Wang J, Chang S F. Graph construction and b-matching for semi-supervised learning. In: Proceedings of the 26th International Conference on Machine Learning. 2009, 441–448
Google Scholar
Belkin M, Niyogi P. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering. In: Proceedings of the 14th International Conference on Neural Information Processing Systems. 2002, 585–591
Google Scholar
Kuncheva L I, Whitaker C J, Shipp C A, Duin R P. Limits on the majority vote accuracy in classifier fusion. Pattern Analysis and Applications, 2003, 6(1): 22–31
Article MathSciNet MATH Google Scholar
Delalleau O, Bengio Y, Roux N L. Efficient Non-Parametric Function Induction in Semi-Supervised Learning. In: Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics. 2005, 96–103
Google Scholar
Li Y F, Zhou Z H. Improving semi-supervised support vector machines through unlabeled instances selection. In: Proceedings of the 25th AAAI Conference on Artificial Intelligence. 2011, 386–391
Google Scholar
Yang Y, Nie F P, Xu D, Luo J B. Zhuang Y T, Pan Y H. A multimedia retrieval framework based on semi-supervised ranking and relevance feedback. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(4): 723–742
Article Google Scholar
Yang Y, Ma Z G, Nie F P, Chang X J, Hauptmann A G. Multi-class active learning by uncertainty sampling with diversity maximization. International Journal of Computer Vision, 2015, 113(2): 113–127
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors want to thank the associate editors and reviewers for helpful comments and suggestions. This research was partially supported by the National Natural Science Foundation of China (Grant No. 61403186), Jiangsu Science Foundation (BK20140613) and MSRA research fund.

Author information

Authors and Affiliations

National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China
Hai Wang, Shao-Bo Wang & Yu-Feng Li
Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing, 210023, China
Hai Wang, Shao-Bo Wang & Yu-Feng Li

Authors

Hai Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shao-Bo Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Feng Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yu-Feng Li.

Additional information

Hai Wang is a master student at Department of Computer Science and Technology in Nanjing University, China. He is currently a member of the LAMDA Group. His main research interest is machine learning.

Shao-Bo Wang is a master student at Department of Computer Science and Technology in Nanjing University, China. He is currently a member of the LAMDA Group. His main research interest is machine learning.

Yu-Feng Li is currently an associate researcher at Department of Computer Science and Technology in Nanjing University, China. He is currently a member of the LAMDA Group. His main research interests include machine learning and data mining. He won the Microsoft Fellowship Award in 2009 and the Excellent Doctoral Dissertation Award of Chinese Computer Federation in 2013. He has been a senior program committee member of several conferences including IJCAI’17 and IJCAI’15, and served as program committee member for ICML’16, KDD’16, CVPR’16, etc.

Electronic supplementary material

Instance selection method for improving graph-based semi-supervised learning

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, H., Wang, SB. & Li, YF. Instance selection method for improving graph-based semi-supervised learning. Front. Comput. Sci. 12, 725–735 (2018). https://doi.org/10.1007/s11704-017-6543-5

Download citation

Received: 17 November 2016
Accepted: 02 March 2017
Published: 13 February 2018
Issue Date: August 2018
DOI: https://doi.org/10.1007/s11704-017-6543-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Instance selection method for improving graph-based semi-supervised learning

Abstract

Access this article

Similar content being viewed by others

Instance Selection Method for Improving Graph-Based Semi-supervised Learning

Robust Semi-Supervised Learning on Multiple Networks with Noise

Exploring Latent Sparse Graph for Large-Scale Semi-supervised Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Instance selection method for improving graph-based semi-supervised learning

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Instance selection method for improving graph-based semi-supervised learning

Abstract

Access this article

Similar content being viewed by others

Instance Selection Method for Improving Graph-Based Semi-supervised Learning

Robust Semi-Supervised Learning on Multiple Networks with Noise

Exploring Latent Sparse Graph for Large-Scale Semi-supervised Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Instance selection method for improving graph-based semi-supervised learning

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation