Abstract
Positive and unlabeled (PU) learning is the problem in which training data contains only PU samples. Although PU learning is widely used in real-world applications, its model selection remains challenging. Specifically, traditional model selection methods are often highly sensitive to the class prior as well as the data size, resulting in human overhead in hyperparameter optimization. In this paper, we present a method called ODE (robust model selection) for robust model selection in PU learning. Two novel model evaluators based on the integral probability metric and area under the curve, which are free of the class prior, are introduced, and a variance reduction method is further employed to improve the quality of model selection. In addition, we perform model selection under user-defined constraints and propose a fast halving-style searching algorithm to efficiently identify the most promising model configuration. Extensive empirical studies demonstrate that our proposed method performs more robustly and is more computationally efficient than many state-of-the-art methods.
Similar content being viewed by others
References
Hsieh C, Natarajan N, Dhillon I S. PU learning for matrix completion. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning, 2015. 2445–2453
McAuley J, Pandey R, Leskovec J. Inferring networks of substitutable and complementary products. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015. 785–794
Partalas I, Kosmopoulos A, Baskiotis N, et al. LSHTC: a benchmark for large-scale text classification. 2015. ArXiv: 1503.08581
Wei H H, Li M. Positive and unlabeled learning for detecting software functional clones with adversarial training. In: Proceedings of International Joint Conference on Artificial Intelligence, 2018. 2840–2846
Nguyen M N, Li X L, Ng S K. Positive unlabeled leaning for time series classification. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence, 2011. 1421–1426
Yang P, Li X L, Chua H N, et al. Ensemble positive unlabeled learning for disease gene identification. PLoS ONE, 2014, 9: 97079
Liu B, Lee W S, Yu P S, et al. Partially supervised classification of text documents. In: Proceedings of the 19th International Conference on Machine Learning, 2002. 387–394
Liu B, Dai Y, Li X L, et al. Building text classifiers using positive and unlabeled examples. In: Proceedings of the 3rd IEEE International Conference on Data Mining ICDM, 2003. 179–188
Li X L, Liu B. Learning to classify texts using positive and unlabeled data. In: Proceedings of International Joint Conference on Artificial Intelligence, 2003. 587–592
Wei T, Shi F, Wang H, et al. MixPUL: consistency-based augmentation for positive and unlabeled learning. 2020. ArXiv:2004.09388
Lee W S, Liu B. Learning with positive and unlabeled examples using weighted logistic regression. In: Proceedings of the 20th International Conference on Machine Learning, 2003. 448–455
Shi H, Pan S J, Yang J, et al. Positive and unlabeled learning via loss decomposition and centroid estimation. In: Proceedings of International Joint Conference on Artificial Intelligence, 2018. 2689–2695
Elkan C, Noto K. Learning classifiers from only positive and unlabeled data. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2008. 213–220
du Plessis M C, Niu G, Sugiyama M. Analysis of learning from positive and unlabeled data. In: Proceedings of Annual Conference on Neural Information Processing Systems, 2014. 703–711
du Plessis M C, Niu G, Sugiyama M. Convex formulation for learning from positive and unlabeled data. In: Proceedings of the 32nd International Conference on Machine Learning, 2015. 1386–1394
Kiryo R, Niu G, du Plessis M C, et al. Positive-unlabeled learning with non-negative risk estimator. In: Proceedings of Annual Conference on Neural Information Processing Systems, 2017. 1674–1684
Thornton C, Hutter F, Hoos H H, et al. Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2013. 847–855
Feurer M, Klein A, Eggensperger K, et al. Efficient and robust automated machine learning. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, 2015. 2962–2970
Yao Q M, Wang M S, Hugo J E, et al. Taking human out of learning applications: a survey on automated machine learning. 2018. ArXiv:1810.13306
Li Y F, Wang H, Wei T, et al. Towards automated semi-supervised learning. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence, 2019
Wei T, Guo L Z, Li Y F, et al. Learning safe multi-label prediction for weakly labeled data. Mach Learn, 2018, 107: 703–725
Dietterich T G. Robust artificial intelligence and robust human organizations. Front Comput Sci, 2019, 13: 1–3
Li Y F, Liang D M. Safe semi-supervised learning: a brief introduction. Front Comput Sci, 2019, 13: 669–676
Zhou Z H. Abductive learning: towards bridging machine learning and logical reasoning. Sci China Inf Sci, 2019, 62: 076101
Tolstikhin I O, Bousquet O, Gelly S, et al. Wasserstein auto-encoders. In: Proceedings of the 6th International Conference on Learning Representations, 2018
Xie Z, Li M. Semi-supervised AUC optimization without guessing labels of unlabeled data. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, 2018. 4310–4317
Niu G, du Plessis M C, Sakai T, et al. Theoretical comparisons of positive-unlabeled learning against positive-negative learning. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, 2016. 1199–1207
du Plessis M C, Niu G, Sugiyama M. Analysis of learning from positive and unlabeled data. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, 2014. 703–711
Claesen M, Davis J, de Smet F, et al. Assessing binary classifiers using only positive and unlabeled data. 2015. ArXiv:1504.06837
Jain S, White M, Radivojac P. Recovering true classifier performance in positive-unlabeled learning. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017. 2066–2072
Conn A R, Scheinberg K, Vicente L N. Introduction to Derivative-free Optimization. Philadelphia: SIAM, 2009
Bergstra J, Bengio Y. Random search for hyper-parameter optimization. J Mach Learn Res, 2012, 13: 281–305
Bergstra J S, Bardenet R, Bengio Y, et al. Algorithms for hyper-parameter optimization. In: Proceedings of the 24th International Conference on Neural Information Processing Systems, 2011. 2546–2554
Snoek J, Larochelle H, Adams R P. Practical bayesian optimization of machine learning algorithms. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, 2012. 2951–2959
Klein A, Falkner S, Bartels S, et al. Fast bayesian optimization of machine learning hyperparameters onlarge datasets. 2016. ArXiv:1605.07079
Bekker J, Davis J. Estimating the class prior in positive and unlabeled data through decision tree induction. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, 2018. 2712–2719
Menon A K, van Rooyen B, Ong C S, et al. Learning from corrupted binary labels via class-probability estimation. In: Proceedings of the 32nd International Conference on Machine Learning, 2015. 125–134
Ramaswamy H G, Scott C, Tewari A. Mixture proportion estimation via kernel embeddings of distributions. In: Proceedings of the 33nd International Conference on Machine Learning, 2016. 2052–2060
du Plessis M C, Niu G, Sugiyama M. Class-prior estimation for learning from positive and unlabeled data. Mach Learn, 2017, 106: 463–492
Sriperumbudur B K, Fukumizu K, Gretton A, et al. On the empirical estimation of integral probability metrics. Electron J Statist, 2012, 6: 1550–1599
You K C, Wang X M, Long M S, et al. Towards accurate model selection in deep unsupervised domain adaptation. In: Proceedings of International Conference on Machine Learning, 2019. 7124–7133
Jamieson K G, Talwalkar A. Non-stochastic best arm identification and hyperparameter optimization. In: Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, 2016. 240–248
Li L S, Jamieson K G, DeSalvo G, et al. Hyperband: a novel bandit-based approach to hyperparameter optimization. J Mach Learn Res, 2017, 18: 1–52
Ke G L, Meng Q, Finley T, et al. LightGBM: a highly efficient gradient boosting decision tree. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017. 3149–3157
Komer B, Bergstra J, Eliasmith C. Hyperopt-sklearn: automatic hyperparameter configuration for scikit-learn. In: Proceedings of ICML workshop on AutoML, 2014
Chang C C, Lin C J. LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol, 2011, 2: 1–27
Acknowledgements
This work was supported by National Natural Science Foundation of China (Grant No. 61772262). The authors want to thank the associate editor and reviewers for their helpful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wei, T., Wang, H., Tu, W. et al. Robust model selection for positive and unlabeled learning with constraints. Sci. China Inf. Sci. 65, 212101 (2022). https://doi.org/10.1007/s11432-020-3167-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-020-3167-1