Abstract
Semi-supervised algorithms are well-known for their ability to combine both supervised and unsupervised strategies for optimizing their learning ability under the assumption that only a few examples together with their full feature set are given. In such cases, the use of weak learners as base classifiers is usually preferred, since the iterative behavior of semi-supervised schemes require the building of new temporal models during each new iteration. Locally weighted naïve Bayes classifier is such a classifier that encompasses the power of NB and k-NN algorithms. In this work, we have implemented a self-labeled weighted variant of local learner which uses NB as the base classifier of self-training scheme. We performed an in depth comparison with other well-known semi-supervised classification methods on standard benchmark datasets and we reached to the conclusion that the presented technique had better accuracy in most cases.
Similar content being viewed by others
References
Aha DW, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66. doi:10.1023/A:1022689900470
Choi B-J, Kim K-R, Cho K-D, Park C, Koo J-Y (2014) Variable selection for naive Bayes semisupervised learning. Commun Stat Simul Comput 43(10):2702–2713. doi:10.1080/03610918.2012.762391
Clark P, Nibblet T (1989) The CN2 induction algorithm. Mach Learn 3(4):261–283. doi:10.1017/CBO9781107415324.004
Deng C, Guo MZ (2006) MICAI 2006: advances in artificial intelligence. In: Gelbukh A, Reyes-Garcia CA (eds), vol 4293. Springer, Berlin, Heidelberg. doi:10.1007/11925231
Deng C, Guo MZ (2011) A new co-training-style random forest for computer aided diagnosis. J Intell Inf Syst 36(3):253–281. doi:10.1007/s10844-009-0105-8
Domingos P, Pazzani M (1997) On the optimality of the simple bayesian classifier under zero-one loss. Mach Learn 29(2–3):103–130. doi:10.1023/A:1007413511361
Frank E, Hall M, Pfahringer B (2003) Locally weighted naive Bayes. In: Proceedings of the 19th conference on uncertainty in artificial intelligence, pp 249–256. doi:10.1.1.8.1071
Garcia S, Fernandez A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 180(10):2044–2064. doi:10.1016/j.ins.2009.12.010
Guo T, Li G (2012) Improved tri-training with unlabeled data. Adv Intell Soft Comput 115:139–147
Hady M, Schwenker F (2008) Co-training by committee: a generalized framework for semi-supervised learning with committees. Int J Softw Inform 2(2):95–124. doi:10.1109/ICDM.Workshops.2008.29
Halder A, Ghosh S, Ghosh A (2010) Ant based semi-supervised classification. In: Swarm intelligence—7th international conference, ANTS 2010, Brussels, Belgium, September 8–10, 2010. Proceedings of lecture notes in Computer Science, vol 6234, pp 376–383. Springer
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software. ACM SIGKDD Explor Newslett 11(1):10. doi:10.1145/1656274.1656278
Hartert L, Sayed Mouchaweh M, Billaudel P (2010) A semi-supervised dynamic version of Fuzzy K-Nearest Neighbours to monitor evolving systems. Evolv Syst 1(1):3–15. doi:10.1007/s12530-010-9001-2
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Elements 1:337–387. doi:10.1007/b94608
Huang T, Yu Y, Guo G, Li K (2010) A classification algorithm based on local cluster centers with a few labeled training examples. Knowl Based Syst 23(6):563–571. doi:10.1016/j.knosys.2010.03.015
Jiang L, Zhang H (2006) Weightily averaged one-dependence estimators. PRICAI 2006: trends in artificial intelligence. Springer, Berlin, Heidelberg, pp 970–974
Jiang L, Cai Z, Zhang H, Wang D (2012) Naive Bayes text classifiers: a locally weighted learning approach. J Exp Theor Artif Intell 25(2):1–14. doi:10.1080/0952813X.2012.721010
Li M, Zhou Z (2005) S ETRED: self-training with editing. LNAI 3518:611–621
Li M, Zhou ZH (2007) Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples. IEEE Trans Syst Man Cybern Part A Syst Hum 37(6):1088–1098. doi:10.1109/TSMCA.2007.904745
Liu K et al (2015) Semi-supervised learning based on improved co-training by committee. In: Lecture Notes in Computer Science. Springer, pp 413–421
McCann S, Lowe DG (2012) Local Naive Bayes nearest neighbor for image classification. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 3650–3656. doi:10.1109/CVPR.2012.6248111
Nigam K, Ghani R (2000) Analyzing the effectiveness and applicability of co-training. In: Proceedings of the nineth international conference on information and knowledge management—CIKM’00, pp 86–93
Nigam K, McCallum A, Mitchell T (2006) Semi-supervised text classification using EM. In: Semi-supervised learning. MIT Press
Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238. doi:10.1109/TPAMI.2005.159
Prakash VJ, Nithya LM (2014) A survey on semi-supervised learning techniques. Int J Comput Trends Technol (IJCTT) 8(1):25–29
Saeed AA, Cawley GC, Bagnall A (2015) Benchmarking the Semi-Supervised naïve Bayes Classifier. In: Proceedings of the 2015 international joint conference on neural networks (IJCNN). IEEE, pp 1–8
Triguero I, García S, Herrera F (2013) Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study. Knowl Inf Syst 42(2):245–284. doi:10.1007/s10115-013-0706-y
Wang B, Zhang H (2007) Probability based metrics for locally weighted naive bayes. Adv Artif Intell 4509:180–191
Wang J, Luo S, Zeng X (2008) A random subspace method for co-training. In: 2008 IEEE international joint conference on neural networks ieee world congress on Comp. Intelligence, pp 195–200. doi:10.1109/IJCNN.2008.4633789
Wang Y, Xu X, Zhao H, Hua Z (2010) Knowledge-based systems semi-supervised learning based on nearest neighbor rule and cut edges. Knowl Based Syst 23(6):547–554. doi:10.1016/j.knosys.2010.03.012
Wang S, Wu L, Jiao L, Liu H (2014) Improve the performance of co-training by committee with refinement of class probability estimations. Neurocomputing 136:30–40
Wu J, Wu B, Pan S, Wang H, Cai Z (2015) Locally weighted learning: how and when does it work in bayesian networks? Int J Comput Intell Syst 8(sup1):63–74. doi:10.1080/18756891.2015.1129579
Xu C, Tao D, Xu C (2015) A survey on multi-view learning. Cvpr 36(8):300072
Yarowsky D (1995) Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd annual meeting on association for computational linguistics, pp 189–196. doi:10.3115/981658.981684
Yaslan Y, Cataltepe Z (2010) Co-training with relevant random subspaces. Neurocomputing 73(10–12):1652–1661. doi:10.1016/j.neucom.2010.01.018
Zheng Z, Webb GI (2000) Lazy learning of Bayesian rules. Mach Learn 41(1):53–84. doi:10.1023/A:1007613203719
Zhou Y, Goldman S (2004) Democratic co-learning. In: 16th IEEE international conference on tools with artificial intelligence. IEEE Comput. Soc, pp 594–602, doi:10.1109/ICTAI.2004.48
Zhou Z, Li M (2005) Tri-training: exploiting unlabled data using three classifiers. IEEE Trans Data Eng 17(11):1529–1541. doi:10.1109/TKDE.2005.186
Zhou ZH, Li M (2010) Semi-supervised learning by disagreement. Knowl Inf Syst 24(3):415–439. doi:10.1007/s10115-009-0209-z
Zighed DA, Lallich S, Muhlenbach F (2002) Separability index in supervised learning. In: 6th European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD 2002, 2431 LNAI, pp 475–487
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
A java software tool implementing the proposed algorithm with some basic run instructions can be found in the following link http://ml.math.upatras.gr/wp-content/uploads/2016/02/SelfLWNB-Experiment.zip.
Rights and permissions
About this article
Cite this article
Karlos, S., Fazakis, N., Panagopoulou, AP. et al. Locally application of naive Bayes for self-training. Evolving Systems 8, 3–18 (2017). https://doi.org/10.1007/s12530-016-9159-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12530-016-9159-3