Evolving Systems

, Volume 8, Issue 1, pp 3–18 | Cite as

Locally application of naive Bayes for self-training

  • Stamatis KarlosEmail author
  • Nikos Fazakis
  • Angeliki-Panagiota Panagopoulou
  • Sotiris Kotsiantis
  • Kyriakos Sgarbas
Original Paper


Semi-supervised algorithms are well-known for their ability to combine both supervised and unsupervised strategies for optimizing their learning ability under the assumption that only a few examples together with their full feature set are given. In such cases, the use of weak learners as base classifiers is usually preferred, since the iterative behavior of semi-supervised schemes require the building of new temporal models during each new iteration. Locally weighted naïve Bayes classifier is such a classifier that encompasses the power of NB and k-NN algorithms. In this work, we have implemented a self-labeled weighted variant of local learner which uses NB as the base classifier of self-training scheme. We performed an in depth comparison with other well-known semi-supervised classification methods on standard benchmark datasets and we reached to the conclusion that the presented technique had better accuracy in most cases.


Naive Bayes classifier Pattern recognition Classification accuracy Labeled/unlabeled data Local decision metrics 


  1. Aha DW, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66. doi: 10.1023/A:1022689900470 Google Scholar
  2. Choi B-J, Kim K-R, Cho K-D, Park C, Koo J-Y (2014) Variable selection for naive Bayes semisupervised learning. Commun Stat Simul Comput 43(10):2702–2713. doi: 10.1080/03610918.2012.762391 MathSciNetCrossRefzbMATHGoogle Scholar
  3. Clark P, Nibblet T (1989) The CN2 induction algorithm. Mach Learn 3(4):261–283. doi: 10.1017/CBO9781107415324.004 Google Scholar
  4. Deng C, Guo MZ (2006) MICAI 2006: advances in artificial intelligence. In: Gelbukh A, Reyes-Garcia CA (eds), vol 4293. Springer, Berlin, Heidelberg. doi: 10.1007/11925231
  5. Deng C, Guo MZ (2011) A new co-training-style random forest for computer aided diagnosis. J Intell Inf Syst 36(3):253–281. doi: 10.1007/s10844-009-0105-8 CrossRefGoogle Scholar
  6. Domingos P, Pazzani M (1997) On the optimality of the simple bayesian classifier under zero-one loss. Mach Learn 29(2–3):103–130. doi: 10.1023/A:1007413511361 CrossRefzbMATHGoogle Scholar
  7. Frank E, Hall M, Pfahringer B (2003) Locally weighted naive Bayes. In: Proceedings of the 19th conference on uncertainty in artificial intelligence, pp 249–256. doi: Scholar
  8. Garcia S, Fernandez A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 180(10):2044–2064. doi: 10.1016/j.ins.2009.12.010 CrossRefGoogle Scholar
  9. Guo T, Li G (2012) Improved tri-training with unlabeled data. Adv Intell Soft Comput 115:139–147Google Scholar
  10. Hady M, Schwenker F (2008) Co-training by committee: a generalized framework for semi-supervised learning with committees. Int J Softw Inform 2(2):95–124. doi: 10.1109/ICDM.Workshops.2008.29 Google Scholar
  11. Halder A, Ghosh S, Ghosh A (2010) Ant based semi-supervised classification. In: Swarm intelligence—7th international conference, ANTS 2010, Brussels, Belgium, September 8–10, 2010. Proceedings of lecture notes in Computer Science, vol 6234, pp 376–383. SpringerGoogle Scholar
  12. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software. ACM SIGKDD Explor Newslett 11(1):10. doi: 10.1145/1656274.1656278 CrossRefGoogle Scholar
  13. Hartert L, Sayed Mouchaweh M, Billaudel P (2010) A semi-supervised dynamic version of Fuzzy K-Nearest Neighbours to monitor evolving systems. Evolv Syst 1(1):3–15. doi: 10.1007/s12530-010-9001-2 CrossRefGoogle Scholar
  14. Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Elements 1:337–387. doi: 10.1007/b94608 zbMATHGoogle Scholar
  15. Huang T, Yu Y, Guo G, Li K (2010) A classification algorithm based on local cluster centers with a few labeled training examples. Knowl Based Syst 23(6):563–571. doi: 10.1016/j.knosys.2010.03.015 CrossRefGoogle Scholar
  16. Jiang L, Zhang H (2006) Weightily averaged one-dependence estimators. PRICAI 2006: trends in artificial intelligence. Springer, Berlin, Heidelberg, pp 970–974CrossRefGoogle Scholar
  17. Jiang L, Cai Z, Zhang H, Wang D (2012) Naive Bayes text classifiers: a locally weighted learning approach. J Exp Theor Artif Intell 25(2):1–14. doi: 10.1080/0952813X.2012.721010 Google Scholar
  18. Li M, Zhou Z (2005) S ETRED: self-training with editing. LNAI 3518:611–621Google Scholar
  19. Li M, Zhou ZH (2007) Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples. IEEE Trans Syst Man Cybern Part A Syst Hum 37(6):1088–1098. doi: 10.1109/TSMCA.2007.904745 CrossRefGoogle Scholar
  20. Liu K et al (2015) Semi-supervised learning based on improved co-training by committee. In: Lecture Notes in Computer Science. Springer, pp 413–421Google Scholar
  21. McCann S, Lowe DG (2012) Local Naive Bayes nearest neighbor for image classification. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 3650–3656. doi: 10.1109/CVPR.2012.6248111
  22. Nigam K, Ghani R (2000) Analyzing the effectiveness and applicability of co-training. In: Proceedings of the nineth international conference on information and knowledge management—CIKM’00, pp 86–93Google Scholar
  23. Nigam K, McCallum A, Mitchell T (2006) Semi-supervised text classification using EM. In: Semi-supervised learning. MIT PressGoogle Scholar
  24. Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238. doi: 10.1109/TPAMI.2005.159 CrossRefGoogle Scholar
  25. Prakash VJ, Nithya LM (2014) A survey on semi-supervised learning techniques. Int J Comput Trends Technol (IJCTT) 8(1):25–29CrossRefGoogle Scholar
  26. Saeed AA, Cawley GC, Bagnall A (2015) Benchmarking the Semi-Supervised naïve Bayes Classifier. In: Proceedings of the 2015 international joint conference on neural networks (IJCNN). IEEE, pp 1–8Google Scholar
  27. Triguero I, García S, Herrera F (2013) Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study. Knowl Inf Syst 42(2):245–284. doi: 10.1007/s10115-013-0706-y CrossRefGoogle Scholar
  28. Wang B, Zhang H (2007) Probability based metrics for locally weighted naive bayes. Adv Artif Intell 4509:180–191MathSciNetCrossRefGoogle Scholar
  29. Wang J, Luo S, Zeng X (2008) A random subspace method for co-training. In: 2008 IEEE international joint conference on neural networks ieee world congress on Comp. Intelligence, pp 195–200. doi: 10.1109/IJCNN.2008.4633789
  30. Wang Y, Xu X, Zhao H, Hua Z (2010) Knowledge-based systems semi-supervised learning based on nearest neighbor rule and cut edges. Knowl Based Syst 23(6):547–554. doi: 10.1016/j.knosys.2010.03.012 CrossRefGoogle Scholar
  31. Wang S, Wu L, Jiao L, Liu H (2014) Improve the performance of co-training by committee with refinement of class probability estimations. Neurocomputing 136:30–40CrossRefGoogle Scholar
  32. Wu J, Wu B, Pan S, Wang H, Cai Z (2015) Locally weighted learning: how and when does it work in bayesian networks? Int J Comput Intell Syst 8(sup1):63–74. doi: 10.1080/18756891.2015.1129579 CrossRefGoogle Scholar
  33. Xu C, Tao D, Xu C (2015) A survey on multi-view learning. Cvpr 36(8):300072Google Scholar
  34. Yarowsky D (1995) Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd annual meeting on association for computational linguistics, pp 189–196. doi: 10.3115/981658.981684
  35. Yaslan Y, Cataltepe Z (2010) Co-training with relevant random subspaces. Neurocomputing 73(10–12):1652–1661. doi: 10.1016/j.neucom.2010.01.018 CrossRefGoogle Scholar
  36. Zheng Z, Webb GI (2000) Lazy learning of Bayesian rules. Mach Learn 41(1):53–84. doi: 10.1023/A:1007613203719 CrossRefGoogle Scholar
  37. Zhou Y, Goldman S (2004) Democratic co-learning. In: 16th IEEE international conference on tools with artificial intelligence. IEEE Comput. Soc, pp 594–602, doi: 10.1109/ICTAI.2004.48
  38. Zhou Z, Li M (2005) Tri-training: exploiting unlabled data using three classifiers. IEEE Trans Data Eng 17(11):1529–1541. doi: 10.1109/TKDE.2005.186 CrossRefGoogle Scholar
  39. Zhou ZH, Li M (2010) Semi-supervised learning by disagreement. Knowl Inf Syst 24(3):415–439. doi: 10.1007/s10115-009-0209-z MathSciNetCrossRefGoogle Scholar
  40. Zighed DA, Lallich S, Muhlenbach F (2002) Separability index in supervised learning. In: 6th European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD 2002, 2431 LNAI, pp 475–487Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Stamatis Karlos
    • 1
    Email author
  • Nikos Fazakis
    • 2
  • Angeliki-Panagiota Panagopoulou
    • 1
  • Sotiris Kotsiantis
    • 3
  • Kyriakos Sgarbas
    • 2
  1. 1.Department of MathematicsUniversity of PatrasPatrasGreece
  2. 2.Wire Communications Laboratory, Department of Electrical and Computer EngineeringUniversity of PatrasPatrasGreece
  3. 3.Educational Software Development Laboratory, Department of MathematicsUniversity of PatrasPatrasGreece

Personalised recommendations