Locally application of naive Bayes for self-training

Karlos, Stamatis; Fazakis, Nikos; Panagopoulou, Angeliki-Panagiota; Kotsiantis, Sotiris; Sgarbas, Kyriakos

doi:10.1007/s12530-016-9159-3

Locally application of naive Bayes for self-training

Original Paper
Published: 24 June 2016

Volume 8, pages 3–18, (2017)
Cite this article

Evolving Systems Aims and scope Submit manuscript

Stamatis Karlos¹,
Nikos Fazakis²,
Angeliki-Panagiota Panagopoulou¹,
Sotiris Kotsiantis³ &
…
Kyriakos Sgarbas²

385 Accesses
8 Citations
Explore all metrics

Abstract

Semi-supervised algorithms are well-known for their ability to combine both supervised and unsupervised strategies for optimizing their learning ability under the assumption that only a few examples together with their full feature set are given. In such cases, the use of weak learners as base classifiers is usually preferred, since the iterative behavior of semi-supervised schemes require the building of new temporal models during each new iteration. Locally weighted naïve Bayes classifier is such a classifier that encompasses the power of NB and k-NN algorithms. In this work, we have implemented a self-labeled weighted variant of local learner which uses NB as the base classifier of self-training scheme. We performed an in depth comparison with other well-known semi-supervised classification methods on standard benchmark datasets and we reached to the conclusion that the presented technique had better accuracy in most cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Aha DW, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66. doi:10.1023/A:1022689900470
Google Scholar
Choi B-J, Kim K-R, Cho K-D, Park C, Koo J-Y (2014) Variable selection for naive Bayes semisupervised learning. Commun Stat Simul Comput 43(10):2702–2713. doi:10.1080/03610918.2012.762391
Article MathSciNet MATH Google Scholar
Clark P, Nibblet T (1989) The CN2 induction algorithm. Mach Learn 3(4):261–283. doi:10.1017/CBO9781107415324.004
Google Scholar
Deng C, Guo MZ (2006) MICAI 2006: advances in artificial intelligence. In: Gelbukh A, Reyes-Garcia CA (eds), vol 4293. Springer, Berlin, Heidelberg. doi:10.1007/11925231
Deng C, Guo MZ (2011) A new co-training-style random forest for computer aided diagnosis. J Intell Inf Syst 36(3):253–281. doi:10.1007/s10844-009-0105-8
Article Google Scholar
Domingos P, Pazzani M (1997) On the optimality of the simple bayesian classifier under zero-one loss. Mach Learn 29(2–3):103–130. doi:10.1023/A:1007413511361
Article MATH Google Scholar
Frank E, Hall M, Pfahringer B (2003) Locally weighted naive Bayes. In: Proceedings of the 19th conference on uncertainty in artificial intelligence, pp 249–256. doi:10.1.1.8.1071
Garcia S, Fernandez A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 180(10):2044–2064. doi:10.1016/j.ins.2009.12.010
Article Google Scholar
Guo T, Li G (2012) Improved tri-training with unlabeled data. Adv Intell Soft Comput 115:139–147
Google Scholar
Hady M, Schwenker F (2008) Co-training by committee: a generalized framework for semi-supervised learning with committees. Int J Softw Inform 2(2):95–124. doi:10.1109/ICDM.Workshops.2008.29
Google Scholar
Halder A, Ghosh S, Ghosh A (2010) Ant based semi-supervised classification. In: Swarm intelligence—7th international conference, ANTS 2010, Brussels, Belgium, September 8–10, 2010. Proceedings of lecture notes in Computer Science, vol 6234, pp 376–383. Springer
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software. ACM SIGKDD Explor Newslett 11(1):10. doi:10.1145/1656274.1656278
Article Google Scholar
Hartert L, Sayed Mouchaweh M, Billaudel P (2010) A semi-supervised dynamic version of Fuzzy K-Nearest Neighbours to monitor evolving systems. Evolv Syst 1(1):3–15. doi:10.1007/s12530-010-9001-2
Article Google Scholar
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Elements 1:337–387. doi:10.1007/b94608
MATH Google Scholar
Huang T, Yu Y, Guo G, Li K (2010) A classification algorithm based on local cluster centers with a few labeled training examples. Knowl Based Syst 23(6):563–571. doi:10.1016/j.knosys.2010.03.015
Article Google Scholar
Jiang L, Zhang H (2006) Weightily averaged one-dependence estimators. PRICAI 2006: trends in artificial intelligence. Springer, Berlin, Heidelberg, pp 970–974
Chapter Google Scholar
Jiang L, Cai Z, Zhang H, Wang D (2012) Naive Bayes text classifiers: a locally weighted learning approach. J Exp Theor Artif Intell 25(2):1–14. doi:10.1080/0952813X.2012.721010
Google Scholar
Li M, Zhou Z (2005) S ETRED: self-training with editing. LNAI 3518:611–621
Google Scholar
Li M, Zhou ZH (2007) Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples. IEEE Trans Syst Man Cybern Part A Syst Hum 37(6):1088–1098. doi:10.1109/TSMCA.2007.904745
Article Google Scholar
Liu K et al (2015) Semi-supervised learning based on improved co-training by committee. In: Lecture Notes in Computer Science. Springer, pp 413–421
McCann S, Lowe DG (2012) Local Naive Bayes nearest neighbor for image classification. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 3650–3656. doi:10.1109/CVPR.2012.6248111
Nigam K, Ghani R (2000) Analyzing the effectiveness and applicability of co-training. In: Proceedings of the nineth international conference on information and knowledge management—CIKM’00, pp 86–93
Nigam K, McCallum A, Mitchell T (2006) Semi-supervised text classification using EM. In: Semi-supervised learning. MIT Press
Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238. doi:10.1109/TPAMI.2005.159
Article Google Scholar
Prakash VJ, Nithya LM (2014) A survey on semi-supervised learning techniques. Int J Comput Trends Technol (IJCTT) 8(1):25–29
Article Google Scholar
Saeed AA, Cawley GC, Bagnall A (2015) Benchmarking the Semi-Supervised naïve Bayes Classifier. In: Proceedings of the 2015 international joint conference on neural networks (IJCNN). IEEE, pp 1–8
Triguero I, García S, Herrera F (2013) Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study. Knowl Inf Syst 42(2):245–284. doi:10.1007/s10115-013-0706-y
Article Google Scholar
Wang B, Zhang H (2007) Probability based metrics for locally weighted naive bayes. Adv Artif Intell 4509:180–191
Article MathSciNet Google Scholar
Wang J, Luo S, Zeng X (2008) A random subspace method for co-training. In: 2008 IEEE international joint conference on neural networks ieee world congress on Comp. Intelligence, pp 195–200. doi:10.1109/IJCNN.2008.4633789
Wang Y, Xu X, Zhao H, Hua Z (2010) Knowledge-based systems semi-supervised learning based on nearest neighbor rule and cut edges. Knowl Based Syst 23(6):547–554. doi:10.1016/j.knosys.2010.03.012
Article Google Scholar
Wang S, Wu L, Jiao L, Liu H (2014) Improve the performance of co-training by committee with refinement of class probability estimations. Neurocomputing 136:30–40
Article Google Scholar
Wu J, Wu B, Pan S, Wang H, Cai Z (2015) Locally weighted learning: how and when does it work in bayesian networks? Int J Comput Intell Syst 8(sup1):63–74. doi:10.1080/18756891.2015.1129579
Article Google Scholar
Xu C, Tao D, Xu C (2015) A survey on multi-view learning. Cvpr 36(8):300072
Google Scholar
Yarowsky D (1995) Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd annual meeting on association for computational linguistics, pp 189–196. doi:10.3115/981658.981684
Yaslan Y, Cataltepe Z (2010) Co-training with relevant random subspaces. Neurocomputing 73(10–12):1652–1661. doi:10.1016/j.neucom.2010.01.018
Article Google Scholar
Zheng Z, Webb GI (2000) Lazy learning of Bayesian rules. Mach Learn 41(1):53–84. doi:10.1023/A:1007613203719
Article Google Scholar
Zhou Y, Goldman S (2004) Democratic co-learning. In: 16th IEEE international conference on tools with artificial intelligence. IEEE Comput. Soc, pp 594–602, doi:10.1109/ICTAI.2004.48
Zhou Z, Li M (2005) Tri-training: exploiting unlabled data using three classifiers. IEEE Trans Data Eng 17(11):1529–1541. doi:10.1109/TKDE.2005.186
Article Google Scholar
Zhou ZH, Li M (2010) Semi-supervised learning by disagreement. Knowl Inf Syst 24(3):415–439. doi:10.1007/s10115-009-0209-z
Article MathSciNet Google Scholar
Zighed DA, Lallich S, Muhlenbach F (2002) Separability index in supervised learning. In: 6th European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD 2002, 2431 LNAI, pp 475–487

Download references

Author information

Authors and Affiliations

Department of Mathematics, University of Patras, 26500, Patras, Greece
Stamatis Karlos & Angeliki-Panagiota Panagopoulou
Wire Communications Laboratory, Department of Electrical and Computer Engineering, University of Patras, 26500, Patras, Greece
Nikos Fazakis & Kyriakos Sgarbas
Educational Software Development Laboratory, Department of Mathematics, University of Patras, 26500, Patras, Greece
Sotiris Kotsiantis

Authors

Stamatis Karlos
View author publications
You can also search for this author in PubMed Google Scholar
Nikos Fazakis
View author publications
You can also search for this author in PubMed Google Scholar
Angeliki-Panagiota Panagopoulou
View author publications
You can also search for this author in PubMed Google Scholar
Sotiris Kotsiantis
View author publications
You can also search for this author in PubMed Google Scholar
Kyriakos Sgarbas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stamatis Karlos.

Appendix

A java software tool implementing the proposed algorithm with some basic run instructions can be found in the following link http://ml.math.upatras.gr/wp-content/uploads/2016/02/SelfLWNB-Experiment.zip.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Karlos, S., Fazakis, N., Panagopoulou, AP. et al. Locally application of naive Bayes for self-training. Evolving Systems 8, 3–18 (2017). https://doi.org/10.1007/s12530-016-9159-3

Download citation

Received: 05 March 2016
Accepted: 18 June 2016
Published: 24 June 2016
Issue Date: March 2017
DOI: https://doi.org/10.1007/s12530-016-9159-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Locally application of naive Bayes for self-training

Abstract

Access this article

Similar content being viewed by others

A survey on semi-supervised learning

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A comparative analysis of gradient boosting algorithms

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Locally application of naive Bayes for self-training

Abstract

Access this article

Similar content being viewed by others

A survey on semi-supervised learning

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A comparative analysis of gradient boosting algorithms

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation