Neural Computing and Applications

, Volume 31, Issue 10, pp 5699–5713 | Cite as

Toward naive Bayes with attribute value weighting

  • Liangjun Yu
  • Liangxiao JiangEmail author
  • Dianhong Wang
  • Lungan Zhang
Original Article


Naive Bayes makes an assumption regarding conditional independence, but this assumption rarely holds true in real-world applications, so numerous attempts have been made to relax this assumption. However, to the best of our knowledge, few studies have assigned different weights to different attribute values. In this study, we propose a new paradigm for a simple, efficient, and effective attribute value weighting approach called the correlation-based attribute value weighting approach (CAVW), which assigns a different weight to each attribute value by computing the difference between the attribute value-class correlation (relevance) and the average attribute value-attribute value intercorrelation (average redundancy). In CAVW, we use the information theoretic method with a strong theoretical background to assign different weights to different attribute values. Two different attribute value weighting measures called the mutual information (MI) measure and the Kullback–Leibler (KL) measure are employed, and thus two different versions are created, which we denote as CAVW-MI and CAVW-KL, respectively. According to extensive empirical studies based on a collection of 36 benchmark datasets from the University of California at Irvine repository, CAVW-MI and CAVW-KL both obtained more satisfactory experimental results compared with the naive Bayesian classifier and other four existing attribute weighting methods, and our methods also maintain the simplicity of the original naive Bayes model.


Attribute value weighting Mutual information Kullback–Leibler measure Naive Bayes 



Many thanks to Mark Hall for kindly providing us with the implementations of two other attribute weighting approaches (GRAW and DTAW). The work was partially supported by the Program for New Century Excellent Talents in University (NCET-12-0953), the excellent youth team of scientific and technological innovation of Hubei higher education (T201736), and the Open Research Project of Hubei Key Laboratory of Intelligent Geo-Information Processing (KLIGIP201601).

Compliance with ethical standards

Conflict of interest

The authors declare no conflict of interest.


  1. 1.
    Chickering MD (1996) Learning Bayesian networks is NP-complete. Artif Intell Stat V:121–130MathSciNetGoogle Scholar
  2. 2.
    Jiang L, Li C, Wang S, Zhang L (2016) Deep feature weighting for naive Bayes and its application to text classification. Eng Appl Artif Intell 52:26–39CrossRefGoogle Scholar
  3. 3.
    Li C, Jiang L, Li H (2014) Naive Bayes for value difference metric. Front Comput Sci 8(2):255–264MathSciNetCrossRefGoogle Scholar
  4. 4.
    Li C, Li H (2013) Bayesian network classifiers for probability-based metrics. J Exp Theor Artif Intell 25(4):477–491CrossRefGoogle Scholar
  5. 5.
    Zhang H, Liu G, Chow TWS, Liu W (2011) Textual and visual content-based anti-phishing: a Bayesian approach. IEEE Trans Neural Netw 22(10):1532–1546CrossRefGoogle Scholar
  6. 6.
    Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San MateoGoogle Scholar
  7. 7.
    Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37CrossRefGoogle Scholar
  8. 8.
    Alhussan A, Hindi KE (2016) Selectively fine-tuning Bayesian network learning algorithm. Int J Pattern Recognit Artif Intell 30(8):1651005MathSciNetCrossRefGoogle Scholar
  9. 9.
    Diab DM, Hindi KE (2016) Using differential evolution for fine tuning naive Bayesian classifiers and its application for text classification. Appl Soft Comput 54:183–199CrossRefGoogle Scholar
  10. 10.
    Hindi KE (2014) Fine tuning the naive Bayesian learning algorithm. AI Commun 27(2):133–141zbMATHGoogle Scholar
  11. 11.
    Wu J, Cai Z (2011) Attribute weighting via differential evolution algorithm for attribute weighted naive Bayes. J Comput Inf Syst 7(5):1672–1679Google Scholar
  12. 12.
    Frank A, Asuncion A (2010) UCI machine learning repository. University of California, Irvine, School of Information and Computer Science.
  13. 13.
    Lee CH, Gutierrez F, Dou D (2011) Calculating feature weights in naive Bayes with Kullback–Leibler measure. In: Proceedings of the 11th IEEE international conference on data mining. IEEE, Vancouver, pp 1146–1151Google Scholar
  14. 14.
    Zhang H, Sheng S (2004) Learning weighted naive Bayes with accurate ranking. In: Proceedings of the 4th international conference on data mining. IEEE, Brighton, pp 567–570Google Scholar
  15. 15.
    Hall M (2007) A decision tree-based attribute weighting filter for naive Bayes. Knowl-Based Syst 20(2):120–126CrossRefGoogle Scholar
  16. 16.
    Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2–3):131–163CrossRefGoogle Scholar
  17. 17.
    Jiang L, Zhang H, Cai Z, Su J (2005) Learning tree augmented naive Bayes for ranking. In: Proceedings of the 10th international conference on database systems for advanced applications, pp 688–698Google Scholar
  18. 18.
    Webb G, Boughton J, Wang Z (2005) Not so naive Bayes: aggregating one-dependence estimators. Mach Learn 58(1):5–24CrossRefGoogle Scholar
  19. 19.
    Jiang L, Zhang H, Cai Z, Wang D (2012) Weighted average of one-dependence estimators. J Exp Theor Artif Intell 24(2):219–230CrossRefGoogle Scholar
  20. 20.
    Jiang L, Zhang H, Cai Z (2009) A novel Bayes model: hidden naive Bayes. IEEE Trans Knowl Data Eng 21:1361–1371CrossRefGoogle Scholar
  21. 21.
    Kohavi R (1996) Scaling up the accuracy of naive-Bayes classifier: a decision-tree hybrid. In: Proceedings of the 2nd international conference on knowledge discovery and data mining. AAAI Press, pp 202–207Google Scholar
  22. 22.
    Xie Z, Hsu W, Liu Z, Lee M (2002) A selective neighborhood based naive Bayes for lazy learning. In: Proceedings of the 6th Pacific Asia conference on KDD. Springer, Berlin, pp 104–114Google Scholar
  23. 23.
    Frank E, Hall M, Pfahringer B (2003) Locally weighted naive Bayes. In: Proceedings of the 19th conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., San Francisco, pp 249–256Google Scholar
  24. 24.
    Jiang L, Wang D, Cai Z (2012) Discriminatively weighted naive Bayes and its application in text classification. Int J Artif Intell Tools 21(1):1250007CrossRefGoogle Scholar
  25. 25.
    Jiang L, Cai Z, Wang D (2010) Improving naive Bayes for classification. Int J Comput Appl 32(3):328–332Google Scholar
  26. 26.
    Jiang L, Zhang H, Su J (2005) Instance cloning local naive Bayes. In: Proceedings of the 18th Canadian conference on artificial intelligence, pp 280–291Google Scholar
  27. 27.
    Langley P, Sage S (1994) Induction of selective Bayesian classifiers. In: Proceedings of the 10th international conference on uncertainty in artificial intelligence. Morgan Kaufmann, San Francisco, pp 339–406Google Scholar
  28. 28.
    Jiang L, Cai Z, Zhang H, Wang D (2012) Not so greedy: Randomly selected naive Bayes. Expert Syst Appl 39(12):11022–11028CrossRefGoogle Scholar
  29. 29.
    Jiang L, Zhang H, Cai Z, and Su J (2005) Evolutional naive Bayes. In: Proceedings of the 2005 international symposium on intelligent computation and its application, pp 344–350Google Scholar
  30. 30.
    Hall M (2000) Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the 17th international conference on machine learning, Stanford, pp 359–366Google Scholar
  31. 31.
    Fayyad UM, Irani KB (1993) Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the 13th international joint conference on artificial intelligence, Chambery, pp 1022–1027Google Scholar
  32. 32.
    Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann, San FranciscoGoogle Scholar
  33. 33.
    Nadeau C, Bengio Y (2003) Inference for the generalization error. Mach Learn 52(3):239–281CrossRefGoogle Scholar
  34. 34.
    Alcalá-Fdez J, Fernandez A, Luengo,J, Derrac J, García S, Sánchez L, Herrera F (2011) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Mult Valued Log Soft Comput 17(2–3):255–287Google Scholar
  35. 35.
    Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetzbMATHGoogle Scholar
  36. 36.
    Garcia S, Herrera F (2008) An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons. J Mach Learn Res 9:2677–2694zbMATHGoogle Scholar
  37. 37.
    Li C, Jiang L, Li H, Wu J, Zhang P (2017) Toward value difference metric with attribute weighting. Knowl Inf Syst 50(3):795–825CrossRefGoogle Scholar
  38. 38.
    Zaidi NA, Cerquides J, Carman MJ, Webb GI (2013) Alleviating naive Bayes attribute independence assumption by attribute weighting. J Mach Learn Res 14:1947–1988MathSciNetzbMATHGoogle Scholar
  39. 39.
    Kurgan LA, Cios KJ, Tadeusiewicz R, Ogiela M, Goodenday LS (2001) Knowledge discovery approach to automated cardiac SPECT diagnosis. Artif Intell Med 23:149CrossRefGoogle Scholar

Copyright information

© The Natural Computing Applications Forum 2018

Authors and Affiliations

  1. 1.Department of Computer ScienceChina University of GeosciencesWuhanChina
  2. 2.School of Mechanical Engineering and Electronic InformationChina University of GeosciencesWuhanChina
  3. 3.Hubei Key Laboratory of Intelligent Geo-Information ProcessingChina University of GeosciencesWuhanChina
  4. 4.School of Mechanical Engineering and Electronic InformationWuhan University of Engineering ScienceWuhanChina

Personalised recommendations