Skip to main content
Log in

A naive Bayes probability estimation model based on self-adaptive differential evolution

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript


In the process of learning the naive Bayes, estimating probabilities from a given set of training samples is crucial. However, when the training samples are not adequate, probability estimation method will inevitably suffer from the zero-frequency problem. To avoid this problem, Laplace-estimate and M-estimate are the two main methods used to estimate probabilities. The estimation of two important parameters m (integer variable) and p (probability variable) in these methods has a direct impact on the underlying experimental results. In this paper, we study the existing probability estimation methods and carry out a parameter Cross-test by experimentally analyzing the performance of M-estimate with different settings for the two parameters m and p. This part of experimental result shows that the optimal parameter values vary corresponding to different data sets. Motivated by these analysis results, we propose an estimation model based on self-adaptive differential evolution. Then we propose an approach to calculate the optimal m and p value for each conditional probability to avoid the zero-frequency problem. We experimentally test our approach in terms of classification accuracy using the 36 benchmark machine learning repository data sets, and compare it to a naive Bayes with Laplace-estimate and M-estimate with a variety of setting of parameters from literature and those possible optimal settings via our experimental analysis. The experimental results show that the estimation model is efficient and our proposed approach significantly outperforms the traditional probability estimation approaches especially for large data sets (large number of instances and attributes).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others


  • Brest, J., Greiner, S., Boskovic, B., Mernik, M., Zumer, V. (2006). Self-adapting control parameters in differential evolution: a comparative study on numerical benchmark problems. IEEE Transactions on Evolutionary Computation, 10, 646–657.

    Article  Google Scholar 

  • Cestnik, B. (1990). Estimating probabilities: A crucial task in machine learning. In Proceedings of the ninth European conference on artificial intelligence, ECAI (pp. 147–149). Stockholm: IOS Press.

  • Chandra, B., Gupta, M., Gupta, M.P. (2007). Robust approach for estimating probabilities in Naive-Bayes classifier. In Proceeding of second international conference on pattern recognition and machine intelligence, PReMI (pp. 11–16). Kolkata: Springer Press.

  • Dai, C.H., Chen, W.R., Zhu, Y.F. (2010). Seeker optimization algorithm for digital IIR filter design. IEEE Transactions on Industrial Electronics, 57, 1710–1718.

    Google Scholar 

  • Das, S., & Sil, S. (2011). Kernel-induced fuzzy clustering of image pixels with an improved differential evolution algorithm. Information Sciences, 180, 1237–1256.

    Article  MathSciNet  Google Scholar 

  • Deng, W., Zheng, Q., Wang, Y., Chen, L., Xu, X. (2008). Differential evolutionary Bayesian classifier. In Proceedings of the 2008 IEEE international conference on granular computing (pp. 191–195). Hangzhou, China.

  • Duan, J., Lin, Z., Yi, W., Lu, M. (2010). Scaling up the accuracy of Bayesian classifier based on frequent itemsets by M-estimate. In Proceeding of artificial intelligence and computational intelligence (pp. 357–364). Sanya, China.

  • Flores, M.J., Gámez, A., Martínez, M., Puerta, J.M. (2009). GAODE and HAODE: Two proposals based on AODE to deal with continuous variables. In Proceedings of the 26th international conference on machine learning, ICML (pp. 40–320). Banff, Canada.

  • Friedman, N., Geiger, D., Goldszmidt, M. (1997). Bayesian network classifiers. Machine Learning, 29, 131–163.

    Article  MATH  Google Scholar 

  • Hansen, N., Ros, R., Mauny, N., Schoenauer, M., Auger, A. (2011). Impacts of invariance in search: when CMA-ES and PSO face ill-conditioned and non-separable problems. Applied Soft Computing, 11, 5755–5769.

    Article  Google Scholar 

  • Huang, G.B., Ding, X.J., Zhou, H.M. (2010). Optimization method based extreme learning machine for classification. Neurocomputing, 74, 155–163.

    Article  Google Scholar 

  • Huang, Y.P., Chang, Y.T., Hsieh, S.L., Sandnes, F.E. (2011). An adaptive knowledge evolution strategy for finding near-optimal solutions of specific problems. Expert Systems with Applications, 14, 865–884.

    Google Scholar 

  • Ilonen, J., Kamarainen, J., Lampinen, J. (2003). Differential evolution training algorithm for feed forward neural networks. Neural Processing Letters, 17, 93–105.

    Article  Google Scholar 

  • Jiang, L.X. (2011). Random one-dependence estimators. Pattern Recognition Letters, 32, 532–539.

    Article  Google Scholar 

  • Jiang, L., Wang, D., Cai, Z. (2007). Scaling up the accuracy of Bayesian network classifiers by M-estimate. In Proceedings of the third international conference on intelligent computing, ICIC (pp. 475–484). Qingdao: Springer Press.

    Google Scholar 

  • Jiang, L., Zhang, H., Cai, Z. (2009). A novel Bayes model: hidden naive Bayes. IEEE Transactions on Knowledge and Data Engineering, 21, 1361–1371.

    Article  Google Scholar 

  • Jiang, L., Cai, Z., Wang D., Zhang, H. (2012a). Improving Tree augmented Naive Bayes for class probability estimation. Knowledge-Based Systems, 26, 239–245.

    Article  Google Scholar 

  • Jiang, L., Zhang, H., Cai, Z.H., Wang, D. (2012b). Weighted averaged one-dependence estimators. Journal of experimental and Theoretical Artificial Intelligence, 24, 219–230.

    Article  Google Scholar 

  • Kohavi, R. (1996). Scaling up the accuracy of Naive-Bayes classifiers: A decision-tree hybrid. In Proceedings of second international conference on knowledge discovery and data mining, KDD (pp. 202–207). Portland, OR: AAAI Press.

  • Kotsiantis, S., & Kanellopoulos, D. (2006). Discretization techniques: a recent survey. International Transactions on Computer Science and Engineering, 32, 47–58.

    Google Scholar 

  • Langley, P., Iba, W., Thompson, K. (1992). An analysis of Bayesian classifiers. In Proceedings of the tenth national conference on artificial intelligence (pp. 223–228). San Jose, California.

  • Lowd, D., & Domingos, P. (2005). Naive Bayes models for probability estimation. In Proceeding of twenty-second international conference on machine learning, ICML (pp. 529–536). Bonn: ACM Press.

    Google Scholar 

  • Merz, C., Murphy, P., Aha, D. (1997). UCI repository of machine learning databases. Dept of ICS, University of California, Irvine.

  • Mininno, E., Neri, F., Cupertino, F., Naso, D. (2011). Compact differential evolution. IEEE Transactions on Evolutionary Computation, 15, 32–54.

    Article  Google Scholar 

  • Mitchell, T.M. (1997). Machine learning. McGraw-Hill Publishers.

  • Park, T., & Ryu, K.R. (2010). A dual-population genetic algorithm for adaptive diversity control. IEEE Transactions on Evolutionary Computation, 14, 865–884.

    Article  Google Scholar 

  • Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Francisco, CA: Morgan Kaufmann Publishers.

    Google Scholar 

  • Qin, A.K., Huang, V.L., Suganthan, P.N. (2009). Differential evolution algorithm with strategy adaptation for global numerical optimization. IEEE Transactions on Evolutionary Computation, 13, 398–417.

    Article  Google Scholar 

  • Slowik, A. (2011). Application of an adaptive differential evolution algorithm with multiple trial vectors to artificial neural network training. IEEE Transactions on Industrial Electronics, 58, 3160–3167.

    Article  Google Scholar 

  • Storn, R., & Price, K. (1997). Differential evolution: a simple and efficient adaptive scheme for global optimization over continuous spaces. Journal of Global Optimization, 11, 341–359.

    Article  MATH  MathSciNet  Google Scholar 

  • Su, J., Zhang, H., Ling, C.X., Matwin, S. (2008). Discriminative parameter learning for Bayesian networks. In Proceedings of the 25th international conference on machine learning, ICML (pp. 1016–1023). New York, NY: ACM Press.

    Chapter  Google Scholar 

  • Subudhi, B., & Jena, D. (2011a). A differential evolution based neural network approach to nonlinear system identification. Applied Soft Computing, 11, 861–871.

    Article  Google Scholar 

  • Subudhi, B., & Jena, D. (2011b). Nonlinear system identification using memetic differential evolution trained neural networks. Neurocomputing, 74, 1696–1709.

    Article  Google Scholar 

  • Webb, G.I., Boughton, J., Wang, Z. (2005). Not so naive Bayes: aggregating one-dependence estimators. Machine Learning, 58, 5–24.

    Article  MATH  Google Scholar 

  • Weise, T., & Tang, K. (2012). Evolving distributed algorithms with genetic programming. IEEE Transactions on Evolutionary Computation, 16, 242–265.

    Article  Google Scholar 

  • Witten, I.H., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques (2nd edn.) Morgan Kaufmann, San Francisco.

  • White, D.R., Arcuri, A., Clark, J.A. (2011). Evolutionary improvement of programs. IEEE Transactions on Evolutionary Computation, 15, 515–538.

    Article  Google Scholar 

  • Wu, J., & Cai, Z.H. (2011). Attribute weighting via differential evolution algorithm for attribute weighted naive Bayes (WNB). Journal of Computational Information Systems, 7, 1672–1679.

    Google Scholar 

  • Zadrozny, B., & Elkan, C. (2001). Learning and making decisions when costs and probabilities are both unknown. In Proceedings of seventh ACM SIGKDD international conference on knowledge discovery and data mining, SIGKDD (pp. 204–213). San Francisco, CA: ACM Press.

  • Zhong, Y.F., & Zhang, L.P. (2012). Remote sensing image subpixel mapping based on adaptive differential evolution. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 42, 1306–1329.

    Article  Google Scholar 

Download references


This work was supported by National Natural Science Foundation of China under Grant No.61075063, the Fund for Outstanding Doctoral Dissertation of CUG No. 2235122, Self-Determined and the Innovative Research Fund of CUG No. 1210491B16.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Zhihua Cai.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, J., Cai, Z. A naive Bayes probability estimation model based on self-adaptive differential evolution. J Intell Inf Syst 42, 671–694 (2014).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: