Discretisation Does Affect the Performance of Bayesian Networks

  • Saskia Robben
  • Marina Velikova
  • Peter J.F. Lucas
  • Maurice Samulski
Conference paper


In this paper, we study the use of Bayesian networks to interpret breast X-ray images in the context of breast-cancer screening. In particular, we investigate the performance of a manually developed Bayesian network under various discretisation schemes to check whether the probabilistic parameters in the initial manual network with continuous features are optimal and correctly reflect the reality. The classification performance was determined using ROC analysis. A few algorithms perform better than the continuous baseline: best was the entropy-based method of Fayyad and Irani, but also simpler algorithms did outperform the continuous baseline. Two simpler methods with only 3 bins per variable gave results similar to the continuous baseline. These results indicate that it is worthwhile to consider discretising continuous data when developing Bayesian networks and support the practical importance of probabilitistic parameters in determining the network’s performance.


Receiver Operating Characteristic Curve Bayesian Network Bayesian Network Model Discretisation Technique Link Level 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abraham, R., Simha, J.B., Iyengar, S.S.: A comparative analysis of discretization methods for medical datamining with naїve Bayesian classifier. In: Proc. of the Ninth International Conference on Information Technology, pp. 235–236 (2006)Google Scholar
  2. 2.
    Acid, S., de Campos, L.M., Fernandez-Luna, J.M., Rodriguez, S., Rodriguez, J.M., Salcedo, J.L.: A comparison of learning algorithms for Bayesian networks: a case study based on data from an emergency medical service. Artif. Intel. in Medicine 30(3), 215–232 (2004)CrossRefGoogle Scholar
  3. 3.
    Bradley, A.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30(7), 1145–1159 (1997)CrossRefGoogle Scholar
  4. 4.
    Burnside, E., Davis, J., Chhatwal, J., Alagoz, O., Lindstrom, M., Geller, B., Littenberg, B., Shaffer, K., Kahn Jr, C., Page, C.: Probabilistic computer model developed from clinical data in national mammography database format to classify mammographic findings. Radiology 251(3), 663–672 (2009)CrossRefGoogle Scholar
  5. 5.
    Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B 39(1), 1–38 (1977)zbMATHMathSciNetGoogle Scholar
  6. 6.
    D’Orsi, C., Bassett, L., Berg, W.e.a.: Breast Imaging Reporting and Data System: ACR BIRADS- Mammography (ed 4) (2003)Google Scholar
  7. 7.
    Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: Proc. of the 12th ICML, pp. 194–202 (1995)Google Scholar
  8. 8.
    Druzdzel, M.J., Onisko, A.: Are Bayesian networks sensitive to precision of their parameters? In: Proc. of the International IIS08 Conference, Intelligent Information Systems XVI, pp. 35–44 (2008)Google Scholar
  9. 9.
    Fayyad, U., Irani, K.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proc. of the 13th IJCAI, pp. 1022–1027 (1993)Google Scholar
  10. 10.
    Ferreira, N., Velikova, M., Lucas, P.: Bayesian modelling of multi-view mammography. In: Proc. of the ICML Workshop on Machine Learning for Health-Care Applications (2008)Google Scholar
  11. 11.
    Flores, J.L., Inza, I., naga, P.L.: Wrapper discretization by means of estimation of distribution algorithms. Intelligent Data Analysis 11(5), 525–545 (2007)Google Scholar
  12. 12.
    Geurts, P.,Wehenkel, L.: Investigation and reduction of discretization variance in decision tree induction. Lecture Notes In Computer Science 1810, 162–170 (2000)Google Scholar
  13. 13.
    Ismail, M.K., Ciesielski, V.: An empirical investigation of the impact of discretization on common data distributions. In: Proc. of the Third Int. Conf. on Hybrid Intelligent Systems: Design and Application of Hybrid Intelligent Systems, pp. 692–701 (2003)Google Scholar
  14. 14.
    Jensen, F., Nielsen, T.: Bayesian networks and decision graphs. Springer Verlag (2007)Google Scholar
  15. 15.
    Kahn, C., Roberts, L., Shaffer, K., Haddawy, P.: Construction of a Bayesian network for mammographic diagnosis of breast cancer. Comp. in Biol. and Medic. 27(1), 19–29 (1997)CrossRefGoogle Scholar
  16. 16.
    Mizianty, M., Kurgan, L., Ogiela, M.: Comparative analysis of the impact of discretization on the classification with na¨ıve Bayes and semi-na¨ıve Bayes classifiers. In: Proc. of the Seventh International Conference on Machine Learning and Applications, pp. 823–828 (2008)Google Scholar
  17. 17.
    Murphy, K.: Bayesian network toolbox (BNT) (2007). Software/BNT/bnt.html
  18. 18.
    Pradhan, A., Henrion, M., Provan, G., del Favero, B., Huang, K.: The sensitivity of belief networks to imprecise probabilities: an experimental investigation. Artificial Intelligence 84(1-2),357–357 (1996)CrossRefGoogle Scholar
  19. 19.
    Radstake, N., Lucas, P.J.F., Velikova, M., Samulski, M.: Critiquing knowledge representation in medical image interpretation using structure learning. In: Proc. of the Second Workshop ”Knowledge Representation for Health Care”, Lisbon, Portugal (2010)Google Scholar
  20. 20.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, Second Edition. Morgan Kaufmann, San Francisco, CA, USA (2005)zbMATHGoogle Scholar
  21. 21.
    Yang, Y., Webb, G.: Proportional k-interval discretization for na¨ıve-Bayes classifiers. In: Machine Learning: ECML 2001, pp. 564–575. Springer (2001)Google Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  1. 1.Radboud University Nijmegen, Institute for Computing and Information SciencesNijmegenThe Netherlands
  2. 2.Department of RadiologyRadboud University Nijmegen Medical CentreNijmegenThe Netherlands

Personalised recommendations