A Bayesian network based learning system: — Architecture and performance comparison with other methods
In this paper, we discuss the construction of Bayesian network models from data using the Advanced Pattern Recognition & Identification (APRI) system. It is designed for classification of low probability events as well as mixed data types, discrete and continuous, with large amounts of available training data (a few million records for a typical application) where other methods such as discriminant analysis and classification trees have difficulty in doing the task. We show here that APRI does as well and in some cases better then these other methods with less demanding problems. We will discuss the architecture of the system as an example of Bayesian network learning system. We present a comparison of this system with the classification tree system C4.5 and statistical discriminant analysis using standard data sets, namely voting record and CRX credit card application. We show that despite the fact that APRI was not designed for small data set applications, it nevertheless performs well. We discuss functional advantages and disadvantages between classification tree (C4.5) and Bayesian network (APRI) methods.
KeywordsBayesian Classification Bayesian Learning Bayesian Networks
Unable to display preview. Download preview PDF.
- C1.Cooper, G. F., and Herskovits, E., “A Bayesian Method for Constructing Bayesian Belief Networks from Databases,” Proceedings of the Seventh Conference on Uncertainty in Artificial Intelligence, pp 86–94. Los Angeles: Morgan Kaufmann, 1991.Google Scholar
- E1.Ezawa, K., “Value of Evidence on Influence Diagrams,” Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence, pp 212–220, Morgan Kaufmann, 1994Google Scholar
- E2.Ezawa, K. and T. Schuermann, “Fraud/Uncollectible Debt Detection Using a Bayesian Network Based Learning System: A Rare Binary Outcome with Mixed Data Structures”, submitted to UAI'95Google Scholar
- F1.Fukunaga K., Introduction to Statistical Pattern Recognition, Academic Press, 1990.Google Scholar
- G1.Geiger, D., “An Entropy-Based Learning Algorithm of Bayesian conditional Trees”, Proceedings of the Eighth Conference on Uncertainty in Artificial Intelligence, pp 92–97, Morgan Kaufmann, 1992.Google Scholar
- J1.Jensen, V., Olesen K. G., and Andersen S. K., “An Algebra of Bayesian Universes for Knowledge-Based Systems”, Networks, Vol. 20 pp. 637–659, John Wiley & Sons, Inc., 1990Google Scholar
- L1.Lauritzen, S. L., and Spiegelhalter, D. J., “Local Computations with Probabilities on Graphical Structures and their Application to Expert Systems”, J. R. Statist. Soc B, 50, No.2 pp 157–224, 1988.Google Scholar
- L2.Langley, Pat and Stephanie Sage, “Induction of Selective Bayesian Classifiers”, in Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence, pp399–406, Morgan Kaufman, 1994.Google Scholar
- M1.McLachlan, Geoffrey J., Discriminant Analysis and Statistical Pattern Recognition, John Wiley & Sons, 1992Google Scholar
- Q1.Quinlan, J. R., C4.5 Programs for Machine Learning, Morgan Kaufmann, 1993.Google Scholar
- S1.Shachter, R. D., “Evidence Absorption and Propagation through Evidence Reversals”, Uncertainty in Artificial Intelligence, Vol. 5, pp. 173–190, North-Holland, 1990.Google Scholar