Abstract
BEXA is a new covering algorithm for inducing propositional concept descriptions. Existing covering algorithms such as AQ15 and CN2 place rigid constraints on the search process to reduce the learning time. These restrictions may allow useless specializations while at the same time ignoring potentially useful specializations. In contrast BEXA employs three dynamic search constraints that enable it to find simple and accurate concept descriptions efficiently. This paper describes the BEXA algorithm and its relationship to the covering algorithms AQ15, CN2, GREEDY3, PRISM, and an algorithm proposed by Gray. The specialization models of these algorithms are described in the uniform framework of specialization by exclusion of values. BEXA is compared empirically to state-of-the-art concept learners CN2 and C4.5. It produces rules of comparable accuracy, but with greater simplicity.
Article PDF
Similar content being viewed by others
References
Bergadano, F., Matwin, S., Michalski, R.S. & Zhang, J. (1992). Learning two-tiered descriptions of flexible concepts: The POSEIDON system. Machine Learning, 8, 5–43.
Breiman, L., Friedman, J.H., Olshen, R.A., & Stone, C.J. (1984). Classification and regression trees. Belmont: Wadsworth.
Buntine, W. & Niblett, T. (1992). A further comparison of splitting rules for decision-tree induction. Machine Learning, 8, 75–85.
Cendrowska, J. (1987). PRISM: An algorithm for inducing modular rules. International Journal of Man-Machine Studies, 27, 349–370.
Clark, P. & Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3, 261–283.
Clark, P. & Boswell, R. (1991). Rule induction with CN2: Some recent improvements. In Y. Kodratoff (Ed.), Machine Learning-European Working Session on Learning EWSL-91, (pp. 151–163). Berlin: Springer-Verlag.
Fayyad, U.M. & Irani, K.B. (1992). On the handling of continuous-valued attributes in decision tree generation. Machine Learning, 8, 87–102.
Gray, N.A.B. (1988). Why grow trees?, Technical Report, University of Wollongong, N.S.W., Australia.
Gray, N.A.B. (1990). Capturing knowledge through top-down induction of decision trees. IEEE Expert, June, 41–50.
Haussler, D. (1988). Quantifying inductive bias: AI learning algorithms and Valiant's learning framework. Artificial Intelligence, 36, 177–221.
Hoff, W.A., Michalski, R.S. & Stepp, R.E. (1983). INDUCE 2: A program for learning structural descriptions from examples. Report, University of Illinois at Urbana-Champaign.
Lavrac, N., Mozetic, I, & Kononenko, I. (1986). An experimental comparison of two learning programs in three medical domains. Proceedings of the ISSEK workshop, Turing Institute, Glasgow.
Lindsay, R.K., Buchanan, B.G., Feigenbaum, E.A., & Lederberg, J. (1980). Applications of artificial intelligence for organic chemistry: The DENDRAL project. New York: McGraw-Hill.
Michalski, R.S. (1975). Variable-valued logic and its applications to pattern recognition and machine learning. In D.C. Rine (Ed.), Computer Science and Multiple-valued logic: Theory and applications, North Holland, 506–534.
Michalski, R.S. & Chilauski, R.L. (1980). Learning by being told and learning from examples: An experimental comparison of the two methods of knowledge acquisition in the context of developing and expert system for soybean disease diagnosis. International Journal of Policy Analysis and Information Systems, 4, 125–161.
Michalski, R.S. (1983). A theory and methodology of inductive learning. In R. S. Michalski, J. G. Carbonell, & T. M. Mitchell (Eds.), Machine learning: An artificial intelligence approach. Los Altos, CA: Morgan Kaufmann.
Michalski, R.S., & Stepp, R.E. (1983). Learning from observation: Conceptual clustering. In R. S. Michalski, J. G. Carbonell, & T. M. Mitchell (Eds.), Machine learning: An artificial intelligence approach. Los Altos, CA: Morgan Kaufmann.
Michalski, R.S., Mozetic, I., Hong, J., & Lavrac, N. (1986). The multi-purpose incremental learning system AQ15 and its testing application to three medical domains. Proceedings of the American association of artificial intelligence (pp. 1041–1045). Los Altos, CA: Morgan Kaufmann.
Mingers, J. (1989). An empirical comparison of selection measures for decision-tree induction. Machine Learning, 3, 319–342.
Mitchell, T.M. (1982). Generalization a Search. Artificial Intelligence, 18, 203–226.
Pagallo, G. & Haussler, D. (1990). Boolean feature discovery in empirical learning. Machine Learning, 5, 71–99.
Quinlan, J.R. (1986). Induction of decision trees. Machine Learning, 1, Boston: Kluwer Academic Publishers, 81–106.
Quinlan, J.R. (1987a). Simplifying decision trees. International Journal of Man-Machine Studies, 27, 221–234.
Quinlan, J.R. (1987b). Generating production rules from decision trees. International Joint Conference on Artificial Intelligence, 304–307.
Rymon, R. (1993). An SE-tree based characterization of the induction problem. 10th International Conference on Machine Learning, 268–275.
Schaffer, C. (1993). Overfitting avoidance as bias. Machine Learning, 10, Boston: Kluwer Academic Publishers, 153–178.
Theron, H. & Cloete, I. (1993). An empirical evaluation of beam search and pruning in BEXA, in Proceedings of the Fifth International IEEE Conference on Tools for Artificial Intelligence-TAI'93, Cambridge, Massachusettes, 8–11 November.
Theron, H. (1994). Specialization by exclusion: An approach to concept learning. Ph.D. dissertation, Department of Computer Science, University of Stellenbosch, Stellenbosch, South Africa, March 1994.
Wells, M.B. (1971). Elements of combinatorial computing. New York: Pergamon Press.
Wirth, J. & Catlett, J. (1988). Experiments on the costs and benefits of windowing in ID3. Proceedings of the 5th International Workshop on Machine Learning. Los Altos, CA: Morgan Kaufmann, 87–99.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Theron, H., Cloete, I. BEXA: A covering algorithm for learning propositional concept descriptions. Mach Learn 24, 5–40 (1996). https://doi.org/10.1007/BF00117830
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF00117830