Machine Learning

, Volume 20, Issue 1–2, pp 63–94 | Cite as

Recursive Automatic Bias Selection for Classifier Construction

  • Carla E. Brodley


The results of empirical comparisons of existing learning algorithms illustrate that each algorithm has a selective superiority; each is best for some but not all tasks. Given a data set, it is often not clear beforehand which algorithm will yield the best performance. In this article we present an approach that uses characteristics of the given data set, in the form of feedback from the learning process, to guide a search for a tree-structured hybrid classifier. Heuristic knowledge about the characteristics that indicate one bias is better than another is encoded in the rule base of the Model Class Selection (MCS) system. The approach does not assume that the entire instance space is best learned using a single representation language; for some data sets, choosing to form a hybrid classifier is a better bias, and MCS has the ability to determine these cases. The results of an empirical evaluation illustrate that MCS achieves classification accuracies equal to or higher than the best of its primitive learning components for each data set, demonstrating that the heuristic rules effectively select an appropriate learning bias.

Inductive bias hybrid classifiers automatic algorithm selection decision trees learning from examples 


  1. Aha, David W. (1990). A study of instance-based algorithms for supervised learning tasks: Mathematical, empirical, and psychological evaluations. Doctoral dissertation, Department of Information and Computer Science, University of California, Irvine, CA.Google Scholar
  2. Aha, D. W., Kibler, D., & Albert, M. (1991). Instance-based learning algorithms. Machine Learning, 6, 37–66.Google Scholar
  3. Aha, D. W. (1992). Generalizing from case studies: A case study. Machine Learning: Proceedings of the Ninth International Conference (pp. 1–10). San Mateo, CA: Morgan Kaufmann.Google Scholar
  4. Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. Belmont, CA: Wadsworth International Group.Google Scholar
  5. Breiman, L. (1992). Stacked regressions, (Technical Report No. 367), University of California, Berkeley.Google Scholar
  6. Brodley, C. E. (1993). Addressing the selective superiority problem: Automatic algorithm/model class selection. Machine Learning: Proceedings of the Tenth International Conference (pp. 17–24). Amherst, MA: Morgan Kaufmann.Google Scholar
  7. Brodley, C. E. (1994). Recursive automatic algorithm selection for inductive learning. Doctoral dissertation, Department of Computer Science, University of Massachusetts, Amherst, MA.Google Scholar
  8. Brodley, C. E., & Utgoff, P. E. (1995). Multivariate decision trees. Machine Learning, 19, 45–77.Google Scholar
  9. Clark, P., & Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3, 261–283.Google Scholar
  10. Detrano, R., Janosi, A., Steinbrunn, W., Pfisterer, M., Schmid, J., Sandhu, S., Guppy, K., Lee, S., & Froelicher, V. (1989). International application of a new probability algorithm for the diagnosis of coronary artery disese. American Journal of Cardiology, 64, 304–310.Google Scholar
  11. Dietterich, T. G. (1990). Machine learning. Annual Review of Computer Science, 4.Google Scholar
  12. Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis. New York: Wiley & Sons.Google Scholar
  13. Feng, C., Sutherland, A., King, R., Muggleton, S., & Henry, R. (1993). Comparison of machine learning classifiers to statistics and neural networks. Preliminary Papers of the Fourth International Workshop on Artificial Intelligence and Statistics (pp. 41–52).Google Scholar
  14. Fisher, R. A. (1936). Multiple measures in taxonomic problems. Annals of Eugenics, 7, 179–188.Google Scholar
  15. Frean, M. (1990). Small nets and short paths: Optimising neural computation. Doctoral dissertation, Center for Cognitive Science, University of Edinburgh.Google Scholar
  16. Kittler, J. (1986). Feature selection and extraction. In Young & Fu (Eds.), Handbook of pattern recognition and image processing. New York: Academic Press.Google Scholar
  17. LeBlanc, M., & Tibshirani, R. (1993). Combining estimates in regression and classification, (no number), University of Toronto.Google Scholar
  18. Linhart, H., & Zucchini, W. (1986). Model selection. NY: Wiley.Google Scholar
  19. Mangasarian, O. L., & Wolberg, W. H. (1990). Cancer diagnosis via linear programming. SIAM News, 23, 1–18.zbMATHGoogle Scholar
  20. Nilsson, N. J. (1965). Learning machines. New York: McGraw-Hill.Google Scholar
  21. Provost, F. J., & Buchanan, B. G. (1992). Inductive policy. Proceedings of the Tenth National Conference on Artificial Intelligence (pp. 255–261). San Jose, CA: MIT Press.Google Scholar
  22. Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, I, 81–106.Google Scholar
  23. Quinlan, J. R. (1987). Simplifying decision trees. International Journal of Man-machine Studies, 27, 221–234.Google Scholar
  24. Quinlan, J. R. (1993). Combining instance-based and model-based learning. Machine Learning: Proceedings of the Tenth International Conference (pp. 236–243). Amherst, MA: Morgan Kaufmann.Google Scholar
  25. Rendell, L., & Cho, H. (1990). Empirical learning as a function of concept character. Machine Learning, 5, 267–298.Google Scholar
  26. Rissanen, J. (1989). Stochastic complexity in statistical inquiry. New Jersey: World Scientific.Google Scholar
  27. Salzberg, S. (1991). A nearest hyperrectangle learning method. Machine Learning, 6, 251–276.Google Scholar
  28. Schaffer, C. (1993). Selecting a classification method by cross-validation. Preliminary Papers of the Fourth International Workshop on Artificial Intelligence and Statistics (pp. 15–25).Google Scholar
  29. Shavlik, J. W., Mooney, R. J., & Towell, G. G. (1991). Symbolic and neural learning algorithms: An experimental comparison. Machine Learning, 6, 111–144.Google Scholar
  30. Tcheng, D., Lambert, B., C-Y Lu, S., & Rendell, L (1989). Building robust learning systems by computing induction and optimization. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence (pp. 806–812). Detroit, Michigan: Morgan Kaufmann.Google Scholar
  31. Utgoff, P. E. (1989). Perceptron trees: A case study in hybrid concept representations. Connection Science, 1, 377–391.Google Scholar
  32. Utgoff, P. E., & Brodley, C. E. (1991). Linear machine decision trees, (COINS Technical Report 91-10), Amherst, MA: University of Massachusetts, Department of Computer and Information Science.Google Scholar
  33. Weiss, S. M., & Kapouleas, I. (1989). An empirical comparision of pattern recognition, neural nets, and machine learning classification methods. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence (pp. 781–787). Detroit, Michigan: Morgan Kaufmann.Google Scholar
  34. Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5, 241–259.Google Scholar
  35. Yerramareddy, S., Tcheng, D. K., Lu, S., & Assanis, D. N. (1992). Creating and using models for engineering design. IEEE Expert, 3, 52–59.Google Scholar
  36. Zhang, X., Mesirov, J. P., & Waltz, D. L. (1992). Hybrid system for protein secondary structure prediction. Journal of Molecular Biology, 225, 1049–1063.Google Scholar

Copyright information

© Kluwer Academic Publishers 1995

Authors and Affiliations

  • Carla E. Brodley
    • 1
  1. 1.School of Electrical and Computer EngineeringPurdue UniversityWest Lafayette

Personalised recommendations