Feature Construction for Concept Learning

  • Larry Rendell
Part of the The Kluwer International Series in Engineering and Computer Science book series (SECS, volume 87)

Abstract

Attribute-based learning needs feature construction for “hard” concepts. Hard binary concepts have arbitrary compositions of conjuncts and disjuncts. Hard graded concepts are arbitrary functions that have any composition of disjuncts or “peaks.” Learning hard concepts is difficult for empirical (selective) induction algorithms (such as ID3 and PLS1), which become inaccurate, slow, and verbose. Experiments based on a functional characterization of the problem shows how traditional induction breaks down with respect to speed and accuracy. These observations lead to “peak-merging” constructive induction to create new attributes or features. Such construction may use domain knowledge. Some effects, issues, and approaches are discussed.

Keywords

Entropy Metaphor Peaked Tame Rote 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Banerji, R. B. Pattern Recognition: Structural Description Languages. In J. Belzer (Ed.), Encyclopedia of Computer Science and Technology. 1978.Google Scholar
  2. Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M. Occam’s Razor. Information Processing Letters, 1987, 24, 377–380.MathSciNetMATHCrossRefGoogle Scholar
  3. Bratko, I., & Michie, D. An Advice Program for a Complex Chess Programming Task. The Computer Journal, 1980, 23, 353–359.CrossRefGoogle Scholar
  4. Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. Classification and Regression Trees. Belmont, California: Wads worth, 1984.MATHGoogle Scholar
  5. Clark, P., & Niblett, T. The CN2 Induction Algorithm. The Turing Institute, 1986.Google Scholar
  6. Cover, T. Geometrical and Statistical Properties of Systems of Linear Equations with Applications to Pattern Recognition. IEEE Trans. Elect. Comp., 1965, 14, 326–334.MATHCrossRefGoogle Scholar
  7. Devijver, P. A., & Kittler, J. Pattern Recognition: A Statistical Approach. Prentice Hall, 1982.Google Scholar
  8. Dietterich, T. G. Learning at the Knowledge Level. Machine Learning, 1986, 1, 287–316.Google Scholar
  9. Dietterich, T. G., London, B., Clarkson, K., & Dromey, G. Learning and Inductive Inference. In P. R. Cohen and E. A. Feigenbaum (Ed.), The Handbook of Artificial Intelligence. Kaufmann, 1982.Google Scholar
  10. Doran, J., & Michie, D. Experiments with the Graph-traverser Program. Proc. Roy. Soc. A, 1966, 235–259.Google Scholar
  11. Draper, N. R., & Smith, H. Applied Regression Analysis. Wiley, 1981.Google Scholar
  12. Drastal, G., Raatz S., & Meunier R. Induction in an Abstraction Space: A Form of Constructive Induction. Proc. Eleventh International Joint Conference on Artificial Intelligence. Morgan Kaufman, 1989.Google Scholar
  13. Duda, R. O., & Hart, P. E. Pattern Classification and Scene Analysis. Wiley, 1973.Google Scholar
  14. Ehrenfeucht, A., Haussler, D., Kearns, M., & Valiant, L. A General Lower Bound on the Number of Examples Needed for Learning. Proc. Computational Learning Theory, 139–154, Morgan Kaufman, 1988.Google Scholar
  15. Gams, M., & Lavrac, N. Review of Five Empirical Learning Systems within a Proposed Schemata. Progress in Machine Learning: Proceedings of the Second European Working Session on Learning, 1987, 46–66.Google Scholar
  16. Goodman, R. M., & Smyth, P. Decision Tree Design from a Communications Theory Standpoint, IEEE Transactions on Information Theory, in press.Google Scholar
  17. Haussler, D. Bias, Version Spaces, and Valiant’s Learning Framework. Proceedings of the Fourth International Workshop on Machine Learning, 1987, 324–336.Google Scholar
  18. Holte, R. C., & Porter, B. W. An Empirical Study of Bias Appropriateness. Austin, Texas, 1988.Google Scholar
  19. Kearns, M., Li, M., Pitt, L., & Valiant, L. G. Recent Results on Boolean Concept Learning. In Proceedings of the Fourth International Workshop on Machine Learning. Morgan Kaufman, 1987, 337–352.Google Scholar
  20. King, R. D. An Inductive Learning Approach to the Problem of Predicting a Protein’s Secondary Structure from its Amino Acid Sequence. In Progress in Machine Learning (Proceedings of the Second European Working Session on Learning). Sigma Press, Wilmslow, 1987, 230–250.Google Scholar
  21. Korf, R. E. Toward a Model of Representation Change. Artificial Intelligence, 1980, 14, 41–78.MathSciNetCrossRefGoogle Scholar
  22. Lakoff, G., & Johnson, M. Metaphors We Live By. University of Chicago Press, 1980.Google Scholar
  23. Langley, P. Rediscovering Physics with Bacon.3. Proc. Fifth International Joint Conference on Artificial Intelligence, 1977, 505–507.Google Scholar
  24. Lathrop, R., Webster, T., & Smith, T. ADRIADNE: Pattern-Directed Inference and Hierarchical Abstraction in Protein Structure Recognition. Communications ACM, 1987, 30, 909–921.MATHCrossRefGoogle Scholar
  25. Lavrac, N., Mozetic, I., & Kononenko, I. An Experimental Comparison of Two Learning Programs in Three Medical Domains, 1986.Google Scholar
  26. Matheus, C. J., & Rendell, L., Constructive Induction on Decision Trees. In Proc. Eleventh International Joint Conference on Artificial Intelligence. Morgan Kaufman, 1989.Google Scholar
  27. McCarthy, J. Epistemological Problems of Artificial Intelligence. Proc. Fifth International Joint Conference on Artificial Intelligence, 1977, 1038–1044.Google Scholar
  28. Mehra, P., Rendell, L., & Wah, B. W. Principled Constructive Induction. In Proc. Eleventh International Joint Conference on Artificial Intelligence. Morgan Kaufman, 1989.Google Scholar
  29. Michalski, R. S. Knowledge Acquisition through Conceptual Clustering: A Theoretical Framework and an Algorithm for Partitioning Data into Conjunctive Concepts. Journal of Policy Analysts and Information Systems, 1980, 4,, 219–244.Google Scholar
  30. Michalski, R. S. A Theory and Methodology of Inductive Learning. Machine Learning: A n Artificial Intelligence Approach, 1983.Google Scholar
  31. Michie, D. A Theory of Advice. In E. W. Elcock and D. Michie (Ed.), Machine Intelligence. American Elsevier, 1977.Google Scholar
  32. Mitchell, T. M. The Need for Biases in Learning Generalizations. Technical Report CBM-TR-117, May 1980.Google Scholar
  33. Mitchell, T. M., Keller, R. M., & Kedar-Cabelli, S. T. Explanation-Based Generalization: A Unifying View. Machine Learning Journal, 1986, 1, 47–80.Google Scholar
  34. Muggleton, S. Structuring Knowledge by Asking Questions. Progress in Machine Learning: Proceedings of the Second European Working Session on Learning, 1987, 218–229.Google Scholar
  35. Muggleton, S., & Buntine, W. Constructive Induction in First-order Logic. Proceedings of the Workshop on Change of Representation and Bias, 1988.Google Scholar
  36. Pagallo, G. Learning DNF by Decision Trees. In Proc. Eleventh International Joint Conference on Artificial Intelligence. Morgan Kaufman, 1989.Google Scholar
  37. Quinlan, J. R. Learning Efficient Classification Procedures and Their Application to Chess End Games. In Ryszard Michalski (Ed.), Machine Learning: An Artificial Intelligence Approach. Tioga, 1983.Google Scholar
  38. Quinlan, J. R. Simplifying Decision Trees. In International Journal of Man-Machine Studies. 1987.Google Scholar
  39. Rendell, L. A. Conceptual Knowledge Acquisition in Search. University of Guelph Report CIS-83-15. Guelph, Ontario, Canada. Reprinted in L. Bole (Ed.), Computational Models of Learning. Springer-Verlag. 1983 and 1987.Google Scholar
  40. Rendell, L. A. Substantial Constructive Induction Using Layered Information Compression: Tractable Feature Formation in Search. Proc. Ninth International Joint Conference on Artificial Intelligence, 1985, 650–658.Google Scholar
  41. Rendell, L. A. A General Framework for Induction and a Study of Selective Induction. Machine Learning, 1986, 1, 177–226.Google Scholar
  42. Rendell, L. A. Learning Hard Concepts. Proceedings of the Third European Working Session on Learning, Pitman, 1988, 177–200.Google Scholar
  43. Rendell, L. A. Comparing Systems and Analyzing Functions To Improve Constructive Induction. Proceedings of the Sixth International Machine Learning Workshop, Morgan Kaufman, 1989, 461–464.Google Scholar
  44. Rendell, L. A. Learning Hard Concepts through Constructive Induction: Framework and Rationale, Computational Intelligence, 1990 (to appear).Google Scholar
  45. Rendell, L. A., & Cho, H. H. The Effect of Data Character on Empirical Concept Learning. Proceedings of the Fifth International Conference on Artificial Intelligence Applications, IEEE Computer Society Press, 1989, 199–205.Google Scholar
  46. Rendell, L. A., Seshu, R. M., & Tcheng, D. K. Layered Concept Learning and Dynamically-Variable Bias Management. Proceedings of the Tenth International Joint Conference on Artificial Intelligence, 1987, 308–314.Google Scholar
  47. Samuel, A. L. Some Studies in Machine Learning Using the Game of Checkers. In IBM Journal of Research and Development. 3, 1959. Reprinted in E. A. Feigenbaum (Ed.), Computers and Thought. McGraw-Hill, 1963.Google Scholar
  48. Schlimmer, J. C. Learning and Representation Change. Proceedings of the Tenth International Joint Conference on Artificial Intelligence, 1987.Google Scholar
  49. Schlimmer, J. C., & Granger, R. H. Incremental Learning from Noisy Data. Machine Learning Journal, 1986, 317–354.Google Scholar
  50. Seshu, R. M., Rendell, L. A., & Tcheng, D. K. Managing Constructive Induction Using Subcomponent Assessment and Multiple-Objective Optimization. Proceedings of the Fifth International Conference on Artificial Intelligence Applications, IEEE Computer Society Press, 1989, 191–197.Google Scholar
  51. Tappel, S. Some Algorithm Design Methods. Proceedings of the National Conference on Artificial Intelligence, 1980, 64–67.Google Scholar
  52. Tou, T. T., & Gonzalez, R. C. Pattern Recognition Principles. Addison-Wesley, 1974.Google Scholar
  53. Utgoff, P. E. Shift of Bias for Inductive Concept Learning. Machine Learning: An Artificial Intelligence Approach, 1986, II, 107–148.Google Scholar
  54. Watanabe, S. Pattern Recognition as Information Compression. In S. Watanabe (Ed.), Frontiers of Pattern Recognition. Academic Press, 1972.Google Scholar
  55. Zadeh, L. A. Fuzzy Sets. Information and Control, 1965, 8, 338–353.MathSciNetMATHCrossRefGoogle Scholar

Copyright information

© Kluwer Academic Publishers 1990

Authors and Affiliations

  • Larry Rendell
    • 1
  1. 1.Department of Computer ScienceUniversity of Illinois at Urbana-ChampaignUrbanaUSA

Personalised recommendations