Skip to main content
Log in

Discovering Interesting Patterns for Investment Decision Making with GLOWER ☹—A Genetic Learner Overlaid with Entropy Reduction

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Prediction in financial domains is notoriously difficult for a number of reasons. First, theories tend to be weak or non-existent, which makes problem formulation open ended by forcing us to consider a large number of independent variables and thereby increasing the dimensionality of the search space. Second, the weak relationships among variables tend to be nonlinear, and may hold only in limited areas of the search space. Third, in financial practice, where analysts conduct extensive manual analysis of historically well performing indicators, a key is to find the hidden interactions among variables that perform well in combination. Unfortunately, these are exactly the patterns that the greedy search biases incorporated by many standard rule learning algorithms will miss. In this paper, we describe and evaluate several variations of a new genetic learning algorithm (GLOWER) on a variety of data sets. The design of GLOWER has been motivated by financial prediction problems, but incorporates successful ideas from tree induction and rule learning. We examine the performance of several GLOWER variants on two UCI data sets as well as on a standard financial prediction problem (S&P500 stock returns), using the results to identify one of the better variants for further comparisons. We introduce a new (to KDD) financial prediction problem (predicting positive and negative earnings surprises), and experiment with GLOWER, contrasting it with tree- and rule-induction approaches. Our results are encouraging, showing that GLOWER has the ability to uncover effective patterns for difficult problems that have weak structure and significant nonlinearities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Achelis, S.B. 1995. Technical Analysis From A to Z. Chicago: Irwin.

    Google Scholar 

  • Atiya, A. 1995. An analysis of stops and profit objectives in trading systems. In Proceedings of the Third International Conference of Neural Networks in Capital Markets (NNCM-95), London, October 1995.

  • Barr, D. and Mani, G. 1994. Using neural nets to manage investments. AI Expert, February.

  • Bauer, R.J. 1994. Genetic Algorithms and Investment Strategies. John Wiley & Sons.

  • Beasley, D. Bull, D.R., and Martin, R.R. 1993. A sequential niche technique for multimodal function optimization. Evolutionary Computation, 1(2):101–125.

    Google Scholar 

  • Blake, C. Keogh, E., and Merz, C.J. 1998. Repository of machine learning databases. Dept. of Information and Computer Sciences, University of California, Irvine.

    Google Scholar 

  • Breiman, L. Friedman, J. Olshen, R., and Stone, C. 1984. Classification and Regression Trees. Wadsworth: Monterey, CA.

    Google Scholar 

  • Cartwright, H.M. and Mott, G.F. 1991. Looking around: Using clues from the data space to guide genetic algorithm searches. In Proceedings of the Fourth International Conference on Genetic Algorithms.

  • Chou, D. 1999. The relationship between earnings events and returns: A comparison of four nonlinear prediction models. Ph.D. Thesis, Department of Information Systems, Stern School of Business, New York University.

  • Clark, P. and Niblett, T. 1989. The CN2 induction algorithm. Machine Learning, 3:261–283.

    Google Scholar 

  • Clearwater, S. and Provost, F. 1990. RL4: A tool for knowledge-based induction. In Proceedings of the Second International IEEE Conference on Tools for Artificial Intelligence, pp. 24–30.

  • Cohen, W.W. and Singer, Y. 1990. A simple, fast, and effective rule learner. In Proceedings of the Sixteenth National Conference on Artificial Intelligence, American Association for Artificial Intelligence (AAAI-99), pp. 335–342.

  • Deb, K. and Goldberg, D.E. 1989. An investigation of niche and species formation in genetic function optimization. In Proceedings of the Third International Conference on Genetic Algorithms.

  • DeJong, K. 1999. Evolutionary computation for discovery. Communications of the ACM, 42(11):51–53.

    Article  Google Scholar 

  • Dhar, V. and Stein, R. 1997. Seven Methods for Transforming Corporate Data Into Business Intelligence. Prentice-Hall.

  • Domingos, P. 1996a. Unifying instance-based and rule-based induction. Machine Learning, 24:141–168.

    Google Scholar 

  • Domingos, P. 1996b. Linear time rule induction. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, AAAI Press, pp. 96–101.

  • Forgy, L. 1982. RETE: A fast algorithm for many pattern/many object pattern matching. Artificial Intelligence, 19:17–37.

    Article  Google Scholar 

  • Friedman, J.H. 1996. Local learning based on recursive covering. Dept. of Statistics, Stanford University.

  • Furnkranz, J. 1999. Separate-and-conquer rule learning. Artificial Intelligence Review, 13(1):3–54.

    Article  Google Scholar 

  • George, E.I., Chipman, H., and McCulloch, R.E. 1996. Bayesian CART. In Proceedings: Computer Science and Statistics 28th Symposium on the Interface, Sydney, Australia.

  • Goldberg, D.E. 1989. Genetic Algorithms in Search, Optimization, and Machine Learning. Reading, MA: Addison-Wesley.

    Google Scholar 

  • Goldberg, D.E., Deb, K., and Horn, J. 1992. Massive multimodality, deception and genetic algorithms. In Parallel Problem Solving from Nature, 2, R. Manner and B. Manderick (Eds.). Elsevier Science.

  • Goldberg, D.E. and Richardson, J. 1987. Genetic algorithms with sharing for multimodal function optimization. In Proceedings of the Second International Conference on Genetic Algorithms.

  • Graham, B. and Dodd, D. 1936. Security Analysis. McGraw-Hill.

  • Grefenstette, J.J. 1987. Incorporating problem specific knowledge into genetic algorithms. In Genetic Algorithms and Simulated Annealing, L. Davis (Ed.). Los Altos, CA: Morgan Kaufmann.

    Google Scholar 

  • Hekanaho, J. 1996. Background knowledge in GA-based concept learning. In Proceedings of the Thirteen International Conference on Machine Learning.

  • Hong, J. 1991. Incremental discovery of rules and structure by hierarchical and parallel clustering. In Knowledge Discovery in Databases, Piatetsky-Shapiro and Frawley (Eds.). CA: AAAI Press, Menlo Park.

    Google Scholar 

  • Holland, J.H. 1975. Adaptation in Natural and Artificial Systems. Ann Arbor: The University of Michigan Press.

    Google Scholar 

  • Holland, J.H. 1992. Adaptation in natural and artificial systems. In An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence, MIT Press.

  • Janikow, C.Z. 1993. A knowledge-intensive genetic algorithm for supervised learning. Machine Learning, 13:189–228.

    Article  Google Scholar 

  • Jensen, D. and Cohen, P.R. 2000. Multiple comparisons in induction algorithms. Machine Learning, 38(3):309–338.

    Article  Google Scholar 

  • Lim, Tjen-Sien, Loh, Wei-Yin, and Shih, Yu-Shan Shih. 2000. A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Machine Learning, 40(3):203–228.

    Article  Google Scholar 

  • Madden, B. 1996. The CFROI life cycle. Journal of Investing, 5(1).

  • Mahfoud, S.W. 1995. A comparison of parallel and sequential niching methods. In Proceedings of the Sixth International Conference on Genetic Algorithms.

  • Mahfoud, S.W. 1995. Niching methods for genetic algorithms. U. of Illinois, Illinois Genetic Algorithms Lab., Urbana.

    Google Scholar 

  • Michalski, R., Mozetec, I., Hong, J., and Lavrac, N. 1986. The multi-purpose incremental learning system AQ15 and its testing to three medical domains. In Proceedings of the Sixth National Conference on Artificial Intelligence, Menlo Park, CA, pp. 1041–1045.

  • Michie, D., Spiegelhalter, D.J., and Taylor, C.C. 1994. Machine Learning, Neural and Statistical Classification, Ellis Horwood Ltd.

  • Mitchell, T.M. 1980. The need for biases in learning generalizations. Report CBM-TR-117, Computer Science Department, Rutgers University.

  • Murthy, S.K. 1998. Automatic construction of decision trees from data: A multi-disciplinary survey. Data Mining and Knowledge Discovery, 2(4):345–389.

    Article  Google Scholar 

  • Oei, C.K., Goldberg, D.E., and Chang, S. 1991. Tournament selection, niching and the preservation of diversity, U. of Illinois, Illinois Genetic Algorithms Lab., Urbana.

    Google Scholar 

  • Packard, N. 1989. A genetic learning algorithm. Tech Report, University of Illinois at Urbana Champaign.

    Google Scholar 

  • Provost, F.J. and Buchanan, B.G. 1995. Inductive policy: The pragmatics of bias selection. Machine Learning, 20:35–61.

    Google Scholar 

  • Provost, F. and Buchanan, B. 1992. Inductive strengthening: The effects of a simple heuristic for restricting hypothesis space search. In Analogical and Inductive Inference, K.P. Jantke (Ed.). Springer-Verlag. Lecture Notes in Artificial Intelligence, vol. 642.

  • Provost, F., Aronis, J., and Buchanan, B. 1999. Rule-space search for knowledge-based discovery. Report #IS 99-012, IS Dept., Stern School, NYU.

  • Quinlan, J. 1996. Machine Learning and ID3. Los Altos: Morgan Kauffman.

    Google Scholar 

  • Sikora, R. and Shaw, M.J. 1994. A double-layered learning approach to acquireing rules for classification: Integrating genetic algorithms with similarity-based learning. ORSA Journal on Computing, 6(2):334–338.

    Google Scholar 

  • Smythe, P. and Goodman, R. 1991. Rule induction using information theory. In Knowledge Discovery in Databases, Piatetsky-Shapiro and Frawley (Eds.). Menlo Park, CA: AAAI Press.

    Google Scholar 

  • UCI. 1995. Repository of machine learning databases. University of California, Department of Information and Computer Science, Irvine, CA. [http://www.ics.uci.edu/~mlearn/MLRepository.html].

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dhar, V., Chou, D. & Provost, F. Discovering Interesting Patterns for Investment Decision Making with GLOWER ☹—A Genetic Learner Overlaid with Entropy Reduction. Data Mining and Knowledge Discovery 4, 251–280 (2000). https://doi.org/10.1023/A:1009848126475

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1009848126475

Navigation