Skip to main content

Advertisement

Log in

Risk neutrality in learning classifier systems

  • Special Issue
  • Published:
Evolutionary Intelligence Aims and scope Submit manuscript

Abstract

Both economics and biology have come to agree that successful behavior in a stochastic environment responds to the variance of potential outcomes. Unfortunately, when biological and economic paradigms are mated together in a learning classifier system (LCS), decision-making agents called classifiers typically simply ignore risk. Since a fundamental problem of learning is risk management, LCS have not always performed as well as theoretically predicted. This paper develops a novel model of risk-neutral reinforcement learning in a traditional Bucket Brigade credit-allocation market under the pressure of a Genetic Algorithm. I demonstrate the applicability of the basic model to the classical LCS design and reexamine two basic issues where traditional LCS performance fails to meet expectations: default hierarchies and long chains of coupled classifiers. Risk-neutrality and noisy probabilistic auctions create dynamic instability in both areas, while identical preferences result in market failure in default hierarchies and exponential attenuation of price signals down classifier chains. Despite the limitations of simple risk-neutral classifiers, I show they’re capable of cheap short-run emulation of more rational behaviors. Still, risk-neutral information markets are a dead end. The model suggests a path toward a new type of LCS built on stable, heterogeneous, and risk-averse preferences under efficient auctions and access to more complete markets exploitable by competing risk management strategies. This will require a radical rethinking of the evolutionary and economic algorithms, but ultimately heralds a return to a market-based approach to LCS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. See Drugowitsch [6] for an interesting attempt to build LCS from first principles in a probabilistic, Bayesian framework. The result looks very different from the simple traditional LCS studied here.

  2. Daniel Bernoulli proposed the natural logarithm of wealth for a utility function, which maximizes the geometric mean growth rate resulting from risky returns.

  3. Following Savage, a number of others developed alternative axiomizations and variations on the subjective expected utility model with different axiomatic foundations. For an early but thorough review, see Fishburn [10].

  4. Modern behavioral approaches such as Kahneman and Tversky’s prospect theory [11] and its derivatives typically apply subjective preferences across both probabilities as well as magnitudes. Many LCS implementations apply ad hoc nonlinear transformations of the match specificity without any supporting behavioral theory; Holland [12] for example takes a base 2 logarithm.

  5. The probability a classifier has of selling its output and the reward received upon completing a sale are assumed to be independent of the price paid for an input.

  6. This Reduction of Compound Lotteries is an explicit axiom or derivative in most expected utility models, but doesn’t always hold up in decision-makers as complex as human-beings (Budescu and Fischer [19]).

  7. More complex classifiers able to monitor and attempt to predict the bidding behavior of competitors may not make for smarter bidders. As Vickrey [15] showed, demand-revealing behavior can be the optimal strategy even when bidders can fully observe the bids of rivals, as in the English or progressive “open” auctions, so there’s little justification for additional complexity here.

  8. The optimal bid that satisfies Eq. (9) is only solvable analytically in the risk-neutral case of a linear value function, v(w), so it must be found numerically under nonlinear preferences.

  9. ‘Constraint’ is a term inherited from the SEU and other economic models, but here the budget really isn’t constrained in the traditional sense, dependent on the classifier’s choice of bid.

  10. Traditional LCS must initialize classifier wealths from some uniform distribution.

References

  1. Bernoulli D (1738) Exposition of a new theory on the measurement of risk. Translated in 1954 in Econometrica, 22(1):23–36

  2. Von Neumann J, Oskar M (1944) Theory of games and economic behavior. Princeton University, Princeton

    MATH  Google Scholar 

  3. Real L, Caraco T (1986) Risk and foraging in stochastic environments. Ann Rev Ecol Syst 17:371–390

    Article  Google Scholar 

  4. Holland JH, Reitman JS (1978) Cognitive systems based on adaptive algorithms. In: Waterman DA, Hayes-Roth F (eds) Pattern directed inference systems. Academic Press, Waltham

    Google Scholar 

  5. Wilson SW, David EG (1989) A critical review of classifier systems. In: Schaffer JD (ed.) Proceedings from the third international conference on genetic algorithms, Morgan Kaufmann, pp 244–255

  6. Drugowitsch J (2008) Design and analysis of learning classifier systems: a probabilistic approach. Springer, Berlin

    MATH  Google Scholar 

  7. Savage LJ (1954) The foundations of statistics. Wiley, New York

    MATH  Google Scholar 

  8. Bayes T (1763) An essay toward solving a problem in the doctrine of chances, vol. 53. Philosophical Transactions of the Royal Society, pp 370–418

  9. Ellsberg D (1961) Risk, ambiguity, and the savage axioms. Quart J Econ 75:643–669

    Article  Google Scholar 

  10. Fishburn PC (1981) Subjective expected utility: a review of normative theories. Theory Decis 13(2):139–199

    Article  MathSciNet  MATH  Google Scholar 

  11. Kahneman D, Tversky A (1979) Prospect theory: an analysis of decision under risk. Econometrica 47(2):263–292

    Article  MATH  Google Scholar 

  12. Holland JH (1992) Adaptation in natural and artificial systems, 2nd edn. MIT Press, Cambridge

    Google Scholar 

  13. Grefenstette JJ (1991) Conditions for implicit parallelism. In: Rawlins GJE (ed) Foundations of genetic algorithms. Morgan Kaufmann Publishers, Waltham

    Google Scholar 

  14. Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Boston

    MATH  Google Scholar 

  15. Vickrey W (1961) Counterspeculation, auctions, and competitive sealed tenders. J Financ 16(1):8–37

    Google Scholar 

  16. De Groot MH (1970) Optimal statistical decisions. McGraw-Hill, New York

    Google Scholar 

  17. Baum EB, Durdanovic I (2000) Evolution of cooperative problem solving in an artificial economy. Neural Comput 12:2743–2775

    Article  Google Scholar 

  18. Goldberg DE (1990) Probability matching, the magnitude of reinforcement, and classifier system bidding. Machine Learn 5:407–425

    Google Scholar 

  19. Budescu DV, Fischer I (2001) The same but different: an empirical investigation of the reducibility principle. J Behav Decision-Making 14:187–206

    Article  Google Scholar 

  20. Riolo RL (1987a) Bucket brigade performance: I. long sequences of classifiers. In: Grefenstette JJ (ed.) Proceedings from the second international conference on genetic algorithms. Lawrence Erlbaum Associates, pp 184–195

  21. Riolo RL (1987b) Bucket brigade performance: II. default hierarchies. In: Grefenstette JJ (ed.) Proceedings from the second international conference on genetic algorithms. Lawrence Erlbaum Associates, pp 196–201

  22. Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175

    Article  Google Scholar 

  23. Arrow KJ (1971) Essays in the theory of risk bearing. North-Holland, Amsterdam

  24. Real LA (1987) Objective benefit versus subjective perception in the theory of risk-sensitive foraging. Am Nat 130(3):399–411

    Article  Google Scholar 

  25. Healy PJ, Moore DA (2007) Bayesian overconfidence. SSRN: http://ssrn.com/abstract=1001820 or http://dx.doi.org/10.2139/ssrn.1001820

  26. Kovacs T (2002) A comparison of strength and accuracy-based fitness in learning classifier systems. Dissertation, University of Birmingham

  27. Wilson SW (1986) Hierarchical credit allocation in a classifier system. Research Memo RIS No. 37r. The Rowland Institute of Science

  28. Holland JH (1985) Properties of the bucket brigade algorithm. In: Proceedings from the first international conference on genetic algorithms. Lawrence Erlbaum, pp 1–7

  29. Holland JH (1986) Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems. In: Michalski RS, Carbonell JG, Mitchel TM (eds) Machine learning II. Morgan Kaufmann, Waltham

    Google Scholar 

  30. Wilson SW (1989) Bid competition and specificity reconsidered. Complex Syst 2:705–723

    Google Scholar 

  31. Smith RE, Goldberg DE (1991) Variable default hierarchy separation in a classifier system. Found Genet Algorithms 1:141–167

    Google Scholar 

  32. Booker LB (2000) Do we really need to estimate rule utilities in classifier systems? In: Lanzi PL, Stolzmann W, Wilson SW (eds) Lecture notes in artificial intelligence 1813. Springer, Berlin

    Google Scholar 

  33. Holland JH (1995) Hidden order: how adaptation builds complexity. Addison-Wesley, Boston

    Google Scholar 

  34. Smith JTH (2010) Implicit fitness and heterogeneous preferences in the genetic algorithm. In: Proceedings of the 12th annual genetic and evolutionary computation conference (GECCO), ACM

  35. Holland JH, Miller JH (1991) Artificial adaptive agents in economic theory. Am Econ Rev 81(2):365–370

    Google Scholar 

  36. Robson AJ (2001) The biological basis of economic behavior. J Econ Lit 39(1):11–33

    Article  Google Scholar 

  37. Rayo L, Becker G (2007) Evolutionary efficiency and happiness. J Political Econ 11(2):37–302

    Google Scholar 

  38. Netzer N (2009) Evolution of time preferences and attitudes toward risk. Am Econ Rev 99(3):937–955

    Article  Google Scholar 

Download references

Acknowledgments

This paper began development in Stephanie Forrest’s Complex Adaptive Systems seminar at the University of New Mexico. I am grateful to Dr. Forrest as well as Janie M. Chermak at UNM and John H. Miller at CMU/SFI, and anonymous reviewers for feedback that substantially helped me clarify arguments, improve examples, and fix mistakes. All remaining errors are my own. Cheers!

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Justin T. H. Smith.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Smith, J.T.H. Risk neutrality in learning classifier systems. Evol. Intel. 5, 69–86 (2012). https://doi.org/10.1007/s12065-012-0079-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12065-012-0079-2

Keywords

Navigation