Learning classifier systems with memory condition to solve non-Markov problems

Zang, Zhaoxiang; Li, Dehua; Wang, Junying

doi:10.1007/s00500-014-1357-y

Learning classifier systems with memory condition to solve non-Markov problems

Methodologies and Application
Published: 01 July 2014

Volume 19, pages 1679–1699, (2015)
Cite this article

Soft Computing Aims and scope Submit manuscript

Zhaoxiang Zang^1,2,
Dehua Li² &
Junying Wang¹

276 Accesses
10 Citations
Explore all metrics

Abstract

In the family of learning classifier systems, the classifier system XCS has been successfully used for many applications. However, the standard XCS has no memory mechanism and can only learn optimal policy in Markov environments, but fails in non-Markov ones. In this work, we aim to develop a new classifier system based on XCS to tackle this problem. It adds a memory list with numbered slots to XCS to record input sensation history, and extends only a small number of classifiers with memory conditions. The classifier’s memory condition, as a foothold to disambiguate non-Markov states, is used to sense a specified element in the memory list, which makes our system can “jump over” irrelevant or confusing states to get decisive prior information that may be far back in time. Besides, a detection method is employed to recognize non-Markov states in environments, to avoid these states controlling over classifiers’ memory conditions. Furthermore, four sets of different complex maze environments have been tested by the proposed method. Experimental results show that our system can overcome the overhead problem often encountered in history-window approaches, and is an effective technique to solve non-Markov environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

BACS: A Thorough Study of Using Behavioral Sequences in ACS2

Context-Aware Decision Making for Maze Solving

Introducing Action Planning to the Anticipatory Classifier System ACS2

Notes

The more explanations will be given in Sect. 4.
The variable \(s_t \) is added to highlight system prediction is dependent on the current input \(s_t \) since the match set [\(M\)] is determined by \(s_t \). This will provide convenience for later use.
To refer to one of the attributes of a classifier \(cl\), we use the dot notation. For example, \(cl.mp\) is used to refer to \(mp\) of \(cl\).

References

Bagnall AJ, Zatuchna ZV (2005) On the classification of maze problems. In: Bull L, Kovacs T (eds) Foundations of learning classifier systems. Studies in fuzziness and soft computing, vol 183. Springer, Heidelberg, pp 307–316
Browne W, Scott D (2005) An abstraction algorithm for genetics-based reinforcement learning. In: Beyer H (ed) GECCO 2005: genetic and evolutionary computation conference, vol 2. ACM Press, Washington, DC, pp 1875–1882
Bull L, Hurst J (2003) A neural learning classifier system with self-adaptive constructivism. In: Proceedings of the 2003 congress on evolutionary computation, CEC ’03, vol 2. IEEE Press, pp 991–997
Butz MV (2003) Documentation of XCS+ts c-code 1.2. Illinois Genetic Algorithm Laboratory (IlliGAL), University of Illinois at Urbana-Champaign
Butz MV, Kovacs T, Lanzi PL, Wilson SW (2001) How XCS evolves accurate classifiers. In: Spector L, Goodman ED, Wu A (eds) GECCO-2001: Proceedings of the genetic and evolutionary computation conference. Morgan Kaufmann, Waltham, pp 927–934
Google Scholar
Butz MV, Wilson SW (2002) An algorithmic description of XCS. Soft Comput 6(3–4):144–153
Article MATH Google Scholar
Cliff D, Ross S (1994) Adding temporary memory to ZCS. Adapt Behav 3(2):101–150
Article Google Scholar
Dung LT, Komeda T, Takagi M (2008) Reinforcement learning for POMDP using state classification. Appl Artif Intell 22:761–779. doi:10.1080/08839510802170538
Article Google Scholar
Gilles E, Mathias P (2008) Adapted Pittsburgh classifier system: building accurate strategies in non Markovian environments. In: Proceedings of the 2008 GECCO conference companion on genetic and evolutionary computation. ACM, Atlanta, GA, USA, pp 2001–2008. doi:10.1145/1388969.1389013
Gilles É, Mathias P (2010) Building accurate strategies in non Markovian environments without memory. In: Proceedings of learning classifier systems: 11th international workshop, IWLCS 2008, Atlanta, GA, USA, July 13, 2008, and 12th international workshop, IWLCS 2009, Montreal, QC, Canada, July 9, 2009, revised selected papers. Springer, Berlin, pp 107–126
Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley Publishing Company Inc, Reading
MATH Google Scholar
Hamzeh A, Hashemi S, Sami A, Rahmani A (2009) A recursive classifier system for partially observable environments. Fundamenta Informaticae 97(1):15–40
Google Scholar
Hamzeh A, Rahmani A (2008) A new architecture for learning classifier systems to solve POMDP problems. Fundamenta Informaticae 84(3):329–351
MATH MathSciNet Google Scholar
Holland JH (1975) Adaptation in natural and artificial systems. University of Michigan Press, Ann Arbor
Google Scholar
Holland JH, Reitman JS (1977) Cognitive systems based on adaptive algorithms. ACM SIGART Bull 63:49. doi:10.1145/1045343.1045373
Article Google Scholar
Iqbal M, Browne WN, Zhang M (2012) Extracting and using building blocks of knowledge in learning classifier systems. In: Soule T, Moore JH (eds) Proceedings of the fourteenth international conference on genetic and evolutionary computation conference. GECCO ’12. ACM, Philadelphia, PA, USA, pp 863–870
Kaelbling LP, Littman M, Moore A (1996) Reinforcement learning: a survey. J Artif Intell Res 4(1):237–285
Google Scholar
Kovacs T (2000) Towards a theory of strong overgeneral classifiers. In: Martin W, Spears WM (eds) Foundations of genetic algorithms (FOGA), vol 6. Morgan Kaufmann, San Francisco, pp 165–184
Google Scholar
Landau S, Sigaud O (2008) A comparison between ATNoSFERES and learning classifier systems on non-Markov problems. Inf Sci 178(23):4482–4500
Article Google Scholar
Lanzi PL (1998a) Adding memory to XCS. In: Proceedings of the IEEE conference on evolutionary computation (ICEC98). IEEE Press, Anchorage, AK, USA, pp 609–614
Lanzi PL (1998b) An analysis of the memory mechanism of XCSM. In: Koza JR, Banzhaf W, Chellapilla K (eds) Genetic programming 1998: Proceedings of the third annual conference. Morgan Kaufmann, University of Wisconsin, Madison, Wisconsin, USA, pp 643–651
Lanzi PL (1999) An analysis of generalization in the XCS classifier system. Evol Comput 7(2):125–149
Article Google Scholar
Lanzi PL (2002) Learning classifier systems from a reinforcement learning perspective. Soft Comput 6(3–4):162–170
Article MATH Google Scholar
Lanzi PL, Wilson SW (2000) Toward optimal classifier system performance in non-Markov environments. Evol Comput 8(4):393–418
Article Google Scholar
Liepins GE, Hilliard MR, Palmer M, Rangarajan G (1991) Credit assignment and discovery in classifier systems. Int J Intell Syst 6(1):55–69
Article Google Scholar
Littman ML (1993) An optimization-based categorization of reinforcement learning environments. From animals to animats 2 : simulation of adaptive behavior. MIT Press, Honolulu, Hawai, USA, pp 262–270
Littman ML, Cassandra AR, Kaelbling LP (1995) Learning policies for partially observable environments: scaling up. Machine learning: Proceedings of the twelfth international conference on machine learning. Morgan Kaufmann Publishers Inc., Tahoe City, California, pp 362–370
Mccallum RA (1993) Overcoming incomplete perception with utile distinction memory. Proceedings of the tenth international conference on machine learning. Morgan Kaufmann, Amherst, pp 190–196
Google Scholar
Mccallum RA (1996) Hidden state and reinforcement learning with instance-based state identification. Proc IEEE Trans Syst Man Cybern Part B 26(3):464–473 Special issue on learning autonomous robots
Article Google Scholar
Métivier M, Lattaud C (2003) Anticipatory classifier system using behavioral sequences in non-Markov environments. In: Proceedings of learning classifier systems: 5th international workshop, IWLCS, (2002) vol 2661. Springer, New York, pp 143–162
Moioli RC, Vargas PA, Zuben FJV (2008) Analysing learning classifier systems in reactive and non-reactive robotic tasks. In: Bacardit J, Bernadó-Mansilla E, Butz MV, Kovacs T, Llorà X, Takadama K (eds) Learning classifier systems, vol 4998. Springer, New York, pp 286–305
Preen R, Bull L (2009) Discrete dynamical genetic programming in XCS. In: Proceedings of the 11th annual conference on genetic and evolutionary computation, GECCO ’09. ACM, New York, USA, pp 1299–1306
Preen RJ, Bull L (2014) Discrete and fuzzy dynamical genetic programming in the XCSF learning classifier system. Soft Comput 18(1):153–167. doi:10.1007/s00500-013-1044-4
Article Google Scholar
Sigaud O, Wilson S (2007) Learning classifier systems: a survey. Soft Comput 11(11):1065–1078
Article MATH Google Scholar
Smith RE (1994) Memory exploitation in learning classifier systems. Evol Comput 2(3):199–220
Article Google Scholar
Stolzmann W (1999) Latent learning in Khepera robots with anticipatory classifier systems. In: Wu A (ed) Proceedings of the 1999 genetic and evolutionary computation conference workshop. Morgan Kaufmann, San Francisco, California, pp 290–297
Stolzmann W (2000) An introduction to anticipatory classifier systems. In: Lanzi P, Stolzmann W, Wilson SE (eds) Learning classifier systems: from foundations to applications, Lecture notes in artificial intelligence, vol 1813. Springer, Berlin, pp 175–194
Tomlinson A, Bull L (1999) A zeroth level corporate classifier system. In: Banzhaf W, Daida J, Eiben AE et al (eds) Proceedings of the genetic and evolutionary computation conference (GECCO’99). Morgan Kaufmann, San Francisco, pp 306–313
Google Scholar
Tomlinson A, Bull L (2002) An accuracy-based corporate classifier system. Soft Comput 6(3):200–215
Article MATH Google Scholar
Widrow B, Hoff ME (1988) Adaptive switching circuits. Neurocomputing: foundations of research. MIT Press, Cambridge, pp 123–134
Google Scholar
Wilson SW (1991) The Animat path to AI. In: Meyer JA, Wilson SW (eds) From animals to animats 1: Proceedings of the first International conference on simulation of adaptive behavior (SAB90). MIT Press/Bradford Books, Cambridge, MA, pp 15–21
Wilson SW (1994) ZCS: a zeroth level classifier system. Evol Comput 2(1):1–18
Article Google Scholar
Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175
Article Google Scholar
Wilson SW (1998) Generalization in the XCS classifier system. In: Koza JR, Banzhaf W, Chellapilla K et al (eds) Proceedings of the third annual genetic programming conference. Morgan Kaufmann, San Francisco, pp 665–674
Wilson SW, Goldberg DE (1989) A critical review of classifier systems. In: Schaffer JD (ed) Proceedings of the 3rd international conference on genetic algorithms. Morgan Kaufmann, San Francisco, pp 244–255
Google Scholar
Zatuchna Z, Bagnall A (2009) A learning classifier system for mazes with aliasing clones. Nat Comput 8(1):57–99
Article MATH MathSciNet Google Scholar
Zatuchna ZV (2005) AgentP: a learning classifier system with associative perception in maze environments, PhD, School of Computing Sciences, University of East Anglia (UEA), Norwich, England

Download references

Acknowledgments

The authors wish to express thanks to the anonymous reviewers for their insightful comments and suggestions. Also, we want to appreciate Dr. Stewart Wilson for his comments and encouragement to our work. This work was supported by National Natural Science Foundation of China, and the Doctoral Startup Foundation of China Three Gorges University (Project No. KJ2013B064).

Author information

Authors and Affiliations

College of Computer and Information Technology, China Three Gorges University, Yichang, 443002, Hubei, China
Zhaoxiang Zang & Junying Wang
Institute for Pattern Recognition and Artificial Intelligence, Huazhong University of Science and Technology, Wuhan, 430074, Hubei, China
Zhaoxiang Zang & Dehua Li

Authors

Zhaoxiang Zang
View author publications
You can also search for this author in PubMed Google Scholar
Dehua Li
View author publications
You can also search for this author in PubMed Google Scholar
Junying Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhaoxiang Zang.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zang, Z., Li, D. & Wang, J. Learning classifier systems with memory condition to solve non-Markov problems. Soft Comput 19, 1679–1699 (2015). https://doi.org/10.1007/s00500-014-1357-y

Download citation

Published: 01 July 2014
Issue Date: June 2015
DOI: https://doi.org/10.1007/s00500-014-1357-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning classifier systems with memory condition to solve non-Markov problems

Abstract

Access this article

Similar content being viewed by others

BACS: A Thorough Study of Using Behavioral Sequences in ACS2

Context-Aware Decision Making for Maze Solving

Introducing Action Planning to the Anticipatory Classifier System ACS2

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning classifier systems with memory condition to solve non-Markov problems

Abstract

Access this article

Similar content being viewed by others

BACS: A Thorough Study of Using Behavioral Sequences in ACS2

Context-Aware Decision Making for Maze Solving

Introducing Action Planning to the Anticipatory Classifier System ACS2

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation