Evolutionary Intelligence

, Volume 3, Issue 1, pp 31–50 | Cite as

A learning classifier system with mutual-information-based fitness

  • Robert Elliott Smith
  • Max Kun JiangEmail author
  • Jaume Bacardit
  • Michael Stout
  • Natalio Krasnogor
  • Jonathan D. Hirst
Research Paper


This paper introduces a new variety of learning classifier system (LCS), called MILCS, which utilizes mutual information as fitness feedback. Unlike most LCSs, MILCS is specifically designed for supervised learning. We present experimental results, and contrast them to results from XCS, UCS, GAssist, BioHEL, C4.5 and Naïve Bayes. We discuss the explanatory power of the resulting rule sets. MILCS is also shown to promote the discovery of default hierarchies, an important advantage of LCSs. Final comments include future directions for this research, including investigations in neural networks and other systems.


Evolutionary computation Learning classifier systems Machine learning Information theory Mutual information Supervised learning Protein structure prediction Explanatory power 



The authors greatly acknowledge support provided by the UK Engineering and Physical Sciences Research Council (EPSRC) under grant GR/T07541/01 & GR/T07534/01.


  1. 1.
    Bacardit J (2004) Pittsburgh genetics-based machine learning in the data mining era: representations, generalization, and run-time, PhD thesis, Ramon Llull University, Barcelona, Catalonia, SpainGoogle Scholar
  2. 2.
    Bacardit J (2007) Personal communicationGoogle Scholar
  3. 3.
    Bacardit J, Stout M, Hirst JD, Sastry K, Llorà X, Krasnogor N (2007) Automated alphabet reduction method with evolutionary algorithms for protein structure prediction. In: Proceedings of the 9th annual conference on genetic and evolutionary computation (GECCO2007), ACM Press, pp 346–353Google Scholar
  4. 4.
    Bernadó-Mansilla E (2003) Accuracy-based learning classifier systems: models, analysis and applications to classification tasks. Evol Comput 11(3):209–238CrossRefGoogle Scholar
  5. 5.
    Bull L (2009) Personal communicationGoogle Scholar
  6. 6.
    Butz MV (2003) Documentation of XCS+TS C-Code 1.2, IlliGAL report 2003023, University of Illinois at Urbana-Champaign, (Source code:
  7. 7.
    Butz M, Wilson SW (2001) An algorithmic description of XCS. In: Lanzi PL, Stolzmann W, Wilson SM (eds) Revised papers from the third international workshop on advances in learning classifier systems. Lecture notes in computer science, 1996. Springer, London, pp 253–272Google Scholar
  8. 8.
    Fahlman SE, Lebiere C (1990) The cascade-correlation learning algorithm. In: Advances in neural information processing systems, vol 2. Morgan Kaufmann, pp 525–532Google Scholar
  9. 9.
    Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. Addison-Wesley, BostonzbMATHGoogle Scholar
  10. 10.
    Heckerman H (1996) A tutorial on learning with bayesian networks, Technical Report, MSR-TR-95-06Google Scholar
  11. 11.
    Holland JH (1992) Adaptation in natural and artificial systems, 2nd edn. MIT Press, CambridgeGoogle Scholar
  12. 12.
    Holland J, Holyoak KJ, Nisbett RE, Thagard P (1986) Induction: processes of inference learning and discover. MIT Press, CambridgeGoogle Scholar
  13. 13.
    Kovacs T (2000) Strength or accuracy? Fitness calculation in learning classifier systems, learning classifier systems: an introduction to contemporary research. Springer, Berlin, pp 143–160Google Scholar
  14. 14.
    Kovacs T, Kerber M (2001) What makes a problem hard for XCS? In: Lanzi P, Stolzmann W, Wilson S (eds) Advances in learning classifer systems: proceedings of the third international workshop, volume 1996 of lecture notes in artificial intelligence. Springer, Berlin, pp 80–99Google Scholar
  15. 15.
    Jiang MK (2009) MILCS: a mutual information based learning classifier system, PhD thesis, University College London, UKGoogle Scholar
  16. 16.
    Langley P, Iba W, Thompson K (1992) An analysis of bayesian classifiers. In: Proceedings of the tenth national conference on artificial intelligence, AAAI Press, pp 223–228Google Scholar
  17. 17.
    Lanzi PL, Loiacono D, Wilson SW, Goldberg DE (2005) XCS with computed prediction for the learning of boolean functions. Evol Comput 1:588–595Google Scholar
  18. 18.
    Orriols-Puig A (2008) Personal communicationGoogle Scholar
  19. 19.
    Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San MateoGoogle Scholar
  20. 20.
    Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423 (623–656)MathSciNetzbMATHGoogle Scholar
  21. 21.
    Shannon CE (1949) Communication in the presence of noise. In: Proceedings of institute of radio engineers, vol 37, no. 1, pp 10–21Google Scholar
  22. 22.
    Smith RE (1991) Default hierarchy formation and memory exploitation in learning classifier systems, PhD thesis, The University of Alabama, USAGoogle Scholar
  23. 23.
    Smith RE, Behzadan B (2008) Mutual information neuro-evolutionary system (MINES), IEEE Congress on Evolutionary Computation (CEC) 2009, pp 1523–1529Google Scholar
  24. 24.
    Smith RE, Cribbs HB (1994) Is a classifier system a type of neural network? Evol Comput 2(1):19–36CrossRefGoogle Scholar
  25. 25.
    Smith RE, Jiang MK (2007) A learning classifier system with mutual-information-based fitness. In: Proceedings of the 2007 congress on evolutionary computation, pp 2173–2180Google Scholar
  26. 26.
    Stout M, Bacardit J, Hirst JD, Krasnogor N (2006) From HP lattice models to real proteins: coordination number prediction using learning classifier systems. In: Proceedings of the 4th European workshop on evolutionary computation and machine learning in bioinformatics. Lectures notes in computer science, 3907, Budapest, Hungary, pp 208–220Google Scholar
  27. 27.
    Stout M, Bacardit J, Hirst JD, Smith RE, Krasnogor N (2008) Prediction of topological contacts in proteins using learning classifier systems. J Soft Comput Fusion Found, Methodol and Appl 13(3):245–258Google Scholar
  28. 28.
    Stubbs N, Park S (1996) Optimal sensor placement for mode shapes via Shannon.s sampling theorem. Microcomput Civ Eng 11:411–419CrossRefGoogle Scholar
  29. 29.
    Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, CambridgeGoogle Scholar
  30. 30.
    Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175CrossRefGoogle Scholar
  31. 31.
    Wilson SW (1998) Generalization in the XCS classifier system. In: Koza JR, Banzhaf W, Chellapilla K, Deb K, Dorigo M, Fogel DB, Garzon MH, Goldberg DE, Iba H, Riolo R (eds). Genetic programming 1998: proceedings of the third annual conference, Morgan Kaufmann, pp 665–674Google Scholar
  32. 32.
    Wilson SW (2001) Function approximation with a classifier system. In: Spector L et al. (eds) Proceedings of the genetic and evolutionary computation conference (GECCO-2001), Morgan Kaufmann, San Francisco, CA, pp 974–981Google Scholar

Copyright information

© Springer-Verlag 2010

Authors and Affiliations

  • Robert Elliott Smith
    • 1
  • Max Kun Jiang
    • 1
    Email author
  • Jaume Bacardit
    • 2
  • Michael Stout
    • 2
  • Natalio Krasnogor
    • 2
  • Jonathan D. Hirst
    • 3
  1. 1.Department of Computer ScienceUniversity College LondonLondonUK
  2. 2.School of Computer ScienceUniversity of NottinghamNottinghamUK
  3. 3.School of ChemistryUniversity of NottinghamNottinghamUK

Personalised recommendations