An Adaptive Approach for the Exploration-Exploitation Dilemma and Its Application to Economic Systems

  • Lilia Rejeb
  • Zahia Guessoum
  • Rym M’Hallah
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3898)


Learning agents have to deal with the exploration-exploitation dilemma. The choice between exploration and exploitation is very difficult in dynamic systems; in particular in large scale ones such as economic systems. Recent research shows that there is neither an optimal nor a unique solution for this problem. In this paper, we propose an adaptive approach based on meta-rules to adapt the choice between exploration and exploitation. This new adaptive approach relies on the variations of the performance of the agents. To validate the approach, we apply it to economic systems and compare it to two adaptive methods originally proposed by Wilson: one local and one global. Moreover, we compare different exploration strategies and focus on their influence on the performance of the agents.


Economic System Exploitation Rate Exploration Probability Gain Factor Exploration Strategy 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Azoulay-Schwartz, R., Kraus, S., Wilkenfeld, J.: Exploration vs. exploitation: choosing a supplier in an environment of incomplete information. Elsevier Science (2003)Google Scholar
  2. 2.
    Baum, J.A.C., Rao, H.: Handbook of Organizational Change and Development: Evolutionary Dynamics of Organizational Populations and Communities. Oxford University Press, Oxford (1999)Google Scholar
  3. 3.
    Butz, M.V., Wilson, S.W.: An algorithmic description of XCS. Journal of Soft Computing 6, 144–153 (2002)CrossRefMATHGoogle Scholar
  4. 4.
    Carmel, D., Markovitch, S.: Exploration Strategies forModel-Based Learning inMulti-agent Systems. In: Jennings, N., Sycara, K., Georgeff, M. (eds.) Autonomous Agents and Multi-agent systems, vol. 2(2), pp. 141–172 (1999)Google Scholar
  5. 5.
    Gittings, J.C.: Multi-armed bandit allocation indices. John Wiley and sons, New York (1989)Google Scholar
  6. 6.
    Roux-Dufort, C.: L’apprentissage organisationnel et le d´eveloppement de l’organisation. D´eveloppement de l’organisation, Nouveaux regards. Durand, R., Economica, pp. 111–134 (2002)Google Scholar
  7. 7.
    Kaelbling, L.P., Moore, A.W.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)Google Scholar
  8. 8.
    March, J.G., Simon, H.A.: Les organisations, Dunod edn. (1991)Google Scholar
  9. 9.
    Meuleau, N., Bourgine, P.: Exploration of multi-state environments: Local measure and backpropagation of uncertainty. Machine Learning 35(2), 117–154 (1999)CrossRefMATHGoogle Scholar
  10. 10.
    Miramontes Hercog, L., Fogarty, T.C.: Social Simulation Using a Multi-agent Model Based on Classifier Systems: The emergence of Vacillating Behavior in the “ El Farol” Bar Problem. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2001. LNCS (LNAI), vol. 2321, pp. 88–111. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  11. 11.
    Rejeb, L., Guessoum, Z.: Adaptive Firms. In: Proc. AISTA 2004 International Conference on Advances in Intelligent Systems - Theory and Applications. In co-operation with the IEEE Computer Society. Luxembourg November (2004)Google Scholar
  12. 12.
    Penrose, E.T.: The theory of the growth of the firm. Basil Blackwell, Malden (1959)Google Scholar
  13. 13.
    Peres-Uribe, A., Hirsbrunner, B.: The risk of Exploration in multi-agent learning systems: a case study. In: Proc. Agents 2000 Joint workshop on learning agents, Barcelona, June 3–7, pp. 33–37 (2000)Google Scholar
  14. 14.
    Sutton, R.S., Barto, A.G.: Reinforcement learning, an introduction. MIT Press, Cambridge (1998)Google Scholar
  15. 15.
    Thrun, S.B.: The role of exploration in learning control. In: Sofge, D.A. (ed.) Handbook of Intelligent Control: Neural, Fuzzy and Adaptive Approaches. Florence, Kentucky: Van Nostrand Reinhold (1992)Google Scholar
  16. 16.
    Watkins, C., Dayan, P.: Q-Learning. Machine Learning 8, 279–292 (1999)MATHGoogle Scholar
  17. 17.
    Wiering, M.: Explorations in Efficient Reinforcement Learning. Ph.D. thesis (February 1999)Google Scholar
  18. 18.
    Wilson, S.W.: Classifiers Fitness Based on Accuracy. Evolutionary computation 3(2), 149–175 (1995)CrossRefGoogle Scholar
  19. 19.
    Wilson, S.W.: Explore/Exploit Strategies in Autonomy. In: Maes, P., Mataric, M., Pollac, J., Meyer, J.-A., Wilson, S. (eds.) From Animals to Animats 4, Proc. of the 4th International Conference of Adaptive Behavior, Cambridge (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Lilia Rejeb
    • 1
  • Zahia Guessoum
    • 1
    • 2
  • Rym M’Hallah
    • 3
  1. 1.CReSTIC, MODECO TeamReims Cedex2France
  2. 2.Université de Paris-VI, LIP6, OASIS TeamFrance
  3. 3.Dep. of Statistics and Operations ResearchKuwait UniversitySafatKuwait

Personalised recommendations