A New Distributed Reinforcement Learning Algorithm for Multiple Objective Optimization Problems

  • Carlos Mariano
  • Eduardo Morales
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1952)


This paper describes a new algorithm, called MDQL, for the solution of multiple objective optimization problems. MDQL is based on a new distributed Q-learning algorithm, called DQL, which is also introduced in this paper. In DQL a family of independent agents, explo- ring different options, finds a common policy in a common environment. Information about action goodness is transmitted using traces over state- action pairs. MDQL extends this idea to multiple objectives, assigning a family of agents for each objective involved. A non-dominant criterion is used to construct Pareto fronts and by delaying adjustments on the rewards MDQL achieves better distributions of solutions. Furthermore, an extension for applying reinforcement learning to continuous functions is also given. Successful results of MDQL on several test-bed problems suggested in the literature are described.


  1. 1.
    Boutilier, Craig. Sequential Optimality and Coordination in Multiagent Systems. Proc. IJCAI-99, Stockholm Sweden, July 31–August 6, 1999Google Scholar
  2. 2.
    Coello, Carlos. Comprehensive Survey of Evolutionary-Based Multiobjective Optimization Techniques. Knowledge and Information Systems. An International Journal, 1(3):269–308, August 1999.Google Scholar
  3. 3.
    Deb, Kalyanmoy. Multiobjective Genetic Algorithms: Problem Difficulties and Construction of Test Problems. Technical Report TR CI-49/98, University of Dortmund, Germany: Department of Computer Science/XI, 1998.Google Scholar
  4. 4.
    Fonseca, Carlos M. and Flemming Peter J. Multiobjective Genetic Algorithms Made Easy: Selection, Sharing, and mating Restriction. Proceedings of the 1st International Conference on Genetic Algorithms in Engineering Systems: Innovations and Applications.pp. 45–52. September: IEEE, 1995.Google Scholar
  5. 5.
    Littman, Michael. Markov Games as a Framework for Multi-agent Reinforcement Learning, Proc. of the Eleventh International Conference on Machine Learning, pp. 157–163, New Brunswick, NJ, 1994.Google Scholar
  6. 6.
    Mariano, Carlos., Morales Eduardo. A New Approach for the Solution of Multiple Objective Optimization Problems Based on Reinforcement Learning. in O. Cairo et al., eds. Lecture Notes in Artifficial Intelligence 1793, Springer-Verlang, 2000, pp. 212–223.Google Scholar
  7. 7.
    Tan, Ming. Multi-agent Reinforcement Learning: Independent vs. Cooperative Agents, Proc. of the Tenth International Conference on Machine Learning, pp. 330–337, Amherst, MA,1993.Google Scholar
  8. 8.
    Van Veldhuizen, David and Lamont, Gary. Multiobjective Evolutionary algorithms Test Suites, Proceedings of the 1999 ACM Symposium on Applied Computing Janice Carrol et al., editor pp. 351–357. 1999.Google Scholar
  9. 9.
    Viennet, R., et al., Multicriteria Optimization Using a Genetic Algorithm for Determining a Pareto Set, International Journal of Systems Science, 27(2):255–260 (1996).MATHCrossRefGoogle Scholar
  10. 10.
    C.J.C.H. Watkins, Learning from Delayed Rewards, Ph.D. thesis, Cambridge University, 1989.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Carlos Mariano
    • 1
  • Eduardo Morales
    • 2
  1. 1.Instituto Mexicano de Tecnología del AguaJiutepec, MorelosMEXICO
  2. 2.ITESM - Campus MorelosTemixco, MorelosMEXICO

Personalised recommendations