Annals of Operations Research

, Volume 271, Issue 2, pp 469–486 | Cite as

Computing multiobjective Markov chains handled by the extraproximal method

  • Julio B. ClempnerEmail author
Original Research


This paper suggests a new method for generating the Pareto front in multi-objective Markov chains, which overcomes some existing drawbacks in multi-objective methods: a fundamental issue is to find strong Pareto policies which are policies whose cost-function value is the closest in Euclidean norm to the utopian point. Each strong Pareto policy is reached when each cost-function, constrained by the strategy of others, cannot improve further its own criterion. Constraints associated to the objective function are implemented formulating the problem as a bi-level optimization approach. We convert the problem into a single level optimization approach by introducing a generalized Lagrangian function to represent the original multi-objective problem in terms of a related nonlinear programming problem. Then, we apply the Tikhonov regularization method to the objective function. The regularization method ensures that all the possible Pareto policies to be generated along the Pareto front are strong Pareto policies. For solving the problem we employ the extra-proximal method. The method effectively approximates to every optimal Pareto point, which in this case is a strong Pareto point, in the Pareto front. The experimental result, applied to the route selection for counter-kidnapping problem, validates the effectiveness and usefulness of the method.


Multi-objective Strong Pareto optimal Euclidian norm Nash Markov chains Route selection 


  1. Aiyoshi, E., & Shimizu, K. (1981). Hierarchical decentralized systems and its new solution by abarrier method. IEEE Transactions on Systems, Man, and Cybernetics, 11, 444–449.CrossRefGoogle Scholar
  2. Alves, M. J., & Clímaco, J. (2007). A review of interactive methods for multiobjective integer and mixed-integer programming. European Journal of Operational Research, 180(1), 99–115.CrossRefGoogle Scholar
  3. Antipin, A. S. (2005). An extraproximal method for solving equilibrium programming problems and games. Computational Mathematics and Mathematical Physics, 45(11), 1893–1914.Google Scholar
  4. Bard, J., & Falk, J. (1982). An explicit solution to the multi-level programming problem. Computers & Operations Research, 9, 77–100.CrossRefGoogle Scholar
  5. Barrett, L., & Narayanan, S. (2008) Learning all optimal policies with multiple criteria. In Proceedings of the 25th international conference on machine learning (ICML ’08), Helsinki, Finland, pp. 41–47Google Scholar
  6. Beltrami, E., Katehakis, M., & Durinovic, S. (1985). Multiobjective markov decisions in urban modelling. Mathematical Modelling, 6, 333–338.CrossRefGoogle Scholar
  7. Benayoun, R., De Montgolfier, J., Tergny, J., & Laritchev, O. (1971). Linear programming with multiple objective functions: Step method (stem). Mathematical Programming, 1, 366–375.CrossRefGoogle Scholar
  8. Bianco, L., Caramia, M., & Giordani, S. (2009). A bilevel flow model for hazmat transportation network design. Transportation Research Part C: Emerging Technologies, 17(2), 175–196.CrossRefGoogle Scholar
  9. Chang, Y. (2015). A leader-follower partially observed, multiobjective markov game. Annals of Operations Research, 235(1), 103–128.CrossRefGoogle Scholar
  10. Chiandussi, G., Codegone, M., Ferrero, S., & Varesio, F. (2012). Comparison of multi-objective optimization methodologies for engineering applications. Computers & Mathematics with Applications, 63, 912–942.CrossRefGoogle Scholar
  11. Chinchuluun, A., & Pardalos, P. M. (2007). A survey of recent developments in multiobjective optimization. Annals of Operations Research, 154, 29–50.CrossRefGoogle Scholar
  12. Clempner, J. B. (2016). Necessary and sufficient Karush–Kuhn–Tucker conditions for multiobjective markov chains optimality. Automatica, 71, 135–142.CrossRefGoogle Scholar
  13. Clempner, J. B., & Poznyak, A. S. (2014). Simple computing of the customer lifetime value: A fixed local-optimal policy approach. Journal of Systems Science and Systems Engineering, 23(4), 439–459.CrossRefGoogle Scholar
  14. Clempner, J. B., & Poznyak, A. S. (2015). Stackelberg security games: Computing the shortest-path equilibrium. Expert Systems with Applications, 42(8), 3967–3979.CrossRefGoogle Scholar
  15. Clempner, J. B., & Poznyak, A. S. (2016). Solving the pareto front for nonlinear multiobjective Markov chains using the minimum Euclidean distance optimization method. Mathematics and Computers in Simulation, 119, 142–160.CrossRefGoogle Scholar
  16. Clempner, J. B., & Poznyak, A. S. (2017). Multiobjective markov chains optimization problem with strong pareto frontier: Principles of decision making. Expert Systems With Applications, 68, 123–135.CrossRefGoogle Scholar
  17. Clempner, J. B., & Poznyak, A. S. (2018). A Tikhonov regularization parameter approach for solving Lagrange constrained optimization problems. Engineering Optimization. (To be published).CrossRefGoogle Scholar
  18. Das, I., & Dennis, J. E. (1997). A closer look at drawbacks of minimizing weighted sums of objectives for Pareto set generation in multi-criteria optimization problems. Structural and Multidisciplinary Optimization, 14, 63–69.CrossRefGoogle Scholar
  19. Das, I., & Dennis, J. E. (1998). Normal-boundary intersection: An alternate approach for generating Pareto-optimal points in multicriteria optimization problems. SIAM Journal on Optimization, 8, 631–657.CrossRefGoogle Scholar
  20. Deb, K. (1999). Multi-objective genetic algorithms: Problem difficulties and construction of test problems. Evolutionary Computation, 7, 205–230.CrossRefGoogle Scholar
  21. Deb, K. (2001). Nonlinear goal programming using multi-objective genetic algorithms. Journal of the Operational Research Society, 52, 291–302.CrossRefGoogle Scholar
  22. Dempe, S. (2001). Discrete bilevel optimization problems. Technical report, Institut fur Wirtschaftsinformatik, Universitat Leipzig, Leipzig, Germany.Google Scholar
  23. DeNegre, S., & Ralphs, T. (2009). A branch-and-cut algorithm for integer bilevel linear programs. Operations Research and Cyber-Infrastructure, 47, 65–78.CrossRefGoogle Scholar
  24. Eichfelder, G. (2008). Adaptive scalarization methods in multiobjective optimization. Berlin: Springer.CrossRefGoogle Scholar
  25. Fampa, M., Barroso, L., Candal, D., & Simonetti, L. (2008). Bilevel optimization applied to strategic pricing in competitive electricity markets. Computational Optimization and Applications, 39(2), 121–142.CrossRefGoogle Scholar
  26. Fliege, J., & Heseler, A. (2003). Constructing approximations to the efficient set of convex quadratic multi-objective problems. Tech. rep.: University of Dortmund, Germany.Google Scholar
  27. Fu, Y., & Diwekar, U. M. (2004). An efficient sampling approach to multiobjective optimization. Annals of Operations Research, 132(1–4), 109–134.CrossRefGoogle Scholar
  28. Herskovits, J., Leontiev, A., Das, G., & Santos, G. (2000). Contact shape optimization: A bilevel programming approach. Structural and Multidisciplinary Optimization, 20, 214–221.CrossRefGoogle Scholar
  29. Hwang, C., & Masud, A. (1979). Multiple objective decision making, methods and applications: A state-of-the art survey. Berlin: Springer.CrossRefGoogle Scholar
  30. Kim, I., & de Weck, O. (2005). Adaptive weighted-sum method for bi-objective optimization: Pareto front generation. Structural and Multidisciplinary Optimization, 29, 149–158.CrossRefGoogle Scholar
  31. Koppe, M., Queyranne, M., & Ryan, C. T. (2009). A parametric integer programming algorithm for bilevel mixed integer programs. Journal of Optimization Theory and Applications, 146(1), 137–150.CrossRefGoogle Scholar
  32. Lau, H. C., Yuan, Z., & Gunawan, A. (2016). Patrol scheduling in urban rail network. Annals of Operations Research, 239(1), 317–342.CrossRefGoogle Scholar
  33. Leigh, J., Dunnett, S., & Jackson, L. (2017). Predictive police patrolling to target hotspots and cover response demand. Annals of Operations Research,. Scholar
  34. Li, K., Kwong, S., Zhang, Q., & Deb, K. (2015). Interrelationship-based selection for decomposition multiobjective optimization. IEEE Transactions on Cybernetics, 45(10), 2076–2088.CrossRefGoogle Scholar
  35. Naoum-Sawaya, J., & Elhedhli, S. (2011). Controlled predatory pricing in a multiperiod stackelberg game: An MPEC approach. Journal of Global Optimization, 50, 345–362.CrossRefGoogle Scholar
  36. Pirotta, M., Parisi, S., & Restelli, M. (2015) Multi-objective reinforcement learning with continuous Pareto frontier approximation. In Proceedings of the twenty-ninth AAAI conference on artificial intelligence.Google Scholar
  37. Poznyak, A. S., Najim, K., & Gomez-Ramirez, E. (2000). Self-learning control of finite Markov chains. New York: Marcel Dekker.CrossRefGoogle Scholar
  38. Roijers, D. M., Vamplew, P., Whiteson, S., & Dazeley, R. (2013). A survey of multi-objective sequential decision-making. Journal of Artificial Intelligence Research, 48, 67–113.CrossRefGoogle Scholar
  39. Salmeron, J., Wood, K., & Baldick, R. (2004). Analysis of electric grid security under terrorist threat. IEEE Transactions on Power Systems, 19(2), 905–912.CrossRefGoogle Scholar
  40. Salukvadze, M. E. (1979). Vector-valued optimization problems in control theory. New York: Academic Press.Google Scholar
  41. Schittkowski, K. (1999). Easy-opt: An interactive optimization system with automatic differentiation—User’s guide. Tech. rep.: Department of Mathematics, University of Bayreuth.Google Scholar
  42. Sheng, W., Liu, Y., Meng, X., & Zhang, T. (2012). An improved strength pareto evolutionary algorithm 2 with application to the optimization of distributed generations. Computers & Mathematics with Applications, 64(5), 944–955.CrossRefGoogle Scholar
  43. Steuer, R. E. (1989). The Tchebyche procedure of interactive multiple objective programming. In Multiple criteria decision making and risk analysis using microcomputers (pp. 235–249). Springer, Berlin.CrossRefGoogle Scholar
  44. Tanaka, K. (1989). The closest solution to the shadow minimum of a cooperative dynamic game. Computers & Mathematics with Applications, 18(1–3), 181–188.CrossRefGoogle Scholar
  45. Tanaka, K., & Yokoyama, K. (1991). On \(\epsilon \)-equilibrium point in a noncooperative n-person game. Journal of Mathematical Analysis and Applications, 160, 413–423.CrossRefGoogle Scholar
  46. Tappeta, R., & Renaud, J. (1999). Interactive multiobjective optimization procedure. AIAA Journal, 37(7), 881–889.CrossRefGoogle Scholar
  47. Tind, J., & Wiecek, M. M. (1999). Augmented lagrangian and tchebycheff approaches in multiple objective programming. Journal of Global Optimization, 14, 251–266.CrossRefGoogle Scholar
  48. Trejo, K. K., Clempner, J. B., & Poznyak, A. S. (2015a). Computing the Stackelberg/Nash equilibria using the extraproximal method: Convergence analysis and implementation details for Markov chains games. International Journal of Applied Mathematics and Computer Science, 25(2), 337–351.CrossRefGoogle Scholar
  49. Trejo, K. K., Clempner, J. B., & Poznyak, A. S. (2015b). A Stackelberg security game with random strategies based on the extraproximal theoretic approach. Engineering Applications of Artificial Intelligence, 37, 145–153.CrossRefGoogle Scholar
  50. Vamplew, P., Dazeley, R., Barker, E., & Kelarev, A. (2009) Constructing stochastic mixture policies for episodic multiobjective reinforcement learning task. In Lecture Notes in Computer Science: Advances in artificial intelligence (Vol. 5866, pp. 340–349). Berlin: Springer.CrossRefGoogle Scholar
  51. Wakuta, K., & Togawa, K. (1998). Solution procedures for multi-objective Markov decision processes. Optimization, 43, 29–46.CrossRefGoogle Scholar
  52. Wierzbicki, P. (1980). Multiple criteria decision making theory and applications (pp. 468–486). Berlin: Springer.CrossRefGoogle Scholar
  53. Xia, H., Zhuang, J., & Yu, D. (2014). Multi-objective unsupervised feature selection algorithm utilizing redundancy measure and negative epsilon-dominance for fault diagnosis. Neurocomputing, 146, 113–124.CrossRefGoogle Scholar
  54. Xinjie, Y., & Mitsuo, G. (2010). Introduction to evolutionary algorithms. London: Springer.Google Scholar
  55. Zadeh, L. (1963). Optimality and non-scalar-valued performance criteria. IEEE Transactions on Automatic Control, 8(1), 59–60.CrossRefGoogle Scholar
  56. Zitzler, E., Knowles, J., & Thiele, L. (2008). Quality assessment of Pareto set approximations. In Lecture Notes in Computer Science: Multiobjective optimization (Vol. 5252, pp. 373–404). Berlin: Springer.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Escuela Superior de Física y Matemáticas, Instituto Politécnico NacionalSchool of Physics and Mathematics National Polytechnic InstituteGustavo A. Madero, Mexico CityMexico

Personalised recommendations