Computing multiobjective Markov chains handled by the extraproximal method
- 48 Downloads
- 1 Citations
Abstract
This paper suggests a new method for generating the Pareto front in multi-objective Markov chains, which overcomes some existing drawbacks in multi-objective methods: a fundamental issue is to find strong Pareto policies which are policies whose cost-function value is the closest in Euclidean norm to the utopian point. Each strong Pareto policy is reached when each cost-function, constrained by the strategy of others, cannot improve further its own criterion. Constraints associated to the objective function are implemented formulating the problem as a bi-level optimization approach. We convert the problem into a single level optimization approach by introducing a generalized Lagrangian function to represent the original multi-objective problem in terms of a related nonlinear programming problem. Then, we apply the Tikhonov regularization method to the objective function. The regularization method ensures that all the possible Pareto policies to be generated along the Pareto front are strong Pareto policies. For solving the problem we employ the extra-proximal method. The method effectively approximates to every optimal Pareto point, which in this case is a strong Pareto point, in the Pareto front. The experimental result, applied to the route selection for counter-kidnapping problem, validates the effectiveness and usefulness of the method.
Keywords
Multi-objective Strong Pareto optimal Euclidian norm Nash Markov chains Route selectionReferences
- Aiyoshi, E., & Shimizu, K. (1981). Hierarchical decentralized systems and its new solution by abarrier method. IEEE Transactions on Systems, Man, and Cybernetics, 11, 444–449.CrossRefGoogle Scholar
- Alves, M. J., & Clímaco, J. (2007). A review of interactive methods for multiobjective integer and mixed-integer programming. European Journal of Operational Research, 180(1), 99–115.CrossRefGoogle Scholar
- Antipin, A. S. (2005). An extraproximal method for solving equilibrium programming problems and games. Computational Mathematics and Mathematical Physics, 45(11), 1893–1914.Google Scholar
- Bard, J., & Falk, J. (1982). An explicit solution to the multi-level programming problem. Computers & Operations Research, 9, 77–100.CrossRefGoogle Scholar
- Barrett, L., & Narayanan, S. (2008) Learning all optimal policies with multiple criteria. In Proceedings of the 25th international conference on machine learning (ICML ’08), Helsinki, Finland, pp. 41–47Google Scholar
- Beltrami, E., Katehakis, M., & Durinovic, S. (1985). Multiobjective markov decisions in urban modelling. Mathematical Modelling, 6, 333–338.CrossRefGoogle Scholar
- Benayoun, R., De Montgolfier, J., Tergny, J., & Laritchev, O. (1971). Linear programming with multiple objective functions: Step method (stem). Mathematical Programming, 1, 366–375.CrossRefGoogle Scholar
- Bianco, L., Caramia, M., & Giordani, S. (2009). A bilevel flow model for hazmat transportation network design. Transportation Research Part C: Emerging Technologies, 17(2), 175–196.CrossRefGoogle Scholar
- Chang, Y. (2015). A leader-follower partially observed, multiobjective markov game. Annals of Operations Research, 235(1), 103–128.CrossRefGoogle Scholar
- Chiandussi, G., Codegone, M., Ferrero, S., & Varesio, F. (2012). Comparison of multi-objective optimization methodologies for engineering applications. Computers & Mathematics with Applications, 63, 912–942.CrossRefGoogle Scholar
- Chinchuluun, A., & Pardalos, P. M. (2007). A survey of recent developments in multiobjective optimization. Annals of Operations Research, 154, 29–50.CrossRefGoogle Scholar
- Clempner, J. B. (2016). Necessary and sufficient Karush–Kuhn–Tucker conditions for multiobjective markov chains optimality. Automatica, 71, 135–142.CrossRefGoogle Scholar
- Clempner, J. B., & Poznyak, A. S. (2014). Simple computing of the customer lifetime value: A fixed local-optimal policy approach. Journal of Systems Science and Systems Engineering, 23(4), 439–459.CrossRefGoogle Scholar
- Clempner, J. B., & Poznyak, A. S. (2015). Stackelberg security games: Computing the shortest-path equilibrium. Expert Systems with Applications, 42(8), 3967–3979.CrossRefGoogle Scholar
- Clempner, J. B., & Poznyak, A. S. (2016). Solving the pareto front for nonlinear multiobjective Markov chains using the minimum Euclidean distance optimization method. Mathematics and Computers in Simulation, 119, 142–160.CrossRefGoogle Scholar
- Clempner, J. B., & Poznyak, A. S. (2017). Multiobjective markov chains optimization problem with strong pareto frontier: Principles of decision making. Expert Systems With Applications, 68, 123–135.CrossRefGoogle Scholar
- Clempner, J. B., & Poznyak, A. S. (2018). A Tikhonov regularization parameter approach for solving Lagrange constrained optimization problems. Engineering Optimization. https://doi.org/10.1080/0305215X.2017.1418866 (To be published).
- Das, I., & Dennis, J. E. (1997). A closer look at drawbacks of minimizing weighted sums of objectives for Pareto set generation in multi-criteria optimization problems. Structural and Multidisciplinary Optimization, 14, 63–69.CrossRefGoogle Scholar
- Das, I., & Dennis, J. E. (1998). Normal-boundary intersection: An alternate approach for generating Pareto-optimal points in multicriteria optimization problems. SIAM Journal on Optimization, 8, 631–657.CrossRefGoogle Scholar
- Deb, K. (1999). Multi-objective genetic algorithms: Problem difficulties and construction of test problems. Evolutionary Computation, 7, 205–230.CrossRefGoogle Scholar
- Deb, K. (2001). Nonlinear goal programming using multi-objective genetic algorithms. Journal of the Operational Research Society, 52, 291–302.CrossRefGoogle Scholar
- Dempe, S. (2001). Discrete bilevel optimization problems. Technical report, Institut fur Wirtschaftsinformatik, Universitat Leipzig, Leipzig, Germany.Google Scholar
- DeNegre, S., & Ralphs, T. (2009). A branch-and-cut algorithm for integer bilevel linear programs. Operations Research and Cyber-Infrastructure, 47, 65–78.CrossRefGoogle Scholar
- Eichfelder, G. (2008). Adaptive scalarization methods in multiobjective optimization. Berlin: Springer.CrossRefGoogle Scholar
- Fampa, M., Barroso, L., Candal, D., & Simonetti, L. (2008). Bilevel optimization applied to strategic pricing in competitive electricity markets. Computational Optimization and Applications, 39(2), 121–142.CrossRefGoogle Scholar
- Fliege, J., & Heseler, A. (2003). Constructing approximations to the efficient set of convex quadratic multi-objective problems. Tech. rep.: University of Dortmund, Germany.Google Scholar
- Fu, Y., & Diwekar, U. M. (2004). An efficient sampling approach to multiobjective optimization. Annals of Operations Research, 132(1–4), 109–134.CrossRefGoogle Scholar
- Herskovits, J., Leontiev, A., Das, G., & Santos, G. (2000). Contact shape optimization: A bilevel programming approach. Structural and Multidisciplinary Optimization, 20, 214–221.CrossRefGoogle Scholar
- Hwang, C., & Masud, A. (1979). Multiple objective decision making, methods and applications: A state-of-the art survey. Berlin: Springer.CrossRefGoogle Scholar
- Kim, I., & de Weck, O. (2005). Adaptive weighted-sum method for bi-objective optimization: Pareto front generation. Structural and Multidisciplinary Optimization, 29, 149–158.CrossRefGoogle Scholar
- Koppe, M., Queyranne, M., & Ryan, C. T. (2009). A parametric integer programming algorithm for bilevel mixed integer programs. Journal of Optimization Theory and Applications, 146(1), 137–150.CrossRefGoogle Scholar
- Lau, H. C., Yuan, Z., & Gunawan, A. (2016). Patrol scheduling in urban rail network. Annals of Operations Research, 239(1), 317–342.CrossRefGoogle Scholar
- Leigh, J., Dunnett, S., & Jackson, L. (2017). Predictive police patrolling to target hotspots and cover response demand. Annals of Operations Research,. https://doi.org/10.1007/s10479-017-2528-x.Google Scholar
- Li, K., Kwong, S., Zhang, Q., & Deb, K. (2015). Interrelationship-based selection for decomposition multiobjective optimization. IEEE Transactions on Cybernetics, 45(10), 2076–2088.CrossRefGoogle Scholar
- Naoum-Sawaya, J., & Elhedhli, S. (2011). Controlled predatory pricing in a multiperiod stackelberg game: An MPEC approach. Journal of Global Optimization, 50, 345–362.CrossRefGoogle Scholar
- Pirotta, M., Parisi, S., & Restelli, M. (2015) Multi-objective reinforcement learning with continuous Pareto frontier approximation. In Proceedings of the twenty-ninth AAAI conference on artificial intelligence.Google Scholar
- Poznyak, A. S., Najim, K., & Gomez-Ramirez, E. (2000). Self-learning control of finite Markov chains. New York: Marcel Dekker.Google Scholar
- Roijers, D. M., Vamplew, P., Whiteson, S., & Dazeley, R. (2013). A survey of multi-objective sequential decision-making. Journal of Artificial Intelligence Research, 48, 67–113.Google Scholar
- Salmeron, J., Wood, K., & Baldick, R. (2004). Analysis of electric grid security under terrorist threat. IEEE Transactions on Power Systems, 19(2), 905–912.CrossRefGoogle Scholar
- Salukvadze, M. E. (1979). Vector-valued optimization problems in control theory. New York: Academic Press.Google Scholar
- Schittkowski, K. (1999). Easy-opt: An interactive optimization system with automatic differentiation—User’s guide. Tech. rep.: Department of Mathematics, University of Bayreuth.Google Scholar
- Sheng, W., Liu, Y., Meng, X., & Zhang, T. (2012). An improved strength pareto evolutionary algorithm 2 with application to the optimization of distributed generations. Computers & Mathematics with Applications, 64(5), 944–955.CrossRefGoogle Scholar
- Steuer, R. E. (1989). The Tchebyche procedure of interactive multiple objective programming. In Multiple criteria decision making and risk analysis using microcomputers (pp. 235–249). Springer, Berlin.Google Scholar
- Tanaka, K. (1989). The closest solution to the shadow minimum of a cooperative dynamic game. Computers & Mathematics with Applications, 18(1–3), 181–188.CrossRefGoogle Scholar
- Tanaka, K., & Yokoyama, K. (1991). On \(\epsilon \)-equilibrium point in a noncooperative n-person game. Journal of Mathematical Analysis and Applications, 160, 413–423.CrossRefGoogle Scholar
- Tappeta, R., & Renaud, J. (1999). Interactive multiobjective optimization procedure. AIAA Journal, 37(7), 881–889.CrossRefGoogle Scholar
- Tind, J., & Wiecek, M. M. (1999). Augmented lagrangian and tchebycheff approaches in multiple objective programming. Journal of Global Optimization, 14, 251–266.CrossRefGoogle Scholar
- Trejo, K. K., Clempner, J. B., & Poznyak, A. S. (2015a). Computing the Stackelberg/Nash equilibria using the extraproximal method: Convergence analysis and implementation details for Markov chains games. International Journal of Applied Mathematics and Computer Science, 25(2), 337–351.CrossRefGoogle Scholar
- Trejo, K. K., Clempner, J. B., & Poznyak, A. S. (2015b). A Stackelberg security game with random strategies based on the extraproximal theoretic approach. Engineering Applications of Artificial Intelligence, 37, 145–153.CrossRefGoogle Scholar
- Vamplew, P., Dazeley, R., Barker, E., & Kelarev, A. (2009) Constructing stochastic mixture policies for episodic multiobjective reinforcement learning task. In Lecture Notes in Computer Science: Advances in artificial intelligence (Vol. 5866, pp. 340–349). Berlin: Springer.Google Scholar
- Wakuta, K., & Togawa, K. (1998). Solution procedures for multi-objective Markov decision processes. Optimization, 43, 29–46.CrossRefGoogle Scholar
- Wierzbicki, P. (1980). Multiple criteria decision making theory and applications (pp. 468–486). Berlin: Springer.CrossRefGoogle Scholar
- Xia, H., Zhuang, J., & Yu, D. (2014). Multi-objective unsupervised feature selection algorithm utilizing redundancy measure and negative epsilon-dominance for fault diagnosis. Neurocomputing, 146, 113–124.Google Scholar
- Xinjie, Y., & Mitsuo, G. (2010). Introduction to evolutionary algorithms. London: Springer.Google Scholar
- Zadeh, L. (1963). Optimality and non-scalar-valued performance criteria. IEEE Transactions on Automatic Control, 8(1), 59–60.CrossRefGoogle Scholar
- Zitzler, E., Knowles, J., & Thiele, L. (2008). Quality assessment of Pareto set approximations. In Lecture Notes in Computer Science: Multiobjective optimization (Vol. 5252, pp. 373–404). Berlin: Springer.Google Scholar