Abstract
In many practical control systems or management systems, the manager of systems may allow that the statistic probability of system error or parameter deviation occurs within a certain range. The problem of decision optimization under probabilistic constraint is thus an issue needs to be addressed urgently. In this paper, we consider to develop an event-based approach which can solve the probabilistic constrained decision problems in discrete events dynamic systems. The framework of the event-based optimization is first introduced, and then with the methodology of the performance sensitivity analysis, we present an online event-based policy iteration algorithm based on the derived performance gradient formula. We apply the event-based idea and propose the concept of “risk state”, “risk event” and “risk index” which can be used to better describe the nature of the probabilistic constrained problem. Furthermore, by taking the Lagrangian approach, the constrained decision problem can be solved with two steps. Finally, numerical experiments are designed to verify the efficiency of the proposed method.
Similar content being viewed by others
References
Puterman, M.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)
Aldhaheri, R., Khalil, H.: Aggregation of the policy iteration method for nearly completely decomposable Markov chains. IEEE Trans. Autom. Control 36(2), 178–187 (1991)
Ren, Z., Krogh, B.: State aggregation in Markov decision processes. In: Conference on Decision and Control. IEEE, pp. 3819–3824. Pittsburgh, USA (2002)
Cao, X., Ren, Z., Bhatnagar, S., et al.: A time aggregation approach to Markov decision processes. Automatica 38(6), 929–943 (2002)
Sun, T., Zhao, Q., Luh, P.: Incremental value iteration for time-aggregated Markov-decision processes. IEEE Trans. Autom. Control 52(11), 2177–2182 (2007)
Powell, W.: Approximate Dynamic Programming: Solving the Curses of Dimensionality. Wiley, New York (2007)
Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT press, Cambridge (1998)
Wan, J., Liu, L., Guo, J.: Dynamic request routing for online video-on-demand service: a Markov decision process approach. Math. Probl. Eng. (2014). https://doi.org/10.1155/2014/920829
Zhang, S., Yang, J., Shi, Y., et al.: Dynamic energy storage control for reducing electricity cost in data centers. Math. Probl. Eng. (2015). https://doi.org/10.1155/2015/380926
Jiang, X., Xi, H., Wang, X., et al.: Finding optimal observation-based policies for constrained POMDPs under the expected average reward criterion. IEEE Trans. Autom. Control 61(10), 3070–3075 (2016)
Cao, X., Chen, H.: Perturbation realization, potentials and sensitivity analysis of Markov processes. IEEE Trans. Autom. Control 42(10), 1382–1393 (1997)
Cao, X.: Basic ideas for event-based optimization of Markov systems. Discrete Event Dyn. Syst. Theory Appl. 15(2), 169–197 (2005)
Cao, X.: Stochastic Learning and Optimization: A Sensitivity-Based Approach. Springer, New York (2007)
Cao, X., Zhang, J.: Event-based optimization of Markov systems. IEEE Trans. Autom. Control 53(4), 1076–1082 (2008)
Xu, C., Yang, J., Xi, H., et al.: Event-related optimization for a class of resource location with admission control. In: International Joint Conference on Neural Networks. IEEE, Hefei, China, pp. 1092–1097 (2008)
Zhao, Y., Zhao, Q., Jia, Q., et al.: Event-related optimization for a class of resource location with admission control. In: Conference on Decision and Control. IEEE, Beijing, China, pp. 2173–2178 (2008)
Sun, B., Luh, P.B., Jia, Q.S., et al.: Event-based optimization within the Lagrangian relaxation framework for energy savings in HVAC systems. IEEE Trans. Autom. Sci. Eng. 12(4), 1396–1406 (2015)
Parlar, M., Rodrigues, B., Sharafali, M.: Event-based allocation of airline check-in counters: a simple dynamic optimization method supported by empirical data. Int. Trans. Oper. Res. 25(5), 1553–1582 (2018)
Jia, Q.: On solving optimal policies for event-based dynamic programming. In: Chinese Control Conference. IEEE, Beijing. China, pp. 1511–1516 (2010)
Jia, Q.: On solving event-based optimization with average reward over infinite stages. IEEE Trans. Autom. Control 56(12), 2912–2917 (2011)
Jia, Q.: Event-based optimization with lagged state information. In: Chinese Control Conference. IEEE, Hefei. China, pp. 2055–2060 (2012)
Xia, L., Jia, Q., Cao, X.: A tutorial on event-based optimization-a new optimization framework. Discrete Event Dyn. Syst. Theory Appl. 24(2), 103–132 (2014)
Xia, L.: Event-based optimization of admission control in open queueing networks. Discrete Event Dyn. Syst. Theory Appl. 24(2), 133–151 (2014)
Xia, L.: Policy gradient approach of event-based optimization and its online implementation. Asian J. Control 16(6), 1735–1743 (2014)
Pietrabissa, A.: Admission control in UMTS networks based on approximate dynamic programming. Eur. J. Control 14(1), 62–75 (2008)
Bhatnagar, S., Lakshmanan, K.: An online actor-critic algorithm with function approximation for constrained Markov decision processes. J. Optim. Theory Appl. 153(3), 688–708 (2012)
Djonin, D., Krishnamurthy, V.: Q-learning algorithms for constrained Markov decision processes with randomized monotone policies: application to MIMO transmission control. IEEE Trans. Signal Process. 55(5), 2170–2181 (2007)
Sun, C., Stevens, E., Shah, V., et al.: A constrained MDP-based vertical handoff decision algorithm for 4G heterogeneous wireless networks. Wirel. Netw. 17(4), 1063–1081 (2011)
Calafiore, G., Dabbene, F.: Probabilistic and Randomized Methods for Design Under Uncertainty. Springer, London (2006)
Cannon, M., Kouvaritakis, B., Rakovic, V., et al.: Stochastic tubes in model predictive control with probabilistic constraints. IEEE Trans. Autom. Control 56(1), 194–200 (2011)
Chung, J., Du, H., Gondzio, J.: A probabilistic constraint approach for robust transmit beamforming with imperfect channel information. IEEE Trans. Signal Process. 59(6), 2773–2782 (2011)
Uryasev, S.: Probabilistic Constrained Optimization: Methodology and Applications. Springer, Gainesville (2013)
Li, Y., Cao, F.: A basic formula for performance gradient estimation of semi-Markov decision processes. Eur. J. Oper. Res. 224(2), 333–339 (2013)
Mundur, P., Sood, A.K., Simon, R.: Class-based access control for distributed video-on-demand systems. IEEE Trans. Circuits Syst. Video Technol. 15(7), 844–853 (2005)
Yin, B., Lu, S., Guo, D.: Analysis of admission control in P2P-based media delivery network based on POMDP. Int. J. Innov. Comput. Inf. Control 7(7B), 4411–4422 (2011)
Acknowledgements
This work is supported by ‘the Natural Science Foundation of Anhui Province’ (No. 1808085QG220, 1708085QG164), ‘the National Natural Science Foundation of China’ (No. 71601066, 71501055, 71690230, 71690235), ‘the Humanities and Social Science Foundation of Ministry of Education in China’ (No. 16YJC630093).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Lu, X., Peng, Z., Zhang, Q. et al. Event-based optimization approach for solving stochastic decision problems with probabilistic constraint. Optim Lett 15, 569–590 (2021). https://doi.org/10.1007/s11590-019-01403-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11590-019-01403-2