Abstract
We study a sequence of many-agent exit time stochastic control problems, parameterized by the number of agents, with risk-sensitive cost structure. We identify a fully characterizing assumption, under which each such control problem corresponds to a risk-neutral stochastic control problem with additive cost, and sequentially to a risk-neutral stochastic control problem on the simplex that retains only the distribution of states of agents, while discarding further specific information about the state of each agent. Under some additional assumptions, we also prove that the sequence of value functions of these stochastic control problems converges to the value function of a deterministic control problem, which can be used for the design of nearly optimal controls for the original problem, when the number of agents is sufficiently large.
Similar content being viewed by others
References
Arapostathis A, Borkar VS, Fernández-Gaucherand E, Ghosh MK, Marcus SI (1993) Discrete-time controlled markov processes with average cost criterion: a survey. SIAM J Control Optim 31(2):282–344
Avila-Godoy G, Fernández-Gaucherand E (1998) Controlled Markov chains with exponential risk-sensitive criteria: modularity, structured policies and applications. In: Proceedings of the 37th IEEE conference on decision and control, 1998, vol 1. IEEE, pp 778–783
Bertsekas DP (2012) Dynamic programming and optimal control: approximate dynamic programming, vol 2. Athena Scientific, Belmont
Billingsley P (1995) Probability and measure, vol 3. Wiley series in probability and mathematical statistics. Wiley, New York
Borkar VS, Meyn SP (2002) Risk-sensitive optimal control for Markov decision processes with monotone cost. Math Oper Res 27(1):192–209
Cavazos-Cadena R (2010) Optimality equations and inequalities in a class of risk-sensitive average cost Markov decision chains. Math Methods Oper Res 71(1):47–84
Chung K-J, Sobel MJ (1987) Discounted MDP’s: distribution functions and exponential utility maximization. SIAM J Control Optim 25(1):49–62
Di Masi GB, Stettner Ł (2007) Infinite horizon risk sensitive control of discrete time Markov processes under minorization property. SIAM J Control Optim 46(1):231–252
Dupuis P, McEneaney WM (1997) Risk-sensitive and robust escape criteria. SIAM J Control Optim 35(6):2021–2049
Dupuis P, James MR, Petersen IR (2000) Robust properties of risk-sensitive control. Math Control Signals Syst 13:318–332
Dupuis P, James M.R, Petersen I, Robust properties of risk-sensitive control. In: Proceedings of the 37th IEEE conference on decision and control (Cat. No.98CH36171), vol 2. IEEE, pp 2365–2370
Dupuis P, Ramanan K, Wu W (2016) Large deviation principle for finite-state mean field interacting particle systems. arXiv:1601.06219, p 62
Ethier SN, Kurtz TG (1986) Markov processes. In: SpringerReference, p 544
Fleming WH, Hernández-Hernández D (1997) Risk-sensitive control of finite state machines on an infinite horizon I. SIAM J Control Optim 35(5):1790–1810
Fleming WH, Soner HM (1989) Asymptotic expansions for Markov processes with Levy generators. Appl Math Optim 19:203–223
Ghosh MK, Saha S (2014) Risk-sensitive control of continuous time Markov chains. Stoch Int J Probab Stoch Process 86:37–41
Hernandez-Hernández D, Marcus SI (1996) Risk sensitive control of Markov processes in countable state space. Syst Control Lett 29(3):147–155
Hernández-Lerma O, Lasserre JB (1999) Further topics in discrete time Markov control processes. Springer, Berlin
Howard RA, Matheson JE (1972) Risk-sensitive Markov decision processes. Manag Sci 18:356–369
Ikeda N, Watanabe S (1989) Stochastic differential equations and diffusion processes, vol 24 of North-Holland Mathematical Library, 2nd ed. North-Holland Publishing Co., Amsterdam; Kodansha, Ltd., Tokyo
Jaskiewicz A (2007) Average optimality for risk-sensitive control with general state space. Ann Appl Probab 17(2):654–675
Marcus SI, Fernandez-Gaucherand E, Hernandez-Hernandez D, Coraluppi SP, Fard P (1997) Risk sensitive markov decision processes. In: Systems and control in the 21st century. Birkhäuser Boston, Boston, MA, p 17
Petersen IR, James MR, Dupuis P (2000) Minimax optimal control of stochastic uncertain systems with relative entropy constraints. IEEE Trans Autom Control 45(3):398–412
Puterman ML (1991) Markov decision processes (Chapter 8). In: Heyman DP, Sobel MJ (eds) Stochastic models, vol 2. North Holland, Amsterdam
Sion M (1958) On general minimax theorems. Pac J Math 8:171–176
Whittle P (1996) Optimal control: basics and beyond. Wiley-Interscience series in systems and optimization, John Wiley & Sons, p 464
Xianping G, Hernández-Lerma O (2009) Continuous-time Markov decision processes: theory and applications. Springer, Berlin
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
P. Dupuis: Research supported in part by AFOSR FA9550-12-1-0399; V. Laschos: Research supported in part by AFOSR FA9550-12-1-0399; K. Ramanan: Research supported in part by AFOSR FA9550-12-1-0399 and PI NSF DMS-171303.
Appendices
Properties of Hamiltonians
In this section we establish Lemma 3.4 and Theorem 3.3. We start with the proof of Lemma 3.4.
Proof of Lemma 3.4
To prove the exchange between supremum and infimum, we will apply a modification of Sion’s theorem (Corollary 3.3 in [25]), which states that if a continuous G(u, q) is quasi-concave for every u is some convex set \({\mathcal {U}}\) and quasi-convex for every q in some convex set \({\mathcal {Q}},\) and if one of the two sets is compact, then we can exchange the supremum with the infimum. We start by investigating the validity of these properties when \( G= L_{xy}.\) Since \(\ell \) is convex, for each \(u \ge 0\),
is convex with respect to q. It is easy to see that \(u \mapsto L_{xy}(u,q)\) is not concave for each \(q \ge 0\). However we now show that under Assumption 3.2, for each \(q \ge 0\), \(u \mapsto L_{xy}(u,q)\) is quasi-concave, or equivalently, that \(\{u \ge 0:L_{xy}(u,q)\ge c\}\) is convex for every \(c\in {\mathbb {R}}.\) By differentiating with respect to u, we get
If we prove that for each q the set of roots for \(\partial _{u}L_{xy}(u,q)\) is an interval or a point we are done, because a real function that changes monotonicity from increasing to decreasing at most once is quasi-concave. However \(\partial _{u}L_{xy}(u,q)\) has the same roots as \(Q(u) = u(C_{xy})^{\prime }\left( {\frac{u}{\gamma _{xy}}}\right) -u+q\). By part 1 of Assumption 3.2, Q(u) is increasing, which gives what is needed.
Thus, we are almost in a situation where we can apply Sion’s theorem, except that our sets are \([0,\infty )\) and hence non-compact. However, as we explain below, we can still apply this result by using the fact that \(\lim _{q\rightarrow \infty }L_{xy}(q,1)=\infty \). If we prove that
then we are done, since by Corollary 3.3 in [25]
Let \(M:= \inf _{q\in [0,\infty )}\sup _{u\in (0,\infty )}L_{xy}(u,q) \). We will assume that \(M<\infty \), and note that the case \( M=\infty \) is treated similarly. Since \(\lim _{q\rightarrow \infty }L_{xy}(q,1)=\infty ,\) we can find \({\tilde{q}}\) such that \(L_{xy}(q,1)>2M\) for every \(q\ge {\tilde{q}}.\) Now we have
and
which gives
\(\square \)
Proof of Theorem 3.3
Let \(H^-\) (respectively, \(H^+\)) denote the left-hand side (respectively, right-hand side), of (3.7). Since each term in the sum that generates \(H^{+}\) is bigger than the corresponding one in the sum of \(H^{-},\) we get equality for all of them. By the theory of the Legendre transform, we know that \(\inf _{q\in [0,\infty )}\sup _{ u\in (0,\infty )}\left\{ q\xi _{xy}+G_{xy}(u,q)\right\} \) is actually a concave function. Since we can exchange the order between the supremum and infimum, then \(\sup _{u\in (0,\infty )}\inf _{q\in [0,\infty )}\left\{ q\xi _{xy}+G_{xy}(u,q)\right\} \) must be a concave function as well. By using the formula
we have that \(\left( C_{xy}\right) ^{*}\left( -\ell ^{*}\left( \xi \right) \right) =\left( C_{xy}\right) ^{*}\left( 1-e^{\xi }\right) \) must also be concave. By differentiating with respect to \(\xi \) we get, \(e^{2\xi }\left( \left( C_{xy}\right) ^{*}\right) ^{\prime \prime }\left( 1-e^{\xi }\right) -e^{\xi }\left( \left( C_{xy}\right) ^{*}\right) ^{\prime }\left( 1-e^{\xi }\right) \le 0,\) from which, by using the identity \((f^{*})^{\prime }=(f^{\prime })^{-1}\), we get
By substituting \({\tilde{u}}=1-e^{\xi }\) we get
Now the last inequality implies that either \(\left( C_{xy}\right) ^{\prime }\left( u\right) \ge 1\) or that \(u(C_{xy})^{\prime }\left( u\right) -u\) is locally increasing and even more that if \(\left( C_{xy}\right) ^{\prime }\left( u_{0}\right) \ge 1\) for some \(u_{0},\) then it must remain like that for every \(u\ge u_{0}.\) If that was not the case then we can find \( u_{1}>u_{0}\) such that \(u_{1}(C_{xy})^{\prime }\left( u_{1}\right) -u_{1}< {\hat{q}}\) for some negative \({\hat{q}},\) while \(u_{0}(C_{xy})^{\prime }\left( u_{0}\right) -u_{0}\ge 0.\) By a suitable application of the mean value theorem, we will get the existence of an r that the last inequality fails. If we set \({\tilde{u}}_{xy}=\inf \{u: \left( C_{xy}\right) ^{\prime }\left( u\right) \ge 1\},\) then Assumption 3.2 is recovered. \(\square \)
Properties of \(F_{xy}\)
Proof of Lemma 3.7
-
(1)
We have
$$\begin{aligned} F_{xy}(q)= & {} \sup _{u\in (0,\infty )}\left\{ u\ell \left( \frac{q}{u}\right) -\gamma _{xy}C_{xy}\left( \frac{u}{\gamma _{xy}}\right) \right\} \ge \gamma _{xy}\ell \left( \frac{q}{\gamma _{xy}}\right) -\gamma _{xy}C_{xy}\left( \frac{\gamma _{xy}}{\gamma _{xy}}\right) \\\ge & {} \gamma _{xy}\ell \left( \frac{q }{\gamma _{xy}}\right) \ge 0 . \end{aligned}$$ -
(2)
We have
$$\begin{aligned} \begin{aligned} F_{xy}(\gamma _{xy})&=\sup _{u\in (0,\infty )}G_{xy}(u,\gamma _{xy})=\sup _{u\in (0,\infty )}\left\{ u\ell \left( \frac{\gamma _{xy}}{u} \right) -\gamma _{xy}C_{xy}\left( \frac{u}{\gamma _{xy}}\right) \right\} \\&=\sup _{u\in (0,\infty )}\left\{ \gamma _{xy}\log \gamma _{xy}-\gamma _{xy}\log u-\gamma _{xy}+u-\gamma _{xy}C_{xy}\left( \frac{u}{\gamma _{xy}} \right) \right\} , \end{aligned} \end{aligned}$$and by applying part 2 of Lemma 3.6
$$\begin{aligned} \gamma _{xy}C_{xy}\left( \frac{u}{\gamma _{xy}}\right) \ge \gamma _{xy}\log \gamma _{xy}-\gamma _{xy}\log u-\gamma _{xy}+u. \end{aligned}$$Therefore, \(F_{xy}(\gamma _{xy})\le 0.\) However, by part (1) of this lemma \( F_{xy}(\gamma _{xy})\ge 0,\) and therefore, the equality follows.
-
(3)
By definition \(F_{xy}(q)=\sup _{u\in (0,\infty )}G_{xy}(u,q).\) Let \(a\in (0,1)\) and \(0\le q_{1}<q_{2}<\infty \), and let \(q=\)\(aq_{1}+(1-a)q_{2}\). Using the convexity of \(G_{xy}(u,q)\) for fixed u as a function of q, we have
$$\begin{aligned} \begin{aligned} F_{xy}(aq_{1}+(1-a)q_{2})&=\sup _{u\in (0,\infty )}G_{xy}(u,aq_{1}+(1-a)q_{2}) \\&\le \sup _{u\in (0,\infty )}\left\{ aG_{xy}(u,q_{1})+(1-a)G_{xy}(u,q_{2})\right\} \\&\le a\sup _{u\in (0,\infty )}G_{xy}(u,q_{1})+(1-a)\sup _{u\in (0,\infty )}G_{xy}(u,q_{2}) \\&\le aF_{xy}(q_{1})+(1-a)F_{xy}(q_{2}). \end{aligned} \end{aligned}$$
\(\square \)
For the proof of Lemma 4.5, we will use the following auxiliary lemma. Recall the definition of \(G_{xy}\) in (1.8).
Lemma B.1
If \(\{{\varvec{C}}^{n}\}\) satisfies Assumption , then the following hold for every \((x,y)\in ~{\mathcal {Z}}\).
-
1.
There exists a positive real number M that does not depend on (x, y) , such that for the decreasing function \(M_{xy}^{1}:(0,\infty )\rightarrow [0,\infty ),\) given by
$$\begin{aligned} M_{xy}^{1}(q)\doteq \min \left\{ \gamma _{xy}\left( \frac{\gamma _{xy}}{q} \right) ^{1/p},M\right\} , \end{aligned}$$we have that \(G_{xy}(u,q)\) is increasing as a function of u on the interval \((0,M_{xy}^{1}(q)].\)
-
2.
There exists a decreasing function \(M_{xy}^{2}:(0,\infty )\rightarrow [0,\infty ),\) with \(M_{xy}^{2}(q)\ge M_{xy}^{1}(q),\) such that \( G_{xy}(u,q)\) is decreasing as a function of u on the interval \(\left[ M_{xy}^{2}(q),\infty \right) \).
Proof
By taking the derivative with respect to u in the definition (1.8), we get
-
(1)
By part 2 of Assumption 4.3 there exists \(M\in (0,\infty )\) such that if \(u<M\), then
$$\begin{aligned} -\frac{q}{u}-(C_{xy})^{\prime }\left( \frac{u}{\gamma _{xy}}\right) +1\ge - \frac{q}{u}+\left( \frac{\gamma _{xy}}{u}\right) ^{p+1}+1, \end{aligned}$$and by taking \(u\le \gamma _{xy}\left( \gamma _{xy}/q\right) ^{1/p}\) we get
$$\begin{aligned} -\frac{q}{u}+\left( \frac{\gamma _{xy}}{u}\right) ^{p+1}+1\ge -\frac{q}{u}+ \frac{q}{u}+1>0. \end{aligned}$$Therefore, for
$$\begin{aligned} M_{xy}^{1}(q)=\min \left\{ \gamma _{xy}\left( \frac{\gamma _{xy}}{q}\right) ^{1/p},M\right\} , \end{aligned}$$we have \(-\frac{q}{u}-(C_{xy})^{\prime }\left( \frac{u}{\gamma _{xy}}\right) +1\ge 0\) on the interval \((0,M_{xy}^{1}(q)].\)
-
(2)
By applying part 3 of Assumption 4.3, we get that there exists decreasing \({\tilde{M}}_{xy}^{2}(q)<\infty ,\) such that if \(u>{\tilde{M}} _{xy}^{2}(q)\) then
$$\begin{aligned} \frac{u}{\gamma _{xy}}(C_{xy})^{\prime }\left( \frac{u}{\gamma _{xy}}\right) -\frac{u}{\gamma _{xy}}\ge -\frac{q}{\gamma _{xy}}. \end{aligned}$$(B.1)Then \(M_{xy}^{2}(q)\doteq \max \{M_{xy}^{1}(q),{\tilde{M}}_{xy}^{2}(q)\},\) is decreasing and bigger than \(M_{xy}^{1}\), and using (B.1) we get
$$\begin{aligned} -\frac{q}{u}-(C_{xy})^{\prime }\left( \frac{u}{\gamma _{xy}}\right) +1=- \frac{q}{u}-\frac{\gamma _{xy}}{u}\left( \frac{u}{\gamma _{xy}} (C_{xy})^{\prime }\left( \frac{u}{\gamma _{xy}}\right) -\frac{u}{\gamma _{xy} }\right) \le 0 \end{aligned}$$on the interval \([M_{xy}^{2}(q),\infty ).\)\(\square \)
Proof of Lemma 4.5
-
(1)
Let \(\epsilon >0,\) and \(q\ge \epsilon \). By Lemma B.1, we have that \(G_{xy}\left( u,q\right) ,\) as a function of u, is increasing on the interval \((0,M_{xy}^{1}(q)]\). Therefore, for all \(u\in (0,M_{xy}^{1}(q)]\) we have
$$\begin{aligned} \begin{aligned} u\ell \left( \frac{q}{u}\right) -\gamma _{xy}C_{xy}\left( \frac{u}{\gamma _{xy}}\right)&\le M_{xy}^{1}(q)\ell \left( \frac{q}{M_{xy}^{1}(q)}\right) -\gamma _{xy}C_{xy}\left( \frac{M_{xy}^{1}(q)}{\gamma _{xy}}\right) \\&\le M_{xy}^{1}(q)\ell \left( \frac{q}{M_{xy}^{1}(q)}\right) \\&\le q\log \left( \frac{q}{M_{xy}^{1}(q)}\right) +M_{xy}^{1}(q) \\&\le q\log \left( \frac{q}{M_{xy}^{1}(q)}\right) +M_{xy}^{1}(\epsilon ) \\&\le q\log \left( q\right) -q\log \left( M_{xy}^{1}(q)\right) +M_{xy}^{1}(\epsilon ) \\&\overset{M_{xy}^{1}(\epsilon )\le M_{xy}^{2}(\epsilon )}{\le }q\log \left( q\right) -q\log \left( M_{xy}^{1}(q)\right) +M_{xy}^{2}(\epsilon ). \end{aligned} \end{aligned}$$By the second part of Lemma B.1, we have that \(G_{xy}(u,q) \) is decreasing on the interval \((M_{xy}^{2}(\epsilon ),\infty )\). Therefore, for all \(u\in (M_{xy}^{2}(\epsilon ),\infty )\)
$$\begin{aligned} \begin{aligned} u\ell \left( \frac{q}{u}\right) -\gamma _{xy}C_{xy}\left( \frac{u}{\gamma _{xy}}\right)&\le M_{xy}^{2}(\epsilon )\ell \left( \frac{q}{ M_{xy}^{2}(\epsilon )}\right) -\gamma _{xy}C_{xy}\left( \frac{ M_{xy}^{2}(\epsilon )}{\gamma _{xy}}\right) \\&\le M_{xy}^{2}(\epsilon )\ell \left( \frac{q}{M_{xy}^{2}(\epsilon )}\right) \\&\le q\log \left( \frac{q}{M_{xy}^{2}(\epsilon )}\right) +M_{xy}^{2}(\epsilon ) \\&\overset{M_{xy}^{2}(q)\le M_{xy}^{2}(\epsilon )}{\le }q\log \left( q\right) -q\log \left( M_{xy}^{2}(q)\right) +M_{xy}^{2}(\epsilon ) \\&\overset{M_{xy}^{1}(q)\le M_{xy}^{2}(q)}{\le }q\log \left( q\right) -q\log \left( M_{xy}^{1}(q)\right) +M_{xy}^{2}(\epsilon ). \end{aligned} . \end{aligned}$$Finally, for the interval \([M_{xy}^{1}(q),M_{xy}^{2}(\epsilon )]\) we have
$$\begin{aligned} \begin{aligned} u\ell \left( \frac{q}{u}\right) -\gamma _{xy}C_{xy}\left( \frac{u}{\gamma _{xy}}\right)&\le u\ell \left( \frac{q}{u}\right) =q\log q-q\log u-q+u \\&\le q\log q-q\log (M_{xy}^{1}(q))+M_{xy}^{2}(\epsilon ). \end{aligned} \end{aligned}$$Now if we recall the definition of \(M_{xy}^{1}\) given in Lemma and set \({\bar{M}}(q)\doteq \max \{M_{xy}^{2}(q):(x,y)\in {\mathcal {Z}}\},\) then
$$\begin{aligned} G_{xy}(u,q)\le q\log \frac{q}{\min \left\{ \gamma _{xy}\left( \frac{\gamma _{xy}}{q}\right) ^{1/p},M\right\} }+{\bar{M}}(\epsilon ), \end{aligned}$$and by taking supremum over u we end up with \(F_{xy}(q)\) satisfying the same bound.
-
(2)
This is straightforward since \(F_{xy}\) is finite on the interval \( (0,\infty ),\) and convex. \(\square \)
Tightness functionals
Proof of Lemma 5.1
Let \(c_{2}>0\) and \(\{(\varvec{\mu }^{n},T^{n})\}\) be a deterministic sequence in S with \(\varvec{\mu }^{n}\) absolutely continuous such that
and \(|\dot{\varvec{\mu }}^{n}(t)|=0\) for \(t>T^{{\varvec{n}}}\). We need to show that H has level sets with compact closure. Since all elements are positive, we have that \(T^{{\varvec{n}}}\le c_{2}/c_{1}\). Let \( \varvec{\bar{\mu }}^{n}\) denote the restriction of \(\varvec{\mu }^{n}\) to \([0,c_{2}/c_{1}]\). If we prove that \(\varvec{\bar{\mu }}^{n}\) converges along some subsequence, then we are done. Using the inequality \( ab\le e^{ca}+\ell (b)/c,\) which is valid for \(a,b\ge 0,\) and \(c\ge 1,\) we have that
This shows that \(\left\{ \varvec{\bar{\mu }}^{n}\right\} \) are equicontinuous. Since \(\varvec{\bar{\mu }}^{n}(t)\) takes values in the compact set \({\mathcal {P}}({\mathcal {X}})\), by the Arzela–Ascoli theorem there is a convergent subsequence. \(\square \)
Rights and permissions
About this article
Cite this article
Dupuis, P., Laschos, V. & Ramanan, K. Exit time risk-sensitive control for systems of cooperative agents. Math. Control Signals Syst. 31, 279–332 (2019). https://doi.org/10.1007/s00498-019-0239-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00498-019-0239-3