Encyclopedia of Systems and Control

Living Edition
| Editors: John Baillieul, Tariq Samad

Stochastic Linear-Quadratic Control

  • Dr.Shanjian TangEmail author
Living reference work entry
DOI: https://doi.org/10.1007/978-1-4471-5102-9_228-1

Abstract

In this short article, we briefly review some major historical studies and recent progress on continuous-time stochastic linear-quadratic (SLQ) control and related mean-variance (MV) hedging.

Keywords

Riccati equation Quadratic backward stochastic differential equations BMO-martingale Bellman’s quasilinearization Monotone convergence Mean-variance hedging 

Introduction

A stochastic linear-quadratic (SLQ) control problem is the optimal control of a linear stochastic dynamic equation subject to an expected quadratic cost functional of the system state and control. As shown in Athans (1971), it is a typical case of optimal stochastic control both in theory and application. Due to the linearity of the system dynamics and the quadratic feature of the cost functions, the optimal control law is usually synthesized into a feedback (also called closed) form of the optimal state, and the corresponding proportional coefficients are specified by the associated Riccati equation. In what follows, we restrict our exposition within the continuous-time SLQ problem, and further, mainly for the finite-horizon case.

The initial study on the continuous-time SLQ problem seems to be due to Florentin (1961). However, his linear stochastic control system is assumed to be Gaussian. That is, the system noise is additive and has neither multiplication with the state nor with the control. Such a case is usually termed as the linear-quadratic Gaussian (LQG) problem, and in the case of complete observation, the optimal feedback law remains to be invariant when the white noise vanishes. The continuous-time partially observable case was first discussed by Potter (1964) and a more general formulation was later given by Wonham (1968a). It is proved that the optimal control can be obtained by the following two separate steps: (1) generate the conditional mean estimate of the current state using a Kalman filter and (2) optimally feed back as if the conditional mean state estimate was the true state of the system. This result is referred to as the certainty equivalence principle or the strict separation theorem. Different assumptions were discussed by Tse (1971) for the separation of control and state estimation.

Wonham (19671968b1970) investigated the SLQ problem in a fairly general systematic framework. In the first two papers, his stochastic system is able to admit a state-dependent noise. Finally, Wonham (1970) considered the following very general (admitting both state- and control-dependent noise) linear stochastic differential system driven by a d-dimensional Brownian motion W = (W 1, W 2, ⋯ , W d ):
$$X_{t} = x +\displaystyle\int _{ 0}^{t}(A_{ s}X_{s} + B_{s}u_{s})\,dt +\displaystyle\int _{ 0}^{t}\displaystyle\sum _{ i=1}^{d}(C_{ s}^{i}X_{ s} + D_{s}^{i}u_{ s})\,dW_{s}^{i},\quad t \in [0,T];$$
and the following cost functional:
$$J(u) = E\langle MX_{T},X_{T}\rangle + E\displaystyle\int _{0}^{T}[\langle Q_{ t}X_{t},X_{t}\rangle +\langle N_{t}u_{t},u_{t}\rangle ]\,dt.$$
Here, T > 0, X t R n is the state at time t, and u t R m is the control at time t. Assume that all the coefficients A, B; C i , D i , i = 1, 2, , d; Q, N are piecewisely continuous matrix-valued (of suitable dimensions) functions of time, and M, Q t are nonnegative matrices and N t is uniformly positive. Wonham (1970) gave the following Riccati equation:
$$\left \{\begin{array}{rcl} -\dot{ K}_{t}& =&A_{t}^{{\ast}}K_{t} + K_{t}A_{t} + C_{t}^{i{\ast}}K_{t}C_{t}^{i} - \Gamma _{t}(K_{t})(N_{t} + D_{t}^{i{\ast}}K_{t}D_{t}^{i})\Gamma _{t}(K_{t}),\quad t \in [0,T); \\ K_{T}& =&M.\end{array} \right .$$
(1)
Here, the asterisk stands for transpose, the repeated superscripts imply summation from 1 to d, and the function \(\Gamma \) is defined by
$$\Gamma _{t}(K) := -(N_{t} + D_{t}^{i}KD_{ t}^{i})^{-1}(KB_{ t} + C_{t}^{i{\ast}}KD_{ t}^{i})^{{\ast}}$$
for time t ∈ [0, T] and any \(K \in \mathcal{S}_{+}^{n} :=\{ \mbox{ all nonnegative $n \times n$ matrices}\}\). This Riccati equation is a nonlinear ordinary differential equation (ODE). Since the nonlinear term \(\Gamma _{t}(K)(N_{t} + D_{t}^{i{\ast}}KD_{t}^{i})\Gamma _{t}(K)\) in the right-hand side is not uniformly Lipschitz in K in general, the standard existence and uniqueness theorem of ODEs does not directly tell whether this Riccati equation has a unique continuous solution in \(\mathcal{S}_{+}^{n}\). To solve this issue, Wonham (1970) used Bellman’s principle of quasilinearization and constructed the following sequence of successive linear approximating matrix-valued ODEs.
Define for \((t,K,\tilde{\Gamma }) \in [0,T] \times R^{n\times n} \times R^{m\times n}\),
$$\displaystyle\begin{array}{rcl} F_{t}(K,\tilde{\Gamma })& :=& [A_{t} + B_{t}\tilde{\Gamma }]^{{\ast}}K + K[A_{ t} + B_{t}\tilde{\Gamma }] \\ & & +[C_{t}^{i} + D_{ t}^{i}\tilde{\Gamma }]^{{\ast}}K[C_{ t}^{i} + D_{ t}^{i}\tilde{\Gamma }] + Q_{ t} + \tilde{\Gamma }^{{\ast}}N_{ t}\tilde{\Gamma }.\end{array}$$
(2)
For \(K \in \mathcal{S}_{+}^{n}\), the matrix \(F_{t}(K,\tilde{\Gamma }) - F_{t}(K,\Gamma _{t}(K))\) is nonnegative, that is,
$$F_{t}(K,\tilde{\Gamma }) \geq F_{t}(K,\Gamma _{t}(K)),\quad \forall \,\tilde{\Gamma } \in R^{m\times n}.$$
(3)
Riccati equation (1) can then be written into the following form:
$$\left \{\begin{array}{rcl} -\dot{ K}_{t}& =&F_{t}(K_{t},\Gamma _{t}(K_{t})),\quad t \in [0,T); \\ K_{T}& =&M.\end{array} \right .$$
(4)
The iterating linear approximations are therefore structured as follows: Set K 0M and for l = 1, 2, ,
$$\left \{\begin{array}{rcl} -\dot{ K}_{t}^{l}& =&F_{t}(K_{t}^{l},\Gamma _{t}(K_{t}^{l-1})),\quad t \in [0,T); \\ K_{T}^{l}& =&M.\end{array} \right .$$
(5)
Using the above minimal property (3) of F t (K, ⋅) at \(\Gamma _{t}(K)\), Wonham showed that the unique nonnegative solution K l of ODE (5) is monotonically decreasing in the sequential number l = 1, 2, . Using the method of monotone convergence, the sequence of solutions {K l } is shown to converge to some \(K \in \mathcal{S}_{+}^{n}\), which turns out to solve Riccati equation (1).

The Case of Random Coefficients and Backward Stochastic Riccati Equation

Bismut (19761978) are the first studies on the SLQ problem with random coefficients. Let \(\{\mathcal{F}_{t},t \in [0,T]\}\) be the completed natural filtration of W. When the coefficients \(A,B;C^{i},D^{i},i = 1,2,\ldots ,d;Q,N\) and M may be random, with \(A,B;C^{i},D^{i},i = 1,2,\ldots ,d;Q,N\) being \(\mathcal{F}_{t}\)-adapted and essentially bounded and M being \(\mathcal{F}_{T}\)-measurable and essentially bounded, Bismut (19761978) used the stochastic maximum principle for optimal control and derived the following Riccati equation:
$$\left \{\begin{array}{rcl} - dK_{t}& =&[A_{t}^{{\ast}}K_{t} + K_{t}A_{t} + C_{t}^{i{\ast}}K_{t}C_{t}^{i} + C_{t}^{i{\ast}}L_{t}^{i} + L_{t}^{i}C_{t}^{i} \\ & & - \Psi _{t}(K_{t},L_{t})(N_{t} + D_{t}^{i{\ast}}K_{t}D_{t}^{i})\Psi _{t}(K_{t},L_{t})]\,dt - L^{i}\,dW_{t}^{i},\quad t \in [0,T); \\ K_{T}& =&M\end{array} \right .$$
(6)
where the function \(\Psi _{t}\) for t ∈ [0, T] is defined as follows:
$$\displaystyle\begin{array}{rcl} \Psi _{t}(K,L)& & := -(N_{t} + D_{t}^{i}KD_{ t}^{i})^{-1}(KB_{ t} + C_{t}^{i{\ast}}KD_{ t}^{i} + L^{i}D_{ t}^{i})^{{\ast}},\forall \ K \in \mathcal{S}_{ +}^{n},\forall \,L \\ & & := (L^{1},\cdots \,,L^{d}) \in (R^{n\times n})^{d}. \\ \end{array}$$
Peng (1992b) used his stochastic Hamilton-Jacobi-Bellman equation to the SLQ problem and also derived the above equation. They both established the existence and uniqueness of an adapted solution of backward stochastic Riccati equation (6) when the function \(\Psi _{t}(K,L)\) does not contain L. However, Bismut used the fixed-point method, and Peng (1992b) used Bellman’s principle of quasilinearization and the method of monotone convergence. Neither methodology works for the general case of quadratic growth in the second unknown variable L in the drift of the stochastic equation. Bismut (19761978) and Peng (1999) stated the general case as an open problem. By considering the stochastic equation for the inverse of K t , Kohlmann and Tang (2003a) solved some particular cases where the function \(\Psi _{t}(K,L)\) can depend on L. Tang (2003) finally solved the general case, using the method of stochastic flows.

In the general case, the optimal feedback coefficient \(\Psi _{t}(K_{t},L_{t})\) at time t depends on L t in a linear manner, which is in general not essentially bounded with respect to (t, ω). Kohlmann and Tang (2003b) observed that the stochastic integral process 0 L t i dW t i is a BMO-martingale.

Indefinite SLQ Problem

Chen (1985) contains a theory of singular (the control weighting matrix vanishing in the quadratic cost functional) LQG control, which is a particular type of indefinite SLQ problems. In the deterministic linear-quadratic (LQ) control theory, the well posedness (i.e., the value function is finite on [0, T] ×R n ) of the problem suggests that the control weighting matrix N in the quadratic cost functional be positive definite. In the stochastic case, when N t is slightly negative, the SLQ may still be well posed if the control could also increase the intensity of the system noise. Peng (1992a) used an indefinite but well-posed SLQ problem to illustrate his new second-order stochastic maximum principle. Chen et al. (1998) gave a deeper study on this feature of the SLQ problem. Yong and Zhou (1999) gave a systematic account of the progress around in the indefinite SLQ problem.

Mean-Variance Hedging

In the theory of finance, Duffie and Richardson (1991) introduced the SLQ control model to hedge a contingent claim in an incomplete market. Schweizer (1992) developed a first framework for MV hedging, and then it was extended to a very general setting in Gouriéroux et al. (1998). Before 2000, the martingale method was used to solve the MV hedging problem. Kohlmann and Zhou (2000) began to use the standard SLQ theory to derive the optimal hedging strategy for a general contingent claim in a financial market of deterministic coefficients, and such a SLQ methodology was subsequently extended to very general settings for financial markets by Kohlmann and Tang (20022003b), Bobrovnytska and Schweizer (2004), and Jeanblanc et al. (2012). See more detailed surveys on the literature by Pham (2000), Schweizer (2010), and Jeanblanc et al. (2012).

Summary and Future Directions

In comparison to the continuous-time deterministic LQ theory, the continuous-time SLQ theory has the following two striking features: An indefinite SLQ problem may be well posed, and the optimal feedback coefficient may be unbounded due to its linear dependence on the martingale part L of the stochastic solution of the Riccati equation. Due to the second feature, the convergence of the sequence of successive approximations constructed via Bellman’s quasilinearization still remains to be solved in the general case. This problem partially motivates Delbaen and Tang (2010) to study the regularity of unbounded stochastic differential equations and also may help to explain the necessity of rich studies on mean-variance hedging and closedness of stochastic integrals with respect to semi-martingales (as in Delbaen et al. 19941997) in various general settings.

Cross-References

Recommended Reading

The theory of SLQ control in various contexts is available in textbooks, monographs, or papers. Anderson and Moore (19711989), Bensoussan (1992), and Chen (1985) include good accounts of the LQG control theory. Wonham (1970) includes a full introduction to the SLQ problem with deterministic piecewise continuous-time coefficients. Bismut (1978) gives a systematic and readable French introduction to SLQ problem with random coefficients. Yong and Zhou (1999) include an extensive discussion on the well-posed indefinite SLQ problem. Tang (2003) gives a complete solution of a general backward stochastic Riccati equation.

Bibliography

  1. Anderson BDO, Moore JB (1971) Linear optimal control. Prentice-Hall, Englewood CliffszbMATHGoogle Scholar
  2. Anderson BDO, Moore JB (1989) Optimal control: linear quadratic methods. Prentice-Hall, Englewood CliffsGoogle Scholar
  3. Athans M (1971) The role and use of the stochastic linear-quadratic-Gaussian problem in control system design. IEEE Trans Autom Control AC-16(6):529–552CrossRefMathSciNetGoogle Scholar
  4. Bensoussan A (1992) Stochastic control of partially observable systems. Cambridge University Press, CambridgeCrossRefzbMATHGoogle Scholar
  5. Bismut JM (1976) Linear quadratic optimal stochastic control with random coefficients. SIAM J Control Optim 14:419–444CrossRefzbMATHMathSciNetGoogle Scholar
  6. Bismut JM (1978) Contrôle des systems linéaires quadratiques: applications de l’intégrale stochastique. In: Dellacherie C, Meyer PA, Weil M (eds) Séminaire de probabilités XII. Lecture Notes in Math 649. Springer, Berlin, pp 180–264Google Scholar
  7. Bobrovnytska O, Schweizer M (2004) Mean-variance hedging and stochastic control: beyond the Brownian setting. IEEE Trans Autom Control 49:396–408CrossRefMathSciNetGoogle Scholar
  8. Chen H (1985) Recursive estimation and control for stochastic systems. Wiley, New York, pp 302–335Google Scholar
  9. Chen S, Li X, Zhou X (1998) Stochastic linear quadratic regulators with indefinite control weight costs. SIAM J Control Optim 36:1685–1702CrossRefzbMATHMathSciNetGoogle Scholar
  10. Delbaen F, Tang S (2010) Harmonic analysis of stochastic equations and backward stochastic differential equations. Probab Theory Relat Fields 146:291–336CrossRefMathSciNetGoogle Scholar
  11. Delbaen F et al (1994) Weighted norm inequalities and closedness of a space of stochastic integrals. C R Acad Sci Paris Sér I Math 319:1079–1081zbMATHMathSciNetGoogle Scholar
  12. Delbaen F et al (1997) Weighted norm inequalities and hedging in incomplete markets. Financ Stoch 1:181–227CrossRefzbMATHGoogle Scholar
  13. Duffie D, Richardson HR (1991) Mean-variance hedging in continuous time. Ann Appl Probab 1:1–15CrossRefzbMATHMathSciNetGoogle Scholar
  14. Florentin JJ (1961) Optimal control of continuous-time, Markov, stochastic systems. J Electron Control 10:473–488MathSciNetGoogle Scholar
  15. Gouriéroux C, Laurent JP, Pham H (1998) Mean-variance hedging and numéraire. Math Financ 8:179–200CrossRefzbMATHGoogle Scholar
  16. Jeanblanc M et al (2012) Mean-variance hedging via stochastic control and BSDEs for general semimartingales. Ann Appl Probab 22:2388–2428CrossRefzbMATHMathSciNetGoogle Scholar
  17. Kohlmann M, Tang S (2002) Global adapted solution of one-dimensional backward stochastic Riccati equations, with application to the mean-variance hedging. Stoch Process Appl 97:255–288CrossRefzbMATHMathSciNetGoogle Scholar
  18. Kohlmann M, Tang S (2003a) Multidimensional backward stochastic Riccati equations and applications. SIAM J Control Optim 41:1696–1721CrossRefzbMATHMathSciNetGoogle Scholar
  19. Kohlmann M, Tang S (2003b) Minimization of risk and linear quadratic optimal control theory. SIAM J Control Optim 42:1118–1142CrossRefzbMATHMathSciNetGoogle Scholar
  20. Kohlmann M, Zhou XY (2000) Relationship between backward stochastic differential equations and stochastic controls: a linear-quadratic approach. SIAM J Control Optim 38:1392–1407CrossRefzbMATHMathSciNetGoogle Scholar
  21. Peng S (1992a) New developments in stochastic maximum principle and related backward stochastic differential equations. In: Proceedings of the 31st conference on decision and control, Tucson, Dec 1992. IEEE, pp 2043–2047Google Scholar
  22. Peng S (1992b) Stochastic Hamilton-Jacobi-Bellman equations. SIAM J Control Optim 30:284–304CrossRefzbMATHMathSciNetGoogle Scholar
  23. Peng S (1999) Open problems on backward stochastic differential equations. In: Chen S, Li X, Yong J, Zhou XY (eds) Control of distributed parameter and stochastic systems, IFIP, Hangzhou. Kluwer, pp 267–272Google Scholar
  24. Pham H (2000) On quadratic hedging in continuous time. Math Methods Oper Res 51:315–339CrossRefzbMATHMathSciNetGoogle Scholar
  25. Potter JE (1964) A guidance-navigation separation theorem. Experimental Astronomy Laboratory, Massachusetts Institute of Technology, Cambridge, Rep. RE-11, 1964Google Scholar
  26. Schweizer M (1992) Mean-variance hedging for general claims. Ann Appl Probab 2:171–179CrossRefzbMATHMathSciNetGoogle Scholar
  27. Schweizer M (2010) Mean-variance hedging. In: Cont R (ed) Encyclopedia of quantitative finance. Wiley, New York, pp 1177–1181Google Scholar
  28. Tang S (2003) General linear quadratic optimal stochastic control problems with random coefficients: linear stochastic Hamilton systems and backward stochastic Riccati equations. SIAM J Control Optim 42:53–75CrossRefzbMATHMathSciNetGoogle Scholar
  29. Tse E (1971) On the optimal control of stochastic linear systems. IEEE Trans Autom Control AC-16(6):776–785Google Scholar
  30. Wonham WM (1967) Optimal stationary control of a linear system with state-dependent noise. SIAM J Control 5:486–500CrossRefzbMATHMathSciNetGoogle Scholar
  31. Wonham WM (1968a) On the separation theorem of stochastic control. SIAM J Control 6:312–326CrossRefzbMATHMathSciNetGoogle Scholar
  32. Wonham WM (1968b) On a matrix Riccati equation of stochastic control. SIAM J Control 6:681–697. Erratum (1969); SIAM J Control 7:365Google Scholar
  33. Wonham WM (1970) Random differential equations in control theory. In: Bharucha-Reid AT (ed) Probabilistic methods in applied mathematics. Academic, New York, pp 131–212Google Scholar
  34. Yong JM, Zhou XY (1999) Stochastic controls: Hamiltonian systems and HJB equations. Springer, New YorkCrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag London 2014

Authors and Affiliations

  1. 1.Fudan UniversityShanghaiChina