Robust Sub-optimality of Linear-Saturated Control via Quadratic Zero-Sum Differential Games

Abstract

In this paper, we determine the approximation ratio of a linear-saturated control policy of a typical robust-stabilization problem. We consider a system, whose state integrates the discrepancy between the unknown but bounded disturbance and control. The control aims at keeping the state within a target set, whereas the disturbance aims at pushing the state outside of the target set by opposing the control action. The literature often solves this kind of problems via a linear-saturated control policy. We show how this policy is an approximation for the optimal control policy by reframing the problem in the context of quadratic zero-sum differential games. We prove that the considered approximation ratio is asymptotically bounded by 2, and it is upper bounded by 2 in the case of 1-dimensional system. In this last case, we also discuss how the approximation ratio may apparently change, when the system’s demand is subject to uncertainty. In conclusion, we compare the approximation ratio of the linear-saturated policy with the one of a family of control policies which generalize the bang–bang one.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2

References

  1. 1.

    Saberi, A., Lin, Z., Teel, A.: Control of linear systems with saturating actuators. IEEE Trans. Autom. Control 41(3), 368–377 (1996)

    MathSciNet  Article  Google Scholar 

  2. 2.

    Tarbouriech, S., Garcia, G., da Silva, J.G., Queinnec, I.: Stability and Stabilization of Linear Systems with Saturating Actuators. Springer, London (2011)

    Book  Google Scholar 

  3. 3.

    Benzaouia, A., Mesquine, F., Benhayoun, M.: Saturated control of linear systems. In: Kacprzyk, J. (ed.) Studies in Systems, Decision and Control, vol. 124. Springer, Cham (2018)

  4. 4.

    Hu, T., Teel, A.R., Zaccarian, L.: Stability and performance for saturated systems via quadratic and nonquadratic Lyapunov functions. IEEE Trans. Autom. Control 51(11), 1770–1786 (2006)

    MathSciNet  Article  Google Scholar 

  5. 5.

    Henrion, D., Tarbouriech, S.: LMI relaxations for robust stability of linear systems with saturating controls. Automatica 35, 1599–1604 (1999)

    MathSciNet  Article  Google Scholar 

  6. 6.

    Benzaouia, A., Tadeo, F., Mesquine, F.: The regulator problem for linear systems with saturation on the control and its increments or rate: an LMI approach. IEEE Trans. Circuits Syst. I Fundam. Theory Appl. 53(12), 2681–2691 (2006)

    MathSciNet  Article  Google Scholar 

  7. 7.

    Castelan, E.B., Tarbouriech, S., da Silva Jr, J.M.G., Queinnec, I.: L2-stabilization of continuous-time systems with saturating actuators. Int. J. Robust Nonlinear Control 16, 935–944 (2006)

    Article  Google Scholar 

  8. 8.

    Fang, H., Lin, Z., Hu, T.: Analysis of linear systems in the presence of actuator saturation and L2-disturbances. Automatica 40(7), 1229–1238 (2004)

    MathSciNet  Article  Google Scholar 

  9. 9.

    Teel, A.R.: Semi-global stabilizability of linear null controllable systems with input nonlinearities. IEEE Trans. Autom. Control 40(1), 96–100 (1995)

    MathSciNet  Article  Google Scholar 

  10. 10.

    Lin, Z., Mantri, R., Saberi, A.: Semi-global output regulation for linear systems subject to input saturation—a low and high gain design. Control Theory Adv. Technol. 10(4), 2209–2232 (1995)

    MathSciNet  Google Scholar 

  11. 11.

    Teel, A.R.: Global stabilization and restricted tracking for multiple integrators with bounded controls. Syst. Control Lett. 18, 165–171 (1992)

    MathSciNet  Article  Google Scholar 

  12. 12.

    da Silva Jr, J.M.G., Tarbouriech, S.: Anti-windup design with guaranteed regions of stability: an LMI-based approach. IEEE Trans. Autom. Control 50, 106–111 (2005)

    Article  Google Scholar 

  13. 13.

    Cao, Y.Y., Lin, Z., Ward, D.G.: An antiwindup approach to enlarging domain of attraction for linear systems subject to actuator saturation. IEEE Trans. Autom. Control 47(1), 140–145 (2002)

    MathSciNet  Article  Google Scholar 

  14. 14.

    Bemporad, A., Morari, M., Dua, V., Pistikopoulos, E.N.: The explicit linear quadratic regulator for constrained systems. Automatica 38(1), 3–20 (2002)

    MathSciNet  Article  Google Scholar 

  15. 15.

    Basar, T., Olsder, G.J.: Dynamic Noncooperative Game Theory, 2nd edn. Academic Press, London (1995)

    MATH  Google Scholar 

  16. 16.

    Bauso, D., Blanchini, F., Pesenti, R.: Robust control policies for multi-inventory systems with average flow constraints. Automatica 42(8), 1255–1266 (2006)

    MathSciNet  Article  Google Scholar 

  17. 17.

    Bagagiolo, F., Bauso, D.: Objective function design for robust optimality of linear control under state-constraints and uncertainty. ESAIM Control Optim. Calc. Var. 17(1), 155–177 (2011)

    MathSciNet  Article  Google Scholar 

  18. 18.

    Bertsimas, D., Thiele, A.: A robust optimization approach to inventory theory. Oper. Res. 54(1), 150–168 (2006)

    MathSciNet  Article  Google Scholar 

  19. 19.

    Elliot, R.J., Kalton, N.J.: The existence of value in differential games. Mem. Am. Math. Soc. 126. AMS, Providence, USA (1972)

  20. 20.

    Visintin, A.: Differential Models of Hysteresis. Springer, Berlin (1994)

    Book  Google Scholar 

Download references

Acknowledgements

This work was partially supported by CUP H76C18000650001 within Progetto di Eccellenza of the Department of Management, Ca’ Foscari.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Rosario Maggistro.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Communicated by Dean A. Carlson.

Appendix

Appendix

Proof of Lemma 4.1

In this proof, we show that \(\left( u^*(.),\omega ^*(.)\right) = \left( \text {sat}_{[-\hat{u},\hat{u}]}(- z(t)\right. - \hat{\omega }\left. (1-e^{t-T})),-\hat{\omega }\right) \), for \(0 \le t \le T\), is a saddle-point policy and we determine the state trajectory z(t).

The values of payoff \(J(z_0, u(.),\omega (.))\) (1a) and of limit (7) can be trivially determined by substituting the values of z(t) and \(u^*(t)\) in the respective formulas. In particular, we consider the Hamiltonian (9) associated with problem (1) taking into account the constraints on z(t) and u(t) to determine z(t), \(u^*(t)\) and \(\omega ^*(t)\). We have:

$$\begin{aligned} H(z(t),u(t),\omega (t))&=\frac{1}{2} (z^2(t) + u^2(t)) + p(t) (u(t)-\omega (t)) \\&\quad -\nu _+(t)(\varepsilon -z(t))-\nu _-(t)(\varepsilon +z(t))\nonumber \\&\quad -\lambda _+(t)(\hat{u}-u(t))-\lambda _-(t)(\hat{u}+u(t)), \nonumber \end{aligned}$$
(9)

where p(t) is the costate and \(\nu _+(t), \nu _-(t),\lambda _+(t), \lambda _-(t)\) are four nonnegative functions for \(0 \le t \le T\), such that \(\nu _+(t)(\varepsilon -z(t))= 0\), \(\nu _-(t)(\varepsilon +z(t))=0\), \(\lambda _+(t)(\hat{u}-u(t))= 0\), \(\lambda _-(t)(\hat{u}+u(t))=0\). These last conditions impose that, at each time t, either \(\nu _+(t)\) or \(\nu _-(t)\), respectively, either \(\lambda _+(t)\) or \(\lambda _-(t)\) must be equal to 0.

From (9) we obtain that best-responses of the two players are:

$$\begin{aligned} u^*(t)&=\arg \min _{u(t)} H(z(t),u(t),\omega (t))= - p(t) -\lambda _+(t) + \lambda _-(t) \end{aligned}$$
(10a)
$$\begin{aligned} \omega ^*(t)&=\arg \max _{\omega (t)} H(z(t),u(t),\omega (t))= - \text {sign} (p(t))\hat{\omega }. \end{aligned}$$
(10b)

The latter condition implies \(p(0) \ge 0\), as we assume \(z_0>0\), and hence \(\omega ^*(0)\).

The dynamics on z(t) and p(t) are then

$$\begin{aligned} \dot{z}(t)&=u(t)-\omega (t)=-p(t)-\lambda _+(t) + \lambda _-(t) + \mathrm{sign}(p(t))\hat{\omega }, {\qquad } z(0) = z_0,\end{aligned}$$
(11a)
$$\begin{aligned} \dot{p}(t)&=-z(t) - \nu _+(t) +\nu _-(t), {\qquad } p(T)=z(T). \end{aligned}$$
(11b)

We now prove that there is a saddle point where \(p(t) >0\) for \(0 \le t \le T\). To this end, we consider the following two different cases for p(t) and z(t): i) \(p(t)\ge \hat{u}\), for \(0 \le t \le T\), ii) \(0<p(t)<\hat{u}\), for \(0 \le t \le T\), iii) \(p(t)>0\), for \(0 \le t \le T\). Specifically, we use the results of the first two cases to address the third general case.

  1. (i)

    If \(p(t) \ge \hat{u}\), condition (10a) imposes \(\lambda _-(t) = p(t)+u^*(t)+\lambda _+(t) \ge 0\). Then, when \(\lambda _-(t) > 0\), we have \(u^*(t)=-\hat{u}\) that in turn implies \(\lambda _+(t) = 0\). On the other hand, as \(\lambda _+(t) \ge 0\) must hold, \(\lambda _-(t) = 0\) implies \(p(t) = \hat{u}\), \(u^*(t)=-\hat{u}\), and \(\lambda _+(t) = 0\). As a consequence, straightforward computations indicate the following functions as unique candidate optimal solutions of the differential two-point boundary value problem defined by (10) and (11), for \(0 \le t \le T\):

    $$\begin{aligned} u^*(t)&= - \hat{u}, \qquad \qquad \qquad \qquad \qquad \qquad \,\, \omega ^*(t) = - \hat{\omega } \end{aligned}$$
    (12a)
    $$\begin{aligned} z(t)&= z_0 - (\hat{u}-\hat{\omega }) t = z_0 - \delta t, p(t) = \frac{1}{2}\delta (t^2 - T(2 + T)) + (1 - t + T) z_0 . \end{aligned}$$
    (12b)

    Note that, given \(u^*(t)\) and \(\omega ^*(t)\), z(t) is decreasing in t, then \(z_0 \ge 0\) in \(\mathcal {S}\) implies \(z(t) \in int\{\mathcal {S}\}\), as \(p(t) \ge \hat{u}\) implies \(z_0 \ge \hat{u} + \delta T\) and hence \(z(t)>0\) as we show next. The candidate saddle-point policies (12) do not contradict the assumptions \(p(t)\ge \hat{u}\) for all \(0 \le t \le T \) if and only if \(z_0 \ge \hat{u}+\delta T\). Indeed, we note that \(p(T) = z_0 - \delta T\ge \hat{u}\) only if \(z_0 \ge \hat{u} + \delta T\) and that function p(t) attains its minimum value in \(t^* = \frac{z_0}{\delta }\) which is greater than T for \(z_0 \ge \hat{u} + \delta T\).

  2. (ii)

    If \(0\le p(t)< \hat{u}\), condition (10a) imposes \(\lambda _-(t) = 0\), as \(\lambda _-(t) > 0\) would imply both \(u^*(t) >- \hat{u}\), due to the current assumption on p(t), and \(u^*(t) =- \hat{u}\), due to the slackness complementarity condition. Similarly, (10a) imposes also \(\lambda _+(t) = 0\). Hence \(-\hat{u}< u(t) < 0\) and \(\lambda _+(t) = \lambda _-(t) = 0\) for all \(0\le t \le T\), then we have

    $$\begin{aligned} u^*(t)&= -\hat{\omega }(1-e^{-T}\cosh (t))-z_0 e^{-t}, \; \omega ^*(t) = - \hat{\omega } ,\end{aligned}$$
    (13a)
    $$\begin{aligned} z(t)&= \hat{\omega }\sinh (t) e^{-T} + z_0 e^{-t}, p(t) = \hat{\omega }(1-e^{-T}\cosh (t))+ z_0 e^{-t} . \end{aligned}$$
    (13b)

    The candidate saddle-point policies (13a) do not contradict \(z(t) \in \mathcal {S}\) and the assumptions \(0\le p(t)<\hat{u}\) for all \(0 \le t \le T\) when \(0\le z_0 < \delta + \hat{\omega }e^{-T}\) as it can be directly verified. In particular, we observe that z(t) is a nonnegative convex function for \(t >0\) if \(z_0 \ge 0\). As a consequence, \(z(t) \in \mathcal {S}\) if and only if \(z_0\) and \(z(T) \in \mathcal {S}\), which in turn requires \(\varepsilon \ge \hat{\omega } \frac{1+e^{-T}}{2}\) and which is implied by the fact that \(\varepsilon > \hat{\omega }\) holds due to (2).

  3. (iii)

    If \(p(t) \ge 0\), we have a more general situation not included in the previous two cases when \(\delta + \hat{\omega }e^{-T}\le z_0 \le \hat{u}+\delta T\).

    We preliminarily observe that we can partition the payoff (1) as follows:

    $$\begin{aligned} J(z_0,u(.),\omega (.))=&\underbrace{\int _0^{\hat{t}}\frac{1}{2}(z^2(t)+u^2(t)) dt}~~+&\underbrace{\int _{\hat{t}}^T\frac{1}{2} (z^2(t)+u^2(t)) dt +\frac{1}{2} z(T)^2}\nonumber .\\&K(z_0,u(.),\omega (.),\hat{t})&L(z({\hat{t}}),u(.),\omega (.), \hat{t}) \end{aligned}$$

    for any \(0\le \hat{t} \le T\), Then, as equation (1b) describes a first-order system, we can determine the value of the payoff (1a) in two steps. First, we compute the value of

    $$\begin{aligned} L^*(z({\hat{t}}),\hat{t})=\min _{u(.)} \max _{\omega (.)} L(z(\hat{t}),u(.),\omega (.),\hat{t}) \end{aligned}$$

    as a function of \(z({\hat{t}})\). Second, we solve the optimization problem

    $$\begin{aligned} \min _{u(.)} \max _{\omega (.)}\{K(z_0,u(.),\omega (.),\hat{t})+L^*(z({\hat{t}}),\hat{t})\}, \end{aligned}$$

    where \(L^*(z({\hat{t}}),\hat{t})\) is seen as a final penalty term. In particular, we denote by \(\hat{t}\), the first instant, if exists, such that \(z_{\hat{t}}:=z(\hat{t})=\delta +\hat{\omega }e^{-T}\).

    We obtain the optimal value \(L^*(z_{\hat{t}})\) from conditions (13) by translating the time origin in \(\hat{t}\), and assuming a horizon length \(T-\hat{t}\):

    $$\begin{aligned} L^*(z_{\hat{t}})= \frac{T-\hat{t}}{2}\hat{\omega }^2+ \frac{(e^{-2(T-\hat{t})}-1)\hat{\omega }^2+ 4z_{\hat{t}}\hat{\omega }(1-e^{-(T-\hat{t})}) + 2z_{\hat{t}}^2}{4}. \end{aligned}$$

    Next, we solve the optimization problem with respect to \(K(z_0,u(t),\omega (t),\hat{t})+L^*(z_{\hat{t}})\). We observe that the boundary condition is

    $$\begin{aligned} p(\hat{t})= \left. \frac{\partial L^*(z_{\hat{t}})}{\partial z_{\hat{t}}}\right| _{z_{\hat{t} }=\delta +\hat{\omega }e^{-(T-\hat{t})}} = \hat{\omega } + z_{\hat{t}} - \hat{\omega } e^{-(T- \hat{t})} \end{aligned}$$

    Now, by contradiction, we show that \(u^*=-\hat{u}\) must hold for \(t\le \hat{t}\). Indeed, assume that there exists an instant \(\bar{t}\le \hat{t}\) such that \(u^*(\bar{t})> -\hat{u}\). Then, the latter implies \(p (\bar{t})< \hat{u}\) and being \(\dot{p}(t)=-z(t)<0\) (as \(\nu _+(t) = \nu _-(t) = 0\)) also \(p(\hat{t})<p(\bar{t})<\hat{u}\) which contradicts the assumption \(p(\hat{t})=\hat{u}\) and, hence, \(\bar{t}\) does not exist. In summary, the saddle-point policies and the associated dynamics on z(t) are:

    $$\begin{aligned} u^*(t)= & {} \left\{ \begin{array}{ll} -\hat{u}, &{}\quad \mathrm {if}~ t< \hat{t},\\ -\hat{\omega }(1-e^{-(T-\hat{t})}\cosh (t-\hat{t}))-z_0 e^{-(t-\hat{t})}, &{}\quad \mathrm {if} ~ t \ge \hat{t}, \end{array} \right. \\ \omega ^*(t)= & {} - \hat{\omega }, \\ z(t)= & {} \left\{ \begin{array}{ll} z_0 - \delta t, &{}\quad \mathrm {if}~ t < \hat{t},\\ \hat{\omega }\sinh (t-\hat{t}) e^{-(T-\hat{t})} + z_0 e^{-(t-\hat{t})}, &{}\quad \mathrm {if} ~ t \ge \hat{t}. \end{array}\right. \end{aligned}$$

    Note that even in this case \(\varepsilon \ge \hat{\omega }\) implies \(z(t) \le \varepsilon \) for \(0 \le t\le T\).

    As a consequence \(\hat{t}\), if exists, is the solution of the following equation \(z_0-\delta \hat{t}= \delta + \hat{\omega }e^{-T+\hat{t}}\) hence \(\hat{t} = -1 + \frac{z_0}{\delta } -W\left( \frac{\hat{\omega }}{\delta } e^{\frac{-\delta - \delta T + z_0}{\delta }}\right) , \) where W(.) is the Lambert W-function. It is left to show that \(\hat{t}\) always exists (at most we have \(\hat{t} =T)\). This is straightforward since if \(\hat{t}\) did not exist, we would have the lower bound condition \(z(t)>\delta +\hat{\omega }e^{-T+ t}\) for all \(t \le T \). But the latter is not possible since it must also hold the upper bound condition \(z(t)< \hat{u}+\delta (T-t)\), for all \(0\le t \le T\), and for \(t=T\) we have that both bounds are equal to \(\hat{u}\), i.e., \(\delta +\hat{\omega }e^{-T+ t}=\hat{u}+\delta (T-t)=\hat{u}\). \(\square \)

Proof of Lemma 4.2

In this proof, we determine player 2’s best-response, under the assumption that player 1 plays the LSC policy and we determine the state trajectory z(t).

The values of payoff \(J(z_0, u(.),\omega (.))\) (1a) and of limit (8) can be trivially determined by substituting the values of z(t) and u(t) in the respective formulas. In particular, we apply the Pontryagin conditions associated with (1) to determine z(t) and \(\omega ^*(t)\). We consider separately the two cases \(0 \le z_0 \le \hat{u} \le \varepsilon \) and \(0 \le z_0 \le \varepsilon \le \hat{u}\).

If \(0 \le z_0 \le \hat{u} \le \varepsilon \), we have:

$$\begin{aligned} \begin{aligned} \omega ^*(t)&= -\text {sign}(p(t))\hat{\omega },\\ \dot{z}(t)&= u(z(t)) + \text {sign}(p(t))\hat{\omega },\\ \dot{p}(t)&= - z(t) -u(z(t))\frac{\partial u(z(t))}{\partial z(t)}-p(t)\frac{\partial u(z(t))}{\partial z(t)},\\ p(T)&= z(T). \end{aligned} \end{aligned}$$
(14)

To prove that \(\omega ^*(t) = -\text {sign}(z(t))\hat{\omega }\), initially, consider \(0 \le z_0 \le \hat{\omega }\), thus \(u(z_0) = -z_0\) and observe that possible \(\omega ^*(.)\) and z(.) components of the solutions of (14) are:

$$\begin{aligned} \omega ^{*}(t) = -\hat{\omega }, \qquad z(t) = \hat{\omega }(1-e^{-t}) + z_0e^{-t} > 0 \;(\text {and} \le \hat{u}). \end{aligned}$$
(15)

To prove that \(\omega ^*(t)\) is actually the solution of (14), we show that any other policy would lead to a worse payoff for player 2. Indeed, suppose that player 2 uses a constant policy \(\omega (t) = \hat{\omega }\). We obtain \(z(t) = -\hat{\omega }(1-e^{-t}) + z_0e^{-t}\), thus \(u(z(t)) = z(t)\) and \(z(T) = -\hat{\omega }(1-e^{-T}) + z_0e^{-T}\). Direct computation of the respective values of the payoff shows that policy \(\omega (t) =\hat{\omega }\) is worse, respectively, not better if \(z_0 =0\), for player 2 than the one of the policy \(\omega ^*(.)\) in (15). Observe also that, player 2 does not benefit from using a time-varying policy. In this case, we obtain \(-\hat{\omega }(1-e^{-t}) + z_0e^{-t} \le z(t) \le -\hat{\omega }(1-e^{-t}) + z_0e^{-t}\), thus \(u(z(t)) = z(t)\) and \(-\hat{\omega }(1-e^{-T}) + z_0e^{-T} \le z(T) \le -\hat{\omega }(1-e^{-T}) + z_0e^{-T}\). Even in this case the payoff of player 2, due to its quadratic structure, takes on a worse value than the value returned by the policy \(\omega ^*(.)\) in (15).

It is apparent that if player 2’s best-response is \( \omega ^*(t) = -\hat{\omega }\) when \(0\le z_0\le \hat{\omega }\), then the same control policy is the best-response when \(z_0\) has a greater values. For any other policies, both u(z(t)) and z(t) would take on smaller absolute values than in case of \(\omega ^*(t) = -\hat{\omega }\).

Given the above arguments, state trajectory z(t) associated with u(.) and \(\omega ^*(.)\) is the defined as \(\dot{z}(t) = \tilde{u}(t) + \text {sign}(z(t))\hat{\omega }.\) Then,

$$\begin{aligned} z(t) = \left\{ \begin{array}{ll} z_0 - \delta t, &{}\quad \text{ if } z_0 \ge \hat{u}+\delta T, \\ (z_0 - \delta t)1\{t\le \frac{z_0 -\hat{u}}{\delta }\} + (\hat{\omega }+\delta e^{\frac{z_0 -\hat{u}}{\delta }-t}))1\{\frac{z_0 -\hat{u}}{\delta }< t \le T\}, &{}\quad \text{ if } \hat{u}< z_0 < \hat{u}+\delta T, \\ \hat{\omega }(1-e^{-t}) + z_0e^{-t}, &{} \quad \text{ if } 0\le z_0 \le \hat{u}, \end{array}\right. . \end{aligned}$$

Similar arguments hold for \(0<z_0 \le \varepsilon < \hat{u}\). Even in this case, player 2’s best-response is \(\tilde{\omega }(t) = -\hat{\omega }\). Also, as \(z_0 \le \varepsilon \), then the state trajectory z(t) associated with \(\tilde{u}(.)\) and \(\tilde{\omega }(.)\) is \( z(t) = \hat{\omega }(1-e^{-t}) + z_0e^{-t}\), for \(0 \le t \le T. \) In particular, \(\varepsilon \ge \hat{\omega }\) implies \(z(t) \le \varepsilon \). \(\square \)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bauso, D., Maggistro, R. & Pesenti, R. Robust Sub-optimality of Linear-Saturated Control via Quadratic Zero-Sum Differential Games. J Optim Theory Appl 184, 1109–1125 (2020). https://doi.org/10.1007/s10957-019-01611-x

Download citation

Keywords

  • Robust optimization
  • Bounded disturbances
  • Differential games
  • Linear-saturated control

Mathematics Subject Classification

  • 93D21
  • 49N70
  • 91A05