Skip to main content

A Uniform Tauberian Theorem in Optimal Control

  • Chapter
  • First Online:

Part of the book series: Annals of the International Society of Dynamic Games ((AISDG,volume 12))

Abstract

In an optimal control framework, we consider the value V T (x) of the problem starting from state x with finite horizon T, as well as the value W λ(x) of the λ-discounted problem starting from x. We prove that uniform convergence (on the set of states) of the values V T ( ⋅) as T tends to infinity is equivalent to uniform convergence of the values W λ( ⋅) as λ tends to 0, and that the limits are identical. An example is also provided to show that the result does not hold for pointwise convergence. This work is an extension, using similar techniques, of a related result by Lehrer and Sorin in a discrete-time framework.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Lemma 6 and Theorem 8 in [4] deal with this general setting, but we believe them to be incorrect since they are stated for pointwise convergence and, consequently, are contradicted by the example in Sect. 10.4.

  2. 2.

    The reader may verify that this is indeed not the case in the example of Sect. 10.4.

  3. 3.

    We thank Marc Quincampoix for pointing out this example to us, which is simpler that our original one.

  4. 4.

    We thank Frédéric Bonnans for the idea of this proof.

References

  1. Alvarez, O., Bardi, M.: Ergodic Problems in Differential Games. Advances in Dynamic Game Theory, pp. 131–152. Ann. Int’l. Soc. Dynam. Games, vol. 9, Birkhäuser Boston (2007)

    Google Scholar 

  2. Alvarez, O., Bardi, M.: Ergodicity, stabilization, and singular perturbations for Bellman-Isaacs equations. Mem. Am. Math. Soc. 960(204), 1–90 (2010)

    Article  MathSciNet  Google Scholar 

  3. Arisawa, M.: Ergodic problem for the Hamilton-Jacobi-Bellman equation I. Ann. Inst. Henri Poincare 14, 415–438 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  4. Arisawa, M.: Ergodic problem for the Hamilton-Jacobi-Bellman equation II. Ann. Inst. Henri Poincare 15, 1–24 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  5. Arisawa, M., Lions, P.-L.: On ergodic stochastic control. Comm. Partial Diff. Eq. 23(11–12), 2187–2217 (1998)

    MathSciNet  MATH  Google Scholar 

  6. Artstein, Z., Gaitsgory, V.: The value function of singularly perturbed control systems. Appl. Math. Optim. 41(3), 425–445 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  7. Bardi, M., Capuzzo-Dolcetta, I.: Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations. Systems & Control: Foundations & Applications. Birkhäuser Boston, Inc., Boston, MA (1997)

    Google Scholar 

  8. Barles, G.: Some homogenization results for non-coercive Hamilton-Jacobi equations. Calculus Variat. Partial Diff. Eq. 30(4), 449–466 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  9. Bellman, R.: On the theory of dynamic programming. Proc. Natl. Acad. Sci. U.S.A, 38, 716–719 (1952)

    Article  MATH  Google Scholar 

  10. Bettiol, P.: On ergodic problem for Hamilton-Jacobi-Isaacs equations. ESAIM: COCV 11, 522–541 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  11. Cardaliaguet, P.: Ergodicity of Hamilton-Jacobi equations with a non coercive non convex Hamiltonian in ℝ 2 ∕ 2. Ann. l’Inst. Henri Poincare (C) Non Linear Anal. 27, 837–856 (2010)

    Google Scholar 

  12. Carlson, D.A., Haurie, A.B., Leizarowitz, A.: Optimal Control on Infinite Time Horizon. Springer, Berlin (1991)

    Book  Google Scholar 

  13. Colonius, F., Kliemann, W.: Infinite time optimal control and periodicity. Appl. Math. Optim. 20, 113–130 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  14. Evans, L.C.: An Introduction to Mathematical Optimal Control Theory. Unpublished Lecture Notes, U.C. Berkeley (1983). Available at http://math.berkeley.edu/~evans/control.course.pdf

  15. Feller, W.: An Introduction to Probability Theory and its Applications, vol. II, 2nd ed. Wiley, New York (1971)

    Google Scholar 

  16. Grune, L.: On the Relation between Discounted and Average Optimal Value Functions. J. Diff. Eq. 148, 65–99 (1998)

    Article  MathSciNet  Google Scholar 

  17. Hardy, G.H., Littlewood, J.E.: Tauberian theorems concerning power series and Dirichlet’s series whose coefficients are positive. Proc. London Math. Soc. 13, 174–191 (1914)

    Article  MathSciNet  MATH  Google Scholar 

  18. Hestenes, M.: A General Problem in the Calculus of Variations with Applications to the Paths of Least Time, vol. 100. RAND Corporation, Research Memorandum, Santa Monica, CA (1950)

    Google Scholar 

  19. Isaacs, R.: Games of Pursuit. Paper P-257. RAND Corporation, Santa Monica (1951)

    Google Scholar 

  20. Isaacs, R.: Differential Games. A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization. Wiley, New York (1965)

    Google Scholar 

  21. Kohlberg, E., Neyman, A.: Asymptotic behavior of nonexpansive mappings in normed linear spaces. Isr. J. Math. 38, 269–275 (1981)

    Article  MathSciNet  MATH  Google Scholar 

  22. Kirk, D.E.: Optimal Control Theory: An Introduction. Englewood Cliffs, N.J. Prentice Hall (1970)

    Google Scholar 

  23. Lee, E.B., Markus, L.: Foundations of Optimal Control Theory. SIAM, Philadelphia (1967)

    MATH  Google Scholar 

  24. Lehrer, E., Sorin, S.: A uniform Tauberian theorem in dynamic programming. Math. Oper. Res. 17, 303–307 (1992)

    Article  MathSciNet  Google Scholar 

  25. Lions, P.-L., Papanicolaou, G., Varadhan, S.R.S.: Homogenization of Hamilton-Jacobi Equations. Unpublished (1986)

    Google Scholar 

  26. Monderer, M., Sorin, S.: Asymptotic Properties in Dynamic Programming. Int. J. Game Theory 22, 1–11 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  27. Pontryiagin, L.S., Boltyanskii, V.G., Gamkrelidge: The Mathematical Theory of Optimal Processes. Nauka, Moskow (1962) (Engl. Trans. Wiley)

    Google Scholar 

  28. Quincampoix, M., Renault, J.: On the existence of a limit value in some non expansive optimal control problems. SIAM J. Control Optim. 49, 2118–2132 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  29. Shapley, L.S.: Stochastic games. Proc. Natl. Acad. Sci. 39, 1095–1100 (1953)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This article was done as part of the PhD of the first author. Both authors wish to express their many thanks to Sylvain Sorin for his numerous comments and his great help. We also thank Hélène Frankowska and Marc Quincampoix for helpful remarks on earlier drafts.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Miquel Oliu-Barton .

Editor information

Editors and Affiliations

Appendix

Appendix

We give here another proofFootnote 4 of Theorem 10.5 by using the analoguous result in discrete time [24] as well as an argument of equivalence between discrete and continuous dynamic.

Consider a deterministic dynamic programming problem in continuous time as defined in Sect. 10.2.1, with a state space Ω, a payoff g and a dynamic Γ. Recall that, for any ω ∈ Ω, Γ(ω) is the non empty set of feasible trajectories, starting from ω. We construct an associated deterministic dynamic programming problem in discrete time as follows.

Let \(\widetilde{\Omega } = \Omega \times [0,1]\) be the new state space and let \(\widetilde{g}\) be the new cost function, given by \(\widetilde{g}(\omega ,x) = x\). We define a multivalued-function with nonempty values \(\widetilde{\Gamma } :\widetilde{ \Omega } \rightrightarrows \widetilde{ \Omega }\) by

$$(\omega ,x) \in \widetilde{ \Gamma }(\omega \prime,x\prime)\Longleftrightarrow\exists X \in \Gamma (\omega \prime),\quad \text{ with }X(1) = \omega \quad \text{ and }\quad {\int \nolimits }_{0}^{1}g(X(t))\,\mathrm{d}t = x.$$

Following [24], we define, for any initial state \(\widetilde{\omega } = (\omega ,x)\)

$$\begin{array}{rcl} {v}_{n}(\widetilde{\omega })& =& \inf \frac{1} {n}\sum\limits_{i=1}^{n}\widetilde{g}(\widetilde{{\omega }}_{ i}) \\ {w}_{\lambda }(\widetilde{\omega })& =& \inf \lambda \sum\limits_{i=1}^{+\infty }{(1 - \lambda )}^{i-1}\widetilde{g}(\widetilde{{\omega }}_{ i}) \\ \end{array}$$

where the infima are taken over the set of sequences \(\{\widetilde{{\omega }{}_{i}\}}_{i\in \mathbb{N}}\) such that \(\widetilde{{\omega }}_{0} =\widetilde{ \omega }\) and \(\widetilde{{\omega }}_{i+1} \in \widetilde{ \Gamma }(\widetilde{{\omega }}_{i})\) for every i ≥ 0.

Theorem 10.5 is then the consequence of the following three facts.

Firstly, the main theorem of Lehrer and Sorin in [24], which states that uniform convergence (on \(\widetilde{\Omega }\)) of v n to some v is equivalent to uniform convergence of w λ to the same v.

Secondly, the concatenation hypothesis (10.4) on Γ implies that for any \((\omega ,x)\in \widetilde{\Omega }\)

$$\begin{array}{rcl}{ v}_{n}(\omega ,x) = {V }_{n}(\omega )& & \\ \end{array}$$

where \({V }_{t}(\omega ) {=\inf }_{X\in \Gamma (\omega )}\frac{1} {t} { \int \nolimits }_{0}^{n}g(X(s))\,\mathrm{d}s\), as defined in equation (10.7). Consequently, because of the bound on g, for any t ∈ ℝ  +  we have

$$\vert {V }_{t}(\omega ) - {v}_{\lfloor t\rfloor }(\omega ,x)\vert \leq \frac{2} {\lfloor t\rfloor }$$

where ⌊t⌋ stands for the integer part of t.

Finally, again because of hypothesis (10.4), for any λ ∈ ]0, 1],

$${w}_{\lambda }(\omega ,x) {=\inf }_{X\in \Gamma (\omega )}\lambda {\int \nolimits }_{0}^{+\infty }{(1 - \lambda )}^{\lfloor t\rfloor }g(X(t))\,\mathrm{d}t.$$

Hence, by equation (10.8) and the bound on the cost function, for any λ ∈ ]0, 1],

$$\vert {W}_{\lambda }(\omega ) - {w}_{\lambda }(\omega ,x)\vert \leq \lambda {\int \nolimits }_{0}^{+\infty }\left\vert {(1 - \lambda )}^{\lfloor t\rfloor }-\mathrm{ {e}}^{-\lambda t}\right\vert \mathrm{d}t$$

which tends uniformly (with respect to x and ω) to 0 as λ goes to 0 by virtue of the following lemma.

Lemma 10.6.

The function

$$\lambda \mapsto \lambda {\int \nolimits }_{0}^{+\infty }\left\vert {(1 - \lambda )}^{\lfloor t\rfloor }-\mathrm{ {e}}^{-\lambda t}\right\vert \mathrm{d}t$$

converges to 0 as λ tends to 0.

Proof.

Since λ ∫0 + (1 − λ)t = λ ∫0 + e− λtdt = 1, for any λ > 0, the lemma is equivalent to the convergence to 0 of

$$E(\lambda ) := \lambda {\int \nolimits }_{0}^{+\infty }{\left[{(1 - \lambda )}^{\lfloor t\rfloor }-\mathrm{ {e}}^{-\lambda t}\right]}_{ +}\mathrm{d}t$$

where [x] +  denotes the positive part of x. Now, from the relation 1 − λ ≤ e− λ, true for any λ, one can easily deduce that, for any λ > 0,  t ≥ 0, the relation (1 − λ)teλt ≤ eλ holds. Hence,

$$\begin{array}{rcl} E(\lambda )& =& \lambda {\int \nolimits }_{0}^{+\infty }\mathrm{{e}}^{-\lambda t}{\left[{(1 - \lambda )}^{\lfloor t\rfloor }\mathrm{{e}}^{\lambda t} - 1\right]}_{ +}\mathrm{d}t \\ & \leq & \lambda {\int \nolimits }_{0}^{+\infty }\mathrm{{e}}^{-\lambda t}(\mathrm{{e}}^{\lambda } - 1)\,\mathrm{d}t \\ & =& \mathrm{{e}}^{\lambda } - \end{array}$$
(1)

which converges to 0 as λ tends to 0. □ 

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Oliu-Barton, M., Vigeral, G. (2013). A Uniform Tauberian Theorem in Optimal Control. In: Cardaliaguet, P., Cressman, R. (eds) Advances in Dynamic Games. Annals of the International Society of Dynamic Games, vol 12. Birkhäuser, Boston, MA. https://doi.org/10.1007/978-0-8176-8355-9_10

Download citation

Publish with us

Policies and ethics