A Uniform Tauberian Theorem in Optimal Control

Oliu-Barton, Miquel; Vigeral, Guillaume

doi:10.1007/978-0-8176-8355-9_10

A Uniform Tauberian Theorem in Optimal Control

Miquel Oliu-Barton³ &
Guillaume Vigeral⁴

Chapter
First Online: 01 January 2012

1744 Accesses
16 Citations

Part of the book series: Annals of the International Society of Dynamic Games ((AISDG,volume 12))

Abstract

In an optimal control framework, we consider the value V _T(x) of the problem starting from state x with finite horizon T, as well as the value W _λ(x) of the λ-discounted problem starting from x. We prove that uniform convergence (on the set of states) of the values V _T( ⋅) as T tends to infinity is equivalent to uniform convergence of the values W _λ( ⋅) as λ tends to 0, and that the limits are identical. An example is also provided to show that the result does not hold for pointwise convergence. This work is an extension, using similar techniques, of a related result by Lehrer and Sorin in a discrete-time framework.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Lemma 6 and Theorem 8 in [4] deal with this general setting, but we believe them to be incorrect since they are stated for pointwise convergence and, consequently, are contradicted by the example in Sect. 10.4.
2.
The reader may verify that this is indeed not the case in the example of Sect. 10.4.
3.
We thank Marc Quincampoix for pointing out this example to us, which is simpler that our original one.
4.
We thank Frédéric Bonnans for the idea of this proof.

References

Alvarez, O., Bardi, M.: Ergodic Problems in Differential Games. Advances in Dynamic Game Theory, pp. 131–152. Ann. Int’l. Soc. Dynam. Games, vol. 9, Birkhäuser Boston (2007)
Google Scholar
Alvarez, O., Bardi, M.: Ergodicity, stabilization, and singular perturbations for Bellman-Isaacs equations. Mem. Am. Math. Soc. 960(204), 1–90 (2010)
Article MathSciNet Google Scholar
Arisawa, M.: Ergodic problem for the Hamilton-Jacobi-Bellman equation I. Ann. Inst. Henri Poincare 14, 415–438 (1997)
Article MathSciNet MATH Google Scholar
Arisawa, M.: Ergodic problem for the Hamilton-Jacobi-Bellman equation II. Ann. Inst. Henri Poincare 15, 1–24 (1998)
Article MathSciNet MATH Google Scholar
Arisawa, M., Lions, P.-L.: On ergodic stochastic control. Comm. Partial Diff. Eq. 23(11–12), 2187–2217 (1998)
MathSciNet MATH Google Scholar
Artstein, Z., Gaitsgory, V.: The value function of singularly perturbed control systems. Appl. Math. Optim. 41(3), 425–445 (2000)
Article MathSciNet MATH Google Scholar
Bardi, M., Capuzzo-Dolcetta, I.: Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations. Systems & Control: Foundations & Applications. Birkhäuser Boston, Inc., Boston, MA (1997)
Google Scholar
Barles, G.: Some homogenization results for non-coercive Hamilton-Jacobi equations. Calculus Variat. Partial Diff. Eq. 30(4), 449–466 (2007)
Article MathSciNet MATH Google Scholar
Bellman, R.: On the theory of dynamic programming. Proc. Natl. Acad. Sci. U.S.A, 38, 716–719 (1952)
Article MATH Google Scholar
Bettiol, P.: On ergodic problem for Hamilton-Jacobi-Isaacs equations. ESAIM: COCV 11, 522–541 (2005)
Article MathSciNet MATH Google Scholar
Cardaliaguet, P.: Ergodicity of Hamilton-Jacobi equations with a non coercive non convex Hamiltonian in ℝ ² ∕ ℤ ². Ann. l’Inst. Henri Poincare (C) Non Linear Anal. 27, 837–856 (2010)
Google Scholar
Carlson, D.A., Haurie, A.B., Leizarowitz, A.: Optimal Control on Infinite Time Horizon. Springer, Berlin (1991)
Book Google Scholar
Colonius, F., Kliemann, W.: Infinite time optimal control and periodicity. Appl. Math. Optim. 20, 113–130 (1989)
Article MathSciNet MATH Google Scholar
Evans, L.C.: An Introduction to Mathematical Optimal Control Theory. Unpublished Lecture Notes, U.C. Berkeley (1983). Available at http://math.berkeley.edu/~evans/control.course.pdf
Feller, W.: An Introduction to Probability Theory and its Applications, vol. II, 2nd ed. Wiley, New York (1971)
Google Scholar
Grune, L.: On the Relation between Discounted and Average Optimal Value Functions. J. Diff. Eq. 148, 65–99 (1998)
Article MathSciNet Google Scholar
Hardy, G.H., Littlewood, J.E.: Tauberian theorems concerning power series and Dirichlet’s series whose coefficients are positive. Proc. London Math. Soc. 13, 174–191 (1914)
Article MathSciNet MATH Google Scholar
Hestenes, M.: A General Problem in the Calculus of Variations with Applications to the Paths of Least Time, vol. 100. RAND Corporation, Research Memorandum, Santa Monica, CA (1950)
Google Scholar
Isaacs, R.: Games of Pursuit. Paper P-257. RAND Corporation, Santa Monica (1951)
Google Scholar
Isaacs, R.: Differential Games. A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization. Wiley, New York (1965)
Google Scholar
Kohlberg, E., Neyman, A.: Asymptotic behavior of nonexpansive mappings in normed linear spaces. Isr. J. Math. 38, 269–275 (1981)
Article MathSciNet MATH Google Scholar
Kirk, D.E.: Optimal Control Theory: An Introduction. Englewood Cliffs, N.J. Prentice Hall (1970)
Google Scholar
Lee, E.B., Markus, L.: Foundations of Optimal Control Theory. SIAM, Philadelphia (1967)
MATH Google Scholar
Lehrer, E., Sorin, S.: A uniform Tauberian theorem in dynamic programming. Math. Oper. Res. 17, 303–307 (1992)
Article MathSciNet Google Scholar
Lions, P.-L., Papanicolaou, G., Varadhan, S.R.S.: Homogenization of Hamilton-Jacobi Equations. Unpublished (1986)
Google Scholar
Monderer, M., Sorin, S.: Asymptotic Properties in Dynamic Programming. Int. J. Game Theory 22, 1–11 (1993)
Article MathSciNet MATH Google Scholar
Pontryiagin, L.S., Boltyanskii, V.G., Gamkrelidge: The Mathematical Theory of Optimal Processes. Nauka, Moskow (1962) (Engl. Trans. Wiley)
Google Scholar
Quincampoix, M., Renault, J.: On the existence of a limit value in some non expansive optimal control problems. SIAM J. Control Optim. 49, 2118–2132 (2011)
Article MathSciNet MATH Google Scholar
Shapley, L.S.: Stochastic games. Proc. Natl. Acad. Sci. 39, 1095–1100 (1953)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This article was done as part of the PhD of the first author. Both authors wish to express their many thanks to Sylvain Sorin for his numerous comments and his great help. We also thank Hélène Frankowska and Marc Quincampoix for helpful remarks on earlier drafts.

Author information

Authors and Affiliations

Institut Mathématique de Jussieu, UFR 929, Université Paris 6, Paris, France
Miquel Oliu-Barton
CEREMADE, Université Paris-Dauphine, Paris, France
Guillaume Vigeral

Authors

Miquel Oliu-Barton
View author publications
You can also search for this author in PubMed Google Scholar
Guillaume Vigeral
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Miquel Oliu-Barton .

Editor information

Editors and Affiliations

, CEREMADE, Université Paris-Dauphine, Place du Maréchal de Lattre de Tassigny, Paris, 75775, France
Pierre Cardaliaguet
Wilfrid Laurier University, University Ave W 75, Waterloo, N2L 3C5, Ontario, Canada
Ross Cressman

Appendix

We give here another proof^{Footnote 4} of Theorem 10.5 by using the analoguous result in discrete time [24] as well as an argument of equivalence between discrete and continuous dynamic.

Consider a deterministic dynamic programming problem in continuous time as defined in Sect. 10.2.1, with a state space Ω, a payoff g and a dynamic Γ. Recall that, for any ω ∈ Ω, Γ(ω) is the non empty set of feasible trajectories, starting from ω. We construct an associated deterministic dynamic programming problem in discrete time as follows.

Let $\widetilde{\Omega } = \Omega \times [0,1]$ be the new state space and let $\widetilde{g}$ be the new cost function, given by $\widetilde{g}(\omega ,x) = x$. We define a multivalued-function with nonempty values $\widetilde{\Gamma } :\widetilde{ \Omega } \rightrightarrows \widetilde{ \Omega }$ by

$$(\omega ,x) \in \widetilde{ \Gamma }(\omega \prime,x\prime)\Longleftrightarrow\exists X \in \Gamma (\omega \prime),\quad \text{ with }X(1) = \omega \quad \text{ and }\quad {\int \nolimits }_{0}^{1}g(X(t))\,\mathrm{d}t = x.$$

Following [24], we define, for any initial state $\widetilde{\omega } = (\omega ,x)$

$$\begin{array}{rcl} {v}_{n}(\widetilde{\omega })& =& \inf \frac{1} {n}\sum\limits_{i=1}^{n}\widetilde{g}(\widetilde{{\omega }}_{ i}) \\ {w}_{\lambda }(\widetilde{\omega })& =& \inf \lambda \sum\limits_{i=1}^{+\infty }{(1 - \lambda )}^{i-1}\widetilde{g}(\widetilde{{\omega }}_{ i}) \\ \end{array}$$

where the infima are taken over the set of sequences $\{\widetilde{{\omega }{}_{i}\}}_{i\in \mathbb{N}}$ such that $\widetilde{{\omega }}_{0} =\widetilde{ \omega }$ and $\widetilde{{\omega }}_{i+1} \in \widetilde{ \Gamma }(\widetilde{{\omega }}_{i})$ for every i ≥ 0.

Theorem 10.5 is then the consequence of the following three facts.

Firstly, the main theorem of Lehrer and Sorin in [24], which states that uniform convergence (on $\widetilde{\Omega }$) of v _n to some v is equivalent to uniform convergence of w _λ to the same v.

Secondly, the concatenation hypothesis (10.4) on Γ implies that for any $(\omega ,x)\in \widetilde{\Omega }$

$$\begin{array}{rcl}{ v}_{n}(\omega ,x) = {V }_{n}(\omega )& & \\ \end{array}$$

where ${V }_{t}(\omega ) {=\inf }_{X\in \Gamma (\omega )}\frac{1} {t} { \int \nolimits }_{0}^{n}g(X(s))\,\mathrm{d}s$, as defined in equation (10.7). Consequently, because of the bound on g, for any t ∈ ℝ ₊ we have

$$\vert {V }_{t}(\omega ) - {v}_{\lfloor t\rfloor }(\omega ,x)\vert \leq \frac{2} {\lfloor t\rfloor }$$

where ⌊t⌋ stands for the integer part of t.

Finally, again because of hypothesis (10.4), for any λ ∈ ]0, 1],

$${w}_{\lambda }(\omega ,x) {=\inf }_{X\in \Gamma (\omega )}\lambda {\int \nolimits }_{0}^{+\infty }{(1 - \lambda )}^{\lfloor t\rfloor }g(X(t))\,\mathrm{d}t.$$

Hence, by equation (10.8) and the bound on the cost function, for any λ ∈ ]0, 1],

$$\vert {W}_{\lambda }(\omega ) - {w}_{\lambda }(\omega ,x)\vert \leq \lambda {\int \nolimits }_{0}^{+\infty }\left\vert {(1 - \lambda )}^{\lfloor t\rfloor }-\mathrm{ {e}}^{-\lambda t}\right\vert \mathrm{d}t$$

which tends uniformly (with respect to x and ω) to 0 as λ goes to 0 by virtue of the following lemma.

Lemma 10.6.

The function

$$\lambda \mapsto \lambda {\int \nolimits }_{0}^{+\infty }\left\vert {(1 - \lambda )}^{\lfloor t\rfloor }-\mathrm{ {e}}^{-\lambda t}\right\vert \mathrm{d}t$$

converges to 0 as λ tends to 0.

Proof.

Since λ ∫₀ ^{+ ∞}(1 − λ)^⌊t⌋ = λ ∫₀ ^{+ ∞}e^− λtdt = 1, for any λ > 0, the lemma is equivalent to the convergence to 0 of

$$E(\lambda ) := \lambda {\int \nolimits }_{0}^{+\infty }{\left[{(1 - \lambda )}^{\lfloor t\rfloor }-\mathrm{ {e}}^{-\lambda t}\right]}_{ +}\mathrm{d}t$$

where [x]₊ denotes the positive part of x. Now, from the relation 1 − λ ≤ e^− λ, true for any λ, one can easily deduce that, for any λ > 0, t ≥ 0, the relation (1 − λ)^⌊t⌋e^λt ≤ e^λ holds. Hence,

$$\begin{array}{rcl} E(\lambda )& =& \lambda {\int \nolimits }_{0}^{+\infty }\mathrm{{e}}^{-\lambda t}{\left[{(1 - \lambda )}^{\lfloor t\rfloor }\mathrm{{e}}^{\lambda t} - 1\right]}_{ +}\mathrm{d}t \\ & \leq & \lambda {\int \nolimits }_{0}^{+\infty }\mathrm{{e}}^{-\lambda t}(\mathrm{{e}}^{\lambda } - 1)\,\mathrm{d}t \\ & =& \mathrm{{e}}^{\lambda } - \end{array}$$

(1)

which converges to 0 as λ tends to 0. □

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Oliu-Barton, M., Vigeral, G. (2013). A Uniform Tauberian Theorem in Optimal Control. In: Cardaliaguet, P., Cressman, R. (eds) Advances in Dynamic Games. Annals of the International Society of Dynamic Games, vol 12. Birkhäuser, Boston, MA. https://doi.org/10.1007/978-0-8176-8355-9_10

Download citation

DOI: https://doi.org/10.1007/978-0-8176-8355-9_10
Published: 09 August 2012
Publisher Name: Birkhäuser, Boston, MA
Print ISBN: 978-0-8176-8354-2
Online ISBN: 978-0-8176-8355-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Abstract

Buying options

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Lemma 10.6.

Proof.

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation