Skip to main content
Log in

Sequential Stochastic Control (Single or Multi-Agent) Problems Nearly Admit Change of Measures with Independent Measurement

  • Published:
Applied Mathematics & Optimization Submit manuscript

Abstract

Change of measures has been an effective method in stochastic control and analysis; in continuous-time control this follows Girsanov’s theorem applied to both fully observed and partially observed models, in decentralized stochastic control (or stochastic dynamic team theory) this is known as Witsenhausen’s static reduction, and in discrete-time classical stochastic control Borkar has considered this method for partially observed Markov Decision processes (POMDPs) generalizing Fleming and Pardoux’s approach in continuous-time. This method allows for equivalent optimal stochastic control or filtering in a new probability space where the measurements form an independent exogenous process in both discrete-time and continuous-time and the Radon–Nikodym derivative (between the true measure and the reference measure formed via the independent measurement process) is pushed to the cost or dynamics. However, for this to be applicable, an absolute continuity condition is necessary. This raises the following question: can we perturb any discrete-time sequential stochastic control problem by adding some arbitrarily small additive (e.g. Gaussian or otherwise) noise to the measurements to make the system measurements absolutely continuous, so that a change-of-measure (or static reduction) can be applicable with arbitrarily small error in the optimal cost? That is, are all sequential stochastic (single-agent or decentralized multi-agent) problems \(\epsilon \)-away from being static reducible as far as optimal cost is concerned, for any \(\epsilon > 0\)? We show that this is possible when the cost function is bounded and continuous in controllers’ actions and the action spaces are convex. We also note that the solution and the cost obtained for the perturbed system is realizable (under a randomized policy) for the original model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Arapostathis, A., Borkar, V.S., Ghosh, M.K.: Ergodic Control of Diffusion Processes, vol. 143. Cambridge University Press, Cambridge (2012)

    MATH  Google Scholar 

  2. Baras, J.S., Bensoussan, A., James, M.R.: Dynamic observers as asymptotic limits of recursive filters: special cases. SIAM J. Appl. Math. 48(5), 1147–1158 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  3. Baras, J.S., Krishnaprasad, P.S.: Dynamic observers as asymptotic limits of recursive filters. In: 1982 21st IEEE Conference on Decision and Control, pp. 1126–1127, IEEE (1982)

  4. Beneš, V.E.: Existence of optimal stochastic control laws. SIAM J. Control 9(3), 446–472 (1971)

    Article  MathSciNet  MATH  Google Scholar 

  5. Bertsekas, D.P., Shreve, S.E.: Stochastic Optimal Control: The Discrete Time Case. Academic Press, New York (1978)

    MATH  Google Scholar 

  6. Bianchini, S., Bressan, A.: Vanishing viscosity solutions of nonlinear hyperbolic systems. Ann. Math. 161, 223–342 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  7. Bismut, J.M.: Partially observed diffusions and their control. SIAM J. Control Optim. 20(2), 302–309 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  8. Blackwell, D.: The comparison of experiments. In: Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability, pp. 93–102 (1951)

  9. Borkar, V.S.: White-noise representations in stochastic realization theory. SIAM J. Control Optim. 31, 1093–1102 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  10. Borkar, V.S.: Average cost dynamic programming equations for controlled Markov chains with partial observations. SIAM J. Control Optim. 39(3), 673–681 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  11. Borkar, V.S.: Dynamic programming for ergodic control of Markov chains under partial observations: a correction. SIAM J. Control Optim. 45(6), 2299–2304 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  12. Charalambous, C.D.: Decentralized optimality conditions of stochastic differential decision problems via Girsanov’s measure transformation. Math. Control Signals Syst. 28(3), 1–55 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  13. Ciampa, G., Rossi, F.: Vanishing viscosity in mean-field optimal control. arXiv preprint arXiv:2111.13015 (2021)

  14. Davis, M.H.A., Varaiya, P.: Information states for linear stochastic systems. J. Math. Anal. Appl. 37(2), 384–402 (1972)

    Article  MathSciNet  MATH  Google Scholar 

  15. Davis, M.H.A., Varaiya, P.: Dynamic programming conditions for partially observable stochastic systems. SIAM J. Control 11(2), 226–261 (1973)

    Article  MathSciNet  MATH  Google Scholar 

  16. Dudley, R.M.: Real Analysis and Probability, 2nd edn. Cambridge University Press, Cambridge (2002)

    Book  MATH  Google Scholar 

  17. Dugundji, J.: An extension of Tietze’s theorem. Pac. J. Math. 1(3), 353–367 (1951)

    Article  MathSciNet  MATH  Google Scholar 

  18. Fleming, W., Pardoux, E.: Optimal control for partially observed diffusions. SIAM J. Control Optim. 20(2), 261–285 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  19. Fleming, W.H., Rishel, R.W.: Deterministic and stochastic optimal control

  20. Fleming, W.H., Soner, H.M.: Controlled Markov Processes and Viscosity Solutions, vol. 25. Springer Science & Business Media, New York (2006)

    MATH  Google Scholar 

  21. Gihman, I.I., Skorohod, A.V.: Controlled Stochastic Processes. Springer Science & Business Media, New York (2012)

    MATH  Google Scholar 

  22. Girsanov, I.V.: On transforming a certain class of stochastic processes by absolutely continuous substitution of measures. Theory Probab. Appl. 5(3), 285–301 (1960)

    Article  MathSciNet  MATH  Google Scholar 

  23. Gupta, A., Yüksel, S., Başar, T., Langbort, C.: On the existence of optimal policies for a class of static and sequential dynamic teams. SIAM J. Control Optim. 53, 1681–1712 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  24. Heunis, A.J.: Non-linear filtering of rare events with large signal-to-noise ratio. J. Appl. Probab. 24(4), 929–948 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  25. Hijab, O.: Asymptotic Bayesian estimation of a first order equation with small diffusion. Ann. Probab. 12(3), 890–902 (1984)

    Article  MathSciNet  MATH  Google Scholar 

  26. Ho, Y.C., Chu, K.C.: Team decision theory and information structures in optimal control problems: part I. IEEE Trans. Autom. Control 17, 15–22 (1972)

    Article  MATH  Google Scholar 

  27. Hogeboom-Burr, I., Yüksel, S.: Continuity properties of value functions in information structures for zero-sum and general games and stochastic teams. SIAM J. Control Optim. 61, 11035 (2023)

    Article  MathSciNet  MATH  Google Scholar 

  28. Hunter, J., Nachtergaele, B.: Applied Analysis. World Scientific, Singapore (2005)

    MATH  Google Scholar 

  29. James, M.R., Baras, J.S., Elliott, R.E.: Risk-sensitive control and dynamic games for partially observed discrete-time nonlinear systems. IEEE Trans. Autom. Control 39(4), 780–792 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  30. Kushner, H.J.: A partial history of the early development of continuous-time nonlinear stochastic systems theory. Automatica 50(2), 303–334 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  31. Langen, H.: Convergence of dynamic programming models. Math. Oper. Res. 6(4), 493–512 (1981)

    Article  MathSciNet  MATH  Google Scholar 

  32. Pra, P.D., Meneghini, L., Runggaldier, W.J.: Connections between stochastic control and dynamic games. Math. Control Signals Syst. 9(4), 303–326 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  33. Reddy, A.S., Budhiraja, A., Apte, A.: Some large deviations asymptotics in small noise filtering problems. arXiv preprint arXiv:2106.05512 (2021)

  34. Rudin, W.: Functional Analysis. McGraw-Hill, New York (1991)

    MATH  Google Scholar 

  35. Saldi, N., Yüksel, S.: Geometry of information structures, strategic measures and associated control topologies. Probab. Surv. 19, 450–532 (2022)

    Article  MathSciNet  MATH  Google Scholar 

  36. Saldi, N., Yüksel, S., Linder, T.: Finite model approximations and asymptotic optimality of quantized policies in decentralized stochastic control. IEEE Trans. Autom. Control 62, 2360–2373 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  37. Schäl, M.: Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal. Z. Wahrscheinlichkeitsth 32, 179–296 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  38. Serfozo, R.: Convergence of lebesgue integrals with varying measures. Sankhyā: Indian J. Stat. Ser. A pp. 380–402 (1982)

  39. Witsenhausen, H.S.: The intrinsic model for discrete stochastic control: some open problems. Lect. Notes Econ. Math. Syst. 107, 322–335 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  40. Witsenhausen, H.S.: Equivalent stochastic control problems. Math. Control Signals Syst. 1, 3–11 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  41. Yüksel, S.: A universal dynamic program and refined existence results for decentralized stochastic control. SIAM J. Control Optim. 58(5), 2711–2739 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  42. Yüksel, S., Başar, T.: Stochastic Networked Control Systems: Stabilization and Optimization Under Information Constraints. Springer, New York (2013)

    Book  MATH  Google Scholar 

  43. Yüksel, S., Linder, T.: Optimization and convergence of observation channels in stochastic control. SIAM J. Control Optim. 50, 864–887 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  44. Yüksel, S., Saldi, N.: Convex analysis in decentralized stochastic control, strategic measures and optimal solutions. SIAM J. Control Optim. 55, 1–28 (2017)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ian Hogeboom-Burr.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hogeboom-Burr, I., Yüksel, S. Sequential Stochastic Control (Single or Multi-Agent) Problems Nearly Admit Change of Measures with Independent Measurement. Appl Math Optim 87, 51 (2023). https://doi.org/10.1007/s00245-023-09965-5

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00245-023-09965-5

Keywords

Navigation