Skip to main content

Sandboxing Controllers for Stochastic Cyber-Physical Systems

  • 553 Accesses

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 11750)


Current cyber-physical systems (CPS) are expected to accomplish complex tasks. To achieve this goal, high performance, but unverified controllers (e.g. deep neural network, black-box controllers from third parties) are applied, which makes it very challenging to keep the overall CPS safe. By sandboxing these controllers, we are not only able to use them but also to enforce safety properties over the controlled physical systems at the same time. However, current available solutions for sandboxing controllers are just applicable to deterministic (a.k.a. non-stochastic) systems, possibly affected by bounded disturbances. In this paper, for the first time we propose a novel solution for sandboxing unverified complex controllers for CPS operating in noisy environments (a.k.a. stochastic CPS). Moreover, we also provide probabilistic guarantees on their safety. Here, the unverified control input is observed at each time instant and checked whether it violates the maximal tolerable probability of reaching the unsafe set. If this probability exceeds a given threshold, the unverified control input will be rejected, and the advisory input provided by the optimal safety controller will be used to maintain the probabilistic safety guarantee. The proposed approach is illustrated empirically and the results indicate that the expected safety probability is guaranteed.


  • Stochastic cyber-physical systems
  • Fault-tolerance
  • Sandboxing controllers

This work was supported in part by the H2020 ERC Starting Grant AutoCPS (grant agreement No 804639) and German Research Foundation (DFG) through the grants ZA 873/1-1 and ZA 873/4-1. Marco Caccamo was supported by an Alexander von Humboldt Professorship endowed by the German Federal Ministry of Education and Research. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the Alexander von Humboldt Foundation.

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-29662-9_15
  • Chapter length: 18 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   59.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-29662-9
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   74.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.


  1. 1.

    No input needed to be provided at \(t=H\) since it is the end of the execution.


  1. Abate, A., Prandini, M., Lygeros, J., Sastry, S.: Probabilistic reachability and safety for controlled discrete time stochastic hybrid systems. Automatica 44(11), 2724–2734 (2008).

    CrossRef  MathSciNet  MATH  Google Scholar 

  2. Abdi, F., Chen, C.Y., Hasan, M., Liu, S., Mohan, S., Caccamo, M.: Preserving physical safety under cyber attacks. IEEE Internet Things J. (2018).

    CrossRef  Google Scholar 

  3. Abdi, F., Tabish, R., Rungger, M., Zamani, M., Caccamo, M.: Application and system-level software fault tolerance through full system restarts. In: 2017 ACM/IEEE 8th International Conference on Cyber-Physical Systems (ICCPS), pp. 197–206. IEEE (2017).

  4. Alshiekh, M., Bloem, R., Ehlers, R., Könighofer, B., Niekum, S., Topcu, U.: Safe reinforcement learning via shielding. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  5. Bak, S., Johnson, T.T., Caccamo, M., Sha, L.: Real-time reachability for verified simplex design. In: 2014 IEEE Real-Time Systems Symposium, pp. 138–148. IEEE (2014).

  6. Bak, S., Manamcheri, K., Mitra, S., Caccamo, M.: Sandboxing controllers for cyber-physical systems. In: 2011 IEEE/ACM Second International Conference on Cyber-Physical Systems, pp. 3–12. IEEE (2011).

  7. Bloem, R., Könighofer, B., Könighofer, R., Wang, C.: Shield synthesis: runtime enforcement for reactive systems. In: International Conference on Tools and Algorithms for the Construction and Analysis of Systems, pp. 533–548. Springer (2015).

    CrossRef  Google Scholar 

  8. Crenshaw, T.L., Gunter, E., Robinson, C.L., Sha, L., Kumar, P.: The simplex reference model: limiting fault-propagation due to unreliable components in cyber-physical system architectures. In: 28th IEEE International Real-Time Systems Symposium, RTSS 2007, pp. 400–412. IEEE (2007).

  9. Esmaeil Zadeh Soudjani, S.: Formal abstractions for automated verification and synthesis of stochastic systems. Ph.D. thesis, Technical University of Delft (2014).

  10. Hernández-Lerma, O., Lasserre, J.B.: Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer, New York (1996).

    CrossRef  MATH  Google Scholar 

  11. Hovakimyan, N., Cao, C., Kharisov, E., Xargay, E., Gregory, I.M.: L 1 adaptive control for safety-critical systems. IEEE Control Syst. Mag. 31(5), 54–104 (2011).

    CrossRef  MathSciNet  MATH  Google Scholar 

  12. Humphrey, L., Könighofer, B., Könighofer, R., Topcu, U.: Synthesis of admissible shields. In: Bloem, R., Arbel, E. (eds.) HVC 2016. LNCS, vol. 10028, pp. 134–151. Springer, Cham (2016).

    CrossRef  Google Scholar 

  13. Kupferman, O., Vardi, M.Y.: Model checking of safety properties. Form. Methods Syst. Des. 19(3), 291–314 (2001).

    CrossRef  MATH  Google Scholar 

  14. Lavaei, A., Soudjani, S., Zamani, M.: From dissipativity theory to compositional construction of finite Markov decision processes. In: Proceedings of the 21st International Conference on Hybrid Systems: Computation and Control (part of CPS Week), pp. 21–30. ACM (2018).

  15. Lavaei, A., Soudjani, S., Zamani, M.: Compositional synthesis of large-scale stochastic systems: a relaxed dissipativity approach. arXiv preprint arXiv:1902.01223 (2019)

  16. Monahan, G.E.: State of the art–a survey of partially observable markov decision processes: theory, models, and algorithms. Manag. Sci. 28(1), 1–16 (1982).

    CrossRef  MATH  Google Scholar 

  17. Reis, C., Barth, A., Pizano, C.: Browser security: lessons from google chrome. Commun. ACM 52(8), 45–49 (2009).

    CrossRef  Google Scholar 

  18. Sha, L.: Using simplicity to control complexity. IEEE Softw. 20–28 (2001).

    CrossRef  Google Scholar 

  19. Tkachev, I., Mereacre, A., Katoen, J.P., Abate, A.: Quantitative automata-based controller synthesis for non-autonomous stochastic hybrid systems. In: Proceedings of the 16th International Conference on Hybrid Systems: Computation and Control, pp. 293–302. ACM (2013).

  20. Wang, X., Hovakimyan, N., Sha, L.: L1simplex: fault-tolerant control of cyber-physical systems. In: 2013 ACM/IEEE International Conference on Cyber-Physical Systems (ICCPS), pp. 41–50. IEEE (2013).

  21. Wang, X., Hovakimyan, N., Sha, L.: RSimplex: a robust control architecture for cyber and physical failures. ACM Trans. Cyber Phys. Syst. 2(4), 27 (2018).

    CrossRef  Google Scholar 

  22. Yao, J., Liu, X., Zhu, G., Sha, L.: Netsimplex: controller fault tolerance architecture in networked control systems. IEEE Trans. Ind. Inform. 9(1), 346–356 (2013).

    CrossRef  Google Scholar 

Download references


The authors would like to thank Abolfazl Lavaei for the discussions on synthesizing optimal safety controllers for stochastic systems.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Bingzhuo Zhong .

Editor information

Editors and Affiliations

Appendix: Proof of Theorem 2

Appendix: Proof of Theorem 2

The Proof of Theorem 2 is done with the help of the following lemma.

Lemma 1

Given a finite MDP \(\mathfrak {M}\,=\,\{\tilde{X},\,\tilde{U},\,\tilde{T}\}\) and a Markov policy \(\mu \,=\,(\mu _0, \mu _1,\ldots ,\mu _{H-1})\) in a finite time horizon \(\overline{0,H}\), we have

$$\begin{aligned} 1-\tilde{V}_{n+1}(\tilde{x}) = \sum _{\tilde{y}\in \tilde{X}\backslash \{\phi \}}(1-\tilde{V}_{n}(\tilde{y}))\tilde{T}(\tilde{y}|\tilde{x},\mu _{H-n-1}(\tilde{x})) \end{aligned}$$

where \(\tilde{V}_{n}(\tilde{x})\) is the value function for the reach-avoid problem and \(\tilde{x}\in \tilde{X}\).

The proof can be readily derived based on Theorem 1 and the definition of \(\tilde{T}\). Let \(\mu '\) be the Markov policy used to control the system when the unverified controller is accepted at some states at some time instants. Here, we use \(\tilde{X}_s\) to represent \(\tilde{X}\backslash \{\phi \}\). Let’s define:

$$\begin{aligned} f(\tilde{x}(k),\mu '_{k}(\tilde{x}(k))) = 1-\sum _{\tilde{x}(k+1)\in \tilde{X}_s}\tilde{V}_{*,H-k-1}(\tilde{x}(k+1))\tilde{T}(\tilde{x}(k+1)|\tilde{x}(k),\mu '_{k}(\tilde{x}(k))), \end{aligned}$$


$$\begin{aligned} g(\tilde{x}(k-1),\mu '_{k-1}(\tilde{x}(k-1)))=\tilde{T}(\tilde{x}(k)|\tilde{x}(k-1),\mu '_{k-1}(\tilde{x}(k-1))). \end{aligned}$$

Given initial state \(s_0\in \tilde{X}_s\), at each time instant \(t=k\) where \(k\in \overline{0,H-1}\), we have

$$\begin{aligned} \begin{aligned}&1-\tilde{V}^{\mu '}_{H}(s_0)\\ =&\sum _{\tilde{x}(1)\in \tilde{X}_s}\left( \sum _{\tilde{x}(2)\in \tilde{X}_s}\left( \ldots \left( \sum _{\tilde{x}(k)\in \tilde{X}_s}f(\tilde{x}(k),\mu '_{k}(\tilde{x}(k)))g(\tilde{x}(k-1),\mu '_{k-1}(\tilde{x}(k-1)))\right) \right. \right. \\&\left. \left. \ldots \right) g(\tilde{x}(1),\mu '_{1}(\tilde{x}(1)))\right) g(s_0,\mu '_{0}(s_0))\\ \end{aligned} \end{aligned}$$
$$\begin{aligned} \begin{aligned} \ge&\sum _{\tilde{x}(1)\in \tilde{X}_s}\left( \sum _{\tilde{x}(2)\in \tilde{X}_s}\left( \ldots \left( f(\underline{\tilde{x}(k)},\underline{\mu '_{k}(\tilde{x}(k))})\sum _{\tilde{x}(k)\in \tilde{X}_s}g(\tilde{x}(k-1),\mu '_{k-1}(\tilde{x}(k-1)))\right) \right. \right. \\&\left. \left. \ldots \right) g(\tilde{x}(1),\mu '_{1}(\tilde{x}(1)))\right) g(s_0,\mu '_{0}(s_0))\\ \ge&\sum _{\tilde{x}(1)\in \tilde{X}_s}\left( \sum _{\tilde{x}(2)\in \tilde{X}_s}\left( \ldots \left( \left( \sum _{\tilde{x}(k)\in \tilde{X}_s}g(\underline{\tilde{x}(k-1)},\underline{\mu '_{k-1}(\tilde{x}(k-1))}))\right) f(\underline{\tilde{x}(k)},\underline{\mu '_{k}(\tilde{x}(k))})\right. \right. \right. \\&\left. \left. \left. \sum _{\tilde{x}(k-1)\in \tilde{X}_s}g(\tilde{x}(k-2),\mu '_{k-2}(\tilde{x}(k-2)))\right) \ldots \right) g(\tilde{x}(1),\mu '_{1}(\tilde{x}(1)))\right) g(s_0,\mu '_{0}(s_0))\\ \ge&\sum _{\tilde{x}(1)\in \tilde{X}_s}\left( \sum _{\tilde{x}(2)\in \tilde{X}_s}\left( \ldots \left( \left( \sum _{\tilde{x}(k-1)\in \tilde{X}_s}g(\underline{\tilde{x}(k-2)},\underline{\mu '_{k-2}(\tilde{x}(k-2))})\right) \right. \right. \right. \\&\left( \sum _{\tilde{x}(k)\in \tilde{X}_s}g(\underline{\tilde{x}(k-1)},\underline{\mu '_{k-1}(\tilde{x}(k-1))})\right) f(\underline{\tilde{x}(k)},\underline{\mu '_{k}(\tilde{x}(k))})\\&\left. \left. \left. \sum _{\tilde{x}(k-2)\in \tilde{X}_s}g(\tilde{x}(k-3),\mu '_{k-3}(\tilde{x}(k-3)))\right) \ldots \right) g(\tilde{x}(1),\mu '_{1}(\tilde{x}(1)))\right) g(s_0,\mu '_{0}(s_0))\\&\ldots \\ \ge&\prod _{t=1}^{k}\sum _{\tilde{x}(t)\in \tilde{X}_s}g(\underline{\tilde{x}(t-1)}, \underline{\mu _{t-1}(\tilde{x}(t-1))}) (f(\underline{\tilde{x}(k)},\underline{\mu '_{k}(\tilde{x}(k))}) \end{aligned} \end{aligned}$$


$$\begin{aligned} (\underline{\tilde{x}(t-1)},\,\underline{\mu _{t-1}(\tilde{x}(t-1))}) = \mathop {\arg \min }_{\begin{array}{c} \tilde{x}(t-1)\in \tilde{X}_s\\ \mu _{t-1}(\tilde{x}(t-1)) \end{array}}\sum _{\tilde{x}(t)\in \tilde{X}_s}g(\tilde{x}(t-1), \mu _{t-1}(\tilde{x}(t-1))) \end{aligned}$$

for all \(t\in \overline{0,k}\), and

$$\begin{aligned} (\underline{\tilde{x}(k)},\,\underline{\mu _{k}(\tilde{x}(k))}) = \mathop {\arg \min }_{\begin{array}{c} \tilde{x}(k)\in \tilde{X}_s\\ \mu _{k}(\tilde{x}(k)) \end{array}}f(\tilde{x}(k),\mu '_{k}(\tilde{x}(k))). \end{aligned}$$

Noted that \(\omega = (\underline{\tilde{x}(0)}, \underline{\mu _0(\tilde{x}(0))},\underline{\tilde{x}(1)}, \underline{\mu _1(\tilde{x}(1))}\ldots \,\underline{\tilde{x}(k)})\) is one of the paths up to time instant k which can be generated by the system controlled by the Markov policy \(\mu '\), and the History-based Supervisor ensures that for all paths \(\omega \) up to arbitrary time instant \(k\in \overline{0,H}\),

$$\begin{aligned} \prod _{t=1}^{k}\sum _{\tilde{x}\in \tilde{X}_s}g(\omega _x(t-1), \omega _u(t-1))\left( f(\omega _x(k),u_{uc}(\omega _x(k),k))\right) \ge 1-\rho . \end{aligned}$$

Note that we have \(1-\tilde{V}^{\mu '}_{H}(s_0)\ge 1-\rho \), i.e. \(p^{\mu '}_{s_0}(\diamond ^{\le H}\mathcal {A}^c) =\tilde{V}^{\mu '}_{H}(s_0) \le \rho \).

Rights and permissions

Reprints and Permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Zhong, B., Zamani, M., Caccamo, M. (2019). Sandboxing Controllers for Stochastic Cyber-Physical Systems. In: André, É., Stoelinga, M. (eds) Formal Modeling and Analysis of Timed Systems. FORMATS 2019. Lecture Notes in Computer Science(), vol 11750. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-29661-2

  • Online ISBN: 978-3-030-29662-9

  • eBook Packages: Computer ScienceComputer Science (R0)