Skip to main content

Intelligent and Dependable Decision-Making Under Uncertainty

  • Conference paper
  • First Online:
Formal Methods (FM 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14000))

Included in the following conference series:

  • 1334 Accesses

Abstract

This talk highlights our vision of foundational and application-driven research toward safety, dependability, and correctness in artificial intelligence (AI). We take a broad stance on AI that combines formal methods, machine learning, and control theory. As part of this research line, we study problems inspired by autonomous systems, planning in robotics, and industrial applications. We consider reinforcement learning (RL) as a specific machine learning technique for decision-making under uncertainty. RL generally learns to behave optimally via trial and error. Consequently, and despite its massive success in the past years, RL lacks mechanisms to ensure safe and correct behavior. Formal methods, in particular formal verification, is a research area that provides formal guarantees of a system’s correctness and safety based on rigorous methods and precise specifications. Yet, fundamental challenges have obstructed the effective application of verification to reinforcement learning. Our main objective is to devise novel, data-driven verification methods that tightly integrate with RL. In particular, we develop techniques that address real-world challenges to the safety of AI systems in general: Scalability, expressiveness, and robustness against the uncertainty that occurs when operating in the real world. The overall goal is to advance the real-world deployment of reinforcement learning.

N. Jansen—This work was supported by the ERC Starting Grant 101077178 (DEUCE).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Available at https://github.com/LAVA-LAB/COOL-MC.

References

  1. Abate, A., et al.: ARCH-COMP18 category report: stochastic modelling. In: ARCH@ADHS. EPiC Series in Computing, vol. 54, pp. 71–103. EasyChair (2018)

    Google Scholar 

  2. Alshiekh, M., Bloem, R., Ehlers, R., Könighofer, B., Niekum, S., Topcu, U.: Safe reinforcement learning via shielding. In: AAAI. AAAI Press (2018)

    Google Scholar 

  3. Amato, C.: Decision-making under uncertainty in multi-agent and multi-robot systems: planning and learning. In: IJCAI, pp. 5662–5666. ijcai.org (2018)

    Google Scholar 

  4. Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., Mané, D.: Concrete problems in AI safety. CoRR abs/1606.06565 (2016)

    Google Scholar 

  5. Argote, L.: Input uncertainty and organizational coordination in hospital emergency units. Adm. Sci. Q., 420–434 (1982)

    Google Scholar 

  6. Badings, T.S., Abate, A., Jansen, N., Parker, D., Poonawala, H.A., Stoelinga, M.: Sampling-based robust control of autonomous systems with non-Gaussian noise. In: AAAI (2022). To appear

    Google Scholar 

  7. Badings, T.S., Romano, L., Abate, A., Jansen, N.: Probabilities are not enough: Formal controller synthesis for stochastic dynamical models with epistemic uncertainty. In: AAAI (2023)

    Google Scholar 

  8. Badings, T.S., et al.: Robust control for dynamical systems with non-gaussian noise via formal abstractions. J. Artif. Intell. Res. (2023)

    Google Scholar 

  9. Bahrammirzaee, A.: A comparative survey of artificial intelligence applications in finance: artificial neural networks, expert system and hybrid intelligent systems. Neural Comput. Appl. 19(8), 1165–1195 (2010). https://doi.org/10.1007/s00521-010-0362-z

    Article  Google Scholar 

  10. Baier, C., Katoen, J.P.: Principles of Model Checking. The MIT Press, Cambridge (2008)

    MATH  Google Scholar 

  11. Biere, A., Heule, M., van Maaren, H., Walsh, T. (eds.): Handbook of Satisfiability Frontiers in Artificial Intelligence and Applications, vol. 185. IOS Press, Amsterdam (2009)

    Google Scholar 

  12. Bry, A., Roy, N.: Rapidly-exploring random belief trees for motion planning under uncertainty. In: ICRA, pp. 723–730. IEEE (2011)

    Google Scholar 

  13. Burns, B., Brock, O.: Sampling-based motion planning with sensing uncertainty. In: ICRA, pp. 3313–3318. IEEE (2007)

    Google Scholar 

  14. Campi, M.C., Garatti, S.: Introduction to the scenario approach. SIAM (2018)

    Google Scholar 

  15. Carr, S., Jansen, N., Junges, S., Topcu, U.: Safe reinforcement learning via shielding under partial observability. In: AAAI (2023)

    Google Scholar 

  16. Carr, S., Jansen, N., Topcu, U.: Verifiable RNN-based policies for POMDPs under temporal logic constraints. In: IJCAI, pp. 4121–4127. ijcai.org (2020)

    Google Scholar 

  17. Carr, S., Jansen, N., Topcu, U.: Task-aware verifiable RNN-based policies for partially observable Markov decision processes. J. Artif. Intell. Res. 72, 819–847 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  18. Carr, S., Jansen, N., Wimmer, R., Serban, A.C., Becker, B., Topcu, U.: Counterexample-guided strategy improvement for POMDPs using recurrent neural networks. In: IJCAI, pp. 5532–5539. ijcai.org (2019)

    Google Scholar 

  19. Clarke, E.M., Henzinger, T.A., Veith, H., Bloem, R.: Handbook of Model Checking, vol. 10. Springer, Cham (2018)

    Book  MATH  Google Scholar 

  20. Cubuktepe, M., Jansen, N., Junges, S., Marandi, A., Suilen, M., Topcu, U.: Robust finite-state controllers for uncertain POMDPs. In: AAAI, pp. 11792–11800. AAAI Press (2021)

    Google Scholar 

  21. David, A., Jensen, P.G., Larsen, K.G., Mikučionis, M., Taankvist, J.H.: Uppaal Stratego. In: Baier, C., Tinelli, C. (eds.) TACAS 2015. LNCS, vol. 9035, pp. 206–211. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-46681-0_16

    Chapter  Google Scholar 

  22. Dehnert, C., Junges, S., Katoen, J.P., Volk, M.: A storm is coming: a modern probabilistic model checker. In: Majumdar, R., Kuncak, V. (eds.) CAV 2017. LNCS, Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63390-9_31

  23. Delahaye, B., Larsen, K.G., Legay, A., Pedersen, M.L., Wasowski, A.: Decision problems for interval Markov chains. In: Dediu, A.-H., Inenaga, S., Martín-Vide, C. (eds.) LATA 2011. LNCS, vol. 6638, pp. 274–285. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21254-3_21

    Chapter  Google Scholar 

  24. Drechsler, R.: Advanced Formal Verification. Kluwer Academic Publishers, Dordrecht (2004)

    Book  MATH  Google Scholar 

  25. Freedman, R.G., Zilberstein, S.: Safety in AI-HRI: challenges complementing user experience quality. In: AAAI Fall Symposium Series (2016)

    Google Scholar 

  26. Frey, G.R., Petersen, C.D., Leve, F.A., Kolmanovsky, I.V., Girard, A.R.: Constrained spacecraft relative motion planning exploiting periodic natural motion trajectories and invariance. J. Guid. Control. Dyn. 40(12), 3100–3115 (2017)

    Article  Google Scholar 

  27. Garcıa, J., Fernández, F.: A comprehensive survey on safe reinforcement learning. J. Mach. Learn. Res. 16(1), 1437–1480 (2015)

    MathSciNet  MATH  Google Scholar 

  28. Givan, R., Leach, S., Dean, T.: Bounded-parameter Markov decision processes. Artif. Intell. 122(1–2), 71–109 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  29. Gross, D., Jansen, N., Junges, S., Pérez, G.A.: COOL-MC: a comprehensive tool for reinforcement learning and model checking. In: Dong, W., Talpin, J.P. (eds.) SETTA 2022. LNCS, vol. 13649, pp. 41–49. Springer, Cham (2022)

    Chapter  Google Scholar 

  30. Gross, D., Jansen, N., Pérez, G.A., Raaijmakers, S.: Robustness verification for classifier ensembles. In: Hung, D.V., Sokolsky, O. (eds.) ATVA 2020. LNCS, vol. 12302, pp. 271–287. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59152-6_15

    Chapter  Google Scholar 

  31. Hahn, E.M., et al.: The 2019 comparison of tools for the analysis of quantitative formal models. In: Beyer, D., Huisman, M., Kordon, F., Steffen, B. (eds.) TACAS 2019. LNCS, vol. 11429, pp. 69–92. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-17502-3_5

    Chapter  Google Scholar 

  32. Hobbs, K.L., Feron, E.M.: A taxonomy for aerospace collision avoidance with implications for automation in space traffic management. In: AIAA Scitech 2020 Forum, p. 0877 (2020)

    Google Scholar 

  33. Huang, X., Kwiatkowska, M., Wang, S., Wu, M.: Safety verification of deep neural networks. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10426, pp. 3–29. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63387-9_1

    Chapter  Google Scholar 

  34. Itoh, H., Nakamura, K.: Partially observable Markov decision processes with imprecise parameters. Artif. Intell. 171(8), 453–490 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  35. Jansen, N., Könighofer, B., Junges, S., Serban, A., Bloem, R.: Safe reinforcement learning using probabilistic shields (invited paper). In: CONCUR. LIPIcs, vol. 171, pp. 1–16. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2020)

    Google Scholar 

  36. Jia, Y., Harman, M.: An analysis and survey of the development of mutation testing. IEEE Trans. Software Eng. 37(5), 649–678 (2011)

    Article  Google Scholar 

  37. Jiang, F., et al.: Artificial intelligence in healthcare: past, present and future. Stroke Vasc. Neurol. 2(4) (2017)

    Google Scholar 

  38. Junges, S., Jansen, N., Seshia, S.A.: Enforcing almost-sure reachability in POMDPs. In: Silva, A., Leino, K.R.M. (eds.) CAV 2021. LNCS, vol. 12760, pp. 602–625. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-81688-9_28

    Chapter  Google Scholar 

  39. Junges, S., et al.: Finite-state controllers of POMDPs using parameter synthesis. In: UAI, pp. 519–529. AUAI Press (2018)

    Google Scholar 

  40. Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artif. Intell. 101(1), 99–134 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  41. Katz, G., Barrett, C., Dill, D.L., Julian, K., Kochenderfer, M.J.: Reluplex: an efficient SMT solver for verifying deep neural networks. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10426, pp. 97–117. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63387-9_5

    Chapter  Google Scholar 

  42. Kerkkamp, D., Bukhsh, Z.A., Zhang, Y., Jansen, N.: Grouping of maintenance actions with deep reinforcement learning and graph convolutional networks. In: ICAART (2022). To Appear

    Google Scholar 

  43. Kim, S.C., Shepperd, S.W., Norris, H.L., Goldberg, H.R., Wallace, M.S.: Mission design and trajectory analysis for inspection of a host spacecraft by a microsatellite. In: 2007 IEEE Aerospace Conference, pp. 1–23. IEEE (2007)

    Google Scholar 

  44. Klingspor, V., Demiris, J., Kaiser, M.: Human-robot communication and machine learning. Appl. Artif. Intell. 11(7), 719–746 (1997)

    Google Scholar 

  45. Kochenderfer, M.J.: Decision Making Under Uncertainty: Theory and Application. MIT press, Cambridge (2015)

    Book  MATH  Google Scholar 

  46. Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_47

    Chapter  Google Scholar 

  47. Leike, J., et al.: AI safety gridworlds. arXiv preprint arXiv:1711.09883 (2017)

  48. Levinson, J., et al.: Towards fully autonomous driving: Systems and algorithms. In: Intelligent Vehicles Symposium, pp. 163–168. IEEE (2011)

    Google Scholar 

  49. Madani, O., Hanks, S., Condon, A.: On the undecidability of probabilistic planning and infinite-horizon partially observable Markov decision problems. In: AAAI. pp. 541–548. AAAI Press (1999)

    Google Scholar 

  50. Meuleau, N., Peshkin, L., Kim, K.E., Kaelbling, L.P.: Learning finite-state controllers for partially observable environments. In: UAI, pp. 427–436. Morgan Kaufmann (1999)

    Google Scholar 

  51. Mnih, V., et al.: Playing atari with deep reinforcement learning. CoRR abs/1312.5602 (2013)

    Google Scholar 

  52. Nilim, A., El Ghaoui, L.: Robust control of Markov decision processes with uncertain transition matrices. Oper. Res. 53(5), 780–798 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  53. OpenAI Gym: (2018). http://gymlibrary.dev/

  54. Poupart, P., Boutilier, C.: Bounded finite state controllers. In: Advances in Neural Information Processing Systems, pp. 823–830 (2004)

    Google Scholar 

  55. Pranger, S., Könighofer, B., Tappler, M., Deixelberger, M., Jansen, N., Bloem, R.: Adaptive shielding under uncertainty. In: ACC, pp. 3467–3474. IEEE (2021)

    Google Scholar 

  56. Puggelli, A., Li, W., Sangiovanni-Vincentelli, A.L., Seshia, S.A.: Polynomial-time verification of PCTL properties of MDPs with convex uncertainties. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 527–542. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39799-8_35

    Chapter  MATH  Google Scholar 

  57. Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley and Sons, Hoboken (1994)

    Book  MATH  Google Scholar 

  58. Russell, S.J., Dewey, D., Tegmark, M.: Research priorities for robust and beneficial artificial intelligence. CoRR abs/1602.03506 (2016)

    Google Scholar 

  59. Smith, R.C.: Uncertainty Quantification: Theory, Implementation, and Applications, vol. 12. Siam, New Delhi (2013)

    Google Scholar 

  60. Sniazhko, S.: Uncertainty in decision-making: a review of the international business literature. Cogent Bus. Manage. 6(1), 1650692 (2019)

    Article  Google Scholar 

  61. Stoica, I., et al.: A Berkeley view of systems challenges for AI. CoRR abs/1712.05855 (2017)

    Google Scholar 

  62. Suilen, M., Jansen, N., Cubuktepe, M., Topcu, U.: Robust policy synthesis for uncertain POMDPs via convex optimization. In: IJCAI, pp. 4113–4120. ijcai.org (2020)

    Google Scholar 

  63. Suilen, M., Simão, T.D., Parker, D., Jansen, N.: Robust anytime learning of Markov decision processes. In: NeurIPS (2022)

    Google Scholar 

  64. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  65. Thrun, S., Burgard, W., Fox, D.: Probabilistic Robotics. The MIT Press, Cambridge (2005)

    MATH  Google Scholar 

  66. Vaandrager, F.W.: Model learning. Commun. ACM 60(2), 86–95 (2017)

    Article  Google Scholar 

  67. Walter, G., Augustin, T.: Imprecision and prior-data conflict in generalized Bayesian inference. J. Stat. Theor. Pract. 3(1), 255–271 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  68. Wang, F.: Toward a revolution in transportation operations: AI for complex systems. IEEE Intell. Syst. 23(6), 8–13 (2008)

    Article  Google Scholar 

  69. Wiesemann, W., Kuhn, D., Rustem, B.: Robust Markov decision processes. Math. Oper. Res. 38(1), 153–183 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  70. Wolff, E.M., Topcu, U., Murray, R.M.: Robust control of uncertain Markov decision processes with temporal logic specifications. In: CDC, pp. 3372–3379. IEEE (2012)

    Google Scholar 

  71. Xu, H., Mannor, S.: Distributionally robust Markov decision processes. Math. Oper. Res. 37(2), 288–300 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  72. Zhang, J., Cheung, B., Finn, C., Levine, S., Jayaraman, D.: Cautious adaptation for reinforcement learning in safety-critical settings. In: ICML. Proceedings of Machine Learning Research, vol. 119, pp. 11055–11065. PMLR (2020)

    Google Scholar 

  73. Zhao, X., Calinescu, R., Gerasimou, S., Robu, V., Flynn, D.: Interval change-point detection for runtime probabilistic model checking. In: 35th IEEE/ACM International Conference on Automated Software Engineering. York (2020)

    Google Scholar 

Download references

Acknowledgements

The approaches presented in this talk are the results of fruitful and enjoyable collaborations with a number of co-authors, in particular: Alessandro Abate, Thom S. Badings, Bernd Becker, Roderick Bloem, Steven Carr, Murat Cubuktepe, Dennis Gross, Sebastian Junges, Joost-Pieter Katoen, Bettina Könighofer, David Parker, Guillermo A. Pérez, Hasan A. Poonawala, Licio Romao, Sanjit Seshia, Alex Serban, Thiago D. Simão, Mariëlle Stoelinga, Marnix Suilen, Ufuk Topcu, and Ralf Wimmer.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nils Jansen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jansen, N. (2023). Intelligent and Dependable Decision-Making Under Uncertainty. In: Chechik, M., Katoen, JP., Leucker, M. (eds) Formal Methods. FM 2023. Lecture Notes in Computer Science, vol 14000. Springer, Cham. https://doi.org/10.1007/978-3-031-27481-7_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-27481-7_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-27480-0

  • Online ISBN: 978-3-031-27481-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics