A probabilistic argumentation framework for reinforcement learning agents

Towards a mentalistic approach to agent profiles


A bounded-reasoning agent may face two dimensions of uncertainty: firstly, the uncertainty arising from partial information and conflicting reasons, and secondly, the uncertainty arising from the stochastic nature of its actions and the environment. This paper attempts to address both dimensions within a single unified framework, by bringing together probabilistic argumentation and reinforcement learning. We show how a probabilistic rule-based argumentation framework can capture Markov decision processes and reinforcement learning agents; and how the framework allows us to characterise agents and their argument-based motivations from both a logic-based perspective and a probabilistic perspective. We advocate and illustrate the use of our approach to capture models of agency and norms, and argue that, in addition to providing a novel method for investigating agent types, the unified framework offers a sound basis for taking a mentalistic approach to agent profiles.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15


  1. 1.

    Though there seems to be an emerging consensus in the literature conceiving ‘undercutting’ to mean an attack on a rule and ‘undermining’ to be an attack on premises, we prefer to adopt here a terminology closer to early work on rule-based argumentation, see e.g. [41].

  2. 2.

    Recall: the set of assumptive arguments supporting a set of assumptions \({Assum}\) is denoted \({\mathrm {AssumArg}}({Assum})\), see Notation  4.4.

  3. 3.

    Recall: the set of assumptive arguments supporting a set of assumptions \({Assum}\) is denoted \({\mathrm {AssumArg}}({Assum})\), see Notation 4.4.

  4. 4.

    We use the standard notation, so for \(\mathbf {Y} \subseteq \mathbf {X}\), we use \(\mathbf {x}(\mathbf {Y})\) to refer to the assignment within \(\mathbf {x}\) to the variables in \(\mathbf {Y}\). For example, if \(\mathbf {X}=\{X1,X2,X3\}\), \(\mathbf {Y}=\{X1,X2\}\) and \(\mathbf {x}=\{X1=1,X2=2,X3=3\}\), then \(\mathbf {x}(\mathbf {Y})=\{X1=1,X2=2\}\).


  1. 1.

    Alexy, R. (1989). A theory of legal argumentation: The theory of rational discourse as theory of legal justification. Oxford: Clarendon.

    Google Scholar 

  2. 2.

    Amgoud, L. (2009). Argumentation for decision making. In Argumentation in artificial intelligence (pp. 301–320). Springer.

  3. 3.

    Artikis, A., Sergot, M., & Pitt, J. (2009). Specifying norm-governed computational societies. ACM Transactions on Computational Logic, 10(1), 1:1–1:42.

    MathSciNet  Article  MATH  Google Scholar 

  4. 4.

    Artikis, A., Sergot, M., Pitt, J., Busquets, D., & Riveret, R. (2016). Specifying and executing open multi-agent systems. In Social coordination frameworks for social technical systems (pp. 197–212). Springer.

  5. 5.

    Atkinson, K., Baroni, P., Giacomin, M., Hunter, A., Prakken, H., Reed, C., et al. (2017). Towards artificial argumentation. AI Magazine, 38(3), 25–36.

    Article  Google Scholar 

  6. 6.

    Atkinson, K., & Bench-Capon, T. J. M. (2007). Practical reasoning as presumptive argumentation using action based alternating transition systems. Artificial Intellignence, 171(10–15), 855–874.

    MathSciNet  Article  MATH  Google Scholar 

  7. 7.

    Baroni, P., Caminada, M., & Giacomin, M. (2011). An introduction to argumentation semantics. The Knowledge Engineering Review, 26(4), 365–410.

    Article  Google Scholar 

  8. 8.

    Baroni, P., Governatori, G., & Riveret, R. (2016). On labelling statements in multi-labelling argumentation. In Proceedings of the 22nd European conference on artificial intelligence (Vol. 285, pp. 489–497). IOS Press.

  9. 9.

    Bellman, R. (1956). Dynamic programming and Lagrange multipliers. Proceedings of the National Academy of Sciences of the United States of America, 42(10), 767.

    MathSciNet  Article  MATH  Google Scholar 

  10. 10.

    Bench-Capon, T. J. M., & Atkinson, K. (2009). Abstract argumentation and values. In L. Rahwan & G. Simari (eds.) Argumentation in artificial intelligence. Springer.

  11. 11.

    Bertsekas, D. P. (1995). Dynamic programming and optimal control (Vol. 1). Belmont, MA: Athena Scientific.

    Google Scholar 

  12. 12.

    Besnard, P., García, A. J., Hunter, A., Modgil, S., Prakken, H., Simari, G. R., et al. (2014). Introduction to structured argumentation. Argument & Computation, 5(1), 1–4.

    Article  Google Scholar 

  13. 13.

    Broersen, J., Dastani, M., Hulstijn, J., & van der Torre, L. (2002). Goal generation in the BOID architecture. Cognitive Science Quarterly, 2(3–4), 428–447.

    Google Scholar 

  14. 14.

    Chen, S. H., & Huang, Y. C. (2005). Risk preference and survival dynamics. In: Agent-based simulation: From modeling methodologies to real-world applications, Agent-based social systems (Vol. 1, pp. 135–143). Tokyo: Springer.

  15. 15.

    Conte, R., & Castelfranchi, C. (1995). Cognitive and social action. London: University College of London Press.

    Google Scholar 

  16. 16.

    Conte, R., & Castelfranchi, C. (2006). The mental path of norms. Ratio Juris, 19, 501–517.

    Article  Google Scholar 

  17. 17.

    Conte, R., Falcone, R., & Sartor, G. (1999). Introduction: Agents and norms: How to fill the gap? Artificial Intelligence and Law, 7(1), 1–15.

    Article  Google Scholar 

  18. 18.

    Cormen, T. H., Leiserson, C. E., Rivest, R. L., Stein, C., et al. (2001). Introduction to algorithms (Vol. 2). Cambridge: MIT press.

    Google Scholar 

  19. 19.

    Dung, P. M. (1995). On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games. Artificial Intelligence, 77(2), 321–358.

    MathSciNet  Article  MATH  Google Scholar 

  20. 20.

    Edmonds, B. (2004). How formal logic can fail to be useful for modelling or designing mas. In Regulated agent-based social systems, Lecture Notes in Computer Science (Vol. 2934, pp. 1–15). Springer.

  21. 21.

    Fasli, M. (2004). Formal systems and agent-based social simulation equals null? Journal of Artificial Societies and Social Simulation, 7(4), 1–7.

    Google Scholar 

  22. 22.

    Fornara, N., & Colombetti, M. (2009). Specifying and enforcing norms in artificial institutions. In Declarative agent languages and technologies VI, Lecture Notes in Computer Science (Vol. 5397, pp. 1–17). Springer.

  23. 23.

    Fox, J., & Parsons, S. (1997). On using arguments for reasoning about actions and values. In Proceedings of the AAAI spring symposium on qualitative preferences in deliberation and practical reasoning.

  24. 24.

    Gao, Y., & Toni, F. (2014). Argumentation accelerated reinforcement learning for cooperativeulti-agent systems. In Proceedings of 21st European conference on artificial intelligence (pp. 333–338). IOS Press.

  25. 25.

    Gao, Y., Toni, F., & Craven, R. (2012). Argumentation-based reinforcement learning for robocup soccer keepaway. In Proceedings of 20th European conference on artificial intelligence (pp. 342–347). IOS Press.

  26. 26.

    Gaudou, B., Lorini, E., & Mayor, E. (2013). Moral guilt: An agent-based model analysis. In Advances in social simulation—Proceedings of the 9th conference of the european social simulation association (pp. 95–106).

  27. 27.

    Governatori, G., & Rotolo, A. (2008). BIO logical agents: Norms, beliefs, intentions in defeasible logic. Autonomous Agents and Multi-Agent Systems, 17(1), 36–69.

    Article  Google Scholar 

  28. 28.

    Hunter, A., & Thimm, M. (2017). Probabilistic reasoning with abstract argumentation frameworks. Journal of Artificial Intelligence Research, 59, 565–611.

    MathSciNet  Article  MATH  Google Scholar 

  29. 29.

    Koller, D., & Friedman, N. (2009). Probabilistic graphical models: Principles and techniques—Adaptive computation and machine learning. Cambridge: The MIT Press.

    Google Scholar 

  30. 30.

    Kostrikin, A. I., Manin, Y. I., & Alferieff, M. E. (1997). Linear algebra and geometry. Washington, DC: Gordon and Breach Science Publishers.

    Google Scholar 

  31. 31.

    Modgil, S., & Caminada, M. (2009). Proof theories and algorithms for abstract argumentation frameworks. In Argumentation in artificial intelligence (pp. 105–129). Springer.

  32. 32.

    Muller, J., & Hunter, A. (2012). An argumentation-based approach for decision making. In 24th international conference on tools with artificial intelligence (Vol. 1, pp. 564–571). IEEE.

  33. 33.

    Ng, A., Harada, D., & Russell, S. (1999). Policy invariance under reward transformations: theory and application to reward shaping. In Proceedings of 16th international conference on machine learning (pp. 278–287).

  34. 34.

    Ng, A. Y., Coates, A., Diel, M., Ganapathi, V., Schulte, J., Tse, B., Berger, E., & Liang, E. (2006). Autonomous inverted helicopter flight via reinforcement learning. In Experimental robotics IX (pp. 363–372). Springer.

  35. 35.

    Oren, N. (2014). Argument schemes for normative practical reasoning (pp. 63–78). Berlin: Springer.

    Google Scholar 

  36. 36.

    Parsons, S., & Fox, J. (1996). Argumentation and decision making: A position paper. In Practical reasoning (pp. 705–709). Springer.

  37. 37.

    Pattaro, E. (2005). The law and the right. In E. Pattaro (Ed.), Treatise of legal philosophy and general jurisprudence (Vol. 1). Berlin: Springer.

    Google Scholar 

  38. 38.

    Pollock, J. L. (1995). Cognitive carpentry: A blueprint for how to build a person. Cambridge, MA: MIT Press.

    Google Scholar 

  39. 39.

    Prakken, H. (2006). Combining sceptical epistemic reasoning with credulous practical reasoning. In Proceedings of the 1st conference on computational models of argument (pp. 311–322). IOS Press.

  40. 40.

    Prakken, H. (2011). An abstract framework for argumentation with structured arguments. Argument and Computation, 1(2), 93–124.

    Article  Google Scholar 

  41. 41.

    Prakken, H., & Sartor, G. (1997). Argument-based extended logic programming with defeasible priorities. Journal of Applied Non-Classical Logics, 7(1–2), 25–75.

    MathSciNet  Article  MATH  Google Scholar 

  42. 42.

    Prakken, H., & Sartor, G. (2015). Law and logic: A review from an argumentation perspective. Artificial Intelligence, 227, 214–245.

    MathSciNet  Article  MATH  Google Scholar 

  43. 43.

    Rahwan, I., & Simari, G. R. (Eds.). (2009). Argumentation in artificial Intelligence. Berlin: Springer.

    Google Scholar 

  44. 44.

    Riveret, R., Baroni, P., Gao, Y., Governatori, G., Rotolo, A., & Sartor, G. (2018). A labelling framework for probabilistic argumentation. Annals of Mathamatics and Artificial Intelligence, 83(1), 21–71.

    MathSciNet  Article  MATH  Google Scholar 

  45. 45.

    Riveret, R., Korkinof, D., Draief, M., & Pitt, J. V. (2015). Probabilistic abstract argumentation: An investigation with boltzmann machines. Argumentation & Computation, 6(2), 178–218.

    Article  Google Scholar 

  46. 46.

    Riveret, R., Pitt, J. V., Korkinof, D., & Draief, M. (2015). Neuro-symbolic agents: Boltzmann machines and probabilistic abstract argumentation with sub-arguments. In Proceedings of the 14th international conference on autonomous agents and multiagent systems (pp. 1481–1489). ACM.

  47. 47.

    Riveret, R., Rotolo, A., & Sartor, G. (2012). Probabilistic rule-based argumentation for norm-governed learning agents. Artificial Intelligence and Law, 20(4), 383–420.

    Article  MATH  Google Scholar 

  48. 48.

    Ross, A. (1958). On law and justice. London: Stevens.

    Google Scholar 

  49. 49.

    Rummery, G. A., & Niranjan, M. (1994). On-line Q-learning using connectionist systems. Technical report. University of Cambridge.

  50. 50.

    Sartor, G. (2005). Legal reasoning: A cognitive approach to the law. Berlin: Springer.

    Google Scholar 

  51. 51.

    Shams, Z., Vos, M. D., Oren, N., Padget, J., & Satoh, K. (2015). Argumentation-based normative practical reasoning. In Proceedings of the 3rd international workshop on theory and applications of formal argumentation, revised selected papers (pp. 226–242). Springer.

  52. 52.

    Simari, G. I., Shakarian, P., & Falappa, M. A. (2016). A quantitative approach to belief revision in structured probabilistic argumentation. Annals of Mathematics and Artificial Intelligence, 76(3), 375–408.

    MathSciNet  Article  MATH  Google Scholar 

  53. 53.

    Stone, P., Sutton, R. S., & Kuhlmann, G. (2005). Reinforcement learning for robocup soccer keepaway. Adaptive Behavior, 13, 165–188.

    Article  Google Scholar 

  54. 54.

    Sutton, R. S., & Barto, A. (1998). Reinforcement learning: An introduction. Cambridge: MIT Press.

    Google Scholar 

  55. 55.

    Tadepalli, P., Givan, R., & Driessens, K. (2004). Relational reinforcement learning: An overview. In Proceedings of the ICML04 workshop on relational reinforcement learning.

  56. 56.

    van der Hoek, W., Roberts, M., & Wooldridge, M. (2007). Social laws in alternating time: Effectiveness, feasibility, and synthesis. Synthese, 156(1), 1–19.

    MathSciNet  Article  MATH  Google Scholar 

Download references


We would like to thank Pietro Baroni for his insights in argumentation. This work was supported by the Marie Curie Intra-European Fellowship PIEFGA-2012-331472.

Author information



Corresponding author

Correspondence to Régis Riveret.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Riveret, R., Gao, Y., Governatori, G. et al. A probabilistic argumentation framework for reinforcement learning agents. Auton Agent Multi-Agent Syst 33, 216–274 (2019). https://doi.org/10.1007/s10458-019-09404-2

Download citation


  • Probabilistic argumentation
  • Markov decision process
  • Reinforcement learning
  • Norms