A Natural Language Argumentation Interface for Explanation Generation in Markov Decision Processes

  • Thomas Dodson
  • Nicholas Mattei
  • Judy Goldsmith
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6992)


A Markov Decision Process (MDP) policy presents, for each state, an action, which preferably maximizes the expected reward accrual over time. In this paper, we present a novel system that generates, in real time, natural language explanations of the optimal action, recommended by an MDP while the user interacts with the MDP policy. We rely on natural language explanations in order to build trust between the user and the explanation system, leveraging existing research in psychology in order to generate salient explanations for the end user. Our explanation system is designed for portability between domains and uses a combination of domain specific and domain independent techniques. The system automatically extracts implicit knowledge from an MDP model and accompanying policy. This policy-based explanation system can be ported between applications without additional effort by knowledge engineers or model builders. Our system separates domain-specific data from the explanation logic, allowing for a robust system capable of incremental upgrades. Domain-specific explanations are generated through case-based explanation techniques specific to the domain and a knowledge base of concept mappings for our natural language model.


Optimal Policy Markov Decision Process Variable Assignment Explanation System Discount Function 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aamodt, A., Plaza, E.: Case-based reasoning: Foundational issues, methodological variations, and system approaches. AI communications 7(1), 39–59 (1994)Google Scholar
  2. 2.
    Bellman, R.: Dynamic Programming. Princeton University Press, Princeton (1957)zbMATHGoogle Scholar
  3. 3.
    Boutilier, C., Dean, T., Hanks, S.: Decision-theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intellgence Research 11, 1–94 (1999)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Camara, W.J., Echternacht, G.: The SAT I and high school grades: utility in predicting success in college. RN-10, College Entrance Examination Board, New York (2000)Google Scholar
  5. 5.
    Elizalde, F., Sucar, E., Noguez, J., Reyes, A.: Generating explanations based on markov decision processes. In: Aguirre, A.H., Borja, R.M., Garciá, C.A.R. (eds.) MICAI 2009. LNCS, vol. 5845, pp. 51–62. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  6. 6.
    Frederick, S., Loewenstein, G., O’Donoghue, T.: Time discounting and time preference: A critical review. Journal of Economic Literature 40, 351–401 (2002)CrossRefGoogle Scholar
  7. 7.
    Guerin, J.T., Crawford, R., Goldsmith, J.: Constructing dynamic bayes nets using recommendation techniques from collaborative filtering. Tech report, University of Kentucky (2010)Google Scholar
  8. 8.
    Hoey, J., St-Aubin, R., Hu, A., Boutilier, C.: SPUDD: Stochastic planning using decision diagrams. In: Proc. UAI, pp. 279–288 (1999)Google Scholar
  9. 9.
    Khan, O., Poupart, P., Black, J.: Minimal sufficient explanations for factored Markov decision processes. In: Proc. ICAPS (2009)Google Scholar
  10. 10.
    Mathias, K., Williams, D., Cornett, A., Dekhtyar, A., Goldsmith, J.: Factored mdp elicitation and plan display. In: Proc. ISDN. AAAI, Menlo Park (2006)Google Scholar
  11. 11.
    Moore, B., Parker, R.: Critical Thinking. McGraw-Hill, New York (2008)Google Scholar
  12. 12.
    Mundhenk, M., Lusena, C., Goldsmith, J., Allender, E.: The complexity of finite-horizon Markov decision process problems. JACM 47(4), 681–720 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Murray, K., Häubl, G.: Interactive consumer decision aids. In: Wierenga, B. (ed.) Handbook of Marketing Decision Models, pp. 55–77. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  14. 14.
    Nugent, C., Doyle, D., Cunningham, P.: Gaining insight through case-based explanation. JIIS 32, 267–295 (2009)Google Scholar
  15. 15.
    Puterman, M.: Markov Decision Processes. Wiley, Chichester (1994)CrossRefzbMATHGoogle Scholar
  16. 16.
    Renooij, S.: Qualitative Approaches to Quantifying Probabilistic Networks. Ph.D. thesis, Institute for Information and Computing Sciences, Utrecht University, The Netherlands (2001)Google Scholar
  17. 17.
    Sinha, R., Swearingen, K.: The role of transparency in recommender systems. In: CHI 2002 Conference Companion, pp. 830–831 (2002)Google Scholar
  18. 18.
    Tversky, A., Kahneman, D.: Judgement under uncertainty: Heuristics and biases. Science 185, 1124–1131 (1974)CrossRefGoogle Scholar
  19. 19.
    Tversky, A., Kahneman, D.: Rational choice and the framing of decisions. The Journal of Business 59(4), 251–278 (1986)CrossRefzbMATHGoogle Scholar
  20. 20.
    Tversky, A., Kahneman, D.: Advances in prospect theory: Cumulative representation of uncertainty. Journal of Risk and uncertainty 5(4), 297–323 (1992)CrossRefzbMATHGoogle Scholar
  21. 21.
    Witteman, C., Renooij, S., Koele, P.: Medicine in words and numbers: A cross-sectional survey comparing probability assessment scales. BMC Med. Informatics and Decision Making 7(13) (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Thomas Dodson
    • 1
  • Nicholas Mattei
    • 1
  • Judy Goldsmith
    • 1
  1. 1.Department of Computer ScienceUniversity of KentuckyLexingtonUSA

Personalised recommendations