Skip to main content
Log in

Interpreting a deep reinforcement learning model with conceptual embedding and performance analysis

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The weak interpretability of the deep reinforcement learning (DRL) model becomes a serious impediment to the application of DRL agents in certain areas requiring high reliability. To interpret the behavior of a DRL agent, researchers use saliency maps to discover important parts of the agent’s observation that influence its decision. However, the representations of saliency maps still cannot explicitly present the cause and effect between an agent’s actions and its observations. In this paper, we analyze the inference procedure with respect to the DRL architecture and propose embedding interpretable intermediate representations for an agent’s policy, the intermediate representations that are compressed and abstracted for explanation. We utilize a conceptual embedding technique to regulate the latent representation space of the deep models that can produce interpretable causal factors aligned with human concepts. Furthermore, the information loss of intermediate representation is analyzed to define the model performance upper bound and to measure the model performance degeneration. Experiments validate the effectiveness of the proposed method and the relationship between the observation information and an agent’s performance upper bound.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533

    Article  Google Scholar 

  2. Lillicrap T, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2016) Continuous control with deep reinforcement learning. In: International conference on learning representations (ICLR), pp 1–10

  3. Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, Harley T, Silver D, Kavukcuoglu K (2016) Asynchronous methods for deep reinforcement learning. In: Proceedings of The 33rd international conference on machine learning, vol 48, PMLR, pp 1928– 1937

  4. Schulman J, Levine S, Abbeel P, Jordan M, Moritz P (2015) Trust region policy optimization. In: Proceedings of The 32rd international conference on machine learning, PMLR, pp 1889–1897

  5. Haarnoja T, Zhou A, Abbeel P, Levine S (2018) Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor

  6. Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Lai M, Bolton A et al (2017) Mastering the game of go without human knowledge. Nature 550 (7676):354–359

    Article  Google Scholar 

  7. Vinyals O, Babuschkin I, Czarnecki WM, Mathieu M, Silver D (2019) Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575(7782):350–354

    Article  Google Scholar 

  8. Zieliński P, Markowska-kaczmar U (2021) 3D robotic navigation using a vision-based deep reinforcement learning model. Appl Soft Comput 110:107602

    Article  Google Scholar 

  9. Saeedvand S, Mandala H, Baltes J (2021) Hierarchical deep reinforcement learning to drag heavy objects by adult-sized humanoid robot. Appl Soft Comput 110:107601

    Article  Google Scholar 

  10. Jiang R, Wang Z, He B, Zhou Y, Li G, Zhu Z (2021) A data-efficient goal-directed deep reinforcement learning method for robot visuomotor skill. Neurocomputing 462:389–401

    Article  Google Scholar 

  11. Zhang R, Wang Z, Zheng M, Zhao Y, Huang Z (2021) Emotion-sensitive deep dyna-q learning for task-completion dialogue policy learning. Neurocomputing 459:122–130

    Article  Google Scholar 

  12. Tiwari A, Saha S, Bhattacharyya P (2022) A knowledge infused context driven dialogue agent for disease diagnosis using hierarchical reinforcement learning. Knowl-Based Syst 242:108292

    Article  Google Scholar 

  13. Coronato A, Naeem M, De Pietro G, Paragliola G (2020) Reinforcement learning for intelligent healthcare applications: a survey. Artif Intell Med 109:101964

    Article  Google Scholar 

  14. Ebrahimi S, Lim GJ (2021) A reinforcement learning approach for finding optimal policy of adaptive radiation therapy considering uncertain tumor biological response. Artif Intell Med 121:102193

    Article  Google Scholar 

  15. Ciampi M, Coronato A, Naeem M, Silvestri S (2022) An intelligent environment for preventing medication errors in home treatment. Expert Systems with Applications 116434

  16. Ilahi I, Usama M, Qadir J, Janjua MU, Al-Fuqaha A, Huang DT, Niyato D (2022) Challenges and countermeasures for adversarial attacks on deep reinforcement learning. IEEE Transactions on Artificial Intelligence 3(2):90–109

    Article  Google Scholar 

  17. Heuillet A, Couthouis F, Díaz-rodríguez N (2021) Explainability in deep reinforcement learning. Knowledge-Based Systems 214:106685

    Article  Google Scholar 

  18. Chen J, Li SE, Tomizuka M (2021) Interpretable end-to-end urban autonomous driving with latent deep reinforcement learning. IEEE Trans Intell Transp Syst, pp 1–11. https://doi.org/10.1109/TITS.2020.3046646

  19. Greydanus S, Koul A, Dodge J, Fern A (2018) Visualizing and understanding Atari agents. In: Dy J, Krause A (eds) Proceedings of the 35th international conference on machine learning, vol 80, PMLR, pp 1792–1801. http://proceedings.mlr.press/v80/greydanus18a.html

  20. Puri N, Verma S, Gupta P, Kayastha D, Deshmukh S, Krishnamurthy B, Singh S (2020) Explain your move: Understanding agent actions using specific and relevant feature attribution. In: International conference on learning representations, pp 1–14

  21. Zahavy T, Ben-Zrihem N, Mannor S (2016) Graying the black box: Understanding DQNs. In: Balcan MF, Weinberger KQ (eds) Proceedings of The 33rd international conference on machine learning, vol 48, PMLR, pp 1899–1908

  22. Simonyan K, Vedaldi A, Zisserman A (2014) Deep inside convolutional networks: Visualising image classification models and saliency maps. In: International conference on learning representations (ICLR). arXiv:1312.6034

  23. Shrikumar A, Greenside P, Kundaje A (2017) Learning important features through propagating activation differences. In: Proceedings of the 34th international conference on machine learning, vol 70, PMLR, pp 3145–3153. http://proceedings.mlr.press/v70/shrikumar17a.html

  24. Selvaraju RR, Cogswell M, Das A, Vedantam R, Batra D (2020) Grad-cam: Visual explanations from deep networks via gradient-based localization. Int J Comput Vis 128(8):336–359

    Article  Google Scholar 

  25. Fong RC, Vedaldi A (2017) Interpretable explanations of black boxes by meaningful perturbation. In: IEEE International conference on computer vision (ICCV), IEEE Computer Society, pp 3449–3457. https://doi.org/10.1109/ICCV.2017.371

  26. Iyer R, Li Y, Li H, Lewis M, Sundar R, Sycara K (2018) Transparency and explanation in deep reinforcement learning neural networks. In: AAAI/ACM Conference on artificial intelligence, ethics, and society, new orleans, LA, pp 144–150

  27. Madumal P, Miller T, Sonenberg L, Vetere F (2020) Explainable reinforcement learning through a causal lens. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 2493–2500

  28. Duong TD, Li Q, Xu G (2022) Stochastic intervention for causal inference via reinforcement learning. Neurocomputing 482:40–49

    Article  Google Scholar 

  29. Sutton RS, Barto AG (2018) Reinforcement learning: An Introduction. MIT press

  30. Nguyen DQ, Vien NA, Dang V-H, Chung T (2020) Asynchronous framework with reptile+ algorithm to meta learn partially observable markov decision process. Appl Intell 50(11):4050–4062

    Article  Google Scholar 

  31. Zheng W, Jung T, Lin H (2022) The stackelberg equilibrium for one-sided zero-sum partially observable stochastic games. Automatica 140:110231

    Article  MathSciNet  MATH  Google Scholar 

  32. Kovařík V, Schmid M, Burch N, Bowling M, Lisỳ V (2022) Rethinking formal models of partially observable multiagent decision making. Artif Intell 303:103645

    Article  MathSciNet  MATH  Google Scholar 

  33. Pang Z-J, Liu R-Z, Meng Z-Y, Zhang Y, Yu Y, Lu T (2019) On reinforcement learning for full-length game of starcraft. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 4691–4698

  34. Dai Y, Wang G, Li K-C (2018) Conceptual alignment deep neural networks. Journal of Intelligent & Fuzzy Systems 34(3):1631–1642

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported in part by China Postdoctoral Science Foundation under Grant Number 2021M693976, the Hunan Provincial Natural Science Foundation under Grant Number 2020JJ5367, and the Key Project of Teaching Reform in Colleges and Universities of Hunan Province under Grant Number HNJG-2021-0251.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yinglong Dai or Xiaojun Duan.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dai, Y., Ouyang, H., Zheng, H. et al. Interpreting a deep reinforcement learning model with conceptual embedding and performance analysis. Appl Intell 53, 6936–6952 (2023). https://doi.org/10.1007/s10489-022-03788-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03788-7

Keywords

Navigation