On Stable Profit Sharing Reinforcement Learning with Expected Failure Probability

  • Daisuke Mizuno
  • Kazuteru MiyazakiEmail author
  • Hiroaki Kobayashi
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 848)


In this paper, Expected Success Probability (ESP) is defined and a reinforcement learning method Stable Profit Sharing with Expected Failure Probability (SPSwithEFP) is proposed. In SPSwithEFP, Expected Failure Probability (EFP) is used in the roulette wheel selection method and ESP is used in the update equation of the weight of a rule. EFP can discard risky actions and ESP can make the distribution of learned results smaller. The effectiveness is shown with simulation experiments for a maze environment with pitfalls.


Reinforcement learning XoL Profit Sharing EFP 



This work was supported by JSPS KAKENHI Grant Number 17K00327.


  1. 1.
    Miyazaki, K., Yamamura, M., Kobayashi, S.: On the rationality of profit sharing in reinforcement learning. In: Proceedings of the 3rd International Conference on Fuzzy Logic, Neural Nets and Soft Computing, pp. 285–288 (1994)Google Scholar
  2. 2.
    Miyazaki, K., Kobayashi, S.: Exploitation-oriented learning PS-r#. J. Adv. Comput. Intell. Intell. Inf. 13(6), 624–630 (2009)CrossRefGoogle Scholar
  3. 3.
    Miyazaki, K., Muraoka, H., Kobayashi, H.: Proposal of a propagation algorithm of the expected failure probability and the effectiveness on multi-agent environments. In: SICE Annual Conference 2013, pp. 1067–1072 (2013)Google Scholar
  4. 4.
    Miyazaki, K., Furukawa, K., Kobayashi, H.: Proposal of PSwithEFP and its evaluation in multi-agent reinforcement learning. J. Adv. Comput. Intell. Intell. Inf. 21(5), 930–938 (2017)CrossRefGoogle Scholar
  5. 5.
    Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing atari with deep reinforcement learning. In: NIPS Deep Learning Workshop 2013 (2013)Google Scholar
  6. 6.
    Stone, P., Sutton, R.S., Kuhlamann, G.: Reinforcement learning toward RoboCup soccer keepaway. Adapt. Behav. 13(3), 165–188 (2005)CrossRefGoogle Scholar
  7. 7.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. A Bradford Book. MIT Press, Cambridge (1998)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Daisuke Mizuno
    • 1
  • Kazuteru Miyazaki
    • 2
    Email author
  • Hiroaki Kobayashi
    • 3
  1. 1.Tokyo Institute of TechnologyTokyoJapan
  2. 2.National Institution for Academic Degrees and Quality Enhancement of Higher EducationTokyoJapan
  3. 3.Meiji UniversityKanagawaJapan

Personalised recommendations