Skip to main content
Log in

A reinforcement learning with switching controllers for a continuous action space

  • Original Article
  • Published:
Artificial Life and Robotics Aims and scope Submit manuscript

Abstract

Reinforcement learning (RL) attracts much attention as a technique for realizing computational intelligence such as adaptive and autonomous decentralized systems. In general, however, it is not easy to put RL to practical use. This difficulty includes the problem of designing a suitable action space for an agent, i.e., satisfying two requirements in trade-off: (i) to keep the characteristics (or structure) of an original search space as much as possible in order to seek strategies that lie close to the optimal, and (ii) to reduce the search space as much as possible in order to expedite the learning process. In order to design a suitable action space adaptively, in this article, we propose a RL model with switching controllers based on Q-learning and an actor-critic to mimic the process of an infant’s motor development in which gross motor skills develop before fine motor skills. Then a method for switching controllers is constructed by introducing and referring to the “entropy.” Further, through computational experiments by using a path-planning problem with continuous action space, the validity and potential of the proposed method have been confirmed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Sutton RS, Barto AG (1998) Reinforcement learning. A Bradford Book, MIT Press

  2. Kimura H, Kobayashi S (2000) An analysis of actor-critic algorithms using eligibility traces: reinforcement learning with imperfect value functions (in Japanese). JSAI J 15(2):267–275

    Google Scholar 

  3. Morimoto J, Doya K (2001) Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning. Robotics Auton Syst 36:37–51

    Article  MATH  Google Scholar 

  4. Shibata K, Nishino T, Okabe Y (2001) Active perception learning system based on actor-Q architecture (in Japanese). T IEICE Jpn, J84-D-II(9), p 2121–2130

    Google Scholar 

  5. Ito A, Kanabuchi M (2001) Speeding up multi-agent reinforcement learning by coarse-graining of perception: hunter game as an example (in Japanese). T IEICE Jpn, J84-DI(3), pp 285–293

    Google Scholar 

  6. Nagayoshi M, Murao H, Tamaki H (2006) A state space filter for reinforcement learning. Proceedings of AROB 11th’06, pp 615–618 (GS1-3)

  7. Nagayoshi M, Murao H, Tamaki H (2006) A state space filter for reinforcement learning in POMDPs: application to a continuous state space. Proceedings of the SICE-ICSE International Joint Conference 2006, pp 6037–6042 (SE18-4)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Masato Nagayoshi.

Additional information

This work was presented in part at the 15th International Symposium on Artificial Life and Robotics, Oita, Japan, February 4–6, 2010

About this article

Cite this article

Nagayoshi, M., Murao, H. & Tamaki, H. A reinforcement learning with switching controllers for a continuous action space. Artif Life Robotics 15, 97–100 (2010). https://doi.org/10.1007/s10015-010-0772-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10015-010-0772-0

Key words

Navigation