A reinforcement learning with switching controllers for a continuous action space

Nagayoshi, Masato; Murao, Hajime; Tamaki, Hisashi

doi:10.1007/s10015-010-0772-0

A reinforcement learning with switching controllers for a continuous action space

Original Article
Published: 27 August 2010

Volume 15, pages 97–100, (2010)
Cite this article

Artificial Life and Robotics Aims and scope Submit manuscript

Masato Nagayoshi¹,
Hajime Murao² &
Hisashi Tamaki³

113 Accesses
5 Citations
Explore all metrics

Abstract

Reinforcement learning (RL) attracts much attention as a technique for realizing computational intelligence such as adaptive and autonomous decentralized systems. In general, however, it is not easy to put RL to practical use. This difficulty includes the problem of designing a suitable action space for an agent, i.e., satisfying two requirements in trade-off: (i) to keep the characteristics (or structure) of an original search space as much as possible in order to seek strategies that lie close to the optimal, and (ii) to reduce the search space as much as possible in order to expedite the learning process. In order to design a suitable action space adaptively, in this article, we propose a RL model with switching controllers based on Q-learning and an actor-critic to mimic the process of an infant’s motor development in which gross motor skills develop before fine motor skills. Then a method for switching controllers is constructed by introducing and referring to the “entropy.” Further, through computational experiments by using a path-planning problem with continuous action space, the validity and potential of the proposed method have been confirmed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Real-Time Actor-Critic Architecture for Continuous Control

Intrinsically Motivated Decision Making for Situated, Goal-Driven Agents

Model-Free Reinforcement Learning-Based Control for Continuous-Time Systems

References

Sutton RS, Barto AG (1998) Reinforcement learning. A Bradford Book, MIT Press
Kimura H, Kobayashi S (2000) An analysis of actor-critic algorithms using eligibility traces: reinforcement learning with imperfect value functions (in Japanese). JSAI J 15(2):267–275
Google Scholar
Morimoto J, Doya K (2001) Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning. Robotics Auton Syst 36:37–51
Article MATH Google Scholar
Shibata K, Nishino T, Okabe Y (2001) Active perception learning system based on actor-Q architecture (in Japanese). T IEICE Jpn, J84-D-II(9), p 2121–2130
Google Scholar
Ito A, Kanabuchi M (2001) Speeding up multi-agent reinforcement learning by coarse-graining of perception: hunter game as an example (in Japanese). T IEICE Jpn, J84-DI(3), pp 285–293
Google Scholar
Nagayoshi M, Murao H, Tamaki H (2006) A state space filter for reinforcement learning. Proceedings of AROB 11th’06, pp 615–618 (GS1-3)
Nagayoshi M, Murao H, Tamaki H (2006) A state space filter for reinforcement learning in POMDPs: application to a continuous state space. Proceedings of the SICE-ICSE International Joint Conference 2006, pp 6037–6042 (SE18-4)

Download references

Author information

Authors and Affiliations

Niigata College of Nursing, 240 Shinnan, Joetsu, 943-0147, Japan
Masato Nagayoshi
Faculty of Cross-Cultural Studies, Kobe University, Kobe, Japan
Hajime Murao
Graduate School of Engineering, Kobe University, Kobe, Japan
Hisashi Tamaki

Authors

Masato Nagayoshi
View author publications
You can also search for this author in PubMed Google Scholar
Hajime Murao
View author publications
You can also search for this author in PubMed Google Scholar
Hisashi Tamaki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Masato Nagayoshi.

Additional information

This work was presented in part at the 15th International Symposium on Artificial Life and Robotics, Oita, Japan, February 4–6, 2010

About this article

Cite this article

Nagayoshi, M., Murao, H. & Tamaki, H. A reinforcement learning with switching controllers for a continuous action space. Artif Life Robotics 15, 97–100 (2010). https://doi.org/10.1007/s10015-010-0772-0

Download citation

Received: 07 April 2010
Accepted: 07 April 2010
Published: 27 August 2010
Issue Date: August 2010
DOI: https://doi.org/10.1007/s10015-010-0772-0

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A reinforcement learning with switching controllers for a continuous action space

Abstract

Access this article

Similar content being viewed by others

A Real-Time Actor-Critic Architecture for Continuous Control

Intrinsically Motivated Decision Making for Situated, Goal-Driven Agents

Model-Free Reinforcement Learning-Based Control for Continuous-Time Systems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

About this article

Cite this article

Key words

Navigation

A reinforcement learning with switching controllers for a continuous action space

Abstract

Access this article

Similar content being viewed by others

A Real-Time Actor-Critic Architecture for Continuous Control

Intrinsically Motivated Decision Making for Situated, Goal-Driven Agents

Model-Free Reinforcement Learning-Based Control for Continuous-Time Systems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

About this article

Cite this article

Share this article

Key words

Search

Navigation