Reinforcement Learning Algorithm with CTRNN in Continuous Action Space
There are some difficulties in applying traditional reinforcement learning algorithms to motion control tasks of robot. Because most algorithms are concerned with discrete actions and based on the assumption of complete observability of the state. This paper deals with these two problems by combining the reinforcement learning algorithm and CTRNN learning algorithm. We carried out an experiment on the pendulum swing-up task without rotational speed information. It is shown that the information about the rotational speed, which is considered as a hidden state, is estimated and encoded on the activation of a context neuron. As a result, this task is accomplished in several hundred trials using the proposed algorithm.
KeywordsDiscrete Action Reinforcement Learn Algorithm Total Reward Reinforcement Learning Method Complete Observability
Unable to display preview. Download preview PDF.
- 2.Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and Cybernetics 13, 834–846 (1983)Google Scholar
- 6.Lin, L.J., Mitchell, T.M.: Reinforcement learning with hidden state. In: Proc. of the 2nd Int. Conf. on Simulation of Adaptive Behavior. MIT Press, Cambridge (1993)Google Scholar
- 8.McCallum, A.K.: Reinforcement Learning with Selective Perception and Hidden State. PhD thesis, Univertsity of Rochester, Rochester, New York (1995)Google Scholar
- 9.Sutton, R.S.: Learning to predict by the methods of temporal difference. Machine Learning 3, 9–44 (1988)Google Scholar
- 10.Doya, K.: Temporal difference learning in continuous time and space. In: Advances in Neural Information Processing Systems, vol. 8. MIT Press, Cambridge (1996)Google Scholar
- 12.Rumelhart, D., Hinton, G., Williams, R.: Learning internal representations by error propagation. In: Parallel distributed processing, vol. 1. MIT Press, Cambridge (1986)Google Scholar
- 13.Tani, J.: An interpretation of the “self” from the dynamical system perspective: A constructivist approach. Consciousness Studies 5(5-6) (1998)Google Scholar