Challenges of Machine Learning for Living Machines
Machine Learning algorithms (and in particular Reinforcement Learning (RL)) have proved very successful in recent years. These have managed to achieve super-human performance in many different tasks, from video-games to board-games and complex cognitive tasks such as path-planning or Theory of Mind (ToM) on artificial agents. Nonetheless, this super-human performance is also super-artificial. Despite some metrics are better than what a human can achieve (i.e. cumulative reward), in less common metrics (i.e. time to learning asymptote) the performance is significantly worse. Moreover, the means by which those are achieved fail to extend our understanding of the human or mammal brain. Moreover, most approaches used are based on black-box optimization, making any comparison beyond performance (e.g. at the architectural level) difficult. In this position paper, we review the origins of reinforcement learning and propose its extension with models of learning derived from fear and avoidance behaviors. We argue that avoidance-based mechanisms are required when training on embodied, situated systems to ensure fast and safe convergence and potentially overcome some of the current limitations of the RL paradigm.
KeywordsReinforcement learning Neural networks Avoidance
This work is supported by the European Research Councils CDAC project: The Role of Consciousness in Adaptive Behavior: A Combined Empirical, Computational and Robot based Approach, (ERC-2013- ADG341196).
- 2.Kulkarni, T.D., et al.: Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation. In: NIPS (2016)Google Scholar
- 6.Rescorla, R.A., Wagner, A.R.: A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement. Class. Cond. Curr. Res. Theory 2, 64–99 (1972)Google Scholar
- 7.Sutton, R.S.: Learning to predict by the methods of temporal differences. Mach. Learn. 3(1), 9–44 (1988)Google Scholar
- 8.Bousmalis, K., et al.: Using simulation and domain adaptation to improve efficiency of deep robotic grasping. arXiv preprint arXiv:1709.07857 (2017)