Abstract
Deep reinforcement learning (DRL) requires large samples and a long training time to operate optimally. Yet humans rarely require long periods of training to perform well on novel tasks, such as computer games, once they are provided with an accurate program of instructions. We used perceptual control theory (PCT) to construct a simple closed-loop model which requires no training samples and training time within a video game study using the Arcade Learning Environment (ALE). The model was programmed to parse inputs from the environment into hierarchically organised perceptual signals, and it computed a dynamic error signal by subtracting the incoming signal for each perceptual variable from a reference signal to drive output signals to reduce this error. We tested the same model across three different Atari games Breakout, Pong and Video Pinball to achieve performance at least as high as DRL paradigms, and close to good human performance. Our study shows that perceptual control models, based on simple assumptions, can perform well without learning. We conclude by specifying a parsimonious role of learning that may be more similar to psychological functioning.
Article PDF
Avoid common mistakes on your manuscript.
Data Availability
There is no data and or materials required for this research.
Code Availability
The code is available at, https://github.com/PCT-Models/PCTagent_Breakout_Atari.
References
Badia, A.P., Piot, B., Kapturowski, S., et al.: Agent57: Outperforming the atari human benchmark. In: International Conference on Machine Learning, PMLR, pp 507–517 (2020)
Barter, J.W., Yin, H.H.: Achieving natural behavior in a robot using neurally inspired hierarchical perceptual control. Iscience 24(9), 102,948 (2021)
Bell, H.C., Bell, G.D., Schank, J.A., et al.: Evolving the tactics of play fighting: Insights from simulating the “keep away game” in rats. Adapt. Behav. 23(6), 371–380 (2015)
Bengio, Y., Lecun, Y., Hinton, G.: Deep learning for AI. Commun. ACM 64(7), 58–65 (2021)
Brown-Ojeda, C., Mansell, W.: Do perceptual instructions lead to enhanced performance relative to behavioral instructions? J. Motor Behav. 50(3), 312–320 (2018)
Dabney, W., Rowland, M., Bellemare, M., et al.: Distributional reinforcement learning with quantile regression. In: Proceedings of the AAAI Conference on Artificial Intelligence (2018)
Hawker, B., Moore, R.K.: Robots producing their own hierarchies with dosa; the dependency-oriented structure architect. UK-Robotics and Autonomous Systems (RAS) Network pp 66–68 (2020)
Henaff, M., Whitney, W.F., LeCun, Y.: Model-based planning with discrete and continuous actions. arXiv:170507177 (2017)
Hessel, M., Modayil, J., Van Hasselt, H., et al.: Rainbow: Combining improvements in deep reinforcement learning. In: Proceedings of the AAAI Conference on Artificial Intelligence (2018)
Higuera, J.C.G., Meger, D., Dudek, G.: Synthesizing neural network controllers with probabilistic model-based reinforcement learning. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp 2538–2544. IEEE (2018)
Johnson, T., Siteng, Z., Cheah, W., et al.: Implementation of a perceptual controller for an inverted pendulum robot. J. Intell. Robot. Syst. 99(3-4), 683–692 (2020)
Kaiser, L., Babaeizadeh, M., Milos, P., et al.: Model-based reinforcement learning for atari. arXiv:190300374 (2019)
Kalweit, G., Boedecker, J.: Uncertainty-driven imagination for continuous deep reinforcement learning. In: Conference on Robot Learning, PMLR, pp 195–206 (2017)
Marken, R., Kennaway, R., Gulrez, T.: Behavioral illusions: The snark is a boojum. Theory Psychol. 32(3), 491–514 (2022)
Marken, R.S.: Optical trajectories and the informational basis of fly ball catching. J. Exp. Psychol. Hum. Percept. Perform. 31(3), 340–343 (2005)
Mataric, M.J.: Reward functions for accelerated learning. In: Machine learning proceedings 1994, pp 181–189. Elsevier (1994)
McPhail, C., Powers, W.T., Tucker, C.W.: Simulating individual and collective action in temporary gatherings. Soc. Sci. Comput. Rev. 10(1), 1–28 (1992)
Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Mnih, V., Badia, A.P., Mirza, M., et al.: Asynchronous methods for deep reinforcement learning. In: International conference on machine learning, PMLR, pp 1928–1937 (2016)
Nagabandi, A., Kahn, G., Fearing, R.S., et al.: Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp 7559–7566. IEEE (2018)
Oh, J., Guo, X., Lee, H., et al.: Action-conditional video prediction using deep networks in atari games. Advances in neural information processing systems 28 (2015)
Parker, M.G., Willett, A.B., Tyson, S.F., et al.: A systematic evaluation of the evidence for perceptual control theory in tracking studies. Neurosci. Biobehav. Rev. 112, 616–633 (2020)
Plooij, F.X.: The phylogeny, ontogeny, causation and function of regression periods explained by reorganizations of the hierarchy of perceptual control systems. In: The Interdisciplinary Handbook of Perceptual Control Theory, pp 199–225. Elsevier (2020)
Powers, W.T.: Behavior: The control of perception. Aldine Chicago (1973)
Powers, W.T.: Living control systems III: The fact of control (2008)
Powers, W.T., Clark, R.K., Farland, R.M.: A general feedback theory of human behavior: Part i. Perceptual Motor Skills 11(1), 71–88 (1960)
Schaul, T., Quan, J., Antonoglou, I., et al.: Prioritized experience replay. arXiv:151105952 (2015)
Schrittwieser, J., Antonoglou, I., Hubert, T., et al.: Mastering atari, go, chess and shogi by planning with a learned model. Nature 588(7839), 604–609 (2020)
Schulman, J., Wolski, F., Dhariwal, P., et al.: Proximal policy optimization algorithms. arXiv:170706347 (2017)
Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: Proceedings of the AAAI Conference on Artificial Intelligence (2016)
Wang, Z., Schaul, T., Hessel, M., et al.: Dueling network architectures for deep reinforcement learning. In: International conference on machine learning, PMLR, pp 1995–2003 (2016)
Watter, M., Springenberg, J.T., Boedecker, J., et al.: Embed to control: A locally linear latent dynamics model for control from raw images. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, pp 2746–2754 (2015)
Yin, H.: The crisis in neuroscience. In: The Interdisciplinary Handbook of Perceptual Control Theory, pp 23–48. Elsevier (2020)
Young, R.: A general architecture for robotics systems: A perception-based approach to artificial life. Artif. Life 23(2), 236–286 (2017)
Funding
Open Access funding enabled and organized by CAUL and its Member Institutions. There is no funding sought for this research.
Author information
Authors and Affiliations
Contributions
Both authors have contributed equally in this research.
Corresponding author
Ethics declarations
Ethics approval
No Ethical approval is required by this research.
Consent for Publication
All authors provide consent for publication.
Competing interests
All authors agreed with the content and that all gave explicit consent to submit this paper.
Consent to participate
No consent form for human participants is required.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Warren Mansell contributed equally to this work.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Gulrez, T., Mansell, W. High Performance on Atari Games Using Perceptual Control Architecture Without Training. J Intell Robot Syst 106, 45 (2022). https://doi.org/10.1007/s10846-022-01747-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10846-022-01747-5