The skinner automaton: A psychological model formalizing the theory of operant conditioning

Ruan, XiaoGang; Wu, Xuan

doi:10.1007/s11431-013-5369-0

The skinner automaton: A psychological model formalizing the theory of operant conditioning

Published: 28 September 2013

Volume 56, pages 2745–2761, (2013)
Cite this article

Science China Technological Sciences Aims and scope Submit manuscript

XiaoGang Ruan¹ &
Xuan Wu¹

5784 Accesses
9 Citations
Explore all metrics

Abstract

Operant conditioning is one of the fundamental mechanisms of animal learning, which suggests that the behavior of all animals, from protists to humans, is guided by its consequences. We present a new stochastic learning automaton called a Skinner automaton that is a psychological model for formalizing the theory of operant conditioning. We identify animal operant learning with a thermodynamic process, and derive a so-called Skinner algorithm from Monte Carlo method as well as Metropolis algorithm and simulated annealing. Under certain conditions, we prove that the Skinner automaton is expedient, ɛ-optimal, optimal, and that the operant probabilities converge to the set of stable roots with probability of 1. The Skinner automaton enables machines to autonomously learn in an animal-like way.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A-learning: A new formulation of associative learning theory

Article 06 July 2020

Basic Language Learning in Artificial Animals

Exploring Sensitization in the Context of Extending the Behavior of an Artificial Agent

References

Skinner B F. The Behavior of Organisms. New York: Appleton-Century-Crofts, 1938. 61–116
Google Scholar
Skinner B F. Science and Human Behavior. New York: Macmillan, 1953. 45–128
Google Scholar
Thorndike E L. Animal Intelligence: Experimental Studies. Edison: Transaction Publishers, 1911. 241–282
Book Google Scholar
Watson J B. Behaviorism. New York: People’s Institute, 1924. 141–232
Google Scholar
Watson J B. Psychology as the behaviorist views it. Psychol Rev, 1913, 20: 158–177
Article Google Scholar
Pavlov I P. Conditioned Reflexes: An Investigation of the Physiological Activity of the Cerebral Cortex. London: Oxford University Press, 1927. 219–300
Google Scholar
Grossberg S. On the dynamics of operant conditioning. J Theor Biol, 1971, 33: 225–255
Article Google Scholar
Grossberg S. Classical and instrumental learning by neural networks. In: Rosen R, Snell F, eds. Progress in theoretical biology. New York: Academic Press, 1974. 51–141
Chapter Google Scholar
Chang C, Gaudiano P. Application of biological learning theories to mobile robot avoidance and approach behaviors. Advs Complex Syst, 1998, 1: 79–114
Article MATH Google Scholar
Touretzky D S, Saksida L M. Operant conditioning in Skinnerbots. Adapt Behav, 1997, 5: 219–247
Article Google Scholar
Saksida L M, Raymond S M, Touretzky D S. Shaping robot behavior using principles from instrumental conditioning. Rob Auton Syst, 1997, 22: 231–249
Article Google Scholar
Daw N D, Touretzky D S. Operant behavior suggests attentional gating of dopamine system inputs. Neurocomputing, 2001, 38: 1161–1167
Article Google Scholar
Itoh K, Miwa H, Matsumoto M, et al. Behavior model of humanoid robots based on operant conditioning. In: Proceedings of the 5th IEEE-RAS International Conference on Humanoid Robots, Tsukuba, Japan, 2005. 220–225
Google Scholar
Itoh K, Onishi Y, Takahashi S, et al. Development of face robot to express various face shapes by moving the parts and outline. In: Proceedings of the 2nd Biennial IEEE/RAS-EMBS International Conference on Biomedical Robotics and Biomechatronics, Scottsdale, AZ, USA, 2008. 439–444
Google Scholar
Sutton R S, Barto A G. Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press, 1998. 1–86
Google Scholar
Narendra K S, Thathachar M A L. Learning automata: A survey. IEEE Trans Syst Man Cybern, 1974, SMC-14: 323–334
Article MathSciNet Google Scholar
Thathachar M A L, Sastry P S. Varieties of learning automata: An Overview. IEEE Trans Syst Man Cybern B Cybern, 2002, 32: 711–722
Article Google Scholar
Thathachar M A L, Sastry P S. A new approach to designing reinforcement schemes for learning automata. IEEE Trans Syst Man Cybern, 1985, SMC-15: 168–175
Article MathSciNet Google Scholar
Lanctot J K, Oommen B J. Discretized estimator learning automata. IEEE Trans Syst Man Cybern, 1992, 22: 1473–1483
Article MathSciNet Google Scholar
Thathachar M A L, Phansalkar V V. Learning the global maximum with parameterized learning automata. IEEE Trans Neural Netw, 1995, 6: 398–406
Article Google Scholar
Phansalkar V V, Thathachar M A L. Local and global optimization algorithms for generalized learning automata. Neural Comput, 1995, 7: 950–973
Article Google Scholar
Hauwere Y-M De, Vrancx P, Nowé A. Generalized learning automata for multi-agent reinforcement learning. AI Commun, 2010, 23: 311–324
MathSciNet MATH Google Scholar
Viswanathan R, Narendra K S. A note on the linear reinforcement scheme for variable-structure stochastic automata. IEEE Trans Syst Man Cybern, 1972, SMC-2: 292–294
MathSciNet Google Scholar
Poznyak S, Najim K. On nonlinear reinforcement schemes. IEEE Trans Automat Contr, 1997, 42: 1002–1004
Article MathSciNet MATH Google Scholar
Stoica F, Popa E M. An absolutely expedient learning algorithm for stochastic automata. WSEAS Trans COMPUTERS, 2007, 6: 229–235
Google Scholar
Stoica F, Popa E M. A new evolutionary reinforcement scheme for stochastic learning automata. In: Mastorakis N E, Mladenov V, Bojkovic Z, et al., eds. The Proceedings of the 12th WSEAS International Conference on Computers, Stevens Point, Wisconsin, USA, 2008. 268–273
Google Scholar
Simian D, Stoica F. A new nonlinear reinforcement scheme for stochastic learning automata. In: The Proceedings of 12th WSEAS International Conference on Automatic control, Modeling & Simulation, Catania, Sicily, Italy, 2010. 450–454
Google Scholar
Metropolis N, Rosenbluth A W, Rosenbluth M N, et al. Equation of State Calculations by Fast Computing Machines. J Chem Phys, 1953, 21: 1087–1092
Article Google Scholar
Jorgensen W L. Perspective on ‘Equation of state calculations by fast computing machines’. Theor Chem Acc, 2000, 103: 225–227
Article Google Scholar
Kirkpatrick S, Gelatt C D, Vecchi M P. Optimization by Simulated Annealing. Science, 1983, 220: 671–680
Article MathSciNet MATH Google Scholar
Černý V A. Thermodynamical approach to the travelling salesman problem: An efficient simulation algorithm. J Optim Theory Appl, 1985, 45: 41–51
Article MathSciNet MATH Google Scholar
Horowitz M J. Introduction to Psychodynamics: A New synthesis. New York: Basic Books, 1988. 17–243
Google Scholar
Palm W J. System Dynamics. 2nd ed. London: McGraw-Hill Science/Engineering/Math, 2009. 172–283
Google Scholar
Kiese-Himmel C. Verstärkungslernen: Operante Konditionierung. Sprache-Stimme-Gehör, 2010, 34: 1
Article Google Scholar
Dayan P, Belleine W. Reward, motivation and reinforcement learning. Neuron, 2002, 36: 285–298
Article Google Scholar
Oudeyer P Y, Kaplan F, Hafner V V. Intrinsic motivation systems for autonomous mental development. IEEE Trans Evolut Comput, 2007, 11: 265–286
Article Google Scholar
Brucke E W. Lectures on Physiology. Vienna: Braumuller, 1874.
Google Scholar
Haynie D. Biological Thermodynamics. Cambridge: Cambridge University Press, 2001. 293–330
Book Google Scholar
Nicholls D G, Ferguson S J. Bioenergetics. 4th ed. Europe: Academic Press, 2013. 1–52
Book Google Scholar
Hopfield J J. Networks, computations, logic, and noise. In: Proceedings of IEEE First International Conference on Neural Networks, California, USA, 1987. 109–141
Google Scholar
Neumann J von. Various techniques used in connection with random digits, in Monte Carlo Method. Applied Mathematics Series, vol. 12, Washington D.C.: U.S. Department of Commerce, National Bureau of Standards, 1951. 36–38
Google Scholar
Skinner B F. ’Superstition’ in the pigeon. J Exp Psychol, 1948, 38(2): 168–172
Article Google Scholar
Wiener N. Cybernetics: Or Control and Communication in the Animal and the Machine. New York: J. Wiley, 1948. 60–132
Google Scholar
Braitenberg V. Vehicles: Experiments in Synthetic Psychology. USA: The MIT Press, 1986. 95–144
Google Scholar
Ooi R C. Balancing a two-wheeled autonomous robot. Dissertation of Masteral Degree. Perth: University of Western Australia, 2003. 1–7
Google Scholar
Ruan X G, Li X Y, ZHAO J W, et al. A flexible two-wheeled self-balancing robot system and its motion control method. China Patent 200910084259.8, 2010-10-9
Asada M, Hosoda K, Kuniyoshi Y, et al. Cognitive developmental robotics: A survey. IEEE Trans Auton Ment Dev, 2009, 1: 12–34
Article Google Scholar
Wood S E, Wood E G, Boyd D. Mastering the World of Psychology. Boston: Allyn & Bacon, 2004. 333–354
Google Scholar
Baranès A, Oudeyer P Y. R-IAC: Robust intrinsically motivated exploration and active learning. IEEE Trans Auton Ment De, 2009, 1: 155–169
Article Google Scholar
Oudeyer P Y, Kaplan F. What is intrinsic motivation? A typology of computational approaches. Front Neurorobot, 2007, 1: 1–14
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Artificial Intelligence and Robots, School of Electronic Information and Control Engineering, Beijing University of Technology, Beijing, 100124, China
XiaoGang Ruan & Xuan Wu

Authors

XiaoGang Ruan
View author publications
You can also search for this author in PubMed Google Scholar
Xuan Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xuan Wu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ruan, X., Wu, X. The skinner automaton: A psychological model formalizing the theory of operant conditioning. Sci. China Technol. Sci. 56, 2745–2761 (2013). https://doi.org/10.1007/s11431-013-5369-0

Download citation

Received: 06 August 2013
Accepted: 13 September 2013
Published: 28 September 2013
Issue Date: November 2013
DOI: https://doi.org/10.1007/s11431-013-5369-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The skinner automaton: A psychological model formalizing the theory of operant conditioning

Abstract

Access this article

Similar content being viewed by others

A-learning: A new formulation of associative learning theory

Basic Language Learning in Artificial Animals

Exploring Sensitization in the Context of Extending the Behavior of an Artificial Agent

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The skinner automaton: A psychological model formalizing the theory of operant conditioning

Abstract

Access this article

Similar content being viewed by others

A-learning: A new formulation of associative learning theory

Basic Language Learning in Artificial Animals

Exploring Sensitization in the Context of Extending the Behavior of an Artificial Agent

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation