Advertisement

Reward Training: Methods and Data

  • M. E. Rashotte
Part of the NATO Advanced Study Institutes Series book series (NSSA, volume 19)

Abstract

Reward training may be summarized by the equation
$$S:R \to S^V$$
which designates that a specified response (R) performed in a given stimulus situation (S) results in presentation of a stimulus that is currently valued by the animal (Sv). Thorndike’s (1898) experiment in which hungry cats pulled a loop in a box to gain access to food illustrates one variety of reward training in which Sv is an appetitive stimulus. Non-appetitive forms of Sv include periods of freedom from an aversive stimulus (i.e., “escape training”), certain classes of exteroceptive stimuli (e.g., a change in ambient illumination level for rats), stimulus conditions which allow certain responses to be performed (e.g., presentation of a manipulable puzzle to monkeys), direct stimulation of certain brain areas (e.g., electrical stimulation of the hypothalamus), and classically conditioned stimuli previously paired with Sv (i.e., “conditioned reinforcement” training). When the parameters of training are properly arranged, an occurrence of S:R → Sv results in increased likelihood that R will occur when the animal subsequently encounters S. It is this result which is summarized in the Empirical Law of Effect (Chapter 1).

Keywords

Conditioned Stimulus Reward Event Instrumental Response Reward Magnitude Operant Schedule 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amsel, A. The role of frustrative nonreward in noncontinuous reward situations. Psychological Bulletin, 1958, 55, 102–119.PubMedGoogle Scholar
  2. Amsel, A. Partial reinforcement effects on vigor and persistence. In K. W. Spence & J. T. Spence (Eds.), The Psychology of Learning and Motivation, Vol. 1. New York: Academic Press, 1967.Google Scholar
  3. Bacon, W. E. Partial reinforcement extinction effect following different amounts of training. Journal of Comparative and Physiological Psychology, 1962, 55, 998–1003.PubMedGoogle Scholar
  4. Bitterman, M. E. Toward a comparative psychology of learning. American Psychologist, 1960, 15, 704–712.Google Scholar
  5. Boakes, R. A., Poli, M., & Lockwood, M. J. A study of misbehavior: Token reinforcement with rats. Paper delivered at the annual meeting of the Psychonomic Society, Colorado, 1975.Google Scholar
  6. Bower, G. H., Fowler, H., & Trapold, M. A. Escape learning as a function of amount of shock reduction. Journal of Experimental Psychology, 1959, 58, 482–484.PubMedGoogle Scholar
  7. Breland, K., & Breland, M. Animal Behavior. New York: The Macmillan Co., 1966.Google Scholar
  8. Brown, P. L., & Jenkins, H. M. Auto-shaping of the pigeon’s keypeck. Journal of the Experimental Analysis of Behavior, 1968, 11, 1–8.PubMedGoogle Scholar
  9. Campbell, B. A., & Kraeling, D. Response strength as a function of drive level and amount of drive reduction. Journal of Experimental Psychology, 1953, 45, 97–101.PubMedGoogle Scholar
  10. Capaldi, E. J. The effect of different amounts of training on the resistance to extinction of different patterns of partially reinforced responses. Journal of Comparative and Physiological Psychology, 1958, 51, 367–371.PubMedGoogle Scholar
  11. Capaldi, E. J. Partial reinforcement: A hypothesis of sequential effects. Psychological Review, 1966, 73, 459–477.PubMedGoogle Scholar
  12. Capaldi, E. J. A sequential hypothesis of instrumental learning. In K. W. Spence & J. T. Spence (Eds.), The Psychology of Learning and Motivation, Vol. 1. New York: Academic Press, 1967.Google Scholar
  13. Capaldi, E. J., & Haggbloom, S. J. Response events as well as goal events as sources of animal memory. Animal Learning & Behavior, 1975, 3, 1–10.Google Scholar
  14. Capaldi, E. J., & Morris, M. D. A role of stimulus compounds in eliciting responses: Relatively spaced extinction following massed acquisition. Animal Learning & Behavior, 1976, 4, 113–117.Google Scholar
  15. Catania, A. C. Concurrent performances: A baseline for the study of reinforcement magnitude. Journal of the Experimental Analysis of Behavior, 1963, 6, 299–300.PubMedGoogle Scholar
  16. Catania, A. C., & Reynolds, G. S. A quantitative analysis of the responding maintained by interval schedules of reinforcement. Journal of the Experimental Analysis of Behavior, 1968, 11, 327–383.PubMedGoogle Scholar
  17. Conrad, D. G., & Sidman, M. Sucrose concentration as a reinforcement for lever pressing by monkeys. Psychological Reports, 1956, 2, 381–384.Google Scholar
  18. Crespi, L. P. Quantitative variation of incentive and performance in the white rat. American Journal of Psychology, 1942, 55, 467–517.Google Scholar
  19. Cumming, W. W., & Schoenfeld, W. N. Behavior under extended exposure to a high-value fixed-interval schedule. Journal of the Experimental Analysis of Behavior, 1958, 1, 245–263.PubMedGoogle Scholar
  20. Dews, P. B. Studies on responding under fixed-interval schedules of reinforcement: The effects on the pattern of responding of changes in the requirements at reinforcement. Journal of the Experimental Analysis of Behavior, 1969, 12, 191–199.PubMedGoogle Scholar
  21. Dews, P. B. The theory of fixed-interval responding. In W. N. Schoenfeld (Ed.), The Theory of Reinforcement Schedules. New York: Appleton-Century-Crofts, 1970.Google Scholar
  22. Felton, M., & Lyon, D. O. The post-reinforcement pause. Journal of the Experimental Analysis of Behavior, 1966, 9, 131–134.PubMedGoogle Scholar
  23. Ferster, C. B., & Skinner, B. F. Schedules of Reinforcement. New York: Appleton-Century-Crofts, 1957.Google Scholar
  24. Flaherty, C. F., & Davenport, J. W. Successive brightness discrimination in rats following regular versus random intermittant reinforcement. Journal of Experimental Psychology, 1972, 96, 1–9.PubMedGoogle Scholar
  25. Fleshier, M., & Hoffman, H. S. A progression for generating variable-interval schedules. Journal of the Experimental Analysis of Behavior, 1962, 5, 529–530.Google Scholar
  26. Fowler, H., & Trapold, M. A. Escape performance as a function of delay of reinforcement. Journal of Experimental Psychology, 1962, 63, 464–467.PubMedGoogle Scholar
  27. Gonzalez, R. C., Bainbridge, P., & Bitterman, M. E. Discrete-trials lever pressing in the rat as a function of pattern of reinforcement, effortfulness of response and amount of reward. Journal of Comparative and Physiological Psychology, 1966, 61, 110–122.PubMedGoogle Scholar
  28. Gonzalez, R. C., & Champlin, G. Positive behavioral contrast, negative simultaneous contrast and their relation to frustration in pigeons. Journal of Comparative and Physiological Psychology, 1974, 87, 173–187.PubMedGoogle Scholar
  29. Goodrich, K. P. Performance in different segments of an instrumental response chain as a function of reinforcement schedule. Journal of Experimental Psychology, 1959, 57, 57–63.PubMedGoogle Scholar
  30. Grastyán, E., & Vereczkei, L. Effects of spatial separation of the conditioned signal from the reinforcement: A demonstration of the conditioned character of the orienting response or the orientational character of conditioning. Behavioral Biology, 1974, 10, 121–146.PubMedGoogle Scholar
  31. Grice, G. R. The relation of secondary reinforcement to delayed reward in visual discrimination learning. Journal of Experimental Psychology, 1948, 38, 1–16.PubMedGoogle Scholar
  32. Guttman, N. Operant conditioning, extinction, and periodic reinforcement in relation to concentration of sucrose used as reinforcing agent. Journal of Experimental Psychology, 1953, 46, 213–224.PubMedGoogle Scholar
  33. Guttman, N. Equal-reinforcement values for sucrose and glucose solutions compared with equal-sweetness values. Journal of Comparative and Physiological Psychology, 1954, 47, 358–361.PubMedGoogle Scholar
  34. Hayes, K. J., & Hayes, C. Imitation in a home-reared chimpanzee. Journal of Comparative and Physiological Psychology, 1952, 45, 450–459.PubMedGoogle Scholar
  35. Hearst, E., & Jenkins, H. M. Sign-Tracking: The Stimulus-Reinforcer Relation and Directed Action. Austin, Texas: The Psychonomic Society, 1974.Google Scholar
  36. Hodos, W., & Kaiman, G. Effects of increment size and reinforcer volume on progressive ratio performance. Journal of the Experimental Analysis of Behavior, 1963, 6, 387–392.PubMedGoogle Scholar
  37. Holland, P. C. Conditioned stimulus as a determinant of the form of the Pavlovian conditioned response. Journal of Experimental Psychology: Animal Behavior Processes, 1977, 3, 77–104.PubMedGoogle Scholar
  38. Honig, W. K. (Ed.), Operant Behavior: Areas of Research and Application. New York: Appleton-Century-Crofts, 1966.Google Scholar
  39. Honig, W. K., & James, P. H. R. (Eds.), Animal Memory. New York: Academic Press, 1971.Google Scholar
  40. Honig, W. K., & Staddon, J. E. R. (Eds.), Handbook of Operant Behavior. Englewood Cliffs, N. J.: Prentice-Hall, 1977.Google Scholar
  41. Hull, C. L. Goal attraction and directing ideas conceived as habit phenomena. Psychological Review, 1931, 38, 487–506.Google Scholar
  42. Hull, C. L. The rat’s speed-of-locomotion gradient in the approach to food. Journal of Comparative Psychology, 1934, 17, 393–422.Google Scholar
  43. Hull, C. L. Principles of Behavior. New York: Appleton-Century-Crofts, 1943.Google Scholar
  44. Hull, J. H. Instrumental response topographies of rats. Animal Learning & Behavior, 1977, 5, 207–212.Google Scholar
  45. Hunter, W. S. The delayed reaction in animals and children. Behavior Monographs, 1913, 2, 21–30.Google Scholar
  46. Jenkins, H. M. Sequential organization in schedules of reinforcement. In W. N. Schoenfeld (Ed.), The Theory of Reinforcement Schedules. New York: Appleton-Century-Crofts, 1970.Google Scholar
  47. Jenkins, H. M., & Moore, B. R. The form of the auto-shaped response with food and water reinforcements. Journal of the Experimental Analysis of Behavior, 1973, 20, 163–181.PubMedGoogle Scholar
  48. Jenkins, W. O., & Clayton, F. L. Rate of responding and amount of reinforcement. Journal of Comparative and Physiological Psychology, 1949, 42, 174–181.PubMedGoogle Scholar
  49. Jenkins, W. O., & Stanley, J. C. Jr. Partial reinforcement: A review and critique. Psychological Bulletin, 1950, 47, 193–234.PubMedGoogle Scholar
  50. Jobe, J. B., Mellgren, R. L., Feinberg, R. A., Littlejohn, R. L., & Rigby, R. L. Patterning, partial reinforcement, and N-length effects at spaced trials as a function of reinstatement of retrieval cues. Learning and Motivation, 1977, 8, 77–97.Google Scholar
  51. Katz, S., Woods, G. T., & Carrithers, J. H. Reinforcement aftereffects and intertrial interval. Journal of Experimental Psychology, 1966, 72, 624–626.PubMedGoogle Scholar
  52. Keesey, R. E., & Kling, J. W. Amount of reinforcement and free-operant responding. Journal of the Experimental Analysis of Behavior, 1961, 4, 125–132.PubMedGoogle Scholar
  53. Kintsch, W. Runway performance as a function of drive strength and magnitude of reinforcement. Journal of Comparative and Physiological Psychology, 1962, 55, 882–887.PubMedGoogle Scholar
  54. Kohn, B., & Dennis, M. Observation and discrimination learning in the rat: Specific and nonspecific effects. Journal of Comparative and Physiological Psychology, 1972, 78, 292–296.PubMedGoogle Scholar
  55. Kraeling, D. Analysis of amount of reward as a variable in learning. Journal of Comparative and Physiological Psychology, 1961, 54, 560–565.PubMedGoogle Scholar
  56. Lett, B. T. Delayed reward learning: Disproof of the traditional theory. Learning and Motivation, 1973, 4, 237–246.Google Scholar
  57. Lett, B. T. Visual discrimination learning with a 1-min delay of reward. Learning and Motivation, 1974, 5, 174–181.Google Scholar
  58. Lett, B. T. Long delay learning in the T-maze. Learning and Motivation, 1975, 6, 80–90.Google Scholar
  59. Lett, B. T. Regarding Roberts’s reported failure to obtain visual discrimination learning with delayed reward. Learning and Motivation, 1977, 8, 136–139.Google Scholar
  60. Logan, F. A. Incentive: How The Conditions of Reinforcement Affect The Performance of Rats. New Haven: Yale University Press, 1960.Google Scholar
  61. LoLordo, V. M., McMillan, J. C., & Riley, A. L. The effects upon food-reinforced pecking and treadle-pressing of auditory and visual signals for response-independent food. Learning and Motivation, 1974, 5, 24–41.Google Scholar
  62. Ludvigson, H. W. & Sytsma, D. The sweet smell of success: Apparent double alternation in the rat. Psychonomic Science, 1967, 9, 283–284.Google Scholar
  63. Mackintosh, N. J. The Psychology of Animal Learning. London: Academic Press, 1974.Google Scholar
  64. McHose, J. H., & Moore, J. N. Reinforcer magnitude and instrumental performance in the rat. Bulletin of the Psychonomic Society, 1976, 8, 416–418.Google Scholar
  65. Medin, D. L., Roberts, W. A., & Davis, R. T. Processes of Animal Memory. Hillsdale, N. J.: Lawrence Erlbaum Associates, 1976.Google Scholar
  66. Meltzer, D., & Brahlek, J. A. Quantity of reinforcement and fixed-interval performance. Psychonomic Science, 1968, 12, 207–208.Google Scholar
  67. Miller, N. E. A reply to “Sign-Gestalt or Conditioned Reflex?” Psychological Review, 1935, 42, 280–292.Google Scholar
  68. Moore, B. R. The role of directed Pavlovian reactions in simple instrumental learning in the pigeon. In R. A. Hinde & J. Stevenson-Hinde (Eds.), Constraints on Learning. London: Academic Press, 1973.Google Scholar
  69. Morrison, R. R., & Ludvigson, H. W. Discrimination by rats of con-specific odors of reward and nonreward. Science, 1970, 167, 904–905.PubMedGoogle Scholar
  70. Morse, W. H., & Kelleher, R. T. Schedules as fundamental determinants of behavior. In W. N. Schoenfeld (Ed.), The Theory of Reinforcement Schedules. New York: Appleton-Century-Crofts, 1970.Google Scholar
  71. Morse, W. H., & Kelleher, R. T. Determinants of reinforcement and punishment. In W. K. Honig & J. E. R. Staddon (Eds.), Handbook of Operant Behavior. Englewood Cliffs, N. J.: Prentice-Hall, 1977.Google Scholar
  72. Perin, C. T. A quantitative investigation of the delay-of-reinforcement gradient. Journal of Experimental Psychology, 1943, 32, 37–51.Google Scholar
  73. Platt, J. R. Discrete trials and their relation to free-behavior situations. In H. H. Kendler & J. T. Spence (Eds.), Essays in Neobehaviorism: A Memorial Volume to Kenneth W. Spence. New York: Appleton-Century-Crofts, 1971.Google Scholar
  74. Premack, D. Reinforcement theory. In D. Levine (Ed.), Nebraska Symposium on Motivation. Lincoln: University of Nebraska Press, 1965.Google Scholar
  75. Rashotte, M. E., Adelman, L., & Dove, L. D. Influence of percentage-reinforcement on runway running of rats. Learning and Motivation, 1972, 3, 194–208.Google Scholar
  76. Reese, E. P. Human Behavior: Analysis and Application. 2nd Edition. Dubuque, Iowa: William C. Brown, 1978.Google Scholar
  77. Roberts, W. A. Failure to replicate visual discrimination learning with a 1-min delay of reward. Learning and Motivation, 1976, 7, 313–325.Google Scholar
  78. Roberts, W. A. Still no evidence for visual discrimination learning: A reply to Lett. Learning and Motivation, 1977, 8, 140–144.Google Scholar
  79. Schneider, B. A. A two-state analysis of fixed-interval responding in the pigeon. Journal of the Experimental Analysis of Behavior, 1969, 12, 677–687.PubMedGoogle Scholar
  80. Schoenfeld, W. N. (Ed.), The Theory of Reinforcement Schedules. New York: Appleton-Century-Crofts, 1970.Google Scholar
  81. Schoenfeld, W. N., Antonitis, J. J., & Bersh, P. J. Unconditioned response rate of the white rat in a bar pressing apparatus. Journal of Comparative and Physiological Psychology, 1950, 43, 41–48.PubMedGoogle Scholar
  82. Schoenfeld, W. N., & Cole, B. K. Stimulus Schedules: The t-г Systems. New York: Harper & Row, 1972.Google Scholar
  83. Schwartz, B., & Gamzu, E. Pavlovian control of operant behavior: An analysis of autoshaping and its implications for operant conditioning. In W. K. Honig & J. E. R. Staddon (Eds.) Handbook of Operant Behavior. Englewood Cliffs, N. J.: Prentice-Hall, 1977.Google Scholar
  84. Sevenster, P. Incompatability of response and rewards. In R. A. Hinde & J. Stevenson-Hinde (Eds.), Constraints on Learning. London: Academic Press, 1973.Google Scholar
  85. Shimp, C. P. Perspectives on the behavioral unit. Choice behavior in animals. In W. K. Estes (Ed.), Handbook of Learning and Cognitive Process. Vol. 2. Hillsdale, N. J.: Lawrence Erlbaum Associates, 1975.Google Scholar
  86. Shimp, C. P. Organization in memory and behavior. Journal of the Experimental Analysis of Behavior, 1976, 26, 113–130.PubMedGoogle Scholar
  87. Shull, R. L. The response-reinforcement dependency in fixedinterval schedules of reinforcement. Journal of the Experimental Analysis of Behavior, 1970, 14, 55–60.PubMedGoogle Scholar
  88. Sidley, N. A., & Schoenfeld, W. N. Behavior stability and response rate as functions of reinforcement probability on “random ratio” schedules. Journal of the Experimental Analysis of Behavior, 1964, 7, 281–283.PubMedGoogle Scholar
  89. Skinner, B. F. The Behavior of Organisms. New York: Appleton-Century-Crofts, 1938.Google Scholar
  90. Skinner, B. F. Are theories of learning necessary? Psychological Review, 1950, 57, 193–216.PubMedGoogle Scholar
  91. Skinner, B. F. Contingencies of Reinforcement: A Theoretical Analysis. New York: Appleton-Century-Crofts, 1969.Google Scholar
  92. Smith, R. F. Topography of the food-reinforced key peck and the source of 30-millisecond interresponse times. Journal of the Experimental Analysis of Behavior, 1974, 21, 541–551.PubMedGoogle Scholar
  93. Spence, K. W. Behavior Theory and Conditioning. New Haven: Yale University Press, 1956.Google Scholar
  94. Spence, K. W. The roles of reinforcement and non-reinforcement in simple learning. In K. W. Spence (Ed.), Behavior Theory and Learning: Selected Papers. Englewood Cliffs, N. J.: Prentice-Hall, 1960.Google Scholar
  95. Staddon, J. E. R., & Frank, J. A. The role of the peck-food contingency on fixed-interval schedules. Journal of the Experimental Analysis of Behavior, 1975, 23, 17–23.PubMedGoogle Scholar
  96. Staddon, J. E. R., & Simmelhag, V. L. The “superstition” experiment: A reexamination of its implications for the principles of adaptive behavior. Psychological Review, 1971, 78, 3–43.Google Scholar
  97. Stebbins, W. C., Mead, P. B., & Martin, J. M. The relation of amount of reinforcement to performance under a fixed-interval schedule. Journal of the Experimental Analysis of Behavior, 1959, 2, 351–356.PubMedGoogle Scholar
  98. Surridge, C. T., & Amsel, A. Acquisition and extinction under single alternation and random partial-reinforcement conditions with a 24-hour intertrial interval. Journal of Experimental Psychology, 1966, 72, 361–368.PubMedGoogle Scholar
  99. Terrace, H. S., Gibbon, J., Farrell, L., & Baldock, M. D. Temporal factors influencing the acquisition and maintenance of an autoshaped keypeck. Animal Learning and Behavior, 1975, 3, 53–62.Google Scholar
  100. Thorndike, E. L. Animal intelligence: An experimental study of the associative processes in animals. Psychological Monographs, 1898, 2, (4, Whole No. 8).Google Scholar
  101. Thorndike, E. L. Animal Intelligence: Experimental Studies. New York: Macmillan, 1911.Google Scholar
  102. Tinklepaugh, O. L. An experimental study of representative factors in monkeys. Journal of Comparative Psychology, 1928, 8, 197–236.Google Scholar
  103. Tyler, D. W., Wortz, E. C., & Bitterman, M. E. The effect of random and alternating partial reinforcement on resistance to extinction in the rat. American Journal of Psychology, 1953, 66, 57–65.PubMedGoogle Scholar
  104. Wagner, A. R. Effects of amount and percentage of reinforcement and number of acquisition trials on conditioning and extinction. Journal of Experimental Psychology, 1961, 62, 234–242.PubMedGoogle Scholar
  105. Weinstock, S. Acquisition and extinction of a partially reinforced running response at a 24-hour intertriai interval. Journal of Experimental Psychology, 1958, 46, 151–158.Google Scholar
  106. Williams, D. R., & Williams, H. Auto-maintenance in the pigeon: Sustained pecking despite contingent non-reinforcement. Journal of the Experimental Analysis of Behavior, 1969, 12, 511–520.PubMedGoogle Scholar
  107. Wolin, B. R. Difference in manner of pecking a key between pigeons reinforced with food and with water. Conference on The Experimental Analysis of Behavior, Note #4 (mimeographed), April 5, 1948. (Reprinted in A. C. Catania [Ed.], Contemporary Research in Operant Behavior. Glenview, IL: Scott, Foresman & Co., 1968).Google Scholar
  108. Woodruff, G., Conner, N., Gamzu, E., & Williams, D. R. Associative interaction: Joint control of key pecking by stimulus-reinforcer and response-reinforcer relationships. Journal of the Experimental Analysis of Behavior, 1977, 28, 133–144.PubMedGoogle Scholar
  109. Zentall, T. R., & Levine, J. M. Observational learning and social facilitation in the rat. Science, 1972, 178, 1220–1221.PubMedGoogle Scholar

Copyright information

© Plenum Press, New York 1979

Authors and Affiliations

  • M. E. Rashotte
    • 1
  1. 1.Florida State UniversityTallahasseeUSA

Personalised recommendations