It’s painful to criticize a long, thoughtful article. But Travis Thompson’s review of The New Behaviorism (TNB) contains too many mistakes and misunderstandings to let it pass.

First, a couple of obvious errors. “Staddon is a highly regarded experimental psychologist known for his landmark chapter on schedule-induced behavior in Honig and Staddon's Handbook of Operant Behavior. . . .” Well, thanks for that, but a better choice might be the paper on which the 1977 chapter was based, which is discussed at some length in TNB, namely the 1971 repeat by Virginia Simmelhag and myself of Skinner’s (1948) “superstition” experiment. We replicated Skinner’s result but, by recording behavior second-by-second, showed that his conclusion was wrong.

Figure 1 (Figure 6.1 in TNB) shows the frequency of three different activities in each 2-s period of the 12-s interfood interval of a fixed-time schedule for a single experimental subject. The bottom panel makes it clear that the behavior that should have been “adventitiously reinforced” (“head in food magazine”), because it was reliably contiguous with food delivery, was in fact replaced after seven experimental sessions by the terminal response of pecking, which had never previously been contiguous with food. Whatever the source of pecking (presumably a Pavlovian-type temporal conditioning), it was not “adventitious reinforcement.”

Fig. 1.
figure 1

Development of the terminal response. (The graph shows, for Bird 49 on the response-independent fixed-interval procedure, the transition from Head in magazine (R11) to Pecking (R7) as terminal response, and includes one interim activity, \( \frac{1}{4} \) circles (R4), for comparison. Each panel covers 2 seconds of the 12-second interval and indicates the number of intervals (out of 64) in which a given response occurred in that 2-second block for each session over the first 36 sessions. Gaps indicate days for which data are not available. Bird 49 was not run for a 9-day period between Sessions 21 and 22.)

Design and Interpretation of Experiments: What are the Data Telling Us?

Second, “Staddon’s book seems to be a paean to Herrnstein,” writes Thompson. This claim is especially odd for a couple of reasons. I was possibly the only student of Herrnstein’s whose PhD thesis was not on some aspect of the thingFootnote 1 for which he is most famous, namely the “matching law.” There is a lengthy section in The New Behaviorism on the matching law, not because I wished to extol it but because I wanted to explore the idea that it tells us little that we did not already know.

Experimenters want to be able to demonstrate reliable effects and orderly relations. The details of the procedure are carefully tuned by trial and error until simple, replicable effects are obtained. The danger is that sometimes the effects may reflect constraints of the procedure more than the properties of the organism. Order may be achieved at the expense of relevance.

The matching law may be one example, but first a simpler one, Abram Amsel’s frustration theory, the idea that if an animal is used to getting a reward, and then the reward is omitted, he is “frustrated” and responds more vigorously immediately afterwards.

This hypothetical frustration effect was tested with rats running in a “double runway,” that is, a runway with two goal boxes: G1, a short way from the start box and the second, G2, some distance after that: a short first runway (3.5 ft) followed by a much longer second one (10 ft). In training the rat runs to G1 and gets a bit of food, then he runs to G2 and gets another bit of food. The experimenter measures how fast the rat runs in the first part of the long second runway.

The experiment is in two phases. The rat always gets food in G2. In the first phase, he also gets food in G1 on every trial. Thus, he learns to expect food in the first goal box. In the second phase, he gets food in the first goal box on only half the trials: rewarded on half, “frustrated” on half. The question: How fast does he run in the long runway after food and after no-food, in the first goal box? The answer is: after training he runs fasterFootnote 2 when there is no food in the mid-goal box compared to when there is food, especially in the first third of the long runway. This is Amsel’s “frustration effect” (FE).

I reasoned as follows: if the effect is confined to the first third of the long second runway, why does the second runway need to be so much longer than the first? Why not save wood, time, and free parameters by making both runways the same length? Perhaps, if the second runway is short, there is no effect? Perhaps the FE is in some way caused by the longer second runway, which imposes a delay between the first reward in G1 and the second in G2?

It was well-known that rats, pigeons, and people react to any reliable delay between a stimulus (such as food in G1) and reward (food in G2) by delaying the rewarded response—the postreinforcement pause (wait) that well-trained rats and pigeons show on fixed-interval schedules is an example. Perhaps, we conjectured, the frustration effect reflects not excitation triggered by nonreward so much as disinhibition caused by the omission of the inhibitory time marker, food, in G1? The normal hesitation after the food in G1 is absent when food is omitted, hence the faster running in the initial segment of the long second runway.

We were able to show in a series of experiments using temporal proceduresFootnote 3 that this interpretation is the correct one, that Alan Wagner’s control experiment (of which we were not aware at the time) is not relevant, and that a “reverse frustration effect” can be produced by a training schedule that produces a high response rate after reward. The frustration effect had to be abandoned.Footnote 4 The message: As you manipulate the details of procedure to get your effect, be sure to incorporate those variables into your theory. Amsel made the second runway longer than the first, but failed to wonder why this extension was necessary.

The matching law is a bit more complicated, but the same principles apply. The law is based on experiments with concurrent (two-choice) variable-interval (VI) schedules. It is a relation between two dependent variables: rate of response to each choice and obtained rate of reinforcement for each choice. In the steady state, the ratio of responses made equals the ratio of reinforcements obtained, which is the matching law. Thus, given a suitably motivated subject long-trained with VI 1 min for responding on the left and VI 3 min for responding on the right, the ratio of reinforcers obtained and responses made will both equal 3:1.

But, as I point out in TNB, Herrnstein had to tweak the simple concurrent VI VI to get this result by enforcing a delay of 1.5 s or so for switching (L→ R or R → L). Absent a changeover delay (CoD), animals tend to undermatch, choosing 2:1 when the reinforcement ratio is 3:1, for example. There is a tendency to overmatching (e.g., 3:1 response ratio and 2:1 reinforcement ratio) when the CoD is much longer than 1.5 s. An explicit travel requirement between choices leads to strong overmatching (Baum, 1982). Yet the free parameter of CoD duration finds no place in matching theory.

Matching is usually thought of as a law of choice behavior. But does it reflect intrinsic properties of the choosing organism or properties of a procedure with two negative feedbacks?

  1. 1.

    no responding = no reinforcement (remember that the law is between response ratio and obtained reinforcement ratio), and

  2. 2.

    the longer the time since a response to one alternative, the higher the probability of payoff for the next response: payoff probability increases with delay.

Feedback 1 means that reinforcements automatically equal 0 when responding ceases so that matching holds even on concurrent ratio schedules where subjects usually fixate on the higher-probability choice. Feedback 2 means that the longer the subject avoids one choice the higher its payoff probability, suggesting that almost any reinforcement principle will lead to some responding to both choices on concurrent VI VI, no matter how disparate the schedules may be.

To test the idea that molar matching is insensitive to the details of the choice rule, Hinson and Staddon (1983) simulated conc VI VI choice with a range of choice rules that balanced a tendency to “stay” immediately after reinforcement against a tendency to “switch” as postreinforcement time increases. In general, the tendency was to undermatching; in every case the molar functions fit the unbiased generalized matching law (Baum, 1974).

It is hard not to conclude that molar matching told us little that we didn’t already know about the actual process, the real-time rules, that govern choice. Herrnstein’s procedure achieved order, but at the expense of relevance to object of inquiry: the organism. I’m not sure that he would have considered that conclusion a “paean.” (Nor is it the case, as Thompson strangely asserts, that “Staddon’s notion of free will arises from Herrnstein’s matching law!”)

Models

“It isn’t clear that the main subject matter of TNB concerns behavior,” writes Thompson. He is right: the book is not just about behavior. It is about the processes that cause behavior, which include concepts other than lever presses or key pecks. The aim of science is to explain, understand, predict—not just describe. Description is often a necessary preliminary, of course. Exploration—of chemical substances, individual organisms, physical phenomena, and of reflexes and reinforcement schedules—is the first step in any science. From orderly description come generalizations that lead to hypotheses that can be tested: first induction, then deduction, and finally, test. The later steps have been skimped by behavior analysis, perhaps because Skinner (1938) himself was unenthusiastic about exploration for its own sake, dismissing it as the “botanizing of reflexes.” But a science of behavior, like the science of biology, must begin by botanizing in order to end with a theory of evolution.

Thompson is puzzled about theory, writing “Staddon’s theoretical state and integrator constructs have no tangible material referents so far as this reviewer has determined.” Perhaps my exposition was at fault. In a misplaced effort to downplay the math I wrote “[An] emerging theme is the idea that many of the properties of simple learning can be explained by interactions among independent agents (“integrators”), each of which retains a memory of its past effectiveness in a given context.” Bad choice; sounds like cognitive psychology. I’m talking here not about motivated human-like “agents” but simply about a set of linked equations.

An example might help. There are two phenomena, one in human memory the other in the learned behavior of primitive animals, that seem to depend on the same simple process:

  1. 1.

    Jost’s Law of Forgetting: “[I]f 2 memories are of the same strength but different ages, the older will decay more slowly than the younger” (Wixted, 2004).

  2. 2.

    Rate-sensitive habituation: The responses elicited by many stimuli diminish in strength with repeated stimulus presentation (habituation). Habituation occurs more rapidly when interstimulus intervals are short than when they are long, but also recovers more rapidly after short ISIs (rate sensitivity: Staddon, 1993).

Most theories of memory assume some kind of decaying trace. H. A. Simon (1966) showed many years ago that Jost’s Law is incompatible with simple exponential decay. A trace that is the sum of two or more exponentials, decaying at different rates, is necessary to account for slower decay of older memories. Staddon (1993, 2001) showed essentially the same thing for rate-sensitive habituation: faster learning but also faster forgetting for closely spaced versus widely spaced training stimuli requires memory decay that is the sum of successively slower and slower-decaying exponentials.

This approach has turned out to be quite powerful. It also has some promising links to neurophysiology, although its explanatory usefulness does not depend on them. Staddon, Chelaru, and Higa (2002a) showed that essentially the same theoretical model (the multiple-time-scale [MTS] model) could explain Hermann Ebbinghaus’s original forgetting function (uniquely, for studies of human memory, a single-subject study), as well as numerous static and dynamic properties of interval timing. Figure 2 (Staddon et al., 2002a, Figure 9) shows predictions of the model in three input sequences, compared with the responses of individual subjects.

Fig. 2.
figure 2

Response of the model to three impulse patterns. Top: two short (15-s) IRIs separated by eight baseline (45-s) IRIs. Middle: eight short IRIs. Bottom: eight short separated by four baseline IRIs. Light lines + markers: data from 3 individual pigeons. Heavy line: predictions of the MTS model (Staddon et al. 2002a, Figure 9)

It is intriguing that (although this lead does not seem to have been followed up) five properties of the MTS model, from the existence of multiple exponential integrators, through sequential properties and chained links between successive units, were demonstrated in magnetic source brain imaging studies by Uusitalo et al. (1996) and Glanz (1998; see Staddon et al. 2002b, Fig. 8).

It is important to emphasize, however, that neither the MTS model nor any other postulated set of internal states need be directly measurable. They may be perfectly good as explanations or to link apparently disparate phenomena without being accessible to the tools of neurophysiology. On the other hand, if in the future such links are found, so much the better, But the utility of these ideas as explanations does not depend on immediate links to brain structure or function.Footnote 5

Scientific hypotheses often involve concepts that are not directly measurable, although they will—must—have measurable implications. The existence of atoms was inferred from the empirical Law of Multiple Proportions in chemistry in 1808. Eventually atoms were identified by other means and after a still longer time could be sort of seen by electron micrographers. But atoms were useful to science long before they could be seen. The same is true of the (perhaps misleadingly named) “agents”—exponential integrators—in the MTS models: they are useful as explanations, even without a 3-D image.

Skinner himself was inconclusive about theory. On the one hand, he wrote that science should not postulate “events taking place somewhere else, at some other level of observation, described in different terms, and measured, if at all, in different dimensions,” which would seem to rule out unobservables like genes and atoms, not to mention integrator memories. On the other hand, Skinner also raised no objection to theory as “a formal representation of the data reduced to a minimal number of terms” (Skinner, 1950a, p. 193; 1950b, p. 216; both works are quoted in TNB). What if the most parsimonious, “minimal” account of a set of behavioral phenomena involves postulating hypothetical processes like memory traces? Would Skinner have gone along? Probably not, because he quickly abandoned his own theoretical foray, the “Reflex Reserve” (Killeen, 1988). But in a discussion of Verbal Behavior, Skinner (1948) explicitly acknowledged the possibility of “latent” (i.e., unconscious, unmeasurable) responses, entities not too remote from memory traces. In the face of this ambiguity, the field, naturally excited by the new experimental possibilities offered by the operant conditioning method, avoided deductive theory almost entirely.

Was Staddon Mean to Skinner?

“Staddon’s descriptions of B. F. Skinner's ideas and professional work exceeds the bounds of civility,” writes Thompson, citing equally critical comments from reviewers of the first edition. No offense intended! I tried in this edition to match my comments to the ambition of Skinner’s own claims. First, as I have said repeatedly, Skinner has no peer in psychology as an experimenter and innovator. He wrote simply and brilliantly. The concept of the operant is unchallenged. His defects are his deprecation of theory and his extrapolations from a highly specialized body of experimental research to problems of ethics and politics to which it has only the most limited application.

He wrote in 1955 that “To confuse and delay the improvement of cultural practices by quibbling about the word improve is itself not a useful practice” (Skinner, 1955/1961) implying that (to him) what is right and wrong is obvious. So he felt no need to justify the ethical basis for his proposals. Their practical feasibility as political systems flowed, he thought, from data on operant conditioning, mostly from rats, pigeons, and human clinical populations in laboratory situations.

Skinner entitles a bestseller Beyond Freedom and Dignity, and then denies the reality of freedom and says nothing about what dignity should be, discussing only how people (i.e., contemporary Americans) recognize and acknowledge it. He alludes not at all to other writing on these difficult topics (neither Adam Smith nor David Hume appear, even Bertrand Russell—who we know Skinner read—is absent, as are any historians). Other writers have gained readability by omitting reference to many relevant people (Yuval Harari’s Sapiens is an example), but they can be forgiven if they do not at the same time recommend massive social engineering, as Skinner does in Beyond Freedom and Dignity and the utopian novel Walden II. The novel sketches out a kind of Platonic technocracy that is both incompatible with any kind of democratic republic and unworkable as a sustained system—as the troubled and usually short lives of its exemplars suggest. All this is totally forgivable in a novel. But Skinner assigned it, along with 1984 and Brave New World, to his introductory classes, so it is my best guess as to what he actually believed.

“Experiment” was to be the key to Skinner’s new community. A laudable aim, to be sure. But valid experimentation is the one thing that can’t be done on most social issues. Should the response to Covid-19 and the attendant economic stress be an injection of government money, a general pensions subsidy, protectionist tariffs to “deglobalize” the United States, or nothing at all? Experiment is not possible. Legislators and the executive must act without knowing for certain what the outcome will be. Action in the face of imperfect knowledge is the rule for most major public policy decisions.

And there is the problem of prediction and time horizon: even if it can be established that a new social practice is likely to have good effects in the short term, how about the long? That’s why societies have values, to guide choice even when the future is profoundly uncertain. Values are important, but the reader must guess Skinner’s values because they are not made explicit.

Thompson gives many examples of what he considers my “derisive comments” about Skinner. The reader can judge for herself by reading them in context. But this one struck me as particularly mistaken: “In referring to Skinner’s earlier work, Staddon described it as ‘a story of insight and happy accident’” (TNB, p. 30). implying that this is a disparaging comment and accidents play no role in creative science. It is not and they do. The context is this passage (admittedly many pages later): “real science is a story of percipience, persistence and failure. All the great discoveries involve either accident, like Becquerel and radioactivity and Fleming and penicillin, or incredible persistence in the face of repeated failure, like Skinner’s evolution of the Skinner box and discovery of reinforcement schedules (Chapter 3). . .” (p. 118).

Skinner (1956), in one of his most insightful papers (“A Case History in Scientific Method,” discussed at length in TNB) described his discovery of the power of intermittent reinforcement as a byproduct of a shortage of food pellets for his rats caused by his unwillingness to undertake the labor of making them:

The procedure was painstaking and laborious. Eight rats eating a hundred pellets each per day could easily keep up with production. One pleasant Saturday afternoon I surveyed my supply of dry pellets, and, appealing to certain elemental theorems in arithmetic, deduced that unless I spent the rest of that afternoon and evening at the pill machine, the supply would be exhausted by ten-thirty Monday morning. Since I do not wish to deprecate the hypotheticodeductive method, I am glad to testify here to its usefulness. It led me to apply our second principle of unformalized scientific method and to ask myself why every press of the lever had to be reinforced. . . .

It is surely not “derisive” to refer to this as a “happy accident,” which, combined with what Pasteur called a “prepared mind,” led to the momentous discovery of reinforcement schedules. This is how creative science happens.

Thomson comments that my criticism of Skinner’s insistence on his own vocabulary is unfair. This point is reasonable as applied to the book Verbal Behavior, which explored a wholly novel field. Not so, when applied to operant conditioning in general. Perfectly good terms already existed for operant (instrumental), reinforcement (reward), conditioned (secondary) reinforcement, ontogenic (ontogenetic), contingent (dependent), etc. In the absence of an established theory, a new term will have as many problems as the old, just as “energy” had an uncertain meaning until Lord Kelvin, Willard Gibbs, and thermodynamics. Insisting on new terms ungrounded in theory almost guarantees endless debate about exactly how they should be defined.

Perhaps it was unkind to see in Skinner’s control of language an attempt to create a self-isolated group of disciples. But it was surely a reasonable inference, especially after listening myself to three weeks of his Harvard Pro-seminar in which his insistence on terminology was the main theme.

I should add, although I should not have to, that my personal relations with Skinner were excellent. He tolerated my attempt to introduce automata theory to his seminar, was generous in passing my French language exam and presided gracefully over my PhD final. My beef is with Skinner’s philosophy and his scientistic proposals to reform society, not with him (Staddon, 2002).