Advertisement

Learning & Behavior

, Volume 40, Issue 4, pp 380–392 | Cite as

Delayed matching to sample: Reinforcement has opposite effects on resistance to change in two related procedures

  • John A. NevinEmail author
  • Timothy A. Shahan
  • Amy L. Odum
  • Ryan Ward
Article

Abstract

The effects of reinforcement on delayed matching to sample (DMTS) have been studied in two within-subjects procedures. In one, reinforcer magnitudes or probabilities vary from trial to trial and are signaled within trials (designated signaled DMTS trials). In the other, reinforcer probabilities are consistent for a series of trials produced by responding on variable-interval (VI) schedules within multiple-schedule components (designated multiple VI DMTS). In both procedures, forgetting functions in rich trials or components are higher than and roughly parallel to those in lean trials or components. However, during disruption, accuracy has been found to decrease more in rich than in lean signaled DMTS trials and, conversely, to decrease more in lean than in rich multiple VI DMTS components. In the present study, we compared these procedures in two groups of pigeons. In baseline, forgetting functions in rich trials or components were higher than and roughly parallel to those in lean trials or components, and were similar between the procedures. During disruption by prefeeding or extinction, accuracy decreased more in rich signaled DMTS trials, whereas accuracy decreased more in lean multiple VI DMTS components. These results replicate earlier studies and are predicted by a model of DMTS from Nevin, Davison, Odum, and Shahan (2007).

Keywords

Delayed matching to sample Multiple schedules Signaled trials Reinforcer probability Resistance to change 

The delayed-matching-to-sample (DMTS) paradigm has been used extensively in research on short-term working memory in nonhuman animals. The basic paradigm involves the presentation of one or the other of two samples, S 1 or S 2, in discrete trials. After sample offset, the comparison stimuli C 1 and C 2 are presented at the end of a retention interval, and reinforcers may follow responses to the comparison that matches the sample. The data are often presented as forgetting functions relating the accuracy of matching to the length of the retention interval. Several studies have used the DMTS paradigm to evaluate the effects of memory processes such as proactive and retroactive interference (e.g., Edhouse & White, 1988; Harper & White, 1997). Other studies have addressed the effects on DMTS accuracy of drugs (e.g., Picker, White, & Poling, 1985) or of brain lesions (e.g., Colombo, Swain, Harper, & Alsop, 1997).

In one way or another, the studies cited above have identified variables or processes that challenge accurate DMTS performance. The effects of such challenges may, however, depend on the conditions of reinforcement and the procedure employed. We consider two procedures that have been used to study the effects of reinforcement on DMTS performance within subjects and sessions.

Nevin and Grosch (1990) trained pigeons on a signaled-DMTS-trials procedure in which an auditory stimulus was presented at sample onset and, continuing through the retention interval, signaled whether correct matches would produce large or small reinforcers; signaled large- and small-reinforcer trials alternated irregularly, and the duration of the retention interval varied irregularly across trials. With accuracy expressed as logit p,1 the steady-state baseline forgetting function on large-reinforcer (rich) trials was consistently higher than and roughly parallel to that for small-reinforcer (lean) trials (see Fig. 1, left panel). Performance was disrupted with injections of sodium pentobarbital (NaPB) at three dose levels, flashing the houselight during retention intervals, and reduced sample duration, with baseline recovery between disruptor tests. For every disruptor, the average proportion of baseline was greater on small- than on large-reinforcer trials (Fig. 1, right panel).
Fig. 1

The left panel presents the average steady-state forgetting functions obtained by Nevin and Grosch (1990) trials with signaled large-magnitude and small-magnitude reinforcers; the retention intervals varied across subjects, always in the ratios 1:2:3. The right panel shows the values of logit p, averaged across retention intervals, as a proportion of average logit p for the steady-state functions at the left, during three disruptors: injections of sodium pentobarbital (NaPB), flashing houselight during the retention interval, and reduced sample duration. Error bars are omitted because the individual data have been lost

Comparable baseline forgetting functions have been reported by Brown and White (2005a, Exp. 1) with stimuli presented after sample offset that signaled whether correct matches would be reinforced with high (rich) or low (lean) probability in irregularly alternating DMTS trials. Thus, the effects of reinforcement on baseline forgetting functions are replicable with signals presented at sample offset rather than sample onset, and with signaled reinforcer probability rather than amount (see also Jones, White, & Alsop, 1995; McCarthy & Voss, 1995).

In a related procedure, Odum, Shahan and Nevin (2005) presented DMTS trials contingent on responding according to variable-interval (VI) schedules in alternating multiple-schedule components. Different reinforcer probabilities were signaled by distinctive stimuli for the entire duration of each component, comprising several successive trials with different retention intervals. This procedure, designated multiple VI DMTS and based on that of Schaal, Odum, and Shahan (2000), allows the experimenter to evaluate the effects of reinforcement on response rates as well as forgetting functions. Odum et al. found that response rates were higher in the high-probability (rich) than in the low-probability (lean) component, and that the forgetting function in the rich component was higher than and roughly parallel to that in the lean component (left panel of Fig. 2, with accuracy expressed as log d; see note 1), as in the signaled-trials procedure of Nevin and Grosch (1990). They found that both the rate of responding and the forgetting function in the rich component were more resistant to disruption by presentations of food during intervals separating components (intercomponent interval [ICI] food) and by extinction than in the lean component (Fig. 2, right panel). These opposite effects on the forgetting functions from the ones reported by Nevin and Grosch (Fig. 1, right panel) may be attributable to the differences between studies in the procedures or the disruptors.
Fig. 2

The left panel presents the average steady-state forgetting functions obtained by Odum et al. (2005) for VI DMTS components with high-probability (rich) or low-probability (lean) reinforcers. The right panel shows the values of log d, averaged across retention intervals and expressed as a proportion of average log d for the steady-state functions at the left, for two disruptors: presentation of food during the ICI, as well as extinction. Standard errors are indicated by range bars

Because resistance to disruption is important for effective working memory, the effects of reinforcement on the persistence of short-term remembering deserve analysis. Moreover, the opposed effects on resistance to change of rich versus lean conditions of reinforcement in the signaled DMTS trials and multiple VI DMTS procedure, displayed in Figs. 1 and 2, are predicted by a theoretical model of DMTS by Nevin et al. (2007; summarized below), but need confirmation within a single experiment. Accordingly, we conducted systematic replications of the studies of Brown and White (2005a, Exp. 1) and Odum et al. (2005) with identical retention intervals, to establish comparable baseline forgetting functions, and then employed identical disruptors in both procedures.

Method

Subjects

Ten White Carneau pigeons were maintained at 80% (±15 g) of their free-feeding weights by postsession feeding and were individually housed in a temperature-controlled colony with free access to water under a 12:12-h light:dark cycle. Five of the pigeons served in the signaled DMTS trials procedure, and the other five served in the multiple VIDMTS procedure. All pigeons had previous histories with diverse operant procedures.

Apparatus

Four Lehigh Valley Electronics pigeon chambers, 350 mm long, 350 mm high, and 300 mm wide, were used. Each front panel had three translucent plastic keys that could be lit from behind with green, red, blue, and yellow light, as well as various symbols, and that required a force of about 0.10 N to record a response. The keys were 25 mm in diameter and 240 mm from the floor. A lamp (28-V, 1.1-W) mounted 45 mm above the center key served as a houselight. A rectangular opening 100 mm above the chamber floor provided access to a hopper filled with pelleted pigeon chow through a 50 × 55 mm aperture. During hopper presentations, the opening was lighted white and the houselight and keylights were extinguished. White noise and chamber ventilation fans masked extraneous noise. The contingencies were programmed and data collected by a microcomputer located in an adjacent room using Med Associates (St. Albans, VT) interfacing and software.

Procedure—Multiple VI DMTS

The baseline procedure was identical to that used by Odum et al. (2005). Two components of a multiple schedule alternated, signaled by the color of the center key (either red or green). Pecks to the lit center key changed it to yellow or blue (the sample) on a VI 20-s schedule. If no peck occurred within 80 s (the longest interval duration plus 20 s), the schedule progressed to sample presentation without a keypeck. The sample remained on until the first peck after 3 s or until a total of 6 s had elapsed. After sample offset, the center key returned to the color present during the VI phase. Following a retention interval of 0.1, 2, 4, or 8 s, the center key was extinguished and the side keys were lit, one yellow and one blue (the comparisons). A single peck turned off the side keys and was followed by food or blackout. The procedure is diagrammed in Fig. 3.
Fig. 3

Sequence of events within a trial in the VI DMTS procedure. The center key color before and after sample presentation signaled the reinforcer probability. See the text for a complete description

The components differed in their probabilities of reinforcement for pecking the color that matched the sample. In one component (rich), correct matches produced 2-s access to food with a probability of .9. In the other component (lean), correct matches produced 2-s access to food with a probability of .1. The red or green color assignments varied across birds. In both components, nonreinforced matches and incorrect choices produced a 2-s blackout. Components alternated after blocks of four trials that contained one presentation of each retention interval; their order was chosen randomly within each block. The components were separated by a 15-s ICI, during which the houselight was on and the keys were dark. The experimental sessions ended after 96 trials, 48 per component, and were conducted daily at about the same time.

Procedure—Signaled DMTS trials

The baseline procedure was similar to that used by Brown and White (2005a), modified so that trial signals accompanied sample onset and the range of retention intervals was the same as in the multiple VI DMTS procedure. Each session began with an intertrial interval (ITI) lasting 15 s, during which the houselight was on but all three keys were dark. After the ITI, a red or green sample was presented on the center key. The sample remained on until the first peck after 3 s or until a total of 6 s had elapsed. After a retention interval of 0.1, 2, 4, or 8 s, the side keys were lit, one red and one green (the comparisons). The key that was lit with each color varied randomly across trials. A single peck turned off the side keys and was followed by food or a blackout. The procedure is diagrammed in Fig. 4.
Fig. 4

Sequence of events within a trial in the signaled-trials procedure. The geometric figure projected on the center key throughout the trial signaled the reinforcer probability. See the text for a complete description

The probability of reinforcement for a correct match was signaled by a circle or a vertical line that was superimposed on the center key at the onset of the sample and remained present until a comparison was pecked. On trials with a circle, correct matches produced 2-s access to food with probability 1.0 (rich); incorrect choices produced a 2-s blackout. On trials with a line, correct matches produced 2-s access to food with probability .2 (lean); nonreinforced matches and incorrect choices produced a 2-s blackout. (These reinforcer probabilities were the same as in Brown & White, 2005a, Exp. 1.) An ITI began after either food or a blackout. Sessions ended after 64 trials, 32 rich and 32 lean, and were conducted daily at about the same time.

Resistance tests

To examine the resistance to change of matching accuracy, several disruptors were introduced in successive tests. Each disruptor was in effect for 10 consecutive sessions, and a minimum of 20 baseline sessions intervened between disruptors. The 10 sessions immediately preceding each disruptor constituted the baseline against which disruptor effects were evaluated. Disruptors were arranged identically across procedures. Two of the disruptors involved novel stimuli presented within trials, and two involved general disruptors that were in effect throughout each test session.

Sample disruption

The houselight and white side keys began flashing at sample onset and continued flashing until sample termination. The side keys and houselight flashed separately and successively every 0.2 s, rotating either clockwise or counterclockwise. On each trial, the direction of the flashing houselight and side keys was randomly selected (p = .5).

Comparison disruption

The houselight and white center key flashed on and off every 0.2 s while the comparisons were presented on the side keys.

Prefeeding

The pigeons received 30 g of pigeon chow in their home cages 30 min prior to each session.

Extinction

Correct matches were never followed by food, but instead were always followed by a blackout. If no peck was made to a comparison stimulus within 20 s, the comparisons were extinguished, a blackout ensued, and that trial was not counted as correct or incorrect.

Table 1 lists the numbers of sessions and sequence of exposure to resistance tests for individual subjects in two replications of each procedure.
Table 1

Sequence of conditions and numbers of sessions for all pigeons in both procedures

 

Signaled DMTS Trials (1)

 

P49830

P54

P587

 

Baseline

50

50

50

 

Prefeeding

10

10

10

 

Baseline

50

50

50

 

Disrupt during samples

10

10

10

 

Baseline

35

35

40

 

Disrupt during comparisons

10

10

10

 

Baseline

40

35

35

 

Extinction

10

10

10

 
 

Signaled DMTS Trials (2)

P11

P958

  

Baseline

120

120

  

Disrupt during samples

10

10

  

Baseline

20

20

  

Disrupt during comparisons

10

10

  

Baseline

18

55

  

Prefeeding

10

10

  

Baseline

20

20

  

Extinction

10

10

  
 

Multiple VI DMTS (1)

P1188

P216

P3060

P1821

Baseline*

20

20

20

20

Disrupt during comparisons

10

10

10

10

Baseline

20

20

20

20

Disrupt during samples

10

10

10

10

Baseline**

20

21

20

20

ICI food**

10

10

10

10

Baseline

20

20

20

20

Prefeeding

10

10

10

10

Baseline

20

20

20

20

Extinction

10

10

10

10

 

Multiple VI DMTS (2)

P1173

   

Baseline

130

   

Disrupt during comparisons

10

   

Baseline

35

   

Disrupt during samples

10

   

Baseline

76

   

Prefeeding

10

   

Baseline

3

   

Extinction

died

   

*All 4 pigeons had previous experience with multiple VI DMTS, so extensive baseline training was not needed. **The ICI food results are not reported because the test was not replicated with P1173 and was not employed with signaled DMTS trials.

Measures

In both procedures, accuracy was expressed as log d, the logarithm of the geometric mean of correct responses to errors on trials with samples S 1 and S 2, where B 1 and B 2 signify pecks to comparisons C 1 or C 2, respectively:
$$ { \log }\;d = 0.{5}\;{ \log }\left[ {\left( {{B_1}\left| {{S_1}/{B_2}} \right|{S_1}} \right)*\left( {{B_2}\left| {{S_2}/{B_1}} \right|{S_2}} \right)} \right], $$
(1)
calculated separately for rich and lean components or trials. This measure (Davison & Tustin, 1978) has been used in many studies of conditional discrimination. Unlike percent correct—the more traditional measure—log d has no upper bound and is, at least in principle, independent of biases toward C 1 or C 2. Log d is not defined if any of its terms is 0, as may happen with easy discriminations at short retention intervals. Accordingly, we added 0.25 to all cells for all calculations (see Brown & White, 2005b). As a result, in multiple VI DMTS with 12 trials per session at each retention interval in rich and lean components, pooled over 10-session blocks, the maximum value of log d is 2.38. In signaled DMTS trials with 8 rich and lean trials per session at each retention interval, pooled over 10-session blocks, the maximum value of log d is 2.21.

Results

Baseline

Forgetting functions based on data pooled for the four 10-session blocks of baseline training that preceded resistance tests, averaged across pigeons, are shown in Fig. 5. The left and right panels present the results for groups trained on multiple VI DMTS and signaled DMTS trials, respectively. The functions for the rich components or trials are higher than and roughly parallel to those for the lean components or trials, replicating Odum et al. (2005) and Brown and White (2005a). The average levels of the rich and lean forgetting functions did not differ between procedures: A 2 × 2 repeated measures analysis of variance found a main effect of rich versus lean reinforcement conditions [F(1, 16) = 14.86, p = .001] but no effect of procedures [F(1, 16) = 1.95, p = .182] and no interaction between reinforcement conditions and procedures [F(1, 16) < 1.0]. The heights of the forgetting functions at the shortest retention interval do not differ significantly between procedures (two-tailed t tests, p > .10). Thus, any differences in resistance to change between the procedures cannot be ascribed to differences in baseline forgetting functions.
Fig. 5

Forgetting functions averaged over 5 pigeons and pooled for 10 sessions of baseline training before each of the four resistance tests for multiple VI DMTS (left panel) and signaled trials (right panel); standard errors are indicated by range bars

Response rates in VI DMTS during baseline were higher in the rich than in the lean component for every pigeon in every replication of baseline, consistent with earlier findings (Odum et al., 2005); the data are summarized in Table 2.
Table 2

Response rates in the baseline and proportions of the baseline during prefeeding and extinction for individual subjects in multiple VI DMTS

 

Responses/min

Proportions of Baseline

Pooled Baseline

Prefeeding

Extinction

Rich

Lean

Rich

Lean

Rich

Lean

P1188

116.5

53.7

.532

.257

.550

0.184

P216

85.2

50.2

.777

.778

.701

0.312

P3060

82.8

57.6

.774

.540

.863

0.500

P1821

100.5

43.7

.747

.609

.547

0.312

P1173

87.0

36.8

.410

.138

died

 

Resistance tests

Accuracy levels in the baseline and disruptor test sessions were summarized by averaging log d across retention intervals separately for the rich and lean components or trials for each individual. Then, for each pigeon, average values of log d during the 10 resistance test sessions were expressed as proportions of their levels during the immediately preceding 10 baseline sessions.

Average proportions of baseline in the rich and lean components or trials are presented for each procedure and disruptor in Fig. 6. The figure shows that presenting novel flashing lights during the samples had modest but similar decremental effects in both procedures, whereas flashing lights during comparisons did not reduce accuracy in multiple VI DMTS. When the data for these within-trial disruptors were expressed as differences between average proportions of baseline in the rich and lean components or trials, a 2 × 2 repeated measures analysis of variance found no main effects of procedures or disruptors [F(1, 16) < 1.0] and no reliable interaction between procedures and disruptors [F(1, 16) = 1.499, p = .23]. Accordingly, these data will not be considered further.
Fig. 6

Values of log d during resistance tests, averaged across retention intervals and expressed as proportions of average log d for the steady-state functions in the immediately preceding baseline in multiple VI DMTS (left panel) and signaled trials (right panel) for rich and lean components or trials and all four disruptors: Dissamp, flashing lights during samples; Discomp, flashing lights during comparisons; PF, prefeeding; Ext, extinction. Standard errors are indicated by range bars

Figure 6 also shows that the general disruptors, prefeeding and extinction, reduced accuracy more in lean than in rich VI DMTS components, whereas the opposite ordering occurred in rich and lean signaled trials. By inspection, the effects of the general disruptors were more clearly differentiated between procedures than the effects of within-trial disruptors. When the data for these general disruptors were expressed as differences between the average proportions of baseline in rich and lean components or trials, the difference was positive for VI DMTS and negative for signaled trials, as shown by the left-hand pairs of bars in Fig. 7. A 2 × 2 repeated measures analysis of variance showed that the main effect of procedures was significant [F(1, 16) = 11.05, p = .002]; the main effect of disruptors and the interaction between procedures and disruptors were not significant [F(1, 16) < 1.0]. The extinction data of P1173, which died during that phase, were replaced with the mean of the remaining 4 pigeons.
Fig. 7

Average differences between the proportions of baseline log d in rich and lean multiple VI DMTS components and signaled trials during prefeeding and extinction (left two sets of histogram bars). Positive values signify greater resistance to disruption in rich components or trials; standard errors are indicated by range bars. The right two sets of histogram bars exhibit the differences predicted by the model of Nevin et al. (2007) when the probability of attending to the sample, p(A s), is reduced by increasing parameter x in Eq. 2, or when the probability of attending to the comparisons, p(A c), is reduced by increasing parameter z in Eq. 3; see the Discussion and the Appendix for explanation and calculations, respectively

We conclude that accuracy is more resistant to general disruptors in the rich than in the lean component in VI DMTS, and that the reverse is true in signaled trials, as suggested by the difference between the effects of disruptors reported by Odum et al. (2005, Fig. 2) and by Nevin and Grosch (1990, Fig. 1).

The effects of prefeeding and extinction on response rates in VI DMTS are in accordance with the effects on accuracy. As shown in Table 2, the proportions of baseline were higher in the rich components of prefeeding and extinction for all pigeons except P216, prefeeding, for which there was virtually no difference. These data confirm the results of Odum et al. (2005).

Discussion

Two apparently similar procedures for the study of DMTS yielded similar forgetting functions when different reinforcer probabilities were arranged in multiple-schedule components or signaled in irregularly alternating trials. However, the effects of prefeeding and extinction differed between procedures in ways that are consistent with previous studies. Odum et al. (2005) found that accuracy in the rich component of multiple VI DMTS was less affected by presenting response-independent food between components and by extinction than was accuracy in the lean component, where ICI food was a general disruptor analogous to prefeeding. By contrast, Nevin and Grosch (1990) found that accuracy in rich signaled trials was more affected by three doses of NaPB (a general disruptor) than was accuracy in lean signaled trials. Similar results were obtained with a flashing houselight during retention intervals and with reduced sample duration (within-trial disruptors). Although Nevin and Grosch varied reinforcer magnitude rather than probability between rich and lean trials and employed different disruptors, their data resemble the signaled-trial data presented above.

In the present study, the procedures arranged for multiple VI DMTS and signaled DMTS trials differed in a number of ways. For example, different key colors were used as samples and comparisons; reinforcer probability was signaled before and after the sample in multiple VI DMTS, but during and after the sample in signaled DMTS trials; and the reinforcer probabilities in multiple VI DMTS were .9 and .1 for the rich and lean components, whereas in signaled DMTS trials, they were 1.0 and .2 for the rich and lean trials. Because the present results replicated those of previous signaled-trial studies that had also differed in a number of ways, and because the baseline forgetting functions were similar, it is unlikely that the incidental differences between the VI DMTS and signaled-trials procedures arranged here affected the ordinal differences in resistance to disruption.

The opposed orderings of resistance to disruption in VI DMTS and signaled DMTS trials are predicted by a model proposed by Nevin et al. (2007), which we summarize here (see Nevin et al., 2007, for a full exposition of the model’s rationale, assumptions, and applications to data).

Modeling DMTS accuracy and the effects of disruptors

The model of Nevin et al. (2007) assumes that correct performance in DMTS requires attending to both samples and comparisons, that the probabilities of attending to the samples and comparisons in DMTS trials are independent, and that both depend directly on signaled reinforcer rates, expressed relative to the context in which the stimuli appear according to equations derived from behavioral momentum theory. Attending to the samples is assumed to include the subjects’ observing behavior before onset of the samples, discriminative behavior in the presence of the samples, and attending to the recently encountered samples during retention intervals (rehearsal). Attending to the comparisons is assumed to include the subjects’ observing behavior during retention intervals and discriminative behavior in the presence of the comparisons themselves. Figure 8 portrays the sequence of events and the times during which attending to samples and comparisons is assumed to occur.
Fig. 8

Time-line diagram of experimentally arranged events within a DMTS trial, and the times during which the subject is assumed to attend to the sample and comparisons. Times during which reinforcers and disruptors are assumed to operate on attending to samples or comparisons are also indicated. Reprinted from Fig. 2 of “A Theory of Attending, Remembering, and Reinforcement in Delayed Matching to Sample,” by Nevin, Davison, Odum, and Shahan, 2007, Journal of the Experimental Analysis of Behavior, 88, pp. 285–317. Copyright 2007 by the Society for the Experimental Analysis of Behavior, Inc. Reproduced with permission

The probability of attending to the sample, p(A s), is given by
$$ p\left( {{A_s}} \right) = \exp \left( {\frac{{ - x - qt}}{{{{\left( {{r_s}/{r_a}} \right)}^{0.5}}}}} \right), $$
(2)
where the sample-related reinforcer rate r s (i.e., reinforcers per trial divided by the time preceding, during, and following sample presentation until onset of the comparisons in each trial) is expressed relative to the average reinforcer rate for an entire session, r a, the overall context within which DMTS trials appear. The value of the exponent on r s/r a is based on fits to parametric data sets for free-operant responding and was used in all fits reported by Nevin et al. (2007). Attending to the sample may be reduced by increasing the general background disruptor x and by increasing a separate disruptor q during a retention interval of length t.
Likewise, the probability of attending to the comparisons is given by
$$ p\left( {{A_c}} \right) = \exp \left( {\frac{{ - z - vt}}{{{{\left( {{r_c}/{r_s}} \right)}^{0.5}}}}} \right), $$
(3)
where the comparison-related reinforcer rate r c (i.e., reinforcers per trial divided by the time from sample offset to comparison offset in each trial) is expressed relative to the reinforcer rate for attending to the samples, r s, the context in which comparisons appear. Overall attending to the comparisons may be reduced by a general background disruptor z and by a separate disruptor v during a retention interval of length t.

In experiments with easily discriminated stimuli and with equal reinforcer probabilities for correct matches following S 1 and S 2, we assume that the subject always responds correctly on a given trial if it attends to both sample and comparisons. If it does not attend either to the sample or the comparisons, it responds randomly. Thus, discrimination accuracy for a block of trials is predicted by a weighted average of trials with and without attending, given by Eqs. 2 and 3. Specifically, the proportion correct is p(A s) * p(A c) + 0.5 * [1 – p(A s) * p(A c)]. The proportion correct is then transformed to logit p and plotted in relation to the retention interval for comparison with empirical forgetting functions with accuracy expressed as log d. The x and z parameters determine the level of the predicted forgetting function, whereas q and v affect its slope (Nevin et al., 2007).

With the inclusion of parameters characterizing the discriminability of sample and comparison stimuli and the generalization of reinforcement across those stimuli, the model can account for the effects of differential reinforcement on the steady-state allocation of responses to the comparison stimuli (Nevin, Davison, & Shahan, 2005; Nevin et al., 2007). However, these complexities are not critical for modeling baseline performances and their resistance to change in the present study, because we employed distinctively colored samples and comparisons and arranged symmetrical reinforcer probabilities for correct responses.

Predictions for multiple VI DMTS and signaled trials

As stated above, subjects are assumed to engage in observing behavior during the VI phase or the ITI before sample presentation, to discriminate the sample while it is present, and to rehearse the recently presented sample during the retention interval. All of these activities constitute attending to the sample, and their probability p(A s) is given by Eq. 2, where the denominator is r s/r a. Because r s (reinforcement associated with attending to the sample) is greater in high-probability (rich) than in low-probability (lean) VI DMTS components or signaled trials, while r a (session-wide reinforcement) is the same, p(A s) must be greater in rich than in lean VI DMTS components or signaled trials. However, the ways in which reinforcers contribute to r s in VI DMTS and signaled-trials procedures are different. In VI DMTS, reinforcer probability is signaled throughout the VI as well as during DMTS trials. In signaled trials, by contrast, reinforcer probability is signaled only during DMTS trials, so the effective reinforcer rate during the ITI preceding a trial is based on the overall expected or average probability on rich and lean trials. As a result, p(A s) differs more between rich and lean components in the VI DMTS procedure than between rich and lean trials in the signaled-trials procedure.

The probability of attending to the comparisons, p(A c), is assumed to depend on the ratio r c/r s according to Eq. 3. In both procedures, the ratio of r cRICH to r cLEAN in rich components or trials is equal to the ratio of reinforcer probabilities. In multiple VI DMTS, the ratio of r sRICH to r sLEAN is also equal to the ratio of reinforcer probabilities, so r c/r s must be the same in rich and lean components, even though its absolute value depends on VI length and trial duration. In signaled trials, by contrast, r c/r s must be greater in rich than in lean trials because the ratio of r sRICH to r sLEAN must be less than the ratio of reinforcer probabilities. The Appendix presents exact calculations for the procedures employed in the experiment reported above.

In general, p(A s) is more differentiated between rich and lean VI DMTS components than between rich and lean signaled trials, whereas p(A c) is more differentiated between rich and lean signaled trials than between rich and lean VI DMTS components. As a consequence, a disruptor that affects attending to samples is predicted to reduce p(A s) less in rich than in lean VI DMTS components, and the difference should be greater than for the same disruptor in signaled trials. When p(A s) is reduced, the predicted difference between proportions of baseline log d in rich and lean components will be positive, whereas the predicted difference between the proportions of baseline log d in rich and lean signaled trials will be negative. Conversely, a disruptor that affects attending to the comparisons is predicted to have the same effect on p(A c) in rich and lean VI DMTS components, whereas p(A c) will be reduced less in rich than in lean signaled trials. When p(A c) is reduced, the predicted difference between the proportions of baseline log d in rich and lean components will be negative, whereas the predicted difference between the proportions of baseline log d in rich and lean signaled trials will be positive. The Appendix explains these predictions in detail.

To compare the data with these predictions, we expressed the proportions of baseline for prefeeding and extinction (see Fig. 6) as differences between rich and lean VI DMTS components or signaled trials (see Fig. 7). Predicted differences were derived from Eqs. 2 and 3 by choosing values of x to reduce p(A s), or z to reduce p(A c), that would yield predicted proportions of baseline equal to the obtained proportions of baseline averaged over rich and lean components or trials. (Note that by using the average of rich and lean proportions of baseline to select parameter values, we did not predetermine differences between rich and lean proportions of baseline.)

The right-hand pairs of bars in Fig. 7 show that the predicted effect of reducing p(A s) corresponds ordinally to the effects of the general disruptors in both procedures: Accuracy is more resistant to change in the rich than in the lean VI DMTS component (i.e., differences are positive), whereas accuracy is less resistant to change in rich than in lean signaled trials (i.e., differences are negative). By contrast, the predicted effect of reducing p(A c) is ordinally opposite to the obtained differences in both procedures.

The correspondence with predictions for reducing p(A s) suggests that general disruptors will reduce attending to the sample, which includes observing behavior before sample onset (see Fig. 8). As described above, the effects of the general disruptors on VI response rates, which may be construed as observing responses, are consistent with disruption of attending to the samples: Response rates, like accuracy, are more resistant to change in the rich component. The differences between the data for VI DMTS and signaled trials obtained here are similar to the differences between the VI DMTS data of Odum et al. (2005) and the signaled-trials data of Nevin and Grosch (1990). Taken all in all, the data on resistance to disruption of DMTS accuracy are consistent with the ordinal predictions of the Nevin et al. (2007) model, as elaborated for VI DMTS and signaled trials. However, the predicted magnitude of the difference is smaller than that obtained (see Fig. 7), suggesting that the model needs revision to achieve quantitative agreement with the data; one such revision would be to allow the exponent on r s/r a in Eq. 2 to vary as a free parameter rather than being fixed at 0.5.

Alternative models of DMTS

A very different model of DMTS performance has been proposed by White and Wixted (1999). In their model, samples S 1 and S 2 are represented as overlapping Gaussian distributions on a dimension of stimulus value. The ordinates of these distributions are multiplied by reinforcer probabilities to yield the distributions of expected reinforcers associated with each point on the stimulus value dimension. When a subject encounters a particular stimulus value on a given trial, its choice of C 1 or C 2 is assumed to match the ratio of expected reinforcers at that value.

Because the model’s predictions are based on reinforcer ratios, it cannot account for the effects of absolute reinforcer probabilities on forgetting functions, such as those illustrated in Figs. 1, 2, and 5. In order to account for the enhancement of accuracy by more frequent reinforcement, Brown and White (2009) added a term for unmeasured extraneous reinforcers by assuming that the relative effects of reinforcers R 1 and R 2 explicitly arranged for correct choices of C 1 or C 2 are, in effect, diluted by extraneous reinforcers. Thus, choices at each point along the stimulus value dimension are given by
$$ {B_1}/{B_2} = \left( {{R_1} + {R_{\text{e}}}} \right)/\left( {{R_2} + {R_{\text{e}}}} \right), $$
(4)
where R e represents extraneous reinforcers. Brown and White (2009) varied reinforcer probabilities in successive experimental conditions and showed that the effects on the level of the forgetting function could be explained by a single value of parameter R e.

Brown and White (2009) also showed that the effects of explicit alternative reinforcers could be treated similarly. Brown and White (2005c) arranged DMTS trials where center-key pecks were reinforced with food according to VI schedules during the retention intervals, and found that the level of the forgetting function decreased as the frequency of food provided by the VI schedule increased, while the slope remained about constant. These results are consistent with those of Jans and Catania (1980), who found that presenting food throughout the retention interval reduced accuracy to near-chance levels; the VI schedules used by Brown and White (2005c) could be expected to have similar but less drastic effects. The results are also consistent with predictions based on Eq. 4: Replacing hypothetical extraneous reinforcers R e with explicit reinforcers R o, it is clear that as R o increases, the B 1/B 2 ratio must decrease and approach 1.0 as R o becomes large.

In addition, Brown and White (2009) showed that Brown and White’s (2005c) data could be explained by the Nevin et al. (2007) model by allowing the disruption parameters x, q, z, and v to increase. This makes sense, because added reinforcers during the retention interval are readily construed as disruptors, as suggested by Jans and Catania (1980). Brown and White (2009) noted that the effects of R e in Eq. 4 depend on the absolute values of R 1 and R 2. Thus, if R 1 and R 2 are large relative to R e, as in rich components or trials, reductions in accuracy due to increases in R e will be smaller than if R 1 and R 2 are small, as in lean components or trials. If R e is assumed, not unreasonably, to increase relative to R 1 and R 2 during prefeeding and extinction, the augmented White–Wixted model predicts that these general disruptors will have a smaller decremental effect on accuracy in the rich component, as found for multiple VI DMTS. However, decremental effects on accuracy were greater in rich signaled trials, and it is not obvious how the White–Wixted model could be adapted to account for the opposed results in closely related DMTS procedures. Although the White–Wixted and Nevin et al. (2007) models are equally effective in accounting for steady-state forgetting functions in relation to the conditions of reinforcement, despite their structural differences, tests of resistance to change such as those reported here can differentiate between the models.

Footnotes

  1. 1.

    The calculation of log d is described in the Measures section. Logit p is given by log[P/(1 – P)], where P is the proportion correct. Logit p is identical to log d if there are no biases toward one or the other comparison stimulus or key position; no such biases were reported by Nevin and Grosch (1990).

Notes

Author note

The research reported here was supported by NIMH Grant 65949 to the University of New Hampshire and was conducted at Utah State University. We thank Wesley Thomas for assistance.

References

  1. Brown, G. S., & White, K. G. (2005a). On the effects of signaling reinforcer probability and magnitude in delayed matching to sample. Journal of the Experimental Analysis of Behavior, 83, 119–128.PubMedCrossRefGoogle Scholar
  2. Brown, G. S., & White, K. G. (2005b). The optimal correction for estimating extreme discriminability. Behavior Research Methods, 37, 436–449.PubMedCrossRefGoogle Scholar
  3. Brown, G. S., & White, K. G. (2005c). Remembering: The role of extraneous reinforcement. Learning and Behavior, 33, 309–323.PubMedCrossRefGoogle Scholar
  4. Brown, G. S., & White, K. G. (2009). Reinforcer probability, reinforcer magnitude, and the reinforcement context for remembering. Journal of Experimental Psychology: Animal Behavior Processes, 35, 238–249.PubMedCrossRefGoogle Scholar
  5. Colombo, M., Swain, N., Harper, D. N., & Alsop, B. (1997). The effects of hippocampal and area parahippocampalis lesions in pigeons: I. Delayed matching to sample. Quarterly Journal of Experimental Psychology, 50B, 149–171.Google Scholar
  6. Davison, M. C., & Tustin, R. D. (1978). The relation between the generalized matching law and signal-detection theory. Journal of the Experimental Analysis of Behavior, 29, 331–336.PubMedCrossRefGoogle Scholar
  7. Edhouse, W. V., & White, K. G. (1988). Sources of proactive interference in animal memory. Journal of Experimental Psychology: Animal Behavior Processes, 14, 56–70.CrossRefGoogle Scholar
  8. Harper, D. N., & White, K. G. (1997). Retroactive interference and rate of forgetting in delayed matching-to-sample performance. Animal Learning & Behavior, 25, 158–164.CrossRefGoogle Scholar
  9. Jans, J. E., & Catania, A. C. (1980). Short-term remembering of discriminative stimuli in pigeons. Journal of the Experimental Analysis of Behavior, 34, 177–183. doi: 10.1901/jeab.1980.34-177 PubMedCrossRefGoogle Scholar
  10. Jones, B. M., White, K. G., & Alsop, B. A. (1995). On two effects of signaling the consequences of remembering. Animal Learning & Behavior, 23, 256–272.CrossRefGoogle Scholar
  11. McCarthy, D. C., & Voss, P. (1995). Delayed matching-to-sample performance: Effects of relative reinforcer frequency and of signaled versus unsignaled reinforcer magnitudes. Journal of the Experimental Analysis of Behavior, 63, 33–51.PubMedCrossRefGoogle Scholar
  12. Nevin, J. A., Davison, M., & Shahan, T. A. (2005). A theory of attending and reinforcement in conditional discrimination. Journal of the Experimental Analysis of Behavior, 84, 281–303.PubMedCrossRefGoogle Scholar
  13. Nevin, J. A., Davison, M., Odum, A. L., & Shahan, T. A. (2007). A theory of attending, remembering, and reinforcement in delayed matching to sample. Journal of the Experimental Analysis of Behavior, 88, 285–317. doi: 10.1901/jeab.2007.88-285 PubMedCrossRefGoogle Scholar
  14. Nevin, J. A., & Grosch, J. (1990). Effects of signaled reinforcer magnitude on delayed matching-to-sample performance. Journal of Experimental Psychology: Animal Behavior Processes, 16, 298–305.CrossRefGoogle Scholar
  15. Odum, A. L., Shahan, T. A., & Nevin, J. A. (2005). Resistance to change of forgetting functions and response rates. Journal of the Experimental Analysis of Behavior, 84, 65–75.PubMedCrossRefGoogle Scholar
  16. Picker, M., White, W., & Poling, A. (1985). Effects of phenobarbital, clonazepam, valproic acid, ethosuximide, and phenytoin on the delayed matching-to-sample performance of pigeons. Psychopharmacology, 86, 494–498.PubMedCrossRefGoogle Scholar
  17. Schaal, D. W., Odum, A. L., & Shahan, T. A. (2000). Pigeons may not remember the stimuli that reinforced their recent behavior. Journal of the Experimental Analysis of Behavior, 73, 125–139.PubMedCrossRefGoogle Scholar
  18. White, K. G., & Wixted, J. T. (1999). Psychophysics of remembering. Journal of the Experimental Analysis of Behavior, 71, 91–113.PubMedCrossRefGoogle Scholar

Copyright information

© Psychonomic Society, Inc. 2011

Authors and Affiliations

  • John A. Nevin
    • 1
    Email author
  • Timothy A. Shahan
    • 2
  • Amy L. Odum
    • 2
  • Ryan Ward
    • 3
  1. 1.The University of New HampshireVineyard HavenUSA
  2. 2.Department of PsychologyUtah State UniversityLoganUSA
  3. 3.Columbia University Medical CenterNew York State Psychiatric InstituteNew YorkUSA

Personalised recommendations