Normalization between stimulus elements in a model of Pavlovian conditioning: Showjumping on an elemental horse
- First Online:
- Cite this article as:
- Thorwart, A., Livesey, E.J. & Harris, J.A. Learn Behav (2012) 40: 334. doi:10.3758/s13420-012-0073-7
- 497 Downloads
Harris and Livesey. Learning & Behavior, 38, 1–26, (2010) described an elemental model of associative learning that implements a simple learning rule that produces results equivalent to those proposed by Rescorla and Wagner (1972), and additionally modifies in “real time” the strength of the associative connections between elements. The novel feature of this model is that stimulus elements interact by suppressively normalizing one another’s activation. Because of the normalization process, element activity is a nonlinear function of sensory input strength, and the shape of the function changes depending on the number and saliences of all stimuli that are present. The model can solve a range of complex discriminations and account for related empirical findings that have been taken as evidence for configural learning processes. Here we evaluate the model’s performance against the host of conditioning phenomena that are outlined in the companion article, and we present a freely available computer program for use by other researchers to simulate the model’s behavior in a variety of conditioning paradigms.
KeywordsAssociative learningComputational modelling
Most models of conditioning describe the content of learning as the strengthening or weakening of an association or link that connects some representations of the conditioned stimulus (CS) and the unconditioned stimulus (US). One fundamental issue that has been extensively debated concerns the distributed versus unitary nature of the way that stimuli are represented within the associative network. On one side of this distinction are theories that assume individual stimuli to be represented by multiple elements distributed across the network; on the other side are theories in which whole stimulus patterns are represented by a single configural unit. Much of the debate has centered on how these models perform in solving particular “nonlinear” discriminations, such as negative patterning. The key to solving the discrimination is to provide a means by which some of what is learned on reinforced trials does not generalize to nonreinforced trials, or vice versa. Most viable models of associative learning (elemental or configural) achieve this by assuming that stimulus representations involve a nonlinear combination of stimulus elements (e.g., Pearce, 1994; Wagner 2008). In the case of purely elemental models (e.g., Harris, 2006; Harris & Livesey, 2010; McLaren & Mackintosh, 2000, 2002), there is quantitative nonlinearity in the processing of stimulus elements. For example, the elements representing a compound, AB, are the same as those representing the individual stimuli, A and B, but the strength of any element’s activation by the compound will often differ from its activation by A or B alone (or, if the element is common to A and B, its activation in the compound will differ from the simple sum of its activations by both stimuli). This means that most complex discriminations, including negative and positive patterning, biconditional discriminations, and some aspects of occasion setting, can be solved by relying solely on elemental representations.
The present article focuses on a model first proposed by Harris (2006), and subsequently elaborated by Harris and Livesey (2010). The model uses a simple normalization operation, attributed to a capacity limitation of attention, to regulate the activation strengths of the stimulus elements as a function of the total number of elements activated. We will give only a brief description of the model here, but we refer the reader to previous reports for detailed descriptions (Harris, 2006, 2010; Harris & Livesey, 2010). We will then go on to describe how the model performs when accounting for the variety of empirical phenomena that have been set out in the companion article, and we will present several illustrative simulations of the model. In doing this, we will concentrate on phenomena that are not already discussed at length in our previous articles.
An attention-modulated associative network (AMAN)
Equation 1 derives from computational rules that have been used in numerous existing models of sensory systems that incorporate a gain control process of normalization (e.g., Grossberg, 1973; Heeger, 1992; Reynolds & Chelazzi, 2004; Reynolds & Heeger, 2009), and its application to the activation of elements has been discussed recently by Harris (2010). It is a monotonically increasing function that asymptotes at 1. If the inhibitory input N = 0, the excitatory inputs (Inputp) must equal the constant, D, to excite the unit to half of its potential response (R = .5). As N—that is, the normalization—increases, the function is shifted to the right, such that the strength of excitatory Inputp must increase by Np in order for R to reach half height. The constant D scales the range of Input values over which critical changes in R occur, and the power p determines the slope of the function. When p = 1, the function is a simple monotonically increasing curve. Higher values of p make the function sigmoid, and increasing p increases the maximal slope.
Equation 1 describes the general form of the function relating the response of a unit to its input. Below, we give the specific equations for each E, I, and A unit. For E units, the excitatory inputs come from both sensory input S and other E units, and the normalizing inputs come from the I units; for I units, the excitatory inputs come from all E units and the normalizing inputs from A units; and for A units, the excitatory inputs come from S, whereas the normalizing inputs come from other A units (as illustrated in Fig. 1). We have set at zero the lower limit on the total summed input to any unit, constraining the activity in any unit to be nonnegative.
The Harris and Livesey (2010) model specifies associative processes in real time, using an algorithm in which associative change is determined by the activation state of the recipient element. In most respects, this algorithm functions in a fashion very similar to the summed error term of the Rescorla–Wagner (1972) model, because the instantaneous change in activation of the recipient element is generally proportional to the summed error term. The algorithm is similar to an idea first suggested by Konorski (1948) (see also McLaren & Dickinson, 1990; Sutton & Barto, 1981) in assuming that the strength of a connection between two Es increases when activity in the E unit of the recipient element rises. However, unlike those earlier proposals—which assumed that connection strength decreases when activity in the recipient E unit falls—the learning rule defined by Harris and Livesey (2010) states that the strength of the connection decreases when activity rises in the inhibitory (I) unit of the recipient element. Thus, rather than tying the direction of associative change to the direction of change in activity of the recipient excitatory E unit, the present rule ties the direction of associative change to the rise in excitation versus inhibition of the recipient element.
Simulating performance in experimental designs
We provide an easy-to-use simulator program developed to assist researchers in simulating the AMAN model proposed by Harris and Livesey (2010). The program, with accompanying instructions, can be freely downloaded from our website at http://sydney.edu.au/science/psychology/staff/justinh/downloads/.
The program is available as two .m files that can be run using MATLAB or as executable files that can be run as standalone programs, for use by researchers who do not have the MATLAB software package. In all simulations described in this article, the parameters were not changed, with the exception of the number of time steps per trial. As the simulations are computationally complex and time-consuming, the number of time steps was kept as small as possible. The full parameter file, as used in these simulations, is included in the Appendix of this article.
To demonstrate how the model performs on the list of conditioning phenomena outlined in the lead article of this volume, we have divided the to-be-explained phenomena into three categories. First, we will discuss those phenomena that are accounted for by the model in terms of a Rescorla–Wagner-like learning rule applied to elemental representations of stimuli. Second, we will discuss those phenomena that can be accounted for by suppressive normalization between elements and the operation of the attention mechanism, as well as the effects on element-to-element associations that develop across stimulus presentations. Finally, we will consider those phenomena that are beyond the explanatory scope of the model, at least in its current formulation.
Phenomena explained by a summed-error learning rule and elemental representations
Acquisition, extinction, and conditioned inhibition
Responding to an extinguished CS has been shown to recover when that CS is presented in a context different from that in which it was extinguished (e.g., Bouton & Peck, 1989; Bouton & Ricker, 1994). AMAN can anticipate this context specificity of extinction via at least two mechanisms. One of these allows elements that form part of the extinction context to acquire some inhibitory strength that contributes to the loss (extinction) of responding to the CS in that context; this inhibition is then lost when the CS is presented in a different context after extinction, leading to some recovery of responding. However, this mechanism does not explain context-specific extinction under circumstances in which two CSs are extinguished in different contexts before the contexts are switched for test (Harris, Jones, Bailey, & Westbrook, 2000). We will return to this issue later.
In a feature-negative discrimination, A + AX–, the feature X acquires inhibitory strength that can reduce responding even when X is presented with a different CS, B (Pavlov, 1927). The model explains the acquisition of conditioned inhibition as the increase in negative associative strength between the E units of X and the US. However, unlike the Rescorla–Wagner (1972) model, AMAN does not predict extinction of inhibition when an inhibitor is presented alone, without the US. This is because the E units of the US cannot become negatively activated, and therefore there is no rise in E units of the US when an inhibitor is presented alone, and thus there is no opportunity for any increment in strength between the E units of the inhibitor and the US.
As described by Harris and Livesey (2010), AMAN predicts the summation of conditioned responses when two separately conditioned CSs are presented simultaneously as a compound (Andrew & Harris, 2011; Kehoe, 1982; Rescorla, 1997). This is because, like many associative models, the conditional activation of the US representation is assumed to be based on summed associative input from each CS present on that trial. However, the amount of summation between two CSs will be reduced to the extent that they suppressively normalize one another’s activation. Since normalization is assumed to be greater between stimuli that belong to the same modality than for stimuli from different modalities, this accounts for the observation that summation is greater between CSs from different modalities than between CSs from a single modality (Kehoe, Horne, Horne, & Macrae, 1994; Thein, Westbrook, & Harris, 2008).
The summing of associative strengths affects both responding and learning, and therefore the model explains a variety of effects that demonstrate the competition of cues during learning. For example, AMAN is equivalent to other models that use a summed error term (e.g., Rescorla & Wagner, 1972) as the means by which they explain the following keystone phenomena: overshadowing (weaker conditioning to a CS when it is conditioned in compound with another stimulus than when it is conditioned in isolation); blocking (weaker conditioning to a CS when it is conditioned in compound with a previously conditioned CS than when it is conditioned in compound with a neutral stimulus); unblocking by increasing the magnitude of the US (Kamin, 1968); the US preexposure effect (weaker conditioning when a US has been repeatedly presented on its own, prior to CS–US pairings; Randich & LoLordo, 1979; Wagner, 1969); and relative validity (superior conditioning to a CS when it is conditioned in compound with a stimulus that is less well correlated with the US; Wagner, Logan, Haberlandt, & Price, 1968). As such, it should not be surprising that the model accounts for both the decrease in responding when two separately conditioned CSs are conditioned in compound (“overexpectation”) and the facilitation of learning about a CS when it is paired with a conditioned inhibitor during conditioning (“superconditioning”) (Garfield & McNally, 2009; Lattal, 1998; Pearce & Redhead, 1995; Rescorla, 1971). The model can also account for faster learning with longer intertrial intervals (Gibbon, Baldock, Locurto, Gold, & Terrace, 1977; Spence & Norris, 1950) in terms of differential extinction of the context over the intertrial interval, thereby permitting faster discrimination between the presence and absence of the CS.
Like other elemental models, the Harris and Livesey (2010) model predicts that conditioned responding to one CS will generalize to a second stimulus to the extent that the two stimuli share common elements (Guttman & Kalish, 1956). It also uses a learning rule that functions in a fashion similar to a summed error term, and thus provides a means to achieve perfect performance on a simple discrimination (S + vs. S–), despite common elements shared between the two stimuli (Pavlov, 1927), because elements unique to S– acquire inhibitory strength that cancels the generalized excitatory strength from S+. (These mechanisms have also been shown to account for peak shift effects following discrimination training of S + vs. S–; Blough, 1975; Hanson, 1959; Terrace, 1968.)
Sensory preconditioning and second-order conditioning
Finally, like many similar elemental models, AMAN explicitly assumes that excitatory and inhibitory associations are acquired between CSs, as well as between a CS and a US. Therefore, the model is capable of accounting for evidence of sensory preconditioning (responding to a stimulus that was initially paired with another stimulus before the latter was conditioned) and second-order conditioning (responding to a stimulus that was paired with a CS after the latter had been conditioned; Brogden, 1939; Rizley & Rescorla, 1972). Nonetheless, in either case, the activation of US elements by a second-order CS is weak, because it depends on associative activation of E units of either the primary CS (Rizley and Rescorla showed that the activation of the primary CS contributes little to the responding to the second-order CS) or the US, and such associative activation is much weaker than direct activation by the stimulus itself.
Phenomena arising from the interactions between elements
AMAN describes mechanisms by which stimulus elements interact to influence one another’s activation. As described earlier, elements suppress (normalize) the activity of E units of other elements both by direct activation of inhibitory units and via the normalizing influences between their attention units. E units within and between CSs also develop excitatory and inhibitory associations as a consequence of stimulus exposure. As reviewed below, these influences allow stimuli to interact in ways that enable the model to explain a number of empirical findings that are otherwise beyond the scope of simple elemental models.
External inhibition and overshadowing
The normalization between CS elements was originally invoked to allow the simple elemental approach to solve nonlinear discriminations in which the associative strength of the CS compound does not equal the summed associative strength of the individual CSs, such as negative patterning (A+, B+, AB–) and the biconditional discrimination (AB+, CD+, AC–, BD–). The way that this is achieved, and its success in dealing with the more complex patterning discriminations tested by Pearce and colleagues (e.g., Pearce, 1994), has been dealt with at length by Harris (2010) and Harris and Livesey (2010) and will not be discussed in more detail here. Nonetheless, it is instructive to consider two empirical findings on this topic that the model does not account for as successfully as its forerunner (Harris, 2006). The current implementation of the Harris and Livesey (2010) model does not predict that the biconditional discrimination would be more difficult than negative patterning, as shown by Harris and Livesey (2008) and Harris, Livesey, Gharaei, and Westbrook (2008), and it incorrectly predicts that a redundant cue will facilitate rather than impede the acquisition of negative patterning (Pearce & Redhead, 1993; Rescorla, 1972).
Unequal learning to two CSs
As described by Harris (2006), one important feature of a simple elemental model like AMAN is that it can explain some recent reports by Rescorla showing unequal learning to each of two CSs conditioned in compound if those CSs start with different conditioning strengths (Rescorla, 2000, 2001, 2002). For example, in a blocking design in which stimulus A is conditioned prior to conditioning of the compound AB, Rescorla (2001) showed that less is learned about A than about B during those compound training trials. This finding is problematic for associative models that assume that any learning that takes place during the AB conditioning trials will be equally shared by A and B. The elemental model initially proposed by Harris (2006), and subsequently developed by Harris and Livesey (2010), can account for Rescorla’s (2001) result. To explain how this works, consider a very simplified example in which A, B, and the US each comprise three elements, and in which each element of A and B is connected to just one element of the US (this is a simplified illustration of the partial connectivity assumed to exist between CS and US elements). During the initial conditioning, each element of A acquires associative strength with one element of the US. When A is subsequently presented in compound with B, one of its elements is suppressed by B (and one, but not necessarily the same one, of B’s elements is suppressed by A). Thus, some of A’s associative strength is lost (as described already to account for external inhibition), and one US element is now available for further conditioning. The important point is that any conditioning that can occur during AB trials is confined to that one US element. However, because this US element is connected to the one A element that has been suppressed, A has no opportunity to increase its associative strength with the US. In contrast to this situation for A, there is no preordained match between B’s suppressed element and the relevant US element, and therefore B is in a better position than A to acquire whatever associative strength is available. Our simulations with AMAN confirm that this outcome holds even when each stimulus is represented with a larger number of elements and the connections between the E units of the elements are random rather than one to one.
Timing of the conditioned response
Two limitations must be acknowledged in regard to the way that AMAN deals with the timing of conditioned responses. First, the simulations presented here have all involved delay conditioning in which the US is presented when the CS terminates. The model does not perform well in simulating conditioning when the US is presented early during a prolonged CS. Such schedules produce a large backward pairing (CS after US) that undermines the development of net excitatory associative strength between the CS and the US in the model’s simulations. Second, while AMAN can effectively simulate inhibition of delay (Pavlov, 1927), we note that it is not able to simulate the temporal specificity of the conditioned response beyond the time of the US (Davis et al., 1989; Kehoe & Joscelyne, 2005; Roberts, 1981; Smith, 1968; Williams et al., 2008). For example, if rats are conditioned with a US that occurs midway through a long CS, the conditioned response not only rises during the CS as the scheduled time of the US approaches, but it also falls as time passes beyond the scheduled time of the US, as assayed on probe trials in which the US is omitted (e.g., Williams et al., 2008). It is clear from Fig. 6 that this fall in the conditioned response beyond the scheduled time of the US is not reproduced by AMAN; instead, the CS continues to activate US elements well beyond the expected US time. Our analysis suggests that this is because, as time elapses, the activation profile of CS elements becomes more stable, thereby reducing the opportunity for inhibitory strength to be acquired by CS elements that are more strongly activated after the US than during it.
Finally, Harris and Livesey (2010) showed that AMAN can simulate the latent inhibition that results from repeated presentations of a stimulus prior to its conditioning, as well as superior discrimination between two CSs with overlapping elements if the CSs have been preexposed prior to training (so-called perceptual learning). However, the model does not account for differences in perceptual learning when those CS preexposures are intermixed (A . . . B . . . A . . . B . . .) rather than blocked (A . . . A . . . A . . . B . . . B . . . B).
A number of phenomena are not explained by AMAN. Some of these describe aspects of conditioned responding that are simply outside the scope of any of the constructs used by the model to capture learning. For example, because the model identifies CS–US conditioning with the associative activation of US elements, rather than with changes in the CS elements, it does not explain how different CSs might influence the form of conditioned responses (Holland, 1977), nor does it account for changes in the associability of CSs in relation to their history of pairing with the US (e.g., Hall & Pearce, 1979). And, like the associative models that it borrows from, its commitment to predicting overshadowing prevents AMAN from being able to explain potentiation (the enhancement of conditioning to a CS when it is reinforced in compound; Batson & Batsell, 2000; Westbrook, Clarke, & Provost, 1980; Westbrook, Homewood, Horn, & Clarke, 1983), and its commitment to the mechanisms that produce blocking prevents it from explaining evidence for unblocking by a decrease in US magnitude (Dickinson & Mackintosh, 1979; Dickinson, Mackintosh, & Cotton, 1980). In much the same vein, because its learning mechanisms are tuned to positive or negative contingencies between CS and US, it does not explain what learning takes place when the CS and US are truly uncorrelated (Bonardi & Hall, 1996). Finally, the model is not equipped with any process that can account for the effect of the passage of time in recovering conditioned responding after extinction (Pavlov, 1927) or following latent inhibition (Westbrook, Jones, Bailey, & Harris, 2000).
These challenges notwithstanding, we have shown here that the conditioning model proposed by Harris and Livesey (2010) is well equipped to explain a large number of empirical observations regarding Pavlovian conditioning and discrimination learning. In this sense, the model clears many hurdles. But developing a model of this type does require specifying many details that may make the model appear overly elaborate, or even cumbersome. Our intention here has been to take simple concepts that underlie an elemental approach to conditioning and to develop these to a natural conclusion, and the present exercise shows how such a model fares in the broad experimental arena. We believe that the model has acquitted itself well in this showjumping exercise, even if it were not to score full marks in the dressage.