Introduction

This debate with Frank Ohl and Henning Scheich (O&S) is about associative learning and neurophysiological plasticity in the primary auditory cortex (A1). The traditional view that A1 (and other primary sensory cortices) are only stimulus analyzers has not been tenable for some time. A major focus of current research is to understand the nature of learning-related plasticity in the primary auditory cortex, including its (a) forms and (b) functions. O&S have criticized us on both issues, which are addressed in turn. Some disputes concern appropriate experimental designs, while others deal with more general problems. However, all of the issues have broad implications, so this debate can best serve as a vehicle for consideration of several key topics in the neuroscience of learning/memory and sensory/perceptual processes [1, 2]. An important aspect of this debate is to delineate and examine assumptions because many controversies ultimately reflect different assumptions that are embedded within implicit theoretical frameworks.

Theoretical framework and assumptions

Before examining the specific issues, it is appropriate that we make explicit our theoretical framework. While it does not bear directly on the interpretation of every experimental finding, it does differ from that of O&S. For example, they view memory as tightly linked to behavioral indices of learning. Thus, they assume that memory traces develop only in areas whose destruction produces behavioral deficits (see also Point #6C). We have a different position (see below).

We have made extensive use of “simple” one-tone classical conditioning because it is the most fundamental form of associative learning, and thus, we assume that it is an appropriate starting place. We assume that the total memory of a given experience is multidimensional and therefore is distributed in the brain. We also presume that any cortical neuron participates in numerous memories as a component of innumerable complex networks. As memories have specific and detailed sensory content, we suppose that auditory memories are likely to be stored, at least in part, within recognized structures of the auditory system. We further assume that circuitry which is sufficient for some aspects of a “simple” association can be completely subcortical, so that it can survive cortical lesions. However, we also believe that cortical areas can store information in parallel with subcortical systems. In addition, we presume that cortical memory traces are more complex than subcortical traces. The former have access to a much greater range of information than the latter, so cortical traces could be used in a highly flexible manner to support adaptive behaviors in an unknown future. We think it likely that neural mechanisms of all learning-related A1 plasticity involve both stimulus-driven (“bottom-up”) and goal-driven (“top-down”) influences. Thus, the particular form of plasticity that is expressed by neurons after learning depends both on current stimulus parameters and the situation in which the subjects are being “interrogated” by experimenters. Therefore, A1 (or any other cortical field) is not “fixed” in a given expression of plasticity and is thus not severely limited in its capacity to store different memory traces. Finally, we believe that A1 works with other auditory circuitry to extract behaviorally meaningful information from the acoustic environment and use that information to meet whatever may be the current behavioral opportunities and challenges. However, virtually nothing is presently known about learning-related integration among cortical auditory fields.

Associative learning and perceptual learning are different

Associative learning (sometimes called “content learning” [3]) and perceptual learning are related but different. “Associative learning” simply refers to the acquisition of information that two events occur nonrandomly, usually with one preceding the other. Classical conditioning is the most basic form of associative learning, in which a conditioned stimulus (CS, e.g., tone) is followed by an unconditioned stimulus (US, e.g., shock or food). The subjects learn and remember that the CS predicts the US. “Instrumental conditioning”, which is based on prior classical conditioning, consists of learning to perform a particular behavioral response (e.g., key press) when presented with a CS, to obtain a reward or avoid a noxious stimulus. The subjects learn and remember that the specific instrumental response in the presence of a particular CS will produce a certain reinforcement (e.g., food). Associative learning always has “content”, i.e., associations are about the relationships between some particular experiences. Such associations enable animals and humans to learn the “causal fabric” of their environments [4].

“Perceptual learning” consists of first learning to discriminate between two different signal stimuli (e.g., tones), one of which is designated as “correct” by the experimenter. After an easy discrimination has been achieved, increasingly difficult discrimination problems are used, until there is no further improvement. Training usually involves hundreds to thousands of trials over many days. The result of perceptual learning is to improve perceptual abilities on the training dimension, such as improving acuity for frequency discrimination. However, unlike basic associative learning, perceptual learning is not presumed to include “perceptual memory”, i.e., to yield specific memories for each of the many paired frequencies used throughout the extensive period of training [5]. Regardless of the duration of training, subjects ordinarily first learn basic associations, i.e., classical conditioning, and then learn to make a response contingent on an acoustic stimulus and a reinforcer, i.e., simple instrumental conditioning (Fig. 1).

Fig. 1
figure 1

Comparison of typical protocols for associative learning and perceptual learning. Whereas associative learning produces specific memories, perceptual learning produces an increase in acuity for the trained dimension without necessarily specific memories of each of the discriminative stimulus values. The error signal (feedback) for responses to a CS is illustrated for completeness but is not invariably used in studies of perceptual learning.

There are two reasons within the current debate to emphasize the differences between associative learning and perceptual learning. First, associative learning usually precedes perceptual learning. For example, subjects generally first learn that a sound is followed by food before they learn which particular sound frequency (CS+, reinforced; CS−, non-reinforced) is followed by food. Thus, when neural correlates of perceptual learning are obtained, the brain (in the present case, the primary auditory cortex) may have previously developed associative plasticity. Such associative plasticity could be the foundation upon which plasticity for perceptual learning develops. Second, some workers treat all auditory learning as “perceptual learning” because it occurs in a “perceptual system”. Ohl and Scheich appear not to distinguish between associative learning and perceptual learning. Therefore, they seem to assume that mechanisms proposed for associative learning are intended to account for perceptual learning as well [6]. Their critique of our findings on such a basis is not further considered in this debate, as there is no need for us to defend a position that we have not held. Our research has concerned associative plasticity, and it is therefore to this topic that we now turn.

Background

Learning-induced plasticity in the primary auditory cortex was discovered by Galambos et al. [7]; classical conditioning increased the amplitude of evoked potentials in A1 as cats learned a sound–airpuff association. Subsequently, other laboratories extended investigations to various learning situations and increasingly replaced evoked potential recordings with multiple-unit activity. The findings were the same, viz., that sounds which gained behavioral importance also evoked greater activity. The next development involved single unit recording. This research revealed a more complicated situation: a certain proportion of A1 cells increased their discharges (as expected), but invariably, another proportion decreased their discharges. While these effects were shown to be associative, they made little functional sense (reviewed in Weinberger and Diamond [8]).

Our own studies of classical fear conditioning (sound-shock pairing) in both A1 and the little understood adjacent “secondary field” (A2) yielded similar mixed effects [9, 10]. For example, we found that during training trials, an equal number of single units in A1 developed increased or decreased responses to the acoustic CS. Such findings convinced us that simply recording cellular activity during training trials yielded data that were both difficult to interpret and too limiting.

The interpretive difficulties stem from the fact that although associations are formed during training trials, non-learning factors are invariably present, due at least to the presence of positive (reward) or negative (punishment) reinforcements. Such factors include changes in attention, arousal level, motor planning, and motor performance. For example, as subjects learn that they face a problem posed by experiments, their arousal level is likely to increase. It may well continue to remain high or even further increase during early stages of the training experience before they are able to solve the challenge posed. But as they achieve a solution, e.g., learn the predictive relationships between a CS and US, their excitability generally declines. Indeed, in some cases, subjects will sleep between trials as when dogs had solved a shuttlebox avoidance problem and leisurely awoke to the CS presentation and performed the required response in plenty of time to avoid shock [11].

Because of such performance factors, Rescorla has emphasized the dangers of relying on data obtained during training trials to infer the strength of learning and those aspects of an experience that enter into memory. Rather, these attributes are best determined by appropriate post-training assessments of behavior [4]. This counsel is equally applicable to plasticity that develops during training trials. Thus, even when the development of plasticity during training trials can be attributed to associative factors (e.g., by use of a sensitization control group), the form and magnitude of this neural correlate do not necessarily reflect only associative processes, but rather, are likely to be a mixture of associative and non-associative factors. That is why plasticity obtained during training trials can be so different in sign (increase vs decrease) from plasticity obtained in post-training assessments outside of the training context [12, 13].

Although not universally recognized, there is a substantial difference between demonstrating that a neural correlate of learning is associative and determining the nature and degree of the specificity of plasticity. The former is a first step. The latter provides information on how the representation of information is altered by associative learning. Primary sensory cortices are advantageous regions of the brain in which to address this issue of specific representational plasticity because their cells have reliable receptive fields and the primary auditory, somatosensory, and visual cortices contain well-characterized, systematic topographical “maps” of their individual sensory epithelia.

This consideration brings us to the second limitation of information that can be gleaned from limiting recording to training trials. The specificity of plasticity cannot be ascertained because training involves only one or two sensory stimuli. For example, in “simple” acoustic conditioning, there is a single auditory CS paired with a US. In discrimination learning, two sounds are employed, one of which (the CS+) is paired with the US, the other (the CS−) is not. While successful discrimination training demonstrates that subjects have learned to distinguish the stimuli in question, two stimuli are insufficient to determine the actual degree of specificity of any resultant neural plasticity.

A unified experimental design: a synthesis of two disciplines

The problems of confounding performance factors and inability to more precisely determine the specificity of plasticity requires an experimental design different from standard training trials. A key goal of a new approach was to determine the effects of learning on the frequency receptive fields (RF) of cells in the auditory system, particularly in the primary auditory cortex. The basic design is simple: (1) obtain tuning curves; (2) train subjects using any frequency except the best frequency (peak of tuning curve) as a signal for reward or punishment [e.g., conditioned stimulus (CS) in classical conditioning]; (3) obtain post-training tuning curves. Additional post-training RFs can be obtained later to determine the long-term retentions of any induced plasticity. (For sake of exposition, findings from classical conditioning will be used, but the approach and findings also apply to other types of training.) Fig. 2 summarizes both this and related designs.

Fig. 2
figure 2

Schematic summary of experimental designs employed in the neurophysiological study of learning and the auditory cortex. Depicted are four basic designs [14] and their treatments during three experimental periods: “Pre” (before training), “During” (during training) and “Post” (after training). 1 Standard training, in which recordings are obtained only during training trials, e.g., [10]. 2 Unified design (see text). Pre–Post designs 2a and 2b illustrate the fact that any training paradigm can be used. 2a shows single tone conditioning (e.g., [22]), and 2b illustrates two-tone discrimination conditioning (e.g., [20]). 3 Modifications of the unified design for cases in which pre-training data cannot be obtained, e.g., complete mapping of the cortex. Designs 3a and 3b also illustrate the fact that any training paradigms can be used with a post-design. 3a illustrates the case of single tone conditioning, whereas 3b shows an example of two-tone instrumental training, in which reward is contingent upon the correct response, i.e., one response if the two tones (S1 and S2) are the same and another response if they are different (responses not shown). In 3a, the “x” in the post-period signifies sacrifice of the animal for 2-DG analysis after repeated presentation of a conditioned stimulus (e.g., [45]). In design 4, the “CS/US” denotes that one of the frequencies in a series of tone bursts is designated as the conditioned stimulus and is paired with shock; the serial order of tones is random from one sequence to another (e.g., [14]). The repeated vertical lines represent presentation of tone bursts. The dotted lines in the post-period for designs 2 and 4 indicate that additional post-periods can be used to determine long term retention, etc. Illustrations are not to scale.

There are two additional requirements. First, pre- and post-training data must be obtained in the same context, so that any differences between the RFs can be attributed to the intervening training and learning rather than to any collateral factor, such as the effects of novelty in a new post-training environment.

Second, training must take place in a different context from that in which pre- and post-training RFs are obtained. This reduces or eliminates the possibility of experimental extinction during the post-training period, when the training reinforcer (e.g., food or shock) must be absent. It also prevents transfer of any contextual learning from the training situation to the post-training environment, which would affect a subject’s state of arousal, fear or expectation; these would render the post-training state of the subject different from the pre-training state.

Of course, there must be behavioral verification that learning has occurred, and this can be obtained during training. The absence of behavioral conditioning in a control group that receives the conditioned and unconditioned stimuli (in an unpaired or random manner) indicates that effects obtained from a CS–US pairing protocol are due to associative processes.

Evidence of the lack of transfer from the training period to the post-training period can be obtained by showing that while subjects give learned behavioral responses to the signal frequency during training, the same frequency presented post-training elicits no such behavior. An effective method to avoid transfer is to also change the acoustic context. For example, when training consists of a 20-s tone of a single frequency followed by reinforcement, with intervals between trials averaging many (e.g., 60) seconds, RFs are obtained by presenting many brief (e.g., 100 ms) frequencies at a high rate (e.g., 2/s). When so tested, subjects do not respond to the CS frequency during RF determination [12].

The first study to use the unified design with contextual control investigated classical conditioning in the cat and studied the secondary (A2) and ventral ectosylvian (VE) auditory cortical fields [13]. Post-conditioning frequency RFs revealed CS-specific associative plasticity that could be either an increase or a decrease in response. Of particular note, the sign of plasticity that was observed during training trials usually differed from that which was evident in post-training tuning curves, e.g., increased response to the CS during conditioning vs decreased CS-frequency specific response in receptive fields. Therefore, the precautions of the unified design were well advised, as plasticity observed during actual training trials does not predict the form (direction of CS-specific change) in receptive fields. All subsequent studies in our and other laboratories have investigated the primary auditory cortex.

The dominant finding in our laboratory has been a CS-specific increase in response magnitude (rate of discharge or amplitude of evoked potentials). Examples of receptive fields and CS-specific RF plasticity are given in Figs. 3 and 4. A summary of the effects of conditioning (CS–US pairing), sensitization and habituation (tone presented repeatedly alone) are given in Fig. 5.

Fig. 3
figure 3

Classical conditioning produces tuning shifts. An example of a complete shift of frequency tuning of a single cell in A1 of the guinea pig from a pre-training best frequency (BF) of 0.75 kHz to the CS frequency of 2.5 kHz after 30 trials of tone-shock pairing, during which the guinea pig developed a cardiac conditioned response. Inset shows pre- and post-training poststimulus time histograms (PSTHs) for the pre-training BF and the CS frequencies.

Fig. 4
figure 4

Associative processes favor responses to the frequency of the CS in a variety of circumstances. Single unit recordings from A1 of the guinea pig. a Double-peaked tuning, with pre-training BFs at 5.0 and 8.0 kHz. The CS was selected to be 6.0 kHz, a low point. After conditioning (30 trials), responses to the CS frequency increased to become the peak of tuning. b A cell that exhibited minimal or no response to tones before tuning developed tuning specifically to the CS frequency after conditioning (30 trials).

Fig. 5
figure 5

Summary of the effects of a conditioning, b sensitization, and c habituation on frequency receptive fields in the primary auditory cortex of the guinea pig. Data are normalized to octave distance from the CS frequency (a), the pre-sensitization best frequency (b), or the repeated frequency (c). Note that conditioning produces a CS-specific increased response, whereas sensitization (tone-shock or light-shock unpaired) produces general increases across the spectrum. Habituation produces frequency-specific decreased response.

We can now turn to the issues in dispute.

Form of associative representational plasticity

  1. 1.

    Rejection of our findings

    This issue, according to O&S, is whether associative representation plasticity (hereafter “ARP”) involves specific increases in responses to the CS or decreases in response to the CS. They reject the CS-specific increases we observed, claiming that they were tainted by contextual factors.

    “...learning-induced changes in the CS+ representation have been shown to be critically dependent on delicate contextual circumstances in other experiments (Weinberger and Diamond, 1986; Diamond and Weinberger, 1989...). Hence, we took care to develop an experimental protocol which minimizes contextual influences that are not under experimental control.” [14] [page 1012].

    Rebuttal to 1. The basis for this critique could reflect two conjoint assumptions by O&S: (a) that the form of plasticity during training trials reflects only associative factors; (b) that the sign of plasticity must be the same during and after training trials.

    We deliberately imposed strikingly different acoustic contexts for the periods of receptive field determination vs training (conditioning) periods, to avoid experimental extinction and contextual transfer, as explained above. This feature is the essence of the new unified design. The phrase of O&S “...delicate contextual circumstances...” is misleading; there was nothing “delicate” about them. As explained above, the unified experimental design benefits from using markedly different acoustic contexts between training and RF determination (Table 1). Moreover, these contextual influences were most definitely under our control.

  2. 2.

    Claim That We “Selected” Findings of CS Increases and Ignored CS Decreases

    O&S criticize us for focusing on CS-specific increased responses, and presumably, the accompanying tuning shifts to and toward the CS frequency. They assert that we have selectively ignored CS-specific decreased responses [6, 14].

    “In the case of auditory single-unit plasticity studied with extracellular recording, a phenomenon that has gained wide attention is the learning-induced increase in the CS+ evoked spike rate. Most of the currently held views of auditory cortical plasticity are based upon this phenomenon (Weinberger, 1990a, b). This selective view [ital. added] tends to neglect two pieces of experimental evidence which are either explicitly mentioned in some of these studies or, in other cases, can be taken from the published figures. The first is the existence of sometimes large changes in tone-evoked activity in response to frequencies other than the CS (Diamond and Weinberger, 1986, 1989); the other is the existence of reduced CS responses after training [9, 10].” [14], [page 1014].

    This critique needs to be separated into three parts. The first concerns changes in response to frequencies other than the CS. The second is about decreased responses to the CS during training trials. Closely related is the third: CS-specific decreased responses in frequency receptive fields after training.

  3. 2A.

    Response changes to non-CS frequencies

    O&S criticize us for allegedly failing to account for “...sometimes large changes in tone-evoked activity in response to frequencies other than the CS in post-training RFs [12, 13].”

    Rebuttal to 2A. O&S are referring to our first RF study, which did not concern primary auditory cortex, but rather, secondary (A2) and ventral ectosylvian (VE) fields. As our model [2, 15] and all other studies concern primary auditory cortex, this critique reflects the failure of O&S to distinguish between A1 and fields that differ from it greatly in functional organization (see also rebuttal to 2C, below). Furthermore, their point is irrelevant, whether for A1 or any other part of the auditory system, because we have not held that large changes must be confined to the CS frequency; (see also rebuttal to 3).

  4. 2B.

    Decreased responses to the CS during training trials

    O&S are referring to decreased responses to the CS in single unit studies of plasticity during training trials in A1 and also secondary auditory cortex (A2), before we had started to use the unified design to study RF plasticity [9, 10].

    Rebuttal to 2B. Of course, we observed decreased responses to the CS and increased responses to the CS during training trials. As explained above, the mixed sign of response changes to the CS was a major motivation to undertake RF studies, given the realization that data obtained during training do not reflect associative processes alone and cannot reveal the degree of specificity of plasticity. The unified design avoids basing conclusions on response changes during training trials, but rather compares receptive fields before and after training when the subjects are in the same state and in an acoustic context different from that of training.

    In short, the fact of decreased (and increased) responses to the CS during training trials is irrelevant to the issue of the sign of CS-specific plasticity in post-training receptive fields. This critique seems to be based on the assumption of O&S that periods of training trials and RF determination are functionally the same. If so, their critique illustrates a failure to adequately appreciate the distinction, e.g., that state and other performance factors are endemic to training trials. The critique also seems to embody the assumption that the sign of plasticity must be the same during training and post-training RF determination. These assumptions would explain why O&S consider decreased responses to the CS during training trials to be contradictory to the dominant findings of CS-specific increased responses to the CS in post-training receptive fields.

  5. 2C.

    Receptive field plasticity after training trials

    In point 2B, O&S may have intended to refer to CS-specific decreased responses in our first studies of RF plasticity [12, 13]. These particular findings appear directly contradictory to our observations of only CS-specific increased responses in A1.

    Rebuttal to 2C. However, these recordings were not obtained from the primary auditory cortex; they were obtained from the secondary auditory (A2) and the ventral ectosylvian (VE) fields. The mixed outcome of RF plasticity (CS-specific increased and decreased responses) in A2 and VE remain valid observations. Unfortunately, no other laboratories have studied RF plasticity in these auditory fields, and we, like other laboratories, have since concentrated on A1 because more is known about its basic processing of sound and its functional organization than any other auditory field. We have focused on CS-specific increased responses (and tuning shifts) in A1 simply because this is the predominant finding in the primary auditory cortex (e.g., [1627]. Concordant with the findings of increased response to behaviorally important stimuli, it is noteworthy that studies of parameters other than frequency also have reported specificity of increased responses, e.g., sound intensity (level) [28] and the repetition rate of sound pulses [29]. Table 2 summarizes findings from our laboratory.

  6. 3.

    Claim that our measures are biased to detect only CS-specific increased responses

    O&S argue that our analyses of the effects of learning on receptive fields are biased to permit detection only of CS-specific increased responses.

    “The criterion for frequency-specific plasticity used by Weinberger, namely the requirement that learning-induced changes of firing probability at the training frequency must exceed all other changes [ital. added], leads a strong bias towards the particular type of retuning described in the article.” [30].

    Rebuttal to 3. O&S are wrong because they are confused. In the paper to which they refer, a set of conservative criteria for CS-specific RF plasticity was used, to avoid false positive findings. One criterion was “...the largest change in the RF difference function [i.e., post minus pre-training RFs] had to be at the CS frequency ...” [22, page 275].

    The previous paragraph of this paper provides the formula which was based on the absolute difference, not a positive difference. Thus, the “largest change” could have been either a CS-specific increase or a CS-specific decrease. But CS-specific decreases were seldom observed in A1. However, the same analysis revealed some CS-specific decreases in the dorsal medial geniculate nucleus [31] and in A1 during avoidance conditioning [18]; see also Table 2. Therefore, our measures are sensitive to decreases and increases, and the claim of O&S can be rejected.

  7. 4.

    O&S claim of CS-specific decreased responses in post-training receptive fields

    O&S performed an experiment in which they found CS-specific decreased responses after training. A single CS+ tone (paired with shock) was randomly intermixed with 11–30 different CS- (no shock) frequencies in a single training session. The same frequencies were presented before, during, and after training without break, so that the only information that training was underway was the presence of an occasional shock [14, 32] (Figure 2.4).

    Two assumptions seem to underlie the claimed validity of learning-induced CS-specific decreased responses in RFs. First, O&S assumed that their subjects learned this unique, difficult discrimination. However, there are no reports in the literature that a discrimination between one CS+ frequency and 11–30 other CS- frequencies can be learned, particularly in a single session. Second, O&S assumed that differences between RFs from pre- and post-training periods are attributable only to CS–US learning during training. Both assumptions are untenable.

  8. 4A.

    Claim that subjects had learned the discrimination

    O&S provided no behavioral evidence that the subjects learned the difficult discrimination. Although the authors recorded heart rate, they chose to use extremely brief intertone/intertrial intervals of 0.25–3.0 s, thus, precluding an opportunity to obtain behavioral evidence of discrimination. Had they employed accepted discrimination intertrial intervals (e.g., a minimum of ∼20 s), O&S might have obtained discriminative cardiac conditioned responses [20], which would have substantiated their claim of successful discrimination learning.

    In the absence of behavioral validation of discrimination learning, the authors resorted to indirect arguments to support their claims. Referring to their current results, the authors stated:

    “A similar rapidity in the development of plastic effects has been reported by Edeline et al. (1993). Long-term retention of receptive field plasticity was also recently reported by Weinberger et al. (1993). These results provide strong support for the argument that the type of plasticity we describe is indeed a correlate of learning. This is important to note since the nature of the study did not technically allow us to measure a stringent behavioural correlate of learning.” [page 1013].

    Rebuttal to 4A. The use of our findings to support their claim of behavioral learning is ironic, but more importantly, it is illogical. First, attributes of plasticity cannot validate behavioral learning. This constitutes a “category error”, i.e., confusing properties of the whole (the organism) with properties of a part (its cells) [33]. Second, the characteristic of rapid development of plasticity is not unique to learning. Even within the field of learning/memory, rapid changes are not unique to successful acquisition in conditioning. For example, they are equally typical of extinction. Of secondary interest, the studies O&S use to support their claims showed rapid development of plasticity in five trials [21] and long-term retention of 24 h to 8 weeks [19], neither of which values were attained in the O&S study.

  9. 4B.

    Claim that CS-specific decreased responses are associative

    O&S argue that the decreased response to the CS frequency was due only to associative processes that were operative during the training phase. They arrived at this conclusion by subtracting post-training RFs from pre-training RFs. This subtraction indeed did show decreased responses. However, the authors also realized that the post-training period constituted a formal period of experimental extinction, i.e., due to the removal of the shock unconditioned stimulus (Figure 2.4). Therefore, they attempted to avoid the effects of extinction by limiting statistical analyses to the first ten repetitions of the 11–30 different tone sequences used to obtain receptive fields.

    Rebuttal to 4B. The attempt to avoid the effects of extinction is problematic because extinction can develop within ten repetitions of a CS, if the subjects had learned the discrimination. In fact, the specificity of the decreased response to the CS frequency is consistent with this possibility. But as noted above, there is no way to resolve this issue on the basis of the experiments performed by the authors.

    Moreover, regardless of whether or not extinction had occurred, the subjects’ post-training state must have been different from pre-training because arousal and fear would have been higher due to the anticipation of shock after training. Therefore, the difference in neural response between the post-training and pre-training RF periods could be due to learning that had occurred during training, or to the change of state, or to both (Table 3). Thus, it is impossible to assign any RF changes exclusively to putative discrimination learning during the training period. This is precisely why the unified experimental design was formulated; it avoids such confounds by keeping the pre and post-training periods identical, while changing the context during training (Figure 2.2a,b).

Table 1 Designated contextual differences between training and RF determination
Table 2 Summary of the direction of CS-specific plasticity in studies of primary auditory cortex and medial geniculate nucleus from Weinberger lab employing the unified design
Table 3 Comparison of the factors of State and Learning in the experimental design of Ohl and Scheich [14, 32] and the “unified design” [e.g., 19] during the three experimental periods: pre-training, during training, and post-training

Functions of associative plasticity in primary auditory cortex

  1. 5.

    Memory traces, lateral contrast enhancement, and task dependency

    We suggest that CS-specific associative plasticity in A1 represents the storage of acquired information, i.e., is part of the substrate of auditory associative memory. O&S believe that a selective decrease in response to the CS frequency with enhancement of side-band frequencies serves as “lateral contrast enhancement” [14]. They also argue that plasticity is “task dependent”, which O&S say is incompatible with learning/memory functions. Referring to our findings (later replicated independently by Suga and associates (e.g., [2427] and Gerstein’s laboratory [23], O&S state the following:

    “The general finding in these experiments was that training shifted the best frequency (BF) of neurons towards the reinforced frequency ... The main argument is that this type of retuning constitutes memory storage in the service of different future adaptive behaviors, that is, it is not task-specific” [italics added]. [6] [pages 470–471].

    Rebuttal to 5. The veracity of CS-specific decreased responses in RFs during associative learning has already been examined and found to be wanting (see 4). It follows that “lateral contrast enhancement”, which is based on such findings, lacks sufficient empirical basis in associative learning. It might develop in perceptual learning (see below).

    According to O&S, plasticity is task-specific, which is incompatible with A1 plasticity as memory storage. Although O&S have not yet explicated the rationale for this alleged incompatibility, it would follow logically that if plasticity does reflect memory storage, then it cannot be task-specific. The evidence that CS-specific associative plasticity represents part of the substrate of auditory memory is quite strong. Thus, such plasticity has all of the major attributes of associative memory: it is associative, highly-specific, discriminative, develops rapidly, exhibits consolidation (becomes stronger and more specific over days in the absence of further training), and exhibits long-term retention (weeks) (reviewed in [34, 35]. In addition, the amount of expanded representation of the CS frequency band in A1 is an increasing function of the level of behavioral importance of that stimulus, indicating that area of representation could serve as a “memory code” for the acquired behavioral significance of sound [16].

    We do not hold that task-specificity is incompatible with memory storage. Quite the contrary. Acoustic habituation is characterized by a specific decrease in response to the repeated tone [36] (Fig. 5C). Behavioral habituation certainly depends upon memory storage, and the sign of plasticity is a decrease rather than an increase. Therefore, the signs of plasticity are different in different tasks, indicating that task-specificity and memory storage are not incompatible.

  2. 6.

    O&S fallacious assertion of our stance on plasticity and auditory memory

    O&S accuse us of claiming that A1 plasticity is essential for all auditory learning and memory [30].

    The review by Weinberger is of merit in pointing to the often neglected fact that learning-induced plasticity is found as early as primary sensory cortical areas. A main focus of the article is the attempt to establish a mechanistic relationship between suitably designed behavioural paradigms and plastic phenomena on the neuronal level of the primary auditory cortex. However, it should be pointed out that the article went too far in claiming the dependence of auditory memory on the type of neuronal plasticity that is described. [ital. added] Best frequency shifts as a form of retuning of neurons’ receptive fields, and the reorganization of the tonotopic map measured by best frequencies are neither sufficient nor necessary for the described types of auditory learning and memory. They are not sufficient because map reorganization induced by alternative means (electrical microstimulation) does not alter frequency discrimination performance, and they are not necessary because tonal memory can develop even after bilateral ablation of the auditory cortex ... On the other hand, in training paradigms in which the relevance of the auditory cortex has been positively shown [37] many mechanisms are required which cannot be accounted for by simple retuning of a unit’s best frequency.”

    This O&S critique involves three issues: (a) our position on the necessity of A1 plasticity, (b) more general issues concerning sufficiency and necessity, and (c) the role of A1 lesions and pure tones for claims of memory-relatedness.

  3. 6A.

    Our stance on cortical plasticity

    O&S, referring to a recent review [1] assert that we claim A1 plasticity is essential for auditory memory: “...went too far in claiming the dependence of auditory memory on the type of neuronal plasticity that is described.”

    Rebuttal to 6A. We have never made any claim that auditory memory depends on receptive field (or other) plasticity in A1, either in the review cited by O&S or in any other venue. Therefore, it is incumbent upon O&S either to substantiate or retract their assertion.

  4. 6B.

    Necessity and sufficiency

    O&S hold that A1 plasticity is neither necessary nor sufficient for all auditory memory.

    Rebuttal to 6B. We have never claimed that A1 plasticity is either sufficient or necessary for all auditory memory. Indeed, we first showed an instance in which CS-specific A1 plasticity was not sufficient, by increasing task difficulty in two-tone discrimination, such that while plasticity did develop, behavioral discrimination did not [20].

    However, O&S draw conclusions that are too sweeping. They cite the failure of cortical microstimulation to alter frequency discrimination performance [38] to conclude that cortical plasticity can never be sufficient for behavioral memory. This way of thinking is problematic for two reasons. First, the effects of microstimulation on cortical organization are highly transient so that cortical plasticity may well have dissipated before the poststimulation behavioral test (G. Gerstein, personal communication). Second, O&S dismiss tuning shifts and map expansions as ever being sufficient for behavioral memory based on one example of dissociation between plasticity and a single measure of auditory performance. But such a dissociation can only show that cortical plasticity is not sufficient for some aspect of auditory memory as measured in a certain way. In short, the argument from an example to the whole constitutes an invalid inductive leap.

    More importantly, O&S cast the issue of the relationship between cortical plasticity and learning/memory in too simple a dichotomy: either plasticity is sufficient or necessary, or not. But Nature is not often so dichotomous. A1 plasticity might be sufficient or necessary or both for some types of learning and memory but not others. The fundamental issue concerns the relationships between plasticities and memories, i.e., discovery of the principles governing the circumstances under which cortical plasticity of a certain type develops, its relations to types of information stored, and ultimately how they are integrated to guide and underlie thought and behavior. The positive relationship between level of motivation and area of representation is one demonstration that “yes or no” approaches fail to capture essential aspects of A1 plasticity.

  5. 6C.

    A1 lesions, pure tones and memory traces

    O&S strongly imply that lesions of A1 are critical for determining which tasks produce plasticities that are part of the substrate for auditory memory. They refer to a study in which A1 lesions impair discrimination of frequency-modulated (FM), but not pure tone, stimuli [37]; “... in training paradigms in which the relevance of the auditory cortex has been positively shown ...”. Given the context of their critique, the implication is a relative disregard for CS-specific increases and tuning shifts during tasks employing pure tones.

    Rebuttal to 6C. It is, of course, well-established that lesions of A1 can fail to reveal auditory-based deficits for pure tone tasks, as emphasized by O&S. However, that depends on the methods of training (for an important early review, see [39]) and the “questions” posed to the animal subjects. For example, two-tone discrimination training can reveal lesion-based deficits in acquisition and retention [40, 41], and A1 lesions impair extinction after removal of the US [42].

    O&S appear to assume that memory traces can develop only in structures in which destruction produces behavioral impairments. In contrast, contemporary conceptions of memory substrates acknowledge distributed storage. For tonal conditioning, specific receptive-field shifts develop not only in A1 but also in all three nuclei of the medial geniculate body, and associative plasticity (not yet checked for specificity) develops in lower auditory structures and at least the amygdala and the hippocampus. A fundamental goal is to determine the relative contribution of all involved structures in the acquisition, storage and representation of experience, not which structure holds the entire memory.

    Moreover, learning studies employing pure tones have universally found the development of plasticity in A1, beginning with the work of Galambos et al. [7] (see also Background section). One can argue that the results of all pure tone studies are epiphenomenal, but it would be more fruitful to seek the principles that govern the learning-based induction of A1 plasticity for all acoustic stimuli, including pure tones. We believe that although this task is difficult, success will ultimately depend upon a detailed characterization of the plasticity under consideration.

Summary

Forms of plasticity (points 1–4)

O&S have cast doubt on our findings of CS-specific increased responses and CS-directed tuning shifts on the grounds of (a) failure to control context, (b) selection of CS-specific increases, while ignoring CS-specific decreases, and (c) using a measure that was biased to reveal increases at the expense of decreases. All of these critiques have been examined and found to reflect unsupported assumptions, failure to distinguish between training trials and post-training receptive fields, failure to distinguish between primary auditory cortex and fields A2 and VE, and misreading of our Methods sections. Moreover, our findings have been replicated for different reinforcers (appetitive, aversive, brain reward), tasks (classical and instrumental conditioning), and taxa (guinea pig, rat, bat) (reviewed in [35]).

In addition, O&S claim that CS-specific decreased responses develop when context is controlled, but in fact, they actually changed state between training and post-training testing, failed to provide behavioral validation of learning and present results that could be explained by extinction if their subjects had learned.

Functions of plasticity (points 5, 6)

O&S hold that CS-specific plasticity in A1 cannot represent part of the substrate of auditory associative memory because this function is incompatible with task-specific plasticity. However, the evidence for a mnemonic function for CS-specific increases, CS-directed tuning shifts, and enlargement of CS-band representation in A1 is consistent and strong. Moreover, task-specificity does not appear to be incompatible with memorial functions.

O&S falsely attribute to us the claim that A1 is essential for all auditory memory. They further minimize studies using pure tones because lesions of A1 do not prevent learning about pure tones. However, their conclusions are too sweeping as one cannot draw conclusions about all auditory memory from examples of particular auditory tasks. Thus, A1 is important in pure tone tasks depending on the methods of training and the “questions” posed; e.g., A1 lesions impair both discrimination learning and extinction. O&S seem to hold that memory traces can develop only in structures in which lesions produce obvious behavioral impairments. However, contemporary conceptions of memory are based on parallel processing and distributed storage so that memories can be stored in the auditory cortex even if a particular task demand exploits only subcortical information storage.

What does it all mean?

The overwhelming finding for associative representational plasticity is that learning biases the processing and representation of stimuli to emphasize sounds that gain increased behavioral importance. CS-specific increases in responses and CS-directed tuning shifts have been observed across laboratories, species, types of conditioning, and motivation (reviewed in Weinberger [35]). As the attributes of this type of plasticity are the same as the attributes of behavioral associative memory, CS-specific increases satisfy the criteria for constituting specific memory traces. However, additional research is needed to determine if such plasticity truly indexes aspects of memory. Moreover, the manner in which such memory storage would guide planning and behavior constitutes a major problem that will require the concerted efforts of many laboratories.

The only evidence for CS-specific decreases and lateral contrast enhancement as animals learn about the signal importance of a tone has been provided by O&S. However, their claim cannot be sustained due to the lack of behavioral validation of learning and the state and/or extinction confounds that are endemic to the experimental design which they used. Of course, it is possible that more compelling studies of associative learning will reveal CS-specific decreases, in which case, it will become necessary to determine the superordinate principles that govern the sign of CS-specific plasticity in A1. It is more likely that processes such as lateral contrast enhancement are engaged in perceptual learning [6]. However, the situation may be very complex because solid experiments have reported either a specific increase in A1 area of discriminative frequency bands in the monkey [43] or no plasticity despite good perceptual frequency learning in the cat [44].

Beyond particular studies that have been conducted to date, this debate highlights critical procedural and conceptual issues that need to be considered for future research. Chief among these is that neural data obtained during training trials do not reflect only associative processes, but rather a mixture of performance (e.g., state) and associative factors. Moreover, while the specificity of plasticity can be obtained using the unified experimental design, it is essential that pre-training and post-training RFs and other representations of neural processing be obtained while subjects are in the same state.

This debate has underscored the need for careful and comprehensive conceptualization of relationships between measures of brain function and measures of behavior. Learning is always inferred from behavior, never directly observed. Neural plasticity can be directly observed. Therefore, in asking questions about learning, it is essential to consider which questions to ask and which behavioral measures would most sensitively reveal the answers. This needs to involve determination of what subjects have learned and even how they have learned it. For example, subjects can use different learning strategies to arrive at the same overall level of correct performance.

Finally, the practice of explicitly stating assumptions should be encouraged. This is particularly important in behavioral neuroscience in which researchers have diverse backgrounds. In the cases of the two disciplines of the neurobiology of learning and memory and brain mechanisms of sensory/perceptual processes, the need seems especially important. These two fields are both concerned with the processing and “fate” of environmental stimuli, i.e., experience. Research on primary auditory cortex, and other sensory cortices, increasingly reveals their entwinement, as Nature does not respect such disciplinary boundaries.