Journal of Computational Neuroscience

, Volume 24, Issue 3, pp 277–290 | Cite as

Distributed representation of perceptual categories in the auditory cortex



Categorical perception is a process by which a continuous stimulus space is partitioned to represent discrete sensory events. Early experience has been shown to shape categorical perception and enlarge cortical representations of experienced stimuli in the sensory cortex. The present study examines the hypothesis that enlargement in cortical stimulus representations is a mechanism of categorical perception. Perceptual discrimination and identification behaviors were analyzed in model auditory cortices that incorporated sound exposure-induced plasticity effects. The model auditory cortex with over-representations of specific stimuli exhibited categorical perception behaviors for those specific stimuli. These results indicate that enlarged stimulus representations in the sensory cortex may be a mechanism for categorical perceptual learning.


Categorical perception Sensory cortex Learning 

1 Introduction

While sensory stimuli may vary continuously along their physical dimensions, the behaviorally significant events that they represent are often discrete. Through a process called categorical perception, the sensory system maps continuous stimulus spaces to discrete perceptual events (Harnad 2003). For instance, lights of gradually changing wavelength are perceived as having discrete hues (Bornstein et al. 1976). Gradual shift in sound frequencies may lead to categorical changes of the perceived musical intervals (Burns and Ward 1978). Categorically perceived stimuli may be recognized more quickly in the presence of distortions and contextual interferences. This efficient sensory processing provides the bases for higher-level cognitive functions such as verbal communication and music appreciation (Harnad 1987).

Categorical perception was first discovered in speech research and was thought to involve language-specific, higher-level brain mechanisms, but not the basic sensory processing mechanisms of the auditory system (Liberman et al. 1957, 1967). Later research indicated that categorical perception occurs in a variety of non-speech sounds (Ehret 1992; Ehret and Haack 1981; Nelson and Marler 1989; Wyttenbach et al. 1996). In addition, speech sounds are categorically perceived by animals of many species (Kluender et al. 1987; Kuhl and Miller 1975; Kuhl and Padden 1982, 1983). These findings suggest that categorical perception may be an auditory, rather than a purely phonetic, process and may be mediated by the auditory sensory system.

Neural mechanisms underlying categorical perception are not well understood. Investigations of such mechanisms often involve searching for categorical neurons—those that respond preferentially to all stimuli in one category, but not to any of the other categories, showing sigmoidal stimulus selectivity. These categorical neurons have been found in the frontal cortex (Freedman et al. 2001; Romo et al. 1997). Although behavioral and psychophysical evidence suggest that sensory systems may mediate categorical perception, the neurons in the sensory cortex, which typically respond to a broad range of stimuli and exhibit bell-shaped tuning curves, are not considered categorical.

Categorical perception may arise both through innate mechanisms and as a result of sensory experiences and learning (Livingston et al. 1998). Some human speech sounds, for instance, are categorically perceived in newborn human infants (Eimas 1974) and in some model animals that have never been exposed to the speech sounds (Ehret and Haack 1981; Kluender et al. 1987; Kuhl and Miller 1975; Nelson and Marler 1989; Wyttenbach et al. 1996). It has been suggested that the auditory systems of both humans and the model animals are innately sensitive to the acoustic distinctions of those speech sounds, and our vocal communication system simply exploits this sensitivity (Holt et al. 2004; Steinschneider et al. 2003). On the other hand, language experience can also alter the perceptual sensitivity of the auditory system to speech sounds and change their categorical boundaries (Lasky et al. 1975; MacKay et al. 2001; Williams 1977). This language-specific reshaping of the phonetic perceptual categories occurs in the first year of life (Kuhl et al. 1992), presumably as a result of acoustic exposure to the speech sound environment. Categorical perception of pitch is also shaped by musical experiences (Burns and Ward 1978).

Sensory experience in a limited window of early life has a profound influence on the development of cortical sensory representations (Wiesel 1982). Recent studies indicate that repeated exposure to a stimulus results in enlarged cortical representations of the experienced stimulus—i.e., more neurons becoming selectively responsive to the stimulus (Chang and Merzenich 2003; Erickson et al. 2000; Sengpiel et al. 1999; Zhang et al. 2001). Similar preferential representations of experienced speech sounds and musical notes have also been shown in humans (Naatanen et al. 1997; Pantev et al. 1998). Given the profound impact of early experience on categorical perception of speech sounds and musical and, as well as cortical sound representations, it is possible that experience-driven reorganization of the auditory cortex plays a role in forming perceptual categories (Crozier 1997; Lasky et al. 1975; MacKay et al. 2001; Takeuchi and Hulse 1993; Williams 1977). In this study, we construct models of acoustic representations of the primary auditory cortex, and examine the effects of experience-induced reorganization of acoustic representation on perceptual discrimination and identification performances of the model primary auditory cortex. We show that categorical perception may arise as a result of enlarged cortical representation induced, for instance, by early experience.

2 Materials and methods

2.1 Modeling the frequency representations in the primary auditory cortex

The parameters of the model were chosen based on properties of the primary auditory cortical neurons documented in the literature and our unpublished observations. The firing rates of the neurons in the auditory cortex exhibit significant variability. We have recorded response magnitude to tone pips from 121 AI neurons and obtained a mean Fano factor value of 0.98 ± 0.21 (SD), suggesting that the neuronal firing may be modeled with a Poisson process. An earlier study showed more reliable responses of auditory cortical neurons (DeWeese et al. 2003). The difference in observations may be due to the use of different stimuli, experimental conditions, and recording techniques. We modeled the frequency representations in the primary auditory cortex with a population of Poisson-firing neurons. Each neuron had a Gaussian-shaped, response-frequency tuning curve as
$$ R = \alpha e^{{\frac{{ - {\left( {{\text{frequency}} - \mu } \right)}^{2} }} {{2\sigma ^{2} }}}} + \delta $$
The maximum response magnitude, α, was set at 1 spike/tone recorded in a 50-ms response window for all neurons. Spontaneous firing rate of the model neurons was set at 1 spike/second, corresponding to δ = 0.05 spike in the 50-ms response window. The tuning bandwidth of the model neurons, defined here as two standard deviations of the Gaussian-shaped tuning curve, was set at 1 octave (i.e., σ  =  0.5 octave). These response parameters were chosen for simplicity, and are consistent with properties of recorded neurons (Bao et al. 2001, 2004). We also examined in later sections how changes in these parameters impacted perception behaviors. For the model naïve AI, the best frequencies (BFs), corresponding to μ in Eq. (1), were equally spaced along the logarithmic frequency scale from 1 to 50 kHz (Fig. 1(a)). To model the AI of a 7.1-kHz-tone-exposed animal, best frequencies of model neurons in a range of 7.1 kHz ± 1 octave were shifted to have a Gaussian distribution centered at 7.1 kHz and a standard deviation of 0.1 octave (Fig. 1(b)).
Fig. 1

Tuning curves of the model neurons in the model naïve primary auditory cortex (a) and the model 7.1-kHz-exposed primary auditory cortex (b). All model neurons have a maximum response magnitude of one spike per tone and a tuning bandwidth of one octave. Spontaneous firings of the model neurons are not shown in the tuning curves

2.2 Modeling frequency discrimination

The response of the ith neuron of the model AI to a tone of frequency f was denoted as Ri—the number of spikes in the response window. As the model neurons fire spikes in a Poisson-random fashion, the probability of the neuron responding to f with response Ri is
$$ P{\left( {R_{i} |\;f} \right)} = \frac{{T_{i} {\left( f \right)}^{{R_{i} }} }} {{R_{i} !}}e^{{ - T_{i} {\left( f \right)}}} $$
where Ti is the neuron’s response-frequency tuning curve. In practice, Ri is simulated with a Poisson-random number with a mean of Ti(   f   ).
The responses of all the N model neurons (1, 2, … N) to a tone of an unknown frequency f were simulated as R1, R2, … RN. We obtained the maximum likelihood estimate of f, denoted as F, by maximizing the following log-likelihood (LL) function (Jazayeri and Movshon 2006; Seung and Sompolinsky 1993)
$$ \begin{array}{*{20}c} {{LL{\left( F \right)} = {\sum\limits_{i = 1}^N {\ln P{\left( {R_{i} |F} \right)}} } = {\sum\limits_{i = 1}^N {\ln {\left( {\frac{{T_{i} {\left( F \right)}^{{R_{i} }} }} {{R_{i} !}}e^{{ - T_{i} {\left( F \right)}}} } \right)}} }}} \\ {{ = {\sum\limits_{i = 1}^N {R_{i} \ln T_{i} {\left( F \right)}} } - {\sum\limits_{i = 1}^N {T_{i} {\left( F \right)}} } - {\sum\limits_{i = 1}^N {\ln {\left( {R_{i} !} \right)}} }}} \\ \end{array} $$
using a sequential quadratic programming method available in Matlab toolboxes (Powell 1977).
The F may be regarded as the “percept” of f by the model AI. As the model neuronal responses are stochastic, the estimated frequency F may deviate from the true frequency f. Such variability of “perception” of the model AI determines its frequency decoding precision and frequency discrimination capability. To estimate this variability, we presented the model AI with pairs of tones of the same frequency (i.e., f1 = f2). We then calculated the estimates of the two frequencies, F1 and F2, and their difference, ΔF0 = ∣F1 − F2∣. This was done 100 times to obtain the distribution of the ΔF0s (Fig. 2). The 50th percentile of all these ΔF0s was chosen as the threshold, so that random chance-level performance would be 50%. To determine discrimination performance for a pair of different frequencies (i.e., f1f2), the difference of the estimates (ΔF) was calculated 100 times to obtain the distribution of the ΔF. A ΔF greater than the ΔF0 threshold indicated a pair of tones was discriminated by the model AI. The percentage of successful discriminations was used as the performance level. Presented in all graphs were means of 200 performance levels in each testing condition. The variability of the performance was measured with 95% confidence intervals, which cover the range of 2.5th and 97.5th percentile of the performance levels.
Fig. 2

Example of distributions (probability density functions) of the differences between the decoded frequencies (i.e., ΔF, where F denotes the decoded frequency) with the input frequencies being 0, 0.1 and 0.5 octave apart in a model naïve AI. ΔF may be considered as perceived frequency difference by the model AI. The distribution is shifted towards the right with increasingly greater differences in the input frequencies (Δf). The vertical dashed line marks the 50th percentile ΔF value of the distribution with Δf = 0. This value was used as the discrimination threshold. Any two tones that produced ΔFs greater than this threshold value were considered discriminated

2.3 Modeling frequency identification

In a typical behavioral identification task, the subject is presented with an unknown stimulus (  fx) and asked to make a forced choice on which of two fixed stimuli (  f1 and f2) is more likely to be the unknown stimulus. In our simulation, the model AI was presented with an unknown frequency (  fx). The response of model AI to fx was denoted as Rx. The task was to determine which of two known frequencies (  f1 and f2) was more likely to be the one that activated Rx. We modeled the perceptual decision process in the frequency identification task with a stochastic process and a deterministic process.

The stochastic model assumes that decision-making is a Bernoulli-stochastic process with two alternative outcomes—i.e., the model AI chooses either the low frequency f1 or the high frequency f2. We first calculated the log of the ratio of the likelihood that Rx was activated by f1 over the likelihood that it was activated by f2 as
$$ LLR{\left( {f_{x} ,f_{1} ,f_{2} } \right)} = \ln {\left( {\frac{{P{\left( {R_{x} |f_{1} } \right)}}} {{P{\left( {R_{x} |f_{2} } \right)}}}} \right)} = {\sum\limits_{i = 1}^N {{\left( {R_{i} \ln T_{i} {\left( {f_{1} } \right)} - T_{i} {\left( {f_{1} } \right)} - R_{i} \ln T_{i} {\left( {f_{2} } \right)} + T_{i} {\left( {f_{2} } \right)}} \right)}} } $$
The probability that the model AI selected f1 as supposed to f2 is determined as
$$ \operatorname{P} {\left( {{\text{choosing }}\;f_{1} } \right)} = \frac{{LLR{\left(\, {f_{x} ,f_{1} ,f_{2} } \right)} - LLR{\left(\, {f_{2} ,f_{1} ,f_{2} } \right)}}} {{LLR{\left( \,{f_{1} ,f_{1} ,f_{2} } \right)} - LLR{\left(\, {f_{2} ,f_{1} ,f_{2} } \right)}}} $$
in which LLR(  f1, f1, f2) is the log-likelihood ratio that a f1-activated response is activated by f1 as suppose to f2, and LLR(  f2, f1, f2) is the log-likelihood that a f2-activated response is activated by f1 as suppose to f2. With fixed f1 and f2, LLR(  f1, f1, f2) and LLR(  f2, f1, f2) are also fixed and were calculated. Thus, the probability of categorizing a frequency fx as f1 is a scaled linear function of the LLR(  fx, f1, f2), which is a function of the unknown frequency fx. The probability P is bounded in the range of 0–1.
The deterministic model assumes that decision-making in the frequency identification task is determined solely by the perceptual process. The following likelihood ratio
$$ LR{\left( {f_{x} ,f_{1} ,f_{2} } \right)}{ = P{\left( {R_{x} |f_{1} } \right)}} \mathord{\left/ {\vphantom {{ = P{\left( {R_{x} |f_{1} } \right)}} {P{\left( {R_{x} |f_{2} } \right)}}}} \right. \kern-\nulldelimiterspace} {P{\left( {R_{x} |f_{2} } \right)}} = \exp {\left( {{\sum\limits_{i = 1}^N {{\left( {R_{i} \ln T_{i} {\left( {f_{1} } \right)} - T_{i} {\left( {f_{1} } \right)} - R_{i} \ln T_{i} {\left( {f_{2} } \right)} + T_{i} {\left( {f_{2} } \right)}} \right)}} }} \right)} $$
was used to directly determine which frequency to choose in the frequency identification task—f1 was chosen if the ratio was greater than one, and f2 was chosen otherwise. This method is referred to as the likelihood ratio (LR) method. In addition, we have also modeled the stimulus identification process with a maximum likelihood estimation method Eq. (2), in which fx was decoded from Rx, denoted as Fx. The frequency (either f1 or f2) that was closer to Fx was chosen. This method is referred to as the maximum likelihood estimation (MLE) method.

Each simulation was run 100 times, and the percentage that the model AI chose f1 was used as the identification index. Each point in all the graphs is the mean of 200 individually calculated identification indices in the specific testing condition. The variability of the performance was measured with 95% confidence intervals, which cover the range of 2.5th and 97.5th percentile of the identification indices.

2.4 Testing stimulus discrimination in adult rats

All procedures are approved by the Animals Care and Use Committee of University of California, Berkeley. Five female Sprague-Dawley rats (200–300 g, over 2 months old) were trained in a tonal frequency discrimination task. Training and testing took place in a wire cage located in an anechoic sound-attenuation chamber. Upon automatic initiation of a trial, tone pips of 100-ms duration and of a standard frequency were played five times per second through a calibrated speaker. All tones were played at a 60-dB sound pressure level. After a random duration of 5–35 s, tone pips of a target frequency were played in the place of every other standard tone pips. Rats were trained to detect the frequency difference and make a nose-poke in a nosing hole within 3 s after the first target tone, which was scored as a hit and rewarded with a food pellet. False alarm rate was determined as the percentage of responses to probe trials, in which the target frequency was the same as the standard frequency. In each training day, an animal was allowed to achieve 200–300 hits. The difference between the target and the standard tone pips were varied. The animal’s performance may be influenced by its motivational states and its internal response criteria. To account for these factors and to estimate the animals’ perceptual ability, we used the discrimination index (Grier 1971; Pollack and Norman 1964):
$$ {A}\prime = \frac{1} {2} + \frac{{{\left[ {{\left( {h - fa} \right)}{\left( {1 + h - fa} \right)}} \right]}}} {{{\left[ {4h{\left( {1 - fa} \right)}} \right]}}} $$
in which h is the hit rate and fa is the false alarm rate. A′ varies from 0.5 to 1, which allows comparison with the discrimination probability of the cortical model (see below).

2.5 Testing stimulus identification in adult rats

Animals were first trained to recognize two prototype tonal frequencies. In each trial, 100-ms tone pips of a prototype frequency were played at a rate of five pips per second and at 60 dB SPL. The animal was trained to make a nose-poke in one of two nosing holes (either on the left or on the right) depending on which one of the two prototype frequencies (6 kHz or 12 kHz) was being played—i.e. an identification task. A nose-poke in the correct hole within 10 s from the onset of the sound was considered a “hit” and rewarded with a food pellet. A nose-poke in the wrong hole or inaction in the 10-s period was a miss and not rewarded. It takes approximately 10 days for naive animals to reach an asymptotic performance level of approximately 80% correct recognition. Then, we tested how animals perceived and categorized a series of nine tones of intermediate frequencies. These frequencies were logarithmically equally spaced between the two prototype frequencies. The prototype sounds were tested in regular trials (80% of all trials). The intermediate sounds were tested in probe trials (20%), in which the animal did not receive a food pellet regardless of the animal’s response. We did not reinforce the animals in these trials to avoid biasing their responses, which could interfere with the perceptual tests. To keep animals motivated with food pellet reward, we included 80% regular trials in which correct responses to prototype stimuli were rewarded. The percentage of trials that animals made nose-poke in the left nosing hole (corresponding to the lower frequency) was used to construct the identification function.

3 Results

3.1 Psychometric function of the model AI

We first examined the model performances as a function of the input frequency difference and the total number of neurons in the model AI. As shown in Fig. 3, the psychometric performance-difference function was approximately sigmoidal. Having more model neurons improved the model performance, as indicated by a leftward shift of the psychometric function. The shape of the psychometric function, however, did not change with the neuron numbers. As predicted (Seung and Sompolinsky 1993), the discrimination threshold of the model AI, as measured with the half-height frequency difference, was inversely proportional to the square root of the number of neurons (Fig. 4(a)).

We examined animal performance in a frequency discrimination task, in which discrimination of various frequency differences was tested in adult rats that have not been exposed to specific sound (hereafter referred to as naïve animals, in contrast to sound-exposed animals with altered frequency representations). The psychometric function of naïve rats was sigmoidal, similar to that of the model AI. Furthermore, the performances of the model AI with 800 neurons fitted well with the animal performances. The total number of neurons in the primary auditory cortex of the rat (1–2 mm2 in size) is on the order of 100,000, including local and inhibitory neurons (Cherniak 1990; O’Kusky and Colonnier 1982). The relatively small number of neurons required for the model to reach the performance levels of the animals is consistent with earlier modeling results (Paradiso 1988). All simulations presented in the subsequent sections used model AIs with 800 neurons.

The tuning bandwidth, response magnitude and spontaneous firing rate of the model neurons were also varied to examine how these properties influence perceptual discrimination behaviors of the model AI. Frequency discrimination threshold decreases with greater response magnitude, narrower tuning bandwidth and lower spontaneous firing rate (Fig. 4(b–d)). These results provide constraints for further comparison between model and animals performances.

3.2 Perceptual discrimination by sound-exposed model AI

One of the two behavioral traits of categorical perception is that the perceptual discrimination ability is worse within a category than between different categories. If a perceptual category forms around the experienced stimulus, perceptual discrimination would be relatively poor within the category. We constructed a sound-exposed model AI, incorporating sound exposure-induced plasticity effects: over-representation of the experienced frequency and under-representation of neighboring frequencies in the range of ±1 octave (see Fig. 1(b) and Chang and Merzenich 2003). Simulation results indicate that discrimination of 0.1-octave frequency differences in the over-represented frequency range was significantly impaired. By contrast, discrimination of neighboring frequencies was improved (Fig. 5).

These results may be understood in terms of the amount of Fisher information the model neurons provide for frequency decoding (Dayan and Abbott 2001). Sensory neurons contribute to stimulus decoding by changing their firing rates (Bala et al. 2003; Luna et al. 2005; Paradiso 1988). Two similar stimuli that are near the center of a Gaussian-shaped tuning curve of a neuron will elicit similar firing rates (close to the maximum response magnitude). However, two similar stimuli that fall on the slopes of a neuron’s tuning curve, where firing rate is most sensitive to stimulus differences, will elicit responses of very different firing rates. In the sound-exposed AI, a large number of neurons become tuned near the experienced frequency. These retuned neurons are less sensitive to changes in frequencies near the experienced tone, because those frequencies fall near the center of their tuning curves. Instead, these neurons become sensitive to frequency changes in the neighboring frequency bands, where the slopes of the tuning curves are located. The limit of decoding accuracy set by Fisher information measure can be attained by maximum likelihood estimation, when a large number of neurons are involved in coding (Dayan and Abbott 2001). Thus, discrimination thresholds derived from Fisher information should be similar to those calculated with MLE.

3.3 Perceptual identification by sound-exposed model AI

The second behavioral trait of categorical perception is the sigmoidal identification function where stimuli on one side of a categorical boundary are classified as members of the same category. Behaviorally, it is often tested with an identification task, in which subjects are required to classify a series of equally spaced stimuli into two categories. We performed frequency identification test in naive animals, and observed a near-linear frequency identification function (Fig. 6). Using this result as a constraint, we explored three methods to model the stimulus identification process—a Bernoulli-stochastic process method, a likelihood-ratio threshold method and a maximum-likelihood estimation method (see Section 2 for details). Among the three methods, only the Bernoulli random process method produced a near linear identification function for naïve model AI. The performances of the likelihood-ratio threshold (LR) and maximum-likelihood estimation (MLE) methods were almost identical, and were pooled together (Fig. 6). The LR/MLE methods produced an inverted sigmoidal identification function that diverges from the corresponding animal behavior. The identification function generated with these two methods shows a complete categorical transition within a 0.2-octave frequency distance, similar to the frequency discrimination threshold shown in Fig. 3. This is not surprising because the methods essentially perform frequency decoding, and then make perceptual decisions based on the decoded frequency. The result that the model AI performed equally well in identification and discrimination tasks when LR/MLE methods are used is inconsistent with experimental findings that animals generally perform worse in identification than in discrimination tasks (For a discussion, see Massaro 1987), suggesting that the LR/MLE methods are inappropriate as models of the perceptual identification processes. The difference between the Bernoulli-stochastic and LR/MLE methods is likely due to their different assumptions about the decision-making process—the Bernoulli stochastic method assumes that the decision-making is stochastic, and the LR/MLE methods assume that the decision-making is deterministic (see Section 2).

Comparison of likelihood measures has been proposed as a model of the perceptual decision processes (Green and Swets 1966). In simple stimulus difference detection tasks (e.g., stimulus discrimination), subjects may compare a likelihood of having perceived stimulus differences with a threshold value to make a perceptual decision (as in the frequency discrimination process described above). Thus the performance is limited by the frequency decoding ability. In the perceptual identification task, however, the stimulus differences are often supra-threshold—i.e., fx is perceived as different from both f1 and f2. Deciding which one of f1 and f2 is closer to the unknown frequency fx is likely a probabilistic process, not a simple comparison of an index value to a fixed threshold. The notion that the discrimination and identification tasks involve different perceptual decision processes is consistent with the findings that performances are generally worse in identification than in discrimination tasks (Massaro 1987). Figure 6 indicates that the performances of MLE/LR methods are as good as the performances of the model AI in a discrimination task, but deviate from the animal performance. Instead, a Bernoulli-random process with the choice probabilities described by the linearly scaled log-likelihood ratio may capture some aspects of the perceptual identification behaviors in an identification task.

We analyzed perceptual identification behaviors of the model 7.1-kHz-exposed AI using the Bernoulli-random process method. The results showed that the tone-exposed AI consistently classified frequencies near 7.1 kHz as the lower one (i.e., 5.9 kHz) of the two prototypes (Fig. 7). This behavior, together with the reduced discrimination performance near 7.1 kHz, indicates that frequencies near 7.1 kHz were grouped into a perceptual category. It is a result of the sound exposure, because it only occurred near the exposed frequency, but not for the frequencies above 8.3 kHz.

3.4 Representations of two perceptual categories

The above results indicate that exposure to a single stimulus may shape a perceptual category near the stimulus. In psychophysical studies, categorical perception is typically defined between at least two categories by a peaked discrimination function and a sigmoidal identification function. We have also simulated cortical plasticity effects of exposure to two tones of different frequencies that were either two or 0.5 octaves apart (Fig. 8). The characteristics of the plasticity effects were similar to those in earlier sections of the simulations—neurons that used to be tuned to within one octave of the exposed frequencies were retuned closer to the exposed frequencies, and the retuned best frequencies had a Gaussian distribution with a 0.1-octave standard deviation (same as those in previous sections, see first section of the Section 2). The neurons that used to be tuned to the frequencies in-between the two exposure frequencies were split equally between the two frequencies. Other neuronal response properties (tuning bandwidths, maximum response magnitudes and spontaneous firing rate) were unchanged. It should be noted that the specific forms of two-tone-induced cortical plasticity used in our simulation are hypothetical, simplified and extrapolated from single tone-induced effects (Chang and Merzenich 2003; Zhang et al. 2001).
Fig. 3

Discrimination performances of the model naïve AI and animals as a function of differences between the input frequencies. The model performance was quantified with the proportion of all trials in which two frequencies were discriminated. The dashed lines show the performances of nine different models each using a different number of neurons. From left to right, the number of model neurons are 3,200, 2,262, 1,600, 1,131, 800, 565, 400, 282, 200. Animal performance, shown as circles, was quantified with the discrimination index A′. The performance of the model with 800 neurons matched that of the naïve animals closely. Therefore, 800 model neurons were used in all subsequent simulations

Fig. 4

Influences of neuronal response properties on the discrimination threshold of model naïve AI. Discrimination threshold was determined as the frequency difference at which the model AI performance is 75% (see Fig. 3). Discrimination threshold decreases with the total number of neurons in the model AI (a) and with the maximum response magnitude—i.e., the peak height of the Gaussian tuning curve (b). It increases with neuronal tuning bandwidth (c) and with spontaneous firing rate (d). A linear regression indicates that the threshold is inversely proportional to the square root of the population size. The parameters that were not systematically varied were given the following values: Tuning bandwidth, one octave; response magnitude, one spike/tone; spontaneous activity, one spike/second

Fig. 5

Impaired discrimination of the over-represented frequencies. The model naïve AI and 7.1-kHz-exposed AI were tested on frequency discrimination ability of various tone-pip pairs with Δf = 0.1 octaves. The model naïve AI performed the frequency discrimination task with the same success rate across all frequencies (dashed line). Model 7.1-kHz-exposed AI, which had enlarged representations near 7.1 kHz, showed impaired discrimination performance near 7.1 kHz, and improved performance in the neighboring 11.9-kHz frequency range. The performance was quantified using the proportion of trials in which the two frequencies were discriminated

Fig. 6

Comparison of naïve animal performances with model naïve AI performances in a frequency identification task. The animals showed a near linear identification function (dashed line). Naïve model AI performance in the identification task was simulated with three methods, likelihood ratio, maximum likelihood estimation and Bernoulli stochastic process (for details, see Methods). The first two methods yield identical results, which differ from the animal performance. The Bernoulli stochastic choice method produced near linear identification function, and was used for later analysis of model identification behaviors. The performance was measured with the percentage of trials in which the tonal frequency was identified as the low frequency of 5.9 kHz

Fig. 7

Frequency identification by the model naïve AI and model 7.1-kHz-exposed AI. (a) The ratio of the log-likelihood that an unknown frequency is 5.9 kHz vs. the log-likelihood that the frequency is 11.9 kHz. (b) The percentage of trials that a frequency in the range from 5.9 to 11.9 kHz is identified as 5.9 kHz by a naïve model AI and a 7.1-kHz-exposed AI is shown (see Methods for details). The 7.1-kHz-exposed AI consistently identifies several tones as 5.9 kHz, building what looks like a perceptual category for lower-frequency tones

Fig. 8

Representations of two perceptual categories. (a) Model AI with over-representations of 3.5-kHz and 14-kHz tones. The neighboring frequencies were under-represented. The over- and under-representations were similar to those described in Fig. 1(b). (b) The model AI described in (a) showed categorical perception of the frequencies with categorical centers at 3.5 and 14 kHz. The model AI showed a sigmoidal identification function and better discrimination of 0.1-octave frequency difference at the categorical boundary at 7 kHz. (c) Model AI with over-representations of 5.9- and 8.3-kHz tones. (d) The model AI described in C did not show categorical perception

Simulation results indicate that when the two experienced frequencies were two octaves apart, the model two-tone-exposed AI showed categorical perceptual behaviors—a sigmoidal identification function and a peaked discrimination function. The discrimination function is similar to that of categorical discrimination of phonemes observed in animals (Kuhl and Padden 1983). These results indicate that categorical perception may be mediated by populations of neurons with bell-shaped tuning curves. In addition, the prototypes of the categorically perceived stimuli are over-represented—e.g., more neurons were tuned to the categorically perceived frequencies near 3.5 and 14 kHz as shown in Fig. 8(a–b). Interestingly when the two frequencies were 0.5 octave apart, no categorical perception was observed. Categorical perception would be established in this case if the tuning bandwidths of the neurons become narrower (data not shown). These results suggest that the properties of the cortical circuits constrain the categorical learning processes. Certain stimuli may be more learnable as categorical prototypes than the others.

Figure 9 illustrates the population categorical responses of the model AI in comparison to non-categorical responses. The tones of the same frequencies activated overlapping and gradually shifting populations of neurons in the model naïve AI (Fig. 9(a)). In the model AI that had over-representations of 3.5 and 14 kHz tones, two distinctive populations of neurons were activated by tones in the two different categories (Fig. 9(b)). In the model AI with over-representations of 5.9 and 8.3 kHz (Fig. 9(c)), the same population of neurons was activated by a range of frequencies near the over-represented ones. These activation patterns are consistent with the results in Fig. 8, showing categorical perception when the over-represented frequencies are two octaves apart.
Fig. 9

Categorical population responses. Activity of various model AIs, each having 800 model neurons. Each row shows the responses of all neurons to a single presentation of a tone. The frequency of the tone is indicated on the left. The bars represent the number of spikes that a neuron discharged in response to the tone. Neurons are arranged by best frequency such that the neuron with the lowest best frequency is to the left on the x-axis. (a) Activity of model naïve AI. Tones of different frequencies activated different but overlapping populations of neurons. (b) Activity of the model AI with over-representations of 3.5- and 14-kHz tones. The response patterns activated by 2.5-, 3.5- and 5-kHz tones were similar, and were much different from those activated by 10.5-, 14- and 20-kHz tones. (c) Activity of the model AI with over-representation of 5.9- and 8.3-kHz tones. Tones of frequencies from 5.6 to 9.1 kHz all activated the same population of neurons

We also varied neuronal properties—i.e., tuning bandwidth, response magnitude and spontaneous firing rate—and examined how they influence categorical sound representation. The model AI used was the same as described in Fig. 8(a), having over-representations of 3.5 and 14 kHz. Altering tuning bandwidth had a profound impact on categorical representation (Fig. 10, (a) and (d)). When the bandwidth was between one and two octaves, we observed two perception categories at the two over-represented frequencies. When bandwidth was four octaves, the 0.1-octave frequency difference was perceived equally poorly across the tested frequency range, and the identification function was close to linear, suggesting that there were no perceptual categories. When bandwidth were 0.5 octaves, there appeared to be three categories. Changing response magnitude altered discrimination performances, but not frequency identification performances (Fig. 10, (b) and (e)). Altering the level of spontaneous activity impacted frequency discrimination, but not frequency identification performances (Fig. 10, (c) and (f)).
Fig. 10

Influences of neuronal response properties on categorical representation of frequencies. We investigated how changes in tuning bandwidth, response magnitude and spontaneous activity influence discrimination of 0.1-octave frequency differences (a–c) and identification of frequencies (e–f) in a model with over-representations of 3.5 and 14 kHz. (a) and (d) Performances of models with four different neuronal tuning bandwidths: 0.5, 1, 2 and 4 octaves. Thicker line indicates broader tuning bandwidth. Tuning bandwidth has a strong influence on how the model discriminates and identifies frequencies. (b–c) and (e–f) Response magnitude and spontaneous firing rate influence discrimination but not identification performances. The response magnitudes are indicated in the figure in units of spikes/tone. Spontaneous firing rates were 0.1, 0.2, 0.4, 1, 2, 4 spikes/second. Frequency discrimination performance was generally better with higher response magnitude and lower spontaneous firing rates. Frequency identification performances were not changed by response magnitude or spontaneous firing rate—the performances overlapped completely. The parameters that were not systematically varied were given the following values: Tuning bandwidth, one octave; response magnitude, one spike/tone; spontaneous activity one spike/second

4 Discussion

Categorical perception may be learned by exposure to specific stimuli during early development, or by extensive training in adulthood (Goldstone 1994; Lasky et al. 1975; MacKay et al. 2001; Williams 1977). After learning, the stimuli within a stimulus category are perceived as being more similar, and stimuli from different categories are perceived as being more different. These two forms of perceptual alterations are referred to as acquired perceptual equivalence and distinctiveness, respectively (Liberman et al. 1957). They are believed to underlie categorical perceptual behaviors—e.g., peaked discrimination functions and sigmoidal identification functions. Recently electrophysiological studies have revealed that sensory exposure and perceptual training often enlarge cortical representations of the relevant stimuli by retuning neuronal selectivity to the stimuli. In the present study, we examined the possibility that enlargement in cortical representation is a cortical mechanism of categorical perception. Our computational simulation results indicate that the perceptual contrast of the over-represented stimuli may be reduced, analogous to acquired perceptual equivalence, and the perceptual contrast of the neighboring under-represented stimuli may be enhanced, resulting in acquired perceptual distinctiveness. Thus, a perceptual category may form for the over-represented stimuli. Further analysis of the model AI with two over-represented stimulus ranges revealed behaviors characteristic of categorical perception—a peaked discrimination function and a sigmoidal identification function. These results support the notion that enlargement in cortical representation mediates learned categorical perception.

Previous electrophysiological studies have investigated neural mechanism of categorical perception by identifying categorical neurons—those that respond to all members of one category but not to any members of other categories. These neurons may be regarded as the category readout neurons. It is still unclear what kind of transformation of sensory information gives rise to this category-selectivity and where the transformations take place. Results of the present study suggest that experience-dependent reorganization of stimulus representations in the primary sensory cortex could provide the transformation underlying learned categorical perception. In the sensory cortex, sensory information and hence perceptual categories are represented in populations of neurons, each of which shows graded responses to a large range of stimuli. There must be readout mechanisms to transform this distributed categorical representation into categorical responses in single neurons. In the present study, we obtained categorical perceptual behaviors in the models of AI using analyses of likelihood measures. Whether and how the neural systems perform likelihood analysis is still under active investigations, and some models have been proposed (Jazayeri and Movshon 2006; Zhang et al. 1998). These models may provide the needed readout mechanisms to transform distributed categorical representations into categorical responses in single neurons.

Several computational models of categorical learning have been investigated in earlier studies such as unsupervised, auto associative feedback networks (Anderson et al. 1977) and supervised, multi-layered networks with a hidden layer and back-propagating error signals (Harnad et al. 1991). The construction of these models was primarily based on theoretical considerations, and the biological plausibility of some of the mechanisms (e.g., the back-propagation of error signal) is unclear. In the present study, the model auditory representations were based on findings of electrophysiological studies—e.g., more neurons become tuned to more frequently experienced frequency. We only considered the cortical decoding capacity and how it would influence animals’ perceptual performances. We did not provide accounts on how the experience-altered cortical decoding capacity can be transformed into categorical neuronal response and guide perceptual behaviors (i.e., the readout problem). The shaping of categorical perception with sensory exposure described in the present study is similar to the learning of perceptual categories by the auto-associative network in that both are unsupervised learning and the learned perceptual categories are represented in distributed population responses (Anderson et al. 1977). The acoustic representations modeled in the present study may also be analogous to the hidden layers of the multilayer network models, which may be altered by experience in animals, and by learning in the multilayer network models (Harnad et al. 1991). Studies of sensory plasticity may provide insights for constructing biologically plausible models of categorical learning.

The results of this study provide some insights into cortical mechanisms of perceptual learning. Enlarged cortical representations of relevant stimuli have been observed after extensive training of adult animals to discriminate tonal frequencies (Recanzone et al. 1993), sound levels (Polley et al. 2004, 2006), temporal modulation rates (Bao et al. 2004), or somatosensory stimuli (Recanzone et al. 1992). Some of the studies show that representational sizes are highly correlated with tonal frequency discrimination performances after perceptual training (Recanzone et al. 1993). These results lead to the notion that greater cortical representations are the neural basis for better perceptual discrimination performance. Such a simplistic view, however, has been challenged by opposite results showing that perceptual discrimination training sometimes does not alter the cortical feature representational map (Brown et al. 2004). Furthermore, animals with cortical representations of certain tonal frequencies enlarged by intracortical electrical stimulations did not show any improvement in stimulus discrimination performances in the over-represented frequency range (Talwar and Gerstein 2001). These results suggest that perceptual discrimination capability may be determined by many cortical neuronal properties, and not just by representational sizes. This is consistent with the simulation results of the present study, which shows that enlarged representations of a very narrow frequency range may cause impaired discrimination of the over-represented frequencies. Our modeling results also indicate that over-represented frequencies may be discriminated better if the tuning bandwidths of the neurons become narrower (Fig. 10(a)), or if a large range of frequencies are over-represented (not shown). These results help to reconcile the seemingly contradicting results reviewed above.

Maximum likelihood estimation is an optimal population decoding method. It is not considered a biologically realistic decoding mechanism, although certain neuronal architectures are thought to be able to perform similar computations (Jazayeri and Movshon 2006; Zhang et al. 1998). In the limit of large numbers of encoding neurons and for Poisson firing rate distributions, its performance saturates the Cramer-Rao bound of the variance of estimate, and sets the upper limit of the performance of the biological systems (Dayan and Abbott 2001; Seung and Sompolinsky 1993). In essence, maximum likelihood estimation measures the maximum decoding capacity of a representational system. It has been used to model visual discrimination processes (Paradiso 1988). Although the successful applications of the method do not imply that the brain decodes sensory information using a similar maximum likelihood decoding method, it does indicate that perceptual behaviors are correlated with stimulus decoding capacity of the neuronal network revealed by the method. We followed the same rationale in our analysis of the impact of cortical plasticity effects on perceptual discrimination performance.

The information processing events underlying the perceptual identification behavior are unknown. The traditional view is that both discrimination and identification are mediated by the same perceptual processes so that their performances should match each other. Later experiments showed that the stimulus identification performance is generally worse than what would be predicted from discrimination functions (Massaro 1987). In the present study, animals showed a nearly linear identification function across a large frequency range. Such a linear identification function is inconsistent with a purely discrimination-based identification process, which would have yielded sigmoidal identification function like that of the MLE/LR group in Fig. 6. We modeled identification behaviors in two steps—first, the choice probability is determined with the log-likelihood ratio, and, second, a Bernoulli random process determines the identification choices. The two steps may correspond to the two separate processes underlying identification behaviors—sensory decoding and decision-making.

In this study, we simplified neuronal tuning properties—all neurons have the same firing rate, tuning bandwidth and spontaneous firing rate. Essentially same results were obtained with model neurons whose properties have the same distributions as those of recorded neurons (data not shown). The sound exposure-induced cortical plasticity effects were also simplified in this study, and only changes in the tuning frequencies were included in the analysis. Other neuronal response properties, such as the shapes of the tuning curves, the maximum response magnitudes, spontaneous firing rates, and spike timing/correlation can also be altered either by sound exposure or by perceptual learning (Bao et al. 2001; Beitel et al. 2003; Blake et al. 2006; Brown et al. 2004; Chang and Merzenich 2003; Chowdhury and Suga 2000; Edeline and Weinberger 1993; Engineer et al. 2004; Fritz et al. 2003; Kilgard and Merzenich 1998; Kilgard et al. 2001; Ma and Suga 2003; Ohl and Scheich 1996; Polley et al. 2004; Recanzone et al. 1993; Schoups et al. 2001; Zhang et al. 2001). Those forms of cortical plasticity effects could also contribute to the learning of categorical perception. Nevertheless, our analysis demonstrates that the enlargement of cortical representations could be a mechanism for categorical perception. Systematic examinations of categorical perception in animals that have been exposed to controlled sensory input would provide new insights into the neural mechanisms of categorical perceptual learning.



The work was supported by a grant from US National Institute of Health.


  1. Anderson, J. A., Silverstein, J. W., Ritz, S. A., & Jones, R. S. (1977). Distinctive features, categorical perception, and probability learning: some applications of a neural model. Psychological Review, 84, 413–451.CrossRefGoogle Scholar
  2. Bala, A. D., Spitzer, M. W., & Takahashi, T. T. (2003). Prediction of auditory spatial acuity from neural images on the owl's auditory space map. Nature, 424(6950), 771–774.PubMedCrossRefGoogle Scholar
  3. Bao, S., Chan, V. T., & Merzenich, M. M. (2001). Cortical remodelling induced by activity of ventral tegmental dopamine neurons. Nature, 412(6842), 79–83.PubMedCrossRefGoogle Scholar
  4. Bao, S., Chang, E. F., Woods, J., & Merzenich, M. M. (2004). Temporal plasticity in the primary auditory cortex induced by operant perceptual learning. Nature Neuroscience, 7(9), 974–981.PubMedCrossRefGoogle Scholar
  5. Beitel, R. E., Schreiner, C. E., Cheung, S. W., Wang, X., & Merzenich, M. M. (2003). Reward-dependent plasticity in the primary auditory cortex of adult monkeys trained to discriminate temporally modulated signals. Proceedings of the National Academy of Sciences of the United States of America, 100(19), 11070–11075.PubMedCrossRefGoogle Scholar
  6. Blake, D. T., Heiser, M. A., Caywood, M., & Merzenich, M. M. (2006). Experience-dependent adult cortical plasticity requires cognitive association between sensation and reward. Neuron, 52(2), 371–381.PubMedCrossRefGoogle Scholar
  7. Bornstein, M. H., Kessen, W., & Weiskopf, S. (1976). The categories of hue in infancy. Science, 191(4223), 201–202.PubMedCrossRefGoogle Scholar
  8. Brown, M., Irvine, D. R., & Park, V. N. (2004). Perceptual learning on an auditory frequency discrimination task by cats: association with changes in primary auditory cortex. Cerebral Cortex, 14(9), 952–965.PubMedCrossRefGoogle Scholar
  9. Burns, E. M., & Ward, W. D. (1978). Categorical perception–phenomenon or epiphenomenon: evidence from experiments in the perception of melodic musical intervals. Journal of the Acoustical Society of America, 63(2), 456–468.PubMedCrossRefGoogle Scholar
  10. Chang, E. F., & Merzenich, M. M. (2003). Environmental noise retards auditory cortical development. Science, 300(5618), 498–502.PubMedCrossRefGoogle Scholar
  11. Cherniak, C. (1990). The bounded brain: toward quantitive neuroanatomy. Journal of Cognitive Neuroscience, 2, 58–68.CrossRefGoogle Scholar
  12. Chowdhury, S. A., & Suga, N. (2000). Reorganization of the frequency map of the auditory cortex evoked by cortical electrical stimulation in the big brown bat. Journal of Neurophysiology, 83(4), 1856–1863.PubMedGoogle Scholar
  13. Crozier, J. B. (1997). Absolute pitch: practice makes perfect, the earlier the better. Psychology of Music, 25, 110–119.CrossRefGoogle Scholar
  14. Dayan, P., & Abbott, L. F. (2001). Theoretical Neuroscience. Cambridge, MA: The MIT Press.Google Scholar
  15. DeWeese, M. R., Wehr, M., & Zador, A. M. (2003). Binary spiking in auditory cortex. Journal of Neuroscience, 23(21), 7940–7949.PubMedGoogle Scholar
  16. Edeline, J. M., & Weinberger, N. M. (1993). Receptive field plasticity in the auditory cortex during frequency discrimination training: selective retuning independent of task difficulty. Behavioral Neuroscience, 107(1), 82–103.PubMedCrossRefGoogle Scholar
  17. Ehret, G. (1992). Categorical perception of mouse-pup ultrasounds in the temporal domain. Animal Behaviour, 43(3), 409–416.CrossRefGoogle Scholar
  18. Ehret, G., & Haack, B. (1981). Categorical perception of mouse pup ultrasound by lactating females. Naturwissenschaften, 68(4), 208–209.PubMedCrossRefGoogle Scholar
  19. Eimas, P. D. (1974). Auditory and linguistic processing of cues for place of articulation by infants. Perception & Psychophysics, 16, 564–570.Google Scholar
  20. Engineer, N. D., Percaccio, C. R., Pandya, P. K., Moucha, R., Rathbun, D. L., & Kilgard, M. P. (2004). Environmental enrichment im-proves response strength, threshold, selectivity, and latency of auditory cortex neurons. Journal of Neurophysiology, 92(1), 73–82.PubMedCrossRefGoogle Scholar
  21. Erickson, C. A., Jagadeesh, B., & Desimone, R. (2000). Clustering of perirhinal neurons with similar properties following visual experience in adult monkeys. Nature Neuroscience, 3(11), 1143–1148.PubMedCrossRefGoogle Scholar
  22. Freedman, D. J., Riesenhuber, M., Poggio, T., & Miller, E. K. (2001). Categorical representation of visual stimuli in the primate prefrontal cortex. Science, 291(5502), 312–316.PubMedCrossRefGoogle Scholar
  23. Fritz, J., Shamma, S., Elhilali, M., & Klein, D. (2003). Rapid task-related plasticity of spectrotemporal receptive fields in primary auditory cortex. Nature Neuroscience, 6(11), 1216–1223.PubMedCrossRefGoogle Scholar
  24. Goldstone, R. (1994). Influences of categorization on perceptual discrimination. Journal of Experimental Psychology General, 123(2), 178–200.PubMedCrossRefGoogle Scholar
  25. Green, D. M., & Swets, J. A. (1966). Singal detection theory and psychophysics. New York: Wiley.Google Scholar
  26. Grier, J. B. (1971). Nonparametric indexes for sensitivity and bias: computing formulas. Psychological Bulletin, 75(6), 424–429.PubMedCrossRefGoogle Scholar
  27. Harnad, S. R. (1987). Categorical perception: the groundwork of cognition. Cambridge: Cambridge University Press.Google Scholar
  28. Harnad, S. (2003). Categorical perception. In L. Nadel (Ed.), Encyclopedia of cognitive science. London: Macmillan.Google Scholar
  29. Harnad, S., Hanson, S. J., & Lubin, J. (1991). Categorical perception and the evolution of supervised learning in neural nets. In L. Reeker (Ed.), Working Papers of the AAAI Spring Symposium on Machine Learning of Natural Language and Ontology. pp. 65–74. Standford, CA.Google Scholar
  30. Holt, L. L., Lotto, A. J., & Diehl, R. L. (2004). Auditory discontinuities interact with categorization: implications for speech perception. Journal of the Acoustical Society of America, 116(3), 1763–1773.PubMedCrossRefGoogle Scholar
  31. Jazayeri, M., & Movshon, J. A. (2006). Optimal representation of sensory information by neural populations. Nature Neuroscience, 9(5), 690–696.PubMedCrossRefGoogle Scholar
  32. Kilgard, M. P., & Merzenich, M. M. (1998). Cortical map reorganization enabled by nucleus basalis activity. Science, 279(5357), 1714–1718.PubMedCrossRefGoogle Scholar
  33. Kilgard, M. P., Pandya, P. K., Vazquez, J., Gehi, A., Schreiner, C. E., & Merzenich, M. M. (2001). Sensory input directs spatial and temporal plasticity in primary auditory cortex. Journal of Neurophysiology, 86(1), 326–338.PubMedGoogle Scholar
  34. Kluender, K. R., Diehl, R. L., & Killeen, P. R. (1987). Japanese quail can learn phonetic categories. Science, 237(4819), 1195–1197.PubMedCrossRefGoogle Scholar
  35. Kuhl, P. K., & Miller, J. D. (1975). Speech perception by the chinchilla: Voiced-voiceless distinction in alveolar plosive consonants. Science, 190(4209), 69–72.PubMedCrossRefGoogle Scholar
  36. Kuhl, P. K., & Padden, D. M. (1982). Enhanced discriminability at the phonetic boundaries for the voicing feature in macaques. Perception & Psychophysics, 32(6), 542–550.Google Scholar
  37. Kuhl, P. K., & Padden, D. M. (1983). Enhanced discriminability at the phonetic boundaries for the place feature in macaques. Journal of the Acoustical Society of America, 73(3), 1003–1010.PubMedCrossRefGoogle Scholar
  38. Kuhl, P. K., Williams, K. A., Lacerda, F., Stevens, K. N., & Lindblom, B. (1992). Linguistic experience alters phonetic perception in infants by 6 months of age. Science, 255(5044), 606–608.PubMedCrossRefGoogle Scholar
  39. Lasky, R. E., Syrdal-Lasky, A., & Klein, R. E. (1975). VOT discrimination by four to six and a half month old infants from Spanish environments. Journal of Experimental Child Psychology, 20(2), 215–225.PubMedCrossRefGoogle Scholar
  40. Liberman, A. M., Harris, K. S., Hoffman, H. S., & Griffith, B. C. (1957). The discrimination of speech sounds within and across phoneme boundaries. Journal of Experimental Psychology, 54(5), 358–368.PubMedCrossRefGoogle Scholar
  41. Liberman, A. M., Cooper, F. S., Shankweiler, D. P., & Studdert-Kennedy, M. (1967). Perception of the speech code. Psychological Review, 74(6), 431–461.PubMedCrossRefGoogle Scholar
  42. Livingston, K. R., Andrews, J. K., & Harnad, S. (1998). Categorical perception effects induced by category learning. Journal of Experimental Psychology. Learning, Memory, and Cognition, 24(3), 732–753.PubMedCrossRefGoogle Scholar
  43. Luna, R., Hernandez, A., Brody, C. D., & Romo, R. (2005). Neural codes for perceptual discrimination in primary somatosensory cortex. Nature Neuroscience, 8(9), 1210–1219.PubMedCrossRefGoogle Scholar
  44. Ma, X., & Suga, N. (2003). Augmentation of plasticity of the central auditory system by the basal forebrain and/or somatosensory cortex. Journal of Neurophysiology, 89(1), 90–103.PubMedCrossRefGoogle Scholar
  45. MacKay, I. R., Flege, J. E., Piske, T., & Schirru, C. (2001). Category restructuring during second-language speech acquisition. Journal of the Acoustical Society of America, 110(1), 516–528.PubMedCrossRefGoogle Scholar
  46. Massaro, D. W. (1987). Categorical partition: A fussy logical model of categorical behavior. In S. Harnad (Ed.), Categorical perception: the groundwork of cognition. pp. 254–283, Cambridge, UK: Cambrige University Press.Google Scholar
  47. Naatanen, R., Lehtokoski, A., Lennes, M., Cheour, M., Huotilainen, M., Iivonen, A., et al. (1997). Language-specific phoneme representations revealed by electric and magnetic brain responses. Nature, 385(6615), 432–434.PubMedCrossRefGoogle Scholar
  48. Nelson, D. A., & Marler, P. (1989). Categorical perception of a natural stimulus continuum: birdsong. Science, 244(4907), 976–978.PubMedCrossRefGoogle Scholar
  49. Ohl, F. W., & Scheich, H. (1996). Differential frequency conditioning enhances spectral contrast sensitivity of units in auditory cortex (field Al) of the alert Mongolian gerbil. European Journal of Neuroscience, 8(5), 1001–1017.PubMedCrossRefGoogle Scholar
  50. O’Kusky, J., & Colonnier, M. (1982). A laminar analysis of the number of neurons, glia, and synapses in the adult cortex (area 17) of adult macaque monkeys. Journal of Comparative Neurology, 210(3), 278–290.PubMedCrossRefGoogle Scholar
  51. Pantev, C., Oostenveld, R., Engelien, A., Ross, B., Roberts, L. E., & Hoke, M. (1998). Increased auditory cortical representation in musicians. Nature, 392(6678), 811–814.PubMedCrossRefGoogle Scholar
  52. Paradiso, M. A. (1988). A theory for the use of visual orientation information which exploits the columnar structure of striate cortex. Biological Cybernetics, 58(1), 35–49.PubMedCrossRefGoogle Scholar
  53. Pollack, L., & Norman, D. A. (1964). A non-parametric analysis of recognition experiments. Psychonomet Sci, 1, 125–126.Google Scholar
  54. Polley, D. B., Heiser, M. A., Blake, D. T., Schreiner, C. E., & Merzenich, M. M. (2004). Associative learning shapes the neural code for stimulus magnitude in primary auditory cortex. Proceedings of the National Academy of Sciences of the United States of America, 101(46), 16351–16356.PubMedCrossRefGoogle Scholar
  55. Polley, D. B., Steinberg, E. E., & Merzenich, M. M. (2006). Perceptual learning directs auditory cortical map reorganization through top–down influences. Journal of Neuroscience, 26(18), 4970–4982.PubMedCrossRefGoogle Scholar
  56. Powell, M. J. D. (1977). A fast algorithm for nonlinearly constrained optimization calculations. In G. A. Watson (Ed.), Numerical analysis. New York: Springer.Google Scholar
  57. Recanzone, G. H., Merzenich, M. M., Jenkins, W. M., Grajski, K. A., & Dinse, H. R. (1992). Topographic reorganization of the hand representation in cortical area 3b owl monkeys trained in a frequency-discrimination task. Journal of Neurophysiology, 67(5), 1031–1056.PubMedGoogle Scholar
  58. Recanzone, G. H., Schreiner, C. E., & Merzenich, M. M. (1993). Plasticity in the frequency representation of primary auditory cortex following discrimination training in adult owl monkeys. Journal of Neuroscience, 13(1), 87–103.PubMedGoogle Scholar
  59. Romo, R., Merchant, H., Zainos, A., & Hernandez, A. (1997). Categorical perception of somesthetic stimuli: psychophysical measurements correlated with neuronal events in primate medial premotor cortex. Cerebral Cortex, 7(4), 317–326.PubMedCrossRefGoogle Scholar
  60. Schoups, A., Vogels, R., Qian, N., & Orban, G. (2001). Practising orientation identification improves orientation coding in V1 neurons. Nature, 412(6846), 549–553.PubMedCrossRefGoogle Scholar
  61. Sengpiel, F., Stawinski, P., & Bonhoeffer, T. (1999). Influence of experience on orientation maps in cat visual cortex. Nature Neuroscience, 2(8), 727–732.PubMedCrossRefGoogle Scholar
  62. Seung, H. S., & Sompolinsky, H. (1993). Simple models for reading neuronal population codes. Proceedings of the National Academy of Sciences of the United States of America, 90(22), 10749–10753.PubMedCrossRefGoogle Scholar
  63. Steinschneider, M., Fishman, Y. I., & Arezzo, J. C. (2003). Representation of the voice onset time (VOT) speech parameter in population responses within primary auditory cortex of the awake monkey. Journal of the Acoustical Society of America, 114(1), 307–321.PubMedCrossRefGoogle Scholar
  64. Takeuchi, A. H., & Hulse, S. H. (1993). Absolute pitch. Psychological Bulletin, 113(2), 345–361.PubMedCrossRefGoogle Scholar
  65. Talwar, S. K., & Gerstein, G. L. (2001). Reorganization in awake rat auditory cortex by local microstimulation and its effect on frequency-discrimination behavior. Journal of Neurophysiology, 86(4), 1555–1572.PubMedGoogle Scholar
  66. Wiesel, T. N. (1982). Postnatal development of the visual cortex and the influence of environment. Nature, 299(5884), 583–591.PubMedCrossRefGoogle Scholar
  67. Williams, L. (1977). The perception of stop consonant voicing by Spanish–English bilinguals. Perception & Psychophysics, 21, 289–297.Google Scholar
  68. Wyttenbach, R. A., May, M. L., & Hoy, R. R. (1996). Categorical perception of sound frequency by crickets. Science, 273(5281), 1542–1544.PubMedCrossRefGoogle Scholar
  69. Zhang, K., Ginzburg, I., McNaughton, B. L., & Sejnowski, T. J. (1998). Interpreting neuronal population activity by reconstruction: unified framework with application to hippocampal place cells. Journal of Neurophysiology, 79(2), 1017–1044.PubMedGoogle Scholar
  70. Zhang, L. I., Bao, S., & Merzenich, M. M. (2001). Persistent and specific influences of early acoustic environments on primary auditory cortex. Nature Neuroscience, 4(11), 1123–1130.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  1. 1.Helen Wills Neuroscience InstituteUniversity of CaliforniaBerkeleyUSA

Personalised recommendations