The ability of single-fiber counting models to explain Weber’s Law
The degree to which the average discharge rate of individual AN fibers can account for robust level discrimination over their limited dynamic range depends on the shape of their rate-level function and on the level dependence of the variance in their response. First, several simple rate-level functions are considered for the Poisson case to evaluate the relationship between the requirements for single AN fibers to produce Weber’s Law and known physiological response properties. The sensitivity per decibel δ′ (and thus the JND) can be calculated if the rate-level function r(L) is specified. Second, the effects of non-Poisson variance on the ability to explain Weber’s Law are explored by evaluating an existing model that includes dead-time refractoriness.
The effect of rate-level shape
The first rate-level function considered is given by the following equations in which all levels are in decibels re: threshold:
where SR is equal to the spontaneous rate of discharge, and L
sat is the level above which the rate saturates. This function is plotted in Figure 1 for a value of L
sat equal to 40 dB and for several values of SR. It can be verified that, for this choice of r(L), (δ′)2 in Eq. (7) is given by
In Figure 2, (δ′)2 vs. L is plotted for T = 0.1 s and three values of SR, corresponding to the three classes of fibers suggested by Liberman (1978): those with low spontaneous rates (Fig. 2A. with SR = 0.5 sp/s), medium spontaneous rates (Fig. 2B with SR = 10 sp/s), and those with high spontaneous rates (Fig. 2C with SR = 50 sp/s). Saturation at L
sat makes the function zero above L
sat. The dashed curves show the JND in decibels as a function of L for a single channel with the corresponding (δ′)2. Note that all functions are plotted relative to a threshold (which would vary among AN fibers).
Several observations are relevant here. First, note that a single low-to-medium spontaneous-rate fiber provides sufficient rate information for a JND of 3 or 4 dB at levels just above threshold. When a longer duration, say T = 0.3 s, and a higher slope, say 10 sp/s/dB (achieving a discharge rate of 200 sp/s at 20 dB above threshold), are used in the calculations, a single fiber provides sufficient rate information for a JND of approximately 1 dB.
Second, note that high-SR fibers provide significantly less information in terms of average discharge rate than do low-SR fibers. In Figure 2, (δ′)2 for a low-SR unit is approximately three times larger than that for a high-SR unit. This effect comes from the larger variance associated with higher means in Poisson random variables. The details of the rate-level functions vary among the different spontaneous rate groups (Sachs and Abbas 1974; Winter et al. 1990; Schoonhoven et al. 1997); however, a more precise description of the rate-level functions would be expected to have only a small quantitative, but not qualitative, effect on the calculations and conclusions of this study.
Third, the shape of the dependence of (δ′)2 for a single Poisson channel with the basic rate-level function of Eq. (17) differs grossly from psychoacoustic observations. It is suggested by the results plotted in Figure 2 [and is easily verified analytically, see Eq. (7)] that whenever r(L) is increasing linearly (on a dB scale), (δ′)2 is decreasing with level (due to increased variance with increases in rate). There is no physiological evidence of fibers with rate-level functions that increase faster than linearly over a range greater than 10 or 20 dB. Thus, it can be concluded that a single Poisson channel with a rate-level function compatible with available physiology cannot provide sufficient information even for Weber’s Law, let alone improvement of performance with level, or “the near miss to Weber’s Law.”
The second rate-level function considered is given by
where L is in dB relative to a threshold reference value. It is easily verified that Weber’s Law predictions are obtained from a Poisson channel with this rate-level function. In this case δ′ is equal to a constant value [10√(4cT)] that is independent of L for levels above threshold. It follows that a near-miss prediction on a single Poisson channel requires a rate-level function that grows faster than quadratically on a decibel level scale.
The last rate-level function considered has an exponential shape (on a dB level scale). This type of function was used in the counting models of McGill and Goldberg (1968a,b) and Luce and Green (1974). In both models, count is a Poisson (or nearly Poisson) random variable with a rate-level function that can be written as
where a and b are constants and L is the level in dB re: some reference level L
ref. In this case, the resulting expression for (δ′)2 is Tab
2ebL. The result is a JND that decreases with increasing level, and thus these models can predict the observed near-miss behavior. However, rate-level functions with this shape have not been observed in AN fibers and at best could represent combinations of many fibers.
Since single-channel Poisson counting models of level discrimination require rate-level functions that do not represent physiological data directly, we next consider whether deviations from Poisson variability can account for Weber’s Law in single AN fibers.
The effect of deviations from Poisson variability
The importance of the assumptions for the statistical properties of the model discharge patterns is illustrated by single-channel predictions using the formulae of Teich and Lachs (1979). They give expressions for the mean and variance of the count for a dead-time-modified Poisson process, assuming that the rate of the original Poisson process grows proportionally to stimulus level in decibels. The mean and variance of the count in the modified process are given by
and
where E is the stimulus energy, E
ref is a threshold constant, T is the duration, and τ is the dead time. If (δ′)2 is computed for a single channel with these statistics, one obtains
Thus, (δ′)2 for a single channel would saturate and become independent of level.
This example shows the importance of variance assumptions, since the mean rate-level function [mean count in Eq. (21) divided by T] has a shape very similar to Eq. (17) with L
sat = 20 dB (if thresholds are adjusted), and yet the predicted (δ′)2 in Eq. (23) is dramatically different than that given in Eq. (7) and shown in Figure 2. Also, a saturating rate-level function can provide information sufficient for Weber’s Law and even at a δ′ level consistent with a JND of 0.3 dB (δ′ = 3.2) when τ/T = 0.005.
However, a question for this article is how well non-Poisson models of this type describe AN behavior. The mean function in the model of Teich and Lachs (1979) is similar to observed rate-level functions; the variance, however, is clearly inconsistent with available data near saturation. For example, with a saturation rate of 100 sp/s, the variance of the count over 1 s at a rate of 90 sp/s is less than unity, and the coefficient of variation (the ratio of the standard deviation to the mean count) is less than 0.01. Furthermore, this relative variability continues to decrease inversely proportionally to the stimulus energy because the model fibers are stimulated to discharge almost immediately upon the conclusion of the fixed dead time after each firing. AN data are closer to the Poisson assumption. For example, the count data from Young and Barta (1986, their Fig. 6a) show that a count of 20 discharges per 200 ms (100 sp/s) has a standard deviation of approximately 3 discharges per 200 ms. (For a count of 100 discharges over a full second, this would correspond to a standard deviation of 3√5 = 6.7). Thus, the coefficient of variation at this mean count would be 0.067, which is roughly a factor of 1.5 less than expected for a Poisson process (0.1), but much greater than the model used by Teich and Lachs (1979). It follows that this non-Poisson model does not appropriately describe AN patterns and thus overestimates the amount of AN information at high levels.
This example illustrates the extent to which variance must be reduced from Poisson statistics to produce Weber’s Law and that this reduction is much greater than has been reported for AN fibers (e.g., Young and Barta 1986; Delgutte 1987; Winter and Palmer 1991). Thus, the deviation from Poisson discharge-count variance observed in AN fibers cannot account for the inability of Poisson counting models to predict robust level encoding.
The ability of multiple-CF counting models to explain the “near miss” to Weber’s Law
Since single-fiber models cannot simultaneously be consistent with physiological observations and psychophysical observations, multiple-channel models are considered. When models for level discrimination of narrowband stimuli are considered, the spread of excitation to fibers with CFs that differ from stimulus frequency becomes a central issue. This section begins with a description of a simple AN model (Siebert 1965, 1968) to demonstrate how a population of AN fibers with limited dynamic range can produce Weber’s Law. Several modifications to Siebert’s model are then discussed in terms of their ability to produce the “near miss” to Weber’s Law.
Siebert’s model of Weber’s Law based on spread of excitation
Unlike many other modeling studies that also explicitly included a spread of excitation over CF (e.g., Zwicker 1956; Maiwald 1967a,b,c; Florentine and Buus 1981), Siebert (1965, 1968) included the AN discharge patterns explicitly in his multiple-CF model. Siebert (1965, 1968) assumed optimum processing of a population of Poisson counts, which were based on a saturating rate-level function that was the same for all fibers except that tone threshold varied with CF based on AN frequency tuning. With these assumptions, level discrimination using fibers within a narrow CF band is poor except for a narrow range of levels near threshold (as discussed above), so that the robustness of performance across level is almost completely determined by the spread of excitation over CF bands. Siebert (1965, 1968) showed that Weber’s Law is predicted for tonal stimuli by this model if one assumes a uniform-in-log-frequency distribution of CFs, and two-piece linear tuning curves with constant slopes (in decibels versus log-frequency axes). Although AN fibers with CFs near the tone frequency saturate, the edges of the activity pattern provide a constant amount of information as level increases.
Possible modifications of Siebert’s model to explain the “near miss”
If the distribution of CFs is changed from uniform-in-log-frequency to uniform-in-linear-frequency, a “near miss” deviation from Weber’s Law is predicted with the amount of deviation dependent upon assumptions about the slopes of the tuning curves. This deviation is a direct consequence of having more fibers in the nonsaturated region of CFs as level increases. Specifically, the increase in the number of fibers in the nonsaturated region with CFs above the stimulus frequency is much greater than the decrease in the number with CFs below the stimulus frequency. However, the original uniform-in-log-frequency assumption is much more descriptive of available physiological data than the uniform-in-linear-frequency alternative, thus rejecting this possibility for the purposes of the present study. [Note that the uniform-in-linear-frequency assumption with this model results in the incorrect prediction that the masking of high-CF fibers results in a decrease in performance as level increases when the masking forces the system to use information on low-frequency fibers, since the number of fibers in the useful range (nonsaturated) decreases as level increases.]
If the shape of the tuning curves changes as a function of CF such that higher-CF fibers have lower slopes (decreasing Q), then the spread of excitation would proceed more quickly and place more high-CF fibers in the useful range at higher levels. A model with this assumption would also result in a “near miss” prediction. Although the narrowly tuned “tip” portion of tuning curves shows an increasing Q with increasing CF, the tails of the tuning curves at high CFs (Kiang and Moxon 1974) provide a clear physiological basis for this assumption. Other examples of the dependence of the tails on CF can also be seen in Kiang (1980) and Evans (1972). Instead of describing available tuning curves and the distribution of CFs and calculating the spread of excitation, one can measure the spread directly by measuring the distribution of thresholds for a fixed stimulus waveform for all AN fibers. A sample of measured thresholds for a 1-kHz tone from three cats can be seen in Figure 4 in Kiang and Moxon (1974). The slope of the mean threshold as a function of CF decreases with increasing CF when plotted on the log-frequency axis, consistent with the increasing number of useful fibers as the level increases (if the distribution of CFs is approximately uniform on a log-frequency scale and if the distribution of thresholds at a fixed CF is independent of CF). There are not sufficient data to characterize this factor with quantitative precision; it is clear, however, that this effect would contribute to a deviation from Weber’s Law in the observed direction, i.e., an improvement in performance with increasing level.
The third factor is the shape of the rate-level functions for fibers with CFs above and below the stimulus frequency (Sachs and Abbas 1974; Cooper and Yates 1994). The slope of the rate-level function for a given fiber decreases as the stimulus frequency increases above CF. As frequency decreases below CF, the slope either increases or remains roughly constant. This result indicates that many high-CF fibers will have steeper rate-level functions than fibers with CFs near the stimulus frequency. This would also predict an improvement in discrimination performance at higher levels (other things being equal) relative to Siebert’s prediction of Weber’s Law. If the slope increases by a factor of 3, the predicted δ′ for a single fiber increases by a factor between √3 and 3, depending on the spontaneous rate. Note that such a slope change is consistent with the nonlinear growth of the output of the high-frequency channels in Zwicker’s (1956) model that leads to a predicted improvement in performance at high levels. Furthermore, the fibers with CF below the stimulus frequency are less useful than the fibers with higher CFs. If it were possible to eliminate the higher-CF fibers, performance (i.e., sensitivity per decibel) would be expected to decrease as level increased as a consequence of this effect.
To summarize the conclusions from Siebert’s model (optimum processing of stationary Poisson patterns), deviations from Weber’s Law that are comparable to psychophysical data (a near miss) could be predicted for tones by modifying the model to incorporate the tails of tuning curves for high-CF fibers and/or changes in slope of the rate-level function with tone frequency relative to CF. It is important to consider how well the data being predicted constrain the models being investigated. For example, as discussed above, many modifications of Siebert’s basic model can produce a “near miss” to Weber’s Law based on spread of excitation (also see Lachs et al. 1984; Delgutte 1996; Heinz et al. 2001a,b). Thus, the ability to predict the “near miss” rather than Weber’s Law for tones in quiet is not a critical issue for evaluating level encoding in the AN. A much stronger constraint is the ability to explain the observation that level-discrimination performance is still robust in the presence of off-frequency masking noise (e.g., Moore and Raab 1974, 1975; Viemeister 1983). The simplest (and most common) interpretation of this result is that spread of excitation is not necessary for robust level encoding. This interpretation is based on the assumption that the only influence of the off-frequency masker is prevention of any spread of excitation to CFs away from the tone frequency. If this is true, it becomes critical to account for Weber’s Law only on the basis of information in AN fibers with CFs near the tone frequency. In fact, models that assume Weber’s Law within single-CF channels produce a near miss to Weber’s Law for tones in quiet based on spread of excitation (e.g., Florentine and Buus 1981). The influence of off-frequency maskers may be more complicated than typically assumed because of nonlinear interactions between the signal and masker (e.g., Rhode et al. 1978); however, a quantitative evaluation of these effects requires a more complex AN model than is considered in the present study (see Heinz 2000; Heinz et al. 2002). Nonetheless, it is informative to evaluate level encoding in single-CF channels, and thus the next section continues with the analytical approach to examine the ability of rate information to account for Weber’s Law based on pooling across AN fibers with similar CFs.
The ability of single-CF counting models to explain Weber’s Law
In this section, level discrimination performance (as characterized by δ′ vs. L) is obtained from a population of AN fibers with a common CF (equal to the stimulus frequency). The results depend upon the postulated combination rule as well as the set of assumptions about the discharge patterns. This section focuses on encoding in terms of discharge rate for illustrative purposes, while contributions of temporal information are evaluated below.
Optimum processing
First consider optimum processing of time-invariant Poisson processes (i.e., optimally weighted Poisson counting variables) with rate-level functions given by Eq. (17) as plotted in Figure 1. Since δ′ for L
thr = 0 has been calculated for this case (as plotted in Fig. 2), overall performance can be calculated by combining across individual AN fibers according to Eq. (15). The distribution of threshold values, n(L), must be specified along with the values for spontaneous discharge rate. To specify the thresholds, the observation that the rate thresholds of fibers at their CFs are (negatively) correlated with the spontaneous rates of discharge (SRs) is incorporated (Liberman 1978). Three distributions of thresholds are chosen, one for each of the SR categories (low SR = 0.5 sp/s, medium SR = 10 sp/s, and high SR = 50 sp/s). The threshold distributions shown in Figure 3 are based on the data of Liberman (1978). With these assumptions, the optimum sensitivity per decibel is given by
where δ′L, δ′M, and δ′H are the sensitivities per decibel for L
thr = 0 described by the functions in Figure 2 for the low, medium, and high SR cases, respectively; n
L(L), n
M(L), and n
H(L) represent the threshold distributions shown in Figure 3; and * represents convolution. The result of this calculation for (δ′op)2, is shown in Figure 4 for a band that is assumed to contain 2200 fibers, corresponding roughly to the number of fibers in a single 1/3-octave band of CFs when frequencies are uniformly distributed on a logarithmic scale (1350 high-SR, 500 medium-SR, and 350 low-SR fibers).
It is apparent in Figure 4 that optimum use of the counts on all fibers in a common CF band does not predict a level dependence corresponding to Weber’s Law or the near miss to Weber’s Law. Rather it predicts a significant decrease in performance as level increases above about 15 dB. However, predictions for reference levels near 15 dB using 2200 fibers are better than observed performance (e.g., δ′op ≅ 4.5, whereas δ′observed ≅ 1 since the JND ≅ 1 dB). The inability of single-CF Poisson rate information to account for Weber’s Law is consistent with similar studies that have used more accurate rate-level shapes (i.e., that vary with spontaneous rate and threshold) and discharge-count variance based on AN data from cat (e.g., Delgutte 1987; Viemeister 1988; Winslow and Sachs 1988). In contrast, Winter and Palmer (1991) predicted robust level-discrimination performance over at least 110 dB based on single-CF AN rate-level responses in guinea pig. Robust level encoding at high levels in their model resulted from the contribution of high-threshold, low-SR fibers with nonsaturating (“straight”) rate-level functions. However, “straight” rate-level functions were not observed in the guinea pig data for CFs below 1.5 kHz (Winter and Palmer 1991) and have not been observed in data from cat at any CF (e.g., Sachs and Abbas 1974; Delgutte 1987; Winslow and Sachs 1988). Thus, optimal processing of rate information within a single-CF-band does not generally predict Weber’s law. This conclusion implies that this type of rate-based single-CF model alone cannot describe the action of a single (critical-band) channel in models of the type suggested by Zwicker (1956) and Maiwald (1967a,b,c) since performance [e.g., δ′ (L)] is postulated to be independent of L for a single channel stimulated at its CF. However, the wide dynamic range over which enough single-CF rate information is available to account for human performance suggests that combination rules other than the optimal rule should be examined.
Other (nonoptimal) combination rules
Throughout the level range for which predicted performance is superior to observed performance (from less than 0 dB to greater than 70 dB in Fig. 4), there is generally sufficient information available in this single band of fibers to allow performance equal to observed performance if appropriate nonoptimum processing is assumed. This means in essence that many nonoptimum models could describe the observed results in this range. Most of these nonoptimum models may be contrived and ad hoc, but some may be simple and appealing.
In the discussion of combination rules above, total-count and single-fibers-at-a-time rules were considered in addition to the optimum rule. The total count statistic can give performance only equal to or poorer than optimum. Since saturated fibers contribute maximum variance and a negligible change in the mean to the total count, total-count performance will be significantly worse than optimum at high levels. Since this degradation will be relatively less important at lower levels, the total-count statistic will give a description of level discrimination that is even worse (more rapid decrease in performance with level) than the optimum use of Poisson counts. Further, as seen in Figure 2, a single-fiber-at-a-time rule does not provide adequate sensitivity; however, a similar rule applied to groups of fibers (i.e., using a different set of fibers at each level, e.g., Winslow et al. 1987) could be constructed to give Weber’s Law performance over a range of at least 80 dB. Similarly, Delgutte (1987) has shown that a combination rule in which low-SR, high-threshold fibers were processed more efficiently than high-SR, low-threshold fibers could extend the dynamic range over which Weber’s Law was predicted; however, performance still degraded significantly above 80 dB SPL.
The considerations for cases in which only fibers within a single CF band are available can be summarized as follows: Performance based on rate information would ultimately degrade at high levels, and therefore the full range of CFs must be included to understand level discrimination of tones at the highest levels. When all fibers within a given CF band are included, and when all uncertainties are considered, it is not possible to exclude the possibility of performance consistent with Weber’s Law over a wide range of levels using only the rate information in a single-CF band. However, a parsimonious and general model for predicting robust level encoding based on the processing of average-rate information does not exist at this time. Thus, it is of interest to extend the analytical approach used in the present study to the quantification of other sources of level information contained in single-CF AN responses, specifically temporal information.
RESULTS: LEVEL DISCRIMINATION BASED ON TEMPORAL INFORMATION
The ability of synchrony information to explain Weber’s Law
The time-varying Poisson single-channel case [Eq. (10)] is considered here, assuming that the time-varying rate-level function is given by Eq. (9) with Θ independent of level (i.e., level-dependent synchrony is included, but not level-dependent phase). Since the characteristics of the first term in Eq. (10) (i.e., rate information) have been described above, attention is focused on the second, synchrony term.
To evaluate the effect of the second term in Eq. (10), specific assumptions about the function g(L) are made. The maximum value of g(L) depends on frequency; in cat, the largest values are about 5 and occur for low frequencies (as do the largest slopes of g vs. L) (Johnson 1980). The maximum value of g(L) decreases steadily above about 1–3 kHz (Johnson 1980; Weiss and Rose 1988; Koppl 1997). As a convenient approximation to available data (Evans 1980; Johnson 1980), it is assumed in the following that g(L) increases linearly over a range of 20 dB as shown in Figure 5A for a low-frequency fiber. Also, since the discharge patterns on AN fibers often show phase-locking to the stimulus at levels below the level at which the average rate of discharges starts to increase (Johnson 1980), a hypothetical fiber is considered for which g(L) increases to its maximum value before the rate increases above the spontaneous rate. [In actuality, the dynamic range for synchronization partly overlaps that of average rate, but the conclusions drawn here are not affected by this simplification.] For easy comparison to the average-rate-alone results in Figure 2, the duration is again taken to be T = 0.1 s. For dg(L)/dL = 1/4, (δ′)2 reduces to (5/8) SR d
2[ln I
0 (g)]/dg
2, where SR is the spontaneous rate. This function is plotted in Figure 5B for two values of SR (SR = 50 sp/s and SR = 10 sp/s). Note that in contrast to rate information, which decreases as SR increases, synchrony information increases with SR because of the increased number of discharges that encode temporal information.
This example shows that synchrony can provide much information for level discrimination at low frequencies. Since the synchrony threshold is clearly below the rate threshold, this source of information could extend the range of levels over which a single fiber could provide robust performance. If synchrony information is included, the (δ′)2 for synchrony in Figure 5B is essentially added to each fiber’s (δ′m)2 from the rate-alone analysis in accordance with Eq. (10) above. For the single-CF population model considered above (see Fig. 4), this information could add 10–15 dB to the range of levels over which (δ′op)2 above observed performance but does not change the fact that predicted performance deteriorates rapidly at high levels.
The ability of nonlinear-phase information to explain Weber’s Law
An additional source of information in the phase-locked discharges of low-frequency AN fibers is the nonlinear phase (Anderson et al. 1971), which introduces the third term on the right side of Eq. (10). As mentioned above, the usefulness of this cue is dependent upon either the availability of an absolute phase reference, which is unlikely, or the use of relative times of the discharges of fibers with different CFs (Carney 1994). The Poisson model with nonlinear phase cues can be studied using the expression for the time-varying rate given in Eq. (9), which includes level-dependent rate and synchrony in addition to level-dependent phase. The average-rate-level function used in this section is described in Eq. (17) (Fig. 1) and the level-dependent synchrony was described in the last section (Fig. 5A).
The level-dependent phase is described by a simple function that captures the key features described by Anderson et al. (1971) for AN responses, Ruggero et al. (1997) for basilar membrane responses, and Cheatham and Dallos (1998) for inner hair cells. The phase of a fiber’s response to tones has increasing lag as a function of level in response to stimulus frequencies below CF, has no change with level at CF, and has decreasing lag with level in response to frequencies above CF. Figure 6 shows the dependence of phase on frequency for a single model fiber’s responses at several levels; the plotted phases are referenced to phase at 90 dB SPL [using Anderson et al.’s (1971) convention]. The model phase varies linearly between 30 and 90 dB SPL. This is a conservative range of levels over which the nonlinear-phase cue might convey information for level discrimination; Ruggero et al. (1997) showed that in the most sensitive experimental preparations, the compressive nonlinearity has a threshold of about 20 dB SPL and extends to levels of 100 dB or higher. The maximum difference in phase between the nonlinear-phase threshold (30 dB SPL) and 90 dB SPL is specified as π/2, and that maximum is reached at frequencies 1/2-octave above and below CF.
This AN model has a highly simplified representation of the nonlinear phase, which facilitates the calculations here. A more accurate representation would vary the amount and frequency range of the level-dependent phase as a function of CF to incorporate the change in the strength of the active process as a function of CF (see Heinz 2000). Nevertheless, the form chosen here yields phase-level curves that are comparable to those of Anderson et al. (1971) for low CFs. As in the treatments of level-dependent rate and synchrony, the details of the level-dependent phase are not important to the goal of illustrating a method for quantifying the information in this neural cue.
When quantifying the information for level discrimination that is available in responses that contain all three level-dependent response properties, the three terms in Eq. (10) can be plotted separately to illustrate the relative contributions of each cue. The upper panels of Figure 7 show rate r(L), synchrony g(L), and phase Θ(L) versus level for a high-SR, 1200-Hz CF model fiber in response to a 1000-Hz tone. Siebert’s (1968) tuning curve function,
was used to compute the threshold for this off-CF tone. For illustration, the frequency of the tone was chosen to be approximately a quarter-octave below CF, resulting in a half-maximal phase cue (see Fig. 6), Recall that the nonlinear-phase cue for tones exists only for fibers responding to frequencies above or below CF. The lower panels of Figure 7 show (δ′)2 for each of the three terms in Eq. (10). The rate-level and sync-level functions are shifted approximately 15 dB to the right compared with Figures 1 and 5 because the fiber is responding to a tone at a frequency away from CF. As before, the changes in rate with SPL contribute information over a limited level range between rate threshold and L
sat. The synchrony contributes a relatively large amount of information, but only at very low levels. In contrast, the nonlinear phase contributes values of (δ′)2 comparable to those of the rate term, which are maintained at mid-to-high SPLs. The nonlinear-phase cue increases from 30 to 55 dB SPL because the rate-level function has still not saturated at these levels. Above 55 dB SPL, where rate is saturated, the phase cue remains constant until 90 dB SPL, where the phase becomes level-independent in the model and no information about level change is provided.
Relative amounts and CF distributions of rate, synchrony, and nonlinear-phase information
The definition of the time-varying rate function in Eq. (9) resulted in the ability to “parse” the level information into the three terms in Eq. (10). The overall information for level discrimination contributed by the three cues can be examined by simply summing the three terms of (δ′)2 (Fig. 8), which illustrates the differing importance of the rate and temporal forms of information over different ranges of sound levels for a single fiber. Of course, the distribution of information provided by some of these cues also varies with CF. The CFs that convey information in the form of rate and synchrony vary with level because of spread of excitation, saturation, and the change in amount of compression as a function of CF.
Figure 9 illustrates (δ′)2 vs. CF for the three terms in Eq. (10) and their sum at three sound levels of a 1000-Hz tone. The level that excites each model fiber is determined by the simple triangular tuning-curve filter described in Eq. (25). The effects of saturation for fibers with CF near the stimulus frequency and the spread of excitation with increasing level are clear in the “rate” and “synchrony” terms. The “phase” term illustrates that, at moderate-to-high levels, the fibers tuned near the tone frequency have information for level discrimination. The sum of the three terms illustrates that the CF range near the tone frequency provides information at all three SPLs, due to synchrony and rate at low sound levels and to phase at moderate-to-high sound levels. Thus, at low CFs, where the average-rate dynamic ranges of both low- and high-SR fibers are limited, the nonlinear-phase cues may be especially important for conveying information related to changes in level.
GENERAL DISCUSSION
This study explored several issues related to the encoding of level in AN discharge patterns. Analytical models of AN tone responses and signal detection theory were used to quantify optimal performance limits based on the stochastic responses of the AN. Simple analytical AN models provided insight into the relative importance of different sources (rate and temporal) of neural information for level encoding. Specifically, simple equations were derived for the relative contributions of average-rate, synchrony, and phase cues. The inclusion of temporal information in analytical AN models extends previous modeling studies of level encoding, which have been primarily limited to average-rate information (e.g., Siebert 1965, 1968; Delgutte 1987; Winslow and Sachs 1988; Viemeister 1988; Winter and Palmer 1991).
The ability of individual AN fibers to robustly encode level changes based on average rate depends on the shape of the rate-level function and on the nature of the discharge randomness. It was shown that the rate information provided by individual AN fibers is maximal at stimulus levels within 5–10 dB above fiber threshold and that information begins to degrade at levels well below those for which rate saturation limits performance. This degradation is primarily due to the variance of AN discharge counts increasing significantly with increases in rate, while AN rate-level curves do not increase faster than linearly (versus decibels) over wide level ranges. Thus, individual AN fibers are even more limited in their ability to robustly encode changes in stimulus level based on rate than saturation would suggest.
Since there is considerably more than enough information in the AN population response to allow observed performance in level discrimination in quiet over a wide range of levels, the interesting question becomes how to understand the parametric dependencies and the effects of off-frequency maskers. It is typically assumed that good performance in the presence of off-frequency maskers implies that Weber’s Law must be produced by AN fibers within a narrow CF band. However, consistent with previous studies, it was shown here that optimal processing of average-rate information does not account for Weber’s Law based on fibers with a limited CF range because performance degrades significantly above about 40 dB SPL.
While the predicted trends in optimal performance were inconsistent with behavioral performance, it is not possible to rule out rate-based models because there is enough total rate information to account for robust level-discrimination performance over a wide range of levels (for general discussions of the use of optimal performance limits to evaluate neural encoding, see Siebert 1968, 1970; Colburn 1973; Delgutte 1996; Heinz et al. 2001a). Optimal performance limits superior to behavioral performance suggest the need for a suboptimal combination scheme (as discussed below). However, the strong degradation in rate information as level increases above medium levels suggests that parsimonious suboptimal combination schemes based on rate information may not exist and that other sources of neural information may be needed to account for robust level encoding in the AN.
The analytical AN model used in the present study allowed for the quantitative comparison of the relative contributions of rate and of temporal information. The level dependence of synchrony provides information that extends the dynamic range for robust level encoding at low frequencies, but only at low levels. Thus, synchrony information per se does not help account for robust level encoding at high levels based on fibers within a narrow range of CFs. In contrast, it was shown that nonlinear-phase cues provide robust level information within a narrow CF range over a wide range of levels, including high levels.
The third term of Eq. (10) illustrates the dependence of nonlinear-phase information on basic AN response properties. It was shown that phase information depends not only on the rate of change in phase with level, but also on average discharge rate and strength of synchrony. This makes sense intuitively, as changes in phase are easier to decode when many spikes are observed and when these spikes are strongly phase locked to the stimulus. This dependence implies that nonlinear-phase information at low frequencies is robust up to high levels in all fibers because average rate and synchrony are essentially constant at levels more than ~30 dB above fiber threshold, and the rate of change of the phase is essentially constant with level (Anderson et al. 1971; Ruggero et al. 1997).
The relation between nonlinear-phase responses and nonlinear tuning implies that nonlinear-phase cues exist over the entire range of levels for which the cochlear amplifier produces compressive BM responses (i.e., at least up to ~90 dB SPL; Ruggero et al. 1997). The predicted optimal performance limits do not depend on (or suggest) a specific mechanism for decoding the nonlinear-phase cues. However, these phase cues can be decoded by any mechanism that compares the relative phase response across fibers with different CFs (discussed further below) because the level dependence of phase differs across frequency relative to CF (Anderson et al. 1971; also see Fig. 6). Thus, nonlinear-phase responses appear to provide a realistic source of robust level information near CF and may provide an alternative explanation at low frequencies to the level-dependent processing schemes that are necessary to account for Weber’s Law based on average rate.
The present study provides constraints for two possible explanations, one based on average rate and one based on nonlinear phase, for robust level encoding at high stimulus levels based on AN fibers within a narrow range of CFs. As discussed in the following paragraphs, a specific neural mechanism has been proposed for each explanation of how AN information could be decoded in the cochlear nucleus to produce robust level encoding. Further support for or against each explanation can be garnered by considering whether there are cell types in the cochlear nucleus that could perform the proposed neural processing.
Winslow et al. (1987) have proposed a “selective listening” mechanism in which average-rate information from high-SR, low-threshold fibers is used at low levels, while that from low-SR, high-threshold fibers is used at high levels. Lai et al. (1994) have demonstrated that such a selective-listening strategy can be performed by a simple model of a cochlear nucleus stellate neuron based on shunting inhibition. However, the required anatomical innervation patterns of the different SR fibers to stellate neurons and quantitative psychophysical predictions have not been demonstrated for this mechanism. Furthermore, it is not clear that a model that relies solely on low-SR fibers at high levels would produce Weber’s Law because the information provided by low-SR fibers also begins to degrade within 10 dB above their threshold (see Fig. 2).
Carney (1994) suggested that a monaural, across-frequency coincidence detection mechanism could be used to decode the level information provided by nonlinear-phase cues. There is physiological evidence that some cell types in the cochlear nucleus (e.g., globular bushy cells) have responses that are consistent with a coincidence-detection mechanism (e.g., Carney 1990; Joris et al. 1994a,b). Heinz et al. (2001b) have quantitatively evaluated the ability of a simple across-frequency coincidence-counting mechanism to account for robust level encoding based on the information in AN responses. They showed that a near-CF population of coincidence counters could reliably decode the robust nonlinear-phase cues provided at low frequencies. In addition, the coincidence-detector population also produced Weber’s Law at high frequencies based on the more robust average-rate cues associated with stronger compression at high frequencies. Carney et al. (2002) have also demonstrated the ability of a monaural, across-frequency, coincidence-detection mechanism to account for detection of tones in noise. Future physiological studies are needed to test specific single-unit-response predictions for the coincidence-detection mechanism, as well as the selective-listening mechanism, in order to provide further support for the types of AN information that are important for robust level encoding.
As noted above, available data indicate that AN phase locking to the cycles of a tone decreases at frequencies higher than approximately 1–3 kHz (Johnson 1980; Weiss and Rose 1988; Koppl 1997); however, this rolloff in synchrony was not included in the simple analytical model. If it is assumed that synchrony information in the human AN is similarly reduced at high frequencies, then the information conveyed by synchrony and level-dependent phase cues for encoding the level of a tone is significantly reduced at high frequencies. At high frequencies, the contributions of rate cues from high-threshold, low-SR fibers with wider dynamic ranges are potentially more important (Winter and Palmer 1991; Heinz 2000; Heinz et al. 2001b). The low-SR fibers depend upon large amounts of compression for their wide dynamic ranges and the amount of compression increases as a function of CF (e.g., Cooper and Yates 1994). These facts are consistent with the observation that fibers with non-saturating (“straight”) rate-level functions are not observed at CFs below about 1500 Hz (Winter and Palmer 1991) in guinea pig.
If robust level encoding were dependent on phase cues at low frequencies and on rate cues at high frequencies, then a variation in level-discrimination performance across frequency could be expected. However, Heinz et al. (2001b) have shown that linear spread of excitation plays a strong role for level discrimination in quiet, which suggests that this frequency effect would be subtle. In fact, a subtle frequency dependency has been observed in level-discrimination performance (Jesteadt et al. 1977; Florentine et al. 1987). While the near miss to Weber’s Law occurs for low frequencies, a small but significant nonmonotonicity in performance as a function of level occurs at high frequencies. This “midlevel bump,” which begins to appear between 1 and 4 kHz, can be accounted for by the strong BM compression at high frequencies that starts around 30 dB SPL (Heinz et al. 2001b). Finally, it should also be mentioned that the present analysis does not address the time variation in the rate that occurs after the onset of a stimulus (Smith and Brachman 1979); the level dependence of this adaptation (i.e., a wider dynamic range at onset) could also provide level information (cf. Evans 1980) and is not limited to low frequencies.
The significance of potential variations across species is another issue that requires future work. For example, “straight” rate-level curves have been observed in guinea pig AN responses for high CFs (Winter et al. 1990) but not for low CFs (Winter and Palmer 1991), whereas “straight” rate-level functions have not been observed for any CFs in cat (e.g., Sachs and Abbas 1974; Delgutte 1987; Winslow and Sachs 1988). This result suggests that the strength and frequency dependence of compression may differ for cats and guinea pigs. Heinz et al. (2001b) have demonstrated that the strength of compression has a large effect on the ability of near-CF rate information to account for Weber’s Law. Thus, an important remaining issue is the strength of compression in humans relative to species for which physiological BM and AN data are available. Psychophysical methods have recently been developed that estimate BM compression based on forward-masking studies (e.g., Oxenham and Plack 1997; Nelson et al. (2001). These methods have been shown to produce estimates of human compression that are consistent with the amount of BM compression that has been measured at high frequencies. However, these methods rely on assumptions for which the physiological evidence at low frequencies is not definitive, e.g., that below-CF responses are linear. These methods show promise for estimating the strength of cochlear nonlinearity in humans, but their ability to accurately estimate compression strength as a function of frequency remains to be shown.
In summary, it is likely that level discrimination is mediated by a multiplicity of attributes of the physiological data and that the relative usefulness of these attributes is dependent upon the stimulus circumstances, such as masked or unmasked, wideband or narrowband, short or long duration, and fast or slow stimulus onsets and offsets. The present study provides a quantitative framework to analyze and compare different types of information available in AN responses for encoding level. Future studies with more complex AN models can extend the results in the present study by using this quantitative approach.