Introduction

Mammalian hearing organs are richly supplied with olivocochlear efferent innervation that allows the brain to control cochlear function. One group of efferents, the medial olivocochlear (MOC) efferents, innervates outer hair cells (OHCs) and controls the gain of mechanical amplification within the cochlea (Cooper and Guinan 2006). Cochlear amplification is produced by OHC receptor currents causing audio-frequency changes in the length of OHCs that amplify cochlear mechanical responses to sound (Dallos et al. 2008). MOC fibers synapse on OHCs and produce an OHC hyperpolarization that reduces the effect of OHC receptor currents and decreases cochlear amplification (Fuchs 2002).

MOC fibers respond to sound and form a cochlear negative feedback system. In cats and guinea pigs, ∼2/3 of MOC fibers respond to ipsilateral sound (re the cochlea innervated) and ∼1/3 respond to contralateral sound (Robertson 1984; Liberman and Brown 1986; Brown 1989). Tuning curves (TCs) from MOC fibers have narrow V-shaped tips that are only slightly wider than TCs from cochlear afferents with similar best frequencies (BFs). Individual MOC fibers innervate a cochlear place that has a BF similar to the MOC fiber’s BF (Robertson 1984; Liberman and Brown 1986; Brown 1989). Thus, the MOC reflex appears to provide frequency-specific feedback to a narrow region around the frequency of the sound that elicited the reflex (Winslow and Sachs 1987).

Studies of the MOC reflex in humans indicate that the largest MOC effects seen at one cochlear frequency region are produced by broadband stimuli, at least for MOC activation by contralateral sounds (Norman and Thornton 1993; Micheyl et al. 1999; Velenovsky and Glattke 2002; Lisowska et al. 2002). Maison et al. (2000) found that reflex amplitude grew as constant sound pressure level (SPL) contralateral noise bands were widened up to 2 octaves, the highest bandwidth tested. It is unknown whether the MOC reflex continues to grow at larger bandwidths, or grows with increased noise bandwidth for ipsilateral or bilateral elicitors.

At everyday sound levels, MOC efferents are thought to provide antimasking that increases the discriminability of transient signals in ongoing noise. MOC antimasking is produced by reducing cochlear amplification which reduces the noise response and reduces adaptation at the inner hair cell–auditory nerve synapse. The resulting increased neural output range provides increased discriminability of a new sound (Winslow and Sachs 1987, 1988; Kawase et al. 1993). Moderate-level masking noise is most effective at frequencies close to the sound being masked, so for antimasking, frequency-specific MOC excitation might be expected.

To better understand summation across frequency in MOC reflexes and the role of this in antimasking, we measured MOC effects as a function of noise bandwidth up to 6.7 octaves with noise bands centered at—and 2 octaves below—the probe tone. We used contralateral, ipsilateral, and bilateral elicitors and probed three frequency regions—0.5, 1, and 4 kHz. In a separate experiment, we tested whether individual half-octave bands summate linearly. The results show that MOC reflexes integrate information from almost the whole cochlea with relatively small differences in the patterns from contralateral, ipsilateral, and bilateral elicitors.

Methods

Methods overview

As a monitor of the mechanical response of the cochlea and MOC effects on this response, we used stimulus frequency otoacoustic emissions (SFOAEs). Otoacoustic emissions (OAEs) are low-level sounds produced in healthy cochleae that can be measured noninvasively in humans with a sensitive microphone in the ear canal. SFOAEs were used because (1) they are the most frequency specific of any OAE and (2) SFOAEs can be evoked by a single low-level (40 dB SPL) probe tone which, by itself, elicits little MOC activity (Guinan et al. 2003). MOC effects were quantified by the change produced in SFOAEs, ΔSFOAE.

Experiment 1: Elicitor noise bandwidth series

In the first experiment, MOC effects were measured as a function of elicitor noise bandwidth with the noise bands centered (on a logarithmic scale) at the probe frequency. The noise bandwidths were 0.5, 1, 2, and 4 octaves for the 0.5- and 1-kHz probes and 0.5, 1, and 2 octaves for the 4-kHz probe. In addition, broadband noise (BBN; 0.1 to 10 kHz) was used for all probe frequencies. BBN is centered on the probe frequency for the 1-kHz probe but not for the 0.5- or 4-kHz probes. All elicitors were 60 dB SPL. The bandwidth series and the BBN data were obtained in separate runs and the criterion (detailed below) that the maximum point must have a signal-to-noise ratio (SNR) >3 was applied to each separately. Note that as bandwidth was increased, the spectral density of the elicitor noise decreased because the SPL was kept constant. Thus, if the MOC response is determined primarily by integration of afferent activity over frequency regions close to the probe frequency, the MOC effect would be expected to decrease as the bandwidth increased beyond this integration region.

Experiment 2: Off-centered bandwidth series

In the second experiment, MOC effects were measured as a function of elicitor noise bandwidth with the noise bands centered (on a logarithmic scale) 2 octaves below the probe frequency. This was done to determine if the increase in MOC effect with increasing noise bandwidth depended on the noise band being centered on the probe frequency. In this experiment, 60-dB SPL noise with bandwidths of 0.5, 1, 1.5, and 2 octaves were used, and data were obtained only for 1 and 4 kHz probes. One advantage of these off-centered noise-band elicitors is that they did not have frequency components near the probe frequency so that ipsilateral and bilateral elicitors did not produce significant two-tone suppression of the probe-tone SFOAE (see Supplementary Material, section 3). Thus, we were able to measure the MOC effects from all elicitor lateralities in a window during the elicitor.

Experiment 3: Calculating MOC effects by combining effects from half-octave noise elicitors

In the third experiment, done only for 1 kHz probes, the composite MOC effects produced by adding the effects of adjacent half-octave noise bands were compared to the measured effects from increasing noise bandwidth. This task was complicated by the fact that for the measured effects, the noise spectral level decreased as the noise bandwidth increased (to keep the overall level at 60 dB SPL). Thus, we needed to know the MOC effect produced by each half-octave band, not at 60 dB SPL, but for the half-octave band presented at a (lower) spectral level that would produce a 60-dB SPL elicitor when all of the half-octave noise bands in the composite noise band were added together.

For this experiment, three sets of measurements were used. The first set was the “real bandwidth series”, i.e., (set 1) measurements of the MOC effect as a function of elicitor bandwidth for 60-dB SPL elicitors. The next two sets were used to calculate a “composite bandwidth” series to be compared with the “real bandwidth series”. The second set was (set 2) measurements of the MOC effects from 60-dB SPL half-octave noise elicitors, with center frequencies in half-octave steps ranging over 4 octaves around the probe frequency (see Supplementary Fig. S1). For set 2, the MOC effects (i.e., ΔSFOAEs) produced by half-octave noise bands were measured in 12 ears (six subjects). Set 1 data were then taken from the real bandwidth series of the corresponding 12 ears (i.e., a subset of the data in Fig. 1). The data of set 2 provided the shape of the MOC effect versus frequency on the 12 ears, but for 60-dB SPL elicitors. This level was equal to, or greater than, the spectral levels used in the “real bandwidth series”. To be able to scale the set 2 data to the spectral levels used in the “real bandwidth series”, we obtained set 3: measurements of the growth of MOC effects with elicitor level for each half-octave band of set 2. To obtain MOC-effect growth functions, the MOC effects in each half-octave band were measured for 30-, 45-, and 60-dB SPL elicitors in three ears (see Supplementary Fig. S2 A, C, E). Obtaining the growth-function data required many hours of subject measurement time so growth functions were measured only on three ears (three subjects). For the growth function of each half-octave band, we used the average across the three ears of the magnitudes of the MOC effects (Supplementary Fig. S2 B, D, F).

FIG. 1
figure 1

MOC effect as a function of elicitor bandwidth for noise bands centered at the probe frequency. ΔSFOAEn magnitudes from bilateral (circles), ipsilateral (×’s), and contralateral (triangles) 60-dB SPL elicitors were measured in the postelicitor window. The squares are ΔSFOAEn magnitudes from the same contralateral elicitors except measured in a during-elicitor window. ΔSFOAEn is the change in the SFOAE normalized by the magnitude of the SFOAE. BBN = broadband noise (0.1–10 kHz, or 6.67 octaves). Note that BBN is centered only for the 1-kHz probe. For probe tones at 0.5, 1, and 4 kHz, data from five, 15 and nine ears were included in the averages. Error bars are standard errors of the mean. Horizontal lines: solid = noise-floor mean, dotted = 1 standard deviation above the noise mean. Average values that were statistically different from the noise floor are indicated by the small symbols: one symbol = P ≤ 0.05, two symbols = P ≤ 0.01, three symbols = P ≤ 0.001.

From the set 2 and set 3 data described above, we then calculated composite bandwidth series to compare with the real bandwidth series. For each of the 12 ears of set 2, we started with the amplitude of the MOC effect from the MOC effect versus frequency series of set 2 and applied the growth shape from set 3 (but not the absolute values in the growth functions). For each half-octave band in each ear, the corresponding growth function was scaled so that it showed the same MOC effect at 60 dB SPL as the data from set 2. Then by interpolation along this scaled growth function, we obtained the MOC effect that would be produced by that half-octave band at the lower spectral level that would make each composite frequency band have a total overall level of 60 dB SPL. Finally, all of the appropriate half-octave MOC-effect magnitudes were vectorially added to estimate the MOC effect produced by each composite bandwidth.

Subjects

All subjects included in this study had normal hearing thresholds in both ears (i.e., 20 dB HL, or better, at octave frequencies from 250 Hz to 4 kHz). Measurements were conducted in a sound-attenuating room. A warning light was automatically turned on before each stimulus presentation and the subject was instructed to sit still while the light was on and sounds were presented. Data from ears were accepted only if (1) there were no significant middle ear muscle (MEM) contractions at the stimulus levels used for each experiment (MEM test explained below), (2) the subject was able to stay awake and sit still during the experiment, and (3) the subject returned to complete the study. A minimum SNR criterion of 3 (∼9.5 dB) was applied to each data set to minimize amplitude estimation bias (Backus 2007). The criterion was applied to the maximum point of the MOC effect rather than to each point individually to allow points to be used even if they showed no MOC effect, as long as there was a large effect at some point in the series for that ear. The number of ears and subjects as well as their gender and age ranges are summarized in Supplementary Table S1. All experiments followed protocols approved by the Massachusetts Eye and Ear Infirmary and Massachusetts Institute of Technology human studies committees.

Acoustic stimuli

As a probe sound to evoke SFOAEs in both ears, a continuous tone was simultaneously presented at 40 dB SPL to both ears. The tone was in the 0.5-, 1-, or 4-kHz frequency region and was produced by one earphone of an Etymotic ER10C acoustic assembly in each ear. For each subject and frequency region, a probe frequency that was within 10% of the target frequency (0.5, 1, or 4 kHz) was selected that (1) was at least 100 Hz away from any spontaneous OAE with magnitude above −10 dB SPL (to avoid possible entrainment) and (2) produced the largest ΔSFOAE magnitude to a 60-dB SPL contralateral BBN (so that the SNR criteria could be reached with the smallest number of stimulus repeats). No overall differences were found in the normalized ΔSFOAEs from frequencies selected within ±10% of the target frequency, as explained above, versus always using a probe at the target frequency (Backus and Guinan 2007)

To elicit MOC activity, a 60-dB SPL noise band was presented ipsilaterally, contralaterally, or bilaterally for 2.5 s. Preceding the elicitor was a 0.5-s onset period from which the baseline response was measured, and following the elicitor was a 2-s period for recovery. The resulting stimulus repetition period was 5 s. In each of the three experiments, the elicitors were presented in a randomized order to avoid bias.

Calibrations of the acoustic outputs from each of the two sound sources in each ER10C acoustic assembly were performed by presenting a spectrally flat, random-phase, broadband (0.1 to 10 kHz) electrical signal (i.e., a “noise”) to the sound source and measuring the resulting ear-canal sound pressure using the microphone in the ER10C. The fast Fourier transform of the microphone signal divided by the level of the electrical signal gave the calibration at each frequency. This was done at the beginning of every data gathering session and frequently within a session to account for small changes in the probe fit in the ear canal. Noise bursts were made spectrally flat at the ER10C probe tip by applying these calibrations.

The SFOAE produced by the probe tone alone, referred to as the “baseline SFOAE”, was measured by the suppression method (Guinan 1990; Kalluri and Shera 2007). With this method, a second tone was presented at a frequency near the probe tone and a level 20 dB, or more, above the probe-tone level. This “suppressor tone” pushes OHC stereocilia into saturation regions that lowers the cochlear amplifier gain at nearby frequencies and thereby suppresses the SFOAE from the probe. Because of their different actions, we distinguish between this “two-tone suppression” and MOC “inhibition”. To measure the SFOAE by suppression, we presented a 40-dB SPL probe tone with and without a suppressor (a 60-dB SPL tone, 110 Hz below the probe frequency presented for 500 ms every second) and calculated the SFOAE as the vector difference between the probe-frequency ear-canal sound pressures with and without the suppressor. Although Backus and Guinan (2007) found that these parameter values produce only 80–100% suppression, this did not affect the shape of the MOC-effect curves because, for a given ear, all points were normalized by the same SFOAE magnitude. For both suppressor tones and MOC elicitors, 5-ms rise/fall cosine ramps were used to minimize spectral splatter. Consecutive elicitor or suppressor presentations had opposite polarities so that after averaging, their acoustic waveforms would cancel leaving a residual equal to the induced change in the SFOAE.

Measurement analysis

Data gathering runs were done in blocks that averaged four to eight (always an even number) artifact-free responses at each elicitor frequency. Responses were rejected as being contaminated with artifacts when the difference between one pair of responses and the next pair exceeded a criterion set for each subject. Stimulus pairs were used because elicitors (or suppressors) were alternated in polarity across stimulus repetitions. Multiple data blocks were averaged until the SNR criterion was reached and at least N ≥ 12 stimulus repetitions were included. The averaged waveforms were then heterodyned (heterodyning is equivalent to sending the signal through a lock-in amplifier; for more details, see Guinan et al. 2003) to obtain P(t), the magnitude and phase of the sound pressure at the probe frequency as functions of time over one repetition period (note: the magnitude and phase are expressed by the single complex time function P(t)).

The ear-canal sound pressure at the probe frequency, P(t), is the vector sum of the probe tone and the evoked SFOAE(t), (both are complex quantities, i.e., they have amplitudes and phases). An MOC elicitor, or a suppressor, can induce a change in SFOAE(t) and thereby change P(t). Changes in the SFOAE(t), ΔSFOAE(t), were calculated from P(t) by subtracting P baseline from P(t), where P baseline is the vector average of P(t) between 50 and 450 ms, i.e., the average P(t) before the onset of the suppressor or MOC elicitor (see Guinan et al. 2003 for more detail). The MOC-induced change was then expressed as the normalized ΔSFOAE(t), ΔSFOAEn(t), i.e., as the change in the SFOAE normalized by the magnitude of the SFOAE. ΔSFOAEn(t) was obtained by dividing ΔSFOAE(t) by the baseline SFOAE vector. That is, ΔSFOAEn(t) = ΔSFOAE(t)/SFOAE. Thus, ΔSFOAEn(t) is ΔSFOAE(t) expressed as a fraction of the SFOAE magnitude.

To obtain MOC-effect metrics from ΔSFOAEn(t), the data were vector averaged within certain time windows: (1) the postelicitor window. Ipsilateral and bilateral elicitors centered on the probe frequency produce “two-tone suppression” while they are on. This suppression obscures the MOC effect during the elicitor. Fortunately, suppression decays in a few milliseconds, much more quickly than the decay of the MOC effect (i.e., τ Suppression < 10 s ms, τ MOCR∼100 s ms; Guinan 1990). Thus, for ipsilateral and bilateral elicitors (and contralateral elicitors when they were plotted on the same axis as the others), we used a 100-ms postelicitor window starting 50 ms after the end of the elicitor. This postelicitor window avoids two-tone suppression, but at the cost of obtaining only a decaying part of the MOC effect. (2) A during-elicitor window: Contralateral MOC elicitors and off-centered elicitors 2 octaves below the probe frequency produced no suppression of the SFOAE, so measurements could be made during the elicitor. Two during-elicitor windows were used. For the off-centered elicitors, we used a 400-ms time period that ended 50 ms before the end of the elicitor. This long window provided the best SNR. When centered-elicitor contralateral and ipsilateral data were in the same figure (Fig. 1), we used a 100-ms time period (also ending 50 ms before the end of the elicitor) so that all the data had comparable noise floors. (3) The noise-floor window: The noise floor was estimated by averaging ΔSFOAEn(t) over a time window of the same length as the data analysis time window (during or postelicitor) but positioned such that the end of the noise time window was 50 ms before the end of the stimulus repetition period.

Middle ear muscle contraction test

A MEM test was performed on each subject to ensure that the MEMs were not activated at the stimulus levels used. In this test, a 65-dB SPL continuous suppressor tone at 110 Hz above the probe frequency was presented in addition to the normal acoustic stimuli. This tone completely suppressed the probe-frequency SFOAE, with the result that only MEM contractions could produce a change in the sound pressure at the SFOAE probe frequency. Thus, the presence of an elicitor-induced change in the ear-canal sound at the SFOAE frequency indicates there has been a MEM contraction, while responses within the noise floor are taken to mean there has been no significant MEM contraction (see Guinan 2006).

Determining the statistical significance of variations across parameters

To determine the statistical significance of the variations of MOC effect obtained with variations in stimulus parameters, we employed an n-dimensional analysis of variance (ANOVA; Matlab Statistical Toolbox) with a Bonferroni correction for multiple comparisons. These statistical tests were done only when the MOC effect results passed the Lillifores test for normality. If they did not pass, the Friedman nonparametric two-way ANOVA was used. Statistical significance was accepted at the 0.05 level. Data were compared to the noise-window measurements that gave the response for the null hypothesis of no ΔSFOAEn.

Results

MOC effects were quantified by ΔSFOAEn, the change in the SFOAE normalized by the magnitude of the SFOAE. For all of the data shown here, MEM tests were done and revealed no significant MEM contractions in response to the stimuli used (see “Methods” section). Hence, all of the measured elicitor-induced changes arose from within the cochlea, presumably due to MOC activity. Most measurements with ipsilateral and bilateral elicitors used a postelicitor window to avoid two-tone suppression of the probe-tone SFOAE. Judging from contralateral elicitor data where both during- and postelicitor responses can be measured, the magnitudes in the postelicitor window were 1/2–2/3 of those from the during-elicitor window (compare triangles to squares in Fig. 1). We did not see any systematic differences in the shapes of data from the postelicitor window versus from the during-elicitor window.

Multidimensional ANOVA on each data set showed no significant differences between the average left and right ear responses except for the off-centered bandwidth series at 1 kHz (ANOVA results in Supplementary Material, Table S2). Considering the small number of ears involved (six), the reliability of right–left difference in the 1-kHz off-centered bandwidth series is questionable. Accordingly, in each data set, responses from right and left ears have been pooled.

MOC effects from bandwidth series centered at the probe frequency

MOC effects as functions of elicitor bandwidth for elicitors centered at the probe frequency are shown in Fig. 1. Elicitor bandwidth, elicitor laterality, and subject, all produced significant variation of MOC effects (ANOVA results in Supplementary Table 2). In general, the magnitude of ΔSFOAE increased as the elicitor bandwidth increased. For 500 Hz probes, the maximum effect was for 2 and 4 octave stimuli with slightly lower effects for broadband noise (which is not a centered stimulus at 500 Hz). For 1 and 4 kHz probes, the maximum effect was for broadband noise (Fig. 1). The increase in response as bandwidth increased was observed despite the fact that, since the overall elicitor level was held constant, the noise energy at frequencies near the probe frequency decreased as bandwidth increased.

Figure 1 also shows that for all probe frequencies, the bilateral response magnitude was greater than the ipsilateral and contralateral response magnitudes regardless of the elicitor bandwidth. When elicitors were narrow, ipsilateral effects were greater than the contralateral effects, as expected from animal work (look only at data from the postelicitor window where ipsilateral and contralateral data were measured in the same way). In contrast, as elicitor bandwidth increased, the MOC effects from ipsilateral and contralateral elicitors converged (Fig. 1).

MOC effects for bandwidth series centered 2 octaves below the probe frequency

To determine if the increases in MOC effects as elicitor bandwidth widens occur only when the elicitors are centered at the probe frequency, we did bandwidth series with the elicitor centered 2 octaves below the probe frequency. These off-centered noise bands elicited increasing MOC effects with increasing elicitor bandwidth for all elicitor lateralities and probe frequencies (Fig. 2). In addition, binaural MOC effects were the largest, and ipsilateral MOC effects were slightly larger (at 1 kHz) or similar (at 4 kHz) to contralateral MOC effects.

FIG. 2
figure 2

MOC effect as a function of elicitor bandwidth for noise bands centered 2 octaves below the probe frequency. ΔSFOAEn magnitudes from bilateral (circles), ipsilateral (×’s), and contralateral (triangles) 60-dB SPL elicitors were measured in a during-elicitor window. For probe tones at 1 and 4 kHz, data from six and three ears were included in the averages. Error bars are standard errors of the mean. Horizontal lines: solid = noise-floor mean, dotted = 1 standard deviation above the noise mean. Average values that were statistically different from the noise floor are indicated by the small symbols: one symbol = P ≤ 0.05, two symbols = P ≤ 0.01, three symbols = P ≤ 0.001.

At first glance, the 4-kHz data from the off-centered bandwidth elicitors appears to produce MOC effects of similar magnitude as the centered-bandwidth elicitors (comparing Fig. 1 data up to 2 octaves with Fig. 2). However, the off-centered data were obtained from a during-elicitor window while three of the four traces of the centered data in Fig. 1 were obtained from the postelicitor window which yields lower magnitudes. For contralateral elicitors where both sets of data were measured using a during-elicitor window, the centered elicitors at both 1 and 4 kHz produced MOC effects that were two to three times larger than the off-centered elicitors.

Strength of MOC effects as a function of probe frequency

To compare MOC effects across probe frequency, the MOC effects from probe-centered elicitors (from Fig. 1) and off-centered elicitors (from Fig. 2) are plotted against probe frequency in Fig. 3. For half-octave and 2-octave elicitors, MOC effects decreased as a function of probe frequency at all elicitor lateralities (Fig. 3). For BBN elicitors (Fig. 3C), MOC effects did not show a clear trend across probe frequency, except that the 4-kHz effects were always the smallest.

FIG. 3
figure 3

MOC effects tended to decrease as probe frequency increased. ΔSFOAEn magnitudes from centered noise bands (top) measured in the postelicitor window and from off-centered noise bands (bottom) measured in a during-elicitor window. Error bars are standard errors of the mean.

Figure 3 shows in a different way than earlier that for both centered and off-centered elicitors, bilateral MOC effects were greater than either the ipsilateral or the contralateral MOC effect. Interestingly, the ipsilateral and contralateral MOC effects were similar in all cases except the centered half-octave elicitors where the ipsilateral MOC effect was considerably greater than the contralateral MOC effect.

MOC effect versus bandwidth estimated from half-octave elicitor MOC effects

To help understand the process by which sounds of increasing bandwidth produce increased excitation, we sought to determine if the addition of the effects of adjacent half-octave elicitors would equal the effect of a single wide-band noise of the same spectral density (and overall SPL). This was done in several steps. First, in 12 ears from six subjects, we measured the MOC effects produced by adjacent half-octave noise elicitors with center frequencies spanning 4 octaves around the probe frequency (data shown in Supplementary Figure S1). These data are similar to data shown in Lilaonitkul and Guinan (2009), but were independently obtained and had fewer responses averaged from each subject. The MOC effects from these 60-dB SPL elicitors were scaled to show the MOC effects at the spectral level of a single noise band with a bandwidth equal to the composite bandwidth and a net level of 60 dB SPL; this was done using average MOC effect growth functions obtained from three ears (three subjects; data shown in Supplementary Figure S2, see Methods for more detail). The MOC effects from each half-octave elicitor were then vectorially added to give the calculated composite MOC effects shown by the dashed lines in Fig. 4. The actual measured values for the same 12 ears are shown as the solid lines in Fig. 4. For contralateral elicitors, the calculated MOC effect was remarkably similar to the actual MOC effect (Fig. 4C). However, for the ipsilateral and bilateral MOC effects, the calculated effects were generally greater than the actual MOC effects, especially for wider elicitors (Fig. 4A, B). Given the close fit achieved in the contralateral case, the departure observed for the ipsilateral and binaural effects seems more likely to be due to a real physiological phenomenon than an inaccurate calculation. Thus, the data show that the MOC effects from contralateral elicitors are approximately equal to the sum of the effects from individual half-octave noise bands, but the MOC effects from ipsilateral elicitors of 1.5 octaves or more are considerably less than the sum of the effects from the individual bands.

FIG. 4
figure 4

MOC effects (ΔSFOAEn magnitudes) calculated by summing the effects of half-octave band elicitors (dashed lines and ×’s) compared to measured effects (solid lines) from 60-dB SPL elicitors of different bandwidths (see “Methods”). The calculated effects were similar to the measured effects for contralateral elicitors but were greater than the measured effects for ipsilateral and bilateral elicitors. Error bars represent standard errors of the mean.

Discussion

The MOC effects are due to inhibition of SFOAE sources near the probe-frequency place

We attribute the measured MOC effects to inhibition of SFOAE components that originate near the peak of the probe-tone traveling wave. Although Siegel et al. (2003, 2005) proposed that SFOAEs “originate from sources (or generators) that are widely distributed along the length of the cochlea near and basal to the peak of the traveling wave”, the data on which this was based came from animals, not humans, and it is not clear that this explanation is correct even for animals. The data for this conclusion are largely from chinchilla (Siegel et al. 2003, 2005; also see Shera et al. 2008) and cat (Guinan 1990), particularly from the finding of residuals at the frequency of a probe tone that were produced by a second higher-level so called suppressor tone that was swept to frequencies several octaves above the probe frequency. Although Siegel et al. (2003) and (Guinan 1990) interpreted these residuals as showing suppression of SFOAE generators that are far basal to the probe-tone cochlear place, an alternate explanation is that the probe-frequency residuals are showing a distortion component at the probe-tone frequency produced by the “suppressor” tone, i.e., this component is not there when there is no high-frequency “suppressor” tone (Shera et al. 2004). However, whatever the origin of the residuals in animals, the residual pattern from similar experiments in humans is very different. For “suppressor” tones swept in frequency around probe tones in 24 subjects, Keefe et al. (2008) found probe-frequency residuals that had amplitude plateaus for suppressor frequencies near the probe frequency and amplitudes that decreased at higher frequencies, while maintaining constant phases throughout. Kemp and Chum (1980) also show data from humans consistent with this. This pattern is consistent with the suppression acting in the tonotopic region of the probe tone, i.e., that the place of generation of the SFOAE residual did not vary as the suppressor frequency varied (Keefe et al. 2008). In addition, in three subjects, Backus and Guinan (2007) found that as “suppressor” tones were increased in frequency, the residual decreased to the noise floor at about 1 octave above the probe frequency. In contrast to these patterns, in chinchillas and cats, large residuals (often larger than the residual produced by “suppressors” near the probe frequency) were found with suppressors many octaves above the probe frequency and the phase of these residuals varied dramatically with “suppressor” frequency. Overall, the data are consistent with the interpretation that, in humans, SFOAEs originate from a region near the tonotopic place of the probe tone. Finally, the data from the elicitor bandwidth series centered 2 octaves below the probe frequency cannot be explained by inhibition of SFOAE components that originate 2 octaves apical to the probe-frequency place because the probe-frequency traveling wave is cut off before it reaches this place. Thus, these elicitors, although distant in frequency from the probe, appear to be eliciting MOC activity that has its effect in the cochlear region near the probe frequency. A simple explanation consistent with all of the human data is that the SFOAE changes we measure with wide-band elicitors are due to MOC activity acting near the probe-tone place to inhibit the probe-tone SFOAE.

How does the observed inhibition of SFOAEs correspond to neural inhibition?

For hearing function, the important effects of MOC activity are on cochlear neural responses, so we would like to know what the observed SFOAE inhibitions tell us about the corresponding neural inhibitions. Shock-evoked MOC effects in animals show similar time courses in the change in SFOAEs compared to the change in neural or cochlear-microphonic responses (Guinan 1986; Kemp and Souter 1988). MOC inhibition of SFOAEs and cochlear compound action potential (CAP) neural responses had the same threshold and parallel growth as MOC-shock level was increased (Guinan 1986). However, the relationship between the dB change in SFOAEs and the dB change in CAPs has not been determined for SFOAEs or for any OAE. For distortion-product OAEs (DPOAEs), the neural change was similar to, or larger than, the DPOAE change (both in dB; Puria et al. 1996). Whether a similar pattern holds for SFOAEs is unknown. The pattern of MOC inhibition on SFOAEs in humans is similar in many respects to what would be expected if the SFOAE is proportional to the amount of cochlear amplification at the probe frequency (Kemp and Chum 1980; Keefe et al. 2008). This suggests that SFOAEs provide a good metric for measuring MOC effects on cochlear amplification in humans, but with an unknown proportionality constant.

The integration of MOC effects across frequency

Figure 1 demonstrates that MOC effects increase with elicitor bandwidth well past the 2 octaves shown by Maison et al. (2000). Since the elicitor SPL was constant as bandwidth increased, the spectral density at frequencies near the probe frequency decreased; nonetheless, the overall MOC effect increased. Thus, the MOC activation from frequencies added as bandwidth increased more than compensated for the lowered sound levels at frequencies near the probe. For 1 kHz, the MOC effect saturated at 4 octaves. Since, as bandwidth increased from four to 6.7 octaves, the MOC effect did not decrease, frequency regions from 4 to 6.7 octaves must have supplied MOC activation, enough to overcome the decrease in activation at frequencies close to the probe. These results indicate that MOC reflexes integrate information from almost the whole cochlea.

At first glance, MOC reflex integration over many octaves is surprising considering that MOC fibers have narrow TCs and innervate cochlear BF regions close to their BF (Robertson 1984; Liberman and Brown 1986; Brown 1989). Excitation of high-BF auditory nerve fibers in their low-frequency tails seems unlikely to provide this integration because a 60-dB SPL noise is below human tail thresholds (e.g., Oxenham and Plack 1997). Several factors may play a role in spreading the MOC effect. First, individual MOC fibers innervate cochlear regions covering 0.1–1 octave and these regions are not always centered at the cochlear BF corresponding to the MOC-fiber BF. Second, MOC fibers decrease cochlear amplification that originates from a region 1/3–1 octave basal to the BF cochlear place (de Boer and Nuttall 2000; Shera et al. 2007). These factors must spread MOC effect along the cochlea, but they do not account for MOC activation from noise components several octaves, or more, from the probe frequency.

One anatomical system that could spread MOC effects is the neural network formed by type II auditory nerve fibers. Type II fibers travel longitudinally under the OHCs and form reciprocal synapses with OHCs along the way (Francis and Nadol 1993 a, b; Sato et al. 1997; Thiers et al. 2000, 2002a, b, 2008). MOC fibers innervate both OHCs and type II fibers. The action of this MOC type II OHC neural network is not known, but it seems plausible that it may act to spread MOC effects along the cochlea.

Whatever the role of these peripheral factors, the central nervous system must be a major player in integrating MOC excitation across frequency. In animal experiments, activation by high-level sounds greatly widened subsequently obtained MOC-fiber TCs (Liberman 1988) which indicates, even though they may show narrow TCs, individual MOC fibers receive information from frequency regions far from their best frequency. Furthermore, broadband noise typically evoked higher MOC firing rates than tones (Liberman 1988; Brown et al. 1998), which is consistent with the MOC neurons integrating afferent activity broadly across frequency. Another factor is that all of the recordings from single MOC fibers were from anesthetized animals but MOC activation is stronger in awake animals (Boyev et al. 2002). Overall, it seems likely that (1) the narrow tone TCs from MOC fibers are not good indicators of brainstem across-frequency convergence of MOC activation, (2) both humans and animals have MOC systems that integrate incoming activation widely across cochlear regions, and (3) the frequency integration of MOC effects in awake humans shown by both Figs. 1 and 2 is due to a combination of all of the factors listed above.

The above factors may explain how MOC excitation is integrated across frequency, but not why the MOC effect increased as bandwidth increased. This increase is likely to originate from the compressive mechanical response of the cochlea and the narrow dynamic range of auditory nerve fibers. Because of the compressive mechanical response, spreading stimulus energy to wider bandwidths produces only small decreases in mechanical response at central frequencies and this decrease is more than compensated by newly excited distant frequencies. Furthermore, low-threshold auditory nerve fibers have narrow dynamic ranges so the high-spectral density of concentrated noise bands may saturate them and the lower spectral density of wider bandwidth elicitors may reduce their rate very little. The combination of these factors can lead to increased total auditory nerve firing as a constant SPL noise is increased in bandwidth. As long as MOC neurons adequately integrate the information across frequency, this increased peripheral excitation will increase MOC activation as bandwidth is increased. Maison et al. (2000) suggested a similar hypothesis. This hypothesis is consistent with both the centered and off-centered bandwidth data.

Our third experiment tested the hypothesized summation across frequency outlined above. The summation of MOC excitation from sounds in individual half-octave bands produced a calculated MOC effect that almost equaled the MOC effect from wide-band noise for contralateral elicitors but for not ipsilateral or bilateral elicitors (Fig. 4). Thus, there is a difference in the frequency-summation properties of the ipsilateral and contralateral MOC reflexes. It seems likely that this difference originates centrally rather than peripherally because crossed and uncrossed MOC fibers have similar synapses at the base of OHCs, and when separately stimulated electrically, both crossed and uncrossed MOC fibers appear to produce the same effects in the cochlea (Guinan et al. 1983; Gifford and Guinan 1987; Guinan and Gifford 1988).

Our finding that narrow-band contralateral elicitors produce MOC effects that summate almost linearly to produce the MOC effect measured as elicitor bandwidth is increased appears to differ from the conclusion of Maison et al. (2000) that noise bands slightly offset from the probe enhance the response. However, the two data sets are different in several respects. Maison et al. obtained the effects of noise bands centered away from the probe frequency by a subtraction process whereas we measured them by a more direct process. Also, the Maison et al. bands were narrower than ours so that the pattern reported by Maison et al. (2000) would not necessarily be shown by our half-octave noise bands.

Ipsilateral versus contralateral MOC effects

The MOC effects we found do not fit the expectation from studies in cats and guinea pigs that the ipsilateral reflex is twice as strong as the contralateral reflex (Robertson 1984; Liberman and Brown 1986; Brown 1989). In squirrel monkeys, 58% of MOC neurons are crossed in the brainstem (Thompson and Thompson 1986), so the ipsilateral/contra MOC reflex ratio might be expected to be 58/42 or ∼1.4 (crossed MOC fibers mediate the double-crossed ipsilateral MOC reflex—see Guinan 1996). The ratio in humans is unknown. Also, in cats, the crossed/uncrossed MOC ratio is a function of cochlear place and is near unity in the apex (Guinan et al. 1984). Our data did not show any clear change in the ipsi-/contra-MOC effect ratio as a function of probe frequency, but the data did show that the ratio depended on noise bandwidth. For narrow-band centered elicitors, ipsilateral MOC effects were much greater than contralateral, but for wide-band elicitors, the ipsilateral and contralateral effects were similar. Obviously, the crossed/uncrossed MOC innervation ratio does not change with elicitor bandwidth, so the change in ipsi-/contra-MOC effect ratio is likely to be due to central neural processes.

Another disagreement is in the pattern of MOC effects across probe frequencies. MOC innervation peaks in the midbasal turn and decreases toward the cochlear apex (Schuknecht et al. 1959). In contrast, MOC effects generally increased as probe frequency decreased (i.e., the effect increased going toward the apex—Fig. 3). The apical increase in MOC effects could be due to central processes that increase MOC firing and/or cochlear properties that produce larger SFOAE changes at lower probe frequencies.

Implications of the results for the role of MOC efferents in hearing

The main role of MOC efferents in everyday hearing is thought to be to reduce masking (Guinan 2006). The narrow MOC-fiber TCs and the MOC cochlear innervation pattern have suggested that the MOC reflex provides frequency-specific antimasking (Winslow and Sachs 1987). In contrast, our data show that MOC reflex effects seen at one place in the cochlea are due to integration of excitation from almost the whole cochlea (Fig. 1). Although no previous study systematically studied MOC reflex integration up to wide bandwidths, more limited data led many authors to also conclude that the most potent elicitor of MOC activity is broadband noise (Norman and Thornton 1993; Micheyl et al. 1999; Velenovsky and Glattke 2002; Lisowska et al. 2002; Maison et al. 2000). The wide bandwidth of the MOC reflex does not mean that antimasking is not the main role of the MOC reflex, nor does it mean that the MOC reflex cannot produce frequency-specific inhibition under some conditions (e.g., when the listener is paying attention to a certain frequency). However, it does imply that for antimasking (or other MOC functions at moderate sound levels) wide-band activation is the normal MOC activation pattern.

At high sound levels, MOC reflexes reduce acoustic trauma (Rajan 1992; Maison and Liberman 2000). We only used sounds at 60 dB SPL, or lower, and it is possible that MOC reflex properties change at higher traumatic levels. The reflex properties that work best might be different for antimasking versus trauma protection because for trauma protection, the reflex needs to reduce the response at the cochlear place of the sound triggering the reflex, whereas for antimasking, particularly forward masking, the auditory system must be ready to hear a new sound that may be at a different frequency from the masking sound. Thus, for antimasking compared to trauma protection, it may be more useful to have MOC activation elicited by a wide range of frequencies.

Our result that MOC effects are triggered by sounds over a wider frequency range than previously appreciated may impact the interpretation of many previous experiments. In particular, in many psychophysical experiments, masking sounds are used (e.g., “off-frequency maskers”) with the implicit assumption that these maskers affect responses only in the frequency bands directly affected by masker energy. It now seems likely that frequency bands distant from the masker can be affected by MOC activity elicited by the masker and that the effects of this MOC activity should be taken into account in interpreting the results.