INTRODUCTION

The cochlea is not only the recipient of sounds but also a sound generator in itself. The discovery of otoacoustic emissions (OAEs) (Kemp 1978), of the vulnerable sharp tuning of basilar membrane (BM) vibration (Sellick et al. 1982), and of outer hair cell (OHC) motility (Brownell et al. 1985; Zenner et al. 1985; Kachar et al. 1986; Ashmore 1987; Zheng et al. 2000) shaped our current understanding of hearing as an active process (Davis 1983; deBoer 1983; Ashmore and Mammano 2001).

OAEs (Probst et al. 1991), a by-product of the cochlear amplifier (CA), i.e., the amplification mechanism internal to the organ of Corti, are generally detected from the ear canal with a sensitive microphone. Signals recorded in response to brief acoustic stimuli (tonebursts, rectangular- or Gaussian-shaped clicks) are termed transient evoked emissions (Prieve et al. 1996). Emissions that can be detected without deliberately presenting any acoustic stimulus to the ear are termed spontaneous (Probst et al. 1991) and are characterized by sharp spectral peaks at selected frequencies (Robinette and Glattke 2002). Stimulus-frequency emissions (Kemp and Chum 1980) are evoked by presenting the ear with tonal stimuli of low to moderate intensity and changing the input frequency as dictated by sophisticated experimental protocols (Shera and Zweig 1993). Under these conditions, in a region above 1 kHz the pressure level of the OAEs exhibits a modulation amplitude of about 2 dB vs. input frequency, with a periodicity of about 100 Hz. This phenomenon is more pronounced near hearing threshold and disappears above about 40 dB sound pressure level (SPL). Cochlear nonlinearity cannot be invoked to explain stimulus-frequency OAEs, as the BM input–output curve of the amplified cochlea is virtually linear up to 30–40 dB SPL (Robles and Ruggero 2001). A simple explanation based on the present model is provided in the Appendix.

The prevailing model of the cochlea used to explain transient evoked OAE generation is the transmission line (Kemp 1978, 1980; Wilson 1980; Zwicker 1986; Neely and Kim 1986; Kaernbach et al. 1987; Furst and Lapid 1988; Fukazawa 1992). This is conceptually appealing since for a long time evoked emissions have been thought of as being due to “reflectance” of the traveling wave (TW) at putative discontinuities of cochlear partition parameters. In transmission lines, distributed parameter discontinuities upset the amplitude balance between progressive and regressive waves as imposed by the continuity condition for the local flows of energy and momentum and, as a by-product, generate wave reflection. Scattering from random inhomogeneity of the cochlear partition has also been invoked as the main contributor at low sound pressure levels (Zweig and Shera 1995; Talmadge et al. 1998).

In reality, it is difficult to reconcile these concepts with the physics of the cochlea, where energy and momentum for the BM motion are conserved globally rather than locally, for fluid coupling links distal BM sites stepping over possible parameter discontinuities. This implies that no continuity condition is locally imposed on the flows of energy and momentum within the cochlea. Some transmission line models, however, seem to account well, at least qualitatively, for some OAE phenomena near hearing threshold (Shera and Zweig 1993; Talmadge et al. 1998; Shera and Guinan 1999).

Here we propose a different interpretation of OAEs based on the instantaneous fluid coupling between stapes footplate and BM and among the BM oscillating elements themselves. This interpretation does not require modeling the cochlea as a transmission line.

METHODS

Modeling otoacoustic emission

OAE time-domain simulations presented here are based on a hydrodynamic model (Mammano and Nobili 1993; Nobili and Mammano 1996; Nobili et al. 1998), adapted so as to fit physical and geometrical characteristics of the human inner ear and completed with the inclusion of forward and reverse middle-ear transfer functions (Puria and Rosowski 1996; Puria et al. 1997). Model characteristics are discussed in the Appendix and illustrated in Figures 1 and 2. Figure 1 shows schematic diagrams of (a) auditory periphery, (b) organ of Corti, and (c) forward-gain and reverse-gain transfer functions of a human middle ear (top, amplitude; bottom, phase). Figure 2 graphically illustrates (a) the main distributed parameters of the model, (b) the stapes–BM fluid coupling factor, (c) the nonlinear profile of the OHC-generated force as a function of the local displacement η of the tectorial membrane (TM) relative to the reticular lamina (RL), and (d) the hydrodynamic Green’s function that accounts for BM self-interaction mediated by the cochlear fluid.

Figure 1
figure 1

Inner and middle ear. a. Scheme of the peripheral auditory system including ear canal (EC), middle-ear cavity (ME, dash-dotted contour), and cochlea. ED, eardrum; M, malleus and its ligaments (short dashed line); I, incus; S, stapes; OW, oval window; RW, round window; ET, eustachian tube; BM, basilar membrane; B, base; A, apex. The cochlear spiral canal comprises the cochlear partition which can be thought of as a collection of mechanical oscillators (arrow pair) vibrating transverse to the BM plane (dashed line). b. Scheme of an organ of Corti element: TM, tectorial membrane; RL, reticular lamina; OHCs + DCs, outer hair cells in series with Deiters’ cells. Trans, Rad, Long indicate transversal, radial, and longitudinal directions, respectively. c. Forward (left panels) and reverse (right panels) transfer function of a human middle ear according to Figure 3 of Puria and Rosowski (1996). (Top) Ratio of fluid pressure in scala vestibuli (vestibular pressure) to pressure in the ear canal in decibels (dB). (Bottom) Phase in degrees. Insets show associated impulse responses. Note pronounced filtering irregularities at selected frequencies, shown by the model to be relevant for generating and shaping transient evoked otoacoustic emissions.

Figure 2
figure 2

Graphic representation of basilar membrane (BM) distributed parameters for the human cochlea model. a. m(x) is mass, k(x) is stiffness, h(x) is positional viscosity, and s(x) is shearing viscosity. All parameters plotted versus normalized longitudinal position (x) on the BM. b. Profile of stapes–BM fluid coupling factor G S (x) representing the downward force sensed by the unit BM segment centered at the BM site x under the action of unit inward stapes acceleration [Eq. (A3)]; G S(x) also represents the outward force sensed by the stapes under unit BM upward acceleration [Eq. (1)]. c. Sigmoid profile representing the outer hair cell (OHC) motor force as −f OHC(x,η) vs. stereocilia deflection η [Eq. (A3)]. d. Green’s function G(x,x′) representing the BM–BM fluid coupling factor, i.e., the downward force sensed by the unit BM segment centered at the BM site x, under the action of unit upward acceleration of the unit BM segment at x′. G(x, x′) enters as a positive-definite (x ↔ x′) symmetric kernel in the BM motion Eq. (A3). Spikes should be understood as logarithmic singularities of G(x, x′) at x =x′ (Allen 1977). G S(x) and G(x, x′) were determined using spiral canal geometry, BM width, and stapes area for the human cochlea (Zwislocki–Mościcki 1948; Fernàndez 1952) following the computational procedure described in Mammano and Nobili (1993).

Bearing necessary simplifications, as all physical models do, and suffering from some limitations in its performance (maximum gain is 53 dB instead of 60–65 dB) and lack of precise estimates for some of its parameters (particularly regarding viscosity; Fig. 2a), the model should be expected to agree at least qualitatively with experiments.

In our approach, the middle ear is one of the key players for transient evoked OAE generation. As a mechanical pressure transducer, the middle ear converts sound pressure at the tympanic membrane to intracochlear fluid pressure (Olson 1999) so as to match fluid/air interface impedance (Fig. 1a), yielding a maximum forward gain somewhat less than 30 dB (Fig. 1c, left). In the reverse direction, the middle ear converts intracochlear fluid pressure oscillations into ear canal pressure waves with about 30 dB minimum loss (Fig. 1c, right). Forward and reverse transfer functions differ appreciably in their filtering properties and their product exhibits a few dB minimum loss only in the 1–1.5 kHz region. In our computations we used the diagrams published by Puria and Rosowski (1996), redrawn in Figure 1c, as they reproduce the only complete and sufficiently detailed human middle-ear data set that we were able to find in the literature (albeit presented only as preliminary conference proceedings).

Connecting middle and inner ear

The model connection to the middle ear was implemented using the transfer function data in Figure 1c to derive the impulse response of the middle ear (insets). Convolution of the latter with input waveforms representing sound pressure at the tympanic membrane yielded fluid pressure in scala vestibuli near the stapes, or the BM base (vestibular pressure). To link vestibular pressure and stapes acceleration, which was used as input to the inner ear model, we considered that cochlear acoustic impedance Z c (vestibular pressure divided by stapes footplate area times stapes velocity) appears approximately independent of frequency in the relevant range for OAEs (Z c ∇ 21 GΩ; Aibara et al. 2001), implying approximate proportionality between pressure and velocity. Accordingly, stapes acceleration was computed as a quantity proportional to the time derivative of sound pressure at the eardrum convolved with the middle-ear forward impulse response (Fig. 1c, left).

Finally, OAEs were computed as poststimulus vestibular pressure, convolved with the reverse transfer function of the middle ear. As the product of forward and reverse middle-ear transfer functions is everywhere less than 1, the middle ear in this model is effectively capable of power dissipation, with negligible reflection coefficient. The method used to compute vestibular pressure is described later.

Modeling fluid coupling

Hydrodynamics is central to cochlear function (Allen 1977; Allen and Sondhi 1979; Kim et al. 1980; Mammano and Nobili 1993) because sound stimuli arriving at the stapes through the middle ear are transmitted to the BM by the fluid filling the spiral canal, and because the organ of Corti vibration itself is heavily conditioned by fluid inertial effects.

In the literature there is some confusion about the possibility of establishing an equivalence between transmission line and hydrodynamic models. Unfortunately, transmission line models reduce fluid coupling to a sort of local interaction, thus failing to represent adequately its long-range character. A rough equivalence between transmission line and hydrodynamic models of the cochlea can be established only for the simplified geometry of the box model, where the bulk portion of Green’s function can be effectively cancelled out by performing a double space derivative operation (Allen 1977). Since our model uses a realistic representation of human cochlea geometry, such a mathematical expedient is inapplicable, leaving us no option but to use the full integrodifferential form of the BM motion equation [see Appendix, Eq. (A3)].

In the cochlea, grading of cochlear partition distributed parameters from base to apex, particularly stiffness and viscosity, and the instantaneous fluid coupling between stapes footplate and BM, and among the BM oscillating elements themselves, contribute to generating BM responses to sounds with characteristic waveforms whose amplitude rarely exceeds tens of nanometers in normal hearing conditions (Robles and Ruggero 2001). Modeling an active cochlea also requires an adequate representation of the nonlinear behavior of the CA that boosts BM oscillatory responses by 2–3 orders of magnitude for sound intensities up to 30–40 dB SPL, which corresponds to the onset of amplifier saturation (Ruggero and Rich 1991; deBoer and Nuttall 2002).

An overview of the model

The cochlear partition was modeled as two (continuous) arrays of damped oscillators representing (1) the BM interacting with surrounding fluid and (2) the TM viscoelastically coupled to the RL through the OHC stereocilia (S in Fig. 1b). The CA force sensed by the BM at x was modeled as a sigmoid-shaped function f OHC(x, η) of the OHC stereocilia deflection η at the corresponding BM site x, or, equivalently, of the local displacement η of TM relative to RL (Fig. 2c). BM oscillators were longitudinally coupled among themselves, and to the stapes footplate, by long-range hydrodynamic forces f H(x, t) that depended linearly on the local BM (upward) accelerations a BM(x, t) and the (inward) stapes acceleration a S(t) according to the formula

where x is the positional coordinate of the BM normalized to BM length, G(x, x′) is the xx′ symmetric positive definite Green’s function (Fig. 2d), which represents fluid coupling between the unit BM segments at x and x′, and G S(x) is the stapes–BM fluid coupling factor (Fig. 2b). As detailed in the Appendix, once moved to the mass side of the BM motion equation [Eq. (A3)], the first term on the right-hand side of Eq. (1) can be interpreted as a nondiagonal BM mass term.

In previous investigations, seeking to match model responses and experimental data, we were compelled to conceive an amplification mechanism based on the undamping of cochlear partition mechanics, i.e., an effective compensation for intrinsic viscous losses. This idea is far from being new, having been advanced since the pioneering work of Kim et al. (1980) and Neely and Kim (1986), and reproposed by several others after them.

As detailed in the Appendix, the condition that OHC activity compensates for the positional viscosity of the cochlear partition imposes strong constraints on possible stereocilia deflection mechanisms, ultimately on the motion equation of TM relative to RL. Here we present our interpretation.

In the cochlea, the TM shears in the radial direction relative to the RL plane (Fig. 1b), so that the relative oscillation of these structures is virtually unaffected by fluid coupling in the longitudinal direction. Therefore, we modeled the TM–RL subsystem as an array of highly damped harmonic oscillators driven by the underlying portion of the organ of Corti. Due to the large shearing viscosity of the narrow fluid cleft separating the TM from the RL and the small mass of this subsystem (small compared with the fluid mass set into motion by local BM accelerations), its mechanical reaction on the BM motion was neglected. We assumed the TM–RL subsystem to resonate weakly at frequencies close to the corresponding BM characteristic frequencies (CFs) (Gummer et al. 1996; Nobili and Mammano 1996; Hemmert et al. 2000) and to elicit motor responses from OHCs through mechanical input to their stereocilia (Robles and Ruggero 2001). More detailed representations are conceivable but probably unnecessary as viscosity prevents most degrees of freedom within the cochlear partition from expressing their proper oscillation modes. These hindered degrees of freedom are therefore effectively “enslaved” to the principal oscillation mode, i.e., the one amplified by the action of the OHCs.

What actually matters most in cochlear dynamics is the effect of fluid coupling, here represented by G(x, x′), interacting with the BM stiffness (Gummer et al. 1981), which we assumed to be an exponentially graded function of position and such as to reproduce the frequency–position map of the human cochlea (Greenwood 1990). In our model, the local resonance frequency of the BM differs substantially from the naive formula

, where k(x) is the local BM stiffness and m(x) is the organ of Corti local mass. According to this formula, to cover the range from 100 Hz to 16 kHz, which is characteristic for the human BM frequency map, k(x) should vary by more than 6 orders of magnitude from base to apex (Fig. 1a). As shown in Figure 2a, our model required instead a stiffness varying by only about 3 orders of magnitude, for both cochlea geometry and effective hydrodynamic mass come into play.

Hydrodynamics, combined with saturation of the CA output, determined also the markedly nonlinear properties of the sound processing performed by this model, notably tone-to-tone suppression (Nobili and Mammano 1996). These properties underlie one of the most important functional characteristics of the cochlea (which may seem an engineering paradox): fast responsiveness paired to high-frequency selectivity.

Model performance

Numeric solutions of our equation system reproduced well all typical vibration patterns detected from cochlear partitions of mammals (Robles and Ruggero 2001) and were consistent with psychoacoustic data (Zwicker and Fastl 1990). Historically, patterns elicited by pure tones were termed TWs because their progressive phase delay gives the illusion of base to apex propagation. Peaking at BM frequency-dependent locations, they project an input sound spectrum map on the BM.

Model responses to impulsive inputs (clicks) were shaped as spindles (see Fig. 3c, c′) formed by a continuous spectrum of TWs, which in the linear regime appeared to propagate slowly from base to apex.

Figure 3
figure 3

Transient evoked otoacoustic emissions simulated by the nonlinear cochlear model in the time domain. a, a′. Time waveform of stapes acceleration used as input to the inner-ear model and corresponding Fourier transform (FT) amplitude (b, b′). c–d, c–d′. Time course of basilar membrane (BM) oscillations following inputs shown in a, a′ plotted at different scales (thick vertical bars, 10 m/s2). e, e′. Simulated otoacoustic emissions and corresponding FT amplitude (f, f′). a–f represent response characteristics of a click filtered through an ideal middle ear with smooth transfer function. Note the absence of any appreciable OAEs. a–f′ represent similar data for an input click filtered by a middle ear with the transfer functions shown in Figure 1c.

All model responses were remarkably stable with respect to parameter irregularities but critically dependent on CA regulation. CA gain is a function of a distributed control parameter λ(x′) that regulates the amplitude of the OHC motor force at BM site x′. Function λ(x′) was determined computationally as described in the Appendix so as to reach the desired CA gain profile, illustrated in Figure 4a. Due to the nonlocal character of fluid coupling, represented by Green’s function G(x, x′) (Fig. 2d), a local change of λ(x′) at x′ produces a change of the CA gain profile in a wide interval around x′. As shown in the Results section, this functional dependence is responsible for the insurgence of spontaneous oscillations and the appearance of curious modulation phenomena near the threshold of hearing, where amplification is maximal. These effects disappear as the level of sound input is increased over the range of the BM compressive nonlinearity (30–40 dB SPL and above). Because of such critical behavior, accurate CA gain regulation proved essential in discriminating potential mechanisms of transient evoked OAE generation and in highlighting possible middle-ear contributions.

Figure 4
figure 4

Cochlear amplifier (CA) gain and its effects on spontaneous otoacoustic emissions. a. Profile of the model CA gain: Ordinates, dB scale as ratio of BM velocity in the active cochlea for near-threshold stimuli to BM velocity in the passive cochlea; upper abscissa, characteristic frequency (kHz); lower abscissa, fractional distance from stapes; thick solid bar marks slight decrease of distributed control parameter λ(x) causing spontaneous basilar membrane (BM) oscillations (b). c. Conceptual sketch to indicate how acceleration of a BM portion (a labeled downward arrow) causes lateral rebound forces (f labeled upward arrows) responsible for unbalancing undamping in the cochlea. Forces rebounding from localized deceleration may overcome a critical threshold (dotted line) starting self-sustained BM oscillations. d. Emissions arising when a subthreshold-noise-like input is applied to the stapes in the conditions described in b, simulating spontaneous OAEs. e. Spectrum. f. Input–output curves of basilar membrane (BM) displacement at the indicated characteristic frequencies for the active cochlea model (solid lines) and the passive model [λ(x) ≡ 0, dotted lines].

Computation of vestibular pressure

Based on Eq. (1), vestibular pressure p V(t) was computed as

where w BM(0) is the BM width at the base. In general, due to the progressive phase delay of TWs, the effects of poststimulus BM oscillations tend to cancel out at the oval window [p V(t) ≈ 0]. However, nonlinear modulation of BM oscillation patterns, generally associated with tone-to-tone suppression, may unbalance selected half-wave components in a spindle yielding non-negligible p V(t), hence emissions.

Acoustic impedance of the cochlea model

Note that p V(t) in Eq. (2) is the sum of two terms [see Eq. (1)], one related to BM acceleration and the other to stapes acceleration. A simple computation shows that the second term, which represents vestibular pressure in a cochlea with rigid BM, is approximately equal to 2Lρa S(t) ≅ 67 kg/m2 × a S(t), where L is BM length and ρ is the density of the cochlear fluid. In our model, this quantity is almost completely cancelled by the first term at signal onsets, as the BM yields to the stapedial input, thus shortening the hydrodynamic circuit across the BM near the base. The mutual cancellation of the acceleration terms makes vestibular pressure essentially a function of fluid velocity alone. This explains why the acoustic impedance of the cochlea is resistive over a wide frequency range (Aibara et al. 2001). In our model, the acoustic impedance of the passive and the active cochlea are the same and close to 21 GΩ in a frequency range of 0.5–6 kHz.

Numerical methods

All results concerning OAEs presented in this article were obtained by solving the full nonlinear model described in the Appendix. A package of routines written in Matlab (The MathWorks, Inc., Natick, MA) was developed with the intent of solving the time-domain equations for the BM and the TM–RL subsystem interacting with each other and the surrounding fluid. The cochlear partition was subdivided into 500 segments using a variable grid spacing, with a Gaussian point density centered at the 2 kHz CF site and maximum density ratio of about 3:1. The fluid coupling functions (Fig. 2b, d) were constructed from physical and geometric parameters of the human cochlea (Zwislocki–Mościcki 1948; Fernàndez 1952) using the numerical procedure described by Mammano and Nobili (1993). The motion equations for the BM [Eq. (A3)] and TM–RL subsystem [Eq. (A4)] were integrated numerically in the time domain with sampling rate equal to 200 kHz. In these computations, the implicit (or backward) Euler method proved to be more efficient than Runge–Kutta’s methods (Press et al. 1992). A set of Matlab-6 routines that can be used to simulate emissions based on this model is available.

RESULTS OF NUMERICAL SIMULATIONS

Overview

To obtain transient evoked OAEs from this model with regularized CA gain profiles, nonlinearity of the undamping force f OHC(x, η) was mandatory, together with sufficient input sound pressure level (more than 30–40 dB SPL). In fact, simulations from a linear (or linearized) model with regularized CA gain profile, which at low input levels performed like the nonlinear model, yielded zero transient evoked OAEs for all input SPL.

A totally different scenario emerged in the presence of slight irregularities of the CA gain profile. In this case the nonlinear model gave measurable spontaneous OAEs (Probst et al. 1991) and stimulus-frequency emissions for near-threshold inputs (Kemp and Brown 1983; Zweig and Shera 1995; Talmadge et al. 1998; Shera and Guinan 1999), as well as transient evoked OAEs, irrespective of middle-ear transfer function characteristics. Remarkably, similar stimulus-frequency emissions were generated also by the linearized version of our model at all input levels. Thus, our results suggest that there are at least two main sources of OAEs in the cochlea: one related to CA gain irregularities and the other to middle-ear characteristics. One of the aims of this article is to show how these two sources can be discriminated.

Transient evoked OAEs

Figure 3 summarizes the main results obtained by simulating transient (click) evoked OAEs. Left panels in Figure 3 show time waveform (Fig. 3a) and Fourier transform amplitude (Fig. 3b) of stapes acceleration following a click filtered through a middle ear with an idealized (smooth) transfer function. Figure 3c, d show the time course of BM spindles. Figure 3e, f show corresponding OAE time course and Fourier transform amplitude, respectively. Figure 3a′–f′ show similar quantities for a click filtered through a middle ear represented by the transfer function displayed in Figure 1c (after Puria and Rosowski 1996). Note that spindles as regular as those shown in Figure 3c, d can be obtained only if the CA gain profile is extremely smooth (as in Fig. 4a). After the initial transients due to signal onset, no transient evoked OAEs are seen in Figure 3e, f. In contrast, the most remarkable response features in Figure 3c′, d′ obtained with the same CA gain profile, are spindle irregularities, persistence of BM oscillations at CFs close to the sharpest frequency peaks of the middle-ear forward transfer function (see Fig. 1c), and transient evoked OAEs (Fig. 3c′, f′) strikingly similar to those well-known to audiologists (Kemp 1978; Probst et al. 1991; Prieve et al. 1996; Robinette and Glattke 2002). Furthermore, the ratio between model OAEs and input pressure level for a click of 0.6 Pa maximum variation was about -40 dB, as found experimentally (see Fig. 2 in Probst 1991).

Transient evoked OAEs appeared to arise as a combination of two main factors, both related to tone-to-tone suppression, which enhanced the irregularity of middle-ear frequency filtering: (1) lateral suppression of comparatively smaller BM oscillations at frequencies close to the frequency of dominant oscillations (winner-takes-all effect) and (2) mutual quenching of BM oscillations associated with a continuum of equally expressed responses.

Properties of cochlear amplifier gain and generation of spontaneous OAEs

Based on this model, an explanation can be found also for the mechanism underlying spontaneous OAE generation. A striking observation concerning spontaneous emissions is the close correspondence between emission frequencies and minima of hearing threshold level (Zwicker and Fastl 1990, p. 44, Fig. 3.23). Unquestionably, increasing cochlear amplification at a given BM site lowers the corresponding hearing threshold, possibly priming spontaneous BM oscillations at that site. However, spontaneous OAEs may arise also from localized damage to the CA. Proving this point required a two-step procedure.

First, as no direct measurements of human CA gain profile exist in the literature, we resorted to infer the profile shown in Figure 4a (solid line) by comparing psychoacoustic data from subjects with normal hearing to data from patients with acquired hearing loss of cochlear origin (Carney and Nelson 1983). Note that maximum model gain (47–53 dB amplification) is in the 1–5 kHz interval, which coincides with the typical spectral range of transient evoked OAE (Robinette and Glattke 2002).

Second, when we represented localized damage to the CA as an indentation in the distributed control parameter λ(x), altering an otherwise smooth CA gain profile (Fig. 4a, solid bar), spontaneous BM oscillations appeared at the CFs of the BM sites corresponding to the indentation ends (see Appendix). With an indentation corresponding to a mere 1–2 dB loss in the 47–53 dB amplification region, any sound input covering a frequency spectrum wide enough to include the indentation CFs, generated transient responses ensuing in self-sustained BM oscillations (Fig. 4b). The hydrodynamic mechanism underlying this phenomenon is sketched in Figure 4c.

The minimum rectangular indentation width that produced emission at two distinct frequencies covered ≃100 Hz CF interval when centered around 2.5 kHz (corresponding to 0.25 Bark; 1 Bark ≃ 20% CF, above 0.5 kHz); a shorter indentation resulted in an unresolved spectral line. Given the oversimplified shape of the indentation, this result agrees remarkably well, at least qualitatively, with the “0.4 Bark rule” that establishes the existence of a minimal frequency distance between neighboring spontaneous emissions (Zwicker and Fastl 1990). Spontaneous oscillations also occurred when the input was a noiselike signal of amplitude comparable to Brownian motion in the ear (Fig. 4e, d). In Figure 4f, dotted lines show representative BM input–output curves for the passive cochlea model; solid lines connect points at which the full nonlinear model’s input–output function was tested; horizontal arrows indicate CA gain (in dB) at the specified CFs.

Stimulus frequency OAEs

Experimentally, a modulation interval of about 100 Hz characterizes the spacing of stimulus frequency OAEs in the 1–2 kHz frequency range (Zweig and Shera 1995). We succeeded in simulating this phenomenon, obtaining the modulation and intensity characteristics illustrated in Figure 5, by imposing the presence of a small CA gain irregularity and a smooth middle-ear transfer function. At variance with the experimental protocol used by Shera and Zweig (1993), our simulations were obtained using input with continuously varying frequency and small gliding rates K = f −1 df/dt and yielded about 50 Hz modulation interval. Figure 5 illustrates the effects for K = 2.8 and 0.7 s−1, whereas the experimental protocol would ideally correspond to the limit K → 0. The simplest explanation is that the strong dependence on K of modulation amplitude and phase is due to the settling time of the BM oscillation elicited at the irregularity site (CF = 1.2 kHz). Note that emissions were maximal for near-threshold inputs, i.e., in the conditions of maximal amplification (lower trace). The appearance of stimulus frequency OAE modulations imputable to a localized damage is generally associated with the presence of spontaneous OAEs (Shera and Zweig 1993; Shera et al. 2002), in accord with our results.

Figure 5
figure 5

Stimulus-frequency otoacoustic emissions. Model emissions detected at the eardrum, elicited by stimuli of 10–40 dB SPL and frequency f slowly varying according to the law df / dt = Kf, were simulated by a time-domain implementation of the model described in the text. Solid lines: (K = 0.7 s−1) emission generated when the cochlear amplifier (CA) is slightly defective at the BM site of CF = 1.2 kHz. Dotted lines: the same, with K = 2.8 s−1. In both cases, modulations of maximum ~2 dB amplitude and ~50 Hz spacing, extending over an interval of ~250 Hz, are noted in the emission profile; their amplitude is larger at smaller input levels and is negligible when the BM response reaches the saturation level of the CA (35–40 dB SPL). Dashed lines: (K = 2.8 s−1) emission of a cochlea with regular (smooth) CA gain profile; no modulations are noted. Traces were offset vertically for clarity.

DISCUSSION

All of the results presented here depended strictly on the hydrodynamic character of cochlear dynamics, in particular, the instantaneous character of fluid coupling between BM and stapes. This model conceives OAEs not as due to some kind of waves back-propagating from irregularity sites on the cochlear partition but rather as residual oscillations of the BM, possibly caused by such irregularities but often imputable to other factors too, and instantly transmitted to the stapes by fluid coupling [Eq. (1)].

To clarify the rationale underlying our approach, we analyze comparatively the BM integrodifferential motion equation [Eq. (A3)] and the hyperbolic differential equation

that governs sound, light, and surface wave propagation (ignoring dissipative effects). As is well known, Eq. (3) admits two independent types of solutions. For example, if the local phase velocity v(x) is a smooth function of its argument, approximate solutions to Eq. (3) for a given frequency ω are

representing forward and backward propagating waves, respectively (Carrier and Pearson 1997). In the case of Eq. (3), a local force input (think of pinching the string of a guitar) generates both forward and reverse wave components that propagate with amplitude scaling as the square root of v(x). Therefore, the two components proceed towards the ends of the integration domain, where reflection can occur. Dispersive waves, such as earthquake, Shroedinger, and surface waves, albeit governed by different equations, exhibit similar long-range propagation properties for each of their frequency components.

Our model disclosed a different behavior. Here, the BM oscillation profile elicited by a sinusoidally varying force directly applied to the BM (Fig. 6, solid lines) is very similar to a scaled version of the TW profile generated by a tone that drives the stapes at the same frequency (Fig. 6, dotted lines). In particular, both profiles affect, with appreciable amplitude, the same limited region of the cochlear partition, i.e. a neighborhood of the CF site. The most relevant difference is that the amplitude profile of the TW elicited by the local stimulus presents a more or less pronounced notch near the stimulus site (Fig. 6, top, arrow), while the phase profile (Fig. 6, bottom) presents a distortion basal to the CF site. By analogy with the relationship between phase sign and wave propagation direction in transmission lines, the phase distortion in Figure 6 might be interpreted as a back-traveling wave. However, its effects remain confined to the neighborhood of the CF site, as wave amplitude decreases rapidly toward the base of the cochlea.

Figure 6
figure 6

Traveling wave elicited by applying a local stimulus to the basilar membrane. Solid line: amplitude and phase of a TW elicited by a pure tone applied at the BM site indicated by the arrow. Note the positive increase of the phase profile on the TW tail and its flattening at the stimulus site. Dotted line: amplitude and phase of a TW elicited by the same pure tone applied at the stapes. Amplitude responses were scaled to similar peak values.

Since the effect of a discontinuity of the cochlear partition parameters is equivalent to a local perturbation of the type described above, the result illustrated in Figure 6 indicates that internal TW reflections could hardly be invoked to explain the generation of OAEs. Instead, according to Eq. (2), OAEs arise from the cumulative hydrodynamic effect of BM residual oscillations.

In the transmission line view, the delays between input and output in the ear canal are interpreted as travel time of back-propagating waves. Instead, the delays observed in our simulations resulted simply from the delayed expression of BM oscillations due to the interplay of BM elasticity and the kinetic energy of the hydrodynamic field.

Hydrodynamics appeared to be responsible for a number of other interesting phenomena that we discuss later.

Spontaneous BM oscillations

How is it that a local amplification fall generates spontaneous BM oscillations, which would be expected only from a local amplification excess? As shown in Figure 4c, because of the nature of fluid coupling, a locally decreased BM acceleration (a labeled downward arrow) rebounds laterally as positive hydrodynamic forces (f labeled upward arrows) acting on adjacent BM segments. With CA gain very close to criticality, not only a slight gain increment but also a decrement may make dissipation and injection of power unbalanced, locally increasing amplification at the discontinuity and engendering spontaneous BM oscillations at that site. Both maximum hearing sensitivity, corresponding to threshold level minima in psychoacoustic measurements, and self-sustained BM oscillations, corresponding to spontaneous OAEs, are then expected to occur at CFs corresponding to local maxima or minima of the first space derivative of an irregular CA gain profile. To further clarify this crucial point, we consider in detail energy dissipation and the interplay between mechanical and hydrodynamic forces in the cochlea.

On undamping

Two main types of viscous drag hinder the motion of the cochlear partition: One opposes BM displacement relative to its resting position (positional viscosity; Fig. 2a, third panel), the other opposes relative displacements of adjacent organ of Corti segments (shearing viscosity; fourth panel). In a cochlea model with zero shearing viscosity, even the slightest overcompensation of positional viscosity would drive the system into instability, priming spontaneous BM oscillations. In our model, compensation of positional viscosity alone was insufficient to achieve large amplification levels because of the residual dissipation caused by shearing viscosity. As fluid coupling forced the BM to oscillate with a negative-definite phase gradient all along its length, thus preventing shearing forces from vanishing locally, the maintenance of subcritical dissipation conditions was consequently favored. In summary, shearing viscosity contributed everywhere to the energy balance of cochlear dynamics, providing distributed sinking for possible excess power locally delivered by the CA. We then conclude that the distributed (nonlocal) balance between energy injected by the OHCs and energy dissipated by viscous losses (Fig. 7) can be kept within stability boundaries even at high amplification levels, up to 60 dB gain, as found experimentally in the active cochlea (Robles and Ruggero 2001; Shera et al. 2002). Note that, because of the nonlocal character of energy balance, the power dissipation profile (Fig. 7, solid line) crosses the zero axis close to the CF site, meaning that energy delivered basal to the peak of the TW (dotted line) is absorbed apical to the peak, i.e., where shearing viscosity is mostly effective.

Figure 7
figure 7

Source-sink power balance in the active cochlea. Solid line: power dissipation profile of a traveling wave. It is negative in the region where the cell motors deliver their excess power and positive where the shearing viscosity is mostly effective, i.e., on the TW decline, where the wavelength shrinks to zero. The oscillation is stable, however, because the dissipation integrated over the BM length is positive. Dotted line: amplitude of the TW. Note that the zero of the power dissipation profile is very close to the TW peak.

On the mechanism underlying stimulus frequency emissions

When the effect of a local decrease of the OHC feedback force at BM site x 0, in a cochlea model otherwise characterized by a regular CA gain, is treated as a first-order perturbation term, the motion equation modifies as if the BM sensed an additional local force at x 0 of strength proportional to the BM velocity at x 0 (see Appendix). At high amplification, the BM response to a force like this is a sort of phase-distorted TW whose amplitude and phase depend on the velocity at x 0 of the main (unperturbed) TW elicited by stapedial input. Because of phase distortion, the hydrodynamic feedback to the stapes produced by such a perturbation term is small, but non-negligible (about 2 dB). Consequently, vestibular pressure is perturbed by an additional contribution whose phase depends on the position of the main TW with respect to x 0. The interference of this contribution with the main TW ultimately imposes on the ear canal pressure the frequency-dependent amplitude modulation typical of stimulus-frequency OAEs (Fig. 5). The effect is maximum for the largest amplification levels, i.e., for input at the threshold of hearing, and when the peak of the main TW passes across x 0. It is then clear that the modulation is related to the wavelength of the TW around the peak region (peak wavelength).

In our model, the modulation cycle caused by local damage at the 1.2 kHz CF site was about 50 Hz because in the frequency range of 1–2 kHz this corresponds to the TW peak wavelength measured in frequency units. Discrepancies between model results and experimental data showing a modulation cycle of about 100 Hz (Shera and Zweig 1993; Zweig and Shera 1995) are probably attributable to underestimation of the BM shearing viscosity coefficient s(x) (see Appendix), since the peak wavelength increases with s(x). Nonetheless, the qualitative features of this phenomenon are reproduced well in our simulations. If the local CA gain damage is not too small, spontaneous BM oscillations also appear at x 0, resulting in spontaneous OAEs at the CF of the damaged site. Note, however, that all such phenomena are relevant if the CA gain is larger than ~40–50 dB, as the number of modulations in the interference pattern depends on the number of oscillations enveloped by the TW peak (which increases with increasing amplification level).

On the time course of TWs and spindles

The BM response to a tone of given frequency, i.e., a TW, has a characteristic oscillatory waveform related to the cyclic exchange of BM elastic potential energy and kinetic energy of the surrounding fluid. Since this exchange is local (see Fig. 1A in Nobili et al. 1998), no total energy propagation takes place along the BM.

Dissipation phenomena resulting from cochlear partition viscosities determine the time course of the TW at the offset of an eliciting tone. During this decay process, energy exchange continues to take place over the limited BM region where the oscillation amplitude is appreciable. In the case of a highly amplified cochlea near threshold, the spatial extent of this region is extremely limited (Ren 2002).

In our model, the BM response to a click has a spindle waveshape. This depends on the fact that a click can be Fourier synthesized from a continuum of pure tones of suitable phases and amplitudes. Consequently, in the linear approximation, i.e., both in the passive cochlea and in the active cochlea near threshold, the BM response is a superposition of TWs, each one evolving independent of the others. When a click is presented to the stapes, each TW component of the global BM response is elicited with a different delay, proportional to the TW period. Therefore, basal BM regions begin to oscillate earlier than more apical regions, imparting the characteristic spindle waveshape to the BM oscillation pattern and also giving the impression that the forming spindle extends progressively toward the apex of the cochlea. At stimulus offset, in the linear regime, the shape of the spindle is determined by the distribution of decay times of the underlying TW components, which are shorter at higher frequencies. This gives the impression of forward propagation for the extinguishing wave packet, however, no effective energy propagation occurs.

In the nonlinear regime, the time course of the spindle is also influenced by tone-to-tone suppression. This is the main cause for the arising and persistence of residual BM oscillations, which may yield OAEs under the conditions analyzed in this article. Furthermore, the asymmetry of tone-to-tone suppression accentuates the apparent forward propagation of the spindle, as its components of lower frequency suppress more those of higher frequency than vice versa.

CONCLUSIONS

The present findings have far-reaching implications. Analysis of the model’s performance under various conditions indicates that either marked irregularities in the forward transfer function of the middle ear, with a regular CA gain profile, or slight irregularities of the CA gain profile, with regular transfer function, suffice to generate detectable transient evoked OAEs. Very often, in the latter case, spontaneous emissions arise also. Thus, our results suggest that, when found in the absence of spontaneous emissions, transient evoked OAEs are mainly attributable to the characteristics of forward middle-ear filtering. This explanation is in accordance with hypotheses previously advanced on the basis of the similarity between middle-ear transfer function profiles and spectra of transient evoked OAEs (Puria and Rosowski 1996). Curiously, in the same vein, absence of both type of emissions in a perfectly sensitive ear, a puzzling finding for the audiologist, should be explained as the result of having both smooth middle-ear transfer function and smooth CA gain profile, i.e., just an ideally performing ear!

The interpretation of OAEs advanced by this model differs substantially from those proposed by several other authors. We indicate here how a simple experiment may help to validate our conclusions. The prediction is that subjects with normal hearing, but negligible click evoked OAEs, will produce enhanced emissions after altering the waveform of the input click so as to simulate the effect of a middle ear with an irregular transfer function.

APPENDIX

The main mathematical features of the model are described here, based on our previously published work (Mammano and Nobili 1993; Nobili and Mammano 1996; Nobili et al. 1998), for the double purpose of introducing our approach to OAEs in a unitary way and of lending mathematical support to the arguments of the Discussion.

The BM motion equation

The BM is represented as a continuous array of adjacent harmonic oscillators affected by (1) positional and shearing viscosity, (2) feedback forces, due to OHC electromotility, of suitable phase and amplitude so as to cancel intrinsic viscous losses (undamping), and (3) hydrodynamic forces depending on BM and stapes accelerations.

Accordingly, the local BM displacements ζ(x, t), with t as time and 0 ≤ x ≤ 1 as BM position normalized to BM length, are governed by a motion equation of the form

where ∂ x is the partial space derivative and overdots are for partial time derivatives, m(x) is mass per unit BM length of the organ of Corti, and h(x) > 0 and s(x) > 0 are its positional and shearing viscosity coefficients, respectively. The profile of h(x), shown in Figure 2a, was selected so that the waveforms of the TWs reported by von Békésy (for the passive human cochlea) were reproduced fairly well when setting the CA gain to zero in the model. The shearing viscosity term in Eq. (A1) is required, for setting s(x) ≡ 0 would generally render the model unstable and the TW profiles totally unrealistic. The profile of s(x) was assumed to scale like the organ of Corti cross-sectional area. Its magnitude was chosen so as to obtain a realistic slope for the TW apical rolloff but probably it is still underestimated.

The term f OHC[x, η(x, t)] in Eq. (A1) represents the OHC motor force, which is responsible for undamping the BM motion, as a local function of stereocilia deflection η(x, t). With this expression, the motor force is assumed to be independent of frequency, despite the frequency rolloff of the receptor potential due to the OHC membrane capacitance. This assumption is based on experimental evidence that Deiters’ cells behave like a viscous cushion interposed between the OHCs and the BM (Lagostena et al. 2001). Viscous coupling forces, increasing in proportion to frequency, compensated for capacitive shunting, which results in a flat motor transfer function over the relevant frequency range.

In the expression

S(…) is a sigmoid function shaped as −f OHC in Figure 2c and normalized to unit slope and height; a(x) and b(x) are suitable distributed parameters that depend on selected features of the organ of Corti architecture and functionality (RL–BM mechanical gain, stereocilia length, angle formed by the plane of the RL and the BM plane, sensitivity of stereociliary transduction channels, etc.). The assignment of values to functions a(x) and b(x) is done in a following subsection.

The term f H(x, t) in Eq. (1) (see Methods) represents the hydrodynamic force per unit BM length sensed by the BM at site x. The integral expression on the right-hand side of that equation accounts for the effect produced at x by the (upward) BM acceleration

at x′, while the second term accounts for the effect produced at x by the (inward) stapes acceleration a S(t). Functions G(x, x′) and G S(x) represent the magnitude of the BM–BM and stapes–BM fluid coupling, respectively. The profiles of the distributed parameters used to model the human cochlea are graphically represented in Figure 2a. Function G S(x) is plotted in Figure 2c and samples of G(x, x′) for x′ = 0.1, 0.2,…,0.9 are shown in Figure 2d.

Equation (A1) can be economically rearranged in the form

where

is the linear integrodifferential kernel of the BM equation, ∂ t , is the time derivative operator, and δ(x-x′) is Dirac’s delta centered at x′. In our computations, all factors multiplying time derivative operators were represented as 500 × 500 nondiagonal matrices generated by sampling each factor over the discrete set of x values forming our computational grid. Note that G(x, x′) is equivalent to a nondiagonal mass of the BM oscillators and, since ∫ G(x, x′)dxm(x′), it largely dominates the inertial behavior of the system. This is indeed the way in which the fluid mass comes into play in cochlear dynamics. The three main ingredients of the BM motion equation are then displayed in Eq. (A3). The first term depends on the BM displacement ζ and its first two time derivatives, the second one depends on stereocilia deflection η, and the third one depends on stapes acceleration a S(t).

A motion equation for stereocilia deflection

The details of the organ of Corti micromechanics remain experimentally controversial to date. However, here we present a phenomenological scheme for the mechanical input to the OHC stereocilia that is largely independent of such details.

We assumed that the TM sheared in the radial direction relative to the RL and that the cochlear fluid surrounding the TM was dragged in the same direction by this motion. Consequently, the longitudinal fluid coupling among the oscillating elements of the TM–RL subsystem was neglected. So, as a first approximation, the TM–RL subsystem was modeled as an array of damped harmonic oscillators (mass attached at one end of a spring in parallel with a dashpot) driven by the motion of the other spring end (ultimately the BM). The mass

of each oscillator accounts for the inertial properties of this subsystem formed by the corresponding TM–RL segment together with the dragged fluid mass. The spring stiffness

represents the elastic properties of the structural components involved in this motion (e.g., stereocilia bundles), which deflect elastically under TM–RL shearing. The dashpot damping constant

represents dissipation factors (e.g., the viscous fluid layer in the TM–RL cleft), which damp this motion.

This is a simplification of a more detailed model that also accounts for the shearing viscosity affecting the radial motion of adjacent TM–RL segments and the mechanical effects due to the viscoelastic attachment of the TM to limbus spiralis. For the purposes of our investigation, there are two good reasons for neglecting both details: (1) Simulations carried out with the inclusion of TM–RL shearing viscosity

, which is a fraction of the organ of Corti shearing viscosity s(x), proved that the effect of this term on cochlear dynamics was negligible, probably because of the dominant effect of s(x); therefore, we set

in our numerical simulations. (2) Far enough from the apical region of the cochlea, i.e., safely within the region of interest for otoacoustic emissions, viscoelastic coupling between TM and limbus spiralis is small relative to the viscoelastic coupling between TM and RL.

In harmony with this view and with the “continuous array” representation [Eq. (A1)], the displacement of the unit TM segment relative to RL at the BM site x and time t, i.e., the stereocilia deflection η(x, t), is assumed to depend upon the BM acceleration

through the linear differential equation

where

, and g(x) is a nondimensional gain factor coupling BM and RL motion. We assumed constant g(x) all along the cochlear partition and determined its magnitude by imposing the condition that the amplitude of model click-evoked OAEs correspond to experimental data (see below).

To assign numerical values to functions ω2 TM(x) and γTM(x) we assumed distributed parameter values aimed at describing a set of TM–RL subsystem segments where each element resonates weakly at a frequency close to the CF of the underlying BM site for the passive cochlea (in the active cochlea, the CFs at the same site are half an octave higher; Gummer et al. 1996). The quality factor of the resonance was between 1.1 and 1.5, implying that the resonance profile was approximately flat over a relatively wide region around the CF site.

Stereocilia deflections η(x,t) described by the solutions to Eq. (A4), with the above parameter choice, supplied a feedback force term f OHC[x,η(x,t)] capable of uniformly and effectively undamping the BM motion. To understand what made this possible, note that at resonance the relationship

holds, and, therefore, in the proximity of resonance, the condition

is met because of the large damping coefficient γTM(x). Consequently, Eq. (A4) simplifies to

, which, after time integration, yields

Inserting this into Eq. (A2) and noting that for b(x)η(x, t) ≪ 1 the expression for the undamping term is linearly approximated by

, we obtain

with

. It is then clear that, within the limits of these approximations and with TM–RL subsystems tuned to resonate weakly close to the CFs of all BM sites, f OHC behaves as a negative viscosity term that tends to cancel the positional viscosity term

and also compensates for the shearing viscosity term

, provided that the CA gain is a smooth function of position. Actually, the condition c(x) > h(x) had to be fulfilled all along the BM length if the shearing viscosity was to be sufficiently compensated to produce cochlear gains approaching that of the active cochlea. This compensation was most effective at low SPL. As the input level increased, the phase angle between the feedback force f OHC and the BM displacement ζ decreased from 90° to 45° for a resonance quality factor around 1, implying that the OHCs also exert an influence on BM tuning. However, although tone-to-tone suppression is undoubtedly affected by this dephasing, we did not quantify the extent of this effect on our simulations.

To adjust the model gain with sufficient accuracy, we set c(x) = λ(x)h(x), where λ(x) is a distributed control parameter. Calling the over-undamping coefficient

, the factor λ(x) − 1 represents the proportion by which the positional viscosity has to be over-undamped in order to undamp shearing viscosity to the desired extent. The CA gain profile reported in Figure 4a, which was obtained by the recursive procedure described in the next subsection, required 0.034 < λ(x) < 1.36.

Determination of the cochlear amplifier saturation properties and gain profile

Numerical values for the function b(x) were established euristically by imposing the condition that the input–output curve plateau for the human cochlea at 2 kHz CF be centered at 50 dB (see Fig. 4f). Assuming uniform CA saturation properties all along the cochlear partition, we set b(x) ≡ 1/15 nm−1, g(x) ≡ 10, and consequently

. This choice fixed the scale for stereocilia deflection at CA saturation onset to be equal to 15 nm. With these parameter values, the TM–RL shearing displacement at the 2 kHz CF site is 20 nm for an input stapes acceleration |a S| = 2 m/s2, estimated to correspond to about 80 dB SPL in the ear canal. At the same site and input level, the BM displacement is ~0.32 µm. The large gain value implies that the BM vibrates with substantially less amplitude than the RL, as previously reported (Mammano and Ashmore 1993; Scherer et al. 2003).

Following the proposal by Robles and Ruggero (2001), the CA gain function is defined as the difference between the peaks in the sensitivity functions for low- and high-intensity tones or, equivalently, between in vivo and postmortem responses. In the chinchilla, the CA gain is in the range of 35–58 dB at the 9–10 kHz CF sites. In the guinea pig, the CA gain is about 35 dB at the 17–18 kHz region. A rather different scenario is found at the apex of the cochlea. Our knowledge of the mechanics of this region is still poor as experimental data are affected by damage induced by the experimenter. At present we can conclude only that the responses at the apex of the cochlea differ from those at the base. The estimated value of the apical CA gain falls in a wide range, from 25 dB down to negative values. An active attenuation has also been proposed (Khanna and Hao 1999; Zinn et al. 2000). As it is impossible to perform BM vibration measurements in vivo in the middle part of cochlea, experimental data from this region is lacking altogether.

This state of affairs imposed several restrictions upon us, making it impossible to use direct measurements to estimate the distribution of the CA gain along the BM in the mammalian cochlea. This was especially restrictive as the goal of our model was to mimic the behavior of the human cochlea. We then decided to use indirect experimental evidence for the CA gain which came from psychophysical tuning curves (Carney and Nelson 1983). In order to obtain an estimate for the spatial distribution of the CA gain, we resorted to comparing psychophysical tuning curves from normal and hearing-impaired subjects.

The distribution of the CA gain obtained empirically as described above could not be inserted analytically into the model. Instead, we created a recursive procedure aimed at generating the target amplification levels over the entire BM length, namely a suitable set of values for the distributed control parameter λ(x).

In the zero-order step, we detected the peak height of the linear passive model response to a constant amplitude tone whose frequency was slowly varied across the range of 16 kHz–100 Hz (thus sweeping the BM from base to apex). This set of values, hereafter defined as M 0(x), was stored into memory. The array M 0(x) was then multiplied by the inferred array of CA gain values (Fig. 4a), producing the set of target values M T(x). Then we recursively applied the same input to the active model using a sufficiently small value of the signal amplitude that would maintain the response within the undamped regime. Starting from the uniform distribution λ(x) ≡ 1, we slightly modified λ(x), producing a new array of maxima M k (x) at each iteration. The changes in λ(x) were proportional to the difference

between the target set and the responses generated by the previous step. After about 50 cycles, convergence was reached under the condition of 0.1% tolerance yielding the final set of λ(x) that was used in all numerical computations of our model. The fine adjustment of λ(x) obtained by this recursive procedure yielded a 53 dB maximum gain before the insurgence of numerical instabilities. Higher amplification levels, however, could have been achieved (at the expense of computation time) by augmenting the number of BM oscillators in the model.

On the mechanisms underlying emission periodicities near hearing threshold

In the framework of our model, neither partial reflection from the stapes nor coherent reflections from putative periodicity in the organ of Corti roughness are responsible for the observed periodicity in the evoked emissions (Zweig and Shera 1995). Such periodicity is simply governed by the phase difference between the main TW and the secondary (perturbative) TW activated by the main TW at a given site of CA gain irregularity. A mathematical explanation of this phenomenon can be derived by studying the effect of amplification changes on BM responses, as detailed hereafter.

In the active cochlea, the amplitude profile of a TW elicited by an input tone of intensity approaching hearing threshold is sharp enough to be covered broadly by the resonance profile of the TM–RL subsystem. Consequently, Eq. (A6) holds as an excellent approximation. The TM–RL subsystem oscillates in the linear range of the sigmoid profile that defines f OHC (Fig. 1c), thus f OHC can be safely replaced by its linear approximation [Eq. (A5)], which leads to the simplified linear equation

Now assume that, for input of constant amplitude close to hearing threshold, the fine regulation of λ(x) = c(x)/h(x) guarantees a regular, i.e., smooth, CA gain profile. We imagined that the active cochlea is endowed with feedback controls capable of approaching these conditions.

Now we analyze what happens when c(x) departs slightly from the above ideal conditions. Let us consider the difference

, for the same input stimulus, between the BM response ζ(x, t) obtained withperturbed coefficient c(x) and the response ζ′(x, t) obtained with a perturbed coefficient c′(x), which differs from c(x) by a slight local decrease, say

. Here ε is a positive quantity to be treated as a small number and δ(xx 0). is Dirac’s delta centered at a putative damage site x 0. Subtraction of the unvaried linearized Eq. (A7) from the varied one yields, in the first-order approximation in ε, the equation

Comparing this with Eq. (A7), we see that the variation in the BM response caused by the localized damage is equivalent to the BM response to a stimulus of strength

directly applied to the damage site and proportional to the BM velocity. Note that in the region of maximum gain (47–53 dB) even slight damage corresponding to ~1 dB CA gain loss suffices to elicit a disturbance of amplitude close to the saturation level of the CA (~35 dB). As shown in Figure 6, for a pure tone input and in highly amplified regions Δζ is a TW of appreciable amplitude, peaking at the very same CF site of a TW normally elicited by stapes input but bearing a positive phase distortion on the tailward side of its peak. As the phase of Δζ relative to that of ζ depends on the phase of

at x 0, which in turn depends on the frequency of ζ, the amplitude of the perturbed response ζ′(x,t) comes out larger or smaller than that of ζ(x,t), depending on the input frequency. The amplitude of the OAEs associated with Δζ is correspondingly modulated. The effect tends to disappear for TW amplitudes approaching the CA saturation level, because compressive nonlinearity works as an equalizer of BM responses. This affords an explanation of both amplitude modulation and input-level dependence of evoked OAEs for near-threshold input (Fig. 5).

Stability conditions at threshold

Since each term in Eq. (A7) represents a local force, the total power dissipation balance is obtained by multiplying both sides of the equation by ζ′(x, t) and integrating over x from 0 to 1. Time-averaging of Eq. (A7), followed by elimination of conservative terms that represent reversible storage of mechanical and hydrodynamic energy, yields

where <…> means time average. In performing this computation, we exploited the symmetry of G(x, x′) and assumed that

is negligible at the BM ends. In the integrand on the left-hand side, the quantity

represents the mean shearing viscosity dissipation, and

represents the mean excess power locally delivered by the OHCs.

The zero-input stability condition at threshold is expressed by the inequality

Clearly, stability is guaranteed if the sum of the two terms in the integrand is positive. However, as the hydrodynamic field provides instantaneous energy transfer among distal BM sites, the balance is not local and the system can be stable even if the integrand is not everywhere positive. Indeed, our simulations showed that, for a stable TW elicited by a pure tone, the integrand changes sign across the TW peak (Fig. 7). Excess power is delivered mostly where the TW amplitude tends to rise and is dissipated where the TW wavelength tends to shrink, i.e., on the apical side of the TW peak.

The content of Ineq. (A8) can be better analyzed in the complex domain for periodic solutions of the form

, where A(x) is real and positive and φ(x) is a real function of x, respectively representing the amplitude and phase of a TW elicited by a pure tone, and ω is the angular frequency of the oscillation. Inequality (A8), when divided by ω2, takes the form

Note that at the TW peak, i.e., where ∂ x A = 0 and A 2 is very large, the dissipation rate is extremely sensitive to the phase gradient ∂ x φ. This means that in critical conditions, i.e., when the undamping level is close to the threshold of spontaneous oscillations, even a slight decrease of the phase slope in the region of a TW peak can bring the system to instability. As discussed above, and shown in Figure 6, the main effect of local damage to the CA is precisely a phase-slope decrease of that sort. We ascribe to this effect the damage-induced instabilities discussed in the text.