Introduction

Simultaneous dichotic loudness balance, commonly abbreviated SDLB, was first popularized by Hood (1950). A more succinct description of the method, and of the interpretation of its outcomes, appeared in Egan (1955b), amongst other sources. Egan (1955b, p. 111) declared that SDLB measures “the decrease in the loudness of a steady acoustic stimulus during its presentation” (original italics). This decrease has variously been called “perstimulatory fatigue” and/or “loudness adaptation.” It was assumed to occur in any single ear (i.e., monaurally). The particulars of the method, and the resulting data, will be described below.

Despite its long history, SDLB remains relevant. For example, it is still utilized (e.g., D’Alessandro, 2008), and it continues to intrigue theorists (e.g., Nizami, 2017). Indeed, the original interpretation of SDLB data (Egan, 1955b; Hood, 1950), namely that loudness adapts monaurally, is still taken as fact by many authors (e.g., Arieh & Marks, 2011; D’Alessandro & Norwich, 2009; Gygi & Shafiro, 2011; Jones, Weiler, Warm, Dember, & Sandman, 2003; Ortiz de Gortari & Griffiths, 2014; Pike, 2015; Tang, Liu, & Zeng, 2006). There has, however, been a slow but steady accumulation of reservations about the original interpretation (amongst others, Bocca & Pestalozza, 1959; Dirks, Morgan, & Bray, 1974; Mirabella, Taub, & Teichner, 1967; Thwing, 1956; and the work of Scharf and colleagues, reviewed in Meunier, 2013). All of this will be described below.

Evidently, a fresh, up-to-date interpretation of SDLB is due. The present article offers it. Some concepts within the present article have been described elsewhere (Nizami, 2012, 2013).

In the following work, “fatigue” will be presented in quotation marks, to remind the reader that it is only a proselytized inference, not a confirmed fact. Indeed, it will be argued that “fatigue” does not happen when only a listener’s single ear experiences acoustic stimuli (commonly called “sound”); two ears are required, whose responses to stimuli hypothetically interact in a manner described below. Loudness itself is assumed to be a “central” experience, occurring at the cerebral cortex, not a peripheral phenomenon occurring at the ears. As such, each ear makes a contribution to loudness, not a loudness contribution. This is an important distinction.

Simultaneous dichotic loudness balance (SDLB)

To introduce SDLB, note well its motivation (Small, 1963, p. 289):

If a pure tone is presented to a listener continuously and at the end of five minutes he is asked if the stimulus sounds differently than it did in the beginning, his usual response is “no, it sounds the same.” The perceived loudness of the stimulus remains very nearly unchanged. It is as though the listener had neither an internal loudness standard nor an effective memory and thus is able to compare the loudness in a particular segment of time only with the loudness of the stimulus in the immediately preceding segment—an imperceptible change. The key to the perception and measurement of a loudness decrement under these circumstances seems to be the availability of a comparison stimulus.

A comparison stimulus is precisely what SDLB supplies. Egan (1955b, p. 111) explained as follows, while introducing some jargon that has continued to be used in the field:

A fatiguing stimulus having constant spectral characteristics is presented to one ear. A comparison stimulus whose intensity the listener can control is simultaneously presented to the other ear. During the simultaneous dichotic stimulation the listener adjusts the intensity of the comparison stimulus until it appears as loud as the fixed, fatiguing stimulus. After this loudness balance the comparison stimulus is turned off, but the fatiguing stimulus continues to sound. Later the comparison stimulus is again briefly presented for a loudness balance with the fatiguing stimulus. In this way the temporal course of the decline in loudness of the fatiguing stimulus may be obtained.

Note that the interval during which “the listener adjusts the intensity of the comparison stimulus until it appears as loud as the fixed, fatiguing stimulus” will henceforth be called the “adjustment session,” no matter how brief it may be.

Figure 1 shows the experiment from the perspective of the actual laboratory listener. Unfortunately, the meanings of “fatiguing ear” and “comparison ear” have sometimes been reversed in the literature. Furthermore, the term “test ear” has been applied to either the “fatiguing ear” or the “comparison ear”. Here, therefore, in an attempt at clarity, the ear receiving the “fatiguing” stimulus will be called “ipsilateral”, and the ear receiving the comparison stimulus will be called “contralateral”. The terms “fatiguing” and “comparison” will be used only if necessary.

Fig. 1
figure 1

The research listener’s role in the SDLB experiment. “Stickman” sits comfortably and alone within a soundproof chamber. Leads (dashed lines) from outside the chamber extend to the headphones. The lead to Stickman’s right-ear headphone passes through an attenuator that Stickman can adjust by hand. The right-ear headphone is therefore the “contralateral” headphone here. (Of course, which of left or right is contralateral or ipsilateral can be changed by simply turning the headphones around.) Within an adjustment session of 5–20 s, Stickman adjusts the attenuator so that the contralateral (“comparison”) stimulus at his right-ear headphone produces the same contribution to overall loudness as the ipsilateral (“fatiguing”) stimulus at his left-ear headphone. The final attenuator setting from each such match is recorded by the experimenter. From “Per-Stimulatory Fattigue,”[sic] by J. D. Hood, 1950, Acta Oto-Laryngologica, Vol. 38, Supplement 92, pages 26–56. Copyright 1950 by Informa UK Limited. Adapted with permission

A single experimental “run” in classic SDLB was described by Egan (1955b, p. 112) as follows:

The temporal sequence of the stimuli in measuring perstimulatory fatigue was as follows. The fatiguing and the comparison stimuli were presented together for 20 seconds, during which time the listener adjusted the intensity of the comparison stimulus for a loudness balance. Both stimuli were then turned off and the listener called out his [attenuator] setting [see Fig. 1 ]. Forty seconds later both stimuli were presented again for another loudness balance. After cycle [sic] was repeated several times, the fatiguing stimulus was left on. During this fatiguing period, the comparison stimulus was presented every minute for 20 seconds beginning on the minute. The recovery from perstimulatory fatigue was traced by turning off both the fatiguing and comparison stimuli for 40 seconds and then presenting both stimuli for another loudness match.

The comparison periods respectively preceding and following the “perstimulatory” period were deemed the “prestimulatory” and “poststimulatory” periods. Figure 2 shows the totality of the three periods. The per- or poststimulatory “fatigue,” indicated by a matching comparison stimulus intensity, is the latter’s measured decibels sound-pressure-level (dB SPL) subtracted from the average prestimulatory comparison-stimulus’s dB SPL.

Fig. 2
figure 2

Stimulus schedule during a typical SDLB run (see the text). Note that, for simplicity’s sake, “The on-off markers do not show the variations in intensity of the comparison [i.e., contralateral] stimulus during a loudness balance” (Egan, 1955b, Fig. 1). From “Perstimulatory Fatigue as Measured by Heterophonic Loudness Balances,” by J. P. Egan, 1955, Journal of the Acoustical Society of America, Vol. 27, pages 111–120, Fig. 1. Copyright 1955 by AIP Publishing LLC. Adapted with permission

Small and Minifie (1961, p. 1028) noted that “Unfortunately, it takes an appreciable interval to obtain a loudness balance”, and, indeed, Egan’s (1955b) listeners used all of their allotted 20 s. Also, a listener’s attenuator was always set to its maximum (i.e., minimum loudness) between loudness matches; furthermore, an arbitrary amount of further attenuation (plus or minus), unknown to the listener, could be introduced by another attenuator, one manipulated by the experimenter. Hence, “As a consequence of the attenuation introduced into the [attenuator] pads of the experimenter and observer, on any given [loudness] balance the intensity [sic] of the comparison stimulus at its onset was either completely inaudible or relatively weak” (Thwing, 1955). Therefore, each research listener was obliged to begin each adjustment session by raising the intensity of the comparison stimulus. This seems quite innocent. It will, however, prove crucial to understanding the empirical results for SDLB.

Not all experiments on SDLB have followed the above description. Some experimenters believed that contralateral stimuli “self-fatigue” the contralateral ear. The presumed confound was allegedly avoided by making the “comparison” tones as brief as possible. Such studies could be called “nonclassical,” and include those of Stokinger and Studebaker (1968), Petty, Fraser, and Elliott (1970, Fig. 4), Stokinger, Cooper, and Meissner (1972), Stokinger, Cooper, Meissner, and Jones (1972), Bray, Dirks, and Morgan (1973), and Dirks, Morgan, and Bray (1974), whose comparison-stimulus durations were 1 s, 1 s, 0.2 s (or 1 s or 2 s), 0.2 s, 3 s, and 0.3 s, respectively. Of course, such experiments do not allow time for smooth attenuation of the contralateral intensity by the listener, although Bray et al. (1973) nonetheless made listeners perform traditional attenuator adjustments (3 s). In the other articles just cited, the comparison stimulus was kept at a fixed intensity on any single presentation, and the listener signaled whether or not it was louder than the “fatiguing” stimulus, and the experimenter then adjusted the comparison-stimulus's intensity to cross back and forth, on a series of successive judgments, the intensity that putatively provided equal loudness. Each such determination would typically be followed by a rest period for the listener and for the experimenter, while the “fatiguing” stimulus continued to play to the listener.

Such “nonclassical” experiments can be informative, as will be explained later in this article. Regardless, there should be some minimum adjustment-session duration, such that even faster adjustments are too difficult to perform. By the same token, there should be some upper adjustment-session duration beyond which the listeners are sufficiently unhurried that yet-longer adjustment sessions do not change the discovered “fatigue.” Both of these notions are confirmed in the literature. The listeners of Bray et al. (1973, p. 1546) found it “difficult” to attenuate 3-s tones. Hence, 5 s will be presently taken as the probable lower limit for effective stimulus attenuation by listeners. At the other extreme, real differences in “fatigue” may occur between adjustment sessions using tones of 10 s and 20 s, but perhaps not between adjustment sessions using tones longer than 20 s (Small & Minifie, 1961, Fig. 3).

“Fatigue” with two ears (with notable exceptions), but rarely with one

Typical SDLB results: “Fatigue” with two ears

Figure 3 shows typical results of a classic SDLB experiment, in which the experimenter holds constant the ipsilateral stimulus intensity. As the ipsilateral stimulus continues during the perstimulatory period, the amount of “fatigue” increases, but with a declining rate-of-change, such that the amount of “fatigue” eventually asymptotes over the perstimulatory time (e.g., Egan, 1955a, 1955b; Egan & Thwing, 1955; Fraser, Petty, & Elliott, 1970; Sergeant & Harris, 1963; Small & Minifie, 1961; Stokinger, Cooper, & Meissner, 1972; Teichner & Sadler, 1966; Thwing, 1955, 1956). Reaching the asymptote requires at least 5 min, and perhaps more than 10 min for stimuli ≥ 80 dB SPL; the requisite time increases with the ipsilateral stimulus intensity (Carterette, 1955, 1956; Hood, 1950; Jerger, 1957; Petty et al., 1970). However, the greatest development of “fatigue” consistently occurs within the first 1–2 min.

Fig. 3
figure 3

Results of a typical SDLB run (as schematized in Fig. 2). The contralateral stimulus intensity is represented by the vertical scale. The constant ipsilateral stimulus intensity is represented by the horizontal line. The listener adjusts the contralateral stimulus intensity (after Fig. 1) to produce a contribution to loudness perceived to be equal to that of the constant-intensity ipsilateral stimulus. Deviations below the horizontal line represent “fatigue.” This “fatigue” can vary from 0 to 50 dB, depending upon the particulars of the experiment (see the text). Therefore, the vertical axis here has no annotations. During the prestimulatory period, the listener should (in principle) set the contralateral stimulus intensity equal to the ipsilateral stimulus intensity (see the text); hence, deviations from the horizontal line during this period represent the minimum variability in the listener’s intensity settings. From “Perstimulatory Fatigue as Measured by Heterophonic Loudness Balances,” by J. P. Egan, 1955, Journal of the Acoustical Society of America, Vol. 27, pages 111–120, Fig. 2. Copyright 1955 by AIP Publishing LLC. Adapted with permission

Figures 2 and 3 are archetypal, but SDLB measurements can be conducted in a variety of ways. Some of that variety is reflected here in a table. Table 1 supplies the procedural details for some representative SDLB experiments—namely, ones that followed the typical setup, in which the ipsilateral and contralateral stimuli have the same frequency composition.

Table 1 Methods used in SDLB experiments for which the ipsilateral and contralateral stimuli had the same waveform frequency composition

Nontypical SDLB results, as well as results of monaural stimulation

Does monaural loudness actually “fatigue”?

SDLB experiments have historically been interpreted as showing that the contribution-to-loudness that arises from a single ear “fatigues” (adapts) over the course of a sustained stimulus in that ear. But accumulating contemporaneous evidence has suggested that the loudness of a monaural stimulus does not “fatigue” unless that stimulus is presented within 20 dB of its absolute detection threshold (20 dB SL: i.e., decibels sensation level). Other evidence has suggested that, in special circumstances, “fatigue” need not occur even in SDLB. This evidence has not all been summarized in one place in the literature, and, as such, it is briefly reviewed here.

Consider first the detection threshold of a simultaneously-masked pure tone when the masker and tone are presented to the same ear. As Bocca and Pestalozza (1959, p. 350) explained for the case of a white-noise masker and a 50 dB SPL pure tone (of unmentioned frequency), “The intensity level of the white noise [which is] sufficient to make the pure tone just detectable, should be as many db [sic] less than the initial level as the value of the loudness loss for the pure tone in perstimulatory adaptation.” Translation: “fatigue” should cause a decrease in the masker intensity needed in the one exposed ear to just-mask the tone. But Bocca and Pestalozza did not find the expected decrease. Nonetheless, they noted (p. 350) that “In the same normal listeners the binaural test [i.e., SDLB] demonstrated a remarkable loudness loss [i.e., ‘fatigue’].” That is, their observed lack of monaural “fatigue” did not preclude the “fatigue” found in SDLB.

The Bocca and Pestalozza (1959) article was exceedingly brief; we can only assume that, in their SDLB procedure, the contralateral stimulus that listeners were required to equalize in loudness to an ipsilateral ongoing white noise had the same waveform frequency as the tone used in their simultaneous monaural masking task. (The cited supporting articles are in Italian.) The situation was unambiguous, however, in an earlier article from another laboratory. Egan (1955a) used a wideband noise of 90 dB SPL, for both monaural masking and SDLB (similar to what Bocca & Pestalozza, 1959, did later). Egan’s (1955a) monaurally masked tone was of 1 kHz, presented in repeating bursts of 0.7 s on, 0.3 s off. In the SDLB task, by contrast, the noise was loudness-matched by noise of the same bandwidth. Altogether, Egan (1955a) obtained results similar to those of Bocca and Pestalozza.

However, there was an apparent contradiction in Egan (1955a), which is why Bocca and Pestalozza (1959) are mentioned first. Another investigator who needs to be mentioned in this regard is Thwing (1956), who, using the same bandwidth of noise and the same frequency of tone as Egan (1955a), found a 6-dB drop in the needed masking-noise intensity for several intensities of the “fatiguing” tone, and an increasing drop of 6–9.33 dB as the tone frequency (at 65 dB SPL) was changed from 0.5 to 1 to 3 kHz. But all these differences fall within the possible realm of error in “fatigue” experiments (see Discussion). Furthermore, Thwing (1956) found that, for a 1-kHz tone at 60 dB SPL masked by a 0.7-kHz tone, experienced listeners showed no change in their setting of the masker intensity. To Thwing (1956, p. 609), this implied that “when the masking stimulus is a pure tone, adaptation virtually disappears with practice.”

Monaural loudness “fatigue” is negligible: Further evidence

Yet another line of research suggests that monaural adaptation is negligible. Harris and Pikler (1960) showed that practiced listeners using a turning attenuator-dial could accurately compensate for experimenter-imposed stimulus intensity changes, in order to maintain constant monaural loudness of a 1-kHz tone after 3 min of exposure to that tone. This discovery proved important for later studies of monaural loudness, as follows. Mirabella et al. (1967) cited Harris and Pikler, and noted that if hearing adapted to monaural stimuli, the listener (if allowed) would increase the stimulus intensity to compensate. And, for 3.5-kHz tones and broadband noises, listeners actually compensated by raising the intensity of a 70 dB SPL stimulus by as much as 5 dB. However, they lowered the intensity of a 90-dB SPL stimulus by the same amount. Mirabella et al. repeated the experiment, using 1-kHz tones, and found a compensatory increase of as much as 10 dB for stimuli of 40 dB SPL, but a compensatory decrease of 5 dB for 80 dB SPL stimuli, and a drop and subsequent rise for 70 dB SPL stimuli.

Mirabella et al. (1967) were (not surprisingly) suspicious of these inconsistencies. Hence, they introduced systematic fluctuations into the intensity of a 40 dB SPL, 1-kHz tone. Listeners who were required to track the intensity of these tones showed far less compensation than listeners required to track a constant-intensity tone. Mirabella et al. concluded that programmed intensity fluctuations maintain the listener’s attention. Thus, auditory “adaptation” to a steady, unwavering tone represents attentional drift, not actual sensory drift. This result is very important, because it implies that monaural loudness itself does not vary.

Wiley, Small, and Lilly (1973) performed similar studies, allowing listeners to adjust the stimulus intensity required to track changes in monaural tones of 0.5 or 4 kHz at 10, 20, 40, or 60 dB SL. Wiley et al. found no net change in loudness over the course of 30 s. Naturally, as Wiley et al. noted, such experiments depend upon the listeners having a memory for loudness, and indeed, such memory would have to be accurate and persistent.

Negligible monaural loudness “fatigue”: A third line of proof

A third line of work suggests that the loudness evoked by a steady monaural stimulus does not decline over time. These experiments involve the concept of the “duty cycle,” which must now be defined, not least because it frequently recurs in the literature. Indeed, it will inevitably prove to be an important factor in the new model to be introduced later.

There are two kinds of duty cycle. Figure 4 illustrates these kinds. The duty cycle of a continuously repeating stimulus is the stimulus duration, divided by the stimulus duration plus the subsequent interval during which the stimulus is turned off before this on–off cycle repeats. For example, if a stimulus has a repeated cycle in which it is on for 100 ms and off for 300 ms, its duty cycle is 25% (i.e., 100/400). Likewise, if a stimulus is on for 3 s and off for 2 s, its duty cycle is 60%. In fact, the literature shows two kinds of duty cycle, namely the “overall duty cycle” and the “fine duty cycle,” which will be explained and dealt-with below.

Fig. 4
figure 4

“Overall” and “fine” duty cycles. The vertical scale for each plot shows the stimulus intensity. The upper plot schematizes the stimulus to the ipsilateral ear, and the lower plot schematizes the stimulus to the contralateral ear. For the ipsilateral and contralateral stimuli, the “fine duty cycle” is set at 50% here. The “overall duty cycle” is set at 100% for the ipsilateral stimulus, and at 25% for the contralateral stimulus. Each individual tone pulse lasts 2 s, and each putative adjustment session lasts 15 s, in an experimental run lasting at least 2 min. From “The Relation of Perstimulatory Adaptation to Other Short-Term Threshold-Shifting Mechanisms,” by R. L. Sergeant and J. D. Harris, 1963, Journal of Speech and Hearing Research, Vol. 6, pages 27–39, Fig. 2. Copyright 1963 by the American Speech-Language-Hearing Association. Adapted with permission

The third line of work suggesting that the loudness evoked by a steady monaural stimulus does not decline over time consists of SDLB experiments in which the ipsilateral and contralateral pure tones differ sufficiently in frequency that any likelihood of the physiological response to one tone affecting the physiological response to the other (by any known physiological means) is remote. Consider Bray et al. (1973), who used SDLB with trains of contralateral 3-s tones (33% duty cycle). The general stimulus conditions were (1) a 70 dB SPL ipsilateral tone of 0.5 kHz, with contralateral tones of either 0.4 or 3 kHz, and (2) a 70 dB SPL ipsilateral tone of 3 kHz, with contralateral tones of either 0.5 or 2 kHz. Bray et al. employed two different techniques to obtain the matching contralateral loudness: either traditional adjustment by the listener, or the method mentioned above, in which the experimenter adjusted the comparison-stimulus’s intensity in order to cross back-and-forth the intensity that putatively provides equal loudness. Furthermore, Bray et al. allowed 7 min of ipsilateral stimulation to elapse, a relatively long time, before perstimulatory loudness matches were made. The amount of found “fatigue” was within the likely limits of error (see Discussion). Altogether, the loudness evoked by the matching contralateral tone—and hence, the loudness evoked by the ipsilateral tone—had apparently not changed over the long course of the ipsilateral tone.

Later, Morgan and Dirks (1973) gathered matching loudnesses for a 0.4-kHz contralateral (i.e., matching) tone over the entire course of a 7-min 0.5-kHz ipsilateral tone, for contralateral tones of 3-s duration (33% duty cycle) or of 0.3-s duration presented in trains of six (with 0.6-s gaps between individual tones) and with an overall 56% duty cycle for these batches. In both cases, the intensity adjustments were made by the listeners, and no overall “fatigue” was found. Later yet, the experimenters in Dirks et al. (1974) adjusted the stimulus intensities of contralateral tones (of 0.3-s duration). No “fatigue” was evident over the course of 3 min for ipsilateral tones of 0.5 or 4 kHz at intensities of either 70 or 80 dB SPL when, respectively, loudness-matched by contralateral tones of 0.4 or 2 kHz. The point of this paragraph is that other studies besides Bray et al. (1973) have shown a lack of “fatigue” when the ipsilateral and contralateral pure tones differed in frequency.

Note well that some caution may be required in interpreting such results; the new model (below) hypothesizes that the duration of the contralateral tone affects the degree of “fatigue.” The profoundness of this effect is difficult to hypothesize, however; indeed, when listeners loudness-matched 0.25-s 0.4-kHz contralateral tones to relatively longer (3-s) 0.5-kHz 70 dB SPL ipsilateral tones, the matching intensities were effectively identical (Morgan & Dirks, 1973, Exp. 2).

Negligible monaural loudness “fatigue”: A fourth line of proof

A fourth, and more recent, line of evidence indicates that monaural loudness does not decline. These data derive from the method of successive estimates. As Canévet, Scharf, and Botte (1985, p. 432) explain, “The subject matched the magnitude of a number to the loudness of an ongoing sound whenever a visual instruction appeared, usually every 20 s. The subject chose any positive number that seemed appropriate at that moment and entered it into a computer terminal”. Bertram Scharf and his colleagues (including Canévet) used this method repeatedly, and their work has been conveniently reviewed by Meunier (2013). Scharf and colleagues found that tones of a single waveform frequency can decline to inaudibility over several minutes, if their intensity is 20 dB SL or less. Indeed, tones could become inaudible for even higher sensation levels, up to 40 dB SL, if the waveform frequency exceeded 10 kHz. Otherwise, however, the loudness evoked by monaural (and binaural) tones does not appreciably decline.

Tang et al. (2006) employed the same method of loudness estimation as Scharf and colleagues, but failed to mention whether their listeners used one ear or both. Regardless, Tang et al. found results similar to those of Scharf and colleagues, for tones of 0.125, 1, or 8 kHz.

Unfortunately, this loudness estimation method is bedeviled by large individual differences (Meunier, 2013). Consider Canévet et al. (1985, p. 433), who found that the standard deviations (in decibels) associated with mean across-listener loudness estimates approached half of those mean values. Such variability may explain why, for example, Cook (1986) found that a 2-kHz tone at 20 dB SL did not subjectively disappear after 1 min (the only testing duration that she employed), although perhaps a greater duration was required.

“Fatigue” with two ears (with notable exceptions), but rarely with one: Summary

The evidence reviewed above suggests overall that the “fatigued” ear (here called the “ipsilateral” ear) does not seem to “fatigue” for tones over 20 dB SL when the comparison ear (here, the “contralateral” ear) has no stimulus presented to it. What, then, is the nature of the “fatigue” found through SDLB? The present article postulates and supports a new model, one that depends upon the following observations and concepts.

The physiology of “fatigue” with two ears: The role of the olivocochlear bundle (OCB)

Introducing the OCB

When both ears are exposed simultaneously to auditory stimuli, experience shows that they both contribute to the overall loudness. Nonetheless, attempts have been made to quantify how overall loudness compares to that evoked by a stimulus at a single ear (e.g., Epstein & Florentine, 2009; Hellman & Zwislocki, 1963; Scharf & Fishken, 1970). A firm consensus has not emerged, other than that two ears offer greater loudness than one when given a common stimulus. Regardless, an individual ear’s contribution to the overall loudness is usually presumed to rise with (1) the number of primary “afferent” neurons (those carrying signals brainward) that are firing above their spontaneous rates, and (2) the actual firing rate of each neuron, up to its maximum (i.e., saturation) firing rate (see Nizami & Schneider, 1997, and the articles reviewed therein).

Given the afferent neuronal firing from any one ear, a great deal of evidence suggests that this firing can affect the afferent neuronal firing at the opposite ear. This action involves anatomical features well “below” the brain, specifically the olivocochlear neuron bundle (OCB). OCB activity is crucial to the present model. Unfortunately, the characteristics of OCB activity are frustratingly difficult to assemble from the available publications, which are numerous but largely uncoordinated. Indeed, the best review of the data up to 1999 may well be a chapter in the present author’s PhD dissertation (Nizami, 1999). Post-1999, the broadest peer-reviewed summary of the anatomy and physiology of the OCB that the present author could find is Guinan (2006), a meaty review that is nonetheless far from fully comprehensive, thanks to an overemphasis on work from Guinan’s own institution. Guinan also failed to mention some of the supporting psychophysical work; that gap is partly filled below.

The OCB pathways

An ongoing tone at one ear evokes simultaneous firing, in the OCB, of efferent neurons, carrying signals “away from” the brain. The OCB efferents project across to the opposite ear. A tone as lengthy as 10 min or more may evoke continuing firing of efferent voltage spikes, with a slight firing-rate decline. This neuronal firing acts to “turn down the volume” at the opposite ear—that is, to reduce its gain (sensitivity). The loss of sensitivity is equivalent to a same-frequency tone being diminished by as much as 24 dB; even stronger effective reductions may be possible. Olivocochlear efferents may be found for primary afferents of all characteristic frequencies (CFs), where the CF is the tone frequency that evokes the greatest firing rate in a particular primary afferent. There can be multiple OCB efferents of different thresholds for a given CF (Guinan, 2006; Nizami, 1999), allowing smooth and progressive “volume turn-down” in response to a tone of that CF.

Figure 5 sparsely illustrates the OCB pathways from the ipsilateral side of the head (here, arbitrarily chosen as the left) to the contralateral side. Similar pathways exist, as mirror images, from the contralateral side to the ipsilateral side. The action of the OCB is intensity-, frequency-, and time-specific, as follows (Guinan, 2006; Nizami, 1999). Increasing the stimulus intensity in one ear increases the firing rates in the OCB neurons that project to the opposite ear. This effect is frequency-specific, as revealed by auditory “tuning curves.” These trace an auditory primary afferent neuron’s threshold for evoked voltage-spike firing as a function of the frequency of a single-frequency stimulating tone (see, e.g., Kiang, Watanabe, Thomas, & Clark, 1965). Such afferent tuning curves are V-shaped. So are the tuning curves for OCB efferents, as found by similar procedures; indeed, the afferent and efferent tuning curves are quite similar (Guinan, 2006). The V shape implies that possible OCB effects, such as “fatigue,” can spread to frequencies other than that of the perstimulatory ipsilateral stimulus with increasing intensity of that stimulus; this point is significant, and will be discussed much farther below.

Fig. 5
figure 5

A simplified picture of the physiological elements comprising the olivocochlear bundle (OCB), which allows ipsilateral stimuli to affect contralateral hearing. (This is merely a portion of a larger olivocochlear system, which is complicated in its anatomy, physiology, and proposed functions; see Guinan, 2006.) To understand “ipsilateral” and “contralateral” here, imagine the flat page itself as representing the human head, seen by looking toward the face. The vertical dashed line therefore runs through the head’s center of mass, from the underside of the jaw (ventral; bottom of the figure) to the top of the crown (dorsal; top of the figure). Hence, “ipsilateral” is presently the left side of the head. The placement of neuronal elements roughly mimics their actual mutual spatial relations. Boxes are nuclei (masses of neuronal cell bodies), and lines are bundles of axons (conduction pathways). The arrows indicate the direction of propagation of voltage spikes along the respective axons. IHC = inner hair cell; CN = cochlear nucleus; TB = trapezoid bundle; DAS = dorsal acoustic stria; IAS = intermediate acoustic stria; HN = higher nuclei; MOC = medial olivocochlear nuclei; UOCB = uncrossed portion of olivocochlear [neuron] bundle; OHC = outer hair cell. From “On Auditory Dynamic Range,” by L. Nizami, 1999, Doctoral dissertation, Department of Psychology, University of Toronto, Toronto, Canada, Fig. 32; supplemented by data from Guinan (2006). Copyright 1999 by Iftikhar Riaz Nizami

Regarding time dependence, Guinan (2006) noted two time-scales of OCB action. There is a “fast” time scale, on which OCB effects rise over 100 ms or so with ongoing OCB firing. But there is also a “slow” time scale, of tens of seconds, over which the “fast” OCB effects can accumulate, even if the OCB is actually stimulated for less than 50% of that total time. Such cumulative effects will prove important in the model introduced below. Regarding the slow effect, Guinan (p. 595) noted that “the slow effect wears off slowly”—that is, the end of tone-evoked firing in OCB neurons does not produce immediate reversal of OCB effects. This, too, will be addressed.

Evidence for OCB action in humans

To repeat an important point: the present model depends on the notion that firing in OCB efferents effectively “turns down” the opposite ear’s “volume,” as if the same-frequency tones presented to that ear had decreased in intensity by as much as 24 dB or more. However, this observation was made in animals, which suggests (but hardly guarantees) an extrapolation to humans. There is, however, evidence for “volume turn-down” in humans, which is not well-reviewed in the literature and therefore is briefly synopsized here, as follows.

Physiological evidence for OCB action in humans

The concept that a stimulus in one ear affects the stimulus-responsiveness of the opposite ear is shown through changes in the latter ear’s stimulus-evoked oto-acoustic emissions (OAEs). Guinan (2006, p. 599), in his literature review, noted that many such experiments had been done, “almost all of which are consistent with the conclusion that MOC [medial olivo-cochlear; i.e., OCB] efferents have the same qualitative effects in humans as they do in other mammals.”

Psychophysical evidence for OCB action in humans

But is there psychophysical evidence for “volume turn-down” in humans? We might expect an elevation of the detection threshold for the stimulus given to one ear as caused by activity in the OCB projecting to that ear from the opposite side. Unfortunately, the ears in question have been referred to, respectively, in the literature as “ipsilateral” and “contralateral,” which may seem contrary to present use. Such differences are confusing but inevitable. Regardless, we might also expect the diminishing of the contribution-to-loudness of “ipsilateral” stimuli that are presented well-above their detection thresholds, due to bursts of “contralateral” stimuli. In fact, elevation of detection thresholds and reduction of loudnesses have both been observed. They were not observed using SDLB, and as such they are not SDLB phenomena, as follows.

Elevation of detection thresholds due to “contralateral masking”

In a technique that was later denoted by the somewhat confusing term “contralateral masking,” Ingham (1959) used “pure” (i.e., single-waveform-frequency) tones of 30 dB SL, played in one ear (deemed the “contralateral” ear in later literature), in order to mask the detection of pure tones played in the other (“test”) ear. Ingham made the waveform frequency of the tones to the two ears different. Listeners were required to establish detection thresholds for the “test” ear, rather than matching the loudnesses across-ears. Ingham discovered an elevation of pure-tone detection thresholds of as much as 15 dB, which progressively lessened with increase in the frequency separation between the “masking” tone (0.4, 0.84, or 1 kHz) and the “test” tone (e.g., any one of 0.6, 0.76, 0.92, 1.08, 1.24, or 1.4 kHz). All tones were continuous, but Ingham did not mention what their durations were during the threshold-determination sessions. They were, however, probably several minutes long—that is, more than long enough to stimulate OCB firing.

Dirks and Malmquist (1965) used a different “contralateral masker” and “test tone”—respectively, a narrowband noise centered at 4 kHz, and a 4-kHz tone. These stimuli were played simultaneously to the ears as pulses (500 ms on, 500 ms off, in trains) or were played continuously. Unlike Ingham (1959), Dirks and Malmquist allowed their listeners to establish the masked-detection thresholds themselves, through continuous adjustment of the intensity of the tone (tracking). As the experimenter-determined noise intensities rose from 0 to 90 dB SPL, the listener-indicated tone-detection thresholds rose by as much as 7.5–8.5 dB. More importantly, these increases were not immediate; they developed over the course of the first 15 s of tone-intensity adjustment (see Dirks & Malmquist, 1965, Fig. 5).

Dirks and Norris (1966) extended the Dirks and Malmquist (1965) results, using the same methods but changing the stimuli. With a wideband noise “contralateral masker” of 60 dB SPL, Dirks and Norris found threshold elevations of 6 dB or so for 4-kHz test tones, 5 dB or so for 1-kHz test tones, and 2.5 dB or so—probably within the limits of error (see Discussion)—for 0.25-kHz test tones. They then repeated their experiment, using “contralateral masker” noise of 15–75 dB SPL. For all frequencies of the test tones, threshold elevations increased with masking-noise intensities. This is what would be expected from an OCB-mediated effect that strengthens with strengthening of the OCB-evoking stimulus. Elevation of the detection threshold for the 1-kHz test tones reached nearly 10 dB. The trials were then repeated, using as “contralateral maskers” wideband noises, narrowband noises, and pure tones having waveform frequencies close to that of the masked test-tone (and intensities of 40 dB SPL). These trials revealed that the degree of threshold elevation increases as the masker’s bandwidth narrows—that is, as the proportion of the masker’s energy near the frequency of the test tone increases. Finally, for test tones of 1 or 4 kHz, pure-tone maskers that had nearby waveform frequencies and intensities of just 40 dB SPL caused threshold elevations of over 12 dB (as much as 16 dB, for the 4-kHz tone). Altogether, Dirks and Norris confirmed the degrees of threshold elevation found by Ingham (1959).

Blegvad (1967) used very similar experimental conditions overall to those of Dirks and Malmquist (1965) and Dirks and Norris (1966), but with white noise (of 50 or 70 dB SPL) as the “contralateral masker,” and test tones of 0.25, 1, or 4 kHz. The threshold elevations for the 0.25-kHz test tone were 2 dB or less, but elevations as high as 8 dB were found for 1-kHz test tones, and threshold elevations tended to be greater for the 70 dB SPL “contralateral masker” than for the 50 dB SPL “contralateral masker.” None of this contradicts the Dirks and Norris findings.

Altogether, Ingham (1959), Dirks and Malmquist (1965), Dirks and Norris (1966), and Blegvad (1967) found that the detection threshold for a “contralaterally masked” test tone depends on two factors. One is the “contralateral masker” intensity. This dependence is to be expected, from the dependence of the OCB firing rate upon the intensity of the evoking stimulus, which here is the “contralateral masker.” Another factor is the closeness of the waveform frequency of the “contralateral masker” and test tone (Dirks & Malmquist, 1965; Dirks & Norris, 1966; Ingham, 1959). This is to be expected from the OCB’s frequency specificity, as was described above.

Importantly, Dirks and Norris (1966, p. 18) observed, as had Dirks and Malmquist (1965), that the elevation in detection threshold for the test tone under “contralateral masking,” as indicated by the listener’s tracking over the course of 1 min, gradually increased early on. This agrees with the expected growth in OCB-firing effectiveness over the course of a continuous OCB-evoking stimulus, as was reviewed above. As Blegvad (1967, p. 164) stated, “It would be reasonable to assume that masking of the non-test ear exerts an influence on the test ear via efferent fibers and that the effect is closely related to adaptative [sic] phenomena.”

Change in ipsilateral loudness due to contralateral stimuli

Meunier (2013) provides a convenient review of various important studies by Scharf and colleagues (1981–1997). The latter found that the loudness evoked by ipsilateral tones could be reduced by intermittent tones in the contralateral ear. (Here, “ipsilateral” and “contralateral” coincide with the orientation used in most of the present article.) Consider one particular example. In Botte, Baruch, and Scharf (1986), a 1-kHz 60 dB SPL tone was played to the ipsilateral ear; 20 s after its start, a single-frequency 60 dB SPL tone of equal or different frequency started in the contralateral ear, ending there 50 s before the end of the ipsilateral tone. Listeners were instructed to judge only the loudness in the ipsilateral ear (loudness estimation; see above). Empirically, the loudness evoked by the ipsilateral tone declined during the contralateral tone, with the decline being greater as the contralateral-tone frequency approached 1 kHz and as the contralateral-tone duration increased from 5 to 20 to 40 s. However, the rate of the contralaterally-evoked loudness-drop progressively lessened during the contralateral tone, similar to the approach to asymptote that is found in plots of evoked “fatigue” (see Fig. 3).

The greater ipsilateral-loudness decline that was found as the waveform frequency of the contralateral tone approached that of the ipsilateral tone is expected from the frequency-dependence of OCB action (above). The loudness evoked by the ipsilateral tone actually disappeared altogether, after the first 20 s of a 1-kHz, 60 dB SPL, 40-s-long contralateral tone. However, for contralateral tones of 1 s or less, listeners found that loudness judgments were impossible. Furthermore, after termination of the contralateral tone, the loudness evoked by the ipsilateral tone did not recover immediately, and the recovery was dependent on what the frequency of the contralateral tone had been.

The frequency selectivity of the contralaterally-evoked loudness drop was further confirmed by Botte et al. (1986) using a 10-s contralateral tone, for ipsilateral tones of 0.5 or 1 or 3 kHz.

The putative role of the olivocochlear bundle in SDLB: A model

The section above revealed substantial evidence for “volume turn-down” in humans, presumably mediated by the OCB. We can now model the role of the OCB during SDLB.

Two basic assumptions of the model

Two assumptions are imperative. First, following the reviews above, it will be assumed that a single ear’s overall contribution to loudness remains constant when that ear is the only one receiving an acoustic stimulus—when the overall loudness is monaural. That is, overall loudness that is purely monaural does not “fatigue.” This notion is consistent with the well-established phenomenon of neural/sensory adaptation. Nizami (2010) briefly summarized some of the salient data on the adaptation of peripheral neurons, which is the best-established of all the sensory neuronal-adaptation data. Specifically, Nizami (2010) supplied a table that lists details of the species studied, the response of the neuronal firing rate to an ongoing steady stimulus (whether complete adaptation down to the spontaneous firing-rate or below, or merely adaptation to a firing-rate plateau), and the approximate elapsed time to that response, for audition, vision, taste, and olfaction. These are all responses from just one side of the body (i.e., monaural or its equivalent). In audition and vision, in which complete psychophysical adaptation (subjective absence of the stimulus) to far-suprathreshold stimuli does not occur in humans, neurons show firing-rate plateaus. (Why any brief, initial decrease in peripheral firing seemingly does not affect the sensation evoked by an ongoing stimulus is still not understood.) In taste and olfaction, in which humans can show complete psychophysical adaptation, peripheral neurons can likewise show complete adaptation. Of course, judging how well neuronal and psychophysical adaptation correlate would require a thorough, dedicated review of a fairly diffuse literature.

The second assumption to be adopted here is one that is often implicit in the literature: namely, when overall loudness is binaural, having contributions from both ears, each ear’s contribution to the loudness is simply added to the other’s, to give an overall contribution. Note well that this notion of equal weighting is a simple default assumption of symmetry. It need not imply that the loudness of a stimulus played to both ears is any particular multiple of the loudness of that stimulus when it is played to only one ear.

Relation of the listener’s behavior to OCB activity

In modeling the role of the OCB during SDLB, it is best to start with the middle of the procedure, namely, the perstimulatory period (Fig. 2). During this time, the ipsilateral stimulus is usually played continuously, which simplifies the interpretation of its effects. Figure 6 illustrates the model for the “1st session” and “2nd session” of perstimulatory adjustments. Later sessions are omitted for simplicity’s sake, but they can be extrapolated. Note well that this figure and the ones that follow represent a coupling of the author’s imagination to the available literature. The time courses of the curves in the illustrations are meant to be “best guesses,” not precise descriptions. Not all of the respective phenomena bear precise descriptions.

Fig. 6
figure 6

Hypothetical actions of the ipsilateral and contralateral ears to affect each other’s “average volume setting” (see the text), from the beginning of the perstimulatory period (see Fig. 2) to the end of the second perstimulatory adjustment session

The continuous ipsilateral stimulus (Fig. 6b) evokes continuous neural firing in the OCB that projects (see Fig. 5) to the contralateral ear. This ongoing firing of the OCB will progressively “turn down the volume” at the contralateral ear (Fig. 6a, c, e, and g). During the adjustment sessions (Fig. 6c, d, g, and h), as was explained above, the listener must equate the contributions-to-loudness from each ear, by adjusting the contralateral stimulus intensity (Fig. 1). Several phenomena affect the listener’s choice of the final contralateral stimulus setting within the session.

First, the contralateral stimulus itself must cause the ipsilateral ear to desensitize, momentarily reducing the ipsilateral ear’s contribution to overall loudness (Fig. 6d and h). This interpretation is crucial but seems well-justified by the literature on the operation of the OCB (see above). Nonetheless, the notion that the ear undergoing stimulus intensity adjustment can influence the ear experiencing a constant stimulus is typically omitted from the SDLB literature. This is truly remarkable; before the spurt of experiments on SDLB in the 1960s and 1970s, there had already been suggestions that some brainward pathway might allow each ear to influence the other during SDLB. For example, in a peer-discussion section in the French language in Bocca and Pestalozza (1959), M. Portmann claimed that in a conference in 1954 (1er Congrès Extraordinaire de la Société Internationale d’Audiologie, Buenos Aires) he had proposed that “Hood’s [1950] binaural adaptation is not a uniquely peripheral phenomenon”—in other words, that “fatigue” does not involve only the cochlea of the ipsilateral (“fatigued”) ear (present author’s translation).

Altogether, the ipsilateral contribution to overall loudness must decline during each adjustment session, thanks to a declining ipsilateral “volume,” as shown in the second and fourth marked intervals in each of Fig. 6d, f, and h. By the end of an adjustment session, the listener must match this weakened ipsilateral contribution to loudness. The listener does so by reducing the contralateral stimulus intensity from its initial setting to a final setting. This brings us to the second phenomenon alluded-to above: namely, that the initial contralateral intensity setting can actually be quite high, causing substantial “volume turn-down” at the ipsilateral ear, thanks to an unfortunate procedure within SDLB experiments. Between the adjustment sessions in a run, the experimenter resets the mechanical attenuator—that is, turns the knob that the listener uses to adjust the intensity (and hence the contribution to loudness) of the contralateral stimulus (Fig. 1). Typically, the experimenter sets the listener’s contralateral-stimulus attenuator to its maximum, and then adds or subtracts some small randomly-determined amount, using either the listener’s attenuator itself and/or a second, external, experimenter-controlled attenuator. All of this attenuator manipulation minimizes the contralateral-stimulus intensity overall, and hence the contralateral contribution to loudness. The listener must counteract these settings when starting each adjustment session, instinctively doing so by quickly and profoundly increasing the contralateral stimulus intensity. This maximizes the “volume turn-down” at the ipsilateral ear, hence reducing the ipsilateral contribution to loudness.

The latter decrease is an experimental artifact, and a crucial one. To understand its importance, consider the opposite case. Suppose that each adjustment session started with the contralateral stimulus intensity equalized to that of the ipsilateral stimulus, instead of beginning at a near-minimum. This has, in fact, been tried (Teichner & Sadler, 1966). The inferred “fatigue” did not exceed 11 dB (using tones of 3.5 or 5 kHz), which is less than that found in most studies presently cited, even after a 5-min perstimulatory duration (Teichner & Sadler, 1966, Fig. 1). Indeed, only 15 s of perstimulatory time sufficed to produce most of the “fatigue”—a duration that, coincidentally, is the typical duration of an adjustment session (see Table 1). Nonetheless, near-maximum initial attenuator settings by the experimenter remained the norm in the literature, perhaps indeed because they produced the greatest “fatigue”! The notion that “fatigue” is circumstantial will feature prominently in the model below.

A key assumption of the new model, then—after Teichner and Sadler (1966), and others to be cited later—is that substantial “volume turn-down” at one ear by the other can occur within the duration of an adjustment session. Typical adjustment sessions last 10 s (Hood, 1950) to 20 s (Egan, 1955b). These durations are sufficient to allow changes in the degree of “volume turn-down” by the OCB, according to the “slow” time scale of OCB effects (Guinan, 2006).

By the same token, of course, the “volume turn-down” at the ipsilateral ear that is evoked by the momentary presence of a contralateral stimulus will, in turn, momentarily reduce the effective intensity of the ipsilateral stimulus, and hence the degree of OCB-mediated, ipsilaterally-induced “volume turn-down” at the contralateral ear (and so on and so forth, potentially ad infinitum for an ongoing ipsilateral stimulus). This reverberation is conceptually complicated and will therefore be omitted from the present model, save for showing in Fig. 6 those momentary slowings of the rate of volume turn-down in the contralateral ear, as lessenings of the slope of the curve in panels c and g during the durations labeled “1st session” and “2nd session.”

How the model explains the principal claim of SDLB: Increasing “fatigue” with ipsilateral stimulus duration

Note well the role of the “duty cycle” (Fig. 4), whether this be the fine duty cycle or the overall duty cycle. The lower the duty cycle of a repeating stimulus applied to one ear, the longer the quiet intervals at that ear, during which the opposite ear can recover from OCB-mediated “volume turn-down.” In terms of the new interpretation (above) of the classic SDLB experiment, the intervals in-between the contralateral adjustment sessions should allow the ipsilateral ear to recover somewhat from the “volume turn-down” induced there by the contralateral stimulus. Such recovery is exemplified by the exponentially rebounding curve in the third segment in Fig. 6f.

Nonetheless, the ipsilateral stimulus continues at its fixed intensity. The ipsilateral ear therefore continues to act through the OCB to “turn down the volume” at the contralateral ear. That is, the contralateral ear continues to lose its sensitivity to stimuli. The listener must therefore, over successive adjustment sessions, set the contralateral stimulus to even higher starting magnitudes. We assume this to be mechanically possible, given no indication to the contrary. This, in turn, creates increasingly greater “volume turn-down” at the ipsilateral ear, reducing its apparent intensity and hence its contribution to overall loudness. This reduced loudness must be matched by stimulus intensity adjustments at the contralateral ear; by the end of the adjustment session, therefore, the listener assigns an even lower contralateral stimulus intensity than at the end of the previous adjustment session. This is classical SDLB “fatigue.”

Figure 7 illustrates the approach to “fatigue” in a novel graphical manner. The figure uses a common linear time scale (bottom of figure) marked “Perstimulatory time.” The upper and middle panels of the figure, respectively, show the ipsilateral and contralateral ears’ contributions to loudness, which are assumed to be equally weighted (as noted above). When there is no auditory stimulus, these respective contributions to loudness are zero. Zero is also used to indicate the corresponding stimulus condition—namely, no applied auditory stimulus at all. (On a decibel scale of stimulus intensity, this is – ∞.)

Fig. 7
figure 7

(Bottom) Hypothetical moving-average contralateral stimulus settings during successive adjustment sessions (left to right) in the perstimulatory stage of an SDLB run. Of course, the sharp onsets and offsets do not represent the output of actual moving-average low-pass filters. (Middle, top) The consequent final contralateral (middle) and ipsilateral (top) contributions to loudness at the end of each adjustment session (see the text)

Stimulus intensity is shown in the lowest panel of Fig. 7. It is the moving-average stimulus intensity at the contralateral ear, “average” because the listener adjusts the contralateral-stimulus intensity up and down during that stimulus’s comparatively brief appearances during the perstimulatory adjustment sessions.

Summary of the new model

In the new interpretation of what happens during SDLB, the continuous ipsilateral stimulus evokes continuous neural firing in the OCB that projects to the contralateral ear, “turning down the (contralateral) volume.” Likewise, during each contralateral-intensity adjustment session, in which the listener matches the contributions-to-loudness from the two ears, the contralateral stimulus itself causes the ipsilateral ear to desensitize, temporarily reducing the ipsilateral contribution to overall loudness. It is that reduced contribution-to-loudness that the listener matches by the end of each adjustment session. The ipsilateral contribution to loudness may be even weaker than anticipated, thanks to a laboratory custom in SDLB: namely, that between adjustment sessions, the experimenter resets the contralateral attenuators to near-maximum. This obliges the listener to compensate by initially quickly increasing the contralateral stimulus intensity, and to a high level. Such adjustments are necessarily fast and extreme, given that listeners are typical allowed only 5–20 s for stimulus-magnitude adjustment (see above), which is very little time for a motor task more complicated than simply pressing a button. A relatively sudden, profound contralateral stimulus will reflexively cause a great reduction of the ipsilateral contribution to loudness.

The ipsilateral contribution can potentially recover in-between SDLB adjustment sessions. Nevertheless, the ipsilateral stimulus is ongoing, causing the contralateral ear to continually desensitize. Over successive adjustment sessions, then, the listener must compensate for contralateral-ear desensitization by further increasing the session-starting magnitude of the contralateral stimulus intensity—causing a yet greater across-session reduction of the ipsilateral contribution to loudness. This phenomenon is classically recorded as “fatigue.”

Hypotheses arising from the model, and evidence consistent with them

Various hypotheses arise from the new model, and there is evidence consistent with those hypotheses, as follows.

Hypotheses regarding prestimulatory contralateral stimulus settings

Recall that in a classic SDLB experiment there is a prestimulatory period (Fig. 2). That comparison period precedes the perstimulatory period, providing a baseline for assessing perstimulatory “fatigue” (Fig. 3). During the prestimulatory period, the ipsilateral and contralateral stimuli are typically on periodically (stimulus modulation) and simultaneously, that is, they are coincident bursts. Each ipsilateral burst has the same intensity; the amplitude of the train of contralateral bursts can be adjusted by the listener within an adjustment session. Indeed, a burst may fill an adjustment session (a practical minimum of 5 s; see above), so that the listener can balance the loudness of a single burst at the contralateral ear to the loudness of a single burst at the ipsilateral ear. When the bursts are too short for this, then in principle one train of stimuli can be loudness-matched to the other. Regardless, the “volume turn-down” at each ear due to the stimulus at the other ear should progress identically for each ear during the adjustment sessions. Altogether, then, the contralateral stimulus intensity should be set equal to the ipsilateral stimulus intensity. This means zero “prestimulatory fatigue.”

Zero “prestimulatory fatigue” is what was seen in Carterette (1955, caption to Fig. 3; overall duty cycle = 25%), Egan (1955b, Figs. 2 and 7; overall duty cycle = 33%), Fraser et al. (1970, Table 1; overall duty cycle = 25%), and Petty et al. (1970, Table 1; overall duty cycle = 25%, see their Fig. 2). When insufficient time was allowed for smooth attenuation by the listener of the contralateral-burst intensity, there was still no evidence of “fatigue” during the prestimulatory period—for example, see Wright (1960, Table 2), Bray et al. (1973, Table 1), or Weiler and Blackmond (1973).

Interestingly, some authors eliminated the perstimulatory and poststimulatory periods altogether, extending the prestimulatory period to entirely fill the experimental run. In terms of the duty cycle, the contralateral and ipsilateral stimuli would in this case both have “overall duty cycles” of 100% but “fine duty cycles” of < 100% (Fig. 4), the latter describing the modulation. Regardless, the hypothesized outcome of the loudness balances would be the same as above: no “prestimulatory fatigue.” This hypothesis was confirmed. Egan (1955b) used 20-s periods of contralateral-intensity adjustment by the listener, but for a “prestimulatory” period alone, which lasted nearly 14 min (Egan, 1955b, Fig. 6; fine duty cycle = 33%). During that time, no “fatigue” was indicated, within the limits of error (limits that were established by Egan himself and are displayed in his Fig. 8), for ipsilateral and contralateral tones of 0.8 kHz, or for ipsilateral tones of 0.8 kHz and contralateral tones of 1 kHz. Similarly, Egan and Thwing (1955, Fig. 1, “Control” condition, “simultaneous”) used a “prestimulatory” period alone, lasting 8 min (1-kHz tones, 15-s adjustment sessions), during which no consistent “fatigue” was found.

Fig. 8
figure 8

Hypothetical moving-average contralateral stimulus settings (bottom) during successive adjustment sessions (left to right) in the poststimulatory stage of an SDLB run, and the consequent final contralateral (middle) and ipsilateral (top) contributions to loudness at the end of each adjustment session (see the text). Unlike in Fig. 7, the adjustment sessions (being poststimulatory) have not been numbered

Hypotheses regarding poststimulatory recovery

Let us return to the full SDLB experiment (Fig. 2). By the end of the perstimulatory period, the contralateral ear is suffering from many minutes of “volume turn-down,” evoked by OCB firing caused by the continuous ipsilateral stimulation. In the subsequent poststimulatory period, the ipsilateral stimulus is turned off, except during adjustment sessions, in which the ipsilateral stimulus intensity is kept the same as in the prestimulatory and perstimulatory periods (Fig. 2), and the intensity of the contralateral stimulus is adjusted by the listener until the contralateral contribution to overall loudness matches the ipsilateral contribution to overall loudness. But the contralateral ear is now free of the constant ipsilateral influence that occurred during the perstimulatory period. Over time, therefore, the contralateral ear will resensitize. The listener must compensate by reversing what had been done during the perstimulatory period—that is, the initial contralateral stimulus setting must be progressively lower from one poststimulatory session to the next. This, in turn, successively decreases the influence of the OCB that projects from the contralateral ear to the ipsilateral ear. This ipsilateral ear therefore produces an increasing contribution-to-loudness over poststimulatory adjustment sessions. This increasing contribution is what the listener must match, as a “loudness match,” by the end of each adjustment session. Across sessions, then, the final intensity setting at the contralateral ear will increase. This is classic SDLB “recovery” (Fig. 3), seen in Carterette (1955, Fig. 3; stimuli = bandpass noises), Egan (1955b, Fig. 2), and Thwing (1955, Fig. 3).

Figure 8 illustrates the proposed time-course of events. As in Fig. 7, the bottom of the figure has a common linear time-scale, now labeled “Poststimulatory time.” As in Fig. 7, the upper and middle panels of the figure, respectively, show the ipsilateral and contralateral contributions to loudness. As before, the lowest panel indicates the moving-average intensity of the contralateral stimulus. The dashed lines in Fig. 8 show, for comparison’s sake, the maximum ipsilateral contribution to loudness during the perstimulatory period.

The hypothesized effect of raising the intensity of the continuous ipsilateral stimulus across experimental runs

In some experiments, the intensity of the ipsilateral stimulus was changed from one experimental “run” to the next, where a run is one complete set of pre-, per-, and poststimulatory periods. Increasing the perstimulatory ipsilateral stimulus intensity empirically increases “fatigue.” Figure 9 shows this effect, in the style of Fig. 3. The same pattern of results is seen for pure tones in Hood (1950, Fig. 15), Carterette (1956, Fig. 6), Jerger (1957), Stokinger and Studebaker (1968, Table 1), and Petty et al. (1970, Fig. 5). For wideband noise, the pattern is also seen in Carterette (1955, Fig. 3; 1956, Fig. 1).

Fig. 9
figure 9

Changes in the plot of “fatigue” (arrow) with increases in ipsilateral stimulus intensity. All ipsilateral intensities are represented here by the one horizontal line (“Ipsilateral stimulus intensity”). Hence, the contralateral intensities (vertical scale) are in decibels relative to their respective ipsilateral intensity, rather than dB SPL as in Fig. 3. Note that other factors can cause the plot of “fatigue” to change similarly (see the later text)

Now, according to the arguments made just above, greater perstimulatory “fatigue” should be followed by slower poststimulatory recovery. Unfortunately, not all articles report the course of poststimulatory recovery, but the hypothesized slowing was confirmed in Carterette (1955, Fig. 3) for bandpass noises.

How does the new model explain greater “fatigue” with greater ipsilateral stimulus intensity? The answer is straightforward. At any particular perstimulatory juncture across experimental runs, an increased ipsilateral stimulus intensity corresponds to greater “volume turn-down” at the contralateral ear. The listener compensates for this during adjustment sessions by setting the initial contralateral stimulus intensity higher. This precipitates a lower final contralateral intensity setting—that is, greater “fatigue” (see above and Fig. 7). Figure 10 illustrates this course of events for a typical perstimulatory adjustment session of 5–20 s, taken at some common perstimulatory juncture across experimental runs, such that the time-scale seen at the bottom of Fig. 7 becomes redundant and is replaced by the intensities of the ipsilateral stimulus.

Fig. 10
figure 10

The hypothetical moving-average contralateral stimulus settings (bottom) for a particular adjustment session in the perstimulatory stage of an SDLB run, for different across-run intensities (horizontal bottom scale) of a continuous perstimulatory ipsilateral stimulus. The middle and upper frames, respectively, show the consequent final contralateral and ipsilateral contributions to loudness at the end of each particular adjustment session

Hypotheses regarding the effect of the duty cycle of the contralateral stimulus

The hypothesized effect of an overall contralateral duty cycle of 100%

Identical contralateral and ipsilateral stimuli: The case of binaural stimulation

The overall contralateral duty cycle during the perstimulatory period (see Fig. 4) has been manipulated in the literature and been proven to affect the degree of “fatigue.” The details are described below, along with how the model explains the outcomes.

Recall that the overall duty cycle of the contralateral stimulus can be made as much as 100%—that is, the contralateral stimulus can fill the entire perstimulatory period—even if its intensity profile is finely “chopped” into segments that are too short for individual adjustment by the listener (i.e., a fine duty cycle; Fig. 4). Consider now a contralateral stimulus whose overall duty cycle is 100% and whose fine duty cycle is also 100%: in other words, a truly continuous stimulus. Imagine also an identical ipsilateral stimulus—same frequency content, same phase, same intensity. Each ear would then equally affect the contribution-to-loudness of the other, by acting through their OCBs. Now, despite the contralateral stimulus being continuous, perstimulatory adjustment sessions could still be done; the listener need merely alter the contralateral stimulus intensity during the designated adjustment sessions. And with the binaural stimulus just described, the listener would not alter the contralateral stimulus intensity. That is, “fatigue” would be zero. Figure 11, on the extreme right-hand side, represents this case.

Fig. 11
figure 11

Hypothetical moving-average contralateral stimulus settings (bottom) for a particular perstimulatory adjustment session across SDLB runs, for different across-run intensities of a continuous perstimulatory contralateral stimulus (numerals at the bottom, which are decibels relative to that of the continuous perstimulatory ipsilateral stimulus). The middle and upper frames, respectively, show the consequent final contralateral and ipsilateral contributions to loudness at the end of each adjustment session. These are equal (i.e., “fatigue” is zero) when the ipsilateral and contralateral stimuli have the same intensity (rightmost case, “0” on the bottom scale)

Nonidentical contralateral and ipsilateral stimuli

Suppose now that the experimenter sets the intensity of the continuous contralateral stimulus to be lower than that of the (otherwise identical) continuous ipsilateral stimulus. Nonetheless, the contralateral stimulus continuously “turns down the volume” at the ipsilateral ear. The ipsilateral ear’s effective intensity and, hence, its contribution to loudness are therefore continually diminishing. Notwithstanding, the ipsilateral ear still acts through the OCB to “turn the volume down” at the contralateral ear, but with diminishing effectiveness over time, thanks to the aforementioned decline in its own effective intensity.

In sum, then, even when the experimenter sets a continuous contralateral stimulus intensity lower than the continuous ipsilateral stimulus intensity, the ears act continuously and mutually on each other to produce less cumulative “volume turn-down” at the contralateral ear at any given perstimulatory moment than during the typical SDLB experiment involving contralateral stimuli that only appear during 5- to 20-s adjustment sessions. This bears repeating: There will be less “volume turn-down” at the contralateral ear, given a 100% overall contralateral duty cycle, than during the usual SDLB routine (Fig. 2 or 4).

Even so, perstimulatory adjustment sessions can be done; the listener merely alters the contralateral stimulus intensity during the designated adjustment periods. During those periods, however, and unlike the case for a noncontinuous contralateral stimulus (Fig. 7), the listener’s initial setting of the loudness-matching contralateral stimulus intensity will not need to be as high (to overcome “volume turn-down”), and the final stimulus-intensity setting will therefore not be as low, as in the usual SDLB routine. That is, there should be less “fatigue” than for a noncontinuous contralateral stimulus. And indeed, if the adjustment session itself is started at the beginning of the perstimulatory period and is extended to fill that period, then the degree of “fatigue” should decrease toward zero. This hypothesis is confirmed in the literature, as follows.

  • An overall contralateral duty cycle of 100%: Small and Minifie ( 1961 )

Small and Minifie (1961, Fig. 3d) used 4-kHz ipsilateral and contralateral tones, with the contralateral tone intensity set equal to the ipsilateral tone intensity at the beginning of the perstimulatory period. Listeners were allowed to adjust the contralateral tone intensity throughout the entire perstimulatory period; there was an across-listener average “fatigue” of nearly 9 dB after nearly 6 min. This may seem profound, but it is small compared to the “fatigue” that Small and Minifie found when contralateral stimuli were not continuous: approximately 25 dB. According to the model, there should have been no “fatigue”; but the 9 dB found by Small and Minifie mimics the amount of attentional drift found by Mirabella et al. (1967). Indeed, there is a remarkable degree of possible error in loudness-matching of the two ears. This issue is important, and is pursued at length in the Discussion.

  • An overall contralateral duty cycle of 100%: Hood ( 1950 )

Hood (1950) had listeners adaptively and continuously track the contralateral stimulus intensity required to match the loudness of an ipsilateral same-frequency tone (presumably 1 kHz at 80 dB SL, 6-min duration; see Hood, 1950, p. 26). Hood found no “fatigue” under these circumstances, as was noted by Wright (1960) in a clarifying summary of Hood’s account; the tracking fluctuations were evidently less than 5 dB (Hood, 1950, p. 28).

Wright (1960) notes yet another Hood (1950) experiment. A listener with sensorineural hearing loss at one ear used continuous adaptive tracking to loudness-match across the two ears. With the stimulus intensity in the unimpaired ear held constant (2 kHz, starting at 70 dB SL; Hood, 1950, p. 43), the listener adjusted the stimulus intensity at the impaired ear, and over the course of 6 min the intensity had to be raised by perhaps 10 dB or more. However, in different runs, the stimulus intensity in the impaired ear was held constant, and the listener adjusted the stimulus intensity at the unimpaired ear (2 kHz, starting at 70 dB SL; Hood, 1950, p. 42). In that case, the stimulus intensity had to be lowered over the course of 6 min, and by as much as 60 dB for some listeners.

Impairment is permanent “volume turn-down.” As such, Hood’s (1950) results with impaired ears can be explained qualitatively within the new model, as follows. Given that any impaired ear has a lower “volume setting” than an unimpaired ear, an identical stimulus to both ears would evoke a greater initial contribution-to-loudness from the unimpaired ear than from the impaired ear. The listener who performs loudness-matching must therefore compensate by increasing the intensity at the impaired ear, or by reducing the intensity at the unimpaired ear (whichever is allowed). It is unclear why Hood found greater decrease than increase. However, there can be great differences in performance between listeners, as will be discussed farther below.

  • Changing contralateral intensity across runs

Consider now what happens if the experimenter sets the contralateral stimulus to play continuously during the perstimulatory period within a single experimental run, but increases its intensity across experimental runs, so as to gradually approach the (constant) ipsilateral stimulus intensity from below. Suppose also that the listener still adjusts the contralateral intensity during the adjustment periods. What end-adjustment-period intensity will the listener choose? The contralateral ear will become more and more sensitive across runs, because increasing the contralateral intensity will make the ipsilateral stimulus less and less loud, counteracting its ability to “turn down the volume” at the contralateral ear. Consequently, for reasons explained above, less and less “fatigue” will be indicated at the end of any given perstimulatory adjustment session as compared to previous runs. Figure 11 illustrates the idea. Unfortunately, no single document could be found that effectively tests this implication, but the literature is vast, and such data may well exist.

The hypothesized effect of reducing the contralateral overall duty cycle

The contralateral overall duty cycle can be reduced from 100% by turning off the contralateral stimulus at regular intervals. Consider, for example, the 25% overall duty cycle in Fig. 4. Hypothetically, the progressive reduction of duty cycle will progressively reduce the cumulation of contralaterally-originating OCB “volume turn-down” at the ipsilateral ear. This will make ipsilateral stimuli louder, resulting in greater ipsilaterally-originating OCB “volume turn-down” at the contralateral ear.

During perstimulatory adjustment sessions, then, the louder ipsilateral stimulus and the lower sensitivity of the contralateral ear together encourage the listener to increase the contralateral stimulus intensity at the start of the adjustment session. Therefore, by the end of the adjustment session, as was explained earlier, the listener will set a lower final contralateral intensity setting. From one perstimulatory adjustment session to the next, the initial setting will get higher and the final setting will get lower (Fig. 7), producing greater “fatigue.”

If the overall contralateral duty cycle is reduced from one run to the next, the “fatigue” shown in a time-plot like Fig. 3 should increase, following the style of Fig. 10. And indeed, Small and Minifie (1961, Fig. 3) found notably more “fatigue” with contralateral overall duty cycles of 33% than with 66%, and found differences between cycles of 50% and 80%, but not for 16.7% versus 20% versus 33%, which suggests the degree of uncertainty in their data.

The hypothesized effect of the contralateral stimulus having a fine duty cycle

Consider now how “fatigue” should be affected when the contralateral stimulus has a fine duty cycle superimposed upon its overall duty cycle, but its overall duty cycle is less than 100%, indeed, the contralateral stimulus only appears during the adjustment sessions (as usual). That is, when the contralateral stimulus does appear it is divided into a train of bursts, bursts that may last for seconds or only milliseconds, rather than minutes. Figure 4, for example, shows a fine duty cycle of 50% imposed upon a contralateral overall duty cycle of 25%. When the fine duty cycle is below 100% (100% = nonchopped stimulus), there will be less stimulus energy concentrated at a given waveform frequency. Consequently, less firing is evoked in the OCB that projects from the contralateral to the ipsilateral ear for that waveform frequency (the OCB is frequency-specific; see above). Hence, the ipsilateral ear experiences less frequency-specific “volume turn-down” than for a nonchopped contralateral stimulus; the lower the contralateral fine duty cycle, the less the ipsilateral “volume turn-down.” This is the same consequence that arises from reducing the overall contralateral duty cycle, with the same anticipated result: greater “fatigue.”

By the same reasoning, if the contralateral overall duty cycle is 100% (i.e., the contralateral stimulus fills the entire perstimulatory period), imposing on it a fine duty cycle should induce greater “fatigue,” where it would otherwise be zero or would decrease toward zero. The smaller the fine duty cycle, the greater the “fatigue.” Consider a perstimulatory contralateral stimulus that is a continuation of the string of brief tonal bursts appearing during the prestimulatory period, such that the ongoing string now has a “fine duty cycle.” That is, the contralateral stimulus’s overall perstimulatory duty cycle is 100% but its fine duty cycle is < 100%. Assume, further, that the individual bursts are too brief for loudness adjustments for each one. For across-ear loudness-matching, then, a different procedure is required. For example, Stokinger, Cooper, and Meissner (1972, Fig. 6) had their listeners match the loudness evoked by a series of 1-s 1-kHz bursts to that evoked by an ipsilateral continuous 1-kHz tone. But rather than using a smooth adjuster knob, the pip intensity was instead adjusted continuously using a staircase procedure, and the resulting intensities were averaged over 20-s epochs. All of this differs from traditional SDLB, but we may still attempt to explain their results.

Stokinger, Cooper, and Meissner (1972) found greater “fatigue” for contralateral stimuli of 1 s on, 1 s off (50% fine duty cycle) than for 1 s on, 5 s off (16 2/3% fine duty cycle). The Stokinger, Cooper, and Meissner results appear to contradict the notion that a lower fine duty cycle should result in greater “fatigue.” But there is a devil in the details. We might assume that 1 s of silence is enough time for ipsilateral-ear recovery from the “volume turn-down” effects of a 1-s contralateral stimulus. (Generally, psychophysical recovery times are assumed to equal or exceed stimulus-exposure times.) Hence, a fine duty cycle of 50% or less might be thought to always cause zero “fatigue,” when matching contralaterally-evoked and ipsilaterally-evoked loudness contributions over at least one cycle of the fine duty cycle. If so, then continuous loudness-matching, as was used in the Stokinger, Cooper, and Meissner staircase procedure, should reveal the ongoing effect of the ipsilateral stimulus in “turning down the volume” at the contralateral ear, resulting in the listener setting the contralateral stimulus to succeedingly higher intensities than the ipsilateral intensity—in short, a “negative fatigue.” Remarkably, three out of 24 listeners did indeed show “negative fatigue” (Stokinger, Cooper, & Meissner, 1972, p. 606) by the end of the 5-min perstimulatory period. However, the magnitude was 5 dB, which is well-within the apparent limits of error (see Discussion). The “negative fatigue” hardly differed by fine duty cycle, which is to be expected if recovery is complete for a fine duty cycle of 50% or less. Another nine listeners were classified by Stokinger, Cooper, and Meissner as “nonadapters,” showing anywhere from – 5 to + 5 dB of cumulative “fatigue,” and 12 were classified as “adapters,” showing average cumulative “fatigue” of more than 10 dB for the 16 2/3% fine duty cycle, and nearly 25 dB for the 50% fine duty cycle.

These vast differences in the degree of “fatigue” suggest serious problems with the conduct of the experiment. It would be too easy to blame the observed performance differences on across-listener anatomical differences, differences that seem an unlikely explanation, as compared to differences in [voluntary] behavior. In particular, the observed vast differences in “fatigue” imply that different listeners were effectively doing different tasks—or doing the same task under different strategies. Regardless, given the data’s uncertainty, we must consider an alternative hypothesis, namely, that recovery from OCB firing is slower than the “volume turn-down” evoked by that firing. If so, then the time-average of the firing evoked by contralateral stimuli in the OCB projecting to the ipsilateral ear, acting over one or more cycles of the contralateral stimulus’s fine duty cycle, would produce “volume turn-down” at the ipsilateral ear. The ipsilateral stimulus would hence seem less loud, and the listener would reduce the contralateral stimulus accordingly. We therefore have the scenario for “fatigue” that was explored above—with the exception that the progression of perstimulatory “fatigue” would not be quite as profound. Indeed, it would hypothetically be less for a smaller fine duty cycle, rather than more as hypothesized for a smaller overall duty cycle (see above). All this agrees with the most profound finding of Stokinger, Cooper, and Meissner (1972): namely, an average cumulative “fatigue” of more than 10 dB for the 16 2/3% fine duty cycle, and nearly 25 dB for the 50% fine duty cycle.

When the contralateral stimulus is a lone brief burst: The hypothesized effect of burst duration

It was noted above that stimuli only a few hundreds of milliseconds long can evoke “fast” OCB effects. Of course, the listener in SDLB cannot react fast enough to manipulate the loudness of a contralateral burst that only lasts a few hundred milliseconds. Therefore, if the burst is to constitute an adjustment session, the listener must have many such sessions, in order to alter the intensity of the bursts up or down. But if there is only one burst in each perstimulatory period, then adjustments must be done across contiguous experimental runs. Furthermore, as contralateral stimulus bursts are shortened, some burst duration should be reached below which there is no burst-evoked OCB activity, such that “fatigue” disappears altogether.

Stokinger, Cooper, and Meissner (1972) provided relevant experiments. Their “Experiment 1” used an ipsilateral tone of only 8 s, and a single contralateral comparison tone of only 2 s. Observed “fatigue” was 10.9 dB. But when the single contralateral comparison tone was reduced to 200 ms, “fatigue” was 5.0 dB, well-within the limits of error (see Discussion). The Stokinger, Cooper, and Meissner results suggest that a lone 2-s stimulus is long enough to effectively “turn down the volume” at the opposite ear, but that a lone 200-ms stimulus is not. The latter conclusion is reinforced by Stokinger, Cooper, Meissner, and Jones (1972) who found that “fatigue” using lone contralateral comparison stimuli of 200 ms again did not exceed 5 dB. (Their ipsilateral stimuli covered four intensities, namely 30, 50, 80, or 100 dB SPL, at each of three tone frequencies, namely 0.25, 1, or 4 kHz.)

Hypotheses regarding a discontinuous ipsilateral stimulus

Hypotheses based upon the ipsilateral stimulus having a fine duty cycle when the contralateral stimulus does not

What happens if the ipsilateral stimulus, but not the contralateral one, fills the entire perstimulatory period (a 100% overall duty cycle), but has a fine duty cycle of less than 100%? This scenario is schematized in the upper part of Fig. 4. In such a case, the contralateral ear should suffer far less cumulative influence of the ipsilateral ear than under a smoothly continuous ipsilateral stimulus. Two factors contribute. First, the “chopped” ipsilateral stimulus offers less overall energy at its waveform (i.e., carrier wave) frequency, hence evoking less cumulative activity in the frequency-specific OCB that projects to the contralateral side. Secondly, and more importantly, a chopped ipsilateral stimulus allows interburst recovery time at the contralateral ear, which therefore suffers less “volume turn-down,” particularly as the ipsilateral stimulus’s off-time increases in duration relative to its on-time. Therefore, during any particular perstimulatory adjustment session, the listener will set a lower initial contralateral stimulus setting, and hence a higher final contralateral stimulus setting—that is, less “fatigue”—than during the classical case of a continuous ipsilateral stimulus. This difference increases as the ipsilateral stimulus’s duty cycle decreases. Graphically, this reverses the direction of the arrow in Fig. 9.

One study provides relevant data, namely Carterette (1955, Fig. 5), who used an ipsilateral stimulus of 90 dB SPL whose overall duty cycle was 100% but whose fine duty cycle was 50% (the situation shown in the upper part of Fig. 4). The contralateral stimulus appeared during 15-s adjustment sessions. The stimuli were not tones but white noise, of identical waveforms to both ears. Hence, the amount of energy in the stimulus at any given waveform frequency should not have mattered, but nonetheless a chopped ipsilateral stimulus would offer less cumulative energy for OCB activation than a continuous one. Carterette (1955) changed the number of times that silent intervals (interruptions) occurred per second in the ipsilateral stimulus; empirically, as the rate of interruption increased from 1 interruption per second (i.e., an on-duration of 500 ms) to 12.5 interruptions per second (i.e., an on-duration of 40 ms), so did “fatigue,” reaching 8.5 dB. Unfortunately, this number may lie within the limits of error (see Discussion). When the ipsilateral noise was uninterrupted, in contrast, and of 87 dB SPL, Carterette (1955) found about 3 dB more “fatigue” (his Fig. 3). Whether this 3-dB difference was significant is difficult to say. However, when the continuous ipsilateral noise was 100 dB SPL, the “fatigue” reached 19.5 dB. Altogether, then, there might well have been less “fatigue” when the ipsilateral stimulus was chopped than when it was continuous.

The hypothesized effect of contralateral and ipsilateral stimuli both having “fine duty cycles” < 100% when the contralateral stimulus has an overall duty cycle < 100%

Imagine once again that the ipsilateral stimulus is ongoing (overall duty cycle of 100%), but is chopped, giving it a fine duty cycle. Consider now a contralateral stimulus having the same waveform frequency and fine duty cycle as the ipsilateral stimulus, such that the two stimuli coincide, except during relatively long gaps in the contralateral stimulus, such that its overall duty cycle is < 100%. Those gaps allow a cumulating net excess, over the perstimulatory period, of the voltage spikes heading toward the contralateral ear from the ipsilateral ear. Therefore, the accumulating “volume turn-down” at the contralateral ear due to the ipsilateral stimulus will exceed the accumulating “volume turn-down” at the ipsilateral ear due to the contralateral stimulus. This difference, for reasons explored above, should manifest as nonzero “fatigue.” The “fatigue” should increase as the overall duty cycle of the contralateral stimulus decreases (see above). If, conversely, the overall contralateral duty cycle is held constant at < 100% and the mutual ipsilateral/contralateral fine duty cycle is increased, overall “volume turn-down” at the contralateral ear at any perstimulatory juncture will increase, manifesting as greater “fatigue” (see above).

Unfortunately, no data could be found to test this hypothesis about change of the overall contralateral duty cycle, but the literature is vast, and such data may exist. The data certainly exist regarding what happens when the fine duty cycle is changed. Sergeant and Harris (1963) employed a 1-kHz ipsilateral tone having a 100% overall perstimulatory duty cycle, with a fine duty cycle superimposed upon it. The contralateral 1-kHz tone appeared in 10-s adjustment sessions, separated by 20 s of contralateral silence, an overall duty cycle of 33%. The contralateral stimulus had a fine duty cycle superimposed upon it, synchronized to the ipsilateral fine duty cycle. The mutual fine duty cycles were characterized by any combination of on-times and off-times (Fig. 4) of 0.1, 0.3, 1, 3, or 10 s. The latter is the adjustment-session duration, such that the contralateral and ipsilateral tones were continuous (i.e., fine duty cycle of 100%) during the adjustment session for that particular condition. Sergeant and Harris found that as off-times got shorter or as on-times got longer (in both cases, the fine duty cycle got larger), “fatigue” progressively increased from less than 5 dB (sometimes indistinguishable from zero) to as much as 20 dB. This agrees with the hypothesis in the preceding paragraph.

Hypotheses regarding changing the perstimulatory ipsilateral stimulus intensity for an adjustment session

How should “fatigue” change if the perstimulatory ipsilateral stimulus intensity is kept constant in-between adjustment sessions, but differs during those sessions? A decrease in ipsilateral stimulus intensity during an adjustment session will cause a decrease in the concurrent ipsilateral contribution to loudness, which the listener will match by reducing the contralateral contribution to loudness. The latter is achieved by reducing the initial contralateral stimulus intensity, resulting (by the end of the session) in greater “fatigue” than is usual (for the usual case, see above). Conversely, an increase in ipsilateral stimulus intensity for an adjustment session will be met by an increase in the initial contralateral stimulus intensity, resulting (by the end of the session) in less “fatigue.”

And one study did, indeed, decrease the ipsilateral stimulus intensity for the adjustment sessions. Egan (1955b) sometimes decreased a 90 dB SPL 0.8-kHz ipsilateral tone during 20-s adjustment sessions (with overall contralateral duty cycle of 33%). The ipsilateral tone was sometimes 90 dB SPL, sometimes 80 dB SPL, and sometimes 70 dB SPL. Under this scheme, the maximum “fatigue” during an experimental run was 16.4 dB (Egan, 1955b, Table 3), as compared to 12.5 dB when the ipsilateral tone remained at 90 dB SPL (Egan, 1955b, Table 1). In short, decreasing the ipsilateral stimulus intensity for the duration of some of the adjustment sessions did seem, on average, to cause greater “fatigue,” although a change of 4 dB might appear to be marginal (see Discussion).

Hypotheses regarding how “fatigue” depends upon the mutual waveform frequency of the ipsilateral and contralateral tones

The OCB connects to the outer hair cells (OHCs) at the opposite ear’s organ of Corti (in Fig. 5 as the “contralateral” ear). The OCB connections vary in density. At those frequencies at which the connections are densest, namely, mid-to-high frequencies, we may expect “volume turn-down,” and hence “fatigue,” to be greatest. (What constitutes “mid-to-high frequencies” will vary by species; remember that the anatomical and physiological studies are largely from nonhumans; Guinan, 2006; Nizami, 1999.) The data from Jerger (1957) support the hypothesis of frequency dependence. Jerger (1957, Fig. 5) plotted “fatigue” after five perstimulatory minutes, as a function of ipsilateral-tone frequency and sensation level, for ipsilateral tones of 0.125, 0.25, 0.5, 1, 2, 4, or 8 kHz. The lowest “fatigue” was found for 0.125 and 0.25 kHz, greater “fatigue” occurred for 0.5 kHz, and yet-greater “fatigue” was found for 1, 2, 4, and 8 kHz, with the greatest “fatigue” (as much as 30 dB) actually being recorded for 4 kHz.

Hypotheses regarding “fatigue” when ipsilateral and contralateral tones have different waveform frequencies

OCB neurons have “tuning curves,” as was noted above. Hence, perstimulatory “fatigue” should hypothetically occur even with a contralateral stimulus whose waveform frequency differs from that of the ipsilateral stimulus. That is, the closer the contralateral and ipsilateral waveform frequencies, the greater should be the “fatigue.” Thwing (1955) supplied a roundabout way to test this hypothesis. During the actual perstimulatory adjustment sessions, Thwing (1955) changed the waveform frequency of the ipsilateral tone, and used that changed frequency also for the contralateral tone. Nonetheless, ipsilateral tone intensity was kept constant. Now, hypothetically, the lesser the respective change in ipsilateral-tone frequency, the greater the effect of between-session ipsilateral-evoked “volume turn-down” at the contralateral ear, and consequently the greater the “fatigue” indicated by the listener for that particular session. Put differently, the closer the within-session tone frequency to the intersession tone frequency, the greater the “fatigue” should be. Such greater “fatigue” is what is schematized in Fig. 9. And Thwing shows just such plots (Thwing, 1955, Figs. 3 and 4) for perstimulatory periods involving an intersession ipsilateral 1-kHz tone and within-session tones of higher or lower frequency.

There is another, illuminating way to plot the Thwing (1955) data. The intensity of the within-session contralateral tone can be plotted as a function of the tone’s waveform frequency, when its contribution-to-loudness matches that of a same-frequency ipsilateral tone. The self-same plots should resemble the tuning curves of OCB efferents. And indeed they do. Figure 12 shows such plots for adjustment sessions that interrupt a perstimulatory 1-kHz ipsilateral tone (after Thwing, 1955). Each plot is labeled by a parameter, in minutes, that is not a waveform frequency, but rather the time at which contralateral adjustment sessions began within a perstimulatory period. Each plot’s tip represents the greatest “fatigue,” occurring when the contralateral and ipsilateral waveform frequencies during the adjustment session are the same as that of the ipsilateral stimulus during the rest of the perstimulatory period.

Fig. 12
figure 12

“Fatigue” when the ipsilateral and contralateral tones have the same waveform frequency during the perstimulatory adjustment sessions, but the ipsilateral waveform frequency is a different (but constant) value between the adjustment sessions (see the text). The left vertical axis indicates the intensity of the perstimulatory contralateral tone whose evoked loudness is matched by the listener to that evoked by a same-frequency ipsilateral tone, during adjustment sessions that interrupt a perstimulatory 1-kHz ipsilateral tone (after Thwing, 1955). The left-hand scale covers 30 dB, and “fatigue” increases downward. The waveform frequency during the adjustment sessions is constant during an experimental run and is changed across runs. Each plot here is labeled by a parameter, that is not a waveform frequency but rather the perstimulatory time, in minutes, at which contralateral adjustment sessions began. That is, each plot represents a given adjustment session, labeled by when it started, across different experimental runs. Hence, a vertical line cutting the plots would represent a single experimental run. The dashed V-shaped line is provided for comparison. Its corresponding scale is on the right-hand side. The dashed line shows the threshold for firing of an OCB efferent in data from the cat, as a function of the waveform frequency of the OCB-stimulating pure tone (Liberman & Brown, 1986, Fig. 7, right-hand side, middle trace). The right-hand scale covers 50 dB, but is compressed into a 30-dB range here for illustration. Also, the tuning curve’s tip was originally at 10 kHz, but the curve is for a cat, and cats’ hearing ranges are well-known to be disposed toward much higher frequencies than those of humans, such that this particular efferent tuning curve might well be representative of those for 1 kHz in humans

There are other means of charting the frequency-dependence of the spread of “fatigue.” Unlike Thwing (1955), Egan (1955b) did not change the waveform frequency of the perstimulatory ipsilateral tone during the adjustment sessions. However, he did change the waveform frequency of the contralateral tone. Egan (1955b) found that the smaller the frequency difference between the ipsilateral and contralateral tones, the greater was the measured “fatigue” (ipsilateral tones of 0.8 kHz [Egan, 1955b, Tables 1, 3, and 4] or 2 kHz [Egan, 1955b, Table 2]). Indeed, Fraser et al. (1970) used the same approach as Egan (1955b), and found a lack of “fatigue” for what appear to be small frequency differences, namely, for a 1-kHz ipsilateral tone with a 1.2-kHz contralateral tone (Fraser et al., 1970, Figs. 1 and 4), for a 1.2-kHz ipsilateral tone with a 1-kHz contralateral tone (Fraser et al., 1970, Fig. 4), and for a 1-kHz ipsilateral tone with a 0.8-kHz contralateral tone (Fraser et al., 1970, Fig. 4). The latter results imply that “fatigue” is quite frequency-specific. In further support, Bray et al. (1973) found no “fatigue” outside the limits of error (those limits are discussed below) for a variety of ipsilateral-frequency/contralateral-frequency combinations (Bray et al., 1973, Figs. 3 and 4).

Discussion (1): What is the actual amount of “fatigue”? Centering and its problems

Many times above, it was noted that experimental results may not be as impressive as they seem, because of to-be-described uncertainties in the degree of “fatigue.” Those uncertainties will now be addressed. No small amount of sleuthing was required, because no single article on SDLB has yielded an adequate account of listener training and/or actual listener strategies (and difficulties). Likewise, no single article has yielded an adequate account of the experimenter-employed data-analysis methods (especially statistical analyses) and/or across-listener differences in “fatigue.” What details are available regarding stimuli employed, actual listener strategies, and across-listener differences in “fatigue” were therefore assembled for seventeen SDLB studies. Those studies are cited elsewhere in the present article and are compared here because, within each study, the ipsilateral and contralateral stimuli have the same waveform frequency composition. Table 1 lists the studies.

Altogether, the new model hypothesizes that “fatigue” depends on the closeness of the ipsilateral and contralateral waveform frequencies, and the literature supports this notion.

Centering: A definition

In SDLB, listeners are typically instructed to adjust the intensity of a stimulus at one ear until it seems as loud as the stimulus at the other. There is, however, a serious confound to SDLB, which is recognized in the literature but nonetheless reappears there. This confound is “centering.” The rest of this section describes centering, and how some authors have tried to avoid it, and what possible difference it makes.

Egan (1955b, p. 111) explained the issue best (bracketed terms added):

Now the simultaneous dichotic loudness balance is in fact based upon a complex judgment. The same pure tone presented in phase to each ear [a common experimental strategy] is heard as a single tone and this phantom sound is localized at some position within the head of the listener. At low [waveform] frequencies the exact position of the phantom sound is a function of the phase relation and the relative intensities of the two tones. When the pure tones are low in frequency, the listener is forced to consider the localization of the sound in his attempt to make the tones equally loud. Even at high frequencies, radical changes in the intensity of the comparison stimulus shift the sound back and forth between the two ears, and the loudness match is most readily accomplished by centering the sound image.

Small (1963, p. 293) noted that all of this assumes that a centered “sound image” corresponds to equal contributions to loudness from both ears. This assumption has apparently been widely accepted. But what happens when, as in some experimental manipulations, a tone at one ear differs in frequency from a tone at the other ear? Egan (1955b, p. 112) stated that “If two pure tones, one in each ear, differ sufficiently in frequency (cps), then the listener localizes one tone in one ear and the other tone in the other ear.” That is, the listener’s task is loudness balance, because “centering a sound image” becomes impossible. However, for stimuli that do not “differ sufficiently in frequency”—for example, white noise from a single generator presented binaurally—then

The apparent movement between the ears of the phantom sound, which is induced by intensity changes in a dichotically presented noise is a most compelling phenomenon. Because of this fusion into a single migrating sound, it is in fact virtually impossible with an unfatigued ear to equate in loudness a noise in one ear to the loudness of a noise in the contra-lateral ear: the sound can only be localized in one ear or the other, or somewhere in auditory space. (Carterette, 1955, p. 104)

In this case, a separate loudness cannot be assigned to each ear; the “loudness balance” devolves to “centering a sound image” between the two ears (Carterette, 1955). This procedure is called “median-plane localization,” henceforth called “centering.” The way to avoid centering while using white-noise stimuli was explained by Egan and Thwing (1955, p. 1226): “When the fatiguing and the comparison stimuli are produced by separate noise generators . . . the listener localizes each noise at the appropriate ear. Under the latter circumstances, the judgment is more properly a loudness balance.”

When centering fails

Using separate noise generators for each ear, Egan and Thwing (1955) allegedly found less “fatigue” (they provided no numbers) than when the noises were 180 deg out of phase (in an effort to reduce centering) or when they were identical. Their results suggest that centering is not the same as loudness balancing, and further that centering exaggerates “fatigue.” These results seem crucial, yet the present author could find no published replications.

Centering proved problematic even when stimuli of the same frequency and phase were given to both ears, which is precisely when centering should be easiest. For example, Jerger (1957) noted that listeners had difficulty doing perstimulatory centering using tones of 2–8 kHz and identical phase to both ears, due to differences in “quality” between the contralateral tone and the ongoing ipsilateral tone. Listeners were therefore instructed to perform “loudness matches,” the supposed instruction in all SDLB, when “quality” differences appeared. “Quality” was also a problem for the listeners in Small and Minifie (1961), exposed to 4-kHz tones of identical phase to both ears:

The instructions to the listeners in the present study called for a “loudness balance.” After completion of the experiment listeners were questioned as to how they actually maintained a “balance.” Nearly all listeners reported that they discovered in the first orientation session that when the two tones were nearly identical in loudness they perceived a sound image within the head at the median plane. In subsequent sessions they tried to keep the sound image centered at the median plane. However, after the first minute or so of an adaptation [i.e., perstimulatory] run, they reported that the sound image became diffuse and consequently they were no longer able to use localization as a cue and relied completely on loudness judgments.

That is, as perstimulatory time passed, “The adapting [i.e., ipsilateral “fatiguing”] stimulus was characterized by listeners as dull, noiselike, and having little pitch quality” (Small & Minifie, 1961, p. 1031). This was noted elsewhere; Barratt and Hood (1984, p. 195) remarked on a “marked change in quality or ‘timbre’ of a sustained pure tone,” the tone becoming more “fuzzy” over the course of 5 min. Likewise, Canévet et al. (1985) reported increased “fuzziness” of a 1-kHz tone over 3 min. Cook (1986), however, found no consistent evidence of change in quality of a 2-kHz tone played at 20 dB SL for 1 minute.

Such findings raise the confusing possibility that all listeners hearing stimuli of identical phase and frequency in SDLB may have started perstimulatory balances by performing centering, but may have finished perstimulatory balances by doing actual across-ear loudness-matching of stimuli that seemed distinct. This switch in experimental performance is rarely mentioned in the literature. In retrospect, this is not surprising; among psychophysicists, it is common knowledge that experimenters rarely ask listeners to describe how they actually perform listening tasks. Indeed, no mention of asking the listeners appears in the classic SDLB study of Hood (1950).

Of course, the story is not yet complete. One way to separate the loudness contributions from the two ears would be to use pulsed contralateral stimuli but smoothly continuous ipsilateral stimuli, thereby perceptually segregating the stimuli. Pulsing can be achieved by imposing a fine duty cycle. For example, Ahaus, Stokinger, and Wylde (1975) had their listeners adjust the intensities of 20-s segments of contralateral 1-kHz tones having “fine duty cycles” of 20%–80% and on-durations of 200–800 ms, presented 180 deg out-of-phase to continuous 1-kHz tones in the ipsilateral ear. In the prestimulatory period, listeners perceived a single “sound image”; however, during the perstimulatory period, in which the 1-kHz ipsilateral tone remained on continuously, the contralateral 1-kHz tones could be distinguished by their evoked loudness. “Fatigue” of as much as 30 dB was found, suggesting that actual loudness balances can produce that much “fatigue.” However, it is difficult to compare this “fatigue” to classic “fatigue,” due to the procedural differences from “classic” SDLB experiments, particularly the use of phase differences between the compared stimuli, which are well-known to be a potentially substantial auditory cue.

Discussion (2): Attempts to remedy the problems of centering, through presentation of ipsilateral and contralateral stimuli nonsimultaneously

Small (1963, p. 291) noted potential ways of eliminating possible confusion due to centering. One notion was nonsimultaneous presentation to the two ears, involving “delayed balance” and “alternate binaural loudness balance,” methods that Small favored:

Since the delayed balance method and the alternate binaural balance method employ stimuli that are in fact not presented simultaneously to the two ears, both have an important virtue. They are the only methods reported that measure loudness adaptation [i.e., “fatigue”] directly, that is, by loudness balances uncontaminated by localization effects.

“Directly” was Small’s own opinion, as will be seen. “Delayed balance” and “alternate binaural loudness balance” are defined as follows. In delayed balance, the contralateral stimulus is presented subsequently to the “fatiguing” ipsilateral stimulus, after the latter is turned off. For present purposes, the period in which the “fatiguing” ipsilateral stimulus appears will be called the perstimulatory period, as customary; the period in which the contralateral stimulus appears will be called the poststimulatory period. The contralateral intensity is adjusted by the listener or by the experimenter (see below), until the contralateral loudness equals that of the end of the ipsilateral stimulus, according to the listener’s memory. Regarding delayed balance, Stokinger and Studebaker (1968) declared outright an assumption than Small (1963) and others were apparently taking for granted—namely, that delayed balance involved “little if any interaction” between one ear and the other (Stokinger & Studebaker, 1968, p. 256). This assumption seems naïve, given that the role of the OCB in “volume turn-down” could have been anticipated by that time, given the contemporaneous literature on OCB anatomy and physiology (see the citations in Guinan, 2006; Nizami, 1999).

Recall Small’s (1963, p. 291) mention of another technique, alternate binaural loudness balance (Hood, 1950, p. 45). There, a long preceding “fatiguing” ipsilateral stimulus is absent, and the ipsilateral and contralateral stimuli are relatively brief and are alternated in time. The ipsilateral stimulus is usually kept at a constant intensity. In each ipsilateral–contralateral stimulus pair, the listener or the experimenter adjusts the intensity of the contralateral stimulus, until the contralateral and ipsilateral stimuli eventually seem equally loud to the listener.

Delayed balance: Single subsequent contralateral stimulus

What does the new model imply about the contralateral intensity at the end of any delayed balance? During the perstimulatory period, the ipsilateral stimulus has been continuously “turning down the volume” at the contralateral ear. Therefore, when the listener matches the loudness of the contralateral stimulus to the remembered loudness of the just-terminated ipsilateral stimulus, the listener must compensate for “volume turn-down” by setting a high contralateral stimulus intensity, indeed one that is higher than the ipsilateral stimulus intensity. That is, there should be “negative fatigue” (see above). But over poststimulatory time, with increasing delay of the contralateral stimulus after termination of the ipsilateral stimulus, the contralateral ear will be recovering from ipsilateral-induced “fatigue,” and therefore the contralateral stimulus intensity set by the listener should progressively decline toward the ipsilateral stimulus intensity. Of course, the aforementioned decline will not be as rapid if contralateral recovery slows down due to the ipsilateral stimulus being made longer or more intense and thereby providing greater “volume turn-down” at the contralateral ear.

Delayed balance may involve only a single poststimulatory contralateral stimulus, one that is too brief for adjustment by the listener. Consider Stokinger and Studebaker (1968). They used 1-s contralateral comparison tones, and the same contralateral and ipsilateral tone frequency (1 kHz). The experimenter adjusted the contralateral-tone intensity over successive experimental runs, according to the listener’s report of “too loud” or “too quiet,” until loudness was balanced. The 1-s comparison tone was presented either during the last second of the perstimulatory ipsilateral tone (“simultaneous balance”), or immediately afterward (“delayed balance”). The perstimulatory ipsilateral tone lasted for 30 s at 50 dB SPL, or 30 s at 80 dB SPL, or 2 min at 80 dB SPL. The “fatigue” found under the simultaneous balances was, respectively 11.25, 14.00, or 13.00 dB—an irregular pattern, nonetheless having surprisingly little variation, suggesting altogether a constant effect. This is not to be expected for ipsilateral tones of such different intensity/duration combinations. However, under the delayed balances the “fatigue” found was respectively 3.67, 7.00, or 5.50 dB. The latter numbers suggest a truly remarkable rate of poststimulatory recovery, given that the delayed balances were done immediately after the perstimulatory ipsilateral tone, yet involved as much as 7.58 dB less “fatigue.” Indeed, the “fatigue” found under the delayed balances lies well-within the presently-presumed margins of error (see Discussion). Altogether, we might conclude that what Stokinger and Studebaker (1968) reported, whether “simultaneous” or “delayed,” were not loudness balances. What they actually were is unclear.

Similar delayed-balance conditions were employed by Petty et al. (1970), including a 60 dB SL ipsilateral tone of 1 kHz, and a contralateral 1-kHz tone of 1-s duration presented immediately after the ipsilateral stimulus. The contralateral tone’s intensity was always determined by the experimenter in response to the listener’s reports. The ipsilateral tone duration within any experimental run was either 1, 5, 10, or 15 s. Petty et al. also used a scenario in which the 15-s ipsilateral tones followed a 7-min ipsilateral “conditioning” tone. Regardless, in all cases they found an across-listener average of just 2 dB of “fatigue,” consistent across listeners. This is unexpected, given that the contralateral tone immediately followed the ipsilateral tone; as described above, negative “fatigue” would have been expected. The same listeners had earlier completed a traditional SDLB experiment that revealed 10–20 dB of [positive] “fatigue.” In sum, the findings of Petty et al. (1970) suggest, once again, that delayed balance and SDLB do not address the same phenomenon.

Delayed balances and simultaneous balances were done by Stokinger, Cooper, Meissner, and Jones (1972), using ipsilateral and contralateral tones of identical waveform frequency (0.25, 1, or 4 kHz). Stokinger, Cooper, Meissner, and Jones used ipsilateral durations of 1–30 s and intensities of 30, 50, 80, or 100 dB SPL. In delayed balance, 200-ms contralateral tones appeared 400 ms after the ipsilateral tone. Stokinger, Cooper, Meissner, and Jones found negligible “fatigue.” They then allowed simultaneous balances, by moving the contralateral tone so that it ended at the same time as the ipsilateral tone. They found perhaps 5 dB of “fatigue.” Unfortunately, the latter result is not compelling, as it is small enough to be artifactual. Again, we may ask what task the listeners were actually performing.

All of the above findings on “delayed balance” are complicated by the usual practice of across-listener averaging of data, which can obscure negative “fatigue” experienced by some listeners (see above regarding Stokinger, Cooper, & Meissner, 1972).

Alternate binaural loudness balance: The ipsilateral and contralateral stimuli alternated

A second method of nonsimultaneous binaural stimulus presentation is alternate binaural loudness balance (Hood, 1950, p. 45). Here, the ipsilateral and contralateral stimuli are relatively brief and are alternated in time (although the labels of “ipsilateral” and “contralateral” for the ears remain the same). Hood, for example, used contralateral and ipsilateral tones of the same frequency and duration, 0.3 s. Egan and Thwing (1955, Fig. 1, “control”) used tones of 0.5 s. Hood did not describe who adjusted the contralateral stimulus intensity; Egan and Thwing allowed the listeners to do so, and in a related experiment, the experimenter in Petty et al. (1970) made the intensity adjustments based on indications given by the listener.

Given such methods, what “fatigue” should ensue? The tones used by Hood (1950) and by Egan and Thwing (1955) and others should have been long enough to activate OCB action. Hypothetically, then, each successive ipsilateral or contralateral stimulus will “turn down the volume” at the opposite ear. The contralateral tones will at first be adjusted to higher intensity than the ipsilateral tone, in order to counteract this effect; as the OCB activity evoked by the contralateral tones likewise “turns down the volume” at the ipsilateral ear, the contralateral intensity will be adjusted toward the ipsilateral intensity (which remains constant). That is, over the course of numerous adjustments, both ears will experience similar “volume turn-down,” hence both ears will similarly contribute to loudness in response to similar stimuli; hence, the stimuli at both ears will be set to the same intensity. But the ipsilateral intensity is kept constant; therefore, the contralateral intensity will be set to the ipsilateral value, representing no “fatigue.”

And that is what Hood found (Hood, 1950, p. 45) for stimuli of 1 or 2 kHz (the same frequency at both ears) and stimulus intensities of 80–100 dB SPL, each tone being 0.3 s long and succeeded by the same tone in the opposite ear after 0.6 s. Now, the latter gap might seem to give plenty of recovery time after each tone. Recall, however, that OCB effects do not disappear instantly, hence they cumulate. Nonetheless, Egan and Thwing (1955, Fig. 1, “control”) found equal intensities, within limits of error (i.e., no “fatigue”), for 1-kHz tones of 0.5 s, presented in immediate alternation (i.e., no recovery time) for a total of 15 s.

In one variation, Hood (1950, p. 46) imposed 1–5 min of same-frequency ipsilateral stimulation before the alternate binaural loudness balances, which were otherwise done as described above. In the new model, the listener should begin these poststimulatory loudness balances by initially setting the contralateral stimulus intensities higher than those of the alternating ipsilateral stimuli (which are still of constant intensity), in order to overcome the long-term “volume turn-down” induced at the contralateral ear by the long, preliminary ipsilateral stimulus. As the contralateral ear recovers from the “volume turn-down,” the listener should progressively decrease the contralateral stimulus intensity toward the (constant) ipsilateral stimulus intensity. That is, “fatigue” should start out negative and approach zero. Empirically, however, the poststimulatory contralateral stimulus intensities were always set equal to the ipsilateral stimulus intensity (Hood, 1950, Fig. 30). This unexpected lack of difference has two obvious explanations. Either the new model does not apply to such brief stimulus bursts as 0.3 s, for whatever reasons, or the listeners were not performing true loudness balances.

A slightly different task was described by Petty et al. (1970, Fig. 3). In this case, the duration of the “fatiguing” tone was changed across runs, from 0.5 s to as much as 7 min. In the poststimulatory balancing period, which was a single session, the experimenter adjusted the contralateral stimulus intensity up or down 11 times according to hand signs given by the listener, in order to bracket the loudness-balancing stimulus intensity. The actual poststimulatory loudness balances required the loudness of 1-s 1-kHz contralateral tones to be balanced to those of alternating same-frequency 15-s ipsilateral tones. Such tones could conceivably evoke considerable “volume turn-down” at the contralateral ear. Empirically, the balancing tone’s intensity did not stray from 3 dB or so below that of the “fatiguing” tone (Petty et al., 1970, Fig. 6, upper plot), well-within the limits of error. In a variation of this scenario, the poststimulatory 1-s contralateral tone appeared during the last second of the 15-s ipsilateral tone (Petty et al., 1970, Fig. 4). Now, averaged “fatigue” of 5–9 dB was found (Petty et al., 1970, Fig. 6, lower plot). Unfortunately, these differences across methods are not definitive, adding further confusion to the issue.

Discussion (3): Variability in “fatigue” within any assigned task

“Fatigue” measurements are invariably presented as averages across listeners. However, averaging can paint a misleading picture, by obscuring individual differences in “fatigue.” Indeed, averages are used for illustrations precisely because they provide smoother data plots. This averaging involves a surprisingly common, but remarkably-rarely mentioned, assumption—namely, that differences across listeners in any specific experimental context represent the randomness of samples drawn from a common underlying statistical distribution. Any systematic (and, by implication, nonrandom) differences across listeners are thereby concealed, and their possible sources are implicitly denied. Such sources are (1) genuine anatomical differences, leading to differences in sensations, and/or (2) inadequate training, amplifying individual behavioral idiosyncrasies in the laboratory, and/or (3) differences in attentiveness, due (e.g.) to unequal motivation, which is a separate issue from training. As will now be shown, across-listener differences in loudness balances are a persistent feature of the SDLB literature. The sheer size of the differences is such that, of the three potential causes mentioned above, Cause 1 seems irrelevant. Granted, we do not know just how much anatomical variation would be required to produce substantial variation in loudness adjustments in the types of experiments discussed here; but in view of the potential differences in performance due to behavior, it seems redundant to blame anatomical differences too.

Variability in perstimulatory loudness balances

Let us begin at the very beginning. Hood (1950) used various ipsilateral perstimulatory intensities. When he plotted “fatigue” versus those intensities, he found that “the fatigue increases with the intensity of the [ipsilateral] fatiguing tone at a rate which is different for each subject” (Hood, 1950, p. 33; italics supplied). Hood used identical waveform frequencies for the stimuli at each ear, 0.5, 1, or 2 kHz. Hood did not mention fusion of the contributions-to-loudness from the two ears into a single roving “image” between the ears (see above), but fusion seems inevitable. Hence, differences in ability to centralize sounds might have contributed to the individual differences observed by Hood (1950).

Centering is important beyond Hood (1950). The listeners of Carterette (1955) performed perstimulatory centering of identical white noises. When Carterette (1955) averaged “fatigue” across listeners, he found that the data’s standard deviation increased with “fatigue,” being as high as 5 dB after 7 min of ipsilateral noise (Carterette, 1955, Fig. 8). The actual average “fatigue” was 9.875 dB (using numbers from Carterette, 1955, p. 106), which was across listeners and across the different ipsilateral noise intensities employed (30, 60, 87, or 100 dB SPL). In Carterette (1955), then, the standard deviation could be half the size of the average.

Consider also Egan (1955b, Fig. 8), whose listeners performed perstimulatory centering of 80 dB SPL 0.8-kHz tones. Egan (1955b, p. 117) noted “large, consistent differences among listeners.” Egan (1955b, p. 117) showed an illustration of standard deviations in perstimulatory “fatigue,” which he had manipulated in an unexplained manner such that “The variability in the total variance of the scores caused by the large differences between listeners has been removed,” yielding “the estimated standard deviation [in ‘fatigue’] based upon a large number of runs made by the typical listener” (Egan, 1955b, p. 117). That manipulated standard deviation nonetheless increased with across-listener-averaged “fatigue,” being as high as 8 dB after 6 min of the ipsilateral tone for “fatigue” of 17.16 dB. In other words, once again the standard deviation is half the size of the average. Furthermore, that average concealed an across-listener range of “fatigue” of 6.3–32.4 dB.

Such a wide range of “fatigue” across listeners is not rare in SDLB. Thwing (1955, Fig. 7) found a broad range of perstimulatory “fatigue” across listeners when it was evaluated for each listener using ipsilateral and contralateral tones having a common waveform frequency that was not equal to that of the perstimulatory “fatiguing” tone, but was otherwise in-phase across the ears (Thwing, 1955, p. 742). As the frequency difference between those tones and the “fatiguing” tone shrank (with “fatigue” commensurately increasing), the range of “fatigue” from the least-“fatigued” listener to the most-“fatigued” listener was 39 dB. This is truly remarkable, and begs the question of whether the participants were all doing the same task.

In standard experiments, those in which the ipsilateral waveform’s frequency composition did not change during the adjustment sessions, strong across-listener differences in “fatigue” continued to be found. Carterette (1956) discovered that after six perstimulatory minutes, the mean “fatigue” across listeners and its standard deviation were 5.5 and 2.3 dB for noises of 40 dB SPL, 9.9 and 5.6 dB for noises of 70 dB SPL, 11.4 and 6.6 dB for noises of 90 dB SPL, 14.4 and 5.8 dB for noises of 100 dB SPL, and finally 16.3 and 7.0 dB for noises of 105 dB SPL. Yet again, the standard deviations approximated half the mean values. Likewise, in Jerger (1957), for tones of 0.125–8 kHz played for five perstimulatory minutes, the standard deviation of “fatigue” across listeners tended to be half the mean “fatigue” itself, such that the maximum mean “fatigue,” 25 dB, involved a standard deviation of about 12 dB.

Variability in loudness balances when “fatigue” should be nil

It is instructive to examine variability in loudness balances when “fatigue” should be nil. Bray et al. (1973) used both “classic” and “nonclassic” methods of SDLB for 3-s contralateral tones whose frequency differed from that of the ipsilateral tones. Under such circumstances, according to the new model, “fatigue” should have been negligible. Instead, there was substantial variability in the perstimulatory loudness-balanced intensities across listeners, showing across-listener standard deviations of as much as 6.5 dB in classic SDLB, and as much as 7.6 dB in “nonclassic” SDLB (Bray et al., 1973, Table 1). Similar across-listener variability appeared in similar contemporaneous work from the same laboratory (Dirks et al., 1974; Morgan & Dirks, 1973). Evidently, even when “fatigue” obtained through perstimulatory loudness balances should, in principle, have been zero, individual differences can be so high that “fatigue” is nonzero, even given the possibility of “negative fatigue” (defined above). Indeed, a net “fatigue” occurred because, evidently, tendencies toward “fatigue” numerically overpowered any tendencies toward “negative fatigue.” (Granted, performances can in principle vary across listeners without a net effect in any specific direction; but in the present author’s experience and reading, individual differences tend to be systematic, and so profound that one or two people will bias an average in one particular direction.)

Likewise, “fatigue” should hypothetically have been nil when 4-kHz tones were played simultaneously and in-phase at equal intensities to both ears for 6 min (Small & Minifie, 1961, Fig. 3d). But, empirically, 9 dB of across-listener average “fatigue” was found during the final loudness-balancing session. Nine decibels seems a remarkable amount of averaged drift, although one might be tempted to blame this drift on boredom. There should have been less boredom, however, during the prestimulatory loudness balances, during which “fatigue” should always hypothetically be zero. Furthermore, if there is jitter in “fatigue,” it should be minimized by using contralateral tones of different frequency from the ipsilateral tones. In fact, Bray et al. (1973) used such stimuli in both “classic” and “nonclassic” SDLB; the prestimulatory intensities set by the listeners had across-listener standard deviations of as much as 6.4 dB for “classic” SDLB, and as much as 5.7 dB for “nonclassic” SDLB (Bray et al., 1973, Table 1). Similar work from the same laboratory (Dirks et al., 1974; Morgan & Dirks, 1973) showed similar across-listener variability for prestimulatory loudness balances.

Altogether, then, across-listener average “fatigue” can be nonzero even when it should be nil. There is substantial across-listener variability, which raises the possibility of systematic differences across listeners, confounding this issue.

What uncertainty in “fatigue” means for attempts to mathematicize the present conceptual model, or any other

The present article presents a conceptual model of how the idiosyncrasies of SDLB derive from the underlying physiological events. Some readers might argue that the model needs greater specificity, especially in terms of the time course, to both fully test its fit to existing data and make specific predictions that could be used to further test the model in the future. Indeed, some might think that an algebraic version of the present model would meet such a desired degree of specificity. However, mathematical descriptions of time courses inevitably involve “integration times” (sometimes called “time constants”). Auditory psychophysics has revealed that such constants are only constant within narrow experimental paradigms, and are woefully variable across paradigms. In fact, their values can cover several orders of magnitude (de Boer, 1985). This conundrum has not been resolved. But without appropriate integration times, any mathematical model would involve sheer speculation. Presently, the relevant time constants might be inferred from the known properties of the olivocochlear bundle (OCB), as discussed above. But these are not known for humans, and we can only speculate what time constants inferred from animals would be appropriate in what role in a math model. Such integration times could perhaps be programmed as free parameters in a math model, and then “obtained” through curve-fitting. But curve-fitting is not measurement; math models can allow far too much flexibility of fit to evoke firm estimates of parameters. And, regardless of any assumed physiological time constants, the SDLB data are too diverse and insufficiently precise (see above) to allow the inference of psychoacoustic integration times.

It should be no surprise, then, that no convincing mathematical model of the processes underlying SDLB has apparently emerged. After all, a convincing set of concepts needs to arise first. The present author knows of only one attempt at a mathematical model (Norwich, 2010), which subsequently proved incompetent (Nizami, 2015, 2017), partly due to its critical lack of an adequate conceptual base. Furthermore, an overriding concern has not yet been addressed, in the form of an important limitation of mathematical models generally. As Uttal (1998, p. 77) explained, in principle an infinite number of math models might describe any set of data. Consequently, choosing the “correct” quantification, even when based upon a conceptual model and utilizing multiple sets of data from different circumstances, becomes moot.

In any case, a math model is unnecessary to generate hypotheses that cannot presently be confirmed or denied through the existing literature; the present conceptual model already does that in one particular case above.

Overall summary

Simultaneous dichotic loudness balance (SDLB) was detailed by Hood (1950) as a method of measuring “perstimulatory fatigue” in a single ear, sometimes called “loudness adaptation.” A “fatiguing” stimulus composed of one or more waveform frequencies is presented to what is here called the ipsilateral ear. Simultaneously, a comparison stimulus, usually of the same frequency composition but whose intensity the listener can control, is presented to the opposite ear, here called contralateral. The listener adjusts the intensity of the contralateral stimulus until it seems as loud as the ipsilateral stimulus. As much as 20 s may be allowed for such a loudness balance. It begins with the contralateral stimulus being either completely inaudible or relatively weak, such that the listener must substantially raise its intensity. Of course, the contralateral stimulus can be made so brief by the experimenter that the listener cannot adjust its intensity while it is heard (less than 5 s); those are the “nonclassic” SDLB experiments, and they too are discussed here.

When the contralateral ear has no stimulus presented to it, the ipsilateral ear does not seem to “fatigue” for tones more than 20 dB above their detection threshold. What, then, is the nature of the ipsilateral “fatigue” found during SDLB—that is, when the contralateral ear does experience stimuli? The new model provides answers. It depends upon the observed properties of a system of neurons called the olivocochlear bundle (OCB), which lies below the brain, physically and physiologically. Empirically, an ongoing tone at one ear evokes simultaneous firing, in the OCB, of neurons projecting to the opposite ear. That OCB firing “turns down the volume,” as if same-frequency tones presented to that opposite ear had decreased in intensity by as much as 24 dB. The “volume turn-down” probably also spreads to frequencies other than that of the ipsilateral stimulus. When the ipsilateral stimulus is turned off, there is recovery from OCB influence, but not immediately. All these results stem from animal studies, but there is also physiological and psychophysical evidence for OCB influence in humans, independently of SDLB experiments.

The new model of how “fatigue” occurs in SDLB is as follows. When the ipsilateral stimulus is played continuously, during the perstimulatory period of SDLB, it evokes likewise-continuous voltage-spike-firing in the OCB that projects to the contralateral ear. This firing “turns down the volume” at the contralateral ear. The perstimulatory period contains adjustment sessions during which the listener equates the contribution-to-loudness from each ear, but only by adjusting the contralateral stimulus intensity. But any contralateral stimulus will cause “volume turn-down” at the ipsilateral ear. This action will, in turn, momentarily reduce the ipsilateral ear’s contribution to overall loudness. It is the reduced ipsilateral contribution-to-loudness that the listener must effectively match by the end of each SDLB adjustment session. Presumably, this is achieved by reducing the contralateral stimulus intensity from an early peak to a final setting. No accounts of actual listener behavior have been published; nonetheless, the early peak-setting is a natural action by the listener. It is enhanced thanks to a custom within SDLB experiments: namely, that the mechanical attenuators turned by the listener to determine contralateral-ear stimulus intensity are typically reset to near-maximum by the experimenter between adjustment sessions. The listener must therefore quickly and profoundly increase the contralateral stimulus intensity when starting each adjustment session, hence inadvertently maximizing the “volume turn-down” at the ipsilateral ear. The listener must counter this by eventually setting, by the end of the adjustment session, a contralateral stimulus intensity that is lower than the constant ipsilateral stimulus intensity.

In-between the adjustment sessions, the ipsilateral ear recovers from the “volume turn-down” induced there by the contralateral stimulus, while continuing to act through its own OCB to “turn down the volume” at the contralateral ear. Therefore, over successive adjustment sessions, the magnitude of the contralateral stimulus intensity at a session’s start must be set increasingly higher, resulting in successively greater “volume turn-down” at the ipsilateral ear, for successively lower intensities at final loudness matches. The latter is classically interpreted as ipsilateral “fatigue.”

The new model explains a number of other findings from SDLB. Consider the prestimulatory period, a comparison period that precedes the perstimulatory period and provides a baseline for assessing perstimulatory “fatigue.” Empirically, there is no “prestimulatory fatigue.” The prestimulatory period can be extended to entirely fill the experimental run, in which case there is still no “fatigue.” The model explains these findings, in ways that are detailed above. SDLB also contains a poststimulatory period, which follows the perstimulatory period. In the poststimulatory period, the ipsilateral stimulus is turned off except during adjustment sessions, during which the ipsilateral stimulus’s intensity is the same as in the perstimulatory period. The model explains why the listener’s final intensity setting at the contralateral ear will increase over poststimulatory time—classic SDLB “recovery.”

The perstimulatory period remains the period of greatest focus, because both the ipsilateral and contralateral stimuli have been empirically manipulated during this period. The results of those manipulations are as idiosyncratic as the manipulations themselves, but the following ones are extant, and all can be explained by the model. First, a higher ipsilateral stimulus intensity during the perstimulatory period increases “fatigue” and causes slower poststimulatory recovery. Second, if the entire perstimulatory period is made into a single, long perstimulatory adjustment session, the degree of “fatigue” will decrease toward zero. Furthermore, if the overall contralateral duty cycle is reduced across experimental runs, “fatigue” increases. Also, if the contralateral stimulus has a fine duty cycle (chopping), while the ipsilateral stimulus remains continuous, then “fatigue” is less for a smaller contralateral fine duty cycle, but more for a smaller contralateral overall duty cycle.

The results of yet other manipulations are explained by the model. That is, as contralateral stimulus bursts are shortened, some burst duration is reached below which there is no burst-evoked OCB activity, such that “fatigue” disappears altogether. A lone 2-s stimulus is long enough to effectively “turn down the volume” at the opposite ear, but a lone 200-ms stimulus is not. Furthermore, regarding stimulus chopping (fine duty cycle < 100%), there is less “fatigue” when the ipsilateral stimulus is chopped than when it is continuous. When the contralateral stimulus has a fine duty cycle superimposed upon it, and that chopping is synchronized to the ipsilateral fine duty cycle, then as the fine duty cycle gets longer “fatigue” progressively increases.

Also, the model explains why an increase in ipsilateral stimulus intensity during an adjustment session will result in less “fatigue,” a highly counterintuitive result. Finally, at those frequencies at which OCB connections are densest—namely, mid-to-high frequencies—“fatigue” is greatest. “Fatigue” also depends upon the closeness of the ipsilateral and contralateral waveform frequencies.

Final conclusions

Since 1950, there have been numerous Simultaneous Dichotic Loudness Balance studies. They have allegedly demonstrated that the loudness evoked by a stimulus to a single ear “fatigues” (adapts) over the course of a sustained auditory stimulus to that ear. However, a concurrent accumulation of other work has suggested that the loudness evoked by a monaural stimulus does not “fatigue,” unless that stimulus is presented within 20 dB of its absolute detection threshold. That is, in the jargon of SDLB, the “fatigued” ear does not seem to “fatigue” for tones over 20 dB SL, if the comparison ear (the ear opposite) receives no stimulus. How, then, is this phenomenon compatible with the supposed adaptation found through SDLB?

The present article postulates a new model of interaction between the two ears during SDLB. In brief, the model postulates “volume turn-down” at one ear being caused by a stimulus at the other. The model generates hypotheses that agree with a variety of effects found in SDLB. Within the model, those effects transpire to be artifacts of SDLB itself. This conclusion is disappointing, but it is hardly the first incidence of an alleged perceptual phenomenon (here, loudness adaptation) apparently being an artifact of the experimental design (here, SDLB). Uttal (1998) has exposed, at some length, some other major artifacts in perceptual psychology.