It is estimated that more than half of the world’s population speaks at least two languages (Grosjean, 2010), with numbers steadily growing. This multilingualism means that listening to foreign-accented speech has become a standard listening situation in metropolitan areas. In foreign-accented speech, native listeners are confronted with pronunciations that deviate from their language standards (e.g., Dutch speakers pronouncing kettle instead of cattle or, for native Japanese speakers, something similar to flied lice instead of fried rice). How then do native listeners cope with foreign-accented speech? Is understanding of foreign-accented speech hindered only by large deviations from the intended pronunciation? Can inexperienced listeners adapt quickly to a new speaker? Furthermore, after adaptation to one nonnative speaker, is it then possible to understand another speaker with the same accent? The present study addresses these questions.

Variation in native speech

Although foreign accents add variation to speech, native speech contains considerable variability too. Most research about variation in speech in fact stems from the first language (L1) domain. Research on L1 speech shows that even when one phoneme in a word is changed arbitrarily, the word can still be recognized (e.g., Connine, Blasko, & Titone, 1993; Marslen-Wilson, 1993), at least as long as this change does not create a new word (Marslen-Wilson & Zwitserlood, 1989). However, the larger the phonemic deviation in a nonword is from the standard pronunciation of a word, the harder it is to recognize the word correctly (Connine et al., 1993). When words are changed in ways typical for natural speech, as in greem bench, where the word green is assimilated to what sounds like greem because the following context warrants nasal place assimilation, this does not prevent recognition of the intended word (e.g., Gaskell & Marslen-Wilson, 1996, 1998; Gow, 2002; Mitterer & Blomert, 2003). This effect does not occur, however, when the following context does not license place assimilation (Gaskell & Marslen-Wilson, 1996, 1998; Mitterer & Blomert, 2003). Reductions also form an interesting case for looking at deviation. Words that are frequently reduced in conversational speech can be recognized easily as their intended targets; hearing posman, for example, will facilitate recognition of unreduced postman (Ernestus, Baayen, & Schreuder, 2002; Mitterer & Ernestus, 2006). Together, these findings thus show that although the amount of deviation is important in recognizing word forms, the context in which the variation occurs has to be appropriate in order for successful recognition to occur. A foreign-accented speaker provides a natural context for deviant pronunciations that are consistent with that accent but, typically, not with the target language.

Frequency and familiarity effects

Frequency of occurrence is another important factor for recognition of deviant word forms. That is, variant forms are more likely to be recognized correctly when they are presented in a phonological context in which they frequently occur (such as certain reductions, including Dutch ['dam] for /'darɔm/ [therefore] [e.g., Ernestus et al., 2002; Mitterer & Ernestus, 2006], medial t-deletion, such as 'senner' or 'sennah' for English 'center' [Pitt, 2009], and vowel raising [Dahan, Drucker, & Scarborough, 2008]). For example, Pitt found that participants judged /t/-deleted variants as words only if the phonological environment in which the /t/-deletion occurred was common in production. A similar effect was found by Dahan et al., who manipulated English words ending in /g/ and /k/ such that they would contain either a raised vowel (like the [] in bag) or an unraised vowel (similar to the [æ] in back). Half of the words had contextually appropriate vowel raising, and the other half did not. Participants learned to understand the intended word and even learned to expect the incorrect pronunciations when new items were presented. Moreover, adaptation proved to be extremely rapid; just a few trials were necessary to get a (limited) effect of generalization.

Processing of words is also affected by the frequency of the variant. Words that are frequently pronounced without a schwa (e.g., corporate as corp’rate) were more likely to be judged as two-syllable words than were words with a low schwa-deletion rate (Connine et al., 1993; Connine, Ranbom, & Patterson, 2008). This influence of the distribution of variant representations has also been found in other languages, such as Dutch (Mitterer & Ernestus, 2006). Experiments on the nasal flap in English (found in gentle in American English) have also shown that lexical representations are stronger when people have more experience with this phenomenon (Ranbom & Connine, 2007). Reductions also form an interesting perspective on frequency of occurrence. Word-final /t/, for example, can be realized with an alveolar closure and an audible release or as a glottal stop without a release. English listeners were able to recognize both forms equally quickly in a semantic priming task; however, when there was a delay added to the task, participants did noticeably better with the typical variant (Sumner & Samuel, 2005). Degree of familiarity with foreign-accented pronunciations should thus influence how listeners process them. In the present study, we tested different types of familiarity effect: We first looked at the role of long-term experience with an accent on comprehension and then investigated how rapidly comprehension of an unfamiliar accent could improve when participants were briefly exposed to that accent immediately before testing. We thus asked what type of experience is needed to process foreign-accented speech correctly.

Sumner and Samuel (2009) looked at the role of long-term experience by studying dialectal variation (New York City English vs. General American English) and found that speakers of a New York City dialect that drops word-final –r (turning bak[] ‘baker’ into bak[]) could instantly interpret these dialectal forms as the intended word, whereas nondialectal speakers (speakers of General American) did not show such an effect. When tested again after a short (20- to 30-min) time lag, General American listeners had more trouble recognizing the dialectal forms than the standard forms, whereas for the New York City dialectal listeners, there was no difference. Thus, listeners who are familiar (passively or actively) with a dialect are apparently more flexible in form processing than inexperienced listeners are: Experienced listeners can deal with more variation, relative to the standard pronunciation, when listening to dialectal speech. A similar result was found with British English listeners who had moved to the United States; they learned to interpret correctly the medial flap (ɾ) in “todal” (/toɾal/) as /t/ (thereby recognizing the intended word total; Scott & Cutler, 1984). These listeners seemed to have adapted their perceptual system to American English standards.

Effects of accent strength in foreign-accented speech?

The present study focuses on foreign-accented speech rather than dialectal variations. In foreign-accented speech, speakers often replace target language sounds with native language sounds when the target speech sounds are not found in the speaker’s native language (e.g., cattle pronounced as kettle by Dutch speakers; Broersma & Cutler, 2011). When speech sounds are shared between the two languages, however, substitutions that involve a different category are usually not observed. The former types of alteration are often perceived as stronger accent markers than the latter. Thus, even within one language combination, there are words that can be strongly affected by foreign accents and words that are affected to a smaller extent. Here, we contrast the recognition of more strongly accented words with that of less strongly accented words.

Most research on foreign-accented speech has focused on its intelligibility (e.g., Bradlow & Bent, 2008; Derwing & Munro, 1997; Munro & Derwing, 1995a). In general, intelligibility increases as the strength of accent of a speaker decreases (Bent & Bradlow, 2003), and native listeners are known to benefit from more exposure to improve their understanding of low-intelligibility speakers, whereas this is not necessary for highly intelligible speakers (Bradlow & Bent, 2008). Exposure from multiple speakers is also beneficial for understanding foreign-accented speech (Sidaras, Alexander, & Nygaard, 2009). Native English listeners were familiarized with multiple speakers of Spanish-accented English or with native English control speakers. Participants were able to attain higher transcription accuracy on novel words and utterances after a familiarization phase with multiple speakers, as compared with no exposure at all. Furthermore, participants who received accented familiarization were better at recognizing some of the accented vowels than were participants with only native training. Which vowels participants were exposed to and how this affected learning of these vowels, however, was not manipulated systematically. Most of the studies on intelligibility have used tasks in which participants were required to make judgments about sentences that were not controlled for specific accent markers. The present study investigated specific accent markers and their effect on understanding accented speech. We examined the recognition of individual words that vary in perceived accentedness. The difference in perceived accentedness is mainly driven by vowels that deviate to a larger or smaller extent from the standard form. This perspective allowed us to investigate in greater detail what makes foreign-accented speech hard to understand and whether all types of variation lead to processing difficulty.

Familiarity with German-accented Dutch

As has already been noted, we were also interested in the effects of familiarity with a naturally occurring accent. In particular, we tested whether Dutch listeners who are either familiar or unfamiliar with German-accented Dutch are able to correctly interpret German-accented Dutch words. Research with second-language (L2) listeners suggests that recognition of familiar variant forms is possible (Weber, Broersma, & Aoyagi, 2011). In an experiment in which Dutch and Japanese participants listened to either Dutch-accented English or Japanese-accented English, the L2 listeners could recognize accented words easily when they were produced in their own accent (e.g., Dutch listeners could recognize Dutch-accented English words easily, and Japanese listeners Japanese-accented English words). But are participants also able to adapt to foreign-accented speech when the target language is their native language? In that case, listeners are constantly exposed to the native pronunciations from their fellow countrymen. Native Dutch speakers, for instance, will usually have far more experience with native Dutch than with German-accented Dutch. Can native listeners therefore easily understand only weakly accented words, and do they, in order to understand strongly accented words, need to attain a certain level of familiarity with the accent first?

Dutch listeners participated in three cross-modal priming experiments with Dutch as the target language, although spoken with a German accent. German-accented Dutch was chosen because native Dutch speakers are known to vary in how familiar they are with the accent. There are a substantial number of German students in the Netherlands, but they tend to study at Dutch universities close to the German border. Fewer German students are found in the center of the Netherlands. So it is possible to find Dutch listeners with either limited or extensive experience with German-accented Dutch.

A German accent in Dutch is particularly noticeable when it comes to vowels, and these vowels produce an excellent starting point for looking at effects of degree of accentedness. We therefore chose words with two particular Dutch diphthongs. Although both Dutch and German are Germanic languages, their vowel systems differ in a number of ways. Both languages have a large vowel inventory: Dutch has 13 monophthongs and 3 diphthongs (Gussenhoven, 1999); German has 12 monophthongs and 3 diphthongs (Kohler, 1999). While there is some overlap between the monophthongs, the diphthongs vary more across the two languages. Dutch has the three diphthongs //, /œy/, and /ʌu/, while German has /a/, //, and //. The two Dutch diphthongs that were the focus of this study, [œy] and [], are thus not part of the German vowel inventory. Both diphthongs are difficult for many learners of Dutch (Doeleman, 1998), but in particular, /œy/ is a rare sound across languages and poses great difficulties for L2 learners. The Dutch [œy]-vowel was replaced with the German [] by the speaker of this experiment, and the Dutch [] was replaced with the German [a]. Acoustically, the trajectories of [] and [œy] deviate more than those of [] and [a]. This was confirmed for the speaker of our experiment with acoustical analyses (see the Method section for Experiment 1). German learners of Dutch are usually well aware of the large deviation between their pronunciation and the intended Dutch [œy]-vowel, while they are often oblivious to the smaller deviation between their [a] pronunciation and the intended Dutch []. In this study, Dutch words with [œy] were considered to be strongly accented, and Dutch words with [] as medium accented. There was a third set of words without any segmental substitutions. These items contained only phonemes shared between Dutch and German but were, nonetheless, spoken by the same nonnative speaker. These words were therefore considered to be weakly accented. The three different strengths of perceived accentedness were confirmed in a rating study (see the Method section for Experiment 1).

In summary, the present study had three goals. First, in Experiment 1, we investigated whether foreign-accented speech (in this study, German-accented Dutch) can be understood by native (Dutch) listeners with limited previous exposure and contrasted their results with those of native listeners who were already highly familiar with the accent. Second, we asked whether effects of familiarity depend on how strongly accented the stimulus words were. Third, in two subsequent experiments, we examined how short-term training on an accent influences word recognition, again as a function of strength of accent (Experiment 2), but also as a function of speaker (Experiment 3). The results of these experiments will be related to models of spoken-word recognition and the accounts they offer for how listeners cope with pronunciation variation—those based on representation (e.g., Goldinger, 1998; Johnson, 2006; Pierrehumbert, 2001; Ranbom & Connine, 2007) and those based on processing (e.g., Gaskell & Marslen-Wilson, 1998; Lotto & Holt, 2006; Mitterer, Csépe, Honbolygo, & Blomert, 2006).

Experiment 1

Experiment 1 was designed to test whether familiarity with a foreign accent influences adaptation to that accent, as measured in terms of ease of word recognition. We compared Dutch listeners with limited experience with German-accented Dutch with listeners with extensive long-term experience with German-accented Dutch. Participants first listened to German-accented Dutch primes and then judged whether target words that appeared on a screen were Dutch words or nonwords. Reaction times (RTs) to a target word are known to be shorter when the auditory prime and the visually presented target word are identical than when the auditory prime is unrelated (see, e.g., Clarke & Garrett, 2004; Marslen-Wilson, Nix, & Gaskell, 1995; McQueen, Cutler, & Norris, 2006). Phonologically similar but nonidentical auditory primes generally do not produce significant facilitatory priming and, sometimes, even produce inhibitory priming (e.g., Van Alphen & McQueen, 2006). We will therefore take statistically significant facilitatory priming as our measure of successful online word recognition (Marslen-Wilson, Moss, & van Halen, 1996; Marslen-Wilson & Zwitserlood, 1989) and, hence, of successful adaptation to the accent. Facilitatory priming of responses to visual target words after auditory primes produced with a foreign accent will thus be taken to suggest that listeners recognized the accented primes online as being the same Dutch words as the visual targets.

We expected that recognition of the variant forms would be more successful when the deviation from the standard was smaller (e.g., that we could observe significant priming for the weakly accented words, but not for the strongly accented words) and that the more experience people had had with an accent, the easier it would be for them to adapt to it. Adaptation was measured separately for the first and second halves of the experiment. It was possible that less experienced listeners would not show priming in the first half but would in the second half, after having had time to adapt to the accent. Given previous findings with native speech (e.g., Marslen-Wilson & Zwitserlood, 1989), it was predicted that Dutch listeners could interpret the weakly accented words correctly, regardless of their previous experience with the German accent. Furthermore, we expected that participants with limited previous exposure to German-accented Dutch would have trouble understanding medium and strongly accented words, whereas participants with extensive experience with the accent would have no problems understanding such words.

Method

Participants

Two groups of participants were tested. The limited-experience group (n = 23, 19 females, mean age = 20.41 years), all native speakers of Dutch, was tested in Utrecht, a city in the middle of the Netherlands. These participants were recruited from the Utrecht University participant pool; the vast majority studied at Utrecht University. The extensive-experience group also consisted of 21 native speakers of Dutch (19 females, mean age = 22.97 years) and was tested in Nijmegen, a city in the east of the Netherlands, near the German border. These participants were recruited from the MPI participant pool; the majority studied at the Radboud University Nijmegen.

There are many more German students enrolled at the Radboud University Nijmegen (approximately 8 % of the students in 2011 were German) than at Utrecht University (approximately 1 % German students in 2011). As such, it could be expected that Dutch students in Nijmegen are, in general, more frequently exposed to German-accented Dutch than students in Utrecht are. But students in Utrecht can happen to have German friends or family with whom they communicate in Dutch, and students in Nijmegen can happen to major in a subject where only a few German students are enrolled. In order to control more closely for the amount of prior experience with German-accented Dutch, a language history questionnaire was administered. One of the questions asked how often participants heard German-accented Dutch (possible answers: never, less than once a week, once a week, and multiple times a week). Another question asked from how many speakers participants heard German-accented Dutch (possible answers: 0–1, 2–5, 6–10, and more than 10). Only Utrecht-based students who reported hearing German-accented Dutch less than once a week from fewer than two speakers were included in the limited-experience group, and only Nijmegen-based participants who reported hearing German-accented Dutch multiple times a week from more than two speakers were included in the extensive-experience group. Because the questionnaire was administered after the main experiment (to avoid a strong focus on German), a number of additional participants were tested but were not included in the analysis, because they did not meet these criteria (14 participants in Utrecht, 10 participants in Nijmegen).

In addition to their exposure to German-accented Dutch, we also asked participants about their knowledge of German. In The Netherlands, all students in upper educational levels have to take German language courses for at least 3 years and will, thus, have some knowledge of German. Since the educational programs for German are very similar across the country, the knowledge acquired in school should be comparable for our listener groups. Indeed, none of the participants in Experiment 1 studied German, none reported being fluent in German, and German was always reported to be either their second or the third nonnative language.

All participants volunteered and were paid a small fee for participating. None reported a hearing disorder, and all had normal or corrected-to-normal vision. They all reported that English was the first nonnative language they had learned, usually starting in school around the age of 10.

Materials

There were 48 critical items and 96 fillers. Each critical item was a combination of two auditory prime words and a visual target word. The targets were Dutch mono- or bisyllabic words (38 nouns, 7 adjectives, 2 adverbs, and 1 pronoun), and their corresponding primes were either the German-accented variants of the targets (identical primes) or phonologically and semantically unrelated Dutch words (unrelated primes). The 48 identical primes comprised 12 strongly accented words, 12 medium accented words, and 24 weakly accented words. Strongly accented primes were words with the Dutch vowel [œy] as in huis, ‘house.’ This diphthong is not part of the German phoneme inventory (see, e.g., Kohler, 1999), and German learners of Dutch mostly substitute it with [], a German diphthong that is perceptually and acoustically quite different from Dutch [œy] (Dutch [œy] starts front-central, half open, and ends front-central, near close; German [] starts back-central, half closed, and ends front-central, near open).

Medium accented prime words contained the Dutch diphthong [] as in lijst, ‘list’; this diphthong also does not exist in German (Kohler, 1999) and is usually substituted with German [a] by German speakers of Dutch. The diphthong [a] is not present in the Dutch phonemic inventory (Gussenhoven, 1999) but is phonetically relatively similar to the Dutch diphthong []. German [aɪ] begins central, half open, and ends front, close-mid; Dutch [] begins front, open-mid, and ends front, close-mid. Both vowels thus end in approximately the same place. We ensured that the remaining sounds (consonants and other vowels) of the strongly and medium accented words did not contain other obvious segmental substitutions by using only phonemes that were shared between the languages. Thus, except for the investigated diphthongs, all sounds were part of the Dutch and German phoneme inventories.

To ensure that the experimental words could not be interpreted as other existing Dutch words, we asked 10 additional native Dutch participants who did not hear German-accented Dutch multiple times a week to transcribe all 48 experimental words. Incorrect transcriptions were given, by a maximum of 4 participants, to only four words (two strongly accented, one medium accented, and one weakly accented). The incorrect transcriptions included one existing Dutch word (one word, and by only 1 participant), loan words (two words), and a name (one word). All other words were transcribed by all participants as the intended words.

The weakly accented words contained only vowels and consonants that are present in both the Dutch and German phonemic inventory, such as [] and [ɛ] vowels and [m], [ŋ], and [b]. We thus minimized segmental variation and strength of perceived accentedness. The same subset of additional vowels and consonants that was used for the strongly and medium accented words was used for the weakly accented words. To the extent that it can be expected that these sounds contribute to the perceived accentedness of words, they should do so to a comparable extent for all three accent types. The 48 unrelated primes matched, overall, in number of phonemes with their corresponding targets (e.g., ‘ketting’ [chain] and ‘prikkel’ [incentive]), and the overall lexical lemma frequency of unrelated primes was not different from the frequency of the targets (log frequency taken from the CELEX database [Baayen, Piepenbrock, & Van Rijn, 1993]), t(48) = 0.082, p = .935. For a complete list of critical items, see Appendix 1.

Of the 96 filler items, 24 had a Dutch word as the visual target. The remaining 72 fillers all had a nonword as their visual target. Eighteen of these items contained [] or [a] in the prime, so that not every [] or [a] prime would predict a yes response. Therefore, the overall ratio of words and nonwords for the visual targets was 1:1, resulting in 50 % yes responses for errorless participants. Half of these 72 filler items were preceded by nonword primes; the others were preceded by existing Dutch word primes. Both the word and nonword auditory primes could contain the [] and [a] vowels, again to ensure that participants could not form a response strategy based on the presence of these vowels in the items.

Speaker selection

Seven native speakers of German were recorded while reading a short Dutch text. Speakers differed in their level of proficiency and time spent in the Netherlands. These recordings were made to find a learner of Dutch who would produce the requested mispronunciations spontaneously and consistently. The chosen speaker was a male native speaker of German, who grew up in Bavaria, in the south of Germany. At the time of recording, he had lived in the Netherlands for 2 years while studying in Dutch at the Radboud University Nijmegen. The speaker was quite fluent in Dutch and knew the meaning of almost all Dutch words used in the experiment.

The Dutch word and nonword primes were recorded in pseudorandomized order from a list of items given in their correct Dutch spelling. The speaker was not instructed to change his pronunciation, so all mispronunciations occurred naturally. The speaker produced the primes one by one, separated by a pause, in a clear citation style, recording each prime at least two times. The recordings were made in a sound-attenuated booth with a Sennheiser microphone and were stored directly onto a computer at a sample rate of 44 kHz. Primes were excised from the recording using the speech editor Praat (Boersma & Weenink, 2009), and the best tokens were selected by the first author, a native speaker of Dutch.

Acoustic measurements

We recorded the complete vowel space for our speaker both in Dutch and in German by having him pronounce all Dutch and German vowels separately in an hVba-context. This context was chosen because it minimizes influences of other segments on the vowels (Jenkins et al., 1997). All words were recorded at least twice per vowel in the speaker’s natural accent, in clear citation, from correct Dutch or German spelling (e.g., the investigated Dutch vowels were written as huiba and hijba, the German vowels as Heuba and Heiba). These recordings were made in the same session as the recordings of the auditory primes. The best two tokens per vowel were selected by a native speaker of Dutch (the first author). For the critical German and Dutch diphthongs, we measured the first and second formants at the 25 and 75 percent points of the vowels. Figure 1 plots the average values for the first two formants for the Dutch diphthongs [œy] and [], as well as the German diphthongs [] and [a] as produced by the speaker of the experiment, as well as the diphthongs [œy] and [] spoken by Dutch native speakers (data taken from Adank, Van Hout, & Smits, 2004).

Fig. 1
figure 1

F1 and F2 formant values of the two critical diphthongs, as pronounced by the experimental speaker (German-accented Dutch in solid lines, native German in bold dashed lines), as compared with Dutch diphthongs from the Adank et al. (2004) corpus (thin dashed lines)

When looking at the Dutch // in Fig. 1, it can be seen that our speaker’s F1 value is higher than the average Dutch speaker’s F1, but the trajectories are quite similar. In comparison, our speaker’s F2 is higher than the average Dutch speaker’s F2 for Dutch /œy/, and the trajectories are furthermore quite different. As can also be seen in Fig. 1, our speaker’s Dutch // and his German /a/ are quite similar, as well as his Dutch /œy/ and German //. This supports the notion that our speaker substituted the Dutch diphthongs with existing German categories.

Rating study

To further ensure that the three types of identical primes were indeed perceived as varying in accent strength, 20 native Dutch speakers who did not participate in the priming study rated the items. We used all strongly accented [œy]-primes, all medium accented []-primes, and half of the weakly accented primes, so that there were 12 items of each type. We recorded two more sets of 36 items from two additional speakers: a native speaker of Dutch and a native speaker of Italian with a very strong accent in Dutch. The reason for adding these two speakers was to add more variation to the materials, thereby avoiding artificial inflation of a perceived difference between prime types for the German speaker. During the rating study, participants heard one word at a time over closed headphones, immediately followed by a rating scale where they could indicate how accented the word was on a scale ranging from 1 (not accented) to 10 (very strongly accented).

The data were analyzed with paired-samples t-tests that indicated that, indeed, the strongly accented [œy]-items (M = 7.98, SD = 1.18) were rated as more accented than the weakly accented items (M = 4.73, SD = 1.67), t(19) = 9.443, p < .001, and the medium accented []-items (M = 7.01, SD = 1.22), t(19) = 7.223, p < .001. Furthermore, the medium accented []-items were rated as more accented than the weakly accented items, t(19) = 7.576, p < .001. These results thus confirm the picture emerging from the acoustic measures: The strongly accented words deviated most from the standard pronunciations and were, indeed, rated as more strongly accented than the medium accented items.

Design and procedure of priming experiment

Participants were seated in a sound-attenuated booth and were informed that they would first hear a Dutch word or nonword spoken by a German-accented speaker and then see a Dutch word or nonword on the screen. Two versions of the cross-modal priming experiment were created, so that every participant saw each visual target only once. To control for effects of presentation order, each participant received a different pseudorandomized list. The first two items of the experiment were always fillers, and there were never two critical items in a row.

The participants’ task was to decide as quickly and accurately as possible whether the word presented on the screen was an existing Dutch word or not. Participants responded by pushing one of two buttons on a button box in front of them. Yes responses were always made with the dominant hand, and RTs were measured from visual target onset.

Auditory primes were presented binaurally over closed headphones at a comfortable listening level. Participants saw the visual targets on a computer screen situated about 50 cm in front of them. Visual targets were presented in white lowercase 24-point Tahoma letters on a black background, 500 ms after the acoustic offset of the auditory primes. The visual targets stayed on the screen for 2,000 ms, after which the next trial started. The experiment was created in Presentation (version 13, Neurobehavioural Systems Inc.) and was controlled with NESU hardware (Nijmegen Experiment Set-Up). After the cross-modal priming experiment, participants were asked to fill out the language history questionnaire.

Results

Three items with weakly accented primes were discarded from the analysis because they had high error rates (more than 10 %). A possible reason for these error rates is that all three of these targets had a very low lexical frequency (for the exact items, see Appendix 1). These items were also excluded from analyses of all other experiments described here.

The remaining cross-modal priming data were analyzed with general linear model repeated measure analyses of variance (ANOVAs) using a 3 (accent type: strong, medium, or weak) × 2 (priming: identical vs. unrelated) × 2 (half: first or second half of the experiment) design. All of these were within–participants factors. We analyzed the results separately by participants (F 1) and items (F 2). In addition to these analyses, we conducted planned comparisons using paired sample t-tests to look at the priming effects. These were calculated separately by half and accent type.

Limited-experience group

In addition to the three weakly accented items, a further 1.8 % of trials on which participants made errors, as well as trials with RTs that deviated more than 2.5 SDs from the condition’s overall mean, were discarded (together, <5 % of all trials). Errors were distributed evenly across the conditions (see Appendix 2) and, due to their low overall occurrence, will not be analyzed statistically.

As is shown in Fig. 2 (calculated priming effects) and Appendix 2 (mean RTs), participants with limited experience were faster to respond to identical than to unrelated trials, F 1(1, 22) = 49.028, p < .001; F 2(1, 11) = 86.190, p < .001. A main effect of accent type across participants furthermore indicated that participants reacted to target words with different speeds, F 1 (2, 44) = 3.876, p = .028; F 2(2, 22) = 1.100, p = .350. Participants also got faster, overall, during the course of the experiment, F 1(1, 22) = 7.536, p = .019; F 2(1, 11) = 7.267, p = .013.

Fig. 2
figure 2

Experiment 1: Priming effects and confidence intervals for participants with limited experience by experiment halves and accent type

The F 1 analysis showed a significant interaction across participants between accent type and priming, F 1(2, 44) = 4.580, p = .016; F 2(2, 22) = 2.541, p = .102, reflecting that participants did not show equal priming effects for all accent types. This was investigated further using planned pairwise comparisons (see Table 1). The interaction between half and priming was not significant, F 1 < 1, F 2 < 1, nor was the three-way interaction, F 1 < 1, F 2 < 1. The pairwise comparisons show that participants had no problems with adapting to the weakly and medium accented items but could not adapt to the strongly accented items during the experiment.

Table 1 Planned pairwise comparisons of priming effects for all accent types across participants and items for participants with limited experience with German-accented Dutch

Extensive-experience group

Trials on which participants made errors (2.0 %) were excluded from the analyses. In addition, we excluded trials on which the RTs deviated more than 2.5 SDs from the condition’s overall mean (together, <5 % of all trials). Inspection of the errors showed no specific patterns.

As is shown in Fig. 3 (calculated priming effects) and Appendix 2 (mean RTs), participants with extensive experience with German-accented Dutch were faster to respond to identical than to unrelated items, F 1(1, 20) = 81.727, p < .001; F 2(1, 11) = 29.828, p < .001, and responded similarly to the three different accent types, F 1 < 1; F 2 < 1. Participants were faster in the second half of the experiment than in the first, F 1(1, 20) = 5.301, p = .033; F 2(1, 11) = 8.957, p = .014. Across participants, there was a marginally significant interaction between accent type and priming, F 1(2, 40) = 3.073, p = .058; F 2 < 1, reflecting that priming effects differed in size for the accent types. The remaining interactions did not reach significance.

Fig. 3
figure 3

Experiment 1: Priming effects and confidence intervals for participants with extensive experience by experiment halves and accent type

Planned pairwise comparisons were used to look at the priming effects for each accent type separately. Table 2 displays the statistical analyses of the priming effects across participants and items. Participants with extensive experience showed significant priming from the start of the experiment for all accent types, thereby showing that they had previously adapted to the accent and were able to apply this knowledge rapidly to the experimental speaker.

Table 2 Planned pairwise comparisons of priming effects for all accent types across participants and items for participants with extensive experience with German-accented Dutch

Discussion

Taken together, the results of Experiment 1 thus show that listeners with limited experience with German-accented Dutch and those with extensive experience can immediately interpret the medium and weakly accented items correctly, but only the listeners with extensive experience show facilitatory priming for strongly accented items. We also analyzed both listener groups (limited experience and extensive experience) in one overall repeated measures analysis with group as an additional between-participants factor. These analyses showed an effect of group marginally significant across participants, F 1(1, 43) = 3.737, p = .060; F 2(1, 20) = 18.223, p < .001, with the experienced listeners responding faster, overall, than the inexperienced listeners, but there were no significant interactions between any of the factors. This could imply that the difference between the groups with respect to processing strongly accented words was possibly less pronounced than the separate analyses suggest or that this effect depends on the strength of elements of the accent. Moreover, it is possible that even Dutch listeners with extensive experience with German-accented Dutch may still encounter some difficulties understanding strongly accented words.

Experiment 1 allowed us to study the role of long-term exposure to German-accented Dutch. It showed not only that the degree of experience influences how easily a listener can adapt to foreign-accented speech, but also that this effect depends on the strength of the accent. In Experiment 2, we wanted to shed more light on the role of short-term experience in word recognition and, in particular, on the effect of short-term exposure on adaptation to strongly accented words. Research on perceptual learning has shown that the word recognition process not only is sensitive to long-term listening experience, but also can adapt after a short amount of exposure. There is some evidence for rapid adaptation to foreign-accented speech (see, e.g., Bradlow & Bent, 2008; Clarke & Garrett, 2004). This is in line with perceptual learning research, with artificial accents or speech sounds constructed to be ambiguous showing flexibility in online word recognition processes (e.g., Maye, Aslin, & Tanenhaus, 2008; McQueen et al., 2006), as well as in the perception of phoneme categories (e.g., Eisner & McQueen, 2006; Norris, McQueen, & Cutler, 2003).

Experiment 2

In Experiment 2, we asked for the first time whether there is short-term adaptation to individual vowels in a real foreign accent (as opposed to an artificial accent). We again tested Dutch listeners with limited experience with German-accented Dutch, but they were now exposed to a 4-min story immediately before the cross-modal priming experiment. The short story was spoken by the same speaker as that used in the priming experiment and would thus function as additional exposure to that speaker. Two versions of the story were recorded, one with 12 strongly accented items (words containing the [œy]-vowel) not used in the main experiment, and one without strongly accented items. The goal of this experiment was to investigate whether a short period of familiarization with the speaker would be sufficient to create adaptation to novel words. We expected to replicate the findings of Experiment 1 with respect to the medium and weakly accented words (i.e., priming in both halves of the experiment). We also expected to find priming for the strongly accented words when participants with limited experience had been exposed to a story containing strongly accented words (in contrast to Experiment 1) at least in the second half and, possibly, already in the first half of the experiment. We expected participants who listened to the weakly accented exposure also to show adaptation to the strongly accented items, but later or to a lesser extent than listeners who were exposed to the story with strongly accented words.

Method

Participants

Fifty-seven participants took part in Experiment 2. Participants were randomly assigned to one of two story exposure groups (group 1, n = 19, 15 females, mean age = 21.18 years; group 2, n =19, 16 females, mean age = 22.15 years). Participants from Utrecht were selected using the same criteria as in the limited-exposure group in Experiment 1. As before, therefore, we excluded additional participants who heard German-accented Dutch more than once a week from more than two speakers. In total, we excluded 19 participants. As in Experiment 1, participants did not report a hearing disorder and had normal or corrected-to-normal vision. All of them reported basic knowledge of German, with German being their second or third nonnative language (after English); none considered themselves fluent in German.

Materials

Two versions of a short story were created. We chose to focus on the effects of exposure to strongly accented words, because learning for the medium and weakly accented words was already at ceiling in Experiment 1. Both stories were recorded by the speaker from Experiment 1. The story was based on the fairytale “Jorinde and Joringel” by the Brothers Grimm. The two versions were identical, except for 12 words. One version contained 12 strongly accented words with the [œy]-vowel (pronounced by the speaker as []). None of these words appeared in the main experiment. In the other version, these 12 words where replaced with words without the vowel [œy] (e.g., duizend, ‘thousand’ replaced with honderd, ‘hundred’). The two stories thus both contained medium and weakly accented words. Both versions were recorded from a script using correct Dutch spelling. The speaker was not instructed to change his pronunciation; again, all mispronunciations occurred naturally. The speaker recorded one paragraph at a time. All paragraphs were recorded multiple times, and the best paragraphs were selected to create a story that was spoken without hesitations and misreadings. That is, recordings were selected in which all words were pronounced as intended, including the 12 strongly accented words consistently pronounced with []. Recordings were made in a sound-attenuated booth with a Sennheiser microphone and were stored directly onto a computer at a sample rate of 44 kHz.

Design and procedure

Participants were seated in a sound-attenuated booth and were informed that they would first listen to a Dutch story spoken by a German speaker and then perform a cross-modal priming experiment. There was no additional task when participants were listening to the story. Group 1 listened to the story with 12 strongly accented words. Group 2 listened to the story without strongly accented words. Participants were randomly assigned to a group. Immediately after the story ended, participants saw the instructions for the cross-modal priming experiment on the screen. The cross-modal priming experiment was identical to the one used in Experiment 1. After the experiment, participants filled out the language history questionnaire (identical to that in Experiment 1).

Results

Limited-experience group–exposure with strongly accented words

The statistical analyses were identical to those used in Experiment 1. Trials containing errors were excluded (1.3 %) from the analyses. We also excluded trials on which the RTs deviated more than 2.5 SDs from the condition’s overall mean (together, <5 % of all trials). Inspection of the errors (see Appendix 2) again showed no specific patterns of priming.

The priming effects for participants exposed to the story with strongly accented words are depicted in Fig. 4 (mean RTs are given in Appendix 2). Participants were faster overall in responding to identical trials, as compared with unrelated trials, F 1(1, 18) = 82.771, p < .001; F 2(1, 11) = 132.515, p < .001. The participant analysis showed that participants responded with different speeds to the three word types, F 1(2, 36) = 3.788, p = .033; F 2 < 1. Participants were faster in the second half of the experiment than in the first half, although this effect was significant only by items, F 1(1, 18) = 3.638, p = .074; F 2 (1, 11) = 6.754, p = .025. There were no significant interactions between the factors.

Fig. 4
figure 4

Experiment 2: Priming effects and confidence intervals for the limited experience group with exposure to strongly accented items by experiment halves and accent type

Planned pairwise comparisons were used to confirm the priming effects. Table 3 displays the priming effects across participants and items. Participants showed successful adaptation to all accent types in both halves of the experiment.

Table 3 Planned pairwise comparisons of priming effects for all accent types across participants and items for participants with exposure to strongly accented items

Limited-exposure group–exposure without strongly accented words

We excluded trials on which participants made errors (1.8 %) from the analyses. We also excluded trials on which the RTs deviated more than 2.5 SDs from the condition’s overall mean (together, <5 % of all trials). The errors (see Appendix 2) showed no systematic priming effects.

Priming effects for participants exposed to the story without strongly accented words are shown in Fig. 5 (mean RTs are given in Appendix 2). Participants were faster overall in responding to identical trials than to unrelated trials, F 1(1, 18) = 76.162, p < .001; F 2(1, 11) = 35.956, p < .001. There was no difference in the way participants responded to the different types of items, F 1(2, 36) = 1.185, p = .318; F 2 < 1. Participants were not faster overall in the second half than in the first, F 1 < 1, F 2 < 1. There was no interaction between accent type and half, F 1 < 1; F 2(2, 20) = 1.039, p = .372, or between accent type and priming, F 1(2, 36) = 1.462, p = .246; F 2 < 1. The interaction between half and priming was significant across participants, F 1(1, 18) = 5.806, p = .028; F 2 < 1, reflecting that, in general, priming effects were larger in the second half of the experiment than in the first half. The three-way interaction was not significant.

Fig. 5
figure 5

Experiment 2: Priming effects and confidence intervals for the limited experience group with no exposure to strongly accented items by experiment halves and accent type

The results of the planned pairwise comparisons of the priming effects are given in Table 4. They show that participants were able to interpret the weakly accented items and the medium accented items throughout the experiment. The strongly accented items, however, could be interpreted correctly only in the second half of the experiment.

Table 4 Planned pairwise comparisons of priming effects for all accent types across participants and items for participants exposed to the story without strongly accented items

Discussion

When participants were exposed to the same speaker without strongly accented words, we saw adaptation to the strongly accented items in the second half of the experiment. Participants thus have some benefit from prior exposure to the speaker even though he did not produce these specific mispronunciations. Moreover, while participants’ performance increased for the strongly and medium accented items across halves, there was no increase in the priming effect for the weakly accented items. Since the only difference between the segments in the words in the three sets is in the vowels (we used the same consonants in all three accent types), the differences for the strongly and medium accented words must be driven by perceptual adaptation to the vowels.

At this point, an interesting comparison can be made between participants with limited experience from Experiment 1 (who did not have a story exposure phase) to the two groups of participants from Experiment 2, who did have an exposure phase. We compared the priming effects for the strongly accented items, separately for the two halves. Participants who had received strongly accented exposure showed significantly more priming for the strongly accented items than did participants without prior experience, both in the first, t(40) = 2.612, p = .013, and in the second, t(40) = 2.523, p = .025, halves of the experiment. The strongly accented exposure thus increased priming in both halves of the experiment and, therefore, allowed listeners to interpret the accent more easily. Participants who received weakly accented exposure did not show more priming than did participants without exposure in the first half of the experiment, t(40) = 0.546, p = .588, but did in the second half of the experiment, t(40) = 2.721, p = .010. Hearing just the speaker without the strongly accented items thus gave participants a head start but was not enough to lead to priming from the start of the experiment.

In Experiment 2, we saw that participants were able to correctly interpret the weakly and medium accented items, as in Experiment 1. For the strongly accented items, however, we saw that if participants listened to the speaker using the strongly accented items for a short exposure period (4 min) before starting the cross-modal priming experiment, they could correctly interpret the strongly accented items from the start of the experiment. This may be evidence for speaker-specific adaptation. Experiment 3 looked further at the effect of speaker. Do the same effects occur when participants with limited exposure listen to different speakers of the same accent in the exposure phase and test phase? Previous research with exposure with multiple talkers does indicate that adaptation to an accent can transfer across speakers (e.g., Bradlow & Bent, 2008; Sidaras et al., 2009). However, in Experiment 2, we had only one speaker during exposure, which might not be enough for speaker-independent adaptation to occur. If adaptation to accents is speaker specific, the exposure phase with a different speaker should not help participants to the same extent as listening to the strongly accented story spoken by the test speaker. However, when taking into account the results from Experiment 1, where listeners who were very familiar with German-accented Dutch could quickly adapt to a new speaker, it is also possible that adaptation is to some degree speaker independent in foreign-accented speech. If this is the case, we should see the same results as when participants listened to the strongly accented exposure from the same speaker.

Experiment 3

Method

Participants

Nineteen participants took part in Experiment 3 (mean age = 21.48 years, 17 females). Limited-exposure participants were recruited and selected using the same procedures as in Experiments 1 and 2. Three additional participants were excluded on the basis of the language history questionnaire. All participants reported normal hearing and normal or corrected-to-normal vision. All of them reported basic knowledge of German, with German being their second or third nonnative language (after English), and none considered themselves fluent in German.

Materials

In Experiment 3, we used a different speaker to record the story used for the exposure phase. This speaker (speaker 2) was selected from a pretest and was chosen because he pronounced the [œy] vowel in a similar fashion to the speaker of Experiments 1 and 2 (thus as []). The speaker was a male native speaker of German, raised in Nordrhein-Westfalen. He had lived in The Netherlands for 3 years while studying in Dutch at the Radboud University Nijmegen. Like the previous speaker (speaker 1), speaker 2 was quite fluent in Dutch and knew almost all Dutch words used in the recorded story. Speaker 2 read one Dutch story, identical to the one used in the strongly accented exposure condition in Experiment 2. The procedure of multiple recordings and of selection of final materials from those recordings was identical to that used in Experiment 2. In particular, recordings were selected in which speaker 2 consistently pronounced the 12 strongly accented words with []. The procedure of the cross-modal priming experiment (with speaker 1) was the same as in Experiments 1 and 2.

Results

The statistical analyses were identical to those used in Experiments 1 and 2. We excluded trials on which participants made errors (2.5 %) and those trials on which the RTs deviated more than 2.5 SDs from the condition’s overall mean (together, <5 % of all trials). The errors (see Appendix 2) showed no specific patterns of priming.

As is shown in Fig. 6 (calculated priming effects) and Appendix 2 (mean RTs), participants were faster to respond to the identical words than to the unrelated words, F 1(1, 18) = 62.868, p < .001; F 2(1, 11) = 64.568, p < .001. In addition, participants were faster overall in the second half than in the first, F 1(1, 18) = 9.941, p = .006; F 2(1, 11) = 22.634, p = .001. Participants did not respond differently to the three types of words, F 1(2, 36) = 2.448, p = .102; F 2(2, 20) = 1.279, p = .300. There were no significant interactions between any of the factors.

Fig. 6
figure 6

Experiment 3: Priming effects and confidence intervals for the limited experience group with exposure to strongly accented items from a different speaker by experiment halves and accent type

Table 5 displays the planned pairwise comparisons of the priming effects. They show that participants could understand the weakly accented and medium accented words throughout the experiment. The priming effect for the strongly accented items, however, was present only in the second half of the experiment.

Table 5 Planned pairwise comparisons of priming effects for all accent types across participants and items for participants with strongly accented exposure from different speaker

Discussion

When comparing these results with those for participants with limited experience who did not receive prior exposure (Experiment 1), we see that participants with exposure to a different speaker of German-accented Dutch did not show more priming for strongly accented items in the first half of the experiment, t(40) = 0.516, p = .604, but did show significantly more priming in the second half of the experiment, t(40) = 2.116, p = .041. This result suggests that hearing a speaker with the same mispronunciations and the same accent does provide an advantage, as compared with not hearing the speaker at all (limited-experience group, Experiment 1), but this advantage is not as large as hearing the same speaker (limited-experience group–exposure with strongly accented items, Experiment 2). We cannot exclude the possibility that the weaker priming effects in Experiment 3 (speaker 2 exposure) than in Experiment 2 (speaker 1 exposure) arose because speaker 2 produced more variable pronunciations of the strongly accented words and, thus, was harder to learn from than speaker 1. This appears unlikely, however, because the two speakers were selected for their similar pronunciation of the critical words and because recordings of these words in the exposure story were selected in which both speakers consistently mispronounced all the critical vowels as [].

General discussion

The present study investigated the effects of different types of experience on word recognition in foreign-accented speech. Dutch listeners with limited prior experience with German-accented Dutch were able to interpret Dutch words with a weak or medium strength German accent correctly, as measured in a cross-modal priming study, but they had difficulties interpreting strongly accented words. But when a short exposure phase was added immediately before the cross-modal priming study, these participants’ performance on the strongly accented words improved, even though they had not heard those tokens in the exposure phase. It improved most when listeners had been exposed to the same speaker producing comparable strongly accented items; in this case, the short additional exposure led to equivalent performance to that of listeners with extensive prior experience. Short exposure to the speaker without strongly accented tokens, as well as short exposure to strongly accented tokens by a different German-accented speaker, also improved word recognition; improvement in these cases, however, was not observable immediately and emerged only in the second half of the experiment. These findings constitute evidence that short-term adaptation to a naturally occurring foreign accent generalizes across words and show for the first time that this adaptation depends on the strength of the accented words—specifically, on the vowels in those words.

Even though all words used in Experiments 1 and 2 were noticeably accented (as demonstrated in the rating study and the acoustic measurements), native Dutch listeners never had difficulties interpreting the weakly and medium accented items. This shows that recognition of variant forms is not necessarily difficult in L2 speech. This is in contrast with research using L1 speech, where it is a robust finding that priming with a word that mismatches with one phoneme from its canonical form does not facilitate target recognition (Marslen-Wilson et al., 1995) and sometimes even causes inhibition (Van Alphen & McQueen, 2006). Possibly, listeners more readily accept variation in pronunciation from L2 speakers; alternatively, deviations from the standard in weakly and medium accented words were too small to severely disrupt word recognition (see, e.g., Connine et al., 1993). In the case of the []-vowel in medium accented words, it could even be that similarity with a native variation facilitated recognition; for example, in ‘Poldernederlands’, the ‘polder ij’ (Jacobi, 2009) is lowered and closer to [a] in comparison with the standard [], and it is therefore somewhat similar to the German-accented [a]. But even though the German [a] might be close to the ‘polder ij’, the rating study still indicated that native Dutch listeners considered the medium accented words to be more accented than the weakly accented words. In any case, the fact that listeners with limited prior experience could interpret weakly accented and medium accented words correctly suggests that extensive experience with an accent is not always required in order to be able to understand the accent. To a certain degree, we can rely on short-term perceptual learning mechanisms for handling variation in foreign-accented speech.

There is, however, also a role for long-term experience in understanding foreign-accented speech. Participants with extensive prior experience with German-accented Dutch had no difficulty recognizing the strongly accented words, and they could do so even without brief preexposure to the speaker (i.e., in Experiment 1). Although it is still possible that listeners with extensive experience may encounter some difficulties interpreting foreign-accented speech, our results suggest that there is speaker-independent adaptation to foreign-accented speech. This is indeed good news for L1 listeners (and L2 speakers), since it implies that we do not have to adapt anew to each L2 speaker of a familiar accent. Since both listener groups showed a basic understanding of the accent (i.e., they could all understand weakly and medium accented words), it is likely that additional long-term experience puts listeners a little further ahead of the limited experience listeners and allows them to recognize even the strongly accented items speaker independently.

In Experiment 1, strongly accented items such as /hs/ for /hœys/ [house] posed difficulties for native Dutch listeners with limited experience throughout the experiment. These listeners thus did not learn to interpret the strongly accented words correctly during the 8 min of the experiment. This result differs from those of earlier studies (e.g., Clarke & Garrett, 2004) that found adaptation to foreign-accented speech within 1 min of exposure. One explanation for this difference in findings is that the accent markers in the present study were stronger and, hence, more difficult to learn. Previous studies on short-term adaptation to foreign accents usually did not control for specific accent markers, and it is feasible that items with varying strength of perceived accentedness were combined in these studies. A second explanation could lie in the contextual presentation of stimuli: Most of the previous studies used sentences, whereas the present experiment used isolated words. Sentence context, of course, provides richer information about the pronunciation habits of a particular speaker than isolated words do, and this additional information could make it easier for participants to tune in to a foreign-accented speaker. The fact that listeners performed better in the cross-modal priming task in Experiment 2 after being exposed to a story featuring the speaker and the accent supports the latter explanation.

In Experiment 3, it was found that short exposure to another speaker with the same accent also aids word recognition for strongly accented words, but not to the same extent as exposure to the same speaker does. The two German-accented speakers were similar to one another in a number of ways: They were male, were approximately the same age, and mispronounced Dutch words in a similar way. An analysis of the two renditions of each of the 12 critical words from the story revealed that the two speakers were also comparable in terms of pitch (speaker 1 mean = 133.0 Hz; speaker 2 mean = 132.7 Hz), t(22) = 0.039, p = .969, and speaking rate (speaker 1 mean = 425 ms; speaker 2 mean = 374 ms), t(22) = 1.229, p = .232. Despite these similarities, listeners recognized the accented words better when they were tested on the forms produced by the same speaker as they had heard during story exposure. This suggests that short-term adaptation to foreign-accented speech is speaker specific to some extent. But the fact that hearing a different speaker during exposure aided word recognition at all suggests that even the initial stages of learning are also, in part, speaker independent. It is possible that participants would be better at learning the accent if they were exposed to more than one or two speakers. This would allow them to learn more about the accent, rather than just about the speaker. Hearing multiple speakers might lead to speaker-independent adaptation. This could be tested by adding multiple accented speakers to the exposure and/or the test phase. Studies that have looked at general accent adaptation (i.e., not to specific accent markers) did show a beneficial effect of exposure to multiple speakers (e.g., Bradlow & Bent, 2008; Sidaras et al., 2009). In fact, listeners are even able to adapt to an accent when the experiment contains one accented and one native speaker (Trude & Brown-Schmidt, 2012). One possible explanation of this result is that it was easier for participants in the Trude and Brown-Schmidt study to contrast native pronunciations with the accented mispronunciations. More evidence for differential effects of multiple speakers is found in studies on intelligibility of foreign-accented speech (e.g., Munro & Derwing, 1995b), but whether a similar effect is found for online word recognition remains a question for further research.

The results of the present priming studies constrain accounts of how words and their accented variants are represented in the lexicon. Theories on how variants in L1 speech are handled can be divided into two types of accounts: representational and processing-based accounts. The first type of account assumes that variation is encoded in lexical entries; that is, not only are the canonical forms stored, but also other variant forms of these words (Goldinger, 1998). Episodic representational accounts postulate that each word, as well as all its variations, is encoded in the lexicon with fine-grained phonetic detail (e.g., Johnson, 2006; Pierrehumbert, 2001). Another viewpoint is that the lexicon has not episodic traces but, rather, multiple abstract representations for variant forms (e.g., Ranbom & Connine, 2007). These representational accounts, of course, also make processing assumptions, but their explanation for recognition of variant pronunciations is based on representations. They propose that variant recognition should be easy when listeners have been exposed to the variants before (because the variation would already be encoded in the mental lexicon). In the present experiments, prior storage of variants (in abstract or episodic form) could explain the benefit shown by the participants with extensive exposure to German-accented Dutch and, potentially, the ability of the participants with limited exposure to recognize the weakly and medium accented words. It is less clear, however, how representational accounts could explain the learning shown by the limited-exposure participants on the strongly accented words and, especially, the generalization of learning from the words in the story to the new words in the test phase. A representation-based explanation for this kind of generalization would require that the limited-exposure participants happened to have heard (and stored) German-accented variants of the words used in the test phase prior to the experiment. This is unlikely, but not impossible. However, even if these participants had happened to have stored the pronunciation variants of the test words, a representation-based explanation would still require additional mechanisms that could account for how these stored variants did not immediately influence recognition (i.e., in Experiment 1) but, instead, only started to do so after in some way being triggered by exposure to other strongly accented words (i.e., in Experiment 2).

A processing-based account of the adaptation appears to be more plausible. Such accounts assume that only canonical forms are stored in the lexicon and that listeners learn how to map variant forms onto stored canonical forms through exposure (e.g., Gaskell & Marslen-Wilson, 1998; Lotto & Holt, 2006; Mitterer et al., 2006). These models thus also make assumptions about representations, but the burden of their explanation for recognition of variant forms is carried by their assumptions about processing. The demonstration of generalization of learning for strongly accented words from the exposure phase to the test phase speaks for such models. If perceptual adaptation reflects a change in the way a speech sound is mapped onto the lexicon and that change takes place at a prelexical level of processing, that learning will be reflected in all words containing that sound (McQueen et al., 2006). In the present situation, if the way the strongly accented vowel is mapped onto the lexicon is modified during the exposure phase, then in the test phase, it should be possible to recognize all words containing that vowel (i.e., whether those specific words have been heard before or not). This processing-based account is thus parsimonious because it is based on the same kind of perceptual-learning mechanism that has been proposed to explain learning about artificial accents and idiosyncratic pronunciations. The mapping that is involved seems to be highly dependent on the type of exposure, as is shown by the different results across the three experiments. The fact that listeners with ample prior experience have less trouble interpreting strongly accented words than listeners with limited experience may indicate that the accented variants might be more strongly linked to their canonical forms for the listeners with more extensive experience.

Experience with an accent thus plays an important role in adaptation: Different types of experience lead to different kinds of adaptation. Prior experience does not seem to be necessary for adaptation to medium and weakly accented items: Remapping for these words takes place immediately or during the first few trials. For the strongly accented items, however, more experience is needed to interpret these items correctly. This experience can be gained either outside the laboratory through extensive exposure to multiple speakers (Experiment 1) or with short-term exposure in the laboratory, but then only under the right circumstances—namely, with the same speaker and the same strongly accented mispronunciations. Only limited exposure of this type appears to be sufficient: In the present experiment, we used a 4-min story with only 12 strongly accented mispronunciations. This is not the full picture, however, because hearing only the speaker without any strongly accented words in the story also helped listeners (Experiment 2). A possible explanation could be that adaptation to foreign-accented speech takes place at a general level first before continuing to vowel-specific adaptation. Evidence for this vowel-specific adaptation can be found in Experiment 2: Only participants who have heard both the speaker and the strongly accented mispronunciations were able to generalize correctly to new items in the first half of the experiment. It is therefore more likely that exposure to the story with weakly accented words boosted the general adaptation, but not the vowel-specific adaptation.

In summary, although strongly accented forms disturb online word recognition initially, adaptation to foreign-accented speech occurs within even a couple of minutes of exposure. More research is needed to look at how long-lasting this exposure effect is and whether this is a language-specific phenomenon (e.g., adaptation occurs only when listening to speakers with the same accent) or language independent (e.g., when listening to accented speech listeners’ word recognition becomes more flexible in general). The present study suggests that even within one accent, there can be substitutions that differ in their comprehensibility. Recognition of words with vowel substitutions that are judged to be only weakly accented appears to be unproblematic. Vowel substitutions that are judged to be more strongly accented do create recognition problems, but adaptation to them is rapid, with exposure to as few as 12 such words appearing to be sufficient for successful recognition of other strongly accented words.