Introduction

A morpheme is “a linguistic form which bears no partial phonetic–semantic resemblance to any other form” (Bloomfield, 1933, as cited in Nida, 1976, pp. 6–7), essentially the smallest pronounceable unit of a word that carries meaning. In English, many common words are made up of multiple morphemes, such as prefixes, suffixes, and roots; for example, the word unfriendly is made up of three morphemes: un-, friend, and -ly. Morphemes as a linguistic class can be divided into two categories: Free morphemes, such as friend, stand alone as complete and meaningful words in isolation, while bound morphemes, such as un- or -ly, cannot occur in isolation (e.g., Nida, 1976). Here, we used event-related potentials (ERPs) to investigate the online neural processing of written words and nonwords composed of bound or free morphemes. Specifically, we focused on the N400 component of the ERP waveform as a potential index of semantic composition, a process of integration of the meanings of the constituent morphemes in morphologically complex words (Koester, Gunter, & Wagner, 2007, p. 65).

Some sort of process of semantic composition would appear to be a necessary counterpart to a process of morphological decomposition (e.g., Koester et al., 2007; Meunier & Longtin, 2007). Although there is continuing debate about whether and how morphemes are represented in the lexicon (e.g., Gonnerman, Seidenberg, & Anderson, 2007; Kielar & Joanisse, 2011; Sandra, 1994; Seidenberg & Gonnerman, 2000), there is “a fairly broad consensus” that morphologically complex words are decomposed into their constituent morphemes in fluent reading (McCormick, Rastle, & Davis, 2008, p. 308). Indeed, many theorists have proposed such a process of decomposition, with constituent morphemes stored in the lexicon, perhaps along with whole words, at different levels or along parallel access routes (e.g., Andrews, 1986; Baayen & Schreuder, 1999; Caramazza, Laudanna, & Romani, 1988; Colé, Segui, & Taft, 1997; Giraudo & Grainger, 2001; Marslen-Wilson, Tyler, Waksler, & Older, 1994; Taft & Forster, 1975). According to these types of theories, fluent readers automatically decompose at least some morphologically complex words into their constituent morphemes while reading, which would necessitate a recomposition process in order to access the semantic meaning of the words. For example, in the case of a compound word like teacup, made up of two free morphemes, there must be a composition process by which the meaning of the whole word is computed, beyond the separate meanings of its morphological constituents tea and cup (e.g., Baayen & Schreuder, 1999; Badecker, 2001). Note that the necessity for semantic composition of morphemic units is still present even if, as in some hybrid models (e.g., Colé et al., 1997), both the whole word teacup and its constituents tea and cup are activated.

Behavioral evidence for morphological decomposition

There is substantial behavioral evidence in support of morphological decomposition during reading. Much of this evidence comes from visual priming experiments in which orthographic, morphological, and semantic relationships between prime and target words are varied and morphological priming effects are shown to be independent of both semantic and orthographic relatedness (see, e.g., McQueen & Cutler, 1998, for a review). This is consistent with other evidence for the independence of morphological, semantic, and orthographic effects in reading (e.g., Feldman, 2000). Facilitatory morphological priming effects are reported not only when the prime and target share a true morphological relation (e.g., cleaner–clean), but also when they share an apparent relation (e.g., corner–corn), beyond the effects of orthographic overlap, in both visual (e.g., Domínguez, Segui, & Cuetos, 2002; Kazanina, Dukova-Zheleva, Dana, Kharlamov, & Tonciulescu, 2008; Longtin, Segui, & Hallé, 2003; Rastle, Davis, Marslen-Wilson, & Tyler, 2000; Rastle, Davis, & New, 2004) and cross-modal (e.g., Diependaele, Sandra, & Grainger, 2005) priming paradigms.

Similar behavioral priming effects have also been reported with nonword stimuli. For example, in a visual masked priming paradigm with a lexical decision task in French, Longtin and Meunier (2005) found that morphologically complex pseudowords (e.g., sportation, the noun sport with the illegal suffix -ation) facilitated recognition of their roots, but not unrelated roots. They interpreted this pattern as evidence for early morphological decomposition triggered by the morphological structure of the prime, regardless of its semantic interpretability or lexicality; that is, the separate morphemes were recognized as units even if the letter string was not recognized as a word. This indicates that the process of morphological decomposition can occur independently of whole-word semantics. Indeed, some suggest that early morphemic segmentation is based not on semantic properties, but on orthographic information (see, e.g., Rastle & Davis, 2008, for a review). Interestingly, a visual priming effect for lexical decision appears to extend beyond complete morphemes in word or pseudoword primes, as in the case of orthographic alterations (e.g., adore/adorable), even in the absence of a semantic relation between prime and target (e.g., fetish/fete; McCormick et al., 2008), or in the case of transpositions (e.g., wranish/warn; Beyersmann, Castles, & Coltheart, 2011). The latter findings again suggest a process of morphological decomposition insensitive to semantic information (McCormick et al., 2008, p. 317).

ERP evidence for morphological decomposition

While ERP studies of derivational morphology are scarce (e.g., Bölte, Jansma, Zilverstand, & Zwisterlood, 2009, p. 338), electrophysiological findings from priming studies are relatively consistent with these behavioral data in indicating morphological decomposition in fluent readers (e.g., Morris, Grainger, & Holcomb, 2008; see also Fiorentino & Poeppel, 2007, for similar findings from magnetoencephalography). Indeed, there is ERP evidence for decomposition even when such a process is not semantically useful. For instance, in a visual masked priming paradigm with a lexical decision task, Morris, Frank, Grainger and Holcomb (2007) recorded ERPs in response to words primed by semantically transparent (e.g., hunter–hunt), opaque (e.g., corner–corn), and orthographically related (e.g., scandal–scan) primes and reported a linear relationship such that priming effects on both N250 and N400 amplitude were greatest for transparent items and least for orthographic items. Recently, in another visual priming paradigm with lexical decision, Lavric and colleagues (Lavric, Clapp, & Rastle, 2007; Lavric, Rastle, & Clapp, 2011) reported a similar N400 priming effect for word pairs with apparent and genuine morphological relationships, greater than that for nonmorphologically related pairs. They reported priming effects in both a masked condition, during the 340- to 380-ms and 460- to 500-ms epochs, and an unmasked condition, during the 300- to 380-ms epoch. Previously, with Spanish stimuli, Barber, Domínguez and de Vega (2002) found that stem homographs elicited attenuated N400s similar to morphologically related words in a visual priming paradigm with a lexical decision task, and Domínguez, de Vega and Barber (2004), in a similar paradigm with similar stimuli, reported similar results. However, in a recent cross-modal lexical decision paradigm with auditory primes, the N400 priming effect was significantly stronger for semantically transparent than for semantically opaque pairs, a graded effect suggesting a possible contribution of semantics (Kielar & Joanisse, 2011; see also Feldman, O'Connor, & del Prado Martín, 2009). Overall, though, these electrophysiological findings from priming studies, similar to the behavioral data, suggest that morphological decomposition of written words occurs automatically initially, regardless of semantic benefit.

ERP evidence for morphological processing: bound and free

A few ERP studies have specifically investigated morphological processing in terms of bound and free morphemes. This set of findings is more equivocal, particularly in terms of the N400 component. In regard to bound morphemes, the majority of ERP studies have investigated regular and irregular verb inflections or noun pluralizations (e.g., Allen, Badecker, & Osterhout, 2003; Gross, Say, Kleingers, Clahsen, & Münte, 1998; Justus, Larsen, de Mornay Davies, & Swick, 2008; Justus, Yang, Larsen, de Mornay Davies, & Swick, 2009; Lavric, Pizzagalli, Forstmeier, & Rippon, 2001; Lehtonen et al., 2007; Leinonen et al., 2009; Morris & Holcomb, 2005; Münte, Say, Clahsen, Schiltz, & Kutas, 1999; Newman, Ullman, Pancheva, Waligura, & Neville, 2007; Penke et al., 1997; Rodriguez-Fornells, Clahsen, Lleó, Zaake, & Münte, 2001; Rodriguez-Fornells, Münte, & Clahsen, 2002; Weyerts, Münte, Smid, & Heinze, 1996; Weyerts, Penke, Dohrn, Clahsen, & Münte, 1997). For example, in a study of verb inflections using unimodal auditory priming, phonological and orthographic control stimuli elicited a typical priming effect, an attenuated early N400; in contrast, no N400 priming effect was evident in a comparable cross-modal priming study, consistent with an interpretation of reduced prelexical facilitation in cross-modal priming and an interpretation of the N400 as sensitive to prelexical processes (Justus et al., 2008; Justus et al., 2009). In another study using a lexical decision paradigm with Finnish nouns and pseudowords, N400 amplitude was modulated by lexicality, suggesting sensitivity to the whole word, and thus reflective of semantic composition processes (Lehtonen et al., 2007). N400 amplitude was also more negative for inflected than for monomorphemic nouns, suggesting a relative semantic processing cost for inflected words (Lehtonen et al., 2007). However, there is debate as to whether studies of inflections can detect neural correlates of decomposition (Zweig & Pylkkänen, 2009).

Beyond inflectional morphology, fewer ERP studies have investigated processing of prefixes, suffixes, and bound stems, although such studies could provide insight into morphological decomposition and semantic composition processes. For example, in a lexical decision study of German suffixes, pseudowords created with real suffixes elicited an N400-like effect, consistent with the notion of decomposition and attempted semantic composition of even incorrectly derived words (Janssen, Wiese, & Schlesewsky, 2006). In a lexical decision paradigm with Spanish stimuli, facilitative priming effects were observed for prefixes in the 360- to 430-ms time window on a component with a frontal distribution, again suggesting decomposition (of prefix and root; Domínguez, Alija, Cuetos, & de Vega, 2006). And in a lexical decision paradigm with English stimuli, McKinnon, Allen and Osterhout (2003) reported that nonwords composed of bound morphemes (e.g., exceive) elicited an N400 of similar amplitude to real words composed of bound morphemes (e.g., receive) and to morphologically complex control stimuli (e.g., muffler). However, control nonwords composed of nonmorpheme segments of the real word control stimuli that could not be decomposed into smaller meaningful elements (e.g., flermuf) elicited markedly more negative N400s. This pattern of findings suggests that morphological decomposition extends to words made up of bound morphemes. Moreover, N400 amplitude seemed “entirely a function of the morphological content of the letter string . . . sensitive to the presence or absence of morphemes in the string, but not to whether the morphemes combine[d] to form a word” (McKinnon et al., 2003, p. 886). On the basis of these data, one could conclude that the N400 serves as an index of morphological decomposition, rather than semantic composition.

In regard to free morphemes, ERPs have been used to investigate the auditory and visual processing of compound words. In Chinese, in a lexical decision paradigm involving sequential auditory presentation of two meaning units forming a compound, the authors reported a more negative N400 for compounds made up of semantically distinct meaning units (e.g., the more opaque butterfly, as compared with the more transparent wineglass, in English) and interpreted this pattern as an indication that such compounds require greater semantic composition processing resources (Bai et al., 2008). In German, in an auditory semantic judgment paradigm, each of the second and third constituents of three-constituent compound words elicited an N400, suggesting that lexico-semantic integration of auditory compound words is an incremental process (Koester, Holle, & Gunter, 2009). Similarly, in Basque, in a visual sentence judgment paradigm, low-frequency second constituents of compound words elicited more negative N400s than did high-frequency second constituents, again suggesting that compound words are decomposed (Vergara-Martínez, Duñabeitia, Laka, & Carreiras, 2009). And in Italian, in a lexical decision task, El Yagoubi, Chiarelli, Mondini, Perrone, Danieli and Semenza (2008) reported that compound nonwords composed of two real words (e.g., spadapesce) and real compound words (e.g., pescespada, swordfish) were processed similarly in terms of the N400, a pattern again suggesting “[t]hat participants may access the meaning of both constituents as a result of a decomposition process; this in turn would mitigate the impact of the linguistic difference between words and nonwords in the compound-noun category” (El Yagoubi et al., 2008, p. 573). In contrast, there was a larger N400 lexicality effect for nonwords derived from noncompound words; these nonwords, less like familiar words, elicited more negative N400s (El Yagoubi et al., 2008). In addition, in an earlier time window (270–370 ms), compounds elicited a larger anterior negativity than did noncompounds (El Yagoubi et al., 2008), consistent with ERP evidence from auditory studies of German compound word processing (e.g., Koester et al., 2007; Koester, Gunter, Wagner, & Friederici, 2004). Thus, the findings of El Yagoubi et al. also suggest that the N400 serves as an index of morphological decomposition, rather than semantic composition.

Taken together, the results of these previous ERP studies using word and nonword stimuli made up of bound and free morphemes are unclear with respect to the type of processing indexed by the N400 during reading of complex morphological stimuli. Some findings indicate that the N400 can serve as an index of semantic composition for words including bound (e.g., Lehtonen et al., 2007) and free (e.g., Bai et al., 2008) morphemes. Other findings, from studies with both bound (e.g., McKinnon et al., 2003) and free (e.g., El Yagoubi et al., 2008) morpheme stimuli, suggest that the N400 may be more sensitive to morphological decomposition than to lexicality or semantic composition. In the latter studies, the meaningfulness of the constituent morphological parts of the stimuli seems to override the lexicality of the whole stimulus, resulting in an N400 of similar amplitude for words and nonwords made up of the same morphological components.

ERP evidence: The N400

This pattern of findings appears discrepant with the N400 literature outside the realm of morphology, which is relatively consistent with the hypothesis that this component reflects lexico-semantic processing such that N400 amplitude indexes the ease of accessing or activating a word in memory (e.g., Holcomb, 1988; Kutas & Federmeier, 2000; Van Petten & Luka, 2006). Previous studies not involving explicit morphological manipulations have shown that pronounceable nonwords elicit more negative N400s than do real words, likely related to extended search within lexical memory and difficulty of integration (e.g., Bentin, 1987; Friedrich, Eulitz, & Lahiri, 2006; Holcomb & Neville, 1990). However, there is some debate about whether the N400 indexes lexical access or integration or both (e.g., Brown & Hagoort, 1993; Deacon, Dynowska, Ritter, & Grose-Fifer, 2004; Lau, Phillips, & Poeppel, 2008; Pylkkänen & Marantz, 2003).

In some models, N400 amplitude reflects the ease of linking or integrating across levels of representation—specifically, the “form–meaning interface” among orthographic, phonological, and semantic representations (e.g., Grainger & Holcomb, 2009, p. 141). Interestingly, Rastle and Davis (2008) have proposed that “form–meaning correspondences drive morphemic segmentation at the orthographic level” (p. 16). It is possible that the products of earlier morphological processing, as, for example, indexed by the N250 elicited in precisely timed masked priming studies (e.g., Morris et al., 2007; Morris et al., 2008) or the anterior negativity reported in studies of compounds (e.g., El Yagoubi et al., 2008), could be one of the levels of representation integrated within the N400 time window (see also Pylkkänen & Marantz, 2003). This would be consistent with a view of the N400 as an index of interactive, integrative word processing at multiple, cascaded levels of representation (e.g., Coch & Holcomb, 2003; Laszlo & Federmeier, 2011, p. 185)—and as a direct index of semantic composition, but only an indirect index of morphological decomposition.

The present study

In the present study, we tested the hypothesis that the amplitude of the N400 would be more sensitive to semantic composition, operationalized here as the lexicality of the whole stimulus, than to morphological decomposition, operationalized here as the meaningfulness or morphological status of the constituent parts of the stimuli. We used three carefully controlled sets of stimuli in a lexical decision task: bound stem words (e.g., discern, predict) and nonwords (e.g., disject, percern), free morpheme words (e.g., cobweb, earring) and nonwords (e.g., cobline, bobweb), and control monomorphemic words (e.g., garlic, minnow) and control nonmorphemic nonwords (e.g., gartus, buzlic). We predicted that nonwords would elicit a more negative N400 than matched real words across all three morphological types, regardless of the morphological composition of the matched pairs. That is, we predicted that the N400 would be more sensitive to the lexicality of the whole stimulus than to the parts of the stimulus. In this view, nonwords made up of bound or free morphemes would elicit more negative N400s than would matched real words made up of the same bound or free morphemes, respectively. This might occur because, although consisting of the same morphological elements represented in the lexicon, composition of those elements into a whole would be more difficult and/or result in a relatively extended lexical search, for an item with no lexical representation, in the case of nonwords. The same lexicality effect on the N400 would be expected for control words, as compared with control nonwords, neither of which have representations at the morphemic level: There would be a more effortful integration of the information available and a more extended lexical search for nonwords, as compared with words. If, instead, the N400 serves as a more direct index of morphological decomposition, we would have predicted similar amplitude N400s to bound words and nonwords and free words and nonwords, since each is made up of matched meaningful parts. However, in this view, we would have predicted a different pattern—a more negative N400 to nonwords than to words—for the control condition, in which stimuli did not contain meaningful parts.

Furthermore, consistent with the substantial evidence for morphological decomposition reviewed above, we predicted that participants would be less accurate and take more time to decide that a nonword composed of two morphemes, regardless of whether the morphemes were bound or free, was not a real word than that a control, monomorphemic nonword was not a real word (e.g., Meunier & Longtin, 2007). We reasoned that meaningful elements combined in nonmeaningful ways would slow, as reflected in reaction times, and “confuse,” as reflected in accuracy, the lexical decision process. To our knowledge, this is the first ERP study to directly compare the processing of printed sets of words composed of bound and free morphemes and monomorphemic control stimuli in order to explore the relative sensitivity of the N400 to morphological recomposition or semantic composition (i.e., the status of the whole) and morphological decomposition (i.e., the status of the parts).

Method

Participants

Sixteen (8 female) undergraduate students 18;8 to 22;5 years of age (mean 20;8, SD 1;4) participated. All were right-handed (Oldfield, 1971), monolingual native English speakers who did not self-report fluency in any other language. Participants, except 2 who reported difficulty with pronouncing /r/ as young children, self-reported no history of language or reading disorders or neurological dysfunction. No participants were taking medication affecting neural functioning. All had normal or corrected-to-normal vision, defined as 20/30 or better, as tested with a standard vision chart. All were volunteers paid $20 for their participation.

Behavioral testing

In order to ensure that participants were fluent readers, they were given a battery of standardized behavioral tests measuring various aspects of reading-related skills. The Homophone Choice and Sight Spelling subtests comprising the Spelling Accuracy composite of the Test of Orthographic Competence measured orthographic skills (Mather, Roberts, Hammill, & Allen, 2008). This test is normed for participants ages 6;0 to 17;11; norms for the 17;11 age group were used, since we were unable to find a normed instrument to measure orthographic awareness in college students. Furthermore, the Elision and Blending Words subtests comprising the Phonological Awareness composite of the Comprehensive Test of Phonological Processing were used as a measure of phonological skill (normed to age 25; Wagner, Torgesen, & Rashotte, 1999). In addition, the Peabody Picture Vocabulary Test IIIA was administered to provide a measure of vocabulary (normed through 90+; Dunn & Dunn, 1997), and the Passage Comprehension subtest of the Woodcock Reading Mastery Tests–Revised (normed through 75+; Woodcock, 1987) was administered as a measure of reading comprehension.

ERP task stimuli

Three stimulus sets were constructed for use in the ERP lexical decision task. For the first set, bound stem words and nonwords, 60 words containing bound stems were chosen (e.g., discern, predict). From these, 60 bound stem nonwords were formed by recombining the first and second morphemes of different words into nonword forms that were legal and pronounceable English letter strings (e.g., disject, percern).Footnote 1 For the second set, free morpheme words and nonwords, 60 words composed of free morphemes (e.g., cobweb, earring) were selected. From these, 60 free morpheme nonwords were created that were pronounceable and composed of legal English letter strings (e.g., cobline, bobweb). For the third set, control monomorphemic words and nonmorphemic nonwords, 60 monomorphemic control words (e.g., garlic, minnow) were selected and used to create 60 nonwords by combining the first syllable of one word with the second syllable of another word (e.g., gartus, buzlic). All stimuli were two syllables, except 2 bound words and their matched nonwords and 2 free words and their matched nonwords. Each participant viewed all 360 stimuli presented in the same pseudorandom order. Presentation order was random, except that at least five items separated any related items (i.e., any stimuli that shared any parts were separated by at least five items) and there were no more than 5 nonwords or 5 words in a row, in order to maintain engagement with the task.

In an attempt to control for possible effects of orthography and frequency, given the known sensitivities of the N400 (e.g., Barber & Kutas, 2007; Holcomb, Grainger, & O'Rourke, 2002; Van Petten & Kutas, 1990), words and nonwords within each of the three morphological types (bound, free, control) were matched for length, orthographic neighborhood size, constrained bigram and trigram frequency, and unconstrained bigram and trigram frequency as calculated by MCWord, an orthographic wordform database (Medler & Binder, 2005; see Table 1). All paired t-tests resulted in ps > .1. Similarly, words across each of the three morphological types and nonwords across each of the three morphological types were balanced across all of these measures (all ps > .1). In addition, word frequency across the three morphological types was controlled (all ps > .6), on the basis of raw frequency of the string in the modified CELEX database, converted to frequency per million in MCWord (Medler & Binder, 2005). Frequency for nonwords was, of course, zero across morphological types.

Table 1 Summary of stimulus characteristics by morphological type [mean, (SD)]

Procedure

Participants were provided with an overview of the procedures, and any questions were addressed before subjects signed a consent form. Behavioral testing was conducted in a quiet room prior to or after participation in the ERP lexical decision paradigm. For electroencephalogram (EEG) recording, participants were fitted with an elastic electrode cap (Electro-Cap International, Eaton, OH) with active electrodes including FP1/2, F7/8, FT7/8, F3/4, FC5/6, C3/4, C5/6, T3/4, CT5/6, P3/4, T5/6, TO1/2, and O1/2. Recordings were also taken from the midline sites Fz, Cz, and Pz, but data from these sites, which showed the same pattern as at medial sites, are not reported here, except as represented in the topographical voltage maps. Mastoid electrodes were used for reference; online recordings were referenced to the right mastoid, and recordings were rereferenced to averaged mastoids in the final data averaging. Electrodes located below the right eye and at the outer canthi of the left and right eyes were used to identify blinks, in conjunction with recordings from FP1/2, and horizontal eye movements, respectively. Mastoid electrode impedances were maintained below 4 KΩ, scalp electrode impedances below 5 KΩ, and eye electrode impedances below 10 KΩ. Once electrode preparation was complete, participants were seated in a comfortable chair in a sound-attenuating and electrically shielded booth for the ERP task.

For the ERP paradigm, stimuli were presented using Presentation software (Neurobehavioral Systems) at the center of a 19-in. LCD monitor approximately 66 in. in front of each participant. Stimuli were displayed in 36-point Arial font in white on a black background. All letter string stimuli in the lexical decision task subtended less than 1° of vertical visual angle and from 0.9° to 2.1° of horizontal visual angle in order to discourage scanning eye movements. The sequence of events began with a white crosshair fixation at the center of the screen, to which participants needed to press either one of two buttons on a hand-held response device to advance to the presentation of the stimulus. Upon pressing the button, the stimulus was presented for a maximum of 3,000 ms, but participants were asked to press a button to indicate their word/nonword decision as quickly as possible after stimulus onset. Word/nonword buttons (i.e., response hand) were counterbalanced across participants. The buttonpress response removed the stimulus from the screen and advanced the program back to the fixation point to start a new trial. If the participant did not press a button to indicate a response within 3 s of stimulus onset, the stimulus disappeared from the screen, and the program advanced to the fixation point automatically. The session was self-paced in that only a buttonpress by the participant would advance from the fixation to the next trial. A brief practice session of 12 trials, not including any of the experimental stimuli, preceded the experimental session. On average, the ERP task took about 12.1 min (SD 2.8).

Data analysis

EEG was amplified with SA Instrumentation bioamplifiers, with bandpass of 0.01–100 Hz, and digitized online with a sampling rate of 4 ms. ERPs were time-locked to the visual presentation onset of each word and nonword. Offline, separate ERPs to the bound words, bound nonwords, free words, free nonwords, control words, and control nonwords were averaged for each subject at each electrode site over a 1,000-ms epoch, using a 200-ms pre-stimulus-onset baseline. Only trials on which participants responded correctly and within the allotted time in the lexical decision task were included in the ERP averages. Trials contaminated by eye movements, muscular activity, or electrical noise were not included in analyses. Standard artifact rejection parameters were initially employed, and data were subsequently analyzed on an individual basis for artifact rejection. The average numbers of trials included in the primary conditions of interest were the following: bound words, mean 48.8 (SD 5.3) or 81 % (SD 8.8 %); bound nonwords, mean 46.1 (SD 6.6) or 77 % (SD 11 %); free words, mean 44.8 (SD 6.3) or 75 % (SD 10.5 %); free nonwords, mean 47.3 (SD 7.5) or 79 % (SD 12.5 %); control words, mean 49.9 (SD 5.4) or 83 % (SD 9 %); and control nonwords, mean 51.3 (SD 5.6) or 86 % (SD 9.3 %).

Mean amplitude of the N400 was measured within the 300- to 600-ms window. An omnibus ANOVA with within-subjects factors morphological type (bound, free, control), lexicality (word, nonword), anterior/posterior [6 levels: frontal (F7/8, F3/4), fronto-temporal (FT7/8, FC5/6), temporal (T3/4, C5/6), central (CT5/6, C3/4), temporoparietal (T5/6, P3/4), and occipital (TO1/2, O1/2)], lateral/medial, and hemisphere (left, right) was performed on the mean amplitude data. Follow-up analyses with Bonferroni correction were used to further investigate effects of morphological type. Difference waves resulting from the subtraction of the word ERPs from the nonword ERPs for each of the three morphological types were created to better directly compare the lexicality effect across morphological types. The Greenhouse–Geisser correction was applied to all within-subjects measures with more than one degree of freedom, and corrected p-values are reported below. Partial eta squared (η 2) values are reported as estimates of effect size for the key effects. All results are significant at the .05 level unless otherwise noted.

Results

Behavioral tests

Results from the battery of standardized behavioral tests are summarized in Table 2. These results indicate that participants scored within normal limits—and, on average, above average—on measures of orthography, phonology, vocabulary, and comprehension. In short, the participants could be considered fluent readers with good vocabularies and no reading disorders. These results serve as an important control, since reading ability is a potentially confounding factor.

Table 2 Standardized behavioral test results [mean, (SD)]

Accuracy and reaction time in the lexical decision task

Accuracy on the lexical decision task is summarized in Fig. 1a. An ANOVA with the factors morphological type and lexicality yielded a main effect of morphological type, such that participants were most accurate for the control stimuli (93.9 %), followed by the bound (88.4 %) and free (85.7 %) stimuli, F(2, 30) = 30.64, p < .001. Follow-up pair-wise comparisons with a Bonferroni corrected p-value of .017 (.05/3) showed that participants were more accurate with bound than with free stimuli, t(15) = 3.03, p = .009, and more accurate with control stimuli than with either free, t(15) = 7.52, p = .001, or bound, t(15) = 4.59, p = .001, stimuli. However, this main effect of morphological type was modified by an interaction with lexicality, F(2, 30) = 5.50, p < .05. Follow-up pair-wise comparisons within morphological type did not support the hypothesis that accuracy would be lower for nonwords than for words, although there was a trend toward this pattern for the bound morphological type (bound, p = .08; free, p = .20; control, p = .18).

Fig. 1
figure 1

Bar graphs illustrating a accuracy and b reaction time in the lexical decision task. Participants were faster to respond to words than to nonwords for all three morphological types. Bars indicate standard errors

Reaction times on the ERP lexical decision task are summarized in Fig. 1b. An ANOVA with the factors morphological type and lexicality yielded both a main effect of lexicality, such that responses were faster to words than to nonwords, F(1, 15) = 19.28, p < .01, and a main effect of morphological type, such that participants were fastest to respond to control stimuli, followed by bound and free stimuli, F(2, 30) = 37.47, p < .001. Follow-up pair-wise comparisons for the main effect of morphological type with a Bonferroni-corrected p-value of .017 (.05/3) showed that response times were similar for bound and free stimuli (p = .123) but faster for control stimuli than for either bound, t(15) = 7.66, p = .001, or free, t(15) = 6.95, p = .001, stimuli.

These main effects were modified by an interaction between morphological type and lexicality, F(2, 30) = 3.85, p < .05, which was also followed up by pair-wise comparisons. For each of the three morphological types, as predicted, participants were faster to respond to words than to nonwords at the Bonferroni-corrected p-value of .017 (.05/3) [bound, t(15) = 4.13, p = .001; free, t(15) = 4.93, p = .001; control, t(15) = 2.99, p = .009]. Looking only at word stimuli, the effect of morphological type was significant, F(2, 30) = 8.10, p = .005; this was also the case for nonword stimuli, F(2, 30) = 38.82, p = .001. Follow-up pair-wise comparisons with a corrected p-value of .008 (.05/6) showed slower response times to both bound, t(15) = 4.07, p = .001, and free, t(15) = 3.11, p = .007, words, in comparison with control words. Similarly, response times to both bound, t(15) = 7.60, p = .001, and free, t(15) = 7.74, p = .001, nonwords were longer than those to control nonwords.

ERP waveforms in the lexical decision task

Grand average ERP waveforms for each of the three morphological types are shown in Fig. 2a–c. An omnibus ANOVA with the within-subjects factors morphological type, lexicality, anterior/posterior, lateral/medial, and hemisphere yielded a main effect of morphological type, F(2, 30) = 30.84, p < .001, η 2 = .67, with mean amplitude of the N400 most negative for free stimuli (−0.96 μV, SE 0.27), followed by control (−0.70 μV, SE 0.25) and bound (−0.04 μV, SE 0.24) stimuli. This effect varied across the scalp, suggesting different distributions of the N400 across the morphological types [morphological type × lateral/medial, F(2, 30) = 13.54, p < .001, η 2 = .47; morphological type × anterior/posterior × lateral/medial, F(10, 150) = 3.17, p < .01, η 2 = .18]. Follow-up pair-wise comparisons with a Bonferroni-corrected p-value of .017 showed that the N400 was more negative for the free stimuli than for the control stimuli, particularly at medial sites [morphological type × lateral/medial, F(1, 15) = 9.35, p = .008]. Furthermore, the N400 was more negative for control stimuli than for bound stimuli, also particularly at medial sites [morphological type, F(1, 15) = 46.40, p = .001; morphological type × lateral/medial, F(1, 15) = 7.44, p = .016]. Finally, the N400 was more negative for free stimuli than for bound stimuli, particularly at anterior and temporoparietal medial sites [morphological type, F(1, 15) = 44.91, p = .001; morphological type × lateral/medial, F(1, 15) = 19.79, p = .001; morphological type × lateral/medial × anterior/posterior, F(5, 75) = 5.18, p = .002].

Fig. 2
figure 2

Grand average ERP waveforms elicited by word (solid line) and nonword (dashed line) stimuli for each of the three morphological types: a bound, b free, and c control. More anterior sites are toward the top of each figure, while more posterior sites are toward the bottom; left-hemisphere sites are on the left, and right-hemisphere sites are on the right; lateral sites are toward the outer edges, and medial sites are toward the middle; each vertical tick marks 100 ms; and negative is plotted up. The calibration bar marks 1 μV. A clear effect of lexicality, such that nonwords elicited a larger negativity than did words in the N400 time window (300–600 ms) for each morphological type, is apparent

The omnibus ANOVA also yielded a main effect of lexicality, F(1, 15) = 40.32, p < .001, η 2 = .73, with nonwords (mean amplitude −1.05 μV, SE 0.30) eliciting more negative N400s than did words (−0.08, μV, SE 0.20). This effect also varied across the scalp and was most pronounced at medial posterior sites, consistent with the typical distribution of the N400 [lexicality × anterior/posterior, F(5, 75) = 3.95, p < .05, η 2 = .21; lexicality × lateral/medial, F(1, 15) = 18.25, p < .01, η 2 = .55; lexicality × anterior/posterior × lateral/medial, F(5, 75) = 8.54, p < .001, η 2 = .36].

In order to better visualize the apparent N400 lexicality effect (i.e., the difference between word and nonword processing) for each of the three morphological types, difference waves were created by subtracting the word ERPs from the nonword ERPs for each morphological type. Topographical voltage maps were then created using a spherical spline interpolation to interpolate the potential on the surface of an idealized, spherical head (Perrin, Pernier, Bertrand, & Echallier, 1989) on the basis of the mean voltages measured in the difference waves at each electrode location within the 300- to 600-ms time window (see Fig. 3). Analysis of the amplitude of the difference waves across morphological types yielded no significant results (all ps > .11), confirming that the mainly centroparietal N400 lexicality effect was of similar amplitude and similarly distributed across the three morphological types.

Fig. 3
figure 3

Topographical voltage maps illustrating a similar N400 lexicality effect across the three morphological types. A spherical spline interpolation (Perrin et al., 1989) was used to interpolate the potential on the surface of an idealized, spherical head based on the voltages measured from the difference waves (reflecting ERPs to words subtracted from ERPs to nonwords) at each electrode location within the N400 time window (300–600 ms) for each of the three morphological types

Importantly, there were no significant interactions between morphological type and lexicality in the omnibus ANOVA (morphological type × lexicality, p = .35, η 2 = .07; all other ps involving morphological type, lexicality, and distributional factors, p > .10).

Discussion

In order to investigate the sensitivity of the N400 component of the ERP waveform to morphological decomposition and semantic composition, we used a stimulus set well controlled for orthography and frequency, including words and matched nonwords made up of bound morphemes, words and matched nonwords made up of free morphemes, and monomorphemic control words and matched nonwords. We reasoned that if the N400 were an index of morphological decomposition, nonwords made up of the same meaningful parts as real words, whether those parts were bound or free morphemes, would elicit an N400 similar to their matched real word counterparts (e.g., El Yagoubi et al., 2008; McKinnon et al., 2003). On the contrary, if the N400 were an index of semantic composition, we predicted that nonwords would elicit a more negative N400 than would matched real words across all three morphological types, indicating a sensitivity to the overall lexicality of the stimulus, rather than to the meaningfulness of its parts (e.g., Bai et al., 2008; Lehtonen et al., 2007). Our results are consistent with the latter and strongly suggest that the N400 serves as an index of semantic composition in fluent readers.

The N400 and lexicality

Analyses of the N400 ERP data in the lexical decision task showed a clear main effect of lexicality such that nonwords elicited more negative N400s than did real words across all three morphological types. Analyses of difference waves and topographical voltage maps confirmed the similarity of the N400 lexicality effect across morphological types. Thus, regardless of whether the parts of words were meaningful morphemes (as in the bound and free morpheme stimuli, reflected in derived and compound complex words) or not (as in the control syllable stimuli), nonwords made up of those parts consistently elicited a more negative N400 than did their matched word counterparts. That even the control nonwords elicited a more negative N400 than did the control words might suggest that the amplitude of the N400 is not sensitive to morphology at all (but see discussion below). This overall pattern of results is consistent with previous studies outside the domain of morphological processing, which have indicated that pseudowords elicit more negative N400s than do real words, likely due to greater effort spent on lexical access and meaningful integration (e.g., Bentin, 1987; Friedrich et al., 2006; Holcomb & Neville, 1990). This is inconsistent with a process of morphological decomposition insensitive to semantic information (e.g., Longtin & Meunier, 2005; McCormick et al., 2008; Morris et al., 2007) and suggestive instead of the N400 as an index of semantic composition in morphological processing.

That real words elicited less negative N400 amplitudes—even though the bound and free nonwords were made up of the same morphological components as their real word counterparts—is consistent with some previous research with bound and free morphemic stimuli showing a lexicality effect for N400 amplitude (e.g., Bai et al., 2008; Lehtonen et al., 2007). However, it is inconsistent with other ERP studies that have reported an N400 more sensitive to morphological decomposition than to semantic composition (e.g., El Yagoubi et al., 2008; McKinnon et al., 2003). As was discussed in the Introduction, McKinnon and colleagues (2003, p. 883) reasoned that if readers decomposed their bound morpheme word and matched nonword stimuli into their constituent morphemes and these morphemes were represented in the lexicon, the words and nonwords would elicit N400s of the same amplitude, since the stimuli were made up of the same morphemes; and this is exactly what they found. This pattern of results is in direct contradiction to our bound stimuli findings.

There are a number of methodological differences between the two studies that may have contributed to the different results. The words and nonwords in McKinnon et al.’s (2003) bound condition were similar to ours, while their control condition included morphologically complex words; examples provided were bookmark and muffler, the first of which would have been included with our free morpheme stimuli. In addition, our stimuli were slightly shorter in length on average, and ours were controlled not only for bigram frequency for words and nonwords within a morphological type (as theirs were), but also for constrained and unconstrained bigram frequency, constrained and unconstrained trigram frequency, orthographic neighborhood size, and overall frequency both within and across morphological types. Given that N400 amplitude is known to be sensitive to both orthographic neighborhood size and frequency (e.g., Barber & Kutas, 2007; Holcomb et al., 2002; Van Petten & Kutas, 1990), these differences in stimulus construction—particularly with respect to the creation of bound nonwords—may have contributed to the differences in results. Moreover, Amenta and Crepaldi (2012) speculated that characteristics of the entire experimental list could influence morphological effects (see, e.g., Andrews, 1986; Feldman & Basnight-Brown, 2008; Taft, 2004); we included a compound word morphological type that was not used by McKinnon et al. and may have influenced the differential findings. In addition, although McKinnon et al. reported accuracy data for their ERP lexical decision task, they did not note whether only trials correctly responded to were included in the ERP averages (as done here), and they used a different statistical approach than our own. Furthermore, McKinnon et al. did not report on the reading ability of their participants; familiarity and fluency with words could be a confounding factor in the lexical decision task. Finally, McKinnon et al. included 12 participants and described development of two experimental lists, although it is unclear whether each participant viewed both lists of stimuli or only one. Both of these factors could have contributed to a difference in power between the two experiments. However, visual inspection of our individual participant data indicated that only 1 of 16 participants showed similar-amplitude N400s for bound word and nonword stimuli. Overall, differences in stimulus construction and design likely contributed to the different outcomes in the present study and McKinnon et al. Perhaps these characteristics differentially affected the contributing processes reflected in the integrative N400 (e.g., Coch & Holcomb, 2003; Laszlo & Federmeier, 2011; Pylkkänen & Marantz, 2003).

With respect to our free morpheme stimuli, in an ERP study of compound words in Italian, El Yagoubi and colleagues (2008) also reported an effect opposite of ours: N400s of similar amplitude for real words and matched nonwords made up of the components of the real words. However, their nonwords were created by simply reversing the order of the components (e.g., the real word capobanda, meaning band leader, became the nonword bandacapo, with no meaning), rather than recombining parts across different stimuli, as we did here (e.g., the components of cobweb were distributed to the nonwords cobline and bobweb). The smaller N400 lexicality effect for compounds reported by El Yagoubi et al. might have been related to this feature of stimulus design, particularly considering that the stimuli were in Italian, a language with a shallow orthography and in which headedness in compounds is not fixed. In comparison, the English used here has a deep orthography and fixed headedness.

Another possibility is that the smaller lexicality effect reported by El Yagoubi et al. (2008) is related to their exclusive use of transparent compounds, while our free morpheme stimuli included both transparent (e.g., earring) and opaque (e.g., chickpea) compounds. Since opaque compounds elicit more negative N400s than do transparent compounds when auditory components of the compounds are sequentially presented (Bai et al., 2008) and N400 priming effects are stronger for semantically transparent than for semantically opaque stimuli in a cross-modal lexical decision paradigm (Kielar & Joanisse, 2011), it is possible that visually presented opaque compounds could elicit greater N400s than transparent compounds. However, it is unclear how this would contribute to the greater lexicality effect observed here, since all of the nonwords were necessarily opaque. Thus, the greatest lexicality effect for N400 amplitude would be expected for transparent words (least negative N400), as compared with matched nonwords (most negative N400), the conditions in the El Yagoubi et al. study. But we found a greater lexicality effect for N400 amplitude with our mix of transparent and opaque words, as compared with matched nonwords. It remains for future research to more directly address the effects of lexicality and transparency on N400 amplitude in visual compound word and nonword processing. For our purposes, a significant lexicality effect was observed for a mix of transparent and opaque words, as compared with matched nonwords—the same as the effect observed for the bound and control stimuli.

The N400 and morphological type

In addition to the clear effect of lexicality on N400 amplitude across the three morphological types, the amplitude of the N400 was, separately, sensitive to morphological type. Previous behavioral studies have reported similar priming effects for free and bound stems (see, e.g., Amenta & Crepaldi, 2012). Here, the N400 was most negative to free morpheme stimuli and more negative to control stimuli than to bound morpheme stimuli, all effects with a medial distribution. These findings suggest that the N400 may be sensitive to some internal, rather than overall, aspects of the stimuli. Since the stimuli were controlled on measures of length, orthography, and frequency and varied only in terms of type of constituent (bound morphemes, free morphemes, or syllables), these effects suggest that the N400 may be sensitive to constituent type—or some other factor for which we did not control.

How might N400 amplitude be sensitive to constituent type? Given that it is known that each constituent of a compound word can elicit an N400 (e.g., Koester et al., 2009; Vergara-Martínez et al., 2009), it is tempting to speculate that the more negative N400s to free morpheme stimuli could reflect lexical access or integration processes not only for the overall stimulus, but also for the two lexical components of the stimulus; in this way, the N400 could be considered an index of morphological decomposition. However, bound morpheme stimuli, which could also be decomposed into meaningful parts, elicited the least N400 activation. In turn, control stimuli, which could not be decomposed into meaningful parts (e.g., Meunier & Longtin, 2007, in which complex words not composed of morphemes did not appear to be decomposed into parts of words that did not contain meaning), elicited N400s less negative than those elicited by free morpheme stimuli but more negative than those elicited by bound morpheme stimuli. If a morphological decomposition process were based on whether stimuli could be parsed into morphemic units (e.g., Meunier & Longtin, 2007, p. 459) and the N400 were an index of this decomposition process, we would not expect an N400 for the monomorphemic control stimuli and would expect N400s for the decomposable bound and free stimuli. Thus, this overall pattern suggests that N400 amplitude here is not primarily an index of the morphological decomposability of the stimuli.

This line of argument rests on the morphological decomposability of the bound and free morpheme stimuli into meaningful parts, in contrast to the control stimuli. By definition, morphemes are meaningful parts. However, it could be argued that bound stem morphemes exist as morpho-orthographic units, not as “meaningful” morpho-semantic units (see, e.g., Morris, Porter, Grainger. & Holcomb, 2011; Rastle & Davis, 2008). Indeed, one would expect a pattern of a more negative N400 to free than to bound stimuli, as observed here, if bound stems do not activate semantic (“meaningful”) representations, while free stems do. However, this argument does not account for the observed difference in N400 amplitude between control and bound stimuli, both of which, according to this line of reasoning, include nonmeaningful morpho-orthographic constituents. In addition, following this argument, if the N400 were purely an index of morphological decomposition and decomposition occurred only in cases in which morphemes were defined as semantic units, but not in cases in which they were defined as orthographic units, we would expect no N400s to either bound or control stimuli, while both elicited substantial N400s here. We do not have independent evidence from our participants regarding the semantic status of the bound, free, or control constituents used in stimulus construction in order to begin to adjudicate this argument. However, given evidence that the N400 likely reflects integration across multiple processes (e.g., Coch & Holcomb, 2003; Laszlo & Federmeier, 2011; Pylkkänen & Marantz, 2003), our finding of a main effect of morphological type is consistent with the notion that somewhat different contributing processes may be involved across morphological types. The N400 here does not seem to serve as a direct index of those contributing processes; rather, an interpretation of the N400 amplitude as an index of integration across those contributing processes is more consistent across the N400 literature, the lexicality findings, and the morphological type findings.

The behavioral findings

Turning to the behavioral results, the reaction time data from the lexical decision task were consistent with a processing time cost for morphologically complex words: As was predicted, responses were fastest for the monomorphemic control stimuli, followed by the bound and free stimuli (e.g., Caramazza et al., 1988). Whether this time cost reflects an actual process of decomposition or costs of associated semantic composition or lexical decision making as required by the task is unclear. This study was not designed to address this issue. Also as was expected, responses were faster for real words than for nonwords across all three morphological types, although a similar lexicality effect was not apparent in the accuracy data. The lexicality effect in response times was mirrored in the N400 amplitude data, such that nonwords elicited both slower responses and more negative N400s, and replicates findings in previous electrophysiological reports focused on issues other than morphological processing (e.g., Bentin, McCarthy. & Wood, 1985; Holcomb, 1988, 1993; Holcomb & Neville, 1990).

Conclusion

To be clear, our findings regarding morphological processing and the N400 do not indicate that a process of early, prelexical morphological decomposition (e.g., Taft & Forster, 1975) does not occur, just that the N400 is not a direct index of this process (but cf. Giraudo & Grainger, 2001, for a supralexical account of morphological representation). As was reviewed in the Introduction, there is ample evidence for a process of morphological decomposition, for which some sort of process of semantic composition would appear to be a necessary counterpart (e.g., Koester et al., 2007; Meunier & Longtin, 2007). Since morphological decomposition predicates morphological semantic composition, the N400 could be considered an indirect index of decomposition—thus, perhaps leading to the observed effect of morphological type on N400 amplitude. But earlier components, such as the N250 elicited in masked priming experiments of morphology (e.g., Morris et al., 2007; Morris et al., 2008), the left anterior negativity elicited in auditory compound word processing (Koester et al., 2007), the anterior negativity identified in visual compound word processing (El Yagoubi et al., 2008), or the magnetoencephalographic M170 sensitive to morphological complexity (Zweig & Pylkkänen, 2009), would appear to be better candidates as more direct neural indices of a process of morphological decomposition. Instead, our results suggest that morphological semantic composition is one of the integrative processes indexed by the N400.

In conclusion, the pattern of results observed here and in previous ERP studies of morphological processing (e.g., Bai et al., 2008; Koester et al., 2007; Lehtonen et al., 2007) indicating that the N400 is more reflective of semantic composition than of morphological decomposition is consistent with a larger literature on the N400 component as an index of lexico-semantic integration processes (e.g., Holcomb, 1988; Kutas & Federmeier, 2000; Van Petten & Luka, 2006). Integration across earlier orthographic, phonological, and morphological processing of word and word-like stimuli characterizes an N400 at the form–meaning interface (e.g., Grainger & Holcomb, 2009) that serves as an index of integration across multiple levels of representation (e.g., Barber & Kutas, 2007; Coch & Holcomb, 2003; Laszlo & Federmeier, 2011; Pylkkänen & Marantz, 2003). In turn, this is consistent with theories of morphological processing involving both sublexical, orthographic and supralexical, semantic levels (e.g., Diependaele et al., 2005; Järvikivi & Pyykkönen, 2011). As was noted above, Meunier and Longtin (2007) and Koester and colleagues (2004), among others, have proposed that reading morphologically complex words involves both an early, rapid decomposition process based on morphological and orthographic information and a later process during which the component products of this decomposition process are semantically and syntactically integrated, in a continuous cycle. Of the two, it is the latter composition process that appears to be most directly indexed by the N400 in complex word reading.