A perennial question regarding compound words is whether speakers store compounds in the lexicon as simple atomic units, as the collection of their parts, or some combination of the two. Pure decomposition approaches (e.g., Taft & Forster, 1976) hold that compound forms are broken down into their constituent parts in the lexicon. Nondecompositional accounts (e.g., Butterworth 1983), in contrast, hold that compounds correspond to unitary lexemes. Hybrid accounts (Baayen & Schreuder, 1999; Schreuder & Baayen, 1995) strike a middle ground, allowing for the coexistence of and connection between the lexemes that represent a compound as a contiguous whole and its constituent lexemes.

We contribute to the debate on compound representation with an investigation of the processing of complex verbs in Norwegian. Complex verbs—alternatively called “particle verbs” or “prefixed” verbs (see, e.g., Smolka, Komlosi, & Rösler, 2009)—are two-constituent compounds containing a first element (which can vary in part of speech) and a root verb. We confine our study to complex verbs such as undersøke (“investigate”), in which the first constituent is a preposition (under “under”) combined with a root verb (søke “seek”). We asked whether Norwegian speakers store complex verbs holistically in the lexicon or decompose these verbs at morphemic boundaries.

According to a pure decompositional account, verbs such as undersøke should map to two distinct lexical entries: the lexical entry for the preposition under, and the lexical entry for the root søke. Nondecompositional accounts hold that such a verb should simply have a single, holistic lexical entry, in which the preposition and root are represented jointly.Footnote 1 Hybrid accounts posit the existence of an entry for the contiguous compound that is linked to (at least) the entries of the two constituents. We present an eyetracking-while-reading study in Norwegian that investigated how the whole-word and constituent frequencies impact the reading times for complex verbs, in an attempt to provide empirical support for one of these accounts. To preview our findings, our results were most compatible with a version of hybrid representation.

A second question of interest arises within hybrid theories. In the case of structured or multicomponent lexical entries, it is an open question which of the various components are accessed immediately, and which are activated subsequently. Early decomposition-first models propose that compound words undergo automatic morphological segmentation prior to lexical access, and that recognition proceeds first via access to the constituent lexemes before ending with the whole-word representation (Taft & Foster, 1976). Whole-word-first models (such as the supralexical model of Giraudo & Grainger, 2000) maintain the opposite order of operations: Access to the whole-word representation gates subsequent access to the constituent parts. Dual-route models (either parallel or interactive), by contrast, allow for simultaneous early access to both the whole-word and constituent representations (Baayen & Schreuder, 1999)—provided, of course, that automatic prelexical segmentation has occurred. Below we review previous work that bears on these questions of interest.

Evidence for and against decomposition

Priming

Previous work on complex verbs in German and Dutch has argued for lexical decomposition, on the basis of priming effects in both visual and auditory lexical decision (Schriefers, Zwitserlood, & Roelofs, 1991; Smolka, Gondan, & Rösler, 2015; Smolka et al., 2009; Smolka, Preller, & Eulitz, 2014; Smolka, Zwitserlood, & Rösler, 2007; Zwitserlood, Bolwiender, & Drews, 2007). Smolka et al. (2014) conducted a series of studies in German that tested whether a simple verb (e.g., binden “bind”) was primed by four different types of particle verbs: (i) semantically associated morphological derivatives of the simple verb (e.g., zubinden “tie”), (ii) morphological derivatives whose meaning was not transparently related to the root (entbinden “deliver”), (iii) semantic associates that were not morphologically related to the verb (zuschnüren “tie”), and (iv) unrelated verbs whose root was formally similar to the critical verb (abbilden “depict”). The researchers found that morphological family members facilitated processing of their root, regardless of semantic transparency; both zubinden and entbinden primed binden to equal degrees. In contrast, morphologically unrelated semantic associates (zuschnüren) offered either small or inconsistent priming effects, and form-related lures failed to prime the root (see also Smolka et al., 2009). These results suggest that the verbal root is independently activated when processing a complex verb in German, rendering the root available to create priming. This in turn implies that the complex verb is decomposed at some stage in lexical access, as is expected by decompositional and hybrid models. Thus, the representations of German and Dutch complex verbs appear to incorporate the root lexeme, as predicted by both strictly decompositional and hybrid models.

Frequency effects

Other researchers have used distributional characteristics apart from priming, such as the frequency of words or morphemes, to investigate lexical decomposition. These studies rely on the well-established finding that the time it takes to read a word is influenced by that word’s frequency (Rayner & Duffy, 1986). Frequency effects are often thought to provide an index of lexical access, since frequency influences recognition times for words in tasks such as lexical decision (e.g., Andrews & Heathcote, 2001). Investigations into decomposition generalize the logic of lexical frequency effects: If decomposition requires access to the lexical items of a complex word’s constituents, then each of these constituents’ frequencies should have an impact on the processing time. The most commonly used frequency measures are (i) whole-word frequency, which measures counts occurrences of the citation form word, and (ii) root frequency, which sums occurrences of the root word across different morphological contexts (søke, undersøke, oppsøke, etc.). Effects of whole-word frequency are thought to reflect access to a unitary lexical representation, whereas effects of root frequency indicate decomposition, under the assumption that the root’s properties can only influence processing if that root’s independent lexical entry has been accessed (though see Baayen, Wurm, & Aycock, 2007). Findings of both whole-word and root frequency effects support hybrid models.

We know of no work that has directly investigated how the frequencies of individual constituents affect the recognition of complex verbs in Germanic languages, so below we focus on work on other compound forms and mention similar investigations conducted with other multimorphemic items. Researchers have looked for frequency effects in both lexical decision and eye movement experiments.

Lexical decision

Fiorentino and Poeppel (2007) compared the processing of compound words (e.g., teacup) to single words (e.g., crescent) that were matched for overall word form frequency (and other superficial factors). Although the word groups were matched for whole-word frequency, the compounds were composed of constituent words (tea, cup) whose frequencies were higher than the frequency of the compound. Fiorentino and Poeppel reasoned that such compounds should be recognized faster than their single-word counterparts if recognition of the compound involves decomposing the compound into its higher-frequency roots. The authors found that compounds were recognized more quickly, on average, than single words in a lexical decision task across frequency bins,Footnote 2 which, the authors concluded, favors an early-decomposition model of compound processing.

Whereas Fiorentino and Poeppel (2007) found evidence for the decomposition of compounds by comparing them to monomorphemic words, other studies have looked for evidence of decomposition by directly comparing compounds whose properties were manipulated. Bronk, Zwitserlood, and Bölte (2013) found that German participants in a lexical decision task recognized compound nouns with high-frequency first constituents (e.g., Papierhut “paper hat”) more quickly than compound nouns with low-frequency constituents (e.g., Zauberhut “magic hat”), even when controlling for the whole-word frequency of the compounds. Other studies that have manipulated the frequency of the first constituent independently of the second have found effects, in languages including English (Andrews 1986; Juhasz, Starr, Inhoff, & Placke 2003; Shoolman & Andrews, 2003; Taft, 1979; Taft & Forster, 1976) and Chinese (Taft, Huang, & Zhu, 1994).

In spite of these findings, the lexical decision evidence does not provide conclusive support for full or automatic decomposition. First-constituent effects may provide less compelling evidence, since first constituents enjoy a perceptual advantage in left-to-right reading. Second constituents are arguably more important for determining full decomposition, in languages where (endocentric) compounds have word-final heads. Two studies have shown that second-constituent frequency can facilitate the recognition of noun–noun compounds (Duñabeitia, Perea, & Carreiras, 2007; Juhasz et al., 2003), but these effects are not firmly established, nor have they been demonstrated for compounds of other types, such as complex verbs.

There are also suggestions that constituent frequency effects are not equally reliable across languages (see, e.g., Dronjic, 2011, for a discussion of Chinese) and that their appearance may depend on external factors such as task difficulty. For example, Bronk et al. (2013) found that manipulating pseudoword foils modulated compound-processing effects (see also Andrews, 1986). It would appear, then, that lexical decision studies do not provide conclusive evidence for full or automatic decomposition.

Eyetracking while reading

Eyetracking studies provide a separate—and arguably more ecologically valid—approach for investigating decomposition and its automaticity under normal processing.

The literature on the processing of compound nouns (e.g., moonwalk) has provided evidence that the properties of the first constituent (moon) can affect early indices of visual word recognition such as first-fixation duration (i.e., the duration of the reader’s first eye fixation on the word). Early studies on the reading of Finnish compounds, for example, showed that the first-constituent frequency consistently affected first-fixation times, but the whole-word frequency did not (Hyönä & Pollatsek, 1998; Pollatsek, Hyönä, & Bertram, 2000; see also Andrews, Miller, & Rayner, 2004, and Juhasz et al., 2003, for similar results in English compounds). Other studies have shown simultaneous first-constituent and whole-word frequency effects on early measures (Juhasz, 2008; Kuperman, Bertram, & Baayen, 2008; Kuperman, Schreuder, Bertram, & Baayen, 2009). Simultaneous effects of constituent and whole-word frequencies are compatible with dual-route models.

Later work on compound nouns has suggested that the influence of the first constituent might be exaggerated by the use of long compounds (in Finnish), none of which could be processed within a single fixation. Bertram and Hyönä (2003) demonstrated that first-constituent frequency effects emerge in long compounds, but not in short compounds that could be apprehended in a single fixation. The authors advocated a “horse-race,” parallel-processing model, in which shorter compounds are processed directly and longer compounds are processed using the compositional route.

Second-constituent frequency has not been shown to have a reliable effect on early processing. When Pollatsek et al. (2000) manipulated second-constituent frequency in Finnish compounds while holding first-constituent frequency constant, second-constituent effects were only found in later measures, such as second-fixation durations and gaze durations (i.e., the sum of all fixation durations during the reader’s first encounter with the word).

Therefore, as with the lexical decision effects, eyetracking studies on nominal compounds do not provide unqualified support for full or automatic decomposition.

The small literature on the processing of morphologically complex, prefixed verbs in Western European languages is also relevant. These verbs, although not compounds per se, are superficially quite similar to complex verbs in Norwegian. In both cases, the verb consists of both a prefix and a verbal root in second position. Evidence for early effects of decomposition in the processing of prefixed verbs has been mixed and may perhaps vary cross-linguistically.

Early work on French failed to find root frequency effects for prefixed verbs (Beauvillain, 1996; Colé, Beauvillain, & Segui, 1989). These findings appear to support a nondecompositional analysis in which French prefixed verbs only have whole-word representations.Footnote 3 More recent research in English, however, has suggested that both whole-word and root representations play integral roles in the processing of prefixed verbs. Niswander-Klement and Pollatsek (2006) studied prefixed verbs that were composed of a prefix and a free root morpheme (e.g., “remove,” “re-” + “move”). Across two studies the authors found a reliable negative correlation between root frequency and gaze duration. In their second experiment they also found that root frequency interacted with word length, such that a higher root frequency reduced gaze duration for long but not for short verbs.

Pollatsek, Slattery, and Juhasz (2008) compared the processing of lexicalized and novel prefixed words. In an experiment that crossed word novelty and root frequency, the authors inspected the processing of verbs in identical preceding contexts (e.g., Chris was warned not to {overload/overmelt} the . . .). First-fixation times were longer on novel words, but there was not a reliable effect of root frequency on this measure. Root frequency affected both gaze duration and regression path duration (i.e., the sum of all fixations from the first encounter with the word until moving past it to the right, including any regressive rereading) for both novel and lexicalized verbs, but the size of the effect did not differ by verb type. The authors concluded that decomposition was just as likely when processing novel words that necessarily required decomposition (since it was impossible for them to have whole-word representations) and lexicalized words.

In sum, whole-word frequency effects have been found in first-fixation durations, which strongly implies a role for unitary lexical representations in the processing of morphologically complex words in such languages as English. We believe that much of the present evidence is consistent with the assumption that whole-word processing either always precedes constituent processing, with decomposition occurring subsequently, or that the morphological segmentation necessary for initial constituent processing may not be automatic (for some forms or some languages).

Norwegian complex verbs in comparison

Complex verbs in Norwegian are similar to complex verbs in German or Dutch, in that they are often composed, as we discussed above and as is illustrated again in Example 1, of (at least) two free morphemes, most commonly a preposition/particle and a verbal root.

  1. (1)

    a. Norwegian:avgjøre (av + gjøre) = “to decide” (lit. “off” + “do”)

b. German:abholen (ab + holen) = “to pick up” (lit. “off/up” + “get/fetch”)

The parallel is underscored by the overlap and etymological relation between many of the prefixes (e.g., av and ab) that surface in compound verbs across the two languages.

The Germanic languages are also alike in that prepositional prefixation is only semiproductive, if not almost entirely lexically determined. The prefixes av and ab, for instance, cannot be productively attached to new roots with the same ease as verbal prefixes like English “re-” or “over-,” studied in previous work on decomposition (e.g., Pollatsek et al., 2008). Moreover, the semantic contribution of individual prefixes in a compound does not straightforwardly reflect the lexical meaning of its free form (Libben, 2010), nor is it always possible to specify a core meaning contributed by the prefixes across their uses (again contrary to “re-” or “over-”). Relatedly, Germanic complex verbs are also alike in that the interpretation of a whole compound can vary in transparency from the perspective of the individual meanings of its parts. Examples range from the entirely transparent Norwegian complex verb utbygge (“to extend, build out,” literally “out” + “build”), to the less transparent avgjøre in Example 1a above, to the entirely opaque, idiomatic omkomme (“to die,” literally “around” + “come”).

Given the many parallels, it is possible that the processing of such complex verbs could be similar across Germanic languages. However, we here note one way in which complex verbs in Norwegian differ from their German cousins: Norwegians and Germans receive vastly different surface input as to the structural “separability” of the prefix and the root in complex verbs. Norwegian verbs such as avgjøre appear as a contiguous unit when inflected. This is true across a range of syntactic contexts: The morphemes appear together in the infinitive citation form (Ex. 1a), when inflected in main clauses (Ex. 2a), and in the past participle (Ex. 2b).

  1. (2)

    a. Jeg avgjør om det skjer.

    • I decide.pres if that happens

    • “I decide if that happens.”

b. Jeg ha-r avgjort det.

  • I have-pres decide.part that

  • “I have decided that.”

German particles, in contrast, form a unit with their roots in the infinitive citation form (Ex. 1b) but are frequently separated from their root when inflected. For example, the prefix ab is “stranded” by the root hole when abholen is the main verb of the highest clause (Ex. 3a). Moreover, the prefix is separated from the root by the prefix ge- in the past participle (Ex. 3b).

  1. (3)

    a. Ich hol-e das ab.

  • I fetch-pres.1sg that off/up

  • “I pick it up.”

b. Ich hab-e das ab-ge-hol-t.

  • I have-pres.1sg that off-pprt1-fetch-pprt2

  • “I have picked that up.”

If surface cues to morphological or syntactic separability play a role in determining lexical decomposition, as was suggested by Smolka et al. (2014), it is conceivable that Norwegians may be less likely than Germans to decompose complex verbs. We will return to this possibility in the Discussion.

Present experiment

We sought to determine whether Norwegian complex verbs are decomposed in the lexicon by testing whether their processing is sensitive to whole-word and root frequency effects. In the case that both whole-word and root frequency effects were found, we wished to know whether the indices of decomposition would precede, follow, or occur simultaneously with whole-word effects.

Method

Participants

A total of 36 native speakers of Norwegian (25 female, 11 male; mean age = 31.2, range = 22–47) from the local community in Trondheim, Norway, participated in the experiment. Participants received a gift certificate redeemable for one movie ticket as compensation.

Materials

In all, 70 prefixed verbs were chosen. Each verb was composed of one of 62 roots, in conjunction with one of 12 prepositions (av, for, fram/frem, inn, mot, om, opp, over, , til, ut, and ved). The verbs ranged from five to 11 characters in length (mean length = 7.57, median = 7). All verbs were presented in the present tense, such that they all ended in the suffix -(e)r. We did not systematically control for the semantic transparency of our compound verbs.

The entire experiment was conducted in Bokmål, one of two official standards for written Norwegian (the other being Nynorsk; Venås, 1993). We chose Bokmål because it is the most commonly used standard in everyday and academic writing; roughly 85%–90% of Norwegians prefer to use Bokmål over Nynorsk in their day-to-day life (Vikør, 1995). Because instruction in both norms is also mandated by law, Norwegians are familiar with both forms (Staalesen, 2014). All but three of our participants reported using Bokmål almost exclusively, and the remaining three, Nynorsk-preferring participants reported that they regularly read Bokmål.

The target verbs were embedded in carrier sentences unique to each verb. These carrier sentences were, on average, 11 words in length (range = 8–16). The ordinal position of the target verbs ranged from 5 to 8, and the target verbs were never presented sentence-finally. The full list of experimental stimuli appears in the Appendix.

To partially control for potential confounding effects of each item’s unique carrier sentence, we collected cloze probabilities and plausibility ratings for the individual verbs in context. Items were created by truncating our test sentences before the critical verb. Two separate surveys were conducted on the Ibex Farm experimental platform (Drummond, 2012). Predictability values came from a cloze task, in which 20 native Norwegian volunteers (recruited through social media platforms) read sentence fragments one by one and filled in the verb that they felt best continued the sentence. Plausibility values were collected from 21 different volunteers. The volunteers were instructed to read each sentence fragment, accompanied by its corresponding verb, and to judge whether the verb was a plausible continuation of the preceding sentence fragment on a 7-point scale.

The cloze probability of each item was calculated as the proportion of trials on which participants provided the complex verb used in the test sentence. Most verbs did not appear in the cloze continuations: Sixty-one of our 70 verbs were never provided in the participant responses, and thus had a cloze probability of 0. The remaining nine verbs ranged from low (.05) to relatively high (.57) cloze probability. Plausibility values were generally high: Sixty of the verbs received an average rating of 4 or more. Summary statistics can be found in Table 1.

Table 1 Descriptive statistics for the complex verbs used in the eyetracking experiment

The test sentences were intermixed among 42 filler sentences. Each sentence was followed by a yes–no comprehension question. The presentation order was randomly determined for each participant. A practice session, including instructions and five practice items, began the experiment. The experiment, including calibration, the practice items, and discretionary breaks, took approximately 45 min to complete. One item was removed due to a typo.

Procedure

Eye movements were recorded using an EyeLink 1000 eyetracker (SR Research, Toronto, Ontario, Canada) with a sampling rate of 1000 Hz. Stimuli were presented on a BENQ XL2420Z monitor. The experiment was implemented using the EyeTrack software, available from the Eyetracking lab at UMASS Amherst (www.psych.umass.edu/eyelab/software). All text was presented in the fixed-width font Monaco, size 20, and the sentences never comprised more than a single line of text. The viewing distance was roughly 100 cm, such that 4.39 characters subtended 1° of visual arc. Participants were instructed to read sentences for comprehension at their own pace and to use the button box to indicate when they had finished reading a sentence. At the beginning of the session, the eyetracker was calibrated using a three-point grid. Calibration was corrected at the start of each trial by displaying a fixation point near the center left-hand side of the screen. Following calibration, the test sentences were presented one at a time. Sentences were not revealed at the start of a trial until participants’ gaze had settled on the fixation point. Participants then pressed either the JA (“yes”) button or the NEI (“no”) button on a button box to answer the comprehension question that followed each sentence. If a participant failed to respond within 3,000 ms, the question trial was aborted and the next trial began. Data preprocessing and preliminary analysis were performed using Robodoc, also available from the Eyetracking lab at UMASS.

Analysis

The distributional predictors used in the experiment were drawn from NoWaC, version 1.0 (Norwegian Web as Corpus; Guevara, 2010), a corpus of over 700 million tokens of Norwegian (Bokmål) text generated by automatically crawling, downloading, and processing websites with the .no domain (see Baroni, Bernardini, Ferraresi, & Zanchetta, 2009, for a discussion of the general method).

We collected a variety of frequency measures for the complex verbs used in the study. Whole-word frequency was calculated as the overall token frequency of the complex verb in the NoWaC corpus.Footnote 4 We calculated two different measures of root frequency: the free root frequency and root family frequency. Free root frequency represents the frequency with which the verb’s root appeared as a standalone verb (e.g., the frequency of søke, for the complex verb undersøke). Root family frequency was defined as the free root frequency plus the cumulative frequency of all complex verbs sharing the same root (e.g., søke+ oppsøke + . . . for the verb undersøke). We also computed the free preposition frequency in a manner similar to the free root frequency. Table 1 provides a descriptive summary of the length and raw frequency measures for the complex verbs in our study. Table 2 presents the correlations between the measures after the frequency measures were log-transformed.

Table 2 Correlation matrix for item-level variables

Choice of predictors

We used whole-word frequency and root family frequency as the main predictors of interest in our models. The correlation between log-transformed whole-word and family frequency was moderate (r = .43). Figure 1 shows the relationship between the frequency measures.

Fig. 1
figure 1

Relationship between (log-transformed) root family and whole-word frequencies of the verbs used in the eyetracking experiment

We chose family frequency instead of free root frequency because (i) the two measures were highly correlated (r = .95) and (ii) the free root frequency of the verb might underestimate the frequency with which the underlying root is accessed if decomposition occurs. Moreover, we reasoned that family frequency was less likely to be correlated with reading times than was free root frequency if decomposition does not occur, since the frequencies of words associated with the complex verbs would add noise to the measure. In addition to the frequency measures, we also included verb length as a predictor, because prior research had shown that length can interact with indices of frequency in morphologically complex words, especially for later morphemes (Bertram & Hyönä, 2003).

We evaluated whether to include predictability and plausibility values as predictors in our models for statistical control. We decided against entering predictability into the model, because so many of the verbs had a predictability value of 0. Because plausibility ratings were highly correlated with whole-word frequency, we instead used the residuals of a linear model that predicted the average verb plausibility by the whole-word frequency. This residual plausibility measure permitted us to control for other factors that contribute to a verb’s plausibility that are not directly related to word frequency.

We analyzed reading times using linear mixed-effect models implemented using the packages lme4 (Bates, Maechler, Bolker, & Walker, 2014) and lmerTest (Kuznetsova, Brockhoff, & Christensen, 2016) in R. The frequency measures were log-transformed and all predictors standardized before the analysis. We began all analyses with a model that included as predictors whole-word frequency, root frequency, verb length, and their interactions, as well as a main effect of residual plausibility. The initial models included random intercepts for participant and verb, as well as by-participant random slopes for whole-word frequency, root frequency, and their interaction. We then conducted an iterative backward model selection procedure by submitting the fully specified model to lmerTest’s step() function, which iteratively identifies and removes predictors that are not significant at the p = .05 level, starting with the highest-order coefficients and working down to main effects. If a higher-order coefficient is found to be significant, the function does not eliminate the lower-order coefficients for effects that participate in that interaction, even if they do not meet the p = .05 criterion. After identifying which predictors were candidates for elimination according to step(), we manually checked on a case-by-case basis whether the simpler model was superior to the more complex model according to the Akaike information criterion (AIC; Akaike, 1973; Müller, Scealy, & Welsh, 2013).

Results

First fixation duration

The percentage of first fixations that were also single fixations was 77.4%. A summary of the coefficients in our fully specified original model for first-fixation duration before backward selection of coefficients can be found in Table 3.

Table 3 Original model for first-fixation duration before backward model selection

Backward selection by step() identified a model with only main effects of (log-transformed) whole-word and root family frequency as the optimal model. That model is given in Table 4.

Table 4 Final model for first-fixation duration identified by backward stepwise selection

Both whole-word and root family frequency were negatively correlated with first-fixation duration. The simple effects of both frequency measures on first-fixation durations are shown in Fig. 2.

Fig. 2
figure 2

Simple effects of log-transformed root family frequency (left panel) and whole-word frequency (right panel) on log-transformed first-fixation duration

Gaze duration

Our analysis of gaze durations began with a full model, as with first-fixation durations. Table 5 provides a summary of the coefficients from this original model before backward selection.

Table 5 Original model for gaze duration before backward model selection

Stepwise selection settled on the model in Table 6 as the final model, which contained simple effects of verb length and whole-word frequency, as well as a significant Whole-Word × Root Family Frequency interaction. On average, gaze durations increased with verb length, but decreased as a function of whole-word frequency. Figure 3 shows the simple effect of whole-word frequency, split up by verb length.

Table 6 Final model for gaze duration identified by backward stepwise selection
Fig. 3
figure 3

Effect of log-transformed whole-word frequency on log-transformed gaze duration, binned by verb length

To resolve the Whole-Word × Root Family Frequency interaction, we split verbs by their log-transformed whole-word frequency quartiles and then plotted average gaze duration by log-transformed root family frequency. The result is in Fig. 4.

Fig. 4
figure 4

Effect of log-transformed root family frequency on log-transformed gaze duration, split by whole-word frequency quartile

Figure 4 makes apparent that the effect of root family frequency on gaze duration is inversely related to whole-word frequency. Effects of root family frequency are visible for verbs in the lower two quartiles, but root frequency seems to exert very little influence on the gaze durations of verbs with high whole-word frequencies. The longer average gaze durations in the bottom two quartiles suggest that refixation was more likely when verbs had lower whole-word frequency. Logistic regression modeling revealed that the probability of refixation was negatively correlated with whole-word frequency, although this effect was not statistically reliable (t = – 1.88, p < .10). Thus, the effect of root family frequency was more pronounced on verbs that participants had to refixate during the gaze duration.

Regression path duration

The fully specified model for regression path times before backward selection is given in Table 7.

Table 7 Original model for regression path duration before backward model selection

The three-way interaction of verb length, whole-word frequency, and root family frequency was marginally significant in the full model, but according to the elimination criterion adopted for backward model selection, it should be removed because it was not significant at the p < .05 level. One might regard removal of the three-way interaction as overly aggressive, however. We therefore conducted a post-hoc exploration of the interaction in order to determine whether elimination was justified. We plotted log-transformed regression path duration as a function of whole-word frequency values, broken up by verb length. To visualize the interaction of root family frequency with these reading times, we further split the scatterplots into groups corresponding to verbs whose family frequencies fell above and below the median. Figure 5 reveals that the main drivers of the interaction were two tendencies: Whole-word frequency had a numerically larger effect for short (seven-letter) verbs with a lower-than-average family frequency, as well as for long (ten-letter) verbs with a higher-than-average family frequency. Because we know of no clear mechanism that predicts such a pattern, we refrain from interpreting it further. We therefore elected to eliminate the three-way interaction in backward model selection.

Fig. 5
figure 5

Effect of log-transformed whole-word frequency on log-transformed regression path duration, binned by verb length and root family frequency

Backward selection ended with the simple model for regression path duration in Table 8, in which root family frequency was the only significant predictor. The simple effect of root family frequency can be seen in Fig. 6.

Table 8 Model for regression path duration after backward model selection
Fig. 6
figure 6

Simple effect of log-transformed root family frequency on log-transformed regression path duration

Total times

A summary of the fully specified model for total reading time prior to backward selection is given in Table 9. From the outset, it appears that total times are strongly correlated with both whole-word frequency and residual plausibility. Root family frequency and verb length are marginally significant predictors in the full model.

Table 9 Original model for total reading time before backward model selection

After backward model selection, three significant predictors of total reading times remained: root family frequency, whole-word frequency, and residual plausibility. The marginally significant effect of verb length did not survive iterative model selection. A summary of the simplified model is in Table 10.

Table 10 Model for total reading time after backward model selection

Plotting the simple effect of each of the main predictors shows clear evidence of strong negative correlations between total reading time and whole-word frequency (Fig. 7a) and residual plausibility (Fig. 7c), as well as a slightly weaker correlation between root family frequency and total time (Fig. 7b).

Fig. 7
figure 7

Simple effects of log-transformed whole-word frequency (a), root family frequency (b), and residual plausibility (c) on log-transformed total reading time.

Discussion

In the present study, we measured root frequency effects in Norwegian complex verbs composed of a preposition and a root verb (e.g., undersøke), with the goal of answering two interrelated questions: Are such compound verbs lexically decomposed? And if so, what roles do the constituent morphemes play in the process of lexical access during word recognition? We measured how whole-word frequency and root family frequency impact the different reading times of complex verbs in sentences.

We found that both root family and whole-word frequency reliably impacted first-fixation times, such that more frequent words were read more quickly. Whole-word frequency and verb length were reliable simple predictors of gaze duration, but we also found that root family frequency interacted with whole-word frequency in the measure. This interaction was driven by larger effects of root family frequency on verbs with low whole-word frequencies than on high-frequency verbs. The effects of root family frequency were more pronounced when verbs were refixated, and verbs with lower whole-word frequency were refixated more often, on average. We return to the interpretation of this interaction below. Regression path duration was reliably predicted by free root frequency alone. Finally, whole-word frequency, root family frequency, and our measure of (residual) plausibility all made independent contributions to the prediction of total times.

Returning to our two questions of interest, the persistent effects of our root frequency measure support the conclusion that Norwegian complex verbs are lexically decomposed, similar to complex verbs in German (Smolka et al., 2015; Smolka et al., 2009; Smolka et al., 2014; Smolka et al., 2007). Decomposition occurs early, as was evidenced by the effect of root family frequency on first-fixation duration. Thus, the constituent morphemes play a role in lexical access or visual word recognition. Moreover, lexical decomposition occurs in naturalistic reading tasks, not simply artificial lexical decision paradigms (e.g., Fiorentino & Poeppel, 2007; Smolka et al., 2007).

Our results also suggest that lexical processing is not entirely root-driven. The effects of whole-word frequency support the idea that unitary representations also play a role in lexical access or visual word recognition. As for the time course of decomposition, family frequency effects in first-fixation durations, the earliest measure of lexical processing, are consistent with early-decomposition models, in which morphological decomposition is automatic (e.g., Fiorentino & Poeppel, 2007; Taft, 2004). We found no evidence to suggest the kind of staged access proposed by whole-word-first models.

Taken together, our results are consistent with dual-route models of lexical access and word recognition (Allen & Badecker, 2002; Baayen & Schreuder, 1999, 2000; Frauenfelder & Schreuder, 1992; Pollatsek et al., 2000; Schreuder & Baayen, 1997). Different dual-route models propose that the decompositional and whole-word access streams can proceed in parallel (Allen & Badecker, 2002; Baayen & Schreuder, 1999; Pollatsek e al., 2000) or interact (Baayen & Schreuder, 2000; Kuperman et al., 2009). The fact that we observed independent main effects, but no interaction, of whole-word and root family frequency in first-fixation times is consistent with parallel dual-route models. Nevertheless, the interaction of whole-word frequency and root family frequency in gaze durations could be interpreted as evidence for interactive processing. Under this interpretation, verbs could forgo the decompositional route when their whole-word representation is high-frequency, and therefore quickly accessible. Effects of root family frequency would only emerge when the whole-word route failed to access a low-frequency representation. Although this is an intriguing possibility, it is somewhat difficult to reconcile with the independent effects of the two frequency measures in first fixations. Moreover, we suspect that the interaction is better explained away as a statistical artifact, arising from the fact that the lower end of the root family frequency range is represented only for words with lower whole-word frequencies. As can be seen in Fig. 4, each successive quartile of the whole-word frequency distribution covers a narrower band of the root family frequency range. Under such conditions, a simple (slightly nonlinear) effect of root family frequency could result in a statistically significant interaction. We favor this interpretation, and therefore conclude that our results are most consistent with parallel dual-route models.

The fact that we found evidence for decomposition of complex verbs in Norwegian speaks to a typological question raised by Smolka et al. (2014), regarding cross-linguistic differences in lexical decomposition. Smolka et al. (2014) speculated that the tendency to lexically decompose complex verbs might vary cross-linguistically and might depend on idiosyncratic properties of individual languages. They supposed that three properties of German induce decomposition: (i) the overall frequency of compounding within the language,Footnote 5 (ii) the derivational productivity of complex verbs within the language, and (iii) surface cues to the separability of the root and prefix. As we mentioned earlier, German speakers receive frequent input that confirms the “separability” of the particle and the root, whereas Norwegian speakers more regularly encounter complex verbs as contiguous units in everyday language. Since our results indicate that Norwegian complex verbs are decomposed, it would appear that abundant surface cues to separability are not required: Frequency of compounding and derivational productivity are sufficient.

Finally, our results suggest that surface cues to separability are not required for rapid decomposition of the group verbs that we selected. However, our results do not determine whether other factors modulate whether and at what stage of processing decomposition occurs. It has been suggested that semantic transparency influences the processing of compounds and other morphologically complex words (e.g., Libben 1998; Libben, Gibson, Yoon, & Sandra, 2003; Marslen-Wilson, Tyler, Waksler, & Older, 1994), though the exact role of transparency differs by theory. Multiple eyetracking studies have revealed indices of early decomposition for both semantically transparent and opaque compounds (e.g., Frisson, Niswander-Klement, & Pollatsek, 2008; Juhasz, 2007; Pollatsek & Hyönä, 2005), though they have also found that transparency may affect later processing (Juhasz, 2007). Since most prior studies have investigated transparency in nominal compounds (though see Smolka et al., 2015), it remains to be determined whether compound verbs are processed differently. Because we did not control for or manipulate the semantic transparency of the compound verbs we used, it is an open question whether transparency interacts with the decomposition process for these verbs. We leave testing this possibility to future work.