The field of visual word recognition is quite productive, and recently there has been great interest in using available databases of visual lexical decision and word naming latencies, such as the English Lexicon Project (ELP: Balota et al., 2007), to examine word recognition processes (e.g., New, Ferrand, Pallier, & Brysbaert, 2006; Yap & Balota, 2009; Yap, Balota, Sibley, & Ratcliff, 2012). Although recent work has extended the examination of visual word recognition processes beyond the short monosyllabic words that were previously the main source of data (e.g., Balota, Cortese, Sergent-Marshall, Spieler, & Yap, 2004), this approach has not often been used to investigate morphological processing (however, see Baayen, Feldman, & Schreuder, 2006). Research has shown that English compound words are processed differently than length- and frequency-matched monomorphemic words (e.g., Fiorentino & Poeppel, 2007; Inhoff, Briihl, & Schwartz, 1996; Ji, Gagné, & Spalding, 2011). The purpose of the present study was to develop a database that contains variables that are relevant for examining morphological processing for a large number of English compound words that can be interfaced with available behavioral data, such as those contained in the ELP (Balota et al., 2007). Specifically, we collected ratings on compound familiarity, age of acquisition (AoA), semantic transparency, lexeme meaning dominance (LMD), imageability, and sensory experience rating (SER). Prior to a detailed discussion of the database, we provide a brief review of research on the lexical decomposition of compound words, as well as research on each of the variables in the present study, with a specific focus on studies investigating English compound words.Footnote 1

Lexical decomposition of compound words

Since Taft and Forster (1976) first reported that the frequency of the first lexeme in English compound words influences lexical decision times, a growing body of research has examined how English compound words are recognized. The major question has been whether familiar compounds are always accessed through a whole word representation or whether they are obligatorily decomposed into their constituent lexemes during recognition. Constituent-priming studies, in which individual lexemes are used as primes for the full compound in a lexical decision task, are sometimes used to gauge decomposition processes in English compounds (e.g., Fiorentino & Fund-Reznicek, 2009; Libben, Gibson, Yoon, & Sandra, 2003; Monsell, 1985; Shoolman & Andrews, 2003). However, many studies have opted to manipulate the frequencies of the first and/or second lexemes while controlling for the overall compound word’s frequency. If the frequency of a compound’s constituent lexemes influences processing times on the compound, then it follows that those lexemes are accessed during the recognition of the full compound. The results of most studies with English compound words converge in showing a functional role for lexemes during compound processing using a variety of tasks such as word naming, lexical decision, and eye movements during sentence reading (e.g., Andrews, 1986; Andrews, Miller, & Rayner, 2004; Inhoff, Starr, Solomon, & Placke, 2008; Juhasz, 2007, 2008, 2012; Juhasz, Starr, Inhoff, & Placke, 2003). These results with English compounds share many similarities to those observed in languages such as Finnish (e.g., Hyönä & Pollatsek, 1998; Pollatsek, Hyönä, & Bertram, 2000) in which compounding is more productive. In addition, research with multiple languages has shown that compounds may be organized in the mental lexicon on the basis of morphological families (e.g., de Jong, Feldman, Schreuder, Pastizzo, & Baayen, 2002; Juhasz & Berkowitz, 2011; Kuperman, Bertram, & Baayen, 2008; Kuperman, Schreuder, Bertram, & Baayen, 2009). Thus, examination of compound word processing provides insight into the structure of the mental lexicon.

Familiarity

In the morphological processing literature, whole word frequency effects are often interpreted as indicating whole-word access (for a different view, see Baayen, Wurm, & Aaycock, 2007). Therefore, it is important to assess the role of whole word frequency in compound word recognition. However, English compound words tend to be low in frequency. This restriction of range makes it challenging to conduct experiments manipulating English compound frequency (see Libben, 2005, for a discussion and Andrews et al., 2004, for a post-hoc analysis of English compound frequency). Compound word frequency has often been examined in other languages such as Finnish (e.g., Bertram & Hyönä, 2003; Kuperman et al., 2008; Pollatsek et al., 2000) and Dutch (e.g., Kuperman et al., 2009; van Jaarsveld & Rattink, 1988) since restriction of the frequency range is not as great of an issue in these languages. Further, research on English monomorphemic words has shown that even for low frequency items, rated familiarity can still affect word recognition (e.g., Gernsbacher, 1984; Juhasz & Rayner, 2003; Williams & Morris, 2004). Rated familiarity can be thought of as a measure of subjective frequency such that it indexes the experience that an individual has with a given word. Using this logic, Juhasz (2008) manipulated the rated familiarity of long (ten or more letters) and short (seven or fewer letters) English compound words in an attempt to extend Bertram and Hyönä’s work on Finnish compounds. Rated familiarity influenced first fixation durations similarly for both shorter and longer English compounds. The effect of rated familiarity was especially pronounced in the gaze duration measure (i.e., the time spent on the compound before a reader moves their eyes to a different word) for long English compounds. Thus, rated familiarity has an early and long lasting effect on reading behavior for English compounds. Manipulating rated familiarity may make it possible to examine whether overall compound frequency influences decomposition of English compounds (see, e.g., Ji et al., 2011), a question that is relevant for theories and models of morphological processing.

Age of acquisition

AoA is another variable that affects word recognition performance such that words learned early in life are processed faster than words learned later in life. AoA has been found to have a robust effect on a variety of word recognition tasks, and although it is correlated with word frequency and familiarity, its influence on word processing has been found to be distinct from these variables (see Johnston & Barry, 2006; Juhasz, 2005, for reviews). AoA is often estimated from adults using either a rating scale (e.g., the Gilhooly & Logie, 1980, 1–7 scale) or estimates of the actual age at which a word is learned (e.g., Kuperman, Stadthagen-Gonzalez, & Brysbaert, 2012). In two recent large-scale studies, rated AoA was found to predict word naming latencies and lexical decision performance for over 2,000 monosyllabic words (Cortese & Khanna, 2007) and for almost 2,000 disyllabic words (Cortese & Schock, 2013) when a large number of control variables were included in the regression models. Although AoA effects on word recognition performance are well established at this point, the cause of this effect is still debated. According to the semantic locus hypothesis (see Juhasz, 2005, for a discussion), early acquired words may have a benefit in processing because they contain richer semantic representations or contain more connections to related concepts in the lexicon (e.g., Steyvers & Tanenbaum, 2005). Another possibility discussed in the literature is related to how connectionist networks learn. According to this network plasticity hypothesis (Ellis & Lambon Ralph, 2000), early acquired words have an advantage in the network due to the greater plasticity early in training. This theory predicts AoA to have effects at all levels within the mental lexicon including access to orthographic word forms, phonological word forms, and semantic representations. Although, to our knowledge, AoA has not previously been investigated in compound processing, we have included this rating in the present database for two reasons. First, we hope to motivate research examining the role that AoA plays in compound processing in adults. Second, there has been some interest in examining how children process existing compounds in English (e.g., Krott & Nicoladis, 2005). Inclusion of an AoA rating may be useful to researchers who are interested in examining compound word processing in children.

Semantic transparency

Another question in the field of morphological processing is the role that semantic transparency plays in the representation and processing of morphologically complex words. As was discussed by Libben (1998), a fully transparent compound is one in which both lexemes contribute to the meaning of the compound word (TT; sunlight). The opposite extreme is a fully opaque compound word in which neither lexeme contributes to the meaning (OO; flapjack). In addition, partially opaque compounds exist, in which only one lexeme overlaps in meaning with the compound (OT, jaywalk; TO, cheapskate). According to some theories of morphological processing, semantic transparency may influence whether constituents are activated during the processing of complex words (e.g., Marslen-Wilson, Tyler, Waksler, & Older, 1994; Schreuder & Baayen, 1995). Libben (1998) proposed a theory of compound word representation that includes an important role for semantic transparency. According to this theory, compound words are represented at a lexical and semantic level in the mental lexicon. At the lexical level, all compounds are tied to the representations of their lexemes. However, only compounds that are transparent to some degree are tied to the semantic representations of their (transparently related) lexemes.

Studies that have used priming in a lexical decision task to investigate the role of semantic transparency in compound word recognition have mostly supported Libben’s (1998) theory. For example, Monsell (1985) employed a constituent-priming paradigm in the lexical decision task. The stimuli were English transparent compounds, opaque compounds, and pseudocompounds (e.g., trombone). No interaction was found between item type and constituent priming, suggesting that all of the items were significantly primed by their constituents. Libben et al. (2003) investigated the processing of fully transparent, partially opaque, and fully opaque English compounds. The compounds were preceded by a neutral prime or either their first or second constituent in a lexical decision task. A main effect of compound type on reaction times emerged, with fully opaque and TO compounds being responded to more slowly than other compounds. However, all four types of compounds received significant priming from their lexemes relative to the neutral prime (see also Jarema, Busson, Nikolova, Tsapkini, & Libben, 1999, for a similar investigation with French and Bulgarian compounds, and Zwitserlood, 1994, for an investigation in Dutch). A recent study by Fiorentino and Fund-Reznicek (2009) also explored the role of semantic transparency for English compound words using a masked priming paradigm. In this experiment, the prime was the compound word and the target was either the beginning or ending lexeme. Both transparent and opaque compounds significantly primed their constituents.

In contrast, semantic priming appears to only occur for compounds with at least one semantically transparent lexeme. Sandra (1990) investigated the processing of Dutch transparent and opaque compounds in a series of three semantic priming experiments using the lexical decision task. In each, a word semantically related to the first or second lexeme in a compound word was used as a prime and compared to an unrelated condition. Although significant semantic priming was observed for the constituents in transparent compounds, it was not observed for opaque compounds. In Zwitserlood (1994), fully transparent, fully opaque, partially opaque, and pseudocompounds were used as primes for a word that was semantically related to either the first or second lexeme in the lexical-decision task. Only transparent and partially opaque compounds significantly primed the targets that were semantically related to both lexemes.

In a recent study conducted by Ji et al. (2011), lexical decision times to semantically transparent and opaque English compound words were compared to frequency- and length-matched monomorphemic words. Both transparent and opaque compounds were responded to faster than monomorphemic words when presented correctly to participants (i.e., as unspaced compounds). When the experimental setup encouraged decomposition by insertion of a space between constituents, presenting the constituents in different colors, or changing the nature of the nonwords, the pattern of effects changed. Although transparent compounds were still processed faster than monomorphemic items, no significant difference was found between opaque compounds and the monomorphemic words. In addition, correlations showed a facilitative effect of first-lexeme frequency for transparent compounds, but an inhibitory effect for opaque compounds. They interpreted their results as indicating that even opaque compounds underwent a process of meaning integration of the two lexemes and this slowed recognition of these types of compounds.

Eyetracking studies have also examined the role of semantic transparency in compound processing, with mixed results. An early investigation by Underwood, Petley, and Clews (1990) showed that opaque English compounds in sentences received longer gaze durations when compared to transparent compounds. Pollatsek and Hyönä (2005) investigated whether semantic transparency influenced the decomposition of long Finnish compounds. In a series of three experiments, they employed opaque and transparent compounds that were either high or low in first-lexeme frequency. Pollatsek and Hyönä did not find any reliable evidence for a transparency effect in gaze durations across the three experiments, and a first-lexeme frequency effect was observed for both types of compounds. Juhasz (2007) manipulated semantic transparency as well as first and second-lexeme frequency for English compound words. As in the results of Underwood et al., a main effect of transparency emerged in the participant analysis for the gaze duration measure, as did main effects of the first- and second-lexeme frequencies. Finally, Frisson, Niswander-Klement, and Pollatsek (2008) also investigated the role of semantic transparency of English compound words in an eye movement experiment. They employed fully transparent, partially opaque, and fully opaque compounds. No significant effects of transparency were observed on the compound words when they were presented correctly. When spaces were inserted between the lexemes of the compound words an effect of transparency was observed on gaze durations in a “spillover” region later in the sentence.

The studies on semantic transparency discussed above employed a variety of measures assessing semantic transparency, including ratings of the overall transparency of the entire compound expression (e.g., Ji et al., 2011; Juhasz, 2007; Libben et al., 2003; Pollatsek & Hyönä, 2005), and assessing the role of each individual lexeme to the entire compound meaning, either through a definition generation task (e.g., Sandra, 1990) or a rating (e.g., Fiorentino & Fund-Reznicek, 2009; Frisson et al., 2008; Libben et al., 2003; Zwitserlood, 1994). The results of these experiments have been somewhat inconclusive as to the role of semantic transparency in compound word recognition. However, this variable has played an important role in theories of morphological processing, and some evidence indicates that transparency may affect compound word processing. It is therefore important that researchers who are designing studies with compound words take this variable into consideration. In the present study, we have included an overall rating of semantic transparency for the entire compound expression. This rating is a relatively easy measure to manipulate or control during stimulus construction and it can be combined with the lexeme meaning dominance rating discussed below to identify relevant partially opaque compounds.

Lexeme meaning dominance

In general, English compound words are right-headed, meaning that their second lexeme is the semantic head of the compound. This is not the case in all languages, however. Some studies have examined semantic effects in compound word decomposition by conducting a cross-language comparison in which compounding conventions differ between the languages (e.g., Jarema et al., 1999). This was the method used by Duñabeitia, Perea, and Carreiras (2007) with Basque and Spanish compounds. They orthogonally manipulated the frequencies of the first and second lexemes in a lexical decision task. In their experiment, approximately 75 % of the Spanish compound words’ semantic heads were in the final position (i.e., the second lexeme), whereas approximately 80 % of the Basque compounds had their semantic heads in the first position (i.e., the first lexeme). Duñabeitia et al. observed significant second-lexeme frequency effects for compounds in both languages and small, nonsignificant, effects of first-lexeme frequency. Since the lexical decomposition effects did not vary as a function of language, they concluded that decomposition occurs prior to access to compound word meaning. Duñabeitia et al.’s results can be contrasted with those reported by Marelli and Luzzatti (2012), who employed Italian compounds that were either head-initial or head-final (both compound types exist in Italian). They included many potential predictor variables in a regression design—including lexeme frequencies, compound length, compound frequency, and a rating of compound semantic transparency—and utilized two tasks: lexical decision, and sentence reading while eye movements were recorded. In both tasks, complex interactions emerged between lexeme frequency, semantic transparency, and headedness. Taken together, their results suggested that semantic transparency and headedness can have relatively early and long-lasting effects on compound word processing in Italian.

Inhoff et al. (2008) also observed differing patterns of lexeme frequency effects in English compound words as a function of a semantic variable. Specifically, they defined a new rated variable that we will refer to as “lexeme meaning dominance.” This variable was rated on a scale of 0–10, where a rating of 0 indicated that the meaning of the entire compound expression was contained solely in the first lexeme, and a rating of 10 indicated that it was contained solely in the second lexeme. Compound words that received a rating of 4.0 or less on this scale were considered to contain most of their meaning in the first lexeme and were referred to as “headed” in the experiment (e.g., stairwell). Compound words that received a rating of 6.0 or more on this scale were considered to contain most of their meaning in the second lexeme and were referred to as “tailed” in the experiment (e.g., armchair). Across three tasks (lexical decision, naming, and sentence reading while eye movements were recorded), larger lexeme frequency effects were observed for the meaning dominant lexeme. Thus, for headed compounds, larger effects of first-lexeme frequency were observed, whereas for tailed compounds, larger effects of second-lexeme frequency were observed. The results of this experiment suggest that the meaning of a compound word can influence the decomposition of the compound. This variable is therefore important to consider when designing studies with English compound words.

The results of both Marelli and Luzzatti (2012) and Inhoff et al. (2008) suggest that semantic information can affect early morphological processing. The finding that activation of a compound word’s semantic representation can influence its initial processing brings up the question of what exactly is contained in the semantic representation of a compound word. The meaning of the entire compound expression is greater than the sum of the individual lexeme meanings. For example, according to Merriam Webster’s online dictionary (www.merriam-webster.com), an eyesore is “something offensive to view,” not a sore on the eye or an eye that is sore. Research by Spalding and Gagné (2011; see also Gagné & Spalding, 2004, 2009) suggests that relational information between the two lexemes (e.g., a snowball is a ball MADE OF snow) is generated for familiar compounds. Specifically, in a series of experiments they showed that compounds (e.g., snowball) are responded to more slowly in a sense/nonsense decision task when preceded by a compound with a different relational structure (e.g., snowshovel, a shovel FOR snow) than when preceded by a compound with the same relational structure (e.g., snowfort, a fort MADE OF snow).

Imageability

Although semantic transparency and lexeme meaning dominance are semantic variables that may influence the processing of morphologically complex words, a number of semantic variables are known to influence word recognition processes that are also potentially important to take into consideration for compound words. According to Paivio’s (1971) dual-coding theory, some words such as door evoke both a verbal representation and a nonverbal image, as compared to other words, such as fact, that may only evoke a verbal representation. Highly imageable words should therefore have an advantage during processing due to the dual code that exists in the mental lexicon. Ratings of imageability have previously been collected on a large number of English words (e.g., Clark & Paivio, 2004; Cortese & Fugett, 2004; Paivio, Yuille, & Madigan, 1968; Schock, Cortese, & Khanna, 2012; Toglia & Battig, 1978). Imageability has been found to reliably predict word recognition performance in a number of tasks, such as lexical decision (Balota et al., 2004; Cortese & Khanna, 2007; Cortese & Schock, 2013) and word naming (Balota et al., 2004; Cortese & Schock, 2013), as well as to influence memory for word stimuli as indexed by both recognition performance (e.g., Cortese, Khanna, & Hacker, 2010) and free recall of words (e.g., Rubin & Friendly, 1986). In addition, concreteness, a variable that is related to imageability and highly correlated with it (see Paivio et al., 1968; Toglia & Battig, 1978), has been found to influence gaze durations on words during reading (Juhasz & Rayner, 2003; Sheikh & Titone, 2013).

Although semantic variables have previously been investigated for compound words, Kuperman (2013) noted that most of these variables are relational, in that they require an evaluation of the meaning of the compound’s two lexemes in relation to the meaning of the entire compound word. Not much effort has been directed at examining what nonrelational semantic properties influence compound word recognition or whether the individual meanings of the lexemes are accessed in concert with the compound word. Kuperman made an important step toward this by examining the influences of six semantic variables (valence, arousal, imageability, concreteness, SER, and body–object interaction [BOI] ratings) on compound word processing using a regression design with lexical decision times to compounds from the ELP (Balota et al., 2007) as the dependent measure. Ratings for one of the six variables for both lexemes and the compound word were added to a baseline regression model that contained word length, lexeme frequencies, and compound frequency. With respect to imageability, Kuperman found that compound word lexical decision times were predicted by compound word imageability but not lexeme imageability (this was also the case for the related concreteness variable).

Sensory experience ratings

Interest has also been growing in variables related to sensory and perceptual processing in the word recognition literature and several variables have been introduced recently. For example, BOI ratings index the degree that a word’s referent can be physically interacted with, and therefore gauge sensorimotor processes (see Siakaluk, Pexman, Aguilera, Owen, & Sears, 2008). In another approach, Amsel, Urbach, and Kutas (2012) asked individuals to rate 559 object concepts on seven different sensory related dimensions (smell intensity, visual motion, taste pleasantness, color vividness, sound intensity, graspability, and likelihood of pain). A related variable, SER is thought to index the degree that a word evokes a sensory or perceptual experience when read silently (see Juhasz & Yap, 2013; Juhasz, Yap, Dicke, Taylor, & Gullick, 2011). SER can be collected on all word types and for all sensations using a single 1–7 scale with relatively easy instructions. Juhasz et al. (2011) demonstrated that SER reliably predicted lexical decision response times to over 2,000 monosyllabic words contained in the Balota et al. (2004) database and the British Lexicon Project database (Keuleers, Lacey, Rastle, & Brysbaert, 2012) when several other variables, such as length, neighborhood size, frequency, consistency, imageability, and AoA, were included in a hierarchical regression. Juhasz and Yap extended this work to disyllabic words, as well, and included additional control variables. Kuperman (2013) examined the influences of both BOI and SER on compound word processing. Although BOI was not found to have an effect on compound word lexical decision times, compound SER did significantly predict these times. As was the case for imageability, lexeme SER did not influence compound processing. On the basis of the null effects for the lexeme semantic variables during compound processing (for all variables except valence ratings), Kuperman suggested that lexeme meanings are typically only activated in the context of the overall compound meaning, not as individual words.

Interrelations between variables

The six variables selected for the present investigation can be organized into three associated groups: lexical/semantic, relational semantic, and nonrelational semantic. Familiarity and AoA can be thought of as indexing lexical and semantic aspects of compound word recognition since they gauge a reader’s past experience with both a word’s form and its meaning. Familiarity is a measure of how often a person experiences a given word. This variable is clearly related to the AoA variable, which indexes the age at which a word is first learned, as words that are learned earlier in life do tend to be more frequently encountered over the lifespan. Although familiarity has been shown to influence compound word processing (Juhasz, 2007), AoA has not been investigated previously. On the basis of past research with morphologically simple words (see Juhasz, 2005, for a review), we predicted that AoA should influence both lexical decision and word naming times for English compounds. The word frequencies of the lexemes have long been known to influence compound processing (e.g., Andrews, 1986; Andrews et al., 2004; Inhoff et al., 2008; Juhasz, 2007, 2008, 2012; Juhasz et al., 2003). An open question is whether lexeme AoA also contributes to the processing of English compounds, over and above the influence of lexeme frequency.

Semantic transparency and LMD are both examples of what Kuperman (2013) refers to as relational semantic variables, since they both index the relationship in meaning between the lexemes in the compound word and the overall compound word meaning. As we discussed above, semantic transparency has been found to influence the processing of English compound words in a lexical decision task, but only when the experimental setup favors decomposition (Ji et al., 2011). To our knowledge, semantic transparency has not been investigated for English compound words in the word naming task. With respect to LMD, Inhoff et al. (2008) reported that it modulated the effects of lexeme frequency in both lexical decision and word naming.

Finally, both imageability and SER are nonrelational semantic variables that index properties related to the meaning of the entire compound expression. Kuperman (2013) found that both variables influenced lexical decision times for compound words, whereas the associated individual lexeme properties did not. We sought to replicate this finding with our set of 629 compound words and to extend these findings to word naming. In addition, we were interested in examining whether the relevant lexeme properties would influence the overall compound ratings for the AoA, imageability, and SER variables. Although the relational semantic variables (transparency and lexeme meaning dominance) specifically require participants to decompose the compound words in order to provide a rating, decomposition of the compound is not required to rate its AoA, imageability, or SER. Examination of the effect of lexeme properties on these ratings can provide insight into how these ratings are accomplished for morphologically complex words.

Purpose of the present study

The above (nonexhaustive) review of the research on English compound word recognition illustrates the range of questions that have been the focus of research in this area for over three decades. In addition, English compound words are often popular stimuli in other types of experiments, such as those investigating transposed-letter and morpheme effects (e.g., Christianson, Johnson, & Rayner, 2005; Crepaldi, Rastle, Davis, & Lupker, 2013), the use of parafoveal information in reading (e.g., Angele & Rayner, 2013; Drieghe, Pollatsek, Juhasz, & Rayner, 2010; Juhasz, Pollatsek, Hyönä, Drieghe, & Rayner, 2009; Juhasz, White, Liversedge, & Rayner, 2008), and memory conjunction errors in recognition memory tasks (e.g., Jones & Atchley, 2002, 2006; Jones, Brown, & Atchley, 2007; Lampinen, Odegard, & Neuschatz, 2004; Lloyd, 2013; Wong & Rotello, 2010). Thus, a centralized database of English compound words with ratings of several attributes should be useful to researchers who use these stimuli. In the present study, we have included information for 629 English compound words on six rated variables: subjective familiarity, AoA, semantic transparency, LMD, imageability, and SER.

Method

Participants

A total of 189 questionnaires were completed by Wesleyan University undergraduates during a period of 2 years (March 2012–April 2014). The number of completed ratings for each questionnaire is provided below. Participants either received partial Introductory Psychology course credit for their participation or received a small payment (up to $10.00).Footnote 2

Materials

A total of 629 English bilexemic compound words were selected from a larger set collected for a previous study (Juhasz & Berkowitz, 2011). Care was taken to include compound words that should vary in familiarity to a native speaker of American English, as well as vary on the other dimensions examined. In addition, each selected compound was included in the English Lexicon Project (Balota et al., 2007) database, so that naming and lexical decision times could be extracted. The compound words ranged in length from five to 13 letters (M =8.36, SD =1.35). Their first-lexeme lengths ranged from two to eight letters (M =4.21, SD =1.11), and the second-lexeme lengths ranged from two to eight letters (M =4.16, SD =0.84). The frequencies of occurrence for the entire compound ranged from 0.02 to 266.65 per million (M =3.58, SD =17.50) in the SUBTLEX-US corpus (Brysbaert & New, 2009), with a frequency range of 0.10 to 9,842.45 per million for the first lexeme (M =568.27, SD =1,261.80), and of 0.41 to 22,677.84 per million for the second lexeme (M =489.63, SD =1,278.45). The 629 compounds were assigned to two sets. Set 1 consisted of 315 compound words, and Set 2 consisted of 314 compound words.

Questionnaires

Six types of questionnaires were administered: imageability (Set 1 N =21, Set 2 N =21), familiarity (Set 1 N =14, Set 2 N =14), SER (Set 1 N =15, Set 2 N =15), AoA (Set 1 N =15, Set 2 N =15), semantic transparency (Set 1 N =14, Set 2 N =15), and LMD (Set 1 N =15, Set 2 N =15).

For imageability, the instructions were modified from those of Schock, Cortese, and Khanna (2012), in the following ways: The number of items was removed, the word examples were changed to blackboard and something to illustrate high- and low-imagery compound items, and reference to the computer buttons was removed because the questionnaire was completed on paper. The scale was from 1 to 7, where higher numbers indicate greater ease of imagery for an item. The instructions for the SERs were taken directly from Juhasz and Yap (2013) and Juhasz et al. (2011) and were not modified for the present study. A rating of 1 to 7 was used, with higher numbers indicating that a word evoked a greater sensory/perceptual experience in the mind of the rater. For the LMD rating, instructions were modified from the description included in Inhoff et al. (2008), and the rating was conducted on a 0-to-10 scale. The semantic transparency instructions were modified from those used previously by the first author (Juhasz, 2007) to study transparency effects in English compound word recognition. A 1-to-7 scale was used, where higher numbers indicate that the lexemes were transparently related to the meaning of the compound word. The AoA instructions were those used in past studies by the first author (e.g., Juhasz & Rayner, 2003) using the 1–7 scale developed by Gilhooly and Logie (1980), where a rating of 1 indicates that the word was learned between ages 0–2, and a rating of 7 indicates that the word was learned at age 13 or older. The familiarity instructions were those used in past studies examining familiarity effects by the first author (e.g., Juhasz, 2008). Words were rated on a 1-to-7 scale, where higher numbers indicate a higher level of familiarity with the item. The complete instructions for the four previously unpublished directions (LMD, transparency, familiarity, and AoA) are provided in the Appendix.

Procedure

Ratings were collected over the course of 2 years (March 2012 to April 2014) at Wesleyan University. For all questionnaires, participants were provided with a packet consisting of one set of the compound words, with a sheet on top detailing the questionnaire instructions. A short statement reiterating the main instructions and scale was also included in the header of each questionnaire page. In the case of the AoA rating, participants were also provided with a sheet containing the Gilhooly and Logie (1980) scale, indicating the age range corresponding to each of the values on the 1-to-7 scale. The ratings were collected in concert with other, unrelated studies being conducted in the Eye Movement and Reading Laboratory, such as word-naming and eyetracking studies. In these situations, participants were tested individually in a testing room. For some ratings, larger groups of participants were tested in a classroom setting. Care was taken to ensure that the ratings were not administered immediately after an experiment assessing compound word recognition, since this could influence the results of the ratings. All rated values were entered into an Excel database and double-checked by research assistants.

Results and discussion

The means, standard deviations, and ranges of each of the rated variables are presented in Table 1. Correlations between the six rated variables, in addition to length and word frequency, are presented in Table 2. Following a discussion of rating reliability, the characteristics of each of the rated variables are discussed separately. This is followed by analyses examining how lexeme characteristics contribute to the ratings. Finally, analyses are reported examining how each of these variables contributes to compound word processing as indexed by the lexical-decision and naming response times in the ELP (Balota et al., 2007).

Table 1 Descriptive statistics for the six ratings in the compound word database
Table 2 Correlations between length, frequency, and the six rated variables in the compound word database

Reliability

The interrater reliability for the sets of ratings was examined using intraclass correlations (ICC) for each set of ratings. Specifically, the reliability of the average rating data was assessed via a two-way random consistency model. ICCs were in the excellent range for all sets of ratings (as determined by the cutoffs provided by Cicchetti, 1994; see also Hallgreen, 2012): familiarity (Set 1 ICC = .896, Set 2 ICC = .927), AoA (Set 1 ICC = .941, Set 2 ICC = .955), semantic transparency (Set 1 ICC = .898, Set 2 ICC = .911), LMD (Set 1 ICC = .913, Set 2 ICC = .909), imageability (Set 1 ICC = .957, Set 2 ICC = .960), and SER (Set 1 ICC = .889, Set 2 ICC = .837).Footnote 3

Rated variable characteristics

Familiarity

As we discussed in the introduction, English compound words tend to be of a relatively low frequency in published corpora, leading to a restriction of range issue when designing studies. This was evident in the present database. According to the SUBTLEX-US frequency norms (Brysbaert & New, 2009), 595 of the compounds in the present study have a word frequency of 10 or less per million (94.59 %), 31 compounds have a frequency between 10 and 100 (4.93 %), and only three compounds have a frequency over 100 (0.48 %; outside =170.02, whatever =246.75, and nobody =266.65). Rated familiarity, which can be thought of as a form of subjective frequency, ranged from 1.57 to 7.00 in the present study. However, over 91 % of the compounds were rated above a 4.00 on the seven-point scale (578/629), indicating that participants were familiar with the majority of these compound words. The correlation between rated familiarity and word frequency, although statistically significant, was quite low at r = .164. We also calculated the correlations between rated familiarity and two additional measures of word frequency: the HAL frequency corpus (Burgess & Livesay, 1998) and Kučera and Francis (1967). These were also quite low, amounting to r = .164 for all 629 compounds in the HAL corpus, and r = .181 for the 352 compounds contained in the Kučera and Francis corpus. Larger correlations were obtained between the familiarity ratings and AoA (r = –.670), imageability (r = .475), SER (r = .440), and semantic transparency (r = .287), indicating that more-familiar compounds tended to be those that were rated as being learned earlier in life, produced more ready mental imagery, evoked a greater sensory/perceptual experience, and were rated as being more semantically transparent.

Age of acquisition

To our knowledge, AoA is a variable that has not been directly examined for English compound words. One motivation for including the variable in this database was to aid developmental researchers in selecting compound words for future studies. As is displayed in Fig. 1, approximately 31 % of the compounds in the database were rated below a 4 on the Gilhooly and Logie (1980) scale, indicating that according to retroactive self-report, they were known prior to 6 years of age. Although rated AoA has not been specifically examined for compound words, Kuperman and colleagues (2012) recently conducted a large rating experiment for over 30,000 English words. For the 597 compounds that appear in both databases, the correlation between the AoA ratings was r = .839. In the present database, AoA was marginally correlated with compound word length (r = .072), and was modestly negatively correlated with word frequency (r = –.200). AoA correlated significantly with transparency (r = –.400), SER (r = –.555), imageability (r = –.636), and familiarity (r = –.670). The direction of these correlation coefficients indicated that compounds learned early in life were more familiar, evoked greater imagery and sensory/perceptual experiences, and also tended to be more semantically transparent.

Fig. 1
figure 1

Percentages of compound words in the database that received average ratings in each range from the Gilhooly and Logie (1980) AoA scale. In this scale, a rating of 1 indicates that a word was learned between 0 and 2 years of age, 2 indicates the word was learned between 2 and 4 years of age, 3 indicates the word was learned between 4 and 6 years of age, 4 indicates the word was learned between 6 and 8 years of age, 5 indicates the word was learned between 8 and 10 years of age, 6 indicates the word was learned between 10 and 12 years of age, and 7 indicates the word was learned at age 13 or older.

Transparency

As we mentioned in the introduction, evidence has been inconsistent as to the role that semantic transparency plays in English compound word recognition, with some studies reporting that opaque English compounds are processed more slowly than transparent ones (e.g., Juhasz, 2007; Underwood et al., 1990), whereas others report no difference in overall processing unless the experimental setup favored decomposition (Frisson et al., 2008; Ji et al., 2011). In the present study, compound words were specifically selected to represent a range of semantic transparencies for use in future experiments. In the present database, 222 compound words were rated as having a transparency of 4 or less on the 1–7 scale (35.3 %), whereas 407 were rated as having a transparency over 4 on this scale (64.7 %). Thus, it should be possible for researchers to select items for use in experiments on the semantic transparency of English compound words. Semantic transparency was not significantly correlated with frequency or LMD. However, significant correlations were obtained with word length (r = .124), AoA (r = –.400), familiarity (r = .287), imageability (r = .469), and SER (r = .404). The correlations indicated that longer, more familiar, and earlier-acquired compounds tended to be rated as more semantically transparent. In addition, compounds that were more semantically transparent were those that evoked more mental imagery and sensory/perceptual experiences.

Lexeme meaning dominance

The inclusion of the LMD rating in this database was based on the work of Inhoff and colleagues (2008), which suggested that LMD can modulate the pattern of lexeme frequency effects observed for English compound words. Inhoff et al. (2008) defined headed compounds as those that received a 0–4 on the 0–10 scale, and tailed compounds as those that received a rating of 6–10 on the scale. In the present database, 145 compounds met this definition of headed (23.1 %), 169 met the definition of tailed (26.9 %), and the remainder (50.1 %) had an average rating in the 4–6 range on the scale. Thus, the breakdown in ratings should provide researchers interested in further examining this variable with a relatively large set of items. LMD ratings were not strongly correlated with other variables in this study. The only significant correlations were with familiarity (r = –.082), SER (r = .093), and imageability (r = .130).

Imageability

Compound imageability ranged from 1.05 to 6.95 in the present database. A total of 269 (42.8 %) of the compound words received a rating of 4 or less on the 1–7 scale, whereas 360 (57.2 %) received a rating over 4 on the scale. Compound imageability did not correlate significantly with the length of the compound, and the correlation with frequency was only marginally significant (r = –.076). Significant correlations were obtained with LMD (r = .130), semantic transparency (r = .469), familiarity (r = .475), AoA (r = –.636), and SER (r = .702).

Sensory experience ratings

SER ranged from 1.00 to 5.93 in the present database. Unlike with the imageability variable, the majority of compounds (542; 86.2 %) received an average rating of 4 or less on the seven-point scale, whereas only 87 compounds (13.8 %) received a rating over 4 on the scale. Therefore, only a relatively small number of compounds were judged as activating a strong sensory/perceptual experience. The highest-rated compounds on SER were meatball (5.53), popcorn (5.87), and watermelon (5.93), supporting Amsel et al.’s (2012) suggestion that this variable may be particularly related to sensory experiences evoked by edible concepts. SER did not correlate significantly with word length or frequency in the present database. Significant correlations were observed with LMD (r = .093), familiarity (r = .440), semantic transparency (r = .404), AoA (r = –.555), and imageability (r = .702). The correlations with AoA and imageability were higher for these compound items than for a large set (over 5,000) of mono- and disyllabic words investigated by Juhasz and Yap (2013). A number of compounds in the database received high ratings of imageability, indicating that they easily evoke mental images, but were rated as not being evocative of strong automatic sensory/perceptual experiences (e.g., signpost, 5.86 imageability, 1.60 SER; hubcap, 5.10, 1.67; cupboard, 6.05, 1.73; briefcase, 6.14, 1.8; or pocketbook, 5.62, 1.93), suggesting that these two variables are dissociable for these compound words.

The influence of lexeme characteristics on compound ratings

As we discussed in the introduction, a large body of research suggests that compound words are decomposed into their lexemes during processing. One motivation for the present study was to examine whether the characteristics of the individual lexemes in a compound word can influence the ratings of lexical, relational, and nonrelational semantic variables for the entire compound expression. This was accomplished by conducting a by-items baseline regression that included the following predictors: compound word length, compound word frequency, Lexeme 1 frequency, and Lexeme 2 frequency (measured as the log of the SUBTLEX-US frequency count; Brysbaert & New, 2009). In addition, lexeme ratings of AoA (obtained from Cortese & Khanna, 2008; Schock, Cortese, Khanna, & Toppi, 2012), imageability (obtained from Cortese & Fugett, 2004, and Schock et al., 2012), and SER (obtained from Juhasz & Yap, 2013) were included in the relevant regression analyses. SER values were not available for all lexemes. This regression analysis was therefore based on a subset of 509 compound words. The results of these regression analyses are reported in Table 3.

Table 3 Standardized regression coefficients (β) and significance from models predicting compound ratings

As can be seen from Table 3, lexeme characteristics did predict several of the overall ratings. Specifically, Lexeme 1 frequency was a marginal predictor of compound familiarity and a significant predictor of compound AoA, imageability, and SER. Second-lexeme frequency was a significant predictor of compound AoA, LMD, and SER. Transparency was not significantly predicted by either lexeme frequency.

Notably, each of the relevant lexeme ratings was a significant predictor of the overall compound rating, such that both the Lexeme 1 and 2 AoAs predicted the compound’s AoA, Lexeme 1 and 2 imageabilities predicted the compound imageability, and Lexeme 1 and 2 SERs predicted the compound SER.

However, the lexeme frequencies and lexeme AoAs were themselves highly correlated (Lexeme 1, r = –.701, p < .001; Lexeme 2, r = –.646, p < .001); thus, this could have caused an issue with multicollinearity in the regression model. In order to further explore the impact of the rated lexeme characteristics on the compound ratings, we performed analyses for the AoA, imageability, and SER variables, comparing the baseline model with the extended model including the relevant lexeme ratings. Addition of the lexeme ratings significantly increased the proportion of variance accounted for by each model [AoA: change in R 2 = .134, F(2, 622) =61.37, p < .001; imageability: change in R 2 = .239, F(2, 622) =111.55, p < .001; SER: change in R 2 = .186, F(2, 502) =62.86, p < .001]. This suggests that when individuals are asked to rate a compound word on a particular variable, the compound word is decomposed, and the relevant characteristics of both of the lexemes influence the rating for the entire compound word.

Influence on compound word recognition

Analyses were conducted to assess the predictive role that each variable has in compound word processing. Lexical decision and naming results for the compounds were retrieved from the ELP (Balota et al., 2007). Following the work of Kuperman (2013), the control variables used were compound word length, compound word frequency, Lexeme 1 frequency, and Lexeme 2 frequency, and the log-transformed latencies were used at the dependent measures in the regression models. All frequencies were the log of the word frequency count from the SUBTLEX-US (Brysbaert & New, 2009) corpus. Each rated variable was then added to this baseline model separately to assess its predictive role. Where appropriate, the relevant characteristics of each lexeme were also added to the baseline model, in concert with the compound ratings.

Lexical decision

Table 4 provides the results of these models for the lexical decision times (LDTs). The baseline model indicated that LDTs were significantly predicted by all four control variables (compound length, compound frequency, and Lexeme 1 and 2 frequencies). Familiarity was also a significant predictor of the LDTs when it was added to the model. The baseline model was extended to include both the rated AoA of the entire compound word and the AoAs of the two lexemes. Compound AoA was a significant predictor of performance. The Lexeme 1 AoA was a marginally significant predictor of LDTs, whereas Lexeme 2 AoA was not a significant predictor. In this model, Lexeme 1 frequency was no longer significant, perhaps due to a multicollinearity issue. In order to further explore whether lexeme AoA influences the LDTs for compound words, a baseline +compound AoA model was computed, with all but the two lexeme AoAs included as predictors (length, frequency, Lexeme 1 frequency, Lexeme 2 frequency, and compound AoA). This baseline +compound AoA model was compared to an extended model containing the lexeme AoAs. The change in R 2 for the extended model was .005, which was only marginally significant [F(2, 621) =2.67, p = .070]. Thus, we did not find strong support for an effect of lexeme AoAs on compound processing in the lexical decision task.

Table 4 Standardized regression coefficients (β) and significance from models predicting lexical decision results from the ELP (Balota et al., 2007)

In terms of the relational semantic variables, there were mixed results. Semantic transparency was a significant predictor of LDTs in the present study, supporting the idea that transparency may influence overall compound processing (see, e.g., Juhasz, 2007). However, LMD did not significantly predict LDTs when added to the baseline model. Nonrelational semantic variables did significantly predict LDTs to compound words. When the baseline model was extended to include the imageability of the compound as well as the lexeme imageabilities, only the compound imageability significantly predicted LDTs. The same pattern emerged with the model containing compound and lexeme SERs (for the subset of 509 compounds). This pattern replicates the results of Kuperman (2013), who had previously examined these variables’ effects on lexical decision performance. In order to further examine whether the lexeme ratings of imageability and SER influence compound processing, model comparisons were conducted in an analogous way to the AoA models described above. Addition of the relevant lexeme variables to the models containing the compound rating did not produce a significant increase in R 2 for either imageability or SER (both ps > .500).

Word naming

Table 5 provides the results for the word-naming analyses using models identical to those described above.Footnote 4 As in the LDT results, the baseline model indicated that compound word length, frequency, Lexeme 1 frequency, and Lexeme 2 frequency were significant predictors of compound word naming times. The results for the extended models for familiarity and AoA were quite similar to those from the lexical decision models discussed above. Both compound familiarity and AoA significantly predicted naming times. In the word-naming model, neither the Lexeme 1 nor Lexeme 2 AoA predicted naming times. A model comparison analogous to the one conducted for the AoA model in the lexical decision task demonstrated that adding lexeme AoA ratings to the model did not produce a significant change in the R 2 associated with the baseline +compound AoA model (p > .300).

Table 5 Standardized regression coefficients (β) and significance from models predicting word naming results from the ELP (Balota et al., 2007)

The relational semantic variables (transparency and LMD) did not significantly predict naming times. The discrepancy in the roles of transparency in the two tasks will be discussed in the Conclusion section. In contrast, both nonrelational semantic variables explored (imageability and SER) significantly predicted naming times. Whereas lexeme imageability ratings did not significantly predict naming times, we found an unexpected influence of the Lexeme 2 SER, indicating that compounds with a higher second-lexeme SER took longer to name. However, model comparisons of the baseline +compound SER model to a model containing both lexeme SERs did not produce a significant change in R 2 once the lexeme SERs were included (p > .100). The same was true for the model comparison with the imageability variable (p > .700).

Conclusion

Familiarity and AoA both index the experience that an individual has with a given compound word form and meaning, and are therefore lexical/semantic in nature. Both variables significantly affected compound word LDTs and naming times, even when the effects of compound word length, frequency, and the lexeme frequencies were statistically controlled. The finding that compound word familiarity has an effect over and above word frequency is consistent with studies on morphologically simple words (e.g., Gernsbacher, 1984; Juhasz & Rayner, 2003; Williams & Morris, 2004). The familiarity ratings included in the present database should allow researchers to explore whether compound familiarity impacts the lexical decomposition process for English compound words. To our knowledge, AoA has not previously been examined for compound words. The finding that AoA affects compound word recognition in both tasks is therefore novel but not surprising, given that it has been found to influence LDTs and word-naming times for large numbers of monosyllabic and multisyllabic words (e.g., Cortese & Khanna, 2007; Cortese & Schock, 2013). Inclusion of these ratings in the database will allow researchers to select items that should be known to individuals of differing age groups. In addition, future research can explore whether the age at which a compound word is acquired influences the representation of the compound word in the mental lexicon.

Both nonrelational semantic variables explored in this study, imageability and SER, influenced LDTs and word-naming times for the compound words. The LDT results replicate Kuperman (2013), who first explored the influence of these variables on compound word recognition. The present study extends this work by also showing an effect of these variables in compound naming times. From a methodological standpoint, this suggests that studies utilizing a factorial design in which compounds are placed in different conditions (e.g., compounds with a high-frequency lexeme vs. compounds with a low-frequency lexeme) should ensure that the compounds do not differ on these variables across groups.

The influence of the relational semantic variables, transparency and LMD, was not as robust as those for the other variables explored. LMD did not have a direct effect on either lexical decision or word naming times. This result is consistent with Inhoff et al. (2008), who observed very similar overall LDTs for headed and tailed compounds and did not report a main effect of LDM on word naming. However, Inhoff et al. (2008) did observe different patterns of lexeme usage for headed than for tailed compound words. The present database should provide interested researchers with enough items to design studies further examining this variable. The review of semantic transparency in the introduction illustrated that although transparency is relevant for theories of compound word processing (e.g., Libben, 1998), the effect of semantic transparency on compound word recognition has not been consistently observed across experiments. A recent study by Ji et al. (2011) showed a main effect of transparency in a lexical decision task only when the experimental conditions favored decomposition (e.g., spacing between lexemes or coloring of lexemes). In contrast, in the present analysis a main effect of semantic transparency emerged in the lexical-decision times from the ELP (Balota et al., 2007), even though lexical decomposition was not encouraged through an experimental manipulation. The discrepancy in these results may perhaps be due to an issue with power, since the present analyses included a much larger number of compound words (629) than were in the Ji et al. study (between 60 and 72 items). However, even with the large number of compound words in the present study, an effect of semantic transparency was not observed in the word naming task. This study has been the first, to our knowledge, to explore the influence of transparency in word naming. Although semantic effects are observed in the word naming task, such as the finding of significant effects of both imageability and SER in the present investigation (see also, e.g., Yap & Balota, 2009), word naming may be less sensitive to semantic variables than is lexical decision, for the latter task may encourage the use of word meaning to correctly discriminate words from nonwords (see, e.g., Chumbley & Balota, 1984). The present results suggest that whereas nonrelational semantic variables influence the naming of compound words, relational semantic variables do not. In our opinion, this is an intriguing finding, since it suggests that the effects of semantic transparency observed in the lexical decision task may occur relatively late, after nonrelational properties of the compound word’s semantic representation are accessed.

Another purpose of this study was to explore whether characteristics of the constituent lexemes in a compound word influence ratings of semantic properties of compound words and/or influence compound word recognition. The answer to the first question was clearly affirmative. When individuals rate a compound word on AoA, imageability, or SER, the relevant characteristics of both lexemes influence the overall compound rating. Thus, individuals are clearly decomposing the compound during the rating task, and the semantic properties of the lexemes influence the resulting ratings. In contrast, we did not find much support for any lexeme characteristic other than lexeme frequency influencing compound word processing in the lexical decision and word naming tasks. There was only a marginal effect of first-lexeme AoA in the lexical decision task, and although the effect of second-lexeme SER was significant in the word naming analysis, this effect disappeared when the initial phoneme was coded in the model (see note 4). These findings support the position put forth by Kuperman (2013) that the semantic representations of the lexemes in a compound word are not automatically activated when the compound word is processed.

Of course, many potential variables may impact the processing of compound words, such as morphological family size and family frequency (e.g., de Jong et al., 2002; Juhasz & Berkowitz, 2011; Kuperman et al., 2009; Kuperman et al., 2008), valence and arousal (e.g., Kuperman, 2013), and relational structure (e.g., Gagné & Spalding, 2004, 2009; Spalding & Gagné, 2011). Future work could expand the present database to incorporate other relevant variables. The inclusion of the six rated variables in the present database can provide insight into the effects of lexical/semantic, relational semantic, and nonrelational semantic variables on English compound word recognition. The creation of this English compound word database should provide word recognition researchers interested in examining morphological complexity with a large set of words to select stimuli for inclusion in future studies exploring a number of questions related to English compound word recognition. In addition, this database should facilitate more complex analyses that can be conducted with the present set of items, using lexical decision times and naming times extracted from the ELP (Balota et al., 2007).

Characteristics of the English compound word database

The supplemental materials consist of the database of 629 English compound words used in the present study (in Excel and .pdf formats). Contained within the database are the compound words, their lengths in letters (LEN), and the average ratings for each of the six rated variables (FAM =familiarity, AOA =age of acquisition, TRAN =transparency, LMD =lexeme meaning dominance, IMAGE =imageability, SER =sensory experience rating). Next to each rating is a column listing the number of scores contributing to each mean rating. Although the number of questionnaires completed is listed in the Method section, participants were encouraged in the instructions for some of the questionnaires to not provide a rating for compounds whose meaning they were uncertain of. Thus, individual items vary in the number of ratings. Standardized z scores for each variable (labeled ZFAM, ZAOA, ZTRAN, ZLMD, ZIMAGE, and ZSER) are also provided in the database, to facilitate comparisons between the various ratings for individual items.