Over the course of a lifetime, people develop a large vocabulary of many thousands of words. Age of acquisition (AoA, i.e., the age at which a word was learned) is a variable that documents this process. Many studies have shown that AoA is related to performance on a variety of tasks including picture naming (Carroll & White, 1973), recognition memory (Cortese, Khanna, & Hacker, 2010), reading aloud and lexical decision (Cortese & Khanna, 2007), and more (for a review see, Juhasz, 2005). The AoA effect in reading aloud and lexical decision remains significant even when frequency is controlled (Brysbaert & Cortese, 2011). Overall, there are very few variables that relate to word and picture processing performance as consistently as does AoA.
There has been some controversy as to the locus of AoA effects. Juhasz (2005) reviewed the results of picture naming, word naming, and lexical decision experiments and found the largest effects of AoA in picture naming tasks, followed by lexical decision, then word naming. One interpretation of this pattern is that AoA has primarily a semantic basis because tasks that place a primary emphasis on semantic information (e.g., picture naming and lexical decision) also show larger effects of AoA than tasks that rely less on semantic information (e.g., naming). This pattern of results is consistent with both the semantic locus hypothesis (Steyvers & Tenebaum, 2005), and the network plasticity hypothesis (Ellis & Lambon Ralph, 2000). The semantic locus hypothesis proposes that earlier acquired concepts provide a structure onto which later acquired words associate. Thus, early AoA concepts have more connections from other concepts than later acquired concepts. Furthermore, words with more connections are more easily retrieved, resulting in an effect of AoA. The network plasticity hypothesis predicts an effect of AoA based on how connectionist models learn. As new words are presented to a model, the resulting changes in connection weights are not constant over time. Rather, those items learned earlier result in larger changes to the connection weights than those learned later. Over time, the model loses plasticity, resulting in less efficient learning of words presented later. This cost for later acquired concepts occurs especially when the relationship between inputs and outputs is not consistent (Ellis & Lambon Ralph, 2000). For example, balk may be acquired relatively late, but the computation of its phonological code may benefit from orthographic-phonological knowledge of earlier acquired words that share a similar orthographic-to-phonological mapping (e.g., talk, walk, and chalk). In contrast, an inconsistent word such as pint, will not benefit from previous knowledge of other int words (e.g., mint, hint, and lint) because they are pronounced differently.
Also controversial is the separation of the effects of AoA and imageability in predicting reading aloud performance because imageability and AoA are moderately correlated (Cortese & Khanna, 2007; Schock, Cortese, & Khanna, in press). Imageability has been found to significantly predict reading aloud performance by several researchers (e.g., Balota, Cortese, Sergent-Marshall, Spieler, & Yap, 2004), using a large scale multiple regression method. However, when Cortese and Khanna (2007) added AoA to the set of predictor variables, the effect of imageability was no longer significant. In addition, Monaghan and Ellis (2002) reported that, when AoA was controlled, the interaction between frequency, consistency and imageability, reported by Strain, Patterson, and Seidenberg (1995) was no longer significant. So, it has been suggested that the observed effect of imageability on reading aloud performance may actually be due to a failure to adequately control for AoA (Monaghan & Ellis, 2002; Ellis & Monaghan, 2002).
In addition, partially due to the limited availability of AoA estimates, it has been somewhat difficult to separate AoA effects from word frequency effects as earlier acquired words tend to be associated with higher frequency values than later acquired words (see, e.g., Zevin & Seidenberg, 2004). As AoA estimates are becoming more readily available, the picture that emerges is that AoA influences word processing even after word frequency has been well controlled (see Brysbaert & Cortese, 2011).
Word processing studies have traditionally used factorial designs with small sets of items, however flaws in this approach (see Balota, et al., 2004) have prompted many researchers to move to a mega study approach (Balota et al., 2004; Cortese & Khanna, 2007; Chateau & Jared, 2003; New, Ferrand, Pallier & Brysbaert, 2006; Yap & Balota, 2009). As noted by Balota et al. (2004), some of the shortcomings of the factorial approach include dichotomizing continuous variables (see, e.g., Humphreys, 1978), difficulty in controlling for all of the relevant variables (also see Cutler, 1981), problems associated with stimulus selection (Forster, 2000), an over emphasis on determining statistical significance at the expense of assessing the relative influence of factors. The mega study (for a review see, Balota, Yap, Hutchison, & Cortese, in press) typically employs a large number of trials, and multiple regression analyses are conducted to examine the influence of predictor variables on performance measures. For example, the English Lexicon Project (ELP, Balota et al., 2007) provides reaction time estimates in the reading aloud and lexical decision tasks for over 40,000 English words. While AoA estimates are available for large numbers of monosyllabic words (see Cortese & Khanna, 2008), AoA ratings do not exist for most of the polysyllabic words in the ELP. For example, we now have obtained AoA and imageability ratings for 3,000 monosyllabic words and 3,000 disyllabic words. Of this corpus of 6,000 words, only 3.3 % (2.3 % of the disyllabic corpus) are representated in the Morrison, Chappell and Ellis (1997) norms, 20.2 % (14.5 % of the disyllabic corpus) are represented in the Bird, Franklin and Howard (2001), norms, and 20.0 % (17.8 % of the disyllabic corpus) are represented in the Stadthagen-Gonzalez and Davis (2006) norms. So, as word processing research turns more to the study of polysyllabic words, and the mega study approach increases in frequency, normative data for large sets of polysyllabic words will be required.
The current study provides AoA estimates for 3,000 disyllabic words. The procedures used to collect these estimates were very similar to those used by Cortese and Khanna (2008). It is expected that these norms will be useful to those who are analyzing performance in the ELP or those whose studies require a large number of disyllabic words.