Abstract
Age of acquisition (AoA) ratings based on a 1-7 scale for 3,000 disyllabic words were obtained from 32 participants. We demonstrate that these estimates are both reliable and valid. These estimates add to those collected on monosyllabic words and are of value to researchers interested in factors that contribute to word processing. They also can be used in regression analyses on measures obtained from large databases, and can be used in conjunction with imageability ratings for the same word corpus to differentiate AoA from imageability.
Similar content being viewed by others
Age of acquisition estimates for 3,000 disyllabic words
Over the course of a lifetime, people develop a large vocabulary of many thousands of words. Age of acquisition (AoA, i.e., the age at which a word was learned) is a variable that documents this process. Many studies have shown that AoA is related to performance on a variety of tasks including picture naming (Carroll & White, 1973), recognition memory (Cortese, Khanna, & Hacker, 2010), reading aloud and lexical decision (Cortese & Khanna, 2007), and more (for a review see, Juhasz, 2005). The AoA effect in reading aloud and lexical decision remains significant even when frequency is controlled (Brysbaert & Cortese, 2011). Overall, there are very few variables that relate to word and picture processing performance as consistently as does AoA.
There has been some controversy as to the locus of AoA effects. Juhasz (2005) reviewed the results of picture naming, word naming, and lexical decision experiments and found the largest effects of AoA in picture naming tasks, followed by lexical decision, then word naming. One interpretation of this pattern is that AoA has primarily a semantic basis because tasks that place a primary emphasis on semantic information (e.g., picture naming and lexical decision) also show larger effects of AoA than tasks that rely less on semantic information (e.g., naming). This pattern of results is consistent with both the semantic locus hypothesis (Steyvers & Tenebaum, 2005), and the network plasticity hypothesis (Ellis & Lambon Ralph, 2000). The semantic locus hypothesis proposes that earlier acquired concepts provide a structure onto which later acquired words associate. Thus, early AoA concepts have more connections from other concepts than later acquired concepts. Furthermore, words with more connections are more easily retrieved, resulting in an effect of AoA. The network plasticity hypothesis predicts an effect of AoA based on how connectionist models learn. As new words are presented to a model, the resulting changes in connection weights are not constant over time. Rather, those items learned earlier result in larger changes to the connection weights than those learned later. Over time, the model loses plasticity, resulting in less efficient learning of words presented later. This cost for later acquired concepts occurs especially when the relationship between inputs and outputs is not consistent (Ellis & Lambon Ralph, 2000). For example, balk may be acquired relatively late, but the computation of its phonological code may benefit from orthographic-phonological knowledge of earlier acquired words that share a similar orthographic-to-phonological mapping (e.g., talk, walk, and chalk). In contrast, an inconsistent word such as pint, will not benefit from previous knowledge of other int words (e.g., mint, hint, and lint) because they are pronounced differently.
Also controversial is the separation of the effects of AoA and imageability in predicting reading aloud performance because imageability and AoA are moderately correlated (Cortese & Khanna, 2007; Schock, Cortese, & Khanna, in press). Imageability has been found to significantly predict reading aloud performance by several researchers (e.g., Balota, Cortese, Sergent-Marshall, Spieler, & Yap, 2004), using a large scale multiple regression method. However, when Cortese and Khanna (2007) added AoA to the set of predictor variables, the effect of imageability was no longer significant. In addition, Monaghan and Ellis (2002) reported that, when AoA was controlled, the interaction between frequency, consistency and imageability, reported by Strain, Patterson, and Seidenberg (1995) was no longer significant. So, it has been suggested that the observed effect of imageability on reading aloud performance may actually be due to a failure to adequately control for AoA (Monaghan & Ellis, 2002; Ellis & Monaghan, 2002).
In addition, partially due to the limited availability of AoA estimates, it has been somewhat difficult to separate AoA effects from word frequency effects as earlier acquired words tend to be associated with higher frequency values than later acquired words (see, e.g., Zevin & Seidenberg, 2004). As AoA estimates are becoming more readily available, the picture that emerges is that AoA influences word processing even after word frequency has been well controlled (see Brysbaert & Cortese, 2011).
Word processing studies have traditionally used factorial designs with small sets of items, however flaws in this approach (see Balota, et al., 2004) have prompted many researchers to move to a mega study approach (Balota et al., 2004; Cortese & Khanna, 2007; Chateau & Jared, 2003; New, Ferrand, Pallier & Brysbaert, 2006; Yap & Balota, 2009). As noted by Balota et al. (2004), some of the shortcomings of the factorial approach include dichotomizing continuous variables (see, e.g., Humphreys, 1978), difficulty in controlling for all of the relevant variables (also see Cutler, 1981), problems associated with stimulus selection (Forster, 2000), an over emphasis on determining statistical significance at the expense of assessing the relative influence of factors. The mega study (for a review see, Balota, Yap, Hutchison, & Cortese, in press) typically employs a large number of trials, and multiple regression analyses are conducted to examine the influence of predictor variables on performance measures. For example, the English Lexicon Project (ELP, Balota et al., 2007) provides reaction time estimates in the reading aloud and lexical decision tasks for over 40,000 English words. While AoA estimates are available for large numbers of monosyllabic words (see Cortese & Khanna, 2008), AoA ratings do not exist for most of the polysyllabic words in the ELP. For example, we now have obtained AoA and imageability ratings for 3,000 monosyllabic words and 3,000 disyllabic words. Of this corpus of 6,000 words, only 3.3 % (2.3 % of the disyllabic corpus) are representated in the Morrison, Chappell and Ellis (1997) norms, 20.2 % (14.5 % of the disyllabic corpus) are represented in the Bird, Franklin and Howard (2001), norms, and 20.0 % (17.8 % of the disyllabic corpus) are represented in the Stadthagen-Gonzalez and Davis (2006) norms. So, as word processing research turns more to the study of polysyllabic words, and the mega study approach increases in frequency, normative data for large sets of polysyllabic words will be required.
The current study provides AoA estimates for 3,000 disyllabic words. The procedures used to collect these estimates were very similar to those used by Cortese and Khanna (2008). It is expected that these norms will be useful to those who are analyzing performance in the ELP or those whose studies require a large number of disyllabic words.
Method
Participants
Thirty-two students enrolled in undergraduate psychology courses at the University of Nebraska Omaha (28) and Creighton University (4) participated for course credit or extra credit. The participants ranged in age from 17 to 40 (M = 20.69). Twenty-six participants were female, and six were male. Their education level ranged from first year of college to fourth year of college (M = 1.9). Based on free response, there were 28 Caucasian participants, two Asian participants, one Black participant and one Hindu participant.
Stimuli
The stimuli were 3,000 disyllabic words. Stimulus characteristics for these words are presented in Table 1. Table 2 provides a correlation matrix of the relationship among AoA and other semantic variables for a relatively large number of disyllabic words (N = 2,792). For a smaller set of words (N = 302) that were in common with the Bennett, Burnett, Siakaluk, and Pexman (2011) norms, we computed the correlation between AoA and body-object interaction (r = -.06, N.S.). In addition, for a small number of words in common with Bradley and Lang’s (1999) corpus, (N = 160) we computed correlations between AoA and valence (r = -.03, N.S.) and between AoA and arousal (r = -.17, p = .03). The words examined here were mainly monomorphemic, but very common multimorphemic words were also included (e.g. awesome, baseball, bathroom). They ranged in frequency from zero to 71.21 occurrences per million, according to the Zeno et al. (1995) norms and ranged in length from three to 11 letters. We began with 23,365 disyllabic words and narrowed this list to 3,000, as our previous research has indicated this is the number of items that can be rated in approximately 4 hours. To reduce the number of items, the list of 23,365 was divided among 7 undergraduate research assistants who each reviewed a section of the list and selected the words (s)he knew. All words that were not familiar to the undergraduate assistants were eliminated, leaving 15,434 words. We proceeded this way to reduce the number of words and to include mostly words that undergraduates know. To reduce the sample further, many multimorphemic words were eliminated, leaving a representative, but not exhaustive list of disyllabic words.
Procedure
The procedures of Cortese and Khanna (2008) were followed as closely as possible. A computer was used to collect ratings in a laboratory. Participants were asked to rate words on a scale of one to seven based on their subjective estimate of how early the word was acquired (the instructions appear in the Appendix). Two sessions of 1.25 to 2.00 hours were conducted within one week of each other. Each session was comprised of four blocks of 375 words each for a total of 1,500 words. At the end of each block, the participant was given the opportunity to take a break. Stimuli were presented in a different random order for each participant.
Each trial consisted of a word being presented in lowercase letters in the center of the screen, while the rating scale was visible at the bottom of the screen. The ratings were entered using the number keypad on the right side of the keyboard. Responses that were faster than 500 ms were followed by the message “response too fast – slow down!” on the bottom of the screen. After 2000 ms the word then reappeared on the screen to be rerated. This delay was intended to discourage the participant from again responding too quickly. Responses that were not numbers between 1 and 7 were followed by the message “response invalid – try again” at the bottom of the screen for 2000 ms, after which the word reappeared on the screen to be rerated. The instructions and scale were the same as those given to participants in Cortese and Khanna (2008) for monosyllabic words, with two exceptions (see Appendix). First, “single syllable” was changed to “two syllable.” In addition, participants were instructed that whenever they noticed that a word had more than one meaning, they should provide the estimate for the meaning that was acquired first. This modification was based on work by Khanna and Cortese (2011) that examined AoA for ambiguous words.
Analyses, results and discussion
Data were collected from 32 participants. Participants were not individually monitored during the collection of data and so a method was needed for ensuring that each participant took the task seriously (i.e., did not make ratings by pressing buttons without consideration). We screened the data using the following procedure. The overall mean for each item was calculated by averaging the ratings given to that word by each of the 32 participants. Next, the correlation between each participant’s ratings and the overall means of all 3,000 words was calculated. From this set of correlation coefficients, a mean correlation (.677) and standard deviation (.137) were established. One participant, whose correlation coefficient was more than 2 standard deviations below the mean, was eliminated. The rationale for eliminating this participant’s responses was that either the participant did not take the task seriously, did not understand the instructions, or was not representative of the population we were interested in sampling. In fact, this participant’s correlation coefficient, r = .05, was 4.58 standard deviations below the overall mean. We then collected responses from an additional participant. Thus, the estimates provided in the current study were derived from 32 people. Of those 32 participants who remained, the correlation coefficients ranged from .33 to .83.
To establish content validity, correlations between those items common to our data set and 3 others, Stadthagen-Gonzalez and Davis (2006, r = .84, N = 535), Morrison et al. (1997, r = .72, N = 68), and Bird et al. (2001, r = .78, N = 436), were calculated (see Fig. 1). Interrater reliability was assessed via Cronbach’s alpha (α = .962). The Spearman Brown coefficient was .965.
Combining the AoA estimates obtained in the present study with the 3,000 estimates reported by Cortese and Khanna (2008) provides researchers with AoA estimates for 6,000 words obtained via very similar procedures. We were interested in examining the relationship between the AoA value of a word and the number of semantic associates that generate that word in word association as well as the AoA value of a word and the number of semantic associates that the word itself produces. We assessed these relationships using our AoA estimates and the word association norms of Nelson, McEvoy, and Schreiber (2004). We found that 3,055 (1716 of which were monosyllabic and 1339 were disyllabic) of 6,000 words that our participants and Cortese and Khanna’s (2008) participants had rated for AoA appeared as cues and as an associate by at least one other word in the Nelson et al. (2004) norms. We found that relative to later acquired words, earlier acquired words have a relatively large number of semantic associates that produce them in word association (r = -.474, p < .001). This relationship is consistent with the idea that later acquired words are learned via association to earlier acquired words. Furthermore, the relationship was log linear; the correlation coefficient increased when the number of semantic associates variable was log transformed, (r = -.619, p < .001, see Fig. 2). In contrast, the AoA value of a word is not related to the number of associates it produces in word association (r = .026, p = .157). These results are remarkably similar to the relationships reported by De Deyne and Storms (2008) in Dutch. Based on analyses involving 1,117 Dutch words, DeDeyne and Storms reported correlations of r = -.61 (for a word’s AoA and the frequency of being produced as an associate of other words), and r = .03 (for a word’s AoA and the number of associates it produces). We note that these values are also based on log transformed values for the number of semantic associate variables.
Speed of response was not emphasized in our instructions. However, reaction time data were collected. The average reaction time across all items was 2007.11 ms. Average reaction times for individual items ranged from 1140.25 ms to 6727.27 ms. Reaction time was negatively correlated with rating, (r = -.141) such that words acquired earlier in life took longer to rate.
Perhaps the greatest value of these AoA ratings is that they can be used in conjunction with the imageability ratings that we have obtained previously to assess independent contributions of AoA and imageability on reading aloud and lexical decision performance for monosyllabic and disyllabic words. Previously, in their analyses of monosyllabic words, Cortese and Khanna (2007) reported that AoA accounted for unique variance in reading aloud and lexical decision reaction times whereas imageability’s effect was limited to lexical decision (although it did account for unique variance in the accuracy of reading aloud). More recently, Schock, Cortese, and Yap (2011) looked at these relationships in 1,937 disyllabic words. More specifically, using the current set of AoA estimates and imageability estimates from Schock et al. (in press), Schock et al. (2011) entered AoA in Step 8 of a hierarchical regression analysis, after initial phoneme characteristics and numerous sublexical and lexical variables were controlled. In fact, Schock et al. examined imageability and AoA while controlling for all of the variables examined in the recent Yap and Balota (2009) studies. Schock et al. found that AoA and imageability each accounted for unique variance in reading aloud and lexical decision reaction times. Interestingly, the effects of AoA and imageability were more similar across reading aloud and lexical decision tasks than they were across these same tasks for monosyllabic words. In contrast, Cortese and Khanna (2007) found that AoA and imageability effects were much stronger in lexical decision than reading aloud. We also note that while AoA also accounted for unique variance in reading aloud and lexical decision reaction times for monosyllabic words (see, e.g., Brysbaert & Cortese, 2011; Cortese & Khanna, 2007), imageability did not account for unique variance in reading aloud reaction times for monosyllabic words when AoA was controlled (Cortese & Khanna, 2007).
We maintain the idea that AoA has primarily a semantic basis which is consistent with the aforementioned semantic locus hypothesis (i.e., AoA affects the structure of semantic associations) and the network plasticity hypothesis (i.e., AoA effects emerge more strongly when associations between inputs and outputs are more arbitrary as they are between orthography/phonology and semantics). In addition, we hypothesize that the difference in results for imageability between monosyllabic and disyllabic reading aloud reaction times reflects the idea that disyllabic words take longer to process and allow more time for semantic information to influence processing. Specifically, Yap and Balota (2009) reported that reaction time increases with number of syllables (r = .44, p < .001).
In addition, these norms combined with our previously published norms will be useful for researchers interested in further examining effects of AoA in monosyllabic and disyllabic words. Given that Schock et al. (2011) found that AoA and imageability accounts for unique variance in reading aloud and lexical decision performance, it will be important to control for the influences of these variables. Also, given that AoA values now exist for a very large set of words, it will be easier to distinguish effects of AoA from other factors as well (e.g., word frequency).
References
Balota, D. A., Cortese, M. J., Sergent-Marshall, S. D., Spieler, D. H., & Yap, M. J. (2004). Visual word recognition of single-syllable words. Journal of Experimental Psychology. General, 133, 283–316. doi:10.1037/0096-3445.133.2.283
Balota, D. A., Yap, M. J., Cortese, M. J., Huchison, K. A., Kessler, B., Loftis, B., . . . Treiman, R. (2007). The English lexicon project. Behavior Research Methods, 39, 445–459.
Balota, D. A., Yap, M. J., Hutchison, K. A., & Cortese, M. J. (in press). Megastudies: Large scale analyses of lexical processes. To appear in J.S. Adelman (Ed.) Visual word recognition Vol. 1: Models and Methods, Orthography and Phonology. Hove, England: Psychology Press.
Bennett, S. D. R., Burnett, A. N., Siakaluk, P. D., & Pexman, P. M. (2011). Imageability and body-object interaction ratings for 599 multisyllabic nouns. Behavior Research Methods, 43, 1100–1109. doi:10.3758/s13428-011-0117-5
Bradley, M. M., & Lang, P. J. (1999). Affective norms for English words (ANEW): Instruction manual and affective ratings. Technical Report C-1, The Center for Research in Psychophysiology, University of Florida.
Bird, H., Franklin, S., & Howard, D. (2001). Age of acquisition and imageability ratings for a large set of words, including verbs and function words. Behavior Research Methods, Instruments, & Computers, 33, 73–79.
Brysbaert, M., & Cortese, M. J. (2011). Do the effects of subjective frequency and age of acquisition survive better word frequency norms? The Quarterly Journal of Experimental Psychology. doi:10.1080/17470218.2010.503374
Brysbaert, M., & New, B. (2009). Moving beyond Kucera and Francis: A Critical Evaluation of Current Word Frequency Norms and the Introduction of a New and Improved Word Frequency Measure for American English. Behavior Research Methods, 41, 977–990. doi:10.3758/BRM.41.4.977
Brysbaert, M., New, B., & Keuleers, E. (in press). Adding part-of-speech information to the SUBTLEX-US word frequencies. Behavior Research Methods.
Carroll, J. B., & White, M. N. (1973). Word frequency and age-of-acquisition as determiners of picture-naming latency. Quarterly Journal of Experimental Psychology, 25, 85–95. doi:10.1080/14640747308400325
Chateau, D., & Jared, D. (2003). Spelling-sound consistency effects in disyllabic words naming. Journal of Memory and Language, 48, 255–280. doi:10.1016/S0749-596X(02)00521-1
Cortese, M. J., & Khanna, M. M. (2007). Age of acquisition predicts naming and lexical-decision performance above and beyond 22 other predictor variables: An analysis of 2,342 words. The Quarterly Journal of Experimental Psychology, 60, 1072–1082. doi:10.1080/17470210701315467
Cortese, M. J., & Khanna, M. M. (2008). Age of acquisition ratings for 3,000 monosyllabic words. Behavior Research Methods, 40, 791–794. doi:10.3758/BRM.40.3.791
Cortese, M. J., Khanna, M. M., & Hacker, S. (2010). Recognition memory for 2,578 monosyllabic words. Memory, 18, 595–609. doi:10.1080/09658211.2010.493892
Cutler, A. (1981). Making up materials is a confounded nuisance: or Will we be able to run any psycholinguistic experiments at all in 1990? Cognition, 10, 65–70.
De Deyne, S., & Storms, G. (2008). Word associations: Network and semantic properties. Behavior Research Methods, 40, 213–231. doi:10.3758/BRM.40.1.213
Durda, K., & Buchanan, L. (2006). WordMine2 [Online] Available: www.wordmine2.org
Ellis, A. W., & Lambon Ralph, M. A. (2000). Age of acquisition effects in adult lexical processing reflect loss of plasticity in maturing systems: Insights from connectionist networks. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26, 1103–1123. doi:10.1037//0278-7393.26.5.1103
Ellis, A. W., & Monaghan, J. (2002). Reply to Strain, Patterson, and Seidenberg (2002). Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, 215–220. doi:10.1037//0278-7393.28.1.215
Forster, K. I. (2000). The potential for experimenter bias effects in word recognition experiments. Memory & Cognition, 28, 1109–1115.
Humphreys, L. G. (1978). Research on individual differences requires correlational analysis, not ANOVA. Intelligence, 2, 1–5.
Juhasz, B. J. (2005). Age-of-acquisition effects in word and picture identification. Psychological Bulletin, 131, 684–712. doi:10.1037/0033-2909.131.5.684
Khanna, M. M., & Cortese, M. J. (2011). Age of Acquisition ratings for 1,208 ambiguous and polysemus words. Behavior Research Methods, 43, 89–96. doi:10.3758/s13428-010-0027-y
Miller, G. A. (2009). “WordNet – About Us,” WordNet. Retrieved from http://wordnet.princeton.edu
Monaghan, J., & Ellis, A. (2002). What exactly interacts with spelling-sound consistency in word naming? Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, 183–206. doi:10.1037/0278-7393.28.1.183
Morrison, C. M., Chappell, T. D., & Ellis, A. W. (1997). Age of acquisition norms for a large set of object names and their relation to adult estimates and other variables. The Quarterly Journal of Experimental Psychology, 50, 528–559. doi:10.1080/027249897392017
Nelson, D. L., McEvoy, C. L., & Schreiber, T. A. (2004). The University of South Florida free association, rhyme, and word fragment norms. Behavior Research Methods, Instruments, & Computers, 36, 402–407. doi:10.3758/BF03195588
New, B., Ferrand, L., Pallier, C., & Brysbaert, M. (2006). Reexamining the word length effect in visual word recognition: New evidence from the English Lexicon Project. Psychonomic Bulletin & Review, 13, 45-52. Retrieved from PsycINFO database.
Schock, J., Cortese, M. J., & Khanna, M. M. (in press). Imageability ratings for 3,000 disyllabic words. Behavior Research Methods.
Schock, J., Cortese, M. J., & Yap, M. J. (2011). Imageability and Age of Acquisition Effects in Disyllabic Word Recognition. In Proceedings of the 51 st Annual Meeting of the Psychonomic Society. Seatle, WA: Psychonomic Society.
Stadthagen-Gonzalez, H., & Davis, C. J. (2006). The Bristol norms for age of acquisition, imageability and familiarity. Behavior Research Methods, 38, 598–605.
Steyvers, M., & Tenenbaum, J. B. (2005). The large-scale structure of semantic networks: Statistical analyses and a model of semantic growth. Cognitive Science, 29, 41–78. doi:10.1207/s15516709cog2901_3
Strain, E., Patterson, K., & Seidenberg, M. S. (1995). Semantic effects in single-word naming. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 1140–1154. doi:10.1037/0278-7393.21.5.1140
Yap, M. J., & Balota, D. A. (2009). Visual word recognition of polysyllabic words. Journal of Memory and Language, 60, 502–529. doi:10.1016/j.jml.2009.02.001
Zeno, S. M., Ivens, S. H., Millard, R. T., & Duvvuri, R. (1995). The educator’s word frequency guide. Brewster, NY: Touchstone Applied Science.
Zevin, J. D., & Seidenberg, M. S. (2004). Age of acquisition effects in reading aloud: Tests of cumulative frequency and frequency trajectory. Memory & Cognition, 32, 31–38. doi:10.3758/BF03195818
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
ESM 1
(XLSX 241 kb)
Appendix
Appendix
We acquire words throughout our lives. Some words are acquired at a very early age, some are acquired later, and others fall in between. The purpose of this study is to determine the approximate age for which 3,000 two syllable words have been acquired.
On each trial of the study, you will be presented with a word. Try to estimate the age at which you acquired that word according to the scale shown at the bottom of the screen. Note that words acquired at an early age should be given a low rating, and words acquired at an older age should be given a higher rating. For any word that you do not know, you should assign a value of 7. The scale will appear below each word for easy reference.
When making your ratings, try to be as accurate as possible, but do not spend too much time on any one word. You may notice that some have more than one meaning. When this happens, you should provide the estimate for the first meaning learned. For example, DUCK can refer to the bird or to lower one’s head. In this case, you may have acquired the bird meaning first, and so you would provide the estimate for that meaning. If you have any questions, ask the experimenter now.
Otherwise PRESS <ENTER> TO BEGIN THE STUDY.
Rights and permissions
About this article
Cite this article
Schock, J., Cortese, M.J., Khanna, M.M. et al. Age of acquisition estimates for 3,000 disyllabic words. Behav Res 44, 971–977 (2012). https://doi.org/10.3758/s13428-012-0209-x
Published:
Issue Date:
DOI: https://doi.org/10.3758/s13428-012-0209-x