The research of the word is predominantly the research of the noun (Clark & Paivio, 2004; Medin, Lynch, & Solomon, 2000). Adjectives in particular have been overlooked, perhaps because not all languages have this word class and because they are often defined negatively, as a set of lexical items that are distinct from the universal noun and verb classes on morphological and syntactic grounds (Dirven & Taylor, 1988; Dixon, 1982; Vogel, 2004). In many languages adjectives are, however, undeniably important parts of speech, both because of their number and because of their semantic role. The CELEX count for Dutch words, for instance, contains 95,657 nouns, 13,912 adjectives, and 11,837 verbs (Baayen, Piepenbrock, & van Rijn, 1993). In the “Small World of Words” Dutch word association norms (SWOW-NL, De Deyne & Storms, 2008a) 18% of the produced associations are adjectives, making them the second most frequently produced word class (after nouns with 72%). Semantically, nouns refer to concepts, whereas adjectives refer to properties (Gärdenfors, 2000). In communication adjectives thus allow one to distinguish or identify instances that are referred to by the same noun (as in: hand me the tall glass; Dixon, 1982). Adjectives also constitute the majority of the hubs in semantic networks, connecting distinct parts of the mental lexicon (e.g., the adjective white connects the semantically remote words North Pole and sink; De Deyne & Storms, 2008a). As such, they contribute considerably to the richness and flexibility of our language.

The predominance of nouns is also reflected in available norm studies (Bird, Franklin, & Howard, 2001). Whenever adjectives are included in norm studies, they are either few in number (Altarriba, Bauer, & Benvenuto, 1999; Berrian, Metzler, Kroll, & Clark-Meyers, 1979; Grühn & Smith, 2008; Võ et al., 2009) or data for a small number of variables is obtained (Anderson, 1968; Bird et al., 2001; Brysbaert, Stevens, De Deyne, Voorspoels, & Storms, 2014a; Kuperman, Stadthagen-Gonzalez, & Brysbaert, 2012; Lynott & Connell, 2009; van Loon-Vervoorn, 1985). To avoid sampling bias, to allow for generalization, and to ensure that the critical lexical variables can be controlled for, norming data for a variety of words and variables are required, however (Clark & Paivio, 2004; Kousta, Vinson, & Vigliocco, 2009). In line with other recent studies (e.g., Quadflieg, Michel, Bukowski, & Samson, 2014), in the present study we aimed to accommodate the paucity of normative data for adjectives by having 1,300 students provide ratings on seven variables for 1,000 Dutch adjectives.

We purposefully included both distributional and experiential variables. We follow Andrews, Vigliocco, and Vinson (2009) in defining distributional variables as specifying how words are statistically distributed across different spoken or written texts, and experiential variables as capturing perceived attributes or properties associated with the referents of the words. Distributional variables thus pertain to linguistic information (e.g., frequency, orthographic neighborhood size, summated bigram frequency), whereas experiential variables result from experience with the physical world (Andrews et al., 2009; Barsalou, Santos, Simmons, & Wilson, 2008; Santos, Chaigneau, Simmons, & Barsalou, 2011). Among the experiential variables, one sometimes distinguishes between lexicosemantic ones (e.g., age of acquisition, familiarity, concreteness, imageability) and affective ones (e.g., valence, arousal, dominance), but the distinction is hard to make in any case (Andrew et al., 2009; Citron, Weekes, & Ferstl, 2014).

There is a growing consensus that the integration of distributional and experiential information is an asset for research domains at both sides of the spectrum (Kousta, Vigliocco, Vinson, Andrews, & Del Campo, 2011; Langacker, 1987; Talmy, 2000), and empirical evidence supporting this point is accumulating (e.g., Dolan, 2002; Kousta et al., 2009; Kuperman, Estes, Brysbaert, & Warriner, 2014). In light of the roles adjectives play in our language (see above), we feel it is especially important to include experiential variables in addition to linguistic ones, to adequately describe the stimulus domain (Clark & Paivio, 2004; De Deyne, Voorspoels, Verheyen, Navarro, & Storms, 2014). Lexicosemantic variables like concreteness and imageability, and affective variables like valence and arousal are important to include since the adjectival domain covers words representing the very abstract (ideal, infinite) to the very concrete (blue, green) and includes both emotion words (happy, sad) and neutral words (level, similar).

In the following sections, we describe the details of the data collection and demonstrate how the norms might be applied through an investigation of the structure of a word association network. In line with the arguments provided above, this demonstration will show that the consideration of both experiential and distributional information proves fruitful for understanding the organization of the mental lexicon.

Collection of the norms

Participants

Our participants were university students because this is the population typically tested in the studies for which the ratings are intended. A total of 1,300 Dutch speaking students were recruited at KU Leuven, Belgium. They participated in exchange for course credit. Each participant provided ratings for only one variable so that the ratings for one variable could not influence or contaminate the ratings for another variable.

Materials

The 488 Dutch adjectives from van Loon-Vervoorn (1985) were included. The 328 English adjectives from Berrian et al. (1979) were translated into Dutch and were included as well. Other research purposes (reported in De Deyne, Voorspoels, Verheyen, Navarro, & Storms, 2014) prompted us to add additional adjectives for a total of 1,000.

Attention was drawn in the instructions to the fact that all the stimuli were to be regarded as adjectives, because the classification of some words may be arguable without context (e.g., abstract, bitter, fine, firm, flat, gold, human, ideal, light, minute, musical, patient, piercing, safe, stiff, and sweet).

Procedure

Experiential variables

In collecting the ratings, we employed a modular approach, in which the full sample of words was divided into five separate blocks, except for age of acquisition (AoA), for which we employed 35 blocks. This decision was made because for AoA participants needed to indicate the age at which they first acquired a word, which was deemed more effortful than providing a judgment on a scale, like for the other variables (see below). For each variable, two (AoA, concreteness, imageability) or four (arousal, dominance, familiarity, valence) permutations of the adjectives in each block were randomly distributed across 20 participants. Participants completed the ratings on paper (AoA, familiarity, connotation) or on a computer (concreteness, imageability, arousal, dominance).

All ratings were performed on 7-point scales, except for AoA, for which participants were asked to enter the age (in years) at which they thought they had learned the word. Participants were also given the option to indicate that they did not know a word. Where possible, the instructions were taken from other Dutch norming studies.

For AoA, we used the same instructions Moors et al. (2013) did. For each word, the participants were asked to enter the age (in years) at which they thought they had learned the word.

The instructions for familiarity were taken from De Deyne et al. (2008). Participants rated how familiar they were with each word in the list by indicating how often they had encountered or used the word. Ratings were indicated on a 7-point scale with the ends labeled never (1) and very often (7).

The instructions for concreteness were taken from Van der Goten, De Vooght, and Kemps (1999). Participants rated each word’s level of abstraction on a 7-point scale ranging from very abstract (1) to very concrete (7).

The instructions for imageability were taken from Paivio, Yuille, and Madigan (1968). Participants rated each word’s capacity to arouse a mental image by indicating how easily they could form a mental image of the word on a 7-point scale with the ends labeled difficult (1) and easy (7).

The instructions for valence were taken from Van der Goten, De Vooght, and Kemps (1999) and invited participants to rate the valence of each word by indicating on a 7-point scale the extent to which it evoked a very bad (1) or a very good (7) feeling.

Following Moors et al. (2013), participants were asked to indicate the degree to which a word refers to something very passive/calm (1) or very active/arousing (7), and to something very weak/submissive (1) or very strong/dominant (7) for arousal and dominance, respectively.

Distributional variables

The experiential norms were supplemented with distributional norms. For each adjective, we included the number of characters (nchar) and the number of syllables (nsyl), as well as the orthographic neighborhood size (neighb), the summated position-nonspecific bigram frequency (bigram), and the frequency (celex_freq) based on the Dutch version of the CELEX database (Baayen et al., 1993). The variable neighb indicates the number of CELEX words that are obtained by changing one letter of the adjective, whereas bigram indicates the frequency of the adjacent letter pairs of the adjective across all the words included in CELEX. The variable subtlex_freq lists the lemma frequency of the adjectives in the SUBTLEX-NL database (Keuleers, Brysbaert, & New, 2010a). It indicates the frequency with which any form of the 1,000 stimulus words occurs in a corpus of Dutch subtitles as an adjective (based on part-of-speech tagging).

The variables in-strength and betweenness were taken from the Small World of Words project (https://smallworldofwords.org/), a large-scale study that aims to build a map of the human lexicon in the major languages of the world (including Dutch) using word association data (De Deyne, Navarro, Perfors, Brysbaert, & Storms, 2019). To this end, participants were shown 15 cue words and invited to provide the three words that first come to mind in response to each cue (see De Deyne & Storms, 2008b, for details). These responses were used to construct a word association network from which the above measures could be derived (see De Deyne, Navarro, & Storms, 2013, for details). The in-strength si of a word i reflects the number of links that go into the node representing the word in the word association network and as such provides an indication of the word’s importance in the mental lexicon. It is operationalized as the weighted sum of the incoming links:

$$ {s}_i=\sum \limits_{j\in N}{w}_{ij}, $$

with wij being the weight of the link connecting nodes i and j, and N the number of nodes in the network. Betweenness is another indicator of the centrality of a node in the network. It takes the global structure of the word association network more in account than in-strength does in that it captures how often a node is located on the shortest path between other nodes in the network. Betweenness was calculated with the iGraph R package (Csardi & Nepusz, 2006). The betweenness bi of a node i in the network is operationalized as

$$ {b}_i=\frac{2}{\left(n\ast n-3\ast n+2\right)}\sum \limits_{\begin{array}{c}h,j\in N,\\ {}h\ne j,h\ne i,j\ne i\end{array}}\frac{\rho_{hj}(i)}{\rho_{hj}}, $$

with ρhj(i) corresponding to the weighted number of shortest paths from h to j passing through i and ρhj the total number of shortest paths from h to j. The summed proportions are divided by the number of all possible pairs h,j. The first term of the equation is used to normalize the score to take into account networks of different sizes.

Results

The full set of norms may be downloaded from https://osf.io/nyg8v/. In addition to the distributional variables, the norms include the original responses, mean values, standard deviations, and % unknown for the rated attributes of each word. The file also contains English translations of the Dutch materials.

Reliability

The reliability of the ratings within each block was evaluated by applying the Spearman-Brown formula to the split-half correlations (Spearman, 1904). Table 1 shows the reliabilities of the variables familiarity, concreteness, imageability, valence, arousal, and dominance for each of the five blocks. The reliabilities for AoA, for which there were 35 blocks, ranged between .95 and .99. The reliability values indicate a substantial degree of agreement between the participants. Do note, however, that because of homonymy, the consistency may be less for some words included in the modules. The rating variability of each word, indicated by the standard deviation, is arguably a good indication of this ambiguity.

Table 1 Estimates of the reliability of the experiential variables for each block of adjectives

Validity

All the collected ratings can be validated by correlating the mean ratings obtained in our study with those obtained in other studies. In the section below, N will always refer to the number of overlapping stimuli. The values reported between square brackets pertain to the subset of the stimuli that only have an adjectival reading in Dutch. We made it clear in our instructions that all words should be considered adjectives, but since other studies included words from other word classes as well, participants in these studies might have responded toward the noun reading, which in turn might have affected the rating.

Our mean AoA ratings correlate .93 [.92] with those in Brysbaert, Stevens, De Deyne, Voorspoels, and Storms (2014a) for the N = 473 [416] stimuli included in both studies. When the ratings of four Dutch AoA norming studies are combined (Brysbaert et al., 2014a; Ghyselinck, Custers, & Brysbaert, 2003; Ghyselinck, De Moor, & Brysbaert, 2000; Moors et al., 2013) the overlap increases to 959 [707] and the correlation becomes .94 [.94]. Our mean valence, arousal, and dominance ratings correlate .96 [.97], .92 [.93], and .90 [.92], respectively, with those in Moors et al. (2013) for the 494 [298] stimuli included in both studies. Our mean valence ratings correlate .96 [.96] with those for the 154 adjectives that were also included in Hermans and De Houwer (1994). Our mean familiarity ratings correlate .82 [.90] with the familiarity ratings taken from the same article. These substantial correlations with external norms indicate that the ratings for AoA, familiarity, valence, arousal, and dominance are well in line with those of previous studies. The magnitude of the correlations approaches the theoretical maximum, considering that the reliability of the ratings (see Table 1) constitutes a ceiling for the correlations of the ratings with external variables (Spearman, 1904). Mismatches due to noun readings of particular adjectives in earlier studies appear to be limited. Restricting the correlation to stimuli that only have an adjectival reading does not impact the correlation heavily.

The mean imageability ratings for the N = 488 [305] adjectives in van Loon-Vervoorn (1985) correlate .76 [.74] with our mean imageability ratings and .45 [.40] with our mean concreteness ratings. The magnitude of the latter correlation coefficient is very similar to the .46 [.41] correlation observed between imageability and concreteness in our own study for the subset of 488 [305] adjectives. The mean concreteness ratings for the 957 [707] adjectives in Brysbaert et al. (2014a) correlate .68 [.69] with our mean imageability ratings and .49 [.47] with our mean concreteness ratings. Although the latter correlation does not seem to be strongly affected by the potentially different interpretation of the word class in Brysbaert et al. (2014a), it is considerably lower than one would expect a priori. We believe that the composition of the stimulus set and differences in the wording of the rating scale might be responsible for this discrepancy. As compared to the participants in Brysbaert et al. (2014a), our participants appeared to use a restricted range of the concreteness scale, which might have reduced the correlation. The 5th and 95th percentile of the mean concreteness ratings for the overlapping adjectives on the 5-point rating scale used by Brysbaert et al. (2014a) are 1.40 and 4.13, whereas the corresponding values for our 7-point rating scale are 3.25 and 6.00. It thus appears that our participants rarely used the lower points on the concreteness scale, presumably because, unlike the participants in Brysbaert et al. (2014a), the stimuli they were asked to rate did not include nouns and verbs, many of which are arguably relatively concrete as compared to adjectives. This observation has ramifications for the way in which concreteness norms collected for specific word classes—be it nouns, adjectives, or verbs—are used. These data cannot be straightforwardly combined because the relative magnitudes might not reflect comparable levels of abstractness. We believe that the lack of agreement between our concreteness ratings and those of Brysbaert and colleagues (2014a) is also due to the use of different operationalizations of the concreteness scale, which might carry different meanings. In their instructions for the concreteness rating task, Brysbaert et al. (2014a) equated “abstract” with language-based and “concrete” with experiential. This might explain why the Brysbaert concreteness ratings correlate better with our imageability ratings than they do with our concreteness ratings. Following Van Der Goten, De Vooght, and Kemps (1999), we employed rather general concreteness instructions, in which the interpretation of “abstract” and “concrete” were left to the participants, whereas in the imageability instructions, a rather explicit reference is made to the (visual) senses. This seems to agree better with Brysbaert et al.’s (2014a) concreteness instructions, which also refer to experience. The observation that scales intended to measure the same construct can yield different results due to what appear to be minor phrasing differences is a cause of concern and warrants further investigation.

Correlations

The collected ratings can be further validated by verifying whether the pattern of intercorrelations between the variables is similar to that of other norming studies. Table 2 shows the correlations between the seven experiential variables in our study.

Table 2 Correlations between the experiential variables collected for 1,000 Dutch adjectives

A natural comparison is one with the AoA, valence, arousal, and dominance norms of Moors et al. (2013) for 4,300 Dutch words; 494 of which are included in our study. We found that AoA shares hardly any variance with the affective variables valence, arousal, and dominance (see Table 2). This is also the case in the Moors et al. data, both for the overlapping stimuli (r = – .06, .05, and – .01, respectively) and the entire stimulus set (r = – .17, .03, and .08). The correlations among the affective variables in Table 2 are also mirrored in the Moors et al. data for the overlapping stimuli, in which the correlation between arousal and dominance is found to be the strongest (r = .71), and the correlation between valence and dominance (r = .56) is found to be somewhat stronger than that between valence and arousal (r = .27). This was to be expected given the strong correlation between our data and those of Moors et al. (all rs > .90; see above). The correlations among the affective variables are, however, more pronounced for the overlapping stimuli than for the stimulus set as a whole (r = – .01, .27, and .59, respectively), indicating that the observed structure of the affective variables might be specific to adjectives, and might not generalize across word classes.

Another natural comparison is with Brysbaert et al. (2014a), in which AoA and concreteness were found to correlate – .34 across 25,882 Dutch words from different word classes. When the stimulus set is restricted to the 473 adjectives that are also included in our stimulus set, the correlation becomes – .35. Both correlations are comparable to the – .31 correlation across our 1,000 adjectives (see Table 2).

Finally, we established a less pronounced relationship between familiarity and valence than Hermans and De Houwer (1994) did. Whereas we observed a correlation of .18 (see Table 2), they found that familiarity and valence correlated .25 across 370 adjectives, and .34 across the 154 adjectives included in both studies.

No Dutch norming studies are available that include AoA, familiarity, concreteness, and imageability. For comparison, we therefore turn to two English studies that incorporate all four lexicosemantic variables. Gilhooly and Logie (1980) contains ratings for 1,944 nouns. Bird, Franklin, and Howard (2001) reported ratings of AoA (N = 2,694), familiarity (N = 1,217), concreteness (N = 1,070), and imageability (N = 2,019) for nouns, verbs, adjectives, numerals, adverbs, and function words. For ease of comparison, the correlations between the four lexicosemantic variables are shown next to one another in Table 3.

Table 3 Correlations between lexicosemantic variables collected in the present and two previous studies

The inter-correlations between the lexicosemantic variables AoA, familiarity, concreteness, and imageability appear comparable across languages and word classes. The correlations we observed for the 1,000 Dutch adjectives tended to be in between the correlations reported in Gilhooly and Logie (1980) and Bird et al. (2001), for English stimuli that mostly comprised nouns. This was the case for the correlation between AoA and familiarity, between AoA and concreteness, and between familiarity and imageability. The – .47 correlation we found between AoA and imageability was somewhat smaller than the correlations reported in the literature, but very close to the – .50 correlation reported by Bird et al. The .20 correlation between familiarity and concreteness was somewhat higher than the correlations in the literature, but close to the .11 correlation from Gilhooly and Logie. The largest discrepancy was found for the correlation between concreteness and imageability. We found a moderate correlation between the two variables, whereas the variables were found to correlate strongly in the English norming studies. As we discussed in the previous section, we believe the wording of the concreteness scale and the composition of the stimulus set to be responsible for this discrepancy (see above).

Application

The idea that the mental lexicon can be thought of as an organized network based on meaningful word associations, dates back to at least Deese (1966). It is currently regaining popularity due to the ability to compile large, contemporary word association datasets through crowdsourcing (De Deyne et al., 2019) and the observation that these data are particularly apt at capturing semantic relations across words of varying levels of abstraction (De Deyne, Verheyen, & Storms, 2015) and relatedness (De Deyne, Navarro, Perfors, & Storms, 2016a). Word association data display assortativity for valence, arousal, and dominance: cues of a particular affective quality tend to elicit responses with a similar affective quality (Pollio, 1964; Staats & Staats, 1959; Van Rensbergen, Storms, & De Deyne, 2015b). Accurate predictions of words’ standings on all three affective dimensions can also be obtained from word association data (Vankrunkelsven, Verheyen, Storms, & De Deyne, 2018; Van Rensbergen, De Deyne, & Storms, 2015a). Therefore, word association data have the potential to uncover the extent to which there are systematic relationships between the manner in which words are organized in the mental lexicon and the words’ affective dimensions, which have been claimed to be an integral part of the stored word meaning (Osgood, Suci, & Tannenbaum, 1957; Samsonovich & Ascoli, 2010).

Investigations of the extent to which distributional variables like word frequency (Steyvers & Tenenbaum, 2005) and contextual diversity (Hills, Maouene, Riordan, & Smith, 2010), and lexicosemantic variables like AoA (Steyvers & Tenenbaum, 2005) affect the interconnectivity of words in a word association network have already been undertaken. However, no study has systematically looked at the effects of the affective variables valence, arousal, and dominance on the organisation of the mental lexicon. The present norms allow us to undertake this investigation for adjectives, which not only can be assumed to cover the entire range of the affective variables under investigation, but also constitute an interesting class of words to investigate because of the role they have been shown to play in establishing the small-world structure of the word association network. As we already mentioned in the introduction, the majority of the hubs in semantic networks—those words connecting remote parts of the mental lexicon—are adjectives (De Deyne & Storms, 2008a).

Method

We chose to evaluate the effect of the affective variables on the organization of the mental lexicon by regressing the variables in-strength and betweenness on the distributional and experiential variables included in the norms. Whereas in-strength represents a local measure of the organization of the word association network, taking just the number of links between a word and its directly connected neighbors into account, betweenness provides an indication of the broader context of the word in the network through the proportion of times it features along indirect paths that connect it with all other words (see also the Distributional Variables section). Both measures indicate the centrality of words in the network and have been shown to explain a host of findings in the word processing and memory literature (e.g., Hutchison, 2003; Nelson & McEvoy, 2000).

Following the investigation of the non-linear effects that valence and arousal have on lexical decision by Kousta, Vinson, and Vigliocco (2009), we carried out an ordinary least squares regression analysis of in-strength and betweenness with the same predictors included in their study, with the addition of dominance and with the position nonspecific mean bigram frequency instead of the mean positional bigram frequency. The model thus incorporated all linear and nonlinear effects of our experiential variables (AoA, familiarity, concreteness, imageability, valence, arousal, dominance) and the distributional variables orthographic neighborhood size, mean bigram frequency, number of characters, and SUBTLEX word frequency.Footnote 1 We took the square root of neighborhood size and logarithmically transformed in-strength, betweenness, and word frequency before entering them in the regression.

All statistical analyses were carried out in R version 3.5.1 (R Development Core Team, 2007). We used restricted cubic splines (Harrell, 2001) to model nonlinear relationships between the predictors and the dependent variables (in all instances, using three knots at quantiles {.1, .5, .9} on a given variable). Alpha was set to .05 to establish significance.

Results

The adjusted R2 was .70 for in-strength and .65 for betweenness, indicating that the predictors in our model captured a considerable amount of variability in both dependent variables. Adjectives tended to score higher on in-strength and betweenness the shorter, more frequent, more familiar, more imageable, and earlier acquired they are (see the complete overview of the regression results in the Appendix Table 4 and 5). Here we focus on the effects of valence, arousal, and dominance on in-strength and betweenness. The top row of Fig. 1 shows the results for in-strength. The bottom row contains the results for betweenness. From left to right, the panels show the partial effects of valence, arousal, and dominance, respectively, when all other predictors in the model are set to their median values.

Fig. 1
figure 1

Plots of the partial effects of the affective variables valence (left), arousal (middle), and dominance (right) on ln(in-strength) (top) and ln(betweenness) (bottom) for median values of all the other predictors. The gray zones indicate 95% confidence intervals.

Valence and arousal were found to be significant predictors of both in-strength and betweenness. Valence displays a U-shaped relationship with in-strength (see the top left panel in Fig. 1), indicating that positive and negative words receive more incoming links [F(2, 930) = 8.73, p < .001; nonlinear: F(1, 930) = 13.18, p < .001]. The results for betweenness indicate that negative words occupy a more central position in the word association network than do neutral and positive words [F(2, 930) = 6.00, p < .01; nonlinear: F(1, 930) = 3.20, p = .07; bottom left panel in Fig. 1].

Words with an active/arousing character tended to receive fewer incoming links and to feature less frequently in the connections between other words: Arousal had linear effects on both in-strength [F(2, 889) = 5.10, p = .006; nonlinear: F(1, 889) = .00, p = .95] and betweenness [F(2, 889) = 4.61, p = .01; nonlinear: F(1, 889) = .00, p = .97].

Dominance significantly predicted in-strength [F(2, 930) = 4.16, p = .02; nonlinear: F(1, 930) = 2.75, p = .10], indicating that words tend to receive more incoming links the stronger/more dominant that they tend to be. Dominance was not a significant predictor of betweenness [F(2, 930) = 2.94, p = .05; nonlinear: F(1, 930) = 0.99, p = .32].

Discussion

The study of the mental lexicon has largely neglected affective variables, perhaps because previous studies were biased toward concrete neutral words. Recently, however, researchers have begun to acknowledge that affect accounts for considerable variability in the meaning of words (De Deyne, Navarro, Collell, & Perfors, 2018), particularly of abstract words (Lenci, Lebani, & Passaro, 2018; Wang et al., 2017) and adjectives (De Deyne et al., 2014). By regressing indices of the centrality of 1,000 adjectives in a word association network on a variety of distributional, lexicosemantic, and affective variables, we found that valence, arousal, and dominance are among the organizing principles of the mental lexicon. That is, the centrality of adjectives in the mental lexicon varies as a function of the words’ affective dimensions: Negative adjectives and adjectives low in arousal take a more central position in the word association network than neutral adjectives do, and as such connect remote parts of the word association network. Highly affective adjectives (be it positive or negative), adjectives low in arousal, and adjectives that express strength take up a prominent position in the network as well. They are provided as an associate more often than other adjectives. The early research on word association has already established that words elicit more variable responses the more positive they are (Johnson & Lim, 1964; Koen, 1962). Our findings go beyond these early results in that they pertain to a much larger set of words and take the structure of the entire word association network into account. We are currently investigating the extent to which our findings generalize across word classes and languages. This investigation also intends to explain any discrepancies in the findings for in-strength versus betweenness. One of the working hypotheses for the observation of a U-shaped relationship between valence and in-strength but not betweenness, is the disproportional distribution of positive and negative words across the lexicon (Boucher & Osgood, 1969; Dodds et al., 2015).

We have already completed a comparable analysis involving 2,042 Dutch nouns, which shows significant linear and nonlinear effects of valence, arousal, and dominance on both in-strength and betweenness.Footnote 2 The nature of the relationship between the centrality measures and the affective dimensions for these nouns appears similar to that for the adjectives. In-strength and betweenness display a U-shaped relationship with valence, a decreasing relationship with arousal, and an increasing relationship with dominance. This result thus confirms our general finding that there are systematic relationships between the manner in which words are organized in the mental lexicon and the words’ affective dimensions.

Word associations arguably involve more semantic elaboration than the lexical decision and word naming tasks for which the effect of emotional content on language processing has been investigated so far (Estes & Adelman, 2008a, 2008b; Kousta et al., 2009; Kuperman et al., 2014; Larsen, Mercer, Balota, & Strube, 2008; Rodríguez-Ferreiro & Davies, 2019; Vinson, Ponari, & Vigliocco, 2014; Yap & Seow, 2014). The involvement of valence, arousal, and also dominance—a variable not considered in previous studies on language processing—in the organization of the mental lexicon, suggests that all three of these affective variables should be used as controls in studies both without and with semantic elaboration (contrary to the current practices; but see Moffat, Siakaluk, Sidhu, & Pexman, 2015, for a notable exception).

General discussion

We presented lexicosemantic (age of acquisition, familiarity, concreteness, imageability), affective (valence, arousal, dominance), and distributional variables (number of characters, number of syllables, summated position-nonspecific bigram frequency, orthographic neighborhood size, and word frequency) for 1,000 Dutch adjectives. The ratings of the lexicosemantic and affective variables proved very reliable. Wherever possible, we compared our ratings to ratings obtained in other studies, which tended to include fewer adjectives. In the case of age of acquisition, familiarity, imageability, valence, arousal, and dominance, the resulting correlations were considerable, validating our measurements. In the case of concreteness, the correlation was less pronounced, suggesting that different constructs are being measured in the various norming studies and indicating that claims regarding the comparability of experiential norms should dwell on the phrasing of the accompanying instructions and the composition of the stimulus set.

The pattern of intercorrelations between the included variables also mirrored that observed in previous norming studies, again validating our measurements, except for the correlations involving concreteness due to the use of different instructions in different studies. We found that the affective variables were stronger correlated among adjectives than across words from different word classes. The observation that valence, arousal, and dominance might correlate differently depending on the word class adds to the ongoing methodological discussion about affective variable ratings. Moors et al. (2013) noted that different patterns of intercorrelations between affective variables have been reported. They attributed these differences to (i) differences in the make-ups of the stimulus sets, (ii) instructions to rate the stimuli versus the participants’ feelings in response to the stimuli, and (iii) the use of a between-subjects versus a within-subjects design. In the latter case, participants might be more inclined to emphasize differences between the different variables. No gold standard is currently available for the measurement of lexicosemantic and affective variables, and individual researchers seem to choose the phrasing that best suits their reading of the underlying construct. To establish a gold-standard rating scale, it would appear that one would have to turn to external variables for validation. Especially for the affective variables this seems possible, in that these experiential variables should supposedly correlate with physiological or behavioral measures. However, there is now a growing consensus that the latter measures constitute distinct aspects of emotional experience, suggesting that a gold standard is out of reach (Mauss & Robinson, 2009). For concreteness, the instruction issue might be resolved by turning to ratings of how stimuli are perceived through each of the perceptual modalities (Lynott & Connell, 2009, 2013). Whereas concreteness rating instructions have tended to overemphasize the visual modality or have left the modality unspecified (Brysbaert, Warriner, & Kuperman, 2014b; Connell & Lynott, 2012), having participants provide separate ratings for each modality (visual, haptic, auditory, olfactory, gustatory) might make the ensuing data more straightforward to interpret.

Using the available norms, we investigated how affective variables provide insight into the organization of the mental lexicon. We found that adjectives with a pronounced negative or positive valence, receive more incoming links, giving rise to a U-shaped relationship between valence and in-strength. Adjectives low in arousal and adjectives high in dominance were found to have a higher number of incoming links. The adjectives that tend to take the most central positions in a word association network were found to be negatively valenced and low in arousal. Although word associations are often referred to as constituting a corpus, suggesting that the information they contain is language-based distributional in nature, they have been shown to contain both semantic and lexical information (De Deyne, Verheyen, & Storms, 2015, 2016b; see also Collins & Loftus, 1975; Szalay & Deese, 1978). The present findings indicate that they also yield affective information about the words for which associations have been gathered (see also De Deyne et al., 2018).

A multimodal distributional view on word meaning, according to which meaning is both embodied in modal representations and informed by word usage in context, currently prevails (e.g., Barsalou, Santos, Simmons, & Wilson, 2008; Louwerse, 2011). The affective embodiment account (AEA; Vigliocco, Meteyard, Andrews, & Kousta, 2009) is an extension of this view, arguing that affect should be considered another element of meaning that is grounded in experience based on internal states, affective experiences, or implicit evaluative appraisals. Moreover, the AEA claims that affective grounding is not limited to emotion words, but extends to most words (Kousta et al., 2009). The involvement of valence, arousal, and dominance in the organization of the mental lexicon supports the AEA.

Our norms add a substantive number of adjectives to the growing set of Dutch experiential and distributional norms, and can easily be connected to the behavioral norm data that are amassing for Dutch, pertaining to lexical decision (Brysbaert, Stevens, Mandera, & Keuleers, 2016; Keuleers, Diependaele, & Brysbaert, 2010b), word prevalence (Keuleers et al., 2015), text reading (Cop, Dirix, Drieghe, & Duyck, 2017), and word fragment completion (Heyman, Van Akeren, Hutchison, & Storms, 2016). We believe the norms would benefit both research that studies adjectives proper (e.g., to establish a typology; Dixon, 1982; Raskin & Nirenburg, 1998) or in which adjectives constitute the preferred stimulus material such as vagueness (Hampton, 2011; Kennedy, 2007; Van Rooij, 2011; Verheyen & Egré, 2018), spatial cognition (Bianchi, Savardi, & Burro, 2011a; Bianchi, Savardi, & Kubovy, 2011b), affective word processing (Bernat, Bunce, & Shevrin, 2001; Herbert, Kissler, Junghofer, Peyk, & Rockstroh, 2006), and inference (Gotzner, Solt, & Benz, 2018; Ruytenbeek, Verheyen, & Spector, 2017). The norms can be used both as explanatory variables (Gilet & Jallais, 2011; Kuperman et al., 2014) and control variables (Estes & Adelman, 2008a; Larsen, Mercer, & Balota, 2006). Our own application demonstrates that the study of adjectives can also constitute an avenue into phenomena of words more general.

Open Practices Statement

All the data and materials are available at https://osf.io/nyg8v/. R scripts for conducting the regression analyses are available at https://osf.io/8zhpw/. None of the procedures were preregistered. The data, materials, and scripts used in this article are licensed under a Creative Commons Attribution 4.0 International License (CC-BY), which permits their use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate whether changes were made. Any other third-party material in this article is included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Author note

S.V. and S.D.D. conceptualized the study and analyzed the data. S.V., S.D.D., and S.L. gathered the data. S.V. drafted the manuscript. S.D.D., S.L., and G.S. provided critical revisions. All four authors discussed the findings thoroughly and read and approved the final version of the manuscript. S.V. was funded by KU Leuven Research Council grant C14/16032, awarded to G.S., and S.D.D. was funded by Australian Research Council grant DE140101749. We thank Kathleen Meylemans for her help with data collection. We thank Hendrik Vankrunkelsven, Wouter Voorspoels, and two anonymous reviewers for helpful suggestions.