Naming visually presented objects is one of the most frequent practices in psychological research on object perception, memory, and language production. Depending on the research purposes, the presented object and its name often need to be controlled or contrasted in the following respects: (a) visual complexity—that is, the amount of visual details in the presented object; (b) imageability—that is, the degree to which a word can evoke a mental image; (c) image agreement—that is, the degree to which the mental image evoked by an object name corresponds to the presented object; (d) familiarity—that is, the subjective frequency of one’s exposure to a word; (e) name agreement—that is, the degree to which speakers agree on the name of an object; (f) age of acquisition (AoA)—that is, the age at which a word was learned; and (g) the frequency of use of a target name.

Naming an object involves object perception, conceptual preparation, word meaning retrieval, word-form encoding, and articulation (Caramazza, 1997; Dell, 1986; Glaser, 1992; Humphreys, Riddoch, & Quinlan, 1988; Levelt, Roelofs, & Meyer, 1999). Each aspect of the presented object may affect different processing stages during object naming. For instance, visual complexity influences the object perception stage; image agreement probably influences the object recognition stage (Barry, Morrison, & Ellis, 1997; Bonin, Chalard, Méot, & Fayol, 2002); and word frequency may affect the word-form encoding stage (Alario et al., 2004; Jescheniak & Levelt, 1994). In order to better understand the complex cognitive processes during naming, it is essential to obtain norms in order to control the variables of the testing materials. Although the precise origins of the effects of some variables are still under debate, and each variable does not contribute to naming processing equally, all of these variables have been shown to be related to the naming latencies of line drawings. For instance, in picture-naming tasks, objects are named faster when they are visually simpler (Alario & Ferrand, 1999) or when they have higher name agreement (Shao, Roelofs, Acheson, & Meyer, 2014a). Furthermore, objects are also named more quickly if their names can evoke images more easily (Balota et al., 2004) or if their names are learned earlier in life and used more frequently (Barry et al., 1997; Brysbaert & Cortese, 2011; Ellis & Morrison, 1998; Liu, Hao, Li, & Shu, 2011). However, these findings are largely based on studies using black-and-white line drawings. It is unclear whether the same results could be found when using photos.

Many studies have collected norms for black-and-white drawings of objects in different languages (Alario & Ferrand, 1999, for French; Nisi, Longoni, & Snodgrass, 2000, for Italian; Sanfeliu & Fernandez, 1996, for Spanish; Severens, Van Lommel, Ratinckx, & Hartsuiker, 2005, for Dutch; Snodgrass & Vanderwart, 1980, for English; Wang, 1997, for Chinese). However, line drawings lack some of the essential attributes of objects, such as color. Therefore, some researchers have recently questioned the ecological validity of object naming using black-and-white line drawings (Moreno-Martínez & Montoro, 2012). Consequently, in some studies researchers have collected norms for colored line drawings in different languages (Bonin, Guillemard-Tsaparina, & Méot, 2013, for French; Nishimoto, Ueda, Miyawaki, Une, & Takahashi, 2012, for Japanese).

Instead of using colored line drawings, using photos in a naming task is more natural. Unlike line drawings that are prototypical schematic representations of objects, photos are more realistic representations. Photos provide multidimensional features of the objects, such as color, texture, and 3-D features (cf. Brodeur, Dionne-Dostie, Montreuil, & Lepage, 2010; Brodeur, Guérard, & Bouras, 2014). Therefore, the difference in surface detail richness between line drawings and photos may affect object recognition and naming differently. It has been found that colored objects can facilitate object recognition (Hagen, Vuong, Scott, Curran, & Tanaka, 2014; Price & Humphreys, 1989); that colored line drawings are named more quickly than black-and-white line drawings (Ostergaard & Davidoff, 1985); and that colored photos are named more quickly than black-and-white line drawings (Biederman, & Ju, 1988; Brodie, Wallace, & Sharrat, 1991). Although it seems that color information contributes to object recognition and naming, what is gained is still unclear. Some studies have shown that color improves object perception (Price & Humphreys, 1989) and identification (Wurm, Legge, Isenberg, & Luebker, 1993), whereas other studies have shown that color contributes to naming at the semantic level (Ostergaard & Davidoff, 1985).

Visual complexity can be measured in an objective way (i.e., be indicated by file size) or a subjective way (i.e., be rated by participants). For subjective visual complexity, it has been argued that when participants estimate the visual complexity of one object, they can hardly suppress the object’s name (Szekely & Bates, 2000). Hence, subjective ratings of visual complexity are possibly influenced by some top-down perceptual cues, such as image agreement. For this reason, we also collected objective measures of visual complexity.

However only a few studies have provided norms of psycholinguistic variables for photos. In addition, these studies have only examined a limited number of variables for naming or have used a relatively small number of photos (e.g., 107 photos were used to measure six psycholinguistic variables in Adlington, Laws, & Gale, 2009; 480 photos were used to measure four psycholinguistic variables in Brodeur et al., 2010; and 174 colored pictures were used to measure three psycholinguistic variables in Viggiano, Vannucci, & Righi, 2004). Moreover, most of the norms for photos are for English, and only two normative studies are for other languages (Bonin, Boyer, Méot, Fayol, & Droit, 2004, for French; Moreno-Martínez & Montoro, 2012, for Spanish).

To the best of our knowledge, there is no Dutch normative study of photos of objects. The present study aims to fill this gap. First, we recorded naming latencies and calculated name agreement for 327 photos, and then we collected norms for the photos and their names. Eight variables were included: objective and subjective visual complexity, imageability, image agreement, subjective AoA, familiarity, word length (in syllables and in letters), and word frequency. We tested which of these variables are significant predictors of naming latencies.

Method

Participants

A total of 184 native speakers of Dutch (164 females, 20 males; mean age = 21 years, ranging from 18 to 32 years) participated in the rating and naming studies. Thirty-one different participants took part in each of the naming task, the imageability rating task, the image agreement rating task, and the visual complexity rating task, and 30 different participants took part in the AoA rating task and the word familiarity rating task. All participants had normal or corrected-to-normal vision and were selected from the participant pool of the Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands. They all signed a consent form before participating and were paid €8 for their participation.

Object naming task

Materials

One set of 327 photos was used in all of the studies (see Fig. 1 for two examples). All photos were downloaded from two databases (www.pixelio.de/ and http://pixabay.com/en/). All of the photos are shown in the supplementary materials.

Fig. 1
figure 1

Examples of two photographs, named banaan and autobus in Dutch (“banana” and “bus” in English)

Procedure

All photos were resized to fit into frames of 4 × 4 cm on a computer screen (2.29° of visual angle). On each trial, a fixation cross (+) was presented for 800 ms in the center of the screen. Then a photo was presented for up to 2,000 ms. A trial was terminated as soon as the participants pressed a key on the keyboard. The interstimulus interval was 1,500 ms; the order of photos was pseudorandomized and was different for each participant. The participants were tested individually.

Apparatus

The naming task was performed on an HP 8540P laptop with the Presentation software package (Version 14.3, www.neurobs.com). Naming latencies were manually measured using the speech analysis program Praat (Boersma, 2001).

Scoring

Participants’ responses were transcribed, and we computed the H statistics to index name agreement. The H statistic, introduced by Snodgrass and Vanderwart (1980), captures the spread of responses for individual photos, and is calculated as follows:

$$ H={\displaystyle \sum_{i=1}^k{P}_{\mathrm{i}}\;{ \log}_2\left(1/{P}_{\mathrm{i}}\right)}, $$

where k represents the number of different names given to each photo, and P i represents the proportion of participants giving each valid name. For instance, if there is only one name for a photo, H is zero; if two names occur with equal frequencies, H is 1. H increases with the number of given responses and decreases with the proportions for each response.

Responses were coded as errors when participants used names that were different from the expected or dominant names, when the response contained a repair or disfluency (e.g., stutters or responses that started with filler words such as “emm”), or when no response was recorded because of technical problems or no response was given by a participant. Only correct responses were used in the following analyses.

Rating tasks

The 327 photos and their namesFootnote 1 were used in the rating tasks, which were performed online. The photo stimuli were sized to fit into frames of 150 × 150 pixels, and the real size depended on the size of the participant’s computer screen. Word stimuli were presented in black in lowercase, large-sized Tahoma font. The rating scales were presented simultaneously below the photos. The study was programmed using the Oracle Application Development Framework 11.1.1.4.0.

The instructions for the rating tasks were adapted from those in Shao, Roelofs, and Meyer (2014b). On each trial, one stimulus (a photo or a written word) was presented in the center of the computer screen. A trial was terminated when participants responded, and the next trial was presented immediately. All tasks were self-paced, but the program automatically terminated if no response was recorded within 1 h. Participants could not revise their ratings. The order of photos was pseudorandomized across participants and across rating tasks.

In the visual complexity rating task (adapted from Snodgrass & Vanderwart, 1980), participants were instructed to rate the visual complexity of each photo. A value of 1 indicated a very simple photo, and a value of 7 indicated a very complex photo. Participants were required to rate the complexity of the photo itself rather than the complexity of the concept it represented. Here, “complexity” was identified as the amount of visual detail in the photo. In addition, we measured the objective visual complexity of each photo according to Szekely and Bates (2000): All photos were compressed into (300 × 300 pixel) JPG file format, each in a separate file, and the size of each compressed file was used to estimate the objective visual complexity.

On each trial in the image agreement rating task (also adapted from Snodgrass & Vanderwart, 1980), a noun was presented in the center of a computer screen for 10 s and was followed by the corresponding photo, which remained in view until the participant responded. A value of 1 indicated a very poor match, and a value of 7 indicated a very good match.

In the imageability rating task (adapted from Paivio, Yuille, & Madigan, 1968), participants were presented with written nouns and instructed to rate how readily each noun evoked a mental image. A value of 1 indicated that the noun evoked mental images with greatest difficulty or not at all, and a value of 7 indicated that the noun evoked mental images most readily.

In the familiarity rating task (adapted from Gilhooly & Logie, 1981), participants were given a 5-point scale to rate how often they had experienced a concept. The value 1 indicated that the participants were not familiar with the concept, and the value 5 indicated that they were very familiar with the concept. Participants were instructed to estimate the familiarity of the concept rather than of the word.

In the AoA rating task, participants were given a 9-point scale to rate the age at which they thought they had acquired a particular noun. On the scale, the value 1 indicated 0–2 years, the value 2 indicated 3 years, the value 3 indicated 4 years, the value 4 indicated 5 years, the value 5 indicated 6 years, the value 6 indicated 7–8 years, the value 7 indicated 9–10 years, the value 8 indicated 11–12 years, and the value 9 indicated 13 years and older. If the participants did not know the word, they were asked to choose Ik ken het woord niet (“I don’t know the word”). “Acquiring a noun” was defined as the age at which the participants thought they would have understood the noun if somebody had used it in front of them, even if they could not yet say, read, or write the noun themselves.

Results

Table 1 provides descriptive statistics for each variable assessed in the rating tasks, along with the log-transformed word-form frequency (obtained from the SUBTLEX-NL database; Keuleers, Brysbaert, & New, 2010) and the word length (in numbers of syllables and letters) of the dominant names. The ratings for each item are listed in the supplementary materials. The ratings showed that the tested words had relatively high imageability and that the photos had good image agreement. Table 1 also shows that the word frequencies of the dominant names of photos covered a broad range. Table 2 provides descriptive statistics for the H statistics, naming latencies, and naming error rates. Errors included nonexpected or nondominant names and missing or disfluent responses. The mean ratings for each photo are listed in the supplementary materials.

Table 1 Descriptive statistics for 327 photos: Age of acquisition (AoA), familiarity, imageability, image agreement, subjective and objective visual complexity, subjective and objective word frequency, and word length in syllables and letters
Table 2 Descriptive statistics for the H statistic, naming latency (in milliseconds), and error rates

We then computed correlations among the naming latencies and all predictors. As is shown by Table 3, the naming latencies were related to all predictors except word length in letters and objective visual complexity. The two measures of visual complexity were highly correlated, and the two measures of word length were also highly correlated. In addition, AoA was highly correlated with concept familiarity, imageability, word length, and log-transformed word frequency. Concept familiarity was highly correlated with imageability and with log-transformed word frequency (rs > .40)

Table 3 Matrix of correlations between naming latencies and the predictor variables

A simultaneous multiple regression analysis was used to assess how much of the variance in the naming latencies was explained uniquely by the predictors. We only included the predictors that were significantly correlated with naming latencies. Table 4 summarizes the results, and shows that imageability, image agreement, H statistic (name agreement), subjective visual complexity, and log-transformed word frequency were significant predictors of naming latencies. The contribution of familiarity was marginally significant, p = .09. The regression analysis yielded an adjusted R 2 = .54, F(8, 250) = 36.86, p < .001. Table 4 also shows that collinearity was not a problem for the regression model, according to Field (2009), because the tolerance statistics were all above .2 and the average variance inflation factor (VIF) was 1.86. Furthermore, we used principal components analysis (PCA) to explore how many underlying factors were reflected by the predictors of naming latencies. The PCA with varimax rotation was computed on all variables. The Kaiser–Meyer–Olkin measure verified the sampling adequacy for the analysis, KMO = .64, Bartlett’s test of sphericity, χ 2(28) = 617.98, p < .001, indicating that the correlations between variables were sufficiently high for PCA. Three components had eigenvalues greater than 1 over the Kaiser criterion, and these components explained 66.37% of the variance in total. Table 5 shows the three factors extracted in the analysis. In particular, Factor 1 loaded on AoA, familiarity, log-transformed word frequency, and word length; Factor 2 loaded on imageability, image agreement, and name agreement; and Factor 3 loaded on subjective visual complexity. Similar results were found in the normative study of line drawings of actions for Dutch by Shao et al. (2014b), in which the first factor loaded on AoA, familiarity, and word frequency, the second factor loaded on image agreement, name agreement, and imageability, and the third factor loaded on word length and visual complexity.

Table 4 Results of the multiple regression analysis, with naming latency as the criterion variable and age of acquisition (AoA), familiarity, imageability, image agreement, H statistic, word length in syllables, subjective visual complexity, and log-transformed word frequency as predictor variables
Table 5 Summary of results of the principal components analysis

Reliability

We calculated the split-half reliabilities for all rating tasks, including subjective visual complexity, AoA, imageability, image agreement, and concept familiarity (see Table 6).

Table 6 Split-half reliabilities for subjective visual complexity, age of acquisition, imageability, image agreement, and concept familiarity within the sample

To indicate the reliability of the data collected in the present study, we correlated the AoA, familiarity, and imageability ratings for the overlapping items between the present study and the other published Dutch norms (Hermans & De Houwer, 1994; Moors et al., 2013; van Loon-Vervoorn, 1985). In particular, for AoA, 185 items overlapped, r = .92, p < .001; for imageability, 221 items overlapped, r = .56, p < .001; and for familiarity, 62 items overlapped, r = .88, p < .001. The correlation for imageability was less strong than the other correlations. One reason is that many items were given the same ratings in van Loon-Vervoorn (1985), which reduced the size of the correlations. In addition, the imageability ratings from van Loon-Vervoorn (1985) were collected 30 years ago, and the imageability of the individual words may have changed over such a long time. For instance, plaat (“music plates”) is less mentioned and seen nowadays than 30 years ago. In general, these results suggest that the norms reported in the present study provide reliable information.

To the best of our knowledge, there is only one published normative study for line drawings of objects for Dutch (Severens et al., 2005). Of our items, 159 overlapped with those from Severens et al.’s study. For the overlapping items, the naming latencies from Severens et al. were positively correlated with the naming latencies from the present study, r = .45, p < .001, indicating the reliability of the data.

We then carried out t tests to compare the numbers of alternative names, H statistics, and average naming latencies for the overlapping items. The results showed that the items used in the present study had similar numbers of alternative names, t(316, 2) = 0.36, p = .72 (M = 2.83 in Severens et al., 2005), higher name agreement, t(316, 2) = 9.53, p < .001 (M = .64 in Severens et al., 2005), and longer name latencies, t(316, 2) = −3.35, p = .001 (M = 1,006 ms in Severens et al., 2005).

Discussion

The present study is the first to provide norms for a set of 327 photos of objects, derived from a group of healthy Dutch participants. A large set of psycholinguistic variables were included: AoA, familiarity, imageability, image agreement, objective and subjective visual complexity, word frequency, word length (in syllables and in letters), and H statistic (name agreement). The ratings and the total number of alternative names for each photo are listed in the supplementary materials.

Among these predictors, AoA and word frequency strongly correlated with familiarity and word length, and AoA strongly correlated with imageability. Furthermore, almost all predictors, except word length in letters and objective visual complexity, correlated with naming latencies. The simultaneous multiple regression analyses showed that naming latencies were only predicted by imageability (how easily a given word can evoke a mental image), image agreement (how well the photo corresponded to the raters’ mental images of a given word), H statistic (how well speakers agree on the name of the photo), word frequency (how frequently speakers use the target name), and subjective ratings of visual complexity. These variables accounted for 46% of the variance in the naming latencies. The remaining variables, including AoA and word length, made no independent contributions to naming latencies.

The results of the PCA analysis suggest that naming latencies are influenced by three factors: (1) frequency of experience with a concept or a word form (indicated by familiarity and word frequency); (2) the association between an object and its name (indicated by imageability, image agreement, and H statistic); and (3) the difficulty of processing the visual information (indicated by visual complexity).

It is noteworthy to mention that although the objective visual complexity was positively correlated with the subjective visual complexity, r = .53, p < .01, only subjective visual complexity predicts naming latencies. It is possible that subjective visual complexity is influenced by some top-down perceptual cues (e.g., imageability or familiarity), because participants cannot avoid activating the name of an object when they estimate the visual complexity of the object (Szekely & Bates, 2000). Combining the results of the multiple regression and PCA analyses, it seems that Factors 1 and 2 have strong impacts on naming latencies.

It is surprising that AoA was not a predictor of photo-naming latencies. The impact of AoA on naming latencies has been found in many studies using line drawings (Barry et al., 1997; Brysbaert & Cortese, 2011; Ellis & Morrison, 1998; Liu et al., 2011). To the best of our knowledge, however, only one photo-norming study has shown that AoA predicts photo-naming latencies (Nishimoto et al., 2012). Since AoA ratings have not been collected in other photo-norming studies, future research will be needed to verify whether AoA is a reliable determinant of photo-naming latencies.

Importantly, we found that photos have higher name agreement but longer naming latencies than do line drawings. This is in contrast to the finding that colored photos had shorter naming latencies than those for line drawings (Brodie et al., 1991). It is noteworthy that only a few photos were used in Brodie et al. (1991; six objects in Exp. 1, 20 objects in Exp. 2, 12 objects in Exp. 3, and 33 objects in Exp. 4). Moreover, most photos used in the present study contained much more surface detail and background than the photos used by Brodie et al., which may also have influenced the naming latencies.

Why does photo naming differ from line-drawing naming? In general, photos, as compared to line drawings, contain more surface details, including color, texture, shadow, and sometimes background details. It is possible that these details could reduce the ambiguity of the presented objects, and therefore improve the degree of name agreement. Color in particular may contribute to the difference between naming photos and naming line drawings. For instance, Tanaka and Presnell (1999) demonstrated that color information helps object recognition, especially for objects strongly associated with a particular color. Moreover, Uttl, Graf, and Santacruz (2006) hypothesized that the recognition of photos and line drawings may recruit different perceptual and semantic processes, because line drawings may be viewed as typical representations of an object class—a type—whereas photos are viewed as individual objects—tokens. Therefore, different types of stimuli may be suitable for different research questions. Nevertheless, the difference between photos and line drawings should be taken into account in future studies.

Conclusions

To summarize, the present study is the first to report Dutch normative data as well as naming latencies for 327 photos of objects. The analyses showed that name agreement, word frequency, imageability, and image agreement were significant predictors of naming latencies. The normative data will be useful for future research on object naming among both clinical and healthy populations.