Background

Although research questions on information processing of object-directed actions are diverse, a common requirement of many studies is a set of visual stimuli that depict object-directed actions (e.g., Bach, Peelen, & Tipper, 2010; Hesse, Sparing, & Fink, 2009; Spunt, Satpute, & Lieberman, 2011). The database introduced in this article provides a set of such actions, many of which are available as short videos of different manipulations of objects under controlled conditions. These stimuli therefore provide a tool for investigating the processing of perceived actions.

This is not the first time action stimuli have been presented. In an attempt to systematize action stimuli, Fiez and Tranel (1997) described a set of 280 transitive and intransitive actions being depicted either in a single photograph or in a photograph pair. Items were characterized by native English speakers as to their name, the correspondence of name and photograph, item familiarity, and visual complexity. In a study by Bonin, Boyer, Méot, Fayol, and Droit (2004), a subset of photographs from the Fiez and Tranel database was rated, using the same psycholinguistic norms but including additional characteristics like imageability and age of acquisition, by a French sample. Comparable name agreement scores and ratings have also been collected for line drawings depicting transitive and intransitive actions with English, French, and Spanish samples (Cuetos & Alija, 2003; Masterson & Druks, 1998; Schwitter, Boyer, Méot, Bonin, & Laganardo, 2004; Szekely et al., 2005; Szekely et al., 2004).

However, most sets that have been published contain static action stimuli—that is, photographs or line drawings. This is ironic given that actions are inherently dynamic as they unfold with time. Only a few studies have acknowledged this dynamic aspect by using action video clips, rather than action photographs, as stimulus materials (e.g., Hamilton & Grafton, 2008; Spunt et al., 2011; Zalla, Labruyère, Clément, & Georgieff, 2010). One reason for this restriction is very likely the fact that preparation and production of video clips is time consuming and laborious. To researchers in the field, it would therefore be useful if a set of standardized action video clips were freely available. Researchers could then easily pick those actions that fit their research goals and use them as stimulus materials in their studies, rather than producing their own action video clips each time anew. To our knowledge, there is only one previous study published in French in which a rather small set of action video clips (110 items) was introduced and rated as to naming agreement and correspondence of action verb and video clip (Bonin, Roux, Méot, Ferrand, & Fayol, 2009). These video clips were based on the actions that had been used in Bonin, Boyer, Méot, Fayol, and Droit (2004) and Schwitter et al. (2004).

In this article, we introduce a more extensive database of video clips referring to actions. These video clips have been rated for familiarity in China and Germany and are, therefore, suitable for investigating action information processing cross-culturally. Although actions as such are ubiquitous in our everyday lives, they are also highly related to culture, and cultures differ partly in their action repertoire. For instance, imagine a regular food intake situation in China and in Germany. Whereas an ordinary Chinese person would use chopsticks in order to transport long noodles from a plate into his or her mouth, this would be a rather unusual action to take for a German person. In contrast, an ordinary German person would wind the noodles up using a fork and spoon, an action that would be exceptional for the Chinese person. Because actions differ in familiarity between cultures, it is likely that some identical actions are differentially processed and represented by Easterners and Westerners.

Previous studies have demonstrated that an action’s familiarity influences perceptual processing, memory performance, imitation, and outcome prediction (e.g., Calvo-Merino, Ehrenberg, Leung, & Haggard, 2010; Knopf, 1991; Wang, Fu, Aschersleben, & Zimmer, 2012; Zalla et al., 2010). The familiarity status of actions and tools can also modulate brain activation patterns (e.g., Calvo-Merino, Grèzes, Glaser, Passingham, & Haggard, 2006; Rumiati et al., 2005; Vingerhoets, Acke, Vandemaele, & Achten, 2009). In a recent study, it was also demonstrated that cross-cultural action familiarity differences affect information processing (Liew, Han, & Aziz-Zadeh, 2011). The results of Liew and colleagues suggest that different brain regions are involved during observation of familiar and unfamiliar gestures when the task is to infer the actors’ intentions.

Capitalizing on these results, we conducted a study with object-directed actions in which we compared memory for seen actions that were physically identical but either familiar or unfamiliar to the observer, depending on whether the action was common in the observer’s own culture or not. We showed that the content of our memory representations was dependent on the action’s familiarity in the given culture.

During encoding, video clips of object-directed actions differing in familiarity were presented to a Chinese and a German sample. Twenty-five percent of the actions were familiar in both cultures, familiar in China and unfamiliar in Germany, familiar in Germany and unfamiliar in China, and unfamiliar in both cultures, respectively. In a recognition memory test, different but related action video clips were presented. Half of the participants were required to make an old/new judgment as to the means (i.e., the detailed interaction of effector and object), and half of the participants as to the ends (i.e., the intended physical consequences) of the actions. The data speak in favor of a hierarchical model of action representations that is common across cultures. Whereas the “end” information was equally well represented for familiar and unfamiliar actions, detection of changed means was better for familiar than for unfamiliar actions. Participants' memory in both cultures was modulated in the same way by familiarity. Consequentially, items that were physically identical were remembered differently in the Chinese and the German sample, due to reversed familiarity. This suggests that culture-specific familiarity has to be considered if one wants to predict memory or the way actions are processed.

Details of this study will be reported elsewhere (Umla-Runge, Zimmer, Fu, & Wang, 2012). Because collecting cross-cultural familiarity ratings of object-directed actions was a necessary preparatory step in the Umla-Runge et al. study, we used that work to build the standardized data set of actions presented in the present article.

A common feature of psycholinguistic rating studies is that real physical stimuli (line drawings, photographs, or video clips) are presented to the participants and their task is either to name them or to rate them with regard to some aspects like imageability or familiarity (Bonin et al., 2009; Bonin et al., 2004; Cuetos & Alija, 2003; Fiez & Tranel, 1997; Masterson & Druks, 1998; Schwitter et al., 2004; Szekely et al., 2005; Szekely et al., 2004). We adopted a different approach in order to identify object-directed actions that differ in familiarity between Eastern and Western cultures. We decided to collect ratings of verbal descriptions of actions. Presenting real actions bears the risk that the perceiver does not judge the familiarity of the action but the familiarity of the object that is manipulated by the actor, which would be a rating of object instead of action familiarity. In order to avoid this, we presented verbal action descriptions rather than visual action depictions.

Action familiarity ratings were obtained in two waves, the first one yielding categorical familiarity ratings, the second one yielding numerical familiarity ratings. Procedures and outcomes will be described in detail in the following sections. Resulting from this two-step procedure, action descriptions are available for 1,315 actions in Chinese and German, as well as in English, and English action descriptions are available for another 439 actions. For 784 actions, video clips are available: 494 actions are depicted in one action video clip, and 290 actions are depicted in two action video clips showing the same action with different object exemplars. In total, 1,080 action video clips of object-directed actions are available, which can be downloaded as supplemental material.

Study 1: Wave 1 ratings

Categorical action familiarity rating

Method

Participants and procedure

We collected 1,754 English action descriptions for object-directed manual actions specifying both the means and the end of the action. They corresponded partly to action phrases that had been used in previous studies from our lab or descriptions of actions that had been observed or executed in everyday life in Germany and/or China. Some examples are given in Table 1. Two of the authors, one native Chinese speaker (L.W.) and one native German speaker (K.U.R.), both fluent in English, made a familiarity judgment of the actions the descriptions referred to. The familiarity judgment consisted of a judgment of what they thought would be true for the majority of right-handed adults between 18 and 40 years of age from their home country. Action descriptions were listed in a table in an electronic document. Next to each item, raters filled in a letter corresponding to their familiarity judgment. Three categories were possible: “I think the action is mostly familiar” (F), “I think the action is mostly unfamiliar” (U), and “I am not sure whether the action is mostly familiar or unfamiliar” (N). For the familiarity judgment, frequency of performing and/or observing the action was considered relevant. Both means and end were to be taken into account.

Table 1 Exemplary action descriptions specifying means and ends in English

From the 1,754 action descriptions, 689 actions could be identified for which both the Chinese and the German rater were sure that this action would be either familiar or unfamiliar for the majority of young adults from their home country. These items were preselected for the categorical rating procedure. Actions of which at least one of the raters claimed to be “not sure” were rated numerically (Wave 2 ratings).

The 689 action descriptions were pseudorandomly divided into three subsets of 172 actions and one subset of 173 actions. Care was taken that action descriptions referring to different means of performing an action with the same end were not included in the same subset, in order to avoid familiarity comparisons between those actions. The order of action descriptions in each subset was also pseudorandomized. Each subset was rated in the same way as described above by 3 additional native Chinese speakers and 3 additional native German speakers fluent in English. Each additional rater rated the action descriptions of one subset only. In total, 12 additional native Chinese (mean age: 26.3 years, 8 females, 10 postgraduate students) and 12 additional native German speakers (mean age: 27.8 years, 10 females, 11 postgraduate students) participated in the categorical familiarity rating. The pseudorandom order in each subset was kept constant across raters.

Results

If for both Chinese and German raters, at least 3 out of the 4 raters within each country agreed that the action belonged to the category “F” and/or “U,” this was taken as the criterion for an action being categorized as familiar/unfamiliar for right-handed young adults in the two countries.

Four hundred thirty-nine actions reached this criterion. They could be divided into four categories: familiar in both China and Germany, familiar in China and unfamiliar in Germany, familiar in Germany and unfamiliar in China, and unfamiliar in both countries. Of the 439 actions, 340 were familiar in both countries, 24 were familiar in China and unfamiliar in Germany, 40 were familiar in Germany and unfamiliar in China, and 35 were unfamiliar in both countries.

For 235 out of the 439 actions, video clips were generated. They were chosen by taking practical considerations into account, such as the availability of objects and the ease/difficulty of producing corresponding video clips. One hundred forty-five actions were depicted in one video clip (83 familiar in China and Germany, 22 familiar in China and unfamiliar in Germany, 25 unfamiliar in China and familiar in Germany, 15 unfamiliar in China and Germany), and 90 actions were depicted in two video clips (67 familiar in China and Germany, 14 unfamiliar in China and familiar in Germany, 9 unfamiliar in China and Germany). Whether one or two video clips were available of a given action was dependent on the action‘s use as an experimental or a filler item in the Umla-Runge et al. (2012) study. Descriptions of the items are categorized as to familiarity in China and Germany in Table A (supplemental material). Video clip duration varies between 1,500 and 4,000 ms. Each clip contains an object-directed manual action from a third-person perspective (from left, from right, or from a position opposite to the actor). Seven different actors were involved in performing the actions (two male actors, five female actors). Only the hands and arms of the actors are visible. The video clips are purely visual stimuli and do not contain any sound. Unless otherwise indicated, double clips of the same action involve the same actor from the same perspective but acting upon a different object exemplar. Table A also lists for each action description whether one or two video clips for the action are available. Furthermore, the clip’s duration and perspective and a label for the actor’s identity (e.g., F1 = female actor 1) are given. Examples for action video clips from the four categories of familiarity are depicted in Fig. 1.

Fig. 1
figure 1

Actions from different categories of familiarity. For each exemplary action, the action description and one frame from a video clip depicting the action are displayed

To summarize, of the 1,754 action descriptions generated in English in Study 1, 439 reached the criterion of being familiar or unfamiliar in Germany and China. For the remaining 1,315 actions, the familiarity status of the actions was established with a numerical method. For this purpose, a second study was conducted where familiarity was judged on a scale from 1 (= very unfamiliar) to 5 (= very familiar).

Study 2: Wave 2 ratings

Numerical action familiarity rating

Method

Participants and procedure

One thousand three hundred fifteen English action descriptions were translated into Mandarin and German by one Chinese and one German native speaker and were double-checked by one other Chinese and German native speaker. Translators were all fluent in English. The action descriptions were divided into five subsets of 219 items and one subset of 220 items. Care was taken that action descriptions referring to different means of performing an action with the same end were not included in the same subset, in order to avoid familiarity comparisons between those items. Each subset was rated by 16 Chinese and 16 German native speakers, half of them males and half of them females. In total, 96 Chinese and 96 German native speakers participated in the rating. They were all right-handed and between 18 and 40 years old. Raters were paid for their participation. For the familiarity judgment, frequency of performing and/or observing the action was considered relevant. Both means and end were to be taken into account.

Unlike in study 1, participants were instructed to judge the familiarity of each action for themselves, rather than giving a judgment for the majority of young adults from their own country. Each action was rated on a scale from 1 (= very unfamiliar) to 5 (= very familiar). Each participant received an electronic questionnaire that contained the action descriptions in their native language from the respective subset. Participants were required to rate each action’s familiarity by ticking the box corresponding to their judgment out of boxes numbered 1 to 5, which were displayed below each action description.

Results

For each item, mean familiarity ratings and standard deviations were calculated for the Chinese and the German samples. Four categories of mean familiarity ratings were defined: [1, .., 2[, [2, …, 3[, [3, …, 4[, [4, …, 5].Footnote 1 Sixteen combinations of action familiarity in China and Germany resulted from the four categories. Table 2 lists the number of items in each category. For each familiarity combination, the table further contains the number of items for which one or two action video clips are available.

Table 2 Familiarity distribution of 1,315 action items

For 549 out of the 1,315 actions, video clips were generated. They were chosen taking into account both high ratings for familiarity/unfamiliarity and practical considerations such as the availability of objects and the ease/difficulty of producing corresponding video clips. Three hundred forty-nine actions were depicted in one video clip, and 200 actions were depicted in two video clips. Again, it was dependent on the use of an action as an experimental or a filler item in the Umla-Runge et al. (2012) study, which determined whether it was depicted in one or two video clips. In Table B (supplemental material), we present descriptions of these items in Mandarin, English, and German, their familiarity ratings in China and Germany, and the names of the respective video clip(s). Clip duration varies between 1,500 and 4,000 ms. Each clip contains an object-directed manual action from a third-person perspective (from left, from right, or from a position opposite to the actor). Eight different actors were involved in performing the actions (two male actors, six female actors). Only the hands and arms of the actors are visible. The video clips are purely visual stimuli and do not contain any sound. Unless otherwise indicated, double clips of the same action involve the same actor from the same perspective, but acting upon a different object exemplar. Table B also lists for each action description whether one or two video clips for the action are available. Furthermore, the clip’s duration and perspective and a label for the actor’s identity (e.g., F1 = female actor 1) are given.

Mean familiarity ratings, mean standard deviations, and standard deviations of the mean standard deviations for the 549 actions available as video clips are listed in Table 3 for the four categories [1, …, 2[, [2, …, 3[, [3, …, 4[, and [4, …, 5] separately for the Chinese and German ratings.

Table 3 Familiarity ratings in China and Germany

Mean familiarity ratings and standard deviations for the four categories of familiarity are comparable across cultures. In a 4 (familiarity category: [4, …, 5], [3, …, 4[, [2, …, 3[, [1, …, 2[) × 2 (culture: China, Germany) mixed model ANOVA with mean familiarity ratings of the 192 participants as the dependent variable, a significant main effect of familiarity emerged, F(3, 570) = 1,220.4, η 2p = .87, p < .001. Planned comparisons revealed that mean familiarity ratings for the four categories were reciprocally significantly different from each other. There was neither a main effect of culture nor a significant interaction effect of culture and familiarity category on the mean familiarity ratings. An analogous ANOVA was conducted with mean standard deviations of the 192 participants as the dependent variable. A significant main effect of familiarity category was obtained, F(3, 519)2 = 58.04, η 2p = .25, p < .001. As was informed by a post hoc Tukey HSD test, mean standard deviations for the moderate categories ([3, …, 4[ and [2, …, 3[) were significantly higher than mean standard deviations for the more extreme categories ([4, …, 5] and [1, …, 2[). A smaller but significant main effect of culture emerged, with higher mean standard deviations in familiarity ratings for Germans than for Chinese, F (1, 173) = 6.27, η 2p = .03, p < .05. Familiarity category and culture did not interact significantly. Mean familiarity ratings and standard deviations are plotted for the four familiarity categories separately for Chinese and German participants in Fig. 2.

Fig. 2
figure 2

Mean familiarity ratings (left) and mean standard deviations (right) for the four familiarity categories in Chinese and German participants. Interaction effects were not significant [mean familiarity ratings, F(3, 570) = 1.19, p ≤ .31; mean standard deviations, F(3, 519) = 1.24, p ≤ .29]. Bars denote the standard errors of the means

Furthermore, we looked at interrater consistency for familiarity ratings within each culture. We considered an item to have exceptionally low interrater consistency if its mean standard deviation (SD_Item) was at least two standard deviations (MSD_SD) larger than the mean standard deviation of the familiarity category (MSD).

Low interrater consistency:

$$ {\text{SD}}\_{\text{Item}} > {\text{MSD}} + 2*{\text{MSD}}\_{\text{SD}} $$

From the 549 actions depicted in video clips, 5 actions show such low interrater consistency in the Chinese ratings, and 14 actions in the German ratings. They are identified in Table B.

Additional information on familiarity ratings for action descriptions for which no video clips are available are listed in Tables C (categorical rating; 204 items) and D (numerical rating; 766 items). They are also available as supplemental material.

Discussion

The purpose of this article is to make a large action database available to other researchers in the field. We believe that this can be helpful in reducing the time and costs involved in preparing stimulus material for studies on action information processing. The database includes 1,754 object-directed actions. All actions are specified in verbal action descriptions containing means and ends and have been rated for familiarity in China and Germany. Each action description is available in an English version, 1,315 of them additionally in a Mandarin and a German version. For 494 action descriptions, one corresponding video clip is available; for 290 action descriptions, two corresponding video clips are available. In total, 1,074 action video clips have been recorded and can be downloaded as supplemental material.

Familiarity ratings were obtained in two waves, with one yielding categorical and the other numerical ratings. Items with numerical ratings have been rated on a scale from 1 (= very unfamiliar) to 5 (= very familiar), and mean numerical ratings have been categorized into the groups [1, …, 2[, [2, …, 3[, [3, …, 4[, and [4, …, 5]. Interrater consistency within cultures has been described by mean standard deviations for each of the four groups, as well as each item. In both cultures, interrater consistency for the items at the extreme ends of the rating continuum was higher, as compared with interrater consistency for items in the middle. This could be due to the possibility that items in the medium familiarity range can deviate to a greater extent toward both ends of the continuum, whereas items at the extreme ends can deviate only toward medium familiarity. Comparing mean familiarity ratings and mean standard deviations of the familiarity ratings for items in the four groups between cultures yielded a high similarity in the rating behavior across cultures.

It has been demonstrated before that an action’s familiarity matters for action information processing and that brain activations differ for familiar and unfamiliar actions (e.g., Calvo-Merino et al., 2010; Cross, Kraemer, Hamilton, Kelley, & Grafton, 2009; Knopf, 1991). Using categorical and numerical rating procedures, we were able to identify object-directed actions that differ, and others that are comparable, in familiarity for Chinese and German participants. In a cross-cultural study on recognition memory for specific aspects of object-directed actions, which we conducted using this database, we showed that the content of our memory representations is dependent on the action’s familiarity in the given culture. These results will be reported elsewhere (Umla-Runge et al., 2012). The action database we have presented here can be of special interest to researchers focusing on the effects of action familiarity or to researchers investigating action information processing in a cross-cultural context.

As a final note, there are some general limitations in the standardization of object-directed actions regarding their familiarity. First, action familiarity is an idiosyncratic feature. For instance, “whipping cream with a hand mixer using the right hand” is an unfamiliar action for most Chinese young adults. Still, a Chinese person frequently cooking according to German recipes will probably be very familiar with this action. In every culture, there will be specific subgroups of people to whom actions will be familiar that are unfamiliar to most of the others. Second, action familiarity changes with technical developments. Inserting a floppy disk into a computer’s drive probably was a very familiar action to many people about 15 years ago, whereas today, most computers do not have drives for floppy disks anymore. As a consequence, we would expect to receive different action familiarity ratings for some of the actions in the database were we to repeat the studies at a later point in time. Therefore, it would be advisable to repeat the cross-cultural familarity rating procedure after some years in order to assess its temporal stability for the individual actions.