Introduction

Interdisciplinary evidence from robotics (Marocco, Cangelosi, Fischer, & Belpaeme, 2010), neuroscience (Hauk, & Pulvermüller, 2011) and cognitive psychology (Bekkering, & Neggers, 2002) support the so-called theory of embodied cognition (Barsalou, 2008). This theory argues that the processing of concepts is associated with the activation of perceptual and motor systems (see Barsalou, 2008; Binder, & Desai, 2011), and such an association is bidirectional, i.e. the activation of sensorimotor systems affects conceptual processing (e.g. see experiments in Rueschemeyer, Lindemann, van Rooj, van Dam, & Bekkering, 2010), and the activation of concepts affects sensorimotor systems (e.g. see experiment in Glenberg, & Kaschak, 2002). The relationship between concepts and sensorimotor systems is considered essential for effective social cognition, a type of cognition used in everyday life situations.Footnote 1 That is, for example, our perceptual and motor system can influence our cognitive processes (e.g. judgment, thinking, decision-making), just as these processes can influence our physical actions in social contexts (e.g. Wilson, 2002).

Based on this theory, Casasanto (2009) proposed the body-specificity hypothesis (BSH). The BSH argues that people implicitly associate positive-valenced concepts with the side of their bodily space on which they are more skilful. The experiments by Casasanto (2009) supported this prediction showing that right-handers were more likely than left-handers to associate the right space with positive ideas and the left space with negative ideas, whilst the opposite holds true for left-handed participants. Accordingly, right- and left-handers tended to link good things such as intelligence, attractiveness, honesty, and happiness more strongly with their dominant side. In employing functional magnetic resonance imaging (fMRI) to compare right- and left-handers’ brain activity during motor imagery tasks and action verb understanding, Casasanto (2011) found that whilst left-hemisphere motor areas were activated in right-handers, right-hemisphere motor areas were activated in left-handers. This finding lends additional support to the BSH from a neuroscience perspective.

In addition to this, Ansorge and Bohner (2013; see also Ansorge, Khalid, & König, 2013) reported a congruency effect when subjects had to categorise spatial words like up as elevated or less elevated (i.e. as high or low in the vertical space), as well as categorise affective words like happy as positive or negative. Their results support the assumption that valence–vertical space associations exist in semantic memory, so that faster responses were observed when target words were presented in spatially congruent locations (e.g. happy in the upper part of a computer screen). Similarly, Meier and Robinson (2004) found that positive-valenced words activated higher areas of visual space, whilst negative words activated lower areas of visual space (Study 2; see also Xie, Wang, & Chang, 2014), and Sasaki, Yamada and Miura (2015) showed that the emotional valence of images is influenced by motor action towards the upper or lower vertical spatial location (see also Sasaki, Yamada, & Miura, 2016).

To further expand on these previous studies, Marmolejo-Ramos, Elosúa, Yamada, Hamm, and Noguchi (2013) examined whether a dominance of the vertical plane exists over the horizontal plane. Their results supported the predictions of the BSH described above, but also showed that the vertical plane is more salient than the horizontal plane in relation to the allocation of valenced words. That is, whilst a rating task showed that left-handers rated the word left as more positive than right and right-handers showed the opposite pattern, a word allocation task showed that positively valenced words were placed in upper locations, whereas negatively valenced words were placed in lower locations regardless of participants’ handedness. Thus, the results lend support to the BSH and also indicate a higher saliency of the vertical plane over the horizontal in the allocation of valenced words (recent evidence as to the saliency of the vertical plane over the horizontal plane is further reported by Damjanovic, & Santiago, 2016). Note that Marmolejo-Ramos et al. (2013) reported some differences in the rating task amongst several linguistic groups (see Fig. 1 in their paper), but there were no linguistic differences in the word allocation task.

Fig. 1
figure 1

Materials used in the rating (a) and the word allocation (b) tasks. a The case of joy for illustrative purposes only

However, in a recent specialised section devoted to research in embodied cognition (Marmolejo-Ramos, & D’Angiulli, 2014), one article reported a study about the effect of linguistic factors on the valence–space metaphor. Marmolejo-Ramos, Montoro, Elosúa, Contreras, and Jiménez-Jiménez (2014) evaluated whether gender and cultural factors have an effect on the mapping of valenced sentences on the vertical space. In the first experiment, Colombian and Spaniards had to recall and report specific personal situations or contexts related to joy, sadness, surprise, anger, fear, and disgust; i.e. participants recalled and reported situations or contexts in which these emotions occur. Results showed that females expressed more contexts than males, and importantly, Colombians reported more contexts than Spaniards. Based on these results, the researchers designed a new spatial–emotional congruency verification task including sentences that recreated the most representative contexts for the emotions of joy and sadness (e.g. John had a good time with his friends). After reading a sentence, participants had to judge whether a probe word, displayed in either a high or low position on the screen, was congruent or incongruent with the previous sentence. The results showed a mapping between emotions and vertical space induced by sentences recreating representative emotional contexts. This evidence is in line with research (e.g. Schubert, 2005) suggesting that perceptions and judgments of abstract concepts are processed in metaphorical ways by estimating their relative position inside a vertical space.

The emotion words joy and sadness are exemplars of positive and negative emotions that have been studied in the context of other valenced concepts (see for an example, the classic study by Bradley and Lang, 1999). Whilst the words joy and sadness represent highly positive- and highly negative-valenced concepts that are readily mapped onto upper and lower locations in space (e.g. Ansorge, & Bohner, 2013), it is unknown how emotion words with rather neutral valence would be mapped onto space. An emotion word that seems to have a rather neutral valence (e.g. Reali, & Arciniegas, 2015) and whose metaphorical location onto space has not been investigated is that of surprise. Surprise is broadly defined as the detection of unexpected situations that challenge a person’s beliefs (Reisenzein, 2009, Reisenzein, Meyer, & Niepel, 2012). It is a peculiar emotion that seems to swing between being negative (e.g. when a person is victim of a robbery) and also positive (e.g. when a person finds his friends at home to celebrate his birthday; see also Macedo, Cardoso, Reisenzein, Lorini, & Castelfranchi, 2009). Also, it has been found that less verbal contexts can be reported for surprise compared to emotions such as joy and sadness (Marmolejo-Ramos et al., 2014). Interestingly, though this emotion has not been studied in the context of embodiment, therefore, the current study aims to do so along with the previously examined emotions; joy and sadness.

The first step before investigating how these three emotions are mapped onto space is to characterise them regarding their level of concreteness (i.e. the degree to which the concept denoted by a word refers to a perceptible entity (Brysbaert, Warriner, & Kuperman, 2014)], imageability [i.e. the ease with which a word gives rise to a sensory mental image of the word (Paivio, Yuille, & Madigan, 1968)], context availability [i.e. the ease with which a context can be brought to mind in which the person would feel that emotion (Schwanenflugel, & Shoben, 1983)] and valence [i.e. the level of positive–negative emotional state attached to what the emotion concept refers to (see Grühn, & Scheibe, 2008)]. The first objective of the study was met by having several linguistic groups rate these three emotion words. Having the ratings from several linguistic groups enables us to gain a comprehensive picture of these emotion words with regards to the levels listed above. Although linguistic differences are expected in the rating of words (see Fig. 1 in Marmolejo-Ramos et al., 2013), it is hypothesised that, across linguistic groups, these emotions could have medium-to-low levels of concreteness, and medium-to-high levels of imageability and context availability. As shown in Table 1, such levels are expected based on previous studies in which the average concreteness, imageability and context availability ratings for the words joy, surprise and sadness have been reported (see Altarriba, Bauer, & Benvenuto, 1999; Altarriba, & Bauer, 2004; Brysbaert et al., 2014).Footnote 2

Table 1 Mean concreteness, imageability, context availability and valence ratings of three emotion words as reported in previous studies

In regard to surprise, it most likely exhibits lower context availability than joy (and possibly sadness) as found by Marmolejo-Ramos et al. (2014; see Tables 1, 2 in the article). Note that, in that study, participants generated verbal contexts representing six different emotions, including the three emotions studied herein. These researchers found that surprise had the lowest number of verbal contexts (joy had the highest number of verbal contexts, followed by fear and sadness). Thus, it is expected to support such finding via a rating task. It could be speculated that fewer verbal contexts and lower context availability ratings for the concept of surprise could be attributed to the neutrality of the concept, which, in turn, may hinder thinking of clear-cut scenarios associated with that given emotion.

Table 2 Demographic and descriptive statistic information of the participants in Study 1 and 2 (MAD = median absolute deviation)

Regarding emotional valence, it is expected that joy will be rated as highly positive, whilst sadness will be rated as highly negative. This result has also been reported in previous studies (see Table 1). In the ratings reported in Bradley and Lang (1999), surprise seems to lean towards positivity (see Table 1). However, based on theoretical accounts arguing that surprise is a rather neutral emotion (e.g. Macedo et al., 2009), we expect that the valence ratings will indicate that surprise is, in fact, neutral.

With regard to the levels of concreteness, context availability, imageability and valence of each emotion word, some variability due to linguistic differences can be expected (see Evans, & Levinson, 2009). This will ultimately be reflected in language effects in all of the 12 rating conditions [i.e. three emotion words (joy, surprise, and sadness) × four word rating dimensions (concreteness, context availability, imageability, and valence)].

The second objective of the study was to investigate the allocation of these three emotions in space via various linguistic groups. Finding that the positive emotion joy and the negative emotion sadness are placed on upper and lower spatial locations, respectively, would support the findings of Ansorge and Bohner (2013; see also Ansorge et al., 2013; Meier, & Robinson, 2004; Xie et al., 2014, 2015). Indeed, finding that right-handers place the words joy and sadness towards rightward and leftward spatial locations, respectively, would lend extra support to the BSH (see Casasanto, 2009, 2011). However, based on the results by Marmolejo-Ramos et al. (2013), the distance between joy and sadness on the horizontal plane (i.e. BSH) is expected to not be significant; rather, it is hypothesised a significant difference between joy and sadness on the vertical plane exclusively.Footnote 3 These findings would then lend support to evidence suggesting a saliency of the vertical plane over the horizontal plane (see Fig. 2f in Marmolejo-Ramos et al., 2013). Finding that surprise is located half-way between the vertical locations of joy and sadness would show for the first time that surprise’s emotional valence is mapped onto space. Specifically, we expect to find that given the neutral valence of surprise, this word would be mapped onto a vertical location near the mid-point (i.e. placed between joy and sadness). The non-linguistic differences originally reported by Marmolejo-Ramos et al. (2013) in the allocation of valenced words onto space suggest that there could be minimal chances of finding language effects in the allocation of these words.

Fig. 2
figure 2

Results of the rating (a) and the word allocation (b) tasks. The notches in the box plots and the error bars represent 95 % CI around the median. Closed triangle = joy, closed square = surprise and closed circle = sadness

Methods

Participants

University undergraduate students and members of the community from six different linguistic backgrounds (i.e. English, Hindi, Japanese, Spanish, Vietnamese and German) voluntarily participated in the rating (n = 325) and the word allocation (n = 362) tasks. The experimental protocol was approved by the ethics committees of the institutions involved in the studies. Participants gave written informed consent to abide by the principles of the Declaration of Helsinki. Table 2 reports demographic and descriptive statistic information of the participants (participants whose responses reflected a lack of understanding of the instructions were illegible, or were incomplete and were discarded. Also, participants with incomplete demographic data, e.g. no information about gender, handedness, age or language, were not included in the analyses).

Materials

The three emotion words joy, surprise and sadness were used in the rating study. The ratings were performed via a simple paper-based task (see Fig. 1a). The word allocation task also consisted of a paper-based task (see Fig. 1b).

Procedure

Rating task

Participants were asked to rate the three emotions on the following dimensions: concreteness, imageability, context availability and valence. The ratings were made by placing a mark (e.g. via a pen or a pencil) on 10-cm horizontal lines; one line for each attribute. On the left end, the scales were labelled as ‘highly abstract’ (concreteness scale), ‘hard to imagine’ (imageability scale), ‘hard to think of a context’ (context availability scale) and ‘highly negative’ (valence scale). On the right end, the scales were labelled as ‘highly concrete’ (concreteness scale), ‘easy to imagine’ (imageability scale), ‘easy to think of a context’ (context availability scale) and ‘highly positive’ (valence scale). The three words were presented to participants for rating in a random order; however, the order of each rating (concreteness, imageability, context availability and valence) for each word was given in a fixed order (see Fig. 1a).

Word allocation task

Participants were asked to locate three symbols representing the words joy, surprise and sadness on a 10-cm2 gridded square (this grid resembles that used in Experiment 2 by Marmolejo-Ramos et al., 2013). A triangle represented joy, a square represented surprise and a circle represented sadness, and this matching was used for all participants (see Appendix for supplementary results that reflect the counterbalanced emotion/symbol combinations). The instructions read: “assuming the words joy, surprise and sadness were symbols to be placed in the following square, where would you put them?” Participants were also instructed that each symbol should occupy only one square within the grid, each symbol should occupy different squares in the grid, and each symbol should be drawn only once (see Fig. 1b). There were no time restrictions to complete this task.

Design and analyses

The data in both tasks were analysed via high-breakdown and high-efficiency robust linear regression modelling (see Yohai, 1987) via the ‘lmRob’ function in the ‘robust’ R package. For the rating study, the independent variables were participant, i.e. all participants in rating study (P), language, i.e. the six languages studied (L), gender, i.e. males and females (G), handedness, i.e. right- and left-handers (H), age, i.e. the ages of the participants in the rating study (A), word, i.e. joy, surprise and sadness (W) and word dimension, i.e. concreteness, imageability, context availability and valence (D). These factors were hierarchically entered in this order, and the dependent variable was the rating values.

For the word allocation study, the independent variables were participant, i.e. all participants in word allocation study (P), language, i.e. the six languages studied (L), gender, i.e. males and females (G), handedness, i.e. right- and left-handers (H), age, i.e. the ages of the participants in the word allocation study (A), and word, i.e. joy, surprise and sadness (W). These factors were entered in this order for the location values obtained in the X and Y axes; i.e. the two dependent variables in the word allocation study. The variables W, H and L were central to this study and added to the model based on previous research showing that they play a part in the mapping of words onto space (see Marmolejo-Ramos et al., 2013, 2014). Whilst the variable D is specific to the rating task, the variables P and A were peripheral to this study and were included to account for their potential effects on the dependent variables. Some of the estimates of the beta weights of the levels of the independent variables (β values) and their associated t and p values were reported to illustrate their influence on the model. For each hierarchical model, the variability accounted for was estimated as adjusted R 2 × 100. The models’ fits via ANOVA and robustified F tests (F r).

Avewere comparedrage values and associated measures of deviation were estimated via the median (Mdn) and median absolute deviation (MAD), respectively. The formula \(\pm 1.58 \times \left( {\frac{\text{IQR}}{\sqrt n }} \right)\), where IQR = interquartile range and n = sample size, was used to generate 95 % CI around the medians for assessing equality of medians at approximately 5 % significance level (see McGill, Tukey, & Larsen, 1978). Based on the results of the robust ANOVA model comparison, pairwise comparisons were examined via the degree of CIs overlap between groups of interest (e.g. within levels of a variable or between variables). Non-overlapping CIs were taken as evidence of significant difference between the groups’ medians (see Cumming, & Finch, 2005; Cumming, 2012). However, when there was some degree of overlap between two or more dependent groups, the Agresti–Pendergast ANOVA test (F AP) was used via the R function ‘apanova’ (see Wilcox, 2012). The p values of multiple comparisons were adjusted via the false discovery rate method, p FDR (Benjamini, & Hochberg, 1995). Pairwise comparisons between two or more independent groups were performed via the Cucconi permutation test, MC (Marozzi, 2012, 2014).

Results

The rating results suggested no differences among the three emotion words regarding their concreteness levels. However, joy received higher context availability ratings than surprise, and the three words differed in terms of imageability ratings; i.e. joy > surprise > sadness. Central to this study was the finding that, in terms of valence, joy was rated higher than sadness, and surprise’s average ratings fell between the other two words.

Rating task

Only the models P, P + L + G and P + L + G + H did not have significant t and p values associated with the β values. The other models had significant β values [e.g. in the P + L model: β Hindi = −1.86 (t = −6.65, p < 0.001); in the P + L + G + H + A model: β age = −0.03 (t = −2.88, p < 0.01); in the P + L + G + H + A + W model: β sadness = −1.78 (t = −17.11, p < 0.001); and in the P + L + G + H + A + W + D model: β context = 1.49 (t = 12.42, p < 0.001)]. The variability accounted for by each model was 1.02 % (P), 4.57 % (P + L), 4.63 % (P + L + G), 4.66 % (P + L + G + H), 4.82 % (P + L + G + H + A), 10.78 % (P + L + G + H + A + W), and 18.41 % (P + L + G + H + A + W + D). A comparison of the models further suggested that there was an improvement of the fitness of the hierarchical models to the rating data when P, L, and A were added; F r = 40.90, p < 0.001, F r = 22.49, p < 0.001 and F r = 7.03, p = 0.006, respectively. However, the largest improvement occurred when W and D were finally added to the model; F r = 111.45, p < 0.001 and F r = 104.77, p < 0.001, respectively.

The model P was significant in that there were differences in the ratings across participants. For example, whereas a participant in the English sample had a median rating of 3.95 [95 % CI (3.15, 4.74)], a participant in the Vietnamese sample had a median rating of 7.7 [95 % CI (4.89, 10.50)]. Language had an effect on the ratings, which was due to median ratings differing across linguistic groups. For example, whilst the median rating in the Hindi sample was 5.4 [95 % CI (5.18, 5.61)], the median rating in the Japanese sample was 6.5 [95 % CI (6.26, 6.73)]. The effect of age on the ratings was graphically explored via a scatter plot with linear and smooth fit lines and a correlation test. The results indicated a near-significant positive correlation (r τ = 0.02, z = 1.87, p = 0.06) such that, for example, the median rating of participants aged 17–25 was 6.7 [95 % CI (6.49, 6.90)], and the median rating of participants aged 30 to 35 was 7.95 [95 % CI (6.70, 9.19)].

The effect of word type (W) was substantiated by the non-overlap between the confidence intervals around the median ratings for the words joy, surprise and sadness; Mdnjoy = 7.6 [95 % CI (7.42, 7.77)], Mdnsurprise = 6.2 [95 % CI (6.059, 6.34)], and Mdnsadness = 5.8 [95 % CI (5.54, 6.054)].Footnote 4 In the case of the factor word dimension (D), whilst the average ratings in the context and imageability dimensions did not differ {Mdncontext = 7.4 [95 % CI (7.20, 7.59)], Mdnimageability = 7.4 [95 % CI (7.24, 7.55)]}, the average ratings in the concreteness and valence dimensions did {Mdnconcreteness = 5.7 [95 % CI (5.45, 5.94)], Mdnvalence = 5.1 [95 % CI (4.80, 5.39)]}. Also, the ratings for the words in the context and imageability dimensions were higher than the ratings for the words in the concreteness and valence dimensions {Mdncontext+imageability = 7.4 [95 % CI (7.27, 7.52)] and Mdnconcreteness+valence = 5.2 [95 % CI (5.02, 5.03)]}.

Given the significant effects of W and D on the ratings, their relationship was analysed. Figure 2a shows the ratings of the three words according to the dimension in which they were evaluated. In the concreteness dimension, the median ratings of joy {Mdn = 5.7 [95 % CI (5.27, 6.12)]}, sadness {Mdn = 5.2 [95 % CI (5.54, 6.25)]} and surprise {Mdn = 5.9 [95 % CI (4.73, 5.66)]} did not differ [F AP (2, 648) = 1.26, p = 0.28]. In the context dimension, there were differences between groups [F AP (2, 648) = 4.69, p = 0.009] due to the median rating of joy {Mdn = 7.6 [95 % CI (7.27, 7.92)]} differing from that of surprise {Mdn = 7.2 [95 % CI (6.96, 7.63)]} [F AP (1, 324) = 8.68, p FDR = 0.01]. Other pairwise comparisons in this dimension, and that involved the word sadness {Mdn = 7.3 [95 % CI (6.88, 7.51)]}, were not significant (all p FDR > 0.05). There were also differences between joy {Mdn = 7.8 [95 % CI (7.58, 8.01)]}, sadness {Mdn = 7.5 [95 % CI (7.18, 7.81)]} and surprise {Mdn = 7 [95 % CI (6.70, 7.29)]} in the imageability dimension [F AP (2, 648) = 14.13, p < 0.001] due to all pairwise comparisons being significant (all p FDR < 0.05). The non-overlap between the 95 % CIs of joy {Mdn = 9.1 [95 % CI (8.89, 9.30)]}, sadness {Mdn = 1.65 [95 % CI (1.40, 1.89)]}, and surprise {Mdn = 5.1 [95 % CI (4.98, 5.21)]} in the valence dimension indicates that the average ratings between these groups differed significantly.

Effects of covariates on the ratings of each emotion word

Emotion word JOY: Analyses of the effects of the covariates participant (P), language (L), gender (G), handedness (H), and age (A), on the four types of ratings revealed an effect of P (i.e. P model) on the context availability (CA), imageability (I) and valence (V) ratings of joy (CA: F r = 15.67, p = 5.45e −05; I: F r = 5.90, p = 0.01; V: F r = 16.59, p = 3.30e −05). There was also an effect of L (i.e. P + L model) on the CA and V ratings of joy (CA: F r = 12.74, p = 0.03; V: F r = 19.03, p = 0.003). All the other models were not significant; p > 0.05.

Emotion word SURPRISE: Analyses of the effects of the covariates P, L, G, H, and A on the four types of ratings revealed an effect of P on the CA and I ratings of surprise (CA: F r = 4.16, p = 0.03; I: F r = 15.58, p = 5.74e −05). There was also an effect of A (i.e. P + L + G + H + A model) on the V ratings of surprise (F r = 10.35, p = 0.001; a Kendall’s tau test did not support this effect: τ = 0.005, p = 0.89). All the other models were not significant; p > 0.05.

Emotion word SADNESS: Analyses of the effects of covariates P, L, G, H, and A on the four types of ratings revealed an effect of P on the concreteness (C), CA, I, and V ratings of sadness (C: F r = 13.04, p < 0.001; CA: F r = 29.77, p = 2.68e −08; I: F r = 26.10, p = 1.92e −07; V: F r = 29.96, p = 2.43e −08). There was also an effect of A (i.e. P + L + G + H + A model) on the C ratings of surprise (F r = 4.30, p = 0.03; τ = 0.09, p = 0.01), an effect of L (i.e. P + L model) on the CA ratings (F r = 18.69, p = 0.003), and an effect of G (i.e. P + L + G model) on the I ratings (F r = 4.39, p = 0.03; a Cucconi test did not support this effect: MC = 1.45, p = 0.23). All the other models were not significant; p > 0.05.

Word allocation task

The results showed that whilst no one factor had effects on the X-axis data, in the case of the Y axis, regardless of language, gender, handedness and age, joy was located in upper spatial locations and sadness in lower spatial locations. The neutral emotional concept of surprise was located mid-way between joy and sadness. In regard to the language factor, results were in line with those reported by Marmolejo-Ramos et al. (2013) in that there were some differences among linguistic groups in the rating task but none in the word allocation task.

Robust linear regression on the X-axis data

In none of the models, the t values associated with the β values were significant (all p > 0.05). The variability accounted for by each model was 0.02 % (P), 0.23 % (P + L), 0.28 % (P + L + G), 0.45 % (P + L + G + H), 0.45 % (P + L + G + H + A), and 0.66 % (P + L + G + H + A + W). A comparison of the models further suggested no improvement of the fitness of the hierarchical models to the X-axis data; P model: F r = 0.17, p = 0.66; P + L model: F r = 0.34, p = 0.99; P + L + G model: F r = 0.44, p = 0.49; P + L + G + H model: F r = 1.40, p = 0.22; P + L + G + H + A model: F r = 0.01, p = 0.88; and P + L + G + H + A + W model: F r = 0.54, p = 0.90.

The overlap between the confidence intervals for the words when located in the X axis suggests that they are not positioned differently on the horizontal plane (see Fig. 2b). Indeed, although there was variability in the location of the words (MADjoy = 5.93, MADsurprise = 5.93, and MADsadness = 8.89), the median location for the three words was −1.Footnote 5

Effects of covariates on the horizontal position of each emotion word

Analyses of the effects of the covariates participant (P), language (L), gender (G), handedness (H), and age (A) on the X values (e.g. effects of those covariates on the values in the X axis when the word was joy) showed that there were non-significant results in the X axis (p > 0.05 in all models for each of the three words).

Robust linear regression on the Y-axis data

The same analysis described above for the data in the X axis was performed for the data in the Y axis. Only in the last model, the t values associated with the β values were significant; e.g. β surprise = −2.67 (t = −6.66, p < 0.001), and β sadness = −12.14 (t = −29.77, p < 0.001). The variability accounted for by each hierarchical model was 0.01 % (P), 0.26 % (P + L), 0.28 % (P + L + G), 0.32 % (P + L + G + H), 0.37 % (P + L + G + H + A), and 49.88 % (P + L + G + H + A + W). A comparison of the models suggested an improvement of the fitness of the hierarchical models to the Yaxis data only when the predictor W was added; P model: F r = 0.19, p = 0.66; P + L model: F r = 0.40, p = 0.99; P + L + G model: F r = 0.18, p = 0.66; P + L + G + H model: F r = 0.29, p = 0.58; P + L + G + H + A model: F r = 0.46, p = 0.49; and P + L + G + H + A + W model: F r = 373.43, p < 0.001.

The non-overlap between the confidence intervals for the words when located in the Y axis suggests that they are positioned differently on the vertical plane (see Fig. 2b). There was some variability in the location of the words (MADjoy = 2.96, MADsurprise = 4.44, and MADsadness = 4.44), and they had notably different locations on the Y axis. Specifically, whilst joy was located in the upper end of the square {Mdnjoy = 7 [95 % CI (6.46, 7.53)]}, sadness was positioned on the lower end of the square {Mdnsadness = −7 [95 % CI (−7.58, −6.41)]}, and surprise was placed in between the other two words {Mdnsurprise = 3 [95 % CI (2.58, 3.41)]}.

Effects of covariates on the vertical position of each emotion word

There was an effect of P in the cases of joy and sadness only (joy: P model: F r = 2.03, p = 0.14; sadness: P model: F r = 16.46, p = 3.54e −05), such that some participants allocated these words more upward/downward than others (all other models in joy and sadness had p > 0.05). There was an effect of H in the case of surprise only (P + L + G + H model: F r = 4.25, p = 0.03; a Cucconi test confirmed this difference: MC = 3.32, p = 0.03), such that right-handers allocated this word higher {Mdn = 3, [95 % CI (2.46, 3.53)]} than left-handers {Mdn = 2, [95 % CI (0.58, 3.41)]}. All the other models in surprise had p > 0.05 (see Appendix for supplementary results).

Discussion and conclusions

The aim of the rating task was to characterise the words under scrutiny in their concreteness, context availability, imageability, and valence dimensions. The word allocation task aimed to determine the allocation of these three emotions in space by various linguistic groups. Overall, the results suggest that the valence of the emotion words joy, surprise and sadness (as indicated on the valence dimension in the rating task) is metaphorically mapped onto the vertical plane, such that joy is located in upper locations, sadness is located in lower locations and surprise is located mid-way between the other two words (word allocation task).

The results of the rating study agree with previous research in which the concreteness, imageability, context availability, and valence of the words joy, sadness and surprise have been assessed (see Table 1; Fig. 2a); however, the present results add novel details. It was found that the three words have similar levels of concreteness and are rated as mildly concrete. Although the results showed that, overall, the three words have medium-to-high levels of imageability, as previous studies have indicated, it was further found that joy is more imageable than sadness, and sadness is more imageable than surprise. In addition, the finding that joy rated higher than surprise in regards to context availability is in line with Marmolejo-Ramos et al. (2014; Tables 1, 2) in which participants generated less emotional contexts for surprise than joy. The present results thus corroborate the findings of these authors via a rating task. Finally, in agreement with past research, joy was rated as more positive than sadness, and surprise was rated mid-way between the other two emotions. However, the median valence rating of surprise {Mdn = 5.1 [95 % CI (4.98, 5.21)]} indicates that this word is regarded as neither positive nor negative. This is a novel finding since it empirically demonstrates that surprise is a rather neutral emotion concept. It is interesting to note that we found an effect of language in the rating task, but such a factor did not mediate the word allocation task (see below).

The results of the word allocation study confirm that highly positive emotions such as joy are mapped onto upper spatial locations, whilst highly negative emotions such as sadness are mapped onto lower spatial locations. This finding is in keeping with research suggesting a metaphorical association between emotion stimuli and the vertical spatial axis (e.g. Ansorge, & Bohner, 2013, Ansorge et al., 2013; Damjanovic, & Santiago, 2016; Marmolejo-Ramos et al., 2014; Meier, & Robinson, 2004; Sasaki et al., 2015, 2016; Xie et al., 2014, 2015). Indeed, the average location of the words on the horizontal axis was no different, and handedness had no effect, which lends extra support to the idea that the vertical plane is more prominent than the horizontal plane for the mapping of emotions onto space as originally suggested by Marmolejo-Ramos et al. (2013). Interestingly, whilst in the rating task, the language and age variables had an influence on the words’ ratings, this was not the case in the word allocation task. As shown in Fig. 1, in the study conducted by Marmolejo-Ramos et al. (2013), the average ratings of words tend to vary across linguistic groups, and as shown by Bird, Franklin and Howard (2001), age of acquisition can correlate with, for instance, the imageability ratings of words. Thus, concluding that language and age have an effect on the ratings of emotion words is not surprising [see for example, Evans, & Levinson (2009) arguments regarding linguistic diversity]. However, in the word allocation task, these factors, along with the factors gender and handedness, did not have any effect. The results of the word allocation task hence suggest that, regardless of language, gender, handedness and age, positive words are located in upper spatial areas and negative words are located in lower spatial areas. This result corroborates the findings from Marmolejo-Ramos et al. (2013).

The novel finding is that surprise was located mid-way between sadness and joy in the vertical axis. Although the median location of surprise on the vertical axis was not exactly zero, it was located rather close to it {Mdn = 3 [95 % CI (2.58, 3.41)]}. Numerically speaking, the exact mid-way location in the vertical axis between where joy and sadness were located is zero, and the exact mid-way location between zero and where joy was located is 3.5 (see Fig. 2b). Thus, it could be said that a location above 3.5 should be an indication of the word leaning towards positivity, whilst a value on the Y axis below 3.5 should be an indication of the word leaning towards neutrality. Given that the upper arm of the CI around the median rating of surprise did not cover 3.5, it is then reasonable to assert that this emotion tends to be located mid-way between joy and sadness in the vertical spatial plane. This result thus provides further evidence that the neutral emotional valence of surprise (as found in the rating task) is reflected in this emotion being mapped mid-way between upper and lower locations onto the vertical plane.

Why is vertical space so salient? It has been argued that locations on the horizontal plane (i.e. left and right) are less salient than locations on the vertical plane (i.e. up and down) since people tend to confuse East–West more than North–South (see Mark, & Frank, 1989, as cited in Marmolejo-Ramos et al., 2013). Locations on the horizontal plane are less noticeable as it is equally easy to look left or right. Locations on the vertical plane, on the other hand, are clear in that locations above eye level are immediately observable and, therefore, more likely to be preferred (i.e. likely to be associated with positive valence) than locations below eye level (see also Freeman, 1975, as cited in Marmolejo-Ramos et al., 2013; see also studies on locatives and comparatives by Clark, Carpenter, & Just, 1973). It is, thus, likely that a mapping of positive-valenced concepts (concepts that refer to events, objects and people) onto upper spatial locations is strongly influenced by bodily configuration and experience rather than language, which labels such experiences.

Note that all studies on the valence–space metaphor focus on mapping of the opposite ends of the affective continuum of a concept (e.g. positive emotions vs negative emotions) onto the opposite ends of the vertical plane (e.g. high spatial location vs low spatial location). The results have consistently shown that high spatial locations are associated with positivity and low spatial locations are associated with negativity (see Clark et al, 1973, and other references cited herein). No previous studies have investigated the location on the vertical plane of neutrally valenced concepts. Our study is the first to show that such concepts, exemplified here with the case of surprise, are associated with the mid-point (between joy and sadness) in the vertical plane.

It is worth noting that focused analyses showed that there were no language effects on the allocation of the three words in the X and Y axes in the first WAT task, but there was a language effect on the allocation of joy in the X axis and the allocation of sadness in the Y axis in the second WAT task (see Appendix). This finding can be due to simple linguistic variability (see Evans, & Levinson, 2009). Interestingly, no covariate had an effect on the allocation of surprise in the vertical and horizontal planes. This suggests that whilst there could be some degree of variability across languages as to the allocation of joy and sadness in 2D space, there seems to be less variability as to the spatial location of surprise. In other words, surprise seems to be zeroed in a specific vertical and horizontal coordinate.

This novel result indicates that the location of a concept on the vertical plane mimics the concept’s degree of emotional valence regardless of linguistic background. Indeed, it could be entertained that the location of any stimulus on the vertical plane should mimic the stimulus’ degree of emotional valence. That is, the more positively valenced the stimulus, the higher in vertical space it would be located; likewise, the more negatively valenced the stimulus, the lower it would be located. By the same token, a stimulus that is neither too positive nor too negative would tend to be located towards the middle in the vertical plane, as surprise was found to be here. A recent study by Sasaki et al. (2015) could be modified to verify this claim. Sasaki et al. (2015) had participants evaluate emotional images. Before evaluation responses were made, the participants had to swipe the display upward or downward, and then, they made an evaluation of the image’s valence. Surprisingly, when participants swiped upward before the evaluation, a more positive evaluation was given to images, and vice versa. Instead of swiping towards a fixed upper or lower area on the screen, as Sasaki et al. did, participants could be required to freely drag the image along a vertical line which would allow for measurement of the distance from the centre of the screen to the place where the emotional stimulus was dragged to. Then the participants would rate the valence of the stimulus. Based on the current findings, it would be hypothesised that the upper/lower the stimulus is located on the vertical axis on the screen, the more positive/negative it would be rated. This finding would support the claim made by Sasaki et al. (2015) that close temporal associations between somatic information and visual events leads to their retrospective integration and provide further credibility to the findings reported herein.

Whilst the emotions joy and sadness have distinctive sensorimotor correlates, these correlates are very broad in the case of surprise. That is, whilst clapping of hands and head hanging on contracted chest are some of the bodily correlates of joy and sadness, respectively (see Wallbott, 1998), surprise manifests in visual search, eye-brow raising, eye-widening, jaw drop, among others (see Reisenzein et al., 2012). However, given that surprise seems to be a neutral emotion, its bodily and sensorimotor correlates can be difficult to pinpoint, and this situation could lead this emotion to not be regarded as an emotion but as a cognitive state (Reisenzein et al., 2012). Given current theories arguing that there are degrees in the embodiment of language and emotions (e.g. Chatterjee, 2010; Marmolejo-Ramos, & Dunn, 2013; Meteyard, Rodríguez, Bahrami, & Vigliocco, 2012), it is possible that as the more neutral a concept (and the object it refers to) becomes, the lower the degree of sensorimotor properties. Such low activation of sensorimotor correlates and neutral valence can be metaphorically mapped onto space in vertical locations that are near the middle instead of upper or lower areas. Moreover, the metaphorical mapping of emotions onto space has so far been limited to the two-dimensional space (i.e. up–down in the Y Cartesian coordinate and left–right in the X coordinate). It is reasonable to suggest that if valenced concepts were to be allocated in a three-dimensional physical space, highly positively valenced concepts would be placed near the body, highly negatively valenced concepts would be placed far away from the body, and neutrally valenced concepts mid-way between these two. That is, valenced concepts should also have different locations on the Z Cartesian coordinate. This is merely conjectural, and further empirical testing is needed to explore this notion.