The words of language are carriers of complex features. These features may come within the scope of surface aspects (number of letters, number of syllables, etc.), use-related aspects of language (usage frequency, grammatical category, etc.), or aspects relating to in-depth treatment (concreteness, imagery, emotions, etc.). The emotional characterization of words permits the control of experimental materials according to variables that influence complex cognitive processes (Bargh, Chaiken, Raymond, & Hymes, 1996; Klauer, 1997) and authorizes the establishment of an empirical base to determine the valence of emotional texts (Leveau, Jhean-Larose, & Denhière, 2011). Two approaches dominate this characterization. The first approach considers the discrete emotion to which the words refer; the other approach focuses on evaluating these terms along one or more emotional dimension. In a dimensional representation of emotions, valence (from unpleasant to pleasant) and arousal (from calm to excited) are usually considered to be primary characteristics (Russell, 2003). In French, seven norms characterize words in valence (Bonin, Méot, Aubert, Niedenthal, & Capelle-Toczek, 2003; Messina, Morais, & Cantraine, 1989; Niedenthal, Auxiette, Nugier, Dalle, Bonin, & Fayol, 2004; Painchaud, 2005; Syssau & Font, 2003; Vikis-Freibergs, 1976) or in valence and arousal (Leleu, 1987). Other norms for valence or arousal characteristics exist in English (Bradley & Lang, 1999), German (Võ, Conrad, Kuchinke, Urton, Hofmann, & Jacobs, 2009), Spanish (Redondo, Fraga, Pradròn, & Montserrat, 2007), Italian (Zammuner, 1998), and Finnish (Eilola & Havelka, 2010). In a discrete representation of emotions, some emotions have a specific status and are considered as basic because (1) they exist in all cultures and in some higher animals, (2) they are universally associated with characteristic facial expressions, or (3) they have an adaptive role for the individual or for the species. Thus, basic emotions can be combined to form more complex emotions (Johnson-Laird & Oatley, 1989). Usual negative basic emotions are fear, anger, sadness, and disgust, and one usual positive basic emotion is happiness. Surprise is also considered as a basic nonvalenced emotion by some authors (Ekman, 1992).

The conceptualization of words denoting emotions, like other words of the French language and, in particular, those relating to abstract concepts, is constructed through the use that individuals make of these words. The experiential and linguistic environments of a term are therefore crucial to providing it with meaning and, in particular, with emotional characteristics. If we refer to Western languages, it may not be reasonable to consider the French word désir as the equivalent of the English desire, of the Italian desiderio, of the Spanish deseo, or of the German Wunch. On the contrary, we can consider an intercultural and interlingual invariance, based on the homogeneity of the mental representation associated with a term (Johnson-Laird, 1983), and assume that emotional characteristics are stable regardless of the language in which this term is expressed. Zammuner (1998) compared the results of evaluations of emotional words in English and Italian and observed a strong correlation between the ratings of the valence of words in Italian and English (Cronbach α = .88). However, few studies have extended this comparison to other Western languages, and to our knowledge, no comparison has ever been made of how people from different countries evaluate the characteristics of emotional words of the same language. Finally, if the conceptualization of the emotional components of a word is highly based on experiential data, what is the stability of these components over time?

Three objectives are addressed by this article. First, we will examine the validity of the collected data according to different criteria. If we assume that the emotional characteristics of the words are part of the mental representation, we can expect a strong similarity between the evaluations of these characteristics by humans, independent of their language, their culture, the date the judgment was made, or even the type of emotional analysis performed by participants (intrinsic characteristics of words or personal experience while reading). If a similarity is observed between the results of these studies, the construction of a metanorm will become possible and, thus, constitute an important linguistic resource for further analysis. Second, we will examine the possibility of using this metanorm for the study of language and emotions. Three tests are performed. The first test consists in identifying basic emotions by valence and arousal coordinates of words denoting these emotions, the second involves recognizing the emotional orientation of texts judged by humans as positive or negative, and the third evaluates the emotional intensity of positive and negative texts. Finally, we will introduce EMOVAL/SEMOTEX, a Web application for text analysis that uses the metanorm.

From norms to metanorm

In this section, we present the 12 norms cited in the introduction, with the objective of comparing the data provided by each of them. If similar emotional valence and/or arousal values are obtained from different norms for similar words, we will be authorized to consider the 12 norms as one metanorm. First, we will present the main characteristics of the 12 considered norms. Second, we will describe the results of comparing the norms along both valence and arousal dimensions.

Presentation of emotional norms

The main characteristics of the 12 norms cited in the introduction are shown in Table 1. Seven norms concern French words (Bonin et al., 2003; Leleu, 1987; Messina et al., 1989; Niedenthal et al., 2004; Painchaud, 2005; Syssau & Font, 2003; Vikis-Freibergs, 1976), and 5 norms concern non-French words (Bradley & Lang, 1999; Eilola & Havelka, 2010; Redondo et al., 2007; Võ et al., 2009; Zammuner, 1998). Four norms concern the emotional characteristics of the word itself (Nos. 4, 5, 7, and 9), and 8 norms concern the feeling of the participant when reading the word (Nos. 1, 2, 3, 6, 8, 10, 11, 12). Only 5 norms provide arousal characteristics.

Table 1 Presentation of the main characteristics of emotional norms

Comparison of norms and building of a metanorm

For the French norms, all words were retained for the norm comparison. For non-French norms, words were automatically translated to French and then back to their original language, using online dictionaries (http://www.reverso.net). Only when this double translation produced the identical original words were words from non-French norms retained. One hundred fifty-three words were selected from Zammuner (1998), 564 words from ANEW (Bradley & Lang, 1999), 563 words from Redondo et al. (2007), 146 words from the BAWL–R (Võ et al., 2009), and 93 from Eilola and Havelka (2010).

For each norm, the common words were identified, and the correlations between the evaluations of these words were calculated on the scales of valence and arousal. The results presented in Table 2 indicated very high and significant correlations (from .77 to .99; all ps < .02) between the valences.

Table 2 Correlations between the valence norms (all ps < .02), with the number of common words between two norms presented in parentheses

Correlations between arousal values were lower than those observed between the values of emotional valence, although they remained significant (from .28 to .82) (see Table 3).

Table 3 Correlations between arousal judgments of each norm (** denotes p < .01; * indicates the correlation was not significant)

These results demonstrate the relevance of constituting a French metanorm based on norms from different Western language resources, from different francophone countries, and at different dates and considering either intrinsic evaluation of words or personal feeling while reading. We name this metanorm EMONORM.

To constitute the EMONORM metanorm, the valences of the words characterized within the 12 norms were reduced and focused on the interval [−1, +1], and the arousal values issued from 5 norms were reduced to the interval [0, 1]. Both transformations are linear. For each word, mean values were calculated for each dimension. The constructed metanorm comprises 6,383 words characterized in valence and 4,345 words characterized in arousal.Footnote 1

Tests of EMONORM

Three tests were performed to assess the relevance of EMONORM. The first test is devoted to the identification of basic emotions on a valence–arousal space, based on the usual words associated with these emotions. The objective of the second test is to identify the valence orientation (positive vs. negative) of texts in comparison with human judgments. Finally, considering a set of positive and negative valence texts, the third test focuses on analyzing the intensity of the expressed emotion.

Test 1: Identification of basic emotions in a valencearousal space

In previous studies, Stevenson and colleagues (Stevenson, Mikels & James, 2007; Stevenson et al., 2010) analyzed the regression of the emotional category belonging to the valence, arousal, and dominance values of words taken from the ANEW (Bradley & Lang, 1999). The objective of our first test was to determine whether basic emotions can be distinguished in a bidimensional valence–arousal space using the EMONORM data.

Materials

Words referring to happiness (17), fear (22), anger (22), sadness (24), disgust (17) and surprise (11), for which valence and arousal characteristics are available in EMONORM, were selected from the French lexical resources of Piolat and Bannour (2009).

Analyses, results, and conclusion

Mean and standard deviation of valence and arousal words were computed for each emotion category (Fig. 1). Two ANOVAs were performed with emotion as a categorical factor and valence and arousal as dependent variables. Main effects were observed on valence, F(5, 107) = 69.44, p < .001, and on arousal, F(5, 84) = 3.40, p < .01. For valence, planned comparisons revealed significant differences between surprise and happiness, F(1, 107) = 30.19, p < .01, and between surprise and all negative emotions, F(1, 107) = 64.39, p < .001; within the negative emotions, sadness is significantly different from the other negative emotions, F(1, 107) = 6.17, p < .02. For arousal, the planned comparisons revealed two significant differences: first between sadness and fear anger disgust , F(1, 84) = 7.72, p < .01, and second between anger and fear disgust,” F(1, 84) = 8.17, p < .01.

Fig. 1
figure 1

Position and standard error for six basic emotions in a valence–arousal space according to the EMONORM data

Figure 1 shows that five groups of emotions can be significantly distinguished from valence and arousal: happiness, surprise, sadness, feardisgust, and anger.

Test 2: Evaluation of positive and negative texts

The objective of the second test is to evaluate the capability of identifying the orientation of texts using EMONORM, in comparison with human judgments.

Materials

For the human judgment evaluation, 15 texts, including 7 positive ones (surprise [1], happiness [2], confidence [2], and desire [2]) and 8 negative ones (sadness [2], fear [2], anger [2], disgust [1], and surprise [1]), were constructed using the Piolat and Bannour (2009) lexical resources. The texts were an average of 45.4 words long (SD = 12.1) (see Table 4).

Table 4 Total number of words, content words, and words analyzed for each text of test 2

Participants and procedure

For the human evaluation, 75 students earning a master’s degree in education from the University of Orléans (France) participated in the study. Participants were asked the extent to which the positive texts (surprise, happiness, confidence, and desire) were considered as pleasant and the negative texts (sadness, fear, anger, disgust, surprise) were considered as unpleasant, judged on a 4-point scale ranging from 1 (low pleasantness/low unpleasantness) to 4 (very high pleasantness/very high unpleasantness). It was not possible for a participant to rate a positive text as unpleasant or to rate a negative text as pleasant. Positive texts were assigned a positive value, and negative texts were assigned a negative value. For an easy comparison with EMONORM, the results were linearly reduced to a [−1; 1] interval.

To evaluate the texts using EMONORM, we used the procedure proposed by Heise (1965), consisting of computing the mean valence for all words belonging to both the text and EMONORM (Fig. 2). On average, 24.3 % of the words in the texts were analyzed, ranging from 11.4 % (5 words out of 44 for one of the surprise texts) to 37.5 % (21 words out of 56 for one of the happiness texts).

Fig. 2
figure 2

Mean valence rating of texts for each emotion based on EMONORM computation and on human judgments

Analyses, results, and conclusion

The correlation between human judgment and the automatic evaluation is significant (r = .96, p < .01). The correlation is also significant within positive (r = .69, p < .01) and negative (r = .76, p < .02) emotion texts. In short, on the valence dimension, EMONORM mimics the human judgments.

Test 3: Evaluation of the intensity of negative or positive texts

As the second test compared human judgment with EMONORM computation on the basis of valence judgments, the objective of the third test was to compare human judgment with EMONORM computation on the basis of the intensity of categorical judgment.

Materials

To perform the third test, a corpus of texts referring to positive and negative emotional categories was needed. Sixteen emotional categories referring to eight positive emotions (love, happiness, calm, courage, liveliness, relief, kindness, satisfaction) and eight negative emotions (hate, suffering, tension, fear, depression, trouble, aggressiveness, frustration) were considered. Six keywords selected from the Piolat and Bannour (2009) lexical resources were assigned to each of the 16 emotional categories. Finally, 16 sets of 50 texts (400 positive texts and 400 negative texts) 80–150 words long, referring to each emotional category, were extracted from a modern literature corpus (Denhière, 2011) using the keywords.

Participants and procedure

For the human judgment, 51 students (16 males and 35 females) pursuing a master’s degree in education from the University of La Sorbonne (France) participated in the study. The text judgment was performed via the Internet. After participants had connected to the Web site, the instructions asked them to indicate, on a 4-point nominal scale (very little, little,” much, and very much), to what extent the text expressed the indicated emotion. The emotion category in which participants had to evaluate the texts was permanently displayed on the top of the screen. The evaluation started with 8 practice texts, followed by 50 experimental texts. In case of interruption, participants could, if desired, reconnect and continue the evaluation from the last text judged after the 8 practice texts were reproposed for evaluation. The 800 texts and their ratings are available as supplemental materials in the journal’s supplemental archive.

For the judgment using EMONORM, each text was evaluated on the valence and arousal scales using the Heise (1965) procedure.

Analyses, results, and conclusion

Each text was rated by at least 3 participants (M = 3.54, SD = 0.84). The human judgment was coded with a numeric value from 0 for very little to 3 for very much. Computing the mean score for each text produced an emotion intensity value ranging from 0 to 3. For each of the eight positive emotion categories, the 50 texts illustrating that category were divided into four quartiles according to the intensity of emotion based on human judgments. Thus, for each positive emotion category, quartile 1 contains 13 low-intensity pleasant texts, and quartile 4 contains 13 high-intensity pleasant ones. Similarly, for each of the eight negative emotions, the 50 corresponding texts were divided into four quartiles according to the intensity of human judgments. So, for each negative emotion category, quartile 1 contains 13 low-intensity unpleasant texts, and quartile 4 contains 13 high-intensity unpleasant ones.

In both cases, an ANOVA was performed with the quartile as a categorical factor, and valence and arousal were calculated using EMONORM as dependent variables. For positive texts, results did not show a significant effect of the quartile category, either on valence, F(3, 396) = 1.40, n.s., or on arousal, F(3, 396) = 0.37, n.s. However, a significant difference was observed between quartile 1 and quartile 4 on the valence dimension, F(1, 206) = 4.37, p < .04, with a higher valence for quartile 4.

For negative texts, results showed a main effect of the quartile category and on arousal, F(3, 396) = 3.13, p < .03, and on valence, F(3, 396) = 15.88, p < .001, with an increasing valence value as the intensity of negative texts decreased.

In conclusion, EMONORM is capable, on the valence scale, of assessing the intensity of positive and negative texts with respect to human judgments. The finest analysis is observed for negative texts.

EMOVAL/SEMOTEX: A Web interface for emotional analysis of texts

We have implemented the text evaluation using EMONORM in a Web application named EMOVAL. EMOVAL is part of the SEMOTEX web platform (http://www.semotex.fr). It provides emotional valence and arousal analyses of texts, using the Heise (1965) procedure. Before being analyzed, texts may or may not be lemmatized or stemmatized (Fig. 3). Lemmatization consists of extracting the lemma of an inflected word (e.g., “included”: “include”; “abilities”: “ability”). Stemmatization (Porter, 1980) consists of extracting the root of a word (e.g., “included”: “includ”; “abilities”: “abil”). Lemmatization or stemmatization is applied to verbs, to adjectives, to nouns, and/or to adverbs. Furthermore, if a word appears more than once in a text, EMOVAL provides the option to ignore the repetitions. In the example displayed in Fig. 3, the text will be analyzed in a lemmatized form. It is considered as referring mainly to sadness, leading to the expectation that it will be of negative valence and low arousal (see Fig. 1):

Son isolement face à cette morne plaine brune, le temps maussade et triste, la mélancolie qui semblait attachée aux pas des paysans moroses et nostalgiques, contribuaient à entretenir son ennui et son pessimisme qui risquaient de transformer sa solitude et son spleen en désespoir.

(His isolation from this dull brown plain, the gloomy bad weather, the melancholy that seemed attached to the steps of sullen and nostalgic farmers, helped to maintain his boredom and pessimism that threatened to turn his loneliness and spleen into despair.)

Fig. 3
figure 3

EMOVAL/SEMOTEX Web interface for text input

The first analysis focuses on the static valence–arousal emotional characteristics of the text. It consists of (1) the computation of the mean valence and arousal of words from the EMONORM data, (2) the projection of the emotional characteristics of the words on a valence–arousal space (Fig. 4), and (3) the distribution of words within valence and arousal values (Fig. 5). For the sadness text, we observe on the right-side graph of Fig. 4 that most of the words are located in the low-valence and low-arousal space.

Fig. 4
figure 4

EMOVAL/SEMOTEX window for static emotional analysis of texts

Fig. 5
figure 5

EMOVAL/SEMOTEX window for dynamic emotional analysis of texts

The second analysis presents a dynamic emotional analysis of texts consisting of the valence and arousal values of words as a function of their position within the analyzed text. Results are presented in a four-graph window (Fig. 5). The left side is dedicated to the valence analysis, the right side to the arousal analysis. The upper graphs indicate the characteristics of terms (valence or arousal) as a function of their position in the text. The lower graphs indicate the number of terms within the whole text for each valence or arousal value. In the example presented in Fig. 5, we observe (1) on the upper left graph, very negative valence words separated by moderate positive valence words (positions 4, 6, 10, 11, 14, 15, 19); (2) on the upper right graph, high-arousal words in the latter part of the text (positions 10, 11, 15 and 18); (3) in the lower left graph, more negative than positive valenced words; (4) in the lower right graph, more low- than high-arousal words. These results are consistent with the negative valence and low-arousal global analysis of the sadness text.

In summary, the EMOVAL/SEMOTEX Web application offers a powerful tool for analyzing the dimensional emotional characteristics of texts.

General discussion

This article had a threefold objective. The first was to compare a set of 12 norms in English, French, Spanish, German, Italian, and Finnish, that emotionally characterized a set of terms according to the dimensions of valence and arousal. By comparing the terms after translating them into French, we observed a strong similarity of judgments between norms despite differences in language, date, country, or instruction. This first result supports the hypothesis that the emotional components of words are included in their meaning and in their mental representation. It was therefore reasonable to construct EMONORM, a synthesis of the 12 norms comprising 6,383 words characterized in valence and 4,345 in arousal.

We then performed three tests. First, we replicated a result of Stevenson et al. (2007) showing the link between discrete and dimensional characterizations of emotions. Second, we showed a correlation between human judgments of the valence of texts and the computed orientation using EMONORM. Third, we observed the capability of EMONORM to assess the intensity of an emotion for positive and negative texts with respect to human judgments.

Last, we presented EMOVAL/SEMOTEX, a Web application that enables the research community to use EMONORM for automatic analysis of valence and arousal characteristics of texts.

In summary, beyond the fact that the proposed metanorm EMONORM is a resource for researchers to construct emotionally controlled experimental materials, the highlighting of the storage of emotional information in terms of mental representation by interlingual homogeneity of the word characteristics gives reason to use the metanorm to characterize larger texts.