Psychometric properties of the benign and malicious envy scale: Assessment of structure, reliability, and measurement invariance across the United States, Germany, Russia, and Poland

The Benign and Malicious Envy Scale is a promising self-report measure forming a counterpoint to the unidimensional approach to the assessment of dispositional envy. The goals of the present study were to examine the reliability, structure, and measurement equivalence of the Benign and Malicious Envy Scale across four independent groups from the United States, Germany, Russia, and Poland. Confirmatory factor analyses demonstrated that the structure of the Benign and Malicious Envy Scale is two-dimensional and its measurement is reliable. Moreover, multigroup confirmatory factor analysis, supplemented by alignment optimization, revealed that the scale is invariant by country across all factors regardless of whether a linguistic distinction between the two envy types in the respective language exists. The results speak to the current debate about whether envy should be conceptualized as unitary or as an emotion that occurs in two distinct forms, supporting the latter view. Additionally, country-level differences in envy point to cultural differences which merit further research.


Introduction
Envy is a complex reaction resulting from a social comparison when the individual lacks subjectively surpassing qualities, possessions, or achievements of another person (Parrott & Smith, 1993). 1 During the long research tradition, envy was mostly treated as an episodic construct (e.g., Parrott & Smith, 1993;Van de Ven, Zeelenberg, & Pieters, 2009, but in the field of individual differences it started to be argued to arise at the dispositional level as well (Lange, Blatz, & Crusius, 2018a;Lange & Crusius, 2015a;Smith et al., 1999). Dispositional envy is generally defined as a unitary construct which is underlined by two core characteristics: (1) inferiority triggered by the tendency to interpret an upward social comparison in a negative way; and (2) invidious ill will resulting from a deep, although subjective sense of injustice (Smith et al., 1999). By this definition, envious individuals are more sensitive to unfavorable social comparisons and are more likely to react with hostility toward superior others (Parrott & Smith, 1993;Smith et al., 1999;Smith & Kim, 2007).
One of the most popular measures of envy is the eight-item Dispositional Envy Scale (DES; Smith et al., 1999). It is found to be a reliable, internally consistent measurement tool capturing envy as a unidimensional construct focused on the feeling of inferiority, resentment, and hostility (Lange, Weidman, et al., 2018b;Smith et al., 1999;Smith & Kim, 2007). The DES is characterized by good psychometric properties, i.e., very good reliability, acceptable support for a one-factor solution verified via confirmatory factor analysis (CFA), and construct validity. Within research on envy, other measurement tools are also applied, though each defines envy in a similar manner as the DES. For example, the 10-item envy subscale of the Vices and Virtues Scale (Veselka, Giammarco, & Vernon, 2014) captures envy as a personality trait classified within the idea of "seven deadly sins", as inspired by the traditional Christian classification of basic human sins and shortcomings which in turn are in opposition to seven cardinal vices. In this vein, envy is opposed to the vice of kindness and is defined by "an overwhelming sense of resentment, where individuals wish for others to be deprived of the things that they themselves lack" (p. 76). A different approach to dispositional envy is presented within the Domain-Specific Envy Scale (Rentzsch & Gross, 2015), which defines envy as a "stable tendency to experience intense unpleasant feelings when being confronted with negative social comparison" (p. 531). This comes earlier in the process of emotion generation (thus a measure is not focused on its outcomes) and might be specified within three domains relevant to eliciting further envious reaction: attraction, competence, and wealth. Despite some subtle differences, taken together, these measures of dispositional envy are reliable and appear valid in their measurement. What unites them is that they conceptualize envy as a dispositional unitary construct grounded within the sensitivity to negative social comparisons (Rentzsch & Gross, 2015;Smith et al., 1999;Veselka et al., 2014).

Two faces of envy
In contrast to the unidimensional approach, recent research on envy, originating from analyzing it at the state level (as episodic envy), distinguishes its two forms-benign and malicious-captured under the Dual Envy Theory (Lange, Weidman, et al., 2018b;Van de Ven, 2016; Van de Ven et al., 2009). The theory posits that both forms of envy arise from upward social comparisons in personally important domains and raising a threat to the self-view of an individual (Van de Ven, 2016). However, whereas the unidimensional approach suggests one kind of reaction which is particularly negative and hostile (Cohen-Charash & Larson, 2016;Smith et al., 1999;Smith & Kim, 2007), the two-dimensional theory puts forward that there are two distinct forms of envy reflecting two ways of dealing with the ego threat: either by improving one's own position in the case of benign envy or by depreciation of the superior other in the case of malicious envy (Van de Ven et al., 2009;Van de Ven, 2016). These forms relate to different appraisal dimensions (Lange, Crusius, & Hagemeyer, 2016a;Van de Ven et al., 2012), cognitions, and feelings, and distinct action tendencies aiming at selfimprovement versus harming superior others (e.g., Crusius & Lange, 2014;Van de Ven et al., 2009). Particularly noteworthy is the fact that studies on envy conceptualization are very dynamic and new concepts that integrate years of existing research are emerging. An example of such a unifying approach is the recently developed Pain-driven Dual Envy Theory (Lange, Weidman, et al., 2018b) which successfully combines two theories and implies that the pain which is rather episodic and resulting directly from an upward comparison (core component of the Pain Theory of Envy) is a common factor of two enduring and independent envy kinds assimilated from the Dual Envy Theory, i.e., benign and malicious envy (Lange, Weidman, et al., 2018b).
The two-dimensional approach finds support in lexical analyses (Falcon, 2015;Van de Ven et al., 2009). In particular, the fact that envy at first was analyzed as a unidimensional construct might be the influence of the English language, which operates only with one term describing envy. However, many languages appear to have words or phrases identifying two distinct kinds of envy that are more benign versus malicious-this applies, e.g., to German (beneiden and missgönnen), Russian (белая [white] and черная [black] зависть [envy]), and Polish (zazdrość and zawiść; Lange, Blatz, et al., 2018a;Van de Ven, 2016). Further research using latent class and taxometric analyses revealed that even in languages with only one word for envy it is possible to differentiate between classes of emotional experiences reflecting benign and malicious envy (Falcon, 2015;Van de Ven et al., 2009). The current study was aimed at investigating whether it does also hold for the dispositional level.
According to the unidimensional perspective on the dispositional character of envy, which is influenced by the frequency of engaging in social comparisons (Smith et al., 1999) and following up the Dual Envy Theory (Lange, Blatz, et al., 2018a;Van de Ven, 2016;Van de Ven et al., 2009)-the divergent types of reaction significantly influence the propensity to experiencing the two envy kinds. Thus, people in general tend to compare to each other and are touchy in case of status threat, although in terms of individual differences they vary in their propensity to react with negative affect. Consequently, such an approach led Lange and Crusius (2015a) to distinguish two distinct forms of envy at dispositional level. The theory holds that both envy kinds stem from a deep-rooted sense of inferiority, a persistent tendency to compare, as well as a painful experience during confrontation with an upward status comparison (Lange, Blatz, et al., 2018a;Lange & Crusius, 2015a;Lange, Weidman, et al., 2018b). However, there are differences between these two forms at the level of motivation, emotions, and behavior. Malicious envy, similarly to previous conceptualizations (Smith et al., 1999;Smith & Kim, 2007), is related to the deep feeling of hostility and resentment resulting from perceiving the success of a superior person as undeserved-as a consequence, it leads to several harming behaviors. Benign envy is driven by a feeling of respect and admiration toward the superior other, which motivates to self-improvement (Lange et al., 2016a;Lange, Blatz, et al., 2018a;Lange & Crusius, 2015a;Lange & Crusius, 2015b;Van de Ven, 2016;Van de Ven et al., 2009;Van de Ven et al., 2012). Both envy kinds are emotional traits or, in other words, a dispositional tendency to react differently depending on the status threat-whereas dispositional benign envy genuinely grows out of sensitivity to the prestige threat, the dispositional malicious envy is deeply connected to the sensitivity to the dominance threat-which results in divergent behaviors (Lange, Blatz, et al., 2018a).

Benign and Malicious Envy Scale
To date, there are two measures that capture two distinct forms of dispositional envy: the Benign and Malicious Envy Scale (BeMaS; Lange & Crusius, 2015a) and its counterpart measuring chronic benign and malicious envy in strict organizational context (Sterling, Van de Ven, & Smith, 2016). The BeMaS is a brief 10-item measure where subjects respond using a six-point Likert-type scale (1 = strongly disagree; 6 = strongly agree). As a result of exploratory factor analyses (EFA) and CFA, BeMaS was demonstrated to reveal two factors. Both subscales were found to be highly reliable and internally consistent, and in terms of convergent and discriminant validity, the full pattern of a double dissociation was supported-dispositional benign envy was predicting benign envy at the state level without any crossing relations, dispositional malicious envy respectively (Lange & Crusius, 2015a). Confronted with measures of unidimensional envy, only the dispositional malicious envy scale showed a positive relation; conversely, the dispositional benign envy scale was not related to any previous measurement of envy (Lange & Crusius, 2015a;Lange, Blatz, et al., 2018a), which suggests that it brings a new quality in this research field. Furthermore, the BeMaS can help to explain different motivations (such as hope for success and fear of failure) and relate to specific behavioral outcomes (active avoidance vs higher goal setting; Lange & Crusius, 2015a).
So far, except for the original study by Lange and Crusius (2015a) on German and American samples, the measurement of the BeMaS was explored only in two national studies in Japan and Turkey (Çırpan & Özdoğru, 2017;Sawada & Fujii, 2016). In both studies, the two-factor structure of the scale turned out to be stable; however, to date, neither of the studies replicated the BeMaS structure using more stringent techniques like CFA (Çırpan & Özdoğru, 2017;Sawada & Fujii, 2016).

Current study
The objectives of the current research were threefold. First, we analyzed the measurement model of two-dimensional dispositional envy as measured by the BeMaS in Americans, Germans, Russians, and Poles using CFA. Secondly, we verified whether the measurement model of the dispositional benign and malicious envy is invariant among the tested samples. Additionally, we tested for differences in latent mean scores of the dispositional benign envy and malicious envy across compared groups. Finally, we aimed to test whether the BeMaS reliably measures the two forms of envy in four different countries.
Based on the foregoing research purposes, we hypothesized that (1) the BeMaS covers the two-dimensional measurement model suggested in previous studies (Çırpan & Özdoğru, 2017;Lange & Crusius, 2015a;Sawada & Fujii, 2016). Although to date, no research has investigated the measurement invariance across samples from different countries, we hypothesized that (2) the structure of the BeMaS would be stable across compared samples because of results obtained in previous studies revealing the occurrence of the twodimensional envy, even though in some languages there is no linguistic differentiation between the two envy forms (Falcon, 2015;Van de Ven et al., 2009). Lastly, we hypothesized that (3) the BeMaS test scores are reliable in measuring benign and malicious envy.

Participants and procedure
The study involved a total sample of N = 2792 residents of the United States (US), Germany, Russia, and Poland. The German research involved N = 558 (65% females) students in their twenties while the US study conducted via Amazon Mechanical Turk (MTurk) involved N = 799 (59% females) participants in their thirties. 2 The data from the other two studies were collected online from N = 708 (62% females; M age = 19.53; SD age = 1.89) Russian and N = 727 (69% females; M age = 22.19; SD age = 2.54) Polish participants aged 18-35 years. In both studies, informed consent was obtained from all individual participants included in the study. Participation was voluntary; each participant had the right to terminate at any time, and only fully completed questionnaires approved by the participant at the end of the survey were submitted to the database. All the participants were administered the BeMaS scale in their native language; for the needs of the study we prepared Russian and Polish translations (see Table 5 in the Appendix), which were generated with the authors of the original scale following the back-translation procedure. For the transparency of our results, we share data used for analyses at the OSF: https://osf.io/7jqgc/.

Statistical analyses
To test Hypothesis 1, we used CFA to assess the BeMaS structure. Within the assessment of goodness-of-fit we used the comparative fit index (CFI) and the root mean square error of approximation (RMSEA). As the BeMaS structure was not 2 To get German and US datasets, we combined the data from two publicly available databases from OSF platform (Lange et al., 2016b;Lange & Crusius, 2017). The German sample was composed of N = 558 respondents from study DESr5 (N = 134; Lange & Crusius, 2017) and Study 5 (N = 424) previously used by Lange and colleagues (2016). The US sample consisted of N = 799 MTurk respondents from studies DESr4 (N = 218), DESr6 (N = 195), DESr7 (N = 194), and DESr9 (N = 192)these subsets were previously used in order to conduct CFA, albeit they were a part of a bigger subset mixed with German participants (Lange & Crusius, 2015a;Lange & Crusius, 2017). yet analyzed in different cultures, we used more liberal criteria, i.e., if the value of CFI is greater than .90 and RMSEA is less than .08, the model may be deemed as wellfitted to the data (Hu & Bentler, 1999;Marsh, Hau, & Wen, 2004).
To test Hypothesis 2, we tested for measurement invariance analysis at three levels determining different outcomes: (1) configural-which refers to accuracy of the measurement model across samples and informs that the analyzed structure is the same across compared groups (e.g., whether the number of factors is equal), (2) metric-discerning whether factor loadings are equivalent across groups and whether the latent construct is understood in the same way, and (3) scalarwhich assumes intercepts to be equal across the compared groups (Meredith, 1993). Establishing scalar invariance allows for meaningful comparison of latent mean scores between the analyzed samples (Van de Schoot, Lugtig, & Hox, 2012). The assessment at each level is based on CFI and RMSEA indices. The basic condition is that the structure at the configural level should initially demonstrate a good fit and if this model is well-fitted, then the differences in fit indices between subsequent models are compared (the differences between the configural and metric level, and between the metric and scalar level). Because cut-offs for measurement invariance vary depending on the number of factors and indicators, in the current study (which aims to test two-factor models with five indicators each) we assumed the criteria proposed by Chung and Rensvold (2002), i.e., full measurement invariance is demonstrated when the ΔCFI does not exceed .0082, while the ΔRMSEA does not exceed .009. 3 Additionally, we assessed which parameters did not hold the invariance across the compared groups using the alignment optimization (AO) which is classified within a group of methods targeted especially at estimating more trustworthy means even in the presence of non-invariance, in particular recommended in cross-cultural research Cieciuch, Davidov, & Schmidt, 2018). Apart from the diagnosis of the non-invariant parameters, the AO allows for latent mean comparisons if non-invariant parameters (loadings and intercepts) do not exceed a cut-off of 25% .
To test Hypothesis 3 concerning the reliability of test scores in each sample, we estimated two kinds of coefficients which do not differ in terms of interpretation (the higher value the greater reliability): McDonald's (1999) omega, i.e., omega totalan estimate of the total reliability of a test (Revelle & Condon, 2018;Zinbarg, Revelle, Yovel, & Li, 2005), and Cronbach's (1951) alpha. We decided to demonstrate alpha values due to its popularity and for comparability purposes. Being aware of alpha's limitations resulting from assumptions of uncorrelated errors, normality, and essential tau-equivalence (Sijtsma, 2009), the reliability assessment was based mainly on omega coefficient. Reliability indices were computed using omega function within psych package (Revelle, 2018) in R software (R Core Team 2016). All structural analyses were computed in Mplus v. 7.2 software (Muthén & Muthén, 2012).

Descriptive statistics
The descriptive statistics of the scales for all four studied samples are presented in Table 1.
The skewness and kurtosis estimates did not indicate deviations from the univariate distribution as their values did not exceed −2 and + 2 (Gravetter & Wallnau, 2014). However, both scales have undergone specific effects associated with socially desirable respondingi.e., participants scored higher on benign, while lower on malicious envy. According to Mardia's test (Mardia, 1970;Korkmaz, Goksuluk, & Zararsiz, 2014), however, the datasets did not follow a multivariate normal distribution. Thus, in further structural analysis we used a maximum likelihood with scaled shifted correction as an estimator which is suitable to deal with the lack of the multivariate normality. 3 The delta (Δ) symbol means the difference between the values. .80 Note. M = mean; SD = standard deviation; S = skewness; K = kurtosis; ω t = omega total; α = Cronbach's alpha *p < .001

Structural validity
To test the structural validity of the BeMaS we ran CFA in each sample. The goodness-of-fit statistics of analyzed measurement models are demonstrated in Table 2. Each of the one-factor models did not meet the criteria of goodness-of-fit. Two-factor solution fitted the data well in the US, German, and Russian samples. The upper boundary of the confidence interval for the German and Russian samples slightly exceeded the boundary of the goodness-of-fit, though it was still acceptable. Contrariwise, in a Polish sample the model showed acceptable fit according to the CFI and poor fit as according to the RMSEA. We successfully improved the fit after introducing modification-we added the correlation between residuals of items 6 and 8. The convergence of both test items was high as the standardized correlation coefficient was at the level of .72. Confirmed factors were weakly correlated within all analyzed samples at level of .15 for US, .22 for German, .24 for Russian, and .15 for Polish sample (all significant at p < .01). The standardized factor loadings of the BeMaS in US, German, Russian, and Polish sample are presented in Table 3.
Most factor loadings were approximate or exceeded a value of .70 and thus satisfactory. Compared to other items, the strength of the factor loadings of item 7 in the Russian sample and items 6 and 8 in the Polish sample (with introduced correlation between residuals within the CFA) was moderate. As a result, the Hypothesis 1 about the two-factor structure of the BeMaS was supported.

Measurement invariance across countries
In the next step we ran a measurement invariance analysis to verify the hypothesis that two-dimensional dispositional envy is a construct understood in the same way, and factor loadings as well as intercepts of test items are invariant across Americans, Germans, Russians, and Poles. The results are displayed in Table 2. Models were well-fitted at initial level supporting the configural invariance. According to the values of ΔCFI and ΔRMSEA, we established metric invariance, however, we did not find support for scalar invariance. We made one attempt to fix the fit of the model to demonstrate partial scalar invariance (we freed the intercept of item 6). It has contributed to a significant improvement, although still did not meet the criteria. Instead of fixing the model and revealing partial scalar invariance through a few more steps (freeing the intercepts of problematic test items or introducing correlations between residuals), we decided to scrutinize which parameters are non-invariant by carrying out alignment optimization (AO; .
To analyze the non-invariance of our data and to make means comparisons more trustworthy we conducted fixed the AO on the two-factor model previously obtained via CFA. The noninvariant parameters for the AO are displayed in Table 4.
The total percentage of non-invariant parameters was less than 25% (0.08% for loadings and 20% for intercepts); this, according to recommendations, allows to conclude that the two-dimensional structure of the BeMaS is universal across compared independent samples; this supports Hypothesis 2. Note. χ 2 (df) = chi-square test of model fit; CFI = comparative fit index; RMSEA = root mean square error of approximation; CI = confidence interval a Correlation between errors of item 6 and 8 b In the partial scalar model, the intercept of item 6 was freed *p < .001

Latent means comparisons
As the results of alignment pointed out an acceptable number of non-invariant parameters, the comparisons of latent means may be deemed as more trustworthyeven in the presence of non-(scalar)invariance (Cieciuch et al., 2018;. In comparisons, we chose the German participants as a model group. 4 Results demonstrate that in terms of benign and malicious envy, Americans scored significantly higher than participants from other samples (Z benign = .37; Z malicious = .45). Within benign envy, Poles scored significantly higher (Z = .20) than Russians (Z = .05) and Germans, whereas there were no significant differences between these two samples. Regarding malicious envy, however, Russia displayed significantly the lowest (Z = −.31), while Poles (Z = .04) and Germans did not significantly differ from each other.

Reliability
The reliability estimates-McDonald's omega total and Cronbach's alpha-which were computed for each envy subscale, are displayed in Table 1. Both dispositional benign and malicious envy subscales had very good reliability in all studied samples; thus, only a small amount of variance was due to measurement error. Therefore, Hypothesis 3 was confirmed and the BeMaS may be considered a highly reliable measurement tool of benign and malicious envy across US and three adjoining European countries: Germany, Russia, and Poland.

Discussion
The aim of the current paper was to analyze psychometric properties of the BeMaS in terms of its structural validity, measurement invariance, and reliability. We combined data from four independent samples: US, German, Russian, and Polish (data for the first two samples were obtained from the authors of the BeMaS, while data for the other two samples were collected specifically for the purposes of this study).

Psychometric properties of the benign and malicious envy scale
Only the initial study of Lange and Crusius (2015a) confirmed the two-dimensional measurement model of the BeMaS using CFA. Further adaptations of the scale conducted in independent populations examined the underlying factor structure by performing EFA (Çırpan & Özdoğru, 2017;Sawada & Fujii, 2016). However, both of these analytical methods serve two different issues: While the EFA is a technique which allows the exploration and identification of underlying factor structure through reduction of the variables, the CFA enables testing the hypotheses on the basis of theory and previous empirical research (Schmitt, 2011). In the current study, the twofactor model was well-fitted to the data in the US, German, and Russian samples, while in a Polish sample the RMSEA indices were almost sufficiently satisfactory. We identified the problem responsible for this worse fit in the Polish sample in the correlation between residuals of item 6 and 8, which turned out to be very similar in their content. To additionally make sure that the envy structure is not unidimensional we tested one-factor models of the BeMaS via CFA and demonstrated unacceptable fit to the data in each sample, which is in favor of the two-factor solution. In the current study we compared the data from countries that linguistically distinguish between the two envy types (Germany, Russia, and Poland) and countries like the US that do not. Our findings support previous results on episodic envy (Falcon, 2015;Van de Ven et al., 2009), and seem to be valuable concerning a cultural universality of two envy dimensions (Cohen-Charash & Larson, 2017;Lange, Weidman, et al., 2018b). Summing up, even if there are linguistic differences, the envy structure is still two-dimensional and independent of whether the language has one or two terms for this phenomenon. 4 As Germany is the country in which the BeMaS was developed, in current study we treat it as the country of reference, what in terms of further comparisons signifies that its latent mean was enforced to be zero and other groups are compared to this mean. In the next step, we aimed to verify the hypothesis concerning the measurement invariance of the BeMaS across the US, Germany, Russia, and Poland. The values of CFI and RMSEA suggested that the metric invariance across the compared samples may be assumed, however, we did not find evidence for scalar invariance, which suggest that the latent means should not be compared. The AO procedure enabled us to assess, which parameters were non-invariant (significantly different from the mean value of the parameters assigned to other groups; Cieciuch et al., 2018). The largest number of non-invariant parameters concerned the Polish sample; this might be mostly an outcome of fixed measurement model revealed via CFA. Obstacles related to the one pair of items (items 6 and 8) were noticeable from the beginning of the analyses, that is, they were strongly correlated in CFA and the strength of their factor loadings were the weakest. A more detailed look through MGCFA and AO procedure suggests that the "culprit" behind the problems is mainly the poor quality of item 6. Despite the back-translation procedure the translation of this statement into Polish turned out to be insufficient and too similar in its content to item 8. Therefore, there is a need to consider a reformulation of this particular statement upon subsequent research using the Polish version of the BeMaS-in order to correct the problem we suggest to slightly rephrase it to the following version: Mam złą wolę wobec ludzi, do których czuję zawiść (for details see Table 5 in the Appendix).
Finally in terms of psychometric properties, as hypothesized, the BeMaS turned out to be characterized by excellent reliability across all groups regardless of applied coefficient. All values of McDonald's omega, which we decided to mainly consider due to its superiority in relation to the most widely used Cronbach's alpha, were higher than .85 for both subscales in each of the analyzed samples. Further, according to recommendations, omega coefficient should be even preferred for assessing reliability of scales characterized by the existence of skewed test items (Trizano-Hermosilla & Alvarado, 2016). We find it important when analyzing the reliability of the BeMaS test scores-especially it applies to the items measuring malicious envy which tend to be represented by positively skewed distribution suggesting "floor effect"-which means that subjects usually answer using lower points of response scale (as people tend not to admit the features which are directly antagonistic) and, in consequence, get lower results in a whole subscale (Kowalski, Rogoza, Vernon, & Schermer, 2018).
Summarizing our findings so far the current paper provides support for applying the BeMaS in two new language versions. Both Russian and Polish adaptations are valid and reliable in terms of measuring benign and malicious envy. In addition, we demonstrated that two-dimensional structure of envy is invariant regardless of whether a linguistic distinction between the two envy types in the respective language exists. Nevertheless, the current study is not free from disadvantages: (1) regarding lack of testing nomological networks which could even more fulfil the investigation of the BeMaS psychometric properties in Russia and Poland; (2) concerning the fact that our research was based on self-report data only which may lead to the so-called monomethod bias; (3) regarding certain differences between tested samples, e.g., the US group consisted of MTurk respondents in their thirties, while for other countries we used mostly student samples, so the differences can be a matter of age and participant specificity. Moreover, cross-cultural gender differences in envy should be a topic for consideration in future research. In the current study, due to the lack of access to the full range of data for the US and Germany, we were not able to conduct detailed gender-based ancillary analyses, which is a shortcoming of our research. According to the previous results, women tend to achieve slightly higher results in benign envy (e.g., Lange & Crusius, 2015a, Xiang, Chao, & Ye, 2018 which presumably corresponds to divergent envy-evoking situations and adaptive challenges faced by men and women over evolutionary time (DelPriore, Hill, & Buss, 2012); however, these results are not well established and may meaningfully vary across cultures.

Cross-cultural differences in benign and malicious envy
Additionally, based on the generated latent mean scores, we demonstrated several significant differences between the  Note. Markings in parentheses represent intercepts tested samples for both benign and malicious envy. The obtained results can be justified through the lens of individualism-collectivism, which is one of the national culture dimensions distinguished by Hofstede and represent patterns of thinking, feeling, and behavior that are characteristic of representatives of a given nation (Hofstede, Hofstede, & Minkov, 2010). In terms of envy, individualism-collectivism, generally defined as "the degree of interdependence a society maintains among individuals" (Hofstede, 1984, p. 83), is found to be the most differentiating dimension (Foster, 1972;Poelker, Gibbons, Hughes, & Powlishta, 2016). In collectivistic societies, interpersonal relationships are above the achievement of one's own goals, while the group one belongs to provides support and a sense of security in exchange for a kind of loyalty as well as avoiding conflicts, primarily with members of one's own group. Members of these societies are characterized by the "we" self-concept, so the individual represents both oneself and the entire group, which imposes increased responsibility for one's actions, thoughts, and feelings (Hofstede, 1984;Hofstede et al., 2010). In individualistic societies, however, the individual strives to achieve a personal goal over relationships with others-from an early age, the "I" self-concept is developed and the pattern of strong personality aspiring for individual success is promoted because material status and self-esteem primarily determine one's position in the social hierarchy (Hofstede, 1984;Hofstede et al., 2010). The latent mean scores for the US, which is recognized as one of the most individualistic countries in the world (Hofstede et al., 2010), were the highest for both forms of envy. We also found it worth looking at these scores through the lens of the capitalist and then the neoliberal economic system that prevailed in this country (Wrenn, 2015). In the US, the free market economy was developed early. According to Scottish philosopher Adam Smith (1981), within such a stable trade and evaluation system, individuals wishing to get rich will seek to specialize in their own work. In such a situation, people will naturally increase the value of the manufactured product without any interference from the state. Veblen (2007;cf. Wrenn, 2015), in turn, states that benign envy is a by-product of such industrialized and capitalistic economies in which material goods become a symbol of social status. Envy therefore has a developmental function because it arouses the desire to achieve a similar socioeconomic status as the envied and, consequently, the need to be admired (enviable), which would serve to confirm the individual's conviction of achieving the desired status. Currently in the US, however, we are dealing with the last stage of capitalism, neoliberalism, the center of which is the idea of hyper-individualism and according to which the state is focused on the individual as a performer of market operations and not on the whole of society (Wrenn, 2015). Among Western countries, the US leads the way in terms of income inequality (Alvaredo, Chancel, Piketty, Saez, &Zucman, 2018), andWrenn (2015) states that "the growing inequality gap that has come to characterize neoliberal economies-that not only widens, but distances individuals in terms of observable qualitiescould lead to even greater disillusionment and turn emulative-driven, benign envy into malicious envy" (p. 507). In the US, there are different socially desirability norms with respect to envy. Even in malicious form, it is captured rather as a source for development through rivalry with superior people. Consequently, it is positively perceived through the prism of its outcomes favorable to the individual (Matt, 2003).
The latent mean scores for samples from Poland and Germany-countries considered individualistic to a moderate degree (Hofstede et al., 2010)-were similar only within malicious envy, while within benign envy Poles obtained higher results compared to both Russians and Germans. In comparison to their eastern and western neighbors, Poles are much more likely to express their admiration and respect for people who are better or succeed in important domains while being inspired by the achievements of others to improve their own development. Research shows that since the economic transformation resulting from the collapse of the Polish People's Republic, socioeconomic status has become increasingly significant for Poles in comparison with citizens of other countries (Kuchinke, Ardichvili, Borchert, & Rozanski, 2009), and work has become one of the most important areas of their everyday lives (Jasińska-Kania & Marody, 2002). Germany, in turn, is currently at a different path and stage of economic development as the formation of the state's social market economy began right after World War II in West Germany; it served the purpose of both achieving general social goals and guaranteeing social security, so as to reduce the risk of losing livelihood by German citizens. One might hypothesize that the solid state social policy, satisfaction with jobs and earnings (Kuchinke et al., 2009), and a strong emphasis on mutual equal treatment (e.g., Federal Anti-Discrimination Agency, 2019) create a certain atmosphere that potentially reduces the possibility of making negative social comparisons that underly dispositional envy.
The latent mean scores for both forms of envy in Russia, which is considered a collectivistic country (Hofstede et al., 2010), were the lowest compared to other countries. Russia is one of the European states that have undergone the industrialization process most recently; until then, the economic efficiency of the state was rather low for many years, and one of the hypothesized reasons for this is a high awareness and a fear of envy, which inhibited the economic development (Foster, 1972;Clanton, 2006). Envy in collectivistic countries potentially disturbs the harmony between group members, which is why it is deeply undesirable (Lindholm, 2008). In addition, the communist political system, which until recently ruled in Russia, favored further collectivization and avoidance of envy (both envy of others and being envied) because individual profit was considered immoral (Clanton, 2006). One might then hypothesize that the lowest mean scores obtained by the Russians in their declared level of envy are the result of a strong community orientation (and thus specific desirability norms regarding envy), even though it clashes with the current economic situation of the Russian state which is dealing with economic growth and simultaneous large income and wealth inequality (Alvaredo et al., 2018), which possibly fuel envious responses toward superior others in post-capitalistic economic systems (Veblen, 2007;Wrenn, 2015).

Conclusions
Is envy understood in the same way by people in different cultures? Our research suggests the answer is yes-despite existing linguistic and cultural differences, we have demonstrated the measurement equivalence for the BeMaS scale. The results of our study are not only empirical arguments confirming the quality of scale measurement (i.e., good reliability and structural validity), but they also provide a specific starting point for future research on envy in the cross-cultural context.
a Due to results of measurement invariance analysis and alignment optimization which suggested a complication with item 6 in Polish translation (the previous content of the test item: Czuję niechęć do ludzi, wobec których czuję zawiść), we decided to slightly rephrase it in order to correct the problem. In the table we present the revised content of this test item, however it should be emphasized that the upgraded content may, but still do not have to improve the model's fitthis requires additional verification in the pilot studies Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.