Always look on the bright side of life? Exploring the between-variance and within-variance of emotion regulation goals

Currently, emotion regulation goals are being perceived as highly situational. This assumption might be wrong, though, as the preeminent measure [the intraclass coefficient (1), ICC(1)] overestimates the proportion of within-variance under the condition of measurement error. We therefore empirically test whether emotion regulation goals represent more of a between-person or a within-person phenomenon, using the reliability-adjusted ICC(1). A total of 305 students participated in a daily diary study and answered a questionnaire about their emotion regulation goals in the most negative event of the day over the course of 9 days. Multilevel analyses suggest that emotion regulation goals vary more between persons than heretofore assumed, especially for hedonic goals, but also for social goals. Besides, we show substantial differences in the within-variance across individuals. We conclude by discussing theoretical implications for general and clinical psychology.


Introduction
Henry and Marta both had a long and exhausting workday. In the evening, Marta watches a horror flick that gives her the creeps. Henry chooses comedy instead and relaxes with a good, hearty laugh.
Why do Henry and Marta pursue different emotion regulation goals (defined as the reason for the individual's attempt to regulate their emotions)? Or, more generally, why do people in seemingly similar situations pursue different emotions? From a between-person perspective, we could examine the ways in which Henry and Marta differ. From a within-person perspective, we could ask questions about the situation and how it differs for them or how Henry and Marta differ in this situation from the average Henry and Marta.
Previous research theoretically conceived emotion regulation goals mainly as fluctuating, time-variant and within-person phenomena (e.g., English et al. 2017;Tamir 2016;Wilms et al. 2020b). Recent research, though, added a time-invariant or between-person perspective (Eldesouky and English 2018a;Eldesouky and Gross 2019). In this article, we want to reconcile both views.
We argue that the between-variance and within-variance, and in particular the intraclass coefficient 1 (1) (ICC(1) ;Fisher 1934;Shrout and Fleiss 1979;Snijders and Bosker 2012) shed light on whether emotion regulation goals represent more of a time-invariant, between-person or time-variant, within-person phenomenon. The more a construct varies across individuals and the less it varies across situations over time (i.e., the larger the between-variance compared to the within-variance; the closer the ICC(1) to one), the more it reflects a between-person phenomenon. Conversely, the more a construct varies across situations over time and the less it varies across individuals (i.e., the larger the withinvariance compared to the between-variance; the closer the ICC(1) to zero), the more it reflects a within-person phenomenon (See Preszler et al. 2019, p. 100 for a similar argument). In line with our argument, we advocate for a continuous transition from a between-person to a within-person 1 3 phenomenon, as constructs can be more or less stable across time and situations.
Most diary and experience sampling studies report -based on the ICC(1)-that emotion regulation goals consist of substantially more within-variance than between-variance (e.g., English et al. 2017;Wilms et al. 2020b). This finding might represent an artifact, though. Since measurement error (i.e., lack of perfect reliability) induces a negative bias in the ICC(1) (Wilms et al. 2020a) and previous studies did not correct for it, the interpretation of emotion regulation goals as a within-person phenomenon may be premature. The present study addresses this methodological limitation by using an estimator robust to measurement error-the reliability-adjusted ICC(1) (Wilms et al. 2020a). Besides, we examine the distribution of the within-variance across individuals (i.e., the degree of variability in an emotion regulation goal across situations between individuals)-to our knowledge-for the first time.

Emotion regulation goals
Emotional regulation goals are activated when the currently experienced emotion and the desired emotion differ, aiming to reduce this mismatch (Eldesouky and Gross 2019). This study focuses on hedonic and social instrumental goals for two reasons. Firstly, these goals are more prevalent in daily life (Eldesouky and English 2018b;English et al. 2017;Wilms et al. 2020b). Secondly, Eldesouky and English (2018a) validated the Emotion Regulation Goal Scale (ERGS) for hedonic (i.e., pro-hedonic and contra-hedonic goals) and social instrumental goals (i.e., pro-social and impression management goals). Thereby, the ERGS allows to estimate a reliability necessary to apply the reliabilityadjusted ICC(1) (Wilms et al. 2020a).
Hedonic goals aim at changing the feeling component of an emotion (i.e., the experienced pleasure or pain). In particular, pro-hedonic goals aim at changing the pleasure-topain ratio in favor of pleasure, whereas contra-hedonic goals aim at changing it in favor of pain (Tamir 2016). For example, a person pursuing pro-hedonic goals may be motivated to search for funny cat videos to lift her mood, while another person pursuing contra-hedonic goals may be motivated to watch a horror movie to experience anxiety.
Instrumental goals represent a class of superordinate emotion regulation goals that can be achieved not only by regulating emotions, but also by other means (Tamir 2016). Social instrumental goals denote a subclass of instrumental goals that encompasses both building or cultivating positive social relationships and averting harm to them (Eldesouky and English 2018b;English et al. 2017;Gable and Impett 2012;Tamir 2016). This study focuses on two prime social instrumental goals: pro-social and impression management goals. Pro-social goals aim "to maintain or promote social interactions and relationships, because it involves influencing relationships for the sake of others" (Eldesouky and English 2018a, p. 751). Impression management goals instead aim "to appear a certain way to others, because it involves influencing relationships for one's own sake" (Eldesouky and English 2018a, p. 751).
Finally, individuals often do not only pursue emotion regulation goals that target their emotional experience, but also their emotional expression (Greenaway and Kalokerinos 2018). We refer to them as emotion expression goals. Pro-hedonic expression goals aim at changing the emotional expression in favor of more positive and/or less negative emotional expression. For example, an individual may pursue the goal to show more happiness at a party to appear more approachable. Contra-hedonic expression goals aim at changing the emotional expression in favor of more negative and/or less positive emotions. For example, an individual may aim to show more anger in order to intimidate the opponent in a negotiation.

The between-variance and the within-variance of psychological constructs
In real life, the time-invariant individual effect and the time-variant situation effect often determine a construct's realization together (e.g., Atkinson 1957). Put plainly, we understand a construct's realization as a function of the individual effect (e.g., a person pursues hedonic goals more than others) and the situation effect (e.g., a situation with intense negative valence activates hedonic goals more than less intense situations).
The between-variance describes the expected squared distance between a person-mean (e.g., the average degree of extraversion for a single person) and the grand mean (e.g., the average degree of extraversion for all persons; e.g., Snijders and Bosker 2012). Accordingly, the betweenvariance refers to the variability of the individual effect in psychological constructs. For example, Henry may pursue pro-hedonic goals more often than Marta on average, and between-variance describes the magnitude of this difference between individuals for the sample/population.
The within-variance describes the expected squared distance between a single observation of a person (e.g., the degree of extraversion in a specific situation) and the person mean (e.g., the average degree of extraversion for a single person; e.g., Snijders and Bosker 2012). Accordingly, the within-variance refers to the variability of the situation effect in psychological constructs, independent from the individual effect. For example, individuals (in general) may pursue prohedonic goals more often in unpleasant than in pleasant situations (English et al. 2017), and within-variance describes the magnitude of this difference across situations, while cancelling out individual differences.
Researchers typically address questions about the degree of within-variance and between-variance with the ICC(1). The closer the value of the ICC(1) to one, the more betweenvariance. The closer the value of the ICC(1) to zero, the more within-variance (Fisher 1934;McGraw and Wong 1996;Shrout and Fleiss 1979). As a reference for psychological constructs, Podsakoff et al. (2019) found an average ICC(1) of about 0.52 in a meta-analysis, which comprised more than 23 psychological constructs 2 and 222 intraindividual empirical studies. They included studies with different time referents (i.e., momentary, daily and last few hours), response formats (most typically agreement measures), number of anchor points (most typically 5 points), number of days surveyed (M = 8.54) and number of surveys per day (M = 2.26). All of the examined constructs showed substantial within-person variability [i.e., at maximum an ICC(1) of 0.6], even though they "have historically been treated as between-person phenomena" (Podsakoff et al., 2019, p. 737).
But Wilms et al. (2020a) showed that the variance of the measurement error induces a positive bias in the withinvariance, which in turn leads to an underestimation of the ICC(1). In other words, if researchers measure a construct with measurement error, the estimated ICC(1) shows more within-variance and less between-variance than the correct estimate. For example, if the true ICC(1) for a psychological construct is 0.5 and it is measured with a reliability of 0.7 (0.5), then the measurement error-affected ICC(1) will be at 0.41 (0.33) 3 in expectation. A reliability between 0.5 and 0.7 appears realistic for experience sampling and diary studies, since they typically use short scales or even one-item-measures and the less items, the lower the reliability (e.g., Brown 1910). It is important to note that reliability represents a quality of the measurement and not of the psychological construct itself. To understand whether the construct varies more between or within individuals, the ICC(1) must be corrected for measurement error accordingly, otherwise the ICC(1) estimate varies as a function of the measurement's reliability. All in all, this example highlights the severity of the measurement error-induced bias in ICC(1) estimation.
A correct estimate of the within-variance instead provides the product of the measurement error-affected withinvariance and the construct's reliability (See Theorem 2 in Wilms et al. 2020a). Applying this correction to the results of Podsakoff et al. (2019), Wilms et al. (2020a) suggested that the average (reliability-adjusted) ICC(1) for psychological constructs could be higher [i.e., about 61% (68%) between-variance under the assumption of a reliability of 0.7 (0.5)], indicating that the majority of psychological constructs vary more between individuals. Why is it important to examine between-variance and within-variance? Put plainly, we can only test hypotheses about individual differences, if meaningful variation on the individual level exists. Likewise, we can only test hypotheses about situational differences, if meaningful variation on the situational level exists. Accordingly, a measurement error-affected ICC(1) may falsely guide research to develop theories and test hypotheses about situational factors, while neglecting individual differences, though theories about individual differences could be promising.

The between-variance and within-variance of emotion regulation goals
Previous studies explored the ICC(1) for emotion regulation goals in negative events. Kalokerinos et al. (2017) found an ICC(1) for pro-hedonic goals, contra-hedonic goals and social goals of 0.30, 0.27 and 0.15, respectively. English et al. (2017) indicated an ICC(1) between 0.18 and 0.38 for different emotion regulation goals (they did not report emotion regulation goals separately). These estimates were based on daily diary studies. Wilms et al. (2020b) reported an ICC(1) of 0.21 for pro-hedonic goals, and ICC(1) between 0.02 and 0.14 for social goals. They examined negative events five times a day. Their results showed the largest amount of within-variance, but also used a higher resolution (i.e., prompts five times a day) than prior studies, which may explain the lower ICC(1) estimate (Podsakoff et al. 2019). Overall, emotion regulation goals appear to consist of more within-variance than between-variance.
By contrast, another daily diary study (Eldesouky and English 2018b) found ICC(1) for pro-hedonic and contrahedonic goals of 0.78 and 0.87, respectively, and for social instrumental goals ICC(1) between 0.59 and 0.60. Here, participants rated the extent to which they pursued emotion regulation goals over the day. This study did not focus on a specific event, but required individuals to give an aggregation of the events of the day. We argue that averaging the events of the day leads to a less extreme day score and therefore more similar scores across days. This may explain 2 They included affect, coping, stress, counterproductive work behavior or intentions, self-efficacy / self-esteem, emotional exhaustion, emotional labor, engagement, leadership, justice, job/task characteristics, job performance, motivation, motives, need satisfaction, personality, organizational citizenship behavior, recovery, satisfaction, sleep, stressors / demands, social support and work-family conflict. 3 For the first and second example, we assumed a between-variance, within-variance and measurement error variance of 2, 2 and 0.86, and 2, 2 and 2, respectively. This yields a reliability of 0.7 (i.e., 2/ (2 + 0.86); reliability = true within-variance/(true within-variance + measurement error variance)) for the first example, and 0.5 (2/ (2 + 2)) for the second example. Wilms et al. (2020a) showed that the estimated within-variance is equal to the true within-variance plus the variance of the measurement error in expectation. Accordingly, the estimated within-variance equals 2.86 for the first example and 4 for the second example (in expectation). The ICC(1) is estimated by the between-variance divided by total variance and yields 0.41 (i.e., 2/(2 + 2.86)) for the first example and 0.33 (i.e., 2/(2 + 2 + 2)) for the second example. the higher between-variance. Moreover, Eldesouky and English (2018b) modelled a two-level structure (events in individuals), even though their data had a three-level structure (events in individuals in couples). The between-variance-ignoring this additional nesting factor-contains the variance of the individual effect and the couple effect, and this may lead to an overestimation of the between-variance. Accordingly, their ICC(1) estimates do not capture the consistency in scores across specific events, and we exclude this study from our discussion. All in all, we conclude that emotion regulation goals seem to be more context-dependent.

The present study
Since previous studies operationalized emotion regulation goals with one item only, they were unable to report reliabilities. This is highly critical as perfect reliability seems very unlikely, and we thus assume these studies underestimated the ICC(1) of emotion regulation goals. The current study aims to estimate the ICC(1) of emotion regulation goals correctly, applying the reliability-adjusted ICC(1) (Wilms et al. 2020a). Thus, this study investigates the between-variance and within-variance of emotion regulation goals with higher precision than previous studies. Besides, we explore the distributions of the within-variance of emotion regulation goals across individuals-to our knowledge-for the first time.

Participants
A total of 305 students (83% female; M age = 22.61, SD age = 4.40, two participants did not indicate their age) were recruited from a German university, which were enrolled in psychology, social work and studies of teaching. Psychology students were engaged via a system that shows studies currently conducted at the psychology department and received trial hours (necessary to finish their studies). The other students were recruited by advertising the study in the beginning of a lecture or seminar and received a course credit.

Procedure
The study contained two parts: a laboratory session and a diary part. Participants came in groups of 1 to 7 individuals to the laboratory session. Participants gave their informed consent, provided their gender and age, and completed questionnaires unrelated to this study. In the end of the laboratory session, participants completed a trial test of the diary questionnaire and were encouraged to ask questions about it, if needed.
The diary study started the next day. Participants received an email with a link to the diary survey at 19 o'clock for the nine consecutive days, which expired at 23:30 o'clock each day. 4 We encouraged students to participate by raffling off vouchers (10 vouchers worth €10 and one worth €50), and each additional participation increased their chances of winning.

The diary procedure
After thanking the students for their cooperation and encouraging them to continue their participation, they were instructed to remember the event with the most negative emotion of the day. Next, they described this event in a few sentences, ensuring that they recalled it. We instructed them that the most negative event may also be a trivia for those who had a great day. Participants then completed both the emotional intensity scale and the ERGS (Eldesouky and English 2018a) with respect to that recalled event. Finally, they answered two additional scales unrelated to this study. Participants completed the surveys with a median response time of 59:37 s.

Emotion regulation goals
To measure emotion regulation goals, we used a translation/back-translation version of the ERGS (Eldesouky and English 2018a). We instructed participants as follows: People often try to manage (or regulate) their emotions. This includes, on the one hand, what they feel on the inside and, on the other hand, what they show on the outside (such as facial expressions, gestures, or tone of voice). The following questions focus on the reasons why you regulated your emotions (i.e., why you tried to manage your emotions). Related to the event described earlier: Please indicate the reasons why you acted to influence your emotions. The ERGS contains three items for pro-hedonic goals ("To feel more positive emotions (e.g., joy, contentment)"; =0.64), three items for contra-hedonic goals ("To feel more negative emotions (e.g., anger, sadness)"; = 0.66), five items for pro-social goals ("To cheer someone else up?"; = 0.80) and four items for impression management goals ("To avoid being rejected by others"; = 0.75), and performance goals (unrelated to this study).
We measured pro-hedonic expression goals ("To show more positive emotions (e.g., joy, contentment)"; = 0.67) and contra-hedonic expression goals ("To show less positive emotions (e.g., joy, contentment)"; = 0.69) by substituting to feel with to show in the scales for pro-hedonic and contrahedonic goals, as expression goals aim at showing a positive or negative emotion, but not at experiencing it (Greenaway and Kalokerinos 2018). The items were measured on a scale from 0 "not at all" to 6 "very" and we computed the arithmetic mean over the construct's items for the construct score. We estimated the reliability based on a multilevel model (i.e., items nested in events nested in individuals ;Nezlek 2017;Raudenbush et al. 1991).

Emotional intensity of the event
To measure the emotional intensity of the most negative event of the day, individuals indicated "How intense was your most negative emotion?". The item was measured on a scale of 0 "not at all" to 6 "very".
Descriptive statistics can be found in Table 1.

Analysis
All analyses were performed with R (R Core Team 2015).
We have excluded records of participants who did not describe what they experienced from the analysis. Additionally, we excluded the events, in which individuals indicated a not at all intense negative emotion. We believe that observations with above characteristics are not originated by the population of interest. If an individual experiences a very low emotional intensity, she would not pursue any emotion regulation goals, and therefore these observations systematically fall outside our sample. Without the exclusion, individuals completed an average of 7.57 questionnaires (variance 1.59; median 8) and 2310 in total. After the exclusion, individuals completed an average of 7.39 questionnaires (variance 1.90; median 8) and 2255 in total.

Confirmatory factor analysis
We conducted a confirmatory factor analysis (CFA) for two reasons: Firstly, we wanted to test whether the translated version of ERGS has the same factor structure as the original one. Secondly, we wanted to test whether the adapted scale for pro-hedonic expression goals and contra-hedonic expression goals fits the data well. The CFAs were performed, using the R package lavaan (Rosseel 2012). Since the Dornik-Hansen test (Korkmaz et al. 2015) showed that the data is not multivariate normally distributed, we used the robust maximum likelihood estimator. We demeaned the variables of interest because of the nested nature of our data Table 1 Means, standard deviations, and between-correlations (below the diagonal) and within-correlation (above the diagonal) M and SD are used to represent mean and standard deviation, respectively. The mean and standard deviation are calculated without accounting for the multilevel structure. (events within individuals). This cancels out the individual specific effect on the variables (Curran and Hussong 2009).

The between-variance and within-variance of emotion regulation goals
We examined three-different ICC(1) for each of the emotion regulation goals: the reliability-adjusted ICC(1) (Wilms et al. 2020a), the uncorrected ICC(1), and the (averaged) uncorrected ICC(1) of the single items.

The reliability-adjusted ICC(1)
We are interested in the ratio of between-variance and within-variance, and the ICC(1) provides the percentage of between-variance in the total variance. Measurement error induces a substantial positive bias in the estimation of the within-variance, while the between-variance is not affected.
To correct this bias, Wilms et al. (2020a) propose to weight the measurement error-affected within-variance with the construct's reliability and analytically proved that the result is a robust estimate. The reliability-adjusted ICC(1) uses this corrected within-variance for the ratio of between-variance and within-variance. It is defined as: where σ 2 b refers to the estimated between-variance, σ * 2 w denotes the estimated (usually measurement error-affected) within-variance, α denotes the estimated within-reliability of the variable of interest.

The standard ICC(1)
The standard ICC(1) represents the ratio of between-variance and total variance and is defined as: where all estimates have been defined above, and it can be expressed by Eq. 1, if α is set to 1 (i.e., perfect reliability). Measurement error induces a negative bias, if the construct's reliability is falsely assumed to be 1 but instead smaller in truth. In this case, the measurement error-affected within-variance estimate is not weighted by its reliability and thus too large. This in turn leads to an underestimation of the ICC(1).

The averaged ICC(1) of the single items
The averaged uncorrected ICC(1) of the single items is estimated using two steps: Firstly, we computed the ICC(1) for each item per construct, and secondly, we computed their arithmetic mean.

A comparison of the estimators
The reliability-adjusted ICC(1) (Wilms et al. 2020a) provides an estimate unaffected by measurement error and informs on the time-variant or time-invariant nature of a psychological construct, independent of the measurement's reliability.
The uncorrected ICC(1) is based on the construct scores that do not necessarily have a perfect reliability and yields an estimate affected by measurement error. Accordingly, this estimate encompasses information about the degree of the time-variant or time-invariant nature of the construct and the measurement's reliability. The less reliable the measurement, the lower the ICC(1) estimate (holding all else equal). This estimator benefits from using more items, since this increases the reliability (e.g., Brown 1910). The more reliable the measurement, the smaller the bias of the ICC(1).
The averaged uncorrected ICC(1) of the single items can be understood as an estimate for the ICC(1) of a single item, which is adjusted for the uniqueness of each item by taking the mean. Even though, this estimator is similar to the uncorrected ICC(1), it does not benefit from using more items. For each item, the ICC(1) estimate contains the true ICC(1) plus a sample and item specific negative bias, whose size depends on the unreliability of the item. Accordingly, their average contains the true ICC(1) of the construct plus the average of the sample and item specific negative bias of the items. Thus, the bias in this estimator does not decrease with an increasing number of items.

Estimation
To estimate the between-variance and within-variance, we fitted a two-level model (i.e., events nested in individuals), using the function lmer and extracted the variance components (Wilms et al. 2020a). To estimate the reliability of the constructs, we fitted a three-level model (i.e., items nested in events nested in individuals), extracted the variance components and created the ratio of the within-variance and total variance (i.e., the within-variance and the error variance, which is divided by the number of items) (Nezlek 2017;Raudenbush et al. 1991). 5 All estimation was performed with the R package lme4 (Bates et al. 2015).

Confirmatory factor analysis
We report fit measures according to the recommendation of Jackson et al. (2009). The CFA showed a good fit for the translated ERGS and for the scale of pro-hedonic expression goals and contra-hedonic expression goals, respectively. The robust comparative fit index (Bentler 1990) amounted to 0.93 and 0.99, respectively. Values greater than 0.95 (0.90) indicate a good fit (an acceptable fit). The robust root mean square error of approximation (Steiger 1990) amounted to 0.06 and 0.02, respectively. Values lower than 0.05 indicate a good fit (Kline 2016). The robust standardized root mean square residual amounted to 0.04 and 0.01, respectively. Factor loadings ranged from 0.72 to 1.00 for pro-hedonic goals, 0.68 to 1.00 for contra-hedonic goals, 0.73 to 1.00 for prohedonic expression goals, 0.62 to 1.00 for contra-hedonic expression goals, 0.7 to 1.00 for pro-social goals and 0.56 to 1.00 for impression management goals. Table 2 shows the results of the examination of the betweenvariance and within-variance with the reliability-adjusted ICC(1), the uncorrected ICC(1) of the averaged items and the averaged uncorrected ICC(1) of the single items. The reliability-adjusted ICC(1) yielded 0.51 for pro-hedonic goals, 0.47 for contra-hedonic goals, 0.49 for pro-hedonic expression goals, 0.46 for contra-hedonic expression goals, 0.37 for pro-social goals and 0.40 for impression management goals. The estimates are based on the reliabilities reported in the measures section.

The between-variance and within-variance of emotion regulation goals
The uncorrected ICC(1) yielded 0.40 for pro-hedonic goals, 0.37 for contra-hedonic goals, 0.39 for pro-hedonic expression goals, 0.37 for contra-hedonic expression goals, 0.32 for prosocial goals and 0.33 for impression management goals.
Additionally, we explored the robustness of our results by examining the reliability-adjusted ICC(1) for male and females separately, as the sample contained more females. For males vs. female, the reliability-adjusted ICC(1) yielded 0.51 vs. 0.53 for pro-hedonic goals, 0.47 vs. 0.47 for contra-hedonic goals, 0.48 vs. 0.51 for pro-hedonic expression goals, 0.46 vs. 0.5 for contra-hedonic expression goals, 0.37 Table 2 The comparison of the reliability-adjusted ICC(1), the reliability-adjusted ICC(1) by gender subgroups, the uncorrected ICC(1) and the averaged ICC (1)  Finally, Fig. 1 shows the histograms and the density distributions of the person-specific and reliability-adjusted within-standard deviation (i.e., the square root of the reliability-adjusted within-variance) of the emotion regulation goals. We used the standard deviation instead of the variance here because it is more interpretable when put into perspective of the 7-point scale. We found substantial between-person differences in the reliability-adjusted within-variability (Wilms et al. 2020a) for all emotion regulation goals. That is, some individuals pursued emotion regulation goals more consistently across situations (i.e., represented by low values in the plots), while others varied drastically from situation to situation (i.e., represented by high values in the plots). The between-person standard deviation of the person-specific and reliability-adjusted within-person standard deviation represents the extent to which individuals differ on average from each other in their variability of emotion regulation goals across situations. It yielded 0.37 for pro-hedonic goals, 0.41 for contra-hedonic goals, 0.39 for pro-hedonic expression goals, 0.4 for contra-hedonic expression goals, 0.4 for pro-social goals and 0.41 for impression management goals.

Discussion
This study investigated the between-variance and withinvariance of emotion regulation goals. Unsurprisingly, (proand contra-) hedonic experience goals and expression goals showed similar realizations of the ICC(1), since Greenaway and Kalokerinos (2018) emphasized that even though they represent distinct concepts, they often occur together. This is supported by the strong correlation (See Table 1) and we therefore discuss hedonic expression goals together with hedonic experience goals. Pro-hedonic goals and pro-hedonic expression goals comprised about 50% between-variance and about 50% within-variance. Contra-hedonic goals and contra-hedonic expression goals comprised about 46% betweenvariance and about 54% within-variance. Accordingly, the individual effect and the situation effect accounted almost equally for the constructs' variability. This suggests that these goals may be equally time-variant and time-invariant, even though contra-hedonic goals and contra-hedonic expression goals appear to be slightly more time-variant.
Comparing our results for pro-hedonic goals and contrahedonic goals to other studies, they suggested a more timevariant nature (English et al. 2017;Millgram et al. 2019;Wilms et al. 2020b). The difference may mostly be caused by the employment of a different estimator for the ICC(1) (See Implications for methodology). To the authors' knowledge, no study has examined the ICC(1) for expression goals.
Pro-social goals comprised 37% between-variance and 63% within-variance. Impression management goals comprised 40% between-variance and 60% within-variance. Accordingly, the situation effect accounted for the variability of social goals more than the individual effect. This suggests that these goals may be more time-variant than time-invariant.
Comparing our results to other studies, they suggested a more time-variant nature for social goals (English et al. 2017;Wilms et al. 2020b). As above, we attribute differences to general social goals mostly to the employment of a different estimator for the ICC(1) (See discussion below). One study (Eldesouky and English 2018b) suggested a more time-invariant nature for pro-social goals and impression management goals in particular, but, as outlined above, it did not capture the consistency in emotion regulation goals across events (See The between-variance and within-variance of emotion regulation goals).
Examining the robustness of our results, we tested for gender differences. This was of particular importance, since our sample contained mostly females. We did not find substantial differences between female and male participants, suggesting that the oversampling female participants did not threaten the generalizability to both genders.
Our results highlight another important point. In the field of emotion regulation, two terminologies exist that refer to reasons why individuals regulate their emotions: emotion regulation motives (e.g., Tamir 2016; Tamir and Millgram 2017) and emotion regulation goals (e.g., Eldesouky and English 2018b;Eldesouky and Gross 2019;English et al. 2017). This is unfortunate as two different terms should not be used interchangeably to describe the same phenomenon. Nevertheless, our results strongly support that a differentiation indeed makes a lot of sense as the time-invariant nature of emotion regulation is more important than previously assumed. Accordingly, we argue for naming the time-variant and time-invariant parts differently. We propose to use the term emotion regulation goals to refer to the time-variant nature of the reasons for emotion regulation, and emotion regulation motives to refer to the time-invariant nature. This will help to understand whether future studies contribute to the understanding of time-variant emotion regulation goals or the time-invariant emotion regulation motives.

Implications for theory
Overall, our results suggest that the reasons for emotion regulation have two components: emotion regulation goals and emotion regulation motives. This opens new possibilities well beyond the scope of this article. We have shown that substantial and meaningful variation exists on the individual level. Research could test whether the emotion regulation motives are a useful predictor for, for example, general well-being  or vulnerability to psychiatric disorders (Millgram et al. 2020a, b). Likewise, it may be tested, which predictors, such as personality traits (Eldesouky and English 2018a), lead to a specific expression of emotion regulation motives. These hypotheses can only be tested because meaningful differences between individuals exist.
Also, we have shown that substantial and meaningful variation exists on the situation level; research could test whether emotion regulation goals are a useful predictor for, for example, daily well-being or daily symptoms of psychiatric disorders. Likewise, it may be tested, which predictors, such as emotional intensity, lead to a specific expression of emotion regulation goals. These hypotheses can only be tested because meaningful differences between situations exist.
Research has made substantial progress in why individuals regulate their emotions (Tamir 2016). But, we strongly encourage researchers to distinguish between emotion regulation goals and motives. We want to support our claim by giving some (arbitrarily chosen) examples of misleading terminology. Tamir et al. (2019) showed that pro-hedonic goals are associated with more psychological well-being in a cross-sectional design and lead to less negative emotions in an experiments. Measuring the extent to which participants generally pursue pro-hedonic goals represents an average score across different events and thus rather captures the effect of pro-hedonic motives instead. In their experiment, the activation of a pro-hedonic goal is manipulated in a specific situation, while the individual effects are kept constant because of the randomization in experimental and control group. Accordingly, the experiment captures the effect of the activation of pro-hedonic goals in this specific situation. But experiments do not always reveal the situation effect. Millgram et al. (2015;Study 1 and 2) showed that depressed are more motivated to experience sadness (i.e., choosing to watch (listen to) sad pictures (music) again) than nondepressed in an experiment. Here, they did not manipulate the situation and also did not randomly assign individuals to an experimental or control group, but rather examined how the individual effect (i.e., depressed vs. nondepressed) in emotion regulation motives affect choices for pictures. Further, English et al. (2017) examined the effect of emotion regulation goals on strategies, but they did not model the individual and the situation effect explicitly. Their analysis therefore yields the aggregation of the effect of emotion regulation goals and motives (which do not have necessarily the same sign) and is thus hard to interpret (Curran and Bauer 2011). We hope that these examples emphasize the need to distinguish between emotion regulation goals and motives in order to better understand emotion regulation from a between-person and within-person perspective.
Finally, we found that individuals differ meaningfully in their within-variance (i.e., the variability in the expression of emotion regulation goals across situations). Previous research highlighted the notion of emotion regulation flexibility as the adaptive alternation of emotion regulation strategies in accordance with personal goals and contextual demands (Aldao et al. 2015;Bonanno and Burton 2013). As emotion regulation goals are associated to specific strategies (Wilms et al. 2020b), a higher within-variance in emotion regulation goals may also be associated with a higher level of emotion regulation flexibility. Further, some first studies suggest that the variability in emotion regulation goals may be related to psychiatric disorders (Millgram et al. 2020a, b). Individuals with bipolar disorder appear to be motivated to experience lability, meaning they should be characterized by a higher variability in emotion regulation goals . These examples show the value of exploring the within-variance of emotion regulation goals between individuals.

Implications for methodology
Overall, we provided a comparison of the reliability-adjusted ICC(1), the uncorrected ICC(1) and the averaged ICC(1) of the single items. In comparison to the reliability-adjusted ICC(1), the uncorrected ICC(1) and the averaged ICC(1) of the single items underestimated the ratio of betweenvariance to total variance (i.e., the sum of between-variance and within-variance) because of measurement error (Wilms et al. 2020a). Since reliability represents a quality of the measurement and not of the psychological construct itself, the uncorrected ICC(1) misinforms on the time-variant and time-invariant nature of a construct, unless the measurement has perfect reliability.
Previous studies (i.e., English et al. 2017;Kalokerinos et al. 2017;Wilms et al. 2020b) did not report the reliability of their measures, since they used one item to measure the respective emotion regulation goal. It is very unlikely however that their measurement was perfectly reliable (Brown 1910;Nezlek 2017), which in turn should have led to an underestimation of the ICC(1) (Wilms et al. 2020a). The severity of the measurement error-induced bias can be estimated, if we are willing to guess a reliability. Under the assumption of a reliability of 0.7 (0.5) and an uncorrected ICC(1) of roughly 0.3, the reliability-adjusted ICC(1) yields 0.38 (0.46) for the previous studies. Here we see that the corrected ICC(1) of previous studies is much larger, if we reject the assumption of perfect reliability. Certainly, our correction is based on a guessed and not statistically estimated reliability, since the none of the authors of the other studies reported reliabilities. We believe however that a reliability of 0.7 and even more a reliability of 0.5 (Nezlek 2012(Nezlek , 2017 1 3 are much more realistic for a single-item measurement than a perfect reliability. Of course, study design differences could additionally explain the difference in the ICC(1) estimates (Podsakoff et al. 2019), but previous studies were similar to ours in many respects. The number of days surveyed were similar to ours with 7, 7 and 9 for English et al. (2017), Kalokerinos et al. (2017) and Wilms et al. (2020b), respectively. Each of the studies asked for the most negative event within a specific time frame. The number of surveys per day was different for Wilms et al. (2020b) with five prompts per day, while the rest of the studies were daily diaries, and indeed Wilms et al. (2020b) found the smallest ICC(1) across studies. This is in line with the findings of Podsakoff et al. (2019), since they showed that the number of surveys per days is associated with more within-variance across psychological constructs. The biggest difference probably lies in the chosen response format, which was dichotomous for the three previous studies and a seven-point extent scale for our study. Podsakoff et al. (2019) provided initial evidence that these response format differences do not affect the ICC(1) significantly.
Accordingly, we argue that previous studies underestimated the ICC(1) substantially because of measurement error and thus falsely indicated a more time-variant nature of emotion regulation goals. All in all, it highlights the importance of correcting the ICC(1) estimate for measurement error because it may otherwise emphasize more withinvariance (Wilms et al. 2020a).

Limitations and directions for future research
The study has limitations, but also directions for future research. First and foremost, we only examined emotion regulation goals for negative events. We cannot extend our results to positive events. Future studies may address this gap and generalize our results to positive events.
Secondly, we only examined the ICC(1) for hedonic goals and social goals, but not other emotion regulation goals such as epistemic goals, performance goals and eudaimonic goals. Based on previous ICC(1) estimates of these goals (Kalokerinos et al. 2017), we predict that they will also show equal amounts of between-variance and within-variance. Future studies may test this prediction.
Thirdly, our sample was very young, mostly female and more educated than the population. It remains unclear to which extent our results generalize to other populations. Previous research highlighted differences in the preference for pro-hedonic goals among age groups (Riediger et al. 2009). But English et al. (2017) and Kalokerinos et al. (2017) found very similar results (ICC(1) from 0.18 to 0.38 vs. ICC(1) from 0.15 to 0.30, respectively), even though their sample age was very different (M = 18 vs. M = 35.23). This hints preliminarily that the ratio of between-variance and within-variance does not change with increasing age. Future research could try to replicate our results in a more representative sample.
Fourthly, other studies may find it useful to increase the number of prompts per day. Doing so usually leads to a higher within-variance (Podsakoff et al. 2019). Further, it captures a more natural and broader spectrum of emotional intensities. Higher emotional intensity has been shown to trigger the initiation of emotion regulation processes (Milyavsky et al. 2019), and therefore should also activate emotion regulation goals (Eldesouky and Gross 2019). Capturing also lower emotional intensities may provide a more diverse picture of emotion regulation goals, since individuals may be less motivated to regulate their emotions in these situations. This could increase the within-variance and reveal a more fluctuating pattern for emotion regulation goals (e.g., Wilms et al. 2020b).
Fifthly, participants indicated their emotion regulation goals over the period of 9 days, and we therefore only tested the consistency in scores across a limited time period. Increasing the number of days may reveal a larger variability over longer time. Alternatively, researchers could use a longitudinal diary study (e.g., m periods of diary study with several weeks break in between). Accordingly, more research is needed to give a final answer to the ratio of between-variance and within-variance of emotion regulation goals.
Finally, we used the ICC(1) instead of the latent state-trait model (e.g., Steyer et al. 1999) to examine the ratio of the between-variance and within-variance for a various reasons: the ICC(1) is frequently used for experience sampling studies (Podsakoff et al. 2019) and we used the ICC(1) to show current empirical evidence for the time-variant nature of emotion regulation goals. Additionally, the ICC(1) is easy to interpret and latent state-trait models often need sample sizes above 500 to get proper results (Cole et al. 2005). Summarizing, both the ICC(1) and latent state-trait model provide estimates for the between-variance and within-variance. Future studies, particularly the longitudinal ones over a longer period of time, may use the latent state-trait model to investigate the time-variant or time-invariant nature of emotion regulation goals, examining whether different estimation approaches yield similar results.

Conclusion
Overall, hedonic goals appear to be mostly equally timeinvariant and time-variant, while social goals appear to be slightly more time-variant than time-invariant. We propose, firstly, to use the term emotion regulation goals to refer to the time-variant nature of the reasons for emotion regulation, and emotion regulation motives to refer to their time-invariant nature. Secondly, we conclude that the reasons for emotion regulation consist mostly equally of emotion regulation goals and emotion regulation motives. Finally, we conclude that previous studies underestimated the ICC(1) and falsely emphasized a time-variant nature of emotion regulation goals because of measurement error (Wilms et al. 2020a).