Ecological momentary assessment of emotional awareness: Preliminary evaluation of psychometric properties

The Levels of Emotional Awareness Scale (LEAS) is a well-validated performance measure of trait emotional awareness (EA), which is associated with psychological and physical problems. EA is, however, expected to vary over time and we aimed to adapt the LEAS to permit the measurement of EA in daily life as a function of momentary state. Twenty-five students completed 12 ecological momentary assessments (EMAs) of EA across 2 days. The correlation between the mean EMAs of EA and trait EA, and the change over time in EA, was also examined. Findings revealed a significant positive correlation between state and trait EA. The within-person reliability was substantial, suggesting that EMAs can reliably assess EA over time across individuals. Importantly, latent state-trait analysis showed that about 50% of EAvariability was due to state variance whereas only 2% of EA variability was due to trait variance. Preliminary psychometric properties suggest that the developed method allows for the measurement of EA in daily life and supports the claim that EA can be measured using both hypothetical (as in the LEAS) and real-life (using EMAs) scenarios.

measure EA as a trait. Studies are needed that uncover how EA unfolds in daily life and how it relates to changes in psychological and physical health.

Measuring Emotional Awareness
A frequently used measure of EA is the Levels of Emotional Awareness Scale (LEAS; Lane et al. 1990). The LEAS presents 20 scenarios including two persons and participants are instructed to anticipate and describe their feelings in the scenario, and the feelings of the other person. Greater complexity in the described emotional experience is associated with higher levels of EA. The measure emanates from the cognitive-developmental theory of EA (Lane and Schwartz 1987) and postulates that the ability to recognize and describe emotions-both in oneself and in others-undergoes a developmental path. The theory follows the structural characteristics of Piaget's theory of cognitive development, but identifies levels instead of stages. A greater degree of differentiation and integration of emotion is posited to occur with each higher level, and different levels can be used to characterize both transitory states and traits. Hence, variability in EA within individuals is expected to occur. The hypothesized variation in EA is further reinforced by the literature on mentalizationbased therapy (Bateman and Fonagy 2013). Core to this perspective is the observation that at times-particularly during periods of high arousal or conflict-people temporarily lose the ability to mentalize about the thoughts and feelings of self and others. In therapy, therapists work with clients to identify, clarify, and, when appropriate, to challenge existing perspectives on self and others. By rewinding and reviewing the sequence of events the therapist aims to stimulate the individuals' ability to mentalize. Importantly, individuals are not considered to have a permanent deficit in mentalization, but rather a temporary functional deficit that occurs as a function of arousal (Bateman and Fonagy 2013). These observations suggest that EA varies; however, this variability in daily life has not previously been addressed in empirical research.

Emotional Awareness in Daily Life
The LEAS can be used to examine individuals' average level of EA, but is insensitive to variations in actual behavior during the day (e.g., as a function of arousal). Yet being able to measure these variations in daily life might be crucial in order to understand why low levels of EA are associated with psychological and physical problems (Levine et al. 1997;Subic-Wrana et al. 2005). Daily assessments of EA are believed to provide a more complete understanding of EA and its relation to health, because individuals are studied in their natural environment and contextual and environmental factors can be accounted for. Ecological momentary assessments (EMAs) may be able to capture these variations in emotional processes within individuals and these processes can be linked to specific contexts (Ebner-Priemer and Trull 2009). The EMA methodology minimizes some forms of retrospective bias and, because the assessments occur in real life, the findings are ecologically valid. Given the advantages of EMAs, developing an EMA version of the LEAS will not only help to gain insight into potential variations of EA, but will also help to more specifically study if and how variations in daily EA are related to daily variations in psychological and physiological health.
It is notable that the LEAS performs well even though the scenarios are hypothetical and may not relate to the person's life. A more recent innovation is to score written text, like essays, pertaining to real life situations using the LEAS scoring system. A study showed that such essay scores about a current medical problem correlated (i.e., r = .30) with LEAS scores, indicating that trait LEAS scores do correspond to reported experiences in everyday life (Lane et al. 2012). The scoring system has also been used to rate the emotional content of individuals' descriptions of the 'emotional' movement of non-verbal stimuli (e.g., Stonnington et al. 2013). In that study the emotional content was found to be lower in patients with somatoform disorders compared to controls, consistent with expectations from cognitive-developmental theory that lower scores indicate a greater focus on bodily sensations when emotion is activated. As the LEAS scoring system was successfully used outside of its original context of hypothetical scenarios, it is conceivable that it can be used to score the emotional content in real life scenarios.

Current Study
The main aim of the current study was to pilot test whether EA could be assessed in daily life using real life scenarios (versus hypothetical scenarios used in the LEAS) and to assess potential variability in EA in real life that is overlooked by the LEAS. To accomplish this, the LEAS was modified to be applicable as EMA. Instead of rating a standard prespecified imaginary context, individuals use the context that they are actually in. Although this results in a decrease in control over the scenarios that are scored, it can provide large amounts of ecologically valid data and this could result in a reliable estimate of EA. For this pilot study, healthy participants were asked to complete six EMAs of EA for 2 consecutive days. Even though this sampling coverage is limited, it will provide a first indication of the applicability of the EA measure in daily life and it will provide preliminary psychometric properties. In addition to the EMAs of EA, participants completed a number of trait questionnaires (including the LEAS as a trait measure of EA). Firstly, a positive association was expected between the mean EMAs of EA ('state EA') and trait EA (based on earlier findings with the LEAS scoring system, see Lane et al. 2012;Stonnington et al. 2013). Secondly, the new measure allowed us to examine variations in state EA. Variation in state EA was expected, because the LEAS scoring system may not only be used to measure EA as a trait, but may also be used to describe transitory states (Lane and Schwartz 1987). The methodology that was used to examine variation in state EA is discussed in the section Analysis Plan.
To establish construct validity, we first examined the association between EA and the related concepts of alexithymia and psychological mindedness. Alexithymia includes problems in distinguishing between emotions and verbalizing emotions (Kooiman et al. 2002), and was expected to negatively relate to EA. Psychological mindedness refers to an interest in and the ability to be in touch with and reflect upon psychological processes (Nyklíček and Denollet 2009), and was expected to positively relate to EA. Second, the association between verbal ability and EAwas examined to determine whether EA was not just a reflection of verbal ability (Lane et al. 1996). As EA is conveyed through language, a small (positive) correlation can be expected. Third, social desirability was measured as this can be predictive of an individuals' defensiveness and individuals high in defensiveness may be less inclined to self-disclose (Evans 1982). Based on previous findings by Lane et al. (2000), a negative association was expected between social desirability and EA. Lastly, we explored the associations between EA and explicit and implicit affect.

Participants
Twenty-five native Dutch students (i.e., 18/25 female, age range 19-33, mean age of 21.84 [SD = 2.94]) were recruited at Leiden University to participate in a study on emotions in daily life. No other inclusion or exclusion criteria were used. The study was approved by the Internal Review Board (CEP16-0407/165). All participants were randomized (1:1 ratio) to one of two conditions using a random number generator (https://www.random.org/). Twelve participants (i.e., 10/ 12 female) were allocated to the condition in which the (trait) questionnaires were completed before the EMAs and 13 participants (i.e., 8/13 female) completed the questionnaires after the EMAs.

Procedure
The study consisted of two lab sessions and 2 days with EMAs scheduled in between the lab visits. The EMA days were scheduled on two consecutive weekdays. All participants were required to complete the questionnaires and EMAs, but the temporal order depended on the participants' condition. By varying the order in these conditions, we could examine whether state EA was influenced by exposure to trait LEAS.
Participants signed up for the study and scheduled the lab sessions online. In the first lab session all participants were consented and they completed a demographic questionnaire. The experimenter explained that the aim of the study was to assess emotions in daily life and, to do this, participants were asked to complete multiple assessments in daily life using their smartphones. Participants received six EMAs per day, for a total of 2 days. The EMAs were randomly triggered using the MovisensXS application between 10 AM and 10 PM, with a minimum of 1 h between each assessment. Participants had the option to delay the assessment for 1 h (or less) or to dismiss the assessment. To incentivize full participation, participants were given 15 Euros when they completed at least 10 of the 12 EMAs (plus the two lab sessions). If participants completed fewer EMAs, they were given 10 Euros. Next, participants were shown what questions to expect in daily life and how to answer those questions. Furthermore, the Android-based application was installed on the smartphone of participants. If participants did not own an Android phone, they were lent one for the duration of the study (n = 10). Depending on the condition, the lab session was either finished or a number of questionnaires had to be completed. In case of the latter, participants were seated behind a computer and were asked to complete the LEAS, IPANAT, explicit affect questions, TAS-20, BIPM, and subscale social desirability of the EPQ-RSS. After this, the experimenter administered the vocabulary subscale of the WAIS-IV-NL.
During the second lab session, participants were first asked to complete the questionnaires (if these had not already been done in the first session). Next, participants received a debriefing and their compensation.

Measures
Emotional Awareness Trait EA was measured with the computer-administered 10-item LEAS (Lane et al. 1990). The LEAS identifies five levels of EA, from low to high: 1) bodily sensations (e.g., sleepy), 2) global, undifferentiated emotions or action tendencies (e.g., good, laugh), 3) single emotions (e.g., happy), 4) blends of emotion (e.g., I would feel sad, but disappointed), and 5) combinations of blends (e.g., I would feel sad and angry; the other would feel happy and relieved). Each item of the LEAS presents a hypothetical emotion-eliciting social scenario including the participant and another individual (see Appendix 1). Participants indicate how they would feel and how the other person would feel.
Each word in the written responses is then scored using an extensive glossary (e.g., 0 = non-emotion word, 1 = bodily sensation, et cetera) and a coding scheme is used to construct a single score per scenario (for details see Lane et al. 1990). The score of each scenario is then summed to provide a measure of emotional complexity-with higher scores indicating greater awareness and differentiation in emotions. All answers were hand-scored (i.e., by author AV who received guidance from author RDL). Cronbach's alpha was adequate (.73). The scale has good intra-test, inter-rater and test-retest reliability, as well as strong evidence of validity (see Lane 2000).
State EA was measured in daily life by asking participants how they felt in their current social interaction (both face-toface and digital interaction qualified). If participants were not in a social interaction at the time of the assessment, they had to describe their most recent interaction. Next, participants had to indicate how the other person felt in the social interaction. When the interaction was with more than one person, the participant was instructed to describe the feelings of the person that was most significant to them at that moment in time. Participants could answer by typing in their answer on the smartphone (i.e., no word limit). Responses were scored using the LEAS scoring system (by author AV who received guidance from author RDL). At each assessment, participants also had to indicate whether a face-to-face or digital interaction was described, and the number of people involved in the social interaction. Screenshots of the smartphone application that was used to assess state EA are presented in Appendix 2.
Alexithymia The 20-item Toronto Alexithymia Scale (TAS-20; Bagby et al. 1994) was used to measure alexithymia and items were answered on a 3-point Likert scale with 1 = disagree, 3 = neither disagree nor agree, and 5 = agree (i.e., differs from the originally used 5-point Likert scale). The questionnaire has three subscales: (1) difficulty identifying feelings (identification scale), (2) difficulty in describing feelings (communication scale), and (3) externally oriented thinking, which reflects the tendency of individuals to focus their attention externally. Cronbach's alpha was satisfactory for the first two scales (respectively .74 and .84), but was unsatisfactory for the third scale (i.e., .38). This finding is line with other studies (see Kooiman et al. 2002) and this subscale was therefore not included in further analyses.
Psychological Mindedness The 14-item Balanced Index of Psychological Mindedness (BIPM; Nyklíček and Denollet 2009) measured psychological mindedness. The subscale 'interest' was used to measure an individual's attitude towards psychological states and processes, and the subscale 'insight' reflects the skill of also being aware of these processes. Items are scored on a 5-point scale ranging from 'not true' (0) to 'very much true' (4). Cronbach's alpha was adequate for 'interest' (.73), but was considerably lower for 'insight' (.60).
Verbal ability The vocabulary subscale of the Wechsler Adult Intelligence Scale was used to assess verbal comprehension (WAIS-IV-NL; Wechsler 2014). This test presents participants with 30 words of increasing difficulty and participants have to define each word. Each response receives a score (between 0 and 2) based on the level of precision. Based on age and gender, the total (raw) score is converted to a scaled score and this standardization helps to interpret the test scores across participants.
Social Desirability The 12-item social desirability subscale of the Eysenck Personality Scale (EPQ-RSS; Eysenck and Eysenck 1991) measures the extent in which individuals tend to give socially desirable responses. An example item is 'Are all your habits good and desirable ones?' and participants can either agree or disagree with the item. The items are formulated in extremes and agreement with such items (or nonagreement with reversed scored items) is indicative of social desirable responding. Cronbach's alpha was relatively low (.64).
Affect Explicit affect was assessed on one occasion by having participants rate 12 emotional adjectives (e.g., angry; nine adjectives for negative affect and three adjectives for positive affect). Participants had to indicate to what extent they generally experienced these emotions. Cronbach's alpha was good for both explicit positive and negative affect (.87 and .82, respectively).
Implicit affect was assessed on one occasion using the Implicit Positive and Negative Affect Test (IPANAT; Quirin et al. 2009). Participants are told that the test examines how the meaning or mood of words from an artificial language can be communicated via sound. The test presents five nonexisting words (e.g., RONPE) and each word is coupled with 12 emotional adjectives. This results in 60 pairs that are scored on a 6-point scale ranging from 'doesn't fit at all' to 'fits very well.' Internal consistency, test-retest reliability, construct and criterion based validity were adequate among students (Quirin et al. 2009). Cronbach's alpha was good for implicit negative affect (.88), but inadequate for implicit positive affect (.52). The relation between implicit negative and explicit negative affect, and the relation between implicit positive and explicit positive affect was in the expected direction (respectively, r = .35, p = .088 and r = .36, p = .079).

Analysis Plan
To examine the association between state EMA of EA and trait EA, a Pearson correlation analysis was performed. Next, we examined whether the EMA of EA reliably measured systematic variation in EA over time using reliability coefficients based on the generalizability theory (Cranford et al. 2006). To this end, we calculated between-and within-person reliability coefficients (R kf and R c respectively) across the 2 assessment days (for details see Cranford et al. 2006). EA ratings were person-centered for the analysis of withinperson reliability, which was used as an indication for reliability of change and reflects the proportion of variability due to changes in ratings over time across individuals. Moderate reliability was assumed for values between .61-.80 and substantial reliability for values between .81 and 1.0 (Shrout 1998).
The latent state-trait (LST) theory was used to estimate the amount of occasion specific (state) and across occasion consistent (trait) variance (Hagemann and Meyerhoff 2008;Steyer et al. 1992). LST theory is an extension of the classical test theory that allows for the decomposition of test scores into true latent states and true latent traits. These approaches normally require the use of structural equation modeling and thus large samples; however, Hagemann and Meyerhoff (2008) provided a simplified method that was shown to provide reliable LST estimates in small samples. This method allows the LST parameters to be estimated from the observed covariance matrix and is particularly useful for small samples (see Hagemann and Meyerhoff 2008 for details of the modeling and estimation procedures). Specifically, estimates of the variability in EMA of EA scores for the self and otherdue to occasion specific (state) versus consistent across occasion (trait) variance-are provided as well as estimates of the reliability (the sum of the state and trait variances). In the present study, all error variances of the EA self-variables were set equal and all error variances of the EA other-variables were set equal. In addition, all variances of the latent state residuals were set equal and all paths were fixed to unity.
Furthermore, Pearson's correlation analyses were done to examine the association between EA (both trait and state) and related variables. Spearman's correlation coefficients were used for explicit positive affect and for the communication subscale of the TAS, as these data were not normally distributed (and the distribution was not improved with data transformation). As the analyses are exploratory and not confirmatory, we did not adjust the alpha value to correct for the multiple correlations. Table 1 displays means and standard deviations of the different assessments. No gender differences were observed. The level of EA was higher than usual in males considering they typically score significantly lower on EA compared to females. Moreover, two participants scored in the alexithymic range (based on TAS-20 cutoffs specified by Bagby et al. 1994). Data of one participant on the 'other' subscale of the LEAS was identified as an outlier by the outlier labeling rule (Hoaglin and Iglewicz 1987) and was therefore not included in the analyses. Participants completed on average 9.00 EMAs (SD = 2.38). Completing the EMAs took on average 123 s (SD = 108) and, on average, participants responded to the trigger of the EMAs after 9 min and 40 s (SD = 18:26 [mm:ss]). One participant completed only two EMAs and these EMAs were therefore excluded from the analyses. State EA was not significantly different between conditions (B = −0.28, SE = 0.22, 95% CI [−0.73; 0.18], p = .222).

Trait and State Emotional Awareness
State EA was not different for face-to-face versus digital interactions (t(221) = −0.51, p = .612) and did not differ for interactions with one other person present versus multiple persons present (t(221) = 0.25, p = .804). A strong positive correlation was found between state EA and trait EA with r(24) = .69, p < .001 (i.e., total scale, see Table 2). Moreover, there were moderate positive correlations between the subscales of trait EA and state EA (i.e., 'self' emotions: r(24) = .59, p = .002; 'other' emotions: r(23) = .44, p = .034). Figure 1 displays how state EA changes over time for all participants. Between-person reliability of the EMAs of EA was moderate (R kf = .69), indicating that the ratings of state EA reflect individual differences. Importantly, the withinperson reliability was substantial (R c = .91) and this suggests that the EMAs can be reliably used to assess EA over time across individuals. The minimum to maximum change in EA showed that none of the participants had identical levels of EA across the different EMAs. Six participants had scores on two different levels of EA, nine participants had scores on three levels, eight participants had scores on four levels and one participant had scores on all levels. Altogether, the results show that state EA is not invariable, but varies over time.
To determine the degree of variation, the LST theory was used. The results of the LST analysis showed that 46% of the self EA and 53% of the other EA variability was due to state variance, whereas only 2% of both self and other EA was due to trait variance. These results clearly show that the EMA of EA is capable of tapping occasion specific or state variability.

Association between Emotional Awareness and Related Constructs
Considering the small sample size, the following results must be interpreted with caution. In contrast with our expectation, both state and trait EA were not significantly associated with most related variables (see Table 1). There was a significant correlation between the communication subscale of the TAS-20 and state EA, with ρ = .52, p = .009. Contrary to our expectation, higher levels of state EA were associated with greater perceived difficulty in communicating feelings. To better understand this association, subsequent (exploratory) spearman correlation analyses with the five items of the communication subscale and state EA were done. Two items were associated with EA: 'It is difficult for me to reveal my inner most feelings, even to close friends' and 'I am able to describe my feelings easily' (reversed scored) (resp. ρ = .61, p = .002 and ρ = .41, p = .046). None of the other items were significantly associated with state EA. Explicit positive affect was the only variable that had a significant association with trait EA. Participants with higher levels of trait EA had lower levels of explicit positive affect (ρ = −.41, p = .041). A scatterplot identified one participant who scored two standard  Fig. 1 Mean state emotional awareness for all participants across the 2 ecological momentary assessment days. Within-subject error bars represent ±2 SE deviations above the mean for positive affect and after removing this datum the association was no longer significant (ρ = −.33, p = .112).

Discussion
The LEAS is a well-validated performance measure that assesses individuals' trait EA. The aim of the current project was to adapt the LEAS method to permit the measurement of EA in daily life as a function of momentary state. The preliminary findings showed that there was a strong positive correlation between the LEAS and the EMA of EA, suggesting that the two types of assessment measure the same construct andimportantly-it provides further evidence that both hypothetical and real life scenarios can be used to measure EA. The data further provide a first indication that there is variability in EA over time. The LST estimates showed that approximately half of the variability in EMA of EA scores is due to state variance whereas there was negligible trait variance. This pilot study thus suggests that the developed methodology allows for the measurement of EA in daily life. Findings suggest that EA varies over time, but it is important to determine what these variations are due to. Whereas the LST analysis showed that the EMA of EA can be used to assess occasion specific or state variance, there was still substantial unexplained variance left to be accounted for. Although type of interaction (face-to-face versus digital) and the number of persons involved did not influence state EA, other potential sources of variations (e.g., variation due to trait, state, occasion and measurement error) can be explored in systematic studies. For example, the setting or an individual's momentary mood might influence that person's current level of EA. Even though we do not necessarily have predictions about the direction of such effects, it would be worthwhile to examine whether and how people in the vicinity (e.g., friend vs foe), current activity (e.g., pleasant vs unpleasant), or mood can influence EA scores. Furthermore, early adversity could be a factor that influences trait variance considering that chronic stress-such as early adversity-has been found to be associated with problems in the processing of emotional information (Taylor et al. 2011) and the capacity for selfawareness (Colvert et al. 2008). Systematically studying such variables will give us a better idea of reliable sources of variation in EA other than measurement error.
The results of our findings, regarding states and traits, require some further explication. When the EMA scores on each occasion were aggregated over the 2 days, each EMA score was comparable to the score of one item on the trait LEAS. As such, the correlation of 0.69 between trait LEAS and EMA-LEAS indicates that a between-person comparison of the mean score of the LEAS reveals substantial consistency whether completing the trait measure consisting of hypothetical scenarios or the EMA measure reporting on real life experiences. When within-subject variability in the EMA scores across day 1 and day 2 was calculated, only 2% of the variance was attributable to a stable within-person trait. Thus, the within-subject variability within each day (i.e. occasion variance) was substantial, suggesting that in real life contexts variation in LEAS scores meaningfully captures variability within subjects in the complexity of emotional experiences over time. These results are comparable to EMA measures of other constructs (see Sliwinski et al. 2018) that vary over time. Future research is needed to understand the optimal psychometric properties of EMA data more generally as these methods become more common, and trade offs between psychometric reliability and participant burden are considered.
Contrary to our expectations, a positive association was found between state EA and perceived difficulty in revealing and describing emotions. This finding is in contrast with previous studies (e.g., Lane et al. 2000;Maroti et al. 2018) and might be due to our sample. Compared with previous studies, the current sample had unusually high EA levels, specifically in males. In a sample of aware participants, a performance measure might differ from subjective assessment-like the TAS-20-that relies on self-report. To clarify, individuals who score high on a performance measure of EA might also realize how much they do not know and this realization could influence their self-report (i.e., these individuals might believe that they are inadequate in communicating emotions).
Another unexpected finding was the negative association between trait EA and explicit positive affect. This association might be due to the high levels of reported positive affect (i.e., mean = 4.68 on scale ranging from 1 to 6). High scores can result in a reduction of variability in the data and this can increase the chance of finding a false-positive (Austin and Brunner 2003). Indeed, when removing one extreme score the association was no longer significant.
The study is limited by the small sample size and the sample size is specifically small for the correlation analyses. Statistical power is reduced with small sample sizes and can result in lower study validity (i.e., it increases the risk for falsely accepting the null-hypothesis). The results of this pilot study should therefore be interpreted cautiously. Future studies should include more participants to examine the relation between EA (both trait and state) and related concepts. Additionally, to get a better understanding of the variability in EA, future studies should also extend the sampling coverage (i.e., more EMAs across more days). Considering the added burden to participants (which can result in reduced reliability of the collected data), it is important to carefully consider both the number of EMAs per day and the number of sampling days (for a full discussion, see Mehl and Conner 2012). Another limitation pertains to the response scale of the measure of alexithymia. The original questionnaire used a 5-point response scale whereas this study used a 3-point answering scale. Nevertheless, the labels were comparable and a similar scoring scheme could be applied. Therefore the scales and ensuing results are likely comparable.
Altogether, this pilot study indicates that EA can be measured in daily life using an adapted version of the LEAS. Moreover, both hypothetical and real life scenarios can be used when assessing EA. The preliminary psychometric properties suggest that EA varies over time and that the EMA of EA captures occasion specific variance. The latter is consistent with the original theory of levels of emotional awareness (Lane and Schwartz 1987), which proposed a theory of levels precisely because the complexity of emotional state varies over time within people as well as demonstrating substantial consistency in mean level between people. Future studies should 1) increase the sample size, 2) increase the study duration, and 3) systematically study variations in emotional processes whilst accounting for potential sources of variation. Overall, this study provides the first evidence that EA can be assessed in real life contexts and thus the EMA of EA can enrich our understanding of what it means to describe an individual as having a trait level of EA.