Introduction

Empathy is widely considered to be an important element in pro-social, moral and altruistic behaviour (Barbot & Kaufman, 2020; Eisenberg, 2003). Empathy has long been considered a multidimensional capacity, ability or skill. Empathy enables people to perceive the emotions of others, resonate with others emotionally and cognitively and to take the perspective of others (Davis, 1980; Eisenberg et al., 1997; Reiss, 2017). More recently, the multidimensional elements of empathy have been supported by the studies in neuroscience highlighting key differences between the pathways in the brain for different empathetic reactions (e.g. Decety & Jackson, 2006). Although there is dispute in the literature about the composing facets of empathy, there is an agreement that there are both cognitive and affective elements (Decety & Jackson, 2006; Eisenberg, 2000). Broadly, cognitive empathy is considered the ability to understand and recognise the way another feels (Preston & Hofelich, 2012). Affective empathy is considered the capacity to experience (or share in) the emotions of another (Jolliffe & Farrington, 2004). These two concepts are often linked with sympathy, which is a related but different construct, referring to feelings of sorrow and concern for another’s misfortune (Vossen et al., 2015).

The three constructs vary in their definition despite being related and sometimes used interchangeably. As mentioned above, cognitive empathy is the capability to understand another’s feelings, whereas affective empathy is focused on the experience of another’s feelings. Sympathy, or concern for another’s misfortune, is often confused with affective empathy, particularly as both experience feelings of sorrow, and there is subsequent pro-social behaviour (Vossen et al., 2015). Consequently, research on empathy has been occasionally problematic, as measurement of affective empathy can overlap with various features of sympathy (Vossen et al., 2015).

Experiencing sympathy does not rely on a ‘sameness’ between self and other in the way experiencing affective empathy does. Affective empathy can be described as an ‘emotional resonance’ between the self and other that is not necessary for sympathy (Keum & Shin, 2016). That is, to feel sympathy, you do not need to experience the same emotion as the other elicits (e.g. one feels sorry, the other feels anger). Some empathy theorists clarify the differentiation in terms of the behaviours they elicit: Chismar (1988) suggests sympathy involves an egotistic motivation to help (conscious) and thus cannot lead to true altruism (unconscious) like empathy can.

Empathy and sympathy are separate, but related, constructs (Decety & Michalska, 2010). Reiss (2017) explains the distinction through this example:

You look out your office window and see a man in the cold rain, shivering, no raincoat. You feel sorrow for this person. This is sympathy. Empathy is the capability to imagine as if oneself is next to the man, out there in the rain. It’s the capability to experience their specific discomfort as if it was all your own; without losing the sense of the ‘as if’. If we lose the ‘as if’ we are unable to move beyond our own self-interests; and it is with the ‘as if’ that motivates a caring empathetic response (e.g. taking down an umbrella). (p. 13)

Table 1 highlights some of the key demarcations between these three concepts.

Table 1 Key points of demarcation between cognitive empathy, affective empathy and sympathy

With these concepts now clearly demarcated, we now turn our attention to how they play out in the school environment.

The development of empathy and the role of schools

Empathy can be considered a developmental, malleable skill (Ratka, 2018). Preschool children begin to show capability to take another person’s perspective which leads to a development in cognitive empathy (Bensalah et al., 2016). Throughout childhood and adolescence, cognitive and emotional empathy increase through a combination of biological and environmental influences (Allemand et al., 2015; Heyes, 2018). The range of environmental influences that can influence empathy include parenting styles and relationships (Feldman, 2007; McDonald & Messinger, 2011) and social media (Vossen & Valkenburg, 2016).

While affective empathy is sometimes considered less influenced by the environment and a more stable and inherited type of empathy (i.e. trait, dispositional empathy), there is current evidence that both affective and cognitive empathy undergo a period of growth in adolescence that can be stimulated (Bunge et al., 2002; Decety, 2020; Decety & Jackson, 2004; Frith & Frith, 2003; Johnson, 2012; Schwenck et al., 2014). Furthermore, adolescence tends to be considered a time where any developmental changes have a long-term consequence; there is a predictive element between empathy development during adolescence and social outcome variables (such as perceived social integration and relationship satisfaction) in adulthood (Allemand et al., 2015).

The ability to ‘feel and show empathy’ is one of the key characteristics of the ‘social awareness’ skill and can be explicitly taught in schools (CASEL, 2019). There is evidence that both cognitive and affective empathy can be improved in this age group with programmes that can be delivered in an educational environment (e.g. Castillo et al., 2013). Some research suggests that these programmes can improve empathy for short periods of time, and although it may then decline, there tends to be a lasting awareness of others, or an overall improvement in a related ‘empathetic response’ (Herrera et al., 2018). Other research (Doreille et al., 2021) has found empathy training programmes show a sustained improvement in empathy that is retained months later.

While in Australia there are a range of programmes and resources for socio-emotional learning (SEL), and direction from the Australian curriculum to implement such programmes, it is largely the school's responsibility to define where SEL fits into their curriculum (Bowles et al., 2017). The investment in these programmes can be significant for schools and deciding how to implement such programmes requires careful thought. However, the research base does not focus on the adolescent period where empathy developmentally increases, with only 13% of studies being on high school students (Years 9–12), and 31% on middle school (Years 6–8). Additionally, most studies were completed in the United States (Durlak et al., 2011).

Given schools use SEL programmes to develop empathy during the adolescent developmental period, we now turn to the role digital technologies might offer given their increasing use in the education system.

The role of virtual reality as the ‘Ultimate Empathy Machine’

Social emotional learning programmes in schools over the last decade have embraced the use of digital technology. Theorists have suggested that to experience empathy, we need to see and gain empathetic cues (such as facial expressions, body language, tone of voice) from the other, and that only a truly interactive experience can promote empathy (Hassan, 2020; Reiss, 2017). As digital technologies improve, they have the potential to provide these cues more authentically.

The idea of psycho-social skill development using technology is closely linked to the technology’s ability to change an individual’s presence. Two key terms are important here. Firstly, immersion which refers to the extent that an individual physically experiences the virtual world (Slater, 2003). For example, if sounds and sight are limited to the virtual world and effectively block out the real world (referred to as sensory fidelity), then this would be a greater immersion than an experience that does not. Secondly, presence refers to the extent to which one is involved in human experience. Improving immersion levels (with increased attention to sensory modalities) may increase the presence (Baños et al., 2004). There is also a connection between empathy and presence. Nicovich et al. (2005) suggested that empathy refers to the individual connecting to another person whereas the presence is the individual connecting to another environment. Experiencing empathy uses similar perceptual tools to experiencing presence. However, this research posits the relationship such that without presence you cannot have empathy; that is, if you do not see someone shivering in the rain, you cannot empathise. As Brinck (2018) suggests, it is this presence that provides a platform for the experience of empathy.

Virtual reality offers a higher level of immersion (and presence) than traditional methods of viewing media content such as 2D-projected films (watching on a screen) (Makransky et al., 2019; Sanchez-Vives & Slater, 2005; Vesisenaho et al., 2019). Someone using virtual reality can see a full 360-degree environment, choose where they look, and completely block all sounds from the external world. It follows that if virtual reality increases immersion and presence, there is a possibility it may also increase empathy compared to 2D media forms, and this has been supported by research (Alberghini, 2020; Barbot & Kaufman, 2020; van Loon et al., 2018). However, there is also research to suggest virtual reality does not improve empathy, but possibly can improve related constructs such as pro-social behaviour (e.g. signing petitions) and attitude change (Hargrove et al., 2020; Herrera et al., 2018; Ventura et al., 2020). Others have found no significant difference between 2D-projected film and virtual reality (Bang & Yildirim, 2018). The perspective-taking element to (cognitive) empathy appears key in research involving virtual reality and subsequent improved empathetic responses or behaviours associated with empathy (Barbot & Kaufman, 2020; Herrera et al., 2018).

More recently, research has shown using virtual reality can increase both types of empathy. Schutte and Stilinović (2017) found that virtual reality headsets can increase engagement compared to 2D-projected films, which was associated with a greater overall experience for both affective and cognitive empathy. Similarly, Alberghini (2020) found improvements in empathy when comparing the virtual reality and 2D experience with adolescents. Martingano et al. (2021) found that empathy can be improved using virtual reality, although there were no differences in charitable donations to a relative charity compared to their control conditions. The films used in these studies included Clouds over Sidra and Step into a Refugee Camp and were designed to promote empathetic responses.

Therefore, empathy can elicit pro-social behaviour and may be influenced by the external factors and programmes. Given the sensitivity of adolescence as a time of important development of empathy, and the past confusion of measurement and definitional clarification in types of empathy and sympathy, it is appropriate that research is conducted with this adolescent age group that investigates both forms of empathy and sympathy as a comparison. This research aims to add to the body of literature in understanding the role virtual reality plays in improving empathy in adolescence.

In summary, there are critical periods of empathy development in adolescence that are associated with positive social outcomes. There is also a possibility that virtual reality can improve empathy, and a potential to use this technology, during this period for SEL, when face-to-face learning is not always feasible. The next section considers the research questions posed for this research.

Research questions and hypotheses

The following research questions were investigated in this study:

  1. 1)

    Do adolescents receiving the same information from different mediums with different immersion levels (virtual reality vs 2D-projected film) have different empathetic and sympathetic reactions?

  2. 2)

    Does the use of a virtual reality medium affect adolescent empathy and sympathy?

In the current study, the following hypotheses were examined:

  1. 1)

    That 13–15-year-old students who experience the documentary Clouds over Sidra in any condition (VR or 2D) will show an increase in affective and cognitive empathy and sympathy.

  2. 2)

    That 13–15-year-old students who experience the documentary Clouds over Sidra using virtual reality (VR) will experience a larger increase in empathy after viewing the documentary compared to those viewing the 2D-projected format.

  3. 3)

    That 13–15-year-old students who use virtual reality (VR) to watch the film Clouds over Sidra will experience a larger increase in sympathy after viewing the documentary compared to those viewing the 2D-projected format.

Method

Participants

Research took place at an Australian independent, co-educational school in Melbourne’s south-eastern suburbs. Year 8 (aged 13–15 years) was the chosen population for the following reasons. Firstly, school administrators felt this year group would benefit from the curriculum link of the film used and their regular classwork as they covered the topic throughout their regular coursework. Secondly, the virtual reality headsets manufacturer recommended they not be used with children under 12 years old. Thirdly, the period of adolescence shows a tremendous growth in empathy. Finally, the documentary used (Clouds over Sidra) is about a 12 years old. Therefore, participants are of a similar age, which potentially enables a more authentic empathetic reaction.

A total of 116 participants enrolled in the study and were involved in the experiment at time 1. There were two individuals who dropped out, meaning there were 114 participants left at time 2, which was approximately 8–10 min after viewing the film. After further dropouts and absences, there were 77 participants who completed the research two weeks later at time 3 resulting in an attrition rate of close to 30%. Adolescents surveyed near the end of middle school or secondary school are especially prone to attrition (Murray & Xie, 2024). Thus, steps to minimise attrition for this age group were taken, including using digital tools (online survey), using existing relationships with teaching staff as rapport, and reducing barriers such as time to do the task (Murray & Xie, 2024). This rate is somewhat expected and consistent with psychological research involving adolescents over a time period of two weeks (Chin et al., 2021; Farris et al., 2020; Graham, 2009). The reasons for this rate of attrition include possible absences from class and the lack of presence of the researcher during the survey at time 3. Higher attrition rates affect the power of tests and generalisability.

From the initial 116 participants, the experimental group (time 1, n = 63) consisted of single-sex girls and boys classes (males = 35, females = 28). The control group (time 1, n = 53) also consisted of both girls and boys classes (males = 29, females = 24). In total, students from nine classes in the school participated in the research; all participants were aged between 13 and 15 years.

Procedure

The regular classroom teacher invited students to take part in the research during their regular humanities class by handing them the plain language statement. As participants were under 18, caregivers were emailed via the school’s administration software regarding their involvement including the plain language statement, withdrawal rights and consent form (for students and parents). If students did not want to participate, they were informed that it would have no effect on grades or reports, and a similar alternative activity was offered. No participant incentives were offered. Students who did not consent to being part of the research were invited via email to have a turn using headsets later, as were participants in the 2D condition. The research was approved by the University of Melbourne ethics committee in March 2018.

To protect privacy and minimise any experimenter effect, a third-party generated code was used to match participant responses over time. These codes were emailed to participants by the third party. The code consisted of a letter (indicating condition) and three numbers (indicating the individual). Those coded with A### were in the virtual reality condition, and those coded with B### were in the 2D film condition. At each time, participants were asked to enter the code to match their responses over time.

At the chosen school, Year 8 classes were stratified by gender: four girls’ classes and five boys’ classes. The participants completed the tasks in their usual class time and were allocated to the experimental (VR) or control (2D) condition depending on the constraints of the school timetable and resources. For example, the virtual reality headsets had to be charged after each use, so classes were allocated based on whether the previous class had used them or not. This meant that the allocation to the control/experimental groups was not truly random (thus classified as quasi-experimental).

Experimental condition

In the experimental (VR) condition, the researcher (with support of the classroom teacher) asked the participants to complete the survey below online, prior to any experience which was recorded as time 1. The participants then watched the documentary Clouds over Sidra using the VR headset and headphones. Immediately after viewing, they completed the survey for a second time (time 2). The time between time 1 and time 2 was approximately 10 min; they completed the survey immediately before and after watching the documentary. Approximately two weeks later, participants were asked to complete the survey a third time (time 3) in the same class, with the same classroom teacher. At all times, participants were reminded they could leave the research at any time.

Control condition

Participants in the control condition were surveyed the same three times as the experimental group participants. The only difference was that in the control condition participants watched a 2-dimensional (2D) viewing of the documentary Clouds over Sidra projected onto a whiteboard, instead of via a virtual reality headset.

The participants completed the research over a period of three weeks. This allowed change in empathy and sympathy to be measured across time and between conditions (VR or 2D). Therefore, the design of the research was ‘within-between’.

Materials

The film

Developed in 2015 by the United Nations, Clouds Over Sidra (https://www.with.in/watch/clouds-over-sidra/) is an eight-minute documentary developed to raise awareness of the Syrian Refugee Crisis. It follows a ‘day in the life’ of Sidra, who is a female 12-year-old refugee who narrates the programme. It was filmed using a 360-degree camera for use in virtual reality and is available in 2D form. This film was shown due to its specific development by the United Nations to improve empathy and understanding by gaining the perspective of an adolescent refugee in the Syrian Refugee crisis. This film has also been used in previous research studies on changes to empathy using virtual reality (Alberghini, 2020; Martingano et al., 2021; Schutte & Stilinović, 2017). Some of these changes to empathy include improved empathetic reactions when watching the film in virtual reality compared to other 2D films, including short-term improvements in altruistic behaviour (Alberghini, 2020), improvements in empathy (using the Davis empathetic scale; 1983) and improvements in engagement in the VR condition compared to 2D version of the same film.

The virtual reality headset

In the experimental condition, participants used virtual reality headsets. The headsets are fully adjustable for individuals, with focus dials for each eye and an overall ‘depth’ focus dial. Headphones were used to provide auditory immersion. In the alternative condition, the 2D version of the film was projected onto a whiteboard to form a large screen, which is typical to how the participants watch classroom films. The room and seating were the same for both conditions.

The survey measurement

Empirical research and measurement in this area offer varying definitions of various subsets of empathy and sympathy (Jolliffe & Farrington, 2006a, 2006b; Reniers et al., 2011; Vossen & Valkenburg, 2016; Vossen et al., 2015). One of the most popular tools used to measure empathy is the Interpersonal Reactivity Index generated by Davis in the 1980’s (Melchers et al., 2016). This Index measures empathy over four subscales: perspective-taking, fantasy, empathetic concern and personal distress. While widely used, this tool has been criticised for not making a clear distinction between empathy and sympathy or accurately measuring both types of empathy (Chrysikou & Thompson, 2016; Vossen & Valkenburg, 2016). Specifically, the empathetic concern (EC) subscale (commonly associated with affective empathy) has been criticised as not differentiating between sympathy and empathy (Jolliffe & Farrington, 2006a, 2006b; Vossen & Valkenburg, 2016). The EC subscale aims to measure ‘the tendency of the respondent to experience feelings of warmth, compassion, and concern for others undergoing negative experiences’, which refers more closely to the definition of sympathy than empathy (Davis, 1980, in Vossen & Valkenburg, 2016, p. 120). In close analysis of the subscale, the items tend to be more associated with feelings of sympathy too, for example: ‘sometimes I don’t feel very sorry for people when they are having problems’ (Davis, 1980, p. 2). Davis (1983) later acknowledged that the EC scale assesses other-oriented feelings of sympathy which further supports Vossen and Valkenburg’s (2016) analysis. However, there is still disagreement in these definitions as some modern empathy research suggests that the empathetic concern scale is perhaps more a ‘motivating’ call to action for pro-social behaviour, otherwise called ‘motivational empathy’.

Along with demographic and coding questions, this research used the Adolescent Measure of Empathy and Sympathy (AMES) Survey (Vossen et al., 2015) at each time point to measure empathy and sympathy. The AMES survey was developed by Vossen et al. (2015) after analysing key issues in the measurement of cognitive empathy, affective empathy and sympathy (as described above). Vossen et al. (2015) developed and validated the AMES survey as a measure to ‘differentiate between empathy and sympathy and to balance emphasis of cognitive empathy and affective empathy’ (Vossen et al., 2015, p. 2.). According to Sesso et al. (2021), the AMES survey is one of the few empathy measures that has assessed test–retest reliability. The scores for test–retest reliability were satisfactory according to the authors with r = 0.56 for affective empathy, r = 0.66 for cognitive empathy and r = 0.69 for sympathy (Vossen et al., 2015). The AMES survey has internal consistency (α = 0.75–0.86) (Sesso et al., 2021). The AMES survey is appropriate for adolescents and its language is simplified from adult surveys. Vossen et al. (2015) suggest the validation and reliability confirmation indicates the survey is appropriate for those aged 10–15 years.

The AMES survey offers four statements each on cognitive empathy, affective empathy and sympathy. The measure uses a Likert-type scale with the options of (1) never, (2) almost never, (3) sometimes, (4) often and (5) always. An example of a cognitive empathy statement is ‘I can easily tell how others are feeling’ and an affective empathy statement example is ‘When people around me are nervous, I am nervous too’ (Vossen et al., 2015). Sesso et al. (2021) advise the context and setting is the most important element in choosing the right survey for empathy and sympathy, proposing, for example, that the Interpersonal Reactivity Index is perhaps more suited to clinical conditions. Given the current research was conducted on 13–15-year-olds with the clear purpose to differentiate sympathy from empathy, the AMES survey was the most suitable choice for our study.

Data analysis procedure

The data were analysed in SPSS. To compare the difference in cognitive empathy, affective empathy and sympathy between the three time points, a repeated-measures MANOVA was used as it measures between and within subject effects. Between-subject effects involve measuring differences between the control (2D) and experimental (VR) condition. The between-subject effects analysis assists in answering Research Question 1: Do adolescents receiving the same information from different mediums (virtual reality vs 2D-projected film) that have different immersion levels have different empathetic and sympathetic reactions? That is, is there a difference between the control group (2D) and the experimental group (VR)?

Investigating within-subjects effects directly address Research Question 2: Does the use of a virtual reality programme affect adolescent empathy and sympathy? This is because a within-subjects design investigates changes before and after an experience (such as watching a film using virtual reality headsets). The design is repeated measures as the participants were asked to complete the survey at three different time points. The three dependent variables measured are cognitive empathy, affective empathy and sympathy.

Results

Descriptive statistics and reliability

Table 2 shows the descriptive statistics of cognitive empathy, affective empathy and sympathy at each time it was measured (time 1, time 2 and time 3). The first correlations are between cognitive empathy, affective empathy and sympathy at time 1. This is followed by correlations between cognitive empathy at time 1 and 2, affective empathy at times 1 and 2 and sympathy at times 1 and 2 which are presented, as well as correlations between each concept (cognitive empathy and affective empathy). Finally, the correlations of cognitive empathy, affective empathy and sympathy between themselves and each other at times 1, 2 and 3 are presented.

Table 2 Descriptive statistics and Cronbach’s reliability co-efficient

Correlation and means

The following descriptions of each correlation interpretation come from Mukaka (2012). Each factor had a strong positive correlation with itself at each of the time points.

Cognitive empathy

There was a high positive correlation between time 1 and time 2 cognitive empathy (r = 0.81, p < 0.01); time 1 and time 3 (r = 0.78, p < 0.01); and time 2 and time 3 (r = 0.75, p < 0.01).

Affective empathy

There was a high positive correlation between time 1 and time 2 affective empathy (r = 0.82, p < 0.01); time 1 and time 3 (r = 0.76, p < 0.01); and time 2 and time 3 (r = 0.73, p < 0.01). Affective empathy at time 1 had a moderate positive correlation with sympathy at time 2 (r = 0.54, p < 0.01) and time 3 (r = 0.51, p < 0.01). Affective empathy at time 2 had a moderate positive correlation with sympathy at time 2 (r = 0.65, p < 0.01) and time 3 (r = 0.51, p < 0.01). Affective empathy at time 3 had a moderate positive correlation with sympathy at time 3 (r = 0.55, p < 0.01).

Sympathy

There was a high positive correlation between time 1 and time 2 (r = 0.84, p < 0.01); time 1 and time 3 (r = 0.87, p < 0.01); and time 2 with time 3 (r = 0.83, p < 0.01. Sympathy at time 3 had a moderate positive correlation with affective empathy at time 3 (r = 0.50, p < 0.01).

All other relationships had low positive correlations.

Means

The mean scores of sympathy (time 1 M = 4.29, time 2 M = 4.35, time 3 M = 4.16) were higher than both of the empathy scores at each time. For all factors, the mean scores increase between time 1 and 2 and subsequently decrease between time 2 and 3. Average scores for each factor at times 3 were the lowest recorded for each factor.

Reliability

Cronbach’s alpha found good internal consistency in most of the items at each time for each factor, except for time 1 sympathy items (α = 0.62). This is considered a questionable score according to George and Mallery (2003). According to George and Mallery (2003), the items used in the study had good reliability over time for cognitive empathy (α = 0.80) and affective empathy (α = 0.81). The items to investigate sympathy had acceptable reliability over time (α = 0.70). Overall, the average of all items over time had acceptable reliability (α = 0.77).

Between and Within-Subjects Analyses

An a priori power analysis was conducted using G*Power (Faul et al., 2007). Results indicated the required sample size to achieve 80% power for detecting a medium effect, with a significance criterion of α = 0.05, was N = 78 for the MANOVA. The obtained sample size at time 1 of N = 116 met this criterion.

All assumptions were accounted for except for normality of data. The Shapiro–Wilk test of normality was used as this is the recommended test in terms of power and can be used in up to 2,000 cases (Hernandez, 2021). In all cases, the data appeared not to be normally distributed. We additionally ran a Kolmogorov–Smirnov test to check for normality, which demonstrated a non-normal distribution for all except five of the 18 cases. However, some researchers suggest if skewness and kurtosis is within acceptable range, and assuming a large enough sample size (> 30 in each condition) MANOVA is robust to this violation (Blanca et al., 2017). Table 3 shows that skewness and kurtosis were within normal ranges (Brown, 2015).

Table 3 Skewness and kurtosis for cognitive empathy, affective empathy and sympathy across time and condition

Change in empathy and sympathy over time

Table 4 shows that there is an effect of time on cognitive empathy, F(2, 150) = 3.51, p < 0.05. There was an effect of time on affective empathy F(2, 150) = 3.41, p < 0.05. There was also an effect of time on sympathy F(2, 150) = 15.23, p < 0.001.

Table 4 The comparison of means of cognitive empathy, affective empathy and sympathy across time

Given the main effects on time for all measures of sympathy and empathy, a closer analysis for each time and each measurement was conducted using ANOVA with a Bonferroni correction to minimise Type 1 error.

Noting the scale was 0–5, there was an increase in cognitive empathy after watching the movie by 0.08 (p = 0.03). Cognitive empathy also decreased by 0.13 after time from 2 to 3 (p = 0.02). Affective empathy increased by 0.15 immediately after watching the film (p = 0.05). Affective empathy also decreases after time 2 to time 3 by 0.13 (p = 0.46). Sympathy decreased overall between time 1 and 3 by 0.14 (p = 0.01) with a drop between times 2 and 3 by 0.21 (p < 0.001) (Fig. 1).

Fig. 1
figure 1

Mean cognitive empathy, affective empathy and sympathy scores across time

The interaction between condition and time

There was no significant main effect on the interaction of the condition and time for cognitive empathy, F(2, 150) = 1.48, p > 0.05. There was no significant main effect on the interaction of the condition and time for affective empathy, F(2, 150) = 0.17, p > 0.05. There was no significant main effect on the interaction of the condition and time overall for sympathy, F(2, 150) = 0.03, p > 0.05. The analyses are shown in Table 5.

Table 5 Comparison of means of cognitive empathy, affective empathy and sympathy across time and condition

Gender differences

There was a significant between-subjects effect with gender for each measure, initially in the pre-test (time 1) shown in Table 6. However, a closer look at gender differences indicated that there was no significant main effect on the interaction of the condition, gender and time for cognitive empathy, F(2, 150) = 1.33, p > 0.05, affective empathy, F(2, 150) = 0.18, p > 0.05. or sympathy, F(2, 150) = 1.79, p > 0.05.

Table 6 Between-subject effects for gender

Discussion

The discussion will first address key results relating to the hypotheses and the subsequent implications for schools and socio-emotional learning (SEL) and then explore how and where the current research supports and contradicts existing research. Next, the discussion will address how the results support the conceptualisation of empathy and sympathy. Finally, future directions and limitations are addressed before the conclusion.

The first hypothesis was that 13–15-year-old students who experience the documentary Clouds Over Sidra (in either VR or 2D) will experience an increase affective and cognitive empathy and sympathy. Results indicated that for both conditions, there was a one-time, immediate increase in both cognitive and affective empathy after viewing although this change did not exist after two weeks. Sympathy decreased over time.

These results suggest that an empathy-provoking stimulus/film can generate a small short-term improvement in empathy. This result is consistent with limited past research suggesting that any form of (perspective-taking) intervention to promote empathy tends to have a short-term improvement in empathy before a decline, although there may be an improvement in valuing or attitudes (Herrera et al., 2018). With this result in mind, it is suggested that future empathy-invoking research is investigated over a course longer than three weeks to investigate the extent of this change with additional research questions on attitude.

As a socio-emotional teaching tool, these findings suggest that one-time emotional experiences are not likely to have a lasting effect at improving empathy for adolescents. The results support the idea that even well-implemented SEL programmes have the largest positive wellbeing effects immediately after the programme, and their effects fade later (Sklad et al., 2012). Further, SEL research in schools has suggested that without combining a range of socio-emotional competencies, there may be lack of long-term change (Durlak et al., 2011). Thus, the drop in empathy between times 2 and 3 may be explained by the lack of focus on empathy and related constructs within these times. This suggests socio-emotional learning in empathy is more likely to be successful within an embedded whole-school programme that is maintained, rather than one-time experiences focusing on a single skill or an externally provided programme. As such, schools could use these programmes and interventions as a platform for further discussion or engagement within a comprehensive programme.

Results also indicated that exposure to the documentary through the different mediums (VR and projected 2D) generated no difference in producing empathetic and sympathetic reactions. This means that using virtual reality did not improve empathy any more than watching a documentary in 2D format. Therefore, hypotheses Two and Three are rejected and the null hypotheses is accepted. That is, there was no difference in empathy or sympathy between 13 and 15-year-olds who used virtual reality to watch the film Clouds over Sidra compared to those who watched the film projected in 2D format. Given that the film Clouds over Sidra did by itself produce a change in empathy, this result suggests that increasing immersion by adding the element of virtual reality and increasing sensory fidelity does not provide a better way to develop empathy, even in the short term compared to 2D format. This particular result is important to consider given expenses to schools and communities considering investing in virtual reality programmes for the purposes of empathy building in their SEL programmes for adolescents.

Our current research supports the findings of Herrera et al. (2018) who found that there was no significant increase in empathy between 2D and virtual reality interventions, and that immersion levels did not have a direct effect on empathy. According to their research, provided a perspective-taking task was engaged, there was no effect of immersion levels on empathy. In other research, Bujić et al.’s (2020) study found that virtual reality was more significantly associated with positive attitudinal change than empathy (e.g. donation to a United Nations fund).

The results challenge the assumption that with improved immersion using virtual reality (compared to 2D films), there may be improved empathy. Digitally ‘being in another’s shoes’ may not equal the improved understanding or shared experience that is so important in empathy. Perhaps empathy is less reliant on immersion and instead reliant on other factors such as perspective-taking and storytelling. This suggests the improved technical immersion as measured by increased sensory fidelity and autonomy may not always improve the psychological capacity. Some theorists (Hassan, 2020) have suggested that digitisation can only produce a shallow experience of the true interactive experience required for empathy. Hargrove et al. (2020) found that virtual reality did not improve empathy more than an embodied or ‘lived’ experience, and thus investigations using more embodied stimuli within immersive virtual reality could possibly be investigated in the future research. However, this research suggests the more important element in developing adolescent empathy is engaging in different perspective-taking stories over time, and less important is the level of immersion in the story itself.

These results have implications for schools that are considering the investment of virtual reality for widespread use across the school. There are also implications for those wishing to run SEL programmes remotely, as schools and teachers need not utilise expensive technology tools to improve empathy and sympathy among their adolescent student body. Instead, long-term well-structured and embedded SEL programmes, with perspective-taking tasks, are more likely to assist developing empathy. Virtual reality could potentially be used to promote engagement or as a novelty tool as part of the SEL programme although this was not the focus of the current research, and investigation of engagement levels are recommended in future research.

The current findings also contradict the most recent work in this area by Schutte and Stilinović (2017) and Alberghini (2020). Some reasons for this difference may be the different measures and stimuli used. Schutte and Stilinović (2017) used adjusted items from Davis’s (1983) IRI measurement of perspective-taking and empathetic concern which has been criticised by some researchers (e.g. Chrysikou & Thompson, 2016) as measuring elements of sympathy and stating them as a subtype of affective empathy. Therefore, any differences in empathy levels may, in actuality, be differences in sympathy levels. Alberghini’s (2020) research was based on self-report and (as they described) had a sample which demonstrates possible social desirability bias based on the sample from a liberal school. Furthermore, the research by Schutte and Stilinović (2017) and Alberghini (2020) was conducted over two time points rather than three. To account for any novelty effect, and to test whether any change in empathy could be a one-off experience, the current research measured three time points.

Empathy and sympathy as concepts

The current research highlights the importance in making the distinction between empathy and sympathy. There were differences in the results between (cognitive and affective) empathy and sympathy. Sympathy was rated highest across all three times with a significant drop between times 2 and 3. Both cognitive and affective empathy showed similar trends. Both cognitive and affective empathy increased between times 1 and 2, and then both significantly dropped between times 2 and 3, while sympathy decreased over time. This supports researchers who have suggested that empathy and sympathy are related but distinct concepts in socio-emotional learning. Considering the importance of the distinction in these concepts, this research also supports using tools that clearly define and distinguish between these concepts, and close examination of tools and results used in previous studies in empathy.

Additionally, given that these concepts all had strong positive correlations with themselves across time, there is indication that those with higher baseline empathy and sympathy maintain this higher level across time. This possibly shows support for the inherit stability of empathy, especially over two weeks. This supports past research pointing to both affective and cognitive empathy being relatively stable with overall increases over adolescence (Davis & Franzoi, 1991).

As sympathy levels in this research appear to have had a ceiling effect and perhaps this occurred through social desirability, it would be prudent to investigate whether adolescents understand the difference between empathy and sympathy as concepts, and if they do, investigate why sympathy is seen to be more socially desirable in adolescents. This could help inform SEL programmes and gain an understanding on empathy development in adolescents.

Gender and empathy

The research also adds to the body of research on the differences between girls and boys in empathy development. There were significant between-group effects between boys and girls in the initial survey across each of the three measures. This supports research that suggests girls are more empathetic than boys in adolescence (Mestre Escrivá et al., 2009). Future research in this area could consider whether social desirability in adolescence may add to this difference.

Limitations

This research aims to build on the current body of research investigating how empathy can be developed through different mediums with a particular focus on schools and adolescent development of empathy. However, the sample was based on convenience and was quasi-experimental in design, limiting generalisations. The large independent school in Melbourne from which the sample was recruited could account for a possible ceiling effect in relation to sympathy due to social desirability, impeding results. Participants knew they were watching a film about refugees, and the first survey (M = 4.29, SD = 0.55) had high results; students from this school may have (before seeing the film) wanted to show high levels of sympathy. This situation may also account for the drop in empathy at time 3. Future research may consider a broader sample group and ensure the same researcher is available over the entire research period.

A second limitation is the conditions in which the survey at time 3 was undertaken. The significant decrease across all three measures could be explained by novelty effect and social desirability bias present for the first two surveys. The third survey was given in their regular classroom (not the senior school library), for 10 min of allocated time in an otherwise ‘normal’ lesson. There were no researchers, headsets or trip to the library and the experience was among other teaching. There was no novelty and instead, something they needed to do on request of the teacher. Therefore, the decrease across concepts by time 3 could be explained by the (lack of) novelty effect and survey fatigue.

A third limitation was that few pre-questions on exposure to virtual reality outside of school were asked which may lead to differences in the effect of novelty. To attempt to mitigate any novelty effect, the surveys were conducted a third time, which did show a short-term increase in empathy from watching the film (although not from virtual reality). Thus, it is recommended that future research on virtual reality allows for an element of longitudinal design, to build on the limited research in this area and have a pre-test question on amount of virtual reality experience. We also recommend that a qualitative approach be considered, particularly after viewing the film given the complexity of the concept of empathy.

Another limitation is related to the stimulus and technology used. While the control condition had less sensory fidelity and autonomy (and less subsequent immersion) than the virtual reality condition, we acknowledge that there was no further interactivity and autonomy other than being able to look at a chosen direction in the virtual reality condition. Additionally, Clouds over Sidra places the viewer within the refugee camp, allowing for a first-person perspective of a refugee, but the embodiment of Sidra itself does not fully occur due to inability to walk and interact with surroundings as Sidra. Although Bowman and McMahan (2007) suggest that sensory fidelity is key to immersion more than interactivity, other researchers suggest autonomy and embodiment are important (Gall et al., 2021; Ijaz et al., 2020). Due to technical constraints related to the school’s finances and limited options for appropriate stimuli for embodiment experiences for 13–15-year-olds available at the time of the research, it is acknowledged that the virtual reality condition was not as immersive as it could be, limiting scope, and we recommend that future research carefully consider the stimuli chosen and use a range of immersive techniques such as embodiment.

Finally, although there was a control and experimental condition, a non-perspective-taking task as an inclusion would provide more scope to explain any differences and to investigate if the change in empathy is about task rather than immersion levels, as suggested by Herrera et al.’s (2018) research.

Conclusion

This research aimed to investigate the effects of virtual reality on empathy. Findings suggest that for adolescents, an emotion-provoking film (Clouds over Sidra) can invoke a short-term improvement in both cognitive and affective empathy but not in sympathy, highlighting the differences between these constructs. However, the medium of the film, virtual reality compared to the control condition of projected 2D film, had no effect on empathy or sympathy. This research supports some past research indicating that virtual reality does not improve empathy compared to other conditions (Herrera et al., 2018).

This work has built on the body of emerging research on virtual reality and empathy and contributes to the broader question on whether virtual reality is the ‘Ultimate Empathy Machine’. More specifically, it adds to the discussion in schools and policy groups on SEL and the use of technology, particularly in a time where off-site/online learning is occurring more readily.