Perspective Matters: A Systematic Review of Immersive Virtual Reality to Reduce Racial Prejudice

In the wake of the COVID-19 pandemic and the rise of social justice movements, increased attention has been directed to levels of intergroup tension worldwide. Racial prejudice is one such tension that permeates societies and creates distinct inequalities at all levels of our social ecosystem. Whether these prejudices present explicitly (directly or consciously) or implicitly (unconsciously or automatically), research suggests that manipulating body ownership by embodying an avatar of another race using immersive virtual reality (IVR) can reduce racial bias. Adhering to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, this systematic review encompassed 12 studies that employed IVR and embodiment techniques to investigate racial attitudes. Subsequently, two mini meta-analyses were performed on four and �ve of these studies, respectively — both of which utilised the Implicit Association Test (IAT) as a metric to gauge these biases. This review demonstrated that IVR allows not only the manipulation of a sense of body ownership but also the investigation of wider social identities. However, despite the novelty of IVR as a tool to help understand and possibly reduce racial bias, our review has identi�ed key limitations in the existing literature. Speci�cally, we found inconsistencies in the measures employed, as well as demographic characteristics within both the sampled population and the embodiment of avatars. Future studies are needed to address these critical shortcomings by appropriately utilising implicit and explicit measures of racial prejudice, ensuring diverse sample representation, and considering a broader spectrum of embodied social groups.


Introduction
In the last 20 years, the use of immersive virtual reality (IVR) has gained a signi cant amount of traction in the social and cognitive neurosciences.By integrating feedback from sensory (seeing oneself in a virtual body or avatar) and motor signals (moving in real-time with the avatar), multisensory integration is achieved to create a full-body illusion (Blanke 2012;Slater et al. 2010).IVR, therefore, offers a powerful tool to manipulate the sense of body ownership (i.e., the feeling that your body belongs to you; Gallagher 2000), going far beyond the pioneering studies of the rubber hand illusion (Botvinick & Cohen 1998).
Accordingly, IVR has had diverse applications in multidisciplinary elds, such as in neurorehabilitation (Demeco et al. 2023), education (Hodgson et al. 2019) and visual perception (for an overview, see Wilson & Soranzo 2015).Drawing on this embodied cognition framework, social neuroscientists have utilised IVR methods to understand the psychological mechanisms involved in feelings of prejudice, especially in relation to perceptions of race and implicit racial attitudes (Farmer & Maister 2017;Maister et al. 2015;Peck et al. 2013).

Implicit Bias and the Implicit Association Test (IAT)
Implicit biases refer to unconscious or automatic mental associations that are typically thought to arise as a product of one's internalised schemas and subsequently guide one to partake in discriminatory behaviours without conscious intent (FitzGerald & Hurst 2017).Conversely, explicit biases refer to preferences, beliefs and attitudes that a person is consciously aware of and can identify (Dovidio & Gaertner 2010).Considering that implicit attitudes represent automatic associations, they were once believed to be relatively uncontrollable in nature.This led researchers to initially hypothesise that external environmental cues could not alter them in any shape or form (Jost et al. 2004).However, current research highlights the social and contextual sensitivity of implicit attitudes, suggesting that they are highly malleable and receptive to even the subtlest environmental in uences (Dasgupta & Greenwald 2001;Lowery et al. 2001).
Implicit attitudes are most frequently measured with the use of the IAT -with researchers relying heavily on this tool, particularly for race-based research (Schnabel et al. 2008;Banakou et al. 2016;Peck et al. 2013).Developed initially by researchers in 1998, the IAT represents an indirect measure of the strength of association between a bipolar target (i.e., me versus others) and a bipolar attribute (i.e., dark skinned versus light skinned) (Greenwald et al. 1998;Schnabel et al. 2008).The speed and accuracy at which each task in the test is successfully categorised is used as an indication of the level of bias tested (Greenwald et al. 1998).Although many advocate for the use of this tool in implicit social-cognitive research -attributing praise for its behavioural predictive value in particularly sensitive contexts (Nock et al. 2010), as well as serving as an alternative to explicit self-reporting measures that otherwise present with numerous limitations (Schnabel et al. 2008) -a number of criticisms have attempted to oppose its validity.
Critiques of the test encompass concerns about its lack of psychometric quality, susceptibility to in uence from social-contextual factors, and reinforcement of cultural stereotypes (Barden et al. 2004; Dasgupta & Greenwald, 2001;Schimmack & Howard 2021).Multiple meta-analyses focusing on the predictive validity of the IAT have documented an average correlation coe cient of between 0.13 and 0.24 when measuring the strength of the relationship between race IAT scores and prejudicial behaviours (Schimmack & Howard 2021;Carlsson & Agerstrom, 2016;Greenwald et al. 2015;Oswald et al. 2015).This could, in part, be due to poor methodological correspondence or measurement errors (Greenwald et al. 2015); however, if we are to rely on the race IAT as an experimental measure of implicit attitudes, it is vital that we are aware of this popular tools' drawbacks.

Implicit Racial Bias in IVR
Alterations in negative implicit biases for factors such as age, disability, race or gender can be achieved by implementing IVR.For example, in an attempt to support their Proteus Effect hypothesis (which suggests that an IVR user will inevitably adjust their self-representation and behaviour to conform to that of their virtual avatar), Yee and Bailenson (2007) showed that the embodiment of an elderly person can reduce implicit prejudices and negative stereotypes held against senior populations.Similarly, it has been shown that the virtual embodiment of an avatar in a wheelchair can positively affect implicit associations towards people with disabilities (Chowdhury et al. 2019).While virtual embodiment and intergroup encounters may work to positively adjust many categories of prejudice, the literature surrounding the alteration of implicit racial attitudes seems to be incongruent.Although most ingroup-outgroup interactions lead to a social divide or distinction in a real-world context, race represents a particularly potent cue for prejudicial categorisation due to its visual salience (Cosmides et al. 2003).
Racial ingroup members are generally evaluated more favourably than outgroup members, a result that has even been documented in infants as young as three months old (Bar-Haim et al. 2006).Due to the strength of this racial ingroup mentality, the sense of realism afforded by interpersonal contact in IVR has been shown to magnify prejudicial racial attitudes (Banakou et al. 2020).For example, in a study conducted by Rossen et al. (2008), medical students were less likely to express empathy for dark-skinned virtual patients than for light-skinned patients.Similarly, a study using interpersonal distance and physiological measures as an indication of racial prejudice showed that participants were fearful of racial minority avatars (Dotsch & Wigboldus 2008).Together, these studies suggest that racial biases have the potential to permeate virtual intergroup encounters.Although these studies suggest that racial biases have the potential to permeate virtual intergroup encounters, understanding the context and characteristics of the participants is crucial for accurately interpreting their fear responses.
Research indicates that perspective-taking tasks involving perceptual ownership of an alternative racial body in IVR can also impact negative racial biases (Banakou et al. 2016;Farmer et al. 2012).A reduction in in-group favouritism (Dovidio et al. 2004), implicit racial biases (Banakou et al. 2016;Forscher et al. 2019) and explicit stereotypes (Galinsky & Moskowitz 2000) have been attributed to embodiment interventions that alter perception through IVR.Moreover, several studies have documented a positive change in implicit racial bias towards black individuals when white participants perceive an illusory rubber arm belonging to a black virtual body as their own (Blanke et

Shortcomings of IVR Racial Research
In contrast to the aforementioned studies, some research has documented opposing relationships, providing evidence indicating either increased or little to no change in performance on measures of bias after being subjected to embodiment and/or interpersonal interaction in IVR (Groom et al. 2009;Hasler et al. 2017;Thériault et al. 2021).We hypothesise that multiple factors may be at play for these inconclusive results to arise, including stereotypical representations of the minority group in IVR scenarios, a lack of methodological precision or insu cient appreciation of the underlying socio-cognitive mechanisms and moderating variables of intergroup bias, especially in relation to embodiment (Chen et al. 2021).Similarly, research in this domain represents a racial issue in and of itself, since researcher-related and methodological complications may contribute to the results and the interpretation thereof in the nal publication.
Persons involved in the research process, including the experimenters, authors, participants and editors, are often systematically connected, with very few authors employing and examining non-white for the purpose of their study (Roberts et al. 2020).Therefore, it is imperative that we acknowledge the sensitivity of geographical and social context in racial research as well as the diversity of external factors that may contribute to the nal results.
Additionally, as the number of researchers capitalising on the potential of IVR to reduce racial prejudice increases, it is crucial that we consider additional methodological or measurement-related consequences of the research process, including the degree of immersion a participant may experience with their embodied avatar, the complexities of implementing IVR technology and the details pertaining to the measurement of racial bias or prejudice used.Therefore, this systematic review synthesises articles that have used IVR to investigate and change implicit and/or explicit racial prejudice with the aim of understanding how virtual embodiment may contribute to our racial and social beliefs, opinions and prejudices.More speci cally, this review aims to determine the effectiveness of IVR interventions in altering racial prejudice, critically analyse the representativeness and external validity of the included studies, identify the methodological advantages and limitations of IVR, and how these factors might in uence the outcomes of current race research as well as the implications for such research in the future.As a supplementary assessment, we performed a meta-analysis on the application of the IAT in investigating racial attitudes using the articles encompassed in this review.
Accordingly, the main research questions guiding this review were as follows: (1) Has embodiment using IVR been successful in eliciting a reduction in racial prejudice?(2) What measures are used to assess implicit and explicit racial biases in IVR studies?(3) What are the characteristics and degrees of representation of samples used in these IVR studies?(4) To what extent do the results of the IAT differ across various studies focusing on IVR embodiment and racial implicit bias?

Methods
This review followed PRISMA (Preferred Reporting Items for Systematic Reviews) guidelines and has been pre-registered in the PROSPERO database (https://www.crd.york.ac.uk/prospero/display_record.php?ID=CRD42022325576.All deviations from the pre-registration are distinctly documented in this review.We undertook a comprehensive systematic review by searching through four distinct online databases: PsychINFO, Embase, MEDLINE and Global Health (all of which were accessed via OvidSP -a search platform enabling access to a variety of international databases, journals and books).Although none were found, we attempted to source additional records from the reference lists of selected articles during the screening process.The termination point for the database search was 03.10.2022.Table 1 illustrates the employed search terms.

Table 1: Summary of Search Terms and Corresponding Boolean Operators
The inclusion and exclusion criteria were outlined according to the Population, Intervention, Comparator and Outcomes (PICO) framework (Richardson et al. 1995).In terms of population, the review included studies examining neurologically healthy adults and/or children capable of using IVR technology and embodiment.As the intervention approach, participants were required to engage with IVR technology through complete embodiment of an avatar from a different racial background (i.e., a race distinct from their own).The results of the intervention strategy were investigated in contrast to the control group, who (i) did not experience an intervention; (ii) experienced an intervention that did not involve IVR technology (e.g., perspective taking exercises, viewing videos, playing two-dimensional (2D) three-dimensional (3D) video games); (iii) used IVR technology to embody or interact with a virtual avatar identical to their own race; or (iv) partially embodied an avatar of another race (e.g., embodiment via the rubber hand illusion).
The included articles used either quantitative or qualitative measures of implicit and/or explicit racial bias/prejudice.This encompasses various measurements such as reaction time assessments such as the IAT, evaluations of racism-related attitudes, racial prejudice, or bias, response time evaluations, measurements of behavioural or interpersonal proximity, physiological indicators (such as skin conductance and heart rate measurements), and any other qualitative evaluations of racial prejudice.Articles were excluded based on whether they were classi ed as reviews or opinion pieces, were unpublished or had not yet been peer reviewed.Additionally, articles targeting non-race related prejudice (i.e., ageism, sexism, disease-related prejudice) and studies not written or published in English were excluded for the purpose of this systematic review.
Following the electronic database search, we transferred our search results to Zotero -a referencing management software -to remove duplicate articles.
Our preliminary search results were then exported to Rayyan -a collaborative online tool for systematic reviewers -to screen articles according to the established set of inclusion and exclusion criteria.The screening process involved three independent phases: title and abstract screening, full-article screening and con ict resolution.Two independent reviewers (SH & SA) performed an initial screening of the titles and abstracts in accordance with the aforementioned criteria.Subsequently, a full-text review was performed to con rm the eligibility of each article for inclusion.A third reviewer (BD) was then consulted to resolve any con icts.The risk of bias was minimised using Rayan's 'blind' option, which ensured that each reviewer could not view their collaborators' screening decisions.Additionally, we evaluated the methodological rigour for each paper utilising the JBI critical appraisal tool (as detailed in the Supplementary Information).The relevant extracted information was organised into three data extraction tables (as seen in Tables 2-4 below).
Thereafter, two random effects meta-analyses were conducted to augment the outcomes and strengthen the statistical power of the systematic review.
Incorporating eight out of 12 eligible papers, the meta-analysis focused on implicit bias assessment through the IAT.However, the methods of measuring IAT scores varied among the studies.Speci cally, ve of the eight articles reported Post IAT scores, indicating assessments conducted after the experimental conditions, while four others reported dIAT scores (difference in IAT score), re ecting the difference between Pre-and Post-test IAT scores.Consequently, two distinct mini meta-analyses were conducted: one utilising Post IAT scores (n = 5) and the other using dIAT scores (n = 4) from these studies.Within the context of these meta-analyses, the effects observed across primary studies were transformed into standardised effect sizes.These effect sizes were calculated for each study by subtracting the average score of the experimental group, which embodied a racial outgroup avatar in IVR, from the average score of the control group (i.e., participants who either embodied their own-race avatars or engaged in different activities, such as perspective-taking interventions).The resulting value was divided by the combined standard deviations of both groups.The relevant data were extracted and organised in Excel, with analyses and gures conducted using the "meta" package for R (Schwarzer et al. 2015).

Search Results
A total of 681 articles were rst identi ed.After removing duplicates, 479 articles were screened according to their title and abstract.Subsequently, 18 full-text papers were screened, 12 of which met the eligibility criteria and were included in the review.Figure 1 depicts the PRISMA owchart, illustrating the progression of information across the various stages of the review.

Embodiment & interaction
The Effect of VR Avatar Embodiment on Improving Attitudes and Closeness Toward Immigrants.

Chen et al. (2021)
Singapore Investigate the effect of VR embodiment on attitudes and closeness toward an outgroup vs ingroup in novel co-ethnic immigration context in Asia.

Embodiment & interaction
Virtual body ownership and its consequences for implicit racial bias are dependent on social context.

Characteristics of the Studies and Samples
Seven studies, accounting for half of the total number, were published between 2020 and 2022, thereby demonstrating the growing interest and progress of IVR research in the cognitive sciences in investigating the question of racial prejudice.Of the remaining articles, ve studies were published between 2013 and 2018, and the earliest publication was dated 2009.All studies except for one, which was conducted in Singapore (Chen et al. 2021), were undertaken in the global North, with the majority of studies originating from Spain (n = 4) and the United States of America (USA; n = 3).Most studies used a between-group design (n = 10), with additional study designs including repeated measures (n = 1) and mixed factorial research design (n = 1).
Sample sizes ranged from 32 to 171 participants, and all studies consisted of young participants, with a mean age ranging from 21-38.5 years old.Every study had a higher percentage of female participants, with some studies including only female participants.Limited information was provided regarding the socio-economic status of participants, with most articles only reporting that their sample comprised university students (n = 9).Additional data on socioeconomic status were provided by Salmanowitz (2018), who stated that their participants were predominantly liberal and highly educated.Finally, the majority of the included studies consisted of either an all-White sample (n = 7) or a majority-White sample (n = 3).Five studies included Asian participants, and three studies incorporated Hispanic or self-identi ed other participants.

IVR Interaction and Embodied
All studies included an embodiment condition.The main forms of interaction included a combination of both avatar embodiment and avatar interaction (n = 6), embodiment-only conditions (n = 3) and alternative conditions (n = 3).The alternative conditions included mental perspective-taking (imagining taking the perspective of a research confederate) versus embodied perspective-taking ("body swapping" with the research confederate; Two studies included neutral conditions, namely, alien (purple) avatars, with one study using an alien embodiment condition (Peck et al. 2013) and the other using an alien interaction condition (Harjunen et al. 2022).In addition to the alien interaction condition, Harjunen et al. ( 2022) included a condition that enabled the participants to interact with a white and black virtual hand.Similarly, this condition was included in three other studies (Hasler et

measures
Commonly investigated outcomes were implicit (n = 8) and explicit (n = 5) attitudes towards a particular target group.Only three studies used both implicit and explicit measures.Following the narrative of IVR being the ultimate "empathy machine" (Barbot & Kaufman 2020), several studies assessed empathy, pain perception, mimicry, evaluation of mock legal cases and self-other overlap (n = 7, using at least one such measure).Neurophysiological measures such as skin conductance, heart rate and electroencephalography (EEG) were applied rarely, with only one study using three EEGs to assess empathetic resonance to ethnic outgroup pain as measured by sensorimotor beta event-related desynchronisation (ERD; Harjunen et al. 2022).
As anticipated, the IAT remained the predominant gauge of racial (implicit) prejudice (n = 8), followed by the Interpersonal Reactivity Index (IRI; Davis 1980), which stood as the second most frequently employed assessment (n = 3).The IRI is a multidimensional measure of empathy that comprises four subscales, namely, perspective taking (PT), empathetic concern (EC), fantasy (FS) and personal distress (PD).Unlike the IAT and the IRI, there is a wide range of different measures used across the studies for both explicit bias and empathy.Speci cally, measures of explicit bias include the Symbolic Racism Scale Finally, most studies (n = 10) included some form of embodiment questionnaire.

Research Outcomes
Of the studies that measured implicit biases, ve out of the eight studies reported a decrease in implicit bias scores (Banakou et 2021) found converging results for implicit and explicit bias, it is important to note that they coincided in that there were nonsigni cant results for implicit and explicit biases.Thus, across all studies, no effect of IVR embodiment on explicit bias was found.
an embodied perspective-taking condition showed an increase in empathy in comparison to the control group (Thériault et al. 2021).Moreover, cross-racial resemblance in IVR has been shown to increase mimicry (Hasler et al. 2017), modulate sensorimotor resonance to others' perceived pain (Harjunen et al. 2022) and lead to more conservative evaluations of legal cases (Salmanowitz 2018).Empathy signi cantly explained the variance observed in IAT scores, with perspective-taking, empathic concern, and personal distress being signi cant predictors of implicit bias.Chen et al. (2021) showed that empathy functioned as a mediator of IVR contact when it came to embodying outgroup members and that participants who placed greater importance on their various group memberships demonstrated stronger intervention effects (i.e., an increase in self-other overlap with the embodied outgroup).These ndings correspond with those establishing that intergroup contact reduces prejudice via both affective mediators, namely, empathy and intergroup anxiety, and cognitive mediators, including perspective-taking, knowledge and increased familiarity (for a review of intergroup contact, see Pettigrew & Tropp 2008).Nevertheless, IVR contact has also been shown to have no signi cant effect on empathy (Tassinari et al. 2022).

Assessment of Methodological Quality
The Joanna Briggs Institute (JBI) critical appraisal checklist for randomised control trials was used to assess the quality of the included studies.Two reviewers (SA & SH) evaluated the quality and thus the eligibility of each study.Any disagreements were resolved by a third reviewer (BD).A summary of this analysis is provided in the supplementary information (Online Resource 1).The average quality score of the studies was 5.75, with a 95% con dence interval of 4.97 to 6.52, and the maximum possible quality score being 10.Few studies clearly stated if true randomisation was used for the randomisation of participants to treatment groups (n = 5).Similarly, it was unclear in most studies whether individuals delivering treatment were blind to treatment assignment (n = 10) and whether the outcome assessors were blind to treatment assignment (n = 9).Only one study made it clear that a follow-up was completed.
However, all but one study scored highly for using the appropriate statistical tests.This decision was based on the numerous statistical tests used in the study, which the reviewers agreed would increase the likelihood of Type I errors.All papers measured treatment outcomes in the same way across treatment groups, and 10 articles clearly stated that treatment groups were treated identically.Taken together, insu cient clarity was provided in relation to whether the studies followed certain procedures that are characteristic of randomised control trials, including randomisation, the blinding of treatment assessors and outcome assessors.Similarly, these articles were vague in describing whether a participant follow-up procedure was conducted or completely failed to conduct this procedure altogether.

Meta-Analyses the IAT
The results of our rst meta-analysis on dIAT scores conducted on four of the 12 studies (as depicted in Fig. 2) show that Cochran's Q statistic yielded a value of 7.43, suggesting a notable level of heterogeneity among the included studies.This indicates that the effect sizes across the included studies are not entirely consistent and could be in uenced by factors beyond chance.The I^2 statistic, computed at 60%, underscores the extent of heterogeneity.This implies that approximately 60% of the observed variation in effect sizes could be attributed to real differences rather than random sampling errors.While this value indicates moderate heterogeneity, it further supports the notion that there are underlying factors contributing to the diversity of results.The p value (p = 0.06) does not quite reach statistical signi cance to reject homogeneity but still indicates that meaningful differences exist among the effect sizes.Testing for the main effect of condition, the meta-analysis revealed no signi cant overall decrease in IAT after embodiment of black vs white avatars, t(3) = − 1.48, p = .24.
However, the high level of variance in effect sizes warrants a cautious interpretation of this result and prompts consideration of potential sources of heterogeneity within the studies.
Similarly, the results of our meta-analysis on Post IAT scores conducted on ve of the 12 studies (as depicted in Fig. 3) show that a pronounced degree of heterogeneity exists within the analysed studies.The I^2 statistic, calculated at 90%, suggests that the observed variation in effect sizes could be attributed to genuine differences among the studies rather than mere random variability.The associated p value, which is less than 0.001, indicates a highly signi cant departure from homogeneity.Testing for the main effect of condition, the meta-analysis revealed no signi cant overall decrease in IAT after embodiment of black vs white avatars, t(4) = − 1.05, p = .35.Again, however, the substantial heterogeneity means that careful consideration of potential sources of variability among the studies becomes crucial to understanding the overall effect size and its implications.It is also important to recognise that our ability to draw robust conclusions is somewhat constrained by the relatively small number of studies available for analysis.

Discussion
This review demonstrates an increase in the use of IVR methods to induce body ownership illusions and investigate wider social identities.
Although this review has highlighted the potential for IVR methods to be used to understand intergroup attitudes, our results show that there is a large homogeneity of the population groups (sampled populations & embodied avatars) used and inconstancies in the outcome measures.These results are unpacked in the discussion below.
Demographic biases were demonstrated consistently in nearly all studies included in the review.All studies, except for one, were conducted in the global North and, thus, comprised of participants from predominantly western, educated, industrialised, rich and democratic (WEIRD) settings and populations (Henrich et al. 2010).WEIRD populations often lack representation from a wide range of racial or ethnic backgrounds.As identi ed by Durrheim (2023), progressive calls such as 'WEIRD' exclude conversations around race, suggesting an extension of the term to white and western populations.As such, various measurement and sampling-related biases emerged in the review, underscoring the current limitations in external validity within the existing studies.Considering the complexity and diversity of human behaviour and attitudes within and between different racial groups, the results from the reviewed studies may portray an incomplete perspective.A lack of consideration of such cultural differences and social context may lead to a super cial understanding of the effects of the embodiment phenomena in IVR, reducing the depth of insights that this research is able to offer.Additionally, researchers originating from WEIRD locations may possess their own implicit assumptions regarding the cultural, socio-political and racial norms of their own societies, which will inadvertently shape the research process and interpretation of results.
By a similar token, the existing literature has investigated either an ingroup perspective (i.e., embodiment in a same-race avatar) or an outgroup perspective (i.e., embodiment in a different race Taken together, our ndings reveal that in addition to the overrepresentation of WEIRD participants, the majority of articles either exclusively or predominantly recruited White participants who embody either same-raced (White) avatars and/or Black avatars, the latter of which is consistently used as the representation of the social outgroup.While it has been argued that this choice is frequently in uenced by demographic attributes -that is, White individuals being the dominant majority in the study region -the studies in the current review do not provide a theoretically driven justi cation or rationale for their choice of study sample and avatar race.Additionally, race is often considered to be a sensitive or challenging area of empirical inquiry (Silverio et al. 2022), especially considering that the research area of focus is particularly controversial in that it involves one racial group embodying another.As a result, researchers might be hesitant to incorporate ethnically varied participants and avatars who could potentially embody marginalised social groups.This, in turn, could contribute to the observed patterns in the studies included.Nevertheless, the tendency to solely or predominantly draw on a White sample perpetuates and rei es the notion that this particular population sets the standard against which others are to be measured.Thus, future research should be directed at diversifying not only the avatars embodied but also the research sample to better capture heterogeneity within diverse populations.
Although the included studies in this review had limited diversity in their sample and avatar embodiment, they demonstrated that embodiment can be induced for outgroup (e.g., black or PRC group) avatars.In particular, ten out of the 12 studies included some form of embodiment questionnaire -assessing either immersion and feelings of presence in the virtual world or body ownership -thereby controlling for the success of the IVR experience.There is a need for further standardisation of measures of embodiment, such as The Participant Experience of Embodiment Questionnaire developed by Peck and Gonzalez-Franco (2021).It is also worth noting that an exclusively psychometric approach to assessing embodiment omits a dimension of depth that could be enriched through the incorporation of qualitative methods.(Hassard 2023; Lewis & Lloyd 2010).
In addition to assessments of embodiment experiences, it is necessary to re-evaluate the appropriateness of both explicit and implicit measures of racial bias.
In particular, the IAT -commonly used as a standard measure of implicit racial bias -requires substantial re nement.Borne out of our meta-analysis results, there appears to be general heterogeneity and discordance on the best practices to adopt for IVR studies, particularly in relation to the variability of measures of explicit bias.Moreover, there are also different scoring methods for measures such as the IRI, some of which have yet suggesting that empathy is a general construct.The use of a variety of different tests and scoring methods brings into question the convergent validity of these measures, that is, the degree to which a test is related to other measures of the same construct and, consequently, the degree to which results are comparable across studies (Westen & Rosenthal 2003).Thus, throughout the examined studies, a wide range of prejudice-related measures, questionnaires and tasks are administered without any apparent emerging standards in the eld.Although there is a need to replicate studies using existing measures, future research should also consider using alternative implicit and explicit measures to demonstrate a consistent change in racial bias or prejudice across a diversity of measures.
Finally, the current systematic review has notable strengths, namely, the inclusion of the quality appraisal procedure, the use of PICO and PRISMA guidelines, and the use of two independent reviewers together with a third reviewer to resolve con ict.These factors assisted in minimising errors and enhancing the power of the review.Nevertheless, notable limitations persist, including the scope of the literature search, which was only conducted in four major electronic databases (Embase, Global Health, MEDLINE and PsycINFO).Additionally, articles were only included if they were available in English.Therefore, the use of limited databases and the decision to solely incorporate English articles may have led to the oversight of additional relevant studies.The signi cance of language choice as a limitation of this review becomes particularly apparent considering that a drawback we identi ed is the prevalence of studies conducted in WEIRD settings such as the USA, where English is the predominant language.Hence, the inclusion of articles contingent on whether they are in English possibly constitutes selection bias.Another potential limitation is the small number of articles included in the review.While the relatively limited number of included articles may be indicative of stringent inclusion criteria, it is possible that it is largely dependent on the research topic of the review as well as the amount of available supporting evidence.Lastly, there is some heterogeneity across the studies in terms of control groups, interventions and measures, which have the potential to affect study results (Bartolucci & Hillegass 2010).Nevertheless, heterogeneity may be attributed to the scope of the review, as it determines the extent to which the included articles are diverse.

Concluding comments
Recent research in the eld has highlighted the potency of IVR as a tool for inducing and controlling embodiment while also effectively showing the potential to change attitudes and reduce racial bias, albeit temporarily.However, a cautionary note is also needed in this line of research, as not to undermine the complexities of racism and racist attitudes, as well as the post-colonial legacies involved in systemic prejudice.This review has highlighted that the sense of immersion and embodiment fostered by the IVR encounter renders it a powerful method to understand the embodied nature of perspective-taking and attitude change, but the methodology is still highly reductionist in nature.IVR methods therefore have the potential to help examine the embodied mechanisms underlying multifaceted processes such as racial bias, but care is needed to understand these results within the wider socio-political and historical context in which they are embedded.Future studies drawing on more diverse sample groups and embodied avatars, while also using more standardised benchmarks for measures of implicit and explicit racial biases, will undoubtably enrich and broaden the application of the use of IVR methods in racial prejudice research.Forest plot of the meta-analysis of the dIAT effects (n = 4) when comparing the experimental and control conditions

Declarations
al. 2017; Patané et al. 2020; Salmanowitz 2018) in which participants could interact with a black and white virtual avatar.The study conducted by Patané et al. (2020) included a black interaction condition only.Finally, three studies involved interaction with avatars of different ethnic groups, including Hispanic (Alvidrez et al. 2020), Asian (Banakou et al. 2016) and Middle Eastern descent (Tassinari et al. 2022).
al. 2020; Banakou et al. 2016; Patané et al. 2020; Peck et al. 2013; Salmanowitz 2018).While these studies have demonstrated IVR's potential in the reduction of implicit racial bias, there is some con icting evidence of the effect of embodiment on intergroup attitudes.For example, Hasler et al. (2017) and Thériault et al. (2021) found no differences in IAT scores across condition groups, suggesting no change in implicit racial bias resulting from embodiment, whereas Groom et al. (2009) observed higher IAT scores for participants embodying black avatars compared to participants who embodied white avatars.Similarly, Alvidrez & Peña (2020) demonstrated that participants categorised in the self-resembling avatars (ingroup) condition reported less perceived outgroup bias compared to participants who were required to customise avatars who looked physically different from themselves (outgroup).Three studies assessed explicit bias together with implicit bias (Banakou et al. 2020; Salmanowitz 2018; Thériault et al. 2021), two of which failed to obtain converging results, suggesting that these two types of measures are often discordant.While Thériault et al. ( to be veri ed(Wang et al. 2020), including the use of standalone scores for empathy subscales(Patané et al. 2020;Thériault et al. 2021) or the sum of scores,

Table 2 :
Details of Included Studies SpainDetermine if a sense of body ownership can be induced in a differently raced avatar, and to determine if this illusion could reduce negative implicit responses toward the other race.FinlandInvestigate how VR can be used to create positive intergroup contact with a member of a stigmatised outgroup and present the results of the effect of intergroup contact in VR on empathy.

Table 4 :
Details of the Methods, Measures and Results (Groom et al. 2009)21); a sham condition, whereby participants experienced the virtual world but without any connection to a virtual body(Salmanowitz 2018); and a perspective-taking condition, which involved participants looking at a photograph of a model and imagining themselves as the model(Groom et al. 2009).Two major forms of social group embodiment emerged from the analysis.These included the ingroup perspective (i.e., participants embodying same-race avatars) and the outgroup perspective (i.e., participants embodying outgroup avatars or different-race avatars).Most studies employed a combination of both ingroup and outgroup embodiment (n = 9), the rest of which included outgroup-only embodiment (n = 3).In both types of designs, the ingroup perspective typically involved White participants embodying White/light-skinned avatars, and the outgroup perspective usually entailed White participants embodying Black/dark-skinned avatars.Speci cally, of the seven studies that comprised White-only participants, ve involved participants embodying either their own race (White avatars) and/or Black avatars(Banakou etal.2016; Banakou et al. 2020; Harjunen et al. 2022; Hasler et al. 2017; Peck et al. 2013).The remaining two studies entailed participants embodying Black avatars only (Patané et al. 2020).In contrast, Chen et al. (2021) included Singaporean Chinese (SC) participants (ingroup) who embodied both SC avatars and People's Republic of China (PRC) Chinese avatars.
(Alvidrez & Peña 2020)sinari et al. 2022)erspective typically involved White participants embodying White avatars, while the outgroup perspective entailed White participants embodying Black avatars.Only one study included different ingroup-outgroup embodiment conditions other than Black and White ethnic groups.Within this study, Singaporean Chinese (SC) participants (ingroup) embodied both SC avatars and People's Republic of China (PRC) Chinese avatars (outgroup).Of the studies that recruited participants from different ethnic groups (i.e., Asian, Hispanic or self-identi ed Other participants), most emulated the abovementioned patterns.That is, most still involved participants embodying either a combination of Black and White avatars(Groom et al. 2009;Tassinari et al. 2022)or Black avatars only(Thériault et al. 2021).Hence, while these studies can be considered more inclusive by recruiting a more diverse sample, the embodiment condition is still limited to Black and White social groups.Moreover, even with the inclusion of a more varied participant pool, the predominant majority remains White in most cases, except a single study in which the characteristics of the embodied avatar are unclear(Alvidrez & Peña 2020).