Abstract
The application of virtual reality to the study of conversation and social interaction is a relatively new field of study. While the affordances of VR in the domain compared to traditional methods are promising, the current state of the field is plagued by a lack of methodological standards and shared understanding of how design features of the immersive experience impact participants. In order to address this, this paper develops a relationship map between design features and experiential outcomes, along with expectations for how those features interact with each other. Based on the results of a narrative review drawing from diverse fields, this relationship map focuses on dyadic conversations with agents. The experiential outcomes chosen include presence & engagement, psychological discomfort, and simulator sickness. The relevant design features contained in the framework include scenario agency, visual fidelity, agent automation, environmental context, and audio features. We conclude by discussing the findings of the review and framework, such as the multimodal nature of social VR being highlighted, and the importance of environmental context, and lastly provide recommendations for future research in social VR.
Similar content being viewed by others
1 Introduction
1.1 Virtual reality for psychology research
Virtual reality (VR) systems involve the use of some combination of computer-generated images, audio, and haptics designed to give the user the feeling that the virtual environment is real (Park et al. 2019). In recent years, virtual reality systems have become readily available in research, business, and commercial contexts as a result of improvements in technologies and affordability (Pan et al. 2018). One burgeoning field of application for VR is in the study of human conversations, one of the primary topics of study in psychology research (Yoon and Brown-Schmidt 2019).
Compared to some traditional conversation research paradigms, VR methods offer several notable advantages (Park et al. 2019). In theory, VR offers an enticing solution to the historical tension in the field of psychology between the desire for experimental control and ecological validity, or realism (Parsons 2015). VR allows for the chance to replace the use of static, abstract stimuli with responsive, multimodal, and contextually embedded scenarios, while allowing for near full control of what is presented, along with detailed recording of the behaviours of participants within the tool and potential for measures to be administered within VR. The requirement of developing specific VR tools for research can also improve the reproducibility of findings, as other researchers can more simply make use of the same VR tools.
In terms of tracking behaviours, the specific information recorded depends on the VR system employed, with popular HMD (Head Mounted Display) systems such as the Oculus Quest 2 tracking the head and hands of users (Carnevale et al. 2022). It is also possible to track the full body of a VR user by adding additional tracking devices or employing motion capture systems (Caserman et al. 2020). Eye tracking has also been implemented in HMD systems, with devices such as the HTC Vive Pro Eye having reliable eye tracking directly embedded into the device (Sipatchin et al. 2021). Neural recording methods such as fMRI, MEG, and EEG have also been used in VR studies (Lenormand and Piolino 2022; Li et al. 2021; Roberts et al. 2019; Tehrani et al. 2021), though the degree to which users can freely move while wearing these devices varies based on their portability.
The field of proxemics study is a notable example of how the unique affordances of VR methodologies can be applied to psychology research. The study of proxemics refers to the perception and behaviours related to the space around them, originally introduced by Hall (1968) in the 1960s. In the current day, the tracking capabilities of VR along with the ability to precisely manipulate stimuli, even when moving, have been used to explore the proxemics of topics such as social settings (Duverné et al. 2020), crowds (Dickinson et al. 2019), and conversations (Kolkmeier et al. 2016). On a similar topic, distance perception has also been a popular avenue of study in VR (e.g. Ebrahimi et al. 2018; Ries et al. 2008; Vienne et al. 2020).
1.2 Social VR
The study of conversation is a field of psychology where VR presents real promise for addressing some of the historical issues. Traditional methods for studying elements of social psychology involve the use of trained actors or reducing the number and complexity of stimuli. For example, the reading the mind in the eyes test presents a set of images of eyes and asks the participant to identify the emotion shown (Baron-Cohen and Wheelwright 2001). The inclusion of actors in a study leads to issues around replication, as the specific characteristics and behaviours of the actors influence outcomes (Baumeister 2016). Virtual characters in VR methods provide a compromise to address these problems by allowing for greater consistency and control of the conversational stimuli without fully abstracting from real world scenarios. Virtual characters are categorised as avatars and agents. Avatars are human-controlled characters in a virtual space, such as the character you control in a videogame. Agents, in comparison, are non-user-controlled virtual characters that are generally human-like in visual design. The level of sophistication for agents varies for their visual style and fidelity, as well as their interactivity and responsiveness.
The term social VR is used in different ways to broadly refer to different social aspects of VR. For example, it is often used to refer to commercially available online spaces/programmes designed for multiple users to interact with each other using VR devices (Handley et al. 2022). In research, it is more commonly used to refer to the tools for teaching (Bermejo et al. 2023), training (Howard and Gutworth 2020), therapy (Anderson et al. 2013), and exploring interaction with agents, with single or multiple users (Pan et al. 2018). This review is specifically examining the latter form of social VR, though some of the findings may have relevance to the design considerations of online space social VR systems. For more information on design principles and social experiences of online commercial forms of social VR see reviews by McVeigh-Schultz et al. (2018), Kolesnichenko et al. (2019), Jonas et al. (2019), and Cao et al. (2023), as well as the works of Guo Freeman (e.g. Maloney et al. 2020; Freeman and Maloney 2021; Freeman and Acena 2021).
While the application of VR to the study of conversation has many promising features, the use of VR in psychology is still in its infancy. There are also significant risks involved in treating VR methods as a magic solution that by default provides benefits such as greater ecological validity. Indeed, it is the manner in which VR tools are implemented that dictates their value (Christophers et al., in press). An unintentionally unnerving social interaction in VR with a robotic conversation partner could sparsely be argued as more ecologically valid than traditional methods. On the methodological side, while no review has been conducted to date on VR conversation studies, the results of reviews of general psychology studies using VR have highlighted notable weaknesses. For example, poor reporting and a lack of open availability of tools were observed in a review conducted by Lanier et al. (2019). In addition, the methodologies employed in VR studies are heterogeneous, making comparison between studies a difficult task (Vasser and Aru 2020).
In response to the lack of standards that have led some researchers to refer to the field as the “Wild West” (Birckhead et al. 2019). Based on the current state of the field, it is imperative that a shared understanding be established. The use of varied methods and designs of conversational agents and scenarios are not inherently problematic, but for the results of these studies to be meaningful we must be able to determine what characteristics of the tool are responsible for the observed psychological outcomes. To achieve this the designs of these experiences must be intentionally crafted, with extensive reporting of the specific methodology employed.
One of the primary causes of these disparities is a lack of theory for the experience of conversing with a virtual agent. While, there may be theories informing individual elements such as nonverbal gestures (Wang and Ruiz 2021), there is no unified theory of the complete experience. Following Whetten’s (1989) proposal of what constitutes a theory, a model must outline the key factors involved, the relations between these factors, explain these relations, and outline the circumstances in which the model applies. Through this, theories allow for clear communication between researchers and provide a structure for the collection, analysis, and interpretation of data, allowing for cross-comparisons for building a body of research on a topic (Hayes 2023). Toward that aim, it is necessary to first establish the relevant factors and their relations, as well as identify relationships that have been as yet untested.
1.3 Contribution
In this paper, we conducted a narrative review and developed a relationship map for the impact of key design features of dyadic conversations with VR agents on experiential outcomes. We use the term relationship map here to refer to a visual representation of the existing findings on the relations between factors, along with our expectations for how those features would interact in cases where they have yet to be directly examined. This relationship map is aimed at both providing the basis for the development of a theoretical model, as well as providing VR conversation researchers with an overview of the current state of the field and a baseline set of expectations for the impact of design choices.
The specific dyadic conversation format looked at in this paper involves one human interacting with one agent. The selected outcomes are experiential in nature from the perspective of the user and are also based on the common aims of current VR tools, such as wishing to prompt feelings of social stress (Zimmer et al. 2019). Our relationship map was based on the findings of a narrative review of the current literature we conducted.
The narrative review drew from diverse disciplines, including social psychology, conversation analysis, HCI, environmental study, as well as social VR, both general and conversation specific. This decision was motivated by a series of factors. Namely, the limited quantity and varied quality of VR conversation studies to date, the multidisciplinary nature of the area, and the aim of identifying knowledge gaps in the current field of VR conversation research.
Using the findings of the review of the state of research for social VR, we developed a relationship map. This map defines expectations for how design features impact qualitative outcomes for dyadic agent conversations. In the narrative review results section, we explore and define the key design features of dyadic VR agent conversations, the relevant experiential outcomes they influence, as well as their relationships with other features.
2 Narrative review results
2.1 User experiential outcomes
The scope of this review in terms of outcomes is purely concerned with the conscious, self-reported outcomes of the user during and following the VR conversation experience. Because of this, no behavioural or physiological outcomes are included. The specific outcomes were chosen based on their relevance to dyadic agent conversations, as well as their popularity as a topic of examination in the field of social VR. One outcome that was originally included in this list was simulator sickness: unintended side effects of VR experiences including dizziness, nausea, and blurred vision. While this is a popular topic of study in VR and an important consideration when designing VR experiences, we consider it to play a more minor role in dyadic conversation experiences due to the typically stationary format of these conversations (Pan et al., 2018). Participants typically sit or stand opposite the agent without significant movement or virtual locomotion which typically pose the greatest risk of cybersickness (Saredakis et al. 2020). For more information on the topic of cybersickness see the recent review conducted by Tian et al. (2022).
Below, we first present the major categories of user experiential outcomes. Further below, we present the key features of relevant VR experiences and research findings on how they impact (or do not impact) the listed outcomes.
2.1.1 Presence and engagement
Presence refers to the degree to which users feel they are, “really there” in a virtual environment, and that elements of that world are “real” (Piccione et al. 2019). Engagement here refers to a temporal set of affective and motivational experiences of the user during conversation (Lohse et al. 2016; Wiebe et al. 2014), with levels of presence being significantly related to levels of engagement (Deriu et al. 2021). Presence in particular is regularly assessed in social VR research, due to its theorised role as a mechanism in the effectiveness of VR therapy (Price et al. 2011) and the emotional impact of VR experiences (Diemer et al. 2015). While results are somewhat mixed, the general findings show that heightened presence can lead to greater emotional responses (e.g. Jicol et al. 2021), particularly for emotions related to arousal such as fear (Diemer et al. 2015).
2.1.2 Social presence
One important element of interaction with social agents is the concept of social presence, the experience that a character is “real” and that you can perceive their thoughts and emotions (Biocca 1997). This is often viewed as a way of assessing how successful a communication system is at emulating face-to-face interaction with a human and is regularly theorised by researchers to lead to greater positive social outcomes (Oh et al. 2018), such as positive emotional experiences. High levels of social presence can, however, also increase psychological discomfort for some individuals, particularly for those who are generally uncomfortable with social interactions (Allmendinger 2010; Cortese and Seo 2012). Findings are somewhat limited and mixed on whether social presence has a direct relationship to general presence, with the overall results suggesting that social presence has a positive correlation with general presence (Bulu 2012; Thie and van Wijk 1998; Zhang and Zigurs 2009). It should be noted that the nature of this relationship, as well as the manner in which presence and its related components should be understood and operationalised are ongoing points of debate in the field (e.g. Latoschik and Wienrich 2022; Skarbez et al. 2018; Slater et al. 2022).
2.1.3 Psychological discomfort
One of the most frequently examined outcomes in psychology studies using VR, psychological discomfort refers to the degree of stress, fear, and/or anxiety that result from the conversation (Somarathna et al. 2022). Psychological discomfort appears to have a bidirectional relationship with presence/engagement, with greater levels of fear leading to increased presence/engagement and vice versa (Diemer et al. 2015; Jicol et al. 2021). Examples of how this is studied in relation to conversation include social stress induction paradigms (Zimmer et al. 2019), public speaking phobia (Jinga et al. 2021), and social anxiety (Kerous et al. 2020). Psychological discomfort is naturally interlinked with other forms of emotional experience.
A distinction is drawn in this relationship map between psychological discomfort and other emotional experiences. The proposal here is not that these are perfectly discrete categories of outcomes, but is rather motivated by the resounding popularity of phobic research and therapy in the VR space. In addition, the results of studies examining the relationships of core elements of VR such as presence with nonphobic emotions are distinct when compared to similar work on fear-related emotions (Diemer et al. 2015).
2.1.4 Non-fear-related emotional experience
This outcome encompasses the emotional experiences of users during the VR conversation that are not directly associated with fear such as joy, sadness, and relaxation. Emotions are one of the core interests of psychological research, and prompting specific emotions has been a popular aim in both psychology and therapeutic research using VR (Somarathna et al. 2022). In comparison to psychological discomfort and presence, the relationship between non-fear related emotion and presence is less consistent and does not appear to have the same bidirectionality (Diemer et al. 2015).
2.2 Key design features
This set of key design features for dyadic agent conversations was selected based on a combination of their importance to the topic, along with how frequently they varied between and within studies in the area of social VR. The proposed relationships are not intended to be reductive, universal rules, but rather inform the reader of current findings and expectations of the area and prompt consideration on how best to apply those features when developing VR tools, based on the specific aims and qualities of the project.
Some features that were considered but ultimately omitted from the relationship map include the type of VR device used. This was removed as a standalone feature due to the wide array of devices currently available and the lack of research directly comparing the impact of the VR device type has as a whole on the identified outcomes for social VR experiences. In its stead, relevant components are included in other features below, such as the display fidelity, level of control agency, haptic-based interpersonal touch, and audio quality. Technology will continue to advance and shift regarding VR, making focusing on the affordances of those devices a potentially more valuable avenue of study.
For each feature in this section, a definition and background are provided, followed by a description of the feature’s relationship with the experiential outcomes, and other features (See Fig. 4 for full relationship map).
2.2.1 Scenario agency
One element explored in several VR papers is the effect of the VR scenario being active or passive in nature. Here, we define agency in VR scenarios as being two-pronged, involving both the level of ability to freely move throughout the space, and the degree to which the user can impact the scenario. A VR scenario is active when the participant is involved and has control over the experience, such as being able to freely move around an area and talk to whomever they choose. In contrast, passive scenarios only allow for looking around as a predetermined experience plays out around them. For freedom of movement in VR, this can vary from three degrees of freedom (tracking head rotation, but not movement) at the most restrictive, to six degrees of freedom (translating the user’s movement through space) and the ability to move to other areas in the scene through some form of virtual locomotion/steering (movement in VR that exceeds the physical input, usually from a handheld controller) or teleportation.
In the domain of pain management, a study conducted by Phelan et al. (2019) found that active scenarios extended the pain thresholds of participants and were rated as being more engaging and immersive as compared to the passive scenarios. While pain management has been the area that has most commonly examined the impact of active and passive scenarios (Boylan et al. 2018; Furness et al. 2019), similar findings of improved engagement and immersion for active scenarios were reported in studies looking at skills training (Piccione et al. 2019), and social anxiety (Sekhavat and Nomani 2017). For social presence specifically, greater agency is also associated with greater levels of social presence (Fortin and Dholakia 2005; Oh et al. 2018; Skalski and Tamborini 2007).
To achieve greater levels of scenario agency, VR device setups that include motion tracking of multiple body points provide greater opportunities for interaction. Active scenarios have been found to result in greater levels of presence/engagement compared to passive scenarios (Furness et al. 2019; Piccione et al. 2019). In terms of emotional outcomes, little work has been conducted directly examining the influence of scenario agency levels. The results of a pilot study looking at social anxiety suggest that active scenarios were more impactful (Sekhavat and Nomani 2017), while another study that compared traumatic 2D films or interactive VR scenes found no differences in terms of negative emotional impact (Dibbets and Schulte-Ostermann 2015). A more recent study conducted by Jicol et al. (2021) directly examined the relationships between emotions (happy vs fear), agency, and presence. They found that the agency was positively correlated with presence and moderated the impact of emotion on presence.
Active scenarios can moderate the impact of an agent’s proximity on psychological discomfort, as users can freely move around the space and move to a more comfortable distance. Passive scenarios are likely to moderate the impact of a self-avatar on presence, as being unable to move and have the avatar move in line with you removes some of its key benefits (Makled et al. 2018).
In relation to conversations in VR, scenario agency can relate to a variety of aspects including whether the user can freely move around the area, whether they can initiate conversations or are forced into them, and the degree of responsiveness of the conversation agents. In general, active scenarios are more impactful, and we would argue that some level of interactivity should be included in VR conversation tools where possible to better immerse the user.
2.2.2 Visual fidelity
In terms of technical features of the VR scenario, the visual fidelity of the experience is a key consideration. Visual fidelity has a layered meaning in the field of VR, including both the technical features such as texture resolution, as well as the design which is considered in terms of realism (realistic vs. cartoon). As noted by Vasser and Aru (2020), two studies that purport to be address the same topic can have wildly varying levels of visual fidelity, leading to potential reliability issues. While the general trend in the field has been a strive towards achieving realistic visuals, there remains debate on whether this is important to the experiences and study outcomes (Slater et al. 2020). At the most basic level, more realistic visuals may lead to users feeling more present in the scene, and greater engagement (Riva et al. 2019; Rizzo and Koenig 2017).
As visual fidelity and realism increase so too do the risks of inducing the “uncanny valley” effect. This is the idea that virtual characters can cause feelings of eeriness and aversion the more realistic and human-like they are. Findings on the effect suggest this comes as a result of inconsistent realism between elements of the character, such as their visuals and behaviours. For example, the more realistic the character looks, the greater our expectations for them to be lifelike and natural feeling (Kätsyri et al. 2018). In order to avoid this problem, for research questions that are not contingent on realistic visuals we would argue it is often preferable to make use of moderate levels of fidelity.
On the topic of conversation, a study that looked at the influence of visual fidelity on anxiety in a job interview scenario found that the visuals had no significant impact on anxiety, with the level of anxiety appearing to be more related to the scenario and the sensitivity of the participant (Kwon et al. 2013). For simulation training programme, the visual fidelity of the hardware was also non-significant as a moderating factor for their effectiveness (Appel et al. 2020). For an example of a tool with moderate fidelity visuals see Fig. 1. Taking this approach also has the added benefit of reducing the complexity and development time of a VR tool.
Along with degrees of realism, other design elements of agents have been investigated. In line with findings on self-similarity, where people tend to like and trust other people who look like them (Byrne 1971; Montoya et al. 2008), participants who had virtual conversations with other participants rated avatars who looked similar to them as being more likeable and less eerie compared to dissimilar avatars (Shih et al. 2023).
While this feature is multi-layered, general findings suggest that more realistic visuals inspire greater levels of presence and engagement (Vasser and Aru 2020). In line with the findings related to the uncanny valley, findings on the impact of the realism of character models on social presence suggest that greater visual realism only consistently leads to increased social presence when the level of behavioural realism is appropriate (Bailenson et al. 2005; Garau et al. 2003). The impact of the virtual reality device’s display, in terms of image definition and display size, has also been examined by a small number of studies, with two (Ahn et al. 2014; Bracken 2005) finding that better displays resulted in higher levels of social presence, with another two finding no effect of display fidelity (James et al. 2011; Skalski and Whitbred 2010). These are in part or fully determined by the technical features of the VR device used.
The visual fidelity of self-avatars has been found to mediate their influence on presence and embodiment (Gorisse et al. 2019). Visual fidelity can also mediate the influence of nonverbal signals on both categories of emotional experiences as more realistic visuals can result in greater expectations for those motions to be lifelike and risk uncanny valley effects (Mori et al. 2012). With that said, combining higher levels of visual realism with appropriate behavioural realism has been found to lead to higher levels of positive affect towards the characters (Ferstl et al. 2021b; Zibrek et al. 2018; Zibrek and McDonnell 2019). Lastly, the visual fidelity of the environment mediates its impact on emotional experiences (both categories), with more realistic environments generally heightening the emotional impact (Newman et al. 2022).
2.2.3 Inclusion of a self-avatar
Another technical feature of VR tools that has received attention are self-avatars. When using a head mounted display (HMD), a VR device you place on your head, the user is no longer able to see their actual body as their field of view is covered by the device’s screen. To address this, virtual bodies that match the movements of the user known as self-avatars have been implemented in some tools (Pan et al. 2018). Research looking at the impact of self-avatars has found that they can lead to greater user engagement, presence, and sense of embodiment in the scene (Parmar et al. 2022; Wagnerberger et al. 2021; Young et al. 2015). On the more physical side of things, several studies have demonstrated that having a virtual self-avatar can aid in distance perception tasks within VR (Lin et al. 2015; Phillips et al. 2010; Ries et al. 2008). Recent results also suggest they could enhance performance or training effects in general (Birk and Mandryk 2019; Friehs et al. 2022; Ratan et al. 2022).
The specific design of avatars can also have an impact, with one study finding that having a self-avatar that is dissimilar to the user can reduce the amount of social anxiety experienced, when compared to having an avatar that looks like their real life self (Aymerich-Franch et al. 2014). Likely the most prominent finding from this area of study is the proteus effect, which suggests that users generally act in line with how they would expect their avatar to behave (Praetorius and Görlich 2020). For example, participants who exercised with an obese avatar showed decreased physical activity (Peña et al. 2016). A review of this effect carried out by Ratan et al. (2020) indicates that effect sizes are relatively consistently between small and medium (0.22–0.26).
Offering users the ability to customise their avatars can also influence a variety of outcomes. While most studies on avatar customisation have focused on non-VR contexts such as health intervention tools or video games, findings from these areas can help provide expectations for the effects of implementing them in VR tools. As an example, the creation of customised self-avatars appears to lead to increased motivation for participants using digital self-improvement programmes (Birk and Mandryk 2018, 2019; Darville et al. 2018). In line with the proteus effect, prompting participants to make specific types of self-avatars can also impact behavioural and qualitative outcomes (Peña et al. 2022; Sah et al. 2017), such as students who created avatars that resembled their actual selves performing better than those who created ideal self or future self-avatars (Ratan et al. 2022). Additionally, participants who could personalise their avatars reported higher levels of presence during an experience where they performed various motions in front of a mirror (Waltemate et al. 2018), though this may be a more pointedly embodied experience than typical social VR experiences.
The inclusion of these self-representations is linked with greater feelings of presence and embodiment in the space (Caserman et al. 2020; Parmar et al. 2022). The nature of the avatar, such as its visual characteristics, can influence both categories of emotional experience, as the proteus effect suggests that users will, to a moderate degree, match their behaviours and mindset to their expectations of the avatar (Praetorius and Görlich 2020; Vahle and Tomasik 2022). While the influence of self-avatars on social presence when in conversation with an agent has not been specifically investigated, results from virtual interactions between participants suggest that full self-avatar embodiment results in greater levels of social presence (Aseeri and Interrante 2021; Cho et al. 2020; Heidicker et al. 2017; Yoon et al. 2019) For a more in-depth examination of self-avatar design choices, see Weidner et al. (2023).
2.2.4 Non-verbal signals
Nonverbal forms of communication are an important element of conversation (Gatica-Perez 2009), leading researchers to regularly develop sets of these behaviours for virtual agents (Wang and Ruiz 2021). The nonverbal avenues of communication studied for social agents are primarily physical in nature, including facial expressions, gestures, and touch. One common application of reactive emotional expressions is in the domain of public speaking exposure therapy. El-Yamri et al. (2019) developed a system where the emotional expressions of the audience in a VR public speaking setting would react to the voice tone, speech content, and gaze behaviours of the user. The emotional valence of the audience in turn can influence the level of anxiety experienced by the speaker (Jinga et al. 2021).
Gestures are motions and poses typically made with hands and arms as part of communication, generally in combination with speech. Their form and function have previously been categorised into systems such as those described by McNeill (1992). These gestures range from small beat gestures that move in rhythm with words, to pointing-based deictic gestures that indicate a location, and emblem gestures that represent objects or concepts, sometimes in place of words. For social agents, gestures are one of the most commonly implemented forms of nonverbal signals (I. Wang and Ruiz 2021). With the aim of improving the believability and level of expressiveness of the agents, behaviours including nodding (Cassell et al. 1999), beat gestures (Mancini et al. 2011), and emblem gestures (Rickel and Johnson 1999) have been developed. For gestures, a study was carried out comparing two forms of agent nodding behaviours (Aburumman et al. 2022). While listening to participants, agents would either exhibit fast nodding, or nodding that mimics the users’ nods with a short delay. Both implicit and explicit measures demonstrated that participants both liked and trusted the agents that mimicked their nodding. This highlights the importance of nonverbal communication being delivered in an appropriate manner. This is reinforced by the findings of Conrad et al. (2015), who found that while agents who displayed more facial expressions prompted more acknowledgements and smiles from participants, they were rated as being less natural. In order to achieve this, systems have been developed with the aim of automatically generating nonverbal gestures based on the characteristics of vocal speech (e.g. Marsella et al. 2013; Yang et al. 2020). A recent model developed by Ferstl et al. (2021a) was rated as being significantly more appropriate when compared to randomly generated gestures (See Fig. 2 for an example gesture sequence).
The inclusion of social touch in conversation has also shown promise in the field of VR agent research. These studies generally make use of mixed reality methods, combining HMD VR systems with either physical props or haptic feedback synced up with the touch of a social agent. As an example, Hoppe et al. (2020) created an artificial hand that was used to apply social touch to participants, in this case, a tap on the shoulder. They found that the inclusion of this touch led to participants reporting greater presence for the agent they were interacting with, as well as greater uncertainty of the distinction between avatars and agents. In the domain of economic bargaining, social touch delivered through haptic vibrations was found to generally increase compliance with unfair offers (Harjunen et al. 2018). These findings demonstrate the persuasive and engaging power of touch in VR applications.
The posture held by agents is another avenue of nonverbal communication that has been studied, though less commonly compared to gestures and gaze behaviours (I. Wang and Ruiz 2021). Posture involves the orientation of one’s body, and plays a part in communicating emotion and intention (Dael et al. 2012). Typically implemented in agents intended for therapeutic use cases, agent postures are manipulated with the intent of improving rapport (e.g. Gratch et al. 2006; Kang et al. 2008; Huang et al. 2011; DeVault et al. 2014). There have been few studies to date examining the impact of agent postures on psychological outcomes. Results from a study examining postural mirroring during a job interview found no significant differences in terms of stress or presence, though the female agent was rated as warmer when it exhibited postural mirroring (Antonio Gómez Jáuregui et al. 2021). Comparing an agent displaying open and closed body language, Li et al. (2018) also found no impact on presence.
Lastly, one of the most developed and commonly studied set of features in VR are gaze behaviours. Gaze behaviours can provide cues to coordinate the flow of conversation (Kendon 1990), and indicate attentiveness (Heylen 2006). In terms of avatars, gaze behaviours that match the speech of the user were viewed more positively and resulted in greater social presence ratings (Garau et al. 2003). In a study on the single and joint effects of agent gaze and proxemics during interaction, the gaze and proxemics responses of participants were strongest when agents manipulated both at the same time (Kolkmeier et al. 2016). The implementation of an algorithm that matched gaze patterns of virtual agents to categories of emotional states also resulted in increases in the sense of general presence in the scenario (Randhavane et al. 2019a, b).
The nature of the signals performed by the agent, such as their valence, frequency, and format (e.g. expression, gestures, touch) has been shown to influence the emotional experiences of users (El-Yamri et al. 2019). Poorly implemented or unnatural nonverbal communication from the agent can also lead to reduced presence, social presence and engagement (Conrad et al. 2015; Garau et al. 2003; Oh et al. 2018). For social presence in particular, a study was conducted that made use of a model to generate movements (gait, gesture, gaze) for agents with the intent of being friendly and likeable, based on their interactions with a user (Randhavane et al. 2019a, b). The application of their model increased the level of reported social presence, once again emphasising the value of nonverbal signals being responsive and based on the ongoing interaction. It is worth noting that personal conversation style preferences and contextual factors of the conversation may moderate the value of mirroring (Aneja et al. 2021; Wang and Ruiz 2021). The value of nonverbal signals in fostering a sense of social presence has been similarly supported by a range of studies, with higher levels of behavioural realism resulting in higher levels of social presence (Bailenson et al. 2005; Garau et al. 2005; Guadagno et al. 2007; Nowak and Biocca 2003). For a more in-depth review of agent nonverbal communication see Wang and Ruiz (2021).
2.2.5 Level of agent automation
Virtual social agents can be classified in two ways in terms of their level of autonomy. Fully autonomous agents have no human input and converse with users using some combination of their speech and body language to respond with appropriate communication cues in order to give the feeling of a natural conversation. Semi-autonomous agents, on the other hand, operate using a “Wizard of Oz” setup, where the behaviours and responses of the virtual character are being operated out of sight by a researcher or therapist (Pan et al. 2018). These are far less complicated to implement compared to fully autonomous conversation agents.
One important consideration and potential drawback to Wizard of Oz systems is the added latency in responses and/or nonverbal signals depending on the degree to which the operator controls the agent. As an example, the operator may need time to figure out which of their predetermined responses is most appropriate, and simultaneously select the nonverbal behaviours to accompany the speech. For nonverbal signals in particular, timing is an important element contributing to whether they feel natural to the user (Aburumman et al. 2022; Ota et al. 2021), and for speech, rhythm is a key element of conversation quality (Borrie et al. 2019; Clark 1996), with larger delays potentially leading to more negative perceptions of the conversation partner (Schoenenberg et al. 2014). These problems can also apply to more automated systems depending on the speed at which they analyse and respond.
While several fully autonomous agents have been developed (DeVault et al. 2014; Kahl and Kopp 2018; R. Zhao et al. 2016), to date they have considerable limitations. The first is that they are highly specialised, designed for particular social circumstances and one or two pre-established topics of conversation. The behaviours both recognised and performed by the agent must also be clearly established beforehand by the developers. Lastly, most of these systems currently require the use of non-immersive VR displays in order to allow for the expressions of users to be clearly visible.
At the current stage of development, the expectation is that agents with high autonomy will be more limited, and as such prompt lower levels of presence and engagement (Pan et al. 2018). For social presence specifically, studies to date have typically studied the impact of perceived avatar/agent status rather than their actual level of automation (Oh et al. 2018), with the direct impact of degrees of agent automation being understudied. With that said, findings of a recent meta-analysis suggest that social presence ratings are typically higher for perceived avatars compared to perceived agents (Felnhofer et al. 2023). Additionally, highly autonomous agents run the risk of increasing feelings of psychological discomfort through uncanny valley effects, particularly if the topic of discussion is emotional in nature (Stein and Ohler 2017).
2.2.6 Agent proximity
The proximity of the agents should also be considered. Proximity in this case refers to the position and angle at which an agent is placed in the virtual space relative to the user. When talking, we generally have a preferred distance between us and our conversation partners. This can vary both across cultures (Sicorello et al. 2019) and in the moment in response to how the conversation is playing out (Bönsch et al. 2018). While this preferred personal space includes not wanting to be too close or too far from our conversation partner, recent findings suggest that being too close causes greater and more immediate discomfort when compared to being too far away (Welsch et al. 2019). As a result of this, agents that violate the personal space boundaries of participants will likely result in feelings of discomfort.
The angle at which an agent is standing can also serve as an additional source of nonverbal communication. As an example, a virtual agent developed by Pejsa et al. (2017) altered its body alignment in order to cue the next speaker in a triadic (two humans, one agent) conversation. While dyadic conversation generally takes place either looking directly at one another or at a 90° angle (Kendon 1990), direct orientation appears to make people feel more attended to (Nagels et al. 2015), though as with all nonverbal communication, this is dependent on factors such as culture and gender (Brugel et al. 2015). The characteristics of a user’s self-avatar, particularly their arm length, have also been found to influence interpersonal space expectations (Buck et al. 2022). Additionally, the form of VR used may influence personal space preferences, with one study finding that users preferred larger distances in CAVE systems compared to when in an HMD (Bönsch et al. 2020).
The agent being too far from the user can cause psychological discomfort, with more immediate discomfort potentially arising in cases where the agent intrudes inside the user’s personal space (Hecht et al. 2019; Welsch et al. 2019). Similarly, participants who were asked to walk through agents in augmented reality exhibited heightened physiological arousal and reported qualitatively that the experience felt unnatural and uncomfortable (Huang et al. 2022). The emotions displayed by agents have also been found to influence the size of the personal space of participants, with greater distance being kept when interacting with angry agents (Bönsch et al. 2018).
2.2.7 Environmental context
The environmental context of an immersive virtual environment includes its setting (e.g. an office space; see Fig. 3 for a set of example locations), layout, features, and visual design. This multifaceted set of features is a vital element of most virtual immersive experiences, particularly for building a sense of presence (Newman et al. 2022). The most commonly studied element of environmental context in terms of experiential outcomes is the impact of nature and urbanity.
To start with the layout, work in the field of architecture has been conducted looking at its experiential and behavioural outcomes. The general finding is that exposure to scenes of nature, even just through photographs, leads to stress reduction and physiological relaxation (Jo et al. 2019). This effect is reflected in the improved affect for people living in urban areas with nature fixtures such as parks when compared to those living in other urban areas (van den Bosch and Ode Sang 2017). These findings have motivated the development and study of the efficacy of a series of immersive nature environments, particularly for therapeutic applications (Appel et al. 2020; Blum et al. 2019; Kim et al. 2017). The results of these studies support the potential of nature scenes for relaxation and prompting positive affect.
Looking more specifically at the structure of the environment, a CAVE VR study was conducted looking at the influence of architectural design on physiological stress reactions (Fich et al. 2014). They asked participants to carry out a series of stressful tasks (such as giving a speech) in front of a panel. The layout of the virtual room in which these tasks took place was manipulated, finding that participants displayed significantly greater levels of stress in the closed room (no visible exits or windows) when compared to the open room (three large openings in the wall).
The relevance of these elements is linked with the architectural design concept “visual comfort”, which is the subjective perception of comfort drawn from an individual’s visual environment (Davis and Nutter 2010). Building on this concept, Cha et al. (2020) carried out a VR study looking at the effect of interior colour schemes on emotions, heart rate, and productivity. They found that the colour changes had a significant impact, with blue, white, and green scenarios leading to lower heart rates and red colour schemes being rated as more exciting, unpleasant, and tense. A later study also found that varying the colour schemes of a VR environment impacted both qualitative and physiological measures for participants (Li et al. 2021). While the influence of visual features on psychological factors is a relatively underexplored avenue compared to nature scenes, existing theory and findings highlight the importance of considering how they may influence the results of Social VR studies.
Other elements of the environmental contexts to bear in mind, include the social setting, and how populated with characters they are. The social setting refers to the social expectations of the location, as well as an individual’s subjective relationship with the setting. This sociological concept was explored in VR by Duverné et al. (2020), who found that proxemics (people’s perception and use of/movement through space) norms varied according to their subjective relationship with the social setting, with no main effect for the settings themselves. Lastly, a study that employed a VR crowd simulation in a university setting found that the more dense the crowd, the greater the levels of negative affect reported, along with differences in proxemics behaviours within the space (Dickinson et al. 2019).
Based on these findings, we would argue that environmental context is a key factor to consider for agent conversations in VR. The social setting of the experience provides a set of expectations for the user in terms of how they should act, as well as how the agent should act. On the more implicit side of things, the layout, visual, and auditory qualities of the environment in which the conversation is taking place will also impact their experience of the interaction.
The setting employed has been found to moderate the impact of visual fidelity on presence and engagement in some cases, as strong sets of expectations for the situation can lead users to pay less care to the visual features of the scene (Kwon et al. 2013). The environmental context also mediates the impact of an agent’s proximity on psychological discomfort, as different settings provide different sets of expectations for personal distance (Duverné et al. 2020).
2.2.8 Audio features
The audio features of an agent interaction as part of this relationship map includes any agent vocal utterances, along with any background audio, and the quality of the audio system. While voice is a critical element of typical conversation (Moore et al. 2016), it is an understudied element of social agent interactions (Hortensius et al. 2018). With that said, as with many of the outcomes discussed in this paper, findings from face-to-face conversations and purely voice-based agents can be used as a baseline set of expectations to be experimentally validated in the field of research.
Vocal characteristics have been found to influence perceptions of a speaker’s personality traits (Wang et al. 2021) and emotional state (Mehrabian 2008). For example, a higher vocal pitch has been found to lead to greater attribution of feminine traits and likeability (Ko et al. 2006; Krahé et al. 2021; Pisanski and Rendall 2011). These characteristics appear to interact with stereotypes individuals hold about the speaker, such as those related to their gender (Aung and Puts 2020; Jin and Park 2023), with pitch differences in some cases leading to different attributions for men compared to women. We would expect differences in perception of the character based on vocal characteristics of a social agent to impact emotional experience, and potentially psychological discomfort (e.g. feeling the agent is judgemental or cold), but this is a yet unexplored avenue of research.
The soundscape of the environment is another important consideration. Consisting of the auditory stimuli related to the location, the soundscape includes both background noises and sounds that play in response to your movement and actions. The inclusion of an appropriate and reactive soundscape can enhance feelings of presence and realism (Kern and Ellermeier 2020; Zhao et al. 2021).
While pre-recorded vocal lines are often used for social agents (Pan et al. 2018), rapid advances have been observed in synthetic speech technology in recent years (Tan 2023). Synthetic speech involves output from computer systems that are designed to mimic human speech. The source of input for synthetic speech can range from vocal recordings which are then digitally altered (e.g. pitch frequency, speech rate, vocal tract length), to systems like IBM Watson that only require text (Cabral et al. 2017). At the current time, text-to-speech based systems typically lack elements of human expressiveness compared to systems making use of vocal recordings (Higgins et al. 2022). The applicability of these methods to social agents has also been examined, though primarily for purely audio-based agents. Recent findings suggest that while users accepted and liked the synthetic voices, they still considered them to be more eerie compared to human speech (Mckie et al. 2022). This has been proposed to result from uncanny valley effects (Do et al. 2022; Kühne et al. 2020). In recent studies conducted with synthetic speech combined with virtual agents, findings have largely supported this, with increased eeriness, particularly for perceived mismatches between social cues across modalities (Abdulrahman and Richards 2022; Higgins et al. 2022). For these studies, results were mixed on whether there was a difference in social presence between human and synthetic speech. As stands the application of text-to-speech may result in increased psychological discomfort compared to human speech, but based on the rate of advancement this effect may be reduced significantly, or eliminated, in coming years. Additionally, developments in the field of natural language generation are likely the next step for agent voice interaction, potentially allowing for agents to respond in naturalistic manners to users without the need for text supplied by an outside party (Foster 2019).
Regarding emotional impact, a study conducted comparing emotional impact between voice types found that participants had stronger emotional reactions to recording-based-synthetic and natural voices compared to text-to-speech, with the emotional state of the character also having a direct effect on emotional outcomes (Higgins et al. 2022).
On a more general level, while investigations of the relationship between audio quality and social presence are limited, findings suggest that higher audio quality leads to higher levels of social presence (Christie 1974; Dicke et al. 2010; Skalski and Whitbred 2010). An avenue of potential interest that has been understudied to date are non-semantic verbal interjections (e.g. “hmm”, “argh”, “uhh”), the addition of which to a voice-based agent resulted in greater levels of rapport and enhanced learner motivation (Ceha and Law 2022). Another is the inclusion of dynamic speech directivity, where different types of vocal noises influence the direction of the sound in line with real life speech (Arai 2001). A series of recent studies implemented dynamic audio systems towards this end, though results have been unclear in terms of how impactful it is on ratings of naturalness, with participants generally not being able to identify the difference between static and dynamic systems (Ehret et al. 2020; Noufi et al. 2023; Sugimoto and Kinoshita 2023).
3 Discussion and conclusion
3.1 Limitations
One of the limitations of this paper model is its scope. Other conversation setups are present in the VR literature, such as group conversations with an increased number of agents (Novick et al. 2018), conversations with other human-controlled avatars (M. Wang 2020), and conversations with a combination of agents and avatars (Pejsa et al. 2017). While some relationships identified in our review will hold true across these scenarios, particularly those related to the environmental context or visual fidelity, the move from dyadic to group conversations involves a significant shift in the dynamics and complexity of the interaction (Cooney et al. 2020).
Another limitation comes from the current field of VR conversation literature. Due to the infancy of the field, we were required to broaden the scope of the narrative review to draw from wide-ranging areas of study. While this had advantages for the review in terms of allowing for a more layered perspective on the topic, the lack of direct research in the area required us to make assumptions about the transferability of in-person findings to VR scenarios.
3.2 Recommendations and conclusion
In terms of recommendations, we argue that the next step for developing the field of social VR is the development of a complete theoretical model of the experience of interacting with a social agent. The findings of this paper can serve as a basis for building this theory. While the field is still in its infancy and many areas of interest are as yet understudied, the development of a model could aid in the development of a shared understanding of the experience, particularly between disciplines (Hayes 2023; Whetten 1989). One of our other primary suggestions is to make research data and tools freely available to other researchers. This would provide a twofold improvement to the field, first by allowing for improved replicability. Secondly, the availability of these tools would provide better opportunities for researchers to build upon existing paradigms without having to start from scratch each time.
Another recommendation is to take the factors and relationships identified in our relationship map into account during the development of tools that make use of dyadic agent conversations. See Fig. 4 for a summary of the relationships. As well as this, it is vital that future research continues to directly investigate the relationships between design features and user outcomes in VR in order to strengthen our understanding of them, as well as identify their relative importance.
In terms of other opportunities for research in the field, one underutilised feature of VR as a method is the potential for reactive stimuli. While some of the agents discussed in this paper make use of some combination of the user’s speech and movements to inform how they behave, the degree of responsivity could be improved in future. One avenue is the utilisation of physiological tracking data, which has previously been used for customising VR training (Uribe-Quevedo et al. 2021), assessing mental workload (Luong et al. 2020), and emotion recognition (Gupta 2022).
The key features identified in the model highlight the multidimensional nature of VR conversations, including visual, technical, behavioural, and contextual factors for the immersive environment and the design of the agent. Based on this, we would argue that there is considerable value in the inclusion of additional measures when conducting social VR research. For example, rather than simply assessing a participant’s level of anxiety following a job interview scenario, measurements should be taken to assess what specifically contributed to those feelings. Additionally, the diverse behavioural measures that VR affords should be taken into consideration as another source of insight into conversation.
In conclusion, the application of VR methods to the field of psychology, particularly for the study of conversation, is still in its early stages of maturation and is plagued by methodological and theoretical weaknesses. To address this failing, we conducted a narrative review and developed a conceptual model to aid future researchers in making informed design decisions when creating VR methods. Drawing on results from varied fields of research, this relationship map provides a set of expectations for how the design features of VR experiences impact psychological outcomes for the user for dyadic agent conversation scenarios. While exact guidelines cannot be given on the “optimal” levels of various design features at this stage, this model contributes to the field by providing initial expectations of the role these features play in experiential outcomes.
3.3 Interdisciplinary collaboration considerations
Given the multifaceted nature of social VR, interdisciplinary collaboration is a valuable avenue for the advancement of the field, both in terms of technical development and methodological guidance. This paper worked towards the aim of bringing together findings from disparate fields for a better shared understanding, with one of the next steps ideally being for researchers from those fields to collaborate. For example, strides towards photorealistic avatars/agents and, by connection, animation requires joint efforts or at minimum an understanding of a variety of fields including 3D modelling, animation, psychology, and anatomy (e.g. Wheatland et al. 2015). On the methodological side of things, as conversational agents become more advanced and naturalistic the introduction of techniques from domains such as conversation analysis or discourse analysis could be a way of gaining more insight into the conversation dynamics (for an overview see Rapley 2018). On the technology side of things, as social VR tools become more complicated and more commonly integrate multiple users, research on networking (Cheng et al. 2022a, b, c), and the creation of end-to-end systems (Friston et al. 2021) becomes more important.
Towards the aim of understanding and guiding best practices for interdisciplinary research, the European Commission funded SHAPE-ID project carried out a review of interdisciplinary research and developed a toolkit to guide researchers on best practices (European Commission 2021). Their toolkit provides resources including case studies of successful collaborations, top tips, guides, and FAQs. We would recommend these resources as a starting point for any researchers interested in collaborating across disciplines on social VR research.
4 Image attributions
-
“Anxiety”—dDara https://thenounproject.com/icon/anxiety-1938926/
-
“Avatar”—artworkbean https://thenounproject.com/icon/avatar-1079584/
-
“Building”—Kiran Shastry https://thenounproject.com/icon/building-2333802/
-
“Conversation”—Yannik Wölfel https://thenounproject.com/icon/conversation-4558401/
-
“Emotions”—Teewara soontorn https://thenounproject.com/icon/emotions-4091815/
-
“Immersion”—Gregor Cresnar https://thenounproject.com/icon/immersion-5239843/
-
“Movement”—Martin Königsmann https://thenounproject.com/icon/movement-5418687/
-
“Nausea”—Travis Avery https://thenounproject.com/icon/nausea-4210646/
-
“Proximity”—Brandon Shields https://thenounproject.com/icon/proximity-1022086/
-
“Visual”—Kukuh Wachyu Bias https://thenounproject.com/icon/visual-4518805/
-
“X mark icon”—freepik https://www.freepik.com/icon/x-mark_1766
-
“Question sign icon”—Dave Gandyy https://www.freepik.com/icon/question-sign_25333
Data availability
We do not analyse or generate any datasets as this paper is focused on a theoretical approach. All relevant materials from the narrative review and relationship map can be found in the references below.
References
Abdulrahman A, Richards D (2022) Is natural necessary? Human voice versus synthetic voice for intelligent virtual agents. Multimodal Technol Interact 6(7):51. https://doi.org/10.3390/mti6070051
Aburumman N, Gillies M, Ward JA, de Hamilton AFC (2022) Nonverbal communication in virtual reality: nodding as a social signal in virtual interactions. Int J Hum Comput Stud 164:102819. https://doi.org/10.1016/j.ijhcs.2022.102819
Ahn D, Seo Y, Kim M, Kwon JH, Jung Y, Ahn J, Lee D (2014) The effects of actual human size display and stereoscopic presentation on users’ sense of being together with and of psychological immersion in a virtual character. Cyberpsychol Behav Soc Netw 17(7):483–487. https://doi.org/10.1089/cyber.2013.0455
Allmendinger K (2010) Social presence in synchronous virtual learning situations: the role of nonverbal signals displayed by avatars. Educ Psychol Rev 22(1):41–56. https://doi.org/10.1007/s10648-010-9117-8
Anderson PL, Price M, Edwards SM, Obasaju MA, Schmertz SK, Zimand E, Calamaras MR (2013) Virtual reality exposure therapy for social anxiety disorder: a randomized controlled trial. J Consult Clin Psychol 81(5):751–760. https://doi.org/10.1037/a0033559
Aneja D, Hoegen R, McDuff D, Czerwinski M (2021) Understanding conversational and expressive style in a multimodal embodied conversational agent. In: Proceedings of the 2021 CHI conference on human factors in computing systems. pp 1–10. https://doi.org/10.1145/3411764.3445708
Antonio Gómez Jáuregui D, Giraud T, Isableu B, Martin J-C (2021) Design and evaluation of postural interactions between users and a listening virtual agent during a simulated job interview. Comput Anim Virtual Worlds 32(6):e2029. https://doi.org/10.1002/cav.2029
Appel L, Appel L, Appel E, Bogler O, Wiseman M, Cohen L, Ein N, Abrams HB, Abrams HB, Campos JL, Campos JL (2020) Older adults with cognitive and/or physical impairments can benefit from immersive virtual reality experiences: a feasibility study. Front Med. https://doi.org/10.3389/fmed.2019.00329
Arai T (2001) The replication of Chiba and Kajiyama’s mechanical models of the human vocal cavity (<Feature Articles> Sixtieth anniversary of the publication of the vowel its nature and structure by Chiba and Kajiyama). J Phon Soc Jpn 5(2):31–38. https://doi.org/10.24467/onseikenkyu.5.2_31
Aseeri S, Interrante V (2021) The influence of avatar representation on interpersonal communication in virtual social environments. IEEE Trans Visual Comput Gr 27(5):2608–2617. https://doi.org/10.1109/TVCG.2021.3067783
Aung T, Puts D (2020) Voice pitch: A window into the communication of social power. Curr Opin Psychol 33:154–161. https://doi.org/10.1016/j.copsyc.2019.07.028
Aymerich-Franch L, Kizilcec RF, Bailenson JN (2014) The relationship between virtual self similarity and social anxiety. Front Hum Neurosci 8:944. https://doi.org/10.3389/fnhum.2014.00944
Bailenson J, Swinth K, Hoyt C, Persky S, Dimov A, Blascovich J (2005) The independent and interactive effects of embodied-agent appearance and behavior on self-report, cognitive, and behavioral markers of copresence in immersive virtual environments. Presence 14:379–393. https://doi.org/10.1162/105474605774785235
Baron-Cohen S, Wheelwright S (2001) The ‘reading the mind in the eyes’ test revised version: a study with normal adults, and adults with asperger syndrome or high-functioning autism. J Child Psychol Psychiatry 42(2):241. https://doi.org/10.1111/1469-7610.00715
Baumeister RF (2016) Charting the future of social psychology on stormy seas: winners, losers, and recommendations. J Exp Soc Psychol 66:153–158. https://doi.org/10.1016/j.jesp.2016.02.003
Bermejo B, Juiz C, Cortes D, Oskam J, Moilanen T, Loijas J, Govender P, Hussey J, Schmidt AL, Burbach R, King D (2023) AR/VR teaching-learning experiences in higher education institutions (HEI): A systematic literature review. Informatics 10(2):45. https://doi.org/10.3390/informatics10020045
Biocca F (1997) The cyborg’s dilemma: progressive embodiment in virtual environments. J Comput Mediat Commun 3(2):JCN324. https://doi.org/10.1111/j.1083-6101.1997.tb00070.x
Birckhead B, Khalil C, Liu X, Conovitz S, Rizzo A, Danovitch I, Bullock K, Spiegel B (2019) Recommendations for methodology of virtual reality clinical trials in health care by an international working group: iterative study. JMIR Mental Health 6(1):e11973. https://doi.org/10.2196/11973
Birk MV, Mandryk RL (2018) Combating attrition in digital self-improvement programs using avatar customization. In: Proceedings of the 2018 CHI conference on human factors in computing systems. pp 1–15. https://doi.org/10.1145/3173574.3174234
Birk MV, Mandryk RL (2019) Improving the efficacy of cognitive training for digital mental health interventions through avatar customization: crowdsourced quasi-experimental study. J Med Internet Res 21(1):e10133. https://doi.org/10.2196/10133
Blum J, Rockstroh C, Göritz AS (2019) Heart rate variability biofeedback based on slow-paced breathing with immersive virtual reality nature scenery. Front Psychol 10:2172. https://doi.org/10.3389/fpsyg.2019.02172
Bönsch A, Radke S, Ehret J, Habel U, Kuhlen TW (2020) The impact of a virtual agent’s non-verbal emotional expression on a user’s personal space preferences. In: Proceedings of the 20th ACM international conference on intelligent virtual agents. pp 1–8. https://doi.org/10.1145/3383652.3423888
Bönsch A, Radke S, Overath H, Asché LM, Wendt J, Vierjahn T, Habel U, Kuhlen TW (2018) Social VR: how personal space is affected by virtual agents’ emotions. In: 2018 IEEE conference on virtual reality and 3D user interfaces (VR), 199–206. https://doi.org/10.1109/VR.2018.8446480
Borrie SA, Barrett TS, Willi MM, Berisha V (2019) Syncing up for a good conversation: a clinically meaningful methodology for capturing conversational entrainment in the speech domain. J Speech Lang Hear Res 62(2):283–296. https://doi.org/10.1044/2018_JSLHR-S-18-0210
Boylan P, Kirwan GH, Rooney B (2018) Self-reported discomfort when using commercially targeted virtual reality equipment in discomfort distraction. Virtual Reality 22(4):309–314. https://doi.org/10.1007/s10055-017-0329-9
Bracken CC (2005) Presence and image quality: the case of high-definition television. Media Psychol 7(2):191–205. https://doi.org/10.1207/S1532785XMEP0702_4
Brugel S, Postma-Nilsenová M, Tates K (2015) The link between perception of clinical empathy and nonverbal behavior: the effect of a doctor’s gaze and body orientation. Patient Educ Couns 98(10):1260–1265. https://doi.org/10.1016/j.pec.2015.08.007
Buck LE, Chakraborty S, Bodenheimer B (2022) The impact of embodiment and avatar sizing on personal space in immersive virtual environments. IEEE Trans Visual Comput Gr 28(5):2102–2113. https://doi.org/10.1109/TVCG.2022.3150483
Bulu ST (2012) Place presence, social presence, co-presence, and satisfaction in virtual worlds. Comput Educ 58(1):154–161. https://doi.org/10.1016/j.compedu.2011.08.024
Byrne DE (1971) The attraction paradigm. Academic Press
Cabral JP, Cowan BR, Zibrek K, McDonnell R (2017) The influence of synthetic voice on the evaluation of a virtual character. Interspeech 2017:229–233. https://doi.org/10.21437/Interspeech.2017-325
Cao J, He Q, Wang Z, Lc R, Tong X (2023) DreamVR: curating an interactive exhibition in social VR through an autobiographical design study. In: Proceedings of the 2023 CHI conference on human factors in computing systems. pp 1–18. https://doi.org/10.1145/3544548.3581362
Carnevale A, Mannocchi I, Sassi MSH, Carli M, De Luca G, Longo UG, Denaro V, Schena E (2022) Virtual reality for shoulder rehabilitation: accuracy evaluation of oculus quest 2. Sensors (basel, Switzerland) 22(15):5511. https://doi.org/10.3390/s22155511
Caserman P, Garcia-Agundez A, Göbel S (2020) A survey of full-body motion reconstruction in immersive virtual reality applications. IEEE Trans Visual Comput Graphics 26(10):3089–3108. https://doi.org/10.1109/TVCG.2019.2912607
Cassell J, Bickmore T, Billinghurst M, Campbell L, Chang K, Vilhjálmsson H, Yan H (1999) Embodiment in conversational interfaces: Rea. In: Proceedings of the SIGCHI conference on human factors in computing systems. pp 520–527. https://doi.org/10.1145/302979.303150
Ceha J, Law E (2022) expressive auditory gestures in a voice-based pedagogical agent. In: Proceedings of the 2022 CHI conference on human factors in computing systems. pp 1–13. https://doi.org/10.1145/3491102.3517599
Cha SH, Zhang S, Kim TW (2020) Effects of interior color schemes on emotion, task performance, and heart rate in immersive virtual environments. J Inter Des 45(4):51–65. https://doi.org/10.1111/joid.12179
Cheng R, Wu N, Chen S, Han B (2022a) Reality check of metaverse: a first look at commercial social virtual reality platforms. In: 2022 IEEE conference on virtual reality and 3D user interfaces abstracts and workshops (VRW). pp 141–148. https://doi.org/10.1109/VRW55335.2022.00040
Cheng R, Wu N, Chen S, Han B (2022b) Will metaverse be NextG internet? Vision, hype, and reality. IEEE Network 36(5):197–204. https://doi.org/10.1109/MNET.117.2200055
Cheng R, Wu N, Varvello M, Chen S, Han B (2022) Are we ready for metaverse? A measurement study of social virtual reality platforms. In: Proceedings of the 22nd ACM internet measurement conference. pp 504–518. https://doi.org/10.1145/3517745.3561417
Cho S, Kim S, Lee J, Ahn J, Han J (2020) Effects of volumetric capture avatars on social presence in immersive virtual environments. In: 2020 IEEE conference on virtual reality and 3D user interfaces (VR), 26–34. https://doi.org/10.1109/VR46266.2020.00020
Christie B (1974) Perceived usefulness of person-person telecommunications media as a function of the intended application. Eur J Soc Psychol 4(3):366–368. https://doi.org/10.1002/ejsp.2420040307
Christophers L, Lee CT, Rooney B (2023) Exploring subjective realism: Do evaluative realism and felt realism respond differently to different cues? Int J Hum Comput Stud 175:103027. https://doi.org/10.1016/j.ijhcs.2023.103027
Christophers L, Mulvaney P, Rooney B (in press) The felt realism of “unreal” environments: testing a dual awareness model of subjective realism. Int J Hum Comput Interact
Clark HH (1996) Using language. Cambridge University Press, Cambridge
Conrad FG, Schober MF, Jans M, Orlowski RA, Nielsen D, Levenstein R (2015) Comprehension and engagement in survey interviews with virtual agents. Front Psychol 6:1578. https://doi.org/10.3389/fpsyg.2015.01578
Cooney G, Mastroianni AM, Abi-Esber N, Brooks AW (2020) The many minds problem: disclosure in dyadic versus group conversation. Curr Opin Psychol 31:22–27. https://doi.org/10.1016/j.copsyc.2019.06.032
Cortese J, Seo M (2012) The role of social presence in opinion expression during FtF and CMC discussions. Commun Res Rep 29(1):44–53. https://doi.org/10.1080/08824096.2011.639913
Dael N, Mortillaro M, Scherer KR (2012) Emotion expression in body action and posture. Emotion 12(5):1085–1101. https://doi.org/10.1037/a0025737.supp
Darville G, Anderson-Lewis C, Stellefson M, Lee Y-H, MacInnes J, Pigg RM, Gilbert JE, Thomas S (2018) Customization of avatars in a HPV digital gaming intervention for college-age males: an experimental study. Simul Gaming 49(5):515–537. https://doi.org/10.1177/1046878118799472
Davis JA, Nutter DW (2010) Occupancy diversity factors for common university building types. Energy Build 42(9):1543–1551. https://doi.org/10.1016/j.enbuild.2010.03.025
Deriu M, Bachis F, Massa M (2021) Improving the user engagement in a fully immersive experience by the means of a conversational non-playable character used as a tourist guide. IoT Vert Top Summit Tour 2021:1–4. https://doi.org/10.1109/IEEECONF49204.2021.9604871
DeVault D, Artstein R, Benn G, Dey T, Fast E, Gainer A, Georgila K, Gratch J, Hartholt A, Lhommet M, Lucas G, Marsella S, Morbini F, Nazarian A, Scherer S, Stratou G, Suri A, Traum D, Wood R, Morency L-P (2014) SimSensei kiosk: a virtual human interviewer for healthcare decision support. In: Proceedings of the 2014 international conference on autonomous agents and multi-agent systems. pp 1061–1068
Dibbets P, Schulte-Ostermann MA (2015) Virtual reality, real emotions: A novel analogue for the assessment of risk factors of post-traumatic stress disorder. Front Psychol. https://doi.org/10.3389/fpsyg.2015.00681
Dicke C, Aaltonen V, Rämö A, Vilermo M (2010) Talk to me: the influence of audio quality on the perception of social presence (p. 318). https://doi.org/10.14236/ewic/HCI2010.36
Dickinson P, Gerling K, Hicks K, Murray J, Shearer J, Greenwood J (2019) Virtual reality crowd simulation: effects of agent density on user experience and behaviour. Virtual Reality 23(1):19–32. https://doi.org/10.1007/s10055-018-0365-0
Diemer J, Alpers GW, Peperkorn HM, Shiban Y, Mühlberger A (2015) The impact of perception and presence on emotional reactions: a review of research in virtual reality. Front Psychol. https://doi.org/10.3389/fpsyg.2015.00026
Do TD, McMahan RP, Wisniewski PJ (2022) A new uncanny valley? The effects of speech fidelity and human listener gender on social perceptions of a virtual-human speaker. In: Proceedings of the 2022 CHI conference on human factors in computing systems. pp 1–11. https://doi.org/10.1145/3491102.3517564
Duverné T, Rougnant T, Le Yondre F, Berton F, Bruneau J, Zibrek K, Pettré J, Hoyet L, Olivier A-H (2020) Effect of social settings on proxemics during social interactions in real and virtual conditions. In: Bourdot P, Interrante V, Kopper R, Olivier A-H, Saito H, Zachmann G (eds) Virtual reality and augmented reality. Springer, Berlin
Ebrahimi E, Hartman LS, Robb A, Pagano CC, Babu SV (2018) Investigating the effects of anthropomorphic fidelity of self-avatars on near field depth perception in immersive virtual environments. In: 2018 IEEE conference on virtual reality and 3D user interfaces (VR). pp 1–8. https://doi.org/10.1109/VR.2018.8446539
Ehret J, Stienen J, Brozdowski C, Bönsch A, Mittelberg I, Vorländer M, Kuhlen TW (2020) evaluating the influence of phoneme-dependent dynamic speaker directivity of embodied conversational agents’ speech. In: Proceedings of the 20th ACM international conference on intelligent virtual agents. pp 1–8. https://doi.org/10.1145/3383652.3423863
El-Yamri M, Romero-Hernandez A, Gonzalez-Riojo M, Manero B (2019) Emotions-responsive audiences for VR public speaking simulators based on the speakers’ voice. In: 2019 IEEE 19th international conference on advanced learning technologies (ICALT), 2161–377X, 349–353. https://doi.org/10.1109/ICALT.2019.00108
European Commission (2021) SHAPE-ID: shaping interdisciplinary practices in Europe. Final toolkit for dissemination. Publications Office of the European Union. https://doi.org/10.3030/822705
Felnhofer A, Knaust T, Weiss L, Goinska K, Mayer A, Kothgassner OD (2023) A virtual character’s agency affects social responses in immersive virtual reality: a systematic review and meta-analysis. Int J Hum Comput Interact. https://doi.org/10.1080/10447318.2023.2209979
Ferstl Y, Neff M, McDonnell R (2021a) ExpressGesture: Expressive gesture generation from speech through database matching. Comput Animat Virtual Worlds 32(3–4):e2016. https://doi.org/10.1002/cav.2016
Ferstl Y, Thomas S, Guiard C, Ennis C, McDonnell R (2021b) Human or robot? Investigating voice, appearance and gesture motion realism of conversational social agents. In: Proceedings of the 21st ACM international conference on intelligent virtual agents. pp 76–83. https://doi.org/10.1145/3472306.3478338
Fich LB, Jönsson P, Kirkegaard PH, Wallergård M, Garde AH, Hansen Å (2014) Can architectural design alter the physiological reaction to psychosocial stress? A virtual TSST experiment. Physiol Behav 135:91–97. https://doi.org/10.1016/j.physbeh.2014.05.034
Fortin DR, Dholakia RR (2005) Interactivity and vividness effects on social presence and involvement with a web-based advertisement. J Bus Res 58(3):387–396. https://doi.org/10.1016/S0148-2963(03)00106-1
Foster ME (2019) Natural language generation for social robotics: opportunities and challenges. Philos Trans R Soc B Biol Sci 374(1771):20180027. https://doi.org/10.1098/rstb.2018.0027
Freeman G, Acena D (2021) Hugging from a distance: building interpersonal relationships in social virtual reality. In: ACM international conference on interactive media experiences. pp 84–95. https://doi.org/10.1145/3452918.3458805
Freeman G, Maloney D (2021) Body, avatar, and me: the presentation and perception of self in social virtual reality. Proc ACM Hum Comput Interact 4(CSCW3):1–27. https://doi.org/10.1145/3432938
Friehs MA, Dechant M, Schäfer S, Mandryk RL (2022) More than skin deep: About the influence of self-relevant avatars on inhibitory control. Cogn Res Princ Implic 7:31. https://doi.org/10.1186/s41235-022-00384-8
Friston SJ, Congdon BJ, Swapp D, Izzouzi L, Brandstätter K, Archer D, Olkkonen O, Thiel FJ, Steed A (2021) Ubiq: a system to build flexible social virtual reality experiences. In: Proceedings of the 27th ACM symposium on virtual reality software and technology. pp 1–11. https://doi.org/10.1145/3489849.3489871
Furness PJ, Phelan I, Babiker NT, Fehily O, Lindley SA, Thompson AR (2019) Reducing pain during wound dressings in burn care using virtual reality: a study of perceived impact and usability with patients and nurses. J Burn Care Res 40(6):878–885. https://doi.org/10.1093/jbcr/irz106
Garau M, Slater M, Pertaub D-P, Razzaque S (2005) The responses of people to virtual humans in an immersive virtual environment. Presence Teleop Virtual Environ 14(1):104–116. https://doi.org/10.1162/1054746053890242
Garau M, Slater M, Vinayagamoorthy V, Brogni A, Steed A, Sasse MA (2003) The impact of avatar realism and eye gaze control on perceived quality of communication in a shared immersive virtual environment. In: Proceedings of the SIGCHI conference on human factors in computing systems. pp 529–536. https://doi.org/10.1145/642611.642703
Gatica-Perez D (2009) Automatic nonverbal analysis of social interaction in small groups: a review. Image Vis Comput 27(12):1775–1787. https://doi.org/10.1016/j.imavis.2009.01.004
Gorisse G, Christmann O, Houzangbe S, Richir S (2019) From robot to virtual doppelganger: impact of visual fidelity of avatars controlled in third-person perspective on embodiment and behavior in immersive virtual environments. Front Robot AI 6:8. https://doi.org/10.3389/frobt.2019.00008
Gratch J, Okhmatovskaia A, Lamothe F, Marsella S, Morales M, van der Werf RJ, Morency L-P (2006) Virtual rapport. In: Gratch J, Young M, Aylett R, Ballin D, Olivier P (eds) Intelligent virtual agents. Springer, Berlin, pp 14–27
Guadagno RE, Blascovich J, Bailenson JN, Mccall C (2007) Virtual humans and persuasion: the effects of agency and behavioral realism. Media Psychol 10(1):1–22. https://doi.org/10.1080/15213260701300865
Gupta K (2022) [DC] Exploration of context and physiological cues for personalized emotion-adaptive virtual reality. In: 2022 IEEE conference on virtual reality and 3D user interfaces abstracts and workshops (VRW), 960–961. https://doi.org/10.1109/VRW55335.2022.00331
Hall ET, Birdwhistell RL, Bock B, Bohannan P, Diebold AR, Durbin M, Edmonson MS, Fischer JL, Hymes D, Kimball ST, La Barre W, Frank Lynch SJ, McClellan JE, Marshall DS, Milner GB, Sarles HB, Trager GL, Vayda AP (1968) Proxemics [and comments and replies]. Curr Anthropol 9(2/3):83–108
Handley R, Guerra B, Goli R, Zytko D (2022) Designing social VR: a collection of design choices across commercial and research applications. arXiv:2201.02253. arXiv. https://doi.org/10.48550/arXiv.2201.02253
Harjunen VJ, Spapé M, Ahmed I, Jacucci G, Ravaja N (2018) Persuaded by the machine: The effect of virtual nonverbal cues and individual differences on compliance in economic bargaining. Comput Hum Behav 87:384–394. https://doi.org/10.1016/j.chb.2018.06.012
Hayes D (2023) The integrated gameplay entertainment model: developing, validating & using a comprehensive model of the video game entertainment experience. University College Dublin
Hecht H, Welsch R, Viehoff J, Longo MR (2019) The shape of personal space. Acta Physiol (oxf) 193:113–122. https://doi.org/10.1016/j.actpsy.2018.12.009
Heidicker P, Langbehn E, Steinicke F (2017) Influence of avatar appearance on presence in social VR. pp 233–234. https://doi.org/10.1109/3DUI.2017.7893357
Heylen D (2006) Head gestures, gaze and the principles of conversational structure. Int J Hum Rob 03(03):241–267. https://doi.org/10.1142/S0219843606000746
Higgins D, Zibrek K, Cabral J, Egan D, McDonnell R (2022) Sympathy for the digital: Influence of synthetic voice on affinity, social presence and empathy for photorealistic virtual humans. Comput Gr 104:116–128. https://doi.org/10.1016/j.cag.2022.03.009
Hoppe M, Rossmy B, Neumann DP, Streuber S, Schmidt A, Machulla T-K (2020) A human touch: social touch increases the perceived human-likeness of agents in virtual reality. In: Proceedings of the 2020 CHI conference on human factors in computing systems. pp 1–11. https://doi.org/10.1145/3313831.3376719
Hortensius R, Hekele F, Cross ES (2018) The perception of emotion in artificial agents. IEEE Trans Cogn Dev Syst 10(4):852–864. https://doi.org/10.1109/TCDS.2018.2826921
Howard MC, Gutworth MB (2020) A meta-analysis of virtual reality training programs for social skill development. Comput Educ 144:103707. https://doi.org/10.1016/j.compedu.2019.103707
Huang A, Knierim P, Chiossi F, Chuang LL, Welsch R (2022) Proxemics for human-agent interaction in augmented reality. In: Proceedings of the 2022 CHI conference on human factors in computing systems. pp 1–13. https://doi.org/10.1145/3491102.3517593
Huang L, Morency L-P, Gratch J (2011) Virtual rapport 2.0. In: Vilhjálmsson HH, Kopp S, Marsella S, Thórisson KR (eds) Intelligent virtual agents. Springer, Berlin
James CA, Haustein K, Bednarz TP, Alem L, Caris C, Castleden A (2011) Remote operation of mining equipment using panoramic display systems: exploring the sense of presence. Ergon Open J, 4(1). https://benthamopen.com/ABSTRACT/TOERGJ-4-93
Jicol C, Wan CH, Doling B., Illingworth CH, Yoon J, Headey C, Lutteroth C, Proulx MJ, Petrini K, O’Neill E (2021) Effects of emotion and agency on presence in virtual reality. In: Proceedings of the 2021 CHI conference on human factors in computing systems. pp 1–13. https://doi.org/10.1145/3411764.3445588
Jin WJ, Park SH (2023) Your voice pitch speaks volumes about you: How voice pitch affects mind perception of the speakers. Br J Soc Psychol 62(3):1230–1250. https://doi.org/10.1111/bjso.12630
Jinga N, Moldoveanu A, Moldoveanu F, Morar A, Mitrut O (2021) VR training systems for public speaking—a qualitative survey. Int Sci Conf eLearn Softw Educ 2:174–181. https://doi.org/10.12753/2066-026X-21-092
Jo H, Song C, Miyazaki Y (2019) Physiological benefits of viewing nature: a systematic review of indoor experiments. Int J Environ Res Public Health 16(23):4739. https://doi.org/10.3390/ijerph16234739
Jonas M, Said S, Yu D, Aiello C, Furlo N, Zytko D (2019) Towards a taxonomy of social VR application design. In: Extended abstracts of the annual symposium on computer-human interaction in play companion extended abstracts. pp 437–444. https://doi.org/10.1145/3341215.3356271
Kahl S, Kopp S (2018) A predictive processing model of perception and action for self-other distinction. Front Psychol. https://doi.org/10.3389/fpsyg.2018.02421
Kang S-H, Gratch J, Wang N, Watt JH (2008) Does the contingency of agents’ nonverbal feedback affect users’ social anxiety?. In: Proceedings of the 7th international joint conference on autonomous agents and multiagent systems. 1:120–127
Kätsyri J, de Gelder B, de Borst AW (2018) Virtual reality and the new psychophysics. Br J Psychol 109(3):421–426. https://doi.org/10.1111/bjop.12308
Kendon A (1990) Conducting interaction: patterns of behavior in focused encounters. Cambridge University Press, Cambridge
Kern AC, Ellermeier W (2020) Audio in VR: effects of a soundscape and movement-triggered step sounds on presence. Front Robot AI 7:20. https://doi.org/10.3389/frobt.2020.00020
Kerous B, Barteček R, Roman R, Sojka P, Bečev O, Liarokapis F (2020) Social environment simulation in VR elicits a distinct reaction in subjects with different levels of anxiety and somatoform dissociation. Int J Hum Comput Interact 36(6):505–515. https://doi.org/10.1080/10447318.2019.1661608
Kim A, Darakjian N, Finley JM (2017) Walking in fully immersive virtual environments: an evaluation of potential adverse effects in older adults and individuals with Parkinson’s disease. J Neuroeng Rehabil 14:16. https://doi.org/10.1186/s12984-017-0225-2
Ko SJ, Judd CM, Blair IV (2006) What the voice reveals: within- and between-category stereotyping on the basis of voice. Pers Soc Psychol Bull 32(6):806–819. https://doi.org/10.1177/0146167206286627
Kolesnichenko A, McVeigh-Schultz J, Isbister K (2019). Understanding emerging design practices for avatar systems in the commercial social VR ecology. In: Proceedings of the 2019 on designing interactive systems conference. pp 241–252. https://doi.org/10.1145/3322276.3322352
Kolkmeier J, Vroon J, Heylen D (2016) Interacting with virtual agents in shared space: single and joint effects of gaze and proxemics. In: Traum D, Swartout W, Khooshabeh P, Kopp S, Scherer S, Leuski A (eds) Intelligent virtual agents. Springer, Berlin, pp 1–14
Krahé B, Uhlmann A, Herzberg M (2021) The voice gives it away. Soc Psychol 52(2):101–113. https://doi.org/10.1027/1864-9335/a000441
Kühne K, Fischer MH, Zhou Y (2020) The human takes it all: humanlike synthesized voices are perceived as less eerie and more likable. Evidence from a subjective ratings study. Front Neurorobot 14:593732. https://doi.org/10.3389/fnbot.2020.593732
Kwon JH, Powell J, Chalmers A (2013) How level of realism influences anxiety in virtual reality environments for a job interview. Int J Hum Comput Stud. https://doi.org/10.1016/j.ijhcs.2013.07.003
Lanier M, Waddell TF, Elson M, Tamul DJ, Ivory JD, Przybylski A (2019) Virtual reality check: Statistical power, reported results, and the validity of research on the psychology of virtual reality and immersive environments. Comput Hum Behav 100:70–78. https://doi.org/10.1016/j.chb.2019.06.015
Latoschik ME, Wienrich C (2022) Congruence and plausibility, not presence: pivotal conditions for XR experiences and effects, a novel approach. Front Virtual Reality. https://doi.org/10.3389/frvir.2022.694433
Lenormand D, Piolino P (2022) In search of a naturalistic neuroimaging approach: exploration of general feasibility through the case of VR-fMRI and application in the domain of episodic memory. Neurosci Biobehav Rev 133:104499. https://doi.org/10.1016/j.neubiorev.2021.12.022
Li C, Androulakaki T, Gao AY, Yang F, Saikia H, Peters C, Skantze G (2018). Effects of posture and embodiment on social distance in human-agent interaction in mixed reality. In: Proceedings of the 18th international conference on intelligent virtual agents. pp 191–196. https://doi.org/10.1145/3267851.3267870
Li J, Wu W, Jin Y, Zhao R, Bian W (2021) Research on environmental comfort and cognitive performance based on EEG+VR+LEC evaluation method in underground space. Build Environ 198:107886. https://doi.org/10.1016/j.buildenv.2021.107886
Lin Q, Rieser J, Bodenheimer B (2015) Affordance judgments in HMD-based virtual environments: stepping over a pole and stepping off a ledge. ACM Trans Appl Percept 12(2):1–21. https://doi.org/10.1145/2720020
Llobera J, Beacco A, Oliva R, Şenel G, Banakou D, Slater M (2021) Evaluating participant responses to a virtual reality experience using reinforcement learning. R Soc Open Sci 8(9):210537. https://doi.org/10.1098/rsos.210537
Lohse KR, Boyd LA, Hodges NJ (2016) Engaging environments enhance motor skill learning in a computer gaming task. J Mot Behav 48(2):172–182. https://doi.org/10.1080/00222895.2015.1068158
Luong T, Martin N, Raison A, Argelaguet F, Diverrez J-M, Lécuyer A (2020) Towards real-time recognition of users mental workload using integrated physiological sensors into a VR HMD. In: 2020 IEEE international symposium on mixed and augmented reality (ISMAR). pp 425–437. https://doi.org/10.1109/ISMAR50242.2020.00068
Makled E, Abdelrahman Y, Mokhtar N, Schwind V, Abdennadher S, Schmidt A (2018) I like to move it: investigating the effect of head and body movement of avatars in VR on user’s perception. In: Extended abstracts of the 2018 CHI conference on human factors in computing systems. pp 1–6. https://doi.org/10.1145/3170427.3188573
Maloney D, Freeman G, Wohn DY (2020) ‘Talking without a voice’: understanding non-verbal communication in social virtual reality. Proc ACM Hum Comput Interact 4(CSCW2):1–25. https://doi.org/10.1145/3415246
Mancini M, Castellano G, Peters C, Mcowan P (2011) Evaluating the communication of emotion via expressive gesture copying behaviour in an embodied humanoid agent (p. 224). https://doi.org/10.1007/978-3-642-24600-5_25
Marsella S, Xu Y, Lhommet M, Feng A, Scherer S, Shapiro A (2013) Virtual character performance from speech. In: Proceedings of the 12th ACM SIGGRAPH/Eurographics symposium on computer animation. pp 25–35. https://doi.org/10.1145/2485895.2485900
Mckie I, Narayan B, Kocaballi B (2022) Conversational voice assistants and a case study of long-term users: a human information behaviours perspective. J Austr Libr Inf Assoc 71(3):233–255. https://doi.org/10.1080/24750158.2022.2104738
McNeill D (1992) Hand and mind: what gestures reveal about thought. University of Chicago Press, Chicago
McVeigh-Schultz J, Márquez Segura E, Merrill N, Isbister K (2018) What’s it mean to ‘be social’ in VR? Mapping the social VR design ecology. In: Proceedings of the 2018 ACM conference companion publication on designing interactive systems. pp 289–294. https://doi.org/10.1145/3197391.3205451
Mehrabian A (2008) Communication without words. In Communication theory (2nd ed). Routledge
Montoya RM, Horton RS, Kirchner J (2008) Is actual similarity necessary for attraction? A meta-analysis of actual and perceived similarity. J Soc Pers Relat 25(6):889–922. https://doi.org/10.1177/0265407508096700
Moore RK, Marxer R, Thill S (2016) Vocal Interactivity in-and-between humans, animals and robots. Front Robot A I:3. https://doi.org/10.3389/frobt.2016.00061
Mori M, MacDorman KF, Kageki N (2012) The uncanny valley [From the Field]. IEEE Robot Autom Mag 19(2):98–100. https://doi.org/10.1109/MRA.2012.2192811
Nagels A, Kircher T, Steines M, Straube B (2015) Feeling addressed! The role of body orientation and co-speech gesture in social communication. Hum Brain Mapp 36(5):1925–1936. https://doi.org/10.1002/hbm.22746
Newman M, Gatersleben B, Wyles KJ, Ratcliffe E (2022) The use of virtual reality in environment experiences and the importance of realism. J Environ Psychol 79:101733. https://doi.org/10.1016/j.jenvp.2021.101733
Noufi C, Markovic D, Dodds P (2023) Reconstructing the dynamic directivity of unconstrained speech. In: 2023 immersive and 3D audio: from architecture to automotive (I3DA). pp 1–13. https://doi.org/10.1109/I3DA57090.2023.10289447
Novick D, Hinojos LJ, Rodriguez AE, Camacho A, Afravi M (2018) Conversational interaction with multiple agents initiated via proxemics and gaze. In: Proceedings of the 6th international conference on human-agent interaction. pp 356–358. https://doi.org/10.1145/3284432.3287185
Nowak, K. L., & Biocca, F. (2003). The Effect of the Agency and Anthropomorphism on Users’ Sense of Telepresence, Copresence, and Social Presence in Virtual Environments. Presence: Teleoperators and Virtual Environments, 12(5), 481–494. https://doi.org/10.1162/105474603322761289
Oh CS, Bailenson JN, Welch GF (2018) A systematic review of social presence: definition, antecedents, and implications. Front Robot AI. https://doi.org/10.3389/frobt.2018.00114
Ota S, Taki S, Jindai M, Yasuda T (2021) Nodding detection system based on head motion and voice rhythm. J Adv Mech Design Syst Manuf 15(1):JAMDSM0005–JAMDSM0005. https://doi.org/10.1299/jamdsm.2021jamdsm0005
Pan X, de Hamilton AFC (2018) Why and how to use virtual reality to study human social interaction: the challenges of exploring a new research landscape. Br J Psychol. https://doi.org/10.1111/bjop.12290
Park MJ, Kim DJ, Lee U, Na EJ, Jeon HJ (2019) A literature overview of virtual reality (VR) in treatment of psychiatric disorders: recent advances and limitations. Front Psych 10:505. https://doi.org/10.3389/fpsyt.2019.00505
Parmar D, Lin L, Dsouza N, Joerg S, Leonard AE, Daily SB, Babu S (2022) How immersion and self-avatars in vr affect learning programming and computational thinking in middle school education. IEEE Trans Vis Comput Gr. https://doi.org/10.1109/TVCG.2022.3169426
Parsons TD (2015) Virtual reality for enhanced ecological validity and experimental control in the clinical, affective and social neurosciences. Front Hum Neurosci 9:660. https://doi.org/10.3389/fnhum.2015.00660
Pejsa T, Gleicher M, Mutlu B (2017) Who, me? How virtual agents can shape conversational footing in virtual reality. In: Beskow J, Peters C, Castellano G, O’Sullivan C, Leite I, Kopp S (eds) Intelligent virtual agents. Springer, Berlin, pp 347–359
Peña J, Craig M, Baumhardt H (2022) The effects of avatar customization and virtual human mind perception: a test using Milgram’s paradigm. New Media Soc. https://doi.org/10.1177/14614448221127258
Peña J, Khan S, Alexopoulos C (2016) I Am what i see: how avatar and opponent agent body size affects physical activity among men playing exergames. J Comput-Mediat Commun 21(3):195–209. https://doi.org/10.1111/jcc4.12151
Phelan I, Furness PJ, Fehily O, Thompson AR, Babiker NT, Lamb MA, Lindley SA (2019) A mixed-methods investigation into the acceptability, usability, and perceived effectiveness of active and passive virtual reality scenarios in managing pain under experimental conditions. J Burn Care Res 40(1):85–90. https://doi.org/10.1093/jbcr/iry052
Phillips L, Ries B, Kaeding M, Interrante V (2010) Avatar self-embodiment enhances distance perception accuracy in non-photorealistic immersive virtual environments. IEEE Virtual Reality Conf 2010:115–1148. https://doi.org/10.1109/VR.2010.5444802
Piccione J, Collett J, Foe AD (2019) Virtual skills training: the role of presence and agency. Heliyon 5(11):e02583
Pisanski K, Rendall D (2011) The prioritization of voice fundamental frequency or formants in listeners’ assessments of speaker size, masculinity, and attractiveness. J Acoust Soc Am 129(4):2201–2212. https://doi.org/10.1121/1.3552866
Praetorius AS, Görlich D (2020) How avatars influence user behavior: a review on the proteus effect in virtual environments and video games. In: International conference on the foundations of digital games. pp 1–9. https://doi.org/10.1145/3402942.3403019
Price M, Mehta N, Tone EB, Anderson PL (2011) Does engagement with exposure yield better outcomes? Components of presence as a predictor of treatment response for virtual reality exposure therapy for social phobia. J Anxiety Disord 25(6):763–770. https://doi.org/10.1016/j.janxdis.2011.03.004
Randhavane T, Bera A, Kapsaskis K, Gray K, Manocha D (2019a) FVA: modeling perceived friendliness of virtual agents using movement characteristics. IEEE Trans Visual Comput Graphics 25(11):3135–3145. https://doi.org/10.1109/TVCG.2019.2932235
Randhavane T, Bera A, Kapsaskis K, Sheth R, Gray K, Manocha D (2019b) EVA: generating emotional behavior of virtual agents using expressive features of gait and gaze. ACM Symp Appl Percept 2019:1–10. https://doi.org/10.1145/3343036.3343129
Rapley T (2018) Doing conversation, discourse and document analysis. SAGE Publications Ltd., Thousand Oaks. https://doi.org/10.4135/9781526441843
Ratan R, Beyea D, Li BJ, Graciano L (2020) Avatar characteristics induce users’ behavioral conformity with small-to-medium effect sizes: A meta-analysis of the proteus effect. Media Psychol 23(5):651–675. https://doi.org/10.1080/15213269.2019.1623698
Ratan R, Klein MS, Ucha CR, Cherchiglia LL (2022) Avatar customization orientation and undergraduate-course outcomes: actual-self avatars are better than ideal-self and future-self avatars. Comput Educ 191:104643. https://doi.org/10.1016/j.compedu.2022.104643
Rickel J, Johnson WL (1999) Animated agents for procedural training in virtual reality: perception, cognition, and motor control. Appl Artif Intell 13(4–5):343–382. https://doi.org/10.1080/088395199117315
Ries B, Interrante V, Kaeding M, Anderson L (2008) The effect of self-embodiment on distance perception in immersive virtual environments. In: Proceedings of the 2008 ACM symposium on virtual reality software and technology. pp 167–170. https://doi.org/10.1145/1450579.1450614
Riva G, Wiederhold BK, Mantovani F (2019) Neuroscience of virtual reality: from virtual exposure to embodied medicine. Cyberpsychol Behav Soc Netw 22(1):82–96. https://doi.org/10.1089/cyber.2017.29099.gri
Rizzo AS, Koenig ST (2017) Is clinical virtual reality ready for primetime? Neuropsychology 31(8):877–899. https://doi.org/10.1037/neu0000405
Roberts G, Holmes N, Alexander N, Boto E, Leggett J, Hill RM, Shah V, Rea M, Vaughan R, Maguire EA, Kessler K, Beebe S, Fromhold M, Barnes GR, Bowtell R, Brookes MJ (2019) Towards OPM-MEG in a virtual reality environment. Neuroimage 199:408–417. https://doi.org/10.1016/j.neuroimage.2019.06.010
Sah YJ, Ratan R, Tsai H-YS, Peng W, Sarinopoulos I (2017) Are you what your avatar eats? Health-behavior effects of avatar-manifested self-concept. Media Psychol 20(4):632–657. https://doi.org/10.1080/15213269.2016.1234397
Saredakis D, Szpak A, Birckhead B, Keage HAD, Rizzo A, Loetscher T (2020) Factors associated with virtual reality sickness in head-mounted displays: a systematic review and meta-analysis. Front Hum Neurosci 14:96. https://doi.org/10.3389/fnhum.2020.00096
Schoenenberg K, Raake A, Koeppe J (2014) Why are you so slow?—Misattribution of transmission delay to attributes of the conversation partner at the far-end. Int J Hum Comput Stud 72(5):477–487. https://doi.org/10.1016/j.ijhcs.2014.02.004
Sekhavat Y, Nomani P (2017) A comparison of active and passive virtual reality exposure scenarios to elicit social anxiety. Int J Serious Games. https://doi.org/10.17083/ijsg.v4i2.154
Shih MT, Lee Y-C, Huang C-M, Chan L (2023) Do you get déjà vu”: persuasiveness effects of communicating with an avatar of similar appearance in social virtual reality. In: Extended abstracts of the 2023 CHI conference on human factors in computing systems. pp 1–8. https://doi.org/10.1145/3544549.3585839
Sicorello M, Stevanov J, Ashida H, Hecht H (2019) Effect of gaze on personal space: a japanese-german cross-cultural study. J Cross Cult Psychol 50(1):8–21. https://doi.org/10.1177/0022022118798513
Sipatchin A, Wahl S, Rifai K (2021) Eye-tracking for clinical ophthalmology with virtual reality (VR): a case study of the HTC vive pro eye’s usability. Healthcare. https://doi.org/10.3390/healthcare9020180
Skalski P, Tamborini R (2007) The role of social presence in interactive agent-based persuasion. Media Psychol 10(3):385–413. https://doi.org/10.1080/15213260701533102
Skalski P, Whitbred R (2010) Image versus sound: a comparison of formal feature effects on presence and video game enjoyment. PsychNol J 8:67–84
Skarbez R, Brooks FP Jr, Whitton MC (2018) A survey of presence and related concepts. ACM Comput Surv 50(6):1–39. https://doi.org/10.1145/3134301
Slater M, Banakou D, Beacco A, Gallego J, Macia-Varela F, Oliva R (2022) A separate reality: an update on place illusion and plausibility in virtual reality. Front Virtual Reality. https://doi.org/10.3389/frvir.2022.914392
Slater M, Gonzalez-Liencres C, Haggard P, Vinkers C, Gregory-Clarke R, Jelley S, Watson Z, Breen G, Schwarz R, Steptoe W, Szostak D, Halan S, Fox D, Silver J (2020) The ethics of realism in virtual and augmented reality. Front Virtual Reality. https://doi.org/10.3389/frvir.2020.00001
Somarathna R, Bednarz T, Mohammadi G (2022) Virtual reality for emotion elicitation—a review. IEEE Trans Affect Comput. https://doi.org/10.1109/TAFFC.2022.3181053
Stein J-P, Ohler P (2017) Venturing into the uncanny valley of mind—the influence of mind attribution on the acceptance of human-like characters in a virtual reality setting. Cognition 160:43–50. https://doi.org/10.1016/j.cognition.2016.12.010
Sugimoto T, Kinoshita K (2023) Angular resolution of radiation characteristics required to reproduce uttered speech in all three-dimensional directions. Acoust Sci Technol 44(5):360–370. https://doi.org/10.1250/ast.44.360
Tan X (2023) Neural text-to-speech synthesis. Springer Nature, Berlin
Tehrani BM, Wang J, Truax D (2021) Assessment of mental fatigue using electroencephalography (EEG) and virtual reality (VR) for construction fall hazard prevention. Eng Constr Archit Manag 29(9):3593–3616. https://doi.org/10.1108/ECAM-01-2021-0017
Thie S, van Wijk J (1998) A general theory on presence. 1st Int. Wkshp. on Presence. http://www0.cs.ucl.ac.uk/staff/m.slater/BTWorkshop/KPN
Tian N, Lopes P, Boulic R (2022) A review of cybersickness in head-mounted displays: raising attention to individual susceptibility. Virtual Reality 26(4):1409–1441. https://doi.org/10.1007/s10055-022-00638-2
Uribe-Quevedo A, Kapralos B, Gualdron DR, Dubrowski A, Perera S, Alam F, Xu S (2021) Physical and physiological data for customizing immersive VR training. In: 2021 IEEE/ACIS 20th international fall conference on computer and information science (ICIS Fall). pp 156–160. https://doi.org/10.1109/ICISFall51598.2021.9627412
Vahle N, Tomasik MJ (2022) The embodiment of an older avatar in a virtual reality setting impacts the social motivation of young adults. Exp Aging Res 48(2):164–176. https://doi.org/10.1080/0361073X.2021.1943793
van den Bosch M, Ode Sang Å (2017) Urban natural environments as nature-based solutions for improved public health—a systematic review of reviews. Environ Res 158:373–384. https://doi.org/10.1016/j.envres.2017.05.040
Vasser M, Aru J (2020) Guidelines for immersive virtual reality in psychological research. Curr Opin Psychol 36:71–76. https://doi.org/10.1016/j.copsyc.2020.04.010
Vienne C, Masfrand S, Bourdin C, Vercher J-L (2020) Depth perception in virtual reality systems: effect of screen distance, environment richness and display factors. IEEE Access 8:29099–29110. https://doi.org/10.1109/ACCESS.2020.2972122
Wagnerberger L, Runde D, Lafci MT, Przewozny D, Bosse S, Chojecki P (2021) Inverse kinematics for full-body self representation in VR-based cognitive rehabilitation. IEEE Int Symp Multimed 2021:123–129. https://doi.org/10.1109/ISM52913.2021.00029
Waltemate T, Gall D, Roth D, Botsch M, Latoschik ME (2018) The impact of avatar personalization and immersion on virtual body ownership, presence, and emotional response. IEEE Trans Visual Comput Gr 24(4):1643–1652. https://doi.org/10.1109/TVCG.2018.2794629
Wang I, Ruiz J (2021) Examining the use of nonverbal communication in virtual agents. Int J Hum Comput Interact 37(17):1648–1673. https://doi.org/10.1080/10447318.2021.1898851
Wang M (2020) Social VR: a new form of social communication in the future or a beautiful illusion? J Phys: Conf Ser 1518(1):012032. https://doi.org/10.1088/1742-6596/1518/1/012032
Wang X, Lu S, Li XI, Khamitov M, Bendle N (2021) Audio mining: the role of vocal tone in Persuasion. J Consum Res 48(2):189–211. https://doi.org/10.1093/jcr/ucab012
Weidner F, Boettcher G, Arboleda SA, Diao C, Sinani L, Kunert C, Gerhardt C, Broll W, Raake A (2023) A systematic review on the visualization of avatars and agents in AR & VR displayed using Head- Mounted displays. IEEE Trans Vis Comput Graph 29(5):2596–2606. https://doi.org/10.1109/TVCG.2023.3247072
Welsch R, von Castell C, Hecht H (2019) The anisotropy of personal space. PLoS ONE 14(6):e0217587. https://doi.org/10.1371/journal.pone.0217587
Wheatland N, Wang Y, Song H, Neff M, Zordan V, Jörg S (2015) State of the art in hand and finger modeling and animation. Comput Gr Forum 34(2):735–760. https://doi.org/10.1111/cgf.12595
Whetten DA (1989) What constitutes a theoretical contribution? Acad Manag Rev 14(4):490–495. https://doi.org/10.2307/258554
Wiebe EN, Lamb A, Hardy M, Sharek D (2014) Measuring engagement in video game-based environments: investigation of the user engagement scale. Comput Hum Behav 32:123–132. https://doi.org/10.1016/j.chb.2013.12.001
Yang Y, Yang J, Hodgins J (2020) Statistics-based motion synthesis for social conversations. Comput Gr Forum 39(8):201–212. https://doi.org/10.1111/cgf.14114
Yoon B, Kim H, Lee GA, Billinghurst M, Woo W (2019) The effect of avatar appearance on social presence in an augmented reality remote collaboration. In: 2019 IEEE conference on virtual reality and 3D user interfaces (VR). pp 547–556. https://doi.org/10.1109/VR.2019.8797719
Yoon SO, Brown-Schmidt S (2019) Audience design in multiparty conversation. Cogn Sci 43(8):e12774. https://doi.org/10.1111/cogs.12774
Young MK, Rieser JJ, Bodenheimer B (2015) Dyadic interactions with avatars in immersive virtual environments: High fiving. In: Proceedings of the ACM SIGGRAPH symposium on applied perception. pp 119–126. https://doi.org/10.1145/2804408.2804410
Zhang C, Zigurs I (2009) An exploratory study of the impact of a virtual world learning environment on student interaction and learning satisfaction. AMCIS 2009 Proceedings. https://aisel.aisnet.org/amcis2009/424
Zhao G, Orlosky J, Uranishi Y (2021) Evaluating presence in VR with self-representing auditory-vibrotactile input. In: 2021 IEEE conference on virtual reality and 3D user interfaces abstracts and workshops (VRW). pp 577–578. https://doi.org/10.1109/VRW52623.2021.00171
Zhao R, Sinha T, Black A, Cassell J (2016) Automatic recognition of conversational strategies in the service of a socially-aware dialog system. In: Proceedings of the 17th annual meeting of the special interest group on discourse and dialogue. pp 381–392. https://doi.org/10.18653/v1/W16-3647
Zibrek K, Kokkinara E, Mcdonnell R (2018) The effect of realistic appearance of virtual characters in immersive environments—does the character’s personality play a role? IEEE Trans Vis Comput Gr 24(4):1681–1690. https://doi.org/10.1109/TVCG.2018.2794638
Zibrek K, McDonnell R (2019) Social presence and place illusion are affected by photorealism in embodied VR. Motion Interact Games. https://doi.org/10.1145/3359566.3360064
Zimmer P, Buttlar B, Halbeisen G, Walther E, Domes G (2019) Virtually stressed? A refined virtual reality adaptation of the Trier Social Stress Test (TSST) induces robust endocrine responses. Psychoneuroendocrinology 101:186–192. https://doi.org/10.1016/j.psyneuen.2018.11.010
Funding
Open Access funding provided by the IReL Consortium.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Mulvaney, P., Rooney, B., Friehs, M.A. et al. Social VR design features and experiential outcomes: narrative review and relationship map for dyadic agent conversations. Virtual Reality 28, 45 (2024). https://doi.org/10.1007/s10055-024-00941-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10055-024-00941-0