1 Introduction

As virtual reality (VR) continues to hold great promise, there is an increasing demand for systematic and interdisciplinary comprehension of quality of experience (QoE) in VR. Consequently, both research and industry communities are actively formulating diverse strategies and methodologies to gain insights into how users perceive and encounter this new immersive medium, ultimately striving to enhance and optimize the overall QoE in VR (International Telecommunication Union - Telecommunication Standardization Sector 2020, 2021; Perkis et al. 2020). Nevertheless, due to the vastness of factors that might influence QoE in VR, researchers usually take a one-sided perspective. To investigate this point, the current study aims to conduct an overview of literature from multiple academic and industry-related sources and critically review the state of QoE findings by pointing out key challenges and possible future avenues. Importantly, this work does not aim to present a new theoretical QoE model. Instead, we aim to investigate how frequently QoE-related concepts are studied and highlight any potential discrepancies in their study frequency. To do so we first present a working definition of quality.

1.1 What is quality?

Quality is a concept that, while seemingly straightforward to grasp, proves challenging to outline. In the field of multimedia, quality is generally associated with engineering goals and is a primary measure for assessing systems, applications, or services during development and operation (Möller and Raake 2014a). In this light, quality is a comprehensive concept influenced by various factors. Consider elements like image size, packet loss, frame rate, and audio-visual synchronization, which play a crucial role in assessing visual and audiovisual quality (Sector 2018).

However, over time, the notions related to quality have been refined, yielding several definitions varying according to their contextual application (Belmudez 2015). The following section presents various interpretations of quality depending on several contexts:

The International Organization for Standardization (ISO) context. ISO provides universal guidelines that make sure manufacturers consistently achieve universally recognized standards for the development of products. In the product-based ISO-9004 context quality is defined as “fitness for use, performance, safety, and dependability," sometimes known as the quality of design (International Organization for Standardization 2018). Notably, it disregards individual user characteristics and centres its attention on distinct attributes for quality determination. However, in the context of ISO 9000-2015 , quality of products and services “includes not only their intended function and performance, but also their perceived value and benefit to the customer” (International Organization for Standardization 2015). Thus, already recognising that quality is a more multifaceted concept that expands beyond the technical characteristics of a product.

Manufacturing context. Quality, according to this perspective, is characterized as conformity to predetermined criteria, usually articulated in numerical terms. Dale et al. (1997) define it as “Strict and consistent adherence to measurable and verifiable standards to achieve uniformity of output that satisfies specific customer or user requirements".

Value-Based context. This context seeks to balance quality and cost based on customer willingness to pay. For example, according to Akinyele (2008) quality is considered “the degree of excellence at an acceptable cost" under this perspective.

Objective/Quantitative context. In this view, quality is defined as “measurable and verifiable aspects of a thing or phenomenon, expressed in numbers or quantities," such as weight, thickness, hardness, or softness (Ebrahimi 2009).

Subjective/User-Based context. Quality is specified as “fitness for purpose" and relies on individual user perspectives. It considers features like ergonomics, efficiency, and dependability, assessed from the user’s viewpoint (Ebrahimi 2009).

QUALINET context. QUALINET is the European Network on Quality of Experience in Multimedia Systems and Services which works in association with ISO. QUALINET defines quality as the product of an individual’s cognitive processes, which entail comparison and judgment. These processes encompass perception, conceptualization, and outcome narration. They refer to the assessment of attributes such as goodness and the fulfilment of requirements. Additionally, quality is contingent upon the identification of a “quality event", distinguished by its specific location, character, and the time of its occurrence (Callet et al. 2013). Thus, this perspective presents a more holistic view of QoE.

In summary, quality is the outcome of human perception, influenced by various factors. Some factors are based on measurable attributes (i.e., Quality of Service (QoS) factors), while others depend on cognitive processes that are challenging to quantify (i.e., QoE factors). The next two subsections elaborate on what QoS and QoE encompass.

1.1.1 Quality of service (QoS)

In the realm of telecommunications, computer networking, and related technological domains, the term QoS is employed to characterize the degree of performance, reliability, and efficiency exhibited by a service or system (Möller and Raake 2014a). It encompasses a multitude of attributes and metrics that facilitate the evaluation and quantification of the overall quality of a service such as VR (Möller and Raake 2014a). The following are the fundamental facets of QoS as outlined by Möller and Raake (2014a):

Reliability. This dimension refers to the capacity of a service or system to consistently deliver its intended functionality without unforeseen failures or disruptions.

Bandwidth. In the context of networking, bandwidth signifies the data transfer rate or the capacity available for data transmission. A higher bandwidth often translates to enhanced service quality by enabling swifter data transfer.

Latency. Also referred to as delay, latency signifies the time taken by data packets to traverse from the source to the destination. Lower latency is generally preferred, especially in applications demanding real-time responsiveness such as video conferencing or VR applications.

Jitter. Jitter denotes the fluctuation in latency over time. It can impact the smoothness of data transmission, particularly in applications where precise timing is critical such as VR training applications for example.

Packet Loss. Packet loss arises when data packets fail to reach their intended destination. Lower packet loss rates are typically favoured to uphold data integrity.

Priority and Differentiated Services. Certain services or data types may necessitate elevated priority or distinct treatment to guarantee a specific quality level. This is particularly pertinent in networks accommodating a variety of traffic types.

Scalability. Scalability denotes a system’s capacity to manage increased traffic or workload without substantial deterioration in service quality.

Security. Ensuring data confidentiality and integrity during transmission holds paramount significance in preserving QoS, especially within sensitive applications. This can be considered especially important in the context of VR with the rollout of devices that have external cameras or physiological signals that measure the user’s state (e.g., the Meta Quest 3 and the HP Reverb G2 Omnicept VR headsets).

Resource Reservation. In certain scenarios, resources may need to be reserved in advance to ensure a predetermined QoS level. This is exemplified in QoS models for VR that need to reserve and allocate resources to multiple systems responsible for tracking accuracy, processing delay, and transmission delay (Chen et al. 2018).

Notably, the above-listed QoS factors are reflective of scenarios where multiple users interact within a virtual environment and/or the virtual environment is dependent on network infrastructure, hence their applicability may diminish in a local, non-networked setting.

In summary, QoS encompasses an array of factors that collectively define the excellence and dependability of a service or system, spanning from fundamental attributes like reliability and bandwidth to advanced considerations such as resource allocation and security, all vital components in the assessment of service quality (Möller and Raake 2014a).

While QoS primarily emphasizes end-to-end system performance and may not definitively capture the cause-and-effect relationship concerning user satisfaction, QoE takes into consideration user perception and contentment which might be influenced by factors beyond the performance of service components (Callet et al. 2013).

1.1.2 Quality of Experience

To understand QoE we first refer to the 2012 QUALINET White Paper on Definitions of Quality of Experience and Related Concepts according to which an experience is the perception and interpretation of an event or multiple events, perception involves recognizing and interpreting incoming stimuli, and an event is an occurrence in space, time and character that can be observed (Perkis et al. 2020). Finally, quality includes the perception of an event, the reflection on one perception and the description of the outcome (Callet et al. 2013).

Another view of QoE is presented by the International Communications Union (ITU-T), which states that QoE is: “The overall acceptability of an application or service, as perceived subjectively by the end-user” (International Telecommunication Union - Telecommunication Standardization Sector 2007).

It is important to note that the initial recommendations of ITU-T have outlined standard measuring tools and procedures for investigating QoE that did not include factors integral to VR. For example, the sense of presence in immersive environments is vital for the users’ QoE as it increases the credibility of the experience. Also, the increased sense of spatial immersion in VR can increase the discrepancy between visual motion cues and the lack of corresponding vestibular feedback, potentially leading to a stronger manifestation of cybersickness symptoms. While similar effects can be observed in other media, they are typically less intense due to the reduced level of sensory immersion. Thus, it is important to consider cybersickness in the context of VR because it might have a profound effect on the user experience in VR by reducing one’s sense of presence and overall enjoyment (Cebeci et al. 2019; Weech et al. 2020).

To bridge the concepts of QoE and VR the 2020 QUALINET White Paper paid attention to the individual perceptual differences between users that might potentially alter QoE (Perkis et al. 2020). Even though the paper presented a more holistic overview of the potential branches of factors that alter the QoE in VR, a more detailed outline of the relationships between branches was needed. For instance, Hvass et al. (2018) illustrated that a higher degree of virtual realism leads to a stronger feeling of presence. Also, Weech et al. (2020) have demonstrated a negative relationship between presence and simulator sickness. A common issue among these studies, however, was that they adopted a top-down approach - participants reflected on their experience in VR by scoring adjectives chosen by the researcher and thus were not able to propose other dimensions that might have described their experience better.

Addressing this issue Letter et al. (2021) used a qualitative bottom-up approach to create a user-driven three-factor QoE model that consisted of user, content, and system-related factors. The user branch encompassed factors related to the participant’s experience in VR (e.g., presence, autonomy). The content factors consisted of attributes related to the design of the VE and the facets of the narrative itself (e.g., realism). Finally, system factors reflected qualities related to the system (e.g., video quality) (Letter et al. 2021).

Meanwhile, ITU also released several new regulations and recommendations for measuring QoE, specific to VR. For example, the ITU-T P.919 Subjective test methodologies for 360° video on head-mounted displays (International Telecommunication Union - Telecommunication Standardization Sector 2020) explicitly mentioned simulator sickness (i.e., cybersickness) as an important QoE factor. Also, the ITU-T G.1035 Influencing Factors on Quality of Experience for Virtual Reality Services (International Telecommunication Union - Telecommunication Standardization Sector 2021) paid great attention to delineating human, system and context factors that influence the QoE in VR. These recommendations brought a better, more holistic, understanding of the multifaceted nature of QoE in VR by mentioning cybersickness and immersion as important user-related factors but also outlining more traditional QoS-related factors such as compression, coding delay and audio sample rate (to name a few). What is more the ITU-T G.103 recommendations incorporated factors such as the physical or social context in which the user is using VR (amongst other context variables) in their overall QoE assessment bringing about an even more detailed and accurate picture of the complexity of QoE in VR (International Telecommunication Union - Telecommunication Standardization Sector 2021).

In summary, QoE includes but also transcends the conventional scope of end-to-end QoS parameters, encompassing a wide array of diverse factors, including the user’s mental state, to enhance the overall quality experienced by individuals. QoE pertains to the intricate perception and interpretation of events, encapsulating the subjective acceptability of an application or service as end-users encounter it. This concept takes into consideration factors such as the sense of presence, realism, and other user-driven evaluations, especially pertinent in immersive settings like VR.

Importantly, a comprehensive model of QoE for VR entails acknowledging the multifaceted elements contributing to the users’ holistic subjective assessments of their experiences. In the following subsections, we will delve into the primary hypothesized trends associated with constructing a comprehensive model of this nature. In this way, the current research critically identifies these trends and explores potential approaches to address them.

1.2 The research questions

We suggest that a primary trend encountered in contemporary academic literature is the prevalent one-dimensional perspective adopted by a substantial portion of research articles. This manifests as an inclination to investigate QoE factors through the lens of a single disciplinary framework. This trend, although understandably driven by the complexity of the concept, inadvertently reinforces the dichotomy between QoS and QoE. Thus, impeding the development of a holistic and cohesive comprehension of the multifaceted factors that collectively contribute to shaping the user experience in VR. To test this assumption the current study poses the following questions:

  • RQ1:What are the main discrepancies in the research interest towards QoE factors?

  • RQ2: What are the most studied factors under the umbrella of QoE in VR?

We believe that by answering these questions we would facilitate a more holistic examination of the current research trends in QoE in VR. This, in turn, can be used by future researchers to address factors that have stayed under the radar thus improving the quality of research in the field.

A well-fitting methodology to address the current research questions is the integrative literature review method. This method follows a five-stage process entailing a clear search strategy and audit trail, in which sources are critically analysed, aggregated, synthesised and interpreted. Importantly, the method allows for the amalgamation of research from various academic fields and industry articles (Snyder 2019). This methodology is instrumental in broadening the scope of our investigation and fostering a well-rounded perspective on the subject.

With this approach, we aim to promote a heightened awareness of the factors of VR QoE that are predominantly explored. At the same time, we strive to outline factors that have been overlooked within the current scholarly discourse.

Next, we provide a comprehensive overview of each step within the integrative literature methodology. Furthermore, we expand upon the significance of each step in the context of the current research objective, which is to discern concepts associated with QoE in VR; both those that have been adequately addressed and those that have been overlooked.

2 Methodology

2.1 Design

The design of an integrative review is not developed according to a specific standard (Torraco 2005). However, to ensure the integrity of the current research, we followed the Whittemore and Knafl methodology (Whittemore and Knafl 2005), which includes the following five steps: step 1—identify a framework (i.e., a blueprint for the analysis), step 2—search the database for sources, step 3—filter the sources according to the inclusion criteria, step 4—code the relevant sources according to the framework identified in step 1, and step 5—aggregate the findings in a detailed QoE model.

Before we outline the analysis steps, it is important to note the definition of factors that this study used and why we decided to only use the term factors. First of all, oftentimes the term factors is used to address broader, holistic elements that affect the user experience, while features have been related to the specific design and technical attributes that can be fine-tuned to improve that experience. With this distinction in mind, we chose to only use the factors label because we believe that in the case of VR, many elements that influence the users’ quality of experience can be both factors and features simultaneously. For example, the resolution of a VR headset can be considered a factor affecting the quality of experience, but it can also be a specific feature of the headset itself. Since the aim of the current paper is not to present a detailed theorethical QoE model but to rather assess the state of the current literature and investigate which factors are mostly studied and which are often overlooked, we believe that using one term was sufficient. Nevertheless, we discuss the possible implications of this decision on the interpretation of the results in the Limitation chapter of this paper.

2.2 Analysis

2.2.1 Step 1: Identify a relevant framework

The first step of the process is to adopt a guiding view of the topic (i.e., a conceptual framework). Based on the exhaustive findings of the QUALINET white paper (Perkis et al. 2020) and the paper of Letter et al. (2021) we decided to follow a four-branch model of QoE (Fig. 1). These papers were chosen because they successfully incorporate the multifaceted nature of QoE in VR (i.e., they include branches that are related both to the user and to the system). Moreover, the two studies use complementary research designs (i.e., the repertory grid bottom-up approach and the top-down literature review approach), which allows us to cast a wider net on the diversity of factors that influence the experience in VR. Thus, we argue that the conceptual framework presented in Fig. 1 incorporates all main facets that might influence QoE in VR (i.e., user, content, context and system related). However, to stay true to the interdisciplinary claim of the current paper, we decided to remain open to adding factors that might not fit into the conceptual framework. Therefore, the rest of the analysis followed the blueprint of the conceptual framework to filter and code the data but remained flexible in incorporating other concepts.

Fig. 1
figure 1

Conceptual framework based on step 1 of the analysis

2.2.2 Step 2: Search databases

We undertook two strategies to identify all academic and industry sources that present findings on one or more than one of the four factor branches of the conceptual framework. First, searches were conducted between November 2021 and September 2023 in (1) academic sources—Web of Science, Science Direct and Google Scholar, and (2) industry sources—Medium, Road to VR, and VR Scout. The non-academic sources were chosen due to their daily updates, which cater to the dynamic development of VR and because they report on a variety of facets of QoE (e.g., data safety, hardware development, storytelling, etc.) as opposed to focusing only on technological developments or user experience. The relevant search terms were “virtual reality" and “quality of experience (QoE)". The second search strategy was to screen the reference sections of relevant meta-analysis review articles and cross-reference them.

2.2.3 Step 3: Filter through the sources

Combined, the two search strategies yielded a dataset of 1612 academic sources and 90 non-academic sources (1702 in total), which included scientific articles, dissertations, theses, conference presentations, press messages and VR development tutorials. All sources were filtered through three inclusion criteria, as recommended by Torraco (2005) (see Fig. 2).

First, we screened for articles that showed repeatedly in our search. Second, we filtered out the sources that did not mention at least one variable related to one of the branches in the conceptual framework. Third, if the sources were academic they must have included a sample size larger than nine (Torraco 2005). This filter was not applied to the non-academic sources as they did not follow the typical structure of an academic article and thus usually did not use a sample of participants. Instead, they presented more technical details of new hardware, software and network developments or press releases of new safety protocols for VR platforms. Using the above procedures, 729 sources were retained in this literature review. All sources are listed and outlined in detail on the Open Science Framework page of the project (see https://bit.ly/3MGT4Dy).

Fig. 2
figure 2

The number of academic and non-academic sources in each of the filtering steps. The grey arrow at the bottom right corner indicates that this step was not applied to the three industry sources

2.2.4 Step 4: Code and analysis

The goal of this step was a thorough and unbiased interpretation of primary sources, along with an innovative synthesis of the evidence. To achieve that, we followed the process outlined in Whittemore and Knafl (2005), which consists of (1) data reduction, (2) data display, (3) data comparison, and (4) conclusion drawing and verification.

Data reduction. The first phase of data reduction involved dividing the sources into groups according to the source they come from. Hence, we compiled one spreadsheet, with 6 different sheets—one per data source. Next, we simplified the data by organizing it into a manageable framework. Predetermined and relevant features (i.e., authors list, year of publication, abstract, QoE-related factors, QoE branch) were extracted from all primary data sources. For example, after reading the study by Gomes et al. (2020), we extracted the name of the authors, year of publication and abstract and input them into our main data frame. Then, we pinpointed the factor latency as mentioned in the article that is related to QoE in VR. Finally, we categorised latency under the branch of system factors. In the end, each primary source was reduced to a single row with the same type of data extracted from individual sources (of each subgroup) (https://bit.ly/3MGT4Dy).

Data display. This step involved consolidating the extracted data from individual sources into four spreadsheets each of which assembled the data from multiple primary sources around one particular branch (i.e., user, content, context, system).

Data comparison. Finally, we went through an iterative process of examining the source data to identify patterns, themes, and relationships. Thus, we observed the development of further sub-groups of factors. To accommodate the newly emerged patterns we added two new columns to each of the data frames (i.e., Level 1 and Level 2). For instance, some articles mentioned variables related to the system factors branch that repeatedly noted network characteristics (e.g., latency), hence a sub-branch titled network was added in the Level 2 column.

Conclusion drawing and verification. This step moved the interpretive effort from the description of patterns and relationships to higher levels of abstraction. Patterns and processes were isolated and identified with a gradual elaboration of a small set of generalizations that encompassed each subgroup database. Finally, we calculated the number of times a factor was mentioned in the database.

During this phase, explicit care was undertaken to avoid premature analytic closure (e.g., being locked into a particular pattern). This was especially poignant when addressing conflicting evidence. For example, there were several instances where the factor video quality was coded as both part of the system factors branch and the user factors branch. To tackle this conflicting evidence we needed to go back to the data comparison phase and explore the different explanations of the term video quality and understand how it fits in both categories, should it be re-labelled or if there is more nuance to the term and what it incorporates. As a result, we came to the finding that the factor video quality encompassed a range of factors (e.g., HMD resolution, HMD framerate, HMD refresh rate, etc.) contributing to the user’s overall experience. In light of this, we re-labelled the instances of video quality to their better corresponding labels.

2.2.5 Create a QoE model

Finally, on completion of each subgroup analysis, we synthesised the important elements of each subgroup into an integrated summation of the current QoE model. This resulted in a new QoE conceptualization that integrated all subgroups into a comprehensive portrayal of the topic (Fig. 5). We chose to present this model only briefly in the subsequent chapter. This is because the primary aim of the article is to use the results as a stepping stone for a critical analysis of the state of QoE research. However, if the reader is further interested in exploring all of the factors (and factor definitions) that the model incorporates, then they can visit the online interactive dashboard (see https://bit.ly/3qs1SV0).

Fig. 3
figure 3

Definition of the four main factor branches and the number of factors per branch

Fig. 4
figure 4

A screenshot of the dashboard dedicated to user factors

Table 1 The proportion of the Top 10 research fields covered in the current data base
Fig. 5
figure 5

A visual representation of the findings of the integrative literature review. The figure presents the percentages of the most mentioned factors per branch or sub-branch. To see all factors visit the online dashboard at https://bit.ly/3qs1SV0

3 Results

This research analysed 661 academic and 68 industry sources to get a comprehensive view of the state of research in the field of QoE in VR. To do so, a set of coding parameters and categories were created and based on these, the data was analysed. This resulted in the currently outlined QoE model. The model consisted of 4 main branches of factors (i.e., user, content, system and context). Figure 3 presents a short description of each of the main branches and the number of factors mentioned per branch. One striking outcome is that the majority of factors outlined in the data sources were user-related (143), followed by system (71), context (25) and content-related (13) factors. To investigate whether this trend was driven by the predominance of a research field (e.g., psychology) in the data sources we outlined the research fields for all sources. As shown in Table 1 most data sources came from the field of 3D interfaces and human-computer interactions.

Next, to answer the research questions we briefly expand on the top 5 most frequently mentioned factors per branch (and/or sub-branch). If the reader is interested in all 252 factors and their descriptions outlined based on the current review they can visit an online dashboard at https://bit.ly/3qs1SV0. The dashboard (see a screenshot in Fig. 4) summarizes all factors, per branch and the frequency with which they were mentioned in the sources used in this review. What is more, the dashboard presents a description of each of the factors.

3.1 User factors

User factors included both static ones, which referred to the fixed traits of the user (i.e., user characteristics) and dynamic ones, which incorporated the current state of the user (i.e., user experience). The analysis showed that user experience factors were more predominant compared to user characteristics.

3.1.1 User characteristics

Forty-six user characteristics were mentioned in 119 of the studies. The top five most frequently mentioned ones were age (38 mentions), previous experience (34 mentions), physical health (19 mentions), gender (17 mentions), and immersive tendencies (14 mentions).

Age was the most often mentioned factor, particularly among older VR users (> 65 years), impacting their experience due to HMD weight, object positioning, and controller comprehension challenges (Abeele et al. 2020, 2021; Barsasella et al. 2020; Coldham and Cook 2017). Previous VR experience was shown to influence cybersickness, control, and presence (Weech et al. 2020). Physical health factors, like reduced visual acuity and hearing ability, were illustrated to play a role, with some users needing extra hardware for optimal VR experience (Ling et al. 2013; Silva et al. 2019). Gender differences were often studied and were shown to impact physical and social presence in some cases (Felnhofer et al. 2021), though sometimes due to content variations (Melo et al. 2018). Lastly, immersive tendencies were shown to affect presence, with highly immersive users often experiencing more cybersickness symptoms (Ling et al. 2013).

3.1.2 User experience

Ninety-five various user experience factors were mentioned in 475 of the data sources. The top five most frequently mentioned ones were presence (213 mentions), cybersickness (134 mentions), immersion (108 mentions), enjoyment (53 mentions), and affect (52 mentions).

The most predominantly studied concept was presence. Presence is a psychological phenomenon where individuals mentally experience being in another world, similar to physical reality (Sas 2004; Slater 2009). It encompasses telepresence, where users feel transported to remote locations, and social presence—feeling connected with real people in virtual environments, which influences persuasion and attraction (Oh et al. 2018). The second most studied factor was cybersickness, which included discomfort, fatigue, headache, and eye strain (Fernandes and Feiner 2016). Immersion, described as VR’s capacity to create realistic experiences, was also frequently studied among the sources analysed in the current paper. Furthermore, it was shown that immersion depended on system and content factors like audio, video quality, frame rate, and network latency (Slater et al. 1994; Witmer and Singer 1998; Jang and Park 2019). Next, enjoyment was often shown to influence immersion and technology use intentions (Carrion et al. 2021), with it being a core predictor of users’ willingness to adopt new technology, including VR (Jang and Park 2019). Finally, affect, the emotional impact induced by VR, was shown to increase the overall enjoyment of VR (Kartiko et al. 2010; Mirvis and Csikszentmihalyi 1991).

3.2 System factors

Overall, seventy-one system-related factors were mentioned in 297 of the articles that were part of this review. In general, three broader subcategories emerged—factors related to the hardware (HMD, controllers, tracking devices), factors related to the software (3D engine, tracking algorithms, rendering algorithms), and factors related to the network (latency, bandwidth, error rate, etc.).

3.2.1 Hardware

HMD resolution was shown to be a pivotal metric related to enriching VR experiences with enhanced detail perception, text clarity, and distant target recognition (Mehrfard 2019; Perroud et al. 2019). Haptic devices, were also listed as important for elevating immersion by providing tactile feedback through touch, feel, and manipulation (Garcia-Valle et al. 2018). The field of view (FoV) was shown to bolster VR immersion by allowing the user to perceive a greater number of environmental stimuli (Boger 2017), albeit potentially exacerbating symptoms of cybersickness (Kim et al. 2018). HMD types, encompassing wireless, phone-in-a-box, stand-alone, and tethered variants, were shown to be important in catering to diverse user preferences, offering varying degrees of comfort, audiovisual quality, and immersion (Angelov et al. 2020). Tracking devices were listed as playing a central role, with orientational tracking offering 3 degrees of freedom (DoF) and positional tracking providing an additional 3DoF, essential for realistic VR interactions (Angelov et al. 2020).

3.2.2 Software

Image rendering was shown to play a pivotal role in shaping VR experiences by determining the visual content displayed and enabling significant advancements in rendering engine speed and cost-effectiveness (de Regt et al. 2020). Tracking was illustrated to be fundamental for user immersion and interaction within VR, with precise tracking of head, body, and eye movements being paramount. The literature also exposed that inaccurate tracking can induce cybersickness symptoms, underscoring the need for tracking precision (Kunz et al. 2016). Additionally, eye gaze tracking was shown to enhance user interactions and efficiency in VR (Li et al. 2017). Lastly, adaptive streaming was illustrated to contribute to delivering VR content efficiently, optimizing video configurations based on network conditions, device capabilities, and content characteristics (Anwar et al. 2020).

3.2.3 Network

The most often mentioned factor here was network latency—the time it takes for data packets to traverse the network. Indeed, the review showed that VR headsets demand exceptionally low latency, ideally between 6 to 15 milliseconds, as latency exceeding this range not only could deteriorate visual quality but also induce symptoms of cybersickness (Saxena et al. 2020). Secondly, bandwidth, denoting the maximum data transfer rate, was shown to be imperative for delivering high-quality VR experiences. Yet, the current capabilities of the more widely spread 4G wireless communication were reported to often fall short, necessitating adaptive streaming and rate adaptation algorithms (Anwar et al. 2020). Bit rate, describing the speed of data transfer, was illustrated to significantly impact QoE by affecting video representation and viewing smoothness (Gao et al. 2019). Delay, distinct from latency as it pertains to the arrival time of the first byte, was shown to influence the overall VR user experience, with delays exceeding 11 milliseconds leading to decreased immersion and satisfaction (Grzelka et al. 2019). Lastly, rate adaptation mechanisms were shown to play a pivotal role in addressing the challenges posed by the demanding VR environment, enabling dynamic selection of video representations based on network conditions, ultimately enhancing audiovisual quality and reducing delay, thereby enriching the overall user experience (Gao et al. 2019; Jiang et al. 2020).

3.3 Context factors

These were factors expressing the circumstances and situational properties, which described the environment in which an individual is using a VR system. Twenty-five context factors were mentioned in 80 of the studies. The most frequently mentioned ones were safety (48 mentions), environment (21 mentions), purpose (18 mentions), costs (17 mentions), accessibility (7 mentions) and viewing position (7 mentions).

The foremost factor in this branch was safety. Ball et al. (2021) revealed that the perceived impact of the pandemic on physical safety influenced the adoption of VR HMDs in education, tourism, and work contexts. Privacy and data safety concerns were also mentioned in the Virtual Reality Awareness and Adoption: 2022 Annual Report (Turner 2022)—particularly regarding Meta devices (Lang 2021; Rosenberg 2021). Second, the user’s physical environment was mentioned to significantly impact their experience (Hameed et al. 2019). For instance, Hong et al. (2018) demonstrated that the type of chair a user sits in could affect their spatial immersion and engagement. Third, the purpose of VR use was often shown to dictate hardware, software, and network requirements, thus shaping the overall user experience (Boas 2013). Fourth, the reviewed literature showed that high overall costs can create cognitive risk perception and can influence the users’ willingness to adopt VR (Carrozzino and Bergamasco 2010; Chen et al. 2020; Jang and Park 2019; Perdana and Mokhtar 2022). Finally, VR’s accessibility, and especially the HMDs’ facial and eye tracking algorithms were mentioned as a relevant QoE factor (Dionisio et al. 2013). Another context factor, which was mentioned as often was the viewing position of the user—McDonnell (2019) demonstrated its impact on comfort, immersion, and simulator sickness symptoms, particularly for physically challenged users.

3.4 Content factors

Thirteen content-related factors were mentioned. These factors described the subject matter of the content and the way it is expressed in VR. The top five most frequently mentioned ones were interaction design techniques (140 mentions), editing techniques (66 mentions), graphics (59 mentions), genre (44 mentions), and audio (28 mentions).

In this branch, the most frequently mentioned factor was interaction design techniques which aimed at enhancing virtual presence and sensory awareness (Newman et al., 2022). Next, editing techniques were mentioned as an important influencer of the user’s QoE. For example, Zheleva et al. (2021) showed that the frequency of straight cuts (i.e., moving directly from one scene to the next) in editing influenced user agency, enjoyment, and story comprehension (Zheleva et al. 2021). Graphics quality and realism were also shown to significantly impact VR users’ sense of presence and immersion (Arora et al. 2012; Hameed et al. 2019). Following, the genre of VR content was shown to affect the users’ presence and overall experience, particularly in terms of content purpose and graphics realism (Letter et al. 2021). Interestingly, Chen et al. (2011) showed that higher genre abstraction levels correspond to lower reported presence levels (i.e., students felt less present in a VR environment that presented a fictional story compared to one that resembled a student classroom scenario). Finally, audio was illustrated to play a crucial role in VR, influencing both user factors like presence and content factors like realism (Kern and Ellermeier 2020).

4 Discussion

The current study undertook an integrative literature review approach to gather a holistic view of the current state of literature that focuses on QoE in VR. As part of this approach 729 academic and industry sources were thoroughly investigated. The top 5 fields that the sources came from were 3D interface, human-computer interactions, virtual reality, quality of multimedia experiences and computer science. Each of these sources was scavaged for the type of factor they investigated concerning QoE in VR and the definition of those factors. Based on this rich dataset, we then created a QoE model that illustrates the most researched QoE factors grouped in four branches (Fig. 5).

The main conclusion of the current investigation is that the majority of articles researched factors related to the user (being the user characteristics and/or the user experience). The discrepancy between the number of user factors (N = 143) and system (N = 71), context (N = 25) and content factors (N= 13) was staggering, especially having in mind the rather technical QoS origin of the concept of quality (as explained in Sect. 1 of this paper).

The ubiquity of factors that measure the user experience can be explained by the novel developments in the field which could have drawn attention to the central role of the user. Researchers investigating emerging media platforms such as VR often exhibit a predominant interest in user-related factors, eclipsing a comprehensive exploration of the hardware, software, context, and content factors for several compelling reasons. Firstly, these studies are fundamentally motivated by a user-centric perspective, recognizing that the success of these technologies hinges on their ability to engage and resonate with users. This user-centric focus aligns with human-centred design principles, emphasizing the importance of crafting technology to suit the needs and preferences of the human user (van der Bijl-Brouwer and Dorst 2017). Additionally, the usability and acceptance of these technologies among users are pivotal for their widespread adoption; hence, understanding how users perceive, interact with, and embrace these technologies takes precedence in research agendas (Sagnier et al. 2019). Furthermore, as these technologies seek to find their place in the market, insights into user-related factors can guide businesses and developers in aligning their products with user needs and preferences, ultimately driving adoption and market success.

Although the focus on user-related factors is undeniably prominent, it does not diminish the importance of a holistic perspective. Indeed, research has seemed to reflect this as proven in the observable increase in interdisciplinary efforts to dynamically integrate system, context, and content factors, leveraging their interplay to enhance user experience. To this point, one should not underestimate the intricate relationship that the user has with the system they are using, the context in which they are using it and the content in which they are engaging.

Indeed, expanding our scientific understanding of system, context, and content factors influencing VR QoE holds immense significance on multiple fronts. Firstly, delving into system factors allows for the refinement of VR technology, yielding improved graphics rendering, reduced latency, and more seamless interactions. These advancements culminate in enhanced user immersion and satisfaction (Saxena et al. 2020). Secondly, exploring context factors is crucial in adapting VR experiences to various physical environments and scenarios, ensuring users’ comfort and continuity of immersion, regardless of their surroundings and equipment that they use (Hong et al. 2018). This adaptability promotes the accessibility of VR across diverse contexts, from gaming at home to professional applications in healthcare or training. What is more, the lack of research on temporal and social contexts in QoE for VR overlooks important insights. Social context, for example, is especially crucial for multiplayer VR, enhancing empathy and immersion more than conventional platforms (Vlahovic et al. 2022). Additionally, the duration and frequency of VR use can affect QoE, highlighting the need for more studies on momentary, reflective, repetitive, and retrospective experiences (Vlahovic et al. 2022). Thirdly, studying content factors is pivotal in crafting captivating and meaningful VR experiences. Understanding how content type, storytelling techniques, and interactive elements impact user engagement and enjoyment can revolutionize content creation, sparking user interest and widespread adoption of VR technologies (Zheleva et al. 2021).

5 Limitations

Notably, there are certain limitations that the reader should consider when interpreting the results of the current review. Particularly, the distinction between ’quality factors’ and ’quality features’ in the context of QoE requires careful attention. As outlined in Möller and Raake (2014b) quality features refer to perceivable and nameable charactersitics of an individual's experience of a service that contribute to its quality. These are formed as part of the QoE formation process and can include elements such as ease of use, cybersickness, and immersion. On the other hand, quality factors are “>characteristic of a user, system, service, application, or context whose actual state or setting may influence the Quality of Experience for the user” (Möller and Raake 2014b).

In this study, we used the term factors to maintain a consistent narrative that would be accessible to a broader readership, including those not well versed in QoE research terminologies. This decision was also influenced by our literature review, which revealed that many sources did not explicitly differentiate between factors and features or used the terms interchangeably. Given this ambiguity, we refrained from introducing our subjective interpretation (even if based on Möller and Raake (2014a) outlined terminology) that may not have been supported by the original texts.

However, we recognize that such a decision might oversimplify the complexities and could potentially affect the granularity of the analysis and conclusions. For instance, aspects such as presence, cybersickness, and immersion are extensively studied within the QoE community and might be more accurately categorized as features rather than factors. While this does not alter the overarching finding that these elements are researched more frequently than content or context-related ones, it does suggest a need for a more cautious interpretation of the listed findings. Therefore, readers are advised to consider the use of the term factors within this manuscript as encompassing both factors and features, per the QoE research lexicon, and interpret the findings with an understanding of the potential overlap between these concepts.

Furthermore, technical developments that partially steer quality of user experience in VR are fast-paced. This can be witnessed by the recent and quick adoption of artificial intelligence (AI) in VR development. As Thau (2024) abtly explains it, AI-powered VR allows people to experience user-centered adaptive environments, which (via gesture and facial recognition for example) can create smoother and more intuitive interactions with the environment. This can ultimately result in higher sense of immersion, presence and overall enjoyment. Therefore, we want to encourage the reader to view the current text and its supplementary materials through a critical lens and keep in mind that new technological developments have appeared during and will appear after the publication of this article.

6 Conclusion

Our study has shed light on a crucial issue: the need for a more diverse research effort to understand the Quality of Experience in Virtual Reality. Our findings emphasize that there is a substantial gap in our current research landscape, where we often overlook the complex interactions among users, systems, contexts, and content in VR.

To address this gap, we propose fostering interdisciplinary collaborations that bring together experts from computer science, psychology, neuroscience, design and engineering among other disciplines. These collaborations hold the potential to generate innovative solutions that can push VR technology forward. What is more, interdisciplinary cooperation can help address ethical concerns related to privacy, safety, and content appropriateness as VR becomes increasingly integrated into our daily lives. Thus, the implications of these interdisciplinary efforts extend beyond technological advancements. They have far-reaching consequences for society, impacting our collective well-being and security. By embracing such collaborations, we can unlock the full potential of tailored VR experiences, which can transform various aspects of human life, from education and healthcare to training and therapy.

Supplementary information: This article is supplemented by an online interactive dashboard which can be found at the following web address: https://bit.ly/3qs1SV0.