1 Introduction

Immersive Virtual Reality (iVR) is a promising technology to improve human performance and wellbeing (Hamilton et al. 2021; Narciso et al. 2021; Patsaki et al. 2022). Tasks such as learning, training, and relaxing can be improved in iVR environments in safe, autonomous, and personalized ways (Checa and Bustillo 2020). Biosignals gathered during iVR experiences can significantly improve user autonomy and the adaptation of the iVR environment to the characteristics of each user (Houzangbe et al. 2020). Biosignals can be defined as descriptions of physiological reactions (Kaniusas 2012), changes to which result in physiological signals that are recorded and tracked by biosensor devices. The most common wearable devices monitor cardiac, electrodermal, and cerebral activity (Antoniou et al. 2020). These devices gather biosignals from a user in iVR experiences and provide feedback as a form of interaction in biofeedback-based experiences (Sajno et al. 2022). Biofeedback can be used to adapt learning iVR experiences to the pace of student learning, and relaxation techniques can be adapted to user stress levels, and levels of training task difficulty, etc. (Michela et al. 2022; Weibel et al. 2023).

However, if biofeedback-based iVR experiences are to become standard solutions, then extensive research is still needed. Many issues on the proper design and the evaluation of such experiences for different sectors of application are summarized in Sect. 2. Regarding the design of the iVR application, the selection of convenient wearable biosensors, the appropriate Head-Mounted Display (HMD), and type of iVR experience are the three first points that should be properly addressed. These points can eventually be resolved in relation to the final application, as will be explained in Sect. 2. As an example, an experience for patients at an Intensive Care Unit (ICU) must be passive, use non-invasive HMD, and biosensors that may already be used at the ICU (e.g., Badke et al. 2022). On the contrary, if physical activation is to be measured during sports activities, the use of passive applications or sensors that cannot measure physical arousal such as electroencephalography (EEG) is not appropriate (e.g., Rutkowski et al. 2021).

Key points when testing the design of iVR experiences are the types of control experiences to be compared and the size of both user groups (the control and experimental group). The availability of sufficient final users is a challenge in many applications. While it might be easy to find undergraduate volunteers at universities for research in iVR environments (e.g., Kampa et al. 2022), the contrary might occur in the case of testing an iVR-based therapy for a rare physical illness (e.g., Lai et al. 2020). The statistical value of the conclusions of both works will in all likelihood differ, in accordance with the numbers of final users.

General-purpose responses are given to the previous points within the seven fields of application that are considered in this review: education, military, psychology, medicine, sports, tourism and marketing, and ergonomics. An eighth field, research, was added, which encompassed all the works focused on the design of an application, although with no specific final use. This division of fields was possible due to the large number of studies published over the last seven years, presenting iVR applications with biosignal acquisition: 560 manuscripts. Nevertheless, biosignals are rarely used to provide feedback during iVR experiences (only around 20% of the papers under review describe biofeedback-based iVR experiences), as is concluded in this research. Most of the studies simply report data acquisition, without using the results to improve the experience performance. The main reasons might be: (1) the current lack of easy and direct solutions to extract and to process biosignals in real time from the most common wearable biosensors; and (2) the difficulty with their real-time integration in the most popular iVR game engines. In this review, the 98 works that can be considered biofeedback-based iVR experiences were therefore analyzed to extract guidelines for iVR experience design in this promising topic.

The final objective of the statistical analysis was to identify the most common solutions, the best practices, and the most promising trends among biofeedback-based iVR applications. On the basis of the good practice identified in this review, practical guidelines are proposed for the future development of biofeedback-based iVR applications. As previously outlined, the practical orientation of this review considers the difference in both the requirements and the needs of the different final sectors where the iVR applications will be used. Section 4 therefore has different subsections with the best practices of each of the 7 final sectors under consideration, helping to focus attention directly on the needs of each reader.

The novel focus of this review concerns the potential to combine wearable biosensors and iVR. Some other reviews are also used as a foundation for the development of the current review. Mainly, the literature review of the historical evolution of biosensors in iVR between 1995 and 2019 (Halbig and Latoschik 2021); although its presentation of biosensor combinations and their current potential is very limited. Then there are several recent reviews on specific fields of application for iVR, although biosensors are given a secondary role, such as the use of medical therapies (Lüddecke and Felnhofer 2022), psychological interventions to reduce anxiety (Alneyadi et al. 2021), and training applications for industrial workers (Radhakrishnan et al. 2021).

The remainder of this review paper is structured in the following way. First, in Sect. 2, the taxonomy used in the review is presented and the main issues related to iVR environments, biosensors, fields of application of the iVR experiences, and the post-assessment objectives. Then, the methodology followed in the survey is described in Sect. 3. A statistical analysis forms the basis of Sect. 4 where the results of this analysis are presented in four sub-sections: (1) demographics, so as to give a general view of research time evolution, fields of application, and the main types of wearable biosensors; (2) development of the iVR environment, considering the HMD in use, the game engine, and the interaction level; (3) the experience design, focused on sample size, the control experience and the structure of the evaluation methods; and (4) the biofeedback-based iVR applications, considering how biofeedback is used, how they are designed and the conclusions that researchers may draw from these works. The good practices identified in the results of each final sector are summarized in Sect. 5. Finally, in Sect. 6, the main conclusions of the review are presented, while the main future research lines are summarized in Sect. 7, which should be covered by future research in biofeedback-based iVR applications.

2 Taxonomy

According to Horvat et al., the design of iVR experiences can be divided into three phases: preparation, execution, and post-processing (Horvat et al. 2022). As shown in Fig. 1, this process can be extrapolated to the creation and the evaluation of a complete experience. In the Preparation stage, the objectives have to be defined. In addition, a decision must be taken on whether to use an application or 360º video, and, in either case, whether to develop it, to outsource the process, or to use an existing source. If the application is to be developed, some steps will be required: to select the engine, the type and the form of interaction, the device in which the application will be run, and the physiological data that are extracted.

Fig. 1
figure 1

Design process of an iVR experience with biosensors

The Execution stage was divided into two phases: design of the experience and its performance. The first points to be decided for the design of the experience is sample size and its organization into groups, as well as whether a control group will be included in the experience for a comparison of the results. Then, the experience is performed under the supervision of researchers. After performing the experience, a post-processing stage takes place. In this phase, different tests related to the objectives of each application are performed. A sociodemographic test can be used in the initial stages of the experience. Furthermore, post-assessments for subjective evaluations of the experience can be performed. At the end of the experience, the researchers should analyse the (objective and subjective) data that may have been gathered, which will determine the achievement of the objectives of the experience (Fig. 2).

Fig. 2
figure 2

PRISMA flow diagram of the procedure for the selection of the papers

The papers to be reviewed were classified for their analysis in accordance with the taxonomy presented in the following sub-sections, with which the decisions behind the experiences described in each study may be better understood.

2.1 Immersive Virtual Reality

Immersive Virtual Reality (iVR) refers to the technology through which users feel fully immersed in the virtual world (Narasimha et al. 2019). It can also be defined as an extension of 3D-animation and static rendering (Bai 2022). iVR incorporates techniques that help users to feel their physical presence within the virtual environment (Won et al. 2023), and the interaction happens when the user moves or uses control devices in a natural and instinctive way (Feng et al. 2018). This review was therefore limited to academic papers published since 2015 and using iVR with HMDs. HMD versions earlier than 2015 are not considered fully immersive, as they never blocked visual access to the environment outside the virtual world, nor physically disconnected the user from the real world (Shadiev and Li 2022).

There are some previous classifications of iVR-related hardware (Anthes et al. 2016). However, a simpler classification is proposed regarding the type of device and its degrees of freedom. Firstly, the following types of devices can be found:

  • Desktop: the HMD becomes one more wired peripheral of the computer (or console in the case of PlayStation VR). The application runs on the computer and its contents are displayed on the iVR device.

  • Standalone: the device works as a stand-alone console where different applications are uploaded or downloaded. No computer is needed, although sometimes it can be connected with a cable to become a Desktop.

  • Cardboard: a holder mount that is connected to a mobile phone. It tracks head rotation, but not user movement in space. In some models, there is a button on the glasses or very simple controls to select options and for limited movement through the virtual environment.

Depending on the Degrees Of Freedom (DOF), (i.e., possible movements within the virtual space) applications can be 3DOF or 6DOF (Rossi et al. 2021):

  • 3DOF: these degrees of freedom correspond to the rotation of the head in its different axes (x, y, and z). The interaction with the environment is only possible by changing the viewing direction, i.e., looking up, down, sideways, and tilting the head from side to side.

  • 6DOF: compared to 3DOF, they add movement in all three axes (x, y, and z) to the 3DOF previously explained. Therefore, the view of the virtual objects will change, depending on the user perspective. This type of movement appears more realistic and adds to the sense of immersion.

Furthermore, there are several ways to interact with the iVR application. For the classification of these types of interaction, a previous review was used as a reference (Checa and Bustillo 2020). Each of the different types are presented as follows:

  • Passive: the most limited sort of interaction, as the user can only observe events with the HMD.

  • Interactive: the user can interact with the environment, but cannot move freely.

  • Explorative: free exploration is allowed, though no direct interaction with the environment.

  • Explorative with interaction: the most complete type of interaction, as the user can explore freely and interact directly with the environment.

2.2 Physiology and wearable biosensors

Physiological studies can cast light on bodily functions, cellular reactions, human organs, and tissues, among many other aspects (Moro et al. 2021). The biosignal classifications conducted in two prior works were taken into consideration for the development of this taxonomy (Navarro et al. 2021; Sajno et al. 2022) Another reference for the biosensor taxonomy was a review of the efficacy of combining different brain-computer interfaces with immersive and non-immersive Virtual Reality (D. Wen et al. 2021). In this review, only physiological signals that can be measured using non-intrusive wearable biosensors have been considered. The classification of physiological signals, derived from both the analysis of the papers within this review and comparisons with reference articles, is presented as follows:

  • Neural activity: An electroencephalogram (EEG) monitors fluctuations in brain waves (cerebral electrical activity) (Cohen 2017). It is usually measured with electrodes placed on the scalp or with a special headset. Information on mental processes, such as attention and relaxation are provided through this parameter (Wu et al. 2022).

  • Muscular activity: An electromyograph (EMG) is a technique that measures the electrical activity of a muscle as it contracts. This parameter is typically measured through pairs of electrodes adhered to the skin (De Luca 2006).

  • Cardiovascular activity: The most common parameters related to cardiovascular activity are heart rate (HR) and blood pressure (BP). There are various types of devices for measuring cardiac activity, including bioelectrical and optoelectrical methods (S. Chen et al. 2021a, b). Cardiac activity can provide insights into the emotional state of an individual and their physical arousal (Wu et al. 2022).

  • Electrodermal activity (EDA): EDA is a measurement of electrodermal conductance and eccrine sweat gland activity (Posada-Quintero and Chon 2020). It is also known as Galvanic Skin Response (GSR). Sensors are usually placed on the body areas where most eccrine glands can be found, such as the hands, feet, and nape of the neck. This parameter provides information on the user's physical activation (J. Kim et al. 2019).

  • Respiration: The respiratory rate is usually used as a measure of user relaxation or excitation (Vanegas et al. 2020). It is usually monitored with belts placed around the abdomen or chest to measure bodily movements and their magnitude.

  • Eye movements: Eye Tracking (ET) is used to monitor pupil movements and size (Sundstedt and Garro 2022). There are two main types of measurements: electro-oculography and video-based corneal reflection. The first one consists of adhesive electrodes placed on the muscles that surrounding the eyes that monitor electrical activity of the skin due to muscular activity. In video-based corneal reflection, infrared light is projected onto the eyes and the light reflected off the cornea is tracked to estimate the position of the gaze.

2.3 Fields of application

The fields of application were classified on the basis of information from other reviews (e.g., Suh and Prophet 2018; Zhao 2009). However, the classification was adapted to the needs of the papers found in this review. The fields are presented as follows:

  • Psychology: Experiences related to the study of human behavior, mental processes, and mental disorders. This category encompasses experiences related to mental well-being, such as relaxation techniques or the development of interpersonal skills.

  • Medicine: Diagnostic and curative processes within the medical field, i.e., diagnosis, rehabilitation, and pre- and post-surgery processes.

  • Sports: Applications with physical activities as the main topic, also known as exergames.

  • Education: Educational experiences for learning a particular topic. Training and Prevention fields are both included in this category. Training: experiences that serve to train users in a specific task. Prevention: experiences that serve to coach users for a situation, so they know how to react to the same situation in real life.

  • Ergonomics: Experiences related to assessing an individual's interaction with various environments (e.g., urban or industrial) or services to enhance their design better aligned with user needs and expectations.

  • Military: All experiences related to the army and law-enforcement forces, whether for training, rehabilitation, or post-traumatic stress disorder treatment.

  • Tourism: Experiences with the objective of bringing the public closer to cultural content. It also includes Marketing, with the aim of demonstrating product qualities on show for sale. Both categories were grouped together due to their low number of papers, and because the ultimate objectives of the papers were similar: to showcase a finished product to the public, be it a tangible object or a tourism product.

  • Research: All other experiences that have no particular field of application and that are in general related to iVR itself and its characteristics, such as immersion and cybersickness.

2.4 Post assessment

After an iVR experience, users are often administered a survey. Surveys generally show subjective information on the experience, the results of which shed light on the user and the application and verify whether the objective of an experience was achieved. This post-assessment process was classified into four categories:

  • Emotions: A category in which the feelings of the user during the experience were assessed. If biosensors were included in the experience, the data that were collected helped to understand the emotional relationship of the user with the application, e.g., anxiety, fear.

  • Features of the experience: To understand the opinion of the user towards the experience (e.g., cybersickness), its features are analyzed. This category includes general interviews or surveys on the experience and its development.

  • Experience of use: This category referred to the assessment of UX-related features, i.e., the way users interact with the application and whether they like it. The most frequently asked questions were related to the ease of using the application and satisfaction with its use.

  • iVR features: This category covered the evaluation of user opinions on the iVR-related characteristics of the application. The evaluations depended on the application and the device in use (e.g., degree of immersion or presence in the iVR environment), not on the experience.

3 Methodology

The PRISMA methodology was followed because it facilitates transparent documentation of the review, its purpose, the steps that the authors followed, and their findings (Page et al. 2021). Figure 2 summarizes the PRISMA procedure followed in this review. All the information concerning this methodology may be consulted in Annex 1.

Two searches were performed. The first one was “virtual” AND “reality” AND (“biofeedback” OR “feedback” OR “neurofeedback” OR “BCI” OR “electroencephalography” OR “heart” AND “rate” OR “breath”). The second one was “virtual” AND “reality” AND (“biofeedback” OR “feedback” OR “neurofeedback”) AND (“electroencephalography” OR (“heart” AND “rate”) OR “electrodermal” OR “respiration” OR “eyetracking” OR “electromyography”). Furthermore, only the articles published between 2015 and 2022 were analyzed. Searches were performed twice: in October 2021 and June 2022.

The searches were only inputted into Scopus. No further searches were performed because the number of papers was a significant sample. In all, the first search yielded 1,368 papers and the second one yielded 297. In addition, some articles that appeared in the bibliography of those included and other related articles were added. Finally, 560 articles were analyzed in this literature review. The sample was compiled to serve as an example of major developments over recent years.

Firstly, a list of exclusion criteria for the papers was established. When one of the exclusion criteria was applicable to a paper, that paper was automatically excluded. Using those criteria, the abstracts of the papers obtained in the searches were then reviewed to remove those affected by the exclusion criteria. In cases of uncertainty regarding any of the papers, a comprehensive review was undertaken to ensure that all criteria were followed. Having collected all the papers forming part of the review, they were carefully read to gather the data analyzed in the subsequent sections (the publication year and the country of the first author, the knowledge field, the application or type of application in use, the application development approach, the type of interaction of the experience, the type of Head-Mounted Display (HMD) in use, the physiological parameters that were measured, the use of biofeedback, the sample size, the presence or absence of a control group, the type of post-evaluation that was conducted, and the journal or conference in which the article had been published). All data can be consulted in Annex 2.

Following the initial classification, a second reading of the articles was conducted to prevent errors. Additionally, a brief investigation into research groups was carried out to complete missing data (e.g., the type of HMD typically used by the research group). If this information could not be located, it was documented as "Not stated" to prevent potential errors.

Below is the list of exclusion criteria for the articles in this review:

  • No iVR with HMD used in the experience. All the papers that described the use of Augmented Reality, Mixed Reality, CAVEs, Second Life, flat screen applications and screen projections were therefore excluded. CAVE-based solutions, although proper iVR applications, were excluded due to their high cost, that limits their democratization and general use.

  • Papers containing no direct results; for example, if on the design rather than the use of the experience.

  • Review articles, comments, editorials, conference records, or any other type of communication other than articles in which experimental results are reported.

  • The content of the papers was unrelated to the topic of this review.

  • Physiological measurements were not part of the experience.

  • The experience was repeated or was complementary to another. In this case, only the most recent paper or the one with the best results was included.

  • The papers were not written in English.

  • The papers were focussed on the design and the development of hardware, including comparisons of hardware types, such as different HMDs or AR/VR.

4 Results

4.1 Demographics

4.1.1 Year of publication

As stated in Sect. 2.1., a criteria for inclusion was that the papers had been published since 2015. In Fig. 3, a comparison of the number of VR-related papers is shown alongside those that also include biosensors. In the graph, the grey line represents the total number of papers that included the term “Virtual Reality” in their research per year. The two dashed lines depict the percentage of all articles that included VR in their research and utilized VR and biosensors: the total of VR&BS papers (% VR&BS total) is represented by the dashed green line, and those analyzed in this review (% iVR&BS analyzed) are represented by the dashed orange line.

Fig. 3
figure 3

Trends since 2015 of Virtual Reality (VR) publications with and without Biosensors

As outlined in Fig. 3, the numbers of papers on general VR experiences and biosensors is increasing. Less costly HMDs may explain this academic interest, and deepfer interest in this type of technology. In addition, the use of biosensors has increased, as they have become more accessible. Nevertheless, the percentage of articles that use biosensors with iVR applications remains low (less than 2%). It may be either because the use of biosensors is at an early stage, or because not enough is known to optimize performance with the existing procedures. However, between 2015 and 2021, the proportion of papers using biosensors combined with iVR increased by 0.8%. Although this growth is slow, its positive tendency is expected to continue.

In 2020, there was a slight decrease in the number of publications, due to the COVID-19 pandemic when testing many people was not feasible. However, 2021 showed a peak which may have been due to a backlog of experiences that could not be carried out in 2020, added to those completed in that same year. Articles published in 2022 were not included because most of those papers were not accessible at the time of writing this review.

4.1.2 Keyword connections

A matrix of bibliometric data from the selected papers was created using the Biometrix software correlations (Aria and Cuccurullo 2017), using clustering and network analysis methods, in order to establish initial correlations. Consequently, a graph was generated showing the interconnections between the SCOPUS keywords. Those keywords were generated with an algorithm that detected words and phrases repeated in the titles of the references of each paper. Words directly associated with the content of each paper were extracted in the process. Figure 4 displays the keywords separated into clusters and their correlations.

Fig. 4
figure 4

Correlations among the SCOPUS keywords

The first conclusion drawn from Fig. 4 is that Virtual Reality serves as a central axis (purple circle) and is strongly connected to human factors (green circles: human, male, female, adult…). The large number of words related to human age outlines the clear need to optimize the VR applications for each target group. Many of the keywords that appear are directly related to two knowledge fields included in the green cluster: Psychology (stress, psychology, anxiety, and virtual reality exposure therapy) and Medicine (pain, pain measurement, and pathophysiology). The keywords within the red circles are related to physiological parameters (heart rates, physiological models and electroencephalography) that are the focus of research that refers to the evaluation of physical functions through the monitoring of bodily activity. Heart rate is already established as the most common physical function for monitoring (high repetition in many clusters) while the highest dispersion of the type of biosensors in use appear in the blue cluster (general research-related papers rather than papers on certain fields of application of the VR experience), which was foreseeable given the range of biosensor types and as the research was not focused on any particular one application. Finally, there were some keywords related to the extraction of subjective data (surveys and questionnaire), a compulsory activity in VR applications where usability plays a major role in their successful application and in educational applications where learning might not be measured in other direct ways. Nonetheless, there were no keywords related to the design of the experiences or to the development of applications, pointing to the lack of research that is directly focused on VR design methodologies. All of these conclusions are quantitatively reflected throughout the paper in the statistical analysis of the survey and are aligned with the conclusions drawn in each of the following sections.

4.1.3 Fields of application

Figure 5 shows the number of papers analyzed in each field. The areas have been categorized into two groups according to the number of papers they contain. Areas with more than 40 papers belong to High-use fields, while those with fewer than 40 papers are in the Low-use fields. Both are highlighted with an orange or yellow dashed rectangle in the figure.

Fig. 5
figure 5

Number of papers per field

There are four areas included in the High-use fields group, a category that encompasses the fields associated with healthcare applications and research activities. The utilization of biosensors is more firmly established in these fields due to their connection with the monitoring of individuals' physiological signals and the quest for new solutions through the integration of iVR and biosensors. Most papers were found in the field of Psychology (38.2%), as biosensors have traditionally been employed within that field, and their usage is well defined and studied (Mancuso et al. 2022). The most common use was to study the influence of iVR on reducing anxiety (e.g. Maples-Keller et al. 2017)., and stress levels (e.g.,Bourassa et al. 2020) through different stimuli. Research was the second field with the highest number of papers (26.4%). Those papers reported studies on the potential and the characteristics of iVR. For instance, the effects of cybersickness (e.g., Garcia-Agundez et al. 2019) and the sensation of immersion with iVR devices (e.g., Skarbez et al. 2021). As the objectives varied depending on each research project, the utilization of biosensors shows considerable diversity within this field. Medicine was the third most frequent field with 14.6% of all papers due to the extensive expertise of professionals at monitoring the physiological signals of patients (Haleem et al. 2021). Its experiences mainly tested the effect of using iVR for reducing pain thresholds during medical procedures (e.g., Menekli et al. 2022) and physical rehabilitation (e.g., Winter et al. 2021). The following field with a notable proportion of papers was Sports (7.7%) which may be attributable to the fact that exergames was a previously established category. Biosensors were used for objective monitoring of individual physical arousal, activity levels, and heart rate during exercise (e.g., Burin et al. 2020).

The Low-use fields group encompassed the remaining areas where the utilization of biosensors was not as firmly established. Education and Training was the field with most studies (3.8%). The most common objectives of those papers were to test the utility of iVR for learning (e.g., Stavroulia et al. 2019), as well as to simulate real-life based situations, so as to learn how to manage them out of the virtual environment (e.g., Chen et al. 2021a, b). Biosensors were not commonly employed in this field because learning outputs are more readily measurable through a practical exercise or a post-experience test (Checa et al. 2021). In the field of Ergonomics (3,9%), biosensors were mainly used to assess user comfort in an environment (e.g., Zhang et al. 2021). The aim was to create safer and more comfortable virtual environments that can be reproduced in real life in relation to interior design (e.g., Chinazzo et al. 2021), urban planning (e.g., Birenboim et al. 2021), and industry (e.g., Xiao and Cheng 2020). Military was one of the smallest categories (1.8%). Biosensors were included in experiences of military training and state agencies (e.g., Muñoz et al. 2020), as well as experiences whose main objective was to treat military personnel for various illnesses, e.g., Post-Traumatic Stress Disorder (PTSD) (e.g., Binsch et al. 2021). Finally, the field of Tourism and Marketing amassed the lowest number of articles (0.5%). Biosensors were generally used within that field to identify areas for improvement, by testing user reactions to final products (e.g., Luangrath et al. 2022), museum tours (e.g., Marchiori et al. 2018), and commercial goods (e.g., X.-T. Huang et al. 2020b).

4.1.4 Types of biosensors

Figure 6 shows the use of biosensors in different fields. In this figure, the total number of biosensors is represented, instead of the total number of papers, because more than one type of biosensor is used in many of them.

Fig. 6
figure 6

Quantity of biosensors used per field

As Fig. 6 shows, cardiovascular activity was the most common physical parameter measured in all fields (53.3% of the total). Its extended use was due to the simplicity of its data representation, the affordability of the devices, and its low intrusiveness for the users in comparison with other devices. It was widely used in Sports, because it provides information on physical arousal, which was the main objective of most exergames (e.g., Rutkowski et al. 2021). In Psychology and Medicine, it was generally used to measure stress and anxiety levels (e.g., Dings et al. 2021; Wright et al. 2022; respectively). In Tourism and Marketing, Education and Training, and Ergonomics, cardiovascular activity devices were used to measure user comfort in different environments (e.g., Huang et al. 2020a; Y. Huang et al. 2022; Saeidi et al. 2021, respectively). Generally, cardiovascular activity was not the unique parameter measured in the experiences. Electrodermal activity, shown in dark blue, was also a widely measured physiological parameter (17.5%). It provides information on the user’s level of physical arousal and tension. More than in any others, the electrodermal activity was the parameter used in the following three fields (20%): Psychology, Ergonomics, and Military. These categories coincide with those in which cardiovascular activity was measured most of all. It could be attributed to the fact that, in at least 73 papers, the same device was described utilized to acquire both types of data (e.g., Yeom et al. 2021), or that the two parameters were measured simultaneously because they offer analogous information about the physiology of the individual (e.g., Mihara et al. 2022).

As much as there are easily measurable and interpretable parameters, there are also complex ones to measure such as neural activity (in yellow), and the devices for its measurement are uncomfortable to wear. However, EEG provides extensive information that can be very useful in some experiences where it wassused (14.2%). In both Education and Training and Psychology, EEG measured attention, to ascertain whether the user was focused on the task (e.g., Parong and Mayer 2021; Delvigne et al. 2022, respectively). In Psychology, EEG was also used to classify emotions (e.g., T. Y. Wen and Mohd Aris 2022) and to examine the areas and the brain waves that different stimuli can activate (e.g., Kaur et al. 2019). In Medicine, EEG was employed for the rehabilitation of individuals with neurological disorders (e.g., Arroyo-Ferrer et al. 2021). In Ergonomics, EEG was employed to analyze the emotional response of individuals within various environments (e.g., Z. Zhang et al. 2021). In contrast, EEG was not mentioned in the papers on Tourism and Marketing, because the device was neither comfortable nor in line with the main objective of the papers within that field.

Respiration (7.8%), represented in green, was mostly measured in areas where user relaxation levels are important, such as Psychology and Medicine. In those fields, it could be useful to teach users how to control their breathing, in order to reduce pain (e.g., Felix et al. 2021), and anxiety (e.g., Nouri et al. 2022). In Sports, respiration was used to measure the level of physical arousal and fatigue during exergames (e.g., Charoensook et al. 2019), and in Education and Training, it was helpful to know if users can control their physiological parameters in different situations (e.g., Michela et al. 2022). Eye tracking, shown in light blue, was measured in most areas (5.1%). More widespread usage was expected, due to the built-in eye tracking systems that can now be found in some HMDs. However, the variety of variables that can be measured (e.g., fixation, saccadic movements, and pupil dilation) are frequently difficult to interpret as they require the assistance of techniques such as Machine Learning to extract the information (Serrano-Mamolar et al. 2023). In Ergonomics, eye tracking was often included to understand user gaze behavior within urban areas (e.g. Birenboim et al. 2021). In Medicine, it was used in processes related to vision disorders (e.g., Leitner et al. 2021). Finally, the electromyogram, represented in purple, was rarely mentioned (2.2%). It was used in the fields of Psychology and Medicine to study the relation between user arousal levels and their moods (e.g., Jacob et al. 2022; or Kaminska et al. 2020). In other fields, it complemented the data of other biosensors.

4.2 Virtual reality development

4.2.1 Head mounted displays

Over the years, new types of HMDs have been introduced to the market, as is shown in Fig. 7. The new generation of HMDs are increasingly used, because of their higher quality and affordable cost. In the past, they were less affordable, which restricted their use to the fields of Medicine and Psychology.

Fig. 7
figure 7

Type of HMDs used per year

Desktop, including 3DOF and 6DOF, was the most widely used application (70.5%). The Desktop HMD presents some advantages in comparison with the other HMDs: a huge space to run the applications is not necessary; as they are connected to a computer, offering higher computational capacity; and performance is easy to assess because developers and supervisors can track user VR views. Desktop is also divided into 3DOF and 6DOF. The first one includes obsolete devices, such as eMagine Z800, Fove 0, nVisor and Oculus Rift DK1. In contrast, 6DOF offers higher immersion levels and sense of presence, which are especially important nowadays to develop interactive experiences with more quality (Rossi et al. 2021). The best-known Desktop 6DOF HMDs are HTC Vive, HTC Vive Pro, Oculus Rift, Oculus Rift CV1, Oculus Rift DK2, Oculus Rift S, PlayStation VR, Samsung Odyssey VR and Valve Index. Furthermore, in recent years, the use of the Standalone group has grown. Some of the most popular devices within this category are: Oculus Go, Oculus Quest, Oculus Quest 2, and Pico Neo 2. The advantage of these devices is that their operation requires no computer. While these devices are continuously improving in quality and their capacity for running applications, they have yet to reach the level of 6-DoF desktop systems (Sadek Hosny et al. 2020). In contrast, the Carboard group, which includes devices such as Samsung Gear, is rapidly becoming stagnant. Their dependency on mobile devices and low user interaction with the environment (e.g., select options and limited movement around the environment), makes the experiences entirely Passive. They are however suitable for 360° videos (e.g., Kim and Jeon 2021). Although the use is similar for Cardboard and Desktop 3DOF, the latter are better because they are connected to a computer with all the advantages that entails.

In addition, Fig. 8 shows that HMD use is dependent on its area of use.

Fig. 8
figure 8

Type of HMDs used per field

Desktop devices were the most widely used HMD types within the High-use fields group, especially the 6DOF ones. Something that may be due to a professional need within Psychology (59.9%) and Medicine (39%) to observe what users were seeing, in order to analyse and to guide their behaviors (e.g., Krisch et al. 2020; Winter et al. 2021, respectively). Cardboard devices were the next most frequently used HMDs in these fields and were commonly used for 360° video experiences (e.g., Mladenovic and Djordjevic 2021), while standalone HMDs were less used (3.9% in Psychology and 11% in Medicine). In Sports, Desktop 6DOF HMDs were used in 59.5% of all the experiences that were analyzed, which was not expected, as it was thought that wireless devices might have been used more for user convenience. It may be because sensors work better when incorporated with HMDs connected to a computer (e.g., Schormann et al. 2020). Meanwhile, the usage of Cardboard and Standalone devices was 24%. In these experiences, users only see the environment when playing a virtual sport, so there was hardly any interaction with the application (e.g., Calogiuri et al. 2018).

The contrary happens in the case of the Low-use fields group. The use of Cardboard devices in the field of Education was of no interest, as interaction was so limited. The Desktop devices were the most frequently used (82.9%), as the educator or trainer can observe what users do in the virtual environment (e.g., Awada et al. 2021). A similar situation was observed in the Military field. Desktop 6DOF devices were used for user-training experiences (70%) (e.g., Caserman et al. 2022), and Cardboard HMDs (10%) were used in those experiences related to reducing PTSD by watching 360° videos (e.g., Park et al. 2017). Finally, in Ergonomics and Tourism and Marketing, Cardboard and Standalone devices were not used, because user reactions to what the users were looking at could not be seen with those devices. Therefore, Desktop 6DOF HMDs were preferred in almost every paper (e.g., Luangrath et al. 2022; Erkan 2021).

4.2.2 Application development

iVR application development differed in each case. 14.8% of the studies under analysis referred to the purchase of the application for two reasons: the expense of application development, and the research group knowledge of iVR application development. In quite a few studies (17.5%), there was a preference for the use of 360º videos, to avoid either developing or purchasing the application, or there was collaboration with other departments, or the development was outsourced. In studies where neither of those options were preferred, the application development process was not described, and they were categorized as Not stated (23.9%).

However, some research groups preferred to develop their own applications. The game engine Unity was usually employed (37.5%), although it requires programming knowledge. Unreal Engine was the second most widely used engine (3.2%), due to the ease of programming with nodes. It is also replacing Source (1.4% of use), which is becoming obsolete. This distribution of the options in application development aligned with the results obtained in other reviews (e.g., Checa and Bustillo 2020).

4.2.3 Interaction with the application

The experiences were classified according to their type of interaction, as explained in Sect. 2.1. Figure 9 shows this classification and the type of HMD used in the experience.

Fig. 9
figure 9

Type of HMDs per form of interaction

Most of the experiences were Passive (41.6%) or Interactive (40.5%), as their development requires less work, and investment of time and cost than Explorative (7.3%) and Explorative with interaction (9.3%) (Checa and Bustillo 2020).

Passive experiences require neither interaction nor programming (e.g., 360º videos, simple environments, with no interaction). Indeed, 46% of these experiences only reproduce 360º videos (e.g., Brivio et al. 2021). Hence, this type of experience is optimal for relaxation simulations: the movement of the user is not considered as important for the objectives of the experience (e.g., Kim et al. 2021). That reason, the low price of Cardboards, and their simplicity all explain why Cardboards were commonly used. However, the most widely used devices were Desktop HMDs (17.7%), a type of HMD that professionals prefer so that they can follow the user gaze (e.g., Brundage et al. 2016). In contrast, Standalone was used far less (7.3%). With the Cardboards, Standalone were commonly used in sanitary spaces where the device needs to be worn comfortably, for example, to reduce patient pain and anxiety levels (e.g., Alaterre et al. 2020). The second most common type was the Interactive experience, for which the most common devices were Desktop (81%). In those experiences, Desktop devices can through computer connections incorporate other types of interaction such as exercise bikes, platforms and biosensors (e.g., Winter et al. 2021). In contrast, Standalone HMDs were hardly used (2.2%) as interaction is only possible with the controllers (e.g., Shoko et al. 2021). The same applies to Cardboard HMDs, in so far as they can only be used in experiences with limited interaction and movement (e.g., Pallavicini et al. 2019). Such applications are more expensive to develop than Passive ones, but have some advantages over them, such as interaction through biosensors (e.g., Blum et al. 2020). For this reason, they were chosen to be developed despite their cost.

The two Explorative types had the lowest number of experiences. In both, the Desktop devices were the most widely used (73.2%), because they can run larger and more detailed environments (e.g., Irshad et al. 2021). In contrast, Standalone (14.6%) and Cardboard (2.4%) HMDs were less used, in view of their limited storage capacity and mobility, (e.g., Adhyaru and Kemp 2022; Varela-Aldás et al. 2019, respectively). The same was observed when the Explorative experiences were also interactives. Those sorts of experiences are the most difficult to develop, because they require advanced development and lengthier investment times. In addition, their larger size means that the experiences are harder to run with Standalone (1.9%) and Cardboard (5.8%) devices. Desktops were therefore the preferred devices, and they can also be connected to other devices (86.5%) (e.g., Alyan et al. 2021).

Moreover, the objective of an experience defines the type of interaction, and the most suitable HMD to use. Figure 10 shows the most common types of interaction in each field.

Fig. 10
figure 10

Type of interaction per field

In the field of Psychology, most experiences were Passive (43.7%). Many of them were used to display 360° or relaxation videos that helped the user to meditate, to relax, and to observe user reactions within a virtual situation (e.g., Crosswell and Yun 2022). Interactive experiences were also commonly used (40.3%). Interaction was usually included as biofeedback, which modified the application according to how the user was participating (e.g., Lan et al. 2021). In contrast, Explorative applications were the least used (6.9%, and 9.1% with interaction). The objective of that type of application was to assess emotional responses to a situation (e.g., Tardif et al. 2019). In the field of Medicine, Passive experiences excelled (64.2%), due to their non-intrusive nature. They were therefore ideally suited for medical treatments such as surgery (e.g., Alaterre et al. 2020) and ICU recovery (e.g., Gerber et al. 2017). The next most common experiences were the Interactive ones, which were most extensively used in cases of rehabilitation and when including biofeedback (e.g., Winter et al. 2021), as well as Psychology. The least used type were the Explorative experiences (3.7% and 1.2% with interaction). With no interaction, they were used to examine user performances in different scenarios, e.g., with a wheelchair (e.g., Younis et al. 2019) and through a natural environment (e.g., Ashley Verzwyvelt et al. 2021). Explorative with interaction experiences were used, for instance, to study how individuals with autism developed social skills (e.g., Kotsopoulos et al. 2021). Unlike Psychology and Medicine, Sports hardly included Passive experiences (14.3%). These applications were designed to play only 360° videos while the user was exercising to show the level of immersion in the sport (e.g., Kim and Lee 2018). The most common experiences were the Interactive ones (73.8%), because they provided a sense of playing sport (e.g., Cao et al. 2021) and usually included biosensors, platforms, and other forms of interaction, despite the absence of free exploration. In contrast, the user could move freely while exercising in Explorative experiences (e.g., Varela-Aldás et al. 2019), though those experiences were less common (2.4%), even when including interaction (9.5%) (e.g., Dębska et al. 2019).

Like Sports, Education hardly included Passive experiences (11.8%), which were only used to watch educational content without interaction (e.g., Parong and Mayer 2021). In some cases, biosensors were included to study the attention that users paid to the educational content (e.g., Hubbard et al. 2017), which was recorded as Interactive interaction (35.3%). In addition, they were commonly used to teach a specific task (e.g., Narciso et al. 2020). The other types of interaction: Explorative (20.6%) and Explorative with interaction experiences (32.4%) were within the category of Prevention. These applications proposed several tasks to the user, such as evacuating a building (e.g., Chen et al. 2021a, b) or escaping a fire (e.g., Kostakos et al. 2021). The most common interactions within the field of Ergonomics were Passive (59.1%), whose main objective was to study the reactions to urban (e.g., Z. Zhang et al. 2021) and indoor environments (e.g., Yin et al. 2020). Interactive applications (27.3%) adapted the environment to the user with biosensors (e.g., Xiao and Cheng 2020) or study what it is that makes the user feel comfortable (e.g., Li et al. 2021a, b). Explorative experiences (13.6%) had the same objective but gave the user freedom within the environment (e.g., Birenboim et al. 2021). The same occurred within the Military field: the most common interaction type was Passive (40%). Soldiers watched war-related 360° videos while their reactions were analyzed (e.g., Park et al. 2017). Interactive experiences were less common (30%), and their objective was to observe how individuals perform in various situations (e.g., Muñoz et al. 2020). In contrast, Explorative experiences were rarely used (10%). They were used to study the relationship between the way they move in a war scenario and their emotions (e.g., Malta et al. 2021). The objective was the same as for the Explorative with interaction experiences (20%) (e.g., Binsch et al. 2021). In the fields of Tourism and Marketing, Passive was the only type of interaction used, because the objective was to show a final project to users and to study their reactions (e.g., Luangrath et al. 2022).

4.3 Experience design

4.3.1 Sample size and control group

As Fig. 11 shows, the sample size in each experience is analyzed, besides the nature of the control group (pie chart). The orange dashed line in Fig. 11 represents the average number of participants: 45.

Fig. 11
figure 11

Sample size, sample average, and information on the control group

Experiences with fewer than 20 participants were the most common (33.2%). It was not an expected size, as more participants are required for a good validation of an experience and reliable results. As shown in the figure, the larger groups were the least common. In the pie chart included in Fig. 11, the number of experiences with control groups is shown. Control groups were used to compare results, so conclusions can be drawn. Despite the benefits of having a control group, because of the reliability of the results, most of the experiences (62.3%) included no control group. It may be due to experiences which include biosensors that had the objective of testing and investigating the devices, instead of comparing the results between groups. In addition, the design of experiences was among the most common topics of the articles under review. In some cases, those designs were tested, in order to conduct further experiences. One example is generating algorithms and extracting user data to verify their effectiveness and to train the data, so that control groups are not necessary. The control group experiences represented 37.1% of all experiences and were subdivided into two groups. The first one (20.4%) comprised the experiences performed with an iVR-based group and the other comprised the same experience, through performed in a different way (i.e., using a flat screen or real interventions instead of HMDs). The second group (16.8%) tested different iVR-related variables and the results were compared with several other groups.

4.3.2 Pre-tests and post-tests of the experiences

In most of the experiences, a pre-test was included to collect the demographic data of the users (i.e., age, gender, educational level, and previous experience with iVR). The data were used to evaluate whether any of the demographic factors conditioned subsequent performance. As that is a widespread and general practice, it was not analyzed in this review. In contrast, the use of post-tests was analyzed. The main objective of those surveys, performed in the 80.9% of the papers, was to study the development of the experiences by evaluating the responses of users to several types of questions defined in Sect. 2.1. Figure 12 represents the total number of questions which assessed each topic under analysis in the post-test experiences. Note that more than one topic can be evaluated in a post-test.

Fig. 12
figure 12

Diagram of the types of questions in post-tests

As Fig. 12 shows, the topic of emotions was the most widely evaluated (31.2%), almost twice the following 3 topics that were equally-presented: experience features, experience of use and VR features. The Emotions category covered anxiety, stress, fear, self-perception (or how users evaluate their mood during the experience) and other emotions. It was considered important to know how the user felt during the experience, in the areas which assessed emotions: such as Psychology, Medicine, and Education and Training. The second most evaluated topic was Experience features (19.9%). In that category, user opinions were evaluated on the characteristics of the experience (i.e., cybersickness, and sense of embodiment). Its assessment was through surveys and interviews. In addition, it was used in most fields, with the exception of Tourism and Marketing. The Use of experience category was the next most evaluated (18.5%) and it was used to investigate user opinions on the use of the devices and applications (Usability topic), and to find out whether they enjoyed it (Satisfaction topic). It was included in most fields, as the user opinions are important to improve the applications and experience.

In contrast, the Virtual Reality features topic was less common (14.6%). In this category, iVR aspects such as the feeling of presence, the sense of immersion, and level of realism were evaluated. The assessment of those features meant that the quality of the iVR experience could be verified in relation to whether it was sufficient to meet the objectives. Those aspects cannot be included in any of the previous topics, so they were gathered under another category (4.3%), being more common in the fields of Medicine and Psychology. Some examples proposed in these areas were personality tests, in order to understand user behavior during the experience, and user self-assessment of their perceived level of pain, to study whether they were distracted with iVR. Finally, no use of a post-test was mentioned in 11.6% of the studies, which was therefore recorded as ‘Not stated’.

4.4 Biofeedback experiences

The objective of this review, a broad overview of the potential of iVR applications used in combination with wearable biosensors, also included the analysis of biofeedback experiences. Biofeedback refers to the measurement and assessment of user physiological parameters when task performance is demanded. Sensors record data, which are then processed to provide feedback to users (Navarro et al. 2021) from the application, thereby creating a personal and unique user experience. Most of the applications described in this review collect data, though the data were used in very few biofeedback based applications: user-adapted biofeedback was mentioned in only 98 articles.

The data are shown in Fig. 13: most of the areas include biofeedback, although WBS were used in only 17.3% of all experiences either to modify or to interact with the application. These results may be due to a lack of clarity on how biofeedback can contribute to iVR applications.

Fig. 13
figure 13

Use of biofeedback per field

As represented in Fig. 13, High-use fields were those where biofeedback was used more: 20.3% in Psychology, 16% in Medicine, and 18.6% in Sports. In Psychology, biofeedback was used to change environments according to the mood of the user (e.g., Liao et al. 2018); to help the user to relax through, for example, controlled breathing (e.g., Min et al. 2020); and even to enhance self-awareness and empathy by listening to user (or partner) heartbeats in real time (e.g., Lan et al. 2021). In the field of Medicine, users can modify stress and pain levels through relaxation while focusing the mind on their physiological responses with the help of biofeedback (e.g.Fominykh et al. 2018). Furthermore, other experiences seek to provide a specialized motor imagery service for patients using neural sensors (e.g., Spicer et al. 2017). As part of their rehabilitation, patients try to control the application through the EEG. However, biofeedback was mainly applied in Sports, in order to adapt the level of exercise to user needs (e.g., Li and Chen 2020) or, in the cases of athletes, to monitor their data when exercising (e.g., Setiawan et al. 2018).

On the contrary, biofeedback was less common in the Low-use fields. In the area of Education and Training (8.6%), the proposed applications were designed to adapt the pace of user learning (e.g., D. Huang et al. 2020a), so the use of biofeedback is expected to increase. Biofeedback was not included in the fields of Military and Tourism and Marketing, because its implementation in those fields has yet to be fully explored, even though data-extraction sensors can be found. Something similar occurs in the field of Ergonomics where real-time biofeedback was hardly used (4.5%), as most experiences were simply intended to observe user responses to the environment without trying to modify them.

4.4.1 Design of biofeedback experiences

The design of experiences with biofeedback can differ in each field. Figure 14 tries to shed some light on those differences. The type of interaction (one-third of the circle represented with circular columns of warm colors), HMDs (one-third of the circle represented in shades of green), and wearable biosensors (one-third of the circle represented in shades of blue) are shown in Fig. 14 for each field of application. The radius of the gray circle represents the total of the elements. 

Fig. 14
figure 14

Influence of experience type (third with warm colors), Device (third with green tones) and biosensors (third with blue tones) on biofeedback experience design in each field

As shown in Fig. 14, Interactive applications were the most common type of interactions in Psychology (70.2%), because biofeedback was the form of interaction in these applications (i.e., control the application with the level of relaxation (e.g., Soyka et al. 2016)). This was the only field in which Passive experiences appear in this section (e.g., Finseth et al. 2022). There was a preference for the Desktop 6DOF devices (72.3%) and their wide range of interactions. The Desktop 3DOF (6.4%) and Cardboards (10,6%) were used in Passive experiences with relaxing videos and little or no interaction. The most frequently used wearable biosensors measure HR (36.8%), EEG (27.9%), respiration (23.5%), and EDA (10.3%). Respiration was primarily evaluated in applications related to mental well-being and relaxation (e.g., Lan et al. 2021).

In the field of Medicine, almost all the experiences were Interactive (92.8%), as the interaction experience was biofeedback based (e.g., McClinton et al. 2019). The other type of interaction was Explorative, whose objective was user rehabilitation using EEG (e.g., Younis et al. 2019). Desktop 3DOF HMD headsets were the most widely used in comparison with other fields. An option that may be explained by the high expenditure levels on medical equipment and a preference to purchase new models no sooner than they appear on the market. However, the most widely used devices were Desktop 6DOFs (64.3%). Cardboard was only used in one experience to assess visual field alteration in conjunction with an external eye tracking device (Leitner et al. 2021). EEG biosensors were widely used (47.1%), due to expanding knowledge of brainwave analysis within this field (e.g., Spicer et al. 2017), and likewise HR (23.5%), which was preferred as an in-app interaction method (e.g., Fominykh et al. 2018).

In the field of Sports, all experiences were Interactive, as biofeedback was used for interaction, and the feedback modified the application while the user performed a sport (e.g., Schormann et al. 2020). For those applications, Desktop 6DOF devices (50%) were mainly used, because they allow the connection of other devices such as exercise bikes and biosensors (e.g., Li and Chen 2020). When using Standalone or Cardboard HMDs, there was a wireless wristband connection to relay HR measurements for use in the application (e.g., Lai et al. 2020). HR was measured in this field more than any other parameter (80%), because it was the physiological parameter which best represented physical arousal, and its outgoing data were easily interpretable by the application. EDA was less commonly measured (20%), but the data were always complemented with cardiac activity.

As well as in the Sports field, Interactive applications were the most frequently used within the fields of Education and Training (66.7%) (e.g., Michela et al. 2022). An exploratory feature to assess user performance within the environment was only added to one experience (D. Huang et al. 2020a). Desktop 6DOF HMDs headsets were used in all the experiences, because educators found them easy to use and they can be connected to several biosensors (e.g., Michela et al. 2022). HR (40%) and respiration data were usually connected with these biosensors for analyzing user moods during an experience (e.g., D. Huang et al. 2020a), and likewise EEG (40%) was used to gather information on attention and concentration (e.g., Škola and Liarokapis 2018). Finally, biofeedback was rarely used in the field of Ergonomics and only one experience was found to include it (Xiao and Cheng 2020). The experience was Interactive, in which users modified a virtual object through physiological parameters. HR, EEG, EDA, and wearable eye tracking biosensors were used, although the HMD was not mentioned.

In summary, Passive experiences were not widely used (5.10%), and their use could even be decreasing. The increasing popularity of interaction in iVR experiences was reflected in the growing use of Interactive experiences (78.6%). In addition, this type of experience can be performed with Desktop 6DOF devices. In contrast, Explorative experiences were less common (8.2%) and they commonly used Standalone and Desktop 6DOF HMDs. Those devices enable the user to move through the virtual environment using controllers, which can involve natural movements and even their own physiological parameters. Desktop 6DOF HMDs were the only HMDs used for Explorative with interaction experiences (8.2%). The requirements for higher storage capacity and resources to run these types of applications, also makes it an optimal device to include biosensors. At the same time, the most common reasons for choice of sensor, ease of use, and data collection potential will depend on the final objective of each experience: HR (35.3%), EEG (33.8%), and respiration (15.1%).

4.4.2 Conclusions of the experiences

At the end of the experiences and based on the results, each research group drew its own conclusions. 55.7% of the studies that included biofeedback yielded positive results (e.g.H. Li et al. 2021a, b), and many of them mentioned an intention to continue developing the same line of research in the future. Likewise, the conclusions of 21.6% of the studies were in agreement that the combination of iVR with biofeedback yielded the best results (e.g., Houzangbe et al. 2018). In these papers, a neutral control group was compared with an iVR group, and variables such as biofeedback results were studied. Furthermore, the results of 12.4% of the papers yielded the expected results on the basis of their own objectives. The results were context-dependent and heterogeneous, however, the combination of iVR and biofeedback was beneficial in all of the papers. Finally, no significant results were obtained in a total of 6.18% of the papers. It may be due to the small sample sizes that were analyzed or the absence of control groups with which to compare the results. Another reason may be that many experiences were first-time studies, attempting to make initial contact with biosensors. The conclusions of only one article were that the incorporation of biofeedback in iVR was not effective when compared to other types of user-media interaction (Tinga et al. 2019).

5 Good practices

This section is intended to serve as a guide for the creation of an iVR application with biofeedback. In the following subsections, recommendations are presented for each field, based on the trends observed in the biofeedback articles analyzed in this review. As a general recommendation, it is proposed to use the videogame engine that best suits each research group, and in any case, the use of 360° images and videos is not recommended, as they cannot be modified in real time by biofeedback. In addition, a control group is essential, for comparison of the results and to verify the effectiveness of the application as it is developed.

5.1 Psychology

The type of applications which can be created depend on which objective is pursued. As examples, some experiences aim to help users to reduce their stress or anxiety levels (Darnall et al. 2020; Rockstroh et al. 2019; Tinga et al. 2019), overcome personal fears such as public speaking (e.g., Liao et al. 2018) and phobias (e.g., Alvear-Suarez et al. 2019). In those cases, the use of Interactive or Explorative with interaction applications is recommended. Users must complete a task to reduce their anxiety or stress levels. Regarding the use of sensors, it is recommended to use HR wearable sensors, as they measure information on user physical activation, and can be combined with EDA. However, the use of EEG is rarely employed, although brain waves can be significantly affected in the case of phobias. Overall, the use of Desktop 6DOF or Standalone devices is recommended. The choice of these HMDs will depend on the need for sensor connections through a computer and the required capacity to run the applications.

Furthermore, applications focused on cerebral processes (e.g., attention span, decision-making, creativity) and their understanding can be developed (e.g., Yang et al. 2019; Zhang et al. 2019). It is recommended for their development that the interaction be Interactive or Explorative, depending on the task to be evaluated. For instance, an Explorative application may be useful to assess user point of focus and thereby to measure attention. In contrast, it seems more reasonable to propose a specific task to the users for the rest of the processes, to assess their performance without letting them move around the scenario. EEG sensors are essential as the objective is to observe and to understand brain processes. EEG can be combined with other types of wearable sensors, depending on each case and its secondary objectives. Sensors of that sort should be connected to Desktop 6DOF devices.

Another topic of application can be to understand the relationship between iVR, biosensors, and emotions (e.g., Houzangbe et al. 2018), such as improving usability and studying how to increase the sense of presence. Desktop 6DOF is the preferred headset for connecting all these devices. However, the selection of wearable biosensors devices depends on the objective of each experience: HR and EDA are recommended to measure physical activation (J. Kim et al. 2019), and EEG to perform Interactive experiences while assessing the individual's cognitive processes in response to a stimulus (Wu et al. 2022). In this case, it is not recommendable to combine Interactive and Explorative applications, as their combination might provoke user confusion. In contrast, Explorative experiences, in which the user can move freely around the environment, are a suitable option.

On the other hand, experiences can be designed that are aimed at instructing individuals to become more mindful of their own physiology (e.g., Lan et al. 2021) or for teaching relaxation and meditation techniques for the management of certain physiological parameters (e.g., Roo et al. 2017). Applications of that sort should be interactive and provide individuals with a task upon which to focus that is centered on their inner world. Control over respiration and HR is easier for individuals to comprehend because both parameters are interrelated and can be regulated, for instance, through diaphragmatic breathing exercises (Hopper et al. 2019). The use of this sensors is, therefore, recommended. Desktop 6DOF devices are recommended for establishing real-time connectivity between the biosensors and the application.

Finally, applications to improve empathy can also be performed (e.g., Salminen et al. 2019). In these cases, multiplayer or collaborative environments are recommended, so users can observe the physiological parameters of another user to better understand personal feelings. In addition, the technique of body swapping can be included in those experiences. It entails immersing the user in the perspective of another individual to evoke similar emotions and sensations (Sansoni et al. 2022). Parameters that are easily understood, such as HR and breathing rate, are recommended for these experiences. In addition, EEG is useful for researchers to study the mental processes related to empathy, as emotional perception can be identified and classified. Interaction will also depend on the objective of the experience: Explorative with interaction is recommended for increasing cooperation, whereas Passive experiences are suggested for understanding individuals’ emotions. Overall, Desktop 6DOF is the most widely recommended HMD, as the biosensors of both users must be connected to a computer in real time.

5.2 Medicine

In the field of Medicine, there are mainly two situations in which iVR applications with biofeedback are used. The first one is aimed at assisting patients with pain that can occur during chronic illnesses and operative convalescence (Solcà et al. 2018), among other situations. In that case, the application is used to teach the user to be calm and to focus on something other than pain (i.e., breathing (e.g., Prabhu et al. 2020)). Applications of that sort must be Interactive, so that the user is not distracted from the task while moving around the environment. The recommended wearable biosensors are those that measure HR to study user anxiety, and EDA and respiration devices to complement it. Once again, Desktop 6DOF is the recommended HMD.

The second situation is to help patients to control the virtual environment through motor imagery (e.g., Winter et al. 2021). These applications are often used by patients undergoing rehabilitation for amyotrophic lateral sclerosis (e.g., McClinton et al. 2019), strokes (e.g., Spicer et al. 2017), paralysis (e.g., Qidwai et al. 2019), and other illnesses causing user immobility. In such contexts, EEG wearable sensors are always used, because they transform brain waves into movements within the virtual environment, so there is no need for physical movements. Other sensors such as HR can be used in a complementary way to assess user feelings. Interactive is commonly used interaction, so that the user concentrates on a task, the complexity of which can increase thanks to the use of EEG. Desktop 6DOF devices are recommended for an optimal experience.

5.3 Sports

There are several types of applications that can be developed. The first consists of studying the effect of receiving biofeedback while exercising (e.g., Campbell and Fraser 2019). In these cases, users see the data of their physiological parameters in real time while performing the experience, and they modify their performance to achieve certain objectives, thereby altering the physiological parameters. HR and EDA wearable biosensors, associated with physical activation, are recommended for such experiences, so that the user can comprehend how to regulate personal arousal. In view of the need for interaction, Interactive applications are recommended, so that the user concentrates on a specific exercise, such as rowing or mimicking an exercise avatar. Those tasks can be adapted to the needs of users to achieve better results. Standalone HMD devices are recommended, as they enable biosensor wireless connectivity and Desktop 6DOFs to connect several biosensors and to run more complex applications.

The second type are those applications to control the exergame itself only with nothing more than the data extracted from the biosensors (e.g., Schormann et al. 2020). In those cases, applications must be Interactive. Moreover, HR and EDA wearable sensors are recommended, because those parameters best define physical arousal and are the easiest physiological parameters for users to control. Standalone and Desktop 6DOF HMDs are also recommended.

Finally, iVR and biofeedback can be combined for the physical rehabilitation of patients (e.g., Lai et al. 2020). In that case, the application is focused on the body part to be rehabilitated, so the applications must be Interactive and propose a single task: to rehabilitate that area of the body. HR is the most useful parameter to extract that data, because it shows the physical effort of the user. EDA and respiration can also be used. Standalone wireless HMDs are recommended in the case of monitoring sports performance.

5.4 Education and training

Applications with real-time biofeedback have hardly been developed in the field of Education though they offer many possibilities. The applications must be Interactive and must help users to learn something specific without distracting them. The choice of biosensors will depend on the purpose of each application. For example, the use of EEG to assess brain processes during learning is interesting, as well as eye tracking devices which record user gaze patterns. Sensor combinations provide information on cognitive load during the learning process. Additionally, the application should be adapted to user learning rates based on the data. The use of Desktop 6DOF HMD devices are recommended, so that user movements can be monitored.

The development of training-related experiences will depend on the objective. For example, if the training is oriented towards mental skills such as interaction through motor imagery, EEG is necessary as an interaction method. And if the objective is to train the user in decision-making under pressure, sensors that measure HR and respiration may be more useful (e.g., Michela et al. 2022). In any case, the applications must be Interactive or with a very limited exploration of the environment which distracts users from the tasks. Desktop 6DOF HMDs are recommended for those applications.

Explorative with interaction is recommended for Prevention-related applications because users can perform prevention tasks in those environments, as if in real-life situations (e.g., Huang et al. 2020a). The most highly recommended physiological responses to be monitored with wearable sensors are HR and EDA that provide information on user physical arousal. In addition, eye tracking devices can also be useful to evaluate gaze patterns, and sensors that track respiration can be used as complements. Once again, the use of Desktop 6DOF HMDs is recommended in such situations.

5.5 Ergonomics

One of the latest fields where biosensors have been incorporated is Ergonomics. The use of Explorative applications is recommended, so that users can move freely within the environment and so that the relationship between individuals and urban spaces can be assessed. EEG wearable sensors are recommended for evaluating the influence of spaces on cognitive processes. Eye tracking devices can be valuable for analyzing where the individual’s gaze is fixated. Other complementary sensors, such as HR and EDA, can also be included to gain a better understanding of user emotions. Some of those applications may be based on environments that users modify in accordance with emotional reactions, making it advisable to use Desktop 6DOF devices. A type of device that can efficiently run large Explorative applications and easily connect biosensors through a computer.

Industrial spaces can be associated with training or risk prevention. In those cases, the applications should be Interactive or Explorative with interaction, so that the users can carry out specific tasks and, in specific cases, move around. The most recommended wearable biosensors are those that monitor eye tracking to record user gaze patterns in real time, and as a form of interaction. In addition, eye tracking can be combined with other sensors, such as EEG to provide information on both emotions and the learning processes of each individual. Additionally, HR and EDA can indicate user arousal in different situations. The choice of these sensors will depend on the objectives of each experience. Finally, Desktop 6DOF HMDs with built-in eye tracking are recommended.

5.6 Military

No iVR applications with real-time biofeedback were found in the Military field. However, this combination in some types of applications could be interesting. For instance, applications could be developed to comprehend and assist individuals with post-traumatic stress. In this case, biofeedback would serve to adapt the application in real time to their anxiety levels and to customize the treatment. The form of interaction can be Passive, so users only observe a situation, or Interactive to perform a task in an anxiety-inducing scenario. Wearable sensors that measure user tension such as HR, EDA, or respiration are recommended. EEG devices could be added to evaluate the neural response of PTSD and to detect emotions. Both Desktop 6DOF and Standalone HMDs are recommended. In contrast, applications focused on military training should be Explorative with interaction, so that free movement within the environment is possible. The data collected by the biosensors could be used to adapt the task difficulty in real-time. The advised ones are those related to physical arousal and anxiety: HR, EDA and respiration. Finally, Desktop 6DOF HMDs are recommended for those tasks.

5.7 Tourism and marketing

Although biosensors appear to offer few advantages in both fields, they can be implemented in several ways. For instance, customized experiences can create a feeling of uniqueness for each user in iVR applications with biofeedback for Tourism. In the case of dark tourism, the environments can be modified if individuals' emotions become excessively negative to prevent them from worsening while contemplating wartime conflict. The type of interactions varies, though Passive and Exploratory ones are recommended. HR and EDA are suggested as they are directly related to emotions. EEG devices can also be used, though they are more uncomfortable for the user in this type of experience. The recommended HMDs are the Desktop 6DOF and Standalone ones. In addition, users can interact with tangible heritage in a tourist environment recreated in Interactive applications. The choice of biosensors depends on the objective of the experience, but eye tracking appears to be an interesting form of interaction (e.g., highlight an area and display an explanation).

In the field of Marketing, the use of biosensors can be related to the assessment of user responses to a product. Their use can be useful in the process of creating a product, in order to know whether the user might like it. Biosensor data can lead to adaptations of the product according to user tastes. Such applications can be Passive in the case of product sampling or Interactive, if the user has to interact with the product. HR and EDA sensors are recommended, as they collect data that point to user comfort levels with a product. EEG can also be used to measure and to classify cerebral processes related with emotions, and eye tracking can provide information on the parts of the product where most attention is paid.

6 Conclusions

This systematic review has provided an overview of the combined use of iVR applications and wearable biosensors. The combination of both techniques is still being tested, with a considerable learning curve ahead. However, the trend toward the use of biosensors along with iVR applications is growing at a similar rate to iVR applications. A total of 560 papers have been included in this literary review. The following is a summary of the most important findings:

  • The usage of iVR applications with wearable biosensors is a rising trend as higher quality devices are being developed at lower cost.

  • Sensors were more commonly used in the fields of Psychology and Medicine (52.8%). Biosensors were often used at medical centres and their cost was no obstacle in this field, as medical equipment costs are generally quite high. Both areas aim to enhance the user experience through knowledge of the physiological parameters (e.g., reducing user anxiety or pain levels). In addition, the use of iVR along with biosensors was growing in the other areas under analysis.

  • The reason for including biosensors in iVR experiences remains unclear in most studies. Hence, their experimental use, to determine how their effectiveness may be maximized. The most widely described wearable biosensors monitored HR (53.3%) with easily interpretable outputs. Simultaneous measurements of HR and EDA can be performed by some of those sensors, which were occasionally utilized for both applications. Both parameters represent user tension, and physical arousal. HR and EDA biosensors are both low cost and non-intrusive for users. In contrast, the EEG sensors can extract very useful data to understand brainwaves, but those data are difficult to interpret, and the devices are uncomfortable for users to wear. Finally, eye tracking and electromyography are the least frequently used techniques with which to measure physiological parameters. Interestingly, there were few experiences with eye tracking (5.1%), even though the HMD equipment can do so. It may be due to the complex nature of the data, the use of which still remains unclear in some areas.

  • The use of Desktop 3DOF and Cardboard devices has been decreasing. However, in fields such as Psychology and Medicine, they were used in situations where passivity is a key point (e.g., viewing relaxing videos to assist ICU patients or reducing anxiety through meditation). Generally, Desktop 6DOF HMDs were preferred (70.5%) as they offered wider interaction and a sense of immersion, as well as supporting larger applications and a quantity of sensors. Data could therefore be acquired in real-time, and professionals could monitor user activities within the virtual environment. In addition, Standalone devices were increasingly being used due to their greater storage capacity and better quality. They were replacing Cardboard in Passive experiences and competing with Desktop 6DOF in all other interactions.

  • Unity was the most widely used videogame engine for developing iVR applications (37.5%). Unreal Engine had replaced Source after it was discontinued. Many departments purchased the applications (14.8%) or outsourced development, due to the lack of capacity to produce iVR applications. The use of 360° video (17.5%), commonly used to relax and to entertain the user, as well as to reduce development costs and for ease of use was reported in another large percentage of articles.

  • Regarding the interaction type, most of the experiences were either Passive (41.6%) or Interactive (40.5%). Passive experiences stood out in Tourism and Marketing where the objective was to observe the reaction of users to a product or at a tourist site, e.g., a museum. In all fields, Interactive experiences were distributed in a balanced way, because the interaction was mostly biosensor-based. In Sports, exergames were usually Interactive, as users were playing a sport while interacting with the virtual environment. Explorative and Explorative with interaction experiences were the least used, as they were the most difficult to develop. Most Explorative experiences were found in the field of Education and Training and, particularly, in the Prevention section. The reason is because prevention was implemented in environments where users must move a lot, such as the evacuation of a building or an escape from fire. A similar usage was also prevalent in the Military area where users were required to enter war scenarios. They were also used in Ergonomics where the aim was to know how the users are affected by the environment and, therefore, how to move around it.

  • Experiences with a sample size of fewer than 20 individuals stood out (33.2%). Furthermore, most experiences had no control group with which to compare the results (62.3%). It may be due to ongoing research into the use of biosensors, and preliminary studies were reported in many of the papers to test them. In other cases, those experiences served only to collect data that were then analyzed to train algorithms.

  • Almost all the articles conducted a pre-experience demographic survey to collect data. In addition, a post-assessment was reported in most of the works (80.9%) to analyse the development of the experience. In the articles under review, Emotions were measured more than any other, a type of evaluation that was described in a large number of papers related to the topic. The remaining post-assessment categories were balanced.

  • A few papers described biofeedback within the application (17.3%), due to its relative novelty and ignorance over its advantages. However, the trend towards the implement of iVR with biofeedback saw a positive increase, due to the emergence of new sensors and the collaboration of people who know how to analyse the data acquired through sensors with people who are able to integrate these data into applications. Psychology and Medicine were used among the fields where biofeedback is directly employed as an interaction within the application (e.g., relaxation while viewing personal parameters and rehabilitation of stroke patients with EEG). In Sports, it was also widely used to adapt the sport to the user, so that the users can see their physical activation levels. Although it has a great future, it was rarely used in the fields of Education, Training and Prevention. In areas such as Ergonomics and Military, the use of biofeedback has yet to begin.

  • Among the papers in where biofeedback was reported, Passive experiences were almost absent. Most of those experiences were Interactive, as biofeedback was only used as a method of interaction. There were also very few Explorative ones. Desktop 6DOF HMDs were the most frequently used, as they can be connected to the computer together with the sensors, providing a better connection quality. The sensors in use depended on the objective of each application, but those that measured HR and EEG were generally preferred. Finally, most of the experiences yielded positive results, pointing towards a promising field of research wheret there is still much to discover from the use of biofeedback linked to iVR.

The overall conclusion from this review is that the combination of iVR and wearable biosensors is a field with a promising future. Over coming years, research into using both technologies together will increase significantly, especially, in applications that use biosensor data to modify and to adapt the experience to the needs of the user. Developments that will also be driven through improvements to both HMD quality and rapid wearable biosensor connections. In addition, other technologies such as AI for data analysis will be incorporated to make this process simpler and more systematic.

7 Future lines

The combination of wearable biosensors and iVR provides a tool from which many benefits can be obtained, especially when biofeedback is included. Biofeedback can contribute to the creation of applications and their optimization to improve user experiences, or even to personalize real-time iVR experiences. It is expected that researchers will include biofeedback, as the use of this technology is positive, and the costs of wearable biosensors are gradually decreasing. In coming years, wearable biosensors will be increasingly deployed in all areas, to understand the user-iVR application relation and to obtain objective data on experiences. Further growth will be seen in some areas such as Education and Training, particularly, in the inclusion of real-time biofeedback within the application. Researchers can use the data that are gathered for a better understanding of learning processes, and for adapting the experience to users. Combining these technologies with AI to understand the relationship between users and the application could be an interesting future line of research. Once extracted, the data can be analyzed in a systematic way, facilitating the task of researchers. AI can also be used to modify the application in real time through biofeedback to adapt it to the performance of users (e.g., changing the difficulty levels as the user progresses to increase engagement and to improve learning processes).

In this review, few studies of eye tracking technology were found. It is peculiar, because some of the HMDs described in the papers include eye tracking systems. However, those sensors extract data the analysis and the interpretation of which is complex and hardly straightforward. Such data will be systematically processed and better understood in the future. Likewise, EEG wearable biosensors are expected to become more accessible, affordable, and less cumbersome for users to wear. It will enable them to be included in more research and fields such as Sports, as well as to be tested on a wider range of people. The trend indicates that HMD Desktop 3DOF devices will be discontinued, because nowadays users prefer to try higher quality experiences. The latest generation of Desktop 6DOF or Standalone HMDs create a greater sense of presence and immersion and promote more interaction with the application. In addition, the popularity of Standalone devices appears to be growing, as these devices offer many features, improved quality, and better storage capacity, enabling the utilization of more complex applications. In the future, they will be used for their capability to operate without the need of a computer connection which, at the same time, will reduce the cost of the experience. Such devices will also be easier to connect to biosensors and to support real-time biofeedback. Desktop 6DOF devices will continue to be used for their usability advantages such as the capability to observe what the user sees and sensors that are directly connected to a compute.

Once having analyzed what can be done with the combination of iVR and wearable biosensors, further testing needs to be conducted to better understand the full range of iVR and wearable biosensor combinations. To do so, experiments should be conducted with a sufficiently large enough sample to reach conclusive results. In such experiences, a minimum of 40 participants is expected to extract conclusive results and to study additional aspects in the post-assessments, which will in the near future lead to improved experiences.

In future research, network and cluster classifications will be conducted, to explore correlations between the various characteristics of the experiences and the applications that may be presented in each paper.