Keywords

1 Introduction

Ecological validity of Virtual Reality is a central question when VR is used as a predictive tool. This is typically the case in the context of industrial product design, where human in the loop simulations of products is part of the design process, to avoid the cost of real prototypes. It is also the case in therapy applications where VR is used to assess human behavior. Indeed in such predictive uses, the proximity of human response within the virtual environment to the equivalent real situation is necessary.

We here define our use of two terms in this paper, Immersion, and Ecological Validity. We will use the term immersion as encompassing two things: (1) sensori-motor interfaces i.e. what a virtual reality system provides from the technical point of view to address visual, haptic, audidory or other modalities (2) the cognitive interfaces, describing the way activities are performed in the virtual environment. We consider as ecological validity the fact that the user’s behavioral response is realistic.

In the specific case of VR as a predictive tool, the role of the designer of the VR simulation is to choose the immersion for each use case and human activity that he or she wants to simulate, with the objective of maximizing the ecological validity.

The research field of ecological validity in VR is very large since sensori-motor and cognitive resources are different for each activity, and may be even so between different users. Thus each activity requires an adaptation of the levels of immersion VR and an assessment of its validity.

We can consider that ecological validity in VR is a specific case of a more general question, that is the relation between immersion and presence. The specificity of ecological validity in VR being that the definition of presence that is considered is “response as if real”, one of the definitions proposed by Slater [14].

On the general problem, which is the relation between immersion and presence, there is a large number of user studies in VR that focus on the role of specific level of immersion, visual, haptic, auditory, on activities in virtual reality. Part of those activities are not considered in relation to an equivalent real activity, and thus cannot be considered as within the ecological validity scope of research. As examples, research on immersion in VR to help scientific data visualization (fluid dynamics, molecular visualization), or studies on levels of immersion for application control using menus, are not representative of real activities. There are activities that cannot be performed in real life. Some activities in VR are of ecological nature such as navigation or object manipulation, but there is not a large amount of literature that actually objectively looks at how close those are to the real thing. The VR Knowledge databaseFootnote 1 is a curated repository of research findings in the field of virtual reality, and is potentially very useful to the research community. It aims at bringing together research results on the relation between levels of immersion, and performance metrics on various tasks in VR. The ecological validity, as defined in this paper is not directly considered though.

Some works have considered comparing behavioral metrics, between a virtual activity and its real counterpart, to assess ecological validity. Various tasks have been considered. We would be tempted here to present a list sorted by a (tentative) level of complexity of tasks. However sorting tasks by complexity is a difficult and open problem. We will just list a series of tasks, and will start with the ones that require only a perceptive component, requiring no active and conscious reaction by the user, then considering more interactive onesFootnote 2. Note that this list is not exhaustive, and is meant to show the variety of tasks being considered in the literature. There have been real/virtual comparisons on Visual Perception of Distances [12], Heights [13], or perception of materials [6]. On more interactive tasks, we can note: the hoop and wire game [1], playing with Lego Bricks [3], peg insertion [21], aiming movements [9], reaching [17]. We could consider that the next tasks require a supplemementary level of cognitive involvement: navigation and spatial knowledge [10], car driving [11], wheelchair driving [4]. As a last group we propose works that tackle activities that have an emotional or social component: collaboratively solving puzzle with other people [2], social behavior in small groups [15], or cooperation with robots [20].

This list is not exhaustive at all, as this paper’s focus is not to do a review of the literature on ecological validity. The intent of the author is to present different ways of tackling the question. I will do so by reporting on three studies that were previously done by the author and his colleagues [6,7,8, 18,19,20] that are at located at 3 different points in the continuum of tasks proposed in the previous paragraph: perception only, perception and action, emotional and/or interaction with other entities.

2 Complex Material Observation

In the domains of Cinema or video Games, an esthetically driven, or simply plausible visual appearance of objects can be acceptable. It not the case for the use of VR for the design of visual appearance of materials, where object’s visual aspects needs to be physically realistic. The industrial need is a virtual material design workshop, in which the user virtually specifies a material composition, then he is able to visualize its resulting visual aspect in a VR environment. In this virtual material workshop, we determine the composition of a future real material whose appearance is supposedly predicted by the virtual one. The realistic nature of the image is thus paramount. In order to set up such a workshop, a research projectFootnote 3 is built on a 4 step methodology: (1) optical measures to characterize the composition of the chosen material samples, (2) light-matter interaction models, (3) Rendering in a VR system, (4) Perceptive validations with the human in the loop by comparing real and virtual samples. The results of this research are described, for the case of homogeneous coloured materials in Medina et al.’s works [18, 19] and as for complex materials (such as car paints with effects) in [6, 7].

The first step in order to obtain a realistic appearance in VR is to ensure a proper calibration of the display chain (including display and stereoscopic glasses characteristics), given that the rendering models and engine provide the spectrum of the light being emitted by the virtual materials. We have set up such a process, in the case of homogeneous materials [18, 19] (Fig. 1).

We will focus in this paper on another material type, which is complex materials with visual effects, such as car paints which have metallic flakes embedded in the base coat and produce sparkling effects. Statistical models describing such materials have been set up in the project [5], based on optical measurements. The models create virtual microstructures, that are fed into a rendering engine (see Fig. 2). To sum up the general results for that case, the above methodology allowed to show two results. Firstly, the use of stereoscopy decreases perception thresholds of specific properties of the material (size and density of aluminium particles) [7]. Secondly, we have validated that our VR visualization system allows to make a correspondence, for a given visual aspect, between the descriptive metrics of the virtual material to the ones of the real material, through user perceptive studies [6].

Both these results have a consequence on ecological validity of VR for the observation of materials: (1) we show that stereoscopy has an impact on perception thresholds and that we are able to quantify it. This quantification can be used as a guideline for the design of VR systems for ecological validity as is can drive immersion choices (given a threshold need, should we use stereo or not?). (2) we show that ecological validity can be achieved in VR at the level of material composition. In the experimental setup, a simple stereoscopic display was used, without dynamic perspective (no head tracking).

Fig. 1.
figure 1

In order to simulate car paints that have metallic flakes embedded in the base coat and produce sparkling effects, statistical models describing such materials have been set up [5], based on optical measurements. The models create virtual microstructures, that are fed into a rendering engine for stereoscopic visualization.

3 Gestures in Object Manipulation

Gestures that are performed in a virtual reality system are likely to be different than gestures from everyday life tasks. The quality of virtual reality systems can modify the way gestures are performed. As examples, we can point out the visual latency of the system, being the delay between movement capture and the resulting visual feedback on the avatar of one’s hand. High latencies are known to decrease performance. The number of degrees of freedom that are captured, between a simple hand pose measurement, or a full hand joints configuration acquisition using gloves or vision based cameras, achieve very different levels of ecology of prehension. The use of props, reproducing the shape of the manipulated object can improve user experience and gesture validity.

Here the question we asked ourselves is how much can we play (reduce, distort) with visual feedback and still get stable, close to normal results?

We looked into the role of visual feedback and visual appearance of objects on manipulation gestures in VR. In a previous work [8] we have studied the role of visual feedback and visual appearance of objects on manipulation gestures (gestures performed while manipulating objects).

A variety of feedbacks where implemented, on a box opening task. The user performed the gesture to open the box, and depending on the progression of his gesture (depending on its position along a prerecorded gesture) a visual feedback on the opening of the box was provided. The visual feedbacks (see Fig. 2) where ranging from boolean (box would visually open only at the end of the gesture), textual (a percentage of the opening was displayed), discrete (only a few steps were displayed), normal, enhanced (with a >1 gain between motion capture and visual displacement) and non coupled (or open loop: meaning that an automatic animation of the box would start at contact user movements having no effect on it).

Fig. 2.
figure 2

Study of gesture feedback: on a box opening task, the gesture is measured and several visual feedbacks are proposed to the user (from left to right: boolean, textual, steps, normal, enhanced, openloop) and compared in terms of gesture completion and proximity to an initially recorded natural gesture

We compared descriptors of gestures between an initially recorded natural gesture (done with no feedback constraints), and the gestures performed with the feedbacks. Results showed firstly that the non coupled feedback is immediately recognized as such, and gesture descriptors greatly differ from initial gesture (small movements, shorter time). Mechanical work and completion time are also different for all feedbacks. Finally the subjective preferences of users went to enhanced feedback, and diminish with for lower levels of information. In a second experiment, we studied the role of affordances on gesture memorization. We have shown that affordances, which is the visual appearance of objects that shows how it is operated, increase user recall of gestures.

The outcome of this study was mainly to set up a method to explore immersion and its effects: a systematic approach of gradually simplifying feedback in order to determine what is necessary in the stimuli in virtual environments to perform valid/representative actions.

4 Human Robot Cooperation

The introduction of robots as coworking or collaborative units with humans would help reducing physical strain for difficult tasks. However it opens a lot of questions on their social and practical acceptability. We looked into the validity of virtual reality as a tool to assess the acceptability, from the human point of view, to work with a robot. To do so, we set up two use cases in both a real setup and a virtual environment (CAVE), and compared what is representative in VR and what is not [20]. The two use cases were, Human Robot (HR) Copresence and HR Cooperation. Both are studied in the context of car parts mounting tasks in a factory. In the case of HR Copresence, a worker and a robot are working side by side on mounting parts of a car door (see Figs. 3 and 4). The studied variables were environment type (Real/Virtual) and the distance between the operator and the robot (Close, Far).

Fig. 3.
figure 3

Human Robot copresence study. Real environment.

Fig. 4.
figure 4

Human Robot copresence study. Virtual environment.

In the case of HR Cooperation the worker and the robot are facing each other, and the robot can give various parts to the worker (see Figs. 5 and 6). Here the variables that were studied were environment type (Real/Virtual) and different levels of assistance of the robot. In the copresence scenario users preference (questionnaires) went to the ‘far’ condition, and this result was consistent between real and virtual environments. A increase in heart rate was found when the robot was close, in both real and virtual conditions. Finally, a skin conductance raise (that may be representative of user stress) when the robot was close, in the real conditions, was not found in the virtual condition. In the cooperation scenario, the main result is again that the questionnaires get close responses between real and virtual conditions (acceptability, perceived security, usability). And we observe the same trends on physiological measures as for the copresence use case. We can hypothesize that more complete feedbacks (such as haptics providing contact information with the environment, or auditory feedback for factory or simulated robot sounds) may increase ecological validity, but these require specific studies.

Fig. 5.
figure 5

Human Robot cooperation study. Real environment.

Fig. 6.
figure 6

Human Robot cooperation study. Virtual environment

5 Discussion

Virtual reality shows ecological validity for each of the tasks that we have studied. However, this validity is only proven to a certain extent. Some of the descriptors of human behavior do not match between VR and real (physiological measures for example). The results that we have presented, on comparing behavior descriptors between same activities performed in real and virtual situations, are specific to the use cases an considered tasks. Especially, for each use case, many additional studies can and should be performed to assess the influence of other levels of immersion. We propose below a few examples of characteristics of immersion that could change user response towards higher ecological validity.

  1. 1.

    Material Observation: Resolution of displays, gamut of displays, high dynamic range displays, use of head tracking and dynamic perspective.

  2. 2.

    Gestures in Object Manipulation: Use of finger tracking, Use of physics engine, Haptic feedback.

  3. 3.

    Human Robot Cooperation: Hapic Feedback, Auditory Feedback.

Here, we argue that the three use cases that we have presented have also a methodological interest. They showed that, for each use case, we can make real/virtual comparisons at very different levels: global subjective appearance of materials, or, at a totally different level, validate underlying physics models. For the Gestures case, two different methods were used. The first is spatial and temporal gesure comparison. For the second, we have tried to develop a method that not only seeks to assertain real/virtual coherence, but also to find out what is the minimal relevant information in the feedback. We did so by stripping down feedback with the goal of finding the minimal information that is needed to have a valid gesture. In this case, the method showed that the maximum info is needed, it is the opinion of the author that this method can be used in different contexts. Finally, for HR copresence and cooperation: we have used subjective questionnaires (Acceptability, Security) or more objective physiological measuresFootnote 4.

An important question that is open is how can designers of VR systems re-use such results? Indeed, if one has to set up an activity that is close to ones that appear in the real/virtual comparison literature, a transposition of the results should lead her or him to choose similar levels of immersion. Here the difficulty resides in deciding what close means. For example, in our material observation use case, we have proven ecological validity for a specific type of metallic flakes. Do these results hold if we change average flake size, or flake shape, or size/shape of the sample on which the material is presented? And what to do if the use case one is setting up does not appear at all in the literature? No guidelines at all in that case. A dream tool answering the question of reusability would be a dictionary of core activities, and the associated immersion needed for ecological validity. Those core activities, simple in nature, would be the building blocks of any more complex activity. Thus the dictionary would provide building blocks to infer immersion for any given task. Today this dream tool is purely conceptual.

6 Conclusion

We have shown that in our three use cases, ecological validity can be proven through a real/virtual comparison method. They work for some of descriptors: questionnaires, movement measures, visual perception of specific properties, and for some not, such as physiological measures. Virtual reality communities have to continue exploring the question of ecological validity and adding up results for the reuse in future designs. But it is a long work. VR and Cognitives Sciences have started to interact, and share their results on the role of immersion on presence. Would this be the beginning of a dictionary of (micro level) tasks that could be combineable to describe any (macro level) task? In that case a dream tool adding up immersion for ecological validity at the micro level, to construct the immersion at a macro level would be a little bit closer to a reality.