1 Introduction

Recently, immersive display such as VR device is widely prevailed. It provides a strong presence illusion, but the major practical issue with head-mounted displays (HMD) is that users commonly experience adverse physical reaction such as headaches, nausea, dizziness, and eye strain that cause uncomfortable VR experience and hampers long-term usages. These symptoms are known as a condition termed simulator sickness, and it is reported that approximately 80% of HMD users experience simulator sickness (McCauley 1984; Stanney 2003). In the paper Yildirim (2019), experiments using Oculus Rift CV1 and HTC Vive have shown that they cause a greater level of sickness compared to desktop displays. Therefore, simulator sickness is still a well-known problem in various modern VR devices, and it emphasizes the need to develop strategies to mitigate it.

Among many theories explaining simulator sickness, the Cue conflict theory and postural stability theory are two well-known theories (Kolasinski 1995; Keshavarz 2015).

There is no well-defined measurement of VR sickness as it is highly subjective. Previous approaches for alleviating sickness include FOV reduction, brightness reduction, and independent visual background, but all of them cause loss in presence as well (Fernandes and Feiner 2016; Whittinghill et al. 2015). Reducing VR sickness while retaining presence is critical in promoting the usage of VR devices.

In this paper, we proposed a novel VR sickness measurement using inertial measurement unit (IMU) sensor in VR device. Based on this measurement, the relationship between sickness and presence is studied and the practical range of FOV reduction is adopted. Also, we proposed the sickness score prediction algorithm based on content analysis and applied it to mapping dynamic FOV processing to reduce sickness. The sickness reduction and presence are both considered by reducing FOV to practical limit during high sickness scene and preserving full FOV during relaxed scene dynamically. Furthermore, the system has been developed to enable real-time rendering in consumer electronics devices after FOV processing through an optimized content analysis algorithm.

2 Related work

There are several studies for measuring and reducing simulator sickness, and recently, it is extended to the sickness in HMD. As the level of VR sickness varies from person to person and also daily condition, it is challenging to measuring sickness accurately.

Researchers proposed several methods to measuring simulator sickness (Kennedy et al. 1993; Bertin et al. 2005; Dahlman 2009; Chardonnet et al. 2015). For subjective measurements, simulator sickness questionnaire proposed by Kennedy et al. (1993) is the most common measurement. It calculates the scores of three subscales which are nausea, disorientation, and oculomotor problem. Bertin et al. (2005) proposed the continuous evaluation that subjects give the scores on the position of cursor on the scale. But, this evaluation is influenced by the latencies which can be introduced an arbitrary interpretation.

Therefore, researchers have been evaluating the possibility of using the objective measurement for simulator sickness. Dahlman (2009) have tried to use various physiological measures including skin conductance, skin temperature, skin potential, saccade amplitudes, respiration rate, heart rate, facial pallor, gastric activity, etc. The disturbance of vestibular, proprioceptive system, or visual causes the autonomic responses which disturb the balance of autonomic nervous system. It makes the symptoms like increasing skin conductance and heart rate, decreasing skin temperature, and cold sweating. Another method of VR sickness measurement is using body’s centre of gravity (COG) proposed by Chardonnet et al. (2015). Based on the postural stability theory, the signal of body sway is analyzed for measuring sickness.

For sickness reduction, FOV control is the most common method. Duh et al. (2001) found that the postural instability of subjects is increasing as FOV increasing from 30° to 180°. Lin et al. (2002) investigated the effect of FOV on the presence, enjoyment, memory, and simulator sickness in a virtual environment. The subjects scores were higher SSQ and presence of subscale with increasing FOV.

Based on our previous work (Nupur et al. 2017), we proposed a novel sickness measurement and FOV processing to reduce VR sickness. The FOV processing includes the optimal luminance profile of peripheral, asymmetric FOV processing-based human visual characteristic, and dynamic FOV processing. The dynamic FOV is based on content analysis, and the appropriate range of FOV is obtained from various fixed FOV experiments.

In this study, we conducted an experiment using the Samsung Gear VR device to watch the input image of the 360 equirectangular projection format.

3 Proposed approach

3.1 Sickness measurement

3.1.1 Objective measurement

There are several researches for evaluating objective simulator sickness measurement including physiological measures and body’s COG (Dahlman 2009; Chardonnet et al. 2015). For skin conductance, it is hard to use valid measurement because of its inconsistent result according to weather, humidity, and condition of subjects. In Chardonnet et al. (2015), the body sway signals from COG are analyzed time domain as well as frequency domain. It shows high correlation with subjective sickness score and consistently provides reliable results. However, it needs an additional postural stability sensing device and a postprocessing for the synchronization with the obtained sensor data and the video frame.

In order to simultaneously measure COG while watching VR contents, we propose a sickness measurement with head dispersion of subjects using IMU sensor in VR device rather than measuring COG. If the subjects are asked to look straight ahead or stop their head motion change, the change in roll and pitch in IMU sensor corresponds to the change in x- and y-axis in COG. It is because their head movement affecting the IMU sensor value is close to zero. In order to guarantee confidence of IMU data, we obtained the COG sensor data and Gear VR’s IMU sensor data by making the subject move left and right, back, and forth. Figure 1a, b shows roll and pitch value in IMU data in terms of time, respectively, and Fig. 1c shows the trajectory which means the pitch values according to roll values. Figure 2a, b shows COG X and COG Y data in COG sensor in terms of time, respectively, and Fig. 2c shows the corresponding sway area. Sway area means confidence ellipse which is the region that contains 95% of all COG samples that can be drawn from the underlying Gaussian distribution. Therefore, sway area is the approximate trajectory area of body sway signal that can be obtained from COG sensor devices (GaitView AFA-50). As you can see in Figs. 1 and 2, the roll and pitch of the IMU sensor and the COG X and COG Y values of the COG sensor have very similar tendencies.

Fig. 1
figure 1

a Roll and b pitch in IMU data in terms of time. c Corresponding head motion trajectory

Fig. 2
figure 2

a COG X and b COG Y data in COG sensor in terms of time. c Corresponding sway area

For more specific analysis, we measured IMU and COG sensor values at the same time during watching four VR contents. Firstly, subjects watch image with homogeneous content for 15 s since all the sensor data become zero. Secondly, a VR video with no camera motion shows for 45 s. Thirdly, the same image as the first image is showed for 15 s to initialize the memory of subjects and the sensor data. Lastly, a VR video with high camera motion shows for 45 s. We only measure IMU and COG sensor values when subjects watch VR videos. In order to measure stability based on the head motion data, we introduced following Eq. (1)

$${\text{Head}}\;{\text{Dispersion}} = \sqrt {\frac{{\sum ({\text{roll}} - \overline{{{\text{roll}}}} )^{2} + \sum ({\text{pitch}} - \overline{{{\text{pitch}}}} )^{2} { }}}{n}}$$
(1)

The roll and pitch values are in degrees, and \(\overline{{{\text{roll}}}}\) and \(\overline{{{\text{pitch}}}}\) are mean values of roll and pitch, along the sessions. When subjects watch VR video with no camera motion, low head dispersion and sway area are derived. Analogously, when the subject is watching VR video with a lot of camera motion, high head dispersion and sway area are calculated. To analyze the correlation between COG and IMU sensor, we compared sway area from COG sensor and head dispersion as shown in Fig. 3. It shows the high correlation between the head dispersion and the sway area (R = 0.82, p < 0.01). Therefore, we use the head dispersion as an objective measurement which requires no additional device and synchronization processing.

Fig. 3
figure 3

The correlation between sway area in COG and head dispersion in IMU sensor of VR device

3.1.2 Presence measurement

After each experiments, subjects are asked to fill the presence questionnaire. It includes three categories which are reality, image detail, border invisibility. In each category, the following question is asked.

  • How much did your experiences in the virtual environment seem consistent with your real-world experiences?

  • How well were you able to see the details in the visual environment?

  • How well could you perceive the frame surrounding the image? (high score means low presence)

The range of scale is from 1 to 5, and each score means as shown in Table 1.

Table 1 The scales used for subjective scoring

3.2 FOV Processing

3.2.1 Profiles for peripheral FOV processing

To provide natural view of periphery, we modified the pixels darken as one moves toward the periphery from the center on either sides. It provides an effect of a FOV gradually decreasing toward the periphery. As shown in Fig. 4, Hanada (2012) investigates the effects of the feeling of dazzling evoked by three luminance profiles: linear, logistic, and inverse logistic.

Fig. 4
figure 4

Different profiles for the peripheral FOV processing

We applied these luminance profiles to FOV processing as shown in Fig. 5. In Fig. 5a, there is a visible white band at the beginning of FOV processing in the linear gradation, and it is more severe for inverse logistic profile. The logistic profile resulted in more seamless transition and, thus, was selected as the profile for peripheral FOV processing as shown in Fig. 5b.

Fig. 5
figure 5

a Linear processing of peripheral FOV and b logistic processing of peripheral FOV

The peripheral FOV gradually darkens according to the logistic profile. At this time, the range from the first darkening until the luminance became zero was defined as the logistic slope range. To derive this range, we asked subjects which of the logistic slope ranges of several degrees were less noticeable to gradually darken. Through these pilot experiments, we chose 18° as the logistic slope range.

3.2.2 Asymmetric FOV processing based on human visual characteristics

According to human visual characteristics, the range of FOV is from 50° upward to 70° downward (Hatada et al. 1980). In practice, subjects feel bothered downward than upward when FOV is reduced symmetrically because human FOV is biased downward. Therefore, we proposed asymmetric FOV processing which retain FOV reduction ratio from 50° upward to 70° downward based on human visual characteristics as shown in Fig. 6.

Fig. 6
figure 6

a Symmetric FOV and b asymmetric FOV processing

3.2.3 Range of FOV processing

As reducing FOV is effective method to reduce sickness, there is trade-off between presence and FOV. Therefore, it is important to study the optimal FOV control range. We analyze the relation between sickness and presence while control FOV with full, 75°, 60°, and 45°. According to the experiments, the minimum FOV is 60° which can effectively reduce sickness with almost negligible loss in presence. The detail procedures of experiments and results will be explained in Sect. 4.

3.3 Content analysis algorithm

Figure 7 represents the overall flowchart of content analysis algorithm. The input VR videos are 360° equirectangular projection format. Figure 8 shows one specific full frame in the input image. In other words, when viewing the frame in VR, a part of the entire area is watched in real time as the head moves. In this case, the content analysis algorithm proceeds by setting the 800 × 600 area of the center part with the least distortion in each frame as region of interest (ROI) like a red box. For feature detection, we used Shi and Tomasi (1994) algorithm suitable for finding the most prominent corners of the image. In Fig. 9, the green dots represent the features obtained by applying feature detection. Each feature then continues to track the coordinates of the corresponding feature in successive frames through a feature tracking algorithm. At this time, if the corresponding feature value is not found in the next frame, the feature is lost. If the number of features gradually decreases below a certain level, accuracy of feature tracking cannot be guaranteed. The feature detection is applied to the first frame of input video and the frames which have the number of remaining features less than 20% of the maximum features. If the current frame has more than 20% of the maximum number of features, the feature tracking is applied. Optical flow is used for feature tracking (Bouguet 1999).

Fig. 7
figure 7

The flowchart of content analysis algorithm

Fig. 8
figure 8

The input VR video with ROI for content analysis

Fig. 9
figure 9

The visualization of contents analysis results based on feature tracking in ROI (color figure online)

In order to remove outlier features accurately, we proposed subregion-based correspondence points tracking. As shown in Fig. 10, ROI is divided into up, down, left, right, and center region, and the average motion vectors of features in each region are calculated. The points of each region which satisfied both Eqs. (2) and (3) conditions are removed.

$$\left( {\left| {\overrightarrow {{{\text{mv}}_{{{\text{sub}}}} \left( i \right)}} } \right| > 2 \times \left| {\overrightarrow {{{\text{mv}}_{{{\text{subAVG}}}} }} } \right|} \right) \cup \left( {\left| {\overrightarrow {{{\text{mv}}_{{{\text{sub}}}} \left( i \right)}} } \right| < \frac{1}{4} \times \left| {\overrightarrow {{{\text{mv}}_{{{\text{subAVG}}}} }} } \right|} \right)$$
(2)
$$\left( {\overrightarrow {{{\text{mv}}_{{{\text{sub}}_{x} }} }} \cdot \overrightarrow {{{\text{mv}}_{{{\text{subAVG}}_{x} }} }} < 0} \right)\cup \left( {\overrightarrow {{{\text{mv}}_{{{\text{sub}}_{y} }} }} \cdot \overrightarrow {{{\text{mv}}_{{{\text{subAVG}}_{y} }} }} < 0} \right)$$
(3)

where \(\overrightarrow {{{\text{mv}}_{{{\text{sub}}}} \left( i \right)}}\) is motion vector of each points in subregion, and \(\overrightarrow {{{\text{mv}}_{{{\text{subAVG}}}} }} = \sum\nolimits_{i}^{n} {\overrightarrow {{{\text{mv}}_{{{\text{sub}}}} \left( i \right)}} }\). It shows effective outlier removal performance for fast motion videos such as rollercoaster; thus, it prevents prediction errors propagation that the remaining outlier points are used for calculating interframe motion vectors. After outlier removal, interframe variables including motion vector, acceleration and rotation are calculated using rigid transform. Figure 9 is image showing the content motion and each feature extracted through the feature tracking process. The blue line connecting the green dots represents each motion trajectory of a continuous frames. In addition, the content motion representing all features is represented as a relative movement through a red crossing line based on a fixed blue crossing line, and the red crossing line inside the blue circle in the lower right rotates clockwise or counterclockwise to visualize rotation information. Since the content analysis is performed on a limited area of 800 × 600 of the 4 K input and the low complexity calculation is used for the outlier elimination algorithm, it shows an average speed of about 69 fps in the five videos of Table 2.

Fig. 10
figure 10

ROI partitioning for subregion-based correspondence points tracking

Table 2 The videos including different characteristics used for the sickness estimation modeling

To design dynamic FOV processing based on content analysis, we used five videos which have different characteristics—one with lots of fast motion (rollercoaster), one with moderate motion including acceleration and turns (twisted colossus), one with animated videos (tarzan), one with motionless rest scene (beach), and one with large object motion without camera motion rest scene (elephant) as shown in Table 2. While watching each videos, we obtained the interframe variables from contents motion analysis and actual sickness from head dispersion. The sickness score is estimated as Eq. (4) using regression modeling.

$$\begin{aligned} {\text{Predicted}}\;{\text{sickness}}\;{\text{score}} & = a_{0} + a_{1} \left| {{\text{mv}}\_x} \right| + a_{2} \left| {{\text{mv}}\_y} \right| + a_{3} \left| {{\text{accel}}\_x} \right| \\ & \quad + \,a_{4} \left| {{\text{accel}}\_y} \right| + a_{5} \left| {{\text{mv}}} \right| + a_{6} \left| {{\text{accel}}} \right| \\ & \quad + \,a_{7} \left| {{\text{mv}}\_\theta } \right| + \, a_{8} \left| {{\text{accel}}\_\theta } \right| \\ \end{aligned}$$
(4)

For modeling, the videos of Table 2 are divided into the first half and the second half to construct training sets and test sets. Each parameter {a0, a1, a2, a3, a4, a5, a6, a7, a8} is derived by modeling the actual sickness obtained from Eq. (1) with the predicted sickness of Eq. (4)

Figure 11 shows the result of regression modeling on training and testing datasets, \(R^{2}\) value of 0.9168 and 0.869 was obtained, respectively. The predicted sickness score normalized from 0 to 1 and used for dynamic FOV mapping from full to 60°.

Fig. 11
figure 11

Actual sickness (blue) and predicted sickness (red) of a training and b testing set (color figure online)

Figure 12a represents the sickness score of twisted colossus which is calculated by regression modeling. Based on the sickness score estimation, the FOV angle is narrowed and widen, during the time of high sickness score and low sickness score, respectively. Figure 12b shows the adjusted FOV angle according to the sickness score. The FOV angle and sickness scores are inverse proportion to each other, and the degree of gradient can be modified to provide user preference of the dynamic FOV sensitivity. The maximum △FOV between frames is preset to prevent an abrupt FOV changes when scene changes.

Fig. 12
figure 12

a The predicted sickness score and b the adjusted dynamic FOV angle

4 Experiment and results

The Gear VR is used for the experiment, and sickness is measured according to FOV control while watching 360 videos (Fig. 13).

Fig. 13
figure 13

The different FOVs used in this experiment. a Represents the beach scene used as a rest scene to ease the VR experience, b Original settings, c 75° FOV,  d 60° FOV, e 45° FOV

4.1 Design of experiment

We select the rollercoaster video for experiments which includes translation, rotation, and various kinds of motion. Subjects watched beach video which had negligible movement at the beginning to adapt VR environment. During experiments, subjects are asked to keep Rhomberg stance which commonly used for quantifying balance and look straight ahead to minimize intended head movement.

We carried out two experiments. First, the relation between sickness and presence is studied, while participants are watching videos with the different FOVs. From this experiment, we can observe the practical range of FOV. Second, the effect of dynamic FOV is studied. The sickness was measured for full FOV and dynamic FOV, in randomized order.

4.2 Different fixed FOV experiment

The head dispersions of 8 subjects are observed for rollercoaster and beach video as shown in Fig. 14. The result of beach video is used for reference dispersion in relaxed condition. The sickness is decreasing as FOV reduces to 60°, but decreasing it further to 45°, the dispersion slightly increased from 1.37 to 1.55.

Fig. 14
figure 14

The head dispersion when 8 subjects watch rollercoaster and beach videos with different fixed FOV

Presence was measured by three factors: border invisibility, reality, and image detail. As shown in Fig. 15, the border invisibility and the image detail are gradually decreasing as they go from full to 45°, respectively. In particular, in the case of border invisibility, there is an obvious tendency in the subjective evaluation value according to each FOV, which is judged to be more intuitive and clear than the other two factors. In the case of reality, full FOV showed a score lower than 75°, which is considered to be due to the fact that the image quality of full FOV differs from the real one. Taken together, the presence keeps moderate level from full FOV to 60°, but in case of 45°, it shows sudden decrease to 2.67 as shown in Fig. 15. According to different fixed FOV experiment, we conclude the limit degree of FOV which effects the sickness reduction and preserves presence as similar as full FOV is somewhere around 60°.

Fig. 15
figure 15

The presence score when 8 subjects watch rollercoaster with different fixed FOV

4.3 Dynamic FOV experiment

To study the effect of dynamic FOV, 17 participants did the experiment. The range of dynamic FOV is set from full to 60° which is obtained as the minimum FOV from previous experiment. Figure 16 shows the sickness of subjects watching rollercoaster video with full and dynamic FOV. The average sickness shows a decrease of 37.05% from 1.77 to 1.12 on going from full FOV to dynamic FOV, with the difference being significant at the level of 0.002 as shown in Fig. 17.

Fig. 16
figure 16

The sickness of 17 subjects watching videos with full FOV and dynamic FOV

Fig. 17
figure 17

The variation of head dispersion for 17 subjects

5 Discussion

From the different fixed FOV experiment, there is a slight head dispersion increasing for 45°. The most of the participants reported that the black regions in periphery were less noticeable till an FOV of 60°, but on decreasing it further to 45°, the value of border invisibility scores in the presence questionnaire suddenly increases. This trend is also observed in beach video which is selected for making subjects relax. Therefore, it can be interpreted not 45° FOV makes more VR sickness than 60°, but the black visible periphery region makes subjects annoying.

As we can see in Fig. 16 from the dynamic FOV experiment, the amount of sickness reduction is different from person to person. For example, subject 5 shows the maximum 2.61° reduction, while subject 12 shows minimum 0.02° reduction in dispersion. In particular, subjects 11 and 16 show slightly higher head dispersion at dynamic FOV than full FOV of 0.21° and 0.12° increasing, respectively. Even it is just 2 of 17 subjects and the increasing degrees are not large, it shows that people have various sensitivities in sickness.

This system shows average processing speed of about 16 ms per frame including sickness score prediction, dynamic FOV processing, and VR contents rendering for five videos in Table 2. This is a level of 60 fps or more that enables real-time processing of the algorithm.

For the future work, if VR video has sensor data including pitch and roll as a meta data, it can provide more accurate motion information of videos and can be used for effective dynamic FOV processing with contents analysis.

6 Conclusion

This paper presents VR sickness reduction methods using dynamic FOV processing. To provide natural view of periphery and reflect human visual characteristics, the logistic profile and asymmetric FOV processing are adopted. The interframe variables are calculated for sickness estimation, and dynamic FOV processing is applied to mitigate VR sickness. From the experiment, we found the practical range of FOV is full to 60° and dynamic FOV processing shows a decrease of 37.05% in VR sickness. This technology has brought some advancement to the VR consumer electronics field. It is a low complexity algorithm software system embedded in VR device and able to operate in real time. In addition, it is expected that it can be applied to various VR devices without hardware dependency because it provides optimal FOV based on content analysis. By reducing the VR sickness pointed out as a big hurdle when using VR, it has the potential to greatly improve the usability and contribute to the expansion of VR market.