Keywords

1 Introduction

Being inside an immersive Virtual Reality (VR) simulation allows the user to fade out the real surrounding and focus on the presented virtual illusion. In full-immersive VR, visual stimuli of the real surrounding are blocked out and thus allow to create the experience of presence. Presence is often described as the feeling of being there [4], accompanied by fading out the real surrounding and experiencing the virtual world as the real world. Besides the importance of different visual aspects [1], a realistic auditory environment is also a major influence on presence in virtual worlds [5, 11, 17]. In contrast to visual perception, auditory cues of the reality can still reach the user and thus affect the user. Having two different sources of auditory information can lead to a problem due to human’s nature of processing different input modalities. According to Wickens [15] this will lead to a conflict and to an increased cognitive demand to fade out the disruptive sound. This instance is a major problem for applications that mostly rely on the experience of the player being part of the virtual world, such as VR-systems that are used during local surgeries to distract the patient from potential stress inducing stimuli [7]. One solution would be to integrate the distracting sound elements in the VR world, maintaining the illusion by pretending there is a reason for the sound. This idea raises the question to what degree such a workaround would be beneficial to the immersion or presence? To answer this, we developed a testbed prototype that allows to recognize real-life sounds, and thus to adapt the visual environment accordingly. The virtual adaptive world is based on classification algorithms that allow an analysis of the auditory environment of the real world. Hence, the virtual world changes according to the classified sound in order to match the virtual visual elements with the real auditory surrounding. Offering this match of sensory information ought to help players experience presence and ignoring possible disruptive perceptions.

2 Related Work

2.1 Effects of Disruptive Audio

One of the main prerequisite for the experience of presence is sensory immersion [16]. Sensory immersion is defined by the ability of a system to deliver sensory cues that are close to real perceptions [13]. Thus, immersion can be described as an objective quality of a medium. The level of immersion increases if more senses and cognitions of the player are addressed. Hence, using modern VR-systems allows to create a high level of sensory immersion by presenting nearly photo-realistic stereoscopic images and thus creating a greater experience of presence. In-game sound is a core element of video games and modern VR-applications and a crucial factor for experiencing presence [5, 11, 17]. While modern VR-systems use head-mounted displays (HMD) that render the stereoscopic image and shield external visual cues from the user, intense sound of the real world can still reach the user and thus disrupt the presence of the player. Though this argument also applies for other sensory perception such as haptics, olfaction, or gustation, auditory perception and the experienced realism of the auditory cues are besides visual perception, the main influencing factors of a virtual experience [17]. In an experiment of Wharton and Collins [14] participants reported that the main reason for listening to additional music while playing games was to mask external sounds and therefore increase their immersion. This underlines the need for users to immerse with the virtual world and to fade out external stimuli. Furthermore, Wharton and Collins [14] examined, that the wrong selection of music can have an adverse effect and reduces the experienced immersion, especially if the music does not fit the visual content. Hence, non-matching visual and auditory cues can break the experience of presence.

2.2 Adaptive Audio

Active Noise Cancelling (ANC) Headphones are a common approach to fade out disruptive sounds. These headphones eliminate sound using destructive interference and are often used to shield the user from distracting ambient sound [9, 12]. Due to different ear forms of every person and to the circumstance that sound is also transferred by the cranial bone, ANC Headphones can not eliminate the sound completely but rather reduce it. In addition, ANC headphones have a varying effectiveness according to the present sound pitches [3, 9, 10]. Furthermore, the use of ANC headphones can filter out important information that must be perceived, i.e. instructions during medical procedures, and are therefore not applicable in several fields of application. Hence disruptive sound can still reach the user. To counter the problem of disruptive sound, the approach of adaptive audio exists. Adaptive audio in games refers to the ability of a game to change the in-game sound or music according to the game play [2]. A well-known example for the first application of this technique can be found in the game “Super Mario Bros.”Footnote 1, where the tempo of the music increases when the timer begins to run out. However, most of the digital games only consider the game state as a possible music changing event and do not include external auditory sources such as music or ambient sound of the real world. Nevertheless, games like “Audiosurf”Footnote 2, “Beat Hazard”Footnote 3 or “Audio Shield”Footnote 4 generate game content according to preselected music. These are first approaches to adapt the game world to external influences, however, the chosen music will be analyzed prior to the game. Hence, ambient sound changes in the real world are not considered. “The Polynomial”Footnote 5 takes this one step further and considers microphone input and thus manipulates the visual game world accordingly. This game world is adapted in real-time and is a first approach to match the visual impressions with the auditory cues of the surrounding. However, “The Polynomial” only changes the visual surrounding and does not adapt the gameplay according to the real-life sounds. So far, we could not find any application, that allows such a game play adaptation based on real-life ambient sound. Therefore, we have developed a fully functional testbed prototype of an adaptive VR-Game that generates games and regulates the visual surrounding based on real time classified auditory cues.

3 Development of an Adaptive VR Game

Based on the concept of an VR application that considers real-life sound and changes game world elements accordingly and thus integrate external sounds in a meaningful way into the virtual world, we examined the idea of an adaptive audio VR-game. Therefore, we developed a fully immersive VR-environment using the HTC Vive to display the rendered virtual world and an external microphone to analyze all auditory cues of the real-world. We created a scenario that allows to explore four different types of environments. Every environment is located in a different area, allows different kinds of playful interaction, and is activated when the outside sound pitch is similar to the dominant sound of the according environment. Hence, we try to achieve a fit of the virtual environment and the real-world surrounding. The four different situations are a showcase for four discriminable tone pitches and are a first approach to integrate outside sound in a meaningful way. There is no interactive walkthrough possible since every environment is only active, if the real-life sound is present. Our development is a first approach of a testbed game that allows to examine how the perception of disruptive sound can be reduced and which tone pitches are of particular importance. The virtual world was implemented using the game engine Unity. The game is a 3D-Singleplayer game which takes place in a virtual city that consists of different buildings, railroads, airports, and a street system. The perspective of the player is a top-down view that allows an observation of the complete urban surrounding. The virtual city is located on an island that is surrounded by water. All assets are designed using a low-poly style, since we did not focus on a photo realistic setting but rather on a first testbed prototype. Besides activating virtual environments according to the real-life sound pitches, the volume of the ambient real-world sound regulates the dynamic of the city. Noisy surrounding leads to a high volume of traffic on the streets and in the air.

Fig. 1.
figure 1

Each environment and the according mini-game that can be activated is located in a different area in the virtual city. Based on the location, the dominant sound of the environment is different.

3.1 Adaptive Elements

The game incorporates two different adaptive approaches. On the one hand, mini-games are triggered according to the pitch of the real-world sound. On the other hand, an increase of the total real-world volume increases the city’s traffic and thus leading to a higher traffic ambient sound.

Mini-games. There are four acoustic places that come along with a mini-game. Each of the mini-games is activated when the real-life sound is classified accordingly. The first mini-game (Game A) takes place in a park near the city and consists of a bonfire which is about to run out of wood. The player’s task is to rapidly gather wood to increase the fire. The dominant sound in this environment consists of medium-high frequency sounds, such as bird cheeping. The second mini-game (Game B) is located on top of a building and allows the player to paint on a wall using the controller as spray can. The sound of the spray can as well as the sound of the air conditions system, that is installed next to the wall are high frequency sounds. In the third mini-game (Game C) the player is inside the airport tower and controls departing airplanes. The sound of the engines is situated in a medium-low frequency. The fourth mini-game (Game D) is near the central station and the player has to clean and repair incoming trains. Here the situation suggests a low frequency sound due to the mechanical surrounding. Every game will be activated when the corresponding sound is classified and only one game is playable at the same time. The surrounding of each mini-game matches a certain real-life tone pitch. If this tone pitch is recognized by the game, the according game is activated. A real-life surrounding that is similar to bird cheeping will lead to the bonfire game, whereas deep industrial noise will trigger the tram repair game.

Fig. 2.
figure 2

Illustration of the location of the mini-games. Each mini-game is located in a different environment with ambient sound. Game A is located in a natural environment next to a bonfire with a medium-high sound pitch. Game B takes place on top of a building next to an AC system referring to a high sound pitch. Game C is located in the tower of an airport that is similar to medium-low sound pitches and game D sets the player next to rail station that represents low sound pitches.

Traffic. As a metaphor, outside sound is mapped to in-game traffic. This mechanism allows to manipulate the visual ambient environment and matches real-life sound to the sound of the driving cars and airplanes. Within the city, a street system with several parking lots exists. At the start of the game, all cars are parked. If the system recognizes a certain volume of the real-world sound, cars will start to drive a random route in the city. The higher the real-world volume is, the more cars are driving. The same principle applies for airplanes. If the real-world sound decreases, cars will park at the nearest parking lot and airplanes start to land.

4 Discussion and Future Work

Our goal is to improve immersion and presence in VR despite disturbing outside sounds. Hence, we developed a fully functional prototype that allows to analyze real-life ambient sound and to change game elements within the virtual world accordingly. We facilitate the possibility to integrate potential disruptive sounds of the surrounding in a meaningful way into the game world. Thus, we maintain the illusion and the experience of presence. Considering the works of Hoffman [6,7,8], who examined the use of VR to distract patients from painful and stress inducing medical procedures, our approach can be used to increase the distracting effect of the VR world by countering discomforting ambient sound in such situations. This field of application was our initial motivation for this investigation. Our development thus serves as a testbed game which we will further evaluate. We want to use our testbed game to examine which tone-pitches and sounds are specifically suitable for the adaptive approach. We hope to use the findings of our future research to examine guidelines how a diverse adaptive audio application should be implemented and which factors are crucial to increase presence. For further development, the site of operation should be considered when adapting the game. Thus, the sounds that will occur during the usage of our prototype can be estimated and result in game changes that are closer to the real-life sounds. For this instance, it is advisable to pre-analyze disruptive sounds of the site of operation and create acoustic fingerprints for each sound that might occur. These fingerprints can be used for a comparison at the run-time of the game. If a stored fingerprint matches the current sound, the game can instantly activate game content to match this specific sound. Thus, pre-defining situations as it is implemented in our prototype, would not be necessary. Ideally, such an adaptive virtual world could react to a great variety of outside sounds and generate a more persistent game experience. An extension to the existing idea would be the use of spatial sound. So far, the position of the sound source is not considered. Generating elements in the virtual world that match the real-life sound should be located at the same place in the virtual world as the sound source in the real world. Otherwise the visual representation does not match the acoustic surrounding and can therefore be a presence breaking problem. This research presents a first approach to improve presence and immersion despite disturbing outside sounds. Though shortcomings exist and the prototype has to be adapted for the according field of application the method can contribute to deliver a more compelling VR experience.