Keywords

1 Introduction and Research Question

In modern cockpits, most information provided to the pilot is given visually. Large visual-display units in state-of-the-art head-down glass cockpits provide a considerable amount of information e. g., Primary Flight Display, Navigation Display, Systems Display, Engine and Warning Display. Additionally helmet-mounted displays or head-up displays on the one hand increase situational awareness but on the other hand intensely consume visual resources of the pilot (Jovanovic 2009). Audio as an interface in modern aircraft cockpits is under-represented compared to the multitude of visual displays. It conveys no spatial information in present-day civil cockpit and is used only to bring attention to a visual display (Begault and Pittman 1996) or as intra-crew and crew to air traffic control communication. In addition to narrowed space in modern cockpits, the cognitive ability of humans is limited (Wickens 2002). With increasing number of systems that have to be used, managed or monitored by the cockpit crew, this creates new operational burdens and new kinds of failure modes in the overall human-machine system (Oving et al. 2004; Spence and Ho 2008).

Audio research has been sparse in aviation and mostly covered spatial audio with a set of loudspeakers around participants head or simple left-right-volume difference in the headset (Simpson et al. 2007). However, several studies have suggested a multitude of applications for the use of 3D audio in the cockpit (Begault and Pittman 1996; Haas 1998; Veltman et al. 2004). This paper wants to fill this gap and introduces the design and results of a psychoacoustic 3D audio experiment with focus on aviation needs.

The experiment intends to test the ability of 3D audio localization presented via a standard stereo headset as it is common used in present aircraft cockpits. Three main aspects were considered: Firstly, the localization performance was evaluated i. e., the capability of positioning sound sources in 3D space at predetermined positions. This enables participants, to localize sound sources at desired positions with some uncertainty. This information is needed to decide about future possible application in the domain of aviation that could be supplemented or replaced by a 3D audio system. Secondly, the influence of linked head movement, measured by a head tracker, on the localization performance was analyzed. It is predicted, that head linking has a distinctly positive influence in the localization performance. It is important to know how strong and what kind of impact head movement has on the acceptances and performance of the participants. Thirdly, three different test sounds were presented at the same position and under the same conditions to the participants. It is mainly interested if the localization performance increase with a sound at wider frequency spectrum.

The paper is organized in three parts. Section 2 gives an overview of the used setup as well as the dependent and independent variable. In Sect. 3 the findings of the 3D audio experiment and the row data are briefly discussed. Conclusions and outline of future work are presented in the last Sect. 4.

2 Experiment Design

2.1 Procedure

The experiment was split into two parts for each participant. Firstly, every participant was introduced to the experiment and got a short summary of the following steps. After that, the pure tone audiometry test was executed. A rapid-result was shown to the experiment operator to ensure that the participant met the requirements of the experiment. Following, the main experiment part.

At the beginning, participants received a digital questionnaire. They were asked: age, gender and if the participant holds a pilot-license. In case of an existing pilot-license, participants were asked about the license-type, flight hours, medical and the use of a headset during flight. All participants were asked, if any experience with spatial audio exists. Moreover they were asked, how often and for which application they use headphones and if they play computer games or a musical instrument.

Four different sessions were created and presented to the participant mixed. They were introduced to the first experiment session and what they are going to expect. Participants were asked to rotate the whole body with the swivel chair to localize the test sound. The head tracker was calibrated, before every new test sound was played, if it was required. In every session and for every sound angle the presenting sequence was the same. First, the sound was played two times at the 0\(^{\circ }\) position. Then it moved to the target position and played during the movement for three times. At the target position, the test sound played either five times (session A, D) or until the participant pressed a button (session B, C). The head tracker was enabled in session A and B and participants were asked to rotate their body as they prefer while the sound plays. In session C and D the head tracker was not active and the test sound was not played relative to the orientation of the participant’s head.

During one session, 20 different angles were presented twice. Each session took around 15 – 20 min, depending on the time, the participant needed to localize and to decide the perceived position. Directly after each test session, participants replied to a questionnaire. They were asked, how they felt after the recently conducted session and what they thought about the given sound file. After a short break the next session began, again with an introduction of the now following setup.

After four sessions, the main experiment part ended and participants were asked about their overall opinion about the experiment and the 3D audio sounds.

2.2 Sound

Three test sounds (A, B, voice) were created as basic stimuli for the experiment, and all sounds having a length of one second. Those sounds were used to evaluate the localization performance. Sound-A is designed as a technical warning sound with a frequency of 2.000 Hz. Sound-B with a frequency of 4.000 Hz. Both sounds are similar to typical warning-sounds used in aviation context. Sound Voice is a synthetic English female voice, speaking out the word position.

2.3 Experiment Sessions

The experiment was split into four test sessions [A–D] for each participant, the order was completely mixed between the participants:

  • A with head tracker and five iterations at target angle

  • B with head tracker and looped iteration at target angle

  • C without head tracker and looped iteration at target angle

  • D without head tracker and five iterations at target angle

For the experiment 20 sound angles where defined. The first six angles always start at the positions: 90\(^{\circ }\), 30\(^{\circ }\), 270\(^{\circ }\), 330\(^{\circ }\), 150\(^{\circ }\), 210\(^{\circ }\). After these, the next 12 angles where defined randomly. Each angle was presented to the participant a second time after the first angle-sets were played completely. The presented sound angle had a distance of at least 40\(^{\circ }\) to the one given before. For the test sessions four distinct angle-sets were defined. No correspondence between these sets was given. In total, \(4\times 40\) sound angles were presented to each participant during the experiment.

2.4 Software and Head Tracker

Audio for the right and left ear were calculated in real time by the experiment software. The head-related transfer function (HRTF) used in this experiment was non-individualized for each participant. The audio corresponded in the two head tracker sessions to the direction of the desired sound source position relative to the orientation of the participant’s head. This was possible by combining the headset with a Carl Zeiss cinemizer head tracker, continuously sending the head positions and orientations to the experiment software. Including this information, it was able to relocate the audio in real time whenever the participant moves the head. Figure 1 shows the structure of this system. The complete real time 3D audio system including HRTF, head tracker, sound source and logging was executed on a DELL E7440 personal laptop.

Fig. 1.
figure 1

Structure of the test system used in the experiment

2.5 Definition of Direction

The direction in this experiment is defined like a compass, the angle rise clockwise from 0\(^{\circ }\) to 360\(^{\circ }\). An angle of 0\(^{\circ }\) for the direction straight ahead and 180\(^{\circ }\) in the back of the participant. Accordingly, directly to the right of the participant was 90\(^{\circ }\), on the opposite side, on the direct left 270\(^{\circ }\). Following the experiment design, the elevation angle was fixed to 0\(^{\circ }\) in the horizontal plane and for all participants on the eye level.

2.6 Pointing Device

Participants had to point to the perceived position of the sound. To achieve this, they had to move a digital red ball, shown on the wall of the simulator, to their perceived position. For this, the ball-position was coupled with the head tracker attached to the participant’s headset. The ball was shown after the test sound was played in the center of attention. They where instructed to rotate their whole body on the swivel chair as part of the localization task. At the perceived position, participants confirmed by clicking on a presenter button in their hand. The combination of ball and head tracker was calibrated by the participant before every new audio position was played if it was required. There is no difference whether the participant shows the position by hand, or the pointing is done by head movement (Majdak et al. 2010).

2.7 Location

The experiment took place in the Institute of Flight Guidance at German Aerospace Center (DLR), Braunschweig, Germany. A 360-degree round room, normally a tower simulator, was used. The wall of the simulator was in a light blue during the experiment. The positions 0\(^{\circ }\), 90\(^{\circ }\), 180\(^{\circ }\) and 270\(^{\circ }\) were marked by a black chessboard-line as it is shown in Fig. 2. Inside the tower simulator a constant background noise at 43 dB from the projectors is present during the whole experiment. The participant sat on a swivel chair in the center of the room, centered between the four marked positions. The cable of the headset and head tracker was attached to the chair, so that the participant could rotate on the chair without limitation. The experiment operator sad in the same room at approximately 150\(^{\circ }\), 3 meters away from the participant.

Fig. 2.
figure 2

360-degree tower simulator

2.8 Participants

The experiment it targeted towards an application in aviation system. Nevertheless, in this part a wider range is aimed. Thus, participants were chosen random from scientists of the research facility. Twenty-three people, five female and eighteen male, ranging from the age of 25 to 62 (M = 36.43, SD = 10.01) participated in the experiment. Ten participants held a pilot license and experienced an average of 478 (SD = 448.06) flight hours in total.

2.9 Pure Tone Audiometry

According to the German Acoustical Society, participants in a hearing test should have a normal threshold of hearing. This can be reviewed by a sound threshold audiometry or a questionnaire (Hellbrück et al. 2008). For this experiment a pure tone audiometry test was built. The up-5-down-10-method, where the tone is raised by 5 dB for every No and lowered by 10 dB for every Yes was used. The presentations of the tones had no rhythmic pattern, but were played in reasonably irregular distances (Gelfand 2001). Every time participants heard a tone, they had to press a button. The hearing test was done with the same type of headset as used in the experiment. If the participants are pilots with a valid European license, they have to pass their medical intermittently. In this regard the hearing threshold gets tested. In a pure tone audiometry the hearing loss on each ear can not fall below 35 dB at 500 Hz, 1.000 Hz and 2.000 Hz and 50 dB at 3.000 Hz. (BMVI 2007).

All participants were tested and passed the pure tone audiometry test with limits for pilots in the frequency 500 Hz, 1.000 Hz, 2.000 Hz and 3.000 Hz. The average hearing threshold was at 16.03 dB (SD = 7.05) with an average left-right difference below 5 dB (M = 3.04 dB, SD = 1.84 dB).

3 Results

The localization error was calculated as the difference between the actual and estimated direction by the participant. The distribution was assumed to be a normal distribution. Mean average and standard deviation were calculated from the raw directional data.

3.1 Direction Offset

During the experiment, all participants heard in total 160 sound positions. As described before, 20 sound angles were defined for the experiment and were repeated for all sessions randomly. The average location performance under all conditions in this experiment is at M = \(-0.29\) \(^{\circ }\), SD = 16.20. The lowest localization error, except for 0\(^{\circ }\), was at the left (90\(^{\circ }\), M = 5,87\(^{\circ }\), SD = 13.63) and right position (260\(^{\circ }\), M = \(-1.66\) \(^{\circ }\), SD = 14.44) of the participant. In a range \(\pm 30^{\circ }\) at the 0\(^{\circ }\) position, the localization error was highest (M = \(-0.51\), SD = 30.03).

Fig. 3.
figure 3

Average error over all tested sessions with aberration to the left and right side

Fig. 4.
figure 4

Sum of all perceived positions for the whole experiment

Following these results, Fig. 3 shows, that the standard deviation in the rear hemisphere is apportion for left and right. However, the deviation in the front-right-quadrant tends to be more to the left (lower angle), whereas the participant moved the localization in the front-left-quadrant more to the right (higher angle).

Figure 4 gives a summary of the perceived positions independent of the given sound angle. The angles are grouped into 15\(^{\circ }\) blocks. It can be seen, that participants tend to orientate into 45\(^{\circ }\) steps with a strong focus on the basic angle 0\(^{\circ }\), 90\(^{\circ }\), 180\(^{\circ }\) and 270\(^{\circ }\). The amplitude in the range 240\(^{\circ }\) to 270\(^{\circ }\) requires further consideration.

3.2 Head Movement

In the experiment two sessions were realized with active head tracker, and two without coupled head movement. The session order was randomized over the participants. As expected, the number of localization errors in the head tracker sessions is relatively low compared to the no head tracker condition. In the head tracker session, the participant perceived the sound with an average error of \(-0.59\) \(^{\circ }\) with a standard deviation of 8.51 over all given angle. As expected, the accuracy without a coupled head tracker is less precise. The average error rose to 1.34\(^{\circ }\) with a standard deviation of 23.89 over all given angles. These results are in line with previous studies (Barfield et al. 1997; Minnar et al. 2001; Weinzierl 2008), which found advantages for coupled head movement during spacial audio sessions. A higher average error in the no head tracker sessions was measured at the positions front-right and back-left. Beside those findings, participants reported a more natural feeling with head coupled audio.

3.3 Different Sounds

Participants were divided into three groups. For each group a different test sounds were presented. Sound-A is designed as a technical warning sound with a frequency of 2.000 Hz. Sound-B with a frequency of 4.000 Hz. Sound Voice is a synthetic English female voice, speaking out the word position.

The results of the experiment show, that the localization performance is similar over the three given sounds. The localization error value in the range from 15 to almost 18 (Sound A: M = \(-0.19\) \(^{\circ }\), SD = 15.66; Sound B: M = \(-0.13\) \(^{\circ }\), SD = 17.54; Sound Voice: M = 1.45\(^{\circ }\), SD = 15.41). The localization of the synthetic voice resulted in a low localization error especially in the area directly left and right of the participant. Participants reported, that the wider frequency range was less disturbing and hence easier to concentrate on.

4 Conclusion

All participants were able to localize a given sound with a good precision. The best results could be found in the half between 90\(^{\circ }\) and 270\(^{\circ }\). During the experiment, no participants remark that they were not able to localize the position of the given sound. Neither in the head tracker session nor without head tracker any front-back confusion or inside-the-head phenomenon were recorded during the experiment. With these results it is thinkable, for application with low precision thresholds, that a complicate head tracking inside an aircraft cockpit can be renounced.

Surely, the localization uncertainty was noticeable lower with head coupled movement during spacial audio sessions. The results of the localization performance with and without head tracker compared well to results presented by other authors. An important finding from this experiment implies, that with the programed experiment software and setup a sound object can be localized in real time with a precession that further research for future application in the domain of pilot assistant systems are thinkable.

Nevertheless the influenced of different sound was not as high as expected. It is thinkable, that the chosen sounds were too close to each other. Still a small advantage for the synthetic voice is notifiable. Further research should be done in this field.

This work opens the door for further research in the domain of aviation with the benefit of 3D audio. With the results presented in this paper, several assistant systems for pilots with 3D audio are thinkable. Ongoing research will be done in this field by the author.