In the following section, the simulator that is used for the study is described as well as the study design and materials used to investigate the hypothesized effects. The implementation of the study is approved by the ethics committee of the TU Berlin in April 2019 and Robert Bosch GmbH.
The driving simulator that is used consists of six monitors, that simulate a 360 degrees view (Fig. 2). For traffic and scenario simulation, the driving simulation SILAB  is used. To make it comparable to driving a car on the road, the driving unit is movable and tilts in accordance to the visual simulation. The HMI is equipped with a speedometer and a visual display for notifications. Indicators are located on the steering wheel. Further, the driving unit contains mirrors, pedals, the handbrake and a fixed mount for tablets. The simulator is equipped with a sound system for driving sounds and speech output.
The study is an experimental laboratory study as it is conducted in a driving simulator. This means, that participants are tested in a controlled environment rather than real traffic. The driving simulator has the advantage to enable the testing of several participants under exactly the same traffic conditions. In this case this is highly important as the impact of the surrounding traffic and familiarity effects have to be controlled.
The driving simulation SILAB  allows a very precise generation of the six different scenarios that have been tested within this study (Fig. 3). As shown in Fig. 3, actions and their timing in different takeover situations have been tested. The takeover is always triggered when the ego vehicle is on the center lane of the highway with a speed of 120km/h. No reason for the takeover is given in this study to prevent unwanted side effects and it is not relevant for what reason the takeover request is triggered. However, the takeover request is always an uncritical one (e.g. exit highway, construction zone ahead). The surrounding traffic is set up for three maneuver options, each with a relatively high or low complexity, resulting in overall six scenarios (Fig. 3). To trigger the corresponding maneuvers, the surrounding traffic is set up accordingly. In order to prevent participants from preparing the maneuver, the traffic constellation is changed as soon as the takeover request appears. Thus, even if participants do not attend the NDRT completely, they cannot prepare the upcoming scenario in advance. Scenarios are set up in the following way. The amount of relevant vehicles per scenario is referred to as RV (relevant vehicles) in the following. Vehicles are rated as relevant when they are directly relevant for the corresponding maneuver and have to be perceived for a safe maneuver performance. The maneuvers include a lane change to the right (RIGHT), car following (FOLLOW) and a lane change to the left (LEFT). The lowest complexity have the RIGHT scenarios. Cases, where a lane change to the right is necessary are set up with no RV on the oncoming right lane (obligation to drive on the right). In one scenario no relevant traffic surrounds the ego vehicle (0RV), in the other scenario a slower car in the right back (80km/h) is relevant (1RV). In the FOLLOW scenarios the right lane is always occupied and the vehicle in the front drives faster (130 km/h) than the ego vehicle. In the one of the FOLLOW scenarios only the relevant vehicle on the right lane (80km/h) and one in the front are set up (130km/h; 2RV). In the other FOLLOW scenario, the left lane is occupied as well (160km/h; 3RV). In the LEFT scenarios, the right lane is occupied (80km/h) and the vehicle in the front is obviously slower (80km/h) than the ego vehicle. In the one scenario, only the two relevant vehicles in the front and on the right lane are set up (2RV). In the other scenario, the left and right lane are highly occupied (6RV; Fig. 3). Overall three blocks are executed, in which each scenario appears once. The sequence of the scenarios in each block is randomized, resulting in three different blocks. Additionally, each participant is set up at a different point of the first block to have randomized trials and prevent learning effects of an order.
Variables and Measurement
Four variables are focused on in the study. The two independent variables are the familiarity with the situation (situation familiarity) and the objective complexity of the traffic environment (amount of relevant vehicles; objective traffic complexity). The subjective complexity is a pseudo-dependent variable, as shown in (Fig. 1). Hypothesized is that the subjective complexity is influenced by the two independent variables that are manipulated. In addition, subjective complexity is assumed to have an impact on the dependent variable, the time to action decision. Hence, it is also assumed to be a mediator variable. Variables and measurement methods are further described in detail.
The familiarity with a situation is implemented by a repetition of the scenarios. Participants drive through all of the six scenarios three times. Thus they have the lowest familiarity is in the first repetition of each scenario, and the highest familiarity in the third repetition of each one. As in real traffic a situation is never the same, situation familiarity is chosen as term instead of learning. In contrast to learning, no representation of the same facts can be learned for traffic situations. The habituation to general traffic situations is called situation familiarity and rises with repeated exposure. To exclude learning effects of the takeover and HMI representations, a learning session is executed prior to the experiment.
Objective Traffic Complexity
The objective complexity of the traffic environment is in this paper based on the amount of relevant vehicles in relation to the ego vehicle. In the conditions it is either high or low. In the low complexity cases only vehicles that are relevant to trigger the corresponding maneuver are used. In complex scenarios as much vehicles as possible in the corresponding scenario are set up. Still, only vehicles, that need to be attended to execute the maneuver-based action count as relevant vehicles. Their distribution and relevance are explained in the following. In Section 2.3, the distribution and the set up is described. This section focuses on the relevance of the vehicles that add up to the objective complexity. The amount of relevant vehicles for the RIGHT-LOW scenario is zero (0RV), as there is no vehicle that is relevant for the current maneuver. The RIGHT-HIGH scenario includes one relevant vehicle, located in the right back of the ego vehicle. Thus, the mirror has to be checked and decided whether the lane change can be executed or not under the given condition (1RV). In the FOLLOW-LOW scenario two vehicles are relevant for the action decision. These are the vehicle on the right, indicating that the right lane is occupied and the car in front, which has a higher speed than the ego vehicle (2RV), giving no need for overtaking. The same applies for the FOLLOW-HIGH scenario with an additional vehicle on the right lane in the maneuver relevant area (3RV). Only two vehicles are relevant in the LEFT-LOW scenario. That is the vehicle on the right, indicating that no lane change to the right can be executed and the car in front which has a significantly lower speed than the ego vehicle (speed difference of 40km/h). Hence, the ego vehicle approaches the oncoming vehicle very fast and a lane change to the left has to be executed (2RV) in order to avoid strong braking. In the LEFT-HIGH scenario this is the same case, except for another close vehicle on the right lane and three vehicles on the left lane: two of them in front of the ego vehicle and one in the back that is overtaking. All three are driving with a speed of 160km/h. These vehicles have to be considered for the maneuver execution. With six relevant vehicles the LEFT-HIGH scenario represents the one with the highest objective complexity (6RV; Fig. 3). As different amounts of vehicles are relevant depending on the maneuver, a distinction between the amount of relevant vehicles is done rather than differentiating between high and low conditions only. The benefit is that a more precise and interval-scaled distinction can be drawn rather than a nominal one.
The subjective complexity indicates how complex the participants perceive the scenario. In order to assess the subjective complexity, the rating sheet of the NASA-TLX (NASA Task Load Index; ) is used. It is a multi-dimensional rating procedure, including six subscales that are rated on a 20-point likert scale. The NASA-TLX was originally developed to measure workload. Although subjective complexity is not the same as workload, the NASA-TLX is useful to measure subjective complexity in this study due to its items. The items that are addressed are presumed to be indicators of subjective complexity. The first item mental demand is indicated by “How much mental and perceptual activity was required (eg. thinking, deciding, calculating, remembering, looking, searching, etc)? Was the task easy or demanding, simple or complex, exacting or forgiving?”. Physical demand is described as: ” How much physical activity was required (e.g. pushing, pulling, turning, controlling, activating, etc.)? Was the task easy or demanding, slow or brisk, slack or strenuous restful or laborious?”. The third item temporal demand is indicated by “How much time pressure did you feel due to the rate or pace at which the tasks or task elements occurred? Was the pace slow and leisurely or rapid and frantic?”, performance by “How successful do you think you were in accomplishing the goals of the task set by the experimenter (or yourself)? How satisfied were you with your performance in accomplishing these goals?”, effort by “How hard did you have to work (mentally and physically) to accomplish your level of performance?” and frustration by “How insecure, discouraged, irritated, stressed and annoyed versus secure, gratified, content, relaxed and complacent did you feel during the task?” . The weighting of the items has been criticized in the past . Therefor it is not used in this study. Based on the evaluation it would neither be beneficial for the current purpose. Participants answered the NASA-TLX after each trial, resulting in overall 18 ratings for each scale of the NASA-TLX.
Time to Action Decision
The subjective complexity is further assumed to have an impact on the resulting time to make an action decision. Participants have to indicate their action decision aloud as soon as it is made. Hence, after every takeover, participants indicate verbally the action decision they make at the same moment. This method is chosen to gather the time of the actual action decision rather then the time of the action execution. This is due to the fact that the action execution is dependent on the surrounding traffic and cannot always be executed directly when the action decision is made. Hence, by using the verbal indication, the time of the decision to execute a maneuver is directly measured. Based on the action decision time, the time to build up a situation model can be deduced.
After a short introduction, participants fill out the formalities (declaration of consent, participant code for complete anonymisation and participant information). Before starting the actual data acquisition, participants read the instructions. To get used to the simulator dynamics and to exclude learning effects of the takeover and the simulator functions, participants practice the takeover before starting the experiment. Finally, the questionnaire for sociodemographic data is filled out.
In the experiment, participants start on a parking lot and have to drive onto a three lane highway. Participants are instructed to activate the highway pilot on the middle lane as soon as it is available. During the automated drive, a quiz is available on the mounted tablet at the center console. On this, participants engage into the non-driving-related task while the automated mode is activated. As soon as a takeover request (TOR) is triggered (always at a speed of 120 km/h), participants are advised to stop answering the quiz immediately. No further action for quiz deactivation is needed. Participants can just stop playing and turn their attention away from the quiz. Instructions declare to take over the driving task and to try maintain the speed of 120km/h. The action participants decide to execute must be verbalized clearly as soon as the decision is made. The action decision is dependent on the surrounding traffic and German driving law (especially the obligation to drive on the right). The speed should be held as constantly as possible, thus using mainly the steering wheel for the maneuver. After the maneuver, participants are instructed to head for the oncoming parking lot where they come to a stop and fill out the NASA-TLX questionnaire to measure subjective complexity of the preceding scenario. Starting from the parking lot again, participants resume driving into the next scenario. This is repeated 18 times, including six scenarios in randomized order per block. This results in three blocks, that participants drive through. Each block has a different randomized order and each participant starts at a different set-up point of the first block. During the experiment, the time of the action decision is measured with a key press. As soon, as the participant indicates the action decision, the investigator presses the space key on the keyboard.
The study was conducted in April and May 2019 in a simulator of Robert Bosch GmbH in Renningen. In a pre-study the test design, methods and technical functionality were tested. The final results base on N = 20 participants that took part in the main study. Out of the 20 participants 13 are male and 7 female with a mean age of 26.2 years (SD = 2.69). Most of the participants drive a car daily (N = 9). The others have a driving frequency distributed between five to six times a week and less than once a week. The most common average driving duration per drive is 30 minutes, ranging from 15 to 120 minutes. Highways are used mostly (N = 7), followed by Rural Roads (N = 4) and cities (N = 2). Seven participants did not indicate their most common road usage. All participants are regular drivers and provide existing pre-knowledge of highway situations, although varying in the amount of highway usage. Overall 13 participants indicate to drive moderately. Three participants have a defensive driving style and four state to drive mainly sporty.
To test whether complexity conditions high vs. low differ significantly from each other, the Mann-Whitney-U test  is used. This is due to the fact that although visually the data seems to be distributed normally, Shapiro-Wilk normality  tests do not support the assumption of normal distribution in all conditions. Regression analysis is based on the regression equation y = xβ + 𝜖 (β= slope;𝜖 = error) and used for statistical evaluation. Tests on non-linearity, normal distribution, homoscedasticity and influential outliers are done using residual vs. fitted-, normal-Q-Q-, scale-location and residuals vs. leverage plots. Further, the relation between the variables is tested on mediation effects. Monte Carlo Analysis  is used to test whether indirect effects can be found. As mediation analysis has the challenge to generate high coverage and high power , Sobel-Test and Bootstrapping  are carried out additionally.