1 Introduction

The 94 % of critical crash was caused by human driver [1], and with the promise of safer transportation, highly automated driving vehicles are expected to avoid those crashes. However, accidents still occur involving high-level automated vehicles, which attracts increasing attention, and has become the biggest roadblock for the mass production. At present, functional deficiencies in perception algorithms constitute significant contributors to accidents, and under a triggering condition those functional deficiencies may cause safety of the intended functionality (SOTIF) [2, 3], which means the absence of unreasonable risk due to hazards resulting from function insufficience of the intended functionality or reasonably foreseeable misuse by person, such as Tesla Tesla colliding with an overturned white truck misidentified as a white cloud. Humans are essentially a special kind of sensor, and it is a meaningful thing to study human-in-loop decision-making based on humans’ states for overcoming perception algorithm function deficiencies and improving SOTIF.

In recent years, some studies based on human states for autonomous vehicles have been carried out, and some researchers point out that fNIRS-measured prefrontal activity may discriminate cognitive states in real life [4,5,6,7,8]. Yamamoto et al. [9] found that both the parietal association cortex and prefrontal area are activated when people are driving. Izzetoglu et al. [10] did an experiment, in which the relationship between driver’s behavior and cognitive measures observed by fNIRS is explored, and the preliminary results demonstrated that driving speed affected the increase in oxygenation levels in dual-task driving where positive correlations are observed. Geissler et al. [11] studied the mental workloads in city environment and country environment, which require distinct demands, and they proposed that the right middle frontal gyrus might be a suitable region for the application of powerful small-area brain-computer interfaces. Horrey et al. [12] explored the impact of task engagement on driving performance, and pointed out that the response time of the driver to braking events is longer in the interesting audio condition, and drivers showed a reduced concentration of cerebral oxygenated hemoglobin when listening to interesting material, compared to baseline and boring conditions. Balters et al. [13] provided a collision that fNIRS is suitable to detect the driver habituation that is present when drivers operate new automated driving systems.

Building upon the aforementioned research, it is evident that different driving tasks have an influence on the mental activity of driver, such as driving speed, driving environment and so on. Huve et al. [14] presented a brain-computer interface (BCI) that may analyze brain activity in real time and deduce the current driving mode, and with an average classification accuracy of 61.7% online, it shows the potential of using DNN-based classification of fNIRS signals in developing BCI for monitoring mental states. Le et al. [15] did a classification of the mental workload from a secondary task by using machine-learning method, and the predicting accuracy on the testing data from the fNIRS data in the cases of subject-dependent classification is 96.8%. In previous study, the mental activities on prefrontal cortex, which is also our regions of interest, caused by different driving tasks have been discussed, online and offline classifications of fNIRS signal for monitoring mental states were analysed, but those works were all aimed at drivers. For high-level automated vehicles, human will serve as the role of passenger, and the mental activities of passengers may be different from diver owing to the result that the passenger can not operate vehicle when he feel dangerous.

The objective of the current study is to analyze the prefrontal correlates of passengers’ mental activity caused by risk and explore the potential of using passengers’ fNIRS signals in developing BCI for improving SOTIF of high-level automated vehicles. The structure of this paper is shown as follows: Sect. 2 provides an introduction to the methodology, Sect. 3 presents the results of the experiment analysis, and Sect. 4 encapsulates the conclusions drawn from the study.

2 Methodology

fNIRS serves as a valuable tool for measuring cerebral oxy-hemoglobin and deoxy-hemoglobin levels by directing specific wavelengths of light to the cerebral cortex. Each data acquisition channel of blood oxygen monitoring device is formed by an emitter and a receiver [16]. The intrinsic characteristics of fNIRS equipment, including its non-invasive, safe, and portable nature, position it as an ideal choice for monitoring brain activity. Its robustness renders it suitable for application in real-world highway driving experiments. In this paper, the influence on passengers’ mental activity of prefrontal cortex resulted in risk is analyzed by comparing the difference of cerebral oxygen change on prefrontal cortex between low-risk segment and high-risk segment. The diagram of this study is shown in Fig. 1. Firstly, a highway cut-in pilot is designed, dividing the scenario into low-risk and high-risk segments based on Kinetic energy fields to assess the impact on passengers’ mental activity resulting from perceived risk; Secondly, two experiments are conducted-one utilizing a driving simulator and the other involving an real-world vehicle. The driving data and cerebral cortical activity data are temporally matched; Finally, the influence of passengers’ mental activity on prefrontal cortex due to risk is analyzed by comparing the difference of cerebral oxygen change on prefrontal cortex between low-risk and high-risk segments using t-test and Wilcoxon Signed Rank Test.

Fig. 1
figure 1

The framework of this study

2.1 Experiment Equipment

The specific device employed OctaMon+ is provided by Artinis, a Dutch company. The OctaMon+ equipment is equipped with eight emitters and two receivers. The blood oxygen monitoring device which was used in this study and its 3D model of this device are shown in Fig. 2. Measurement region is located at the prefrontal cortex, the distance between two optodes is 30 mm, and the sampling frequency is 50 Hz. A head attachment is placed so that the center of the front row is 3.5 cm above the nasion. During the attachment of probes, careful adjustments are made to apply minimal pressure to the skin surface.

Fig. 2
figure 2

Experimental device and its 3D model

In order to analyze the passengers mental activity of prefrontal cortex resulted in risk, we built a signal acquisition system. This signal acquisition system contains Matlab/Simulink module, Python module and OxySoft Software, they can record vehicle states which may be used to build risk field, participant states, and the cerebral oxygen exchange data of passenger, respectively, and time error can be controlled within 10 milliseconds.

2.2 Kinetic Energy Field

Kinetic energy field is a safety indicator [17, 18], reflecting the potential danger level within a driving scenario. It is mathematically represented in Eq (1). The kinetic energy field [19] involves relative longitudinal distance, relative speed. Those information is related to time to collision and enhanced time (TTC) to collision [20], so there is a relationship between the kinetic energy field and TTC. In this paper, the segments where \(\textit{E}_{v}>0.05\) are considered as a high-risk, and while the others are deemed low-risk.

$$\begin{aligned} \textit{E}_{v}=\frac{GR_{2}M_{2}}{{{r}}^{k_{1}}}\frac{{{r}}^{k_{1}}}{\vert {{r}}^{k_{1}}\vert }e^{\left[ k_{2}v_{2}\cos (\theta _{2})\right] } \end{aligned}$$
(1)

where \(k_{1},k_{2},G\) are three constants, \(M_{2}\) indicates target vehicle mass, \({{r}}\) indicates the distance between target vehicle and ego vehicle, \(v_{2}\) represents target vehicle speed, \(R_{2}\) is road friction coefficient, and \(\theta _{2}\) indicates the angle between \({{r}}\) and \(v_{2}\).

2.3 Data Processing Process

In this paper, the data processing process is shown in Fig. 3, and it can be divided into two parts: data preprocess and data analysis. In data preprocess part, kinetic energy field and cerebral oxygen exchange \(\triangle \)TH are aligned by time; in data analysis part, scenario is divided into a low-risk segment and a high-risk segment based on a splits point, the \(\triangle \)TH during low-risk segment and high-risk segment are analyzed based on mean value and statistics theory for finding the influence on passengers mental activity of prefrontal cortex resulted in risk.

Fig. 3
figure 3

The data analysis process

2.3.1 Data Preprocessing

Firstly, oxy-haemoglobin concentration changes (\(\triangle HbO\)) and deoxy-haemoglobin concentration changes (\(\triangle HB\)), which are obtained by blood oxygen monitoring equipment based on fNIRS, are processed with Homer3. It contains three steps: (1) Raw data is collected by Oxysoft3.4.9x64; (2) Raw data format is convert to snirf format by the MATLAB tool package “oxysotf2matlab”; (3) Intensity data is converted to \(\triangle HbO\) and \(\triangle HB\) using Homer3. It contains filtering operation and operation removing motion artifact operation. Motion artifacts are rectified by correlation-based signal improvement algorithm [21],and a bandpass filter (0.015–0.085 Hz) is used to remove respiration, heart rate, blood pressure fluctuations, Mayer waves noises, and others noises.

Secondly, an index is chosen to indicate mental activity. Previous studies have shown that \(\triangle TH\) is an effective index, which may indicate mental activity [22,23,24], and it equals to \(\sqrt{2}\) times of the difference between \(\triangle HB\) and \(\triangle HbO\). In this paper, passengers’ mental activity is analyzed by comparing the difference in \(\triangle TH\) on the prefrontal cortex of passengers between low-risk and high-risk segments.

Finally, kinetic energy field and \(\triangle TH\) are aligned by time, then a scenario is divided into a low-risk segment and a high-risk segment. In this paper, for ease of description, the time corresponding to the low-risk segment is termed the "window time," and results from different window times are analyzed successively.

2.3.2 Data Analysis

The low-risk segment and high-risk segment represent distinct phases, where the mean value of the risk field in the high-risk segment is systematically contrasted with the mean value in the low-risk segment across various window times. Additionally, to discern the characteristics of passengers’ mental activity influenced by risk, t-test and Wilcoxon Signed Ran are performed on the mean values of \(\triangle TH\) in low-risk and high-risk segments. The data processing is carried out using MATLAB 2020.a, Python 3.8.8, and SPSS 25.0.

3 Experiments Analysis Results

The high complexity of advanced driving systems (ADS) and the associated costs of real-world testing have led to a substantial increase in test efforts for practical scenarios. Scenario based methods play a crucial role in the verification and validation processes of ADS, Reference [25], provides a comprehensive survey of various approaches and methods for scenario generation and evaluation in ADS testing and validation. Notably, three common single-car scenarios-cut-in, cut-out, and emergency braking-were investigated. Specifically, this paper focuses on investigating the prefrontal correlates of passengers’ mental activity during a cut-in scenario. Simulation and real-world vehicle experiments have been conducted. The simulation experiment, involving twenty participants, examine the impact of scenario risk on passengers’ prefrontal cortex activity. The results from the simulation experiment are utilized to assess the impact, while the findings from the real-world vehicle experiment serve to validate the conclusions drawn.

The participants are volunteers who provided informed consent after a thorough explanation of the tasks involved. Participants are explicitly informed of their option to withdraw from the experiment at any point without facing any penalties. The study adhered to the principles of the Declaration of Helsinki and received approval from the Institutional Review Board of Tsinghua University, China. (Approval number: 20210102).

3.1 Simulation Experiment Introduction

This simulation experiment is performed in a driving simulator and a signal acquisition system based on hardware-in-loop equipment. The experiment process, as depicted in Fig. 4, involved participants sitting in the driving simulator, focusing on front scenarios. On sensing danger or hearing a stimulating sound, he/she was only required to press the keyboard. Those scenarios are established based on virtual test drive Software, and contains cut-in scenario and some other scenarios 13 kinds scenarios. The ego vehicle in this cut-in scenario is set at 72 km/h, and the target vehicle at 40 km/h. When the distance between target vehicle and ego vehicle is 25 m, target vehicle would cut in from left lane. Twenty-five scenarios are randomly selected from those 14 kinds scenarios, and make up a simulation scenario, which lasts approximately 13 min. Within in initial 1000 ms of this simulation scenario, no events occur to prompt participant entry states. Each participant need to complete 12 simulation experiments. There are total 28 data of blood oxygen monitoring device about this cut-in scenario may collected for each participant.

Fig. 4
figure 4

The simulation experiment process and scenario information

3.2 Simulation Experiment Result

Twenty participants completed the assigned task. The group consisted of 5 females (with a mean age of 24.73, ranging from 21 to 41) and 15 males (with a mean age of 34, ranging from 25 to 46). Among these participants, 7 had valid driving experience. Fifty-three results were excluded due to data recording errors, leaving a total of 506 valid data points for analysis. The p values of t-test for the mean values of \(\triangle TH\) between low-risk and high-risk segments, and the mean values of kinetic energy field are presented in Fig. 5. In this context, L denotes low-risk segment, H denotes high-risk segment and p signifies the associated probability.

Fig. 5
figure 5

The results in simulation experiment

Across all five window times, the mean values of the kinetic energy field are consistently higher in the high-risk segment compared to the low-risk segment, and it indicates that the division mode of cut-in scenario is effective. The t-test results show that within 5 s, the p from t-test for \(\triangle TH\) between low-risk and high-risk saller than 0.1. This signifies that the null hypothesis is rejected at the 10% significance level, indicating a significant difference in passengers’ mental activity in the prefrontal cortex between low-risk and high-risk segments.

3.3 Real-World Vehicle Experiment Introduction

This real-world vehicle experiment is performed on the dedicated road for intelligent connected vehicles at experimental baseFootnote 1 in Chongqing, China, and it is shown in Fig. 6 in which it includes a real cut-in scenario and a virtual diagram. This experimental base is owned by Automotive Engineering Research Institute Co., Ltd, located in Dazu District, Chongqing. This experimental base comprises a dynamic square and a dedicated road for intelligent connected vehicles.

Fig. 6
figure 6

Cut-in scenario

In those experiments, automotive drive platform (ADP) is used for building a cut-in scenario. The ADP, provided by CAERI Intelligent Connected Technology Co., Ltd., comprises an inertial navigation system, a driving robot, a communication module, and a Global Vehicle Target (GVT), and it may obtain the motion parameters and relative parameters of ego vehicle and target vehicle, including speed, acceleration, relative longitudinal distance, relative lateral distance, relative speed, and more.

GVT is driven by power, and people can control its trajectory. In this experiment, it is used to play the role of one target vehicle. When ego and GVT achieve a constant speed and relative distance, GVT will cut in from the left lane.

3.4 Real-World Vehicle Experiment Result

In these experiments, the participant, who sits in the driving position, plays the role of passenger, and does not need to operate the high-level automated vehicles. Blood oxygen monitoring equipment is attached to the participant’s forehead in a way allowing for measuring concentration changes. This scenario inherently carries risks, with 9 group tests conducted for a 23-year-old male participant.

In real-world vehicle experiment, the Wilcoxon Signed Rank Test is used to analyse the difference in cerebral oxygen exchange between low-risk and high-risk episodes with small sample sizes. p indicates probability, when \(p<0.1\) means that the test rejects the null hypothesis at the \(10\%\) significance level, and there are significant difference. The results of these experiments are shown in Fig. 7. L represents low-risk segment, and H represents high-risk segment.

Fig. 7
figure 7

The results in real-world vehicle experiment

The p values obtained from t-test or Wilcoxon Signed Rank Test for the mean values of \(\triangle TH\) between low-risk and high-risk segments in five window times are all below 0.1. This indicates a significant difference between low-risk and high-risk segments. Furthermore, the mean values of the kinetic energy field in high-risk segments consistently exceed those in low-risk segments across the five window times.

Based on the results of simulation experiments and real-world vehicle experiments, the following two conclusions may be acquired: (1) For simulation experiment and real-world vehicle experiment, the p values of t-test or Wilcoxon Signed Rank Test all less than 0.1, and it indicates that there are obvious difference of the passengers’ mental activities on prefrontal cortex between low-risk and high-risk segments.(2) For real-world vehicle experiment, the change trend of the kinetic energy field in five window times differs from that observed in the simulation experiment. This disparity is attributed to distinct criteria for delineating low-risk and high-risk segments. In simulation experiment, the segment in which \(\textit{E}_{v}>0.05\) is considered as a high-risk segment, and otherwise it is considered as a low-risk segment. In real-world vehicle experiment, the front half segment is considered as a high-risk segment, and the latter half segment is considered as a low risk segment.

4 Conclusions

For high-level automated vehicles, functional deficiencies in robustness and logical completeness may caused SOTIF. Humans, in the role of passengers, can be seen as special sensors. The mental activities of passengers may differ from those of drivers in SOTIF scenarios because passengers cannot intervene to control the vehicle when they perceive danger. It is worth noting that existing studies predominantly concentrate on drivers rather than passengers.

In this study, two experiments were conducted, utilizing driving simulator and real-world vehicle respectively. t-test and Wilcoxon Signed Rank Test are employed to analyzed the disparity in \(\triangle TH\) between low-risk and high-risk segments within a cut-in scenario aiming to assess the impact on passengers’ mental activity resulted in risk.

Drawing from the findings of both simulation and real-world vehicle experiments, it becomes evident that risk exposure can indeed alter passengers’ mental activity on prefrontal cortex. This alteration can be discerned through the analysis of \(\triangle TH\) using fNIRS, and it offers a promising avenue for implementing passenger-in-loop decision-making. By leveraging passengers’ states mechanisms, such as the \(\triangle TH\) on prefrontal cortex, combined with reinforcement learning techniques, it becomes feasible to address the function deficiencies of perception algorithms and improve SOTIF.