1 Introduction

Car-induced accidents are a consequence of drivers’ stress or lack of attention which could be affected by emotional events. Traffic congestion could be directly correlated to drivers’ mental health, hence developing a continuous monitoring system to automatically detect drivers’ stress is vital to enhance safety. Driving in stressful conditions such as city or freeway is associated with a higher rate of accidents, life-threatening situations, and compromises decision-making skills. The physiological response of the human body to stress causes an increase in heart rate, respiration rate, muscle contraction, sweating, etc.[1]. Employing physical indicators and analyzing physiological representatives are techniques that could be used to detect and classify stress. The physiological signals including GSR, electrocardiogram (ECG), respiratory rate (RR), and electromyography (EMG) could be acquired for the aim of stress level monitoring [2, 3]. An experimental procedure to elicit stress conditions has been designed and proposed by Martinez et al. in [4] Using quantitative analysis and different stress levels are classified based on ECG and GSR signals.

Many studies have been conducted to computationally recognize and classify stress levels effectively [5]. George Rigas et al. introduced a method of stress estimation for drivers based on a dynamic Bayesian network (BN) [6]. Healey and Picard [1] achieved an accuracy of 97.4% for two levels of high and moderate stress based on extracted data from EMG, RR, ECG, and GSR. This research aimed to identify the stress level using the signal fusion of multiple sensors. However, this method is not suitable for classifying stress into three levels and particularly with considering a single signal. They used the support vector machine (SVM) technique based on electroencephalography (EEG) and ECG to recognize driver’s fatigue. Zhai [7] and Angus [8] monitored and recorded three types of physiological signals, namely skin temperature (ST), GSR, and blood volume pulse (BVP), and introduced a novel automated system for stress classification. In addition, they used three types of classification methods such as SVM, Naive Bayes (NB) classifier, and decision tree. Recently, A k-nearest-neighbor classifier learner is used in [9] for the stress recognition purpose while driving. Here, parameter-based and trend-based techniques are proposed to generate features from ECG signals. The vector Quantization learning method is used in [10] to distinguish stress from relaxation. According to this paper, the most informative signals belong to the cardiac signals.

1.1 Related work

GSR and ECG signals are the two most reliable physiological signals for stress recognition [11, 12]. One of the remarkable points that makes GSR signals a better indicator than ECG signals is that GSR signal could be obtained by two sensors on the hand and foot skin, while the collection of ECG signals requires an additional sensor for the chest [13, 14]. In a practical application, utilizing fewer sensors for drivers during driving is a significant improvement due to the embarrassment which might occur during driving task [15]. It is mostly preferred to use GSR signal along with other physiological signals to increase the accuracy of the classification; however, in the past few years, some literature has focused on the performance of GSR signal solely to reduce the cost and complexity of the required measurement equipment compared to multiple sensor utilization. The feasibility of using a single physiological signal to classify stress levels has been investigated as a practical alternative in [9]. In this study, the advantages of a single signal compared to the multiple signal approach are comprehensively explored. Tang et. al have studied the effect of continuous stress monitoring using GSR on the pattern classification features and illustrated that the GSR signal facilitates this procedure [16]. In [17] GSR signal along with blood pressure (BP) are investigated for stress level detection. Liu et al. have applied SVM on automatically selected GSR to classify human emotion with an accuracy of 66.67%. Ogorevc et al. have proved that the GSR signal provides larger peaks and amplitude in response to human emotion [18], and [19]. Also, in [20] it has been recognized that the GSR signal has a better correlation with emotional events during driving tasks compared to other physiological signals.

Table 1 is an overview of several stress-detection studies with multiple physiological signals, classification methods, and achieved accuracy. According to this overview, the novelty of the present study could be concluded from two aspects. First, the GSR signal solely is a reliable source of data to discriminate stress since it has an exceptional performance based on a single feature. Second, using a large number of sensors results in a large number of features which leads to considerable computational burden and time taking processing procedure. In addition, using a single sensor is not only provides a cost-effective approach but also does not limit the driver’s performance during the driving task. To develop an efficient and reliable system that precisely detects stress levels in drivers, preprocessing calculations, time and cost should be minimized along drivers’ comfort and safety should be noted.

Table 1 Overview of stress classification methods and their accuracies

1.2 Contribution

The importance of both time and frequency features for successfully classifying cognitive tasks is undeniable and has been mentioned in several studies [26]. In this paper, we propose a method to categorize stress into three levels of low, medium, and high based on a single physiological signal during the driving task. We focus on the extensive comparison of the two most reliable features of hand and foot GSR signals. Accordingly, we demonstrate that an appropriate processing and classification method of GSR signal alone can offer it as a viable solution to discriminate stress in the sense of feasibility and effectiveness. The most efficient feature extraction and preprocessing methods are devised to achieve the highest performance and a considerable number of features are extracted and processed. We use a one-way ANOVA classifier learner since it is theoretically simple and powerful, and it is a common way to perform statistical analysis on experiments that has the capability of classifying more than two groups of datasets. The novelty of this approach in utilizing ANOVA is to employ it as a classification technique to reduce the data dimensionality, computational complexity and time, and improve the classification accuracy. Working with one signal in real-time stress recognition is computationally efficient and cost-effective. Moreover, it requires fewer monitoring sensors which makes data acquisition easier and it does not interfere with drivers’ natural behavior. Consequently, a recognition rate of 95.83% using the ‘foot’ amplitude feature is achieved, where potentially offers a promising solution for future automatic stress detection devices.

This paper is organized as follows: Sect. 2, presents the methodology including data selection, feature extraction, data normalization, and classification method. Section 3 demonstrates the results and discussion. And the final section presents the conclusion of this study.

2 Methodology

In order to conduct the proposed methodology, the GSR data MIT-BIH database[1] is employed in this paper. This database is collected by Healey and Picard during a real-world driving experiment. It consists of seventeen drivers’ raw data for ECG, EMG, ‘foot’ and ‘hand’ GSR, heart rate (HR), marker, and RR and acquired from various wearable sensors. In [1], the experiment was conducted in the Greater Boston area starting from the rest status and after passing three city districts and two highways they returned to the initial location.

2.1 Data selection

To recognize stress based on the GSR signal, we have to monitor all related datasets comprehensively. Due to a large number of drivers’ dataset, all provided results in this paper belongs to driver 5 as training data. Since some drivers’ data sources lack some information, the authors have decided to choose driver 5 to driver 12. These drivers’ data include both foot and hand GSR signals along with a marker. The objective of providing a marker signal which is collected by an observer who accompanied the drivers during the driving task is to support the main signal with alternative data and as an identification tool to define each segment of the experiment. The test duration is 5055 s and the sampling interval is 30 ms. The overall number of samples is 167183 for each type of signal. The signal segmentation using hand and foot GSR signal and the marker data is presented in Fig. 1. Once the signal is extracted, it is divided into seven segments for the initial and final rest, three city districts, and two highways as data samples. The segmentation of the foot GSR signal regarding the different stress levels is shown in Fig. 2. Each segment is an indicator of low, medium, or high-stress levels.

Fig. 1
figure 1

Signal segmentation for Driver 5 with respect to the marker

Fig. 2
figure 2

Stress level segmentation of Driver 5 based on foot GSR Signal

2.2 Feature extraction

Two main components of the GSR signal are skin conductance level (SCL) known as tonic level and skin conductance response (SCR) known as a phasic response. Tonic level changes are slow (within ten seconds to 1 min) and dependent on the skin condition which could be dry or moist. Hence, SCL is not informative enough for this research objective. While SCR rides on top of the tonic level with rapid fluctuations, it is easy to find the peaks and bursts which could provide us with valuable information to identify the emotional stimulus events. Generally, the duration of the stimulus event is approximately 1 to 5 s after the onset of the emotional stimulus [27].

According to Fig. 3, four features could be utilized to characterize SCR. These features are introduced as follows:

  • (1) Latency of response onset (SCR lat.):

Fig. 3
figure 3

Characteristics of the SCR in GSR signal

Generally, it may take 1 to 5 s after the stimulus onset to reach the minimum amplitude. The response onset particularly sets to the time that the SCR experiences the minimum amplitude. Here, SCR changes are disregarded before this period and they would be considered as non-specific skin conductance responses that happen spontaneously in the body every 1 to 3 min and are not considered as a stimulus.

  • (2) The amplitude (SCR amp):

In this paper, we find the difference in the peak amount and obtain minimum amplitude for each stress segment.

  • (3) Rise time (SCR rise time):

This feature indicates the time of rising from the response onset to the response peak.

  • (4) Half time value of the recovery time (SCR recovery 1/2):

Here, the duration from the peak to the 50% recovery is needed to be found.

2.3 Data normalization and segmentation

Since noise is an inevitable part of the signal in collecting GSR signal, the selection of filter and associated cut-off frequency is vital to make the GSR signal smooth and obtain the SCR response and remove the SCL. Therefore, an appropriate fourth-order low pass elliptical filter, as shown in Fig. 4, is designed with 0.1 dB maximum ripple in the pass-band, 40 dB drop in the stop-band, and 4 Hz cut-off frequency which is applied to both ‘foot’ and ‘hand’ GSR data to remove the noise of high frequency.

Fig. 4
figure 4

Designed elliptical filter

The signals are downsampled to obtain particular values at specific time intervals. Since skin conductance variation in different individuals is inescapable, the signals are required to be normalized. For this purpose, the one-step difference signal is devised to eliminate possible effects. It is worthwhile to mention that these procedures are applied to each segment of each signal thoroughly. The first-order difference equation is used to normalize the signal of foot GSR for medium stress as

$$Y = \left[ {X\left( 2 \right) - X(1} \right),X\left( 3 \right) - X\left( 2 \right), \ldots ,{ }X\left( m \right) - X\left( {m - 1} \right)].$$
(1)

where \(Y\) is the output signal after implementing the first-order difference equation, \(X\) is the raw signal and \(m\) shows the number of samples in each dataset. The processed data is shown in Fig. 5.

Fig. 5
figure 5

Original, filtered, and first-order difference signals of driver 5 (Medium level segment)

The derived first-order difference signal represents the GSR phasic response and associated stimulus point. The stimulus point is identified by finding the significant start point considering amplitudes higher than 0.001 mV to remove the tonic level. In the next step, the response onset and the peak time are defined to build the amplitude and rise time vectors.

2.4 Data classification

Analysis of the variance (ANOVA) is a statistical model that provides a statistical test of whether or not the means of several groups are equal. In this case, it generalizes t-test to more than two groups [17].

ANOVA classifier evaluates the value of between-groups variance compared to the average variance within groups. In the case that the between-group variances are identical, mean differences between groups. The ratio of between-group variance to within-group variance is the main focus of the ANOVA and the equation is provided in (2). ANOVA performs analysis by comparing the sample dataset and produces a single p-value. The significance level that we assume here is 0.05. If the P-value ratio is larger than the expected coincidentally, it results that at least one means is substantially dissimilar. The amount of P-value derives from the cdf of the F-distribution. Thus, to achieve a statistical result the F value calculated from the observed data and the critical value at the α error level of 0.05 should be compared [28, 29].

$$F = \frac{{{\text{Between}} - {\text{Group variance }}}}{{{\text{Within}} - {\text{Group variance }}}}$$
(2)

All the statistical results are obtained using the laboratory-developed software based on MATLAB. In this stage, a t-test is applied to compare various stress levels. Besides, the analysis of variance is performed to evaluate the proposed classification results. If the obtained P-value is smaller than 0.05, then the null hypothesis is rejected and the between-group differences are statistically significant. Therefore, this method could be reliable to discriminate stress levels.

3 Results and discussion

In this study, the amplitude and rise time features are employed to classify the stress levels. To investigate the significant difference between the mean of these extracted features, the variance test is used and the results of sixteen sets of experiments of one hand and one foot GSR signals for eight drivers) are assessed. Each experiment consists of two low-stress segments, two moderate-stress segments, and three high-stress segments. Each of the 112 sections is investigated statistically and the corresponding P-value results are identified. The ANOVA tables for hand and foot GSR signals are presented in Tables 2, 3, 4 and 5.

Table 2 Anova table using hand-amplitude feature
Table 3 Anova table using hand-time feature
Table 4 Anova table using foot-amplitude feature
Table 5 Anova table using foot-time feature

SS indicates the sum of squares of the errors, df is the degree of freedom and the total degrees of freedom equal to the total number of observations minus one. Moreover, the between-groups degree of freedom equals the number of groups minus one and within groups degree of freedom is the total degree of freedom minus the between groups degree of freedom. Also, MS is the mean squared error and the F-static is the ratio of the mean squared errors. The P-value is the probability that the test statistic can take a value greater than the value of the computed test statistics P (F > 0.05), the small P-value indicates that differences between column means are significant and we can reject H0.

Boxplots illustrate the upper and lower confidence limit of the median for each group. The boxplots of both features considering foot and hand GSR signals are shown in Figs. 6, 7, 8 and 9.

Fig. 6
figure 6

Classification performance of hand amplitude feature

Fig. 7
figure 7

Classification performance of hand time feature

Fig. 8
figure 8

Classification performance of foot amplitude feature

Fig. 9
figure 9

Classification performance of foot time feature

Calculated between-group P-value results of amplitude and rise time for all eight drivers for both ‘foot’ and ‘hand’ signals are shown in Figs. 10, 11, 12, 13. It is worthwhile to mention that P-values more than 0.05 is considered as an error which reduces the precision of classification.

Fig. 10
figure 10

P-value of hand’s amplitude feature

Fig. 11
figure 11

P-value of hand’s time feature

Fig. 12
figure 12

P-value of foot’s amplitude feature

Fig. 13
figure 13

P-value of foot’s time feature

According to the results, the extracted foot GSR amplitude feature provides more precise information about the drivers’ stress level. Besides, P-values of all drivers are compared and results show that there is a significant difference between all three levels of stress considering the amplitude feature. On the other hand, recognition between high and moderate stress levels has lower accuracy using the hand signal time feature. Average discrimination accuracy for different signals considering amplitude and time features are presented in Tables 6, 7, respectively.

Table 6 Discriminant accuracy between different stress levels using amplitude feature of hand and foot signal for 8 drivers
Table 7 Discriminant accuracy between different stress levels using time feature of hand and foot signal for 8 drivers

It is concluded that the amplitude feature is the most informative source to identify stress levels. Moreover, the foot GSR signal presents a high accuracy of stress recognition compared to the hand signal. Using the ANOVA classification model provides us with an average classification rate of 95.83%.

4 Conclusion

The focus of this research is on the stress classification based on the GSR signal during the driving task. For this aim, the ANOVA classifier model is employed as a classification tool. The GSR data sets are derived from the MIT Media Lab. It is concluded that the ANOVA model has an excellent performance in classifying three levels of stress based on P-value calculations with a 95.83% rate of accuracy in discriminating three levels. After a thorough study, we can conclude that GSR signals could individually be a reliable data source for stress recognition and classification particularly with the employment of amplitude features of the foot and ANOVA approach. The need for less expensive sensors and less interference with the driver particularly during the driving task endorses the fact that a single signal measurement procedure would be an ideal perspective for future stress detection devices. It must be noted that the acquisition of this data is easier and less prone to hassles compared to other physiological signals. Although the achieved accuracy in this paper is promising and the acquisition of GSR signal is considerably easier compared to other physiological signals, the effect of external temperature in skin conductance which might lead to some errors is undeniable [30]. Since this research is based on previously collected data on the Physionet website, utilizing more precise and comprehensive data would improve the accuracy of classification.

The perspective of future research could be achieving a noncontact stress calculator system which could have hassle-free and cheap implementation along with minimum involvement of external factors and high accuracy. A reliable stress measurement procedure that would allow the driver to participate in the driving task more conveniently could be a future direction.