Driver Classification Using Self-reported, Psychophysiological, and Performance Metrics Within a Simulated Environment

Kummetha, Vishal Chandra; Durrani, Umair; Mason, Justin; Concas, Sisinnio; Kondyli, Alexandra

doi:10.1007/s42421-023-00069-8

Driver Classification Using Self-reported, Psychophysiological, and Performance Metrics Within a Simulated Environment

Research
Published: 12 April 2023

Volume 5, article number 6, (2023)
Cite this article

Download PDF

Data Science for Transportation Aims and scope Submit manuscript

Driver Classification Using Self-reported, Psychophysiological, and Performance Metrics Within a Simulated Environment

Download PDF

1823 Accesses
1 Altmetric
Explore all metrics

Abstract

Driver classification provides an efficient approach to isolating unique traits associated with specific driver types under various driving conditions. Several past studies use classification to identify behavior and driving styles; however, very few studies employ both measurable physiological changes and environmental factors. This study looked to address the shortcomings in driver classification research using a data-driven approach to assess driving tasks performed under varying mental workloads. Psychophysiological and driving performance changes experienced by drivers when engaged in simulated tasks of varying difficulty were coupled with machine-learning techniques to provide a more accurate estimate of the ground truth for behavioral classification. A driving simulator study consisting of six tasks was carefully designed to incrementally vary complexity between individual tasks. Ninety drivers were recruited to participate in both the subjective and driving components of this research. Positive and Negative Affect Schedule (PANAS), Cognitive Reflection Task (CRT), Interpersonal Reactivity Index (IRI), Empathy Assessment Index (EAI), Psychological Entitlement Scale (PES), 18-point Need for Cognition (NFC), and a basic demographic survey were administered. Time-series clustering using the Dynamic Time Warping (DTW) score within hierarchical clustering was applied to determine difference in driving styles. The use of pre-driving psychometric questionnaires to determine the most suitable metrics for predicting driving style outside of the automobile suggest promising results. The results of the binary logistic regression indicate that an individual’s annual driving mileage, CRT score, age, IRI fantasy score, and EAI affective response score, contribute significantly towards predicting their driving style.

Assessment and Profiling of Driving Style and Skills

The Role of Eco-Driving and Wearable Sensors in Industry 4.0

User Engagement with Driving Simulators: An Analysis of Physiological Signals

Introduction

Driver classification is a common strategy used to easily group individuals based on similar behavioral traits, driving style, or demographic characteristics. Classification provides an efficient approach to isolating unique characteristics associated with specific driver types under various driving conditions. Not only can this process be used to develop better driver behavioral models for car-following or lane changing, but also to identify risk-averse driving behaviors; thus, providing context for targeted safe-driving education.

Existing research indicates that driver classification has primarily been performed using aggregate or continuous driving performance metrics and driving-centric questionnaires. Furthermore, very few studies incorporate the use of physiological (defined as the information pertaining to a driver’s bodily functions such as cognition, heart rate, vision-related changes, skin conductance, perspiration, and others (Lin et al. 2006)) changes and environmental factors when assigning driving styles. To achieve a more holistic classification, this study seeks to address the above shortcomings using a data-driven approach to assess driving tasks performed under varying traffic conditions resulting in measurable changes to mental workloads (defined as the proportion of mental capacity required by an individual to perform a task (Brookhuis et al. 1991)). We also go a step further to assess the overall performance under complex tasks such as distracted driving. The main goals of this research are to classify drivers based on their simulated driving and psychophysiological (defined as physiological changes resulting from variations of psychological processes or behavior (Kummetha 2020)) performance using a machine-learning approach and predict their classification outcome (i.e., driving profile) from the scores of various psychometric questionnaires administered prior to their driving assessment, as independent variables within logistic regression. The novelty of this research stems from the extensive driving performance and psychophysiological time-series data collected across various task complexities and for a relatively large pool of ninety participants.

The manuscript begins by providing a comprehensive overview on the existing research on subjective and quantitative driver classification. The section detailing the data and methods is then presented, diving into the participant selection and simulation, psychophysiological measures, data preparation, and methodological approach to driver classification and statistical modeling. The results, discussion, and conclusions are then presented.

Background

Several studies have employed classification to identify behavior and driving styles. Some rely on continuous measures of driving performance or physiological parameters while others take on a subjective approach relying on questionnaires. Aoude et al. developed algorithms to classifying drivers by considering their behavior at intersections (Aoude et al. 2012). The authors collected roadside data which included vehicle speeds, range, and lateral positions. A GPS antenna also recorded video from all intersection approaches including the traffic signal phases. They implemented two machine-learning algorithms called Support Vector Machine (SVM) and Hidden Markov Model (HMM) to classify whether the driver was compliant or violating the signal indication. The developed model was then validated on another naturalistic dataset.

However, roadside data alone may not accurately classify drivers. Driving performance data from inside the vehicle can provide richer information for driver classification. Kondyli and Elefteriadou observed behavioral trends within a naturalistic setting suggesting three driver behavioral types: aggressive, average, and conservative. Aggressive drivers were observed to drive at high speeds (> 15 mph speed limit), perform six discretionary lane changes, and aggressive lane changing (Kondyli and Elefteriadou 2011). Average drivers did not exceed 10 mph over posted speed limits while conservative drivers maintained speeds within 5 mph of posted limits. Fernandez et al. used several data elements such as gas/brake pedal usage, speed indicators, and acceleration collected from a 3-D virtual driving simulator with a lifelike cockpit to group drivers (Fernández and Ito 2016). They developed a Fuzzy rule-based system that can profile the drivers into five groups (very passive, passive, normal, aggressive, and dangerous). Ly et al. built both supervised (SVM) and unsupervised (K-means clustering) machine-learning models using the inertia data collected from an instrumented vehicle’s Controller Area Network (CAN) bus (Ly et al. 2013). The study showed that features associated with the acceleration did not contribute much to classify drivers, rather the braking and turning events were prominent features to differentiate between drivers. Zhang et al. also developed a similar SVM model to classify drivers based on individual characteristics sensed by only low-level sensors (Zhang et al. 2016). Low-level sensor data include data gathered from sensors that are available on smartphones along with the sensor information available through the vehicle’s on-board diagnostics (OBD-II) port. Acceleration and jerk were used as main parameters instead of speed, as these variables were more closely related to driving behavior (Zhang et al. 2016). One of the shortcomings of the study was that the classifier model worked best with fewer drivers, which may not be a realistic case for real world-driving conditions. A general framework to classify drivers through mathematical strategies was proposed by Jensen et al. (2011). In-vehicle data and survey data were both collected to build the framework to classify drivers into six classes ranging from ‘timid’ to ‘aggressive’. The in-vehicle data were collected through the OBD-II recorder and included vehicle speed, acceleration, jerk, brake pressure, steering wheel angle, and yaw rate. The survey was mostly limited to questionnaires regarding demographic characteristics of the drivers.

Further, researchers have also explored the use of physiological measures in driver classification. Lin et al. classified ten drivers by analyzing physiological measures in response to an unexpected obstacle in a virtual simulator. Drivers were classified using driving trajectory, speeds, and event-related potentials (ERP) to form two categories: aggressive and gentle (Lin et al. 2006). Zeng et al. utilized Electroencephalography (EEG) signals to classify driver’s mental state (Zeng et al. 2018). Physiological data (EEG signal, blink rate, and heart rate) were collected through a driving simulator experiment. They proposed two deep convolutional neural network-based models called EEG-Conv and EEG-Conv-R, which can predict the mental state of the drivers (whether the driver is drowsy or not). The experiment and validation concluded that the developed models were more reliable and accurate for binary classification of the driver’s mental state than traditional machine-learning models like SVM.

Subjective questionnaires are frequently used in driving research to capture information not readily available or captured in performance datasets. The Driver Behavior Questionnaire (DBQ) was initially created to assess whether a questionnaire could distinguish between four main factors, i.e., unfocused errors, emotional errors, reckless violations, and confused errors. The questionnaire could explain up to 33% of the variance in aggressive driving taking into account various errors and violations (Hong et al. 2014). Berdoulat et al. investigated the potential effects of aggressiveness and transgressive behavior of the drivers while driving (Berdoulat et al. 2013). They utilized five different types of scales and questionnaires to assess driving anger, aggressiveness, and impulsiveness. DBQ was used to classify three types of behaviors: “violations”, “slips and lapses”, and “Errors”.

Tao et al. investigated the role of personality traits and driving experience in risky driving behaviors among drivers in China (Tao et al. 2017). Personality traits were measured by Eysenck Personality Questionnaire (EPQ) which has 12-item subscales assessing four personality constructs such as: extraversion (e.g., “Are you a talkative person?”), neuroticism (e.g., “Does your mood often go up and down?”), psychoticism (e.g., “Do you enjoy co-operating with others?”) and lie (e.g., “Do you say some bad words casually?”). The risky driving behaviors were analyzed by the Chinese version of DBQ which modified the original four factors of DBQ into: ordinary errors and violations (15 items), lapses (6 items), aggressive violations (3 items), and speeding (3 items). Another questionnaire that has been moderately used in driving behavior studies is “The Driving Vengeance Questionnaire (DVQ)”. It was created and used to gauge how often drivers avenge themselves in typical driving circumstances (Wiesenthal et al. 2000). Wiesenthal et al. further validated the DVQ where they showed that DVQ scores were predicted by driver violence and vengeful drivers had violence into their past acts. In situations of heavy traffic, vengeful drivers were more likely to display moderate aggression.

Other psychometric evaluation techniques used in behavioral studies to assess mood and personality, cognitive engagement, and empathy and social decision-making, could also be applied to link general behavioral characteristics to driving styles. The most popular questionnaires/metrics as found in the literature are summarized in Table 1.

Table 1 Summary of common psychometric evaluators

Full size table

From the literature, it is evident that past studies have primarily performed classification using one of the three categories of data, i.e., driving performance, physiological measures, or driving-centric questionnaires. However, most metrics are directly associated with driving, without regard for other behavioral traits outside the realm of the automobile but still influence the tasks associated with driving. In addition, most of the previous driver-classification studies, focused on uncovering extreme behaviors (e.g., errors or risky behaviors) and their correlation with safety and crash occurrence. However, they do not investigate how driver variability and different driver styles are affected by the roadway/traffic environment and complexity, and consequently, how this relationship may impact the operation of the traffic stream. Therefore, the main objective of this study was to investigate driver variability and potential driver classes while completing driving tasks under different roadway environments and normal/crash-free operations, by combining a variety of complex psychophysiological and driving performance measures. A secondary objective was to assess whether static driver information obtained from subjective questionnaires and demographic data can equivalently be used to predict the derived driver classes. This is achieved by:

Data fusion of psychophysiological and driving performance changes experienced by drivers when engaged in simulated tasks of varying level of difficulty, coupled with machine-learning techniques to provide a more accurate estimate of the ground truth for behavioral classification, and
Employing pre-driving psychometric questionnaires to determine the most suitable subjective metrics for predicting driving style. If psychometric questionnaires have strong predictive qualities, this may save resources and time for defensive driving educators, researchers, driving rehabilitation specialists, and drivers.

Data and Methods

In this study, we utilize time-series datasets of complex psychophysiological and driving performance measures as the ground truth to complement a set of psychometric and demographic questionnaires to assign and predict driver classes. This section details the participant selection and driving simulation, psychophysiological measures employed, and data preparation and methodological framework.

Participants and Driving Simulation

The established methodology involves collecting driving performance and psychophysiological data from six tasks in a driving simulator setting as shown in Fig. 1. A fixed-base simulator, in a half cab, utilizing three forward screens to provide a 170° horizontal field of view and a rear screen was used in this research. Ninety participants ranging from 18 to 64 years of age were recruited for the study (mean age = 31.4 years, standard deviation (SD) = 14.2 years, Male driver to Female driver ratio of 1). PANAS, CRT, IRI, EAI, PES, 18-point NFC, and a basic demographic survey (i.e., age; gender; annual mileage traveled; crash history; insurance type; traffic violation history; cell phone use while driving; and education level) were administered to drivers prior to their participation in the simulated drive. Drivers were compensated for their participation in the study with a $50 gift card. Table 2 presents a summary of the demographic properties of the recruited drivers.

Table 2 Summary of participant demographics

Full size table

Participants’ eligibility for the study was determined based on the following criteria:

Aged between 18 and 65 years and in possession of a valid U.S. driver’s license;
At least 1 year of driving experience with an annual mileage of not less than 1,000 miles; and
In good health (no eye conditions apart from corrective lenses, free from any heart conditions and seizures, no history of inner ear/balance problems, and little to no known history of severe motion sickness).

For the simulated drive, a tutorial and six tasks consisting of select external conditions as detailed in Table 3 were created. The tutorial comprised of familiarizing participants with the simulator environment and equipment, particularly gas and brake pedal responsiveness, steering wheel sensitivity, speed and gap adjustment, and lane changing activity. The tutorial also served as a tool to screen for simulator sickness and any participants that showed signs of severe simulator sickness (i.e., vertigo, nausea, sweating, dizziness, fatigue, stomach awareness, vomiting, and general discomfort) were advised to forfeit with a partial monetary compensation.

Table 3 Summary of common psychometric evaluators

Full size table

Each task was 8 km long with a posted speed limit of 70 mph (112.7 km/h) and was designed to capture incremental changes to behavioral and driving performance measures. Tasks varied in traffic density, but drivers were free to maintain any gap or speed. Emphasis was placed in scenario design to carefully control in-vehicle body movements, visual and auditory discrepancies, and task uniformity, as these significantly affect data quality of psychophysiological measures. The task sequence for each participant was completely randomized to eliminate order bias. Further, psychophysiological measures were continuously collected during the drive by capturing micro-level changes to vision and attention. At the end of each task, participants were required to complete the National Aeronautics and Space Administration-Task Load Index (NASA-TLX) (Hart 1986) and Situation Awareness Rating Technique (SART) (Selcon and Taylor 1990) questionnaires to provide a well-rounded subjective baseline.

Tasks 4, 5, and 6 shared the same roadway configuration as shown in Fig. 2. Task 6 additionally required participants to attempt a visual secondary task on a 10-in. Windows tablet, up to four times at random intervals no longer than 20 s each.

The tablet was placed approximately 30° to the lower right from drivers’ line of sight and was used to simulate the act of driving while using the media center or entering text into an on-board navigation system as shown in Fig. 3 (Kummetha 2020; Kummetha et al. 2020). The data collection zone was limited to the middle 6.4 km to minimize data irregularities arising from the beginning and ending of the drive, shown in Fig. 2.

Psychophysiological measures

For the continuous measurement of vision-related metrics such as pupil diameter, gaze point vectors, blinks, and fixations, a Fovio FX3 eye tracker with a data collection frequency of 60 Hz was utilized (EyeTracking-Inc. 2019). Mental workload was established from the eye tracker in one-second increments using a patented pupillometric technique called Index of Cognitive Activity (ICA) (Vogels et al. 2018). ICA was chosen over the raw pupil diameter metric because of its ability to disentangle pupil response arising from changing lighting conditions especially prominent in virtual environments (Vogels et al. 2018). ICA scores vary continuously between 0 and 1 indicating the lower and higher spectrum of mental workloads, respectively.

Engagement level (EL), defined as the level of engagement/arousal experienced in response to a particular task, was an important component in this research (Pope et al. 1995). Enobio 8 portable Bluetooth-based system was used to capture participants baseline and time-correlated EEG at 500 Hz (Neuroelectrics 2019). Eight EEG electrode positions (i.e., P3, PZ, P4, CZ, T7, T8, O1, and FZ) aligned with the 10–20 international mapping layout and findings of Pope et al. were used as shown in Fig. 4 (Pope et al. 1995). An ear clip was also used to add two reference electrodes, i.e., common mode sense (CMS) and driven right leg (DRL). Driver EL was calculated using the power spectral density (PSD) relationship β/(α + θ) established by Pope et al. and Prinzel III et al. to assess alertness and engagement, mental effort, and attention investment; with theta band (4–8 Hz), alpha band (8–13 Hz): linked to mental workload, and cognitive fatigue, and beta band (13–22 Hz): linked to visual attention and short-term memory (Pope et al. 1995; Prinzel 2001).

Data were first system time-synchronized to identify critical analysis zones using MATLAB. The EEG data were isolated using unique identifiers and split into 10-s events. This resulted in approximately twenty 10-s events for each task with the goal of 20-point estimates of EL per task. Postprocessing was performed using the EEGLAB v14.1.1 toolbox in MATLAB (Delorme and Makeig 2004). A custom MATLAB script was developed to convert data from the native Neuroelectrics-Enobio 8 (.easy) format to EEGLAB (.set). The EEG data were then resampled from their native 500 to 240 Hz to ensure noise and processing time reduction. The dataset was filtered by applying a high-pass filter at 1.6 Hz and a notch filter at 60 Hz. An independent component analysis was performed by identifying and pruning artifacts from blinks and muscle movement to improve signal quality. This was followed by the application of a low-pass filter at 40 Hz. The filtered signal was then split into 2-s epochs with 5% overlap. The Darbeliai plugin in EEGLAB was applied to perform a fast Fourier transformation (FFT) using a 2-s window to generate PSD values for each electrode position. Using β/(α + θ) at PZ, CZ, P3, and P4 electrode locations, a single combined PSD value representing driver EL was obtained every 10-s of the drive. PSD combined values closer to 0 indicate lower EL.

Further, a Polar H10 chest strap was used to monitor drivers heart rate (HR) at 1 Hz as a surrogate measure of mental workload. HR data were manually synchronized with the frames of the driving simulator output. Baseline HR data were collected prior to the start of the drive and HR variability within participants was normalized with respect to the baseline.

Data Preparation and Methods Summary

Due to the vast data collection techniques used in the study design, data fusion is employed to merge all output elements. The data fusion framework used to uniformly synchronize all data variables to 10 Hz is shown in Fig. 5.

Figure 6 summarizes the methodological approach with respect to five main stages, i.e., psychometric questionnaires, simulator study, data fusion, driver classification, and the prediction model.

Results

Task-Specific Analysis

As each task was designed to capture performance and behavioral variability, changes in mental workload and situation awareness between individual tasks were assessed using a series of repeated measure analysis of variances (ANOVAs) with task 1 as the baseline. The ANOVAs were used to test the null hypothesis that the driver-reported means of the average NASA-TLX and SART scores across the six tasks were similar and to identify the presence of any statistically significant differences, providing quantitative validity for the experimental design in terms of the varying complexity of the attempted driving tasks. The average NASA-TLX scores showed no significant differences (α = 0.05) between the mean scores of task 2 and task 1 (baseline). However, significant differences in scores were observed between the baseline and tasks 3 (F(1, 83) = 4.087, p = 0.046, ${\eta }_{P}^{2}$ = 0.047), 4 (F(1, 83) = 16.298, p < 0.001, ${\eta }_{P}^{2}$ = 0.164), 5 (F(1, 83) = 35.230, p < 0.001, ${\eta }_{P}^{2}$ = 0.298), and 6 (F(1, 83) = 201.257, p < 0.001, ${\eta }_{P}^{2}$ = 0.708). The significant increases in NASA-TLX scores suggest that the developed tasks successfully captured various levels of mental workload, shown in Fig. 7.

Similar trends across tasks were observed with the SART scores. No significant differences were observed between the baseline and the mean scores of the task 2 and task 3; however, task 4 (F(1, 83) = 7.448, p = 0.008, ${\eta }_{P}^{2}$ = 0.082), task 5 (F(1, 83) = 7.840, p = 0.006, ${\eta }_{P}^{2}$ = 0.086), and task 6 (F(1, 83) = 26.794, p < 0.001, ${\eta }_{P}^{2}$ = 0.244) showed significant differences. Overall, with greater task complexity an increase in subjective mental workload was observed along with a decrease in situation awareness (Kummetha and Kondyli 2022).

Time-Series Clustering

As tasks were constant between participants, time-series clustering using the Dynamic Time Warping (DTW) algorithm was selected for the analysis. The DTW algorithm provides a robust method of measuring and matching similar time-series datasets, such as trajectory and psychophysiological data in our case. The DTW algorithm also allows us to compare time-series datasets of different lengths, or even with time lag between two pairs of interest, especially useful in driving studies where no two events can be completely identical (Giorgino 2009; Mueen and Keogh 2016). The DTW score (dissimilarity) for the two time-series (D1, D2) can be calculated using Eq. 1. The higher the DTW score the larger the dissimilarity and the closer the score to zero the more identical the observed pattern in the selected variable across the pair of drivers (Durrani and Lee 2019).

$$DTW\left(D1, D2\right)=\frac{1}{K}\sqrt{\sum_{k=1}^{K}{d}_{k},}$$

(1)

where ${d}_{k}$ is the distance in cell k in the shortest path. K is the total number of cells in the shortest path.

The same approach is employed to compare more than two time-series trends. In this study, acceleration (m/s²), steering wheel angle (degrees), headway (m), mental workload as ICA from left eye, and EL, were used for time-series clustering using the hierarchical clustering algorithm with DTW score as the distance metric (Yan et al. 2019). HR was not included as ICA was deemed to be more accurate for measuring the mental workload. Hierarchical clustering was selected due to its ability to provide intuitive visualization of the clustering process and unsupervised identification of cluster numbers (Johnson 1967). Hierarchical clustering was performed with various iterations of the number of clusters (i.e., n = 2 to 5), resulting in an optimal solution consisting of two clusters. The Silhouette coefficient was also calculated for various n values to verify the optimal solution and the best/largest coefficient value (0.33) was obtained using n = 2 (Rousseeuw 1987). Hierarchical clustering does have some limitations associated with sensitivity to the distance metric and inability to handle missing data points, however, with careful selection of the distance metric (i.e., DTW score) the effects were minimized.

The summary of the aggregate metrics for the two clusters established for each task is shown in Table 4. The two established clusters did not exhibit qualities of aggressive driving as evident from the average speeds and headways. Average speeds were within 6 km/h of the posted speed limit while the smallest headway was 68.4 m (2.2 s). Drivers showed distinct properties within these clusters, including differences in self-reported measures of mental workload (NASA-TLX score) and situation awareness (SART score).

Table 4 Cluster properties

Full size table

Cluster outputs from the six driving tasks were aggregated to individually classify drivers as having moderate or conservative driving styles i.e., if a driver was classified to represent moderate driving style in three or more of the tasks, they were classified as moderate, else conservative. This was necessary because although hierarchical clustering classifies drivers, it is not apparent as to what each cluster represents, and the aggregate cluster properties help differentiate the behavioral composition within them. Figure 8 shows the mean variable spread of the conservative and moderate clusters across the six tasks. Cluster ID 2 in almost all instances has a larger surface spread, with the exception of the speed axis. This surface spread explains the differences between the two clusters and suggests that moderate drivers (i.e., Cluster 1) were younger with less driving experience and on average maintained shorter headway, at higher but safe speeds, and exhibited lower mental workload and higher situation awareness compared to conservative drivers.

The recorded increase in self-reported mental workload and decrease in situation awareness across both driver clusters could be a result of driving under extreme/non-typical conditions (such as Task 6 with the presence of a distraction). Out of 90 drivers, 84 drivers completed all the components of the questionnaires and driving tasks. 20 drivers were classified as overall conservative across all tasks while 64 drivers were classified as moderate.

Logistic Regression

A 70/30 training–testing stratified split was applied to each of the obtained cluster IDs before performing further analysis. Binary logistic regression, shown in Eq. 2, is a form of statistical modeling appropriate for independent binary outcome variables, such as in classification problems to distinguish one class from another. Further, binary logistic regression is capable of handling both continuous and categorical independent variables and provides informative outcomes expressed as odds ratios. In this application, binary logistic regression was performed to distinguish between conservative (indicated by the value 0) and moderate (indicated by the value 1) drivers and predict their driving styles, using the psychometric metric and demographic questionnaires data.

$$\mathrm{log}\left(\frac{p}{1-p}\right)={\beta }_{0}+{\beta }_{1}{X}_{1}+{\beta }_{2}{X}_{2}+\cdots +{\beta }_{n}{X}_{n},$$

(2)

where $\mathrm{log}()$ is the natural logarithm. $p$ is the probability of the dependent variable. ${\beta }_{0}$ is the intercept of the constant term. ${\beta }_{1},\dots ,{\beta }_{n}$ are the coefficients assigned to the independent variables ${X}_{1},\dots ,{X}_{n}$

Training

A total of 10 iterations were performed using the following independent variables from the survey and questionnaires: PANAS (subscales: positive affect, negative affect), CRT, IRI (subscales: perspective taking, fantasy, empathic concern, and personal distress), EAI (subscales: AR, SA, ER, PT, EA), PES, 18-point NFC, and other demographics (i.e., age; gender; annual mileage traveled; crash history; insurance type; traffic violation history; cell phone use while driving; and education level). A balance between over-fitting the model and classification accuracy was achieved by excluding correlated parameters within the questionnaires. The backward stepwise selection was implemented to sequentially remove statistically insignificant variables in the model. Manual checks were performed to ensure that the included independent variables were contributing substantially to the beta coefficient (B) i.e., Exp(B) and 95% confidence intervals deviating considerably from 1. The results of the logistic regression are shown in Table 5.

Table 5 Logistic regression summary

Full size table

The results from the binary logistic regression indicate that an individual’s annual mileage, CRT score, Age, IRI fantasy (FS), and EAI affective response (AR) can significantly help in predicting their driving style (moderate or conservative) with a training prediction accuracy of 87.7%. Figure 9 shows the scatter plots of the training dataset with respect to the significant variables in the logistic regression model. Results indicate a weak to moderate correlation between Annual Mileage and Age (r = 0.39) and CRT score and EAI AR (r = − 0.32), suggesting no substantial effect of multicollinearity.

Testing

Results from the confusion-matrix of the test sample, shown in Table 6, indicate a higher-than-average classification accuracy of 65.4% and a relatively good recall, precision score, and F₁ score, suggesting a good predictive model (especially for moderate drivers). However, for conservative drivers the percent of correct predictions is relatively low at 33%. This may be attributed to factors such as the relatively low sample size of the conservative drivers, selected training–testing data split, and the existence of other non-linear relationships between the variables not accounted for within the binary logistic regression.

Table 6 Confusion matrix

Full size table

Discussion

An important objective of this research was to investigate driver variability under different roadway/traffic environments and task complexity. This was achieved through the testing of six driving simulator scenarios (Table 3) with multiple levels of task difficulty by varying the traffic, geometry, and control conditions, and presence of distraction. It was found that, as task difficulty increased, mental workload increased and situation awareness decreased, supporting the linear relationship between these variables and ensuring that our data collection methods were appropriate to study these relationships.

Time-series clustering using the hierarchical approach with the DTW score as the distance metric was also performed to classify drivers by utilizing a holistic approach where both performance metrics (acceleration, headway, steering wheel angle) and psychophysiological characteristics (mental workload, situation awareness) were adopted. The two derived clusters (i.e., moderate and conservative) indicate distinct differences in driver age, driving experience, average headway, average speed, and average subjective scores for mental workload and situation awareness, similar to the observations made by Lin et al. (2006) and Manjunatha et al. (2019).

The second modeling approach of binary logistic regression was used to investigate whether we can use offline, static driver information to accurately cluster drivers. The results of the logistic regression indicate that annual driving mileage, CRT score, Age, IRI fantasy score, and EAI affective response score, can be used to predict driving styles with a decent accuracy and precision. The odds ratios (Exp(B)) for the significant variables shown in Table 5 were assessed for more insights into the findings. The odds ratio for annual mileage suggests that an increase in reported mileage increases the chance of being a moderate driver. To be specific, every 1000 miles increase in the reported annual mileage is associated with a 54.7% increase in the odds of being a moderate driver. The CRT score can be used as a measure of intuitive thinking and objective assessment, an important component of executive function, which is crucial for safe driving. The results suggest that an increase in the CRT score (indicating higher level of objective assessment) by one point decreases the odds of being a moderate driver by 60.7%. Similarly, for every 1-year increase in age, the odds of being classified as a moderate driver decreases by 10.4%.

The IRI FS score, which represents an individual’s tendency to transpose into imaginative feeling and characters, was also found to be significant. A one-point increase in IRI FS score decreases the odds of being classified as a moderate driver by up to 14%. Although, not directly intuitive to driving, inference can be made from the fact that a high IRI FS score indicates the general tendency to use more cognitive resources engaging in thoughts while driving suggesting a slower behavioral adaptation to changes in conditions, typically associated with decreased driving performance (Davis 1983).

EAI AR, which is a subcomponent of empathy derived from affective response, was also found to be significant. An increase in the EAI AR score by one point suggests up to 28.6% decrease in the likelihood of being classified as moderate driver. This result seems plausible and similar to the past research (Wiesenthal et al. 2000) as an increase in empathy could indicate less vengeance and lead to more conservative driving behavior such as letting motorists cut in front or allowing merging at on/off ramps.

Overall, the results show similarities with past research utilizing other forms of questionnaires with similar subcomponents of personality traits. However, the differences lie within the variety of psychometric and psychophysiological data collected, multi-complexity experimental design, clustering approach, and prediction methods. The combined accuracy of predicting driving style from the logistic regression performed on the training and testing datasets was 76.6%, which is higher than similar studies using non-driving performance metrics for classification (Liao et al. 2022; Razmara et al. 2018).

Conclusions

This study aimed at addressing the shortcomings in driver classification research using a data-driven approach and assess driving tasks performed under varying mental workloads. A simulator study was carefully designed to achieve variability in complexity between individual tasks. Ninety drivers were recruited to participate in both the subjective and driving components of this research. The main goals of this research were achieved by successfully performing data fusion of psychophysiological and driving performance. Time-series clustering using the DTW score as the distance metric within the hierarchical clustering was performed to provide a more accurate and holistic estimate of the ground truth for behavioral classification.

Evaluating the use of pre-driving psychometric questionnaires to determine the most suitable metrics for predicting driving style outside of the automobile suggest promising results. Binary logistic regression summary indicates that an individual’s annual driving mileage, CRT score, Age, IRI fantasy score, and EAI affective response score, can be directly attributed to identifying their driving style. The results provide a promising outlook towards the practical strength of using psychometric evaluators in driver profiling with respect to personality, attitudes, and habits, risk assessment to develop targeted interventions for road safety improvement, deeper understanding of driver behavior and decision making processes that affect the conscious and subconscious components of driving, development of training and education programs targeted towards pre-screened individuals, and design/development of intelligent and autonomous transport systems to tailor behavioral needs and preferences of individual drivers.

Although the research goals were achieved, the limitations of simulator-based driving data should be acknowledged. However, the use of complex physiological equipment such as EEG and presence of hazardous conditions (in-vehicle distractions) might not be feasible for real-world testing. A larger sample size would provide more concrete evidence of the observed findings. Lastly, large sample sizes are also needed to investigate the attributes of the aggressive driving style cohort, since the existing dataset only captured moderate and conservative driving styles.

Based on the conclusions derived from this study, future research can utilize a similar methodology to study how psychometric evaluators can be used to predict driver behavior during partial (i.e., adaptive cruise control and lane centering) and full automation. Another aspect to explore would be utilizing psychometric evaluators to subjectively predict individual driving styles during travel-altering emergencies such as pandemics (i.e., coronavirus disease 2019 (COVID-19) travel restrictions/lockdowns) and disaster evacuations (i.e., hurricanes, wildfires, floods, and other types of severe weather).

Availability of Data and Materials

The datasets generated during and/or analyzed during the study are not publicly available due to them containing sensitive information that could compromise research participant privacy/consent but are available from the corresponding author [VK] on reasonable request.

References

Aoude GS, Desaraju VR, Stephens LH, How JP (2012) Driver behavior classification at intersections and validation on large naturalistic data set. IEEE Trans Intell Transp Syst 13(2):724–736. https://doi.org/10.1109/TITS.2011.2179537
Article Google Scholar
Berdoulat E, Vavassori D, Sastre MT (2013) Driving anger, emotional and instrumental aggressiveness, and impulsiveness in the prediction of aggressive and transgressive driving. Accid Anal Prev 50:758–767. https://doi.org/10.1016/j.aap.2012.06.029
Article Google Scholar
Brookhuis KA, de Vries G, de Waard D (1991) The effects of mobile telephoning on driving performance. Accid Anal Prev 23(4):309–316. https://doi.org/10.1016/0001-4575(91)90008-s
Article Google Scholar
Cacioppo JT, Petty RE, Feng Kao C (1984) The efficient assessment of need for cognition. J Pers Assess 48(3):306–307. https://doi.org/10.1207/s15327752jpa4803_13
Article Google Scholar
Campbell WK, Bonacci AM, Shelton J, Exline JJ, Bushman BJ (2004) Psychological entitlement: interpersonal consequences and validation of a self-report measure. J Pers Assess 83(1):29–45. https://doi.org/10.1207/s15327752jpa8301_04
Article Google Scholar
Davis MH (1983) Measuring individual differences in empathy: evidence for a multidimensional approach. J Pers Soc Psychol 44(1):113–126. https://doi.org/10.1037/0022-3514.44.1.113
Article Google Scholar
Delorme A, Makeig S (2004) EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. J Neurosci Methods 134(1):9–21. https://doi.org/10.1016/j.jneumeth.2003.10.009
Article Google Scholar
Durrani U, Lee C (2019) Investigating heterogeneity in car-following behavior due to driving style and lead vehicle type: implications for the wiedemann model. In: 98th transportation research board annual meeting, Washington, DC
EyeTracking-Inc. (2019) EyeWorks 3 User Manual
Fernández S, Ito T (2016) Driver classification for intelligent transportation systems using fuzzy logic. In: 2016 IEEE 19th international conference on intelligent transportation systems (ITSC), pp 1212–1216
Frederick S (2005) Cognitive reflection and decision making. J Econ Perspect 19(4):25–42. https://doi.org/10.1257/089533005775196732
Article Google Scholar
Giorgino T (2009) Computing and visualizing dynamic time warping alignments in R: the dtw package. J Stat Softw 31(7):1–24. https://doi.org/10.18637/jss.v031.i07
Article Google Scholar
Hart SG (1986) NASA task load index (TLX)
Highway Capacity Manual (2010) Fifth edition. Washington, D.C.: Transportation Research Board, c2010-. https://search.library.wisc.edu/catalog/9910110589002121
Hong J-H, Margines B, Dey AK (2014) A smartphone-based sensing platform to model aggressive driving behaviors. In: Proceedings of the SIGCHI conference on human factors in computing systems, Toronto, Ontario, Canada.https://doi.org/10.1145/2556288.2557321
Jensen M, Wagner J, Alexander K (2011) Analysis of in-vehicle driver behaviour data for improved safety. Int J Veh Saf 5(3):197–212. https://doi.org/10.1504/ijvs.2011.042850
Article Google Scholar
Johnson SC (1967) Hierarchical clustering schemes. Psychometrika 32(3):241–254
Article MATH Google Scholar
Kondyli A, Elefteriadou L (2011) Modeling driver behavior at freeway-ramp merges. Transp Res Rec 2249(1):29–37. https://doi.org/10.3141/2249-05
Article Google Scholar
Kummetha VC, Kondyli A (2022) Simulator-based framework to incorporate driving heterogeneity via a biobehavioral extension to the intelligent driver model. Transport Res Rec 1(1):03611981221134630. https://doi.org/10.1177/03611981221134630
Article Google Scholar
Kummetha VC, Kondyli A, Chrysikou EG, Schrock SD (2020) Safety analysis of work zone complexity with respect to driver characteristics — A simulator study employing performance and gaze measures. Accident Anal Prevent 142:105566. https://doi.org/10.1016/j.aap.2020.105566
Article Google Scholar
Kummetha VC, Kondyli A, Devos H (2021) Evaluating driver comprehension of the roadway environment to retain accountability of safety during driving automation. Transport Res Part F Traffic Psychol Behav 81:457–471. https://doi.org/10.1016/j.trf.2021.07.007
Article Google Scholar
Kummetha VC (2020) Incorporating biobehavioral architecture into car-following models: a driving simulator study, University of Kansas]. https://www.proquest.com/openview/59eab8d718164956a0befca0e85df097/1?cbl=18750&diss=y&pq-origsite=gscholar
Liao X, Mehrotra S, Ho S, Gorospe Y, Wu X, Mistu T (2022) Driver profile modeling based on driving style, personality traits, and mood states. In: 2022 IEEE 25th international conference on intelligent transportation systems (ITSC)
Lietz CA, Gerdes KE, Sun F, Geiger JM, Wagaman MA, Segal EA (2011) The Empathy Assessment Index (EAI): a confirmatory factor analysis of a multidimensional model of empathy. J Soc Soc Work Res 2(2):104–124. https://doi.org/10.5243/jsswr.2011.6
Article Google Scholar
Lin C, Liang S, Chao W, Ko L, Chao C, Chen Y, Huang T (2006) Driving style classification by analyzing EEG responses to unexpected obstacle dodging tasks. In: 2006 IEEE international conference on systems, man and cybernetics
Ly MV, Martin S, Trivedi MM (2013) Driver classification and driving style recognition using inertial sensors. IEEE Intell Veh Symp (IV) 2013:1040–1045
Google Scholar
Manjunatha P, Kummetha VC, Kondyli A, Elefteriadou L (2019) Validating the task-capability extension to the Intelligent Driver Model (IDM) using driving simulator data. In: 98th annual meeting of the transportation research board
Mueen A, Keogh E (2016) Extracting optimal performance from dynamic time warping. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, California, USA. https://doi.org/10.1145/2939672.2945383
Neuroelectrics (2019) Neuroelectrics User Manual-Enobio 8
Pope AT, Bogart EH, Bartolome DS (1995) Biocybernetic system evaluates indices of operator engagement in automated task. Biol Psychol 40(1):187–195. https://doi.org/10.1016/0301-0511(95)05116-3
Article Google Scholar
Prinzel LJ (2001) Empirical Analysis of EEG and ERPs for Psychophysiological Adaptive Task Allocation. https://books.google.com/books?id=oYg9AQAAMAAJ
Razmara A, Aghamolaei T, Madani A, Hosseini Z, Zare S (2018) Prediction of taxi drivers’ safe-driving behaviors based on the theory of planned behavior: the role of habit. J Educ Health Promotion. https://doi.org/10.4103/jehp.jehp_61_18
Article Google Scholar
Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65
Article MATH Google Scholar
Selcon, S. J., & Taylor, R. (1990). Evaluation of the Situational Awareness Rating Technique(SART) as a tool for aircrew systems design. AGARD, Situational Awareness in Aerospace Operations 8 p(SEE N 90–28972 23–53).
Tao D, Zhang R, Qu X (2017) The role of personality traits and driving experience in self-reported risky driving behaviors and accident risk among Chinese drivers. Accid Anal Prev 99(Pt A):228–235. https://doi.org/10.1016/j.aap.2016.12.009
Article Google Scholar
Vogels J, Demberg V, Kray J (2018) The index of cognitive activity as a measure of cognitive processing load in dual task settings. Front Psychol 9:2276. https://doi.org/10.3389/fpsyg.2018.02276
Article Google Scholar
Watson D, Clark LA, Tellegen A (1988) Development and validation of brief measures of positive and negative affect: the PANAS scales. J Pers Soc Psychol 54(6):1063
Article Google Scholar
Wiesenthal DL, Hennessy D, Gibson PM (2000) The Driving Vengeance Questionnaire (DVQ): the development of a scale to measure deviant drivers’ attitudes. Violence Vict 15(2):115–136
Article Google Scholar
Yan F, Liu M, Ding C, Wang Y, Yan L (2019) Driving style recognition based on electroencephalography data from a simulated driving experiment [original research]. Front Psychol. https://doi.org/10.3389/fpsyg.2019.01254
Article Google Scholar
Zeng H, Yang C, Dai G, Qin F, Zhang J, Kong W (2018) EEG classification of driver mental states by deep learning. Cogn Neurodyn 12(6):597–606. https://doi.org/10.1007/s11571-018-9496-y
Article Google Scholar
Zhang C, Patel M, Buthpitiya S, Lyons K, Harrison B, Abowd GD (2016) Driver classification based on driving behaviors. In: Proceedings of the 21st international conference on intelligent user interfaces, Sonoma, California, USA. https://doi.org/10.1145/2856767.2856806

Download references

Acknowledgements

The authors would like to thank the Mid-America Transportation Center for funding and supporting this research initiative. They would also like to thank Dr. Evangelia Chrysikou from Drexel University for aiding with the questionnaires and methodological developments.

Funding

This work was supported by the Mid-America Transportation Center (MATC) under the Grant number 69A3551747107.

Author information

Authors and Affiliations

Center for Urban Transportation Research, University of South Florida, Tampa, FL, USA
Vishal Chandra Kummetha & Sisinnio Concas
Department of Civil and Environmental Engineering, University of Windsor, Windsor, ON, Canada
Umair Durrani
National Advanced Driving Simulator, University of Iowa, Coralville, IA, USA
Justin Mason
Department of Civil, Environmental, and Architectural Engineering, University of Kansas, Lawrence, KS, USA
Alexandra Kondyli

Authors

Vishal Chandra Kummetha
View author publications
You can also search for this author in PubMed Google Scholar
Umair Durrani
View author publications
You can also search for this author in PubMed Google Scholar
Justin Mason
View author publications
You can also search for this author in PubMed Google Scholar
Sisinnio Concas
View author publications
You can also search for this author in PubMed Google Scholar
Alexandra Kondyli
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization: VCK, AK; methodology: VCK, UD, JM; formal analysis and investigation: VCK, UD, JM, SC; writing—original draft preparation: VCK, UD; writing—review and editing: VCK, UD, JM, SC, AK; funding acquisition: AK. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Vishal Chandra Kummetha.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Kummetha, V.C., Durrani, U., Mason, J. et al. Driver Classification Using Self-reported, Psychophysiological, and Performance Metrics Within a Simulated Environment. Data Sci. Transp. 5, 6 (2023). https://doi.org/10.1007/s42421-023-00069-8

Download citation

Received: 22 September 2022
Revised: 22 February 2023
Accepted: 04 April 2023
Published: 12 April 2023
DOI: https://doi.org/10.1007/s42421-023-00069-8

Driver Classification Using Self-reported, Psychophysiological, and Performance Metrics Within a Simulated Environment

Abstract

Similar content being viewed by others

Assessment and Profiling of Driving Style and Skills

The Role of Eco-Driving and Wearable Sensors in Industry 4.0

User Engagement with Driving Simulators: An Analysis of Physiological Signals

Introduction

Background