1 Introduction

The World Health Organization (WHO) recently stated that around 1.35 million people die every year around the globe through road accidents [1]. Around the world, on an average, out of 5,891,000 vehicle accidents each year, 21% that is 1,235,000 are due to adverse weather and road conditions such as heavy rainfall, sleet, fog, snow, blowing sand or debris. On an average, almost 5 to 6 thousand people get killed, and over 4 lakh people get injured due to unfavorable weather and road conditions according to the study by Hamilton [2] based on NHTSA data (2007–2016). Bucsuházya et al. [3] have stated that human factors are responsible for road accidents. After analyzing a large number of various types of accidents, they have put forth the reasons such as inexperience driving, panic, glare, unstable health conditions, drowsiness, mental illness, handicap, speeding, and violation of traffic rules. Road accidents happen in a fraction of second, even for an expert driver when they make a small mistake or gets distracted. An intelligent environment can provide solutions for such real-time issues. Intelligent environment [4] system is enriched with interconnected devices such as sensors, actuators, processors, and information terminals. It gathers real-time information and makes decisions to benefit the environment based on the accumulated historical data. Vehicle automation is a part of intelligent transportation [5] that has the ability to use of Internet of Things (IoT), Artificial Intelligence (AI), mechanics, and electronics to assist the person driving the vehicle and to impart the vehicle with a potential to make decisions instead of the driver, in case of a mishap. A vehicle employing such features is an intelligent or smart vehicle [6]. Such a smart vehicle will help reduce and eliminate the probability of an accident.

IoT [7] is defined as a sensor network or a collection of everyday things that are interconnected through the internet and can collect, process, and share vital data [8]. There are different kinds of measuring devices, such as Ultrasonic sensor, passive infrared sensor (PIR), DHT, LDR, etc. for sensing environmental data such as distance, heat, light, respectively. Car-following models in general are used to calculate the safe speed of the vehicle with respect to the preceding vehicle based on the performance of the driver [9]. An intelligent car-following model is a smart system developed to assist the drivers to automatically predict the safe speed based on various environmental conditions [10]. Smart cars and car-following models use the aforementioned sensors to calculate and predict the tire pressure, distance to be maintained between two vehicles, road condition, GPS, and so on [11]. Then the data is reconstructed into a logical format for collection or further processing. Such equipment’s are mainly used by car-following models to calculate the pressure in the tire, temperature, space between two vehicles, etc. IoT can be combined with AI [12].

AI in simple terms, providing a brain or a thinking capability to a machine. Technically, it is a part of computer science that concentrates on the making of smart machines that act just like a man, e.g., speech recognition [13], problem-solving [14], natural language understanding [14], etc. Machine learning (ML) is a subset of AI that also plays a significant role in the car-following models. ML equips computers to learn and interpret without being explicitly told to do so (Unsupervised learning) [15]. Vehicle-automation includes the use of IoT sensors [16], electronics, AI [13], and ML to assist the motorist. An automobile employing these characteristics can be tagged as a smart car [17].

Besides the driver’s mistake, speeding, drunken driving, jumping red lights, there are adverse road and weather conditions that are also responsible for most of the accidents that happen throughout the world [18]. e.g., roads become slippery as soon as the snow melts, and a fast-moving car might get out of control if sudden brakes are applied. Rains, floods, slippery roads, fog, uneven roads etc., all these escalate the probability of accidents [19, 20]. Climate and roadway conditions create a risky set-up for the vehicle as well as escalate the reaction time of the driver. An efficient monitoring system is required to alert the drivers regarding external weather and road conditions. It should continuously monitor the environment and update the driver periodically to maintain a particular speed by analyzing the current weather and road conditions.

The main contribution of this paper is to propose an intelligent IoT based accident avoidance system for adverse weather and road conditions. The proposed system is a combination of IoT system and machine learning system. The IoT system is used to perceive the environment for adverse weather and road conditions. The machine learning model is used to analyze the dataset collected by IoT system.

to predict the safe speed and intelligent decisions to assist the driver. The highlights of the contributions are as follows:

  • An IoT-based system is modelled to sense and collect data with respect to the weather and road conditions.

  • Experimenting various machine learning models to identify the best ML technique to be adopted for car-following model.

  • Simulation of the proposed system through Blynk application

  • Experimental analysis and performance evaluation of the proposed system.

In this paper, we also analyze the dataset collected by Hjelkrem and Ryeng [21] that consists of around 3,11,908 different observations. The sample dataset table along with the parameters is shown in Table 1.

Table 1 Parameters and sample records of the dataset

The remainder of the paper is organized as follows: Sect. 2 presents the literature survey to support the research on car-following model. In Sect. 3, the proposed methodology is presented along with the system architecture. Section 4 presents the experimental setup required for the system. Section 5, discusses the results and observations. In Sect. 6 we present the experimental analysis of the system in real-time dataset. Section 7, compares the proposed system with other existing system. Finally, Sects. 8 & 9 concludes and discusses the potential future scope of the paper

2 Literature survey

In this section, we present the state-of-the-art literatures study on different car-following models. There can be various reasons for vehicle accidents out of which few are driver’s health/ attention, glare, panic, weather and road condition, traffic, over speeding, drunken driving, etc. out of which this paper mainly focus on the accidents caused due to adverse weather and road conditions. In our previous work [22], we presented a thorough study of 4 different types of car-following models, which are as follows—Vehicle-following models based on (a) Sensors [23] (b) Weather and road condition [24] (c) Networking [25] (d) ML algorithm [26]. However, we provide an updated literature study on the car-following models based on the recent advancements.

2.1 IoT based car-following models

Car-following models observe the driver’s behavior and environment to suggest the safe speed from the preceding vehicle to accidents. The recent advancements in the field of electronics have brought a huge advantages of car-following models. Internet of Things (IoT) has emerged as one of the important technology for various technical advancements [27]. It comprises of various sensors and microcontroller to continuously collect and perform operations in various environment. An IoT based car-following model is a system equipped with IoT sensors that can periodically collect real-time data to make better decisions to avoid accidents. In recent, there are a number of literature study presented. Singh et al. [28] use the dashboard camera, and a combination of IoT sensors to ameliorate the controlled following potential. The camera can acquire the live traffic or accident recording, then it can be transferred to the people nearby, health center, police station, family members, insurance etc., along with the exact position and well-being through a mobile net or Wi-Fi [29]. The study done by Nyamati [30] is based on using a PIR sensor which can detect anything or people in the front or beside the vehicle while in motion. If the driver is unable to apply brakes on time the sensors apply brakes on time. The distance between the car body and any surrounding abject can be calculated by using the ultrasonic sensor or PIR sensor. The sensors give off an alarm if anything comes too close to the car. Even after this alarm if the driver is unable to apply brakes the microprocessor in the car automatically applies brakes and the car stops 5 m away from the object. Such a system is good for distracted drivers or people who cannot respond quickly to the changes. Eshghi and Schmidtke [31] have proposed a navigation system for vehicles with reduced time complexity. The proposed system is based on Geosensor networks to identify the safe path during the hurricane damage. This helps the car to reach asafe destination. Kumar et al. [32] have proposed a virtual IoT and intelligent model to predict the driver intention and traffic congestion avoidance. The model utilizes the twin approach and blockchain technologies to make the vehicles more sophisticated and smarter. This makes the transport system intelligent to avoid congestions and accidents.

2.2 Car-following models based on machine learning and artificial intelligence

Artificial Intelligence (AI) based car-following models uses predicates and logics to predict various aspects to avoid accidents. Machine Learning (ML) based system uses various algorithms for prediction of safe distance and with multiple inputs. AI or ML integrated with sensors, is used to enable the vehicles to navigate through the traffic as well as to make complex instant decisions. The concept of smart cars and autonomous cars is wholly based on AI and ML. The ML algorithms used in the following research papers are RF [33], CCF [34], BPNN [35], and NGSM. Chen et al. [36] have proposed the rear-end collision avoidance system to use a probabilistic model named CGPN by concentrating on the chances of accidents using the Internet of Vehicles. BPNN evaluates this probability with V2I and V2V transmission support for which the training data file is made from VISSIM [29], 30.

Czubenko et al. [37] discussed Google Car movements and automation. The movement of this Google car is created to manoeuvre similar to humans. Still, for every condition, these vehicles do not produce secure motions. To solve this problem, they combined the ML models and kinematics-based models that can improve coherence and create a powerful machine, known as the CCF model, also known as Gipps model [9]. The resulting model is the Gipps-RF and the Gipps-BPNN model which uses BPNN and RF models. The given models make the vehicle-following more powerful and coherent. Both generated models are stronger as compared to others in stabilizing traffic flow, reducing crowding and preventing accidents.

Guo et al. [38] worked to enhance the security of the smart control technique by proposing a system that provides the pragmatic means to find the target in advance. From the results of naturalistic on-road experiments, a vehicle-following technique based on the tracking capability of nearby vehicles is made. The desired car-following model is improved by the proposed merging prediction model. Vehicle maneuvering, ride comfort, enhancing safety are enabled by the improved model. A correlation matrix is generated for refining smart-car control and enhancing the precision of intelligent machines.

Alanezi et al. [39] proposed an idea to notify the vehicle driver who is behind a vehicle by installing an LCD at the backside of the car which can be used for showing alerts like less visibility, flooded road, damaged road, crowed, flat tire, maintain space, etc. which run by touch-screens or voice recognition devices. This system is cheap as it requires only a screen to display the caution signs. Hjelkrem and Ryeng [40] have shown how light, rain, and bad roadways affect the consciousness of the person driving the vehicle. In a vehicle-following situation, CRI describes the risk for drivers. This study has used a dataset that contains more than 71,255 rows of data on vehicle driving habits and environment updates. According to the road and weather conditions the way of driving changes. This means that the probability of an accident is higher in snow-filled roads and medium rain, whereas bigger vehicle drivers can struggle in sleet-filled roads but not in mildly flooded roads. Data collected on Norway roadways is also given in their study, which shows around 3,11,908 automobile surveillance with 21 unique specifications related to climate and roadways. This data is being collected over a few years so that every possible climate and roadways situation can be considered, which ranges from dry summer to adverse winter, from snowy roads to heavy flooding rainfalls [21]. Such a huge amount of data is essential to create a smart system which will be able to identify almost all possible environmental conditions and then be able to help identify the safe conditions and speed so that an accident won’t happen. Table 1 shows the 17 parameters—carload, speed, length, climate, roadways, heat, etc. that are important in the research to develop a vehicle-following model.

Cheng et al. [41] use deep learning technology to predict road surface conditions using image classification and recognition from the image dataset captured through the camera placed in front of driverless vehicles. They have classified five common types of road surface conditions, i.e., dry roads, wet roads, snowy roads, muddy roads, and other roads [42] 43. There is a comparison between the Gai-ReLU function and TanH, SoftMax, ReLU, leaky ReLU. The image recognition accuracy, network structure, and hyperparameters can further be improved. Xu et al. [44] study present automobiles as the superior aspect in a prototype of vehicle-following behaviours, and the potency is shown on a broad set of vivid steering information.

Galanis et al. [45] analyzed the roadways and climate conditions to find RSI based on which the given system can automatically predict the speed limit to be maintained by the vehicle. Four weather conditions are considered in the research, i.e., dry, wet, snow, and ice-related with the drag on the track. But the analysis is not related to the mass and momentum of the automobile, which impacts the braking angle and disaster can happen.

Zhai et al. [46] have proposed the likelihood of vehicles crashing into pedestrians in different weather conditions. Their analysis shows that 17% of the 20,381-road accident-related deaths in 2015 were pedestrians. The adverse weather conditions responsible for pedestrian crashes that are considered in the paper are increasing variation in air temperature and precipitation, heatwaves, storms, heavy rainfall, etc. Their objective is to calculate the relationship across the intensity of crashes of pedestrians, roadway conditions, automobile conditions, pedestrian behaviour, and climate conditions. Their prediction is based on binary and mixed logistic regression models to calculate the crash intensity of probable risk factors and climatic conditions. The 15,13,792 accidents reported in the FARS [Fatality Analysis Reporting System] [NHTAS, 2016] are used to study the effect of climate on fatal crashes, which involves a huge number of automobiles [47]. The output of the study is that drivers should lessen the speed in less visibility, snowy roads, and rainfall. Speed is a significant factor that causes accidents.

Singh and Singh [28] have proposed the idea of using a dashboard camera called Smart eye to enhance the accident tracking capability. The camera, along with IoT sensors, can gather and update the live-time traffic into text, audio, and video format and convey it to the nearby police station, hospitals, or family members using cellular networks or Wi-Fi. A similar study has been done by Nyamati et al. [30] which uses an ultrasonic sensor to identify any person or object near or around the car while driving. The vehicle can automatically apply brakes using a motor driver if anyone is near the car or if the car is going to hit something. Other sensors which can be used in a vehicle along with ultrasonic sensor are safety belt sensor, radar sensor, motion sensor, gas/oil leakage sensor, impact sensor, door, and parking sensor, etc. Different ML algorithms can be used for such vehicle automation and accident prediction models namely BPNN [35], NGSIM [48], CCF [34], RF [33], GLM [40], NN [36], ACC[38]. In our proposed system, we experimented on a few efficient ML algorithms.

The literature study provides a detailed information about the state-of-the-art techniques for accident avoidance and for alerting the drivers. It is understood that the car-following models use various sensors and devices to gather environmental conditions within the vehicle as well as external. Such techniques are crucial in reducing the chances of accidents at a great extent and ensure safe driving experience. However, there is a certain amount of drawbacks and research gap in the existing study such as advanced techniques to learn the environment, accurate prediction model, and alert generation with minimal time delay. In this paper, we present a comparative analysis of various ML techniques that overcomes the aforementioned drawbacks in the existing systems. We also propose a IoT based machine learning-based system to increase the accuracy of accident avoidance methods that are mainly related to accidents due to adverse road and weather conditions.

3 Proposed system

In this section, we present the system architecture of the proposed IoT based accident avoidance system, data collection methods, preprocessing techniques to clean the data, and various ML algorithms.

The proposed work is a combination of IoT sensors that aids in getting the parameters from the vehicle as well as the environment and an intelligent system that comprises of machine learning algorithm that predicts the accident through analyzing the data collected by IoT sensors. The sensor network used in the proposed model consists of ultrasonic sensors to calculate the distance to the objects around the car, Digital Temperature and Humidity (DHT) sensor for measuring the current air temperature and humidity, speedometer to get the vehicle speed, dashboard camera to get the road surface status as well as the weather, visibility sensor to measure the visibility on the road and rain sensor to measure the precipitation intensity. There are five main steps in the architecture are shown in Fig. 1. They are data collection, data pre-processing, data sampling, ML algorithms, and prediction.

Fig. 1
figure 1

System Architecture

The system architecture consists of data collection, data pre-processing, data sampling (based on Road Surface, Precipitation Type, and Time of Day), and ML algorithm. In the ML algorithm we perform an experimental analysis on various ML techniques to identify the appropriate algorithm for the accidence avoidance system with the highest accuracy and minimal time delay.

3.1 Data collection

Data collection is the initial step for the proposed system. An intelligent car-following model should be equipped with various sensors to continuously monitor the adverse weather and road conditions. In this paper, we suggested various sensors that could be used to collect real-time data from the environment. However, in this study we use a real-world dataset [21] which is a collection of vehicle observations based on adverse weather and road conditions for simulation. The dataset consists of 311,908 observations and classified based on categories of the road surface, climatic conditions, weight of the vehicle, speed of the vehicle, time of the day, and precipitation type.

3.2 Data preprocessing

Data preprocessing is an essential task to analyze the dataset for missing values, noisy data, outlier elimination, and data normalization. Tkachenko et al. [49] have proposed approaches to deal with missing values and noisy data in the IoT environment. In our proposed work we adopted such approaches for data preprocessing. The missing values in the dataset are replaced with the average value calculated from the particular parameter. The noisy data and outliers influence the accuracy of the system, so it is important to eliminate such values from the dataset. The selected dataset had a lot of missing values and noisy data, so we used scikit-learn MinMaxScaler and have rescaled the data of attributes with varying scales and used binarizing technique. In our work, we identified the noisy and outlier data through a calculated threshold value. The parameter values that does not reach the threshold are removed from the dataset. Data normalization is another crucial task that enhances the ML learning and accuracy. In our work, we use adaptive normalization method [50] for time series data that perform normalization of data generated by the IoT sensors.

3.3 Data sampling

Data sampling is required to divide the data into different categories to efficiently perform the appropriate algorithm. In our proposed work we divided the data into nine categories based on the parameters of the road surface, precipitation type, and time of day. Figure 2 shows the correlation matrix of the dataset used in the proposed system. A correlation matrix is used to show the correlation coefficients among different sets of values in the dataset. The darker shade in the Correlation matrix shows that those variables are strongly correlated whereas the lighter shade shows that the variables are less strongly related to each other. The correlation matrix helps us to understand the dataset. The more the data are correlated the better it can contribute to prediction accuracy. If the data are not correlated then the preprocessing steps need to be performed to improve the dataset. In our proposed work the data correlation is checked based on a machine learning-based correlation technique [51]. We adopted correlation technique to improve the prediction accuracy of the accidence avoidance system.

Fig. 2
figure 2

Correlation matrix

3.4 Algorithms

The different Classification algorithms that can be used for prediction are Logistic Regression [52], Naïve Bayes [53], K-Nearest Neighbors [54], Decision Tree [55], Random Forest [26], and Support Vector Machines (SVM) [56]. In the paper, we present a comparative analysis of 6 different ML algorithms, i.e., Logistic Regression, Decision tree, SVM, Naïve Bayes, Random Forest, Neural Network. The contrast of these algorithms and states are analyzed to suggest the appropriate technique for accidence avoidance system. The algorithm with the best accuracy and least loss will be the best for the proposed idea. Applying these classification algorithms over the test data set, we predict where an accident can happen in case of the given values of vehicle speed, vehicle weight, weather, and road condition. Table 2 shows the various attributes of the dataset [21] that are collected during vehicle observation, which is linked to data containing road surface and weather conditions.

Table 2 Attributes, range, and datatype of the dataset

The algorithms analyze the data to find the association between one or more independent variables and one dependent binary variable. It is a probability between 0 and 1, i.e., whether an accident will happen or not in the given road and weather conditions. The detailed explanation of the ML algorithms and its output are discussed in the results section.

4 Experimental setup

This section describes the basic requirement for the proposed system. The details of dataset, IoT sensors, and system requirements are presented in this section.

4.1 Dataset

In our work, we used a dataset collected by Hjelkrem and Ryeng [21], on a road in Norway which is around 10 km from Oslo. This data was recorded between March 21st 2012 and April 30th 2014. Since this dataset is collected from a particular region and not from around the world the results of the proposed system is confined to this particular region. Table 2 shows the parameters that are considered in the proposed system.

Table 3 shows the different categories that are sampled from the dataset. The dataset are divided into nine categories based on different weather and road conditions based on the time of day. Where the various terms of precipitation type are clear, rain and snow, for the road surface, the conditions are dry, wet, and slippery, and for the time of day, values are Twilight, Day, and Night.

Table 3 Data divided into categories

4.2 Sensors

In the actual implementation of the proposed system, the sensors that can be used in the proposed system are ultrasonic sensors to calculate the distance to the objects around the Car, DHT sensor for measuring the current air temperature and humidity, speedometer to get the vehicle speed, dashboard camera to get the road surface status as well as the weather, visibility sensor to measure the visibility on the road rain sensor to measure the precipitation intensity. However, in this paper we used publicly available real-world dataset and did not collect data in real-time. Nevertheless, we identified the sensors and actuators that are required to collect the real-time data are discussed.

5 Results and discussions

This section discusses the result of various machine learning algorithms for the proposed model and the performance is evaluated through confusion matrix, precision, recall, F1-score, and support metrics.

5.1 Logistic regression

Logistic regression is a statistical model that in its basic form uses a logistic function to model a binary dependent variable although many more complex extensions exist. Logistic regression indicates the result in a value between 0 and 1.

$$ P = \frac{{e^{a + bx} }}{{1 + e^{a + bx} }} $$

Accuracy of Logistic Regression model is 0.9078 i.e., 90.78%.

Confusion matrix for the Logistic Regression model is shown just below.

N = 1410

Predicted:No

Predicted: Yes

 

Actual: No

TN = 1

FP = 55

56

Actual: Yes

FN = 3

TP = 1352

1355

 

4

1407

 

In the confusion matrix above, 1315 shows the True Positive rate, 1 is the True Negative rate, 55 is the False Positive rate and 3 is False Negative rate. Table 4 shows the performance metrics evaluation of logistic regression.

Table 4 Classification of logistic regression

5.2 Decision tree

A decision tree algorithm is used for a binary classification problem with only two classes positive and negative class [34]. It is a way to display an algorithm that only contains conditional control statements. This algorithm is commonly used in operation research specifically in decision analysis, to help identify a strategy often used to reach a goal. The formula for decision tree entropy is given below

$$ E\left( {T,x} \right) = \mathop \sum \limits_{C \in x} P\left( C \right)E\left( C \right) $$

Accuracy of Decision tree model is 0.8617, i.e., 86.1%.

Confusion matrix for Decision tree is shown just below.

N = 1410

Predicted: No

Predicted: Yes

 

Actual: NO

TN = 17

FP = 39

56

Actual: YES

FN = 55

TP = 1263

1318

 

72

1302

 

In the confusion matrix above, 1263 shows the True Positive rate, 17 is the True Negative rate, 39 is the False Positive rate and 55 is False Negative rate. Table 5 shows the performance metrics evaluation of decision tree.

Table 5 Classification of decision tree

5.3 Support vector machine (SVM)

SVM is used for regression and classification problems, it uses a kernel technique to find a proper boundary between the given inputs [43] e.g. When the data is not linearly separable it maps the data into a high dimensional feature space. SVM can efficiently perform a non-linear classification using the kernel trick, implicitly mapping their inputs into high-dimensional feature space. We have implemented SVM algorithm using the Gaussian kernel [57]. The equation is given below

$$ k\left( {k,y} \right) = \exp \left( { - \frac{{\left\| {x - y^{2} } \right\|}}{{2\sigma^{2} }}} \right) $$

Accuracy of Support Vector machine is 0.901, i.e., 90.1%.

Confusion matrix for SVM is shown just below.

N = 1410

Predicted: No

Predicted: Yes

 

Actual: No

TN = 19

FP = 37

56

Actual: Yes

FN = 12

TP = 1306

1318

 

31

1343

 

In the confusion matrix above, 1306 shows the True Positive rate, 19 is the True Negative rate, 37 is the False Positive rate and 12 is False Negative rate. Table 6 shows the performance metrics evaluation of support vector machine.

Table 6 Classification of SVM

5.4 Random forest (RF)

RF is a technique for regression, a classification that operates by constructing a group of decision trees [18]. Random Forest correct for decision tree habit of overfitting to their training set. Is based on a random selection of features to construct a collection of decision trees with controlled variance.

$$ normfi = \frac{Fi}{{\Sigma_{j} \in \, all \, features \, \,Fi}} $$

Accuracy of Random Forest is 0.913, i.e., 91.3%.

Confusion matrix for the RF model is shown just below.

N = 1410

Predicted: No

Predicted: Yes

 

Actual: No

TN = 8

FP = 48

56

Actual: Yes

FN = 7

TP = 1311

1318

 

15

1359

 

In the confusion matrix above, 1311 shows the True Positive rate, 8 is the True Negative rate, 48 is the False Positive rate and 7 is False Negative rate. Table 7 shows the performance metrics evaluation of random forest.

Table 7 Classification of random forest

5.5 Naïve bayes

In ML, Naïve Bayes is a part of simple ‘probabilistic classifiers’ based on Bayes theorem with powerful independent assumptions among the parameters. Naïve Bayes classifiers are highly scalable, requiring a number of parameters linear in the number of features in a learning problem.

Accuracy of our Naïve Bayes regression model is 0.9278, i.e., 92.78%.

Confusion Matrix for Naïve Bayes is shown just below.

N = 1410

Predicted:No

Predicted: Yes

 

Actual: No

TN = 14

FP = 42

56

Actual: Yes

FN = 12

TP = 1306

1318

 

26

1348

 

In the confusion matrix above, 1306 shows the True Positive rate, 14 is the True Negative rate, 42 is the False Positive rate and 12 is False Negative rate. Table 8 shows the performance metrics evaluation of Naïve bayes.

Table 8 Classification report of Naïve bayes

5.6 Dataset and algorithm analysis

The graphical representation of the data set and analysis in which conditions most accidents happen are presented in Figs. 3, 4. Figure 3 shows the comparison of several accidents that occur during the time of the day, and we can conclude that the probability of an accident is more in Daytime as compared with Twilight and Night. Figure 3 indicates that the probability of accidents in three different time of day i.e. day, twilight and night. We can see that the rate of accidents is more in the day time as compared to twilight and night time according to the available data set.

Fig. 3
figure 3

Time of Day VS Accident

Fig. 4
figure 4

Road surface VS Accident

Figure 4 indicates that the probability of accidents in two different road conditions i.e. dry and wet. It is understood from the representation that the rate of accidents is more when the roads are wet as compared to when the roads are dry according to the available data set.

Figure 5 shows the accuracy of various ML algorithms used in the proposed system where we can see the comparison of 5 different techniques with the type of techniques on the X-axis and Accuracy on the Y-axis from the tables and Fig. 5, we can conclude that among the ML algorithms the accuracy of Naïve Bayes algorithm in predicting the rate of accident is 92.78%. In contrast, the accuracy of the decision tree is the least.

Fig. 5
figure 5

Accuracy of ML algorithms

6 Experimental analysis

In this section, we further analyze the accidence avoidance system based on machine learning techniques to analyze adverse weather and road conditions. Figure 6, 7 show the experimental analysis of Decision Tree and Logistic Regression algorithms on time and road surface classification. These graphs show two different areas which correspond to Time of Day and Road Surface. The red and green dots indicate accidents. The red and green dots in the opposite region are the outliers which means that these are wrongly predicted, whereas the dots which lie in the same colored region means that these accidents are correctly predicted.

Fig. 6
figure 6

a Decision tree classification between time of day and road surface (Training set). b Decision tree classification between time of day and road surface (Test set)

Fig. 7
figure 7

a Logistic regression classification between time of day and road surface (Training set). b Logistic regression classification between time ofday and road surface (Test set)

Figure 6a is a representation of the decision tree algorithm of a training set of time of day vs road surface. Figure 6b is a representation of the decision tree algorithm of a test set of time of day vs road surface. The graphs shown in Fig. 6a, b quantify the relationship between the time of day and road classification relating to the rate of accidents in the 2 different conditions using Decision Tree.

Figure 7a is a representation of the logistic regression of training set of time of day vs road surface Fig. 7b is a representation of the logistic regression of the test set of time vs road surface. The graphs shown in Figs. 7a, b quantify the relationship between the time of day and road classification relating to the rate of accidents in the 2 different conditions using logistic regression. The red dots in the green areas represent the wrong prediction of the algorithm.

7 Performance evaluation

In this section, the performance of the proposed ML algorithms and Neural Network model with the existing state-of-the-art techniques.

7.1 Performance evaluation of proposed ML techniques

At first, we compare our model with an existing model, which is proposed by Yuan et al. [58] they have carried out similar research based on accidents related to adverse weather and road conditions on the same dataset. In their study, they have used three algorithms they are, RF, SVM, and DT. The performance metrics to be compared to existing and proposed models are Precision, Recall, F1-Score, and Accuracy. Table 9 shows the performance metrics of the existing system.

Table 9 Performance of Existing system

Table 10 shows the performance of our proposed model, which includes precision, recall, F1-score, and accuracy. By performing the comparative analysis of the existing system and proposed systems performance we can conclude that the efficiency of the proposed system is higher than the state-of-the-art literature study.

Table 10 Performance of the Proposed system

Figure 8 shows a side by side comparison of the accuracy of proposed and existing systems. The darker shade represents the accuracy of proposed modified ML algorithms whereas the lighter shade represents the accuracy of existing algorithms. It is observed from the graph that the proposed ML algorithms are yielding higher accuracy compared to the ones implemented in the existing system (Fig. 9).

Fig. 8
figure 8

Comparison of Accuracy of Proposed and Existing algorithms

Fig. 9
figure 9

Accuracy of Neural Network

7.2 Performance Evaluation of Neural Network Model

We have also experimented the proposed model using Neural Network. Here we are using the kernel initializer as ‘Uniform,’ activation function ‘ReLU’ and optimizer ‘Adam’. We are using the 3/4th part of the dataset for training and 1/4th part of the dataset for testing.

Figure 10 shows the decrease in the loss of the proposed model implemented using Neural Network. We can see that in the initial epochs the loss is around 0.85 but as the number of epochs increases the loss drops down to 0.1. In Fig. 9, we can see that in the initial epochs the accuracy is around 82% but as the number of epochs increase the accuracy reaches 97%.

Fig. 10
figure 10

Loss in Neural Network

7.3 Simulation of Accidence Avoidance System

The proposed system is simulated using an mobile application called Blynk [59]. Blynk is a mobile application that allows us to quickly build interfaces for monitoring and controlling our hardware models using iOS and Android device. Using Blynk app we can control Arduino Uno, Arduino mini, NodeMCU (ESP 8266), Raspberry pie and almost all similar electronics boards. In the proposed plan we are using Blynk to send a message that is to be displayed on the back of the car.

Figure 11 shows a screenshot of the Blynk application interface, which can create customizable messages to be displayed on the back of the screen to alert the drivers following a particular car. The default messages include ‘Changing lane’, ‘Overtake’, ‘Traffic jam ahead’, ‘Reverse’, ‘Bad weather’, ‘Keep distance’, ‘Engine Failure’, ‘Slippery Road’. The message can be customized as per requirement, or more options can be added whenever needed, such as ‘Bad road condition’, ‘Road Flooded’, etc.

Fig. 11
figure 11

For simulation we are using the Blynk app to send data from mobile to the system

How the message will be displayed on the screen is shown in Fig. 12. For the prototype, we are using a 2.0-inch SPI TFT LCD color screen module to post the alert, whereas, in the implementation, a bigger LED screen can be used by fixing it behind the Car. Ateeq et al. [39] have proposed a similar idea to place an LED screen behind the car so that the following vehicles can get an immediate alert about the conditions in front of that particular car. In the existing system, the given options are limited and cannot be customized in real-time, so by using BLYNK application, as mentioned in the proposed method, alerts can be customized whenever needed.

Fig. 12
figure 12

Message displayed on the screen

The proposed system also consists of a sensor network that will be used for parking assistance, to avoid close impacts and a real-time alert to the following cars. The IoT sensors which are required in the proposed system are ultrasonic sensor, NodeMCU, LED, and crash sensor. Figure 13 shows an instance of the Blynk application, which is used to connect IoT devices to be controlled by mobile. In the proposed plan, we are showing a demo of how it works. Nowadays, most of the cars come with built-in dashboard screens, which provides a lot of operational controls within the vehicle. We propose to place a set of 8–16 ultrasonic sensors around the car based on the dimensions of the car, which will be activated when the driver is driving through congested places also while parking the vehicle. A similar image can be displayed on the dashboard screen as shown in Fig. 12. To display the distance read by the ultrasonic sensors, we are using the Blynk application, which we have connected to the NodeMCU board, which transfers data over WIFI and displays the readings of the sensors according to their positions around the vehicle. NodeMCU supports eight input channels, so if we need to install more than eight sensors around the Car based on the dimensions, we need to use either more than 1 Node MCU board or Arduino Mega with Wi-Fi board.

Fig. 13
figure 13

An Instance of Blynk Application

8 Conclusion

Road-related accidents are gradually increasing every year, out of which 30% of them are due to bad weather and adverse road-conditions. Though road disasters are unpreventable, their odds can be decreased. An intelligent transportation system can help mitigating the accidents on the road through its interconnected sensors and actuators. Car following models which is a part of the intelligent transportation system can help to reduce the probability of accidents to a greater extent. A combination of AI and ML will contribute more to understand the parameters that are responsible for accidents and help to eliminate them. The proposed system aims to decrease the rate of accidents caused by harsh weather and road conditions. The proposed idea is capable of alerting the driver not to exceed a particular speed limit in different road and weather conditions. It is also able to analyze the given input data of road and weather conditions and predict the chances of an accident. The Naïve Bayes theorem proves to be giving faster and better results than other ML algorithms. Decision tree seems not as efficient as other algorithms. A parallel comparison of the proposed system with the existing system given in the paper portrays an increase in the accuracy and efficiency of the algorithm. The accuracy and versatility of the system can be improved by importing profuse weather and road constraints from around the globe. Finally, we can say that this system will help to make vehicles safer.

9 Future scope

The machine learning algorithms mentioned in the paper are applied over a dataset collected from a particular region. In contrast, we plan to collect data from different places, which can help in determining the probability of accidents in various conditions that we might not have taken into consideration. Other algorithms, besides the algorithms mentioned in the proposed plan, can be used. This plan can be combined with autonomous cars as well in the future.