Skip to main content
Log in

Classifying collisions in road accidents using XGBOOST, CATBOOST and SALP SWARM based optimization algorithms

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Traffic accidents are the leading cause of death and injury in many developed nations. Anyone utilizing the road can meet an accident at any moment of time. The type of collision also plays a role in determining who is accountable for the accident. The biggest advantage of classifying collisions in road accidents can pave a way for safer roads and reduced accident rates. A novel approach is proposed for classifying the type of collisions that might take place between vehicles and near by pedestrians, obstacles etc. on roads. A total of six hybrid classifiers are introduced in this article namely \(``{} \textit{XGBoost}\; classifier\; using\;ISSA''\), \(``{} \textit{XGBoost}\; classifier\; using\; ESSA''\), \(``{} \textit{XGBoost}\; classifier\; using\;\) \( \textit{TVBSSA}''\), \(``{} \textit{CatBoost}\; classifier\; using\) \(ISSA''\), \(``{} \textit{CatBoost}\; classifier\; using\; ESSA''\), and \(``{} \textit{CatBoost}\; classifier\; using\; \textit{TVBSSA}''\), The dataset considered in this article is the SWITRS dataset for classifying \(``Type\_of\_Collision''\). A total of 103000 accidents are considered when determining the \(``Type\_of\_Collision''\). It classifies the type of collisions using XGBoost algorithm, CatBoost Algorithm and three Nature Inspired Algorithms (NIA’s) have been used at the feature selection stage. The NIA’s considered for feature selection includes Improved Salp Swarm Algorithm (ISSA), Enhanced Salp Swarm Algorithm (ESSA), and Time-Varying Binary Salp Swarm Algorithm (TVBSSA). It is concluded that \(\textit{XGBoost}\; classifier\; using\; ISSA\) presents good stability with fewer hyper-parameters and the highest accuracy under different levels of training data volume. The value of Accuracy, Mean Square Error, and ROC-Auc in XGBoost using ISSA is 90.40, 0.1624 and 97.75, respectively. Moreover, the confusion matrix and evaluation metrics of \(\textit{XGBoost}\; classifier\; using\; ISSA\) performed better than the other two approaches. The findings of this study would be helpful in classifying the “type of collision”. These findings are highly significant in smart city projects to effectively establish timely proactive strategies and improve road traffic safety.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Availability of data and material

Not Applicable

Code Availability

The PYTHON code written is available upon request to the corresponding author.

References

  1. Petrović Đ, Mijailović R, Pešić D (2020) Traffic accidents with autonomous vehicles: type of collisions, manoeuvres and errors of conventional vehicles’ drivers. Trans Res Procedia 45:161–168

    Google Scholar 

  2. Gude A, Patrol CH (2020) California traffic collision data from switrs. [Online]. Available: https://www.kaggle.com/dsv/1671261

  3. Thomas P, Frampton R (1999) Large and small cars in real-world crashes-patterns of use, collision types and injury outcomes. In: Annual Proceedings/Association for the Advancement of Automotive Medicine, vol 43. Association for the Advancement of Automotive Medicine, p 101

  4. Sachelarie A, Gaiginschi R (2020) The investigation of pedestrians’ accident according the place where they are thrown. In IOP Conf Ser Mater Sci Eng, IOP Publishing 997(1):012131

    Google Scholar 

  5. Wood DP, Simms CK, Walsh D (2005) Vehicle-pedestrian collisions: Validated models for pedestrian impact and projection. Proc Inst Mech Eng D: J Automob Eng 219(2):183–195

    Google Scholar 

  6. Tiwari G (2020) Progress in pedestrian safety research. Int J Inj Control Saf Promot 27(1):35–43

    Google Scholar 

  7. Park Y, Garcia M (2020) Pedestrian safety perception and urban street settings. Int J Sustain Transp 14(11):860–871

    Google Scholar 

  8. Petrescu L, Petrescu A (2017) Vehicle-pedestrian collisions-aspects regarding pedestrian kinematics, dynamics and biomechanics. In: IOP Conf Ser Mater Sci Eng, IOP Publishing 252(1):012001

  9. Rolison JJ, Regev S, Moutari S, Feeney A (2018) What are the factors that contribute to road accidents? an assessment of law enforcement views, ordinary drivers’ opinions, and road accident records. Accid Anal Prev 115:11–24

    Google Scholar 

  10. Gicquel L, Ordonneau P, Blot E, Toillon C, Ingrand P, Romo L (2017) Description of various factors contributing to traffic accidents in youth and measures proposed to alleviate recurrence. Front Psychiatry 8:94

    Google Scholar 

  11. Zhang X-F, Fan L (2013) A decision tree approach for traffic accident analysis of saskatchewan highways. In: 2013 26th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), pp 1–4.IEEE

  12. Pu Z, Li Z, Jiang Y, Wang Y (2020) Full bayesian before-after analysis of safety effects of variable speed limit system. IEEE Trans Intell Transp Syst 22(2):964–976

    Google Scholar 

  13. Lv Y, Tang S, Zhao H (2009) Real-time highway traffic accident prediction based on the k-nearest neighbor method. In 2009 International Conference On Measuring Technology And Mechatronics Automation, vol 3, pp 547–550. IEEE

  14. Hossain M, Muromachi Y (2012) A bayesian network based framework for real-time crash prediction on the basic freeway segments of urban expressways. Accid Anal Prev 45:373–381

    Google Scholar 

  15. Lin L, Wang Q, Sadek AW (2015) A novel variable selection method based on frequent pattern tree for real-time traffic accident risk prediction. Transp Res Part C Emerg Technol 55:444–459

    Google Scholar 

  16. Caliendo C, Guida M, Parisi A (2007) A crash-prediction model for multilane roads. Accid Anal Prev 39(4):657–670

    Google Scholar 

  17. Yu R, Abdel-Aty M (2013) Utilizing support vector machine in real-time crash risk evaluation. Accid Anal Prev 51:252–259

    Google Scholar 

  18. Beshah T, Ejigu D, Abraham A, Snasel V, Kromer P (2013) Mining pattern from road accident data: role of road user’s behaviour and implications for improving road safety. International journal of tomography and simulation 22(1):73–86

    Google Scholar 

  19. Priyanka A, Sathiyakumari K (2014) A comparative study of classification algorithm using accident data. Int J Comput Sci Eng Technol (IJCSET) 5(10):1018–1023

    Google Scholar 

  20. Chong MM, Abraham A, Paprzycki M (2004) Traffic accident analysis using decision trees and neural networks. arXiv:cs/0405050

  21. Shiau Y-R, Tsai C-H, Hung Y-H, Kuo Y-T, et al (2015) The application of data mining technology to build a forecasting model for classification of road traffic accidents. Math Probl Eng, vol 2015

  22. Zhang J, Li Z, Pu Z, Xu C (2018) Comparing prediction performance for crash injury severity among various machine learning and statistical methods. IEEE Access 6:60079–60087

    Google Scholar 

  23. Cigdem A, Ozden C (2018) Predicting the severity of motor vehicle accident injuries in adana-turkey using machine learning methods and detailed meteorological data. Int J Intell Syst Appl Eng 6(1):72–79

    Google Scholar 

  24. Ahmadi A, Jahangiri A, Berardi V, Machiani SG (2020) Crash severity analysis of rear-end crashes in california using statistical and machine learning classification methods. J Transp Saf Secur 12(4):522–546

    Google Scholar 

  25. Liao Y, Zhang J, Wang S, Li S, Han J (2018) Study on crash injury severity prediction of autonomous vehicles for different emergency decisions based on support vector machine model. Electronics 7(12):381

    Google Scholar 

  26. Wang J, Liu B, Fu T, Liu S, Stipancic J (2019) Modeling when and where a secondary accident occurs. Accid Anal Prev 130:160–166

    Google Scholar 

  27. Rezapour M, Molan AM, Ksaibati K (2020) Analyzing injury severity of motorcycle at-fault crashes using machine learning techniques, decision tree and logistic regression models. Int J Trans Sci Technol 9(2):89–99

    Google Scholar 

  28. Bahiru TK, Singh DK, Tessfaw EA (2018) Comparative study on data mining classification algorithms for predicting road traffic accident severity. In: 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT), pp 1655–1660. IEEE

  29. Zong F, Xu H, Zhang H, et al (2013) Prediction for traffic accident severity: comparing the bayesian network and regression models. Mathematical Problems in Engineering, vol. 2013

  30. Karthik L, Kumar G, Keswani T, Bhattacharyya A, Chandar SS, Bhaskara Rao K (2014) Protease inhibitors from marine actinobacteria as a potential source for antimalarial compound. PloS one 9(3):e90972

    Google Scholar 

  31. He Z, Yu W (2010) Stable feature selection for biomarker discovery. Comput Biol Chem 34(4):215–225

    Google Scholar 

  32. Kalina J (2014) Classification methods for high-dimensional genetic data. Biocybern Biomed Eng 34(1):10–18

    Google Scholar 

  33. Kahya MA, Altamir SA, Algamal ZY (2021) Improving whale optimization algorithm for feature selection with a time-varying transfer function. Numer Algebra Control Optim 11(1):87

    MathSciNet  Google Scholar 

  34. Algamal ZY, Lee MH (2019) A two-stage sparse logistic regression for optimal gene selection in high-dimensional microarray data classification. Adv data anal class 13(3):753–771

    MathSciNet  Google Scholar 

  35. Al-Fakih A, Algamal Z, Lee M, Aziz M, Ali H (2019) Qsar classification model for diverse series of antifungal agents based on improved binary differential search algorithm. SAR QSAR Environ Res 30(2):131–143

    Google Scholar 

  36. Hichem H, Elkamel M, Rafik M, Mesaaoud MT, Ouahiba C (2019) A new binary grasshopper optimization algorithm for feature selection problem. J King Saud Univ, Comp & Info,

  37. Shrivastava P, Shukla A, Vepakomma P, Bhansali N, Verma K (2017) A survey of nature-inspired algorithms for feature selection to identify parkinson’s disease. Comput Methods Programs Biomed 139:171–179

    Google Scholar 

  38. Guozheng L, Meng W, Huajun Z (2004) An introduction to support vector machines and other kernel-based learning methods. Publishing House of Electronics industry, Beijing, p 3

    Google Scholar 

  39. Algamal Z, Qasim M, Lee M, Ali H (2020) Qsar model for predicting neuraminidase inhibitors of influenza a viruses (h1n1) based on adaptive grasshopper optimization algorithm. SAR QSAR Environ Res 31(11):803–814

    Google Scholar 

  40. Qasim OS, Algamal ZY (2020) Feature selection using different transfer functions for binary bat algorithm. Int J Math, Eng Manag Sci 5(4):697

    Google Scholar 

  41. Qasim OS, Algamal ZY (2018) Feature selection using particle swarm optimization-based logistic regression model. Chemom Intell Lab Syst 182:41–46

    Google Scholar 

  42. Qiu C (2019) A novel multi-swarm particle swarm optimization for feature selection. Genet Program Evolvable Mach 20(4):503–529

    Google Scholar 

  43. Yan C, Ma J, Luo H, Zhang G, Luo J (2019) A novel feature selection method for high-dimensional biomedical data based on an improved binary clonal flower pollination algorithm. Hum Hered 84(1):34–46

    Google Scholar 

  44. Feng Y-H, Wang G-G (2018) Binary moth search algorithm for discounted \(0-1\) knapsack problem. IEEE Access 6:10708–10719

    Google Scholar 

  45. Rais A-TQKS (2019) Hm mirjalili s alhussian h. Binary optimization using hybrid grey wolf optimization for feature selection IEEE Access 7:39496–39508

    Google Scholar 

  46. Emary E, Yamany W, Hassanien AE, Snasel V (2015) Multi-objective gray-wolf optimization for attribute reduction. Procedia Comput Sci 65:623–632

    Google Scholar 

  47. Hu P, Pan J-S, Chu S-C (2020) Improved binary grey wolf optimizer and its application for feature selection. Knowl-Based Syst 195:105746

    Google Scholar 

  48. Sayed GI, Darwish A, Hassanien AE (2018) A new chaotic whale optimization algorithm for features selection. J Classif 35(2):300–344

    MathSciNet  Google Scholar 

  49. Mafarja M, Aljarah I, Faris H, Hammouri AI, Ala’M A-Z, Mirjalili S (2019) Binary grasshopper optimisation algorithm approaches for feature selection problems. Expert Syst Appl 117:267–286

    Google Scholar 

  50. Shang R, Wang W, Stolkin R, Jiao L (2017) Non-negative spectral learning and sparse regression-based dual-graph regularized feature selection. IEEE Trans Cybern 48(2):793–806

    Google Scholar 

  51. Shang R, Meng Y, Wang W, Shang F, Jiao L (2019) Local discriminative based sparse subspace learning for feature selection. Pattern Recognit 92:219–230

    Google Scholar 

  52. Shang R, Xu K, Shang F, Jiao L (2020) Sparse and low-redundant subspace learning-based dual-graph regularized robust feature selection. Knowl-Based Syst 187:104830

    Google Scholar 

  53. Karthikeyan S, Asokan P, Nickolas S (2014) A hybrid discrete firefly algorithm for multi-objective flexible job shop scheduling problem with limited resource constraints. Int J Adv Manuf Technol 72(9):1567–1579

    Google Scholar 

  54. Mirjalili S, Gandomi AH, Mirjalili SZ, Saremi S, Faris H, Mirjalili SM (2017) Salp swarm algorithm: A bio-inspired optimizer for engineering design problems. Adv Eng Softw 114:163–191

    Google Scholar 

  55. Hegazy AE, Makhlouf M, El-Tawel GS (2020) Improved salp swarm algorithm for feature selection. J King Saud Univ, Comp & Info 32(3):335–344

    Google Scholar 

  56. Osman AIA, Ahmed AN, Chow MF, Huang YF, El-Shafie A (2021) Extreme gradient boosting (xgboost) model to predict the groundwater levels in selangor malaysia. Ain Shams Eng J 12(2):1545–1556

    Google Scholar 

  57. Mirri S, Delnevo G, Roccetti M (2020) Is a covid-19 second wave possible in emilia-romagna (italy)? forecasting a future outbreak with particulate pollution and machine learning. Computation 8(3):74

    Google Scholar 

Download references

Funding

Not Applicable

Author information

Authors and Affiliations

Authors

Contributions

First Author “Dr Insha Altaf” has done the Implementation part and has done article writing. Second Author Dr. Ajay Kaul has contibuted in the idea and framing of the article.

Corresponding author

Correspondence to Insha Altaf.

Ethics declarations

Conflicts of interest

The authors declare that they have no conficts of interest/competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Altaf, I., Kaul, A. Classifying collisions in road accidents using XGBOOST, CATBOOST and SALP SWARM based optimization algorithms. Multimed Tools Appl 83, 38387–38410 (2024). https://doi.org/10.1007/s11042-023-16969-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-16969-4

Keywords

Navigation