Skip to main content
Log in

Comprehensive evaluation of multiple machine learning classifiers for predicting freeway incident duration

  • State-of-the-Art Paper
  • Published:
Innovative Infrastructure Solutions Aims and scope Submit manuscript

Abstract

This study compares the accuracy and complexity of eleven machine learning classifiers for the problem of incident duration prediction. The proposed framework integrates feature selection and modeling techniques to evaluate the effect of multiple influencing factors and choose the best model for predicting incident durations. Models were developed and tested using an incident dataset collected from the Houston TranStar incidents archive, including more than 110,000 records. Features were selected based on integrating information gain, correlation-based, and relief-based evaluators’ results. The developed and fine-tuned classifiers were compared in terms of multiple accuracy measures (precision, recall, F-1 score, and AUC) and complexity measures (memory storage, training time, and testing times). Overall, results showed that among the developed models, the support vector machines (SVM), K-Nearest Neighborhoods, and Gaussian processes classification outperformed other classifiers with a prediction accuracy of 97%. The Decision Tree classifier recorded the lowest performance with a prediction accuracy of 82%. Considering a trade-off between the model’s accuracy and complexity, the classifier with higher accuracy associated with low training time complexity was the K-Nearest Neighborhoods achieving an accuracy of 97%, 0.024 s of training time, 0.042 s of testing time, and a memory storage of 0.04 megabytes. Nevertheless, the SVM achieved the same accuracy of 97% yet consumed much lower memory storage of 0.004 megabytes and a testing time of 0.01 s. Although the K-NN recorded the lowest training time, the SVM can be considered the best model for the ID-prediction classification problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Ma X, Ding C, Luan S, Wang Y, Wang Y (2017) Prioritizing Influential Factors for Freeway Incident Clearance Time Prediction Using the Gradient Boosting Decision Trees Method. IEEE Trans Intell Transp Syst 18(9):2303–2310. https://doi.org/10.1109/TITS.2016.2635719

    Article  Google Scholar 

  2. Tavassoli Hojati A, Ferreira L, Washington S, Charles P, Shobeirinejad A (2014) Modelling total duration of traffic incidents including incident detection and recovery time. Accid Anal Prev 71:296–305. https://doi.org/10.1016/j.aap.2014.06.006

    Article  Google Scholar 

  3. WJJ Knibbe, TP Alkim, JFW Otten, and MY Aidoo, (2006) Automated estimation of incident duration on Dutch highways,” in IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC, 870–874 https://doi.org/10.1109/itsc.2006.1706853.

  4. Hojati AT, Ferreira L, Washington S, Charles P (2013) Hazard based models for freeway traffic incident duration. Accident Anal Prevent 52:171–181. https://doi.org/10.1016/j.aap.2012.12.037

    Article  Google Scholar 

  5. Li R, Pereira FC, Ben-Akiva ME (2015) Competing risk mixture model and text analysis for sequential incident duration prediction. Transp Res Part C Emerg Technol 54:74–85. https://doi.org/10.1016/j.trc.2015.03.009

    Article  Google Scholar 

  6. Shi Y, Zhang L, Liu P (2015) Survival analysis of urban traffic incident duration: a case study at shanghai expressways. J Comput 26(1):29–39

    Google Scholar 

  7. B. N. Araghi, R. K. Simon Hu, M. Bell, and W. Ochieng, (2014) A comparative study of k-NN and hazard-based models for incident duration prediction, in 17th International IEEE Conference on Intelligent Transportation Systems (ITSC), , pp. 1608–1613.

  8. Ji YB, Jiang R, Qu M, Chung E (2014) Traffic incident clearance time and arrival time prediction based on hazard models. Math Probl Eng. https://doi.org/10.1155/2014/508039

    Article  Google Scholar 

  9. Hou L, Lao Y, Wang Y, Zhang Z, Zhang Y, Li Z (2014) Time-varying effects of influential factors on incident clearance time using a non-proportional hazard-based model. Transp Res Part A Policy Pract 63:12–24. https://doi.org/10.1016/j.tra.2014.02.014

    Article  Google Scholar 

  10. Ghosh I, Savolainen PT, Gates TJ (2014) Examination of factors affecting freeway incident clearance times: a comparison of the generalized F model and several alternative nested models. J Adv Transp 48(6):471–485. https://doi.org/10.1002/atr

    Article  Google Scholar 

  11. Chimba D, Kutela B, Ogletree G, Horne F, Tugwell M (2014) Impact of abandoned and disabled vehicles on freeway incident duration. J Transp Eng 140(3):04013013. https://doi.org/10.1061/(ASCE)TE

    Article  Google Scholar 

  12. Zou Y, Ye X, Henrickson K, Tang J, Wang Y (2018) Jointly analyzing freeway traffic incident clearance and response time using a copula-based approach. Transp. Res. Part C Emerg. Technol. 86(2017):171–182. https://doi.org/10.1016/j.trc.2017.11.004

    Article  Google Scholar 

  13. Al Kaabi A, Dissanayake D, Bird R (2012) Response time of highway traffic accidents in Abu Dhabi: investigation with hazard-based duration models. Transp Res Rec 2278(1):95–103. https://doi.org/10.3141/2278-11

    Article  Google Scholar 

  14. Junhua W, Haozhe C, Shi Q (2013) Estimating freeway incident duration using accelerated failure time modeling. Saf Sci 54:43–50. https://doi.org/10.1016/j.ssci.2012.11.009

    Article  Google Scholar 

  15. Hamad K, Khalil MA, Alozi AR (2020) Predicting freeway incident duration using machine learning. Int J Intell Transp Syst Res 18(2):367–380. https://doi.org/10.1007/s13177-019-00205-1

    Article  Google Scholar 

  16. Z. A. Mohammed, M. N. Abdullah, and I. H. Al-hussaini, (2021) Predicting incident duration based on machine learning methods, Iraqi J. Comput. Commun. Control Syst. Eng., 1–15 https://doi.org/10.33103/uot.ijccce.21.1.1.

  17. W. Wu, S. Chen, and C. Zheng, (2011) traffic incident duration prediction based on support vector regression, in In 11th International Conference of Chinese Transportation Professionals (ICCTP), 346–359.

  18. Zhao Y, Deng W (2022) Prediction in traffic accident duration based on heterogeneous ensemble learning. Appl Artif Intell 00(00):1–24. https://doi.org/10.1080/08839514.2021.2018643

    Article  Google Scholar 

  19. Garib A, Radwan AE, Al-Deek H (1997) Estimating magnitude and duration of incident delays. J Transp Eng 123(6):459–466. https://doi.org/10.1061/(ASCE)0733-947X(1997)123:6(459)

    Article  Google Scholar 

  20. J.-Y. Lee, J.-H. Chung, and B. Son, (2009) Incident Clearance Time Analysis for korean freeways using structural equation model, in The 8th International Conference of Eastern Asia Society for Transportation Studies, 7: 360–360.

  21. Ding C, Ma X, Wang Y, Wang Y (2015) Exploring the influential factors in incident clearance time: disentangling causation from self-selection bias. Accid Anal Prev 85:58–65. https://doi.org/10.1016/j.aap.2015.08.024

    Article  Google Scholar 

  22. Khattak AJ, Liu J, Wali B, Li X, Ng MW (2016) Modeling traffic incident duration using quantile regression. Transp Res Rec 2554(2554):139–148. https://doi.org/10.3141/2554-15

    Article  Google Scholar 

  23. Khattak AJ, Schofer JL, Wang M-H (1995) A simple time sequential procedure for predicting freeway incident duration. I V H S J 2(2):113–138. https://doi.org/10.1080/10248079508903820

    Article  Google Scholar 

  24. Yu B, Xia Z (2012) A methodology for freeway incident duration prediction using computerized historical database, CICTP 2012 Multimodal Transp. Safe, Cost-Effective, Effic, Syst. https://doi.org/10.1061/9780784412442.351

    Book  Google Scholar 

  25. Hamad K, Al-ruzouq R, Zeiada W, Dabous SA, Khalil MA (2020) Predicting incident duration using random forests. Transp A Transp Sci 16(3):1269–1293

    Google Scholar 

  26. Lin L, Wang Q, Sadek AW (2016) A combined M5P tree and hazard-based duration model for predicting urban freeway traffic accident durations. Accid Anal Prev 91:114–126. https://doi.org/10.1016/j.aap.2016.03.001

    Article  Google Scholar 

  27. Liu F, Wang S (2021) Predicting subway incident delays using text analysis based accelerated failure time model. J Transp Saf Secur 13(3):340–356. https://doi.org/10.1080/19439962.2019.1638474

    Article  Google Scholar 

  28. Zhang Z, Liu J, Li X, Khattak AJ (2021) Do Larger Sample Sizes Increase the Reliability of Traffic Incident Duration Models? A Case Study of East Tennessee Incidents. Transp, Res. Rec., p 0361198121

    Google Scholar 

  29. Kalair K, Connaughton C (2021) Dynamic and interpretable hazard-based models of traffic incident durations. Front Futur Transp. https://doi.org/10.3389/ffutr.2021.669015

    Article  Google Scholar 

  30. Zhan C, Gan A, Hadi M (2011) Prediction of lane clearance time of freeway incidents using the M5P tree algorithm. IEEE Trans Intell Transp Syst 12(4):1549–1557. https://doi.org/10.1109/TITS.2011.2161634

    Article  Google Scholar 

  31. Khattak A, Wang X, Zhang H (2012) Incident management integration tool: dynamically predicting incident durations, secondary incident occurrence and incident delays. IET Intell Transp Syst 6(2):204–214. https://doi.org/10.1049/iet-its.2011.0013

    Article  Google Scholar 

  32. Zhang H, Khattak AJ (2010) Analysis of cascading incident event durations on urban freeways. Transp Res Rec 2178(1):30–39. https://doi.org/10.3141/2178-04

    Article  Google Scholar 

  33. Park H, Haghani A, Zhang X (2016) Interpretation of Bayesian neural networks for predicting the duration of detected incidents. J. Intelligent Transport Syst 20(4):385–400

    Article  Google Scholar 

  34. Zou Y, Lin B, Yang X, Wu L, Muneeb Abid M, Tang J (2021) Application of the Bayesian model averaging in analyzing freeway traffic incident clearance time for emergency management. J Adv Transp. https://doi.org/10.1155/2021/6671983

    Article  Google Scholar 

  35. Ghosh B, Dauwels J (2021) Comparison of different Bayesian methods for estimating error bars with incident duration prediction”, J. Transp. Syst. Technol. Planning, Oper, Intell. https://doi.org/10.1080/15472450.2021.1894936

    Book  Google Scholar 

  36. Zong F, Zhang H, Xu H, Zhu X, Wang L (2013) Predicting severity and duration of road traffic accident. Math Probl Eng. https://doi.org/10.1155/2013/547904

    Article  Google Scholar 

  37. Valenti G, Lelli M, Cucina D (2010) A comparative study of models for the incident duration prediction. Eur Transp Res Rev 2(2):103–111. https://doi.org/10.1007/s12544-010-0031-4

    Article  Google Scholar 

  38. Lee Y, Wei CH (2010) A computerized feature selection method using genetic algorithms to forecast freeway accident duration times. Comput Civ Infrastruct Eng 25(2):132–148. https://doi.org/10.1111/j.1467-8667.2009.00626.x

    Article  Google Scholar 

  39. Wei CH, Lee Y (2007) Sequential forecast of incident duration using artificial neural network models. Accid Anal Prev 39(5):944–954. https://doi.org/10.1016/j.aap.2006.12.017

    Article  Google Scholar 

  40. Wei C, Lee Y (2005) Applying data fusion techniques to traveler information services in highway network. J East Asia Soc Transp Stud 6:2457–2472. https://doi.org/10.11175/easts.6.2457

    Article  Google Scholar 

  41. Pereira FC, Rodrigues F, Ben-Akiva M (2013) Text analysis in incident duration prediction. Transp Res Part C Emerg Technol 37:177–192. https://doi.org/10.1016/j.trc.2013.10.002

    Article  Google Scholar 

  42. El-Basyouny K, Sayed T (2006) Comparison of two negative binomial regression techniques in developing accident prediction models. Transp Res Rec 1950:9–16. https://doi.org/10.3141/1950-02

    Article  Google Scholar 

  43. Vlahogianni EI, Karlaftis MG (2013) Fuzzy-entropy neural network freeway incident duration modeling with single and competing uncertainties. Computer-Aided Civil and Infrastructure Engineering 28(6):420–433. https://doi.org/10.1111/mice.12010

    Article  Google Scholar 

  44. Kim HJ, Choi H-K (2001) A comparative analysis of incident service time on urban freeways. IATSS Res 25(1):62–72. https://doi.org/10.1016/s0386-1112(14)60007-8

    Article  Google Scholar 

  45. W. Wenqun, C. Haibo, and M. Bell, (2002) A study of the characteristics of traffic incident duration on motorways, in Proceedings of the Conference on Traffic and Transportation Studies, ICTTS, pp. 1101–1108, doi: https://doi.org/10.1061/40630(255)153.

  46. Vlahogianni EI, Dimitriou L (2015) Fuzzy modeling of freeway accident duration with rainfall and traffic flow interactions. Anal Methods Accid Res 5–6:59–71. https://doi.org/10.1016/j.amar.2015.04.001

    Article  Google Scholar 

  47. Sheikh MS, Regan A (2022) A complex network analysis approach for estimation and detection of traffic incidents based on independent component analysis. Phys. A Stat. Mech. its Appl. 586:126504. https://doi.org/10.1016/j.physa.2021.126504

    Article  Google Scholar 

  48. Chang H, Chang T (2013) Prediction of freeway incident duration based on classification tree analysis. J East Asia Soc Transp Stud 10(1):1964–1977

    Google Scholar 

  49. Kim W, Chang G (2012) Development of a hybrid prediction model for freeway incident duration: a case study in Maryland. Int J Intell Transp Syst Res 10(1):22–33. https://doi.org/10.1007/s13177-011-0039-8

    Article  Google Scholar 

  50. W. Kim, G.-L. Chang, and S. M. Rochon, (2008) Analysis of freeway incident duration for atis applications, in Proceedings of the 15th World Congress on Intelligent Transport Systems and ITS America Annual Meeting, 950–958.

  51. Ozbay K, Noyan N (2006) Estimation of incident clearance times using Bayesian networks approach. Accid Anal Prev 38(3):542–555. https://doi.org/10.1016/j.aap.2005.11.012

    Article  Google Scholar 

  52. Yang BBJ, Zhang X, Sun LJ (2008) Traffic incident duration prediction based on the bayesian decision tree method. In Transport Develop Innovat Best Pract 2008(319):338–343. https://doi.org/10.1061/40961(319)56

    Article  Google Scholar 

  53. L Shen and M Huang, (2011) Data mining method for incident duration prediction, in Communications in Computer and Information Science, https://doi.org/10.1007/978-3-642-23214-5_64.

  54. S. Boyles, D. Fajardo, and S. T. Waller, “Naive bayesian classifier for incident duration prediction,” in Transportation Research Board 86th Annual Meeting, 2007, vol. 253, no. 07–1801, [Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.526.3396&rep=rep1&type=pdf.

  55. Lin L, Wang Q, Sadek A (2014) Data mining and complex network algorithms for traffic accident analysis. Transp Res Rec 2460(1):128–136. https://doi.org/10.3141/2460-14

    Article  Google Scholar 

  56. Weng J, Qiao W, Qu X, Yan X (2015) Cluster-based lognormal distribution model for accident duration. Transp A Transp Sci 11(4):345–363. https://doi.org/10.1080/23249935.2014.994687

    Article  Google Scholar 

  57. Zhao LP, Kolonel LN (1992) Efficiency loss from categorizing quantitative exposures into qualitative exposures in case-control studies. Am J Epidemiol 136(4):464–474. https://doi.org/10.1093/oxfordjournals.aje.a116520

    Article  Google Scholar 

  58. T Shoaib, (2019) SPSS- Visual Binning, https://doi.org/10.13140/RG.2.2.28631.73123.

  59. I. H. W. G. Holmes, A. Donkin, “Weka: A machine learning workbench, in: Intelligent Information Systems, 1994.,” 1994. [Online]. Available: http://netcologne.dl.sourceforge.net/project/weka/documentation/3.7.x/WekaManual-3-7-12.pdf.

  60. I Koprinska, (2010) Feature Selection for Brain-Computer Interfaces, Pacific-Asia Conf. Knowl. Discov. data Min., 106–117

  61. Salo F, Nassif AB, Essex A (2019) Dimensionality reduction with IG-PCA and ensemble classifier for network intrusion detection. Comput Networks 148(November):164–175. https://doi.org/10.1016/j.comnet.2018.11.010

    Article  Google Scholar 

  62. M N Injadat, A Moubayed, AB Nassif, and A Shami, (2020) Multi-stage optimized machine learning framework for network intrusion detection, arXiv, https://doi.org/10.1109/tnsm.2020.3014929.

  63. Lee S, Park I (2013) Application of decision tree model for the ground subsidence hazard mapping near abandoned underground coal mines. J Environ Manage 127:166–176. https://doi.org/10.1016/j.jenvman.2013.04.010

    Article  Google Scholar 

  64. DM Farid, L Zhang, CM Rahman, MA Hossain, and R Strachan, (2014) Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks, Expert Syst. Appl., 41(4) PART 2: 1937–1946

  65. Song YY, Lu Y (2015) Decision tree methods: applications for classification and prediction. Shanghai Arch Psychiatry 27(2):130–135. https://doi.org/10.11919/j.issn.1002-0829.215044

    Article  Google Scholar 

  66. Salo F, Injadat MN, Moubayed A, Nassif AB, Essex A (2019) Clustering Enabled Classification using Ensemble Feature Selection for Intrusion Detection, 2019 Int. Conf Comput Netw Commun ICNC 2019(April):276–281. https://doi.org/10.1109/ICCNC.2019.8685636

    Article  Google Scholar 

  67. S Alketbi, AB Nassif, MA. Eddin, I Shahin, and A Elnagar,(2020) Predicting the power of a combined cycle power plant using machine learning methods, 1–5, 2020, https://doi.org/10.1109/ccci49893.2020.9256742

  68. Y. Afadar, A. B. Nassif, M. A. Eddin, M. AbuTalib, and Q. Nasir, (2020) Heart Arrhythmia abnormality classification using machine learning 1–5, https://doi.org/10.1109/ccci49893.2020.9256763.

  69. Subasi A, Erçelebi E (2005) Classification of EEG signals using neural network and logistic regression. Comput Methods Programs Biomed 78(2):87–99. https://doi.org/10.1016/j.cmpb.2004.10.009

    Article  Google Scholar 

  70. Liu D, Li T, Liang D (2014) Incorporating logistic regression to decision-theoretic rough sets for classifications. Int J Approx Reason 55(1):197–210. https://doi.org/10.1016/j.ijar.2013.02.013

    Article  Google Scholar 

  71. Manogaran G, Lopez D (2018) Health data analytics using scalable logistic regression with stochastic gradient descent. Int J Adv Intell Paradig 10(1–2):118–132. https://doi.org/10.1504/IJAIP.2018.089494

    Article  Google Scholar 

  72. AB Nassif, O Mahdi, Q Nasir, MA Talib, and M Azzeh, (2018) Machine Learning Classifications of Coronary Artery Disease, arXiv https://doi.org/10.1109/isai-nlp.2018.8692942.

  73. A. B. Nassif, M. AlaaEddin and A. A. Sahib, "Machine Learning Models for Stock Price Prediction," 2020 Seventh International Conference on Information Technology Trends (ITT), Abu Dhabi, United Arab Emirates, 2020, pp. 67–71. https://doi.org/10.1109/ITT51279.2020.9320871

  74. López-Martín C, Villuendas-Rey Y, Azzeh M, Bou Nassif A, Banitaan S (2020) Transformed k-nearest neighborhood output distance minimization for predicting the defect density of software projects. J Syst Softw 167:1–20. https://doi.org/10.1016/j.jss.2020.110592

    Article  Google Scholar 

  75. Sharma A, Paliwal KK (2015) Linear discriminant analysis for the small sample size problem: an overview. Int J Mach Learn Cybern 6(3):443–454. https://doi.org/10.1007/s13042-013-0226-9

    Article  Google Scholar 

  76. Morais CLM, Lima KMG (2018) Principal component analysis with linear and quadratic discriminant analysis for identification of cancer samples based on mass spectrometry. J Braz Chem Soc 29(3):472–481. https://doi.org/10.21577/0103-5053.20170159

    Article  Google Scholar 

  77. L Bottou, (2010) Large-Scale Machine Learning with Stochastic Gradient Descent https://doi.org/10.1007/978-3-7908-2604-3.

  78. S Shrivastava, PM Jeyanthi, and S Singh, (2020) Failure prediction of Indian Banks using SMOTE, Lasso regression, bagging and boosting, Cogent Econ. Financ. 8(1) https://doi.org/10.1080/23322039.2020.1729569.

  79. Kim MJ, Kang DK (2010) Ensemble with neural networks for bankruptcy prediction. Expert Syst Appl 37(4):3373–3379. https://doi.org/10.1016/j.eswa.2009.10.012

    Article  Google Scholar 

  80. Bazi Y, Melgani F (2010) Gaussian process approach to remote sensing image classification. IEEE Trans Geosci Remote Sens 48(1):186–197. https://doi.org/10.1109/TGRS.2009.2023983

    Article  Google Scholar 

  81. Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27(8):861–874. https://doi.org/10.1016/j.patrec.2005.10.010

    Article  Google Scholar 

  82. Hou L, Lao Y, Wang Y, Zhang Z, Zhang Y, Li Z (2013) Modeling freeway incident response time: a mechanism-based approach. Transp Res Part C Emerg Technol 28:87–100. https://doi.org/10.1016/j.trc.2012.12.005

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Khaled Hamad.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare relevant to this article’s content. No direct funding was received to assist with the preparation of this manuscript.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

For this type of study, formal consent is not required.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hamad, K., Obaid, L., Nassif, A.B. et al. Comprehensive evaluation of multiple machine learning classifiers for predicting freeway incident duration. Innov. Infrastruct. Solut. 8, 177 (2023). https://doi.org/10.1007/s41062-023-01138-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s41062-023-01138-1

Keywords

Navigation