Abstract
In this paper, we aim to identify the significant variables that contribute to the injury severity level of the person in the car when an accident happens and build a statistical model for predicting the maximum injury severity level as well as estimating the potential economic cost in a car accident based on those variables. The General Estimates System data, which is a representative sample of police-reported motor vehicle crashes of all types collected by the National Highway Transportation Safety Administration, from the years 2012 to 2013 is the main data source. Some other data sources such as the car safety rating from the United State Department of Transformation and the state-specific cost of crash deaths fact sheets are also used in the predictive model building process. An interactive system programmed in HyperText Markup Language, Cascading Style Sheets and JavaScript is developed based on the results of predictive modeling. The system is hosted on a website at http://gessmu.azurewebsites.net for public access. The system allows users to input variables that are significant contributors in car accidents and obtain the predicted maximum injury severity level and potential economic cost of a car accident.
This is a preview of subscription content, access via your institution.



References
Amjadi R, Martinez W (2021) The 2016 Data Challenge of the American Statistical Association. Computational Statistics. https://doi.org/10.1007/s00180-021-01076-5
Brubacher JR, Chan H, Fang M, Brown D, Purssell R (2013) Police documentation of alcohol involvement in hospitalized injured drivers. Traffic Injury Prevent 14:453–460
Centers for Disease Control and Prevention (2015) State-specific costs of motor vehicle crash deaths
Federal Highway Administration, United States Department of Transportation (2011) highway safety improvement program manual
Federal Highway Administration, United States Department of Transportation (2014) The crash severity cost table, crash modification factors clearinghouse
Insurance Institute for Highway Safety (2016) Fatality facts: state by state
National Highway Traffic Safety Administration. (2013). 2012 Fatality Analysis Reporting System (FARS) and National Automotive Sampling System (NASS) General Estimates System (GES) Coding and Validation Manual (DOT Publication No. DOT HS 811 854). Retrieved from https://crashstats.nhtsa.dot.gov/Api/Public/ViewPublication/811854
National Highway Traffic Safety Administration (2014a) 2013 FARS and NASS GES coding and validation manual (DOT Publication No. DOT HS 812 094). Retrieved from https://crashstats.nhtsa.dot.gov/Api/Public/ViewPublication/812094
National Highway Traffic Safety Administration. (2014b). National Automotive Sampling System (NASS) General Estimates System (GES) Analytical User’s Manual 1988-2013 (DOT Publication No. DOT HS 812 091). Retrieved from ftp://ftp.nhtsa.dot.gov/GES/GES13/GES%20Analytical%20Users%20Manual%201988-2013_FINAL.pdf
R Core Team (2018) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. 3-900051-07-0
SAS Institute Inc. (2009) Getting Started with SAS Enterprise Miner 6.1. Cary, NC: SAS Institute Inc
SAS Institute Inc. (2013) SAS 9.4 language reference concepts. SAS Institute Inc, Cary
Schafer JL (1999) Multiple imputation: a primer. Stat Methods Med Res 8:3–15
Stefanski LA, Carroll RJ (1985) Covariate measurement error in logistic regression. Ann Stat 13:1335–1351
Voas RB, Torres P, Romano E, Lacey JH (2012) Alcohol-related risk of driver fatalities: an update using 2007 data. J Stud Alcohold Drugs 73:341–350
Acknowledgements
This paper is based on an entry of the American Statistical Association Government Statistics Section 2016 Data Challenge. The authors would like to thank the guest editors, Dr. Roya Amjadi and Dr. Wendy Martinez, for providing them the opportunity to contribute to this special issue, and to thank the two anonymous referees for their useful comments and suggestions on an earlier version of this manuscript which resulted in this improved version. H. K. T. Ngs work was supported by a grant from the Simons Foundation (#709773 to Tony Ng).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Alkan, G., Farrow, R., Liu, H. et al. Predictive modeling of maximum injury severity and potential economic cost in a car accident based on the General Estimates System data. Comput Stat 36, 1561–1575 (2021). https://doi.org/10.1007/s00180-021-01074-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-021-01074-7