A Correlation Based Imputation Method for Incomplete Traffic Accident Data

Deb, Rupam; Wee-Chung Liew, Alan; Oh, Erwin

doi:10.1007/978-3-319-13560-1_77

Rupam Deb²¹,
Alan Wee-Chung Liew²¹ &
Erwin Oh²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8862))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

6396 Accesses
4 Citations

Abstract

Death, injury and disability from road traffic crashes continue to be a major global public health problem. Recent data suggest that the number of fatalities from traffic crashes is in excess of 1.25 million people each year with non-fatal injuries affecting a further 20-50 million people. It is predicted that by 2030 road traffic accidents will have progressed to be the 5^th leading cause of death and that the number of people who will die annually from traffic accidents will have doubled from current levels. Therefore, methods to reduce accident severity are of great interest to traffic agencies and the public at large. Road accident fatality rate depends on many factors and it is a very challenging task to investigate the dependencies between the attributes because of the many environmental and road accident factors. Any missing data in the database could obscure the discovery of important factors and lead to invalid conclusions. In order to make the traffic accident datasets useful for analysis, it should be preprocessed properly. In this paper, we present a novel method based on sampling of distributions obtained from correlation measures for the imputation of missing values to improve the quality of the traffic accident data. We evaluated our algorithm using two publicly available traffic accident databases of United States (explore.data.gov, data. opencolorado.org). Our results indicate that the proposed method performs significantly better than the three existing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Missing Value Imputation for the Analysis of Incomplete Traffic Accident Data

Imputation Methods Used in Missing Traffic Data: A Literature Review

Missing Values Imputation Using Genetic Algorithm for the Analysis of Traffic Data

References

Zamani, Z., Poumand, M., Saraee, M.H.: Application of data mining in traffic management: Case of city of Isfahan. In: Proceeding of ICECT 2010 Conference, Kuala Lumpur, pp. 102–106 (May 2010)
Google Scholar
Rahman, M. G., Islam, M.Z.: kDMI: A novel method for missing values imputation using two levels of horizontal partitioning in a data set. In: Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., Wang, W. (eds.) ADMA 2013, Part II. LNCS (LNAI), vol. 8347, pp. 250–263. Springer, Heidelberg (2013)
Chapter Google Scholar
Rahman, M.G., Islam, M.Z.: A decision tree-based missing value imputation technique for data pre-processing. In: Proceeding of AusDM 2011 Conference, Ballarat, pp. 41–50 (December 2011)
Google Scholar
Schneider, T.: Analysis of incomplete climate data: estimation of mean values and covariance matrices and imputation of missing values. Journal of Climate 14(5), 853–871 (2001)
Article Google Scholar
Batista, G.E.A.P.A., Monard, M.C.: An analysis of four missing data treatment methods for supervised learning. Journal of Applied Artificial Intelligence 17(5-6), 519–533 (2003)
Article Google Scholar
Deb, R., Liew, A.W.C.: Missing value imputation for the analysis of incomplete traffic accident data. In: Proceeding of ICMLC Conference, Lanzhou, China (July 2014)
Google Scholar
Tan, P.N., Kumar, V., Srivastava, J.: Selecting the right objective measure for association analysis. Information Systems 29(4), 293–313 (2004)
Article Google Scholar
Farhangfar, A., Kurgan, L., Dy, J.: Impact of imputation of missing values on classification error for discrete data. Pattern Recognition 41(12), 3692–3705 (2008)
Article MATH Google Scholar
Maletic, J.I., Marcus, A.: Data cleansing: Beyond integrity analysis. In: Proceeding of IQ 2000 Conference, pp. 200–209. Citeseer (June 2000)
Google Scholar
Zhu, X., Wu, X., Yang, Y.: Error detection and impact-sensitive instance ranking in noisy data sets. In: Proceeding of AAAI 2004 Conference, California, pp. 378–384 (July 2004)
Google Scholar
Liu, C.-C., Dai, D.-Q., Yan, H.: The theoretic famework for local weighted approximation for microarray missing value estimation. Pattren Recognition 43(8), 2993–3002 (2010)
Article MATH Google Scholar
Gargett, S., Connelly, L.B., Nghiem, S.: Are we there yet? Australian road safety targets and road traffic crash fatalities. BMC Public Health 11(270), 323–336 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information and Communication Technology, Griffith University, Australia
Rupam Deb & Alan Wee-Chung Liew
School of Engineering, Griffith University, Australia
Erwin Oh

Authors

Rupam Deb
View author publications
You can also search for this author in PubMed Google Scholar
Alan Wee-Chung Liew
View author publications
You can also search for this author in PubMed Google Scholar
Erwin Oh
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

MIMOS Berhad Technology Park Malaysia, 57000, Bukit Jalil, KL, Malaysia
Duc-Nghia Pham
Kyungpook National University, Sankyuk-Dong, Buk-Gu, 702-701, Daegu, Korea
Seong-Bae Park

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Deb, R., Wee-Chung Liew, A., Oh, E. (2014). A Correlation Based Imputation Method for Incomplete Traffic Accident Data. In: Pham, DN., Park, SB. (eds) PRICAI 2014: Trends in Artificial Intelligence. PRICAI 2014. Lecture Notes in Computer Science(), vol 8862. Springer, Cham. https://doi.org/10.1007/978-3-319-13560-1_77

Download citation

DOI: https://doi.org/10.1007/978-3-319-13560-1_77
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13559-5
Online ISBN: 978-3-319-13560-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Correlation Based Imputation Method for Incomplete Traffic Accident Data

Abstract

Access this chapter

Preview

Similar content being viewed by others

Missing Value Imputation for the Analysis of Incomplete Traffic Accident Data

Imputation Methods Used in Missing Traffic Data: A Literature Review

Missing Values Imputation Using Genetic Algorithm for the Analysis of Traffic Data

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Correlation Based Imputation Method for Incomplete Traffic Accident Data

Abstract

Access this chapter

Preview

Similar content being viewed by others

Missing Value Imputation for the Analysis of Incomplete Traffic Accident Data

Imputation Methods Used in Missing Traffic Data: A Literature Review

Missing Values Imputation Using Genetic Algorithm for the Analysis of Traffic Data

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation