Skip to main content

A Correlation Based Imputation Method for Incomplete Traffic Accident Data

  • Conference paper
PRICAI 2014: Trends in Artificial Intelligence (PRICAI 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8862))

Included in the following conference series:

Abstract

Death, injury and disability from road traffic crashes continue to be a major global public health problem. Recent data suggest that the number of fatalities from traffic crashes is in excess of 1.25 million people each year with non-fatal injuries affecting a further 20-50 million people. It is predicted that by 2030 road traffic accidents will have progressed to be the 5th leading cause of death and that the number of people who will die annually from traffic accidents will have doubled from current levels. Therefore, methods to reduce accident severity are of great interest to traffic agencies and the public at large. Road accident fatality rate depends on many factors and it is a very challenging task to investigate the dependencies between the attributes because of the many environmental and road accident factors. Any missing data in the database could obscure the discovery of important factors and lead to invalid conclusions. In order to make the traffic accident datasets useful for analysis, it should be preprocessed properly. In this paper, we present a novel method based on sampling of distributions obtained from correlation measures for the imputation of missing values to improve the quality of the traffic accident data. We evaluated our algorithm using two publicly available traffic accident databases of United States (explore.data.gov, data. opencolorado.org). Our results indicate that the proposed method performs significantly better than the three existing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Zamani, Z., Poumand, M., Saraee, M.H.: Application of data mining in traffic management: Case of city of Isfahan. In: Proceeding of ICECT 2010 Conference, Kuala Lumpur, pp. 102–106 (May 2010)

    Google Scholar 

  2. Rahman, M. G., Islam, M.Z.: kDMI: A novel method for missing values imputation using two levels of horizontal partitioning in a data set. In: Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., Wang, W. (eds.) ADMA 2013, Part II. LNCS (LNAI), vol. 8347, pp. 250–263. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  3. Rahman, M.G., Islam, M.Z.: A decision tree-based missing value imputation technique for data pre-processing. In: Proceeding of AusDM 2011 Conference, Ballarat, pp. 41–50 (December 2011)

    Google Scholar 

  4. Schneider, T.: Analysis of incomplete climate data: estimation of mean values and covariance matrices and imputation of missing values. Journal of Climate 14(5), 853–871 (2001)

    Article  Google Scholar 

  5. Batista, G.E.A.P.A., Monard, M.C.: An analysis of four missing data treatment methods for supervised learning. Journal of Applied Artificial Intelligence 17(5-6), 519–533 (2003)

    Article  Google Scholar 

  6. Deb, R., Liew, A.W.C.: Missing value imputation for the analysis of incomplete traffic accident data. In: Proceeding of ICMLC Conference, Lanzhou, China (July 2014)

    Google Scholar 

  7. Tan, P.N., Kumar, V., Srivastava, J.: Selecting the right objective measure for association analysis. Information Systems 29(4), 293–313 (2004)

    Article  Google Scholar 

  8. Farhangfar, A., Kurgan, L., Dy, J.: Impact of imputation of missing values on classification error for discrete data. Pattern Recognition 41(12), 3692–3705 (2008)

    Article  MATH  Google Scholar 

  9. Maletic, J.I., Marcus, A.: Data cleansing: Beyond integrity analysis. In: Proceeding of IQ 2000 Conference, pp. 200–209. Citeseer (June 2000)

    Google Scholar 

  10. Zhu, X., Wu, X., Yang, Y.: Error detection and impact-sensitive instance ranking in noisy data sets. In: Proceeding of AAAI 2004 Conference, California, pp. 378–384 (July 2004)

    Google Scholar 

  11. Liu, C.-C., Dai, D.-Q., Yan, H.: The theoretic famework for local weighted approximation for microarray missing value estimation. Pattren Recognition 43(8), 2993–3002 (2010)

    Article  MATH  Google Scholar 

  12. Gargett, S., Connelly, L.B., Nghiem, S.: Are we there yet? Australian road safety targets and road traffic crash fatalities. BMC Public Health 11(270), 323–336 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Deb, R., Wee-Chung Liew, A., Oh, E. (2014). A Correlation Based Imputation Method for Incomplete Traffic Accident Data. In: Pham, DN., Park, SB. (eds) PRICAI 2014: Trends in Artificial Intelligence. PRICAI 2014. Lecture Notes in Computer Science(), vol 8862. Springer, Cham. https://doi.org/10.1007/978-3-319-13560-1_77

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-13560-1_77

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-13559-5

  • Online ISBN: 978-3-319-13560-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics