Skip to main content

Comparing Methods of Imputation for Time Series Missing Values

  • Conference paper
  • First Online:
IoT and Big Data Technologies for Health Care (IoTCare 2021)

Included in the following conference series:

Abstract

Due to the rapid development of modern information engineering, a lot of data are used in machine learning and data cleaning and data mining of the hot research fields, such as a large portion of the data algorithm and related data model are built for complete data set, But in our real life and work, the absence of data exists in a large number of data collection, collation, transmission, storage and other links, it causes many obstacles and difficulties to build a model for complete data. The general way of dealing with missing values for simple delete, that deal with missing value method is a simple convenient but can cause: two aspects of the problem and the inconvenience caused by the original data set to reduce, reduce the reliability of the data, especially in the case of data loss is bigger, can cause a large number of data sets to reduce and missing, This has caused a lot of trouble to our work and research, so we need to find a more efficient and better method than direct deletion. In order to better solve the above problems, we mainly fill in the missing values of time series data, which has become an urgent problem to be solved. In this paper, mean filling, median filling, mode filling, PCA-EM filling and other methods are used to fill traffic data. By comparing these methods, the filling effect of each method is evaluated.

This work is supported by Shandong Key R&D Program grant 2019JZZY021005.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Fisher, R.A., Yates, F.: Statistical Tables: For Biological, Agricultural and Medical Research. Oliver and Boyd (1938)

    Google Scholar 

  2. Ma, L., Sun, B., Li, Z.: Bagging likelihood-based belief decision trees. In: 20th International Conference on Information Fusion (FUSION), Xi’an, China, 1–6 (2017). http://ieeexplore.ieee.org/abstract/document/8009664/

  3. Geng, R., Sun, B., Ma, L., Zhao, Q., Shen, T.: Anomaly-aware in sequence data based on MSM-H with EXPoSE. In: 40th Chinese Control Conference (CCC 2021), Shanghai, China (2021)

    Google Scholar 

  4. Batista, G.E., Monard, M.C.: An analysis of four missing data treatment methods for supervised learning. Appl. Artif. Intell. 17(5–6), 519–533 (2003)

    Article  Google Scholar 

  5. Sun, B., Cheng, W., Ma, L., Goswami, P.: Anomaly-aware traffic prediction based on automated conditional information fusion. In: International Conference on Information Fusion (FUSION), Cambridge, United Kingdom, pp. 2283–2289. IEEE (2018)

    Google Scholar 

  6. Leduc, G.: Road traffic data: collection methods and applications. In: Working Papers on Energy, Transport and Climate Change, vol. 1, no. 55, pp. 1–55 (2008)

    Google Scholar 

  7. Sun, B., Cheng, W., Bai, G., Goswami, P.: Correcting and complementing freeway traffic accident data using mahalanobis distance based outlier detection. Tehnicki Vjesnik Tech. Gazette 24(5), 1597–1607 (2017)

    Google Scholar 

  8. Scheffer, J.: Dealing with missing data (2002)

    Google Scholar 

  9. Lv, Y., Duan, Y., Kang, W., et al.: Traffic flow prediction with big data: a deep learning approach. IEEE Trans. Intell. Transp. Syst. 16(2), 865–873 (2014)

    Google Scholar 

  10. Ma, L., Sun, B., Han, C.: Learning decision forest from evidential data: the random training set sampling approach. In: 4th International Conference on Systems and Informatics (ICSAI), Hangzhou, China (2017)

    Google Scholar 

  11. Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, Hoboken (2019)

    Google Scholar 

  12. Sun, B., Cheng, W., Goswami, P., Bai, G.: An overview of parameter and data strategies for K-nearest neighbours based short-term traffic prediction. In: ACM International Conference Proceeding Series 2017, pp. 68–74. ACM (2017)

    Google Scholar 

  13. Marlin, B.: Missing Data Problems in Machine Learning (2008)

    Google Scholar 

  14. Sun, B., Ma, L., Shen, T., et al.: A robust data-driven method for muti-seasonal and heteroscedastic IoT time series preprocessing. In: Wireless Communications and Mobile Computing (WCMC), p. 6692390 (2021)

    Google Scholar 

  15. Yu, L., Snapp, R.R., Ruiz, T., et al.: Probabilistic principal component analysis with expectation maximization (PPCA-EM) facilitates volume classification and estimates the missing data. J. Struct. Biol. 171(1), 18–30 (2010)

    Article  Google Scholar 

  16. Sun, B., Cheng, W., Goswami, P., et al.: Short-term traffic forecasting using self-adjusting k-nearest neighbours. IET Intell. Transp. Syst. 12(1), 41–48 (2018)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Mingran Li or Mingxu Sun .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Geng, R., Li, M., Sun, M., Wang, Y. (2022). Comparing Methods of Imputation for Time Series Missing Values. In: Wang, S., Zhang, Z., Xu, Y. (eds) IoT and Big Data Technologies for Health Care. IoTCare 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 415. Springer, Cham. https://doi.org/10.1007/978-3-030-94182-6_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-94182-6_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-94181-9

  • Online ISBN: 978-3-030-94182-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics