Comparing Methods of Imputation for Time Series Missing Values

Geng, Renkang; Li, Mingran; Sun, Mingxu; Wang, Yujie

doi:10.1007/978-3-030-94182-6_24

Renkang Geng¹⁸,
Mingran Li¹⁸,
Mingxu Sun¹⁸ &
…
Yujie Wang¹⁸

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 415))

Included in the following conference series:

IoT and Big Data Technologies for Health Care

353 Accesses
1 Citations

Abstract

Due to the rapid development of modern information engineering, a lot of data are used in machine learning and data cleaning and data mining of the hot research fields, such as a large portion of the data algorithm and related data model are built for complete data set, But in our real life and work, the absence of data exists in a large number of data collection, collation, transmission, storage and other links, it causes many obstacles and difficulties to build a model for complete data. The general way of dealing with missing values for simple delete, that deal with missing value method is a simple convenient but can cause: two aspects of the problem and the inconvenience caused by the original data set to reduce, reduce the reliability of the data, especially in the case of data loss is bigger, can cause a large number of data sets to reduce and missing, This has caused a lot of trouble to our work and research, so we need to find a more efficient and better method than direct deletion. In order to better solve the above problems, we mainly fill in the missing values of time series data, which has become an urgent problem to be solved. In this paper, mean filling, median filling, mode filling, PCA-EM filling and other methods are used to fill traffic data. By comparing these methods, the filling effect of each method is evaluated.

This work is supported by Shandong Key R&D Program grant 2019JZZY021005.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Survey on Missing Values Handling Methods for Time Series Data

Time Series Missing Value Prediction: Algorithms and Applications

A Novel Approach to Detect Missing Values Patterns in Time Series Data

References

Fisher, R.A., Yates, F.: Statistical Tables: For Biological, Agricultural and Medical Research. Oliver and Boyd (1938)
Google Scholar
Ma, L., Sun, B., Li, Z.: Bagging likelihood-based belief decision trees. In: 20th International Conference on Information Fusion (FUSION), Xi’an, China, 1–6 (2017). http://ieeexplore.ieee.org/abstract/document/8009664/
Geng, R., Sun, B., Ma, L., Zhao, Q., Shen, T.: Anomaly-aware in sequence data based on MSM-H with EXPoSE. In: 40th Chinese Control Conference (CCC 2021), Shanghai, China (2021)
Google Scholar
Batista, G.E., Monard, M.C.: An analysis of four missing data treatment methods for supervised learning. Appl. Artif. Intell. 17(5–6), 519–533 (2003)
Article Google Scholar
Sun, B., Cheng, W., Ma, L., Goswami, P.: Anomaly-aware traffic prediction based on automated conditional information fusion. In: International Conference on Information Fusion (FUSION), Cambridge, United Kingdom, pp. 2283–2289. IEEE (2018)
Google Scholar
Leduc, G.: Road traffic data: collection methods and applications. In: Working Papers on Energy, Transport and Climate Change, vol. 1, no. 55, pp. 1–55 (2008)
Google Scholar
Sun, B., Cheng, W., Bai, G., Goswami, P.: Correcting and complementing freeway traffic accident data using mahalanobis distance based outlier detection. Tehnicki Vjesnik Tech. Gazette 24(5), 1597–1607 (2017)
Google Scholar
Scheffer, J.: Dealing with missing data (2002)
Google Scholar
Lv, Y., Duan, Y., Kang, W., et al.: Traffic flow prediction with big data: a deep learning approach. IEEE Trans. Intell. Transp. Syst. 16(2), 865–873 (2014)
Google Scholar
Ma, L., Sun, B., Han, C.: Learning decision forest from evidential data: the random training set sampling approach. In: 4th International Conference on Systems and Informatics (ICSAI), Hangzhou, China (2017)
Google Scholar
Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, Hoboken (2019)
Google Scholar
Sun, B., Cheng, W., Goswami, P., Bai, G.: An overview of parameter and data strategies for K-nearest neighbours based short-term traffic prediction. In: ACM International Conference Proceeding Series 2017, pp. 68–74. ACM (2017)
Google Scholar
Marlin, B.: Missing Data Problems in Machine Learning (2008)
Google Scholar
Sun, B., Ma, L., Shen, T., et al.: A robust data-driven method for muti-seasonal and heteroscedastic IoT time series preprocessing. In: Wireless Communications and Mobile Computing (WCMC), p. 6692390 (2021)
Google Scholar
Yu, L., Snapp, R.R., Ruiz, T., et al.: Probabilistic principal component analysis with expectation maximization (PPCA-EM) facilitates volume classification and estimates the missing data. J. Struct. Biol. 171(1), 18–30 (2010)
Article Google Scholar
Sun, B., Cheng, W., Goswami, P., et al.: Short-term traffic forecasting using self-adjusting k-nearest neighbours. IET Intell. Transp. Syst. 12(1), 41–48 (2018)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical Engineering, University of Jinan, Jinan, 250022, Shandong, China
Renkang Geng, Mingran Li, Mingxu Sun & Yujie Wang

Authors

Renkang Geng
View author publications
You can also search for this author in PubMed Google Scholar
Mingran Li
View author publications
You can also search for this author in PubMed Google Scholar
Mingxu Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yujie Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Mingran Li or Mingxu Sun .

Editor information

Editors and Affiliations

University of Leicester, Leicester, UK
Shuihua Wang
Harbin Institute of Technology, Shenzhen, China
Zheng Zhang
University of Jinan, Jinan, China
Yuan Xu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Geng, R., Li, M., Sun, M., Wang, Y. (2022). Comparing Methods of Imputation for Time Series Missing Values. In: Wang, S., Zhang, Z., Xu, Y. (eds) IoT and Big Data Technologies for Health Care. IoTCare 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 415. Springer, Cham. https://doi.org/10.1007/978-3-030-94182-6_24

Download citation

DOI: https://doi.org/10.1007/978-3-030-94182-6_24
Published: 18 June 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-94181-9
Online ISBN: 978-3-030-94182-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Comparing Methods of Imputation for Time Series Missing Values

Abstract

Access this chapter

Similar content being viewed by others

A Survey on Missing Values Handling Methods for Time Series Data

Time Series Missing Value Prediction: Algorithms and Applications

A Novel Approach to Detect Missing Values Patterns in Time Series Data

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Comparing Methods of Imputation for Time Series Missing Values

Abstract

Access this chapter

Similar content being viewed by others

A Survey on Missing Values Handling Methods for Time Series Data

Time Series Missing Value Prediction: Algorithms and Applications

A Novel Approach to Detect Missing Values Patterns in Time Series Data

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation