Skip to main content

Time Series Prediction with Preprocessing and Clustering

  • Conference paper
  • First Online:
IoT and Big Data Technologies for Health Care (IoTCare 2021)

Abstract

This paper studies the similarity of time series, and studies the influence of weight on prediction results on the basis of clustering. We first introduce the practical significance and research purpose of the selected topic, summarizes the current research situation at home and abroad, and summarizes the research content of this paper. Second, we describe related concepts. Later, based on Dodger data set, we study the flow of total prediction data of time series. First of all, feature extraction of the data, pre-processing work, the original data generation time series. Then the data are processed and divided into training data and test data for the convenience of subsequent processing. Then the clustering algorithm was used to divide the time series into categories, and seven categories were divided according to the characteristics of one week time cycle. The average value of each category is calculated to replace the characteristics of the current category, and then the similarity is compared. Finally, the weight of each category is calculated by similarity degree, and then the data is predicted. MAE, R-squared, MAPE and other indicators were used to analyze and evaluate the forecast data.

This work is supported by Shandong Key R&D Program grant 2019JZZY021005.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Wang, W., Shan, X.: Study on regular pattern of railway passenger flow in three-day holiday based on clustering method of time series. Railw. Comput. Appl. 04, 23–27 (2015)

    Google Scholar 

  2. Geng, R., Sun, B., Ma, L., Zhao, Q., Shen, T.: Anomaly-aware in sequence data based on MSM-H with EXPoSE. In: 40th Chinese Control Conference, CCC 2021, Shanghai, China (2021)

    Google Scholar 

  3. Sun, B., Cheng, W., Goswami, P., Bai, G.: Short-term traffic forecasting using self-adjusting k-nearest neighbours. IET Intel. Transp. Syst. 12(1), 41–48 (2018)

    Article  Google Scholar 

  4. Ji, M., Xiao, L.: A dynamic k-means clustering algorithm for time series data. Comput. Digit. Eng. 48(8), 1852–1857 (2020). https://doi.org/10.3969/j.issn.1672-9722.2020.08.007

    Article  Google Scholar 

  5. Lin, Q.: Research on Feature Screening and Clustering Analysis of Time Series Data - A Case Study of the CSI 300 Index. Southwestern University of Finance and Economics (2017)

    Google Scholar 

  6. Ma, L., Sun, B., Ziyi, L.: Bagging likelihood-based belief decision trees. In: 20th International Conference on Information Fusion (FUSION), Xi-An, China, pp. 1–6 (2017). http://ieeexplore.ieee.org/abstract/document/8009664/

  7. Sun, B., Wei, C., Liyao, M., Prashant, G.: Anomaly-aware traffic prediction based on automated conditional information fusion. In: International Conference on Information Fusion (FUSION), pp. 2283–2289. IEEE, Cambridge, United Kingdom (2018)

    Google Scholar 

  8. Zheng, C.Z.L.: Shape clustering on time series data. In: Proceedings of Information Technology and Environmental System Sciences, ITESS 2008, vol. 3, pp. 1249–1253 (2008)

    Google Scholar 

  9. Plant, C., Wohlschhiger, A.M., Zherdin, A.: Interaction-based clustering of multivariate time series. In: The 9th IEEE International Conference on Data Mining, ICDM 2009, Miami, Florida, USA, 6–9 December 2009, pp. 914–919 (209)

    Google Scholar 

  10. Sun, B., Cheng, W., Goswami, P., Bai, G.: An overview of parameter and data strategies for k-nearest neighbours based short-term traffic prediction. In: 2017 ACM International Conference Proceeding Series, pp. 68–74. ACM (2017)

    Google Scholar 

  11. Ma, L., Sun, B., Han, C.: Learning decision forest from evidential data: the random training set sampling approach. In: 4th International Conference on Systems and Informatics (ICSAI), Hangzhou, China (2017)

    Google Scholar 

  12. Li, F., Tan, L., et al.: On the data-mining oriented methods for clustering time series. Comput. Sci. 027(012), 76–80 (2000)

    MathSciNet  Google Scholar 

  13. Zijian, T.: Time Series Forecast via Similar Fluctuate Pattern. Hefei University of Technology (2016)

    Google Scholar 

  14. Sun, B., Cheng, W., Bai, G., Goswami, P.: Correcting and complementing freeway traffic accident data using Mahalanobis distance based outlier detection. Tehnicki Vjesnik (Tech. Gaz.) 24(5), 1597–1607 (2017)

    Google Scholar 

  15. Liu, C.: Research on Interactive Prediction of Airport Noise Monitoring Points Based on Time Series Similarity Measure. Nanjing University of Aeronautics and Astronautics

    Google Scholar 

  16. Sun, B., Ma, L., Shen, T., et al.: A robust data-driven method for muti-seasonal and heteroscedastic IoT time series preprocessing. Wirel. Commun. Mob. Comput. (WCMC) 2021, 1–11 (2021). Article ID 6692390

    Google Scholar 

  17. Lai, Y.: Study on Real-Time Prediction of Arrival Time for Floating Transit Vehicle. Chongqing University (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Lin Han or Jidong Feng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sun, H., Lin, S., Han, L., Feng, J., Sun, M. (2022). Time Series Prediction with Preprocessing and Clustering. In: Wang, S., Zhang, Z., Xu, Y. (eds) IoT and Big Data Technologies for Health Care. IoTCare 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 415. Springer, Cham. https://doi.org/10.1007/978-3-030-94182-6_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-94182-6_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-94181-9

  • Online ISBN: 978-3-030-94182-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics