Skip to main content

Advertisement

Log in

A novel prediction method of complex univariate time series based on k-means clustering

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Time-series prediction has been widely studied and applied in various fields. For the time series with high acquisition frequency and high noise, it is very difficult to establish a prediction model directly. Therefore, it is necessary to study how to obtain the change trend information of time series accurately, and then build a prediction model for its change trend. To obtain the change trend information of the original time series effectively and establish an accurate prediction model, this paper proposes a novel prediction method of complex univariate time series based on K-means clustering. This method first obtains the change trend information of the original time series based on the K-means clustering idea, and then, a gated recurrent unit based on the input attention mechanism is used to establish a prediction model for the obtained time-series change trend information. Extensive experiments on the electromagnetic radiation dataset we collected, the AEP_hourly dataset, and the Wind Turbine Scada dataset published online, demonstrate that our proposed K-means clustering method can effectively reduce noise interference and accurately obtain the time-series change trend information. Comparative experiments of different prediction models demonstrate that our prediction model has the best prediction accuracy, and our proposed complex univariate time-series prediction algorithm has great practical value.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Achanta S, Gangashetty SV (2017) Deep elman recurrent neural networks for statistical parametric speech synthesis. Speech Commun 93:31–42

    Article  Google Scholar 

  • Arthur D, Vassilvitskii S (2007) K-means: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms, New Orleans, Louisiana, pp 1027–1035

  • Asteriou D, Hall S (2016) ARIMA models and the box-jenkins methodology, pp 275–296

  • Baek Y, Kim HY (2018) ModAugNet: a new forecasting framework for stock market index value with an overfitting prevention LSTM module and a prediction LSTM module. Expert Syst Appl 113:457–480

    Article  Google Scholar 

  • Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv:1409.0473

  • Brockwell PJ, Davis RA (1989) Time series: theory and methods. Technometrics 31(1):121

    Google Scholar 

  • Bustamam A, Puspa SD, Siswantining T (2018) Implementation of co-similarity measure on microarray data of lymphoma using K-means partition algorithm. AIP Conf Proc 2023(1):20221–20222

    Article  Google Scholar 

  • Chang Y, Sun F, Wu Y, et al (2018) A memory-network based solution for multivariate time-series forecasting. arXiv:1809.02105

  • Cheng J, Dong L, Lapata M (2016) Long short-term memory-networks for machine reading. arXiv:1601.06733

  • Cho K, van Merrienboer B, Gulcehre C et al (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv: 1406.1078

  • Choi H (2019) Persistent hidden states and nonlinear transformation for long short-term memory. Neurocomputing 331:458–464

    Article  Google Scholar 

  • Chuanmin M, Yue L, Sifeng L et al (2018) An ensemble telecom customers clustering model based on grey incidence and K-means. J Grey Syst 30(4):47–59

    Google Scholar 

  • Elman JL (1991) Distributed representations, simple recurrent networks, and grammatical structure. Mach Learn 7(2–3):195–225

    Google Scholar 

  • Esling P, Agon C (2012) Time-series data mining. ACM Comput Surv 45(1):1–34

    Article  MATH  Google Scholar 

  • Frigola-Alcalde R (2016) Bayesian time series learning with Gaussian processes

  • Fu T (2011) A review on time series data mining. Eng Appl Artif Intell 24(1):164–181

    Article  Google Scholar 

  • Har-Peled S, Kushal A (2005) Smaller coresets for k-median and K-means clustering. In: Proceedings of the twenty-first annual symposium on computational geometry, Pisa, Italy, ACM, pp 1027–1035

  • Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  • Hong-Sen Y, Nan-Yun J, Wen-Wu S et al (2017) Product price forecasting based on correlative price net and neural networks. Int J Ind Eng 24(3):306–327

    Google Scholar 

  • Kingma DP, Adam BJ (2014) A method for stochastic optimization. arXiv:1412.6980

  • Li Y, Wu H, Liu H (2018) Multi-step wind speed forecasting using EWT decomposition, LSTM principal computing, RELM subordinate computing and IEWT reconstruction. Energy Convers Manag 167:203–219

    Article  Google Scholar 

  • Liang C, Hao H (2017) Research on distributed data mining technology based on K-mean algorithm. Rev Facult Ingen 32(5):291–298

    Google Scholar 

  • Mhammedi Z, Hellicar A, Rahman A et al (2016) Recurrent neural networks for one day ahead prediction of stream flow. In: Proceedings of the workshop on time series analytics and applications, Hobart, TAS, Australia, ACM, pp 25–31

  • Morrison GL, Hall KR, Holste JC et al (1994) Comparison of orifice and slotted plate flowmeters. Flow Meas Instrum 5:71–77

    Article  Google Scholar 

  • PJM Hourly Energy Consumption Data. https://www.kaggle.com/robikscube/hourly-energy-consumption

  • Qin Y, Song D, Chen H et al (2017) A dual-stage attention-based recurrent neural network for time series prediction, pp 2627–2633

  • Roberts S, Osborne M, Ebden M et al (2012) Gaussian processes for time-series modeling. Philos Trans Ser A Math Phys Eng Sci 371(1984):20110550

    MATH  Google Scholar 

  • Sun L, Yang X, Zhou J et al (2018) Echo state network with multiple loops reservoir and its application in network traffic prediction. In: 2018 IEEE 22nd international conference on computer supported cooperative work in design (CSCWD), IEEE, pp 689–694

  • Tang L, Pan H, Yao Y (2018) PANK-A financial time series prediction model integrating principal component analysis, affinity propagation clustering and nested k-nearest neighbor regression. J Interdiscip Math 21(3):717–728

    Article  Google Scholar 

  • Wang EY, He XQ, Nie BS, Liu ZT (2000) Principle of predicting coal and gas outburst using electromagnetic emission. J China Univ Min Technol 3:3–7

    Google Scholar 

  • Warren LT (2005) Clustering of time series data—a survey. Pattern Recogn 38(11):1857–1874

    Article  MATH  Google Scholar 

  • Wind Turbine Scada Data. https://www.kaggle.com/berkerisen/wind-turbine-scada-dataset

  • Whittle P (1951) Hypothesis testing in time series analysis. PhD thesis

  • Yu R, Gao J, Yu M et al (2019) LSTM-EFG for wind power forecasting based on sequential correlation features. Future Gener Comput Syst 93:33–42

    Article  Google Scholar 

  • Zhang B, Ren H, Huang G et al (2019) Predicting blood pressure from physiological index data using the SVR algorithm. BMC Bioinform 20(1):1–15

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (Nos. 61672522 and 61976216).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shifei Ding.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Informed consent

All procedures followed were in accordance with the ethical standards of the responsible committee on human experimentation (institutional and national) and with the 1975 Declaration of Helsinki, as revised in 2008 (5). Additional informed consent was obtained from all patients for which identifying information is included in this article.

Human and animal rights

This article does not contain any studies with human or animal subjects performed by the any of the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Y., Ding, S. & Jia, W. A novel prediction method of complex univariate time series based on k-means clustering. Soft Comput 24, 16425–16437 (2020). https://doi.org/10.1007/s00500-020-04952-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-020-04952-2

Keywords

Navigation