A novel prediction method of complex univariate time series based on k-means clustering

Liu, Yunxin; Ding, Shifei; Jia, Weikuan

doi:10.1007/s00500-020-04952-2

A novel prediction method of complex univariate time series based on k-means clustering

Methodologies and Application
Published: 24 April 2020

Volume 24, pages 16425–16437, (2020)
Cite this article

Soft Computing Aims and scope Submit manuscript

Yunxin Liu^1,2,
Shifei Ding^1,2 &
Weikuan Jia³

612 Accesses
7 Citations
Explore all metrics

Abstract

Time-series prediction has been widely studied and applied in various fields. For the time series with high acquisition frequency and high noise, it is very difficult to establish a prediction model directly. Therefore, it is necessary to study how to obtain the change trend information of time series accurately, and then build a prediction model for its change trend. To obtain the change trend information of the original time series effectively and establish an accurate prediction model, this paper proposes a novel prediction method of complex univariate time series based on K-means clustering. This method first obtains the change trend information of the original time series based on the K-means clustering idea, and then, a gated recurrent unit based on the input attention mechanism is used to establish a prediction model for the obtained time-series change trend information. Extensive experiments on the electromagnetic radiation dataset we collected, the AEP_hourly dataset, and the Wind Turbine Scada dataset published online, demonstrate that our proposed K-means clustering method can effectively reduce noise interference and accurately obtain the time-series change trend information. Comparative experiments of different prediction models demonstrate that our prediction model has the best prediction accuracy, and our proposed complex univariate time-series prediction algorithm has great practical value.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Time Series Prediction with Preprocessing and Clustering

Intelligent Analysis Method of Multidimensional Time Series Data Based on Deep Learning

Time Series Prediction Based on Consecutive Neighborhood Preserving Properties of Matrix Profile

References

Achanta S, Gangashetty SV (2017) Deep elman recurrent neural networks for statistical parametric speech synthesis. Speech Commun 93:31–42
Article Google Scholar
Arthur D, Vassilvitskii S (2007) K-means: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms, New Orleans, Louisiana, pp 1027–1035
Asteriou D, Hall S (2016) ARIMA models and the box-jenkins methodology, pp 275–296
Baek Y, Kim HY (2018) ModAugNet: a new forecasting framework for stock market index value with an overfitting prevention LSTM module and a prediction LSTM module. Expert Syst Appl 113:457–480
Article Google Scholar
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv:1409.0473
Brockwell PJ, Davis RA (1989) Time series: theory and methods. Technometrics 31(1):121
Google Scholar
Bustamam A, Puspa SD, Siswantining T (2018) Implementation of co-similarity measure on microarray data of lymphoma using K-means partition algorithm. AIP Conf Proc 2023(1):20221–20222
Article Google Scholar
Chang Y, Sun F, Wu Y, et al (2018) A memory-network based solution for multivariate time-series forecasting. arXiv:1809.02105
Cheng J, Dong L, Lapata M (2016) Long short-term memory-networks for machine reading. arXiv:1601.06733
Cho K, van Merrienboer B, Gulcehre C et al (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv: 1406.1078
Choi H (2019) Persistent hidden states and nonlinear transformation for long short-term memory. Neurocomputing 331:458–464
Article Google Scholar
Chuanmin M, Yue L, Sifeng L et al (2018) An ensemble telecom customers clustering model based on grey incidence and K-means. J Grey Syst 30(4):47–59
Google Scholar
Elman JL (1991) Distributed representations, simple recurrent networks, and grammatical structure. Mach Learn 7(2–3):195–225
Google Scholar
Esling P, Agon C (2012) Time-series data mining. ACM Comput Surv 45(1):1–34
Article MATH Google Scholar
Frigola-Alcalde R (2016) Bayesian time series learning with Gaussian processes
Fu T (2011) A review on time series data mining. Eng Appl Artif Intell 24(1):164–181
Article Google Scholar
Har-Peled S, Kushal A (2005) Smaller coresets for k-median and K-means clustering. In: Proceedings of the twenty-first annual symposium on computational geometry, Pisa, Italy, ACM, pp 1027–1035
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Hong-Sen Y, Nan-Yun J, Wen-Wu S et al (2017) Product price forecasting based on correlative price net and neural networks. Int J Ind Eng 24(3):306–327
Google Scholar
Kingma DP, Adam BJ (2014) A method for stochastic optimization. arXiv:1412.6980
Li Y, Wu H, Liu H (2018) Multi-step wind speed forecasting using EWT decomposition, LSTM principal computing, RELM subordinate computing and IEWT reconstruction. Energy Convers Manag 167:203–219
Article Google Scholar
Liang C, Hao H (2017) Research on distributed data mining technology based on K-mean algorithm. Rev Facult Ingen 32(5):291–298
Google Scholar
Mhammedi Z, Hellicar A, Rahman A et al (2016) Recurrent neural networks for one day ahead prediction of stream flow. In: Proceedings of the workshop on time series analytics and applications, Hobart, TAS, Australia, ACM, pp 25–31
Morrison GL, Hall KR, Holste JC et al (1994) Comparison of orifice and slotted plate flowmeters. Flow Meas Instrum 5:71–77
Article Google Scholar
PJM Hourly Energy Consumption Data. https://www.kaggle.com/robikscube/hourly-energy-consumption
Qin Y, Song D, Chen H et al (2017) A dual-stage attention-based recurrent neural network for time series prediction, pp 2627–2633
Roberts S, Osborne M, Ebden M et al (2012) Gaussian processes for time-series modeling. Philos Trans Ser A Math Phys Eng Sci 371(1984):20110550
MATH Google Scholar
Sun L, Yang X, Zhou J et al (2018) Echo state network with multiple loops reservoir and its application in network traffic prediction. In: 2018 IEEE 22nd international conference on computer supported cooperative work in design (CSCWD), IEEE, pp 689–694
Tang L, Pan H, Yao Y (2018) PANK-A financial time series prediction model integrating principal component analysis, affinity propagation clustering and nested k-nearest neighbor regression. J Interdiscip Math 21(3):717–728
Article Google Scholar
Wang EY, He XQ, Nie BS, Liu ZT (2000) Principle of predicting coal and gas outburst using electromagnetic emission. J China Univ Min Technol 3:3–7
Google Scholar
Warren LT (2005) Clustering of time series data—a survey. Pattern Recogn 38(11):1857–1874
Article MATH Google Scholar
Wind Turbine Scada Data. https://www.kaggle.com/berkerisen/wind-turbine-scada-dataset
Whittle P (1951) Hypothesis testing in time series analysis. PhD thesis
Yu R, Gao J, Yu M et al (2019) LSTM-EFG for wind power forecasting based on sequential correlation features. Future Gener Comput Syst 93:33–42
Article Google Scholar
Zhang B, Ren H, Huang G et al (2019) Predicting blood pressure from physiological index data using the SVR algorithm. BMC Bioinform 20(1):1–15
Article Google Scholar

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (Nos. 61672522 and 61976216).

Author information

Authors and Affiliations

School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China
Yunxin Liu & Shifei Ding
Mine Digitization Engineering Research Center of Ministry of Education of the People’s Republic of China, Xuzhou, 221116, China
Yunxin Liu & Shifei Ding
School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
Weikuan Jia

Authors

Yunxin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Shifei Ding
View author publications
You can also search for this author in PubMed Google Scholar
Weikuan Jia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shifei Ding.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Informed consent

All procedures followed were in accordance with the ethical standards of the responsible committee on human experimentation (institutional and national) and with the 1975 Declaration of Helsinki, as revised in 2008 (5). Additional informed consent was obtained from all patients for which identifying information is included in this article.

Human and animal rights

This article does not contain any studies with human or animal subjects performed by the any of the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, Y., Ding, S. & Jia, W. A novel prediction method of complex univariate time series based on k-means clustering. Soft Comput 24, 16425–16437 (2020). https://doi.org/10.1007/s00500-020-04952-2

Download citation

Published: 24 April 2020
Issue Date: November 2020
DOI: https://doi.org/10.1007/s00500-020-04952-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel prediction method of complex univariate time series based on k-means clustering

Abstract

Access this article

Similar content being viewed by others

Time Series Prediction with Preprocessing and Clustering

Intelligent Analysis Method of Multidimensional Time Series Data Based on Deep Learning

Time Series Prediction Based on Consecutive Neighborhood Preserving Properties of Matrix Profile

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Informed consent

Human and animal rights

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A novel prediction method of complex univariate time series based on k-means clustering

Abstract

Access this article

Similar content being viewed by others

Time Series Prediction with Preprocessing and Clustering

Intelligent Analysis Method of Multidimensional Time Series Data Based on Deep Learning

Time Series Prediction Based on Consecutive Neighborhood Preserving Properties of Matrix Profile

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Informed consent

Human and animal rights

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation