Advertisement

A novel multi-resolution representation for time series sensor data analysis

  • Yupeng Hu
  • Cun Ji
  • Qingke Zhang
  • Lin Chen
  • Peng Zhan
  • Xueqing LiEmail author
Methodologies and Application
  • 30 Downloads

Abstract

The evolution of IoT has increased the popularity of all types of sensing devices in a variety of industrial fields and has resulted in enormous growth in the volume of sensor data. Considering the high volume and dimensionality of sensor data, the ability to perform in-depth data analysis and data mining tasks directly on the raw time series sensor data is limited. To solve this problem, we propose a novel dimensional reduction and multi-resolution representation approach for time series sensor data. This approach utilizes an appropriate number of important data points (IDPs) within a certain time series sensor data to produce a corresponding multi-resolution piecewise linear representation (MPLR), called MPLR-IDP. The results of the theoretical analyses and experiments show that MPLR-IDP can reduce the dimensionality while maintaining the important characteristics of time series data. MPLR-IDP can represent the data in a more flexible way to meet diverse needs of different users.

Keywords

Internet of things Time series Piecewise linear representation Multi-resolution representation 

Abbreviations

\(TS_n\)

A time series with length n

PLR

Piecewise linear representation

MPLR

Multi-resolution PLR

BMPLR

The basic multi-resolution PLR

EMPLR

The extended multi-resolution PLR

PIPs

Perceptually important points

TPs

Turning points

IDPs

Important data points

TSRSs

Time series representation standards

\(Num_{seg}\)

The user-specified number of segments

TFE

The user-specified fitting error of entire time series

\(MFE_{seg}\)

The user-specified maximum fitting error of segment

ARI

Adaptive representation index

SB-Tree

Specialized binary tree index

OBST

The optimal binary search tree

LI

Linear interpolation

LR

Linear regression

\(seg{{<}} {x},{y}{>}\)

Segment object from \(v_x\) to \(v_{y}\)

\(es_{{<} x,y{>}}\)

The fitting error of \(seg{<} {x},{y}{>}\)

BMPLR

Basic multi-resolution PLR

EMPLR

Extended multi-resolution PLR

\(DS_m\)

The time series dataset with m time series

\(MN_\mathrm{TP}\)

The maximum number of TPs

DCR

Data compression ratio

TSC

Time series classification

ST

Shapelet transformation

TDS

Time series training dataset

SubTS

All the time series subsequences set

FSS

The final shapelets set

Notes

Acknowledgements

The authors would like to thank the anonymous reviewers and the editors for their insightful comments and suggestions, which are greatly helpful for improving the quality of this paper. This work is supported by the National Natural Science Foundation of China, No.: 61772310, No.: 61702300, No.: 61702302, No.: 61802231; the Science and Technology Development Funds of Shandong Province, No.: 2014GGX101028; the Project of Qingdao Postdoctoral Applied Research.

Compliance with ethical standards

Conflict of interest

All authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.

References

  1. Agrawal R, Faloutsos C, Swami A (1993) Efficient similarity search in sequence databases. In: International conference on foundations of data organization and algorithms. Springer, pp 69–84 Google Scholar
  2. Bagnall A, Bostrom A, Large J, Lines J (2016) The great time series classification bake off: an experimental evaluation of recently proposed algorithms. Extended version CoRR. arXiv:1602.01711
  3. Bagnall A, Lines J, Bostrom A, Large J, Keogh E (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Discov 31(3):606–660MathSciNetCrossRefGoogle Scholar
  4. Chan KP, Fu AWC (1999) Efficient time series matching by wavelets. In: 15th international conference on data engineering, 1999. Proceedings. IEEE, pp 126–133Google Scholar
  5. Chen Y, Keogh E, Hu B, Begum N, Bagnall A, Mueen A, Batista G (2015) The ucr time series classification archive Google Scholar
  6. Doerr B, Fischer P, Hilbert A, Witt C (2016) Detecting structural breaks in time series via genetic algorithms. Soft Comput 21(16):4707–4720 CrossRefGoogle Scholar
  7. Fu TC (2011) A review on time series data mining. Eng Appl Artif Intell 24(1):164–181CrossRefGoogle Scholar
  8. Fu TC, Chung Fl, Luk R, Ng CM (2008) Representing financial time series based on data point importance. Eng Appl Artif Intell 21(2):277–300CrossRefGoogle Scholar
  9. He Q, Dong Z, Zhuang F, Shang T, Shi Z (2012) Fast time series classification based on infrequent shapelets. In: 2012 11th international conference on machine learning and applications (ICMLA), vol 1. IEEE, pp 215–219Google Scholar
  10. Hills J, Lines J, Baranauskas E, Mapp J, Bagnall A (2014) Classification of time series by shapelet transformation. Data Min Knowl Discov 28(4):851–881MathSciNetCrossRefGoogle Scholar
  11. Keogh EJ, Smyth P (1997) A probabilistic approach to fast pattern matching in time series databases. KDD 1997:24–30Google Scholar
  12. Keogh EJ, Pazzani MJ (1998) An enhanced representation of time series which allows fast and accurate classification, clustering and relevance feedback. KDD 98:239–243Google Scholar
  13. Keogh E, Chu S, Hart D, Pazzani M (2001) An online algorithm for segmenting time series. In: Proceedings IEEE international conference on data mining, 2001. ICDM 2001. IEEE, pp 289–296Google Scholar
  14. Korn F, Jagadish HV, Faloutsos C (1997) Efficiently supporting ad hoc queries in large datasets of time sequences. ACM Sigmod Record 26:289–300CrossRefGoogle Scholar
  15. Lines J, Davis LM, Hills J, Bagnall A (2012) A shapelet transform for time series classification. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 289–297Google Scholar
  16. Lomet D, Hong M, Nehme R, Zhang R (2008) Transaction time indexing with version compression. Proc VLDB Endow 1(1):870–881CrossRefGoogle Scholar
  17. Mueen A, Keogh E, Young N (2011) Logical-shapelets: an expressive primitive for time series classification. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1154–1162Google Scholar
  18. Park S, Lee D, Chu WW (1999) Fast retrieval of similar subsequences in long sequence databases. In: 1999 workshop on knowledge and data engineering exchange 1999. (KDEX’99) proceedings. IEEE, pp 60–67Google Scholar
  19. Perng CS, Wang H, Zhang SR, Parker DS (2000) Landmarks: a new model for similarity-based pattern querying in time series databases. In: 16th international conference on data engineering 2000. Proceedings. IEEE, pp 33–42Google Scholar
  20. Pratt KB, Fink E (2002) Search for patterns in compressed time series. Int J Image Graph 2(01):89–106CrossRefGoogle Scholar
  21. Qu Y, Wang C, Wang XS (1998) Supporting fast search in time series for movement patterns in multiple scales. In: Proceedings of the seventh international conference on information and knowledge management. ACM, pp 251–258Google Scholar
  22. Shatkay H, Zdonik SB (1996a) Approximate queries and representations for large data sequences. In: Twelfth international conference on data engineering, pp 536–545Google Scholar
  23. Shatkay H, Zdonik SB (1996b) Approximate queries and representations for large data sequences. In: Proceedings of the twelfth international conference on data engineering, 1996. IEEE, pp 536–545Google Scholar
  24. Si YW, Yin J (2013) Obst-based segmentation approach to financial time series. Eng Appl Artif Intell 26:2581–2596CrossRefGoogle Scholar
  25. Wan Y, Si YW (2017) A hidden semi-Markov model for chart pattern matching in financial time series. Soft Comput 22(19):6525–6544CrossRefGoogle Scholar
  26. Xing Z, Pei J, Yu PS, Wang K (2011) Extracting interpretable features for early classification on time series. In: Proceedings of the 2011 SIAM international conference on data mining. SIAM, pp 247–258Google Scholar
  27. Ye L, Keogh E (2011) Time series shapelets: a novel technique that allows accurate, interpretable and fast classification. Data Min Knowl Discov 22(1):149–182MathSciNetCrossRefGoogle Scholar
  28. Ye L, Keogh E (2009) Time series shapelets: a new primitive for data mining. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 947–956Google Scholar
  29. Yin S, Kaynak O (2015) Big data for modern industry: challenges and trends [point of view]. Proc IEEE 103(2):143–146CrossRefGoogle Scholar
  30. Zhang Z, Zhang H, Wen Y, Yuan X (2016) Accelerating time series shapelets discovery with key points. In: Asia-Pacific web conference. Springer, pp 330–342Google Scholar
  31. Zhou DZ, Li MQ (2008) Time series segmentation based on series importance point. Comput Eng 23:14–16Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyShandong UniversityTsingtaoChina
  2. 2.School of Information Science and EngineeringShandong Normal UniversityJinanChina
  3. 3.School of SoftwareShandong UniversityJinanChina

Personalised recommendations