Introducing time series chains: a new primitive for time series data mining

Abstract

Time series motifs were introduced in 2002 and have since become a fundamental tool for time series analytics, finding diverse uses in dozens of domains. In this work, we introduce Time Series Chains, which are related to, but distinct from, time series motifs. Informally, time series chains are a temporally ordered set of subsequence patterns, such that each pattern is similar to the pattern that preceded it, but the first and last patterns can be arbitrarily dissimilar. In the discrete space, this is similar to extracting the text chain “data, date, cate, cade, code” from text stream. The first and last words have nothing in common, yet they are connected by a chain of words with a small mutual difference. Time series chains can capture the evolution of systems, and help predict the future. As such, they potentially have implications for prognostics. In this work, we introduce two robust definitions of time series chains and scalable algorithms that allow us to discover them in massive complex datasets.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22

References

  1. 1.

    Bertens R, Vreeken J, Siebes A (2016) Keeping it short and simple: Summarising complex event sequences with multivariate patterns. In: ACM SIGKDD, pp 735–744

  2. 2.

    Bögel T, Gertz M (2015) Time will tell: Temporal linking of news stories. In: Proceedings of the 15th ACM/IEEE-CS joint conference on digital libraries, pp 195–204

  3. 3.

    Ding H, Trajcevski G, Scheuermann P et al (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. In: VLDB, pp 1542–1552

  4. 4.

    Gama J, Žliobaitė I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv (CSUR) 46(4):1–37

    Article  MATH  Google Scholar 

  5. 5.

    Gupta S, Reynolds M, Patel S (2010) ElectriSense: single-point sensing using EMI for electrical event detection and classification in the home. In: Proceedings of the UbiComp 12th ACM international conference on Ubiquitous computing

  6. 6.

    Hao MC, Marwah M, Janetzko H et al (2012) Visual exploration of frequent patterns in multivariate time series. Inf Vis 11(1):71–83

    Article  Google Scholar 

  7. 7.

    Heldt T, Oefinger MB, Hoshiyama M et al (2003) Circulatory response to passive and active changes in posture. In: Computers in cardiology. IEEE, pp 263–266

  8. 8.

    Hoang T, Choi D, Nguyen T (2015) On the instability of sensor orientation in gait verification on mobile phone. In: 12th IEEE international joint conference on e-Business and telecommunications (ICETE), vol 4, pp 148–159

  9. 9.

    Krumme C, Llorente A, Cebrian M et al (2013) The predictability of consumer visitation patterns. Sci Rep 3:1645

    Article  Google Scholar 

  10. 10.

    Li Z, Han J, Ding B et al (2012) Mining periodic behaviors of object movements for animal and biological sustainability studies. Data Min Knowl Discov 24(2):355–386

    MathSciNet  Article  Google Scholar 

  11. 11.

    Lovallo WR, Wilson MF, Vincent AS et al (2004) Blood pressure response to caffeine shows incomplete tolerance after short-term regular consumption. Hypertension 43(4):760–765

    Article  Google Scholar 

  12. 12.

    Matsubara Y, Sakurai Y, Faloutsos C (2015) The web as a jungle: non-linear dynamical systems for co-evolving online activities. In: Proc’ of the 24th WWW, pp 721–731

  13. 13.

    McLoone J (2012) The longest word ladder puzzle ever. blog.wolfram.com/2012/01/11/the-longest-word-ladder-puzzle-ever. Retrieved 6 Sept 2016

  14. 14.

    Moya A (2009) Tilt testing and neurally mediated syncope: too many protocols for one condition or specific protocols for different situations? Eur Heart J 30(18):2174–2176

    Article  Google Scholar 

  15. 15.

    Mueen A, Zhu Y, Yeh M et al (2017) The fastest similarity search algorithm for time series subsequences under Euclidean distance. www.cs.unm.edu/~mueen/FastestSimilaritySearch.html. Retrieved 2 Feb 2017

  16. 16.

    Murray D, Liao J, Stankovic L (2015) A data management platform for personalised real-time energy feedback. In: EEDAL

  17. 17.

    Patel P, Keogh E, Lin J et al (2002) Mining motifs in massive time series databases. In: ICDM, pp 370–377

  18. 18.

    Ponganis PJ, St Leger J, Scadeng M (2015) Penguin lungs and air sacs: implications for baroprotection, oxygen stores and buoyancy. J Exp Biol 218(5):720–730

    Article  Google Scholar 

  19. 19.

    Shokoohi-Yekta M, Chen Y, Campana B et al (2015) Discovery of meaningful rules in time series. In: Proc’ of the 21th ACM SIGKDD, pp 1085–1094

  20. 20.

    Silver N (2012) The signal and the noise: the art and science of prediction. Penguin, London

    Google Scholar 

  21. 21.

    Smith J (2010) The accidentally-on-purpose history of cyber monday. www.esquire.com/news-politics/news/a23870/cyber-monday-online-shopping-4021548/. Retrieved 5 Feb 2017

  22. 28.

    Supporting Webpage (2017). https://sites.google.com/site/timeserieschain/. Retrieved 1 Jun 2017

  23. 22.

    Syed Z, Stultz C, Kellis M et al (2010) Motif discovery in physiological datasets: a methodology for inferring predictive elements. TKDD 4(1):2

    Article  Google Scholar 

  24. 23.

    Williams CL, Sato K, Shiomi K et al (2011) Muscle energy stores and stroke rates of emperor penguins: implications for muscle metabolism and dive performance. Phys Biochem Zool 85(2):120–133

    Article  Google Scholar 

  25. 24.

    Yan R, Wan X, Otterbacher J et al (2011) Evolutionary timeline summarization: a balanced optimization framework via iterative substitution. In: Proc’ of the 34th ACM SIGIR, pp 745–754

  26. 25.

    Yeh CCM, Zhu Y, Ulanova L et al (2016) Matrix profile I: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets. In: IEEE ICDM, pp 1317–1322

  27. 26.

    Zhu X, Oates T (2012) Finding story chains in newswire articles. In: IEEE IRI, pp 93–100

  28. 27.

    Zhu Y, Zimmerman Z, Shakibay Senobari N et al (2016) Matrix profile II: exploiting a novel algorithm and GPUs to break the one hundred million barrier for time series motifs and joins. In: IEEE ICDM, pp 739–748

Download references

Acknowledgements

We would like to acknowledge funding from MERL and from NSF IIS-1161997 II and NSF IIS-1510741. We especially want to thank Dr. John Michael Criley and Dr. Gregory Mason for their invaluable advice on the hemodynamics domain, and Dr. Matsubara for providing the GoogleTrend data.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Yan Zhu.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhu, Y., Imamura, M., Nikovski, D. et al. Introducing time series chains: a new primitive for time series data mining. Knowl Inf Syst 60, 1135–1161 (2019). https://doi.org/10.1007/s10115-018-1224-8

Download citation

Keywords

  • Time series
  • Motifs
  • Prognostics
  • Link analysis