Advertisement

The Journal of Supercomputing

, Volume 72, Issue 10, pp 3850–3867 | Cite as

Discovering sub-patterns from time series using a normalized cross-match algorithm

  • Xueyuan Gong
  • Simon FongEmail author
  • Raymond K. Wong
  • Sabah Mohammed
  • Jinan Fiaidhi
  • Athanasios V. Vasilakos
Article
  • 204 Downloads

Abstract

Time series data stream mining has attracted considerable research interest in recent years. Pattern discovery is a challenging problem in time series data stream mining. Because the data update continuously and the sampling rates may be different, dynamic time warping (DTW)-based approaches are used to solve the pattern discovery problem in time series data streams. However, the naive form of the DTW-based approach is computationally expensive. Therefore, Toyoda proposed the CrossMatch (CM) approach to discover the patterns between two time series data streams (sequences), which requires only O(n) time per data update, where n is the length of one sequence. CM, however, does not support normalization, which is required for some kinds of sequences (e.g. stock prices, ECG data). Therefore, we propose a normalized-CrossMatch approach that extends CM to enforce normalization while maintaining the same performance capabilities.

Keywords

Pattern discovery CrossMatch NCM Data streams Time series 

Notes

Acknowledgments

The authors are thankful for the financial support from the research grant “Temporal Data Stream Mining by Using Incrementally Optimized Very Fast Decision Forest (iOVFDF)”, Grant No. MYRG2015-00128-FST, offered by the University of Macau, FST, and RDAO.

References

  1. 1.
    Sakurai Y, Faloutsos C, Yamamuro M (2007) Stream monitoring under the time warping distance. In: IEEE 23rd international conference on data engineering (ICDE), pp 1046–1055Google Scholar
  2. 2.
    Gong X, Si Y-W, Fong S, Mohammed S (2014) Nspring: normalization-supported spring for subsequence matching on time series streams. In: IEEE 15th international symposium on computational intelligence and informatics (CINTI), pp 373–378Google Scholar
  3. 3.
    Toyoda M, Sakurai Y, Ichikawa T (2008) Identifying similar subsequences in data streams. In: Database and expert systems applications, pp 210–224Google Scholar
  4. 4.
    Toyoda M, Sakurai Y (2010) Discovery of cross-similarity in data streams. In: IEEE 26th international conference on data engineering (ICDE), pp 101–104Google Scholar
  5. 5.
    Toyoda M, Sakurai Y, Ishikawa Y (2013) Pattern discovery in data streams under the time warping distance. VLDB J 22(3):295–318CrossRefGoogle Scholar
  6. 6.
    Angiulli F, Fassetti F (2007) Detecting distance-based outliers in streams of data. In: Proceedings of the 16th conference on information and knowledge management (CIKM), pp 811–820Google Scholar
  7. 7.
    Bu Y, Chen L, Fu AW-C, Liu D (2009) Efficient anomaly monitoring over moving object trajectory streams. In: Proceedings of the 15th international conference on knowledge discovery and data mining (SIGKDD), pp 159–168Google Scholar
  8. 8.
    Keogh E, Kasetty S (2003) On the need for time series data mining benchmarks: a survey and empirical demonstration. Data Min Knowl Discov 7(4):349–371MathSciNetCrossRefGoogle Scholar
  9. 9.
    Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th international conference on knowledge discovery and data mining (SIGKDD), pp 262–270Google Scholar
  10. 10.
    Aach J, Church GM (2001) Aligning gene expression time series with time warping algorithms. Bioinformatics 17(6):495–508CrossRefGoogle Scholar
  11. 11.
    Yi B-K, Jagadish H, Faloutsos C (1998) Efficient retrieval of similar time sequences under time warping. In: Proceedings of the 14th international conference on data engineering (ICDE), pp 201–208Google Scholar
  12. 12.
    Itakura F (1975) Minimum prediction residual principle applied to speech recognition. IEEE Trans Acoust Speech Signal Process 23(1):67–72CrossRefGoogle Scholar
  13. 13.
    Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust Speech Signal Process 26(1):43–49CrossRefzbMATHGoogle Scholar
  14. 14.
    Keogh E, Ratanamahatana CA (2005) Exact indexing of dynamic time warping. Knowl Inf Syst 7(3):358–386CrossRefGoogle Scholar
  15. 15.
    Keogh E, Wei L, Xi X, Vlachos M, Lee S-H, Protopapas P (2009) Supporting exact indexing of arbitrarily rotated shapes and periodic time series under euclidean and warping distance measures. Int J Very Large Data Bases 18(3):611–630CrossRefGoogle Scholar
  16. 16.
    Chiu B, Keogh E, Lonardi S (2003) Probabilistic discovery of time series motifs. In: Proceedings of the 9th international conference on knowledge discovery and data mining (SIGKDD), pp 493–498Google Scholar
  17. 17.
    Mueen A (2013) Enumeration of time series motifs of all lengths. In: IEEE 13th international conference on data mining (ICDM), pp 547–556Google Scholar
  18. 18.
    Mueen A, Keogh EJ, Zhu Q, Cash S, Westover MB (2009) Exact discovery of time series motifs. In: SDM, pp 473–484Google Scholar
  19. 19.
    Ringeval F, Sonderegger A, Sauer J, Lalanne D (2013) Introducing the recola multimodal corpus of remote collaborative and affective interactions. In: 10th IEEE international conference and workshops on automatic face and gesture recognition (FG), pp 1–8Google Scholar
  20. 20.
    Agrawal R, Faloutsos C, Swami AN (1993) Efficient similarity search in sequence databases. In: Proceedings of the 4th international conference on foundations of data organization and algorithms (FODO), pp 69–84Google Scholar
  21. 21.
    Wan Y, Gong X, Si Y-W (2016) Effect of segmentation on financial time series pattern matching. Appl Soft Comput 38:346–359CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Xueyuan Gong
    • 1
  • Simon Fong
    • 1
    Email author
  • Raymond K. Wong
    • 2
  • Sabah Mohammed
    • 3
  • Jinan Fiaidhi
    • 3
  • Athanasios V. Vasilakos
    • 4
  1. 1.Department of Computer and Information ScienceUniversity of MacauMacauChina
  2. 2.School of Computer Science and EngineeringUniversity of New South WalesSydneyAustralia
  3. 3.Department of Computer ScienceLakehead UniversityThunder BayCanada
  4. 4.Department of Computer Science, Electrical and Space EngineeringLulea University of TechnologyLuleaSweden

Personalised recommendations