Skip to main content
Log in

Invariant time-series factorization

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Time-series analysis is an important domain of machine learning and a plethora of methods have been developed for the task. This paper proposes a new representation of time series, which in contrast to existing approaches, decomposes a time-series dataset into latent patterns and membership weights of local segments to those patterns. The process is formalized as a constrained objective function and a tailored stochastic coordinate descent optimization is applied. The time-series are projected to a new feature representation consisting of the sums of the membership weights, which captures frequencies of local patterns. Features from various sliding window sizes are concatenated in order to encapsulate the interaction of patterns from different sizes. The derived representation offers a set of features that boosts classification accuracy. Finally, a large-scale experimental comparison against 11 baselines over 43 real life datasets, indicates that the proposed method achieves state-of-the-art prediction accuracy results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. www.cs.ucr.edu/~eamonn/time_series_data.

  2. http://fs.ismll.de/publicspace/InvariantFactorization/.

References

  • Arthur D, Vassilvitskii S (2007) k-means++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms. SODA ’07, Philadelphia, PA, society for industrial and applied mathematics, pp 1027–1035

  • Barthelemy Q, Larue A, Mayoue A, Mercier D, Mars J (2012) Shift and 2d rotation invariant sparse coding for multivariate signals. IEEE Trans Signal Process 60(4):1597–1611

    Article  MathSciNet  Google Scholar 

  • Batista GEAPA, Wang X, Keogh EJ (2011) A complexity-invariant distance measure for time series. In: SDM, SIAM / Omnipress, pp 699–710

  • Batista GEAPA, Keogh EJ, Tataw OM, de Souza VMA (2014) CID: an efficient complexity-invariant distance for time series. Data Min Knowl Disc 28(3):634–669

  • Baydogan MG, Runger G, Tuv E (2013) A bag-of-features framework to classify time series. IEEE Trans Pattern Anal Mach Intell 35(11):2796–2802

  • Buza K, Schmidt-Thieme L (2010) Motif-based classification of time series with Bayesian networks and SVMs. In: Fink A, Lausen B, Seidel W, Ultsch A (eds) Advances in data analysis, data handling and business intelligence. Studies in classification, data analysis, and knowledge organization. Springer, Berlin, Heidelberg, pp 105–114

  • Chen Y, Nascimento M, Ooi BC, Tung A (2007) Spade: on shape-based pattern detection in streaming time series. In: IEEE 23rd international conference on data engineering, 2007. ICDE 2007. pp 786–795

  • Chen L, Ng R (2004) On the marriage of lp-norms and edit distance. In: Proceedings of the thirtieth international conference on very large data bases—vol 30. VLDB ’04, VLDB endowment, pp 792–803

  • Cuturi M (June 2011) Fast global alignment kernels. In: et al. G. (ed) Proceedings of the ICML 2011. ICML 2011, New York, ACM, pp 929–936

  • Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MATH  MathSciNet  Google Scholar 

  • Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh EJ (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. PVLDB 1(2):1542–1552

    Google Scholar 

  • Grabocka J, Nanopoulos A, Schmidt-Thieme L (2012a) Classification of sparse time series via supervised matrix factorization. In: Hoffmann J, Selman B (eds) AAAI, AAAI Press

  • Grabocka J, Nanopoulos A, Schmidt-Thieme L (2012b) Invariant time-series classification. In: Flach PA, Bie TD, Cristianini N (eds) ECML/PKDD (2). Lecture notes in computer science, vol 7524. Springer, pp 725–740

  • Gudmundsson S, Runarsson TP, Sigurdsson S (2008) Support vector machines and dynamic time warping for time series. In: IJCNN, IEEE, pp 2772–2776

  • Hills J, Lines J, Baranauskas E, Mapp J, Bagnall A (2013) Classification of time series by shapelet transformation. Data Min Knowl Disc 28:851–881

  • Huang PS, Yang J, Hasegawa-Johnson M, Liang F, Huang TS (2012) Pooling robust shift-invariant sparse representations of acoustic signals. In: INTERSPEECH, ISCA

  • Keogh EJ, Pazzani MJ (2000) Scaling up dynamic time warping for datamining applications. In: KDD. pp 285–289

  • Keogh E, Chakrabarti K, Pazzani M, Mehrotra S (2001) Dimensionality reduction for fast similarity search in large time series databases. Knowl Inf Syst 3(3):263–286

    Article  MATH  Google Scholar 

  • Kuksa P, Pavlovic V (2010) Spatial representation for efficient sequence classification. In: 20th international conference on pattern recognition (ICPR), 2010. pp 3320–3323

  • Lewicki MS, Sejnowski TJ (1999) Coding time-varying signals using sparse, shift-invariant representations. In: Proceedings of NIPS, Cambridge, MIT Press pp 730–736

  • Lin J, Khade R, Li Y (2012) Rotation-invariant similarity in time series using bag-of-patterns representation. J Intell Inf Syst 39(2):287–315

    Article  Google Scholar 

  • Lin J, Li Y (2009) Finding structural similarity in time series data using bag-of-patterns representation. In: Proceedings of the 21st international conference on scientific and statistical database management. SSDBM 2009, Springer, Berlin pp 461–477

  • Lin J, Keogh E, Wei L, Lonardi S (October 2007) Experiencing sax: a novel symbolic representation of time series. Data Min Knowl Disc 15(2):107–144

  • Marussy K, Buza K (2013) Success: a new approach for semi-supervised classification of time-series. In: Rutkowski L, Korytkowski M, Scherer R, Tadeusiewicz R, Zadeh L, Zurada J (eds) Artificial intelligence and soft computing. Lecture notes in computer science, vol 7894. Springer, Berlin, pp 437–447

  • Mueen A, Keogh EJ, Young N. (2011) Logical-shapelets: an expressive primitive for time series classification. In: Apté C, Ghosh J, Smyth P (eds) KDD, ACM, pp 1154–1162

  • Platt JC (1999) Advances in kernel methods. MIT Press, Cambridge

    Google Scholar 

  • Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD. KDD 2012, New York, ACM, pp 262–270

  • Rakthanmanon T, Keogh E (2013) Fast shapelets: a scalable algorithm for discovering time series shapelets. In: Proceedings of the 13th SIAM international conference on data mining

  • Vlachos M, Kollios G, Gunopulos D (2002) Discovering similar multidimensional trajectories. In: Proceedings 18th international conference on data engineering, 2002. pp 673–684

  • Wang X, Mueen A, Ding H, Trajcevski G, Scheuermann P, Keogh E (2013) Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Disc 26(2):275–309

  • Wang J, Liu P, She MF, Nahavandi S, Kouzani A (2013) Bag-of-words representation for biomedical time series classification. Biomed Signal Process Control 8(6):634–644

    Article  Google Scholar 

  • Wang F, Lee N, Hu J, Sun J, Ebadollahi S (2012) Towards heterogeneous temporal clinical event pattern discovery: a convolutional approach. In: Proceedings of ACM SIGKDD. KDD ’12, New York, ACM pp 453–461

  • Wei L, Keogh E (2006) Semi-supervised time series classification. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’06, New York, ACM pp 748–753

  • Yu HF, Hsieh CJ, Si S, Dhillon IS (2012) Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In: Zaki MJ, Siebes A, Yu JX, Goethals B, Webb GI, Wu X (eds) ICDM, IEEE computer society, pp 765–774

  • Zhang D, Zuo W, Zhang D, Zhang H (2010) Time series classification using support vector machine with Gaussian elastic metric kernel. In: ICPR, IEEE, pp 29–32

Download references

Acknowledgments

Partially co-funded by the Seventh Framework Programme of the European Comission, through project REDUCTION (# 288254). www.reduction-project.eu.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Josif Grabocka.

Additional information

Responsible editors: Toon Calders, Floriana Esposito, Eyke Hüllermeier, Rosa Meo.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Grabocka, J., Schmidt-Thieme, L. Invariant time-series factorization. Data Min Knowl Disc 28, 1455–1479 (2014). https://doi.org/10.1007/s10618-014-0364-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-014-0364-z

Keywords

Navigation