Abstract
Comparing and contrasting subtle historical patterns is central to time series analysis. Here we introduce a new approach to quantify deviations in the underlying hidden stochastic generators of sequential discrete-valued data streams. The proposed measure is universal in the sense that we can compare data streams without any feature engineering step, and without the need of any hyper-parameters. Our core idea here is the generalization of the Kullback–Leibler divergence, often used to compare probability distributions, to a notion of divergence between finite-valued ergodic stationary stochastic processes. Using this notion of process divergence, we craft a measure of deviation on finite sample paths which we call the sequence likelihood divergence (SLD) which approximates a metric on the space of the underlying generators within a well-defined class of discrete-valued stochastic processes. We compare the performance of SLD against the state of the art approaches, e.g., dynamic time warping (Petitjean et al. in Pattern Recognit 44(3):678–693, 2011) with synthetic data, real-world applications with electroencephalogram data and in gait recognition, and on diverse time-series classification problems from the University of California, Riverside time series classification archive (Thanawin Rakthanmanon and Westover). We demonstrate that the new tool is at par or better in classification accuracy, while being significantly faster in comparable implementations. Released in the publicly domain, we are hopeful that SLD will enhance the standard toolbox used in classification, clustering and inference problems in time series analysis.
Similar content being viewed by others
References
Berndt DJ, Clifford J (1994) Using dynamic time warping to find patterns in time series. In: KDD workshop, vol 10. Seattle, WA, pp 359–370
Bondy JA, Murty USR (2008) Graph theory. Grad. Texts in Math (2008)
Chattopadhyay I (2014) Causality networks. arXiv preprint arXiv:1406.6651
Chattopadhyay I, Lipson H (2013) Abductive learning of quantized stochastic processes with probabilistic finite automata. Philos Trans R Soc A Math Phys Eng Sci 371(1984):20110543
Chattopadhyay I, Lipson H (2014) Data smashing: uncovering lurking order in data. J R Soc Interface 11(101):20140826
Chen L, Özsu MT, Oria V (2005) Robust and fast similarity search for moving object trajectories. In: Proceedings of the 2005 ACM SIGMOD international conference on management of data, pp 491–502. ACM
Ching WK, Ng MK (2006) Chains: models, algorithms and applications. International Series in Operations Research & Management Science. Springer US, ISBN 9780387293370
Cover TM, Thomas JA (2012) Elements of information theory. Wiley, New York
Crutchfield JP (1994) The calculi of emergence: computation, dynamics and induction. Physica D Nonlinear Phenomena 75(1–3):11–54
Dau HA, Bagnall A, Kamgar K, Yeh C-CM, Zhu Y, Gharghabi S, Ratanamahatana CA, Keogh E (2019) The UCR time series archive. IEEE/CAA J Automatica Sinica 6(6):1293–1305
Dekking FM, Kraaikamp C, Lopuhaä HP, Meester LE (2005) A modern introduction to probability and statistics: understanding why and how. Springer, Berlin
Dempster A, Petitjean F, Webb GI (2020) Rocket: exceptionally fast and accurate time series classification using random convolutional kernels. Data Min Knowl Discov 34(5):1454–1495
Dua D, Graff C (2017) UCI machine learning repository
Dupont P, Denis F, Esposito Y (2005) Links between probabilistic automata and hidden Markov models: probability distributions, learning models and induction algorithms. Pattern Recognit 38(9):1349–1371
Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PC, Mark RG, Mietus JE, Moody GB, Peng C-K, Eugene Stanley H (2000) Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals. Circulation 101(23):e215–e220
Gupta G, Pequito S, Bogdan P (2018) Dealing with unknown unknowns: identification and selection of minimal sensing for fractional dynamics with unknown inputs. In: 2018 Annual American Control Conference (ACC). IEEE, pp 2814–2820
Gupta G, Pequito S, Bogdan P (2019) Learning latent fractional dynamics with unknown unknowns. In: 2019 American Control Conference (ACC). IEEE, pp 217–222
Hardy GH (1992) Divergent series, with a preface by je littlewood and a note by ls bosanquet, reprint of the revised (1963) edition. Éditions Jacques Gabay, Sceaux
Helstrom CW (1991) Probability and stochastic processes for engineers. Macmillan Coll Division
Jain S, Xiao X, Bogdan P, Bruck J (2021) Generator based approach to analyze mutations in genomic datasets. Sci Rep 11(1):1–12
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86
Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In: Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery. ACM, pp 2–11
Löning M, Bagnall A, Ganesh S, Kazakov V, Lines J, Király FJ (2019) sktime: a unified interface for machine learning with time series. arXiv preprint arXiv:1909.07872
Middlehurst M, Large J, Flynn M, Lines J, Bostrom A, Bagnall A (2021) Hive-cote 2.0: a new meta ensemble for time series classification. Mach Learn 110(11):3211–3243
Möller-Levet CS, Klawonn F, Cho K-H, Wolkenhauer O (2003) Fuzzy clustering of short time-series and unevenly distributed sampling points. In: International symposium on intelligent data analysis. Springer, pp 330–340
Navarro G (2001) A guided tour to approximate string matching. ACM Comput Surv (CSUR) 33(1):31–88
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
Petitjean F, Ketterlin A, Gançarski P (2011) A global averaging method for dynamic time warping, with applications to clustering. Pattern Recognit 44(3):678–693
Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2):257–286
Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. pp 262–270
Rényi A (1965) On the foundations of information theory. Revue de l’Institut International de Statistique, pp 1–14
Ruiz AP, Flynn M, Large J, Middlehurst M, Bagnall A (2021) The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Discov 35(2):401–449
Shannon CE (2001) A mathematical theory of communication. ACM SIGMOBILE Mob Comput Commun Rev 5(1):3–55
Abdullah Mueen Qiang Zhu Jesin Zakaria Eamonn Keogh Gustavo Batista Thanawin Rakthanmanon, Bilson Campana and Brandon Westover. UCR suite for time series subsequence search. (Accessed on 01/20/2021)
Vidyasagar M (2007) Bounds on the Kullback–Leibler divergence rate between hidden Markov models. In: 2007 46th IEEE conference on decision and control. IEEE, pp 6160–6165
Vidyasagar M (2014) Hidden Markov processes: theory and applications to biology, vol 44. Princeton University Press, Princeton
Xue Y, Bogdan P (2019) Reconstructing missing complex networks against adversarial interventions. Nat Commun 10(1):1–12
Xue Y, Rodriguez S, Bogdan P (2016) A spatio-temporal fractal model for a CPS approach to brain-machine-body interfaces. In: 2016 design, automation & test in Europe conference & exhibition (DATE), pp 642–647. IEEE
Yang R, Sala F, Bogdan P (2021) Hidden network generating rules from partially observed complex networks. Commun Phys 4(1):1–12
Acknowledgements
We thank anonymous reviewers for their very useful comments and suggestions. Part of this work was done while Li Shen and Ling Cheng were doing research in Griffith University. The work was supported by Australian Research Council (ARC) Large Grant A849602031.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Huang, Y., Rotaru, V. & Chattopadhyay, I. Sequence likelihood divergence for fast time series comparison. Knowl Inf Syst 65, 3079–3098 (2023). https://doi.org/10.1007/s10115-023-01855-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-023-01855-0