Abstract
Knowledge discovery and information extraction of large and complex datasets has attracted great attention in wide-ranging areas from statistics and biology to medicine. Tools from machine learning, data mining, and neurocomputing have been extensively explored and utilized to accomplish such compelling data analytics tasks. However, for time-series data presenting active dynamic characteristics, many of the state-of-the-art techniques may not perform well in capturing the inherited temporal structures in these data. In this paper, integrating the Koopman operator and linear dynamical systems theory with support vector machines, we develop a novel dynamic data mining framework to construct low-dimensional linear models that approximate the nonlinear flow of high-dimensional time-series data generated by unknown nonlinear dynamical systems. This framework then immediately enables pattern recognition, e.g., classification, of complex time-series data to distinguish their dynamic behaviors by using the trajectories generated by the reduced linear systems. Moreover, we demonstrate the applicability and efficiency of this framework through the problems of time-series classification in bioinformatics and healthcare, including cognitive classification and seizure detection with fMRI and EEG data, respectively. The developed Koopman dynamic learning framework then lays a solid foundation for effective dynamic data mining and promises a mathematically justified method for extracting the dynamics and significant temporal structures of nonlinear dynamical systems.
Similar content being viewed by others
References
Ahn S, Korattikara A, Liu N, Rajan S, Welling M (2015) Large-scale distributed bayesian matrix factorization using stochastic gradient mcmc. In: KDD ’15 proceedings of the 21rd ACM SIGKDD international conference on knowledge discovery and data mining, Sydney, NSW, Australia, pp 9–18
Alotaiby TN, El-Samie FEA, Alshebeili SA, Aljibreen KH, Alkhanen E (2015) Seizure detection with common spatial pattern and support vector machines. In: 2015 International conference on information and communication technology research (ICTRC)
Brockett R (1970) Finite dimensional linear systems, decision and control, electronic & electrical engineering research studies, vol 7. Wiley, Hoboken
Budišić M, Mohr R, Mezić I (2012) Applied koopmanism. Chaos Interdiscip J Nonlinear Sci 22(4):047510
Canuta C, Hussaini MY, Quarteroni A, Zang TA (2006) Spectral methods: fundamentals in single domains. Scientific computation. Springer, Berlin
Chou CA, Kampa K, Mehta S, Chaovalitwongse WA, Grabowski RT (2014) Voxel selection framework in multi-voxel pattern analysis of fmri data for prediction of neural response to visual stimuli. IEEE Trans Med Imaging 33(4):925–934
Coutanche MN, Thompson-Schill SL, Schultz RT (2011) Multi-voxel pattern analysis of fmri data predicts clinical symptom severity. Neuroimage 57(1):113–123
Davis JV, Kulis B, Jain P, Sra S, Dhillon IS (2007) Information-theoretic metric learning. In: ICML ’07 proceedings of the 24th international conference on Machine learning, Corvalis, Oregon, USA
de Ridder D, de Ridder J, Reinders MJT (2013) Pattern recognition in bioinformatics. Brief Bioinform 14(5):633–647
Douglas RG (1998) Banach algebra techniques in operator theory, graduate texts in mathematics, vol 179, 2nd edn. Springer, New York
Giannakaki G, Sakkalis V, Pediaditis M, Tsiknakis M (2014) Methods for seizure detection and prediction: an overview. In: Sakkalis V (ed) Modern electroencephalographic assessment techniques neuromethods, vol 91, pp .131–157
Giannakis D, Slawinska J, Zhao Z (2015) Spatiotemporal feature extraction with data-driven koopman operators. In: Storcheus D, Rostamizadeh A, Kumar S (eds) Proceedings of the 1st international workshop on feature extraction: modern questions and challenges at NIPS 2015, PMLR, Montreal, Canada, proceedings of machine learning research, vol 44, pp 103–115
Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PC, Mark RG, Mietus JE, Moody GB, Peng CK, Stanley HE (2000) Physiobank, physiotoolkit, and physionet: Components of a new research resource for complex physiologic signals. Circulation 101(3):e215–e220
Gotman J (1982) Automatic recognition of epileptic seizures in the eeg. Electroencephalogr Clin Neurophysiol 54(5):530–540
Hallac D, Vare S, Vare S, Boyd S, Leskovec J (2017) Toeplitz inverse covariance-based clustering of multivariate time series data. In: KDD ’17 proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, Halifax, NS, Canada, pp 215–223
Haxby J, Gobbini M, Furey M, Ishai A, Schouten J, Pietrini P (2001) Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science 293:2425–2430
Huettel SA, Song AW, McCarthy G (2008) Functional magnetic resonance imaging, 2nd edn. Sinauer Associates, Sunderland
Jain AK, Duin RPW, Mao J (2000) Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell 22(1):4–37
Khan YU, Rafiuddin N, Farooq O (2012) Automated seizure detection in scalp eeg using multiple wavelet scales. In: 2012 IEEE international conference on signal processing, computing and control (ISPCC)
Khanmohammad S, Chou CA (2016) A simple distance based seizure onset detection algorithm using common spatial patterns. Brain informatics and health. Springer, Cham, pp 233–242
Khanmohammad S, Chou CA (2018) Adaptive seizure onset detection framework using a hybrid pca-csp approach. IEEE J Biomed Health Inform 22(1):154–160
Koopman BO (1931) Hamiltonian systems and transformation in hilbert space. Proc Natl Acad Sci 17:315–318
Krishnapuram B, Carin L, Hartemink AJ (2004) Joint classifier and feature optimization for comprehensive cancer diagnosis using gene expression data. J Comput Biol 11(2–3):227–242
Lehnertz K, Geier C, Rings T, Stahn K (2017) Capturing time-varying brain dynamics. Nonlinear Biomed Phys 5:2. https://doi.org/10.1051/epjnbp/2017001
Liu H, Motoda H (1998) Feature selection for knowledge discovery and data mining, The Springer international series in engineering and computer science, vol 454. Springer, New York
Ma X, Chou CA, Sayama H (2016) Brain response pattern identification of fmri data using a particle swarm optimization-based approach. Brain Inform 3:181–192
Martino FD, Valent G, Staeren N, Ashburner J, Goeble R (2008) Combining multivariate voxel selection and support vector machines for mapping and classification of fmri spatial patterns. Neuroimage 43(1):44–58
Mezić I (2005) Spectral properties of dynamical systems, model reduction and decompositions. Nonlinear Dyn 41:309–325
Mezić I (2013) Analysis of fluid flows via spectral properties of the koopman operator. Annu Rev Fluid Mech 45(1):357–378
Michel V, Damon C, Thirion B (2008) Mutual information-based feature selection enhances fmri brain activity classification. In: 5th IEEE international symposium on biomedical imaging: from Nano to Macro, 2008. ISBI 2008., pp 592–595
Mitchell TM, Shinkareva SV, Carlson A, Chang KM, Malave VL, Mason RA, Just MA (2008) Predicting human brain activity associated with the meanings of nouns. Science 320(5580):1191–1195
Nasehi S, Pourghassem H (2013) Patient-specific epileptic seizure onset detection algorithm based on spectral features and ipsonn classifier. In: 2013 international conference on communication systems and network technologies, pp 186–190
Petersen KE (1983) Ergodic theory. Cambridge studies in advanced mathematics. Cambridge University Press, Cambridge
Raak F, Susuki Y, Hikihara T (2016) Data-driven partitioning of power networks via koopman mode analysis. IEEE Trans Power Syst 31(4):2799–2808
Rowley CW, Mezić I, Bagheri S, Schlatter P, Henningson DS (2009) Spectral analysis of nonlinear flows. J Fluid Mech 641:115–127
Saab M, Gotman J (2005) A system to detect the onset of epileptic seizures in scalp EEG. Clin Neurophysiol 116(2):427–442
Sato I, Nakagawa H (2015) Stochastic divergence minimization for online collapsed variational bayes zero inference of latent dirichlet allocation. In: KDD ’15 proceedings of the 21rd ACM SIGKDD international conference on knowledge discovery and data mining, Sydney, NSW, Australia, pp 1035–1044
Schomer DL, da Silva FL (2010) Niedermeyer’s electroencephalography: basic principles, clinical applications, and related fields, 6th edn. Lippincott Williams and Wilkins, Philadelphia
Shamir L, Delaney JD, Orlov N, Eckley DM, Goldberg IG (2010) Pattern recognition software and techniques for biological image analysis. PLoS Comput Biol 6(11):e1000974
Shoeb A (2009) Application of machine learning to epileptic seizure onset detection and treatment
Smith SM (2004) Overview of fmri analysis. Br J Radiol 77(Spec No. 2):167–175
Susuki Y, Mezić I, Hikihara T (2011) Coherent swing instability of power grids. J Nonlinear Sci 21(3):403–439
Susuki Y, Mezić I, Raak F, Hikihara T (2016) Applied koopman operator theory for power systems technology. Nonlinear Theory Appl 7:430–459
Walters P (1982) An introduction to ergodic theory. Graduate texts in mathematics, vol 79, 1st edn. Springer, New York
Wilkinson DJ (2006) Bayesian methods in bioinformatics and computational systems biology. Brief Bioinform 8(2):109–116
Wilson S, Scheuer M, Emerson R (2004) Seizure detection: Evaluation of the reveal algorithm. Clin Neurophysiol 115(10):2280–2291
Wilson SB (2005) A neural network method for automatic and incremental learning applied to patient-dependent seizure detection. Clin Neurophysiol 116(8):1785–1795
Wilson SB (2006) Algorithm architectures for patient dependent seizure detection. Clin Neurophysiol 117(6):1204–1216
Zhao Y, Ahmed B, Thesen T, Blackmon KE, Dy JG, Brodley CE (2016) A non-parametric approach to detect epileptogenic lesions using restricted boltzmann machines. In: KDD ’16 proceedings of the 22rd ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, CA, USA, pp 73–382
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Eamonn Keogh.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported in part by the National Science Foundation under the awards CMMI-1763070 and ECCS-1509342 and the National Institutes of Health Grants R21EY027590 and R01GM131403.
Rights and permissions
About this article
Cite this article
Zhang, W., Yu, YC. & Li, JS. Dynamics reconstruction and classification via Koopman features. Data Min Knowl Disc 33, 1710–1735 (2019). https://doi.org/10.1007/s10618-019-00639-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-019-00639-x