Abstract
With the biomedical field generating large quantities of time series data, there has been a growing interest in developing and refining machine learning methods that allow its mining and exploitation. Classification is one of the most important and challenging machine learning tasks related to time series. Many biomedical phenomena, such as the brain’s activity or blood pressure, change over time. The objective of this chapter is to provide a gentle introduction to time series classification. In the first part we describe the characteristics of time series data and challenges in its analysis. The second part provides an overview of common machine learning methods used for time series classification. A real-world use case, the early recognition of sepsis, demonstrates the applicability of the methods discussed.
Key words
- Time series
- Classification
- Onset detection
- Subsequence mining
- Deep learning
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
While we try to be consistent in our usage of notation, it might be necessary to slightly diverge from initially introduced notation to keep equations more readable. If this is the case, we will clarify this in the text.
- 2.
We will refer to subsequences that are statistically significantly associated with a class label as shapelets. Note that this term is typically used for subsequences that are maximizing information gain.
References
Sudlow C, Gallacher J, Allen N et al (2015) UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med 12:e1001779
Johnson AEW, Pollard TJ, Shen L et al (2016) MIMIC-III, a freely accessible critical care database. Sci Data 3:160035
Miller G (2012) The smartphone psychology manifesto. Perspect Psychol Sci 7:221–237
Ent MMVX van den, Brown DW, Hoekstra EJ et al (2011) Measles mortality reduction contributes substantially to reduction of all cause mortality among children less than five years of age, 1990-2008. https://doi.org/10.1093/infdis/jir081
Au-Yong ITH, Thorn N, Ganatra R et al (2009) Brown adipose tissue and seasonal variation in humans. Diabetes 58:2583–2587
Refinetti R, Menaker M (1992) The circadian rhythm of body temperature. Physiol Behav 51:613–637
Reed BG, Carr BR (2018) The normal menstrual cycle and the control of ovulation. In: Feingold KR, Anawalt B, Boyce A et al (eds) Endotext. MDText.com, South Dartmouth, MA
Nagai S, Anzai D, Wang J (2017) Motion artefact removals for wearable ECG using stationary wavelet transform. Healthc Technol Lett 4:138–141
Durbin J, Watson GS (1950) Testing for serial correlation in least squares regression. I. Biometrika 37:409–428
Bence JR (1995) Analysis of short time series: correcting for autocorrelation. Ecology 76:628–639
Peña D, Tiao GC, Tsay RS (2011) A course in time series analysis. Wiley, New York
Kurbalija V, Radovanović M, Geler Z et al (2010) A framework for time-series analysis. In: Artificial intelligence: methodology, systems, and applications. Springer, Berlin, pp 42–51
Warren Liao T (2005) Clustering of time series data—a survey. Pattern Recognit 38:1857–1874
Malhotra P, Vig L, Shroff G et al (2015) Long short term memory networks for anomaly detection in time series. In: Proceedings. Presses universitaires de Louvain, p 89
De Gooijer JG (2017) Elements of nonlinear time series analysis and forecasting. Springer, Cham
Kirchgässner G, Wolters J (2008) Introduction to modern time series analysis. Springer Science & Business Media, Berlin
Ye L, Keogh E (2009) Time series shapelets: a new primitive for data mining. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, NY, pp 947–956
Zhu Y, Imamura M, Nikovski D et al (2018) Time series chains: a novel tool for time series data mining. https://doi.org/10.24963/ijcai.2018/764
Yeh CM, Zhu Y, Ulanova L et al (2016) Matrix profile I: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets. In: 2016 IEEE 16th international conference on data mining (ICDM). pp 1317–1322
Celebi ME, Aydin K (eds) (2016) Unsupervised learning algorithms. Springer, Cham
Dau HA, Bagnall A, Kamgar K et al (2018) The UCR time series archive. http://arxiv.org/abs/1810.07758
Che Z, Purushotham S, Cho K et al (2018) Recurrent neural networks for multivariate time series with missing values. Sci Rep 8:6085
Singer M, Deutschman CS, Seymour CW et al (2016) The third international consensus definitions for sepsis and septic shock (Sepsis-3). JAMA 315:801–810
Chevyrev I and Kormilitzin A (2016). A primer on the signature method in machine learning. http://arxiv.org/abs/1603.03788
Aggarwal CC (2015) Data mining: the textbook. Springer, New York
Rizzo R, Fiannaca A, La Rosa M et al (2016) A deep learning approach to DNA sequence classification. In: Computational intelligence methods for bioinformatics and biostatistics. Springer, New York
Kadous MW, Sammut C (2005) Classification of multivariate time series and structured data using constructive induction. Mach Learn 58:179–216
Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S et al (eds) Advances in neural information processing systems 30. Curran Associates, Red Hook, pp 5998–6008
Harutyunyan H, Khachatrian H, Kale DC et al (2019) Multitask learning and benchmarking with clinical time series data. https://doi.org/10.1038/s41597-019-0103-9
Bagnall A, Lines J, Bostrom A et al (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. https://doi.org/10.1007/s10618-016-0483-9
Ismail Fawaz H, Forestier G, Weber J et al (2019) Deep learning for time series classification: a review. Data Min Knowl Discov 33:917–963
Futoma J, Hariharan S, Heller K (2017) Learning to detect sepsis with a multitask Gaussian process RNN classifier, In: Proceedings of the 34th international conference on machine learning—volume 70. JMLR.org, Sydney, NSW, pp 1174–1182
Calvert JS, Price DA, Chettipally UK et al (2016) A computational approach to early sepsis detection. Comput Biol Med 74:69–73
Moor M, Horn M, Rieck B et al (2019) Early recognition of sepsis with Gaussian process temporal convolutional networks and dynamic time warping.
Futoma J, Hariharan S, Sendak M et al (2017) An improved multi-output Gaussian process RNN with real-time validation for early sepsis detection. http://arxiv.org/abs/1708.05894
Ferrer R, Martin-Loeches I, Phillips G et al (2014) Empiric antibiotic treatment reduces mortality in severe sepsis and septic shock from the first hour: results from a guideline-based performance improvement program. Crit Care Med 42:1749–1755
Shimabukuro DW, Barton CW, Feldman MD et al (2017) Effect of a machine learning-based severe sepsis prediction algorithm on patient survival and hospital length of stay: a randomised clinical trial. BMJ Open Respir Res 4:e000234
Desautels T, Calvert J, Hoffman J et al (2016) Prediction of sepsis in the intensive care unit with minimal electronic health record data: a machine learning approach. JMIR Med Inform 4:e28
Reyna M, Josef C, Jeter R et al (2019) Early prediction of sepsis from clinical data: the PhysioNet/computing in cardiology challenge 2019. Crit Care Med
Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust 26:43–49
Xi X, Keogh E, Shelton C et al (2006) Fast time series classification using numerosity reduction. In: Proceedings of the 23rd international conference on machine learning. ACM, New York, NY, pp 1033–1040
Dau HA, Silva DF, Petitjean F et al (2018) Optimizing dynamic time warping’s window width for time series data mining applications. https://doi.org/10.1007/s10618-018-0565-y
Hastie T, Tibshirani R, Friedman J et al (2005) The elements of statistical learning: data mining, inference and prediction. Math Intelligencer 27:83–85
Bishop CM (2006) Pattern recognition and machine learning. Springer, New York
Ghalwash MF, Obradovic Z (2012) Early classification of multivariate temporal observations by extraction of interpretable shapelets. BMC Bioinformatics 13:195
Ghalwash M, Radosavljevic V, Obradovic Z (2013) Early diagnosis and its benefits in sepsis blood purification treatment. In: 2013 IEEE international conference on healthcare informatics. pp 523–528
Bock C, Gumbsch T, Moor M et al (2018) Association mapping in biomedical time series via statistically significant shapelet mining. Bioinformatics 34:i438–i446
Xu J, Zhang Y, Zhang P et al (2017) Data mining on icu mortality prediction using early temporal data: a survey. Int J Inf Technol Decis Mak 16:117–159
Shanjina T, Sivakumar PB (2012) Human gait recognition and classification using time series shapelets. In: 2012 international conference on advances in computing and communications. pp 31–34
Shannon CE, Weaver W (1949) The mathematical theory of communication. University of Illinois Press, Champaign
Rakthanmanon T, Keogh E (2011) Fast-shapelets: a fast algorithm for discovering robust time series shapelets. In: Proceedings of 11th SIAM international conference on data mining,
Grabocka J, Schilling N, Wistuba M et al (2014) Learning time-series shapelets. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, NY, pp 392–401
Hills J, Lines J, Baranauskas E et al (2014) Classification of time series by shapelet transformation. Data Min Knowl Discov 28:851–881
Dudoit S, van der Laan MJ (2007) Multiple testing procedures with applications to genomics. Springer Science & Business Media, Berlin
Llinares-Lopez F, Borgwardt K (2019) Machine learning for biomarker discovery: significant pattern mining. In: Pržulj N (ed) Analyzing network data in biology and medicine: an interdisciplinary textbook for biological, medical and computational scientists. Cambridge University Press, Cambridge, pp 313–368
Fisher RA (1922) On the interpretation of χ2 from contingency tables, and the calculation of P. J R Stat Soc 85:87–94
Bonferroni CE (1936) Teoria statistica delle classi e calcolo delle probabilita. Libreria internazionale Seeber, Firenze
Tarone RE (1990) A modified Bonferroni method for discrete data. Biometrics 46:515–522
Terada A, Okada-Hatakeyama M, Tsuda K et al (2013) Statistical significance of combinatorial regulations. Proc Natl Acad Sci U S A 110:12996–13001
Devlin J, Chang M-W, Lee K et al (2018) BERT: pre-training of deep bidirectional transformers for language understanding. http://arxiv.org/abs/1810.04805
Girshick R, Donahue J, Darrell T et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 580–587
Tomašev N, Glorot X, Rae JW et al (2019) A clinically applicable approach to continuous prediction of future acute kidney injury. Nature 572:116–119
Lecun Y, Bottou L, Bengio Y et al (1998) Gradient-based learning applied to document recognition. https://doi.org/10.1109/5.726791
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 770–778
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9:1735–1780
Rumelhart DE, Hinton GE, Williams RJ et al (1988) Learning representations by back-propagating errors. Cogn Model 5:1
Zhou B, Khosla A, Lapedriza A et al (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2921–2929
Shanmugam D, Blalock D, Guttag J (2018) Multiple instance learning for ECG risk stratification. http://arxiv.org/abs/1812.00475
Brueckner R, Schulter B (2014) Social signal classification using deep blstm recurrent neural networks. In 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP). pp 4823–4827
Xiong W, Wu L, Alleva F et al (2018) The Microsoft 2017 conversational speech recognition system. In 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). pp 5934–5938
Ramachandran P, Zoph B, Le QV (2017) Searching for activation functions. http://arxiv.org/abs/1710.05941
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, Cambridge
Graves A, Liwicki M, Fernández S et al (2009) A novel connectionist system for unconstrained handwriting recognition. IEEE Trans Pattern Anal Mach Intell 31:855–868
Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5:157–166
Pascanu R, Mikolov T, Bengio Y (2013) On the difficulty of training recurrent neural networks. In: International conference on machine learning. pp 1310–1318
Vinyals O, Toshev A, Bengio S et al (2015) Show and tell: a neural image caption generator, In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 3156–3164
Wu Y, Schuster M, Chen Z et al (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. http://arxiv.org/abs/1609.08144
Cireşan DC, Giusti A, Gambardella LM et al (2013) Mitosis detection in breast cancer histology images with deep neural networks. Med Image Comput Comput Assist Interv 16:411–418
Cho K, Merrienboer B van, Gulcehre C et al (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. https://doi.org/10.3115/v1/d14-1179
Chung J, Gulcehre C, Cho K et al (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. http://arxiv.org/abs/1412.3555
Russakovsky O, Deng J, Su H et al (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115:211–252
Fukushima K (1980) Neocognitron: a self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern 36:193–202
LeCun Y, Bengio Y et al (1995) Convolutional networks for images, speech, and time series. 3361:1995
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L et al (eds) Advances in neural information processing systems 25. Curran Associates, Red Hook, pp 1097–1105
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 3431–3440
Lea C, Flynn MD, Vidal R et al (2017) Temporal convolutional networks for action segmentation and detection. In: proceedings of the IEEE conference on computer vision and pattern recognition. pp 156–165
Bai S, Zico Kolter J, Koltun V (2018) An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. http://arxiv.org/abs/1803.01271
Yu F, Koltun V (2015) Multi-scale context aggregation by dilated convolutions. http://arxiv.org/abs/1511.07122
Oord A van den, Dieleman S, Zen H et al (2016) WaveNet: a generative model for raw audio. http://arxiv.org/abs/1609.03499
Waibel A, Hanazawa T, Hinton G et al (1989) Phoneme recognition using time-delay neural networks. IEEE Trans Acoust 37:328–339
Salimans T, Kingma DP (2016) Weight normalization: a simple reparameterization to accelerate training of deep neural networks. In: Advances in neural information processing systems. pp 901–909
Ba JL, Kiros JR, Hinton GE (2016) Layer normalization. http://arxiv.org/abs/1607.06450
Bonilla EV, Chai KM, Williams C (2008) Multi-task Gaussian process prediction. In: Platt JC, Koller D, Singer Y et al (eds) Advances in neural information processing systems 20. Curran Associates, Red Hook, pp 153–160
Li SC-X, Marlin BM (2016) A scalable end-to-end Gaussian process adapter for irregularly sampled time series classification. In: Lee DD, Sugiyama M, Luxburg UV et al (eds) Advances in neural information processing systems 29. Curran Associates, Red Hook, pp 1804–1812
Acknowledgments
The authors would like to thank Dr. Bastian Rieck, Dr. Damian Roqueiro, and Max Horn for their valuable input and discussion.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Bock, C., Moor, M., Jutzeler, C.R., Borgwardt, K. (2021). Machine Learning for Biomedical Time Series Classification: From Shapelets to Deep Learning. In: Cartwright, H. (eds) Artificial Neural Networks. Methods in Molecular Biology, vol 2190. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-0826-5_2
Download citation
DOI: https://doi.org/10.1007/978-1-0716-0826-5_2
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-0825-8
Online ISBN: 978-1-0716-0826-5
eBook Packages: Springer Protocols