Data Mining and Knowledge Discovery

, Volume 30, Issue 2, pp 403–437 | Cite as

Classification of streaming time series under more realistic assumptions

Article

Abstract

Much of the vast literature on time series classification makes several assumptions about data and the algorithm’s eventual deployment that are almost certainly unwarranted. For example, many research efforts assume that the beginning and ending points of the pattern of interest can be correctly identified, during both the training phase and later deployment. Another example is the common assumption that queries will be made at a constant rate that is known ahead of time, thus computational resources can be exactly budgeted. In this work, we argue that these assumptions are unjustified, and this has in many cases led to unwarranted optimism about the performance of the proposed algorithms. As we shall show, the task of correctly extracting individual gait cycles, heartbeats, gestures, behaviors, etc., is generally much more difficult than the task of actually classifying those patterns. Likewise, gesture classification systems deployed on a device such as Google Glass may issue queries at frequencies that range over an order of magnitude, making it difficult to plan computational resources. We propose to mitigate these problems by introducing an alignment-free time series classification framework. The framework requires only very weakly annotated data, such as “in this ten minutes of data, we see mostly normal heartbeats\(\ldots \),” and by generalizing the classic machine learning idea of data editing to streaming/continuous data, allows us to build robust, fast and accurate anytime classifiers. We demonstrate on several diverse real-world problems that beyond removing unwarranted assumptions and requiring essentially no human intervention, our framework is both extremely fast and significantly more accurate than current state-of-the-art approaches.

Keywords

Time series classification Data dictionary Anytime algorithms 

References

  1. Andino SLG et al (2000) Measuring the complexity of time series: an application to neurophysiological signals. Hum Brain Map 11(1):46–57CrossRefGoogle Scholar
  2. Aspelin K (2005) Establishing pedestrian walking speeds. Portland State University. www.usroads.com/journals/p/rej/9710/re971001.htm. Accessed 24 Aug 2009
  3. Aziz W, Arif M (2006) Complexity analysis of stride interval time series by threshold dependent symbolic entropy. EJAP 98(1):30–40Google Scholar
  4. Batista G, Keogh E, Mafra-Neto A, Rowton E (2011) Sensors and software to allow computational entomology, an emerging application of data mining. SIGKDD demo paperGoogle Scholar
  5. Batista G, Wang X, Keogh E (2011) A complexity-invariant distance measure for time series. In: SDMGoogle Scholar
  6. Bao L, Intille SS (2004) Acitivity recognition from user-annotated acceleration data. In: Proceedings of the 2nd international conference on pervasive computing, pp 1–17Google Scholar
  7. Cavagna GA, Heglund NC, Taylor CR (1977) Mechanical work in terrestrial locomotion: two basic mechanisms for minimizing energy expenditure. J Physiol 233(5):R243–R261Google Scholar
  8. Chen L, Özsu MT, Oria V (2005) Robust and fast similarity search for moving object trajectories. In: Proceedings of the ACM SIGMODGoogle Scholar
  9. CMU Graphics Lab Motion Capture Database. www.mocap.cs.cmu.edu/. Accessed 24 April 2012
  10. de Chazal P, O’Dwyer M, Reilly RB (2004) Automatic classification of ECG heartbeats using ECG morphology and heartbeat interval features. IEEE Trans Biomed Eng 51:1196-06Google Scholar
  11. Faezipour M, Saeed A, Bulusu S, Nourani M, Minn H, Tamil L (2010) A patient-adaptive profiling scheme for ECG beat classification. IEEE Trans Inform Technol Biomed 14(5):1153–1165CrossRefGoogle Scholar
  12. Gafurov D, Helkala K, Søndrol T (2006) Biometric gait authentication using accelerometer sensor. J Comput 1(7):51–59CrossRefGoogle Scholar
  13. Gafurov D, Snekkenes E (2008) Towards understanding the uniqueness of gait biometric. In: 8th IEEE International Conference on Automatic Face & Gesture RecognitionGoogle Scholar
  14. Grass J, Zilberstein S (1995) Anytime algorithm development tools. Technical Report. UMI Order Number: UM-CS-1995-094, University of MassachusettsGoogle Scholar
  15. Hanson MA, Powell Jr HC, Barth AT, Lach J, Brown MBC (2009) Neural network gait classification for on-body inerital sensors. In: Proceedings of the 2009 sixth international workshop on wearable and implantable body sensor networksGoogle Scholar
  16. Hao Y, Chen Y, Zakaria J, Hu B, Rakthanmanon T, Keogh E (2013) Towards never-ending learning from time series streams. In: SIGKDDGoogle Scholar
  17. Hu B, Chen Y, Keogh E (2013) Time series classification under more realistic assumptions. In: SDMGoogle Scholar
  18. Hu B, Chen Y, Zakaria J, Ulanova L, Keogh E (2013) Classification of multi-dimensional streaming time series by weighting each classifier’s track record. In: ICDMGoogle Scholar
  19. Hu B, Rakthanmanon TR, Hao Y, Evans S, Lonardi S, Keogh E (2011) Discovering the intrinsic cardinality and dimensionality of time series using MDL. In: ICDMGoogle Scholar
  20. Keogh E, Zhu Q, Hu B, Hao Y, Xi X, Wei L, Ratanamahatana CA (2006) The UCR time series classification/clustering homepage. www.cs.ucr.edu/~eamonn/time_series_data/
  21. Keogh E, Lonardi S, Ratanamahatana C (2004) Towards parameter-free data mining. In: Proceedings of the tenth ACM SIGKDDGoogle Scholar
  22. Keogh E, Palpanas T, Zordan VB, Gunopulos D, Cardle M (2004) Indexing large human-motion databases. In: VLDBGoogle Scholar
  23. Koch P, Konen W, Hein K (2010) Gesture recognition on few training data using slow feature analysis and parametric bootstrap. In: IJCNNGoogle Scholar
  24. Kranen P, Seidl T (2009) Harnessing the strengths of anytime algorithms for constant data stremas. J Data Min Knowl Discov 19(2):245–260CrossRefMathSciNetGoogle Scholar
  25. Lester J, Choudhury T, Kern N, Borriello G, Hannaford B (2005) A hybrid discriminative/generative approach for modeling human activities. In: IJCAIGoogle Scholar
  26. Li L, Prakash BA (2011) Time series clustering: complex is simpler. In: ICMLGoogle Scholar
  27. Li M, Vitanyi P (1997) An introduction to Kolmogorov complexity and its applications, 2nd edn. Springer Verlag, New YorkCrossRefMATHGoogle Scholar
  28. Liu J, Yu K, Zhang Y, Huang Y (2010) Training conditional random fields using transfer learning for gesture recognition. In: ICDMGoogle Scholar
  29. McMahon TA, Cheng GC (1990) The mechanics of running: how does stiffness couple with speed. J Biomech 23:65–78CrossRefGoogle Scholar
  30. Morse M, Patel JM (2007) An efficient and accurate method for evaluating time series similarity. In: Proceedings of SIGMODGoogle Scholar
  31. Niennattrakul V, Keogh E, Ratanamahatana CA (2010) Data editing techniques to allow the application of distance-based outlier detection to streams. In: ICDMGoogle Scholar
  32. PAMAP, Physical activity monitoring for aging people. www.pamap.org/demo.html. Accessed 12 May 2012
  33. Pärkkä J, Ermes M, Korpipää P, Mäntyjärvi J, Peltola J, Korhonen I (2006) Activity classification using realistic data from wearable sensors. IEEE Trans Inf Technol Biomed 10:119–128CrossRefGoogle Scholar
  34. Pekalska E, Duin RPW, Paclík P (2006) Prototype selection for dissimilarity-based classifiers. Pattern Recognit 39:189–208CrossRefMATHGoogle Scholar
  35. Pham C, Plötz T, Olivier P (2010) A dynamic time warping approach to real-time activity recognition for food preparation. In: Proceedings of the first international joint conference on Ambient intelligenceGoogle Scholar
  36. Raptis M, Kirovski D, Hoppes H (2011) Real-time classification of dance gestures from skeleton animation. In: Proceedings of the ACM SIGGRAPH symposium on computer animationGoogle Scholar
  37. Raptis M, Wnuk K, Soatto S (2008) Flexible dictionaries for action recognition. In: Proceedings of the 1st international workshop on machine learning for vision-based motion analysisGoogle Scholar
  38. Rakthanmanon T, Keogh E, Lonardi S, Evans S (2011) Time series epenthesis: clustering time series streams requires ignoring some data. In: ICDMGoogle Scholar
  39. Ratanamahatana CA (2012) Personal communcation. May 2012Google Scholar
  40. Ratanamahatana CA, Keogh E (2004) Making time-series classification more accurate using learned constraints. In: SDMGoogle Scholar
  41. Reiss A, Stricker D (2011) Introducing a modular activity monitoring system. In: 33th international EMBCGoogle Scholar
  42. Shieh J, Keogh E (2010) Polishing the right apple: anytime classification also benefits data streams with constant arrival times. In: ICDMGoogle Scholar
  43. Song J, Kim D (2006) Simultaneous gesture segmentation and recognition based on forward spotting accumulative HMM. In: Proceedings of the 18th ICPRGoogle Scholar
  44. The BIDMC congestive heart failure database, www.physionet.org/physiobank/database/chfdb/
  45. Ueno K, Xi X, Keogh E, Lee D (2010) Anytime classification using the nearest neighbor algorithm with applications to stream mining. In: ICDMGoogle Scholar
  46. Usabiaga J, Bebis G, Erol A, Nicolescu M (2007) Recognizing simple human actions using 3D head movement. Comput Intell 23(4):484–496CrossRefMathSciNetGoogle Scholar
  47. Vatavu RD (2011) The effect of sampling rate on the performance of template-based gesture recognizers. In: Proceedings of ICMIGoogle Scholar
  48. Xi X, Keogh E, Shelton C, Wei L, Ratanamahatana C (2006) Fast time series classification using numerosity reduction. In: ICML, pp 1033–1040Google Scholar
  49. Ye L, Wang X, Keogh E, Mafra-Neto A (2009) Autocannibalistic and anyspace indexing algorithms with applications to sensor data mining. In: SDMGoogle Scholar
  50. Ye L, Keogh E (2009) Time series shapelets: a new primitive for data mining. In: KDD, pp 947–956Google Scholar
  51. Yang AY, Giani A, Giannatonio R, Gilani K et al (2009) Distributed human action recognition via wearable motion sensor networks. www.eecs.berkeley.edu/~yang/software/WAR/index.html
  52. Yang K, Jiang H, Dong J, Zhang C, Wang Z (2012) An adaptive real-time method for fetal heart rate extraction based on phonocardiography. In: 2012 IEEE biomedical circuits and systems conference. BioCAS, pp 356–359Google Scholar
  53. Zilberstein S, Russell S (1995) Approximate reasoning using anytime algorithms. In: Imprecise and approximate computation. Kluwer Academic Publishers, DordrechtGoogle Scholar
  54. Zhang M, Sawchuk AA (2012) USC-HAD: a daily activity recognition using wearable sensors. ACM international conference on ubiquitous computing (UbiComp) workshop on situation, activity and goal awareness(SAGAware)Google Scholar

Copyright information

© The Author(s) 2015

Authors and Affiliations

  1. 1.Department of Computer Science & EngineeringUniversity of CaliforniaRiversideUSA

Personalised recommendations