Machine Learning

, Volume 58, Issue 2–3, pp 179–216 | Cite as

Classification of Multivariate Time Series and Structured Data Using Constructive Induction

  • Mohammed Waleed KadousEmail author
  • Claude Sammut

We present a method of constructive induction aimed at learning tasks involving multivariate time series data. Using metafeatures, the scope of attribute-value learning is expanded to domains with instances that have some kind of recurring substructure, such as strokes in handwriting recognition, or local maxima in time series data. The types of substructures are defined by the user, but are extracted automatically and are used to construct attributes.

Metafeatures are applied to two real domains: sign language recognition and ECG classification. Using metafeatures we are able to generate classifiers that are either comprehensible or accurate, producing results that are comparable to hand-crafted preprocessing and comparable to human experts.


time series constructive induction propositionalisation substructure 


  1. Aurenhammer F., & Klein, R. (2000). Voronoi diagrams. In J. Sack & G. Urruita (Eds.), Handbook of Computational Geometry. Elsevier Science.Google Scholar
  2. Bengio, Y. (1996). Neural Networks for Speech and Sequence Recognition. International Thomson Publishing Inc.Google Scholar
  3. Blake, C. L., & Merz, C. (1998). UCI Repository of machine learning databases.Google Scholar
  4. Box, G. E. P., & Jenkins, G. M. (1976). Time Sereis Analysis: Forecasting and Control. Holden Day.Google Scholar
  5. Bracewell, R. N. (1965). The Fourier Transform and Its Applications. New York: McGraw-Hill.Google Scholar
  6. Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123–140.Google Scholar
  7. Cohen, W. W. (1995). Learning to classify English text with ILP methods. In L. D. Raedt (Ed.), Proceedings of the 5th International Workshop on Inductive Logic Programming (pp. 3–24) Department of Computer Science, Katholieke Universiteit Leuven.Google Scholar
  8. Das, G., Lin, K.-I., Mannila, H., Renganathan, G., & Smyth, P. (1998). Rule discovery from time series. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98). AAAI Press.Google Scholar
  9. de Chazal, P. (1998). Automatic classification of the Frank lead electrocardiogram. Ph.D. thesis, University of New South Wales.Google Scholar
  10. Dietterich, T. G. (2000). The divide-and-conquer manifesto. In Proceedings of the Eleventh International Conference on Algorithmic Learning Theory (pp. 13–26). Springer-Verlag.Google Scholar
  11. Fayyad, U. M., & Irani, K. B. (1993). Multi-interval discretization of continuous-valued attributes for classification learning. In IJCAI-93: Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence (pp. 1022–1027). Morgan-Kaufmann.Google Scholar
  12. Geurts, P. (2001). Pattern extraction for time series classification. In L. de Raadt and A. Sieves (Eds.), Principles of Data Mining and Knowledge Discovery, 5th European Conference, PKDD 2001 Proceedings. Freiburg, Germany: Springer-Verlag.Google Scholar
  13. Goodwin, G. C., Ramage, P. J., & Caines, P. E. (1980). Discrete time multivariable adaptive control. IEEE Trans. Automatic Contro, 25, 449–456.Google Scholar
  14. Ho, Y. C., Sreenivas, R. S., & Vakili, P. (1992). Ordinal Optimization of DEDS. Discrete Event Dynamic Systems: Theory and Applications, 2(1), 61–88.Google Scholar
  15. Ivan, B., Igor Mozetic, N. L. (1989). KARDIO: A Study in Deep and Qualitative Knowledge for Expert Systems. MIT Press.Google Scholar
  16. Johnston, T. (1989). Auslan dictionary: A Dictionary of the Sign Language of the Australian Deaf Community. Deafness Resources Australia Ltd.Google Scholar
  17. Kadous, M. W. (1995). GRASP: Recognition of Australian sign language using instrumented gloves. Honours Thesis.Google Scholar
  18. Kadous, M. W. (2002). Temporal classification: Extending the classification paradigm to multivariate time series. Ph.D. thesis, School of Computer Science and Engineering, University of New South Wales.Google Scholar
  19. Keogh, E., & Pazzani, M. (2001). Dynamic time warping with higher order features. In SIAM International Conference on Data Mining, SDM 2001. SIAM.Google Scholar
  20. Keogh, E. J., Chakrabarti, K., Mehrotra, S., & Pazzani, M. J. (2001). Locally adaptive dimensionality reduction for indexing large time series databases. In SIGMOD Conference.Google Scholar
  21. Lee, J. K., & Kim, H. S. (1995). Intelligent Systems for Finance and Business. Chapt. 13. John Wiley and Sons Ltd.Google Scholar
  22. Liu, H., & Motoda, H. (Eds.). (1998). Feature Extraction, Construction and Selection: A Data Mining Perspective. Kluwer Academic Publishers.Google Scholar
  23. Mallat, S. (1999). A Wavelet Tour of Signal Processing. Academic Press.Google Scholar
  24. Manganaris, S. (1997). Supervised classification with temporal data. Ph.D. thesis, Computer Science Department, School of Engineering, Vanderbilt University.Google Scholar
  25. Mannila, H., Toivonen, H., & Verkamo, A. I. (1995). Discovering frequent episodes in sequences. In Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD-95) (pp. 210–215).Google Scholar
  26. Michalski, R. S., Mitchell, J. G., & Carbonell, T. G. (Eds.). (1983). Machine Learning: An Artificial Intelligence Approach, Chapt. A Theory and Methodology of Inductive Learning. Tioga Publishers.Google Scholar
  27. Myers, C. S., & Rabiner, L. R. (1981). A comparative study of several dynamic time-warping algorithms for connected word recognition. The Bell System Technical Journal, 607, 1389–1409.Google Scholar
  28. Oates, T., Schmill, M. D., & Cohen, P. R. (2000). A method for clustering the experiences of a mobile robot that accords with human judgments. In Proceedings 17th National Conference on Artificial Intelligence (pp. 846–851). AAAI Press.Google Scholar
  29. Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers.Google Scholar
  30. Rabiner, L. R. (1989). A Tutorial on Hidden Markov Models and selected applications in speech recognition. Proceedings of the IEEE, 77:2, 257–286.Google Scholar
  31. Rodríguez, J. J., Alonso, C. J., & Boström, H. (2000). Learning first order logic time series classifiers. In J. Cussens, & A. Frisch (Eds.), Proceedings of ILP2000 (pp. 260–275).Google Scholar
  32. Rosenstein, M. T., & Cohen, P. R. (1998). Concepts from time series. In AAAI ‘98: Fifteenth National Conference on Artificial Intelligence (pp. 739–745). AAAI Press.Google Scholar
  33. Saito, N. (1994). Local feature extraction and its application using a library of bases. Ph.D. thesis, Yale University.Google Scholar
  34. Schapire, R. E. (1999). A brief introduction to boosting. In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence.Google Scholar
  35. Schiller Medical (1997). The Schiller ECG Measurement and Interpretation Programs Physicians Guide.Google Scholar
  36. Srinivarsan, A. (2000). The aleph manual. Technical report, Oxford University.Google Scholar
  37. Statsoft (2002). Electronic Statistics Textbook ( Tulsa, OK: Statsoft.
  38. White, A. P., & Liu, W. Z. (1994). Bias in information-based measures in decision tree induction. Machine Learning, 15, 321–329.Google Scholar
  39. Willems, J. L., Abreu-Lima, C., Arnaud, P., Brohet, C., & Denic, B. (1990). Evaluation of ECG interpretation results obtained by computer and cardiologists. Methods of Information in Medicine, 294, 308–316.Google Scholar
  40. Witten, I. H., & Frank, E. (1999). Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann.Google Scholar
  41. Young, S., Kershaw, D., Odell, J., Ollason, D., Valtchev, V., & Woodland, P. (1998). The HTK Book. Microsoft Corporation.Google Scholar

Copyright information

© Springer Science + Business Media, Inc. 2005

Authors and Affiliations

  1. 1.School of Computer Science and EngineeringUniversity of New South WalesSydneyAustralia

Personalised recommendations