On Reduction of Data Series Dimensionality

  • Maciej Krawczak
  • Grażyna Szkatuła
Part of the Studies in Computational Intelligence book series (SCI, volume 530)


In this paper we introduce a complex procedure of reducing dimensionality of multidimensional data series. The procedure consists of several steps, and each step gives a new data series representation as well as dimension reduction. The approach is based on the concept of data series aggregated envelopes, and principal components called here ‘essential attributes’ generated by a multilayer neural network. The essential attributes are generated by outputs of hidden layer neurons. Next, all differences of the essential attributes are treated as new attributes. The real values of the new attributes are nominalized in order to obtain a nominal representation of data series. The approach creates a nominal representation of the original data series and considerably reduces their dimension. Practical verification of the proposed approach was verified for classification and clustering of time series problems, the results are set out in different papers of the authors. Here, the short summarization confirms utilities of time series dimension reduction procedure.


Data series Nominal attributes Dimension reduction Envelopes Essential attributes Data series mining 


  1. 1.
    Chan, K., Fu, A.W.: Efficient time series matching by wavelets. In: Proceedings 15th IEEE International Conference on Data Engineering. Sydney, Australia, pp. 126–133 (2005)Google Scholar
  2. 2.
    Choy, E., Krawczak, M., Shannon, A., Szmidt, E. (eds.): A Survey of Generalized Nets. KvB Institute of Technology, Sydney (2007)Google Scholar
  3. 3.
    Faloutsos, C., Ranganathan, M., Manolopulos, Y.: Fast subsequence matching in time-series databases. SIGMOD Rec. 23, 519–529 (1994)CrossRefGoogle Scholar
  4. 4.
    Fu, T.C.: A review on time series data mining. Eng. Appl. Artif. Intell. 24, 164–181 (2011)CrossRefGoogle Scholar
  5. 5.
    Jolliffe, I.T.: Principal Component Analysis. Springer, New York (2002)Google Scholar
  6. 6.
    Johnson, S.C.: Hierarchical clustering schemes. Psychometrika 2, 241–254 (1967)CrossRefGoogle Scholar
  7. 7.
    Kacprzyk, J., Szkatuła, G.: An inductive learning algorithm with a preanalysis data. Int. J. Knowl. Based Intel. Eng. Syst. 3, 135–146 (1999)Google Scholar
  8. 8.
    Kacprzyk, J., Szkatuła, G.: An integer programming approach to inductive learning using genetic and greedy algorithms. In: Jain, L.C., Kacprzyk, J. (eds.) New Learning Paradigms in Soft Computing, pp. 323–367. Studies in Fuzziness and Soft Computing, Physica-Verlag Heidelberg (2002)CrossRefGoogle Scholar
  9. 9.
    Kacprzyk, J., Szkatuła, G.: A softened formulation of inductive learning and its use for coronary disease data. Lect. Notes Artif. Intel. 3488, 200–209 (2005a)Google Scholar
  10. 10.
    Kacprzyk, J., Szkatuła, G.: An inductive learning algorithm with a partial completeness and consistence via a modified set covering problem. Lect. Notes Comput. Sci. 3697, 661–666 (2005b)Google Scholar
  11. 11.
    Kacprzyk, J., Szkatuła, G.: Inductive Learning: A Combinatorial Optimization. In: Koronacki J., Ras Z.W., Wierzchoń S.T., Kacprzyk J. (eds.) Advances in Machine Learning. Studies in, Computational Intelligence, vol. 262, Springer (2010)Google Scholar
  12. 12.
    Keogh, E., Chakrabarti K., Pazzani M.: Locally adaptive dimensionality reduction for indexing large time series databases. In: Proceedings of ACM SIGMOD Conference on Management of Data. Santa Barbara, pp. 151–162, May 21–24 2001Google Scholar
  13. 13.
    Krawczak, M.: Multilayer Neural Systems and Generalized Net Models. Ac. Publ House EXIT, Warsaw (2003a)Google Scholar
  14. 14.
    Krawczak, M.: Heuristic dynamic programming—learning as control problem. In: Rutkowski, L., Kacprzyk, J. (eds.) Neural Networks and Soft Computing, pp. 218–223. Physica Verlag, Heidelberg (2003b)CrossRefGoogle Scholar
  15. 15.
    Krawczak, M.: A novel modelling methodology: generalized nets. In: Cader A., Rutkowski L., Tadeusiewicz R., \(\dot{Z}\)urada J. (eds.) Artificial Intelligence and Soft Computing. Ac. Publ. House EXIT, Warsaw (2006)Google Scholar
  16. 16.
    Krawczak, M., Szkatuła, G., et al.: On decision rules application to time series classification. In: Atanassov, K.T. (ed.) Advances in Fuzzy Sets, Intuitionistic Fuzzy Sets, Generalized Nets and Related Topics. Ac. Publ. House EXIT, Warsaw (2008)Google Scholar
  17. 17.
    Krawczak, M., Szkatuła G. (2010a) Time series envelopes for classification. In: IEEE Intelligent Systems Conference, London, July 7–9 2010Google Scholar
  18. 18.
    Krawczak, M., Szkatuła, G.: On time series envelopes for classification problems. In: Atanassov K.T. et al. (eds.) Developments in Fuzzy Sets, Intuitionistic Fuzzy Sets, Generalized Nets and Related Topics, vol II, SRI PAS, Warsaw (2010b)Google Scholar
  19. 19.
    Krawczak, M., Szkatuła, G.: Dimentionality reduction for time series. Case Stud. Pol. Assoc. Knowl. 31, 32–45 (2010c)Google Scholar
  20. 20.
    Krawczak, M., Szkatuła, G.: A hybrid approach for dimension reduction in classification. Control Cybern. 40(2), 527–552 (2011)Google Scholar
  21. 21.
    Lin, J., Keogh, E., Patel, P., Lonardi, S.: Finding motifs in time series. In: The 2nd Workshop on Temporal Data Mining, the 8th ACM International Conference on Knowledge Discovery and Data Mining. Edmonton, Canada, pp. 53–68 (2002)Google Scholar
  22. 22.
    Lin, J., Keogh, E., Wei, L., Lonardi, S.: Experiencing SAX: a novel symbolic representation of time series. Data Min. Knowl. Disc. 2(15), 107–144 (2007)CrossRefMathSciNetGoogle Scholar
  23. 23.
    Matheus, C., Rendell, L.: Constructive induction on decision trees. In: Proceedings of the Eleventh International Joint Conference on Artifcial Intelligence, Mateo, CA, Morgan Kaufmann (1989)Google Scholar
  24. 24.
    Nanopoulos, A., Alcock, R., Manolopoulos, Y.: Feature-based classification of time-series data. Int. J. Comput. Res. 10, 49–61 (2001)Google Scholar
  25. 25.
    Oja, E.: Principal components, minor components and linear neural networks. Neural Netw. 5, 927–935 (1992)CrossRefGoogle Scholar
  26. 26.
    Rodríguez, J. J., Alonso, C.J.: Interval and dynamic time warping-based decision trees. In: Proceedings of the 2004 ACM symposium on applied computing (SAC), pp. 548–552 (2004)Google Scholar
  27. 27.
    Shahabi, C., Tian, X., Zhao, W.: TSA-tree: A wavelet-based approach to improve the efficiency of multi-level surprise and trend queries. In: Proceedings of the 12th International Conference on Scientific and Statistical Database Management. Berlin, pp. 55–68 (2000)Google Scholar
  28. 28.
    Szkatuła, G.: Machine learning from examples under errors in data, Ph.D. Thesis, SRI PAS Warsaw, Poland (1995)Google Scholar
  29. 29.
    Szkatuła, G.: Application of modified covering problem in machine learning. In: Gutenbaum, J. (ed.) Automatics Control Manage., pp. 431–445. SRI PAS, Warsaw (2002)Google Scholar
  30. 30.
    Szkatuła, G., Kacprzyk, J.: An inductive learning algorithm with a partial completeness and consistency. In: Dramiński, M., Grzegorzewski, P., Trojanowski, T., Zadrożny, S. (eds.) Issues in Intelligent Systems. Models and Techniques, pp. 229–246. EXIT, Warszawa (2005)Google Scholar
  31. 31.
    Wang, B.: A new clustering algorithm on nominal data sets. In: Proceedings of International MultiConference of Engineers and Computer Scientists 2010 (IMECS 2010), Hong Kong, March 17–19 2010Google Scholar
  32. 32.
    Wu, Y., Chang, E.Y.: Distance-function design and fusion for sequence data. CIKM 04, 324–333 (2004)Google Scholar
  33. 33.
    Yang, Q., Wu, X.: 10 Challenging problems in data mining research. Int. J. Inf. Technol. Decis. Making 5(4), 597–604 (2006)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Systems Research InstitutePolish Academy of SciencesWarsawPoland

Personalised recommendations