Advertisement

Semi-supervised Multivariate Sequential Pattern Mining

  • Zhao Xu
  • Koichi Funaya
  • Haifeng Chen
  • Sergio Leoni
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9607)

Abstract

Multivariate sequence analysis is of growing interest for learning on data with numerous correlated time-stamped sequences. It is characterized by correlations among dimensions of multivariate sequences and may not be separately analyzed as multiple independent univariate sequences. On the other hand, labeled data is usually expensive and difficult to obtain in many real-world applications. We present a graph-based semi-supervised learning framework for multivariate sequence classification. The framework explores the correlation within the multivariate sequences, and exploits additional information about the distribution of both labeled and unlabeled data to provide better predictive performance. We also develop an efficient method to extend the graph-based learning approach to out-of-sample prediction. We demonstrate the effectiveness of our approach on real-world multivariate sequence datasets from three domains.

Keywords

Adjacency Matrix Dynamic Time Warping Unlabeled Data Graph Construction Label Propagation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Banko, Z., Abonyi, J.: Correlation based dynamic time warping of multivariate time series. Expert Syst. Appl. 39(17), 12814–12823 (2012)CrossRefGoogle Scholar
  2. 2.
    Belkin, M., Niyogi, P., Sindhwani, V.: On manifold regularization. In: Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (2005)Google Scholar
  3. 3.
    Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Workshop on Computational Learning Theory (1998)Google Scholar
  4. 4.
    Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning. MIT Press, Cambridge (2006)Google Scholar
  5. 5.
    Chapelle, O., Vapnik, V.: Model selection for support vector machines. In: NIPS (1999)Google Scholar
  6. 6.
    Cristianini, N., Kandola, J., Elisseeff, A., ShaweTaylor, J.: On kernel target alignment. In: NIPS (2001)Google Scholar
  7. 7.
    Esling, P., Agon, C.: Time-series data mining. ACM Comput. Surv. 45(1), 1–34 (2012)CrossRefzbMATHGoogle Scholar
  8. 8.
    Kadous, M.W.: Temporal Classification: Extending the Classification Paradigm to Multivariate Time Series. Ph.D. Thesis, University of New South Wales (2002)Google Scholar
  9. 9.
    Kelley, C.T.: Iterative Methods for Linear and Nonlinear Equations. Society for Industrial and Applied Mathematics, Philadelphia (1995)CrossRefzbMATHGoogle Scholar
  10. 10.
    Krzanowski, W.J.: Between-groups comparison of principal components. J. Am. Stat. Assoc. 74, 703–707 (1979)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Kudo, M., Toyama, J., Shimbo, M.: Multidimensional curve classification using passing-through regions. Pattern Recogn. Lett. 20, 1103–1111 (1999)CrossRefGoogle Scholar
  12. 12.
    Lanckriet, G., Cristianini, N., Bartlett, P., Ghaoui, L.E., Jordan, M.: Learning the kernel matrix with semi-definite programming. In: Proceedings of the International Conference on Machine Learning (2002)Google Scholar
  13. 13.
    Li, C., Khan, L., Prabhakaran, B.: Real-time classification of variable length multi-attribute motions. Knowl. Inf. Syst. 10(2), 16317183 (2005)Google Scholar
  14. 14.
    Liao, T.W.: Clustering of time series data-a survey. Pattern Recogn. 38(11), 1857–1874 (2005)CrossRefzbMATHGoogle Scholar
  15. 15.
    Marussy, K., Buza, K.: SUCCESS: a new approach for semi-supervised classification of time-series. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2013, Part I. LNCS, vol. 7894, pp. 437–447. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  16. 16.
    Montero, P., Vilar, J.A.: Tsclust: An r package for time series clustering. J. Stat. Softw. 62(1), 1–43 (2014)CrossRefGoogle Scholar
  17. 17.
    Rath, T.M., Manmatha, R.: Lower-bounding of dynamic time warping distances formultivariate time series. Technical report MM-40, University of Massachusetts (2002)Google Scholar
  18. 18.
    Scudder, H.J.: Probability of error of some adaptive pattern-recognition machines. IEEE Trans. Inf. Theory 11, 363–371 (1965)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Seeger, M.: Learning with labeled and unlabeled data. Technical report, Institute for ANC, Edinburgh, UK (2001)Google Scholar
  20. 20.
    Shahabi, C., Yan, D.: Real-time pattern isolation and recognition over immersive sensor data streams. In: Proceedings of the 9th International Conference on Multi-Media Modeling (2003)Google Scholar
  21. 21.
    Sindhwani, V., Niyogi, P., Belkin, M.: Beyond the point cloud: from transductive to semi-supervised learning. In: Proceedings of the 22nd International Conference on Machine Learning (2005)Google Scholar
  22. 22.
    Smola, A.J., Kondor, R.: Kernels and regularization on graphs. In: Proceedings of the Conference on Learning Theory (2003)Google Scholar
  23. 23.
    Subakan, Y.C., Kurt, B., Cemgil, A.T., Sankur, B.: Probabilistic sequence clustering with spectral learning. Digit. Sig. Proc. 29, 1–19 (2014)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Wang, F., Zhang, C.: Label propagation through linear neighborhoods. In: Proceedings of the 23rd International Conference on Machine Learning (2006)Google Scholar
  25. 25.
    Wang, X., Mueen, A., Ding, H., Trajcevski, G., Scheuermann, P., Keogh, E.: Experimental comparison of representation methods and distance measures for time series data. Data Min. Knowl. Disc. 26(2), 275–309 (2012)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Wei, L., Keogh, F.J.: Semi-supervised time series classification. In: Proceedings of KDD, pp. 748–753 (2006)Google Scholar
  27. 27.
    Weng, X., Shen, J.: Classification of multivariate time series using locality preserving projection. Knowl.-based Syst. 21(7), 581–587 (2008)CrossRefGoogle Scholar
  28. 28.
    Xing, Z., Pei, J., Keogh, E.: A brief survey on sequence classification. SIGKDD Explor. 12(1), 40–48 (2010)CrossRefGoogle Scholar
  29. 29.
    Yang, K., Shahabi, C.: A pca-based similarity measure for multivariate time series. In: Proceedings of the 2nd ACM International Workshop on Multimedia Databases, pp. 65–74 (2004)Google Scholar
  30. 30.
    Zhu, X.: Semi-supervised learning literature survey. Technical report TR 1530, University of Wisconsin Madison, Department of Computer Sciences (2008)Google Scholar
  31. 31.
    Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using Gaussian fields and harmonic functions. In: Proceedings of the 20th International Conference on Machine Learning (2003)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Zhao Xu
    • 1
  • Koichi Funaya
    • 1
  • Haifeng Chen
    • 2
  • Sergio Leoni
    • 1
  1. 1.NEC Laboratories EuropeHeidelbergGermany
  2. 2.NEC Laboratories AmericaPrincetonUSA

Personalised recommendations