A Novel Time Series Kernel for Sequences Generated by LTI Systems

  • Liliana Lo PrestiEmail author
  • Marco La Cascia
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10113)


The recent introduction of Hankelets to describe time series relies on the assumption that the time series has been generated by a vector autoregressive model (VAR) of order p. The success of Hankelet-based time series representations prevalently in nearest neighbor classifiers poses questions about if and how this representation can be used in kernel machines without the usual adoption of mid-level representations (such as codebook-based representations). It is also of interest to investigate how this representation relates to probabilistic approaches for time series modeling, and which characteristics of the VAR model a Hankelet can capture. This paper aims at filling these gaps by: deriving a time series kernel function for Hankelets (TSK4H), demonstrating the relations between the derived TSK4H and former dissimilarity/similarity scores, highlighting an alternative probabilistic interpretation of Hankelets.

Experiments with an off-the-shelf SVM implementation and extensive validation in action classification and emotion recognition on several feature representations, show that the proposed TSK4H allows achieving state-of-the-art or even superior accuracy values in classification with respect to past work. In contrast to state-of-the-art time series kernel functions that suffer of numerical issues and tend to provide diagonally dominant kernel matrices, empirical results suggest that the TSK4H has limited numerical issues in high-dimensional spaces. On three widely used public benchmarks, TSK4H consistently outperforms other time series kernel functions despite its simplicity and limited time complexity.


Support Vector Machine Dynamic Time Warping Hankel Matrix Face Emotion Recognition Precision Matrix 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Abdi, H.: RV coefficient and congruence coefficient. In: Encyclopedia of Measurement and Statistics, pp. 849–853. Sage, Thousand Oaks (2007)Google Scholar
  2. 2.
    Bradski, G.: The OpenCV library. Dr. Dobb’s J. Softw. Tools 25(11), 120–126 (2000)Google Scholar
  3. 3.
    Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27–37 (2011). ACMGoogle Scholar
  4. 4.
    Chaudhry, R., Ofli, F., Kurillo, G., Bajcsy, R., Vidal, R.: Bio-inspired dynamic 3D discriminative skeletal features for human action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2013), pp. 471–478. IEEE (2013)Google Scholar
  5. 5.
    Chen, L., Wei, H., Ferryman, J.: A survey of human motion analysis using depth imagery. Pattern Recogn. Lett. 34(15), 1995–2006 (2013). ElsevierGoogle Scholar
  6. 6.
    Chew, S., Lucey, P., Lucey, S., Saragih, J., Cohn, J., Sridharan, S.: Person-independent facial expression detection using constrained local models. In: Proceedings of Conference and Workshop on Automatic Face and Gesture Recognition (FG), pp. 915–920. IEEE (2011)Google Scholar
  7. 7.
    Cohn, J., Schmidt, K.: The timing of facial motion in posed and spontaneous smiles. Int. J. Wavelets Multiresolut. Inf. Process. 2(2), 121–132 (2004). World ScientificGoogle Scholar
  8. 8.
    Cootes, T., Edwards, G., Taylor, C.: Active appearance models. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 23(6), 681–685 (2001). IEEEGoogle Scholar
  9. 9.
    Cuturi, M.: Fast global alignment kernels. In: Proceedings of International Conference on Machine Learning (ICML), pp. 929–936 (2011)Google Scholar
  10. 10.
    Cuturi, M., Doucet, A.: Autoregressive kernels for time series. arXiv preprint arXiv:1101.0673 (2011)
  11. 11.
    Cuturi, M., Vert, J., Birkenes, O., Matsui, T.: A kernel for time series based on global alignments. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 2, pp. 413–420. IEEE (2007)Google Scholar
  12. 12.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 1, pp. 886–893. IEEE (2005)Google Scholar
  13. 13.
    Ellis, C., Masood, S.Z., Tappen, M.F., Laviola Jr., J.J., Sukthankar, R.: Exploring the trade-off between accuracy and observational latency in action recognition. Int. J. Comput. Vis. 101(3), 420–436 (2013). SpringerGoogle Scholar
  14. 14.
    Frank, J., Mannor, S., Precup, D.: Activity and gait recognition with time-delay embeddings. In: Conference on Artificial Intelligence (AAAI) (2010)Google Scholar
  15. 15.
    Gehler, P.V.: Kernel learning approaches for image classification. Ph.D. thesis, Universitat des Saarlandes (2009)Google Scholar
  16. 16.
    Harandi, M.T., Salzmann, M., Jayasumana, S., Hartley, R., Li, H.: Expanding the family of Grassmannian Kernels: an embedding perspective. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 408–423. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-10584-0_27 Google Scholar
  17. 17.
    Hare, S., Saffari, A., Torr, P.H.S.: Struck: structured output tracking with kernels. In: Proceedings of International Conference on Computer Vision (ICCV 2011), pp. 263–270. IEEE (2011)Google Scholar
  18. 18.
    Haufe, S., Nolte, G., Mueller, K., Krämer, N.: Sparse causal discovery in multivariate time series. arXiv preprint arXiv:0901.2234 (2009)
  19. 19.
    Hofmann, T., Schölkopf, B., Smola, A.: Kernel methods in machine learning. Ann. stat. 36(3), 1171–1220 (2008). JSTORGoogle Scholar
  20. 20.
    Huang, X., Zhao, G., Pietikainen, M., Zheng, W.: Robust facial expression recognition using revised canonical correlation. In: Proceedings of International Conference on Pattern Recognition (ICPR), pp. 1734–1739. IEEE (2014)Google Scholar
  21. 21.
    Jebara, T., Kondor, R., Howard, A.: Probability product kernels. J. Mach. Learn. Res. 5, 819–844 (2004).
  22. 22.
    Jiang, Z., Lin, Z., Davis, L.S.: Recognizing human actions by learning and matching shape-motion prototype trees. Trans. Pattern Anal. Mach. Intell. 34(3), 533–547 (2012). IEEEGoogle Scholar
  23. 23.
    Lehrmann, A., Gehler, P., Nowozin, S.: Efficient nonlinear Markov models for human motion. In: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR 2014), pp. 1314–1321. IEEE (2014)Google Scholar
  24. 24.
    Li, B., Camps, O., Sznaier, M.: Cross-view activity recognition using Hankelets. In: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR 2012), pp. 1362–1369. IEEE (2012)Google Scholar
  25. 25.
    Lin, R.-S., Liu, C.-B., Yang, M.-H., Ahuja, N., Levinson, S.: Learning nonlinear manifolds from time series. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 245–256. Springer, Heidelberg (2006). doi: 10.1007/11744047_19 CrossRefGoogle Scholar
  26. 26.
    Lo Presti, L., La Cascia, M.: An on-line learning method for face association in personal photo collection. Image Vis. Comput. 30(4), 306–316 (2012). ElsevierGoogle Scholar
  27. 27.
    Lo Presti, L., La Cascia, M.: Ensemble of Hankel matrices for face emotion recognition. In: Murino, V., Puppo, E. (eds.) ICIAP 2015. LNCS, vol. 9280, pp. 586–597. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-23234-8_54 CrossRefGoogle Scholar
  28. 28.
    Lo Presti, L., La Cascia, M.: Using Hankel matrices for dynamics-based facial emotion recognition and pain detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2015), pp. 26–33. IEEE (2015)Google Scholar
  29. 29.
    Lo Presti, L., La Cascia, M., Sclaroff, S., Camps, O.: Gesture modeling by Hanklet-based hidden Markov model. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9005, pp. 529–546. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-16811-1_35 Google Scholar
  30. 30.
    Lo Presti, L., La Cascia, M., Sclaroff, S., Camps, O.: Hankelet-based dynamical systems modeling for 3D action recognition. Image Vis. Comput. 40, 1–53 (2015). ElsevierGoogle Scholar
  31. 31.
    Lorincz, A., Jeni, L., Szabó, Z., Cohn, J., Kanade, T.: Emotional expression classification using time-series kernels. In: Proceedings of Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 889–895. IEEE (2013)Google Scholar
  32. 32.
    Lucey, P., Cohn, J., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended Cohn-Kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression. In: Proceedings of Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 94–101. IEEE (2010)Google Scholar
  33. 33.
    Moeslund, T., Granum, E.: A survey of computer vision-based human motion capture. Comput. Vis. Image Underst. 81(3), 231–268 (2001). ElsevierGoogle Scholar
  34. 34.
    Nicolaou, M.A., Pavlovic, V., Pantic, M.: Dynamic probabilistic CCA for analysis of affective behaviour. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7578, pp. 98–111. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33786-4_8 CrossRefGoogle Scholar
  35. 35.
    Nie, S., Wang, Z., Ji, Q.: A generative restricted Boltzmann machine based method for high-dimensional motion data modeling. Comput. Vis. Image Underst. 136, 14–22 (2015). ElsevierGoogle Scholar
  36. 36.
    Noma, H., Shimodaira, K.: Dynamic time-alignment kernel in support vector machine. Adv. Neural Inf. Process. Syst. 14, 921–930 (2002)Google Scholar
  37. 37.
    Ofli, F., Chaudhry, R., Kurillo, G., Vidal, R., Bajcsy, R.: Sequence of the most informative joints (SMIJ): a new representation for human skeletal action recognition. J. Vis. Commun. Image Represent. 25(1), 24–38 (2014). ElsevierGoogle Scholar
  38. 38.
    Paoletti, S., Juloski, A., Ferrari-Trecate, G., Vidal, R.: Identification of hybrid systems a tutorial. Eur. J. Control 13(2), 242–260 (2007). ElsevierGoogle Scholar
  39. 39.
    Poppe, R.: A survey on vision-based human action recognition. Image and Vis. Comput. 28(6), 976–990 (2010). ElsevierGoogle Scholar
  40. 40.
    Poullot, S., Tsukatani, S., Phuong Nguyen, A., Jégou, H., Satoh, S.: Temporal matching kernel with explicit feature maps. In: Proceedings of Conference on Multimedia Conference, pp. 381–390. ACM (2015)Google Scholar
  41. 41.
    Prabhakar, K., Oh, S., Wang, P., Abowd, G., Rehg, J.M.: Temporal causality for the analysis of visual events. In: Proceedings on Computer Vision and Pattern Recognition (CVPR 2010), pp. 1967–1974. IEEE (2010)Google Scholar
  42. 42.
    Rahimi, A., Recht, B., Darrell, T.: Learning to transform time series with a few examples. Trans. Pattern Anal. Mach. Intell. 29(10), 1759–1775 (2007). IEEEGoogle Scholar
  43. 43.
    Raptis, M., Kokkinos, I., Soatto, S.: Discovering discriminative action parts from mid-level video representations. In: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR 2012), pp. 1242–1249. IEEE (2012)Google Scholar
  44. 44.
    Revaud, J., Douze, M., Schmid, C., Jégou, H.: Event retrieval in large video collections with circulant temporal encoding. In: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR 2013), pp. 2459–2466. IEEE (2013)Google Scholar
  45. 45.
    Ramirez Rivera, A., Castillo, R., Chae, O.: Local directional number pattern for face analysis: face and expression recognition. Trans. Image Process. (TIP) 22(5), 1740–1752. IEEE (2013)Google Scholar
  46. 46.
    Sahami, M., Heilman, T.D.: A web-based kernel function for measuring the similarity of short text snippets. In: Proceedings of International Conference on World Wide Web, pp. 377–386. ACM (2006)Google Scholar
  47. 47.
    Sankaranarayanan, A.C., Turaga, P.K., Baraniuk, R.G., Chellappa, R.: Compressive acquisition of dynamic scenes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6311, pp. 129–142. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-15549-9_10 CrossRefGoogle Scholar
  48. 48.
    Sariyanidi, E., Gunes, H., Cavallaro, A.: Automatic analysis of facial affect: a survey of registration, representation and recognition. Trans. Pattern Anal. Mach. Intell. (PAMI) 37(6), 1113–1133 (2014). IEEEGoogle Scholar
  49. 49.
    Schüldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proceedings of International Conference on Pattern Recognition (ICPR 2004), vol. 3, pp. 32–36. IEEE (2004)Google Scholar
  50. 50.
    Seo, H.J., Milanfar, P.: Training-free, generic object detection using locally adaptive regression kernels. Trans. Pattern Anal. Mach. Intell. 32(9), 1688–1704 (2010). IEEEGoogle Scholar
  51. 51.
    Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images. Commun. ACM 56(1), 116–124 (2013). ACMGoogle Scholar
  52. 52.
    Slama, R., Wannous, H., Daoudi, M., Srivastava, A.: Accurate 3D action recognition using learning on the Grassmann manifold. Pattern Recognit. (PR) 48(2), 556–567 (2015). ElsevierGoogle Scholar
  53. 53.
    Smilde, A.K., Kiers, H.A.L., Bijlsma, S., Rubingh, C.M., Van Erk, M.J.: Matrix correlations for high-dimensional data: the modified RV-coefficient. Bioinformatics 25(3), 401–405 (2009). Oxford University PressGoogle Scholar
  54. 54.
    Songsiri, J., Dahl, J., Vandenberghe, L.: Graphical models of autoregressive processes. In: Convex Optimization in Signal Processing and Communications, pp. 89–116. Cambridge University Press. Cambridge (2010)Google Scholar
  55. 55.
    Turaga, P., Chellappa, R., Subrahmanian, V.S., Udrea, O.: Machine recognition of human activities: a survey. Trans. Circ. Syst. Video Technol. 18(11), 1473–1488 (2008). IEEEGoogle Scholar
  56. 56.
    Urtasun, R., Fleet, D.J., Fua, P.: 3D people tracking with Gaussian process dynamical models. In: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR 2006), vol. 1, pp. 238–245. IEEE (2006)Google Scholar
  57. 57.
    Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004). SpringerGoogle Scholar
  58. 58.
    Wang, Z., Wang, S., Ji, Q.: Capturing complex spatio-temporal relations among facial muscles for facial expression recognition. In: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3422–3429. IEEE (2013)Google Scholar
  59. 59.
    Wu, B., Yuan, C., Hu, W.: Human action recognition based on context-dependent graph kernels. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2014), pp. 2609–2616. IEEE (2014)Google Scholar
  60. 60.
    Xu, D., Yan, S., Tao, D., Zhang, L., Li, X., Zhang, H.: Human gait recognition with matrix representation. Trans. Circ. Syst. Video Technol. 16(7), 896–903 (2006). IEEEGoogle Scholar
  61. 61.
    Yang, M.H., Ahuja, N., Tabb, M.: Extraction of 2D motion trajectories and its application to hand gesture recognition. Trans. Pattern Anal. Mach. Intell. 24(8), 1061–1074 (2002). IEEEGoogle Scholar
  62. 62.
    Zeng, Z., Pantic, M., Roisman, G.I., Huang, T.S.: A survey of affect recognition methods: audio, visual, and spontaneous expressions. Trans. Pattern Anal. Mach. Intell. 31(1), 39–58 (2009). IEEEGoogle Scholar
  63. 63.
    Zhang, X., Yang, Y., Jiao, L.C., Dong, F.: Manifold-constrained coding and sparse representation for human action recognition. Pattern Recogn. 46(7), 1819–1831 (2013). ElsevierGoogle Scholar
  64. 64.
    Zhou, F., De la Torre, F.: Generalized canonical time warping. Trans. Pattern Anal. Mach. Intell. (PAMI) 38(2), 279–294 (2016). IEEEGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.DIIDUniversitá degli studi di PalermoPalermoItaly

Personalised recommendations