Forecasting and Granger Modelling with Non-linear Dynamical Dependencies

  • Magda GregorováEmail author
  • Alexandros Kalousis
  • Stéphane Marchand-Maillet
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10535)


Traditional linear methods for forecasting multivariate time series are not able to satisfactorily model the non-linear dependencies that may exist in non-Gaussian series. We build on the theory of learning vector-valued functions in the reproducing kernel Hilbert space and develop a method for learning prediction functions that accommodate such non-linearities. The method not only learns the predictive function but also the matrix-valued kernel underlying the function search space directly from the data. Our approach is based on learning multiple matrix-valued kernels, each of those composed of a set of input kernels and a set of output kernels learned in the cone of positive semi-definite matrices. In addition to superior predictive performance in the presence of strong non-linearities, our method also recovers the hidden dynamic relationships between the series and thus is a new alternative to existing graphical Granger techniques.



This work was partially supported by the research projects HSTS (ISNET) and RAWFIE #645220 (H2020). We thank Francesco Dinuzzo for helping to form the initial ideas behind this work through fruitful discussions while visiting in IBM Research, Dublin.


  1. 1.
    Arnold, A., Liu, Y., Abe, N.: Temporal causal modeling with graphical granger methods. In: Proceedings of 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2007 (2007)Google Scholar
  2. 2.
    Bach, F.: Consistency of the group lasso and multiple kernel learning. J. Mach. Learn. Res. 9, 1179–1225 (2008)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Bach, F., Jenatton, R., Mairal, J., Obozinski, G.: Optimization with sparsity-inducing penalties. Found. Trends Mach. Learn. 4, 1–106 (2012)CrossRefzbMATHGoogle Scholar
  4. 4.
    Bahadori, M., Liu, Y.: An examination of practical granger causality inference. In: SIAM Conference on Data Mining (2013)Google Scholar
  5. 5.
    Beck, A., Teboulle, M.: Gradient-based algorithms with applications to signal recovery. In: Convex Optimization in Signal Processing and Communications (2009)Google Scholar
  6. 6.
    Brockwell, P.J., Davis, R.A.: Time Series: Theory and Methods, 2nd edn. Springer Science+Business Media, LLC, New York (2006). zbMATHGoogle Scholar
  7. 7.
    Caponnetto, A., Micchelli, C.A., Pontil, M., Ying, Y.: Universal multi-task kernels. Mach. Learn. Res. 9, 1615–1646 (2008)MathSciNetzbMATHGoogle Scholar
  8. 8.
    Dinuzzo, F., Ong, C.: Learning output kernels with block coordinate descent. In: International Conference on Machine Learning (ICML) (2011)Google Scholar
  9. 9.
    Eichler, M.: Graphical modelling of multivariate time series. Probab. Theory Relat. Fields 153, 233–268 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Franz, M.O., Schölkopf, B.: A unifying view of wiener and volterra theory and polynomial kernel regression. Neural Comput. 18, 3097–3118 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Granger, C.W.J.: Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37(3), 424–438 (1969)CrossRefzbMATHGoogle Scholar
  12. 12.
    Jawanpuria, P., Lapin, M., Hein, M., Schiele, B.: Efficient output kernel learning for multiple tasks. In: NIPS (2015)Google Scholar
  13. 13.
    Kadri, H., Rakotomamonjy, A., Bach, F., Preux, P.: Multiple operator-valued kernel learning. In: NIPS (2012)Google Scholar
  14. 14.
    Lanckriet, G.G.R., Cristianini, N., Bartlett, P., Ghaoui, L.E., Jordan, M.I.: Learning the kernel matrix with semidefinite programming. J. Mach. Learn. Res. 5, 27–72 (2004)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Lim, N., D’Alché-Buc, F., Auliac, C., Michailidis, G.: Operator-valued Kernel-based vector autoregressive models for network inference. Mach. Learn. 99, 489 (2015). MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Lozano, A.C., Abe, N., Liu, Y., Rosset, S.: Grouped graphical Granger modeling for gene expression regulatory networks discovery. Bioinformatics 25, i110–i118 (2009). (Oxford, England)CrossRefGoogle Scholar
  17. 17.
    Micchelli, C.A., Pontil, M.: On learning vector-valued functions. Neural Comput. 17, 177–204 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Pillonetto, G., Dinuzzo, F., Chen, T., De Nicolao, G., Ljung, L.: Kernel methods in system identification, machine learning and function estimation: a survey. Automatica 50, 657–682 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Sindhwani, V., Minh, H.Q., Lozano, A.: Scalable matrix-valued kernel learning for high-dimensional nonlinear multivariate regression and granger causality. In: UAI (2013)Google Scholar
  20. 20.
    Turkman, K.F., Scotto, M.G., de Zea Bermudez, P.: Non-linear Time Series. Springer, Cham (2014). CrossRefzbMATHGoogle Scholar
  21. 21.
    Xu, Z., Jin, R., Yang, H., King, I., Lyu, M.R.: Simple and efficient multiple kernel learning by group lasso. In: International Conference on Machine Learning (ICML) (2010)Google Scholar
  22. 22.
    Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 68, 49–67 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Zhao, P., Rocha, G.: Grouped and hierarchical model selection through composite absolute penalties (2006)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Magda Gregorová
    • 1
    • 2
    Email author
  • Alexandros Kalousis
    • 1
    • 2
  • Stéphane Marchand-Maillet
    • 2
  1. 1.Geneva School of Business AdministrationHES-SO University of Applied Sciences of Western SwitzerlandGenevaSwitzerland
  2. 2.University of GenevaGenevaSwitzerland

Personalised recommendations