Abstract
Graphical modelling strategies have been recently discovered as a versatile tool for analyzing multivariate stochastic processes. Vector autoregressive processes can be structurally represented by mixed graphs having both directed and undirected edges between the variables representing process components. To allow for more expressive vector autoregressive structures, we consider models with separate time dynamics for each directed edge and non-decomposable graph topologies for the undirected part of the mixed graph.
Contrary to static graphical models, the number of possible mixed graphs is extremely large even for small systems, and consequently, standard Bayesian computation based on Markov chain Monte Carlo is not in practice a feasible alternative for model learning. To obtain a numerically efficient approach we utilize a recent Bayesian information theoretic criterion for model learning, which has attractive properties when the potential model complexity is large relative to the size of the observed data set. The performance of our method is illustrated by analyzing both simulated and real data sets. Our simulation experiments demonstrate the gains in predictive accuracy which can obtained by considering structural learning of vector autoregressive processes instead of unstructured models. The analysis of the real data also shows that the understanding of the dynamics of a multivariate process can be improved significantly by considering more flexible model classes.
Article PDF
Similar content being viewed by others
References
Abramovitz, M., & Stegun, I. A. (Eds.) (1965). Handbook of mathematical functions with formulas, graphs, and mathematical tables. New York: Dover.
Akaike, H. (1969). Fitting autoregressive models for prediction. Annals of the Institute of Statistical Mathematics, 21, 243–247.
Bach, F. R., & Jordan, M. I. (2004a). Beyond independent components: Trees and clusters. Journal of Machine Learning Research, 4, 1205–1233.
Bach, F. R., & Jordan, M. I. (2004b). Learning graphical models for stationary time series. IEEE Transactions on Signal Processing, 52, 2189–2199.
Bernardo, J. M. (1999). Nested hypothesis testing: the Bayesian reference criterion. In J. M. Bernardo, J. O. Berger, A. P. Dawid, & A. F. M. Smith (Eds.), Bayesian statistics (Vol. 6, pp. 101–130). London: Oxford University Press. With discussion.
Bernardo, J. M., & Smith, A. F. M. (1994). Bayesian theory. Chichester: Wiley.
Brillinger, D. R. (1996). Remarks concerning graphical models for time series and point processes. Revista de Econometria, 16, 1–23.
Brüggemann, R., Krolzig, H.-M., & Lütkepohl, H. (2002). Comparison of model reduction methods for VAR processes. EUI Working Paper, ECO, 2002/19. http://hdl.handle.net/1814/791.
Carvalho, C., & West, M. (2007). Dynamic matrix-variate graphical models. Bayesian Analysis, 2, 69–98.
Corander, J. (2003). Bayesian graphical model determination using decision theory. Journal of Multivariate Analysis, 85, 253–266.
Corander, J., & Marttinen, P. (2006). Bayesian model learning based on predictive entropy. Journal of Logic, Language and Information, 15, 5–20.
Corander, J., & Villani, M. (2006). A Bayesian approach to modelling graphical vector autoregressions. Journal of Time Series Analysis, 27, 141–156.
Cormen, T. H., Leiserson, C. E., & Rivest, R. L. (2001). Introduction to algorithms (2nd edn.). Cambridge: MIT Press.
Dahlhaus, R. (2000). Graphical interaction models for multivariate time series. Metrika, 51, 157–172.
Dahlhaus, R., & Eichler, M. (2003). Causality and graphical models in time series analysis. In P. J. Green, N. L. Hjort, & S. Richardson (Eds.), Highly structured stochastic systems (pp. 115–137). London: Oxford University Press.
Dash, D. (2005). Restructuring dynamic causal systems in equilibrium. In: R. Cowell & Z. Ghahramani (Eds.), Proceedings of the tenth international workshop on artificial intelligence and statistics (AIStats). Society for artificial intelligence and statistics. Available electronically at http://www.gatsby.ucl.ac.uk/aistats/.
Drton, M., & Eichler, M. (2006). Maximum likelihood estimation in Gaussian chain graph models under the alternative Markov property. Scandinavian Journal of Statistics, 33, 247–257.
Eichler, M. (2001). Graphical modelling of multivariate time series. Technical report, Universität Heidelberg. arXiv:math.ST/0610654.
Eichler, M. (2006a). Fitting graphical interaction models to multivariate time series. In Proceedings of the 22nd conference of uncertainty in artificial intelligence. Arlington: AUAI Press.
Eichler, M. (2006b). Graphical modelling of dynamic relationships in multivariate time series. In M. Winterhalder, B. Schelter, & J. Timmer (Eds.), Handbook of time series analysis (pp. 335–372). New York: Wiley.
Eichler, M. (2007). Granger-causality and path diagrams for multivariate time series. Journal of Econometrics, 137, 334–353.
Eichler, M. (2008). Causal inference from multivariate time series: What can be learned from Granger causality. In C., Glymour, W. Wang & D. Westerstahl (Eds.), Proceedings from the 13th international congress of logic, methodology and philosophy of science. King’s College Publications, London.
Eichler, M., Dahlhaus, R., & Sandkühler, J. (2003). Partial correlation analysis for the identification of synaptic connections. Biological Cybernetics, 89, 289–302.
Florens, J. P., & Mouchart, M. (1985). A linear theory for noncausality. Econometrica, 53, 157–175.
Fried, R., & Didelez, V. (2003). Decomposability and selection of graphical models for multivariate time series. Biometrika, 90, 251–267.
Fried, R., & Didelez, V. (2005). Latent variable analysis and partial correlation graphs for multivariate time series. Statistics & Probability Letters, 73, 287–296.
Friedman, N., Murphy, K., & Russell, S. (1998). Learning the structure of dynamic probabilistic networks. In G. F. Cooper & S. Moral (Eds.), Proceedings of the 14th annual conference on uncertainty in artificial intelligence (UAI-98). San Mateo: Morgan Kaufmann.
Gather, U., Imhoff, M., & Fried, R. (2002). Graphical models for multivariate time series from intensive care monitoring. Statistics in Medicine, 21, 2685–2701.
Giudici, P., & Stanghellini, E. (2002). Bayesian inference for graphical factor analysis models. Psychometrika, 66, 577–592.
Granger, C. W. J. (1969). Investigating causal relations by econometric models and cross-spectral methods. Econometrica, 37, 24–36.
Granger, C. W. J. (2001). Essays in econometrics: collected papers of Clive W.J. Granger. Cambridge: Cambridge University Press. Ghysels, E., Swanson, N.R. & Watson, M.W. (Eds.).
Gredenhoff, M., & Karlsson, S. (1999). Lag-length selection in VAR-models using equal and unequal lag-length procedures. Computational Statistics, 14, 171–187.
Haario, H., Saksman, E., & Tamminen, J. (2001). An adaptive Metropolis algorithm. Bernoulli, 7, 223–242.
Hannan, E. J., & Quinn, B. G. (1979). The determination of the order of an autoregression. Journal of the Royal Statistical Society, B 41, 190–195.
Heckerman, D., Geiger, D., & Chickering, D. M. (1995). Learning Bayesian networks—the combination of knowledge and statistical data. Machine Learning, 20, 197–243.
Imhoff, M., & Kuhls, S. (2006). Alarm algorithms in critical care monitoring. Anesthesia and Analgesia, 102, 1525–1537.
Iwasaki, Y., & Simon, H. A. (1994). Causality and model abstraction. Artificial Intelligence, 67, 143–194.
Janzura, M., & Nielsen, J. (2006). A simulated annealing-based method for learning Bayesian networks from statistical data. International Journal of Intelligent Systems, 21, 335–348.
Johansen, S. (1995). Likelihood-based inference in cointegrated vector autoregressive models. London: Oxford University Press.
Jordan, M. I. (2004). Graphical models. Statistical Science, 19, 140–155.
Koivisto, M., & Sood, K. (2004). Exact Bayesian structure discovery in Bayesian networks. Journal of Machine Learning Research, 5, 549–573.
Lauritzen, S. L. (1996). Graphical models. London: Oxford University Press.
Leimer, H.-G. (1993). Optimal decomposition by clique separators. Discrete Mathematics, 113, 99–123.
Lütkepohl, H. (1993). Introduction to multiple time series analysis. Berlin: Springer.
Lynggaard, H., & Walther, K. H. (1993). Dynamic modelling with mixed graphical association models. Master’s thesis, Aalborg University.
Mardia, K. V., Kent, J. T., & Bibby, J. M. (1979). Multivariate analysis. San Diego: Academic Press.
Moneta, A., & Spirtes, P. (2005). Graph-based search procedure for vector autoregressive models. LEM Working Paper 2005/14, Sant’Anna School of Advanced Studies, Pisa.
Oxley, L., Reale, M., & Tunnicliffe, W. (2004). Finding directed acyclic graphs for vector autoregressions. In J. Antoch (Ed.), Proceedings in computational statistics 2004 (pp. 1621–1628). Heidelberg: Physica.
Ozcicek, O., & McMillin, W. D. (1999). Lag length selection in vector autoregressive models: symmetric and asymmetric lags. Applied Economics, 31, 517–524.
Pearl, J. (2000). Causality: models, reasoning, and inference. Cambridge: Cambridge University Press.
Reale, M., & Tunnicliffe Wilson, G. (2001). Identification of vector AR models with recursive structural errors using conditional independence graphs. Statistical Methods and Applications, 10, 49–65.
Reale, M., & Tunnicliffe Wilson, G. (2002). The sampling properties of conditional independence graphs for structural vector autoregressions. Biometrika, 8, 457–461.
Robert, C. P., & Casella, G. (2005). Monte Carlo statistical methods (2nd ed.). New York: Springer.
Roverato, A. (2002). Hyper inverse Wishart distribution for non-decomposable graphs and its application to Bayesian inference for Gaussian graphical models. Scandinavian Journal of Statistics, 29, 391–411.
Salvador, R., Suckling, J., Schwarzbauer, C., & Bullmore, E. (2005). Undirected graphs of frequency-dependent functional connectivity in whole brain networks. Philosophical Transactions of the Royal Society B Biological Sciences, 360, 937–946.
Schelter, B., Winterhalder, M., Hellwig, B., Guschlbauer, B., Lucking, C. H., & Timmer, J. (2006). Direct or indirect? Graphical models for neural oscillators. Journal of Physiology (Paris), 99, 37–46.
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.
Seinfeld, J. H. (1986). Atmospheric chemistry and physics of air pollution. New York: Wiley.
Sisson, S. A. (2005). Transdimensional Markov chains: a decade of progress and future perspectives. Journal of the American Statistical Association, 100, 1077–1089.
Speed, T. P., & Kiiveri, H. T. (1986). Gaussian distributions over finite graphs. Annals of Statistics, 14, 138–150.
Spirtes, P., Glymour, C., & Scheines, R. (2000). Causation, prediction, and search. Cambridge: MIT Press.
Stanghellini, E., & Whittaker, J. (1999). Analysis of multivariate time series via a hidden graphical model. In D. Heckerman & J. Whittaker (Eds.), Proceedings of the seventh international workshop on artificial intelligence and statistics. San Mateo: Morgan Kaufmann.
Valdés-Sosa, P. A., Sánchez-Bornot, J. M., Lage-Castellanos, A., Vega-Hernández, M., Bosch-Bayard, J., Melie-García, L., & Canalez-Rodríguez, E. (2005). Estimating brain functional connectivity with sparse multivariate autoregression. Philosophical Transactions of the Royal Society B-Biological Sciences, 360, 969–981.
Whittaker, J. (1990). Graphical models in applied multivariate statistics. New York: Wiley.
Winker, P., & Maringer, D. (2004). Optimal lag structure selection in VEC-models. Computing in Economics and Finance 2004 155, Society for Computational Economics.
Zellner, A. (1971). An introduction to Bayesian inference in econometrics. New York: Wiley.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editor: Zoubin Ghahramani.
This work was financially supported by the COMMIT graduate school, the research funds of University of Helsinki, and grant no. 121301 from the Academy of Finland. The authors are grateful to Prof. H. Karrasch, Universität Heidelberg, for the air pollution data set.
Rights and permissions
About this article
Cite this article
Marttinen, P., Corander, J. Bayesian learning of graphical vector autoregressions with unequal lag-lengths. Mach Learn 75, 217–243 (2009). https://doi.org/10.1007/s10994-009-5101-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-009-5101-2