Causality Discovery with Additive Disturbances: An Information-Theoretical Perspective

  • Kun Zhang
  • Aapo Hyvärinen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5782)


We consider causally sufficient acyclic causal models in which the relationship among the variables is nonlinear while disturbances have linear effects, and show that three principles, namely, the causal Markov condition (together with the independence between each disturbance and the corresponding parents), minimum disturbance entropy, and mutual independence of the disturbances, are equivalent. This motivates new and more efficient methods for some causal discovery problems. In particular, we propose to use multichannel blind deconvolution, an extension of independent component analysis, to do Granger causality analysis with instantaneous effects. This approach gives more accurate estimates of the parameters and can easily incorporate sparsity constraints. For additive disturbance-based nonlinear causal discovery, we first make use of the conditional independence relationships to obtain the equivalence class; undetermined causal directions are then found by nonlinear regression and pairwise independence tests. This avoids the brute-force search and greatly reduces the computational load.


Mean Square Error Causal Relation Directed Acyclic Graph Granger Causality Conditional Independence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Cichocki, A., Amari, S.: Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications. John Wiley & Sons, UK (2003) (corrected and revisited edition)Google Scholar
  2. 2.
    Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Chichester (1991)CrossRefzbMATHGoogle Scholar
  3. 3.
    Gibbs, P.: Event-Symmetric Space-Time. Weburbia Press, Great Britain (1998)Google Scholar
  4. 4.
    Granger, C.: Testing for causality: A personal viewpoint. Journal of Economic Dynamics and Control 2 (1980)Google Scholar
  5. 5.
    Gretton, A., Fukumizu, K., Teo, C.H., Song, L., Schölkopf, B., Smola, A.J.: A kernel statistical test of independence. In: NIPS 20, pp. 585–592. MIT Press, Cambridge (2008)Google Scholar
  6. 6.
    Hoyer, P.O., Janzing, D., Mooji, J., Peters, J., Schölkopf, B.: Nonlinear causal discovery with additive noise models. In: NIPS 21, Vancouver, B.C., Canada (2009)Google Scholar
  7. 7.
    Hyvärinen, A.: Fast and robust fixed-point algorithms for independent component analysis. IEEE Transactions on Neural Networks 10(3), 626–634 (1999)CrossRefGoogle Scholar
  8. 8.
    Hyvärinen, A., Karhunen, J., Oja, E.: Independent Component Analysis. John Wiley & Sons, Inc., Chichester (2001)CrossRefGoogle Scholar
  9. 9.
    Hyvärinen, A., Ramkumar, P., Parkkonen, L., Hari, R.: Independent component analysis of short-time Fourier transforms for spontaneous EEG/MEG analysis (2008) (submitted manuscript)Google Scholar
  10. 10.
    Hyvärinen, A., Shimizu, S., Hoyer, P.O.: Causal modelling combining instantaneous and lagged effects: an identifiable model based on non-gaussianity. In: ICML 2008, Helsinki, Finland, pp. 424–431 (2008)Google Scholar
  11. 11.
    Liu, R.W., Luo, H.: Direct blind separation of independent non-Gaussian signals with dynamic channels. In: Proc. Fifth IEEE Workshop on Cellular Neural Networks and their Applications, London, England, April 1998, pp. 34–38 (1998)Google Scholar
  12. 12.
    Margaritis, D.: Distribution-free learning of bayesian network structure in continuous domains. In: Proceedings of the 20th Conference on Artificial Intelligence (AAAI 2005), Pittsburgh, PA, July 2005, pp. 825–830 (2005)Google Scholar
  13. 13.
    Pearl, J.: Causality: Models, Reasoning, and Inference. Cambridge University Press, Cambridge (2000)Google Scholar
  14. 14.
    Pellet, J.P., Elisseeff, A.: Using markov blankets for causal structure learning. Journal of Machine Learning Research 9, 1295–1342 (2008)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT Press, Cambridge (2006)zbMATHGoogle Scholar
  16. 16.
    Reale, M., Tunnicliffe Wilson, G.: Identification of vector ar models with recursive structural errors using conditional independence graphs. Statistical Methods and Applications 10(1-3), 49–65 (2001)CrossRefzbMATHGoogle Scholar
  17. 17.
    Schwarz, G.: Estimating the dimension of a model. The Annals of Statistics 6, C461–C464 (1978)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Shimizu, S., Hoyer, P.O., Hyvärinen, A., Kerminen, A.J.: A linear non-Gaussian acyclic model for causal discovery. Journal of Machine Learning Research 7, 2003–2030 (2006)MathSciNetzbMATHGoogle Scholar
  19. 19.
    Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search, 2nd edn. MIT Press, Cambridge (2001)zbMATHGoogle Scholar
  20. 20.
    Zhang, K., Peng, H., Chan, L., Hyvärinen, A.: ICA with sparse connections: Revisisted. In: Proc. 8rd Int. Conf. on Independent Component Analysis and Blind Signal Separation (ICA 2009), Paraty, Brazil, pp. 195–202 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Kun Zhang
    • 1
  • Aapo Hyvärinen
    • 1
    • 2
  1. 1.Dept of Computer Science & HIITUniversity of HelsinkiFinland
  2. 2.Dept of Mathematics and StatisticsUniversity of HelsinkiFinland

Personalised recommendations