Structural Equation Models and Directed Networks

  • Steve Horvath


Undirected networks (encoded in symmetric adjacency matrices) cannot be used to describe causal relationships between random variables. Instead, causal information is encoded by directed networks where the arrow A → B indicates that variable A causally influences variable B. We refer to the process of assigning a causal direction to edges in an association network as “edge orienting”. We review structural equation model (SEM)-based approaches for constructing directed networks between random variables. SEMs lead to predictions about the variance–covariance matrices of the observed variables, which is why they are also known as covariance structure models. We review likelihood-based approaches for evaluating the fit of a structural equation model. We provide a short review of SEMs and show how these techniques can be used for defining directed networks. In particular, we describe how local structural equations based on causal anchors can be used to infer causal networks among variables. Causal networks have been used in systems genetics applications for inferring causal relationships based on genetic markers. The network edge orienting (NEO) R software and method can be used to orient the edges of correlation networks (aka. quantitative trait networks) if the edges can be anchored to causal anchors (e.g., genetic polymorphisms). This section reviews and extends work with Jason Aten and Jake Lusis.


Structural Equation Model Exogenous Variable Endogenous Variable Causal Model Path Diagram 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Akaike H (1973) Information theory as the extension of the maximum likelihood principle. Akademiai Kiado, Budapest, Hungary, pp 267–281Google Scholar
  2. Aten J, Fuller T, Lusis AJ, Horvath S (2008) Using genetic markers to orient the edges in quantitative trait networks: The NEO software. BMC Syst Biol 2(1):34PubMedCrossRefGoogle Scholar
  3. Bentler PM (2006) EQS 6 structural equations program manual. Multivariate Software, Inc, Encino, CAGoogle Scholar
  4. Chen J, Xu H, Aronow BJ, Jegga AG (2007a) Improved human disease candidate gene prioritization using mouse phenotype. BMC Bioinform 8:392CrossRefGoogle Scholar
  5. Chen LS, EmmertStreib F, Storey JD (2007b) Harnessing naturally randomized transcription to infer regulatory relationships among genes. Genome Biol 8:R219PubMedCrossRefGoogle Scholar
  6. Cooper GF (1997) A simple constraint-based algorithm for efficiently mining observational databases for causal relationships. Data Min Knowl Discov 1:203–224CrossRefGoogle Scholar
  7. Cribbie RA (2000) Evaluating the importance of individual parameters in structural equation modeling: The need for type I error control. Pers Individ Dif 29:567–577CrossRefGoogle Scholar
  8. Cribbie RA (2007) Multiplicity control in structural equation modeling. Struct Equ Model 14(1):98–112Google Scholar
  9. Emilsson V, Thorleifsson G, Zhang B, Leonardson A, Zink F, Zhu J, Carlson S, Helgason A, Walters G, Gunnarsdottir S, Mouy M, Steinthorsdottir V, Eiriksdottir G, Bjornsdottir G, Reynisdottir I, Gudbjartsson D, Helgadottir A, Jonasdottir A, Jonasdottir A, Styrkarsdottir U, Gretarsdottir S, Magnusson K, Stefansson H, Fossdal R, Kristjansson K, Gislason H, Stefansson T, Leifsson B, Thorsteinsdottir U, Lamb J, Gulcher J, Reitman M, Kong A, Schadt E, Stefansson K (2008) Genetics of gene expression and its effect on disease. Nature 452(7186):423–428PubMedCrossRefGoogle Scholar
  10. Farber CR, vanNas A, Ghazalpour A, Aten JE, Doss S, Sos B, Schadt EE, IngramDrake L, Davis RC, Horvath S, Smith DJ, Drake TA, Lusis AJ (2009) An integrative genetics approach to identify candidate genes regulating bone density: Combining linkage, gene expression and association. J Bone Miner Res 1:105–16CrossRefGoogle Scholar
  11. Fox J (1984) Linear structural-equation models. In: Linear statistical models and related Methods, vol. 4. Wiley, New YorkGoogle Scholar
  12. Fox J (2006) Structural equation modeling with the sem package in R. Struct Equ Model 13:465–486CrossRefGoogle Scholar
  13. Geier F, Timmer J, Fleck C (2007) Reconstructing gene-regulatory networks from time series, knock-out data, and prior knowledge. BMC Syst Biol 1:11PubMedCrossRefGoogle Scholar
  14. Gjuvsland A, Hayes B, Meuwissen T, Plahte E, Omholt S (2007) Nonlinear regulation enhances the phenotypic expression of trans-acting genetic polymorphisms. BMC Syst Biol 1(1):32PubMedCrossRefGoogle Scholar
  15. Inouye M, Silander K, Hamalainen E, Salomaa V, Harald K, Jousilahti P, Mannista S, Eriksson JG, Saarela J, Ripatti S, Perola M, van Ommen GJB, Taskinen MR, Palotie A, Dermitzakis ET, Peltonen L (2010) An immune response network associated with blood lipid levels. PLoS Genet 6(9):e1001113PubMedCrossRefGoogle Scholar
  16. Jordan MI (1998) Learning in graphical models. The MIT, Cabridge, MACrossRefGoogle Scholar
  17. Kline RB (2005) Principles and practice of structural equation modeling. The Guilford, New York, NYGoogle Scholar
  18. Korb KB, Nicholson AE (2004) Bayesian artifical intelligence. Chapman & Hall/CRC, Boca Raton, FLGoogle Scholar
  19. Kulp DC, Jagalur M (2006) Causal inference of regulator-target pairs by gene mapping of expression phenotypes. BMC Genomics 7:125PubMedCrossRefGoogle Scholar
  20. Lander EJ, Kruglyak L (1995) Genetic dissection of complex traits: Guidelines for interpretation and reporting linkage results. Nat Genet 11:241–247PubMedCrossRefGoogle Scholar
  21. Li R, Tsaih SW, Shockley K, Stylianou IM, Wegedal J, Paigen B, Churchill GA (2006) Structural model analysis of multiple quantitative traits. PLos Genet 2(7):(e114) 1046–1057Google Scholar
  22. Loehlin JC (2004) Latent variable models, 4th edn. Lawrence Erlbaum Associates, Mahwah, NJGoogle Scholar
  23. Lusis AJ (2006) A thematic review series: Systems biology approaches to metabolic and cardiovascular disorders. J Lipid Res 47(9):1887–1890PubMedCrossRefGoogle Scholar
  24. Mounier C, Posner BI (2006) Transcriptional regulation by insulin: From the receptor to the gene. Can J Physiol Pharmacol 84:713–724PubMedCrossRefGoogle Scholar
  25. Neto CE, Ferrara CT, Attie AD, Yandell BS (2008) Inferring causal phenotype networks from segregating populations. Genetics 179(2):1089–1100CrossRefGoogle Scholar
  26. Neto CE, Keller MP, Attie AD, Yandell BS (2010) Causal graphical models in systems genetics: A unified framework for joint inference of causal network and genetic architecture for correlated phenotypes. Ann Appl Stat 4(1):320–339PubMedCrossRefGoogle Scholar
  27. Opgen-Rhein R, Strimmer K (2007) From correlation to causation networks: A simple approximate learning algorithm and its application to high-dimensional plant gene expression data. BMC Syst Biol 1:37PubMedCrossRefGoogle Scholar
  28. Pearl J (1988) Probabilistic reasoning in intelligent systems: Networks of plausible inference. Morgan Kaufmann Publishers, Inc., San Francisco, CAGoogle Scholar
  29. Pearl J (2000) Causality: Models, reasoning, and inference. Cambridge University Press, Cambridge, UKGoogle Scholar
  30. Plaisier CL, Horvath S, Huertas-Vazquez A, Cruz-Bautista I, Herrera MF, Tusie-Luna T, Aguilar-Salinas C, Pajukanta P (2009) A systems genetics approach implicates USF1, FADS3, and other causal candidate genes for familial combined hyperlipidemia. PLoS Genet 5(9):e1000642PubMedCrossRefGoogle Scholar
  31. Presson AP, Sobel EM, Papp JC, Suarez CJ, Whistler T, Rajeevan MS, Vernon SD, Horvath S (2008) Integrated weighted gene co-expression network analysis with an application to chronic fatigue syndrome. BMC Syst Biol 2:95PubMedCrossRefGoogle Scholar
  32. Schadt EE, Lamb J, Yang X, Zhu J, Edwards J, GuhaThakurta D, Sieberts SK, Monks S, Reitman M, Zhang C, Lum PY, Leonardson A, Thieringer R, Metzger JM, Yang L, Castle J, Zhu H, Kash SF, Drake TA, Sachs A, Lusis AJ (2005) An integrative genomics approach to infer causal associations between gene expression and disease. Nat Genet 37(7):710–717PubMedCrossRefGoogle Scholar
  33. Schaefer J, Strimmer K (2005) An empirical Bayes approach to inferring large-scale gene association networks. Bioinformatics 21(6):754–764CrossRefGoogle Scholar
  34. Shipley B (2000a) Cause and correlation in biology, 2nd edn. Cambridge University Press, Cambridge, UKCrossRefGoogle Scholar
  35. Shipley B (2000b) A new inferential test for path models based on directed acyclic graphs. Struct Equ Model 7:206–218CrossRefGoogle Scholar
  36. Sieberts SS, Schadt EE (2007) Moving toward a system genetics view of disease. Mamm Genome 18(6):389–401PubMedCrossRefGoogle Scholar
  37. Smith GD (2006) Randomized by (your) god: Robust inference from an observational study design. J Epidemiol Community Health 60:382–388PubMedCrossRefGoogle Scholar
  38. Spirtes P, Glymour C, Scheines R (2000) Causation, prediction, and search, 2nd edn. The MIT, Cambridge, MAGoogle Scholar
  39. Steiger JH, Fouladi RT (1997) What if there were no significance tests? Erlbaum, Mahwah, NJGoogle Scholar
  40. Zhu J, Wiener MC, Zhang C, Fridman A, Minch E, Lum PY, Sachs JR, Schadt EE (2007) Increasing the power to detect causal associations by combining genotypic and expression data in segregating populations. PLoS Comput Biol 3(4):0692–0703 (e69)Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.University of California, Los AngelesLos AngelesUSA

Personalised recommendations