Geometric Interpretation of Correlation Networks Using the Singular Value Decomposition

  • Steve Horvath


The nodes of a correlation network correspond to the columns of a numeric data matrix datX. Based on the singular value decomposition (SVD) of datX, we are able to characterize approximately factorizable correlation networks, i.e., adjacency matrices that factor into node-specific contributions. The SVD yields singular vectors that have important practical applications. For example, the first left singular vector (referred to as the eigenvector) explains the maximum amount of variation of the columns of datX. The eigenvector is also known as module eigengene in the context of a gene co-expression network module. Right singular vectors can be used for signal balancing, e.g., to remove batch effects and other technical artifacts. Based on the eigenvector (the first left singular vector), we define a new type of network concept, referred to as eigenvector-based network concept. Eigenvector-based concepts are analogous to approximate conformity-based network concepts but have a major advantage: they often allow for a geometric interpretation based on the angular interpretation of correlations. The underlying structure of correlation networks affects network analysis results. For example, there are geometric reasons why intramodular hub nodes in important modules tend to be important, and why hub nodes in one module cannot be hubs in another distinct module. The hub node significance of a module can be interpreted as angle between a sample trait and the module eigengene. Since the intramodular connectivity kIM i is highly related to the module membership measure kME i , it can be interpreted as angle between x i and the module eigenvector ME. A short dictionary for translating between data mining- and network theory language may facilitate the communication between the two fields. Mouse and brain gene co-expression network applications are used to illustrate the results. This work reviews and extends work with Jun Dong (Horvath and Dong PLoS Comput Biol 4(8):e1000117, 2008).


  1. Adrian D, Chris H, Beatrix J, Joseph R, Guang Y, West M (2004) Sparse graphical models for exploring gene expression data. J Multivar Anal 90(1):196–212CrossRefGoogle Scholar
  2. Alter O, Brown PO, Botstein D (2000) Singular value decomposition for genome-wide expression data processing and modelling. Proc Natl Acad Sci USA 97:10101–10106PubMedCrossRefGoogle Scholar
  3. Carlson M, Zhang B, Fang Z, Mischel P, Horvath S, Nelson SF (2006) Gene connectivity, function, and sequence conservation: Predictions from modular yeast co-expression networks. BMC Genomics 7(7):40PubMedCrossRefGoogle Scholar
  4. Dong J, Horvath S (2007) Understanding network concepts in modules. BMC Syst Biol 1(1):24PubMedCrossRefGoogle Scholar
  5. Fuller TF, Ghazalpour A, Aten JE, Drake T, Lusis AJ, Horvath S (2007) Weighted gene coexpression network analysis strategies applied to mouse weight. Mamm Genome 18(6–7):463–472PubMedCrossRefGoogle Scholar
  6. Ghazalpour A, Doss S, Zhang B, Plaisier C, Wang S, Schadt EE, Thomas A, Drake TA, Lusis AJ, Horvath S (2006) Integrating genetics and network analysis to characterize genes related to mouse weight. PloS Genet 2(2):8CrossRefGoogle Scholar
  7. Hibbs MA, Hess DC, Myers CL, Huttenhower C, Li K, Troyanskaya OG (2007) Exploring the functional landscape of gene expression: Directed search of large microarray compendia. Bioinformatics 23(20):2692–2699PubMedCrossRefGoogle Scholar
  8. Holter NS, Mitra M, Maritan A, Cieplak M, Banavar JR, Fedoroff NV (2000) Fundamental patterns underlying gene expression profiles: Simplicity from complexity. Proc Natl Acad Sci USA 97(15):8409–8414PubMedCrossRefGoogle Scholar
  9. Horvath S, Dong J (2008) Geometric interpretation of gene co-expression network analysis. PLoS Comput Biol 4(8):e1000117PubMedCrossRefGoogle Scholar
  10. Horvath S, Zhang B, Carlson M, Lu KV, Zhu S, Felciano RM, Laurance MF, Zhao W, Shu Q, Lee Y, Scheck AC, Liau LM, Wu H, Geschwind DH, Febbo PG, Kornblum HI, Cloughesy TF, Nelson SF, Mischel PS (2006) Analysis of oncogenic signaling networks in glioblastoma identifies ASPM as a novel molecular target. Proc Natl Acad Sci USA 103(46):17402–17407PubMedCrossRefGoogle Scholar
  11. Langfelder P, Horvath S (2007) Eigengene networks for studying the relationships between co-expression modules. BMC Syst Biol 1(1):54PubMedCrossRefGoogle Scholar
  12. Liao JC, Boscolo R, Yang YL, Tran LM, Sabatti C, Roychowdhury VP (2003) Network component analysis: Reconstruction of regulatory signals in biological systems. Proc Natl Acad Sci USA 100(26):15522–15527PubMedCrossRefGoogle Scholar
  13. Oldham MC, Konopka G, Iwamoto K, Langfelder P, Kato T, Horvath S, Geschwind DH (2008) Functional organization of the transcriptome in human brain. Nat Neurosci 11(11):1271–1282PubMedCrossRefGoogle Scholar
  14. Shen R, Ghosh D, Chinnaiyan A, Meng Z (2006) Eigengene-based linear discriminant model for tumor classification using gene expression microarray data. Bioinformatics 22(21):2635–2642PubMedCrossRefGoogle Scholar
  15. Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, Futcher B (1998) Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell 9(12):3273–3297PubMedGoogle Scholar
  16. Tamayo P, Scanfeld D, Ebert BL, Gillette MA, Roberts CW, Mesirov JP (2007) Metagene projection for cross-platform, cross-species characterization of global transcriptional states. Proc Natl Acad Sci USA 104(14):5959–5964PubMedCrossRefGoogle Scholar
  17. West M, Blanchette C, Dressman H, Huang E, Ishida S, Spang R, Zuzan H, Olson JA, Marks JR, Nevins JR (2001) Predicting the clinical status of human breast cancer by using gene expression profiles. Proc Natl Acad Sci USA 98(20):11462–11467PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.University of California, Los AngelesLos AngelesUSA

Personalised recommendations