A Weighted SNP Correlation Network Method for Estimating Polygenic Risk Scores

  • Morgan E. LevineEmail author
  • Peter Langfelder
  • Steve Horvath
Part of the Methods in Molecular Biology book series (MIMB, volume 1613)


Polygenic scores are useful for examining the joint associations of genetic markers. However, because traditional methods involve summing weighted allele counts, they may fail to capture the complex nature of biology. Here we describe a network-based method, which we call weighted SNP correlation network analysis (WSCNA), and demonstrate how it could be used to generate meaningful polygenic scores. Using data on human height in a US population of non-Hispanic whites, we illustrate how this method can be used to identify SNP networks from GWAS data, create network-specific polygenic scores, examine network topology to identify hub SNPs, and gain biological insights into complex traits. In our example, we show that this method explains a larger proportion of the variance in human height than traditional polygenic score methods. We also identify hub genes and pathways that have previously been identified as influencing human height. In moving forward, this method may be useful for generating genetic susceptibility measures for other health related traits, examining genetic pleiotropy, identifying at-risk individuals, examining gene score by environmental effects, and gaining a deeper understanding of the underlying biology of complex traits.

Key words

Polygenic score Weighted network GWAS Height 


  1. 1.
    McCarthy MI et al (2008) Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet 9(5):356–369CrossRefGoogle Scholar
  2. 2.
    Risch NJ (2000) Searching for genetic determinants in the new millennium. Nature 405(6788):847–856CrossRefPubMedGoogle Scholar
  3. 3.
    Hindorff LA et al (2009) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A 106(23):9362–9367CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Manolio TA et al (2009) Finding the missing heritability of complex diseases. Nature 461(7265):747–753CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Hardy J, Singleton A (2009) Genomewide association studies and human disease. N Engl J Med 360(17):1759–1768CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Dudbridge F (2013) Power and predictive accuracy of polygenic risk scores. PLoS Genet 9(3):e1003348CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Wray NR, Goddard ME, Visscher PM (2007) Prediction of individual genetic risk to disease from genome-wide association studies. Genome Res 17(10):1520–1528CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Levine ME, Crimmins EM (2015) A genetic network associated with stress resistance, longevity, and cancer in humans. J Gerontol A Biol Sci Med Sci 71(6):703–712CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Peterson RE et al (2011) Genetic risk sum score comprised of common polygenic variation is associated with body mass index. Hum Genet 129(2):221–230CrossRefGoogle Scholar
  10. 10.
    Purcell SM et al (2014) A polygenic burden of rare disruptive mutations in schizophrenia. Nature 506(7487):185–190CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Wray NR, Goddard ME, Visscher PM (2008) Prediction of individual genetic risk of complex disease. Curr Opin Genet Dev 18(3):257–263CrossRefPubMedGoogle Scholar
  12. 12.
    Eichler EE et al (2010) Missing heritability and strategies for finding the underlying causes of complex disease. Nat Rev Genet 11(6):446–450CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Cordell HJ (2009) Detecting gene-gene interactions that underlie human diseases. Nat Rev Genet 10(6):392–404CrossRefPubMedPubMedCentralGoogle Scholar
  14. 14.
    Hemani G, Knott S, Haley C (2013) An evolutionary perspective on epistasis and the missing heritability. PLoS Genet. 9(2):e1003295CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Ghazalpour A et al (2006) Integrating genetic and network analysis to characterize genes related to mouse weight. PLoS Genet. 2(8):e130CrossRefPubMedCentralGoogle Scholar
  16. 16.
    Horvath S et al (2006) Analysis of oncogenic signaling networks in glioblastoma identifies ASPM as a molecular target. Proc Natl Acad Sci U S A 103(46):17402–17407CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Langfelder P et al (2012) A systems genetic analysis of high density lipoprotein metabolism and network preservation across mouse models. Biochim Biophys Acta 1821(3):435–447CrossRefPubMedGoogle Scholar
  18. 18.
    Oldham MC, Horvath S, Geschwind DH (2006) Conservation and evolution of gene coexpression networks in human and chimpanzee brains. Proc Natl Acad Sci U S A 103(47):17973–17978CrossRefPubMedPubMedCentralGoogle Scholar
  19. 19.
    Oldham MC, Langfelder P, Horvath S (2012) Network methods for describing sample relationships in genomic datasets: application to Huntington’s disease. BMC Syst Biol 6:63CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Langfelder P, Horvath S (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9:559CrossRefPubMedPubMedCentralGoogle Scholar
  21. 21.
    Zhang B, Horvath S (2005) A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol. 4:Article 17CrossRefGoogle Scholar
  22. 22.
    Patterson N, Price AL, Reich D (2006) Population structure and eigenanalysis. PLoS Genet. 2(12):e190CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Langfelder P, Zhang B, Horvath S (2008) Defining clusters from a hierarchical cluster tree: the dynamic tree cut package for R. Bioinformatics 24(5):719–720CrossRefPubMedGoogle Scholar
  24. 24.
    Visscher PM (2008) Sizing up human height variation. Nat Genet 40(5):489–490CrossRefGoogle Scholar
  25. 25.
    Lui JC et al (2012) Synthesizing genome-wide association studies and expression microarray reveals novel genes that act in the human growth plate to modulate height. Hum Mol Genet 21(23):5193–5201CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Lango Allen H et al (2010) Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467(7317):832–838CrossRefPubMedPubMedCentralGoogle Scholar
  27. 27.
    Liu JZ et al (2010) Genome-wide association study of height and body mass index in Australian twin families. Twin Res Hum Genet 13(2):179–193CrossRefPubMedPubMedCentralGoogle Scholar
  28. 28.
    Wood AR et al (2014) Defining the role of common variation in the genomic and biological architecture of adult human height. Nat Genet 46(11):1173–1186CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    Song L, Langfelder P, Horvath S (2012) Comparison of co-expression measures: mutual information, correlation, and model based indices. BMC Bioinformatics 13:328CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    Horvath S, Dong J (2008) Geometric interpretation of gene coexpression network analysis. PLoS Comput Biol. 4(8):e1000117CrossRefPubMedPubMedCentralGoogle Scholar
  31. 31.
    Langfelder P, Mischel PS, Horvath S (2013) When is hub gene selection better than standard meta-analysis? PLoS One 8(4):e61505CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media LLC 2017

Authors and Affiliations

  • Morgan E. Levine
    • 1
    • 2
    Email author
  • Peter Langfelder
    • 2
  • Steve Horvath
    • 1
    • 3
  1. 1.Department of Human GeneticsUniversity of CaliforniaLos AngelesUSA
  2. 2.Semel Institute for Neuroscience and Human BehaviorUniversity of CaliforniaLos AngelesUSA
  3. 3.Department of BiostatisticsUniversity of CaliforniaLos AngelesUSA

Personalised recommendations