Applications of Graphical Models in Quantitative Genetics and Genomics

  • Guilherme J. M. Rosa
  • Vivian P. S. Felipe
  • Francisco Peñagaricano


In this chapter, we provide a brief introduction about graphical models, with an emphasis on Bayesian networks, and discuss some of their applications in genetics and genomics studies with agricultural and livestock species. First, some key definitions regarding stochastic graphical models are provided, as well as basic principles of inference related to graphical structure and model parameters. Next is a discussion of some examples of applications, which include prediction of complex traits using genomic information or other correlated traits as well as the investigation of the flow of information from DNA polymorphisms to endpoint phenotypes, including intermediate phenotypes such as gene expression. A first example with prediction refers to the forecasting of total egg production in quails using early expressed traits (such as weekly body weight, partial egg production, and egg quality traits) as explanatory variables to support decision making (e.g., earlier culling decisions) in production/breeding systems. An additional example uses genomic information for the estimation of genetic merit of selection candidates for genetic improvement of economically important traits. An example with causal inference deals with the network underlying carcass fat deposition and muscularity in pigs by jointly modeling phenotypic, genotypic, and transcriptomic data. Some additional applications of Bayesian networks and other graphical model techniques are highlighted as well, including multitrait quantitative trait loci (QTL) analysis and structural equation models with latent variables. It is shown that graphical models such as Bayesian networks offer a powerful and insightful approach both for prediction and for causal inference, with a myriad of applications in the areas of genetics and genomics, and the study of complex phenotypic traits in agriculture.


Quantitative Trait Locus Bayesian Network Directed Acyclic Graph Genomic Selection Conditional Independency 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Bollen KA (1989) Structural equations with latent variables. Wiley, New YorkCrossRefGoogle Scholar
  2. Bouwman AC, Valente BD, Janss LLG, Bovenhuis H, Rosa GJM (2014) Exploring causal networks of bovine milk fatty acids in multivariate mixed model context. Genet Sel Evol 46:2CrossRefPubMedPubMedCentralGoogle Scholar
  3. Burnham KP, Anderson DR (2002) Model selection and multimodel inference: a practical information-theoretic approach, 2nd edn. Springer, New YorkGoogle Scholar
  4. Chaibub Neto E, Keller MP, Attie AD, Yandell BS (2010) Causal graphical models in systems genetics: a unified framework for joint inference of causal network and genetic architecture for correlated phenotypes. Ann Appl Stat 4:320–339CrossRefGoogle Scholar
  5. de los Campos G, Gianola D, Boettcher P, Moroni P (2006) A structural equation model for describing relationships between somatic cell score and milk yield in dairy goats. J Anim Sci 84:2934–2941CrossRefPubMedGoogle Scholar
  6. de los Campos G, Gianola D, Rosa GJM (2009) Reproducing kernel Hilbert spaces regression: a general framework for genetic evaluation. J Anim Sci 87:1883–1887CrossRefGoogle Scholar
  7. de los Campos G, Hickey JM, Pong-Wong R, Daetwyler HD, Calus MPL (2013) Whole-genome regression and prediction methods applied to plant and animal breeding. Genetics 193(2):327–345CrossRefPubMedCentralGoogle Scholar
  8. de Maturana EL, Wu X-L, Gianola D, Weigel KA, Rosa GJM (2009) Exploring biological relationships between calving traits in primiparous cattle with a Bayesian recursive model. Genetics 181:277–287CrossRefPubMedPubMedCentralGoogle Scholar
  9. de Maturana EL, de los Campos G, Wu X-L, Gianola D, Weigel KA, Rosa GJM (2010) Modeling relationships between calving traits: a comparison between standard and recursive mixed models. Genet Sel Evol 42:1CrossRefPubMedPubMedCentralGoogle Scholar
  10. Edwards DB, Ernst CW, Raney NE, Doumit ME, Hoge MD et al (2008a) Quantitative trait locus mapping in an F2 Duroc x Pietrain resource population: II. Carcass and meat quality traits. J Anim Sci 86:254–266CrossRefPubMedGoogle Scholar
  11. Edwards DB, Ernst CW, Tempelman RJ, Rosa GJM, Raney NE et al (2008b) Quantitative trait loci mapping in an F2 Duroc x Pietrain resource population: I. Growth traits. J Anim Sci 86:241–253CrossRefPubMedGoogle Scholar
  12. Felipe VPS, Silva MA, Valente BD, Rosa GJM (2015) Using multiple regression. Bayesian networks and artificial neural networks for prediction of total egg production in European quails based on earlier expressed phenotypes. Poult Sci 94:772–780CrossRefPubMedGoogle Scholar
  13. Friedman N, Linial M, Nachman I, Pe’er D (2000) Using Bayesian networks to analyze expression data. J Comput Biol 7:601–620CrossRefPubMedGoogle Scholar
  14. Gianola D, Sorensen D (2004) Quantitative genetic models for describing simultaneous and recursive relationships between phenotypes. Genetics 167:1407–1424CrossRefPubMedPubMedCentralGoogle Scholar
  15. Gianola D, de los Campos G, Hill WG, Manfredi E, Fernando R (2009) Additive genetic variability and the Bayesian alphabet. Genetics 183:347–363CrossRefPubMedPubMedCentralGoogle Scholar
  16. Goddard ME, Hayes BJ (2007) Genomic selection. J Anim Breed Genet 124(6):323–330CrossRefPubMedGoogle Scholar
  17. Haavelmo T (1943) The statistical implications of a system of simultaneous equations. Econometrica 11:1–12CrossRefGoogle Scholar
  18. Henderson CR, Quaas RL (1976) Multiple trait evaluation using relatives records. J Anim Sci 43:1188–1197CrossRefGoogle Scholar
  19. Heringstad B, Wu X-L, Gianola D (2009) Inferring relationships between health and fertility in Norwegian red cows using recursive models. J Dairy Sci 92:1778–1784CrossRefPubMedGoogle Scholar
  20. Jamrozik J, Bohmanova J, Schaeffer LR (2010) Relationships between milk yield and somatic cell score in Canadian Holsteins from simultaneous and recursive random regression models. J Dairy Sci 93:1216–1233CrossRefPubMedGoogle Scholar
  21. Karacaören B, Silander T, Álvarez-Castro JM, Haley CS, de Koning DJ (2011) Association analyses of the MAS-QTL data set using grammar, principal components and Bayesian network methodologies. BMC Proc 5(Suppl 3):S8CrossRefPubMedPubMedCentralGoogle Scholar
  22. König S, Wu X-L, Gianola D, Heringstad B, Simianer H (2008) Exploration of relationships between claw disorders and milk yield in Holstein cows via recursive linear and threshold models. J Dairy Sci 91:395–406CrossRefPubMedGoogle Scholar
  23. Li R, Tsaih SW, Shockley K, Stylianou IM, Wergedal J, Paigen B, Churchill GA (2006) Structural model analysis of multiple quantitative traits. PLoS Genet 2, e114CrossRefPubMedPubMedCentralGoogle Scholar
  24. Li JZ, Chen X, Gong XL, Liu Y, Feng H, Qiu L, Hu ZL, Zhang JP (2009) A transcript profiling approach reveals the zinc finger transcription factor ZNF191 is a pleiotropic factor. BMC Genomics 10:241CrossRefPubMedPubMedCentralGoogle Scholar
  25. Long N, Gianola D, Rosa GJM, Weigel KA, Avendaño S (2009) Comparison of classification methods for detecting associations between SNPs and chick mortality. Genet Sel Evol 41:18CrossRefPubMedPubMedCentralGoogle Scholar
  26. Morota G, Valente BD, Rosa GJM, Weigel KA, Gianola D (2012) An assessment of linkage disequilibrium in Holstein cattle using a Bayesian network. J Anim Breed Genet 129:474–487PubMedGoogle Scholar
  27. Mrode RA (2005) Linear models for the prediction of animal breeding values, 2nd edn. CABI Publishing, WallingfordCrossRefGoogle Scholar
  28. Nagarajan R, Scutari M, Lèbre S (2013) Bayesian networks in R with applications in systems biology. Springer, New YorkCrossRefGoogle Scholar
  29. Pearl J (2009) Causality: models, reasoning, and inference, 2nd edn. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  30. Peñagaricano F, Valente BD, Steibel JP, Bates RO, Ernst CW, Khatib H, Rosa GJM (2015a) Exploring causal networks underlying fat deposition and muscularity in pigs through the integration of phenotypic, genotypic and transcriptomic data. BMC Syst Biol 9:58CrossRefPubMedPubMedCentralGoogle Scholar
  31. Peñagaricano F, Valente BD, Steibel JP, Bates RO, Ernst CW, Khatib H, Rosa GJM (2015b) Searching for causal networks involving latent variables in complex traits: application to growth, carcass, and meat quality traits in pigs. J Anim Sci 93:4617–4623CrossRefPubMedGoogle Scholar
  32. Rosa GJM, Valente BD (2013) Inferring causal effects from observational data in livestock. J Anim Sci 91:553–564CrossRefPubMedGoogle Scholar
  33. Rosa GJM, Valente BD (2014) Structural equation models for studying causal phenotype networks in quantitative genetics. In: Sinoquet C, Mourad R (eds) Probabilistic graphical models for genetics, genomics and postgenomics. Oxford University Press, OxfordGoogle Scholar
  34. Rosa GJM, Valente BD, de los Campos G, Wu X-L, Gianola D, Silva MA (2011) Inferring causal phenotype networks using structural equation models. Genet Sel Evol 43:6CrossRefPubMedPubMedCentralGoogle Scholar
  35. Sachs K, Perez O, Pe’er D, Lauffenburger DA, Nolan GP (2005) Causal protein-signaling networks derived from multiparameter single-cell data. Science 308:523–529CrossRefPubMedGoogle Scholar
  36. Schadt EE, Lamb J, Yang X, Zhu J, Edwards S, GuhaThakurta D, Sieberts SK, Monks S, Reitman M, Zhang C, Lum PY, Leonardson A, Thieringer R, Metzger JM, Yang L, Castle J, Zhu H, Kash SF, Drake TA, Sachs A, Lusis AJ (2005) An integrative genomics approach to infer causal associations between gene expression and disease. Nat Genet 37:710–717CrossRefPubMedPubMedCentralGoogle Scholar
  37. Scutari M, Mackay I, Balding D (2013) Improving the efficiency of genomic selection. Stat Appl Genet Mol Biol 12(4):517–527PubMedGoogle Scholar
  38. Sebastiani P, Ramoni MF, Nolan V, Baldwin CT, Steinberg MH (2005) Genetic dissection and prognostic modeling of overt stroke in sickle cell anemia. Nat Genet 37:435–440CrossRefPubMedPubMedCentralGoogle Scholar
  39. Shipley B (2002) Cause and correlation in biology. Cambridge University Press, Cambridge, UKGoogle Scholar
  40. Sinoquet C, Mourad R (eds) (2014) Probabilistic graphical models for genetics, genomics and postgenomics. Oxford University Press, OxfordGoogle Scholar
  41. Spirtes P, Glymour C, Scheines R (2000) Causation, prediction and search, 2nd edn. The MIT Press, Cambridge, MAGoogle Scholar
  42. Steibel JP, Bates RO, Rosa GJM et al (2011) Genome-wide linkage analysis of global gene expression in loin muscle tissue identifies candidate genes in pigs. PLoS One 6(2), e16766CrossRefPubMedPubMedCentralGoogle Scholar
  43. Thomas DC, Conti DV (2004) Commentary: the concept of ‘Mendelian randomization’. Int J Epidemiol 33:21–25CrossRefPubMedGoogle Scholar
  44. Tsamardinos I, Brown LE, Aliferis CF (2006) The max-min hill-climbing Bayesian network structure learning algorithm. Mach Learn 65:31–78CrossRefGoogle Scholar
  45. Valente BD, Rosa GJM, de los Campos G, Gianola D, Silva MA (2010) Searching for recursive causal structures in multivariate quantitative genetics mixed models. Genetics 185:633–644CrossRefPubMedPubMedCentralGoogle Scholar
  46. Valente BD, Rosa GJM, Teixeira RB, Torres RA (2011) Searching for phenotypic causal networks involving complex traits: an application to European quails. Genet Sel Evol 43:37CrossRefPubMedPubMedCentralGoogle Scholar
  47. Valente BD, Morota G, Peñagaricano F, Gianola D, Weigel KA, Rosa GJM (2015) The causal meaning of genomic predictors and how it affects the construction and comparison of genome-enabled selection models. Genetics 200:483–494CrossRefPubMedPubMedCentralGoogle Scholar
  48. Varona L, Sorensen D, Thompson R (2007) Analysis of litter size and average litter weight in pigs using recursive model. Genetics 177:1791–1799CrossRefPubMedPubMedCentralGoogle Scholar
  49. Vazquez AI, Rosa GJM, Weigel KA, de los Campos G, Gianola D, Allison DB (2010) Predictive ability of subsets of single nucleotide polymorphisms with and without parent average in US Holsteins. J Dairy Sci 93(12):5942–5949CrossRefPubMedPubMedCentralGoogle Scholar
  50. Wang H, van Eeuwijk F (2014) A new method to infer causal phenotype networks using QTL and phenotypic information. PLoS One 9(8), e103997CrossRefPubMedPubMedCentralGoogle Scholar
  51. Wang H, Paulo J, Kruijer W, Boer M, Jansen H, Tikunov Y, Usadel B, van Heusden S, Bovy A, van Eeuwijk F (2015) Genotype–phenotype modeling considering intermediate level of biological variation: a case study involving sensory traits, metabolites and QTLs in ripe tomatoes. Mol Biosyst 11:3101–3110CrossRefPubMedGoogle Scholar
  52. Wright S (1921) Correlation and causation. J Agri Res 201:557–585Google Scholar
  53. Wu X-L, Heringstad B, Chang YM, de los Campos G, Gianola D (2007) Inferring relationships between somatic cell score and milk yield using simultaneous and recursive models. J Dairy Sci 90:3508–3521CrossRefPubMedGoogle Scholar
  54. Wu X-L, Heringstad B, Gianola D (2008) Exploration of lagged relationships between mastitis and milk yield in dairy cows using a Bayesian structural equation Gaussian-threshold model. Genet Sel Evol 40:333–357CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Guilherme J. M. Rosa
    • 1
  • Vivian P. S. Felipe
    • 2
  • Francisco Peñagaricano
    • 3
  1. 1.University of WisconsinMadisonUSA
  2. 2.Cobb-Vantress Inc.Siloam SpringsUSA
  3. 3.University of FloridaGainesvilleUSA

Personalised recommendations