Genetica

, Volume 136, Issue 2, pp 319–332

Developments in statistical analysis in quantitative genetics

Article
  • 270 Downloads

Abstract

A remarkable research impetus has taken place in statistical genetics since the last World Conference. This has been stimulated by breakthroughs in molecular genetics, automated data-recording devices and computer-intensive statistical methods. The latter were revolutionized by the bootstrap and by Markov chain Monte Carlo (McMC). In this overview a number of specific areas are chosen to illustrate the enormous flexibility that McMC has provided for fitting models and exploring features of data that were previously inaccessible. The selected areas are inferences of the trajectories over time of genetic means and variances, models for the analysis of categorical and count data, the statistical genetics of a model postulating that environmental variance is partly under genetic control, and a short discussion of models that incorporate massive genetic marker information. We provide an overview of the application of McMC to study model fit, and finally, a discussion is presented on the development of efficient McMC updating schemes for non-standard models.

Keywords

Statistical genetics McMC Genetic models 

References

  1. Albert JH, Chib S (1993) Bayesian analysis of binary and polychotomous response data. J Am Stat Assoc 88:669–679CrossRefGoogle Scholar
  2. Anderson DA, Aitkin M (1985) Variance component models with binary response: interviewer variability. J R Stat Soc B 47:203–210Google Scholar
  3. Besag J (1994) Contribution to the discussion paper by Grenander and Miller. J R Stat Soc B 56:591–592Google Scholar
  4. Blasco A, Piles M, Varona L (2003) A Bayesian analysis of the effect of selection for growth rate on growth curves in rabbits. Genet Sel Evol 35:21–41PubMedCrossRefGoogle Scholar
  5. Box GEP, Cox DR (1964) An analysis of transformations. J R Stat Soc B 26:211–252Google Scholar
  6. Carlson CS, Eberle MA, Rieder MJ, Yi Q, Kruglyak L, Nickerson DA (2004) Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am J Hum Genet 74:106–120PubMedCrossRefGoogle Scholar
  7. Chipman H, George E, McCulloch R (1998) Bayesian CART model search (with discussion). J Am Stat Assoc 93:935–960CrossRefGoogle Scholar
  8. Christensen OF, Waagepetersen RP (2002) Bayesian prediction of spatial count data using generalized linear mixed models. Biometrics 58:280–286PubMedCrossRefGoogle Scholar
  9. Curnow RN (1961) The estimation of repeatability and heritability from records subject to culling. Biometrics 17:553–566CrossRefGoogle Scholar
  10. Damgaard LH, Korsgaard IR (2006a) A bivariate quantitative genetic model for a linear Gaussian trait and a survival trait. Genet Sel Evol 38:35–64Google Scholar
  11. Damgaard LH, Korsgaard IR (2006b) A bivariate quantitative genetic model for a threshold trait and a survival trait. Genet Sel Evol 38:565–581PubMedCrossRefGoogle Scholar
  12. de Boer IJM, van Arendonk JAM (1992) Prediction of additive and dominance effects in selected or unselected populations with inbreeding. Theor Appl Genet 84:451–459CrossRefGoogle Scholar
  13. Denison DGT, Mallik BK, Smith AFM (1998) Automatic Bayesian curve fitting. J R Stat Soc B 60:333–350CrossRefGoogle Scholar
  14. Detilleux J, Leroy PL (2000) Application of a mixed normal mixture model to the estimation of mastitis-related parameters. J Dairy Sci 83:2341–2349PubMedCrossRefGoogle Scholar
  15. Ducrocq V, Casella G (1996) Bayesian analysis of mixed survival models. Genet Sel Evol 28:505–529CrossRefGoogle Scholar
  16. Ducrocq V, Quaas RL, Pollak E, Casella G (1988) Length of productive life of dairy cows. II. Variance component estimation and sire evaluation. J Dairy Sci 71:3071–3079Google Scholar
  17. Falconer DS (1965) The inheritance of liability to certain diseases, estimated from the incidence among relatives. Ann Hum Genet 29:51–76CrossRefGoogle Scholar
  18. Fisher RA (1918) The correlation between relatives on the supposition of Mendelian inheritance. Trans R Soc Edinb 52:399–433Google Scholar
  19. Foulley JL, Gianola D, Im S (1987) Genetic evaluation of traits distibuted as Poisson–binomial with reference to reproductive characters. Theor Appl Genet 73:870–877CrossRefGoogle Scholar
  20. Gelman A, Meng XL, Stern H (1996) Posterior predictive assessment of model fitness via realized discrepancies (with discussion). Stat Sin 6:733–807Google Scholar
  21. Gelman A, Carlin JB, Stern HS, Rubin DB (2004) Bayesian Data Analysis. Chapman and HallGoogle Scholar
  22. George EI, McCulloch RE (1993) Variable selection via Gibbs sampling. J Am Stat Assoc 8: 881–889CrossRefGoogle Scholar
  23. Geyer CJ (1992) Practical Markov chain Monte Carlo. Stat Sci 7:473–511CrossRefGoogle Scholar
  24. Gianola D (1982) Theory and analysis of threshold characters. J Anim Sci 54:1079–1096Google Scholar
  25. Gianola D, Fernando RL (1986) Bayesian methods in animal breeding theory. J Anim Sci 63:217–244Google Scholar
  26. Gianola D, Foulley JL (1983) Sire evaluation for ordered categorical data with a threshold model. Genet Sel Evol 15:201–223CrossRefGoogle Scholar
  27. Gianola D, Foulley JL, Fernando RL (1986) Prediction of breeding values when variances are not known. In: proceedings of the third world congress on genetics applied to livestock production, vol XII. University of Nebraska, Lincoln, pp 356–370Google Scholar
  28. Gianola D, Perez-Enciso M, Toro MA (2003) On marker-assisted prediction of genetic value: beyond the ridge. Genetics 157:1819–1829Google Scholar
  29. Gianola D, Fernando RL, Stella A (2006) Genomic-assisted prediction of genetic value with semiparametric procedures. Genetics 173:1761–1776PubMedCrossRefGoogle Scholar
  30. Gustafson P, MacNab YC, Wen S (2004) On the value of derivative evaluations and random walk suppression in Markov chain Monte Carlo algorithms. Stat Comp 14:23–38CrossRefGoogle Scholar
  31. Gutierrez JP, Nieto B, Piqueras P, Ibáñez N, Salgado C (2006) Genetic parameters for canalisation analysis of litter size and litter weight at birth in mice. Genet Sel Evol 38:445–462PubMedCrossRefGoogle Scholar
  32. Hartl DL, Jones EW (2005) Genetics. Analysis of Genes and Genomes. Jones and Bartlett Publishers, Sudbury, MassachusettsGoogle Scholar
  33. Harville DA, Mee RW (1984) A mixed model procedure for analyzing ordered categorical data. Biometrics 40:393–408CrossRefGoogle Scholar
  34. Hastie T, Tibshirani R (1990) Generalized additive models. Chapman and HallGoogle Scholar
  35. Henderson CR (1950) Specific and general combining ability. In: Gowen JW (eds) Heterosis. Iowa State College Press, Ames, Iowa, pp 352–370Google Scholar
  36. Henderson CR (1963) Selection index and expected selection advance. In: Hanson WD, Robinson HF (eds) Statistical genetics and plant breeding, National Academy of Sciences. National Research Council Publication No. 982, Washington, DC, pp 141–163Google Scholar
  37. Henderson CR (1973) Sire evaluation and genetic trends. In: proceedings of the animal breeding and genetics symposium in honor of Dr. J. L. Lush. American Society of Animal Science, Champaign, pp 10–41Google Scholar
  38. Henderson CR (1975) Best linear unbiased estimation and prediction under a selection model. Biometrics 31:423–447PubMedCrossRefGoogle Scholar
  39. Henderson CR (1976) A simple method for the inverse of a numerator relationship matrix used in prediction of breeding values. Biometrics 32:69–83CrossRefGoogle Scholar
  40. Hill WG, Zhang XS (2004) Effects on phenotypic variability of directional selection arising through genetic differences in residual variability. Genet Res 83:121–132PubMedCrossRefGoogle Scholar
  41. Hoeting JA, Madigan D, Raftery AE, Volinsky CT (1999) Bayesian model averaging: a tutorial. Stat Sci 14:382–417CrossRefGoogle Scholar
  42. Ibáñez N, Varona L, Sorensen D, Noguera JL (2007) A study of heterogeneity of environmental variance for slaughter weight in pigs. Animal 2:19–26Google Scholar
  43. Ibáñez N, Sorensen D, Waagepetersen R, Blasco A (2008) Selection for environmental variation: a statistical analysis and power calculations. Genetics (in press)Google Scholar
  44. Im S, Fernando R, Gianola D (1989) Likelihood inferences in animal breeding: a missing-data theory view point. Genet Sel Evol 21:399–414Google Scholar
  45. Johnson NL, Kotz S (1969) Distributions in statistics: discrete distributions. Wiley, New YorkGoogle Scholar
  46. Kennedy BW (1990) The use of mixed model methods in the analysis of designed experiments. In: Gianola D, Hammond K (eds) Advances in statistical methods for genetic improvement of livestock. Springer-Verlag, New York, pp 77–97Google Scholar
  47. Lambert D (1992) Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34:1–14CrossRefGoogle Scholar
  48. Lee HKH (2004) Bayesian nonparametrics via neural networks. ASA-SIAM SeriesGoogle Scholar
  49. Lin DY, Zen D (2006) Likelihood-based inference on haplotype effects in genetic association studies. J Am Stat Assoc 101:89–104CrossRefGoogle Scholar
  50. Mackay TFC, Lyman RF (2005) Drosophila bristles and the nature of quantitative genetic variation. Philos Trans R Soc B 360:1513–1527CrossRefGoogle Scholar
  51. Madigan D, Raftery AE (1994) Model selection and accounting for model uncertainty in graphical models using Occam’s window. J Am Stat Assoc 89:1535–1546CrossRefGoogle Scholar
  52. Mäki-Tanila A, Kennedy BW (1986) Mixed model methodology under genetic models with a small number of additive and non-additive loci. In: Proceedings of the 3rd world congress on genetics applied to livestock production, vol 12. University of Nebraska, pp 443–447Google Scholar
  53. Martinez V, Bünger L, Hill WG (2000) Analysis of response to 20 generations of selection for body composition in mice: fit to infinitesimal model assumptions. Genet Sel Evol 32:3–21PubMedCrossRefGoogle Scholar
  54. McCulloch CE (1994) Maximum likelihood variance components estimation for binary data. J Am Stat Assoc 89:330–335CrossRefGoogle Scholar
  55. Meuwissen THE, Hayes BJ, Goddard ME (2001) Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829PubMedGoogle Scholar
  56. Meyer K, Hill WG (1991) Mixed model analysis of a selection experiment for food intake in mice. Genet Res 57:71–81PubMedCrossRefGoogle Scholar
  57. Mulder HA, Bijma P, Hill WG (2007) Prediction of breeding values and selection responses with genetic heterogeneity of environmental variance. Genetics 175:1895–1910PubMedCrossRefGoogle Scholar
  58. Ochi Y, Prentice RL (1984) Likelihood inference in a correlated probit regression model. Biometrika 71:531–543CrossRefGoogle Scholar
  59. Østergård J, Jensen J, Madsen P, Gianola D, Klemetsdal G, Heringstad B (2003) Detection of mastitis in dairy cattle by use of mixture models for repeated somatic cell scores: a Bayesian approach via Gibbs sampling. J Dairy Sci 86:3694–3703Google Scholar
  60. Østergård J, Madsen P, Gianola D, Klemetsdal G, Jensen J, Heringstad B, Korsgaard IR (2005) A Bayesian threshold-normal mixture model for analysis of a continuous mastitis-related trait. J Dairy Sci 88:2652–2659Google Scholar
  61. Pearson K (1904) Contributions to the mathematical theory of evolution. Philos Trans R Soc Lond A 185:71–110CrossRefGoogle Scholar
  62. Raftery AE, Madigan D, Hoeting JA (1997) Model selection and accounting for model uncertainty in linear regression models. J Am Stat Assoc 92:179–191CrossRefGoogle Scholar
  63. Roberts GO, Tweedie RL (1997) Exponential convergence of Langevin diffusions and their approximations. Bernoulli 2:314–363Google Scholar
  64. Robertson A, Lerner IM (1949) The heritability of all-or-none traits: viability of poultry. Genetics 34:395–411Google Scholar
  65. Ros M, Sorensen D, Waagepetersen R, Dupont-Nivet M, SanCristobal M, Bonnet J-C, Mallard J (2004) Evidence for genetic control of adult weight plasticity in the snail Helix aspersa. Genetics 168:2089–2097PubMedCrossRefGoogle Scholar
  66. Rowe SI, White S, Avendano S, Hill WG (2006) Genetic heterogeneity of residual variance in broiler chickens. Genet Sel Evol 38:617–635PubMedCrossRefGoogle Scholar
  67. Rubin DB (1976) Inference and missing data. Biometrika 63:581–592CrossRefGoogle Scholar
  68. Rubin DB (1984) Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann Stat 12:1151–1172CrossRefGoogle Scholar
  69. Ruppert D, Wand MP, Carroll RJ (2003) Semiparametric regression. Cambridge University PressGoogle Scholar
  70. San Cristobal-Gaudy M, Elsen JM, Bodin L, Chevalet C (1998) Prediction of the response to a selection for canalisation of a continuous trait in animal breeding. Genet Sel Evol 30:423–451CrossRefGoogle Scholar
  71. Sheet P, Stephens M (2006) A fast and flexible statistical model for large-scale population genotype data: Applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet 78:629–644CrossRefGoogle Scholar
  72. Sorensen D, Gianola D (2002) Likelihood, Bayesian, and Markov chain Monte Carlo methods in quantitative genetics. Springer-Verlag, New YorkGoogle Scholar
  73. Sorensen D, Waagepetersen R (2003) Normal linear models with genetically structured residual variance heterogeneity: a case study. Genet Res 82:207–222PubMedCrossRefGoogle Scholar
  74. Sorensen D, Andersen S, Gianola D, Korsgaard IR (1995) Bayesian inference in threshold models using Gibbs sampling. Genet Sel Evol 27:229–249CrossRefGoogle Scholar
  75. Sorensen D, Fernando RL, Gianola D (2001) Inferring the trajectory of genetic variance in the course of artificial selection. Genet Res 77:83–94PubMedCrossRefGoogle Scholar
  76. Sorensen D, Guldbrandtsen B, Jensen J (2003) On the need for a control line in selection experiments: a likelihood analysis. Genet Sel Evol 35:3–20PubMedCrossRefGoogle Scholar
  77. Sorensen D, Vernersen A, Andersen S (2000) Bayesian analysis of response to selection: a case study using litter size in Danish Yorkshire pigs. Genetics 156:283–295PubMedGoogle Scholar
  78. Sorensen D, Wang CS, Jensen J, Gianola D (1994) Bayesian analysis of genetic change due to selection using Gibbs sampling. Genet Sel Evol 26:333–360CrossRefGoogle Scholar
  79. Tempelman RJ, Gianola D (1996) A mixed effects model for overdispersed count data in animal breeding. Biometrics 52:265–279CrossRefGoogle Scholar
  80. Thompson R (1973) The estimation of variance and covariance components with an application when records are subject to culling. Biometrics 29:527–550CrossRefGoogle Scholar
  81. Thompson R (1976) Estimation of quantitative genetic parameters. In: Pollak E, Kempthorne O, Bailey TB (eds) In: proceedings of the international conference on quantitative genetics. Iowa State University, pp 639–657Google Scholar
  82. Thompson R (1986) Estimation of realized heritability in a selected population using mixed-model methods. Genet Sel Evol 18:475–483CrossRefGoogle Scholar
  83. Varona L, Sorensen D (2008) Genetic analysis of mortality in pigs using zero-inflated models. (in Preparation)Google Scholar
  84. Waagepetersen R, Ibáñez N, Sorensen D (2008) A comparison of strategies for Markov chain Monte Carlo computation in quantitative genetics. Genet Sel Evol 40:161–176PubMedCrossRefGoogle Scholar
  85. Wright S (1934) An analysis of variability in number of digits in an inbred strain of guinea pigs. Genetics 19:506–536PubMedGoogle Scholar
  86. Xu S (2003) Estimating polygenic effects using markers of the entire genome. Genetics 163:789–801PubMedGoogle Scholar
  87. Zhang XS, Hill WG (2005) Evolution of the environmental component of phenotypic variance: stabilizing selection in changing environments and the cost of homogeneity. Evolution 59:1237–1244PubMedGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2008

Authors and Affiliations

  1. 1.Department of Genetics and Biotechnology, Faculty of Agricultural SciencesUniversity of AarhusTjeleDenmark

Personalised recommendations