Abstract
A remarkable research impetus has taken place in statistical genetics since the last World Conference. This has been stimulated by breakthroughs in molecular genetics, automated data-recording devices and computer-intensive statistical methods. The latter were revolutionized by the bootstrap and by Markov chain Monte Carlo (McMC). In this overview a number of specific areas are chosen to illustrate the enormous flexibility that McMC has provided for fitting models and exploring features of data that were previously inaccessible. The selected areas are inferences of the trajectories over time of genetic means and variances, models for the analysis of categorical and count data, the statistical genetics of a model postulating that environmental variance is partly under genetic control, and a short discussion of models that incorporate massive genetic marker information. We provide an overview of the application of McMC to study model fit, and finally, a discussion is presented on the development of efficient McMC updating schemes for non-standard models.
Similar content being viewed by others
References
Albert JH, Chib S (1993) Bayesian analysis of binary and polychotomous response data. J Am Stat Assoc 88:669–679
Anderson DA, Aitkin M (1985) Variance component models with binary response: interviewer variability. J R Stat Soc B 47:203–210
Besag J (1994) Contribution to the discussion paper by Grenander and Miller. J R Stat Soc B 56:591–592
Blasco A, Piles M, Varona L (2003) A Bayesian analysis of the effect of selection for growth rate on growth curves in rabbits. Genet Sel Evol 35:21–41
Box GEP, Cox DR (1964) An analysis of transformations. J R Stat Soc B 26:211–252
Carlson CS, Eberle MA, Rieder MJ, Yi Q, Kruglyak L, Nickerson DA (2004) Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am J Hum Genet 74:106–120
Chipman H, George E, McCulloch R (1998) Bayesian CART model search (with discussion). J Am Stat Assoc 93:935–960
Christensen OF, Waagepetersen RP (2002) Bayesian prediction of spatial count data using generalized linear mixed models. Biometrics 58:280–286
Curnow RN (1961) The estimation of repeatability and heritability from records subject to culling. Biometrics 17:553–566
Damgaard LH, Korsgaard IR (2006a) A bivariate quantitative genetic model for a linear Gaussian trait and a survival trait. Genet Sel Evol 38:35–64
Damgaard LH, Korsgaard IR (2006b) A bivariate quantitative genetic model for a threshold trait and a survival trait. Genet Sel Evol 38:565–581
de Boer IJM, van Arendonk JAM (1992) Prediction of additive and dominance effects in selected or unselected populations with inbreeding. Theor Appl Genet 84:451–459
Denison DGT, Mallik BK, Smith AFM (1998) Automatic Bayesian curve fitting. J R Stat Soc B 60:333–350
Detilleux J, Leroy PL (2000) Application of a mixed normal mixture model to the estimation of mastitis-related parameters. J Dairy Sci 83:2341–2349
Ducrocq V, Casella G (1996) Bayesian analysis of mixed survival models. Genet Sel Evol 28:505–529
Ducrocq V, Quaas RL, Pollak E, Casella G (1988) Length of productive life of dairy cows. II. Variance component estimation and sire evaluation. J Dairy Sci 71:3071–3079
Falconer DS (1965) The inheritance of liability to certain diseases, estimated from the incidence among relatives. Ann Hum Genet 29:51–76
Fisher RA (1918) The correlation between relatives on the supposition of Mendelian inheritance. Trans R Soc Edinb 52:399–433
Foulley JL, Gianola D, Im S (1987) Genetic evaluation of traits distibuted as Poisson–binomial with reference to reproductive characters. Theor Appl Genet 73:870–877
Gelman A, Meng XL, Stern H (1996) Posterior predictive assessment of model fitness via realized discrepancies (with discussion). Stat Sin 6:733–807
Gelman A, Carlin JB, Stern HS, Rubin DB (2004) Bayesian Data Analysis. Chapman and Hall
George EI, McCulloch RE (1993) Variable selection via Gibbs sampling. J Am Stat Assoc 8: 881–889
Geyer CJ (1992) Practical Markov chain Monte Carlo. Stat Sci 7:473–511
Gianola D (1982) Theory and analysis of threshold characters. J Anim Sci 54:1079–1096
Gianola D, Fernando RL (1986) Bayesian methods in animal breeding theory. J Anim Sci 63:217–244
Gianola D, Foulley JL (1983) Sire evaluation for ordered categorical data with a threshold model. Genet Sel Evol 15:201–223
Gianola D, Foulley JL, Fernando RL (1986) Prediction of breeding values when variances are not known. In: proceedings of the third world congress on genetics applied to livestock production, vol XII. University of Nebraska, Lincoln, pp 356–370
Gianola D, Perez-Enciso M, Toro MA (2003) On marker-assisted prediction of genetic value: beyond the ridge. Genetics 157:1819–1829
Gianola D, Fernando RL, Stella A (2006) Genomic-assisted prediction of genetic value with semiparametric procedures. Genetics 173:1761–1776
Gustafson P, MacNab YC, Wen S (2004) On the value of derivative evaluations and random walk suppression in Markov chain Monte Carlo algorithms. Stat Comp 14:23–38
Gutierrez JP, Nieto B, Piqueras P, Ibáñez N, Salgado C (2006) Genetic parameters for canalisation analysis of litter size and litter weight at birth in mice. Genet Sel Evol 38:445–462
Hartl DL, Jones EW (2005) Genetics. Analysis of Genes and Genomes. Jones and Bartlett Publishers, Sudbury, Massachusetts
Harville DA, Mee RW (1984) A mixed model procedure for analyzing ordered categorical data. Biometrics 40:393–408
Hastie T, Tibshirani R (1990) Generalized additive models. Chapman and Hall
Henderson CR (1950) Specific and general combining ability. In: Gowen JW (eds) Heterosis. Iowa State College Press, Ames, Iowa, pp 352–370
Henderson CR (1963) Selection index and expected selection advance. In: Hanson WD, Robinson HF (eds) Statistical genetics and plant breeding, National Academy of Sciences. National Research Council Publication No. 982, Washington, DC, pp 141–163
Henderson CR (1973) Sire evaluation and genetic trends. In: proceedings of the animal breeding and genetics symposium in honor of Dr. J. L. Lush. American Society of Animal Science, Champaign, pp 10–41
Henderson CR (1975) Best linear unbiased estimation and prediction under a selection model. Biometrics 31:423–447
Henderson CR (1976) A simple method for the inverse of a numerator relationship matrix used in prediction of breeding values. Biometrics 32:69–83
Hill WG, Zhang XS (2004) Effects on phenotypic variability of directional selection arising through genetic differences in residual variability. Genet Res 83:121–132
Hoeting JA, Madigan D, Raftery AE, Volinsky CT (1999) Bayesian model averaging: a tutorial. Stat Sci 14:382–417
Ibáñez N, Varona L, Sorensen D, Noguera JL (2007) A study of heterogeneity of environmental variance for slaughter weight in pigs. Animal 2:19–26
Ibáñez N, Sorensen D, Waagepetersen R, Blasco A (2008) Selection for environmental variation: a statistical analysis and power calculations. Genetics (in press)
Im S, Fernando R, Gianola D (1989) Likelihood inferences in animal breeding: a missing-data theory view point. Genet Sel Evol 21:399–414
Johnson NL, Kotz S (1969) Distributions in statistics: discrete distributions. Wiley, New York
Kennedy BW (1990) The use of mixed model methods in the analysis of designed experiments. In: Gianola D, Hammond K (eds) Advances in statistical methods for genetic improvement of livestock. Springer-Verlag, New York, pp 77–97
Lambert D (1992) Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34:1–14
Lee HKH (2004) Bayesian nonparametrics via neural networks. ASA-SIAM Series
Lin DY, Zen D (2006) Likelihood-based inference on haplotype effects in genetic association studies. J Am Stat Assoc 101:89–104
Mackay TFC, Lyman RF (2005) Drosophila bristles and the nature of quantitative genetic variation. Philos Trans R Soc B 360:1513–1527
Madigan D, Raftery AE (1994) Model selection and accounting for model uncertainty in graphical models using Occam’s window. J Am Stat Assoc 89:1535–1546
Mäki-Tanila A, Kennedy BW (1986) Mixed model methodology under genetic models with a small number of additive and non-additive loci. In: Proceedings of the 3rd world congress on genetics applied to livestock production, vol 12. University of Nebraska, pp 443–447
Martinez V, Bünger L, Hill WG (2000) Analysis of response to 20 generations of selection for body composition in mice: fit to infinitesimal model assumptions. Genet Sel Evol 32:3–21
McCulloch CE (1994) Maximum likelihood variance components estimation for binary data. J Am Stat Assoc 89:330–335
Meuwissen THE, Hayes BJ, Goddard ME (2001) Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829
Meyer K, Hill WG (1991) Mixed model analysis of a selection experiment for food intake in mice. Genet Res 57:71–81
Mulder HA, Bijma P, Hill WG (2007) Prediction of breeding values and selection responses with genetic heterogeneity of environmental variance. Genetics 175:1895–1910
Ochi Y, Prentice RL (1984) Likelihood inference in a correlated probit regression model. Biometrika 71:531–543
Østergård J, Jensen J, Madsen P, Gianola D, Klemetsdal G, Heringstad B (2003) Detection of mastitis in dairy cattle by use of mixture models for repeated somatic cell scores: a Bayesian approach via Gibbs sampling. J Dairy Sci 86:3694–3703
Østergård J, Madsen P, Gianola D, Klemetsdal G, Jensen J, Heringstad B, Korsgaard IR (2005) A Bayesian threshold-normal mixture model for analysis of a continuous mastitis-related trait. J Dairy Sci 88:2652–2659
Pearson K (1904) Contributions to the mathematical theory of evolution. Philos Trans R Soc Lond A 185:71–110
Raftery AE, Madigan D, Hoeting JA (1997) Model selection and accounting for model uncertainty in linear regression models. J Am Stat Assoc 92:179–191
Roberts GO, Tweedie RL (1997) Exponential convergence of Langevin diffusions and their approximations. Bernoulli 2:314–363
Robertson A, Lerner IM (1949) The heritability of all-or-none traits: viability of poultry. Genetics 34:395–411
Ros M, Sorensen D, Waagepetersen R, Dupont-Nivet M, SanCristobal M, Bonnet J-C, Mallard J (2004) Evidence for genetic control of adult weight plasticity in the snail Helix aspersa. Genetics 168:2089–2097
Rowe SI, White S, Avendano S, Hill WG (2006) Genetic heterogeneity of residual variance in broiler chickens. Genet Sel Evol 38:617–635
Rubin DB (1976) Inference and missing data. Biometrika 63:581–592
Rubin DB (1984) Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann Stat 12:1151–1172
Ruppert D, Wand MP, Carroll RJ (2003) Semiparametric regression. Cambridge University Press
San Cristobal-Gaudy M, Elsen JM, Bodin L, Chevalet C (1998) Prediction of the response to a selection for canalisation of a continuous trait in animal breeding. Genet Sel Evol 30:423–451
Sheet P, Stephens M (2006) A fast and flexible statistical model for large-scale population genotype data: Applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet 78:629–644
Sorensen D, Gianola D (2002) Likelihood, Bayesian, and Markov chain Monte Carlo methods in quantitative genetics. Springer-Verlag, New York
Sorensen D, Waagepetersen R (2003) Normal linear models with genetically structured residual variance heterogeneity: a case study. Genet Res 82:207–222
Sorensen D, Andersen S, Gianola D, Korsgaard IR (1995) Bayesian inference in threshold models using Gibbs sampling. Genet Sel Evol 27:229–249
Sorensen D, Fernando RL, Gianola D (2001) Inferring the trajectory of genetic variance in the course of artificial selection. Genet Res 77:83–94
Sorensen D, Guldbrandtsen B, Jensen J (2003) On the need for a control line in selection experiments: a likelihood analysis. Genet Sel Evol 35:3–20
Sorensen D, Vernersen A, Andersen S (2000) Bayesian analysis of response to selection: a case study using litter size in Danish Yorkshire pigs. Genetics 156:283–295
Sorensen D, Wang CS, Jensen J, Gianola D (1994) Bayesian analysis of genetic change due to selection using Gibbs sampling. Genet Sel Evol 26:333–360
Tempelman RJ, Gianola D (1996) A mixed effects model for overdispersed count data in animal breeding. Biometrics 52:265–279
Thompson R (1973) The estimation of variance and covariance components with an application when records are subject to culling. Biometrics 29:527–550
Thompson R (1976) Estimation of quantitative genetic parameters. In: Pollak E, Kempthorne O, Bailey TB (eds) In: proceedings of the international conference on quantitative genetics. Iowa State University, pp 639–657
Thompson R (1986) Estimation of realized heritability in a selected population using mixed-model methods. Genet Sel Evol 18:475–483
Varona L, Sorensen D (2008) Genetic analysis of mortality in pigs using zero-inflated models. (in Preparation)
Waagepetersen R, Ibáñez N, Sorensen D (2008) A comparison of strategies for Markov chain Monte Carlo computation in quantitative genetics. Genet Sel Evol 40:161–176
Wright S (1934) An analysis of variability in number of digits in an inbred strain of guinea pigs. Genetics 19:506–536
Xu S (2003) Estimating polygenic effects using markers of the entire genome. Genetics 163:789–801
Zhang XS, Hill WG (2005) Evolution of the environmental component of phenotypic variance: stabilizing selection in changing environments and the cost of homogeneity. Evolution 59:1237–1244
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sorensen, D. Developments in statistical analysis in quantitative genetics. Genetica 136, 319–332 (2009). https://doi.org/10.1007/s10709-008-9303-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10709-008-9303-5