Skip to main content
Log in

A Gene-Free Formulation of Classical Quantitative Genetics Used to Examine Results and Interpretations Under Three Standard Assumptions

  • Regular Article
  • Published:
Acta Biotheoretica Aims and scope Submit manuscript

Abstract

Quantitative genetics (QG) analyses variation in traits of humans, other animals, or plants in ways that take account of the genealogical relatedness of the individuals whose traits are observed. “Classical” QG, where the analysis of variation does not involve data on measurable genetic or environmental entities or factors, is reformulated in this article using models that are free of hypothetical, idealized versions of such factors, while still allowing for defined degrees of relatedness among kinds of individuals or “varieties.” The gene-free formulation encompasses situations encountered in human QG as well as in agricultural QG. This formulation is used to describe three standard assumptions involved in classical QG and provide plausible alternatives. Several concerns about the partitioning of trait variation into components and its interpretation, most of which have a long history of debate, are discussed in light of the gene-free formulation and alternative assumptions. That discussion is at a theoretical level, not dependent on empirical data in any particular situation. Additional lines of work to put the gene-free formulation and alternative assumptions into practice and to assess their empirical consequences are noted, but lie beyond the scope of this article. The three standard QG assumptions examined are: (1) partitioning of trait variation into components requires models of hypothetical, idealized genes with simple Mendelian inheritance and direct contributions to the trait; (2) all other things being equal, similarity in traits for relatives is proportional to the fraction shared by the relatives of all the genes that vary in the population (e.g., fraternal or dizygotic twins share half of the variable genes that identical or monozygotic twins share); (3) in analyses of human data, genotype-environment interaction variance (in the classical QG sense) can be discounted. The concerns about the partitioning of trait variation discussed include: the distinction between traits and underlying measurable factors; the possible heterogeneity in factors underlying the development of a trait; the kinds of data needed to estimate key empirical parameters; and interpretations based on contributions of hypothetical genes; as well as, in human studies, the labeling of residual variance as a non-shared environmental effect; and the importance of estimating interaction variance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Couzin-Frankel J (2010) Major heart disease genes prove elusive. Science 328:1220–1221

    Article  Google Scholar 

  • Donoghue JR, Collins LM (1990) A note on the unbiased estimation of the intraclass correlation. Psychometrika 55:159–164

    Article  Google Scholar 

  • Falconer DS, Mackay TFC (1996) Introduction to quantitative genetics, 4th edn. Longman, Harlow

    Google Scholar 

  • Feng R, Zhou G, Zhang M, Zhang H (2009) Analysis of twin data using SAS. Biometrics 65:584–589

    Article  Google Scholar 

  • Holland JB, Nyquist WE, Cervantes-Martínez C (2003) Estimating and interpreting heritability for plant breeding: an update. Plant Breed Rev 22:9–112

    Google Scholar 

  • Howell, DC (2002). Intraclass Correlation: For Unordered Pairs. http://www.uvm.edu/~dhowell/StatPages/More_Stuff/icc/icc.html. Accessed 10 Jan 2007

  • Jacquard A (1983) Heritability: one word, three concepts. Biometrics 39:465–477

    Article  Google Scholar 

  • Kendler KS, Prescott CA (2006) Genes, environment, and psychopathology: understanding the causes of psychiatric and substance abuse disorders. The Guilford Press, New York

    Google Scholar 

  • Layzer D (1974) Heritability analyses of IQ scores: science or numerology? Science 183:1259–1266

    Article  Google Scholar 

  • Lewontin RC (1974) The analysis of variance and the analysis of causes. Am J Hum Genet 26:400–411

    Google Scholar 

  • Lynch M, Walsh B (1998) Genetics and analysis of quantitative traits. Sinauer, Sunderland

    Google Scholar 

  • McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, Ioannidis JPA, Hirschhorn JN (2008) Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet 9:356–369

    Article  Google Scholar 

  • McClellan J, King M-C (2010) Genetic heterogeneity in human disease. Cell 141:210–217

    Article  Google Scholar 

  • Moffitt TE, Caspi A, Rutter M (2005) Strategy for investigating interactions between measured genes and measured environments. Arch Gen Psychiatr 62:473–481

    Article  Google Scholar 

  • Nuffield Council on Bioethics (2002) Genetics and human behavior: the ethical context. http://wwwnuffieldbioethicsorg. Accessed 22 June 2007

  • Parens EA, Chapman N (eds) (2006) Wrestling with behavioral genetics: science, ethics, and public conversation. Johns Hopkins University Press, Baltimore

    Google Scholar 

  • Plomin R (1999) Genetics and general cognitive ability. Nature 402:C25–C29

    Article  Google Scholar 

  • Plomin R, DeFries JC, Loehlin JC (1977) Genotype-environment interaction correlation in analysis of human behavior. Psychol Bull 84:309–322

    Article  Google Scholar 

  • Plomin R, Defries JC, McClearn GE, Rutter M (1997) Behavioral genetics. Freeman, New York

    Google Scholar 

  • Rijsdijk FV, Sham PC (2002) Analytic approaches to twin data using structural equation models. Brief Bioinform 3:119–133

    Article  Google Scholar 

  • Rutter M (2002) Nature, nurture, and development: from evangelism through science toward policy and practice. Child Dev 73:1–21

    Article  Google Scholar 

  • Taylor PJ (2010) Three puzzles and eight gaps: what heritability studies and critical commentaries have not paid enough attention to. Biol Philos 25:1–31

    Article  Google Scholar 

  • Turkheimer E, Waldron M (2000) Nonshared environment: a theoretical, methodological, and quantitative review. Psychol Bull 126:78–108

    Article  Google Scholar 

  • Visscher PM, Medland SE, Ferreira M, Morley K, Zhu G, Cornes B, Montgomery GW, Martin NG (2006) Assumption-free estimation of heritability from genome-wide identity-by-descent sharing between full siblings. PLoS Genet 2:e41. doi:10.1371/journal.pgen.0020041

    Article  Google Scholar 

  • Visscher PM, Macgregor S, Benyamin B, Zhu G, Gordon S, Medland S, Hill WG et al (2007) Genome partitioning of genetic variation for height from 11,214 sibling Pairs. Am J Hum Genet 81:1104–1110

    Article  Google Scholar 

  • Wahlsten D (2000) Analysis of variance in the service of interactionism. Hum Dev 43:46–50

    Article  Google Scholar 

  • Zuka O, Hechtera E, Sunyaev S, Lander E (2012) The mystery of missing heritability: genetic interactions create phantom heritability. PNAS 109:1193–1198

    Article  Google Scholar 

Download references

Acknowledgments

This article is based on research supported by the National Science Foundation under grant SES-0634744. The comments on previous versions by anonymous reviewers and an editor helped in the revision process.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter J. Taylor.

Appendices

Appendices

1.1 Appendix 1: Gene-Free Formulations for Classes of Relatives Other Than Twins

Gene-free formulations can be derived and applied through five steps:

  1. (1)

    Define varieties in terms of the progenitors for individuals in the variety, e.g., a clone, a pair of parents, one mother and unrelated fathers, a pair of grandparents.

  2. (2)

    Specify the variant of Eq. 8 that encompasses the different kinds of relatives possible for such progenitors (e.g., sibs, first cousins).

  3. (3)

    Spell out the intraclass correlation equations for the different kinds of relatives in the different circumstances (e.g., raised together, raised apart).

  4. (4)

    Rearrange the intraclass correlation equations to produce estimators for the variance fractions and for any empirically determined parameter that had to be introduced to take degree of relatedness into account.

  5. (5)

    Collect data for the classes of relatives needed in order to estimate the variance fractions and parameters of interest using the equations from step 4.

1.2 Appendix 2: Numerical Illustrations of the Gene-Free Formulation of Sect. 2 and of the Contrasting Results under the Standard and Alternative Assumptions of Sect. 3

This Appendix provides results from applying the formulas of Sects. 2.2 and 3.2 to data sets generated for a range of values of broad-sense heritability and the other fractions of variance. First, data sets of 100 varieties and 100 locations were randomly generated with two MZ twins and two DZ twins for each variety–location combination. Each row of Table 4 corresponds to a pair of V/Y and γ values drawn from the ranges [.2–.8] and [.3–.8], respectively. L, VL, and E were set equal to (1–V)/3. V and VL were set to γ.V and γ.VL, respectively; T and TL were set to (1−γ)V and (1−γ)VL, respectively. Equations 37 were used to calculate from the full data set the actual fractions for V, L, VL, and E for MZ and for DZ twins. It is the average of the two values that is given in Table 4. (The actual fractions differ slightly from the pre-set values due to the random generation process). Equation 12 was used to calculate the actual values for γ. The values for broad-sense heritability and the fraction of variation due to differences among location averages were estimated first using the conventional formulas (Eqs. 18, 19) and then using the adjusted formulas that factor in γ (Eqs. 15, 16). The fraction due to noise is 1-the sum of these estimates for broad-sense heritability and the location fraction.

Table 4 Estimates of broad-sense heritability and L/Y derived from full simulated data set for a series of combinations of actual values

The results in Table 4 indicate that: (a) the adjusted estimates of broad-sense heritability are close to (V + VL)/Y and the adjusted values of L/Y are close to the actual values, a result that matches the theory in the body of the text; (b) the conventional formulas yield estimates of the two quantities that differ from the corresponding estimates from the adjusted formula by a close-to-equal and opposite amount. This amount is close to the value given by the difference between Eqs. 15 and 18, which can be shown to be (2γ−1) times the adjusted formula’s estimate of V+/Y.

To produce the results given in Table 5, 48 MZT pairs, 48 DZT pairs, and 48 UVT pairs were sampled from the full data sets of 10,000 MZ twin pairs and 10,000 DZ twin pairs. (Each row of the table corresponds to a row of Table 4, that is, to a particular pair of V/Y and γ values). The UVT pairs were sampled from the MZ data by choosing one member of the pairs for two different varieties at a given location. (One sample of pairs for pre-set values of V from .2 to .8 and γ = .5 is downloadable from http://bit.ly/TwinPairs). The values for broad-sense heritability and the fraction of variation due to differences among location averages were again estimated using both the conventional and adjusted formulas. After the sampling process was repeated 100 times for each data set, the mean and standard deviation of the various values were calculated.

Table 5 Estimates of broad-sense heritability and L/Y derived from repeated samples from full data sets in Table 4

The results summarized in Table 5 have the same features as those in Table 4 except that: (a) the means of the estimates of broad-sense heritability using the adjusted formula tend to overestimate slightly the actual values for (V + VL)/Y and the means of the corresponding estimates of L/Y tend to underestimate the actual values; and (b) the Standard Deviations of the estimates are substantial, but, in general, less for the adjusted formulas than for the conventional.

1.3 Appendix 3: Contrasting Formulations in a Selection of Articles That Point to the Potential Importance of Interaction Variance

This Appendix translates the main arguments about the potential importance of interaction variance made by a range of authors into the terms and notation of this article. This translation makes apparent the differences among the formulations and their differences from the account in this article. Two adjustments should be noted: (a) e is used below for environmental factors, not for noise or error; and (b) the equations are centered on the average value of the trait over all varieties, locations, and replicates. In other words, in Eq. 1, subtract m from both sides to produce the deviation of the trait value from the overall mean, then average over the replicates for each variety–location combination and call this average y′ij.

$$ {\text{y}}^{\prime }_{{{\text{ij}}.}} = {\text{v}}_{\text{i}} + {\text{l}}_{\text{j}} + {\text{vl}}_{\text{ij}} $$
(20)

1.4 Layzer (1974)

Main argument: The contribution of variety–location (“genotype–environment”) interaction to trait variation consists of a direct interaction contribution and a covariance that is “negligible if, and only if, the genotypic and environmental variables are statistically independent to a high degree of approximation.” (Layzer’s discussion focuses on variety–location covariance or correlation for human traits.) In mathematical terms:

$$ {\text{y}}^{\prime }_{{{\text{ii}} .}} = {\text{v}}_{\text{i}} + {\text{l}}_{\text{i}} + {\text{vl}}_{\text{ii}} $$
(21)

where

$$ {\text{v}}_{\text{i}} = \int { \ldots \int {{\text{y }}(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\text{g}}_{\text{i}} ,\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\text{e}}){\text{ p}}(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\text{g}}_{\text{i}} |\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\text{e}})de_{1} \ldots de_{s} } } $$
(22)

g i is the set of genetic factors present in variety i; e is a set of possible values of environmental factors 1,….s; y (g i,e) is the value of the trait for the set e; p(g i|e) is the conditional probability of the occurrence of g i for the set e; ∫…∫ … de 1 …de s is the integration over all of possible sets of environmental factors 1,….s

Similarly li is an integration over all of possible sets of genetic factors. The equation for variances corresponding to 21 is

$$ {\text{Y}} = {\text{V}} + {\text{L}} + 2{\text{ Covariance }}\left( {{\text{V}},{\text{ L}}} \right) + 2{\text{ Covariance }}\left( {{\text{V}} + {\text{L}},{\text{ VL}}} \right) + {\text{VL}} $$
(23)

which simplifies if covariances are zero to

$$ {\text{Y}} = {\text{V}} + {\text{L}} + {\text{VL}} $$
(24)

but covariances are zero if and only if p(g i|e) = p(g i) and p(e i|g) = p(e i).

1.5 Lewontin (1974)

Main argument: When the norm of reaction, that is, the response of a variety (or genetic type) to changes in some measurable environmental factor, varies for different varieties in slope or position, this can confound any attempt to extrapolate the relative ranking of varieties observed over part of the range of the environmental factor to the full range. This situation is equivalent to the vlij values being non-negligible in Eq. 25

$$ {\text{y}}^{\prime }_{{{\text{ij}} .}} = {\text{v}}_{\text{i}} + {\text{c}} . {\text{e}}_{\text{j}} + {\text{vl}}_{\text{ij}} $$
(25)

where ej is the average value for the environmental factor in the jth location; c is a scaling constant; and vlij is the additional contribution from the i, jth variety–location combination.

1.6 Plomin et al. (1977)

Main argument: A proxy for variety–location interaction is constructed, namely, the interaction of two variants of a given variable, e.g., average educational attainment, first averaged for the individual’s biological parents (standing in for the variety contribution) and then for the individual’s adoptive parents (standing in for the location contribution). Low values of the interaction between the two variants are found for human behavioral traits. In other words, the vrlij values are low in the following equation:

$$ {\text{y}}^{\prime }_{{{\text{ij}}.}} = {\text{v}}^{\text{b}}_{\text{i}} + {\text{l}}^{\text{a}}_{\text{j}} + {\text{vl}}^{\text{r}}_{\text{ij}} $$
(26)

where \( {\text{v}}^{\text{b}}_{\text{i}} \) is the average value of y for the biological parents of person i; \( {\text{l}}^{\text{a}}_{\text{j}} \) is the average value of y for the adoptive parents of person I raised in family j; vrlij is the residual after allowing for the previous two contributions.

1.7 Jacquard (1983)

Main argument: In reality the condition is hardly ever fulfilled that the interaction contributions (estimated by vlij in Eq. 20) are all zero. Thus heritability in the broad sense (V/Y) has no meaning.

1.8 Wahlsten (2000)

Main argument: multiplicative functional relationships can result in a significant interaction variance even if there is no functional interaction term in the functional relationship. Consider the two functional relationships:

$$ {\text{y}}^{\prime }_{{{\text{ij}} .}} = {\text{g}}_{\text{i}} + {\text{e}}_{\text{j}} + {\text{ge}}_{\text{ij}} $$
(27)
$$ {\text{y}}^{\prime }_{{{\text{ij}} .}} = {\text{g}}_{\text{i}} {\text{e}}_{\text{j}} $$
(28)

where gi is a functional contribution of the ith variety; ej is the functional contribution of the jth location; geij is the functional contribution from the i, jth variety–location combination.

If the geij values in Eq. 27 are negligible, the values of vlij in Eq. 20 will also be negligible. However, even without any geij values in Eq. 22, statistically significant values of vlij can arise.

1.9 Summary

In this article, the linear model underlying the partitioning of variation in the trait connects values of the trait for an individual to the summation of variety, location, variety–location interaction, and noise contributions (e.g., Eq. 1). It does so without reference to measurable genetic or environmental factors (unlike Layzer, Wahlsten), their functional relationships (unlike Wahlsten), probabilities of their occurrence (unlike Layzer), or a gradient underlying the differences in the variety or location contributions (unlike Lewontin). Interaction variance can be estimated (i.e., no proxy is needed; unlike Plomin) provided data are available from the appropriate classes of defined degrees of relatedness. In that case, broad-sense heritability can be estimated (unlike Jacquard).

Several of the authors also make arguments about the potential importance of variety–location (“genotype–environment”) correlation (or covariance). In this article lack of such correlations is assumed only to keep the focus on partitioning of variance under three standard assumptions and their alternatives. As noted in Sect. 2.4, whether these conditions can be met in human situations is a matter of ongoing controversy.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Taylor, P.J. A Gene-Free Formulation of Classical Quantitative Genetics Used to Examine Results and Interpretations Under Three Standard Assumptions. Acta Biotheor 60, 357–378 (2012). https://doi.org/10.1007/s10441-012-9164-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10441-012-9164-2

Keywords

Navigation