Skip to main content

Effects of data imbalance on estimation of heritability

Summary

Effects of data imbalance on bias, sampling variance and mean square error of heritability estimated with variance components were examined using a random two-way nested classification. Four designs, ranging from zero imbalance (balanced data) to “low”, “medium” and “high” imbalance, were considered for each of four combinations of heritability (h2=0.2 and 0.4) and sample size (N=120 and 600). Observations were simulated for each design by drawing independent pseudo-random deviates from normal distributions with zero means, and variances determined by heritability. There were 100 replicates of each simulation; the same design matrix was used in all replications. Variance components were estimated by analysis of variance (Henderson's Method 1) and by maximum likelihood (ML). For the design and model used in this study, bias in heritability based on Method 1 and ML estimates of variance components was negligible. Effect of imbalance on variance of heritability was smaller for ML than for Method 1 estimation, and was smaller for heritability based on estimates of sire-plus-dam variance components than for heritability based on estimates of sire or dam variance components. Mean square error for heritability based on estimates of sire-plus-dam variance components appears to be less sensitive to data imbalance than heritability based on estimates of sire or dam variance components, especially when using Method 1 estimation. Estimation of heritability from sire-plus-dam components was insensitive to differences in data imbalance, especially for the larger sample size.

This is a preview of subscription content, access via your institution.

References

  • Cavalli-Sforza LL, Bodmer WF (1971) The genetics of human populations. Freeman, San Francisco

    Google Scholar 

  • Corbeil RR, Searle SR (1976) A comparison of variance component estimators. Biometrics 32:779–791

    Google Scholar 

  • Falconer DS (1981) Introduction to quantitative genetics. Longman, New York

    Google Scholar 

  • Gill JL, Jensen EL (1968) Probability of obtaining negative estimates of heritability. Biometrics 24:517–526

    Google Scholar 

  • Grossman M, Norton HW (1981) An approximation of the minimum-variance estimator of heritability based on variance component analysis. Genetics 98:417–426

    Google Scholar 

  • Harville DA (1968) Statistical dependence between subclass means and the numbers of observations in the subclasses for the two-way completely-random classification. J Am Stat Assoc 63:1484–1494

    Google Scholar 

  • Hemmerle WJ, Hartley HO (1973) Computing maximum likelihood estimators for the mixed A.O.V. model using the W-transformation. Technometrics 15:819–832

    Google Scholar 

  • Henderson CR (1953) Estimation of variance and covariance components. Biometrics 9:226–252

    Google Scholar 

  • Kendall M, Stuart A (1979) The advanced theory of statistics, vol 2. Macmillan, New York, p 21

    Google Scholar 

  • Pearson K (1897) Mathematical contributions to the theory of evolution. On a form of spurious correlation which may arise when indices are used in the measurement of organs. Proc R Soc London, Ser B 60:489–498

    Google Scholar 

  • Rothschild MF, Henderson CR, Quaas RL (1979) Effects of selection on variances and covariances of simulated first and second lactations. J Dairy Sci 62:996–1002

    Google Scholar 

  • SAS Institute Inc (1982) SAS user's guide: statistics. SAS Institute Inc, Cary, North Carolina

    Google Scholar 

  • Searle SR (1971) Linear models. Wiley and Sons, New York, p 475

    Google Scholar 

  • Searle SR (1979) Notes on variance component estimation: a detailed account of maximum likelihood and kindred methodology. Mimeo BU-673-M Biometrics Unit, Cornell University, Ithaca, New York

    Google Scholar 

Download references

Author information

Affiliations

Authors

Additional information

Supported by grants from the Illinois Agricultural Experiment Station and the University of Illinois Research Board. Charles Smith, H. W. Norton and D. Gianola contributed valuable suggestions

Communicated by L. D. Van Vleck

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Caro, R.F., Grossman, M. & Fernando, R.L. Effects of data imbalance on estimation of heritability. Theoret. Appl. Genetics 69, 523–530 (1985). https://doi.org/10.1007/BF00251098

Download citation

  • Received:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00251098

Key words

  • Unbalanced data
  • Heritability
  • Variance components