Skip to main content
Log in

A Limited Dependent Variable Model for Heritability Estimation with Non-Random Ascertained Samples

  • Published:
Behavior Genetics Aims and scope Submit manuscript

Abstract

In a questionnaire study, a random sample of Dutch families was asked whether they suffered from asthma and related symptoms. From these families, a selected sample was invited to come to the hospital for further phenotyping. Families were selected if at least one family member reported a history of asthma and the twins were 18 years of age or older. Not all families that were thus selected volunteered, leaving us with a fraction of the original sample.

The aim of this paper is to describe a limited dependent variable model that can be used in such situations in order to obtain estimates that are representative of the population from which the sample was originally drawn. The model is a linear (DeFries-Fulker) regression model corrected for sample selection. This correction is possible when (some of) the characteristics that determine whether subjects volunteer (or not) are known for all subjects, including those that did not volunteer.

The questionnaire study is of interest by itself but serves mainly to provide a concrete illustration of our method. The present model is used to analyze the data and the results are compared to those obtained with other methods: raw (or direct) likelihood estimation, multiple imputation, and sample weighting. Throughout, Rubin's general theory of inference with missing data serves as an integrating framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

REFERENCES

  • Aitchison, J., and Brown, J. (1969). The Lognormal Distribution with Special Reference to Its Uses in Econometrics. New York: Cambridge University Press.

    Google Scholar 

  • Arbuckle, J. L. (1996). Full information estimation in the presence of incomplete data. In Schumacker, R., and Marcoulides, G. A. (eds.), Advanced Structural Equation Modeling. Mahuah, NJ: Lawrence-Earlbaum.

    Google Scholar 

  • Arminger, G. H. (1995). Specification and estimation of mean structures: Regression models. In Arminger, G., Clogg, C., and Sobel, M. (eds.), Handbook of Statistical Modeling for the Social and Behavioral Sciences. New York: Plenum Press.

    Google Scholar 

  • Bechger, T. M. (1997). A case in heritability estimation with complex non-random ascertained samples. Unpublished report, Faculty of Psychology, Free University Amsterdam.

  • Boomsma, D. I., Geus, E. J. C. de, Baal, G. C. M. van, and Koopmans, J. R. (1999). Religious upbringing reduces the influence of genetic factors on dishibition: Evidence for interaction between genotype and environment. Twin Research, in press.

  • Burrows, B., Hasan, F. M., Barbee, R. A., Halonene, M., and Lebowitz, M. D. (1989). Association of asthma with serum IgE levels and skin-test reactivity to allergens. N. Engl. J. Med. 320: 271–277.

    Google Scholar 

  • DeFries, J. C., and Fulker, D. W. (1985). Multiple regression analysis of twin data. Behavior Genetics 15:467–473.

    Google Scholar 

  • Dudewicz, E. J., and Mishra, S. N. (1988). Modern Mathematical Statistics. New York: Wiley.

    Google Scholar 

  • Finkbeiner, C. (1979). Estimation for the multiple factor model when data are missing. Psychometrika 44(4):409–420.

    Google Scholar 

  • Fulker, D. W., Cardon, L. R., DeFries, J. C., Kimberling, W. F., Pennington, B. F., and Smith, S. D. (1991). Multiple regression analysis of sib-pair data on reading to detect quantitative trait loci. Reading and Writing: An Interdisciplinary Journal 3: 299–313.

    Google Scholar 

  • Greene, W. H. (1981). Sample selection as a specification error: Comment. Econometrica 49:795–798.

    Google Scholar 

  • Greene, W. H. (2000). Econometric Analysis. (4th edition). Upper Saddle River, NJ: Prentice Hall.

    Google Scholar 

  • Hanson, B., McGue, M., Roitman-Johnson, B., Segal, N. L., Bouchard, T. J., and Blumenthal, M. N. (1991). Atopic disease and Immunoglobulin E in twins reared apart and together. American Journal of Human Genetics 48:873–879.

    Google Scholar 

  • Heath, A. C., Madden P. A. F., and Martin, N. G. (1998). Assessing the effects of cooperation bias and attrition in behavioral genetic research using data-weighting. Behavior Genetics 28(6):415–428.

    Google Scholar 

  • Heckman, J. (1979). Sample selection bias as a specification error. Econometrica 47:153–161.

    Google Scholar 

  • Heitjan, D. F., and Basu, S. (1996). Distinguishing between “missing at random” and “missing completely at random.” The American Statistician 50(3):207–212.

    Google Scholar 

  • Koopmans, J. R., Boomsma, D. I., Heath, A. C., and Doornen, L. J. P. (1995). A multivariate genetic analysis of sensation seeking. Behavior Genetics 25:349–356.

    Google Scholar 

  • Koopmans, J. R., and Boomsma, D. I. (1996). Familial resemblance in alcohol use: Genetic or cultural transmission? Journal of Studies on Alcohol 57:19–28.

    Google Scholar 

  • Lange, K., Westlake, J., and Spence, M. (1976). Extensions to pedigree analysis: III. Variance components by the scoring method. Annals of Human Genetics 39:485–491.

    Google Scholar 

  • Little, R. J. A., and Rubin, D. B. (1987). Statistical Analysis with Missing Data. New York: Wiley.

    Google Scholar 

  • Little, R. J. A., and Schenker, N. (1995). Missing data. In Arminger, G., Clogg, C., and Sobel, M. (eds.), Handbook of Statistical Modeling for the Social and Behavioral Sciences. New York: Plenum Press.

    Google Scholar 

  • Lykken, D. T., McGue, M., and Tellegen, A. (1987). Recruitment Bias in Twin Research: The Rule of Two-Thirds Reconsidered. Behavior Genetics 17(4):343–362.

    Google Scholar 

  • Maddala, G. S. (1983). Limited dependent and qualitative variables in econometrics. Econometric society monographs nr. 3, Cambridge University Press.

  • Nawata, K. (1993). Estimation of sample selection bias models by the maximum likelihood estimator and Heckman's two-step estimator. Economics Letters 45:33–40.

    Google Scholar 

  • Nawata, K., and Nagase, N. (1996). Estimation of sample selection bias models. Econometric Reviews 15:387–400.

    Google Scholar 

  • Neale, M. C. (1997). Mx. Statistical Modeling with Mx. Box 126 MCV, Richmond, VA 23298: Department of Psychiatry.

  • Neale, C. M., Eaves, L. J., Kendler, K. S., and Hewitt, J. K. (1989). Bias in correlations from selected samples of relatives: The effects of soft selection. Behavior Genetics 19(2):163–169.

    Google Scholar 

  • Neale, M. C., and Eaves, L. J. (1993). Estimating and controlling for the effect of volunteer bias within pairs of relatives. Behavior Genetics 23(3):271–277.

    Google Scholar 

  • Rubin, D. B. (1976). Inference and Missing data. Biometrika 63: 581–592.

    Google Scholar 

  • Rubin, D. D. (1986). Multiple Imputation for Nonresponse in Surveys. New York: Wiley.

    Google Scholar 

  • Russell, J., Hopp, D. O., Againdra, K., Bewtra, M. D., Gavin, D., Watt, M. P. H., Nicki, M., Hair, M. D., and Townley, R. G. (1984). Genetic analysis of allergic disease in twins. Clinical Immunology 73:265–270.

    Google Scholar 

  • Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data. New York: Chapman and Hall.

    Google Scholar 

  • Tas, R. F. J. (1990). Meerlingen in Nederland, 1900–1988 [Multiple Births in the Netherlands, 1990–1988]. Maanstatistiek Bevolking: CBS (Central Bureau of Statistics).

    Google Scholar 

  • Wilde, A. G. J. S. (1970). Neurotische labiliteit gemeten volgens de vragenlijstmethode [The Questionnaire Method as a Means of Measuring Neurotic Instability]. Amsterdam: van Rossen.

    Google Scholar 

  • Wilks, S. S. (1932). Moments and distributions of estimates of population parameters from fragmentary samples, Annals of Mathematical Statistics 3:163–195.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Timo M. Bechger.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bechger, T.M., Boomsma, D.I. & Koning, H. A Limited Dependent Variable Model for Heritability Estimation with Non-Random Ascertained Samples. Behav Genet 32, 145–151 (2002). https://doi.org/10.1023/A:1015257908396

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1015257908396

Navigation