Skip to main content

Advertisement

Log in

An Empirical Comparison of Information-Theoretic Selection Criteria for Multivariate Behavior Genetic Models

  • Published:
Behavior Genetics Aims and scope Submit manuscript

Abstract

Information theory provides an attractive basis for statistical inference and model selection. However, little is known about the relative performance of different information-theoretic criteria in covariance structure modeling, especially in behavioral genetic contexts. To explore these issues, information-theoretic fit criteria were compared with regard to their ability to discriminate between multivariate behavioral genetic models under various model, distribution, and sample size conditions. Results indicate that performance depends on sample size, model complexity, and distributional specification. The Bayesian Information Criterion (BIC) is more robust to distributional misspecification than Akaike's Information Criterion (AIC) under certain conditions, and outperforms AIC in larger samples and when comparing more complex models. An approximation to the Minimum Description Length (MDL; Rissanen, J. (1996). IEEE Transactions on Information Theory 42:40–47, Rissanen, J. (2001). IEEE Transactions on Information Theory 47:1712–1717) criterion, involving the empirical Fisher information matrix, exhibits variable patterns of performance due to the complexity of estimating Fisher information matrices. Results indicate that a relatively new information-theoretic criterion, Draper's Information Criterion (DIC; Draper, 1995), which shares features of the Bayesian and MDL criteria, performs similarly to or better than BIC. Results emphasize the importance of further research into theory and computation of information-theoretic criteria.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

REFERENCES

  • Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrov and F. Csaki (eds. ), Proceedings of the Second International Symposium on Information Theory. Budapest:Akademiai Kiado, pp. 267–281.

  • Azzalini, A., and Dalla Valle, A. (1996). The multivariate skew – normal distribution. Biometrika 83:715–726.

    Google Scholar 

  • Azzalinia, A., and Capitanio, A. (2003). Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution. J. Roy. Stat. Soc.:Ser. B (Meth-odol. ). 65:367–389.

    Google Scholar 

  • Barron, A. R., and Cover, T. M. (1991). Minimum complexity density estimation. IEEE Trans. Inform. Theory 37:1034–1054.

    Google Scholar 

  • Barron, A. R., Rissanen, J., and Yu, B. (1998). The minimum description length principle in coding and modeling. IEEE Trans. Inform. Theory 44:2743–2760.

    Google Scholar 

  • Bickel, P., and Zhang, P. (1992). Variable selection in nonpara-metric regression with categorical covariates. J. Am. Stat. Assoc. 87:90–97.

    Google Scholar 

  • Boomsma, D. (1987). The genetic analysis of repeated measures. I. Simplex models. Behav. Genet. 17:111–123.

    Google Scholar 

  • Burhnam, K. P., and Anderson, D. R. (1998). Model selection and inference:a practical information-theoretic approach. New York: Springer.

    Google Scholar 

  • Celeux, G., and Soromenho, G. (1996). An entropy criterion for assessing the number of clusters in a mixture model. J. Classif. 13:195–212.

    Google Scholar 

  • Cover, T. M., and Thomas, J. A. (1991). Elements of information theory. New York: Wiley.

    Google Scholar 

  • Draper, D. (1995). Assessment and propagation of model uncer-tainty. J. Roy. Stat. Soc.:Ser. B (Methodol. ). 57:45–97.

    Google Scholar 

  • Duffy, D. L., Martin, N. G., Battistutta, D., Hopper, J. L., and Matthews, J. D. (1990). Genetics of asthma and hay-fever in Australian twins. Am. Rev. Respirat. Dis. 142:1351–1358.

    Google Scholar 

  • Fishler, E., Grosmann, M., and Messer, H. (2002). Detection of signals by information theoretic criteria:general asymptotic performance analysis. IEEE Trans. Signal Process. 50:1027–1036

    Google Scholar 

  • Fujikoshi, Y., and Satoh, K. (1997). Modi ed AIC and C p in multivariate linear regression. Biometrika 84:707–716.

    Google Scholar 

  • Hansen, M. H., and Yu, B. (2001). Model selection and the principle of minimum description length. J. Am. Stat. Assoc. 96:746–774.

    Google Scholar 

  • Hurvich, C. M., and Tsai, C-L. (1989). Regression and time series model selection in small samples. Biometrika 76:297–307.

    Google Scholar 

  • Hurvich, C. M., and Tsai, C-L. (1990). The impact of model selection on inference in linear regression. Am. Stat. 44:214–217.

    Google Scholar 

  • Ichikawa, M. (1988). Empirical assessments of AIC procedure for model selection in factor analysis. Behaviormetrika 24:33–40.

    Google Scholar 

  • Ichikawa, M., and Konishi, S. (1999). Model evaluation and information criteria in covariance structure analysis. Brit. J. Math. Stat. Psychol. 52:285–302.

    Google Scholar 

  • Ihaka, R., and Gentleman, R. (1996). R:a language for data analysis and graphics. J. Comput. Graph. Stat. 5:299–314.

    Google Scholar 

  • Lin, T. H., and Dayton, C. M. (1997). Model selection information criteria for non-nested latent class models. J. Educat. Behav. Stat. 22:249–264.

    Google Scholar 

  • Myung, I. J., Balasubramanian, V., and Pitt, M. A. (2000). Counting probability distributions:Differential geometry and model selection. Proc. Natl. Acad. Sci. 97:11170–11175.

    Google Scholar 

  • Neale, M. C., Boker, S. M., Xie, G., and Maes, H. H. (1999). Mx:statistical modeling (5th Ed. ). Richmond, VA: Department of Psychiatry.

    Google Scholar 

  • Neale, M. C., and Cardon, L. R. (1992). Methodology for genetic studies of twins and families. Dordrecht, NL: Kluwer.

    Google Scholar 

  • Pauler, D. K. (1998). Schwarz criterion and related methods for normal linear models. Biometrika 85:13–27.

    Google Scholar 

  • Raftery, A. E. (1993). Bayesian model selection in structural equation models. In K. A. Bollen and J. S. Long (eds. ), Testing structural equation models. Newbury Park, CA: Sage, pp. 163–180.

    Google Scholar 

  • Rissanen, J. (1978). Modeling by shortest data description. Automatica 14:465–471.

    Google Scholar 

  • Rissanen, J. (1983). A universal prior for integers and estimation by minimum description length. Ann. Stat. 11:416–431.

    Google Scholar 

  • Rissanen, J. (1989). Stochastic complexity and statistical inquiry. Singapore: World Scientific.

    Google Scholar 

  • Rissanen, J. (1996). Fisher information and stochastic complexity. IEEE Trans. Inform. Theory 42:40–47.

    Google Scholar 

  • Rissanen, J. (2001). Strong optimality of the normalized ML models as universal codes and information in data. IEEE Trans. Inform. Theory 47:1712–1717.

    Google Scholar 

  • Schwarz, G. (1978). Estimating the dimension of a model. Ann. Stat. 6:461–464.

    Google Scholar 

  • Sugiura, N. (1978). Further analysis of the data by Akaike 's information criterion and the nite corrections. Commun. Stat. —Theory Meth. A7:13–26.

    Google Scholar 

  • Takeuchi, K. (1976). Distribution of information statistics and criteria for adequacy of models. Math. Sci. 153:12–18.

    Google Scholar 

  • den Oord, E. J. C. G., Simono., E., Eaves, L. J., Pickles, A., Silberg, J., and Maes, H. (2000). An evaluation of different approaches for behavior genetic analyses with psychiatric symptom scores. Behav. Genet. 30:1–18.

    Google Scholar 

  • Yang, C. (1998). Finite mixture model selection with psychometric applications (Doctoral dissertation, University of California, Los Angeles, 1996). Dissert. Abs. Int.:Sect. B:Sci. Eng. 59(9-A):3421.

    Google Scholar 

  • Zhang, P. (1993). On the convergence of model selection criteria. Commun. Stat. —Theory Meth. 22:2765–2775.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Markon, K.E., Krueger, R.F. An Empirical Comparison of Information-Theoretic Selection Criteria for Multivariate Behavior Genetic Models. Behav Genet 34, 593–610 (2004). https://doi.org/10.1007/s10519-004-5587-0

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10519-004-5587-0

Navigation