Abstract
Information theory provides an attractive basis for statistical inference and model selection. However, little is known about the relative performance of different information-theoretic criteria in covariance structure modeling, especially in behavioral genetic contexts. To explore these issues, information-theoretic fit criteria were compared with regard to their ability to discriminate between multivariate behavioral genetic models under various model, distribution, and sample size conditions. Results indicate that performance depends on sample size, model complexity, and distributional specification. The Bayesian Information Criterion (BIC) is more robust to distributional misspecification than Akaike's Information Criterion (AIC) under certain conditions, and outperforms AIC in larger samples and when comparing more complex models. An approximation to the Minimum Description Length (MDL; Rissanen, J. (1996). IEEE Transactions on Information Theory 42:40–47, Rissanen, J. (2001). IEEE Transactions on Information Theory 47:1712–1717) criterion, involving the empirical Fisher information matrix, exhibits variable patterns of performance due to the complexity of estimating Fisher information matrices. Results indicate that a relatively new information-theoretic criterion, Draper's Information Criterion (DIC; Draper, 1995), which shares features of the Bayesian and MDL criteria, performs similarly to or better than BIC. Results emphasize the importance of further research into theory and computation of information-theoretic criteria.
Similar content being viewed by others
REFERENCES
Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrov and F. Csaki (eds. ), Proceedings of the Second International Symposium on Information Theory. Budapest:Akademiai Kiado, pp. 267–281.
Azzalini, A., and Dalla Valle, A. (1996). The multivariate skew – normal distribution. Biometrika 83:715–726.
Azzalinia, A., and Capitanio, A. (2003). Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution. J. Roy. Stat. Soc.:Ser. B (Meth-odol. ). 65:367–389.
Barron, A. R., and Cover, T. M. (1991). Minimum complexity density estimation. IEEE Trans. Inform. Theory 37:1034–1054.
Barron, A. R., Rissanen, J., and Yu, B. (1998). The minimum description length principle in coding and modeling. IEEE Trans. Inform. Theory 44:2743–2760.
Bickel, P., and Zhang, P. (1992). Variable selection in nonpara-metric regression with categorical covariates. J. Am. Stat. Assoc. 87:90–97.
Boomsma, D. (1987). The genetic analysis of repeated measures. I. Simplex models. Behav. Genet. 17:111–123.
Burhnam, K. P., and Anderson, D. R. (1998). Model selection and inference:a practical information-theoretic approach. New York: Springer.
Celeux, G., and Soromenho, G. (1996). An entropy criterion for assessing the number of clusters in a mixture model. J. Classif. 13:195–212.
Cover, T. M., and Thomas, J. A. (1991). Elements of information theory. New York: Wiley.
Draper, D. (1995). Assessment and propagation of model uncer-tainty. J. Roy. Stat. Soc.:Ser. B (Methodol. ). 57:45–97.
Duffy, D. L., Martin, N. G., Battistutta, D., Hopper, J. L., and Matthews, J. D. (1990). Genetics of asthma and hay-fever in Australian twins. Am. Rev. Respirat. Dis. 142:1351–1358.
Fishler, E., Grosmann, M., and Messer, H. (2002). Detection of signals by information theoretic criteria:general asymptotic performance analysis. IEEE Trans. Signal Process. 50:1027–1036
Fujikoshi, Y., and Satoh, K. (1997). Modi ed AIC and C p in multivariate linear regression. Biometrika 84:707–716.
Hansen, M. H., and Yu, B. (2001). Model selection and the principle of minimum description length. J. Am. Stat. Assoc. 96:746–774.
Hurvich, C. M., and Tsai, C-L. (1989). Regression and time series model selection in small samples. Biometrika 76:297–307.
Hurvich, C. M., and Tsai, C-L. (1990). The impact of model selection on inference in linear regression. Am. Stat. 44:214–217.
Ichikawa, M. (1988). Empirical assessments of AIC procedure for model selection in factor analysis. Behaviormetrika 24:33–40.
Ichikawa, M., and Konishi, S. (1999). Model evaluation and information criteria in covariance structure analysis. Brit. J. Math. Stat. Psychol. 52:285–302.
Ihaka, R., and Gentleman, R. (1996). R:a language for data analysis and graphics. J. Comput. Graph. Stat. 5:299–314.
Lin, T. H., and Dayton, C. M. (1997). Model selection information criteria for non-nested latent class models. J. Educat. Behav. Stat. 22:249–264.
Myung, I. J., Balasubramanian, V., and Pitt, M. A. (2000). Counting probability distributions:Differential geometry and model selection. Proc. Natl. Acad. Sci. 97:11170–11175.
Neale, M. C., Boker, S. M., Xie, G., and Maes, H. H. (1999). Mx:statistical modeling (5th Ed. ). Richmond, VA: Department of Psychiatry.
Neale, M. C., and Cardon, L. R. (1992). Methodology for genetic studies of twins and families. Dordrecht, NL: Kluwer.
Pauler, D. K. (1998). Schwarz criterion and related methods for normal linear models. Biometrika 85:13–27.
Raftery, A. E. (1993). Bayesian model selection in structural equation models. In K. A. Bollen and J. S. Long (eds. ), Testing structural equation models. Newbury Park, CA: Sage, pp. 163–180.
Rissanen, J. (1978). Modeling by shortest data description. Automatica 14:465–471.
Rissanen, J. (1983). A universal prior for integers and estimation by minimum description length. Ann. Stat. 11:416–431.
Rissanen, J. (1989). Stochastic complexity and statistical inquiry. Singapore: World Scientific.
Rissanen, J. (1996). Fisher information and stochastic complexity. IEEE Trans. Inform. Theory 42:40–47.
Rissanen, J. (2001). Strong optimality of the normalized ML models as universal codes and information in data. IEEE Trans. Inform. Theory 47:1712–1717.
Schwarz, G. (1978). Estimating the dimension of a model. Ann. Stat. 6:461–464.
Sugiura, N. (1978). Further analysis of the data by Akaike 's information criterion and the nite corrections. Commun. Stat. —Theory Meth. A7:13–26.
Takeuchi, K. (1976). Distribution of information statistics and criteria for adequacy of models. Math. Sci. 153:12–18.
den Oord, E. J. C. G., Simono., E., Eaves, L. J., Pickles, A., Silberg, J., and Maes, H. (2000). An evaluation of different approaches for behavior genetic analyses with psychiatric symptom scores. Behav. Genet. 30:1–18.
Yang, C. (1998). Finite mixture model selection with psychometric applications (Doctoral dissertation, University of California, Los Angeles, 1996). Dissert. Abs. Int.:Sect. B:Sci. Eng. 59(9-A):3421.
Zhang, P. (1993). On the convergence of model selection criteria. Commun. Stat. —Theory Meth. 22:2765–2775.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Markon, K.E., Krueger, R.F. An Empirical Comparison of Information-Theoretic Selection Criteria for Multivariate Behavior Genetic Models. Behav Genet 34, 593–610 (2004). https://doi.org/10.1007/s10519-004-5587-0
Issue Date:
DOI: https://doi.org/10.1007/s10519-004-5587-0