Abstract
Model specification tests are essential tools for evaluating the appropriateness of probability models for estimation and inference. White (Econometrica, 50: 1–25, 1982) proposed that model misspecification could be detected by testing the null hypothesis that the Fisher information matrix (IM) Equality holds by comparing linear functions of the Hessian to outer product gradient (OPG) inverse covariance matrix estimators. Unfortunately, a number of researchers have reported difficulties in obtaining reliable inferences using White’s (Econometrica, 50: 1–25, 1982) original information matrix test (IMT). In this chapter, we extend White (Econometrica, 50: 1–25, 1982) to present a new generalized information matrix test (GIMT) theory and develop a new Adjusted Classical GIMT and five new Eigenspectrum GIMTs that compare nonlinear functions of the Hessian and OPG covariance matrix estimators. We then evaluate the level and power of these new GIMTs using simulation studies on realistic epidemiological data and find that they exhibit appealing performance on sample sizes typically encountered in practice. Our results suggest that these new GIMTs are important tools for detecting and assessing model misspecification, and thus will have broad applications for model-based decision making in the social, behavioral, engineering, financial, medical, and public health sciences.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Agresti, A.: Categorical data analysis. New York: Wiley-Interscience, 2002.
Akaike, H.: “Information theory and an extension of the maximum likelihood principle”, 1973.
Alonso, A., S. Litière, and G. Molenberghs: “A family of tests to detect misspecifications in the random-effects structure of generalized linear mixed models”, Computational Statistics and Data Analysis, 52(2008), 4474–4486.
Aparicio, T., and I. Villanua: “The asymptotically efficient version of the information matrix test in binary choice models. A study of size and power”, Journal of Applied Statistics, 28(2001), 167–182.
Archer, K. J., and S. Lemeshow: “Goodness-of-fit test for a logistic regression model fitted using survey sample data”, The Stata Journal, 6(2006), 97–105.
Arminger, G., and M. E. Sobel: “Pseudo-maximum likelihood estimation of mean and covariance structures with missing data”, Journal of the American Statistical Association, 85(1990), 195–203.
Begg, M. D., and S. Lagakos: “On the consequences of model misspecification in logistic regression”, Environmental Health Perspectives, 87(1990), 69–75.
Bera, A. K., and S. Lee: “Information Matrix Test, Parameter Heterogeneity and ARCH: A Synthesis”, The Review of Economic Studies, 60(1993), 229–240.
Bertolini, G., R. D’Amico, D. Nardi, A. Tinazzi, and G. Apolone: “One model, several results: the paradox of the Hosmer-Lemeshow goodness-of-fit test for the logistic regression model”, Journal of Epidemiology and Biostatistics, 5(2000), 251–3.
Box, E. P., G. M. Jenkins, and G. C. Reinsel: Time Series Analysis: Forecasting and Control. New York: John Wiley & Sons, 2008.
Bozdogan, H.: “Akaike’s Information Criterion and Recent Developments in Information Complexity”, Journal of Mathematical Psychology, 44(2000), 62–91.
Bradley, A. P.: “The Use of the Area Under the ROC Curve in the Evaluation of Machine Learning Algorithms”, Pattern Recognition, 30(1997), 1145–1159.
Burnham, K. P., and D. R. Anderson: Model selection and multimodel inference : a practical information-theoretic approach. New York: Springer, 2002.
Chesher, A.: “The information matrix test: Simplified calculation via a score test interpretation”, Economics Letters, 13(1983), 45–48.
Chesher, A., and R. Spady: “Asymptotic Expansions of the Information Matrix Test Statistic”, Econometrica, 59(1991), 787–815.
Christensen, R.: Log-Linear Models and Logistic Regression. Springer Texts in, Statistics, 1997.
Collett, D.: Modelling Binary Data. Chapman & Hall/CRC, 2003.
Copas, J.B.: “Unweighted sum of squares test for proportions”, Applied Statistics, 38(1989), 71–80.
Cox, D.R.: “Role of models in statistical analysis”, Statistical Science, 5(1990), 169–174.
Cramér, H.: Mathematical Methods of Statistics. Princeton: Princeton University Press, 1946.
Davidson, R., and J. G. MacKinnon: “A New Form of the Information Matrix Test”, Econometrica, 60(1992), 145–157.
Davidson, R., and J. G. MacKinnon: “Graphical Methods for Investigating the Size and Power of Hypothesis Tests”, The Manchester School, 66(1998), 1–26.
Davison, A. C., D. V. Hinkley, and G. A. Young: “Recent Developments in Bootstrap Methodology”, Statistical Science, 18(2003), 141–157.
Davison, A. C., and C. L. Tsai: “Regression model diagnostics”, International Statistical Review, 60(1992), 337–353.
Deng, X., S. Wan, and B. Zhang: “An improved goodness-of-test for logistic regression models based on case-control data by random partition”, Communications in statistics: Simulations and computation, 38(2009), 233–243.
Dhaene, G., and D. Hoorelbeke: “The information matrix test with bootstrap-based covariance matrix estimation”, Economics Letters, 82(2004), 341–347.
DHHS: “The International Classification of Diseases, 9th Revision, Clinical Modification (ICD-9-CM). DHHS Publication No. (PHS) 80–1280”, Washington D.C.: Department of Health and Human Services, 1980.
Farrington, C.P.: “On assessing goodness of fit of generalized linear models to sparse data”, Journal of the Royal Statistical Society, Series B, 58(1996), 349–360.
Fawcett, T.: “An introduction to ROC analysis”, Pattern Recognition Letters, 27(2006), 861–874.
Fisher, R.A.: “On the mathematical foundations of theoretical statistics”, Philosophical Transactions of the Royal Society of London, Series A, 222(1922), 309–368.
Gallini, J.: “Misspecifications that can result in path analysis structures”, Applied Psychological Measurement, 7(1983), 125–137.
Golden, R.M.: Mathematical methods for neural network analysis and design. Cambridge, Mass.: MIT Press, 1996.
Golden, R. M.: “Statistical tests for comparing possibly misspecified and nonnested models”, Journal of Mathematical Psychology, 44(2000), 153–170.
Golden, R.M.: “Discrepancy risk model selection test theory for comparing possibly misspecified or nonnested models”, Psychometrika, 68(2003), 229–249.
Greene, W.: Econometric Analysis. New Jersey: Prentice-Hall, 2003.
Hall, A.: “The Information Matrix Test for the Linear Model”, The Review of Economic Studies, 54(1987), 257–263.
Hamilton, J. D.: Time Series Analysis. Princeton, New Jersey: Princeton University Press, 1994.
Hanley, J. A., and B. J. McNeil: “The Meaning and Use of the Area under a Receiver Operating Characteristic (ROC) Curve”, Radiology, 143(1982), 29–36.
Harrell, F. E.: Regression modeling strategies : with applications to linear models, logistic regression, and survival analysis. New York: Springer, 2001.
Hastie, T., R. Tibshirani, and J. Friedman: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Series in, Statistics, 2009.
Hastie, T. J., and R. J. Tibshirani: “Generalized additive models”, Statistical Science, 3(1986), 297–318.
Hastie, T. J., and R. J. Tibshirani: Generalized Additive Models. Chapman & Hall/CRC, 1990.
Henley, S. S., R. M. Golden, T. M. Kashner, and H. White: “Exploiting Hidden Structures in Epidemiological Data: Phase II Project”, (R44AA011607) National Institute on Alcohol Abuse and Alcoholism, 2000. http://www.sbir.gov/sbirsearch/detail/223679
Henley, S. S., R. M. Golden, T. M. Kashner, H. White, and R. D. Katz: “Improving Validity Measures for Alcohol-Related Models: Phase I Project”, (R43AA013351) National Institute on Alcohol Abuse and Alcoholism, 2001. http://www.sbir.gov/sbirsearch/detail/223681
Henley, S. S., R. M. Golden, T. M. Kashner, H. White, and R. D. Katz: “Robust Classification Methods for Categorical Regression: Phase I Project”, (R43AA014302) National Institute on Alcohol Abuse and Alcoholism, 2003. http://www.sbir.gov/sbirsearch/detail/223689
Henley, S. S., R. M. Golden, T. M. Kashner, H. White, and D. Paik: “Robust Classification Methods for Categorical Regression: Phase II Project”, (R44CA139607) National Cancer Institute, 2008. http://www.sbir.gov/sbirsearch/detail/223709
Henley, S. S., R. M. Golden, T. M. Kashner, H. White, L. Xuan, D. Paik, and R. D. Katz: “Improving Validity Measures in Alcohol-Related Models: Phase II Project”, (R44AA013351) National Institute on Alcohol Abuse and Alcoholism, 2004. http://www.sbir.gov/sbirsearch/detail/223693
Hilbe, J. M.: Logistic Regression Models. New York: Chapman and Hall, 2009.
Horowitz, J.L.: “Bootstrap-based critical values for the information matrix test”, Journal of Econometrics, 61(1994), 395–411.
Horowitz, J.L.: “The bootstrap in econometrics”, Statistical Science, 18(2003), 211–218.
Hosmer, D. W., T. Hosmer, S. LeCessie, and S. Lemeshow: “A comparison of goodness-of-fit tests for the logistic regression model”, Statistics in Medicine, 16(1997), 965–980.
Hosmer, D. W., and S. Lemeshow: “A goodness-of-fit test for the multiple logistic regression model”, Communication in Statistics, A10(1980), 1043–1069.
Hosmer, D. W., and S. Lemeshow: Applied Logistic Regression. New York: John Wiley & Sons, 2000.
Hosmer, D. W., S. Lemeshow, and J. Klar: “Goodness-of-Fit Testing for Multiple Logistic Regression Analysis when the Estimated Probabilities are Small”, Biometrical Journal, 30(1988), 1–14.
Hosmer, D. W., S. Taber, and S. Lemeshow: “The importance of assessing the fit of logistic regression models: a case study”, American Journal of Public Health, 81(1991), 1630–1635.
Huber, P.: “The behavior of maximum likelihood estimates under non-standard conditions”, University of California Press, 1967.
Kashner, T. M., T. J. Carmody, T. Suppes, A. J. Rush, M. L. Crismon, A. L. Miller, M. Toprac, and M. Trivedi: “Catching up on health outcomes: The Texas Medication Algorithm Project”, Health Services Research, 38(2003), 311–331.
Kashner, T. M., S. S. Henley, R. M. Golden, J. M. Byrne, S. A. Keitz, G. W. Cannon, B. K. Chang, G. J. Holland, D. C. Aron, E. A. Muchmore, A. Wicker, and H. White: “Studying the Effects of ACGME Duty Hours Limits on Resident Satisfaction: Results From VA Learners’ Perceptions Survey”, Academic Medicine, 85(2010), 1130–1139.
Kashner, T. M., S. S. Henley, R. M. Golden, A. J. Rush, and R. B. Jarrett: “Assessing the preventive effects of cognitive therapy following relief of depression: A methodological innovation”, Journal of Affective Disorders, 104(2007), 251–261.
Kashner, T. M., R. Rosenheck, A. B. Campinell, A. Suris, and C. W. T. S. Team: “Impact of work therapy on health status among homeless, substance-dependent veterans - A randomized controlled trial”, Archives of General Psychiatry, 59(2002), 938–944.
Konishi, S., and G. Kitagawa: “Generalized information criteria in model selection”, Biometrika, 83(1996), 875–890.
Kuss, O.: “Global goodness-of-fit tests in logistic regression with sparse data”, Statistics in Medicine, 21(2002), 3789–3801.
Lancaster, T.: “The Covariance Matrix of the Information Matrix Test”, Econometrica, 52(1984), 1051–1054.
Lehmann, E. L.: “Model specification: The views of Fisher and Neyman, and later developments”, Statistical Science, 5(1990), 160–168.
Maddala, G. S.: Limited-dependent and Qualitative Variables in Econometrics. New York: Cambridge, 1999.
Magnus, J. R.: “On differentiating eigenvalues and eigenvectors”, Econometric Theory, 1(1985), 179–191.
Magnus, J. R., and H. Neudecker: Matrix Differential Calculus with Applications in Statistics and Econometrics. New York: John Wiley & Sons, 1999.
McCullagh, P.: “On the asymptotic distribution of Pearson’s statistic in linear exponential family models”, International Statistical Review, 53(1985), 61–67.
McCullagh, P., and J. A. Nelder: Generalized linear models. New York: Chapman and Hall, 1989.
Orme, C.: “The Calculation of the Information Matrix Test for Binary Data Models”, The Manchester School, 56(1988), 370–376.
Orme, C.: “The small-sample performance of the information-matrix test”, Journal of Econometrics, 46(1990), 309–331.
Osius, G., and D. Rojek: “Normal goodness-of-fit tests for multinomial models with large degrees-of-freedom”, Journal of the American Statistical Association, 87(1992), 1145–1152.
Pepe, M. S.: The Statistical Evaluation of Medical Tests for Classification and Prediction. Oxford: Oxford University Press, 2004.
Politis, D. N., J. P. Romano, and M. Wolf: Subsampling. New York: Springer, 1999.
Qin, J., and B. Zhang: “A goodness-of-fit test for logistic regression models based on case-control data”, Biometrika, 84(1997), 609–618.
Raudenbush, S. W., and A. S. Bryk: Hierarchical Linear Models: Applications and Data Analysis Methods. Thousand Oaks, CA: Sage Publications, Inc., 2002.
Sarkar, S. K., and H. Midi: “Importance of assessing the model adequacy of binary logistic regression”, Journal of Applied Sciences, 10(2010), 479–486.
Serfling, R. J.: Approximation theorems of mathematical statistics. New York: John Wiley & Sons, 1980.
Stomberg, C., and H. White: “Bootstrapping the Information Matrix Test”, University of California, San Diego Department of Economics Discussion Paper, 2000.
Stukel, T.A.: “Generalized logistic models”, Journal of the American Statistical Association, 83(1988), 426–431.
Takeuchi, K.: “Distribution of information statistics and a criterion of model fitting for adequacy of models”, Mathematical Sciences, 153(1976), 12–18.
Taylor, L.W.: “The Size Bias of White’s Information Matrix Test”, Economics Letters, 24(1987), 63–67.
Tsay, R.S.: Analysis of Financial Time Series. New York: John Wiley & Sons, 2010.
Tsiatis, A.A.: “A Note on a goodness-of-fit test for the logistic regression model”, Biometrika, 67(1980), 250–251.
Verbeke, G., and E. Lesaffre: “The effect of misspecifying the random-effects distribution in linear mixed models for longitudinal data”, Computational Statistics and Data Analysis, 23(1997), 541–556.
Vuong, Q.H.: “Likelihood ratio tests for model selection and non-nested hypotheses”, Econometrica, 57(1989).
Wald, A.: “Tests of Statistical Hypotheses Concerning Several Parameters When the Number of Observations is Large”, Transactions of the American Mathematical Society, 54(1943), 426–482.
Wei, B.: Exponential Family Nonlinear Models. New York: Springer, 1998.
White, H.: “Using least squares to approximate unknown regression functions”, International Economic Review, 21(1980), 149–170.
White, H.: “Consequences and detection of misspecified nonlinear regression models”, Journal of the American Statistical Association, 76(1981), 419–433.
White, H.: “Maximum Likelihood Estimation of Misspecified Models”, Econometrica, 50(1982), 1–25.
White, H.: “Specification Testing in Dynamic Models”, Cambridge University Press, 1987.
White, H.: Estimation, inference, and specification analysis. Cambridge: Cambridge University Press, 1994.
Wickens, T.D.: Elementary Signal Detection Theory. New York: Oxford University Press, 2002.
Winkler, G.: Image Analysis, Random Fields, and Dynamic Monte Carlo Methods. New York: Springer-Verlag, 1991.
Zhang, B.: “A chi-squared goodness-of-fit test for logistic regression models based on case-control data”, Biometrika, 86(1999), 531–539.
Zhang, B.: “An information matrix test for logistic regression models based on case-control data”, Biometrika, 88(2001), 921–932.
Acknowledgments
This research was made possible by grants from the National Cancer Institute (NCI) (R44CA139607, PI: S.S. Henley) and the National Institute on Alcohol Abuse and Alcoholism (NIAAA) (R43AA014302, PI: S.S. Henley; R43/44AA013351, PI: S.S. Henley; R44AA011607, PI: S.S. Henley) under the Small Business Innovation Research (SBIR) program. The authors wish to gratefully acknowledge this support. This chapter reflects the authors’ views and not necessarily the opinions or views of the NCI or the NIAAA. The authors would also like to thank the anonymous referee for helpful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this chapter
Cite this chapter
Golden, R.M., Henley, S.S., White, H., Kashner, T. (2013). New Directions in Information Matrix Testing: Eigenspectrum Tests. In: Chen, X., Swanson, N. (eds) Recent Advances and Future Directions in Causality, Prediction, and Specification Analysis. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-1653-1_6
Download citation
DOI: https://doi.org/10.1007/978-1-4614-1653-1_6
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-1652-4
Online ISBN: 978-1-4614-1653-1
eBook Packages: Business and EconomicsEconomics and Finance (R0)