Abstract
There has been considerable debate on how important goodness of fit is as a tool in regression analysis, especially with regard to the controversy on R 2 in linear regression. This article reviews some of the arguments of this debate and its relationship to other goodness of fit measures. It attempts to clarify the distinction between goodness of fit measures and other model evaluation tools as well as the distinction between model test statistics and descriptive measures used to make decisions on the agreement between models and data. It also argues that the utility of goodness of fit measures depends on whether the analysis focuses on explaining the outcome (model orientation) or explaining the effect(s) of some regressor(s) on the outcome (factor orientation).
In some situations a decisive goodness of fit test statistic exists and is a central tool in the analysis. In other situations, where the goodness of fit measure is not a test statistic but a descripitive measure, it can be used as a heuristic device along with other evidence whenever appropriate. The availability of goodness of fit test statistics depends on whether the variability in the observations is restricted, as in table analysis, or whether it is unrestricted, as in OLS and logistic regression on individual data. Hence, G 2 is a decisive tool for measuring goodness of fit, whereas R 2 and SEE are heuristic tools.
Similar content being viewed by others
References
Achen, C. H. (1982). Interpreting and Using Regression. Newbury Park: Sage Publications.
Achen, C. H. (1990). What Does “Explained Variance” Explain?: Reply, Political Analysis, 2. Ann Arbor: The University of Michigan Press, pp. 173–184.
Agresti, A. (1996). An Introduction to Categorical Data Analysis. New York: John Wiley and Sons.
Agresti, A. (1990). Categorical Data Analysis. New York: John Wiley and Sons.
Aldrich, J. H. & Nelson, F. D. (1984). Linear Probability, Logit, and Probit Models. Newbury Park: Sage Publications.
Berry, W. D. & Feldman, S. (1985). Multiple Regression in Practice. Newbury Park: Sage Publications.
Bishop, Y. M. M., Fienberg, S. E., & Holland, P. W. (1975). Discrete Multivariate Analysis. Theory and Practice. Cambridge: The MIT Press.
Bollen, K. A. & Long, J. S. (1993). Introduction. In: K. A. Bollen & J. S. Long (eds), Testing Structural Equation Models. Newbury Park: Sage Publications.
Clogg, C. C. & Shihadeh, E. S. (1994). Statistical Models for Ordinal Variables. Thousand Oaks: Sage Publications.
Demaris, A. (1992). Logit Modeling. Practical Applications. Newbury Park: Sage Publications.
Duncan, O. D. (1985). Personal letter to David Burke.
Fienberg, S. E. (1980). The Analysis of Cross-Classified Categorical Data. Cambridge: The MIT Press.
Gilbert, N. (1993). Analyzing Tabular Data. Loglinear and Logistic Models for Social Researchers. London: UCL Press.
Hagle, T. M. & Mitchell, G. E. (1992). Goodness of fit measures for probit and logit. American Journal of Political Science 36: 762–784.
Hanushek, E. A. & Jackson, J. E. (1977). Statistical Methods for Social Scientists. Orlando: Academic Press.
Hosmer, D. W. & Lemeshow, S. (1989). Applied Logistic Regression. New York: John Wiley and Sons.
King, G. (1986). How not to lie with statistics: Avoiding common mistakes in quantitative political science. American Journal of Political Science 30: 666–687.
King, G. (1990). Stochastic Variation: A Comment on Lewis-Beck and Skalaban's “The R-Squared”. Political Analysis, 2. Ann Arbor: The University of Michigan Press, pp. 185–200.
Knoke, D. & Burke, P. J. (1980). Log-linear models. Newbury Park: Sage Publications.
Lewis-Beck, M. S. (1980). Applied Regression. An Introduction. Newbury Park: Sage Publications.
Lewis-Beck, M. S. & Skalaban, A. (1990). The R-Squared: Some Straight Talk. Political Analysis, 2. Ann Arbor: The University of Michigan Press, pp. 153–171.
McCullagh, P. & Nelder, J. A. (1989). Generalized Linear Models. London: Chapman and Hall.
McFadden, D. (1974). Conditional Logit Analysis of Qualitative Choice Behavior. Frontiers of Econometrics. New York: Academic Press, pp. 105–142.
Menard, S. (1995). Applied Logistic Regression Analysis. Thousand Oaks: Sage Publications.
Schroeder, L. D., Sjoquist, D. L., & Stephan, P. E. (1986). Understanding Regression Analysis. An Introductory Guide. Newbury Park: Sage Publications.
SPSS (1993). SPSS for Windows. Advanced Statistics Release 6.0. Chicago: SPSS.
SPSS (1993). SPSS for Windows. Base System User's Guide. Release 6.0. Chicago: SPSS.
SPSS (1994). SPSS 6.1 for Windows update. Chicago: SPSS.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Hagquist, C., Stenbeck, M. Goodness of Fit in Regression Analysis – R 2 and G 2 Reconsidered. Quality & Quantity 32, 229–245 (1998). https://doi.org/10.1023/A:1004328601205
Issue Date:
DOI: https://doi.org/10.1023/A:1004328601205