Skip to main content
Log in

Goodness of Fit in Regression Analysis – R 2 and G 2 Reconsidered

  • Published:
Quality and Quantity Aims and scope Submit manuscript

Abstract

There has been considerable debate on how important goodness of fit is as a tool in regression analysis, especially with regard to the controversy on R 2 in linear regression. This article reviews some of the arguments of this debate and its relationship to other goodness of fit measures. It attempts to clarify the distinction between goodness of fit measures and other model evaluation tools as well as the distinction between model test statistics and descriptive measures used to make decisions on the agreement between models and data. It also argues that the utility of goodness of fit measures depends on whether the analysis focuses on explaining the outcome (model orientation) or explaining the effect(s) of some regressor(s) on the outcome (factor orientation).

In some situations a decisive goodness of fit test statistic exists and is a central tool in the analysis. In other situations, where the goodness of fit measure is not a test statistic but a descripitive measure, it can be used as a heuristic device along with other evidence whenever appropriate. The availability of goodness of fit test statistics depends on whether the variability in the observations is restricted, as in table analysis, or whether it is unrestricted, as in OLS and logistic regression on individual data. Hence, G 2 is a decisive tool for measuring goodness of fit, whereas R 2 and SEE are heuristic tools.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Achen, C. H. (1982). Interpreting and Using Regression. Newbury Park: Sage Publications.

    Google Scholar 

  • Achen, C. H. (1990). What Does “Explained Variance” Explain?: Reply, Political Analysis, 2. Ann Arbor: The University of Michigan Press, pp. 173–184.

    Google Scholar 

  • Agresti, A. (1996). An Introduction to Categorical Data Analysis. New York: John Wiley and Sons.

    Google Scholar 

  • Agresti, A. (1990). Categorical Data Analysis. New York: John Wiley and Sons.

    Google Scholar 

  • Aldrich, J. H. & Nelson, F. D. (1984). Linear Probability, Logit, and Probit Models. Newbury Park: Sage Publications.

    Google Scholar 

  • Berry, W. D. & Feldman, S. (1985). Multiple Regression in Practice. Newbury Park: Sage Publications.

    Google Scholar 

  • Bishop, Y. M. M., Fienberg, S. E., & Holland, P. W. (1975). Discrete Multivariate Analysis. Theory and Practice. Cambridge: The MIT Press.

    Google Scholar 

  • Bollen, K. A. & Long, J. S. (1993). Introduction. In: K. A. Bollen & J. S. Long (eds), Testing Structural Equation Models. Newbury Park: Sage Publications.

    Google Scholar 

  • Clogg, C. C. & Shihadeh, E. S. (1994). Statistical Models for Ordinal Variables. Thousand Oaks: Sage Publications.

    Google Scholar 

  • Demaris, A. (1992). Logit Modeling. Practical Applications. Newbury Park: Sage Publications.

    Google Scholar 

  • Duncan, O. D. (1985). Personal letter to David Burke.

  • Fienberg, S. E. (1980). The Analysis of Cross-Classified Categorical Data. Cambridge: The MIT Press.

    Google Scholar 

  • Gilbert, N. (1993). Analyzing Tabular Data. Loglinear and Logistic Models for Social Researchers. London: UCL Press.

    Google Scholar 

  • Hagle, T. M. & Mitchell, G. E. (1992). Goodness of fit measures for probit and logit. American Journal of Political Science 36: 762–784.

    Google Scholar 

  • Hanushek, E. A. & Jackson, J. E. (1977). Statistical Methods for Social Scientists. Orlando: Academic Press.

    Google Scholar 

  • Hosmer, D. W. & Lemeshow, S. (1989). Applied Logistic Regression. New York: John Wiley and Sons.

    Google Scholar 

  • King, G. (1986). How not to lie with statistics: Avoiding common mistakes in quantitative political science. American Journal of Political Science 30: 666–687.

    Google Scholar 

  • King, G. (1990). Stochastic Variation: A Comment on Lewis-Beck and Skalaban's “The R-Squared”. Political Analysis, 2. Ann Arbor: The University of Michigan Press, pp. 185–200.

    Google Scholar 

  • Knoke, D. & Burke, P. J. (1980). Log-linear models. Newbury Park: Sage Publications.

    Google Scholar 

  • Lewis-Beck, M. S. (1980). Applied Regression. An Introduction. Newbury Park: Sage Publications.

    Google Scholar 

  • Lewis-Beck, M. S. & Skalaban, A. (1990). The R-Squared: Some Straight Talk. Political Analysis, 2. Ann Arbor: The University of Michigan Press, pp. 153–171.

    Google Scholar 

  • McCullagh, P. & Nelder, J. A. (1989). Generalized Linear Models. London: Chapman and Hall.

    Google Scholar 

  • McFadden, D. (1974). Conditional Logit Analysis of Qualitative Choice Behavior. Frontiers of Econometrics. New York: Academic Press, pp. 105–142.

    Google Scholar 

  • Menard, S. (1995). Applied Logistic Regression Analysis. Thousand Oaks: Sage Publications.

    Google Scholar 

  • Schroeder, L. D., Sjoquist, D. L., & Stephan, P. E. (1986). Understanding Regression Analysis. An Introductory Guide. Newbury Park: Sage Publications.

    Google Scholar 

  • SPSS (1993). SPSS for Windows. Advanced Statistics Release 6.0. Chicago: SPSS.

    Google Scholar 

  • SPSS (1993). SPSS for Windows. Base System User's Guide. Release 6.0. Chicago: SPSS.

    Google Scholar 

  • SPSS (1994). SPSS 6.1 for Windows update. Chicago: SPSS.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hagquist, C., Stenbeck, M. Goodness of Fit in Regression Analysis – R 2 and G 2 Reconsidered. Quality & Quantity 32, 229–245 (1998). https://doi.org/10.1023/A:1004328601205

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1004328601205

Keywords

Navigation