Adding Meaning to Regression

Taagepera, Rein

doi:10.1057/eps.2010.28

Adding Meaning to Regression

Research
Published: 17 December 2010

Volume 10, pages 73–85, (2011)
Cite this article

European Political Science Aims and scope Submit manuscript

Rein Taagepera^1,2

235 Accesses
5 Citations
Explore all metrics

Abstract

In any data analysis we should look for ability to predict and for connections to a broader comparative context. Our equations must not predict absurdities, even under extreme circumstances, if we want to be taken seriously as scientists. Poorly done linear regression analysis often does lead to absurd predictions. Fixed exponent and exponential patterns seem more prevalent in social nature than linear patterns. Before applying regression to two variables, graph them against each other, showing the borders of the conceptually allowed space and possible logical anchor points. Transform the data until anchor points and data points do fit a straight line which does not pierce conceptual ceilings or floors. During regression, consider symmetric regression, because Ordinary Least Squares y-on-x and x-on-y differ from each other and their slopes depend on the degree of scatter. After regression, look at the numerical values of parameters and ask what they tell us in a comparative context. When considering multivariable regression, pay more than lip service to Occam's Razor.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multivariate Regression: Additional Topics

Regression

General Aspects of Fitting Regression Models

References

Coleman, S. (2007) ‘Testing theories with qualitative and quantitative predictions’, European Political Science 6 (2): 124–133.
Article Google Scholar
Colomer, J.M. (2007) ‘What other sciences look like’, European Political Science 6 (2): 134–142.
Article Google Scholar
Dalton, R.J. and Shin, D.C. (2006) Citizens, Democracy, and Markets Around the Pacific Rim: Congruence Theory and Political Culture, Oxford: Oxford University Press.
Google Scholar
Grofman, B. (2007) ‘Toward a science of politics?’ European Political Science 6 (2): 143–155.
Article Google Scholar
Kvålseth, T.O. (1985) ‘Cautionary note about R²’, The American Statistician 39: 279–285.
Google Scholar
Levine, J.H. (1993) Exceptions Are the Rule: An Inquiry into Methods in the Social Sciences, Boulder, CO: Westview Press.
Google Scholar
Lijphart, A. (1999) Patterns of Democracy: Government Forms and Performance in Thirty-Six Countries, New Haven, CT: Yale University Press.
Google Scholar
Taagepera, R. (2007a) ‘Why political science is not scientific enough: a symposium’, European Political Science 6 (2): 111–113.
Article Google Scholar
Taagepera, R. (2007b) ‘Predictive versus postdictive models’, European Political Science 6 (2): 114–123.
Article Google Scholar
Taagepera, R. (2008) Making Social Sciences More Scientific: The Need for Predictive Models, Oxford: Oxford University Press.
Book Google Scholar
Taagepera, R. (2009) ‘Logical models in social sciences: how to begin’, http://www.psych.ut.ee/stk/Beginners_Logical_Models.pdf.

Download references

Acknowledgements

I thank Allan Sikk, Mirjam Allik, Rune H. Andersen, Russ Dalton, Steve Coleman and two anonymous reviewers for thoughtful comments on the manuscript, and Rune also for finalizing the graphs.

Author information

Authors and Affiliations

Institute of Government, University of Tartu, Tartu, EE-50090, Estonia
Rein Taagepera
School of Social Sciences, University of California, Irvine, 92697, CA, USA
Rein Taagepera

Authors

Rein Taagepera
View author publications
You can also search for this author in PubMed Google Scholar

APPENDIX

TURNING CURVES INTO STRAIGHT LINES AND CALCULATING THE PARAMETERS

Unbounded field

If we expect a linear pattern y=a+bx because it is an unbounded field, no transformation is needed. Graph y versus x. If the data cloud is linear, then y=a+bx applies. Then we can regress y versus x.

How can we find the coefficients a and b, using the visual best-fit line?

Intercept a is the value of y where the line crosses the y axis (because here x=0.)
Slope b is the ratio −a/c, c being the value of x where the line crosses the x axis (because here y=0.)

How can we find the coefficient values in y=a+bx from two points?

Take two ‘typical’ points along the axis of the data belt, far away from each other: x₁,y₁ and x₂,y₂.
For y=a+bx we have b=(y₁−y₂)/(x₁−x₂). Then a=y₁−bx₁.
When a=0 is imposed, the equation is reduced to y=bx. Then b=y₁/x₁.

Only one quadrant allowed

If we expect a fixed exponent pattern y=Ax^k, because only one quadrant is allowed, taking logarithms leads to linear relationship between log y and log x: log y=log A+k log x. Designating log A as a takes us to the familiar linear form (log y)=a+k(log x). Graph log y versus log x. If the data cloud is linear, then y=Ax^k applies. Then we can regress log y versus log x.

How can we find the coefficients A and k in y=Ax^k, using the log-log graph?

Coefficient A is the value of y where the line crosses the log y axis (because here log x=0 and x=1).
Exponent k is the ratio −A/c, c being the value of log x where the line crosses the log x axis (because here log y=0.)

How can we find the coefficient values in y=Ax^k from two points on the curved graph y versus x?

Take two ‘typical’ points of the data belt, far away from each other: x₁,y₁ and x₂,y₂.
For y=Ax^k we have k=log(y₁/y₂)/log(x₁/x₂). Then A=y₁/(x₁^k).
When A=1 is imposed, the equation is reduced to y=x^k. Then k=log y₁/log x₁.

Only two quadrants allowed

If we expect an exponential pattern y=A(B^x), because only the positive-x quadrants are allowed, taking logarithms leads to linear relationship between log y and non-logged x: logy=log A+x(log B). Designating log A as a and log B as b takes us to the familiar linear form (log y)=a+bx. Graph log y versus x itself. If the data cloud is linear, then y=A(B^x) applies. Then we can regress log y versus x itself.

There are often reasons to use the alternative exponential expression y=A(e^kx) and natural logarithms (ln). By definition ln e=1. Hence the logarithms are related as ln y=ln A+kx=a+kx. Graph ln y versus x itself. If the data cloud is linear, then y=Ae^kx applies. Then we can regress ln y versus x itself.

How do natural (ln x) and decimal (log x) logarithms relate? ln x=2.30log x and conversely, log x=0.434ln x. Often we can use either.

How can we find the coefficients in y=A(B^x) or y=Ae^kx, using the ‘semilog’ graph?

One may get confused between log and ln, so it is better to use the two-point formula below.

How can we find the coefficient values in y=A(B^x)=A(e^kx) from two points on the curved graph y versus x?

Take two ‘typical’ points of the data belt, far away from each other: x₁,y₁ and x₂,y₂.
For y=A(B^x) we have log B=[log(y₁/y₂)]/(x₁−x₂). Then B=10^logB and A=y₁/(B^x1).
For y= A(e^kx) we have k=[ln(y₁/y₂)]/(x₁−x₂). Then A=y₁(e^−kx1).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Taagepera, R. Adding Meaning to Regression. Eur Polit Sci 10, 73–85 (2011). https://doi.org/10.1057/eps.2010.28

Download citation

Published: 17 December 2010
Issue Date: 01 March 2011
DOI: https://doi.org/10.1057/eps.2010.28

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adding Meaning to Regression

Abstract

Access this article

Similar content being viewed by others

Multivariate Regression: Additional Topics

Regression

General Aspects of Fitting Regression Models

References

Acknowledgements

Author information

Authors and Affiliations

APPENDIX

TURNING CURVES INTO STRAIGHT LINES AND CALCULATING THE PARAMETERS

Unbounded field

Only one quadrant allowed

Only two quadrants allowed

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Adding Meaning to Regression

Abstract

Access this article

Similar content being viewed by others

Multivariate Regression: Additional Topics

Regression

General Aspects of Fitting Regression Models

References

Acknowledgements

Author information

Authors and Affiliations

APPENDIX

APPENDIX

TURNING CURVES INTO STRAIGHT LINES AND CALCULATING THE PARAMETERS

Unbounded field

Only one quadrant allowed

Only two quadrants allowed

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation