Skip to main content

The Variable Selection Problem in the Three Worlds of Welfare Literature


Based on a quantitative meta-analysis of empirical studies, this article points out a significant flaw in the Three Worlds of Welfare literature, the “variable selection problem.” Compiling, classifying, and quantitatively analysing all variables that have been employed in this literature, the article shows first that variable selection has depended more on case selection than on theory. Scholars tend to employ variables based on data availability, rather than selecting variables according to theoretical frameworks. Second, the use of welfare policy variables is mostly limited to the analysis of Organization for Economic Co-operation and Development (OECD) countries, while studies analysing non-OECD countries, where data is limited, tend to use developmental outcome variables as a proxy. This tendency harms conceptualization and operationalization of welfare regimes, as well as blur the boundary between development and welfare regimes studies. Third, the use of original Esping-Andersen variables remains very limited, undermining continuity, comparability, and reliability within the literature.

This is a preview of subscription content, access via your institution.

Fig. 1


  1. We contend that the use of certain political variables is necessary for capturing the political causes or effects of welfare regime changes. Our intention however is to point out a more general tendency in the literature to easily replace welfare policy variables by other variables without regarding much about the ensuing validity problems.

  2. Here “all” does not refer to all countries in the world but rather connotes those with available comparable datasets. Kim (2015, p. 314) refers to this group of studies as those encompassing OECD and non-OECD nations on the basis of certain case selection criteria. Even if the study itself aimed at “all nation-states of the world,” as for instance in Abu Sharkh (2009), case selection was narrowed down to those reporting data or with UN or World Bank “guestimates” of national data (Kim 2015, p. 315).

  3. In order to replicate the analysis provided in this paper, an anonymized dataset replication is uploaded online at

  4. Each observation we have refers to a variable used in a specific study. The same variable can be reused in multiple articles and we approach each instance of a specific variable as a single observation, used in a specific study. Therefore, our unit of analysis is a “variable-study” dyad. For convenience, in the rest of the article, wherever we used the term variable, we refer to a particular instance in which a variable is used by a particular article.

  5. Exponentiated coefficients obtained from multinomial logit are called “relative risk ratios,” rather than odds ratios. We employed a multinomial logistic regression analysis since our dependent variable has more than two categories. Multinomial logit method is used to predict a nominal dependent variable given one or more independent variables. It is an extension of binomial logistic regression to allow for a dependent variable with more than two categories. As with other types of regression, multinomial logistic regression can have nominal and/or continuous independent variables and can have interactions between independent variables to predict the dependent variable.

  6. These datasets and related documentations are available from, and

  7. The 16 variables used by Esping-Andersen (1990) is as follows: 1—minimum pension replacement rate, 2—standard pension replacement rate, 3—number of years of contributions required to qualify for old age pension, 4—the share of total pension finance paid by individuals, 5—the percent of persons above pension age actually receiving a pension (take-up rate), 6—sickness benefits replacement rate, 7—number of waiting days to receive benefits, 8—number of weeks of benefit duration, 9—unemployment benefits replacement rate, 10—number of waiting days to receive benefits, 11—number of weeks of benefit duration, 12—corporatism (occupationally distinct public pension schemes), 13—etatism (measured as expenditure on pensions to government employees as percentage GDP), 14—means-tested poor relief (as a percentage of total public social expenditure), 15—private pensions (as percentage of total pensions,) and 16—private health spending (as percentage of total). In his analysis, Esping-Andersen (1990) also uses “average universalism” and “average benefit equality” measures but we excluded those in our review since these two measures are his own index calculations, hence not original variables.

  8. Scruggs and Allan (2006, 2008) and Powell and Barrientos (2004) are examples of good practice. They have used genuine social policy variables to represent welfare state efforts. Castles and Obinger (2008) can be cited as two examples for the type of flaws that dominate the literature. In their cluster analysis, they mixed different types of variables to estimate welfare regimes, including total fertility rate, public sector employment, social security contributions, direct taxes, indirect taxes, inflation, unemployment, education expenditure, subsidies, male employment, social transfers, total tax revenues, female employment, outlays of government, economic growth.


Download references


We thank the team members of Emerging Welfare ERC project ( for their comments and criticisms, the editor of Social Indicators Research for her continuous support and the anonymous reviewers for their very constructive criticisms and suggestions.


This work was supported by the European Research Council (ERC) (Grant Number: 714868).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Erdem Yörük.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Yörük, E., Öker, İ., Yıldırım, K. et al. The Variable Selection Problem in the Three Worlds of Welfare Literature. Soc Indic Res 144, 625–646 (2019).

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Welfare modelling
  • Case selection
  • Methodology
  • Comparative analysis
  • Welfare regime