This chapter discusses methods appropriate to the study of social protection in Latin America. It reviews and assesses the main methodological approaches employed by researchers in the past and, building on these, it identifies methods appropriate to developing a theoretical perspective capable of explaining the evolution and current configuration of social protection institutions in the region. The task is to select methods of data collection and analysis that will connect theoretical propositions to empirical counterparts.

By and large, the literature on social protection in Latin America has relied on case studies, cross-country institutional comparisons, linear structural equations estimated on aggregate annual data, and quasi-experimental evaluation of social protection interventions. After the turn of the century, critical improvements in data availability have supported comparative analysis. The spread, regularity, and accessibility of individual and household dataFootnote 1 since the mid 1990s has facilitated the application of a wider range of analytical tools for the comparative study of social protection. Dedicated beneficiary surveys in some countries in the region provide good quality detailed data on social protection.Footnote 2 More recently, impact evaluation survey data collected to assess the effects of social assistance programmes have been available to researchers. They offer an excellent resource for research on social protection outcomes.Footnote 3 Region-wide attitudinal survey data provide an essential resource to study public preferences on social protection.Footnote 4 There are also notable improvements in the availability of administrative data.Footnote 5

Recent trends in social research methods reveal a renewed emphasis on causal inference. This is the outcome of push and pull factors. On the one hand, a growing awareness of the deficiencies associated with correlation bias in quantitative analysis. On the other, the growing application of experimental approaches to data collection and analysis (Angrist & Pischke, 2008; Banerjee & Duflo, 2008; Gelman & Imbens, 2013). The application of quasi-experimental methods in the study of the outcomes of social protection interventions, initially in the context of conditional income transfers, has focused attention on the causal effects of social protection interventions.

The issue to be discussed in this chapter is the extent to which these methodological developments can guide research on social protection institutions. There are significant challenges to causal inference in social protection research. Arguably, the study of social protection institutions could be seen as fundamentally different to the study of whether conditional income transfers have specific effects on their recipients. There are obvious limits to the application of experimental techniques in the study of social protection institutions and policies. Institutions evolve in time making it hard to keep conditions unchanged. Controls groups can be hard to identify in the context of social policy. Implementation issues can be significant in the context of decentralised programmes affected by territorial diversity (Niedzwiecki, 2018).

Nevertheless, causal inference studies are increasingly being implemented in the study of social policy (Morgan & Winship, 2015). Taking causal inference seriously promises to deliver significant gains. The implementation of graphical causal models forces researchers to examine critically the relationships between the variables of interest on the basis of their theoretical frameworks (Elwert, 2013). Sifting causal from non-causal factors associated with social protection systematically, as opposed plugging multiple control variables in regression exercises, is a welcomed discipline. The potential outcomes approach can be usefully applied in the context of observational data, providing the preconditions for its applicability are considered carefully (Imbens, 2019). And alternatives to times-series-cross section studies that avoid post-treatment bias and omitted variables can support credible and reliable causal inference (Blackwell & Glynn, 2018). Identifying the challenges of implementing causal inference methods in social protection research could contribute to our understanding of these institutions in the region.

The materials in this chapter are arranged as follows. Section 2.1 will provide a brief review of the methods employed in the current literature on social protection in Latin America. Section 2.2 will make a case for adopting methods focused on causal inference. Section 2.3 discusses the implications of distinguishing between ‘effects of cause’ and ‘causes of effects’ for our approach to methods. Sections 2.4 and 2.5 offer a brief introduction to two causal inference tools: the potential outcomes approach and directed acyclic graphs. Section 2.6 identifies causality deficits in current social protection research. Section 2.7 summarises methodological challenges associated with placing greater emphasis on causal models. A final section concludes.

2.1 Methods in the Comparative Literature

The comparative literature on social protection in the region, especially in the last quarter of the last century, relied mainly on country and cross-country case studies. Case studies described institutional patterns and their evolution over time, focusing on the institutional set up in specific countries or focusing on specific institutions and instruments. In the new century, case studies focusing on multiple countries and regional comparisons have expanded.Footnote 6 Regional reports from ECLAC, ILO, IADB, UNDP and from the World Bank regularly offer comprehensive regional perspectives on social protection.Footnote 7 This comparative literature provides a substantive knowledge base on social protection policies and practices in the region. The analytical interest is often the identification of trends, cross-country and sub-regional differences in institutional performance, responses to the issue of the day, and comparisons with other regions or countries.

The identification of country clusters has generated a lively literature grouping countries according to the key features of their institutions. The relevant studies will be covered in some detail in a later chapter. Here the interest is to identify the methods employed. Studies on country clusters follow up on the welfare regime literature on welfare states pioneered by the work of Esping-Andersen (1990). This approach distinguishes welfare systems according to the institution that dominates their welfare provision: markets, the state, and families. This comparative approach has generated a great deal of research interest because it claims to unveil fundamental differences in the structure of capitalism among high income countries.Footnote 8 The main analytical tools employed in this literature cluster countries by extracting an index of differences or similarities from multivariate data (Barrientos, 2015). Cluster analysis is a commonly used approach (Abu Sharkh & Gough, 2010; Gough, 2001; Hirschberg et al., 1991; Powell & Barrientos, 2004). Martínez Franzoni (2008) has applied this approach to social policy in Latin America. Alternative data reduction techniques have been implemented in the region. Cruz-Martínez (2014, 2017b) employs principal components analysis to reduce several indicators supporting the construction of indexes which are then combined into a multidimensional welfare index. Countries are then ranked by their scores on this welfare index. Two points from this literature need to be underlined at this juncture. First, a crucial underlying assumption is that welfare institutions are multidimensional, hence the need to implement data reduction. Second, country groupings are helpful, analytically, in distinguishing multiple configurations of welfare institutions in the region, but they also have a normative content is as much as they reveal good and bad outcomes.

Qualitative Comparative Analysis (QCA) has proved especially useful in the study of institutions, their configuration, and change over time (Kangas, 1994; Kvist, 2007; Vis, 2012). It has been applied to the study of welfare institutions in Latin America (Cruz-Martínez, 2017a, 2019; Segura-Ubiergo, 2007). Simulation studies are also effective in providing a comparative perspective on social protection (Altamirano Montoya et al., 2018).

Time-series-cross-section and panel data are employed extensively in economic and political studies which include social protection institutions. Studies have considered whether globalisation has undermined social expenditure (Avelino et al., 2005; Huber et al., 2008) or whether democracy or conflict are associated with social programmes and expenditure (Yörük, 2022). Studies have relied on time-series-cross-section data to investigate the factors associated with the origins and evolution of welfare institutions (Haggard & Kaufman, 2008; Huber et al., 2008; Kaufman & Segura-Ubiergo, 2001; Segura-Ubiergo, 2007). Time-series-cross-section and panel data analysis are grounded on structural linear equation models unveiling correlations among variables of interest. The statistical models estimated are scrutinised as regards the significance of the estimated coefficients and the strength of the model’s ability to reduce unexplained variance.

Event studies are helpful in identifying the factors constraining or facilitating the emergence of specific social protection institutions. They focus on unveiling correlates of foundational events (Knutsen & Rasmussen, 2018; Rasmussen & Knutsen, 2017; Schmitt, 2015; Schmitt et al., 2015).

More recently, the availability of evaluation and observational data has encouraged quasi-experimental techniques to study the impact of social protection interventions. In Latin America the spread of conditional income transfers helped bed in the collection of evaluation data and techniques for their evaluation. Mexico’s Progresa was the pioneer (Skoufias, 2005). Exploiting a scheduled implementation of the programme, difference in difference estimates of its impact on poverty helped to protect it from contracting government budgets (Levy, 2006). Impact evaluations of old age transfers estimate the effects of the programmes by focusing on the differences in social indicators around the age of eligibility, a regression discontinuity design (Galiani et al., 2014). Regression discontinuity design can be implemented where programme regulations or their change over time generate exogenous breaks in benefit entitlements (Barrientos & Villa, 2015). Comparative studies extract information from the estimated impact of specific programme. Meta studies of the impact of conditional income transfers on education (Saavedra & Garcia, 2017); child mortality (Cavalcanti, 2023); labour supply (Alzúa et al., 2010); or elections (Araújo, 2021) provide a comparative perspective on social assistance. The rapid spread of evaluation data and quasi-experimental analytical techniques have greatly refined our understanding of the impact of social protection interventions in the region.

2.2 Shifting Attention to Causality

The growing use of experimental and quasi-experimental in social protection, especially in the study of the effects of social assistance programmes is part of a methodological shift focused on causal inference. In the context of social assistance programmes this focus on causality has greatly benefited from the design of impact evaluations and the availability of quasi-experimental data. This will be discussed in detail in the Protection chapter. This approach is predicated on the presence of an ad hoc intervention, social assistance transfers for example, capable of leading to a change in the behaviour of recipients. Dividing potential participants into two equivalent groups and implementing the intervention to the treatment group makes it possible to measure the difference in outcomes across the treatment and control groups after the intervention. This measure is a representation of the causal effect of the intervention.

There are multiple ways in which social protection institutions, like occupational pension funds or individual savings plans, are likely to have more complex effects than conditional income transfers. It is harder, and often unfeasible, for governments to experiment with large scale pension schemes, if anything because of the time horizons required. To a significant extent, researchers can only rely on observational data to try to understand social protection institutions. Generalising from local experiments is perilous as the causal effects identified by impact evaluation studies trade off external validity for external validity. These and other issues have persuaded researchers examining social protection institutions to sidestep causal inference in favour of associational models, qualitative and quantitative.

The aim of the materials discussed here is to argue there is path towards a causal examination of social protection institutions in Latin America. The main argument is that causal models are essential to unveil the factors explaining existing social protection in the region. The causal framework should not be limited to the study of the effects of local interventions but can also be deployed to study the configuration of social protection institutions in the region. To do this successfully, we need to integrate a set of methodological tools into our research together with an understanding of their limitations or, what is the same, an understanding of the conditions in which they can help advance understanding.

2.3 ‘Effects of Causes’ and ‘Causes of Effects’ Explanations

Gelman and Imbens (2013) make a distinction between two types of causal explanations in the social sciences. First, a forward explanation that answers to the question: ‘what are the effects of causes’? This is at the core of the experimental methods revolution. A second set of explanations consist of backward explanations answering the question: ‘what are the causes of effects? They start from a set of effects and seek to understand what factors caused them. Impact evaluation studies of conditional income transfers belong to the first set of explanations, the ‘effects of causes’, while studies aiming to explain the origins of occupational insurance funds in Latin America in terms of associated features of society, politics, or the economy belong to the second set of explanations, the ‘causes of effects’. Yamamoto (2012) argues that ‘causes of effects’ explanations “are about attribution, instead of effects, because their primary concern is the extent to which the actual occurrence of events can be attributed to a suspected cause” (2012, p. 1). This is closer to the term in common use. This distinction is relevant to the methodological approach in this book.

The ‘experimental revolution’ involved shifting the focus from ‘causes of effects’ to ‘effects of causes. In Gelman and Imbens’ view, estimating causal relations can only be done with forward causal questions. Studies attempting to explain causes of phenomena usually begin by identifying a set of potential causal factors and then test for the presence or absence of these factors in the antecedents of the phenomena. The presence of the effect often precludes consideration of possible counterfactual. Even when attention is paid to counterfactuals, reverse causation studies find multiple explanatory factors capable of confounding the potential link between causes and the effect. Research findings are on stronger ground if they can dismiss, falsify, the influence of some factors on the presence of the phenomena.

The current dominance of ‘effects of causes’ methodologies, especially experimental methods, has cast doubt on the contribution that ‘causes of effects’ studies can make (Goertz & Mahoney, 2012) and on whether this approach can be accommodated within statistical or econometric estimation models (Gelman & Imbens, 2013). Some researchers go further and argue that only an external intervention or manipulation can support causal inference. Experimental data is needed to support causal inference, observational data lacking an intervention in the data generating process will not be capable of supporting causal inference.

It can be argued this view is too restrictive, that ‘causes of effects’ explanations have a role to play in advancing causal knowledge. Experimental methods are unfeasible in many social protection or social policy contexts that motivate social protection researchers in Latin America. However, a ‘causes of effects’ approach could be helpful in the task of identifying non-causal factors, the factors that are not causally linked to social protection institutions; in refining existing models capable of capturing the relationships involved, and perhaps in helping formulate hypotheses to be explored using ‘effects of causes’ research methods. Identification of causal and non-causal factors is a first step (Morgan & Winship, 2015, Chapter 3). It makes explicit the assumptions that are required to claim causal status. Searching for causes of effects or inverse causal inference could lead to improvements in the maintained model or suggest its replacement with a better one. Gelman and Imbens suggest that ‘causes of effects’ questions fit into statistics and econometrics not as inferential questions to be answered with estimates or confidence intervals, but as the identification of statistical anomalies that motivate improved models” (Gelman & Imbens, 2013, p. 4).

In fact, the identification of relevant factors capable of explaining the emergence of social protection institutions has been the primary objective in the quantitative literature. In some contributions, this relates to conditions required for social protection institutions, for example the level of economic development, or democracy, or openness, or urbanisation. Studies interrogating times series cross-section data have focused on the identifying the main correlates of social expenditure. The underlying statistical model is a linear structural equation model including a range of variables to ‘control’ for potential confounders and estimated using regression analysis. Studies using times-series-cross-section data (Segura-Ubiergo, 2007) and panel data (Haggard & Kaufman, 2008) provide some examples.

‘Causes of effects’ or reverse causation studies have contributed to refining the identification of the factors explaining the emergence and evolution of social protection institutions in the region. They have helped to clear the ground for ‘effects of causes’ research leading to reliable causal inference. But perhaps the most significant methodological challenge is to extend the application of ‘effects of causes’ to the study of social protection institutions in the region. It requires shifting the focus of research from conditions to effects. The following sections introduce some of the relevant tools.

2.4 The Potential Outcomes Framework

The potential outcomes framework provides a foundation for causal inference (Imbens, 2019). Assume that education is a unit treatment, so that an individual is a college graduate or only manages to complete basic education. It is a well-established fact that college educated workers command higher salaries than workers with only basic education. In an experimental setting, an individual selected at random will have two potential outcomes denoted y1 if offered college education or y0 is offered only basic education. In theory the difference between the two outcomes, y1y0, would be the college premium. In practice, only one outcome can be observed for an individual although in theory there are two potential outcomes. For the individual who completes college education, the salary of the individual without college education is the counterfactual, and vice versa. Researchers examining a sample of individuals, some ‘treated’ with college education and others in the ‘control’ group can measure the average college premium, the payoff to the treatment.

There are some complications the researcher must address (Angrist & Pischke, 2008). In the real world, individuals in the sample might not start from the same position. Some will have benefited from wealthy parents, or from a privileged social position, or their mothers’ attention in early life. Variables that influence both the treatment and the outcome are referred to as confounders. Some might have genetic advantages while others might be affected by learning difficulties. Random assignment of individuals to the treatment, the experimental approach, could overcome these complications and lend credibility to the findings from the research. The design of the experimental research requires some rules guiding the random assignments. If random assignment is not feasible, identifying groups in the sample that share the same characteristics is essential. Grouping individuals according to the income level of their parents or the education level reached by their mothers might strip away some of the confounders such that the college premium can be measured across comparable groups. This strategy is referred to as ‘quasi-experiment’ in that researchers do not have full control over the assignment but are able to find ex-post that conditions are equivalent to a random assignment. Evaluating average potential outcomes requires appropriate assumptions.

The potential outcome framework can be stated more formally for an individual i as

$$ {y}_i={y}_i1-{y}_i0 $$

Where yi is the outcome for individual i; yi1 denotes the outcome if treated and yi0 the outcome if not treated.Footnote 9 Aggregating across the sample, it is possible to evaluate the sample average treatment effect (SATE).

$$ {Y}_{\mathrm{sate}}=\frac{1}{n}\sum \limits_{i=1}^n\left({y}_i1-{y}_i0\right) $$

Several assumptions are required to ensure validity for the SATE (Hernán, 2020). The ignorability assumption refers to the requirement that the random assignment is independent of the potential outcomes. This assumption can be formalised as t ⊥ y0, y1. The stable unit treatment assumption requires that the potential outcome for unit i depends only in the treatment. This implies the requirement that there are no hidden treatments and no spillovers among units.

The potential outcomes framework facilitates thinking through causality in the context of social protection institutions. It emphasises the role of counterfactuals, that is the road not taken.

2.5 Directed Acyclical Graphs

Directed acyclical graphs (DAGs) provide a tool for assessing causal models that could potentially deliver causal inference (Elwert, 2013; Morgan & Winship, 2015; Pearl, 2010). DAGs are visual representations of hypothesised qualitative causal relationships. In constructing directed acyclical graphs, researchers encode their perceptions about how the world works aided by a set of formal logical rules (Pearl, 2010). Elwert emphasises two primary uses for DAGs “First, DAGs can be used to prove or disprove the identification of causal effects, that is, the possibility of computing causal effects from observable data… Since identification is always conditional on the validity of the assumed causal model, it is fortunate that the second use of DAGs is to present those assumptions explicitly and reveal their testable implications, if any” (Elwert, 2013, p. 246).

Elwert describes DAGs as nonparametric structural equation models (NPSEM), including a range of (parametric) linear structural equation model as special cases. The fact that DAGs are non-parametric implies that the relationships identified do not require the specification of the distribution of the variables involved, nor the functional form or the size of the causal effects. In this sense nonparametric structural equation models (NPSEM) identify causal relationships at a more basic level than identification in (parametric) linear structural equation model. DAGs can tell us whether causality can be ascertained in theory, assuming the relevant observable data and appropriate statistical models are available.Footnote 10

DAGs encode assumption about qualitative causal relationship by arrows pointing to the outcomes. Arrows are directed edges linking nodes or random variables. Missing arrows indicate the absence of a causal effect.Footnote 11 Causal paths represent all paths in which the arrows point away from the treatment and towards the outcome. Non-causal paths are all other paths. Variables directly caused by another variable are referred to as their children, whilst the causing variables are referred to as their parents. Descendants are all variables directly and indirectly related to a parent. A collider is a variable with two arrows pointing into it. A DAG is a directed acyclic path because it does not contain a path that returns to the original variable, a cyclic path.Footnote 12 This requirement rules out simultaneous causality, as in ‘trade unions and social spending cause each other’.

In Fig. 2.1(a) the causal path X→ B → Y represents that Y is the outcome of X. 2.1(b) represents the same causal path X → B → Y, but now with a non-causal path from X to D or D ← X. 2.1(c) show B as a collider in that both X and Y influence B or X → B ← Y. 2.1(d) shows that B is a common cause of X and Y or X ← B → Y.

Fig. 2.1
4 diagrams of causal and non-causal paths. a, X to B to Y. b, D from X, X to B, and B to Y. c, A to B, Y to B. d, B to X, B to Y. The connections are marked by arrows in between.

DAGs showing causal and non-causal paths

DAGs encode researchers’ beliefs about how the world works and. Paying attention to a small number of rules, they can help translate researchers’ beliefs into models of associations that can be potentially observable in the data. DAGs rules can also help exclude non-causal paths. DAG rules for identification offer interesting insights into assessing these causal models. For example, in Fig. 2.1(a) variable X has a marginal effect on the outcome Y. The presence of B does not prevent the association between X and Y. In fact, estimation of a statistical model that ‘controls’ for B would fail to compute the effect of X on Y.Footnote 13 In Fig 2.1(b) computing the causal relation of X on Y should ignore the non-causal path from X to D as it does not affect the primary causal relationship.

In Fig. 2.1(c) there is no causal path between X and Y, the two variables are not dependent on each other. However, an estimation of the association between X and Y that controls for B, a common outcome, could produce a spurious association between the two. This is commonly described as ‘endogenous selection bias’. In Fig 2.1(d), X and Y are not dependent on each other but share a common cause. In this case B is a ‘confounder’ because it influences both X and Y, and ‘controlling’ for B would lead to the identification of the association between X and Y (none in this case).

Figure 2.2(a) borrowed from Morgan and Winship (2015), represents the effect of E earnings on Y income, A is a common cause of earnings and income, say unobserved ability, and C is a confounding variable also affecting earnings and income. Note that ‘controlling’ for the confounding variable will bias any estimation of the direct effect. Because A is unobserved, controlling for C would not support an unbiased estimate of the effect of education on earnings. Figure 2.2(b) points to another issue in DAGs, unobserved error terms associated with variables can be included, as in Uc, Ue and Uy in the Figure. To the extent that they could be described as independent of each other and as ‘idiosyncratic’ causes of each variable, they are usually left out of the DAG. This practice would not apply if the unobserved error terms were judged to influence more than one variable.

Fig. 2.2
2 diagrams of confounders. a, A to E and Y, C to E and Y, E to Y. A is encircled. b, C to E and Y, E to Y, U e to E, U y to y, U c to C. U c, U e, and U y are encircled. The links are marked with arrows.

Confounders

As can be seen, applying some basic rules to the encoding of researchers’ beliefs concerning the phenomena under investigation adds precision to the description of causal and non-causal relations between variables, and suggests appropriate strategies for the analysis of observable counterparts.

2.6 Causality Deficits in Social Protection Research

As noted in the Introduction, an incipient application of causal inference in social protection research can be traced back to Mesa-Lago’s (1978) Social Security in Latin America: Pressure Groups, Stratification, and Inequality. There he sketches three models of social security development, using arrows to underline causal linkages between pressure groups PG, the state S, Social Security SS, and political parties PP. The first model describes an acyclical path from occupational pressure groups the state from the state to social security institutions, PG → S → SS. This model requires democratic pluralism he associated with conditions in Chile. The second model describes a causal path from the state to social security institutions, with a further link from the state to pressure groups denoting their co-optation, this can be represented as PG ← S → SS. Here the state is a common cause of both pressure groups and social security. Co-optation of pressure groups and the establishment of publicly supported social security institutions are sourced to authoritarian state elites. The third implicit model is more complex, it has an edge from the state to social security, and edge from pressure groups to the state, and an edge from pressure groups to political parties and political parties to the state. This can be interpreted as political parties acting as a common cause of both pressure groups and the state, or PG ← PP → S → SS plus PG→ S. Mesa-Lago offered an early insight into mapping the causal links between the different actors. This methodological approach was not developed further in the literature.

The availability of harmonised cross-country data on aggregate social spending has enabled researchers to construct panels of repeated observations for countries in the region. Quantitative researchFootnote 14 on social protection in Latin America has made use of panel and time-series-cross section data to investigate determinants of social protection dynamics. Time-series-cross-section data open a new range of potential research questions associated with changes in social protection provision over time. At the same time, these data present some challenges for researchers as regards appropriate statistical estimation (Beck & Katz, 1995, 1996, 2011; Plumper et al., 2005; Wilson & Butler, 2007). It is likely that pooling repeated observations for countries in the region will show serial correlation in that spending in one year will be correlated with spending in previous years. It is also the case that the different units or countries will show temporal and spatial correlation, as with the impact of crises or conflict. Differences in social protection across countries may be permanent or at least persistent if due to differences in natural resources, population, history, or legal or constitutional frameworks. The point is that pooled observations cannot be assumed to be independent random draws from nature. It is essential to take account of these potential biases in the estimation model.Footnote 15

The workhorse is some form of error correction estimating model that takes care of serial correlation and panel heteroskedasticity (Beck & Katz, 2011). Starting from a static regression model as in

$$ \mathrm{Static}\kern0.28em \mathrm{model}:{Y}_t=\alpha +{\beta}_0{X}_t+{u}_i $$

Where Yt is the outcome of interest, Xt are the determinant factors and ui is an iid error term.

An error correction models (ECM) will include lagged variables (first order autoregressive, and independent variables in levels and rate of change

$$ \mathrm{Error}\ \mathrm{correction}\ \mathrm{model}:{Y}_t-{Y}_{t-1}=\alpha +{\gamma}_1{Y}_{t-1}++{\beta}_0{X}_{t-1}+{\beta}_1\left({X}_t-{X}_{t-1}\right)+{u}_t $$

ECM will have several advantages in estimating TSCS data. Assuming the autoregressive correlation is AR(1) the ECM models the dynamics in the data generating process. Estimated by OLS with panel corrected standard errors, it addresses both serial correlation and panel heteroskedasticity.

An important assumption in ECMs is that the data are drawn from a stationary process, that is observations are mean reverting, and that the best long-run forecast of a stationary process is that mean. We can think of the mean is “the ‘equilibrium’ of a stationary process” (Beck & Katz, 2011, p. 333). ECM are therefore appropriate to the study of the way in which ‘shocks’ impact on the variables of interest, so that the coefficients associated with the independent variable describe the rate of adjustment to the equilibrium.

Among the studies on social protection in Latin America using ECM, Kaufman and Segura-Ubiergo (2001) and Avelino et al. (2005) investigate the influence of economic openness on social spending in the region. Following the literature on OECD countries, they test alternative explanations of social spending responses to globalisation. Increased competition brought about by globalisation could force governments to reduce the costs of welfare provision to improve efficiency and competitiveness, or alternatively to raise their social spending to compensate citizens for any adverse effects of globalisation. Kaufman and Segura-Ubiergo (2001), relying on data from 1973–1997, find that populist governments squeeze all social spending, while democratic governments increase spending in health and education and reduce spending on social security. Avelino et al. (2005) on the other hand, using data from 1980–1999 harmonised using purchasing power parity as opposed to exchange rates, find that openness tends to enhance rather than diminish social spending. They also find that the effects of globalisation differ across the components of social spending. As they put it, “… Latin America’s heightened exposure to international competition does not affect all social programs equally. In fact, our results suggest that politicians in open economies both compensate certain groups (spend on social security) and undertake policies that raise the level of efficiency in an economy (spend on education). In addition, democracies enhance the prospects for investing in human capital while preserving social security payments.” (Avelino et al., 2005, p. 634).

A study of the effects of protests and democracy on social spending shows another relevant application of the ECM (Zarate Tenorio, 2014). In this study, interacting a democracy indicator variable with variables capturing alternative types of protest generates interesting findings. Zarate does not “…find evidence that democracy has an independent effect on social security spending, neither in the short nor in the long run.” (Zarate Tenorio, 2014, p. 1961). In fact, “…the effect of democracy on social security spending is conditioned by organized labor protest. Their organizational capacity inherited from earlier periods puts them in a better position relative to other groups to effectively mobilize to defend their entitlements. However, mass protest has also had important consequences for human capital spending. While democratic leaders increase human capital spending as a consequence of electoral incentives, mass protest inhibit cutbacks in these areas in democratic regimes “(Zarate Tenorio, 2014, p. 1965).

Huber et al. (2008) also examine the effect of openness and domestic politics on social spending in the region for the period 1970 to 2000. They are interested in the determinants of long-term patterns of social spending. They aim to explain the level of social spending rather than the rate of change or adjustment towards an equilibrium. They argue that democracy and partisanship does not produce change instantly, but over time. It is long-term periods of unbroken democracy or left partisan coalitions in power, rather than instantaneous change, that are held to influence the level of social spending. Accordingly, their model does not include a lagged dependent variable and the independent variables are estimated in levels. They also avoid including fixed country effects.Footnote 16 The find that democracy matters for social spending in the longer term for both social security and education and health spending. Authoritarian regimes, on the other hand, maintain health and education spending at low levels but raise social security and welfare spending. This finding, shared by the other studies discussed above, is predicated on the acute regressivity of occupational insurance provision in the region.

Altman and Castiglioni (2020) study the influence of political and growth factors on the egalitarian character of social policy expansion in Latin America. They construct a panel of countries for the period 1990 to 2013. Their estimation method includes lagged independent variables, country fixed effects and 5-year dummies. They find that political competitiveness and civil society strength, rather than left-right partisanship, influenced the egalitarian character of social policy expansion in the region (Altman & Castiglioni, 2020).

Studies of social protection spending in the region using panel or time-series-cross-section (TSCS) data have adopted ECMs or panel corrected standard errors (PCSE) of levels to account for the dynamics of spending and its determinants and in the process account for serial correlation and panel heteroskedasticity. The models are appropriate to address the effects of ‘shocks’ to stationary data. However, as demonstrated by Huber et al. (2008) ECM models are less appropriate to the study of the longer-term determinants of social protection since the relevant data reveals the presence of a trend, as opposed to stationary data. The adoption of alternative methods reflects the research questions they address.

As anticipated by the discussion on ‘cause of effects’ research, the findings from the literature examining the influence of globalization and democracy on social spending in Latin America are helpful but meagre (Flechtner & Sánchez-Ancochea, 2022). Overall, they confirm the influence of political regimes on social spending but fail to throw much light on the actual mechanism through which this influence is exercised. There is some consensus around the view that partisanship is not as important as it appears to be in the welfare state literature, a hypothesis that echoes the cleavage literature (Bornschier, 2009; Dix, 1989; Roberts, 2002). The studies find agreement on the need to disaggregate social spending into its component parts: social protection and basic service. This is based on the view that the each of the components have a different underlying political logic. The studies are inconclusive as regards the influence of democracy, measured as a continuum, or the role of interest groups, in explaining spending on social security. They offer contested findings on the influence of authoritarian regimes on the persistence of social security spending and on the influence of democracy on the expansion of service spending.

To an important extent, the frailties of studies based on time-series-cross-section and panel data are well known (Kittel, 2006; Plumper et al., 2005; Wilson & Butler, 2007) and perhaps it is unfair to blame the tools. The assumptions underlying ECMs might be the real issue. At its core is the issue whether the social protection data generating process in Latin America is characterised by stationarity or alternatively it is characterised by a trend. This has implications for whether estimated coefficients are constant or variable over time. It is important to keep in mind that the TSCS literature reviewed above focuses on the neoliberal phase in the region and the findings might apply to this period only.

Structural equation models in quantitative studies of social protection generally focus on correlation, not causation. This is made explicit in Kaufman and Segura-Ubiergo (2001) estimation of the association of democracy and globalization with social spending in a cross-section panel of Latin American countries. They are careful to interpret the estimated coefficients in their model as associations or correlations, precluding causality claims. In the spirit of ‘causes of effects’ or attribution studies, their study helps shape an ‘effects of causes’ research strategy. Discussing the finding that democratic governments are more likely to raise education and health spending while authoritarian governments are more likely to protect spending in pensions, they note that if “our analysis cannot definitely uncover the causal mechanisms that underlie the statistical findings, however, it does provide a frame of reference that might orient future research.”(Kaufman & Segura-Ubiergo, 2001, p. 582).

Structural equation models can underscore thinking about social protection implicitly. Levy and Cruces (2021) develop a model in which a long list of variables is taken to influence social protection and each other. They do not move forward to estimate such a model although it is implicit from their approach that they have a linear structural equation model in mind. As all the relevant variables are assumed to influence each other simultaneously, this approach precludes any consideration of causal associations.

Dynamic models emphasising feedback effects are often employed in institutional analyses of social protection. They hypothesise simultaneous influences among the variables of interest. They are essentially ruled out by a DAG causal model. In DAG causal models the aim is to “’redefine’ effects as a general capacity to transmit changes among variables.”(Pearl, 2009, p. 107). Approaching feedback mechanisms and simultaneous associations between variables from a causal perspective would require separating out different causal relationships in time and treating them as separate phases lending themselves to separate causal models.

2.7 Methodological Challenges

The review of methods in social protection research in Latin America points to several challenges for this book. They are associated with expanding the implementation of causal models. The potential outcomes and directed acyclical graphs discussed above could productively complement the range of current methodological approaches to the study of social protection in Latin America.

Graphical methods can encourage a deeper consideration of the quality of association between variables and discriminate between causal and non-causal association. This approach points to the need for close attention to theory. This is shared across methodological approaches. Noting areas of consensus between structural equation modelling and structural causal modelling Glynn and Quinn (2013) note that “a theoretical understanding of the substantive problem, particularly regarding the nature of any temporal dependencies, is an important part of a successful estimation strategy” (p. 3).Footnote 17 They recommend recourse to graphical methods associated with the structural causal model of Pearl (Pearl, 2009) to establish the quality of these relationships.

Awareness of the limitations of linear structural models in capturing the influence of time-varying covariates has encouraged work on estimation models embedding potential outcomes and treatment histories (Blackwell & Glynn, 2018). In the ECM model presented above, the effects of lagged covariates are included in several ways, the effect of Xt-1 on Yt, the effect of Xt-2 and Xt1 on Yt through the inclusion of the lagged dependent variable Yt-1. This complex set of interactions are hard to model and lead to estimation biases, especially where time-varying covariates are present. Note that in the static model, covariates are assumed to be constant, only their current value is considered. In the single shot scenario, covariates can be included as part of a strategy to control for baseline conditions or for preintervention variables. In a dynamic context, some covariates are time varying. Take the example of past social protection spending that strengthen trade union or supportive political parties that could themselves influence current social protection spending. In this case TSCS models will include trade union or partisan strength as a covariate. These variables qualify as posttreatment. TSCS or panel regression can avoid posttreatment bias by not including them as control variables, but this exacerbates any omitted variable bias. TSCS models with time-varying covariates find themselves in a dilemma. Including time-varying covariates leads to omitted variable bias but including them results in posttreatment bias. Blackwell and Glyn (2018) explore marginal structural models with treatment histories as a potential solution.

2.8 Conclusions

This chapter has provided a review of methodological approaches in social protection research. The availability of harmonised cross-country data on social protection in the region has encouraged the application of a wider range of methods, qualitative and quantitative. Step improvements in data availability and methodological innovation will make it possible for researchers to address an expanding range of research questions on social protection.

The discussion in the chapter focused on applications of parametric structural equation models estimated with times-series-cross-section and panel data in the context of Latin American social protection institutions. They have contributed to our understanding of the influence of globalisation, democracy, and conflict on social spending, including spending on social protection. At the same time, these studies demonstrate some weaknesses associated with the lack of fit between their assumptions regarding the data generating process, and the characteristics of the relevant data for Latin America. The model assumptions are too restrictive. Their findings privilege correlation over causation.

The chapter has made a case for paying greater attention to causal models in social protection research. It discussed three main areas where this strategy can be advanced.

First, it made a distinction between ‘causes of effects’ and ‘effects of causes’ type of research enquiry. In ‘causes of effects’ enquiries the aim is to attribute phenomena to specific causes. It is important define the potential contribution of this type of research to our understanding of social protection. Attribution studies can help sift through possible causal factors primarily by identifying non-causal factors and can shape follow up ‘effects of causes’ research. The latter aims to establish the effects of changes in causal factors, for example the consumption effects of conditional income transfers. ‘Causes of effects’ type enquiry accounts for the bulk of research on social protection institutions in Latin America. More recently, experimental, and quasi-experimental methods have been implemented in the study of the effects of social protection interventions. They signal a shift towards ‘effects of causes’ research methods. There are challenges to the application of ‘effects of causes’ methods in the context of research on institutions relying on observational data. Second, the potential outcomes approach offers a systematic framework for incorporating attention to counterfactuals. Third, graphical casual models offer a range of tools for assessing the association between variables in a systematic fashion. Directed acyclic graphs offer a representing of researchers’ beliefs of researchers regarding the way the world works. The application of a set of basic rules (Pearl, 2010) help discriminate causal versus non-causal association between variables, refining researchers’ hypotheses and linking the model to potential empirical counterparts.