Keywords

Introduction

This chapter is a product of the research conducted in the Collaborative Research Center “Global Dynamics of Social Policy” at the University of Bremen. The center is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)—project number 374666841—SFB 1342.

The global diffusion of social policy is an emerging field in political science and comparative macro-sociology. Detailed, qualitative studies can precisely highlight the mechanisms of diffusion at work, e.g., learning, emulation, competition, or coercion (Gilardi 2016; Obinger et al. 2013). Even though this approach can reveal these mechanisms, it is limited to the respective cases under investigation. At a higher level of abstraction, researchers can apply statistical models for diffusion research on a comprehensive set of countries and over a long historical period. On the one hand, such studies usually abstract from the country-specific “micro” mechanisms; on the other hand, they provide a “macro” perspective on the diffusion process in the overall population of countries around the globe. The empirical studies collected in this volume follow the second approach.

Our research was conducted in the Collaborative Research Center 1342 (CRC 1342) at the University of Bremen, which is funded by the German Research Foundation (DFG). The members of the CRC 1342 collected an unprecedented amount of historical data on welfare policies around the globe to allow for macro-quantitative analyzes of global diffusion in different subfields of social policy covering almost all countries in the world. This book is a collaborative effort of the quantitative projects in the CRC 1342 that analyze the diffusion of welfare policies.

We regard diffusion as a process driven by multiplex ties between countries in global social networks. In social network research, multiplexity means that subjects have network ties in various dimensions. In our view, global trade, colonial history, similarity in culture, and spatial proximity link countries to each other. In an epidemic, nowadays an unfortunately well-known type of diffusion, the share of infected subjects in the population depends on single events of disease-adoption at the micro-level; these events, in turn, result from some kind of interaction between subjects. Hence, networks are the “pipe structure,” or the structural backbone, of the diffusion process. We will analyze diffusion in several subfields of social policy, investigating the question of which network dimensions drive the process. For instance, the introduction of certain labor regulations might depend more on economic ties, in particular, global trade, whereas cultural similarity between countries could be more important for family or education policy. This volume aims at testing the different network structures against one another in their relevance for the diffusion process in different subfields of social policy. These policy fields are old age and survivor pensions, labor and labor markets, health and long-term care, education and training, and family and gender policy.

The present chapter introduces a network diffusion model for the analysis of social policy diffusion. We will give a detailed overview of the networks used in the following contributions. By applying an identical methodology to different fields of social policy, studies in this volume contribute to comparative research on the diffusion of social policy.

Four different networks will be analyzed as explanatory variables in this volume. The first is the network of geographical distanceor proximity, which is represented by the distances between the capitals of the countries included in the sample. This network is based on the assumption that diffusion processes are subject to “slowing” effects of distance (Staudacher 2005; Berry 1972). However, geographical distances do not represent actual network contacts but merely promote the formation, frequency, and intensity of contacts. For this reason, we will secondly analyze the effect of the globaltrade network. We assume that beneficial economic exchange in global markets is a crucial condition for domestic economic growth (Krugman et al. 2018), but global economic transactions might be less costly if labor or educational standards are similar. Thirdly, we will analyze the networkrepresenting “cultural spheres” (Windzio and Martens 2021), which we assume to be of particular importance in the subfields of family and education policy. The fourth network represents ties of colonial legacies between states and captures long-term, asymmetric interdependencies. In this framework, the spatial distancenetwork, or more precisely the spatial proximitynetwork, serves as a reference point for determining whether the contacts in the three other network types exceed the breaking effect of distances and are therefore more relevant to the diffusion of social policy (Simmons and Elkins 2004).

The aim of this chapter is to present in detail the methodology of the network diffusion model used in the following chapters and the networks of geographical distances, global trade, cultural spheres, and colonial legacies. We will give a brief overview of the current state of research and argue that the respective networks might be relevant in explaining diffusion processes in social policy. Subsequently, we will describe the construction of the networks, network parameters, and visualizations. Accounting for the change of network contacts over time, we apply longitudinal exponential random graph models (ERGMs) (Harris 2014) to analyze relevant variables influencing the probability of network ties.

The Network Diffusion Model in Event History Analysis

Processes of social diffusion often follow a logistic growth curve. Logistic growth processes are common in epidemiology, where they describe the spread of infectious diseases (Shen 2020). If the mechanism of diffusion is contagion via contact among subjects, the probability of meeting an “infected” subject is very low at the beginning of an epidemic, but the likelihood increases as the share of those who have already contracted the disease rises.

Yet subjects show considerable variance in social behavior as well as in their likelihood to contract the disease. Depending on the disease, some subjects turn out to be immune, have very few network ties, or are even isolated. Moreover, if other subjects recover from the disease and are immune afterward, the increase in the probability of becoming infected at a particular moment decreases if most subjects to whom potentially infected persons have contacted are now immune (left-hand side of Fig. 1.1). This applies not only to the spread of diseases (Shen 2020); we can also describe the diffusion of innovation in this way and, accordingly, the diffusion of different social policies as well. Even though the logistic growth curve is a crucial characteristic of diffusion processes (Rogers2003), the underlying structure is a network. Networks were not systematically included in diffusion analysis until the mid-1990s, when Thomas Valente developed the networkdiffusion model (Valente1995). At the micro-level, events of contraction drive the diffusion process, which means that subjects change their state from uninfected to infected, for example, or to have adopted an innovation—in our case, a policy. Each single micro-level event contributes to the time-dependent aggregation of these events to the macro-level, where we then describe the diffusion process as a characteristic of the overall population, for example, by the cumulative logistic growth function (right-hand side of Fig. 1.1).

Fig. 1.1
Two graphs depict the logistic density and cumulative logistic density on the Y axis, while the X axis contains values of x.

Logistic density and cumulative logistic density function

At the starting point of an epidemic, all subjects are at risk of adopting the disease. Due to the waiting time until the moment of contraction, the underlying micro-level data are called episodes, with the starting point being the first occurrence of an infection in the population and the endpoint being either the contraction of the disease, the end of the epidemic, or simply the end of the window of observation. Consequently, we will apply event history models to analyze micro-level events of policy adoption in order to reconstruct the diffusion process at the population level. In these models, the dependent variable is the hazard rate (in our case the rate of adoption of the respective social policy). It is defined as the probability P, that the event at time T, occurs within a particular interval between t and t + Δt, given that the event has not yet occurred at t, that is, T is greater than or equal to t.

$$ r(t) = P(t \le T < t + \Delta t|T \ge t) = \frac{P(t \le T < t + \Delta t)}{{P(T \ge t)}} $$

In a discrete-time situation, we can estimate event history regression models by using binary outcome models (Singer and Willett 2003) such as logit, probit, or complementary log–log models. In this volume, we will use logit models, where the hazard rater(t) is predicted by j time-dummies α that indicate e.g., 25-year time intervals, to estimate the effects of our four networksof trade, colonial history, cultural spheres, and spatial proximity, and some control variables β’x.

$$ r(t) = \frac{1}{{1 + \exp (\, - (\alpha_{1} t_{1} +... + \alpha_{j} t_{j} + \beta_{1} trade + \beta_{2} colony + \beta_{3} culture + \beta_{4} proximty + {\mathbf{\beta ^{\prime}x}}) \,)}} $$

Contagion at time t depends on exposure to subjects already infected at t−1. Valente (1995, 43) defines exposure as the share of infected subjects j in the (time-varying) egocentered network of subject i. The term xij defines a tie in the egocentered network of subject j, and aj are those alters already infected at t. The formula below shows that exposure is a function of t, which means that it depends on time.

$$ E_{i} (t) = \frac{{(\sum\nolimits_{j \ne i} {x_{ij}} \bullet a_{j})_{t}}}{{(\sum\nolimits_{j \ne i} {x_{ij}})_{t}}} $$

Figure 1.2 gives an example of how exposure is calculated and represented in time-dependent episode data. The table on the right-hand side of Fig. 1.2 represents the underlying data structure, which is comprised of two subjects i and j. It shows the dependent variable “d” that denotes whether the innovation was adopted at a particular time point “t,” the networkexposure (“expo.”), and one binary control variable. For this exemplary representation, we chose a dummy variable, which indicates that subject i belongs to the WEIRD“cultural sphere” of western, educated, industrialized, resourceful and democratic countries as one control variable (Henrich 2020; Seitzer et al. 2021) (see below). We will describe this category in more detail later on. To the left of Fig. 1.2, we see the graphical representation of networkexposure in the respective episodes. Observation i is exposed to 2/6 of its alters who already adopted a social policy at t1, to 3/6 at t2, 4/6 at t3, and 5/6 at t4. Since subject i adopts the social policy at t4 + 1, when 5/6 in its network are adopters, i’s threshold is 5/6. In contrast, subject j adopted the social policy at t2 + 1 at a threshold of 3/6.

Fig. 1.2
Two illustrations labeled i and j illustrate the network exposure and hazard rate in id, t, d, expo., and WEIRD. The values from the illustration are depicted in a separate table.

Networkexposure and the hazard rate

In the column “expo.” in the table to the right of Fig. 1.2, there is a particular value of exposure for each year in which the two countries i and j were at risk of adopting (which means that they had not yet adopted, or T ≥ t). The event of adoption occurs as a result of a given exposure in the moment before adoption, so the respective exposure is lagged by one year. At the bottom of Fig. 1.2, hazard ratios are shown for the binary explanatory variable WEIRD. Country iis WEIRD and has 1 event out of 4 time periods at risk and thus a hazard rate of 0.25. Period t1 has been dropped because of the lagged exposure, i.e., there is no data on subjects that adopted at t ≤ 1. Country j (non-WEIRD) has 1 event out of 2 time periods at risk and thus a hazard rate of 0.5, so the hazard ratio is (1/4)/(1/2) = 0.5. Computing hazard ratios and standard errors for continuous variables, such as exposure, is much more difficult and requires the application of maximum likelihood estimation, particularly if the model includes further covariates.

The Methodology Used in This Volume

Throughout this volume, we use discrete-time logistic hazard models. The dependent variable is the absorbing destination state of having adopted a social policy (= 1). Similar to Fig. 1.2, once a country has adopted the social policy in question, it drops out of the risk set. Since j adopts at t2 + 1, there are no data entries for the subsequent time points. Conversely, more entries are given for i because the adoption comes later in t4 + 1. Countries that adopted a policy prior to 1880 dropped out of the risk set, and if they did not adopt until 2010, they are right-censored. In hazard models, the consequence of left-censoring is usually that the beginning of the episode is unknown, so we cannot properly compute time-at-risk. Those countries are not considered in the risk set, i.e., in the underlying sample on which hazard ratios are estimated. However, they contribute to the estimation of the networkexposure of countries that have not yet adopted. Right-censoring, on the other hand, means that those countries remain in the risk set throughout the entire time frame.

To test whether the diffusion of social policy occurs along particular network contacts, four different networks build the underlying structure through which we assume diffusion to occur. As mentioned before, these are geographic proximity, trade relations, cultural similarity, and colonial legacies. Exposure to countries that already adopted a social policy is calculated separately for every network. Hence, while the exposure of a country i can be very high in the global tradenetwork, it can be zero in the colonial legacies network simply because the country has not had any colonial relationship. Furthermore, the exposure in the respective network is weighted by tie strength, e.g., exposure to a country that had already adopted the social policy is higher in a geographically close country than in one that is further away. Lastly, exposure is estimated either undirected (for the networks of geographic distance, global trade, and cultural spheres) or directed (for the network of colonial legacies). For the latter, this means that if the colonial power adopted a social policy, exposure for its (past) colonial entities increases. However, this does not hold the other way around. For undirected networks, exposure would take the same value regardless of direction. Generally, (unweighted) exposure is included in the logistic hazard models as a numeric variable ranging from 0, where no alter has adopted the social policy, to 1, where all alters have adopted the social policy. On a similar note, the geographical proximitynetwork is time constant, meaning the tie strength does not change over the duration of analysis, while cultural spheres, trade, and colonial legacies are time-variant to account for the declining influence of colonial powers after decolonization, changing economic partnerships, and evolving cultural characteristics.

We take the four networks as the underlying structures for the diffusion process. As we will see later in the chapter, all networks constitute different avenues or “pipes” through which communication and information about social policies can travel. Taken together, these networks emphasize different specificities of countries’ interdependencies. By including different networks, we assume to catch as many instances of network diffusion as possible through the different mechanisms.

However, social policy diffusion can also depend on domestic factors such as a country’s level of economic development or financial capability. The same is true for civil freedom in the political regime (Lindert 2004). Thus, we introduce Gross Domestic Product (GDP) per capita (Inklaar et al. 2018) and a democratizationindex as baseline control variables. The former was linearly interpolated for the whole time frame by taking the minimum value for every income group based on all observations before 1800 and filling in any missing values according to the minimum of the respective income group of the corresponding country by assuming a logistic growth function. Provided there were no data available, these were the values to start the interpolation into future years. This yields a continuous measure of economic development from 1880 to 2010 for almost all countries in our dataset. For the level of democratization, we use the basic Varieties of Democracy Regime Score (Lührmann et al. 2018), which in the raw data ranges from 0 to 9 and was linearly interpolated for any missing data points. This method introduces some noise to the data, as it fills missing data points with non-natural numbers (decimals). However, filling missing data either with the number observed before or thereafter would make the measurement error even greater. We suspect the benefits of the interpolation to be greater than its disadvantages and certainly greater than having to discard observations.

Additionally, the diffusion process in question might show time dependency resulting from unobserved heterogeneity. Hence, we control for time dependency by using a piecewise constant step function, based on a baseline of, e.g., 25 years steps, starting in 1880 until 2010. Nevertheless, as different as the social policy fields are in this volume, authors might very well find a way of defining time effects that better fit their theories and hypotheses. One last variable that needs introduction is trade existed. This variable stems from and directly refers to the global tradenetwork. Because of the historicity of the data, we are often unable to accurately describe national units that were not established in the respective historical period. This problem is especially apparent in the network of global trade based upon the Correlates of War (COW) International Trade Dataset (Barbieri and Keshk 2016). Since their collection efforts were for the purpose of measuring trade between states, any states considered to be non-existent at a particular moment according to the COW definition are not included in times of non-existence. Because our data covers the network across all 164 countries from 1880 until 2010, empty dyads in the tradenetwork do not necessarily mean that no trade happened. It might just mean that the country did not exist as an independent trading partner and therefore trade with this country was impossible. States that did exist but did not officially trade are coded with a value of zero. To control for the possible distortion of the two different meanings of zero ties, we include a dummy variable in all estimations which signifies whether a country, according to the trade data, existed (= 1) or did not exist (= 0).

Lastly, we face a problem with statistically non-independent observations. During the time frame under investigation, there are historical time periods in which several countries did not exist because they were part of a larger unit. An ideal-type example of this are countries of the former Yugoslavia. For example, if Slovenia and Croatia both adopted a social policy when they were part of the former Yugoslavia, then Yugoslavia was the overarching unit that actually adopted the policy, thus resulting in the introduction of a policy when the country units Slovenia and Croatia were non-independent observations. This is due to the way we arrange the dataset for the diffusion analysis: the set of nodes in the network is constant over time, which implies that Slovenia and Croatia existed before, during, and after Yugoslavia existed. Our approach to address this problem is to regard Slovenia and Croatia as “spatial patches,” remaining well aware of the fact that many countries actually changed their borders throughout history. From this perspective, Slovenia and Croatia were spatial patches at risk of adopting a social policy before, during, and after Yugoslavia existed. Yugoslavia will not be regarded as a subject in our sample, but Slovenia and Croatia and all other countries formerly belonging to Yugoslavia are indeed distinct units. These subjects are not, however, statistically independent from one another! This does not pose a problem for the calculation of exposure through the networks but it does cause a violation of the assumption of independence of error terms in the maximum likelihood estimation. In the logistic diffusion model, we address this statistical non-independence by using cluster-robust standard errors (Zeileis et al. 2020). Our procedure has the following advantage: it accounts for the statistical non-independence of observations when they are part of an overarching cluster (spatial patch) by using the corrected standard errors, but it does not impose any standard error correction in the hazard model for country-years not belonging to the respective cluster or to any other cluster.

The analyses in all chapters of this book follow the same rationale: First, the exposure to already “infected” countries is calculated for each network, as discussed above. This statistic, i.e., the weighted share of ego’s network contacts who had already adopted the policy in question at t−1, is then handed over to a time-discrete hazard model. In this model, the adoption rate is regressed on exposure, controlling for GDP per capita, the democracy index, and additional policy field-specific factors. The resulting robust standard errors correct any statistical non-independence, potentially affecting standard errors. In most chapters, we present the coefficients as hazard ratios, representing influence of the predictors on the risk of policy adoption. The results therefore allow us to determine which of our networks represent a “pipe structure” for the contagion of social policies, for example, through exposure to countries that already adopted the respective policy. We can determine whether factors such as cultural similarity or trade, for example, have a stronger effect on the adoption of a policy, as they represent the better “diffusion channel.” To give an example: if we enhance a diffusion model based on the cultural spheresnetwork with the tradenetwork and the effect of the former thereby loses significance afterward, then the tradenetwork is not just a mediator of the effect of trade on diffusion but also the more appropriate explanatory variable.

Networks of Social Policy Diffusion

In the following, we discuss the networks we use to explain diffusion processes in different fields of social policy. As mentioned before, countries are tied to each other in networks of geographical proximity, global trade, cultural spheres, and colonial legacies. These network dimensions are the basis of our comparative analysis of diffusion based on the networkdiffusion event history model discussed in the previous section.

At first sight, our four networks seem to correspond with the mechanisms discussed in the diffusion literature (Obinger et al. 2013; Starke and Tosun 2019; Gilardi 2016). Networks of colonial legacies could correspond with coercion, global tradenetworkswith competition, cultural spheres networks with learning and geographical proximitywith imitation. On second thought, however, such an assignment between network dimensions and diffusion mechanisms does not capture the complex reality of policy diffusion. For instance, global tradenetworks can also indicate cooperation and division of labor, so that the mechanism at the dyadic or country level would be surely different. Moreover, whether a policy adoption in a particular country results from learningor imitation is hard to decide from a global, macro-quantitative perspective. It is thus important to put the power of the network diffusion approach into perspective. Network diffusion analysis based on multiplex networks can reveal the relative importance of the respective “pipe structure” for the diffusion process under investigation. But neither does it provide information on agency and decision-making nor does it guarantee that the networks considered in the analysis actually are the most important structures. Possibly, other network dimensions, international organizations, or even personal networks between experts and policymakers are more important, e.g., for learning. Our approach is thus a first starting point in the global analysis of network diffusion of social policies.

Network of Geographic Distances

There is little doubt that geographical distance influences diffusion processes. The closer the objects of investigation are located to each other, the more likely they come into contact and the more likely the content of the diffusion process—e.g., disease, innovation, or a social policy—will be contracted or adopted. A simple and illustrative example is the spread of a virus transmitted via personal contacts (Cliff 1979), or, alternatively, the negative effect of geographical distance in migration (Windzio 2018) as predicted by the gravity model (Dodd 1950). The “neighborhood effect” is a simplified version of spatial distance, whereby a location in the immediate neighborhood increased the risk of adoption.

The strength of neighborhood effects can be derived from the diffusion rate, which in turn depends on the properties of the diffusing information. The adoption rate usually declines with increasing complexity of knowledge or increasing capital intensity (Staudacher 2005). Rumors about prominent personalities spread rather quickly, while complex scientific findings, for example, show a much slower diffusion. In addition to the speed of diffusion, the spatial area in which diffusion takes place is a crucial factor. If geographical distances were the only explanatory factor, the speed of diffusion would allow conclusions about the topology of the area and the distribution of subjects within this area. The diffusion rate is usually not constant across time and space, rather there are preferred routes—for example, through particularly intensive contacts—which increase the propagation velocity along certain diffusion channels and thus have a significant influence on the propagation area (Grabher 2006).

In previous research on policy diffusion (Obinger et al. 2013), geographical distances were used as weighting matrices in spatial regression models to capture dependencies in the form of “spatial lags” (e.g., Franzese and Hays 2007; Schmitt and Obinger 2013). A simple form of a spatial weighting matrix is the neighborhood matrix. If two countries have a shared border, the respective cell of the neighborhood matrix has a value of 1, and otherwise 0 (Windzio et al. 2019). The neighborhood matrix thus implies the assumption that only countries with a common border can influence each other (Obinger et al. 2013). Not least because of the criticism of this very narrow assumption, the distances between capital cities were used instead of, or rather in addition to, the neighborhood matrix to define “spatial lags” in the weighting matrices (Schmitt 2019; Simmons and Elkins 2004).

As a justification of the relevance of geographical proximity, it is often argued that the intensity of communication between countries can increase due to their proximity. This argument implies the assumption that the exchange of information between neighboring or geographically close countries is substantially higher. Even unintended forms of information exchange occur more easily and thus more frequently. In addition, policy examples from neighboring or nearby countries are often regarded as a blueprint for a country’s own national policies, so that a high degree of mutual influence is assumed due to geographical proximity (Schmitt and Obinger 2013). However, a clear assignment of geographical proximity to one of the mechanisms from the diffusion literature—(i) learning, (ii) competition, (iii) imitation, or (iv) coercion (Obinger et al. 2013; Starke and Tosun 2019)—is difficult. Magetti and Gilardi (2016) conclude that “Geography is often an important component of diffusion, but it cannot be linked straightforwardly to any of the […] mechanisms. Therefore, it is a catch-all indicator that usually discriminates between them. It is best used in combination with other indicators” (Magetti and Gilardi 2016, 93).

Similarly, Simmons and Elkins (2004) note that geographical distance does not provide a satisfactory explanation for policy diffusion per se. In line with their view, Beck et al. (2006) point out in a contribution with the significant title “Space is more than Geography” that, on the one hand, taking geographical distances into account in spatial econometrics is a methodological enrichment, but that other measures for determining interconnectedness between states would produce more fruitful results. Similarly, Boschma (2005) argues that proximity not only encompasses physical–geographical proximity, but that cognitive, organizational, social, and institutional forms of proximity exist as well. Accordingly, neighboring countries are more likely to display similar social structures and traditions. These similarities serve as one explanation for the high correlation between culture and spatial proximity. This argument fits well with our idea that cultural proximity can also be an important dimension. Ties in the networkof “cultural spheres” (see below), which is correlated with spatial proximity, can be a much more meaningful condition of diffusion. Whereas spatial proximity between capitals is measured almost accurately, however, the networkof cultural spheres is a combination of various complex characteristics and therefore more prone to measurement error. According to this brief overview, we argue that the network of geographical distances serves as a reference point to measure the relevance of the contact networks of global trade, cultural spheres, and colonial connections.

The calculation of distances between capitals is described in detail in Eiser et al. (2020). The corresponding dataset is available in the Global Welfare State Information System WeSIS (www.wesis.org). To ensure that an increasing geographical distance indicates a decrease in the intensity of contact, we calculated the inverse of distance. The value for the contact between two countries i and j due to geographical proximity is therefore:

$$ x_{ij} = {1}/{\text{capital distance}} $$

Even though there are occasional shifts of the capital in some countries, for pragmatic reasons the distances are based on the capital cities in 2020. Therefore, the geographical distances are a time-invariant network.

Global Trade Networks

In international comparative social policy research, tradenetworks are a central indicator for mapping economic globalization processes. Both in the first wave of globalization from 1890 to World War I (WWI), and especially during the second wave of globalization from World War II (WWII) to the mid-1980s, the density of tradenetworks increased rapidly, and trade was the central engine of economic globalization (Mossig and Lischka 2022). In social policy research, the share of trade ([imports + exports]/GDP) was traditionally interpreted as an indicator of economic openness (Busemeyer 2009). Cameron (1978) was one of the first to show an empirical association between the expansion of the public sector and the integration into world trade for 18 Western industrialized countries. According to his argument, open economies with a high share of trade in GDP are particularly dependent on external events, such as price developments on the world market. In order to counteract these external dependencies, these open economies try to extend their influence within the domestic economic sectors. Smaller economies in particular have comparatively high trade shares as a percentage of GDP due to the smaller domestic market and a high degree of specialization in their own industrial structure. Accordingly, the economic openness of smaller economies, such as the Scandinavian countries or the Netherlands, partially explains the disproportionate expansion of the welfare state. In the literature, such side effects of economic globalization are discussed in the context of the compensation thesis (Rieger and Leibfried 2003; Starke and Tosun 2019).

Since the 1980s the importance of tradenetworks on world market integration declined. States have now become increasingly involved in global competition for foreign direct investment (FDI). This competition takes place with regard to the range of low-cost location conditions offered, for example, in terms of social security contributions or taxes (Mossig and Lischka 2022; Düpont et al. 2022). In order to survive this competition, policymakers considered a dismantling of the welfare state by lowering social standards and social contributions as necessary (Swank 2010), which was referred to as a “race to the bottom” (Kvist 2004) in the literature.

Openness or inclusion as measured by trade shares in a country’s GDP or foreign direct investments (FDI stocks or flows) is a highly aggregated indicator. It disregards the varying importance of different trading partners, i.e., it does not differentiate between trading partners that are important and unimportant to ego. In addition, indirect connections via third trading partners are neglected. However, the structure of the network and the position of the individual states in this network largely determine the scope of action and also influence the vulnerability and sensitivity of interstate relations (Glückler and Doreian 2016; Maoz 2011). The significance of economic globalization and the relevance of intensifying trade linkages for the diffusion of social policy is based on the assumption that important trading partners influence a country’s policies more strongly than subordinate trading partners do. As a result of the globalization process, countries are becoming more closely aligned with one another, although this does not necessarily mean that social policy has to converge (Jahn 2016).

The tradenetworks were defined as follows: The trade data are collected from the Correlates of War Project (Barbieri and Keshk 2016). According to the following regulations, the edge weights were determined for each year. The volume of trade between each of the two countries comprises the total trade in goods in one year and is therefore undirected. The original trade flows were converted into US$ using the average exchange rate from 2011 to avoid an inflation-related densification of the networks. Due to the extremely different trade volumes, we logarithmically transformed the trade values. The edge weight of trade interdependence between two countries i, j is therefore:

$$ x_{ij} = \left\{{\begin{array}{*{20}c} {\log (trade)} & {\text{if\,\, trade}\,{> 0}} \\ 0 & {\text{if\,\, trade}\,{= 0}} \\ \end{array}} \right. $$

If a dyad shared any trade volume in any respective year, the log of this volume was used, otherwise the edge was set to 0 as the dyad did not share any trade in the respective year. Further decisions regarding the construction of trading networks concern former countries that have split up over time, e.g., Austria-Hungary, Czechoslovakia, or Serbia-Montenegro. In such countries, the trade volume of the shared years was divided according to the GDP proportion of these countries after these countries separated from each other. In the case of the Union of Soviet Socialist Republics (USSR) this refers to the period from 1922 to 1991, in the case of the Baltic States 1941–1991, and in the case of the former Yugoslavia the period from 1918 to 1992. “Small” states that once existed but are not represented in the selected country sample for this anthology were deleted (e.g., Yemen People’s Republic, Republic of Vietnam, Korea from 1880 to 1905, Kosovo, Zanzibar). Furthermore, because there are some missing values, we include a dummy variable in the later analysis which depicts whether a country “existed” based on the COW definition, as explained in detail above. The network representation in Fig. 1.3 is a quadrilateral Simmelian backbone (Nocaj et al. 2015) (Fig. 1.3).

Fig. 1.3
A cluster diagram of global trade-off networks. A few listed networks are E T H, S W Z, S L E, S A B, A Z E, GEO, and others that are connected.

The network of global trade in 2010

The network visualization only shows to a limited extent how intensively individual countries are involved in global trade. But weighted degree centrality can be used as a measure of network integration. In 2010, China was the country with the highest trade integration (degree centrality of 1.92). The value 1.92 indicates that China was involved in 1.92% of bilateral trade worldwide, followed by the USA (1.80), Germany (1.69), and France (1.57). The following Table 1.1 divides the country sample into quartiles. 17 countries (10.4% out of 164 countries) with the highest centrality rating account for 25% of the cumulative degree centrality. In contrast, the last quartile is occupied by 81 countries with the lowest centrality values in global trade. Bhutan, for example, which ranks last, only accounts for 0.05% of global trade. Lesotho (0.08), the Comoros (0.09), and the Solomon Islands (0.10) also had a very low-degree centrality in 2010.

Table 1.1 Countries in quartiles of degree centrality

The Network of “Cultural Spheres”

During the last decades, culture became an increasingly important concept in economics and the social sciences (Rose 2019; Emirbayer and Goodwin 1994). Despite its importance, however, culture is quite a controversial concept. Culture exists at different levels (Basáñez 2016); it can be very local, or it can encompass wider regions of the world—the term can be used to refer to the character of business organizations or of neighborhoods, cities, and nation-states (Anderson-Levitt 2007). Huntington’s “clash of civilizations” emphasized the role of cultural conflicts after the end of the Cold War but attracted sharp criticism for his approach because it also challenged optimistic views on cultural diversity. He derived his typology of world cultures from the most important world religions but did not appropriately account for the cultural diversity within these religions and regions. Finally, he focused on “fault lines” between cultures, where he supposed conflicts to be most likely to occur (Huntington 1993). Given this criticism, scientific investigations should think more carefully about how to classify cultures rather than simply abstaining from analyzing this important driving force of global politics and political and social change of nation-states. We thus use the concept of “cultural spheres,” which distinguishes cultures in the world but allows fuzzy boundaries, a considerable degree of overlap, and change in cluster membership over time (Windzio and Martens 2021). Our typology of cultural spheres results from a combination of time-varying indicators. By regarding cultures as spheres with fuzzy boundaries, changing membership, and considerable overlap, we avoid an essentialist concept of culture. We coded our cultural indicators as binary variables and created a valued two-mode network in which countries are linked to one another by sharing one or several cultural characteristics, e.g., the highest quartile of the index of political liberties or the same language group. We used the following cultural characteristics to build the two-mode networkof cultural spheres: a country’s dominant religion, gender relations, civil liberties, rule of law, government ideology (nationalist, socialist or communist, restorative or conservative, separatist or autonomist, religious), dominant language group, hegemonic language (English, Spanish, Arabic), Huntington’s civilizations (African, Buddhist, Hindu, Islamic, Latin American, Lone States, Orthodox, Sinic, Western), and both long and short-term colonial influence (Besche-Truthe et al. 2020). The more of these characteristics two countries share, the higher their cultural proximity. In our network diffusion models, we thus include exposure as a weighted term, which means that exposure increases with the growing share of adopters in the network but also with the increasing tie-strength to these adopters. The cultural spheresnetwork is time varying. For example, proportions of dominant religious or ethnic groups as well as dominant language changed over time.

To gain a better overview of the network and the resulting cultural spheres, we clustered the network with a Louvain clustering algorithm. According to the time-variant nature of the network, the result suggests a five-cluster solution in 1880 but a three-cluster solution in 2010. This supports the idea that there has been an increasing isomorphism in institutional structures around the globe (Meyer et al. 1997). Figure 1.4 shows the result of a Louvain clustering procedure which results in a three-cluster solution for the year 2010. Blue vertices represent a cluster of mostly WEIRD (see above) and economically developed countries, the second cluster (green) mainly consists of non-dominantly Muslim African, Asian, and South American countries, and the third cluster (orange) is dominated by Muslim countries. A closer inspection of these clusters shows that there is some overlap between cultural spheres and world regions, but this correspondence is far from being perfect (Fig. 1.4).

Fig. 1.4
A clustered network of several countries to depict sphere culture. A few listed networks are K O R, J P N, I N D, M N G, and others that are connected.

Networkof cultural spheres in 2010

Network of Colonial Legacies

Researching the history of social policy adoption means to acknowledge specific historical interdependencies. A thorough and encompassing diffusion study must consider early social policy diffusion “under the conditions of colonialism” and “under conditions of continuing post-colonial ties” (Kuhlmann et al. 2020, 81). Influences of these dependencies can be as diverse as the mechanisms of diffusion. On the one hand, we can assume a coercive mechanism in that the empire just implemented policies in colonies without deliberation. The process of social policies diffusing from the empire to dependent entities is described as “imperial diffusion” (Kuhlmann et al. 2020). After the colonial dominion ended, however, a different diffusion mechanism might have been at work. For example, we know from diffusion research that perceived similarity can foster orientation toward some specific “role model” countries; Australia might look to Britain and Guinea to France for appropriate policy solutions (Dobbin et al. 2007, 453). Furthermore, possible policy solutions can be easier to implement because of path dependencies, such as institutional structures implemented during colonial rule that were modeled according to the role model. However, adverse effects can also be existent, as the institutionalization of policies in colonies differed in light of different characteristics and the strength of indigenous traditions (Craig 1981, 192).

Moreover, after colonization ends, the forged linkages between nation-states can facilitate diffusion in several ways. Specialized actors enter into transnational contact, especially in cases where nation-states are actively searching for role models for their institutions or for the transformation of their welfare systems. Once, a colonial link has been forged, the influence does not recede immediately after independence. Indeed, past studies show a strong correlation between colonial past on the one hand and enhanced contact and influence between the two countries on the other, such as through migration (Windzio 2018) or development aid (Shields and Menashy 2017), for example.

To include both colonial dependencies and postcolonial influences, we established a network of colonial legacies which is time-variant, directed, and weighted. This means that much like social network surveys, colonized countries “nominate” their colonizers. In the years of colonial dominion, the weight of the tie is 1. After colonization ended, an exponential decay parameter is estimated, representing the eroding influence of the former colonial link. The exponential function has been chosen because the values tend to get quite small, i.e., the influence via a link of colonial legacy is diminishing. For example, the values of ties are 0.97 one year after colonization, 0.77 ten years after colonization, and 0.08 one hundred years after colonization. The decay parameter was calculated with the following function:

$$ {\text{exp}}\left({- \left({No. \, years \, since \, colony \, ends/{4}0} \right)} \right) $$

In our view, the influence of a colonial power does not simply disappear immediately after the colony becomes officially independent. According to our assumption, the influence of the colonial power declines much more gradually over time after official independence is achieved. There are different variants to compute exposure due to the colonial legacy based on this function. We apply the function to the standardized exposure as computed by the netdiffuseR package (Vega Yon and Valente2021), which restricts the range of exposure between zero and one. Accordingly, the theoretical assumption is that colonial legacy is very strong and the strength of the colonial power’s influence remains almost constant after independence. In contrast, if we do not standardize the exposure, the influence of former colonial powers still exist, but compared to the standardized computation of exposure, the power declines after colonization. To date, there is no commonly accepted standard by which the influence of former colonial ties on the subsequent history of a country can be modeled. There are even more alternative approaches that are conceivable, e.g., that the network of colonial ties is simply cross-sectional, but this would be a strange assumption for the historical periods before colonization. Another approach would be to test the influence in a time-constant way after colonization ended, whereas the tie in the colonization network is zero before colonization. The “right” way to capture the effect of colonial legacies might also depend on the particular social policy under investigation. Finally, since we are interested in comparing the effects of different networks on the diffusion of social policy, we should keep in mind the strong correlation of exposure across different networks. Hence, researchers should also interpret their results against the background of considerable multicollinearity

Fig. 1.5
A cluster diagram of a colonial legacy network that shows the networking countries like B E L, N L D, P R T, G B R, R U S, T U R, F R A, E S P, M G N, C H N, and J P N.

The network of colonial legacies

.

The raw data is based on the Colonial Dates Dataset (COLDAT) by Bastian Becker (2019) in combination with the Centre d’Etudes Prospectives et d’Informations Internationales (CEPII) (Head and Mayer 2014) and our own data collection using Wikipedia. In line with CEPII, our definition of colonial links is that a colonial relationship should involve long-term, civilian administration that includes significant settlement. We assume, for instance, that the territory of what is now known as Armenia was “colonized” by the Persian Empire before 1828 and by the Ottoman Empire before 1920, as well as simultaneously by Russia between 1813 and 1918. After that time, we assume Armenia to be a “colony” of Russia until the dissolution of the USSR in 1990. Although these relations do not depict “classic” (exploitative) colonial relations, we find merit in a more encompassing approach. The long rule of an empire leaves marks on the society and the political system at large. We still see some former USSR states that actively search for contact to Russia and openly base their (authoritarian) policies on Russian examples, e.g., Belarus. Furthermore, by using the aforementioned decay parameter we do assume a decreasing influence of former rule by empires. Nevertheless, the colonial network poses a methodological problem when, for instance, a social policy was adopted for the entirety of the USSR. Due to the simultaneous adoption of policy, the exposure of former USSR states is calculated as 0 at the time of policy adoption. That, however, would assume an incorrect threshold and therefore distort the regression estimation. Hence, in contrast to Fig. 1.2, exposure was calculated without a one-year lag in the colonial network, i.e., exposure at t is calculated as the ratio of alters that adopted precisely at t and not t−1 (Fig. 1.5).

Correlations of Our Networks

Social networks are the structural backbone of the diffusion process. We are interested in whether the multiplex network in the dimensions of geographic proximity, colonial heritage, global trade, and cultural proximity do actually relate to different influence channels, or whether they tend to be rather redundant. If the correlation between ties in a network or, more precisely, the value of different edges in the dyads are strongly correlated, these networks tend to be redundant. As Table 1.2 indicates, this is definitely not the case. Here we see a correlation matrix of the weighted edges and find only minor correlations overall. We find the highest correlation between networksof cultural spheres and (log) trade (r = 0.242). As a result, these four networks are far from being redundant.

Table 1.2 Correlations of (weighted) networks

However, the correlation of these networks is not the same as the correlation of exposure to alters that have already adopted the information. At the beginning of a pandemic, for example, when exposure is generally low, it does not matter whether these networks are correlated or not. Exposure will be low anyway. Minor differences in network structure can correspond to strong differences in exposure if, for example, the ego-network of country i has just one more tie to an adopter in the (weighted) tradenetwork than in the cultural spheresnetwork but the additional tie in the tradenetwork has a particularly high weight in the computation of exposure. Similarly important are situations when most alters are already infected and exposure is generally high. Exposure can be 1 (maximum normalized exposure) in a network dimension where ego is tied to 12 alters, but it can also be 1 in another network dimension where ego is tied only to 2 alters. Table 1.3 shows the correlations of (weighted) exposure to alters that adopted compulsory education. Indeed, correlations are considerably higher. Exposure in the network of geographic proximity is strongly correlated with the network of cultural proximity (0.919) and also with exposure in the tradenetwork (0.741). Moreover, trade and cultural proximity are highly correlated as well (0.728).

Table 1.3 Correlations of (weighted) exposure to alters that adopted compulsory education

Structural Features and Interdependencies of Our Networks

How can we further characterize these networks? Networks of positive ties often show transitive hierarchies, as epitomized by the adage “friends of my friends are my friends.” If node i names node j as a friend, and if j is befriended with node k, i tends to close the triad and establish a tie to k because i regards friends of j as his or her friends as well. However, not all networks show this pattern. A visual inspection of the colonial ties network in Fig. 1.5 suggests that the overall share of transitive triads of all triads is comparatively low, but the structure is dominated by so-called “in-stars.” We use Exponential Random Graph Models ERGMs (Harris 2014) in order to explain the basic determinants of the respective networks in a multivariate regression. We recoded the weighted edges into binary values by setting the lowest quintile of geographic distance to 1 (else = 0). We did the same with values of ≥ 3 of log(trade) and values > 3 of weighted cultural proximity. These thresholds identify rather strong ties in the respective network. The motivation of this model is to maximize the likelihood of actually observing the empirical networkx out of the huge set of networksX that the respective set of nodes (in our case countries) could form. The outcome of interest is the probability P of observing the empirical networkx out of the huge set X. The odds of all possible networks are represented by κ(θ), and due to κ(θ), P is indeed a probability in the equation below, expressed in a way that resembles a multinomial logit model.

$$ P({\mathbf{X}} = x) = \frac{{\exp \{{\mathbf{\theta ^{\prime}z}}({\mathbf{x}})\}}}{{\kappa ({{\varvec{\uptheta}}})}}{,}\,\,\text{where} \, \kappa ({{\varvec{\uptheta}}}) = \sum\limits_{n = 1}^{{2^{g(g - 1)}}} {\exp} \{{\mathbf{\theta ^{\prime}z}}({\mathbf{x}})\} $$

The likelihood is maximized by inserting coefficients θ for the network characteristics z(x), e.g., transitive closure, homophily, or any other kind of explanatory variable. Because of the statistical non-independence in networks, it is almost impossible to get reliable results by using maximum likelihood methods, the estimation is based on Markov Chain Monte Carlo (MCMC) simulations. Given the specified regression equation, the algorithm generates a huge set of networks by inserting θ coefficients drawn from a random distribution and adapts these coefficients until the equation generates networks similar to the empirical network with respect to the underlying characteristics z(x). The resulting coefficients θ of a converged model can be interpreted as changes in the log odds of a tie in the respective network due to a one-unit change in the explanatory variable z(x).

The first column in Table 1.4 shows determinants of ties in the trade network, the second column in the networkof cultural spheres, and the third column in the network of colonial histories. We estimated a temporal ERGM for the period from 1890 to 2010 in 20-year intervals and eight measurement occasions using bootstrapping methods (Leifeld et al. 2016). The term “edges” is the intercept of the regression model and represents the log odds of the network density, given that all covariates are constrained to zero. The positive significant effect of gwesp (geometrically edgewise shared partners) indicates that transitive closure much more likely occurs in the empirical network than in a corresponding random network. In contrast, gwdsp (geometrically dyadwise shared partners) shows a significantly negative effect and points to the lower probability of open triads (Harris 2014). Aside from these network structural effects, ties in the trade network depend on spatial proximity(0.1488*) and cultural spheres (0.9971*) but not significantly on colonial legacies. They occur less often if two countries have the same level of democratization (same regime) and the higher the absolute difference in GDP per capita between ego and alter is. Unsurprisingly, global trade is an issue of economically well-performing countries since high levels of GDP increase the degree (0.0332*). We also estimated the memory-term of “tie stability” (Leifeld et al. 2016), which indicates the stability of ties and non-ties, and thereby accounts for how strongly the state of the network at t depends on its previous state at t−1 (Table 1.4).

Table 1.4 Determinants of ties in networks of countries 1880–2010 (t = 8), temporal exponential random graph models, N = 164

Column 2 in Table 1.4 shows the effects on the log odds of ties in the networkof cultural spheres. Again, we find the pattern of high transitivity (gwesp) and a negative tendency toward open triads (gwdsp). Having a tie in the network of colonial legacies has a negative effect on cultural similarity (−0.3214*), which means that countries colonized other countries that were culturally rather different. Contrariwise, a tie in the network of global trade increases the log odds of a tie in the cultural spheresnetwork—which we also do not interpret in a strict causal sense because the direction of the influence could also be reversed (0.8275*). Our model does not indicate that political regime type in terms of levels of democratization and economic development corresponds with culture: if two countries have the same level of democratization, the log odds of a tie in the cultural spheresnetwork is only insignificantly increased. The absolute difference in GDP is insignificant as well. Again, the memory-term indicates a significant effect of the lagged network.

Finally, we analyze the network of colonial legacies, which has quite a specific topology as shown in Fig. 1.5. This network is rather special since there are few “hubs” with many ingoing ties, and there is a clear distinction between node sets of senders and receivers. We find a positive effect of 2-in-stars. This means that two ingoing ties occur significantly more often than expected by chance, which is obvious from the visual representation in Fig. 1.6. In addition, there are positive effects of spatial proximity and ties in the tradenetwork. While there is no effect of same regime, effects of GPD per capita are negative on outdegree. Accordingly, richer countries name other countries as colonizers less often: overall, richer counties have a considerably lower risk of being colonized.

Overview of the Volume

Social policy fields investigated in this volume are old age and survivor pensions, labor and labor markets, health and long-term care, education and training, and family and gender policy. In Chapter 2, Breznau and Lanver analyze the introduction of work injury insurance, which often marks the beginning of an emerging welfare state. According to the results, spatial proximity and levels of democratization are the major determinants of adoption, but ties in the tradenetwork also have a positive effect. Emerging education states are analyzed by Seitzer, Besche-Truthe, and Windzio in Chapter 3. They show that cultural proximity has a strong effect on the adoption of compulsory education, but this effect vanishes after controlling for spatial proximity. Similarly, in Besche-Truthe’s study (Chapter 4) the effect of a tie in the network of cultural proximity becomes insignificant upon the adoption of adult basic education policies after controlling for spatial proximity, GDP per capita, and level of democratization. Moreover, although the introduction of healthcare systems, as analyzed in Chapter 5 by Polte, Haunss, Schmid, De Carvalho, and Rothgang, mainly occurred in economically prosperous countries before WWII, the effect of GDP decreases in subsequent periods. In addition, the effect of spatial proximity decreases over time, whereas the effect of tradenetworks seems to increase. Another important policy in aging societies is long-term care, analyzed by Fischer, Polte, and Sternkopf in Chapter 6. Aside from geographic proximity, there seems to be no horizontal diffusion via networks. Rather, the introduction of long-term care systems depends on problem pressure (population 75+), political empowerment of women, GDP per capita, and levels of democratization.

In their study on the introduction of paid maternity leave, family allowances, and the adoption of workplace childcare regulations, Böger, Son, and Tonelli (Chapter 7) show that while paid maternity leave was an important issue on the agenda of the International Labour Organization (ILO), family allowances tend to depend more on domestic factors. In contrast, there seem to be effects of colonial legacies, particularly in former French colonies, with regard to workplace childcare regulations. The ILO is in the focus of Hahs study on the ratification of the C111 Anti-Discrimination Legislation in Employment and Occupation (Chapter 8). Ties in the network of colonial legacies and spatial proximity seem to drive the diffusion process, but the former effect is strongly confounded with a country’s legal origin.

Interestingly, exposure to other countries due to similarity in culture has a negative effect on the adoption of antidiscrimination legislation supporting the LGBTQ+ community, whereas there are positive exposure effects in the network of global trade (see the study of Seitzer in Chapter 9).

Chapter 10 by Schmitt and Obinger critically reviews the results and the research design applied in this volume. They appreciate the macro-quantitative approach to social policy diffusion, but also recognize its limitations. Analyzing network diffusion highlights the global interdependence, but does not tell us much about the precise mechanism at work in a respective country dyad or subnetwork. These mechanisms also depend on country-specific factors and sometimes on idiosyncratic situations that we cannot generalize to other interdependent constellations. Future research on policy diffusion should thus systematically consider mixed-methods designs and apply a combination of macro-quantitative data and in-depth case study analyzes.