The Obsession with Covariance
Comparative public policy is a wide-ranging field focusing on very important topics such as environmental protection (e.g., Jahn 2016; Knill et al. 2010a), penal issues (Cavadino and Dignan 2006; Wenzelburger 2020), the legal rights of homosexuals (Engeli et al. 2012; Knill et al. 2015), and the welfare state (Esping-Andersen 1990; Huber and Stephens 2001; Jensen and Wenzelburger 2020). Given how broad the literature is, it is striking how constrained almost all research is in a more significant aspect: its focus on using variation between countries on some political, institutional, or economic factor to explain variation in another factor.
Examples are legion. One prominent line of work dating back several decades studies the effect of government partisanship on various outcomes (for some early, defining studies, see Hibbs 1977, Castles and McKinlay 1979, Korpi 1989, and Huber et al. 1993; for a review, see Schmidt 2010). The logic is that office-holding parties hold distinct ideological preferences over policy and that these preferences affect policy-making. The deduced expectation is that as the partisan composition of governments varies across time and between countries, public policy varies as well. The literature on government partisanship is highly advanced, and it often explores the complex conditions that constrain governments pursuing their first-order ideological preferences (e.g., Becher 2010; Jensen 2010). There is also a lively debate on how best to measure a government’s partisanship. This is far from trivial, both conceptually and in terms of the empirical conclusions researchers may draw from their analyses (see the section on agency below for a more exhaustive discussion).
Governments are nested in and influenced by the legal, political, and economic institutions of countries. In democracies, nations’ constitutions obviously put very real limits on the policy-making abilities of the governments, as do the electoral rules by specifying how the governments are elected in the first place, although how exactly this happens can vary greatly from one place to another. In the United States, a system of checks and balances entails that power is shared to a much higher extent than in, say, the United Kingdom, where a single party typically dominates policy-making—a difference in governmental power that flows directly from the countries’ constitutional arrangements and electoral rules. There is a large literature exploring how such institutional differences may affect public policy-making either directly or as conditional factors that moderate the direct influence of, e.g., governing parties’ ideological preferences (Jensen and Mortensen 2014; Kittel and Obinger 2003; Schmidt 1996). Yet there are other, less formal institutions, too. These include a tradition for corporatist negotiations between interest groups and the government (Ebbinghaus and Weishaupt 2021), as well as a norm for public referenda rather than parliamentary legislation as an instrument of decision-making (Papadopoulos 2001; Wagschal 1997).
The focus in the literature on variation in the explanatory variables generates an intense focus on variation in policy outcomes. Indeed, a well-tested way for authors to generate an interesting “puzzle” to captivate the attention of the academic audience is to highlight how already-established variation in countries’ institutional or partisan setups does not match an observed variation in policy outcome—and then move on to provide a new explanatory variable that varies in just the right way to account for the policy variation. There are good methodological reasons for the attention to variation as well, not only in the context of quantitative public policy studies, which we focus on here, but much more broadly. From historical case studies to quasi-experimental studies, variation in the explanatory variable is a vital element to establish a (maybe even causal) relationship, as a generation of political scientists has been taught by King et al. (1994) and the increasingly sophisticated literature on causal identification that has swamped the discipline.
Yet no matter the reasons, focusing on variation in the extreme way we see in the literature on comparative public policy is unwarranted. It is above all unwarranted because it is intellectually constraining. Today, all Western democracies have extensive environmental protection regimes; they all tax corporate profits; they all offer some social safety net for the poor; they all have a voting age of 18; and none punish tax evasion with death—just to mention a tiny fraction of examples that sum up to a major backdrop of similarities that is so pervasive we most of the time simply do not appreciate it. Borrowing a phrase from Baldwin (2009), ignoring these similarities has led to a “narcissism of minor differences” in comparative public policy in which variation on the margin of an explanatory variable is used to explain variation on the margin of a dependent variable (although this, of course, does not mean that variation-based studies by default are of marginal relevance).
A related trend has been the rise of what we might call institutional particularism, a phenomenon that may be even more widespread in the small‑N public policy literature than in the quantitative branch. As authors jockey to explain often relatively minor differences in public policy, a forest of concepts has emerged categorizing countries into new or refined categories. This has sometimes led to awkward results, such as in the welfare state literature, in which the paradigmatic work of Esping-Andersen (1990) on welfare state regimes (itself inspired by Titmuss’s work) has been relabeled and rethought dozens of times, often based on intricate arguments, but with highly decreasing analytical value. After three decades of work, the best advice to anyone interested in understanding the basic structure of welfare states in Western democracies remains to read the original book.
The Lack of Agency
Comparative policy scholars have difficulties acknowledging the role of agency when seeking to explain variance in policy outputs. This is partly due to the high level of abstraction and the number of cases that characterize most quantitative comparative policy studies: Admittedly, when analyzing policies in a large number of countries and/or over several decades, it is difficult to assess whether individual political actors and their characteristics matter for certain policy outcomes. At the same time, however, comparative public policy researchers do acknowledge that individual actors can be important for explaining policy outputs, but they also claim that they cannot account for this influence in large‑N cross-case analysis (Wagschal and Wenzelburger 2012, p. 68). Given this inherent problem, public policy researchers have resorted to proxies to account for agency in different ways. The traditional solution has been to model agency on the level of collective actors, such as parties, trade unions, or nongovernmental organizations. However, in recent years, several attempts have been made to open the black box of collective actors for more fine-grained analysis of individual actors. We will briefly comment on both arguments.
Theorizing Collective Agency
A prime example for the traditional way of coping with agency in macroquantitative analyses—via the introduction of collective actors—is the literature on the influence of political parties and governments. In this literature, scholars have modeled political parties as collective actors having a certain ideology to which they adhere when deciding about public policies once in government (Schmidt 1996). Hence, agency is modeled on the party level as a function of a party’s ideology. To quantitatively assess the impact of party ideology at the aggregate level, different measures have been put forward. One approach is to claim that ideology can be measured via party families—a concept that was famously introduced by Von Beyme (1985). Some studies resort to a simple left–right indicator to differentiate between the ideological stance of political parties (Allan and Scruggs 2004), and others use a three-family approach and add center or Christian Democratic parties (Huber et al. 1993), whereas the most sophisticated research employs more fine-grained measures and also accounts, for instance, for liberal parties (Wolf et al. 2014) or Greens (Neumayer 2003).Footnote 2 No matter which operationalization of party families is used, this strand of the literature concurs in modeling agency implicitly as a function of the long-standing affiliation of a party to a certain party family.
A second approach is to empirically estimate the ideological stance of a party with techniques that approximate party positions. This can be done by coding party manifestos in the tradition of Klingemann et al. (1994, 2006), with regular updates made available by the Manifesto Project team (Volkens et al. 2021), via expert surveys (Hooghe et al. 2010) or approaches that combine both, such as the Wordscore method (Debus 2009; Lowe 2017). In terms of theorizing agency, studies relying on party programs actually model political parties as being tied to what they say in the manifesto: Agency is doing what has been announced in the program. If expert surveys are used, the perceived position of the party in competition with others is seen as an indication of agency: Agency is modeled as parties following the ideology as perceived by experts. Finally, some authors also look at the constituencies of political parties to make inferences about their policy preferences (Häusermann 2006; Jensen 2014). Following the “electoral turn” (Beramendi et al. 2015) in political economy, this strand of research sees parties as agents of their voters.Footnote 3 Depending on the theoretical model used, preferences are tied back to the median voter position, the constituency, or the party’s electorate.
These examples illustrate quite well how intricate the choices are when agency is to be analyzed quantitatively on the level of collective actors. In fact, research on political parties is very advanced and offers a lot of data to measure party positions and to model agency with the help of party ideology. For other important actors in policy-making, the modeling strategy is much more simplistic and basically assumes the preferences of collective actors. Trade unions are expected to care about wage levels and social policies, central banks about price stability, and environmental organizations about more protection of nature. While this may be true, the theoretical underpinnings are only seldomly discussed—although we do need at least a theoretical microfoundation of the preferences of collective actors if we want to make inferences about the role of agency in public policy-making. Providing such a theory-based microfoundation can be an extensive exercise, as can be seen in Scharpf’s work on macroeconomic policies during the oil crises (Scharpf 1987, 1997).
Opening the Black Box of Collective Actors
An alternative to theorizing and modeling agency at the level of collective actors—such as parties, trade unions, or central banks—is to go to the micro level and estimate preferences of individual policy-makers that are identified as important. Qualitative studies have repeatedly shown that key actors in a certain policy area, say cabinet ministers, can strongly affect policy decisions (Wenzelburger 2020; Wenzelburger and Staff 2017; Zohlnhöfer 2009). Moreover, it is conceivable that their policy preferences are driven by a number of (sometimes competing) considerations: Gaining votes clearly plays an important role, but core policy beliefs also matter, and so do strategic considerations and party-specific goals an actor wants to reach (Wenzelburger and Zohlnhöfer 2020).
While it is true that quantitative studies will not be able to account for all of these factors, substantial advances have been made in bringing individual actors more fully into comparative policy research in quantitative studies. They concern two aspects: the question of whether different individual actors have similar weight in influencing policy decisions (equivalence problem), and the question of whether individual actors can have competing preferences (preference formation problem).
On the first point of equivalence, the basic question is whether individual actors that have been identified as important in policy-making are more or less influential. While this aspect has been discussed to some extent in the veto player literature with respect to the formal role of veto players (Ganghof 2003), it also matters for individual actors. How the problem of equivalence can be accounted for empirically in quantitative studies has been shown by Alexiadou (2015). In her study on welfare policies, she closely analyzes how different types of cabinet ministers—ideologues, partisans, or loyalists—are able to influence policy decisions to different degrees. Her study shows that, indeed, differences between individual political actors exist and can be included in a quantitative study: Partisans (party heavyweights and aspiring leaders) and ideologues (with strong and fixed policy preferences) are much more successful in influencing policies than loyalists, who follow the party leader.
While Alexiadou shows how one can address the problem of equivalence in quantitative studies, she still needs to make assumptions about what the preferences of political actors are (preference formation problem). To do so, she draws on the party-family literature to derive expectations about social welfare preferences (opposing social democrats on the one hand with liberal and conservative ministers on the other hand). However, if we want to take individual actors more seriously, we also need to cope with the fact that agency of individual policy-makers may not be driven by rather general party-family goals. To address this problem, biographical research can help to a certain extent, an idea that has mainly been followed by economists in order to explain economic policy decisions. Hayo and Neumeier (2014, 2016) have, for instance, shown that the class background of political leaders matters for fiscal conservatism even when controlling for political party affiliation, with lower class status correlated to higher budget deficits. And for capital account liberalization, Chwieroth (2007) has shown that the professional background of economic policy-makers matters for liberalization decisions in emerging countries. These results point out that preference formation can indeed be modeled on the level of the individual policy-maker even in quantitative studies.
The Unclear Universe of Cases
The “grand theories” of comparative policy research, such as power resources theory or institutionalist approaches, have been initially developed to explain variation between policies of Western democracies. However, the universe of the cases that the theories have been applied to has been widened significantly—as studies on postcommunist countries in Eastern Europe and Latin America illustrate (e.g. Borges 2018; Ha 2015). But are the theories of comparative policy research universally applicable, or are their explanatory claims limited to democratic systems or even only to the Western industrialized nations? Unfortunately, this important question of the universe of cases to which the theories can be applied has never really been discussed in the respective research.Footnote 4 This state of affairs is problematic because if we want to say something about whether the theoretical claims may travel to other systems, we should at least know where theories are clearly applicable (Sartori 1970). Only then can we define scope conditions and argue under which circumstances certain theorized relationships may also be expected in other contexts. To illustrate our point, we focus on two major approaches within the canon of comparative public policy theories: power resources theory/party politics and institutionalism.
For power resources theory as well as institutionalism, it seems safe to say that the DNA of these theories is strongly influenced by researchers who had political developments in Western European states and their respective consequences in mind. The key work on power resources theory—Korpi’s (1983) “democratic class struggle”—has not only been influenced by neo-Marxism and the analysis of the Swedish case, but it is also clearly rooted within the theory of cleavage structures that has been devised for Western European nations (Lipset and Rokkan 1967). The cleavages identified by Lipset and Rokkan have been developed for Western European societies, and the distinct societal groups of interests, which are transported to the political systems by intermediate organizations such as trade unions or parties, are the groups created by the history of revolutions in Western Europe. Similarly, cleavage structures have been important ingredients in Lijphart’s work on democratic systems, with institutional features of consensus democracies being set up in strongly “verzuiled” nations to guarantee coalition building and the protection of minorities. Hence, much speaks for restricting the very core universe of cases of these theories to the democracies in Western Europe (Austria, Belgium, Denmark, Finland, France, Germany, Ireland, Italy, the Netherlands, Norway, Sweden, Switzerland, and the United Kingdom).
However, it is true that the classics of power resources theory and institutionalism did enlarge the universe of cases and included Western industrialized nations outside Europe, such as Australia and New Zealand, the United States, and Canada: Korpi (1983) and Esping-Andersen (1990) include, for instance, 18 OECD countries, and Lijphart (1984) initially included 21 Western democracies. Thus, the universe could therefore be drawn a bit larger if we were to follow the selection criteria given by the authors themselves, namely to focus on countries that “have had a record of political democracy during the entire postwar period” (Korpi 1989).
The selection of democracies for applying power resources theory and institutionalist theories to the explanation of policies is linked to the causal mechanisms that are assumed to be at work. Clearly, theorizing about the power resources of the working class and their impact on policies requires establishing a transmission belt to bring these interests into the political game. In the literature on power resources theory, these channels are both corporate (e.g., trade unions) and political (socialist or social-democratic parties)—and at the least, the political channel needs democratic systems to work. For the influence of political institutions on policy-making, a similar point can be made. In Lijphart’s (2012) work, for instance, the comparative study of institutional features that allow systems to integrate minority positions in the decision-making process makes sense only in the context of democracies. Similarly, veto-point approaches that account for the institutional barriers against policy change (Huber et al. 1993; Kaiser 1997; Schmidt 2002) have also been designed for democratic systems: They model how constitutional structures of democratic states limit the maneuvering room of governments, which is directly linked to the democratic idea of separation of powers.Footnote 5 Hence, these considerations speak in favor of including non-European democracies in the universe of cases to which power resource theory and institutionalist approaches can be applied.
At the same time, however, enlarging the universe in this way raises additional questions. First of all, if being an established democracy is the criterion of using comparative public policy theories, we have to ask ourselves whether we should not follow Lijphart’s example and also include countries such as Argentina, Uruguay, and Korea to our analyses (Lijphart 2012). Most of the comparative public policy scholars refrain from doing so—mostly because they seem to feel that Latin American states such as Uruguay and Argentina are rather different from the traditional Western European democracies that have been at the core of theorizing, or from other advanced industrialized countries such as the United States and Canada. However, without giving strong theoretical reasons for exclusion, such choices quickly seem arbitrary. Consequently, the main question is whether the concepts used to form our theory would travel to a such extended universe of cases. Here, the inclusion of Australia, Canada, New Zealand, and the United States may already be criticized. The mechanisms underlying power resources theory are a nice illustration: If we take the idea of political parties as channels for power resources of societal groups (or coalitions) seriously, it is unclear whether U.S.-style political parties actually are similar to European parties in fulfilling this role, given that more fluid membership and the stronger ties of members of Congress to their local constituents weaken the “responsible party model” (Miller and Stokes 1963; Page et al. 1984). Similarly, if our institutional theory should travel to democracies in Latin America, we need to ask whether focusing on written constitutions to conceptualize institutional constraints and veto points in policy-making is actually enough, given that important collective actors “outside” the constitution have been able to influence policy decisions (e.g., the International Monetary Fund [IMF] on economic policy). Hence, if enlarging the number of cases should not lead to “conceptual stretching” (Sartori 1970), we have to go back to theory and ask ourselves whether restricting the universe of cases does not provide us with more valid insights than applying comparative public policy theories to a universe of cases they have not been designed for.
A Focus on Outputs
One reason why much comparative public policy research is quantitative is what might best be described as a data revolution. Twenty-five years ago, the most prevalent quantitative measure of public policy was government spending—data that were collected by the IMF (e.g., government finance statistics) or the Organisation for Economic Co-operation and Development (OECD; economic outlook) and provided a reliable basis for comparison. Therefore, evidently, many of the first-wave cross-national public policy studies used different spending items of government’s budgets as the dependent variable (Castles and Mckinlay 1979). This has changed dramatically as researchers have constructed large datasets with often very fine-grained information both on public policies and the theorized predictors of public policy change.
Walter Korpi and his collaborators started the Social Citizenship Indicator Programme in the 1980s, and it may be counted as an early and very successful example of a public policy dataset (Korpi and Palme 2008). Although some elements of the dataset may be better classified as policy outcomes, because they capture not only the legal entitlements of citizens but also the value of these rights compared with the incomes of other members of society, it has been immensely popular (Ferrarini et al. 2013). The Social Citizenship Indicator Programme covers 18 Western democracies all the way back to 1930 and therefore allows for an analysis of when governments introduce the right to receive old age pensions, sick pay, or unemployment benefits, as well as the conditions attached to these social rights. The landmark study of Esping-Andersen (1990) drew heavily on this data, as have several widely cited pieces by Korpi and Palme (Korpi and Palme 1998, 2003). In the realm of social policy, the Social Citizenship Indicator Programme has since been supplemented by the Comparative Welfare Entitlements Dataset (Scruggs et al. 2013), which provides annual data and deviates from the Social Citizenship Indicator Programme in several measurement issues (Scruggs 2013; Wenzelburger et al. 2013). Still more recently—and to overcome reliance on replacement rates that are connected to income—researchers have turned to legislation to measure policy outputs in the realm of social policy (see, e.g., the Welfare Reform Dataset [Jensen and Wenzelburger 2020]) or to even more fine-grained program-related data (see, e.g., the Comparative Unemployment Benefit Conditions & Sanctions Dataset [Knotz 2018]).
Among European researchers, Knill and his colleagues (Bauer and Knill 2014; Knill 2013; Knill et al. 2010b) arguably take first prize in the art of collecting very large public policy datasets across a wide range of policy fields—from environmental protection to moral policies to social rights—but today there is a large number of datasets, often with quite specific information. But international organizations such as the OECD have also expanded their data collection efforts, going way beyond spending. Their systematic collection of policy information from its member countries across many different policy areas has been particularly helpful in this regard and ranges, today, from employment protection rules to taxation indices to policy instruments for environmental protection—data that researchers frequently use to construct new datasets on their own. In conjunction with the similar impressive expansion of dataset measuring of government partisanship and other important independent variables (Armingeon et al. 2020), the breadth and depth of this collective effort means we today have a good grasp of the trajectories of policy developments in many different policy areas.
All qualities untold, the major problem with the data revolution and the analytical focus it implies is exactly its focus on policy outputs and outcomes. Yet almost all theories of comparative public policy emphasize the role of the policy process. The veto play theory (Tsebelis 2002), to take one example, implies a quite intricate process of bargaining between the political actors, exactly as the power resource theory does. Yet, as data are readily available, researchers are quick to dismiss more appropriate measures that may take at least some parts of the policy process into account and instead correlate what can be downloaded from the existing sources. Power resource theory is a case in point. Here, most scholars use, for instance, the data collected by Visser (2006) on trade union density, although measures of centralization or the inclusion of unions in the policy-making process may be more appropriate. Moreover, from an empirical perspective, another important issue is that there typically are many more testable observations following from a given theory than what is possible to test with the new datasets. Hence, a focus on the policy process would often be helpful in discriminating between possible explanations.