The Accountability Sequence: from De-Jure to De-Facto Constraints on Governments

Accountability is one of the cornerstones of good governance. Establishing accountable governments is a top priority on the international development agenda. Yet, scholars and democracy practitioners know little about how accountability mechanisms develop and thus can be supported by international and national actors. The present study tackles the questions of how, and in what sequence accountability sub-types develop. We consider not only vertical (elections and political parties) and horizontal accountability (legislature, judiciary, and other oversight bodies) but also diagonal accountability (civil society and media) in both their de-jure and the de-facto dimensions. By utilizing novel sequencing methods, we study their sequential relationships in 173 countries from 1900 to the present with data from the new V-Dem dataset. Considering the long-term dimensions of institution building, this study indicates that most aspects of de-facto vertical accountability precede other forms of accountability. Effective institutions of horizontal accountability—such as vigorous parliaments and independent high courts—evolve rather late in the sequence and build on progress in many other areas.

the international development agenda, e.g., as a target of the Sustainable Development Goals (UN Resolution: A/Res/70/1).
Yet, we know little about how and in what sequence different aspects of accountability evolve. The extant literature provides some suggestions on the sequence of democratization-thus how accountability starts to evolve. Dahl (1971) famously argued that it is conducive if competition among elites evolves before the expansion of participation, and there is some empirical support for this (Wilson 2015, p. 234). And while there are contributions on how one aspect of accountability may strengthen other aspects (e.g., Keane 2009), this study is the first of its kind by investigating if there are generalizable sequences in the evolution of accountability across 35 highly specific institutions with data covering 176 countries and 113 years.
Taking Dahl's (1971) famous axiom as a point of departure, we argue that governments are more likely to allow for de-facto accountability if the costs of supplying accountability decrease and the costs of suppressing the demand for accountability increase. Distinguishing between institutions of vertical accountability (related to elections and political parties), horizontal accountability (checks and balances between institutions), and diagonal accountability (media and civil society), we hypothesize that progress in vertical and diagonal (aka societal) accountability increases the pressure for horizontal accountability. For instance, the more elections are really free and fair (evolution of vertical accountability), the greater the incentives for legislators to demand more oversight over the executive to ensure that the executive satisfy constituents' demands, hence increasing horizontal accountability. Conversely, advancement of vertical accountability is less contingent on progress in other areas, because voters are principals, not agents, in accountability relationships. Furthermore, effective horizontal accountability should be resisted by governments because actors such as high courts and legislators typically have more information than voters and can impose costly sanctions at any time rather than just every 4 or 5 years. In short, we expect institutions of vertical accountability to develop first and those institutions of horizontal accountability, that directly oversee and constraint governments to develop last.
Novel sequencing methods developed by Lindenfors et al. (2018) adapting Bthe contingent states test^(an established method for establishing historical sequences in biological evolution of parasite-host systems, see, e.g., Siixén-Tullberg 1993), makes this type of analysis possible for the first time. 1 It allows us to offer a distinctive depiction of long chains of sequences between 35 separate aspects of accountability measured at ordinal levels. Using V-Dem data from 173 countries from 1900 to the present (Coppedge et al. 2016a, b), the empirical results support the main theoretical predictions.
In the following, we first discuss the dependent variable, specify the hypotheses about the expected accountability sequence, and then explain the logic of these hypotheses using theoretical ideas related to cost of supply and demand from the perspective of rulers. We then turn to discussing our measures of accountability and the empirical methods, followed by the empirical analysis and the conclusion.

What Is BPolitical Accountability^?
A range of fields employ the concept of accountability, resulting in over 100 subtypes and usages (Lindberg 2013, p. 204). Yet, the underlying etymological principle of allocating authority, appraising performance, and the possibility of applying sanctions, is constant. Thus, as Locke (1980Locke ( [1690) observed, accountable leadership requires separation between governors and the governed. This paper focuses on political accountability in this original sense: When decision-making power is transferred from a principal (citizens) to an agent (government), there must be a mechanism for holding the agent accountable and means to apply sanctions. Accountability hence is associated with discretionary governing, or the authoritative allocation of resources and exercise of control (e.g., Dahl 1971;Kooiman 1993;Marsh and Rhodes 1992). It can also be a key instrument for voters to make elected representatives work in their favor (Manin et al. 1999). Therefore, accountability is central to democratic theory, even if there are types of accountability that have little to do with democracy (in accounting, business relationships).
Based on earlier work , we define political accountability as Bconstraints on governments' use of political power through requirements for justification of its actions and potential sanctions.^By governments we mean the executive branch of the government including the chief executive: the head of state or government, the cabinet, ministries, and top civil servants (Coppedge et al. 2016b: 413). Henceforth, we refer to this meaning simply as Baccountability.Ŵ e follow the extant literature differentiating between vertical, horizontal, and diagonal sub-types. The distinction between vertical and horizontal accountability is common (e.g., O'Donnell 1998) and diagonal (aka social) accountability captures the role of civil society and media in constraining governments (e.g., Goetz and Jenkins 2001;Malena et al. 2004).
More specifically, institutions of vertical accountability concern the extent to which voters exert control over politicians who are faced with the uncertainty of re-election (Fox 2015;Mainwaring and Welna 2003;Olson 2000). Political parties also factor in because stronger and more organized parties are in a better position to enforce constraints on governments' and prevent illicit activities that would hurt the party's reputation (Gehlbach et al. 2011;Svolik 2012).
Diagonal accountability reflects to what extent civil society and media actors constrain governments either indirectly via providing information for and thus enhancing the effectiveness of, other accountability actors, or directly by pressurizing them (Goetz and Jenkins 2005): Media empowers citizens to make informed political choices (Voltmer 2010, p. 139); a robust civil society is critical to hold governments accountable beyond elections (Besley 2006;Johnston 2005;Peruzzotti and Smulowitz 2006); and CSOs are important for increasing the political awareness and impact of their members (Lipset et al. 1956).
Horizontal accountability is the oversight exercised by state institutions such as legislatures, judicial branches, ombudsmen, and prosecutor generals (O'Donnell 1998, 119) where separation of power should prevent governments' abuse of power (Rose-Ackerman 1996). We focus here on how these institutions oversee the government (and not each other). For a more in-depth conceptual discussion, see ).

Hypothesizing the Accountability Sequence
Does a high level of political accountability develop in a particular sequence of institutional strengthening? We present very briefly here the theoretical underpinnings of our hypotheses, and detail them in the next section. The first intuition we rest on when hypothesizing the political accountability sequence should be uncontroversial: Governments' interest is to stay in power and seek therefore to be as free as possible from constraints to achieve that aim, and will thus try to evade being held to account. Actors in the position of principals (what we also refer to as Baccountability actors^below) conversely should seek greater oversight powers to ensure that policy promises are followed and that executive power is not abused at their own expense. Thus, we expect the costs for governments of giving in to demands for accountability and the strength of these demands, to shape the accountability sequence.
It is also logical that institutions must be established de-jure before higher or lower levels of de-facto accountability can start to emerge (e.g., the legislature has to be formally established first before it can be effective). Our primary focus is on the latter even if the empirical analysis also includes the former.
Our first expectation is that high levels of de-facto vertical accountability should develop first, because incentives and capacities of voters to demand accountability are relatively unconditioned on institutional advancements in other sub-types. Conversely, legislators' incentives to exercise real oversight of the executive for example are conditional on whether they need voters' approval in free and fair elections, as well as on an independent media facilitating adequate information flows. From the perspective of the government, vertical accountability also carries relatively limited costs in the absence of effective diagonal and horizontal accountability mechanisms. If voters lack accurate information and are subjected to propaganda, their ability to effectively sanction government in elections is limited. Hence, we expect the first step in the accountability sequence to be that the cost of repressing the demand for vertical accountability becomes higher than the cost of allowing improvements. Thus, H1: High levels of de-facto vertical accountability develop before high levels of other sub-types of de-facto accountability.
The realization of the sharpest mechanisms of horizontal accountability, with the capacity to directly oversee and constrain governments continuously and apply direct sanctions, should be the costliest for executives. This applies in particular to effective oversight by national legislatures' and independent high courts, while lower levels equivalents (e.g., lower courts) should be less threatening and can be expected to be become fully effective earlier. Governments are only likely to make substantive concessions in the realm of the national bodies, when demand for them is forwarded by strong vertical and diagonal accountability actors who would be too costly to repress. Hence, H2: Those institutions of horizontal accountability, that directly oversee and constraint governments become de-facto fully effective last in the sequence-after progress in the diagonal and vertical sub-types.

Theorizing Sequences of Accountability
We now turn to detail the logic of these hypotheses using the theoretical ideas related to cost of supply and demand. Order, timing, and historical context naturally matter for the evolution of complex sets of institutions (e.g., Mahoney 2001, Yashar 1997. Nevertheless, the literature on sequencing has three main shortcomings: (1) it often focuses on bivariate relationships such as the place and role of one specific institution in relationship to another (e.g., introducing competition before extending suffrage); (2) it has not had methods developed to identify series of variables related sequentially in longer chains; and (3) it usually analyses the de-jure introduction of institutions and not their de-facto effectiveness, in part because of lack of data.
Regarding the first issue, take for example studies on the timing of the introduction of de-jure multiparty elections (vertical accountability). Mansfield and Snyder (2007, pp. 6-7) hold that an Bout-of sequence^push to hold competitive elections in culturally diverse societies without reasonably effective institutions is likely to fail and even lead to violence. Gandhi and Lust-Okar (2009) also suggest that multiparty elections may stabilize and legitimize dictatorships if introduced before full competition is institutionalized. Yet, Carothers (2007, pp. 20-21) claim that stable political institutions and accountability mechanisms are more likely to develop Bas part and parcel^of the process of democratization rather than separate from it. Similarly, Howard and Roessler (2006) and Lindberg (2006) argue that even in authoritarian contexts, repeated elections are more likely than not to have democratizing outcomes. With regard to political parties, Shefter (1977) argues that relative timing of bureaucratization and the organizational origin of political parties explains to a large extent whether they choose programmatic appeals over clientelistic strategies.
In the area of diagonal accountability, establishing a robust civil society is often viewed as a condition for the subsequent fall of authoritarian regimes and building of a resilient democracy (Bernhard 1993). Carothers (2007, p. 20) points out that the development of strong grass root movements (e.g., Solidarity in Poland, the African National Congress in South Africa) have often been necessary conditions for democratic change. Yet, Keane (2009: xxvii) suggests that civil society developed their monitory role only after extensive historical experience with electoral democracy, and as part of their inclination to strive for more influence. They succeeded in part because after World War II, many influential actors saw the strengthening of institutions of diagonal and horizontal accountability as a recipe for preventing democratic breakdowns. (Keane 2009: 729ff) But we lack systematic, empirical tests of these claims.
Regarding horizontal accountability, Fish (2006) argues that a powerful legislature must develop first or else a concentration of power in the hands of the executive and underdevelopment of political parties inhibits democratization. Some case study work is supportive showing that even in weak democracies, opposition parties in a legislature can hold the executive somewhat to account (Herron and Boyko 2015, p. 132). Others find a risk that such conflicts between the legislature and the executive lead to democratic breakdown (Stepan et al. 1993).
The second issue is that while these and many other contributions are valuable, investigations of institutions affecting each other over longer sequential Bchains^are still lacking. It is due to both previous shortage of detailed and comparable global data with long time-series, as well as to unavailability of fitting methods. We employ here a set of entirely new methods developed precisely to detail sequential relationships between a substantial number of indicators measured at a higher level than dichotomies and involving large number of observations (here 35 ordinal variables of accountability with typically five levels measured for about 17,000 country-years). The methods emerged in evolutionary biology to study parasite-host systems (e.g., Siixén-Tullberg 1993) and have been adapted to the study of political systems recently (Lindenfors et al. 2016(Lindenfors et al. , 2018.
Finally, most of the extant literature focuses on de-jure institutions. However, what really matters is how well such institutions are functioning in practice. We develop on this in the following section.

De-Jure vs. De-Facto Accountability
There is an important difference between the introduction of institutions of accountability de-jure and their de-facto effectiveness (Besley 2006: 37, Snyder 2006, perhaps particular so in authoritarian regimes (Gandhi and Lust-Okar 2009): While almost 90% of countries now hold multiparty elections, less than half of these elections are substantially free and fair (Hafner-Burton et al. 2014, p. 152, van Ham andLindberg 2016, 5f). Rulers use Bthe menu of manipulation^ (Schedler 2002) to undermine two key preconditions for vertical accountability: procedural certainty and ex ante uncertainty (Przeworski 1986, pp. 56-57).
In terms of diagonal accountability, journalists are often severely restricted in practice even when freedom of expression and the media are constitutionally guaranteed, as for example in contemporary Russia (Besley et al. 2002, p. 720). Finally, most nations have legislatures with constitutionally guaranteed oversight functions. Yet, effective exercise such de-jure prerogatives are much scarcer (Salih 2005; Rakner and van de Walle 2009; Vliet 2014). Authoritarian governments use legislatures to co-opt elites and rather shield governments from criticism (Gandhi 2008) especially where one-party-dominance undermine the division between legislative and executive powers (Cranenburgh 2009, p. 64;Lindberg and Jones 2010).

Sequences in the Evolution of Accountability?
The advancement of de-facto accountability is not inevitable. As an approximation, it seems reasonable to assume that agents' (governments) strategic interest is to remain as unconstrained as possible in order to stay in power, while principals (accountability actors) want to maximize the amount of control they exercise over agents, and hence seek to expand the reach of de-facto accountability mechanisms. The accountability actors include citizens, political parties, legislatures, high courts, ombudsman offices, and other oversight bodies, as well as media, journalists, and CSOs. 2 The government must decide to what extent they will concede to such demands in an iterative process balancing the costs of supplying accountability against the cost of suppressing the demand for accountability. This notion builds on Dahl's (1971, p. 14f) famous axiom that the likelihood of democratization increases as the cost of tolerating opposition decrease and the cost of repression increase (Fig. 1).
Thus, the evolution of specific patterns of accountability is a function of (1) how costly governments calculate it would be to supply improved institutions of accountability-in particular to the extent it would affect continued hold on to power; and (2) whether governments perceive the cost of suppressing the demand for specific types of accountability as acceptable or not (cf. Lindberg 2009, p. 320).

The Cost of Supplying Accountability
We suggest that because vertical, diagonal, and horizontal accountability encompass distinct mechanisms to constrain governments, one should expect that they vary in how effective they are in the information and sanctions dimensions (Schedler 1999). Table 1 shows the pattern we expect.
Vertical accountability has ultimately a sharp edge by voters' power to Bthrow the rascals out^if governments perform poorly. The potential cost of immediately losing power is thus very high when the associated institutions are developed de facto, but considered a Blong route^to accountability (World Bank 2004) since elections occurs only periodically. Citizens also face informational disadvantages restricting their ability to evaluate governments' (Miller 2005, p. 207) even in established democracies (e.g., Achen and Bartels 2016;Brennan 2016;Evans 2004;Manin et al. 1999, p. 44), and ruling elites have multiple instruments to deceive the electorate (Schedler 2002) thus significantly lowering the costs of supplying vertical accountability. Hence, we consider the information dimension low for Fig. 1 The probability of governments allowing the evolution of de-facto accountability. Note: This figure builds on Dahl (1971, p. 16)  Conversely, the strength of diagonal accountability institutions when effectively in place is uncovering and providing information. Media and watch-dog CSOs are main sources of information for many citizens and therefore vital also for facilitating the effective exercise of vertical accountability. Nevertheless, CSOs and media have few direct means of sanctioning and depend on whether the institutions of vertical and horizontal accountability respond (Mainwaring and Welna 2003). Therefore, the potential costs of diagonal accountability in the dimension of sanctions are low, but high in the information dimension.
Finally, we argue that horizontal accountability when realized in practice carry high costs for the executive in both dimensions. First, it is difficult for governments to evade fully effective and independent horizontal oversight institutions. Legislatures, high courts, and other oversight bodies have both incentives and powers to monitor the actions of the executive on a day-to-basis, and impose costly sanctions (Laver and Shepsle 1999;Fish 2006). They are privy to even classified information and not easily deceived. For example, in countries such as Sweden (National Audit Office) or the USA (Government Accountability Office), independent audit offices have the right to thoroughly scrutinize records of public expenditure. Their reports are important tools for legislators and journalists to hold the government to account. Second, powerful legislatures-for instance through votes of non-confidence-and high courts through rulings have the power to directly sanction the government. The dual characteristics of effective information and enforcement make full de-facto horizontal accountability potentially very costly for governments and they should therefore seek to prevent it for as long as possible.

The Cost of Suppressing the Demand for Accountability
The cost of repressing demands for expanding the effective powers of accountability actors is the second aspect shaping the propensity of government concessions, and it seems reasonable to assume that a strong demand is costlier to repress than a weak.
There is evidence that institutions of vertical accountability-even if weak-build demand for more de-facto accountability of any kind. For example, introduction of dejure multiparty elections lead African countries toward strengthening diagonal accountability (Lindberg 2006) perhaps because repeated participation in elections shape citizens to demand more democratic procedures and broader participation (Gandhi and Lust-Okar 2009, p. 415). At the same time, political competition and a minimum level of press freedom enables civil society to push for better quality of government (Grimes 2013). If elections prompt various actors to believe that democracy is Bthe new game in town^incentives to adhere to democratic norms increase (Lindberg 2009, p. 335), including holding the government to account.
Key actors in vertical accountability are voters. They are the principals of legislators-in contexts with clean elections-with few contingencies on other subtypes political accountability. 3 This makes their incentives and capacity to demand for more accountability less dependent on advancements in other areas. Disenchanted voters have a potent tool as potential mass protesters independent of other accountability actors. Therefore, many scholars (Markoff 1999, p. 189;Therborn 1997) single out mass protest as key driving force in democratization processes, while others emphasize the role of elites particularly (Huntington 1984). In recent history, we can find many examples for the important role of citizens in moving from de-jure to de-facto accountability. BStolen^elections have triggered mass protests leading up to the color revolutions (Bunce and Wolchik 2010;Thompson and Kuntz 2009). In 2010, Nigerians took to the streets demanding free and fair elections and the replacement of the head of the Election Management Body (EMB) (Le Van and Ukata 2012). Responding to the protests, the government appointed a new EMB head, who organized the much-improved 2011 elections (Lewis 2011).
We then argue that improvements in vertical accountability should be expected to increase the demand for greater de-facto diagonal as well as horizontal accountability, thus making it costlier for governments to repress such demands. For instance, legislators facing clean elections are likely to insist that the legislature gains effective oversight powers. First, if legislators are to be reelected, they need to have achieved something during their tenure, such as getting the government to implement a certain policy. This is a strong incentive to demand more power to hold the executive to account de-facto, especially since legislators have genuine mandates from voters and do not depend on the government to manipulate elections in their favor, including provision of clientelistic goods (Lust 2009). Second, when elections are free and fair, a greater share of opposition candidates are typically elected, who are independent of government. Thus, clean elections increase the independence of legislators writ large. Due to both pathways, high levels of vertical accountability are likely to increase the demand for more horizontal accountability (Fig. 2).
An example of effective diagonal accountability facilitating stronger horizontal accountability is the campaign by Argentinian CSOs using the media to push for reforms in the judicial system. The non-profit organization Asociación Por Los Derechos Civiles (ADC) led a campaign resulting in public hearings for Supreme Court of Justice nominees. Similarly, CSOs spearheaded judicial reforms at provincial level in Argentina. As result, the selection of judges was removed from political control and moved to the Council of Magistrates under CSO monitoring (Fisher 2013, pp. 238-9). Thus, CSOs can push for more transparency and better oversight of governments that alongside independent media strengthens the demand for effective de-facto horizontal accountability. Therefore, we expect effective diagonal accountability to create a stronger demand for improved horizontal accountability.
Our central argument is that the cost of suppressing the demand for more de-facto horizontal accountability (in particular for the institutions of horizontal accountability at the national level with the sharpest teeth) is contingent on advancements in other sub-types of accountability. Conversely, the demand for vertical accountability is not conditioned as much on other sub-types and diagonal accountability occupies an intermediate position.

Measurement and Data
Figure 3 maps our conceptualization of accountability and identifies a measurement scheme with a combination of factual and evaluative indicators for vertical, diagonal, and horizontal sub-types.
To measure de-facto accountability, we rely on V-Dem's v6.2 data set covering 173 polities between 1900 and 2012, 4 drawing on over 2500 country experts' evaluations (Coppedge et al. 2016a). V-Dem aggregates the expert assessments in a custom-built Bayesian item-response theory model taking coder disagreement and measurement error into account enhancing both reliability and validity of the data (Pemstein et al. 2015). 5 If specific de-jure aspects are not available from V-Dem, we use data from the Comparative Constitution Project (CCP, Elkins et al. 2014). A detailed description of the variables is found in the Appendix 1.
Four indicators capture the de-jure aspects of vertical accountability. Electoral regime captures whether elections for parliament and the executive are on course or not. Party ban de-jure indicates if it is legally possible for parties not affiliated with the government to form. If the law allows for multiple parties to register for the elections is denoted by Multiparty elections de-jure, and finally, if elections were held under universal suffrage by Share of population with suffrage. 6 Seven indicators capture the de-facto aspect of vertical accountability: The extent to which elections are truly multiparty in practice by Multiparty elections de-facto; the degree to which freedom to form political parties is unrestricted by Party barriers defacto; to what extent the electoral management body (EMB) has autonomy to apply election laws impartially is by EMB autonomy; to what extent elections are free and fair and not marred by fundamental flaws and irregularities by Clean elections; to what extent Vote buying occurs; the extent to which political parties are based on programs versus clientelistic linkages by Party linkages in order to capture the functioning of political parties; and Opposition parties autonomy from the government indicates the extent to which voters have a real choice. 4 Data for 76 countries are available until 2015 and for 37 countries until 2014. 5 The measurement model produces a probability distribution over country-year scores on a standardized interval scale (Coppedge et al. 2016c, p. 33). As the sequencing models require ordinal variables, we use the ordinal version of the V-Dem variables. An advantage of expert-coded data is that it provides information on the strength of institutions in practice. However, a legitimate concern is that there might be bias if the hypotheses are derived from the same literature on which coders base their coding. To counter this potential issue V-Dem has recruited more than 2500 independent scholars from almost 180 countries, two-thirds of which are local from the country they are coding. For each indicator-country-year, five or more independent coders provide ratings. Due to the number and diversity of independent scholars, it is not very likely that when coding raters use the literature on which we are basing our hypotheses. The advanced modeling techniques, which to the extent possible, minimize coder error and addresses issues of comparability across countries and over time, also help (Pemstein et al. 2015). 6 We define virtual universal suffrage to be achieved when 98% of the population is enfranchised (Skaaning et al. 2015). The de-jure horizontal aspects of accountability are captured with indicators from the CCP data set: We account for whether a Legislature exists; whether the legislature is allowed to question the government (Legislature questions executive de-jure); if there is Judicial independence by constitution; and whether provisions for an Attorney general/ prosecutor exist. Six indicators measure de-facto horizontal accountability: The likelihood that the Legislature investigates [the] executive in practice; if the Legislature controls resources for its own operations; the likelihood that other bodies such as a comptroller general, general prosecutor, or ombudsman would conduct such an investigation (Executive oversight by other bodies); to what extent judges are subject to disciplinary action (Judicial accountability); and the High and Low court independence from the government.
To gauge de-jure diagonal accountability, we use three CCP indicators reflecting whether there is Freedom of assembly, Freedom of expression, and Freedom of the press by constitution. For de-facto diagonal accountability, we include indicators measuring the extent of Media censorship; whether media outlets regularly criticize the government (Critical media); and the extent to which media represent a wide range of political perspectives (Media wide range of views); the extent to which CSOs are free to organize (CSO entry and exit) and to criticize the government without fear of negative consequences (CSO repression); how wide and how independent are public deliberations when important policy changes are being considered (Engaged society); and to what degree there is Wide involvement in CSOs from society.

Empirical Approach
A set of novel methods developed for political science from parasite-host systems analysis in biology (Lindenfors et al. 2016(Lindenfors et al. , 2018Wang et al. 2017) makes it possible for the first time to describe long chains of sequential relationships between ordinal variables. We use the following two graphical investigation of the exact pathways for how variables change in relation to one another; and dependency analysis, exploring whether the values of one variable are systematically conditional on certain values of other variables. 7 The latter is inspired by Bthe contingent states test,^developed to investigate dependencies in biological evolution (Siixén-Tullberg 1993). 8 We construct such dependency tables for how each accountability indicator has developed in relation every other across the 35 indicators. These tables identify the lowest value recorded on other variables, across all 173 countries and 113 years, at the point when the variable in question reached its own maximum value (in order to reduce the risk that outliers drive the results, we exclude the lowest 5% of observations following convention, c.f. Lindenfors et al. 2016, p. 10). The sum of these minimum values is called contingency conditions. A low number of contingency conditions for a variable indicates that this institution developed to its highest level before much progress in other institutions were made. Conversely, a high number of dependencies for a variable indicate that that institution cannot fully develop before many other variables have reached high levels. We stress that the approach is purely descriptive in its nature and does not allow for causal claims. The contribution of the method is rather 7 For readers familiar with sequence analyses of the type used by Abbott (1995) and Abbott and Tsay (2000), it should be pointed out that both the origin and the logic of those methods are fundamentally different. Abbott et al.'s approach builds on the logic of DNA sequencing and requires that variables are dichotomous and only occur once, such as in life history analysis. That said, as pointed out by Wu (2000), sequencing methods do not take other covariates into account and that applies also to what we present here. In principle, the methods developed by Lindenfors et al. can include as many confounding variables as one thinks relevant, but in this paper we are not analyzing exogenous factors. The research task here is to identify and describe the endogenous sequence of accountability, or how do the different aspects of accountability develop in relation to one another. In addition, both the methods used in this paper and Abbott (1995) are time-insensitive and are unable to capture non-linear dependence on time. Thus, in Appendix 3, we provide tests showing that the results are robust to disaggregating the analysis by region and time and region (before and after the fall of the Soviet Empire). However, we recognize that time may be important in other ways. This is not in the scope of this paper, and can be tested with other methods in future work. 8 The method combines a series of bivariate analysis, and thus, establishes a long series of sequences involving many multi-state variables. If high values in Variable A always correspond to a certain minimal value of Variable B, then it can be inferred that the high values of Variable A are likely to be conditional on this minimal value of Variable B. Conversely, if for the highest value of Variable B, the corresponding value of Variable A is its minimum, then this shows that Variable B is not contingent on Variable A. The result is a detailed and empirically based map of which aspects of a phenomenon occurs before others. In Appendix 2, we present a simple illustrative example of a contingency table, and further discuss the interpretation of the contingency conditions. to identify large portions of the data that exhibit specific contingencies, and describe those, just like patterns of evolution in biology.
When interpreting the results, one should not draw any strong conclusions from small differences in the number of dependencies and contingency conditions, but we could draw inferences on sequence mechanisms from large differences (Lindenfors et al. 2016, p. 24). One distinct advantage of this method is that-unlike in time-series cross-section analysis-it does not focus on average effects with fixed time lags. The dependency analysis allows one to identify similarities in sequences across countries regardless of time: when and where it happened and how long it took. Thus, to simplify it, the method can identify what is similar across a process in one country in say Europe that took three years occurring in the 1920s and another taking place in the 1990s in Africa and took 23 years. The dependency analysis tells us that one aspect never emerged before another one-in our case Bnever^in the history since 1900, across some 17,500 country-years. This is arguably rather strong evidence that it is unlikely to happen in the future. Thus, we can present evidence on which aspects have developed first, in the middle, and last in the processes of building accountability. Table 2 presents the aggregate summary of 595 bivariate analyses following the dependency analysis approach outlined above, displaying the sum of contingency conditions for each of the variables reaching their highest value (the top category. For selected indicators, more detailed dependency tables can be found in Appendix 4. Table 2 illustrates that almost all de-jure indicators have very few dependencies as expected. 9 The only exception is the formal establishment of an ombudsman office (part of de-jure horizontal accountability), reflecting that this institution, even if instituted first by the Swedish King Karl XII in 1713, spread relatively late internationally from the 1960s.

Results
Our findings support the hypothesis that de-facto vertical accountability evolve first in the accountability sequence (H1), with few minor exceptions. Most indicators of defacto vertical accountability require fewer contingencies than the indicators of the diagonal sub-type and as expected, the key institutions with Bsharp teeth^of the horizontal sub-type.
The sequence pattern demonstrates that improving vertical accountability by diminishing Vote buying in elections can be achieved very early along with getting Multiparty elections de-facto and transforming Party linkages from clientelistic to programmatic. A little more demanding but still relatively less contingent are other vertical institutions like party barriers, opposition party autonomy, and clean elections.
There are two exceptions to the pattern that high levels of vertical accountability develop first. First, full EMB autonomy has more contingency conditions than all other aspects of vertical accountability, most indicators of diagonal accountability, and some auxiliary aspects of horizontal accountability. Thus, the Blast holdout^for governments in the area of vertical accountability seems to be influencing the management of elections. This is plausible, because restricting EMB autonomy is a low-cost way of manipulating elections due its low public visibility (Schedler 2013, p. 274).
Second, three indicators capturing auxiliary aspects of horizontal accountability have fewer dependencies than some indicators of vertical accountability: Lower court independence, Legislature controls resources, and Judicial accountability. A closer look at these indicators reveals that while they are important for achieving horizontal accountability in the long run, their development does not immediately constrain governments but are auxiliary. For instance, Lower court independence facilitates the rule of law, which may help to enforce accountability. However, lower courts rarely have the power to directly sanction the government for its political actions. Similarly, Judicial accountability captures to what extend judges themselves are held accountable for serious misconduct, which strengthens the integrity of these key accountability actors. However, these institutions do not directly constrain the government, contrary to Finally, a Legislature [controlling its own] resources may become more independent from the government, which may enable MPs to constraint government action. However, whether or not they do so is reflected by the indicator Legislature investigates in practice, which clusters with other key aspects of horizontal accountability on top of the contingency table. These three auxiliary aspects are important for building the institutions of horizontal accountability, but do not immediately threaten the government. This can explain why they develop relatively early in the sequence. Thus, we have to qualify our expectation that all aspects of vertical accountability develop first. The indicators of de-facto diagonal accountability cluster together in the upper half of the contingency table indicating that reaching their highest states tends to occur later in the sequence than most indicators of vertical accountability, as we hypothesized. For example, nearly all governments in the world discontinued CSO repression only after achieving at least medium levels in institutions such as Freedom of discussion, Clean elections, and Critical media. 10 Finally, we expected key institutions of de-facto horizontal accountability, which enable institutions to directly oversee and constraint governments to become de-facto fully effective late in the sequence (H2). Our findings support this hypothesis ( Table 2). All indicators of diagonal and vertical accountability have fewer contingencies than the three key indicators of horizontal accountability. Table 2 provides evidence that no country has scored high on these three indicators without achieving significant progress in many other institutions of accountability: The three key indicators of horizontal accountability are contingent on high overall achievements in many other aspects of accountability-their contingency scores overall sums to 54, 57, and 62. The difference to the number of contingencies for almost all indicators of vertical accountabilitybetween 5 and 27-is substantial. The high number of contingencies for the three key indicators of horizontal accountability suggests that historically, before the legislature and other bodies were able to effectively hold the executive to account, and for the high court was able to issue rulings independently, advancements in several areas were made. These are for example institutions guaranteeing that politicians are subject to regular and clean elections; citizens are free to organize themselves and express their political will through political parties; and independent CSOs as well as the media is able to scrutinize the work of governments. 11 Figures 4, 5, and 6 use the graphical investigation to validate some of the findings from the contingency table. Figure 4 presents the bivariate relationship between two key variables of diagonal and vertical accountability: Media censorship (y-axis) and De-facto multiparty elections (x-axis). Higher values of the variables indicate that the government is more accountable. The size of the dots signifies the frequency of country-years with a particular combination of values. The relatively small dots to the left of the diagonal line indicate that only few cases have accomplished an uncensored media before De-facto multiparty elections. The arrows on Fig. 4 illustrate the pathways of countries moving from one combination of indicators to another restricted 10 See Table 6 that documents the specific contingencies for selected individual indicators. Contingency tables for the remaining indicators are available upon request. 11 See Table 6 documenting the specific contingency conditions for selected individual indicators. Dependency tables for the remaining indicators are available upon request. only to positive developments (therefore, the arrows only point to higher values on the scale). Thickness of the arrows indicates frequency. The lack of thick arrows between the lowest and highest states of any indicator suggests that high-levels of these aspects of de-facto accountability evolve in a sequential process and not overnight. Almost all high values on Media censorship occur when De-facto multiparty elections has already reached the highest value. This provides further support for our first hypothesisvertical accountability evolves early in the sequence. Figures 5 and 6 provide further evidence on the second hypothesis regarding key aspects of horizontal accountability. Figure 5 shows the development of the variables Legislature investigates executive in practice (y-axis) and Clean elections (x-axis). The bigger bubbles on the right of the diagonal line indicate that, historically, countries tend to start holding clean elections before the legislature could investigate the executive.
Similarly, when we look at the combination of values of two key variables from diagonal and horizontal accountability (Fig. 6), we see that Freedom of discussion (diagonal accountability) develops higher values earlier than High court independence (horizontal accountability).
To sum up, while our empirical analyses only display descriptive contingencies, they corroborates the claims developed in the theory section, in particular that high levels of key aspects of horizontal accountability that directly oversee the government comes last in the sequence of accountability.

Regional and Time Trends
To assess the robustness of our findings, we disaggregate the analysis by time and by regions, which also helps to address concerns that time and geographic characteristics might be important covariates. Since the end of the Cold War, the number of electoral authoritarian regimes has surged and it seems plausible that this trend should reflect in different sequencing patterns. Therefore, we split the sample into two parts: One including all countries in 1988 or earlier and one with all countries after 1988. Table 3 lists the de-facto accountability indicators sorted in descending order based on this division.
Most key findings from the general patterns are similar to the results described for the global sample. In particular, the three mechanisms of horizontal accountability that directly oversee and really put constraints on governments are at top of the dependency table for both samples, requiring most other aspects to be relatively highly developed de-facto. Thus, the post-Cold War world change did not affect the reluctance of governments to give-in on these issues.
There are, however, instructive differences between the two samples regarding some vertical accountability mechanisms. Before 1988, two important indicators of de-facto vertical accountability-Clean elections and programmatic Party linkages-are at a similar spot in the sequence as Multi-party elections de-facto. However, after the end of the Cold War, the development of Clean elections and non-clientelistic Party linkages seems to require considerable more progress in other aspects of accountability than Multi-party elections defacto. This could be linked to the emergence of a larger number of electoral autocracies in the latter period, which only improve the quality of elections-if at all-after internal as well external pressure (Lindberg 2006;Schedler 2013).
Also in the period after 1988 Lower court independence developed last in the sequence, whereas for the earlier time period, it is in the lower part of the dependency table. This suggests that countries that developed accountability after 1988 had to struggle with a legacy of weak low courts.
We also disaggregate the analysis by splitting the sample by world regions in order to investigate regional trends. This helps to address concerns that the results are driven by one region or cluster of countries. In addition, geography is a proxy for other important covariates that tend to be similar in different regions of the world (e.g., culture or economic development). Finally, theories about democratization suggest that there is a spillover effect or diffusion in the spread of democracy where states are more likely to adopt and sustain democracy the more democratic their neighbors are (Gleditsch and Ward 2006).
Appendix 3 includes the results and more detailed discussions. 12 Yet overall, the main findings from the global analyses above hold across regions. The variables necessitating the lowest number of contingency conditions tend to be associated with vertical accountability; many diagonal accountability indicators are concentrated in the middle of the table, and the aspects that come at the latest stage of development (or are not achieved yet) reflect the ley institutions of horizontal accountability. Thus, a specific region does not drive the results in the global sample but there is a global pattern of the sequence in which accountability developed.

Conclusion
This paper breaks new ground in understanding the details of how governments become more accountable de-facto across three sub-types of accountabilityvertical, diagonal, and horizontal. We argue that governments are more likely to allow for de-facto accountability if the costs of supplying accountability decrease and the costs of suppressing the demand for accountability increase. Based on this notion, governments tend to make initial concessions in the vertical sub-type of accountability (voters, political parties), because this subtype is less effective in directly constraining their actions and thus less costly than de-facto horizontal accountability (oversight by the judiciary, legislatures, and other oversight bodies). Furthermore, since voters are not agents in 12 In Appendix 5, we present a brief, illustrative case study of Ghana as an example for the country-specific experiences that are part and parcel of our general findings based on 115 years of data from 173 countries. accountability relationships, their incentive to demand for more influence is less contingent on advances in other sub-types. Conversely, the incentive of legislators as key agents of horizontal accountability to demand for more oversight power increases with more vertical (voter demands) and diagonal (CSOs, media) accountability.
Using novel sequencing methods, we present new evidence on how accountability has evolved in 173 countries from 1900 until the present and a total of some 17,500 country-years. Our findings-while descriptive in nature-provide empirical support to our main theoretical assumptions and uncover the following empirical trends. High levels of de-facto accountability in the realm of vertical accountability typically evolve before other aspects of accountability. Effective horizontal accountability is contingent on progress in vertical and diagonal accountability. Without fully clean elections, autonomous opposition parties and a developed civil society and media, no country in the world has yet achieved fully effective government oversight through independent high courts, vigorous parliaments, or other institutions.
These findings have important policy implications. Efforts seeking to enhance horizontal accountability, such as the legislature's de-facto power, are very unlikely to be fully successful unless a series of other mechanisms of accountability are in place. Meanwhile, efforts to improve elections, the In sum, the novel sequencing methods utilized in this paper make an important contribution to our understanding of endogenous patterns of accountability evolution. Future research should also examine the role of exogenous factors-such as international interventions or economic development-in these sequential developments as well as the reverse process of diminishing accountability. While the analysis finds support for the existence of a global sequence of accountability-building over time, we also see interesting variations across regions and before and after the Cold War. Future research could seek to explore this variation, for example, by combining the two lines of work and testing if regions have experienced different sequencing patterns before and after 1988. Finally, our empirical analysis has shown that some aspects of horizontal accountability-such as a legislature controlling its own resources-typically develop before other aspects of the same accountability subtype-such as legislature investigating the executive in practice. Such auxiliary indicators, which enable the evolution of other aspects, may exist in the other accountability sub-types as well. This could be local elections in the realm of vertical accountability or professional associations for diagonal accountability. Since we have not included such indicators in our analysis, this issue warrants further investigation.  (v2elembaut) Taking all aspects of the pre-election period, election day, and the post-election process into account, would you consider this national election to be free and fair?

V-Dem
Programmatic party links (v2psprlnks) A party-constituent linkage refers to the sort of "good" that the party offers in exchange for political support and participation in party activities.

V-Dem
Opposition parties autonomy (v2psoppaut) Are opposition parties independent and autonomous of the ruling regime? An opposition party is any party that is not part of the government, i.e., that has no control over the executive.

V-Dem
Clean elections (v2elfrfair) Taking all aspects of the pre-election period, election day, and the post-election process into account, would you consider this national election to be free and fair?

V-Dem
Vote buying (v2elvotbuy) Vote and turnout buying refers to the distribution of money or gifts to individuals, families, or small groups in order to inϐluence their decision to vote/not vote or whom to vote for. It does not include legislation targeted at speciϐic constituencies, i.e., "porkbarrel" legislation.

HORIZONTAL ACCOUNTABILITY
De-jure horizontal accountability Legislature exists (v2lgbicam) Is there a legislature in place? Advisory bodies that do not have the formal authority to legislate-as stipulated by statute, legislative rules, the constitution, or common law precedentare not considered legislatures.

V-Dem
Legislature investigates executive de-jure (INTEXEC) Does the legislature have the power to interpolate members of the executive branch, or similarly, is the executive responsible for reporting its activities to the legislature on a regular basis? If the executive were engaged in unconstitutional, illegal, or unethical activity, how likely is it that a legislative body would conduct an investigation that would result in a decision or report that is unfavorable to the executive?

V-Dem
Legislature controls resources (v2lgfunds) In practice, does the legislature control the resources that ϐinance its own internal operations and the perquisites of its members?

V-Dem
High court/lower court independence (v2juhcind, v2juncind) When the high/lower court in the judicial system is ruling in cases that are salient to the government, how often would you say that it makes decisions that merely reϐlect government wishes regardless of its sincere view of the legal record?

V-Dem
Judicial accountability (v2juaccnt) When judges are found responsible for serious misconduct, how often are they removed from their posts or otherwise disciplined?

V-Dem
Executive oversight by other bodies (v2lgotovst) If executive branch ofϐicials were engaged in unconstitutional, illegal, or unethical activity, how likely is it that a body other than the legislature, such as a comptroller general, general prosecutor, or ombudsman, would question or investigate them and issue an unfavorable decision or report?

Appendix 2 Constructing a dependency table
For an analysis of sequential relationships between a larger number of variables, dependency tables can be constructed for all possible combinations of variables, and then summarized (Table 4). An example of how several such bivariate dependency tables can be summarized is found below in Table 5. For each variable, we summarize Indirect forms of censorship might include politically motivated awarding of broadcast frequencies, withdrawal of ϐinancial support, inϐluence over printing facilities and distribution networks, selected distribution of advertising, onerous registration requirements, prohibitive tariffs, and bribery.

V-Dem
Critical media ( (v2cseeorgs) To what extent does the government achieve control over entry and exit by civil society organizations (CSOs) into public life?

V-Dem
Freedom of discussion (v2xcl_disc) This indicator speciϐies the extent to which citizens are able to engage in private discussions, particularly on political issues, in private homes and public spaces (restaurants, public transportation, sports events, work etc.) without fear of harassment by other members of the polity or the public authorities. We are interested in restrictions by the government and its agents but also cultural restrictions or customary laws that are enforced by other members of the polity, sometimes in informal ways. the minimum values in other variables and report them as number of BContingency conditions.Î n this illustration, the maximum sum of thresholds, or contingency conditions, for a variable reaching its highest state, is 20 (five other variables, and each variable's maximum level is four, for the highest state). The illustrative results would indicate that variable B comes first in attaining its maximum value in a sequence. It can reach its highest state unconditional on any other variables.

V-Dem
For our study, the dependencies of the highest indicator states are of particular interest, because we are interested in what these conditional relationships look like for developing the de-facto accountability mechanisms. If one were, for example, interested rather in the onset of such developments, one should look at the number of dependencies for different variables reaching the first, or perhaps the second level, which would indicate Bearly moves^rather than Bfinal pushÂ ppendix 3 Regional trends We split the sample by world regions in order to investigate regional trends. 13 Table 5 lists the de-facto accountability indicators sorted in descending order with the indicators with the highest number of dependencies at the top of the list, and the lowest at the bottom. 14 Key findings from the sequence of variables in the global sample hold across regions. The variables that necessitate the lowest number of conditions tend to be associated with vertical accountability (indicators displayed in red in the table); many diagonal accountability indicators (displayed in green) are concentrated in the middle of the table, and for most regions the indicators that come at the latest stage of development reflect horizontal accountability (blue indicators). Some exceptions to this pattern in Table 6 can also be found in the global sample, e.g., establishing autonomous EMB comes relatively late in time, while in some regions progress in terms of horizontal accountability, like financial independence of the legislature and judicial accountability, comes before reaching high levels on any other mechanisms of accountability. While the exact ordering sometimes varies a little, the indicators at the bottom, the middle, and at the top in the three types of accountability are the same as in the global analysis for most regions.
There are a number of interesting differences in the progress of accountability mechanisms across regions. First, in some regions, no country has reached the highest level on all accountability indicators. These are crossed-out in Table 5. For example, no government has yet fully given up on media censorship or enabled the legislature to effectively investigate in practice in the MENA region (here including Turkey and Israel). 13 To divide the countries, we have used a politico-geographical classification scheme (e_regionpol) from the V-Dem data set v6 (taken from QoG 2013). We dropped the Pacific region (excluding Australia and New Zealand) due to the low number of countries and cases. 14 Table A. documents the full table with number of contingency conditions for each region.  Second, the pattern of development of vertical accountability seems to differ across regions. In most regions, vote buying is eradicated relatively early. However, in Western countries as well as in the Caribbean, vote buying persists longer than other deficits in vertical accountability-with the exception of EMB autonomy, which has been fully realized relatively late in the sequence everywhere. EMB autonomy comes particularly late in the sequence in Sub-Saharan Africa and South Asia, indicating that there governments have kept a backdoor for electoral manipulation open longer than other instruments for limiting accountability. Finally, clean elections are achieved rather late in the MENA region (if at all) and unlike in other regions, countries from the Caribbean have not developed programmatic relationships between political parties and citizens early in the sequence.
Thus, interventions to help make the EMB fully autonomous should be synchronized with efforts to strengthen the other mechanisms of vertical accountability too. On the other hand, vote buying is something that can be addressed early in most regions of the world where weak mechanisms of accountability is an issue, and regardless of the state of other mechanisms being in place or not.
There are also some interesting differences across regions with regard to horizontal accountability. Notably, no country from Eastern Europe and Central Asia, Sub-Saharan Africa and South-East Asia has reached full judicial accountability-a measure of whether judges are held accountable for possible illegal actions-before making substantial progress in many other aspects of accountability. This is one instance where the disaggregated, regional analysis is very useful. Because of the fact that in a minority of regions (e.g., Western Europe) judicial accountability developed to a high degree early, the global analysis Bhides^that in most of the regions it is actually an aspect of accountability that comes very late in the sequence.
Similarly, lower court independence was developed relatively late in the sequence in regions in the world covering a substantial number of countries (Eastern Europe and Central Asia, Latin America, East and South Asia), but in other regions, it had a relatively low number of contingency conditions. While the present analysis cannot provide an answer to why these regional differences occur, it is important to note these exceptions to the global pattern if and when the analyses here are used to make policy recommendations.  Most of the losing vote went to the Danquah-Busia legatee of the New Patriotic Party (NPP), which refused to accept the outcome (Morrison 1999). The 1992 elections were somewhat free and fair, largely free of vote buying, and barriers for parties to form and participate were low. But there was evidence of irregularities and questions about the autonomy of the EMB (Gyimah-Boadi 2001;Lindberg 2003;Nugent 2001). Despite these problems, after the election, the legislature headed by a well-known liberal, Justice D. F. Annan, asserted its independence in control over its own resources.
With the 1996 elections, opposition party autonomy was beyond doubt, and lower courts were clearly independent of the regime even though the ruling NDC and its leader President Rawlings remained in power. A fully independent and critical media that would openly challenge the sitting government did not develop until around the third elections in 2000. Both the indicators for free and fair elections (v2elfrfair_ord) and for government censorship effort on the media (v2mecenfm_ord) reach the highest score in 2000. The opposition party NPP then won both a narrow majority in parliament, and the presidential office. Despite this electoral turnover, the legislature was still not fully capable of exercising executive oversight and conducting real investigations of illicit behavior by the executive. This is captured by the lower scores on the V-Dem indicator for this aspect: legislature investigates executive in practice (v2lginvstop_ord), on which Ghana is yet to achieve the highest score. The new President Kufour and his government even actively sought and managed to minimize the reach of the legislature's oversight power and continued doing so into the party's second term from 2005 to 2008. The most important explanation for this circumvention of the legislature is to be found in the strategy of the President Kufour to coopt members of the legislature in order to reduce political competition (Lindberg 2009). As illustrated by the history of Ghana, many governments across the world resist full de-facto horizontal accountability for as long as they can, even if they came to power in clean elections.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.