Statistical capacity and corrupt bureaucracies

In many developing countries, economic statistics (such as the growth rate of GDP) are imprecise, making it difficult to evaluate economic reforms and learn “what works”. Improving economic statistics has thus become a priority of international organizations. In this paper, we isolate an insidious mechanism—a type of observer effect—by which a push for better statistics can make matters worse. Precise statistics require the collection of data from a large number of firms. If firms suspect that detailed information, when spreading through the bureaucracy, is misused to collect bribes, they have weaker incentives to invest. As a result, the effects of reforms are muted, making it even harder to discover “what works”. To suppress this mechanism, efforts to improve economic statistics should be comprehensive and also include institutional aspects.


Introduction
In many developing countries, economic statistics such as the growth rate of GDP, the inflation rate, or the unemployment rate are highly unreliable. For example, in a widely noticed book, Jerven (2013) documented that the quality of African GDP numbers is extremely "poor". At the same time, the World Bank's then chief economist for Africa referred to the deficient state of African economic statistics as a "statistical tragedy" (Devarajan 2013). It is therefore no surprise that improving developing countries' statistics has become a priority of international organizations, among them the World Bank, the IMF, and the OECD. These organizations pursue their objective through initiatives such as the "Partnership in Statistics for Development in the 21st Century" (PARIS21), a group concerned with technical issues and the funding of data collection and processing in poor countries. More recently, the push for better statistics in developing countries has gained additional momentum through the rise of digitalization and big data. Under the umbrella of the UN Global Working Group for Big Data in Official Statistics (GWG Big Data), both developing and advanced countries exchange experiences concerning the use of big data to improve economic statistics. 1 In many ways, improvements in the precision and availability of economic statistics would be highly welcome. In particular, considering that concepts such as "growth diagnostics" (e.g., Rodrik 2010) and "experimentation at scale" (e.g., Muralidharan and Niehaus 2017) have gained ground, accurate statistics are of increasing importance in the context of development policy. Growth diagnostics, for instance, is based on the notion that-when it comes to incremental economic reforms-which reforms work and which do not is highly context-specific, i.e., depends on a country's economic and institutional status quo. Therefore, as Rodrik (2010, p. 41) puts it, growth diagnostics "emphasizes experimentation as a strategy for discovery of what works, along with monitoring and evaluation." A condition for meaningful monitoring and evaluation is the availability of accurate statistics. If the numbers are poor, evaluation may become impossible or may lead to erroneous conclusions about "what works" (Manski 2015). This paper does not deny that good statistics have many benefits. Yet, focusing on GDP statistics, we isolate an insidious mechanism by which a push for better statistics can have harmful side effects in developing countries. This mechanism-a type of observer effectreduces the benefits or may reverse them into net losses. Our analysis suggests that efforts to improve developing countries' statistics should not have a narrow focus on technical statistical capacity (i.e., data gathering); such efforts should be comprehensive and include institutional aspects (e.g., data confidentiality), notably in places where the bureaucracy has an extractive nature. It is key that there be a symmetry between technical statistical capacity and the quality of the institutional setting.
Our argument rests on three observations. First, strengthening technical statistical capacity to improve GDP statistics necessarily means collecting more data. It includes a move to a regular economic census schedule and the enlargement of the firm surveys that underlie GDP estimates between the censuses (e.g., Berry et al., 2018, Jerven 2013. Second, although there often are official guarantees of confidentiality for census and survey data, a large number of reports on the handling of government-collected data suggest a grave risk of confidentiality breaches that may allow detailed firm data to spread widely within the bureaucracy and beyond. As we will discuss in Section 2, the reasons for this include widespread IT security holes and the expansive sharing of collected data (e.g., Brookings Institution 2018). 2 Third, control of corruption is weaker in developing countries (e.g., Olken and Pande 2012), and corrupt officials use information on firm characteristics to "bribe discriminate" (Svensson 2003), with the consequence that larger firms pay higher bribes (e.g., Bai et al. 2019).
Connecting these observations, a push to strengthen technical statistical capacity must arouse fear of higher bribery costs among firms: with larger surveys, each firm faces a higher chance of being sampled and, given the possibility of confidentiality breaches, a higher chance of being confronted with bribe demands from somewhere within the bureaucracy. The expectation that the official confidentiality (or no-harm) assurances may not hold weakens firms' incentives to invest. But if firms hold back for fear of getting (even more) entangled in bribe demands, their responses to any reform policy-and thus the policy's overall effect on economic performance-will be muted. As a result, although the improved statistics reduce the noise in growth estimates, it may become more difficult for the government to discover what policies work. In other words, the informativeness of policy experiments may fall rather than rise. In this case, a push to improve technical statistical capacity slows down learning. While this particular mechanism is novel, individual elements are well documented. A growing body of empirical results suggests that corruption has a strong negative impact on investment (e.g., Beekman et al. 2014, Paunov 2016, Zakharov 2019. There is also evidence of corruption biasing the effects of reforms towards zero (e.g., Banerjee et al. 2019).
To examine the relationship between technical statistical capacity and societal learning about reforms, this paper proposes a theoretical two-period framework that features ex ante fundamental uncertainty about the effects of alternative reform options (as in Oechslin 2015, 2020) and ex post measurement uncertainty in the economy's key statistic, the output estimate. Measurement uncertainty stems from the fact that the statistical office has to base its output estimate on a random sample of firms. Within the framework, the size of this firm sample can be interpreted as a measure of the economy's technical statistical capacity. An improvement in technical statistical capacity reduces measurement uncertainty and hence improves the accuracy of output estimates. A further key element of the framework is that it treats firms' investment decisions as endogenous and allows for the possibility of firms being subjected to bribe collection by bureaucrats. Specifically, we assume that firms sampled by the statistical office face a positive chance of being confronted with (possibly additional) bribe demands that amount to a fixed proportion of their current revenue.
In this framework, what are the consequences of an exogenous improvement in technical statistical capacity? Holding constant firms' investments, a fall in measurement uncertainty permits a more reliable assessment of whether an implemented reform boosts output or whether the government should pursue adjustments to make the reform work; as a result, the learning process speeds up. However, firms' investments do not stay constant when technical statistical capacity improves: a larger sample implies that each individual firm faces a higher probability of being sampled and hence a higher risk of being subjected to bribe collection; as a result, firms scale back their investments, 3 a response that has direct negative consequences for economic performance. But even more importantly, with smaller investments, economic reforms have a smaller effect on output-which, in turn, makes it more difficult for the government to determine whether an implemented reform works or needs adjustment.
So an improvement in technical statistical capacity has two opposing effects on the speed of the societal learning process and hence economic performance. A key implication of our framework is that-if control of corruption is sufficiently weak-there is a hump-shaped relationship between technical statistical capacity and economic performance: increasing the firm sample helps initially but reduces expected output beyond some critical threshold. In other words, although sampling is costless and the government is interested in learning about reforms, the optimal sample size is strictly smaller than the total number of firms. Corruption, by impairing the government's ability to identify the consequences of its reform decisions, retards economic growth by slowing down the learning process about "what works".
Over the past few years, a growing literature on the quality of economic statistics in developing countries has emerged. Often, the quality is shown to be low, pointing to a large potential for improvements in the timeliness and precision of economic indicators (e.g., Devarajan 2013, Jerven 2013, Kiregyera 2015, Sandefur and Glassman 2015, Kerner et al. 2017. Many papers consider the link between the quality of statistics and policy making. Rodrik (2010) stresses the importance of high quality data for evidence-based development policy. Manski (2015) worries that imprecise estimates may lead to bad policy decisions if policy makers fail to account for measurement error. Binswanger and Oechslin (2015) argue that better statistics-by making evaluations of past policy changes more reliable-could reduce disagreements and promote economic reforms. More in line with the present paper, Binswanger and Oechslin (2020) identify adverse effects of better statistics in electoral democracies. Even though the present paper also rests on a model of policy learning, it differs significantly from the former. Here, we relate learning about policies to corruption and explicitly model firms' investment choices. This connects us with the large literature on corruption, in particular with research on how corruption constrains investment and, more generally, the growth aspirations of firms (e.g., Fisman and Svensson 2007, Estrin et al. 2013, Freund et al. 2016, Paunov 2016, Zakharov 2019, Colonnelli and Prem 2020 and with work on how corruption reduces the socially optimal level of government intervention (e.g., Immordino and Pagano 2010).
The rest of this paper is organized as follows. The next section presents motivating evidence. In Section 3, we describe the theoretical setup. Section 4 solves for the equilibrium, assuming a given level of technical statistical capacity. Section 5 derives the optimal capacity level and discusses the harmful role of corruption. Section 6 offers a modified version of the model that allows for informality and misreporting. Finally, Section 7 concludes.

Economic statistics and economic performance
We start by presenting motivating evidence on the relationship between the quality of economic statistics and economic performance. We also consider the moderating role of corruption. To capture the quality of economic statistics, we use the World Bank's Statistical Capacity Index (SCI). The SCI is available for 153 developing and emerging economies at yearly frequency. We have recoded the index so that it ranges from 0 to 1, where 1 indicates maximum statistical capacity (the original range is 0 to 100). The index measures the extent to which a country's statistical system adheres to international technical standards deemed essential for the quality of economic data. We use the growth rate of real GDP p.c. (PPP, constant 2011 I$) to capture economic performance and the World Bank's Control of Corruption Index (CCI) as a measure for (the absence of) corruption. The CCI is a corruption perception index that is concerned with the exercise of public power for private gain and is constructed from a broad range of sources. 4 Our dataset includes 146 countries and covers the period from 2005 to 2016. The figures in this section rely on observations averaged over periods of three years (2005-07, 2008-10, 2011-2013, 2014-16). The full sample includes 556 observations. Figure 1 shows a partial residual plot that illustrates the correlation between the residual growth rate of real GDP p.c. and statistical capacity. 5 We see a significant positive relationship: an increase in the SCI of one standard deviation (0.16) is associated with a rise in real GDP p.c. growth of 0.6 percentage points. Figure 1 is based on the full sample and does not account for cross-country differences in corruption. The role of corruption is highlighted in Fig. 2, which again shows partial residual plots. Each subfigure considers two disjunct subsamples of the full sample. One of The underlying linear OLS regression relates the average annual growth rate of real GDP p.c. to a constant, the SCI, the log of real GDP p.c., and period fixed effects. The value of the coefficient on the SCI (which is also the slope of the fitted line) is 3.913 (p-value: 0.000) the subsamples contains all countries with an average CCI score belonging to the top 25% of the distribution ("low corruption") and the other one all countries with an average CCI score belonging to the bottom quartile ("high corruption"). Examples of low-corruption countries are Uruguay and Chile (both high SCI), but also Botswana and Namibia (both low SCI). High-corruption countries are, among others, Libya and Equatorial Guniea (both low SCI), but also Russia and Ukraine (both high SCI). Subfigure (a) of Fig. 2 replicates the regression from Fig. 1 for the two subsamples. The plot suggests that the positive correlation found in Fig. 1 is driven by observations from low-corruption countries. While there is a positive relationship between the growth rate of real GDP p.c. and statistical capacity in the low-corruption subsample, no such relationship emerges among high-corruption countries. 6 In Subfigure (b), the underlying regression additionally includes country fixed effects. The moderating role of corruption seems to be even stronger. However, in statistical terms, the difference becomes less significant.
The basic pattern shown in Fig. 2 is fairly robust to a number of modifications. It remains mostly unaffected when we use larger subsamples of low-and highcorruption countries, 7 or when we split the full sample into disjunct subsamples 6 One concern relating to the analysis of split samples is a possible lack of common support. But a first glance at Fig. 2 suggests that the two subsamples have a considerable common support in terms of SCI. A more formal procedure for visual inspection suggested by Hainmueller et al. (2019) confirms this impression. 7 The 25%-threshold (top/bottom) is chosen for visual clarity. The general conclusion stated at the end of the preceding paragraph remains valid when we gradually move to larger thresholds, up to the median.  [2005][2006][2007][2008][2009][2010][2011][2012][2013][2014][2015][2016]. Note: Each subfigure shows a partial residual plot for countries with an average CCI score belonging to the top ("low corruption") or bottom ("high corruption") 25% of the distribution (145 observations each). The underlying linear OLS regressions in Subfigure (a) relate the average annual growth rate of real GDP p.c. to a constant, the SCI, the log of real GDP p.c., and period fixed effects. The values of the coefficients on the SCI (which are also the slopes of the fitted lines) are -0.015 (p-value: 0.994) for the high-corruption subsample and 6.527 (pvalue: 0.000) for the low-corruption subsample. The p-value for the difference in the two coefficients is 0.011. In Subfigure (b) country fixed effects are added to the underlying regression. The values of the coefficients on the SCI are -2.217 (p-value: 0.671) for the high-corruption subsample and 8.196 (p-value: 0.148) for the low-corruption subsample. The p-value for the difference in the two coefficients is 0.174 according to the World Bank's Rule of Law Index (instead of the CCI). Moreover, given the concerns regarding GDP data quality, we were also using the (log) change in nighttime light intensity as a proxy for economic performance (Henderson et al. 2012). The source of the light data is Hodler and Raschky (2014), who aggregated the georeferenced raw data to ADM2 administrative levels (from where we aggregated it to the country level). 8 The data are scaled from 0 to 63, with a larger number reflecting more intense nighttime lights. The data are available up to 2013, which leaves us with 399 observations from 134 countries. While the differences between low-and high-corruption countries are smaller and more sensitive to the threshold applied, Fig. 3 shows that the swap of GDP for light data does not change the basic pattern documented in Fig. 2.
Overall, we find a positive relationship between economic performance and statistical capacity among low-corruption countries, while no such relationship appears among high-corruption countries. This pattern is consistent with our model, which predicts improvements in technical statistical capacity to lift economic growth in lowcorruption places but warns that such a positive relationship should not be expected in places where corruption is higher.

Data leaks and reactions
At a more general level, the mechanism we explore emphasizes that increased data gathering by the government makes people adjust their behavior in anticipation of shows a partial residual plot for countries with an average CCI score belonging to the top ("low corruption") or bottom ("high corruption") 25% of the distribution (101 and 102 observations, respectively). The underlying linear OLS regressions relate the log difference of mean light intensity to a constant, the SCI, the log of mean light intensity, and period fixed effects. The values of the coefficients on the SCI (which are also the slopes of the fitted lines) are -0.015 (p-value: 0.771) for the high-corruption subsample and 0.046 (p-value: 0.215) for the low-corruption subsample the possibility that the data, when leaked, is used to their disadvantage. In turn, these adjustments in behavior may well undermine the very purpose of data gathering. There is anecdotal evidence that broadly supports the relevance of this mechanism in developing countries. First of all, fear of data leaks is clearly well-founded. One reason is the prevalence of IT security holes. In a recent focus on Africa, the Brookings Institution (2018) warns that throughout the continent and across sectors, including government, the commitment to cybersecurity is weak. Again relating to Africa, Serianu (2017) reports that the government sector is among the most frequent victims of cybercriminal activity. This aggregate perspective meshes well with country-level accounts. As an example, consider Kenya, a country where reports about data leaks abound. For instance, in April 2016 servers of the Kenyan Ministry of Foreign Affairs were breached and one terabyte of (partly confidential) information stolen (Privacy International 2019b). This should not come as a surprise, considering that according to an estimate from 2017 more than 80% of public sector institutions in Kenya do not even have the means to detect network intruders. The frequent leaking of data held by public institutions does not go unnoticed and leads to reactions in the Besides IT security holes, the expansive use of data gathered by the government is another reason for why fear of leaks is often warranted. A particular ostensive illustration of this problem is India's Aadhaar biometric database. According to Privacy International (2019a) the breaches and leaks of personal data have been enormous. At some point in 2018, personal details of hundreds of millions of Indians apparently could be purchased online for as little as 500 (see Fig. 4). The leaks did not seem to be driven by weak IT security (Privacy International 2019a); they rather appeared to be a result of the fact that the data are shared across the public sector to an extent way beyond what was originally imagined. In this context, an article in the Economist (Dec 18, 2018) noted that "No one knows for certain how much Aadhaar-associated data have been shared with whom, (...)." It is obvious that the wide sharing of data, in combination with weak governance, increases the risk of someone making illicit use of them. The consequences for Aadhaar are severe: "A system designed to prevent fraud has given rise to a whole new economy of fraudulent activity (...)." As in Kenya, there is evidence that people respond to the danger of data breaches. For instance, reports suggest that HIV patients preferred to abandon essential treatment programs when they had their Aadhaar numbers linked to their patient identity cards (Privacy International 2019a).
The general pattern of anticipation and response that arises in the examples from Kenya and India can be found in many places. In the model below, it emerges in a generic setting that considers data gathering for the purpose of evaluating uncertain policies.

Output, revenue, and economic policy
We consider a two-period economy with N > 0 firms. Each firm i ∈ {1, · · ·, N} is equipped with a technology to produce a homogeneous good whose price is normalized to 1. The technology is uniform across firms and represented by the production function where t ∈ {1, 2} denotes time and 0 < α < 1. A t is a productivity parameter and x it refers to a firm-specific investment. For simplicity, the per-unit cost of investment is normalized to 1, too. Following Barro (1990) and the subsequent literature, we assume that economic policy plays a productive role in output generation; it offers complementary inputs to private production (such as reliable power supply) and so potentially creates a positive link between government, investment, and economic performance. However, unlike much of the literature, our setup includes a status-quo policy and assumes that any deviation from the status quo has an uncertain effect on production. For concreteness, suppose that economic policy, P t , enters production function (1) via the productivity parameter. Following Binswanger and Oechslin (2020), we assume P t ∈ {−1, 0, 1}, where 0 is the status-quo policy and −1 and 1 refer to two alternative reform policies. Policy affects productivity according to where 0 < γ < 1 reflects the economic significance of the reform and S ∈ {−1, 1} captures the unobserved and invariable "state of the world" that materializes prior to the start of the economy. 10 In Eq.
(2), we use the square root of γ to simplify the formal presentation of the main results below. Together, Eqs. (1) and (2) imply that a reform policy is beneficial (harmful) if its sign is the same as (is different from) the sign of the state. S takes each of its two possible values with probability 1/2. This specific value is chosen for analytical convenience and not important for our argument. What matters is that there is some uncertainty as to whether a particular reform alternative has a positive or a negative effect on productivity. For the policy maker, the only way to gain information is "policy experimentation": implement one of the alternatives, monitor the result-and then adjust if necessary. We note that in terms of key results our analysis would be unchanged if there were no status-quo policy. The difference would be that in such a simplified setup the government would be forced to try an uncertain policy while in the present framework this will be an endogenous decision. While producing the homogeneous good is the core activity, each firm additionally runs its individual side business. There are many possibilities: wholesale trade in a different good, property dealing, services such as repairs, etc. The side business is volatile and enters the analysis as an exogenous net contribution ζ it to the total firm revenue z it : ( 3 ) To capture the volatile nature of individual side businesses, we assume that ζ it is a continuous i.i.d. random variable with support on [0, ∞), mean χ > 0, and variance σ . In combination with the informational frictions introduced in Section 3.2 below, the plausible consequence of firms running side businesses will be that the impact of any reform cannot be immediately inferred from the data of a single or a handful of firms. Hence the case for systematic data gathering. To keep the analysis tractable, the modeling of side businesses is parsimonious. Among other things, a side business is unaffected by policy and its contribution to total firm revenue is independent of the scale of the firm's core activity. The latter assumption, a measure of independence from core activities, is important for some of our findings. 11 It is worthwhile to note that there is no need to impose a specific distribution function for the ζ it s. But it is helpful to make two "technical" assumptions in this regard. First, to secure that the model is scale invariant regarding the number of firms, we assume σ = θN, (4) where θ > 0. The parameter θ will govern the volatility of the average side business contribution. Second, for a reason that will become clear below, we assume that the distribution of the ζ it s has a light left tail. Finally, we will use an asterisk ( * ) to mark values reflecting optimal firm choices: x * it denotes firm i's optimal investment level, while y * it and z * it refer to, respectively, the resulting output of the homogeneous good and the resulting total firm revenue.

Informational constraints and output estimation
The government runs a statistical office that is tasked with the timely collection of firm-level data with the sole purpose of producing economic statistics at the end of each period. 12 Yet there are practical problems placing constraints on the level of detail with which firm-level data can be collected: within the limited time frame for completing statistical questionnaires, firms can only just identify total firm revenue; they do not have more detailed information on individual components yet. In brief, firm i observes z * it but does not individually observe y * it or ζ it before period t ends. So firms can only report data on total firm revenue. Naturally, with time, the informational constraints ease and firms become able to separately identify the components 11 If the side businesses in-or deflated in perfect proportionality to the scale of the core activities (z it = y it · ζ it ), data gathering would still depress investment (and thus the effect of policy), but an interior level of data gathering could no longer be a maximizer of the informativeness of policy experimentation. The reason is that, with multiplicative shocks, a fall in investment mechanically reduces the side-business induced volatility of z it . 12 This is consistent with Principle 6 of the United Nation's Fundamental Principles of Official Statistics, which says that "Individual data collected by statistical agencies for statistical compilation, whether they refer to natural or legal persons, are to be strictly confidential and used exclusively for statistical purposes." of z * it . Specifically, firm i learns (y * i1 , ζ i1 ) in period 2 just before it is set to choose its level of investment. Provided that P 1 = 0, learning (y * i1 , ζ i1 ) permits the identification of A 1 and hence state S. Assuming that firms (possibly) identify S before they decide on their second-period investment simplifies the analysis without changing its substance.
Firms provide the statistical office with accurate revenue data (e.g., because misreporting, if detected, carries a prohibitive fine). The office uses the data to compute estimates of average total firm revenue, which are then published at the end of each period and become valuable in the case of reforms. All activities of the statistical office are costless. The office's estimates are based on a random sample of n ≤ N firms, where n is determined before the start of the economy. The ratio p ≡ n/N is an obvious measure of technical statistical capacity. With this, the office's estimate of average total firm revenue in period t can be written as where Assuming that pN is sufficiently large, the Lindeberg-Lévy CLT implies that the distribution of √ pN ζ p t − χ is closely approximated by N(0, θN). We therefore work with ζ p t ∼ N(χ, θ/p).
If the government has implemented a reform policy in period 1 (P 1 = 0), it has to rely on Z p 1 as the only available source of information when it determines its secondperiod policy at the very beginning of that period. 13 The government understands the firms' decision problem (and hence can infer their first-period investments, x * i1 ) as well as the parameters of the side businesses. Based on this, and with the help of Z p 1 , it uses Bayes' rule to compute the ex-post probability that the implemented first-period reform policy is the beneficial one: In Section 6, we sketch a modified version of the model in which the data firms report to the statistical office is not necessarily fully accurate. Misreporting becomes an option as firms can shift parts of their core business into informal operations whose output cannot be detected.

The bureaucracy and the government
While officially firm-level data collected by the statistical office may not be used for any purpose other than output estimation, there is a potential for data misuse: in both periods, there is a probability π > 0 of a confidentiality breach that puts the data into the hands of a "corrupt" government official. Following Svensson (2003) and Fisman and Svensson (2007), we may think of an official outside the statistical office whose power to collect bribes derives from his discretion in the application and enforcement of complex regulations. In practice, there is more than one way by which the official can get hold of confidential firm data (Section 2.2). Just like any hacker from outside the government bureaucracy, the official may take advantage of insufficient cyber defenses. Alternatively, the official may obtain access to the firm data due to an illicit practice of expansive data sharing within the government bureaucracy.
In many instances, corruption is governed by norms (see, e.g., Malesky and Samphantharak 2008, della Porta and Vannucci 2012, World Bank 2015. While authorizing corruption, these norms sometimes include restrictions, a sense of what would be an official's customary share and what would amount to excessive, "intolerable" extortion. For the model, assume a norm that tolerates bribe collection as long as it is not out of proportion with a target's economic capacity. Such a norm may be rooted in the experience that the economic damage associated with bribe collection beyond a certain relative limit causes civil unrest, putting in danger the government bureaucracy as a whole. For concreteness, assume that the corrupt official is protected by the responsible superior (e.g., a high-ranking bureaucrat) as long as a collected bribe does not exceed a share 0 <β < 1 of the target firm's total revenue; for bribes in excess of this limit, if credibly reported to the superior, the norm demands that the corrupt official face a severe penalty (i.e., one that is large relative to the bribe collected). Further assume that firms can, and also do, credibly report excessive bribe collection.
In the model,β is one of two parameters that determine the scope for corruption. The second parameter is the probability of a confidentiality breach, π.
In what follows, we use β ≡ πβ ∈ (0, 1) as an overall measure of corruption and refer to it as the bureaucracy's vulnerability to corruption. With the help of β, the model is able to capture a broad spectrum of institutional realities. If β takes a relatively low value, we have a bureaucracy whose integrity is merely compromised. By contrast, a value close to 1 may capture a setting in which an illicit practice of data sharing, combined with a norm authorizing vast bribe collection, enables an extractive bureaucracy that is hardly bound to the rule of law at all.
Regardless of the concrete level of β, leaked firm data offer valuable information to the corrupt official. Eliminating the informational asymmetry between the official and the firms in the sample, the data permit bribe discrimination-i.e., tailored bribes that extract the maximum possible without any risk of sanctions. By contrast, collecting bribes from firms that are not part of the sample is risky. Without information on firm revenue, collecting a bribe entails the possibility of a penalty. The official considers collecting a uniform bribe from those firms, weighing the penalty to be expected as a function of the bribe level. Since a uniform ("lumpsum") bribe does not affect firms' incentives, we assume without loss of generality that the outcome of this optimization process is a bribe of zero. In summary, in the case of a confidentiality breach in period t, the official collectsβz kt from all firms k that are part of that period's sample and leaves the remaining firms alone. 14 Without a confidentiality breach in t, the official abstains from collecting bribes in that period.
The government determines statistical capacity, p, and is in charge of economic policy, P t , t ∈ {1, 2} (mind the difference between lower-and upper-case letters). We consider a benevolent government that, however, has inherited a bureaucracy whose vulnerability to corruption is deep-rooted and thus cannot be reduced within the time horizon of the model (i.e., must be taken as exogenous). Naturally, this assumption entails that β is unaffected by both p and P t . Put differently, a possible improvement in statistical capacity, or a deviation from the status-quo economic policy, would not help in addressing problems with data confidentiality or corruption. 15 The government's objective is to maximize the expected lifetime total revenue of the representative firm i and its objective function reads where the expectation in Eq. (9) is formed at the beginning of period 1. So our analysis of optimal technical statistical capacity assumes a relatively "favorable" environment in which the government does not pursue special interests but aims at maximizing economic performance and in which running the statistical office is free of charge.

Time line
The timing of actions is as follows. Prior to the start of the economy, Nature determines the unobserved state of the world S and the government chooses statistical capacity p.
In the first period, the government sets P 1 ; observing the government's decision, all firms i choose x i1 ; the statistical office draws the random firm sample, thereby complying with the government's choice of p; Nature determines {ζ i1 } N i=1 ; the statistical office collects the data and-if there is a confidentiality breach-the official collects bribes from the sampled firms; the statistical office publishes Z p 1 and-if P 1 = 0-the government computes r(Z p 1 ).
14 Proportional bribe collection requires that the official has the power to wrest larger bribes from larger firms. This is plausible since the cost of hold-ups caused by the official arguably rises in the scale of production. 15 The literature (e.g., Pitlik et al. 2010) suggests that information is important in fighting corruption (e.g., as it helps monitor public spending and identify misuses of funds). Here, it is still plausible to assume that β is unaffected by p: the statistical office collects information on private firms and not on the government sector.
In the second period, the government sets P 2 ; all firms learn (y * i1 , ζ i1 ) and-if P 1 = 0-infer state S; taking P 2 and the available information on S into account, all firms i choose x i2 . From this point onwards, the sequence of actions is identical to that in the first period.

Input choice
Before going backwards through the sequence of policy choices, we consider the firms' investment decisions. In period t ∈ {1, 2}, firm i solves the maximization problem where E i t {·} refers to the expectation formed by the firm just before it chooses x it . The objective function in problem (10) reflects that, with probability p, firm i is sampled by the statistical office, in which case there is a chance π that it will be approached by a bribe-collecting official who would take a shareβ of total revenue. Problem (10) can be simplified to Since 0 < α < 1, the objective function in maximization problem (11) is a strictly concave function of x it ∈ [0, ∞). The function's maximizer is given by Using production function (1), one can calculate firm i's output of the homogeneous good as

Second period
The final decisions of interest to be taken in period 2 are those by the firms on secondperiod investment. Those decisions are mostly analyzed in Section 4.1. What remains to be done here is to determine E i 2 {A 2 }. When deciding on x i2 , firm i has just learned about (y * i1 , ζ i1 ). As a result, if P 1 = 0, the firm can infer S ∈ {−1, 1} from (y * i1 , ζ i1 ); moreover, having observed P 2 , it can identify A 2 with certainty (Eq. 2). Otherwise, if P 1 = 0, firms still reckon that S has taken the value −1 with probability 1/2 and the value 1 with the same probability; thus, E i 2 {A 2 } = 1 irrespective of the choice of P 2 . To summarize: The first decision to be taken in period 2 is that by the government on secondperiod policy, P 2 ∈ {−1, 0, 1}. Objective function (9) implies that, at this point in time, the government wants to maximize E 2 z * i2 , where the expectation is formed at the beginning of period 2. First assume that the status-quo policy has been implemented in period 1 (P 1 = 0). For this case, Eq. (14) implies E i 2 {A 2 } = 1. We therefore obtain Again, since there is no information on the realization of S in this case, we have E 2 {A 2 (P 2 )} = 1 for all P 2 ∈ {−1, 0, 1}. So the government is indifferent between the three options. Without loss of generality, we henceforth assume that it decides to keep the status-quo policy in place: Now suppose that a reform policy has been implemented in period 1 (P 1 = 0). In this case, taking into account Eq. (13) and Eq. (14), we obtain The expectation in Eq. (17) is now based on r(Z p 1 ), the ex-post probability that the first-period reform alternative is the beneficial one. Therefore: Proposition 1 Suppose P 1 = 0. Then, in order to maximize E 2 z * i2 (P 2 ) , the government chooses P 2 according to Proof See Appendix A.

First period
If a reform policy has been implemented, the final activity in period 1 is the computation of the ex-post probability r(Z p 1 ) ≡ Pr P 1 = S| Z p 1 . When doing this computation, the government considers the firms' investments earlier in the period and the resulting implications for the output of the homogeneous good. The government knows that the firms have solved the maximization problem stated in Section 4.1 and accordingly that the level of output is as specified in Eq. (13). Moreover, it follows that E i 1 {A 1 } = 1 since state S takes each of its two possible values with probability 1/2. So it is clear that Given this, and considering Eq. (5) and Eq. (7), the government understands that Z p 1 follows a normal distribution with a mean that depends on S and P 1 : where Since each of these two possibilities materializes with probability 1/2, Bayes' rule implies where f Z p 1 · denotes the corresponding normal density. Using functional forms, we obtain According to Eq. (22), In combination with Proposition 1, this implies that P 2 = P 1 if Z p 1 ≥Ẑ p 1 and P 2 = −P 1 otherwise. Moving backwards to the firms' actual investment decisions, it is sufficient to refer to the above discussion and repeat that x * i1 and y * i1 are given by Eqs. (12) and (13), respectively, with E i 1 {A 1 } = 1 irrespective of the actual policy decision. The first decision to be taken in period 1 is that by the government on first-period policy. To inform this decision, the government compares the value of its objective function, V = E 1 z * i1 + z * i2 , under the status quo to the value under any of the two reform alternatives. According to Eq. (16), P 1 = 0 implies P 2 = 0. So, if the government opts for the status quo in period 1 (P 1 = 0), we obtain A 1 = A 2 = 1. From this, it follows that the expected lifetime total revenue by the representative firm i is given by Due to the symmetric setup, the government is indifferent between the two reform alternatives. Without loss of generality, we henceforth assume that P 1 = 1 if the government decides to abandon the status quo. In this case, P 2 is specified by Eq. (18) and the expected lifetime total revenue by the representative firm i is given by (25) Moreover: Lemma 1 Suppose the government opts for a reform policy in period 1 (e.g., P 1 = 1). Then, where Pr[P 2 = S] denotes the chance that in period 2 the beneficial reform policy is chosen andÂ The results derived so far lead to the following conclusion: Proposition 2 In period 1, the government prefers reform (e.g., P 1 = 1) to the status quo. In period 2, the government's policy choice is described by Eq. (18).
Proof The first statement of the proposition follows from Eqs. (24) and (25) and Lemma 1. The second statement follows from the first and Proposition 1.
In period 1, there are two factors that make the government prefer a reform policy to the status quo. First, if a reform policy is implemented, the government gains information about "what works"; this, in turn, allows for a better informed policy decision in period 2. Second, y * i2 is convex in A 2 (see Eqs. 13 and 14). For this reason, the government prefers taking a "symmetric risk" to obtaining the expected value with certainty.

Optimal statistical capacity
Besides economic policy, the government determines technical statistical capacity, p, with a view to maximizing V . The key magnitude in its decision problem is the probability with which the beneficial reform will be implemented eventually (Eqs. 25) and (26).

Proposition 3 At the beginning of period 1, i.e., at the moment when the government decides to implement a reform policy (Proposition 2), the probability that the implemented second-period reform is in fact the beneficial one, Pr[P 2 = S], is given by
where (·) denotes the distribution function of the standard normal distribution and Proof See Appendix A.
In what follows, we will call I the "informativeness of policy experimentation". 16 As can be seen from Eq. (28), informativeness depends on five parameters, some of them unrelated to the statistical office: other things equal, if a reform is more significant (higher γ ), or if the side businesses are less volatile (lower θ ), informativeness is higher. On the other hand, informativeness is influenced by the statistical office. In Eq. (28), C(p; α, β) captures the entirety of channels by which the statistical office affects informativeness. For this reason, we will call C(p; α, β) a measure of "comprehensive statistical capacity". It is immediately apparent that the effect of technical statistical capacity on comprehensive statistical capacity is ambiguous if the bureaucracy's vulnerability to corruption is not equal to zero (β > 0). This reflects that firms-observing a positive relationship between p and expected bribe demandsreduce investment in response to a rise in p (Eq. 12). Note that α affects C(p; α, β) because this parameter governs the elasticity of investment with respect to expected bribe demands. C(p; α, β) has the following important properties: Lemma 2 Comprehensive statistical capacity C(p; α, β) is a function of p on [0, 1] that has a unique maximizer, p * ∈ (0, 1]. Moreover, C(p; α, β) is strictly concave on [0, p * ).
Proof See Appendix A.
What level of technical statistical capacity maximizes comprehensive capacity? The answer depends on the bureaucracy's vulnerability to corruption: Proposition 4 If the bureaucracy is sufficiently vulnerable to corruption, the level of technical statistical capacity that maximizes comprehensive statistical capacity C(p; α, β)-and hence informativeness I (C; γ, θ)-is strictly less than 1. In formal terms: if we obtain where we use p * to denote the maximizer of C(p; α, β).
Proof See Appendix A.
A rise in technical statistical capacity has two opposing effects on comprehensive statistical capacity and hence on the extent to which the estimate of average total firm revenue, Z p 1 , is informative for the policy decision in period 2. On the positive side, a rise in technical statistical capacity reduces the variance of the exogenous component of Z p 1 (Eq. 7); all else equal, this helps informativeness. On the negative 16 The complementary probability to Pr[P 2 = S] is equal to the sum of the probabilities of a type-I error (P 1 = S is rejected although true) and of a type-II error (P 1 = S is not rejected although false). side, for any individual firm, a rise in technical statistical capacity implies a higher chance of being subjected to bribe collection. As a result, firms respond by reducing investment (Eq. 12)-which dampens the impact of reforms on production; all else equal, this harms informativeness. The strength of the negative effect rises in the bureaucracy's vulnerability to corruption. If β is sufficiently large, the negative effect starts to dominate at a strictly interior level of technical statistical capacity. Figure 5 illustrates the ambiguous role of technical statistical capacity. In both subfigures, the underlying assumption is that the implemented first-period reform is the beneficial one (P 1 = S); moreover, in both subfigures, the shaded areas correspond to the probabilities of committing a type-I error (P 1 = S rejected although true) under low (blue) and high (red) technical statistical capacity, respectively. Subfigure (a) shows a case in which higher technical statistical capacity (red line) improves informativeness: an increase in p from p l to p h reduces the probability of a type-I error. In other words, the reduction in the variance of the exogenous component of Z p 1 outweighs the reduction in the beneficial effect of the reform (leftward shift of the entire distribution). By contrast, Subfigure (b) shows a situation in which an increase in technical statistical capacity leads to a higher probability of a type-I error: the reduction in the variance of the distribution is dominated by its shift to the left.
However, informativeness-and the associated probability of choosing the beneficial reform alternative in period 2-is not the only variable the government has to consider when determining the level of p that maximizes V , the expected lifetime total firm revenue. The negative static effect of technical statistical capacity on firm investment (and hence production) must be taken into account, too. We therefore obtain the following result: Proposition 5 Consider a benevolent government that aims at maximizing economic performance by experimenting with a set of policies and that can implement any level of technical statistical capacity at no cost. Yet assume that the government faces a bureaucracy that is sufficiently vulnerable to corruption such that condition (R1) holds.
When choosing the level of technical statistical capacity, this government neither opts for 1 (maximum level) nor for p * < 1 (the level that maximizes comprehensive statistical capacity and hence the informativeness of policy experimentation). Instead, the government finds it optimal to choose a level p * * that is (substantially) lower than p * : Proof See Appendix A. Figure 6 illustrates how I (C; γ, θ) and V = E 1 z * i1 + z * i2 depend on technical statistical capacity, assuming that condition (R1) holds. Both curves are humpshaped. Proposition 5 predicts that V peaks at a strictly lower level of technical statistical capacity than C does. As a result, the peak of I must lie to the right of the peak of V .
In Fig. 6, p * * takes a particularly low value. So technical statistical capacity is weak, as is the case in many developing countries. The figure thus conveys the message that real-world instances of low technical statistical capacity should be interpreted with care. They are compatible with completely different versions of reality. Clearly, "poor numbers" may be an expression of neglect and/or a lack of resources and expertise. However, keeping technical statistical capacity weak may also be the best response by a capable government that is mindful of facing a corrupt bureaucracy and problems with data confidentiality like those illustrated in Section 2.2. "Flying blind" may be a well-considered choice.

The effect of corruption
An increase in the bureaucracy's vulnerability to corruption amplifies the negative effect of statistical capacity on firm investment. As a result, the level of technical statistical capacity that maximizes comprehensive statistical capacity and informativeness, p * , is a decreasing function of β (Eq. 30). Simulations suggest that a similar result holds for the level of statistical capacity that maximizes the expected lifetime total firm revenue, p * * . Figure 7 illustrates that p * * shifts to the left as β increases. Consistent with this, we find that a rise in β has negative consequences for the two outcomes of interest: Proposition 6 Suppose that the bureaucracy is sufficiently vulnerable to corruption so that condition (R1) holds. Then, (i) C(p * * ; α, β)-and thus I (C; γ, θ)-are strictly decreasing functions of β; There is a large literature on the channels by which corruption affects economic performance. Proposition 6 puts the spotlight on a novel one. An increase in the bureaucracy's vulnerability to corruption reduces the informativeness of policy experiments (a process that is understood to involve the implementation, evaluation, and adjustment of reform policies). Two mechanisms are simultaneously at work. On the one hand, an increase in corruption dampens the economy's response to policy changes (Eq. 13). On the other hand, as can be inferred from Fig. 7, higher corruption induces the government to lower the level of technical statistical capacity-which makes navigating among the different reform alternatives even more challenging. This is also reflected in the economy's growth rate: Proposition 7 Suppose that the bureaucracy is sufficiently vulnerable to corruption so that condition (R1) holds. Then, the expected growth rate due to policy experimentation, is a strictly decreasing function of β.
Proof Follows immediately from Proposition 6.
So the consequences of an increase in corruption are not limited to a mere level effect; higher corruption also flattens the path of production over time.

Informality and misreporting
In principle, sampled firms have an incentive to underreport total revenue since they must anticipate the possibility of proportional bribe demands. However, so far, we have maintained the assumption that the firms nonetheless report truthfully, e.g., because any misreporting may be detected and penalized by the statistical office. While this is a strong assumption, we can relax it without changing the main insights from our analysis. In this section, we sketch a modified version of the model in which firms can shift parts of their production into informal operations whose output is perfectly hidden-and thus not reported to the statistical office. The exposition here concentrates on the impact of exogenous changes to technical statistical capacity on firm investment and the informativeness of policy experimentation.
Consider the following modifications. First, in addition to the standard technology represented by production function (1), firms have access to a second technology by means of which the homogenous good can be produced. The output of the second technology, in contrast to that of the standard technology, cannot be observed by any other actor in the economy. Therefore, firms will not report it. It is natural to regard the use of the second technology as an informal activity. Thus, if a firm relies on the second technology in addition to the standard one, it runs formal as well as informal operations. 17 Second, in each period, firms must decide on the allocation of a fixed amount of a resource (e.g., managerial time) across their formal and informal operations. In this context, we continue to use the wording "investment": the part of the resource that goes to the formal operations is called formal investment, the rest is called informal investment. Finally, the government no longer perfectly understands the firms' decision problem. Yet it knows the standard technology and observes how much firms invest formally. 18 As a result, following a reform in the first period, the government can still compute r(Z p 1 ), the ex-post probability that the implemented reform is the beneficial one.
For concreteness, suppose the second technology is represented by the production function where B > 0 captures productivity and x n it and y n it refer to, respectively, informal investment and the resulting informal output of the homogeneous good. Independence from policy is assumed for simplicity only. We usex > 0 to denote the fixed amount of the resource to be allocated. Note that x n it + x f it =x, where in this section x f it stands for formal investment (and y f it stands for the formal output that results via production function (1)). Parameter restriction is sufficient to ensure that firms always run formal and informal operations in parallel. In what follows, we assume that restriction (34) holds. Moreover, without loss of generality, we impose B = 1. With these assumptions, the firms solve the following maximization problem: When we compare problem (11), the maximization problem in the baseline model, to problem (11 ), it is immediately clear that the two objective functions are identical except for constantx in Eq. (11 ). As a result, the two problems have identical , where x * it is given by Eq. (12). Restriction (34) guarantees that x n * it =x − x f * it is strictly positive. How do the results regarding the impact of technical statistical capacity, p, differ between the baseline and the modified model? With regard to investment, there are similarities as well as differences. The similarity is that formal investment is still a monotonically decreasing function of p; a higher chance of being subjected to proportional bribe collection continues to have a disincentivizing effect. The difference is that in the modified model total investment is unaffected by p; a rise in p merely shifts investment into informal operations, away from formal ones. So the fall in output coming from formal operations is accompanied by a rise in informal output. In net terms, there is still a loss since the misallocation of investment across operations worsens. 19 However, for identical parameters, the negative effect of a given increase in p on total output of the homogeneous good is smaller than in the baseline model.
With regard to the informativeness of policy experimentation, there is no difference between the baseline and the modified model. The negative effect of p on formal investment continues to dampen the impact of reform policies. In fact, the level of p that maximizes informativeness, p * , is unchanged (Eq. 30). Yet what level of p should be chosen in presence of informal operations? If the government were interested in formal-operations output only, Proposition 5 would continue to apply: the government would choose a p * * that is markedly smaller than p * . But if the government factors in that losses in formal output are partly compensated by gains in informal output (which, however, it does not exactly observe), it may want to push p closer to p * . By how much must depend, among other things, on the government's attitude towards informality. Regardless of this, we can summarize that the main insights from the baseline model are robust to the introduction of informality and misreporting.

Summary and conclusion
One approach to boost economic output is policy experimentation: tweak an existing economic policy, evaluate the consequences, and so discover "what works". While economists have been interested in this approach for a long time, it has recently gained traction in the context of development policy. Clearly, how much policy makers can learn from a policy experiment-i.e., the degree of the experiment's informativeness-must depend on the accuracy with which the economy is measured. In developing countries, accuracy tends to be low. As a result, efforts to help developing countries improve their statistical capacities are of great importance. In this paper, we demonstrate that efforts with a sole focus on technical aspects-here understood as the scale of data gathering-need not be unambiguously helpful. confidentiality breaches are to be expected and control of corruption is weak, more information gathering by the statistical office means that firms face a higher risk of being subjected to bribe collection. As a result, firms reduce their investments, thereby dampening the effect of policy changes. The relative strength of this harmful effect rises in the level of technical statistical capacity. At some point, it becomes the dominant force, implying that further improvements in technical statistical capacity make it harder-rather than easier-to discover "what works".
Against this background, we argue that efforts with the aim of expanding data gathering should not be uniform but adapted to local circumstances. Broadly speaking, our analysis suggests that such efforts be aligned with the local institutional setting: in places where extensive bribe collection is the norm and data confidentiality in doubt, the extent of data gathering-and hence the precision with which the economy is measured-should be more limited. Viewed from a different angle, our analysis shows that efforts to address insufficient technical statistical capacity should come with additional arrangements. In particular, taking measures to reinforce data confidentiality would dampen the observer effect and so permit the analysis of a less distorted economy, including the economy's response to policy changes. A list of such measures would certainly include the strengthening of cyber defenses. Yet, when dealing with an extractive bureaucracy in which a practice of expansive data sharing has taken root, institutional changes might be needed, too. They could range from fortifying the statistical system's independence from the rest of the bureaucracy to more robust measures such as the outright outsourcing of statistical services to an independent international body. Finally, we believe that this paper may contribute to a better understanding of the intricate ways through which corruption retards economic growth. By stifling private economic activity, and by limiting the optimal degree of measurement precision, corruption makes it harder for policy makers to discover how to tailor economic policies to local circumstances.