1 Introduction

Since the 1990s there has been a widespread liberalization of the electricity sector in Europe, the Americas, and Australasia (Pollitt, 2012). This has variously involved vertical unbundling, the introduction of competition, and privatization. For IO economists this process is of interest because it variously affects debates about the benefits of vertical integration between stages of production (generation, transmission, distribution, and retail in the case of electricity), the cost reduction benefits of competition (which has been introduced at the generation and retail stages), and the welfare effect of privatization.

Three key problems arise in measuring the effects of this liberalization process. These apply to other liberalization processes (e.g. telecoms, gas, water and rail), but are especially important in the electricity sector, which is the focus of our study:

First, the nature of the firm that is being analyzed often changes due to the vertical unbundling, making it difficult to make use of corporate data through time. For electricity, before-and-after comparability problems are exacerbated by the fact that vertical integration is often tapered. This means that putting unbundled companies back together to track their performance is problematic. Studies such as Newbery and Pollitt (1997) explicitly address this by examining the performance of one liberalized company—e.g., the CEGB—in the UK and reconstructing a virtual liberalized comparator from the unbundled successor companies. This careful approach is difficult to replicate for a large panel of companies.

Second, it is often the case that multiple interventions—vertical unbundling, the introduction of competition, and privatization—happen almost simultaneously; and any empirical evaluation may struggle to distinguish these individual effects adequately. One approach is to use panel data at the country level and exploit the differences in the degree of unbundling, competition, and privatization; a good early example of this is Steiner (2003). A shortcoming of this approach is that it does not examine individual firm performance, and the high-level analysis struggles to distinguish satisfactorily between reform elements that may be packaged—and not viewed as separable—by individual governments.

Third, analyses that attempt to respond to the first two problems narrow their focus to examine only one vertical stage of the electricity industry. Thus, Fabrizio et al. (2007) examine the effect of liberalization of the US electricity industry only on generation plants, while Kwoka et al. (2010) examine the effect of liberalization on electricity distribution businesses in the US. This approach then begs the question as to whether the benefits of liberalization identified in one stage of production are offset by losses in other stages of production. For example, do the apparent benefits of competition in generation—lower wholesale power costs—come at the cost of higher costs in transmission and distribution (i.e. higher network costs)? Indeed, what does a firm-level analysis of costs tell us about the overall cost effect of electricity liberalization on the whole electricity industry?

This paper seeks to address all three of these problems by using US electricity data on investor-owned utilities, through its liberalization in the 1990s and early 2000s.

On the first problem, we exploit the granularity of available regulatory information on costs at three vertical stages of production—generation, transmission, and distribution—and the flexibility of the efficiency analysis technique [following, Farrell (1957)] to measure relative performance for combined or separated activities.

On the second problem, we exploit the fact that in the US electricity liberalization some states did not liberalize at all: The structure and degree of competition that faced their electric utilities did not change from the pre-liberalization situation. Meanwhile in states that did liberalize their electricity sectors, some firms vertically unbundled, and some did not. This allows a difference-in-differences approach to be used to identify both the combined effect of competition and unbundling and the separate effects of competition and unbundling.

On the third problem, we examine generation, transmission, and distribution stages individually and collectively. Our analysis is thus able to answer the question of whether there is a trade-off by which apparent gains in one vertical stage are offset by apparent losses in another vertical stage and what the overall effect is. This is important because several US states actually approved the shifting of fixed costs from generation to transmission and distribution in order to make restructuring appear more successful in terms of its competition effect (Maloney, 2001).

We believe our analysis is the first to attempt to examine the aggregate cost effect of US electricity liberalization with the use of corporate data, while separating competition and vertical separation effects.

Our analysis sheds light on the apparently contradictory findings of earlier US studies that focused on vertical integration or competition. The literature on vertical integration in electricity has long suggested vertical economies of scope in electricity [e.g., (Kaserman and Mayo, 1991; Kwoka, 2002; Arocena et al., 2012)], which could be lost during liberalization. Meanwhile the literature that examines the introduction of competition in electricity generation shows cost reductions (Fabrizio et al., 2007; Bushnell and Wolfram, 2005). The finding of vertical economies is mirrored among Japanese and European electricity power companies (Nemoto and Goto, 2004; Gugler et al., 2017). An earlier attempt to separate privatization and competition benefits in the UK electricity generation market shows a positive effect on productivity of privatization, but not subsequent competition (Triebs and Pollitt, 2019). Recent analyses have called into question whether vertical economies of scope are different for integrated and non-integrated firms (Triebs et al., 2016), which reopens the empirical question of whether vertical economies actually exist, given the theoretical and empirical limitations of pre-liberalization studies (Pollitt and Steer, 2012).Footnote 1

This paper is divided into five further sections: Sect. 2 gives the institutional background and the identification strategy. Section 3 discusses the different technology models and the difference-in-differences approach. Section 4 summarizes the data and gives details on our variables. Section 5 gives the results, and Sect. 6 concludes.

2 Background and Identification

In the US, historically, most electricity was provided by privately owned, vertically integrated, rate-of-return regulated, franchise monopolies.Footnote 2 These utilities covered the entire supply chain, which Fig. 1 illustrates.

Fig. 1
figure 1

Electricity supply chain. Schematic graph of the electricity supply chain. Source: https://www.eia.gov/energyexplained/electricity/delivery-to-consumers.php

Power plants generate electricity, which is then delivered via transmission and distribution cables to households and industry. Power plants comprise different generation technologies and fuels: e.g., coal-fired, gas fired, nuclear, renewable.

Starting in the 1970s the US federal regulator tried to create competition in the electricity generation market by encouraging merchant entry and facilitating third-party transmission network access. Continued network access discrimination by incumbent utilities led to the 1992 Energy Policy Act, which regulated wholesale transmission access. This was followed by functional and control separation of transmission in 1996 and 1999, respectively. Control separation means that incumbent utilities still own transmission assets, but system control lies with newly created independent system operators (ISOs).

In parallel with federal regulation of merchant entry and transmission network access, many states introduced competition for electricity generation. These state-level reforms are our first source of quasi-experimental variation. Following Fabrizio et al. (2007), we define a state as introducing a competitive electricity generation market—a discrete change—if it starts formal hearings on restructuring and subsequently passes a law. Introduction of competition and restructuring are used synonymously. All 50 states initiated hearings on restructuring between 1993 and 1999. By 2000, about half of the states had passed restructuring laws. Whereas competition always comprised wholesale electricity competition, it might include retail competition. We ignore the latter because most retail competitors buy all of their electricity. We assume retail competition does not further increase competition in the electricity generation market.

Our second source of quasi-experimental variation is the vertical separation of electricity generation from distribution. We define this divestiture—a discrete change—as follows: We start with a utility that has the local distribution activity, keeps it, and remains regulated throughout. We record an incumbent utility as divested after a 50% reduction in generation plant book value; we follow the definition of Kwoka et al. (2010). After a careful analysis of state Public Utility Commissions’ records, Kwoka et al. (2010, p. 92) conclude that this is a good threshold for “major policy-induced divestiture”.Footnote 3 According to Bushnell and Wolfram (2005, p. 7) divestitures were driven by state policy and not firm strategy.Footnote 4 It is necessary to define some cut-off because: (i) we believe that the underlying policy decision is discreet, but divestiture is not necessarily complete; and (ii) a utility’s generation capacity might fluctuate independently of restructuring. We show that qualitatively our results are robust to a 75% cut-off.

Also, instead of using book value, one could use physical generation capacity to define divestiture. For our data the correlation coefficient between the two measures is 0.61, but the plant-based measure produces more reasonable divestiture cases. It has a stronger correlation with our competition indicator: 0.33 as opposed to 0.28. Also, the physical-capacity-based measure records divestitures before 1998 and for non-restructuring states, which according to Bushnell and Wolfram (2005) is not accurate. In any case, we show that qualitatively, most of our results are robust to a physical-capacity-based measure of divestiture.

In addition to the 50% year-on-year drop in plant value, we require that the proportion of own electricity generation is at least 25% of total requirements the year before divestiture. And that after divestiture firms generate at most 50% of their requirements. For a given firm, we account for only the first divestiture; the firm counts as divested for all subsequent years. Table 3 in Appendix 1 lists the names of the 29 utilities that we count as divested and the year for which the divestiture is recorded.Footnote 5 Note that due to gaps in the data, the recorded year might be after the actual year of divestiture.

We have now defined competition and divestiture; but the latter leads to a complication: After a divestiture we still observe the distribution unit because it remains a regulated utility throughout, but we no longer observe divested generation activities. The same is true for transmission activities that fall under the responsibility of an ISO. However, after the divestiture of a generation plant the distribution entity still reports its cost of purchased power, which we use as a proxy for stand-alone generation cost. As we explain below we use cost measures as proxies for input quantities to estimate technical inefficiency.

For instance, if a utility sold all of its generation capacity to a newly formed independent generation company and bought all of its electricity, the cost of purchased power would reflect the cost of this new, stand-alone generation company. In reality however, a divesting utility might sell different plants to different companies and buy its requirements from different utilities, whether integrated or not. Hence, our proxy captures a “virtual” stand-alone generation company. To reflect this, in the remainder we refer to “power sourcing” instead of generation.

The same issue applies to the transmission activity where it is transferred to an ISO. (Note that we do not analyze the separation of transmission). In particular, we do not observe the operating expenses of ISOs. We also do not observe a transmission fee—equivalent to purchased power—on the accounts of the utility. But this omission should not affect our results much. First, transmission costs, compared to generation and distribution are very low, as is shown in Table 1 below.Footnote 6 Second, the establishment of ISOs does not coincide with restructuring; the former mostly predates the latter. Thus, our difference-in-differences approach should control for some of these omitted costs/inputs.

Given these two sources of variation—state-level differences in competition, and utility-level divestiture—we identify the combined effect of competition and divestiture by comparing divested units in restructuring states to non-divested units in non-restructuring states: the control group. Non-restructuring states experienced no divestitures—certainly, no policy-induced divestitures. We identify the divestiture effect by comparing divested units to non-divested units in restructuring states; the latter are our alternative control group. As divestitures were a consequence of restructuring, they were not independent of the introduction of competition across states, so our two sources of variation are not independent.

As we are interested in the potential trade-off between competition and divestiture, we define the competition effect as the difference between these two. Consequently we ignore the competition effect for non-divested units in restructuring states. This leads us to underestimate the overall competition effect; but we capture the competition effect that is relevant for the trade-off that we analyze here. Our estimate of the combined effect is thus conservative.

State-level restructuring is unlikely to be fully random: States with higher electricity prices were probably more likely to reform (White et al., 1996; Knittel, 2006). Our difference-in-differences calculation controls for unobserved, time-invariant heterogeneity and assumes that in the absence of the treatment productivity would have developed in a parallel fashion for the treatment and control groups. Section 3.4 below gives the details for these calculations.

3 Modeling

At the heart of our analysis lies a measure of a decision-making unit’s inefficiency. As usual for productivity measurement we need to model the technology—here, the production function—to be able to compare different units for multiple inputs and outputs. There are different approaches to aggregate inputs and outputs for comparison across units (Van Biesebroeck, 2007). We use a nonparametric, deterministic frontier model (Farrell, 1957) where a unit’s inefficiency is relative to observed best practice.Footnote 7

3.1 Technology and Efficiency

Suppose that we have N production units such that each of the N units has K inputs—\(\varvec{\textbf{x}}\in \mathbb {R}_{+}^{K}\)—and Q outputs: i.e., \(\textbf{y}\in \mathbb {R}_{+}^{Q}\). Then we can express the production possibility set or technology T as

$$\begin{aligned} T=\{(\varvec{\textbf{x}},\textbf{y})\in \mathbb {R}_{+}^{K} \times \mathbb {R}_{+}^{Q}:\varvec{\textbf{x}}\,\text {can} \,\text {produce}\,\textbf{y}\}. \end{aligned}$$

We assume that technology T satisfies the following standard axioms:Footnote 8

  1. (i)

    Possibility of inaction and no free lunch: \((\varvec{0},\varvec{0})\in T\); and if \((\varvec{0},\varvec{y})\in T\), then \(\varvec{y}=\varvec{0}\).

  2. (ii)

    T is a closed subset of \(\mathbb {R}_{+}^{K}\times \mathbb {R}_{+}^{Q}\).

  3. (iii)

    Strong input and output disposal: if \((\varvec{x},\varvec{y})\in T\) and \((\varvec{x}',\varvec{y}')\in \mathbb {R}_{+}^{K}\times \mathbb {R}_{+}^{Q}\), then \((\varvec{x}',-\varvec{y}')\ge (\varvec{x},-\varvec{y})\Rightarrow (\varvec{x}',\varvec{y}')\in T\). Finally, we assume convexity for one of our technology models but not for others:

  4. (iv)

    T is a convex set.

We focus on input-oriented efficiency and it is convenient to consider the input requirement set \(L(y)=\{\varvec{x}\mid (\varvec{x},\varvec{y})\in T\}\) that is associated with technology T. The set includes all input vectors \(\textbf{x}\) that can produce at least a given output vector \(\textbf{y}\). Given this technology constraint, the input-oriented efficiency measure is defined as follows:

$$\begin{aligned} E(\varvec{x},\varvec{y})=\min _{\theta }\{\theta \mid \theta \ge 0,\theta \varvec{x}\in L(\varvec{y})\}=\min _{\theta }\{\theta \mid \theta \ge 0,(\theta \varvec{x},\varvec{y})\in T\}. \end{aligned}$$
(1)

The radial efficiency measure \(E(\varvec{x},\varvec{y})\) satisfies \(0<E(\varvec{x},\varvec{y})\le 1\). When a unit’s production is efficient and lies on the boundary (isoquant) of L(y) its efficiency value is one. For our purposes we transform \(\theta\) and obtain the inefficiency score \(\gamma =1-\theta .\) The non-technical reader can skip the next sub-section on the details of the technology without loss.

3.2 Nonparametric Frontier Technology

Assume that we have N observed production units with input–output combinations \((\varvec{x}_{j},\varvec{y}_{j})\in \mathbb {R}_{+}^{K} \times \mathbb {R}_{+}^{Q}\,(j\in \{1,\dots ,N\})\). Then we can represent our convex, nonparametric, variable returns to scale (VRS) frontier technology, referred to as Data Envelopment Analysis (DEA) (Charnes et al., 1978; Banker et al., 1984), as follows:

$$\begin{aligned} T^{DEA}=\left\{ (\varvec{x},\varvec{y})\, \mid \,\varvec{x}\ge \sum _{j=1}^{N}\varvec{x}_{j} \lambda _{j},\varvec{y}\le \sum _{j=1}^{N}\varvec{y}_{j} \lambda _{j},\sum _{j=1}^{N}\lambda _{j}=1,\lambda _{j}\ge 0\right\} , \end{aligned}$$
(2)

where the activity vector \(\varvec{\lambda }=(\lambda _{1},...,\lambda _{N})\in \mathbb {R^{N}}\) summing to unity gives a convex technology. The nonconvex, nonparametric, frontier technology—which is referred to as a free disposal hull (FDH)—(Deprins et al., 2006), is:

$$\begin{aligned} T^{FDH}=\left\{ (\varvec{x},\varvec{y})\,\mid \,\varvec{x}\ge \sum _{j=1}^{N}\varvec{x}_{j}\lambda _{j},\varvec{y}\le \sum _{j=1}^{N}\varvec{y}_{j}\lambda _{j},\sum _{j=1}^{N}\lambda _{j}=1,\lambda _{j}\in \{0,1\}\right\} . \end{aligned}$$
(3)

where each vector element of \(\varvec{\lambda }\) being a binary gives a nonconvex technology. Nonconvexity implies that returns to scale are variable.

We can now model the input-oriented efficiency \(E(\varvec{x},\varvec{y})\) for (2), when unit i is under evaluation, as follows:

$$\begin{aligned} \min _{\theta ,\varvec{\lambda }}&\,\theta \nonumber \\ \mathrm {s.t.}&\sum _{j=1}^{N}\varvec{x}_{j}\lambda _{j}\le\theta \varvec{x}_{i},\nonumber \\&\sum _{j=1}^{N}\varvec{y}_{j}\lambda _{j}\ge\varvec{y}_{i},\nonumber \\&\sum _{j=1}^{N}\lambda _{j}=1,\nonumber \\&\lambda _{j}\ge 0,j=1,...,N. \end{aligned}$$
(4)

The problem is solved N times and the nonconvex FDH model is obtained by replacing \(\lambda _{j}\ge 0\) with \(\lambda _{j}\in \left\{ 0,1\right\}\).Footnote 9

The comparison plan—which consists of \(\textbf{Y}\mathbf {\varvec{\lambda }}\) and \(\textbf{X}\varvec{\lambda }\), with dimensions \(Q\times N\) for \(\textbf{Y}\) and \(K\times N\) for \(\textbf{X}\)—gives the best practice (projected) frontier. The scalar \(\theta\) is the technical efficiency measure; it ranges from 0 to 1. It captures the maximum radial contraction of inputs that project unit i onto the frontier. If, relative to peers, inputs cannot be contracted, the unit is fully efficient, and its score is 1.

3.3 Three Technology Models

Based on (1) or (4) we define three models of the technology. Our benchmark model uses the following assumptions: First, it has a nonconvex (FDH) production reference set. Intuitively, a unit that is special in its input or output dimension is more likely to be efficient than not. All production plans that are weakly dominated by observed plans are also part of the set: inputs and outputs are strongly disposable (e.g., more inputs do not reduce the maximum output).

Second, our benchmark model has a combined technology that comprises the power sourcing, transmission, and distribution activities. A combined technology means that the model includes the inputs and outputs for all three activities and calculates a single inefficiency score for all three activities: The technology is an aggregate of all three supply chain activities. Power sourcing includes own-generation cost and purchased power, which is our proxy for stand-alone generation. Also, as for all of our models, costs are proxies for physical inputs to the technology. We describe input and output measurements below and in Sect. 4. Thus, for divested utilities where a large fraction of power sourcing is attributable to stand-alone generation this model creates a virtually integrated unit, which does not necessarily represent legacy integrated units as power can be purchased from any number of different utilities.

Third, our benchmark model, like all of our models, has a contemporaneous frontier: the current technology set includes only present production plans. We exclude technical change as a mechanism. To sum up: Our benchmark model is conservative in the sense that the frontier does not include convex combinations of observed units; combines all supply chain activities; and comprises present production possibilities only.

In addition to this benchmark model we define two different technology models: we vary one of the benchmark’s assumptions at a time. First, instead of specifying a combined technology, we specify separate technologies for each activity—power sourcing, transmission, and distribution—and we obtain inefficiency estimates for each. The intuition is that after divestiture, specialized firms might use a technology that is different from the technology that is used by integrated firms. We combine the three inefficiency measures into a single net benefit number—cost-weighted inefficiencies—as shown in Eq (6) below. This is similar to the “meta production frontier” approach (Hayami and Ruttan, 1970).

Second, we specify a convex (DEA) technology with variable returns to scale. Unlike our benchmark model this model allows convex combinations of observed units to be part of the frontier. A convex frontier captures more of the underlying heterogeneity as inefficiency.

Finally, we define inputs and outputs for each of the three activities. Our theoretical model is for technical inefficiency; but as is often the case for similar applications we observe no physical input measures and use accounting cost measures instead. As our data source is regulatory accounts, measures should be consistent across firms and time. An activity’s input and output definitions do not vary with the technology model. Also, our difference-in-differences approach controls for any time-invariant input prices differences across units. Deflation controls for common time trends in input prices.

For the distribution activity we define two inputs: operating and maintenance expenditure (Opex), as a proxy for variable inputs; and capital expenses (Capex), as the proxy for capital inputs. The distribution outputs are physical units of electricity delivered, number of customers, and network length, which together reflect long-run expansion. Together they also account for differences in density, which is an important cost driver for electricity distribution.

The power sourcing activity has a single input only: the sum of Opex, Capex, and purchased power expenses. We use the cost of purchased power, as recorded in the accounts of the incumbent utility, as the proxy for divested stand-alone generation. Recall that after divestiture, generating plants no longer have to file regulatory accounts and are unobserved by the analyst. Ideally, we would treat these three costs as separate inputs; but because our inefficiency estimator does not admit zero input values, we have to combine them. For instance, a fully divested utility has no Opex and Capex expenses for own generation. The single output for the power-sourcing activity is units of electricity supplied, which is generated or bought.

For transmission we define again two inputs: Opex and Capex. The single output—units of electricity transmitted—is unobserved, and we use units distributed as a proxy. Section 4 gives details on the data and variable measurements.

3.4 Divestiture and Treatment Effect

Above, we already introduced our identification strategy and historical background. Here we give the formal calculation for the treatment effects. Whether the treatment for divested units in restructuring states is “divestiture” or “divestiture and competition” depends on whether the counterfactual group is non-divested units in restructuring states or non-divested firms in non-restructuring states, respectively.

For both treatments, the counterfactual is the average inefficiency across utilities in the control group G in year t:

$$\begin{aligned} \overline{\gamma _{G,t}}=\frac{1}{\left| {G}\right| }\sum _{j\in G}\gamma _{j,t}. \end{aligned}$$
(5)

That is, we use the average inefficiency of non-divesting units as the proxy for the unobserved true counterfactual. To make the two control groups specific, let D denote the set of all divesting firms and \(\overline{D}\) the set of all non-divesting firms. And, let R denote the set of all states that introduced competition (restructuring) and \(\overline{R}\) the set of all states that did not introduce competition. When the control group is non-divesting units in non-restructuring states \(\left| {\overline{D}\cap \overline{R}}\right|\) the treatment is divestiture and competition. On the other hand, when the control group is non-divesting units in restructuring states \(\left| {\overline{D}\cap R}\right|\) the treatment is divestiture only. The competition effect is simply the difference. Recall that we ignore the competition effect for the non-divested units in restructuring states because we want to analyze the potential trade-off between competition and vertical separation.

For a divested unit i the difference-in-differences formula for the net benefit of activity A in year t is,

$$\begin{aligned} NB_{i,t}^{A}=\left( \overline{\gamma _{G,t}^{A}}-\gamma _{i,t}^{A}\right) C_{i,t}^{A}-\frac{\sum _{t'=1}^{b}\left( \overline{\gamma _{G,t'}^{A}}-\gamma _{i,t'}^{A}\right) C_{i,t'}^{A}}{b}. \end{aligned}$$
(6)

where \(t=b\) is the date of divestiture. Activity A is distribution, power sourcing, transmission, or all three combined—depending on the technology model. As usual this identification strategy is valid if the (non-testable) common trends assumption holds.

We multiply inefficiency differences by an activity’s total cost C to express the results in monetary terms and thereby be able to sum across activities and time. In particular, this allows us to produce a total net benefit number when the technology is activity-specific. Note again that theoretically \(\gamma\) is technical inefficiency (consistent with the theory above); but when multiplying with the cost base we effectively measure the cost of that inefficiency.

The first term is the post-treatment difference. From this we subtract the pre-treatment difference, which is the average across pre-treatment years. Effectively, we control for selection on pre-treatment differences. The net benefit of divestiture is positive (negative) if the average inefficiency—the average waste of all non-divested units—is larger (smaller) than the inefficiency of the divested firm (corrected for any pre-divestiture differences).

Finally, we sum the net benefits across activities (if the technology is activity-specific), units, and years to obtain the overall net benefit in current US dollars as:

$$\begin{aligned} NB=\sum _{t}\sum _{i}\sum _{A}NB_{i,t}^{A}. \end{aligned}$$
(7)

4 Data and Variable Definitions

Our main data source is the regulatory accounts, known as FERC Form 1, that US utilities have to file with the Federal Energy Regulatory Commission (FERC). These have to be submitted annually by utilities that are above a certain size threshold. For our sample period we observe the distribution activities of 138 incumbent utilities.Footnote 10 The data are publicly available on the FERC website, and its use is well established in the economics literature [e.g., (Fabrizio et al., 2007; Kwoka et al., 2010; Arocena et al., 2012)].

The data have gaps, as some observations are missing and we drop others that make no sense. We drop all observations, at the utility level, that distribute less than 1000GWh/year. This is consistent with FERC’s definition of a major utility. We thereby drop 13 observations. Generally, the non-convexity of our variable returns to scale technologies reduces sensitivity to outliers; and any outliers influence the technology only locally (Deprins et al., 2006). Although a regulatory requirement to submit data should assure that missing data are few and random, we observe that missing data are more likely for some cases. In our sample the proportion of missing values is greater for distribution than for power sourcing. Also, the proportion of missing values drops after 2001 for distribution but stays constant for power sourcing. Last, data for the first year after divestiture are more likely to be missing than are data for subsequent years, so that we observe only 18 first post-divestiture years for our 29 divestitures. Next, we describe the input and output measures for all three activities—distribution, power sourcing, and transmission—in detail.

Recall that we use deflated cost-based proxies for physical inputs. For the distribution activity we define our variable input proxy as follows: Operating expenses (Opex) are measured as operation and maintenance (O&M), customer accounts, customer service, and sales expenses plus a share of general and administrative expenses. For our activity-specific technology, we need to allocate the last. The allocation key is based on the ratio of labor expenses for distribution, customer accounts, and sales to total labor expenses less general and administrative labor expenses. This is a commonly used allocation method. We deflate operating expenses with an index of state-level electricity distribution wages (or gas where electricity is not available). The index is based on the “Quarterly Census of Employment and Wages” series published by the Bureau of Labor Statistics. Throughout, the base year is the year 2000.

For all three activities the proxy for capital input is also cost. We measure capital expense for distribution, as well as for the other two activities, as the allocated sum of interest, dividends, tax, and depreciation expenses [following, (Farsi and Filippini, 2005)]. Allocation is based on the share of distribution plant to total plant. We deflate capital expenses by the US GDP deflator.Footnote 11

Our output measures are all in physical units. The distribution outputs are electricity units delivered, number of customers, and network length. Since Form 1 reports only the units delivered and the number of customers for bundled service, we adjust the data to take into account that with the onset of retail competition actual numbers tend to be higher than bundled numbers. For this purpose we use data from the Energy Information Agency (Form EIA-861) and the state Public Utility Commissions (PUC), which both report bundled and unbundled distribution units (where we have data from both the EIA and the PUC we take the minimum). If we cannot obtain data from either the EIA or PUC, we revert to the FERC data.

As we already mentioned above, power sourcing has a single aggregate input. The proxy is total cost of power sourcing: the sum of own-generation and purchased power costs. Own-generation cost is the sum of Opex and capital expenses. The cost of purchased power is a proxy for the cost of stand-alone generation. After the introduction of competition, prices reflect system marginal cost. If the system had constant returns to scale and marginal cost pricing, the cost of purchased power would be comparable to own-generation cost. However, the cost of purchased power includes fuel expenses, network charges, and, potentially, a mark-up.Footnote 12 To control for changes in these additional costs we deflate the cost of power sourcing by a state-level index of retail electricity prices for industrial customers.Footnote 13 The single output for the power sourcing activity is units of electricity supplied: either generated or bought. It is measured as the sum of bundled distribution units, which includes units purchased, and units for resale.

Proxies for transmission inputs are again Opex and capital expenses. The former is O&M expenses plus system control and load dispatching, and a share of general expenses where the allocation key is based on transmission labor expenses. As system control and load dispatching is a generation item on FERC Form 1, the allocation key underestimates the share of general expenses that is allocated to transmission. Opex is deflated by the same wage deflator as distribution expenses above.

We include only the transmission costs that are accounted for by incumbent utilities, which include investment and maintenance. ISOs are responsible for trading systems (both for electricity and transmission rights) and ancillary services; but we do not observe their costs. As different transmission operators have different functional scope, we capture full transmission costs only where functions are operated by incumbent utilities. The single output is units transmitted; but, as FERC’s Form 1 does not report transmission units we use units distributed as a proxy. Appendix 1 gives more detail on the construction of these variables as well as the sources.

Table 1 provides summary statistics, separately for non-divested and divested firms. About 13% of the unit-year observations are for divested firms. Note that we do not list transmission and generation outputs, as these are proxied for by distributed units and distributed units plus resale units, respectively. The average size (output) of non-divested and divested utilities is similar. (Units that are generated for resale are much higher for non-divested firms). Average distribution and transmission costs are higher for divested units, but the opposite is true for the cost of sourcing power, which suggests that despite potential mark-ups, buying power might be beneficial. Examining the costs for the three activities we see that the cost of power sourcing dominates. And as we show below, efficiency effects for this activity largely drive the results.

The last row provides the average proportions of own generation; that is a variable that is different from the ones that we use to define divestiture. Average own-generation proportions are 0.7 and 0.1 for non-divested and divested utilities, respectively. It seems our definition of divestiture discriminates well. As an additional check, Fig. 2 plots trends for average own-generation shares for the treatment as well as for the two control groups. The vertical line indicates the first divestiture in 1998. Before the first divestiture, own-generation shares are similar for all three groups. After the first divestiture, the share of own generation drops quickly for divesting firms, and the shares for the two control groups hardly change. What if we defined the cut-off for divestiture more stringently? Fig. 5 in Appendix A compares the trends for our preferred 50% threshold (left panel) to the trends for a 75% threshold (right panel). The trends in both panels are similar. Figure 7 in the same Appendix also shows that trends for own-generation shares are very similar if we define divestiture for physical capacity instead.

Table 1 Summary statistics for costs and outputs
Fig. 2
figure 2

Average shares of own generation of total requirements. This graph plots the yearly averages for own-electricity generation over total requirements for three groups: divesting plants in restructuring states; non-divesting plants in non-restructuring states; and non-divesting plants in restructuring states. The vertical line gives the date of the first divestiture in 1998

5 Results

We begin the presentation of our results by giving summary statistics for inefficiency scores for our three technology models in Table 2. These are the key ingredients for our calculation of net benefits.

Table 2 Inefficiency scores for the different technology models

For our benchmark model (common-across-activities, nonconvex), the average inefficiency score is between 0.7 and 1.1%. Inefficiency is relatively low because the nonconvex FDH model gives firms the “benefit of the doubt” and intuitively ascribes differences between firms to heterogeneity other than inefficiency. Rows two to four give activity-specific inefficiencies, which together constitute the separate technology model. Average inefficiencies are much higher for all three activities. Intuitively, as the units are more comparable for single activities, more heterogeneity is ascribed to inefficiency. Additionally, the power sourcing and transmission activities have single outputs that lead to much higher average inefficiencies. The fifth row shows that compared to the benchmark model, the convex technology model also produces a much higher average inefficiency. This is no surprise as the model allows convex combinations (across all activities) as peers, which makes it more likely that a given unit is inefficient.

Due to the different assumptions about the true technology and the differences in input/output dimensionality, average inefficiency differs across the models; but our difference-in-differences measure of the treatment effect removes these level effects. Across all models, average inefficiencies are higher for divested units. For the distribution activity this finding is consistent with the results of Kwoka et al. (2010). However, these inefficiency differences are not necessarily consistent with our results for treatment effects below. These also weight inefficiency scores by actual costs and calculate difference-in-differences to control for unobserved heterogeneity. This is what we present next.

Figure 3 plots yearly, to-date, undiscounted cumulative totals from the year of the first divestiture until the end of our sample: The observation for the last year is the grand total, as in Eq. (7). The three lines give the results for the three different technology models. A linear trend indicates constant yearly net benefits. Recall that the bulk of divestitures occurred between 1999 and 2001. Whereas the top panel gives the combined effect of competition and divestiture, using the non-restructuring counterfactual, the bottom panel gives the divestiture-only effect, using the restructuring counterfactual.

Due to the difficulty of accurately measuring the stand-alone generation (and to some extent transmission) activities we emphasize qualitative over quantitative results and present all of our results with the use of graphs. In any case, a back-of-the-envelope calculation that uses the average cost figures in Table 1 as a base suggests that maximum absolute net benefits are about 8.5% of divested utilities’ total cost for all post-treatment years. But the minimum is less than a third of that.Footnote 14

First, we examine the combined effect in the top panel. At the end of the sample, net benefits are positive for all three models. Whereas our benchmark model (the FDH benchmark) gives the lowest net benefit, the separate technology model gives the largest. Whether the divested units are artificially combined (benchmark model) or the non-divested units artificially separated makes no difference qualitatively, but does make a large difference for the quantitative assessment. The same is true for the convexity assumption. It is not clear what the sources of these differences are as each technology model tends to be appropriate for some units but not necessarily for others. A model averaging approach, which we do not implement here, might be sensible.

For several years after divestitures the models produce similar net benefits, but these start diverging around 2002. Also, the length of the post-treatment horizon matters for the assessment. The non-linearity of the lines suggests that the effects are not immediate and constant. Several papers [e.g. (Fabrizio et al., 2007; Kwoka et al., 2010)] that assess the effects of US electricity restructuring also have very short post-treatment horizons, which might bias results.

Whereas the top panel in Fig. 3 gives the combined divestiture and competition effect, the bottom panel gives the divestiture-only effect. Unlike for the combined effect, the models do not agree on the sign of the total net benefit at the end of our sample period. The negative effect for the separate technology model is consistent with the finding of economies of scope in the literature. For the other two models the divestiture effect is positive, which indicates gains from separation. This is not necessarily evidence against economies of scope, but suggests that there might be efficiency gains from separation—e.g., due to improved management focus—that outweigh any lost economies of scope. Again, the length of the post-treatment horizon matters. Whereas net benefits are decreasing for all models in the early years, they are increasing at the end of our sample. This might be evidence for firm learning.

As we would expect, the divestiture effect is smaller than the combined effect for all models, which implies a positive competition effect. The absolute size of the competition effect varies across the models. It is larger for the more flexible models. To conclude: For these divested utilities a positive competition effect outweighs a potentially negative divestiture effect—certainly after a number of years. Also, the effect is robust across different specifications of the technology. The appendix provides additional robustness tests.

Appendix (A) shows that qualitatively these results are robust to two different definitions of divestiture: a 75% drop in plant book value, and a 50% drop in physical generation capacity. However, the slightly lower net benefit estimates for the higher plant book threshold suggests that the trade-off worsens the more stringent unbundling is. Full unbundling is not necessarily optimal.

Fig. 3
figure 3

Cumulative net benefits. These graphs plot to-date cumulative net benefits for our technology models over time, from the year of the first divestiture. The top graph uses non-divesting utilities in non-restructuring states as the counterfactual. The bottom graph uses non-divesting utilities in restructuring states as the counterfactual

The above analysis showed a single net benefit for the separate technology model, which is the sum of the three activity specific values. What are the net benefits for its components? Fig. 4 plots the net benefits for the constituent activities: power sourcing, transmission, and distribution. Again, the top panel gives the combined competition and divestiture effect and the bottom panel gives the divestiture-only effect.

The top panel shows that by the end of our sample period the contributions differ across the three activities. Whereas for transmission there is hardly any effect, there is a negative effect for distribution and a large positive effect for power sourcing. Any positive effect for power sourcing is magnified by its large cost base. These differences are consistent with the prior evidence and underline the importance of including all stages of the supply chain in the analysis. They also suggest that there are probably multiple underlying mechanisms; a detailed identification of these is beyond the scope of this paper.

The bottom panel shows that, at the end of our sample period, for distribution and transmission the pure divestiture effects are similar to the combined effects, which suggest that, as expected, there is virtually no competition effect. The negative divestiture effect could be due to lost economies of scope or cost shifting. For power sourcing the combined effect is much larger than the divestiture-only effect; this is evidence for a positive competition effect and consistent with the results of Fabrizio et al. (2007) for non-divesting utilities. The short-term negative divestiture effect for power sourcing might reflect adjustment costs. It again cautions against using too short a post-treatment period for the analysis.

Fig. 4
figure 4

Cumulative net benefit by activity. These figures plot cumulative to-date net benefits by activity over time, starting with the year where the first divestiture occurred. The technology is activity-specific

6 Conclusion

We set out to examine the effect of liberalization of the electricity sector in the US by identifying the separate effects of competition and of vertical unbundling. We also sought to examine the combined impact of liberalization on the costs of generation, transmission, and distribution, in order to assess whether cost decreases in one part of the electricity sector have been offset by cost increases in another. These two features of our work sought to address weaknesses in the existing literature on electricity reform effects. Our approach has been to use an efficiency methodology, which allows us to account for the changing external conditions and structural nature of the electricity sector through the period of our analysis.

Our results provide evidence that the combined reform was beneficial. Although there are probably efficiency costs due to vertical separation or unbundling, the efficiency benefits from competition are larger. However, we do not analyze whether separation was necessary for these benefits from competition.

The efficiency assessment of reforms requires like-for-like comparisons across firms and time. We try to achieve this by modeling different technologies. Our results show that how we model the unobserved production technology matters: both qualitatively and quantitatively. A model averaging approach might produce more robust results.

The combined effect of divestiture and competition is positive for all models of the technology. More flexible technologies—in the sense that they allow activity-specific technologies and/or convexity, i.e., divisibility—produce larger reform effects. More conservative technology models produce a negative reform effect for divestiture but not competition.

When modeling reform effects separately across the supply chain, they differ, even qualitatively, across the activities of the supply chain. Partial results are no substitute for a comprehensive analysis. Whereas the effect of divestiture and competition is positive for power sourcing it is negative for distribution and roughly neutral for transmission. These different effects point to diverse mechanisms, which we discuss anecdotally, but future work might carefully identify.

Finally, the length of the post-treatment horizon matters for the results. Generally, effects increase over time, which suggests temporary adjustment costs or that firms learn to operate in the new environment. Many studies estimate average post-treatment effects for relatively short horizons. These might be misleading.

To the best of our knowledge this is the first attempt to assess the combined reforms of vertical separation and competition for the US electricity industry. Although we do not provide definitive quantitative results, it seems likely that the combination of competition and vertical separation generated benefits in terms of cost efficiencies.