Abstract
Nonparametric efficiency analysis has become a widely applied technique to support industrial benchmarking as well as a variety of incentive-based regulation policies. In practice such exercises are often plagued by incomplete knowledge about the correct specifications of inputs and outputs. Simar and Wilson (Commun Stat Simul Comput 30(1):159–184, 2001) and Schubert and Simar (J Prod Anal 36(1):55–69, 2011) propose restriction tests to support such specification decisions for cross-section data. However, the typical oligopolized market structure pertinent to regulation contexts often leads to low numbers of cross-section observations, rendering reliable estimation based on these tests practically unfeasible. This small-sample problem could often be avoided with the use of panel data, which would in any case require an extension of the cross-section restriction tests to handle panel data. In this paper we derive these tests. We prove the consistency of the proposed method and apply it to a sample of US natural gas transmission companies from 2003 through 2007. We find that the total quantity of natural gas delivered and natural gas delivered in peak periods measure essentially the same output. Therefore only one needs to be included. We also show that the length of mains as a measure of transportation service is non-redundant and therefore must be included.
Similar content being viewed by others
Notes
The information asymmetries in regulation mainly result from adverse selection and moral hazard problems (Joskow 2006).
The homogeneity assumption is comparable to the parametric homoscedasticity assumption and means that the distribution of the inefficiencies does not depend on inputs or the outputs. The problem is that it will not generally hold in both the input and the output direction, prohibiting tests based on it in both directions.
Alternatively, variables could be omitted or aggregated. Omitting variables based on correlations should be avoided for translation invariant DEA models (Dyson et al. 2001) and aggregating variables based on principal components might be inappropriate for radial efficiency measurement (Simar and Wilson 2001). However, the restriction tests proposed by Simar and Wilson (2001) and Schubert and Simar (2011) provide statistical inference procedures for the investigation of aggregates.
Note, that panel data is just one example of clustered data and that, therefore, the applicability of the proposed test is even more comprehensive.
To comprehensively define the DGP, assumptions on the statistical model are necessary. Due to space limitations, we omit the discussion and refer the reader to e.g., Simar and Wilson (2001).
Other bootstrap methods, e.g., the homogeneous bootstrap proposed by Simar and Wilson (1998) and further developed by Simar and Wilson (2001) or the double smooth bootstrap proposed by Kneip et al. (2008) are not applicable in our setting because we need a method that allows for heteroscedasticity and that is valid for all data points considered simultaneously (Schubert and Simar 2011). The aforementioned alternatives are, therefore, excluded.
We could also normalize the inner sum by dividing by \(Z_{i}\), but this will have no asymptotic effect.
This corresponds to the selection rule proposed by Simar and Wilson (2008), which selects a value of m for which the resulting sample distribution and some of its features, e.g., relevant moments, are stable with respect to deviations from this particular value.
These mainly include gas storage facilities, gas processing and treatment plants, as well as liquefied natural gas storage and processing plants.
Note that the legitimacy of input (or output) aggregation should also be tested, e.g., by means of restriction tests; however, this it outside the focus of the present work.
Alternatively an OPEX model could have been implemented, which makes capital a determinant of some variable input factor as discussed previously. This would involve the specification of an input requirement function. To our best knowledge, there is no empirical analysis dealing with regulatory benchmarking of natural gas transmission companies applying this specification. We leave this to further research and present our proposed method with a simple input–output-specification.
The transport of natural gas is based on a pressure differential at the inlet and outlet.
However, we are aware of the fact that the length of mains cannot fully explain the differences of total operational costs of the compressor station since these also depend on the engineering characteristics. Further, length of mains likely reflects the geographical reach of services. An alternative view of its importance might result from the notion that companies active in rural areas naturally need greater length to deliver the same amount of gas than firms in metropolitan areas. This is simply because the customers are more dispersed. In this interpretation length would be rather a conditioning variable than an input or output. However, if length reflects an exogenous and monotonous cost disadvantage, it can also be included as an additional output. Our results are consistent with both qualifications of the variable length and corroborate its importance.
One can see clearly that we make a priori assumptions about the partition of the technology set into inputs and outputs. This partition is not always unambiguous. For example, one can think of arguments that would suggest that length is an input rather than an output. If this was, true the cascade of tests in Table 1 would change as well. This is a potential weakness of the procedure presented here, because these assumptions remain untested. Assessing them would require the use of some sort of goodness-of-fit criteria that are, to the best of our knowledge, not available in non-parametric frontier models. Another point concerns a delicate issue in terms of interpretation of the results. Using our tests we do not confirm that an input or output dimension is redundant. Rather we show that there is no conclusive evidence that it is not. Based on the failure to reject the Null we recommend excluding certain outputs/inputs. What might seem unwarranted at first sight can, however, be justified by the regulation context. In fact, because regulated firms have a strong incentive to include as many inputs and outputs as possible in order to reduce the effectiveness of the regulation, from the regulator’s point of view it seems adequate to exclude those inputs or outputs that have not proven to be highly relevant. Otherwise the regulator would most likely be forced to use models with large dimensionality rendering the regulation exercise utterly ineffective.
Additionally, we conducted the test where the Null incorporating length as the sole output variable is tested against the two alternatives length, deliv and length, peak, respectively. Both tests confirm the presented results. To save space, we present the detailed results in Appendix 2.
Note that we want to empirically apply our proposed method and are, therefore, not concerned about the exact period under consideration.
Calculations are conducted using the statistical software R with the additional package “FEAR” version 1.12 by Wilson (2008).
The outlier analysis was conducted before the restriction tests implying that the restrictions tests are run on the unrestricted model. This approach is consistent, because even if an outlier is identified on the basis of an ex post redundant dimension, under the Null the restricted and the unrestricted model converge to the same probability limit. In finite samples, if \(H_{0}\) is true, this approach is less efficient, because observations may potentially be dropped on the basis of unnecessary dimensions. To assess this, we ran an ex post outlier analysis on the restricted set. We found that largely the same units are identified as outliers irrespective of whether we use the unrestricted or the restricted model. Also the restriction tests were rerun based on this sample. The decisions on the restriction tests did not change, with almost constant numerical test statistics. Thus, we can conclude that in our application the results are robust with respect to the order in which restriction tests and the outlier analysis are performed.
References
Agrell P, Bogetoft P, Cullmann A, Hirschhausen C, Neumann A, Walter M (2008) Ergebnisdokumentation: Bestimmung der Effizienzwerte Verteilnetzbetreiber Strom—Endfassung. PROJEKT GERNER IV, Sumicsid and Chair of Energy Economics and Public Sector Management at Dresden University of Technology
Banker RD, Charnes A, Cooper WW (1984) Some models for estimating technical and scale inefficiencies in data envelopment analysis. Manag Sci 30(9):1078–1092. http://www.jstor.org/stable/2631725
Beesley M, Littlechild S (1989) The regulation of privatized monopolies in the United Kingdom. RAND J Econ 20(3):454–472
Bogetoft P, Agrell P (2007) Development of benchmarking models for German electricity and gas distribution. Final report. Project Gerner/AS6, Sumicsid
Bogetoft P, Otto L (2011) Benchmarking with DEA, SFA, and R. International Series in Operations Research & Management Science. Springer, New York. http://books.google.de/books?id=rBiGxrgFk-kC
Coelli T, Estache A, Perelman S, Trujillo L (2003) A primer on efficiency measurement for utilities and transport regulators. World Bank Institute, Development Studies
Cullmann A (2012) Benchmarking and firm heterogeneity: a latent class analysis for German electricity distribution companies. Empir Econ 42(1):147–169. http://ideas.repec.org/a/spr/empeco/v42y2012i1p147-169.html
Daraio C, Simar L (2007) Advanced robust and nonparametric methods in efficiency analysis: methodology and applications. Studies in productivity and efficiency. Springer, New York. http://books.google.de/books?id=QAtGqmOwyIwC
Davison AC, Hinkley DV (1997) Bootstrap methods and their application. Cambridge University Press, Cambridge
Debreu G (1951) The coefficient of resource utilization. Econometrica 19(3):273–292. http://www.jstor.org/stable/1906814
Dyson R, Allen R, Camanho A, Podinovski V, Sarrico C, Shale E (2001) Pitfalls and protocols in DEA. Eur J Oper Res 132(2):245–259
Färe R, Grosskopf S (2005) New directions: efficiency and productivity. Studies in productivity and efficiency. Kluwer Academic Publishers, Boston. http://books.google.de/books?id=w0rSAFFFwMYC
Färe R, Grosskopf S, Lindgren B, Roos P (1992) Productivity changes in Swedish pharmacies 1980–1989: a non-parametric Malmquist approach. J Product Anal 3(1–2):85–101
Farrell MJ (1957) The measurement of productive efficiency. J R Stat Soc Ser A 120(3):253–290. http://www.jstor.org/stable/2343100
Farsi M, Fetz A, Filippini M (2007) Benchmarking and regulation in the electricity distribution sector. CEPE working paper series 07-54, CEPE Center for Energy Policy and Economics, ETH Zurich, Zurich. http://ideas.repec.org/p/cee/wpcepe/07-54.html
Granderson G (2000) Regulation, open-access transportation, and productive efficiency. Rev Ind Organ 16(3):251–266
Haney AB, Pollitt MG (2009) Efficiency analysis of energy networks: an international survey of regulators. Energy Policy 37(12):5814–5830
Hollas DR, Macloed KR, Stansell SR (2002) A data envelopment analysis of gas utilities’ efficiency. J Econ Finance 26(2):123–137
Homburg C (2001) Using data envelopment analysis to benchmark activities. Int J Prod Econ 73(1):51–58. doi:10.1016/S0925-5273(01)00194-3, http://www.sciencedirect.com/science/article/pii/S0925527301001943
IEA (2003) The challenges of future cost reductions for new supply options (pipelines, LNG, GTL). 22nd World Gas Congress Tokyo. http://www.dma.dk/themes/LNGinfrastructureproject/Documents/Infrastructure/IEA-The%20challenges%20of%20further%20cost%20red%20new%20supply%20options.pdf, retrieved 26 September 2011
Jamasb T, Pollitt MG (2003) International benchmarking and yardstick regulation: an application to European electricity distribution utilities. Energy Policy 31(15):1609–1622
Jamasb T, Nillesen P, Pollitt M (2004) Strategic behaviour under regulatory benchmarking. Energy Econ 26(5):825–843. http://ideas.repec.org/a/eee/eneeco/v26y2004i5p825-843.html
Jamasb T, Pollitt MG, Triebs T (2008) Productivity and efficiency of US gas transmission companies: a European regulatory perspective. Energy Policy 36(9):3398–3412
Joskow PL (2006) Incentive regulation in theory and practice: electricity distribution and transmission networks. Cambridge Working Papers in Economics 0607, Cambridge University, Faculty of Economics, Cambridge
Kneip A, Park BU, Simar L (1998) A note on the convergence of nonparametric DEA estimators for production efficiency scores. Econom Theory 14(6):783–793
Kneip A, Simar L, Wilson PW (2008) Asymptotics and consistent bootstraps for DEA estimators in nonparametric frontier models. Econom Theory 24(06):1663–1697. http://ideas.repec.org/a/cup/etheor/v24y2008i06p1663-1697_08.html
Littlechild SC (1983) Regulation of British telecommunications’ profitability. Report to the Secretary of State, Department of Industry in London, London
Natgasinfo (2011) Gas pipelines. http://natgas.info/html/gaspipelines.html, retrieved 26 Sept 2011
O’Neill RP (2005) Natural gas pipelines. In: LMoss D (ed) Network access, regulation and antitrust. Routledge, London, pp 107–120
Pastor JT, Ruiz JL, Sirvent I (1999) A statistical test for detecting influential observations in DEA. Eur J Oper Res 115(3):542–554. http://ideas.repec.org/a/eee/ejores/v115y1999i3p542-554.html
Politis DN, Romano JP, Wolf M (2001) On the asymptotic theory of subsampling. Stat Sin 11(4):1105 –1124. http://www.ams.org/mathscinet-getitem?mr=1867334
Saal D, Reid S (2004) An investigation into opex productivity trends and causes in water industry in England & Wales—1992–93 to 2002–03: main report—final. Tech. rep., Stone & Webster Consultants
Schefczyk M (1993) Industrial benchmarking: a case study of performance analysis techniques. Int J Prod Econ 32(1):1–11. http://ideas.repec.org/a/eee/proeco/v32y1993i1p1-11.html
Schubert T, Simar L (2011) Innovation and export activities in the German mechanical engineering sector: an application of testing restrictions in production analysis. J Prod Anal 36(1):55–69. http://ideas.repec.org/a/kap/jproda/v36y2011i1p55-69.html
Shepard RW (1970) Theory of cost and production function. Princeton University Press, Princeton
Shuttleworth G (2005) Benchmarking of electricity networks: practical problems with its use for regulation. Util Policy 13(4):310–317
Sickles RC, Streitwieser ML (1992) Technical inefficiency and productivity decline in the U.S. interstate natural gas pipeline industry under the National Gas Policy Act. J Prod Anal 3(1–2):119–133
Sickles RC, Streitwieser ML (1998) An analysis of technology, productivity, and regulatory distortion in the interstate natural gas transmission industry: 1977–1985. J Appl Econom 13(4):377–395. http://ideas.repec.org/a/jae/japmet/v13y1998i4p377-395.html
Simar L (2003) Detecting outliers in frontier models: a simple approach. J Prod Anal 20(3):391–424. doi:10.1023/A:1027308001925
Simar L, Wilson PW (1998) Sensitivity analysis of efficiency scores: how to bootstrap in nonparametric frontier models. Manag Sci 44(1):49–61. http://www.jstor.org/stable/2634426
Simar L, Wilson PW (2000) Statistical inference in nonparametric frontier models: the state of the art. J Prod Anal 13(1):49–78. doi:10.1023/A:1007864806704
Simar L, Wilson PW (2001) Testing restrictions in nonparametric efficiency models. Commun Stat Simul Comput 30(1):159–184. http://econpapers.repec.org/RePEc:fth:louvis:0013
Simar L, Wilson PW (2008) Statistical inference in nonparametric frontier models: recent developments and perspectives. In: Fried HO, Lovell CK, Schmidt SS (eds) The measurement of productive efficiency and productivity growth. Oxford University Press, Oxford, pp 421–521
Simar L, Wilson PW (2011) Inference by the m out of n bootstrap in nonparametric frontier models. J Prod Anal 36(1):33–53. http://ideas.repec.org/a/kap/jproda/v36y2011i1p33-53.html
Stone & Webster Consultants (2004) Investigation into evidence for economies of scale in the water and sewerage industry in England and Wales. Final report. Technical report, Stone & Webster Consultants
Thompson RG, Dharmapala PS, Thrall RM (1995) Linked-cone DEA profit ratios and technical efficiency with application to Illinois coal mines. Int J Prod Econ 39(1-2):99–115. http://ideas.repec.org/a/eee/proeco/v39y1995i1-2p99-115.html
Vogelsang I (2002) Incentive regulation and competition in public utility markets: a 20-year perspective. J Regul Econ 22(1):5–27. http://ideas.repec.org/a/kap/regeco/v22y2002i1p5-27.html
Wilson P (2008) FEAR 1.0: a software package for frontier efficiency analysis with R. Socio Econ Plan Sci 42(4):247–254
Acknowledgements
This article was developed while Maria Nieswand was Jean Monnet Fellow at the European University Insitute. We thank Luis Orea and participants of the 5th International Workshop on Empirical Methods in Energy Economics (EMEE), the annual meeting of the Verein für Socialpolitik 2012, and the 10th Conference on Applied Infrastructure Research (INFRADAY) for valuable comments and discussions. Two anonymous and an associate editor provided excellent and helpful comments. All remaining errors remain ours.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: Proof of consistency
A robust approach to obtain corrected standard errors with clustered data is to sub-sample block-wise (Davison and Hinkley 1997). This allows for arbitrary dependence between the observations belonging to the same cross-section unit.
We show that this procedure meets the essential consistency requirements set out in Politis et al. (2001). Let sample size \(n_{PD}\) be defined by the number of different cross-section observations. Although we used the more easily interpretable Farrel–Debreu measure so far, for actual calculations it is preferable to use the inverse \(\lambda =1/\theta\) because it is truncated only once.
Proposition
Let \(n\left( Z\right) ={\scriptstyle {\displaystyle { \sum \nolimits_{i=1}^{n_{PD}}Z_{i}}}}\) where iid random variables \(Z_{i}\) give the number of time observations per cross-section unit with distribution function \(F_{Z}\) defined on the support \(S_{Z}=1,\ldots ,L\) and expectation \(e\in \left[ 1,L\right]\), then for the test-statistic \(t_{n_{PD}}\left( X,Y,Z\right)\) the asymptotic distribution of \(\sqrt{{n_{PD}}}n_{PD}^{2/\left( p+q+1\right) }t_{n_{PD}}\left( X,Y,Z\right)\) is non-degenerate with expectation zero.
Proof
If we reformulate the time subscripts to take only consecutive integers, we can use the following definition:
It follows from the results of Kneip et al. (2008) that
for any fixed \(z_i\), where \(H_{n}\) is a random variable with an asymptotic distribution function Q that is non-degenerate and has mean 0 under \(H_{0}\). Furthermore we can rewrite \(n=n_{PD}e\) , where e is the expectation of Z. Replacing and rearranging yields
Since the right-hand-side is a scaled version of \(H_{n}\), also
has a non-degenerate distribution. This implies that the conditional distribution of \(n_{PD}^{2/\left( p+q+1\right) }t_{i}\left( X,Y|Z=z\right)\) is non-degenerate. Call this distribution D(z).
Furthermore, we obtain the distribution of \(t_{i}\left( X,Y,Z\right)\) by marginalizing out Z: \(D(\cdot )=\int _{z\in S_{Z}}D(z)dF_Z\). Obviously, if D(z) is non-degenerate with a given scaling factor, then \(D(\cdot )\) must be non-degenerate with the same scaling factor. In order to complete the proof, since \(t_{n_{PD}}\left( X,Y,Z\right)\) is an empirical mean of the \(t_{i}\left( X,Y,Z\right)\), it follows by using the redefinition \(n=n_{PD}c\) that \(\tau _{n_{PD}}t_{n_{PD}\left( X,Y,Z\right) }\) with \(\tau _{n_{PD}}=\sqrt{{n_{PD}}}n_{PD}^{2/\left( p+q+1\right) }\) is non-degenerate and additionally has an asymptotic expectation equal to zero under \(H_{0}\), because the mean associated with the asymptotic distribution Q is zero. As a consequence of this result, the subsampling methods proposed by Politis et al. (2001) are consistent, when subsampling is conducted block-wise along the cross-section dimension. The sub-sampling size \(m_{PD}\) is as usually defined as the integer part of \(n_{PD}^{k}\) for \(0<k<1\). It should be noted that these results include the case of ordinary cross-section data and a balanced panel setting. In the former case \(z_{i}=1\) and \(n=n_{PD}\) yielding just the formulae in Schubert and Simar (2011). In the latter case \(z_{i}=L\) implying that \(z_i\) cannot affect the asymptotic distribution because it is non-random.
Appendix 2: Additional test results
In this appendix we report the additional tests performed for length as single output variable. Figure 2a presents the visual analysis in which length is tested against the alternative technology set with length and deliv. At the optimal subsample size of 21 the observed test statistic of 0.468 is very close to the critical value. The corresponding p value of this test is 8 % when the panel structure is accounted for and 1 % otherwise. Panel (b) of this figure shows the visual results for testing length against length and peak. In this case the observed test statistic takes the value 0.593 and clearly exceeds the critical value at the subsample size of 23. The p value is 3 % (0 %) when the subsampling accounts (ignores) the panel structure of the data.
Consequently, we find empirical evidence that the technology set should include either deliv or peak if length is chosen as inital output variable. Therefore, these tests support the previous findings reported in Fig. 1. Also they show that the test procedure does not depend on the order of variable selection.
Rights and permissions
About this article
Cite this article
Neumann, A., Nieswand, M. & Schubert, T. Estimating alternative technology sets in nonparametric efficiency analysis: restriction tests for panel and clustered data. J Prod Anal 45, 35–51 (2016). https://doi.org/10.1007/s11123-015-0461-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11123-015-0461-z
Keywords
- Benchmarking models
- Network industries
- Nonparametric efficiency estimation
- Data envelopment analysis
- Testing restrictions
- Subsampling
- Bootstrap