1 Introduction

Recently, the idea that “lucid waters and lush mountains are invaluable assets” was put forward by the president of China. This fundamental change in development philosophy and the corresponding developmental measures is expected to further boost investments in emission reduction, leading to a decrease in undesirable outputs per unit of GDP in China. However, just as environmental deterioration is a process of long-term accumulation; environmental improvement will necessitate even longer-term effort. Hence, we examine China's performance trajectory in terms of energy use efficiency and pollutant emission during the first two decades of the twenty-first century. This kind of descriptive research is important given that it provides fundamental information for further scientific analysis; for example, it can be used to conduct ex post policy impact analysis or to mark a new baseline. Although absolute emission amounts and GDP/energy or GDP/undesirable output ratios are frequently used and easily handled and understood indexes of performance, they suffer from incomparability over time and across regions and can only provide limited information of relevance; for example, changes in a country’s energy consumption make little sense if GDP is not taken into consideration. Similarly, the GDP/energy ratio may only show limited information since other inputs, such as capital, and other outputs, such as CO2, are not embodied in this index. Put differently, if a rise in the GDP/energy ratio comes at the cost of larger inputs in other factors and/or higher emission of pollutants, it is difficult to interpret the environmental significance of this development pattern.

From the viewpoint of Pareto optimality, efficiency- or productivity-based indexes are more meaningful and informative of the trajectory of energy consumption and pollutant emission during a given period. Furthermore, as early as 1983, Pittman theoretically stressed the necessity of combining undesirable outputs with other kinds of outputs and inputs in efficiency and productivity analyses (Pittman, 1983). Zhou and Ang (2008) can be seen as offering a response to this call: they construct a number of energy efficiency indexes that take undesirable output(s) into consideration by using the nonparametric data envelopment analysis (DEA) framework. In what follows, energy efficiency with undesirable output(s) is termed environmental energy efficiency (hereinafter referred to as EEE).

However, to the best of our knowledge, few parametric stochastic frontier analysis (SFA)-based models have been proposed for EEE so far.Footnote 1 This may be partly due to the challenge of imposing regularity constraints on undesirable output(s) during the estimation process. Consequently, this study attempts to fill this gap by combining the Shephard energy input distance function with the Bayesian technique. As will be noted in Sect. 3, the Bayesian technique makes it possible to impose regularity constraints on undesirable output(s) by setting a prior density distribution function on the coefficients.

The contribution of this study lies in three aspects. First, the study proposes a new SFA-based model that can estimate environmental energy efficiency. In situations where the time span is long and the environmental heterogeneity among decision-making units (DMU) is obvious, the SFA-based method is thought to be superior to the DEA-based method in terms of the accuracy of estimates, given that the DEA-based method is unable to separate inefficiency from the error term in an easy and accurate manner,Footnote 2 unable to control for contextual impacts, and unable to isolate efficiency changes from technological changes. Indeed, in the presence of rapid technological change or clear stochastic exogenous shocks, DEA-based methods may seriously bias the estimation results. Second, based on the modes of constructing the frontier and adjusting the inputs, this study classifies EEE estimation methods into four types (discussed in more detail in Sect. 2.3). This classification clarifies the comparison of development stages of different types and helps highlight meaningful research agendas. Third, this study enriches the empirical literature on Chinese energy- and environment-related issues. It also confirms that the GDP/energy and GDP/undesirable output ratios that are often used in economic analysis may tell incomplete or even biased stories, and it is strongly suggested to combine these one-sided indexes with more sophisticated ones, such as EEE.

The remainder of the paper is organized as follows. Section 2 offers a literature review on energy efficiency estimation methods. The newly proposed model is presented and explained in Sect. 3, which is followed by an application to China in Sect. 4. Section 5 reports and discusses the empirical results, and Sect. 6 concludes the paper.

2 Literature review

In this section, we first review representative models proposed for estimating EEE and then classify these EEE estimation methods into four types. This is followed by a discussion of unsolved estimation issues for those four types of methods, thereby underscoring the necessity of our newly proposed model. Finally, we review how undesirable outputs are treated in parametric efficiency estimation-related models, from which we borrow building blocks for the construction and estimation of our new model.

2.1 Development of DEA-based EEE models

Patterson (1996) critically reviews the methods used to measure energy efficiency and summarizes the advantages and disadvantages of various methods. He notes that the commonly used energy-to-GDP ratio is incapable of measuring technical change and distinguishing energy efficiency changes prompted by adjustment of the industrial structure from changes caused by pure technical efficiency. Finally, he defines energy efficiency as the ratio of useful output to energy input in the process. Given that the term efficiency is usually defined as the ratio of optimal input (or output) to actual input (or output), other things being constant, while productivity is usually defined as the ratio between output(s) and input(s) (Coelli et al., 2005), the definition given by Patterson (1996) is more similar to measuring energy in terms of productivity rather than in terms of efficiency.Footnote 3 However, his study does raise the issue of how to gauge energy efficiency scientifically and meaningfully.

As energy input alone cannot produce any output without incorporating non-energy inputs, Hu and Wang (2006), inspired by the conception of total factor productivity employed in the growth accounting literature, introduce the notion of total factor energy efficiency (TFEE) as a proxy of energy efficiency performance. They regard capital, labor, energy and other factors as inputs and GDP as output; formally,

$$ TFEE_{it} = 1 - \left\{ {\left[ {(1 - \beta^{*} )*energy_{it}^{actural} + Slack_{it}^{energy} } \right]/energy_{it}^{actural} } \right\} $$
(1)

where \(\beta^{*}\) represents technical efficiency and informs the equal-proportional radial contraction of all inputs (capital, labor and energy) in the production of a given level of output (GDP), \(energy_{it}^{actural}\) stands for the actual energy input and \(Slack_{it}^{energy}\) stands for the nonradial adjustment of the ith DMU in period t. Consequently, the sum of \((1 - \beta^{*} )*energy_{it}^{actural}\) and \(Slack_{it}^{energy}\) indicates savable energy input through both radial adjustment and nonradial adjustment.

Mukherjee (2008) proposes another DEA-based model that can be used to estimate to what degree energy input alone can be maximally reduced while allowing the DMU to produce a constant level of output (or more) without using any additional amounts of any other inputs, namely,

$$ \begin{aligned} & \beta^{*} = \min \beta \\ & s.t. \\ & \sum\limits_{j = 1}^{N} {x_{ij}^{non - energy} \lambda_{j} } \le x_{i0}^{non - energy} ,\quad i = 1,2,...,M \\ & \sum\limits_{j = 1}^{N} {x_{ij}^{energy} \lambda_{j} } \le \beta x_{0}^{energy} ,\quad j = 1,2,...N \\ & \sum\limits_{j = 1}^{N} {y_{j}^{good} \lambda_{j} } \ge y_{0}^{good} ,\;\;\lambda_{j} \ge 0 \\ \end{aligned} $$
(2)

where \(x_{ij}^{non - energy}\) denotes the ith non-energy input of the jth DMU and \(x_{i0}^{non - energy}\) represents the ith non-energy input of \(DMU_{0}\), namely, the DMU under evaluation. Under such a model, a value of 0.85 for \(\beta^{*}\), for example, means that if \(DMU_{0}\) is as efficient in terms of energy input (not all inputs) as \(DMU(s)\) at the frontier, it will produce the same or a higher amount of output \(y_{0}^{{}}\) by using 85 percent of the actual energy input \(x_{0}^{energy}\) and the same or a smaller amount of non-energy inputs \(x_{i0}^{non - energy}\).

The aforementioned models proposed by Hu and Wang (2006) and Mukherjee (2008) do not take undesirable outputs into consideration. As increasing attention is being paid to environmental issues such as global climate change in general and haze disasters in developing countries in particular, an energy efficiency measurement containing information on undesirable outputs is particularly instructive and of great importance in terms of designing environmentally benign policies. Zhou and Ang (2008) attempt to fill this gap between practical needs and the academic literature. Within a joint production framework of both desirable and undesirable outputs,Footnote 4 a simplified version of the model put forward in Zhou and Ang (2008) can be expressed as

$$ \begin{aligned} & \beta^{*} = \min \beta \\ & s.t. \\ & \sum\limits_{j = 1}^{N} {x_{ij}^{non - energy} \lambda_{j} } \le x_{i0}^{non - energy} ,\quad i = 1,2,...,M \\ & \sum\limits_{j = 1}^{N} {x_{ij}^{energy} \lambda_{j} } \le \beta x_{0}^{energy} ,\quad j = 1,2,...N \\ & \sum\limits_{j = 1}^{N} {y_{j}^{good} \lambda_{j} } \ge y_{0}^{good} ,\quad \lambda_{j} \ge 0 \\ & \sum\limits_{j = 1}^{N} {y_{j}^{bad} \lambda_{j} } = y_{0}^{bad} \\ \end{aligned} $$
(3)

where \(y_{{}}^{good}\) and \(y_{{}}^{bad}\) represent desirable and undesirable outputs, respectively; further, it is assumed here that there is only one energy input, one desirable output and one undesirable output [more complicated examples are also discussed and addressed in Zhou and Ang (2008)].Footnote 5 Imposing the weak disposability assumption on undesirable output (\(\sum\limits_{j = 1}^{N} {y_{j}^{bad} \lambda_{j} } = y_{0}^{bad}\)) implies the introduction of environmental regulation, which means that disposal of undesirable output is a costly activity because firms are required to divert some of their productive resources (which could otherwise be used for the production of desirable output) to control pollution. To examine whether environmental regulation could further improve energy efficiency when energy conservation regulation already exists, Mandal (2010) further modifies the model proposed by Zhou and Ang (2008) as follows:

$$ \begin{aligned} & \beta^{*} = \min \beta \\ & s.t. \\ & \sum\limits_{j = 1}^{N} {x_{ij}^{non - energy} \lambda_{j} } \le x_{i0}^{non - energy} ,\quad i = 1,2,...,M \\ & \sum\limits_{j = 1}^{N} {x_{ij}^{energy} \lambda_{j} } \le \beta x_{0}^{energy} ,\quad j = 1,2,...N \\ & \sum\limits_{j = 1}^{N} {y_{j}^{good} \lambda_{j} } \ge y_{0}^{good} ,\quad \lambda_{j} \ge 0 \\ & \sum\limits_{j = 1}^{N} {y_{j}^{bad} \lambda_{j} } \le y_{0}^{bad} \\ \end{aligned} $$
(4)

The only difference between formula (3) and formula (4) is that the former imposes \(\sum\nolimits_{j = 1}^{N} {y_{j}^{bad} \lambda_{j} } = y_{0}^{bad}\) while the latter imposes \(\sum\nolimits_{j = 1}^{N} {y_{j}^{bad} \lambda_{j} } \le y_{0}^{bad}\), which means that the weak disposability of undesirable outputs is changed to strong disposability. Put differently, it is assumed that there is no additional environmental regulation in reality. According to Mandal (2010), if the energy efficiency scores estimated with formula (3) are higher than those estimated with formula (4), it follows that environmental regulation, such as mandatory carbon-dioxide emission orders, could further improve energy use efficiency even when energy conservation regulation already exists.

2.2 Developments of SFA-based energy efficiency estimation models

Our literature review shows that Feijoó et al. (2002) and Buck and Young (2007) offer the earliest tentative studies on energy efficiency estimation using parametric models. Subsequently, Boyd (2008) replaces the usual error terms \(\varepsilon\) in the input requirement function and subvector input distance function with the composed error terms \(v \pm u\), in which \(v\) is used to capture systematic error, whereas u represents inefficiency caused by management shortfalls or even absence due to a distorted incentive scheme or other controllable reasons. By introducing the composed error term, Boyd (2008) changes the input requirement function and subvector input distance function to the SFA-based energy requirement function and SFA-based Shephard distance function, both of which can be used to estimate energy efficiency when output, energy and non-energy inputs are given. Zhou et al. (2012) proposes an economy-wide SFA-based energy efficiency index.Footnote 6 Furthermore, Filippini and Hunt (2015) illustrate the meanings of radial and nonradial measurements of energy efficiency in detail and extend the aggregate frontier energy demand model first presented by Filippini and Hunt (2011)Footnote 7 into an input requirement frontier model, which includes other non-energy inputs. Filippini and Hunt (2011) also propose an energy demand frontier model to explore the minimum amount of energy required for a certain amount of output when input prices are given. Furthermore, Filippini and Hunt (2012, 2016) and Filippini et al. (2014) employ, for the first time, various SFA-based models to estimate energy efficiency.

2.3 Classifications of EEE estimation methods

The previous literature review shows that depending on how the EEE results are obtained, EEE methods can be classified into nonparametric DEA-based EEE and parametric SFA-based EEE.Footnote 8 In addition, according to how the non-energy inputs are treated during the estimation process, the EEE results can be further divided into two types, namely, EEE in terms of radial measurement and EEE in terms of nonradial measurement. More specifically, nonradial measurement of EEE refers to a situation where non-energy inputs are efficiently used and hence only energy input can be contracted, whereas radial measurement of EEE refers to a situation where it is assumed that non-energy inputs may also be inefficiently used and can be contracted proportionally with energy input (Appendix Figs. 5 and 6 illustrate this difference in more detail).

In conclusion, based on the mode of construction of the frontier and adjustment of inputs, it follows that there are four types of EEE estimates (as shown in Appendix Table 6), namely, (i) estimates obtained through the DEA-based method assuming that all inputs are not used efficiently (DEA-based radial EEE); (ii) estimates obtained through the DEA-based method assuming that only energy input is not used efficiently (DEA-based nonradial EEE); (iii) estimates obtained through the SFA-based method assuming that all inputs are not used efficiently (SFA-based radial EEE); and (iv) estimates obtained through the SFA-based method assuming that only energy input is not used efficiently (SFA-based nonradial EEE).

The literature review in Sects. 2.1 and 2.2 shows that both the DEA-based radial EEE and DEA-based nonradial EEE methods are well developed methodologically and empirically, while neither the SFA-based radial EEE method nor the SFA-based nonradial EEE method is widely discussed. It is a surprising as well as real gap that should be filled.

2.4 Treatment of undesirable output in SFA models

In what follows, we will review how undesirable output is treated within extant SFA-based empirical studies in both energy- and non-energy-related research fields. Special attention is paid to whether the regularity constraints of null jointness put forth by Shephard and Färe (1974) and of weak disposability of undesirable outputs advocated by Shephard (1970) are imposed in these studies and, if so, in which way.

In brief, undesirable output is usually treated as an input, output or exogenous technology-shifter in efficiency-related empirical studies. More specifically, the origin of regarding undesirable output, viz., environmental pollutants, as an input of production can be traced back to Ayres and Kneese (1969), who assert that the amount of environmental pollutants should approximately equal the weight of energy and raw material inputs. This is called the mass balance approach and is further developed by Ayers (1978). In a review conducted by Cropper and Oates (1992), it is found that treating undesirable outputs as inputs has been the standard approach within the environmental economics literature. Recent studies along this line include Considine and Larson (2006) and Mekaroonreung and Johnson (2012). In studies regarding undesirable outputs as input factors, the assumptions of both null jointness and weak disposability are automatically imposed.Footnote 9

The treatment of undesirable outputs as outputs has been pioneered by Färe et al. (1993) and mainly involves the use of distance function-based models (Parmeter and Kumbhakar, 2014). In Färe et al. (1993), a Shephard output-oriented distance function is constructed and used to estimate the shadow price of undesirable output. Since the Shephard distance function assumes that desirable and undesirable outputs are inflated or deflated at the same rate and in the same direction, in other words, that outputs are radially adjusted, this method may not satisfy the common interests of the public and policymakers, who prefer to reduce undesirable outputs and increase desirable outputs simultaneously. To address this concern, Färe et al. (2005) develop a directional distance function that allows for the simultaneous inflation of desirable outputs and deflation of undesirable outputs. Additionally, in both Färe et al. (1993) and Färe et al. (2005), the weak disposability assumption of undesirable outputs is maintained by imposing constraints on the quadratic directional output distance function, while the null-jointness assumption is not imposed in their empirical models.Footnote 10 Bokusheva and Kumbhakar (2014) present another approach to treating undesirable outputs. In their study, a hedonic function \(h(y^{bad} ,y^{good} )\) is used to represent an aggregator of desirable and undesirable outputs. Further, the functional form chosen for \(h(y^{bad} ,y^{good} )\) determines the type of relationship imposed on the two kinds of outputs. Needless to say, neither null jointness nor weak disposability is imposed ex ante in this input-oriented distance function model.

The final treatment of undesirable output in parametric efficiency estimation techniques involves taking it as a shifter of the technology set (see Sect. 3 for a detailed discussion). The studies of Atkinson and Dorfman (2005) and Assaf et al. (2013) fall within this field, and they further employ a Bayesian estimation method to obtain efficiency scores of individual DMUs and coefficients of the models. The weak disposability assumption is imposed by adding constraints to the models, and the null-jointness constraint is imposed automatically in this type of parametric model. More specifically, supposing the coefficients for desirable and undesirable outputs are \(\beta^{good}\) and \(\beta^{bad}\), respectively, this means that production occurs if and only if the combination of \(\beta^{good}\) and \(\beta^{bad}\) is possible under the present production technology; otherwise the production process stops. It follows that if either of the two kinds of output become zero, the production process will cease; consequently, the null-jointness constraint imposed on output is automatically imposed in this kind of parametric model.

In what follows, we will borrow the Bayesian technique from Atkinson and Dorfman (2005) and Assaf et al. (2013) to construct and estimate SFA-based EEE in terms of both radial measurement and nonradial measurement.

3 Model specification for SFA-based EEE

As noted above, in the construction of the SFA-based models to estimate EEE, the most challenging problem encountered is how to maintain the regularity constraint of weak disposability. To be more direct, the challenge is how to decide the sign of coefficients for desirable and undesirable outputs in the model construction and how to respect those constraints in the model estimation. The null-jointness constraint is satisfied automatically in the parametric SFA-based model, as noted above.

Inspired by Atkinson and Dorfman (2005) and Assaf et al. (2013), this study attempts to surmount this weak disposability barrier by means of the Bayesian technique. Supposing a production process with two inputs and one desirable output and ignoring the existence of undesirable output for the time being, we can obtain a standard isoquant. If a firm produces a certain amount of undesirable output \(y_{t1}^{bad}\) at time \(t_{1}\), then the isoquant shifts upward, reflecting that a higher combination of inputs is required to produce the same desirable output because some inputs used are directed toward reducing the production of the undesirable output. The larger the amount of \(y_{t1}^{bad}\), the higher the combination of inputs is. This means that with constant desirable output and technology, undesirable output can only be further reduced through the increased usage of at least some inputs (Assaf et al., 2013). If we further hold the inputs of capital and labor constant and realize the increase in input through an increase in the energy input, it follows that the undesirable output should be negatively associated with energy input. This means that, given a specific level of technology, the non-energy (labor and capital) inputs and desirable outputs, less undesirable output (emission) comes at the cost of more energy input, viz., \(\partial E/\partial Y^{bad} \le 0\). Correspondingly, the sign of the coefficient for undesirable output should be positive, viz., \(\partial E/\partial Y^{god} \ge 0\). Finally, an SFA-based model that is used to estimate nonradial EEE can be expressed as

$$ \begin{aligned} & - \log (E) = f(Y^{good} ,Y^{bad} ,X) - u_{it} + v_{it} \\ & \partial f/\partial Y^{good} \le 0 \\ & \partial f/\partial Y^{bad} \ge 0 \\ & v_{it} \sim iidN(0,\sigma_{v}^{2} ) \\ & u_{i} \sim iidN^{ + } (0,\sigma_{u}^{2} ) \\ \end{aligned} $$
(5)

Accordingly, an SFA-based model used to estimate radial EEE, which can be seen as the application of the input distance function to the field of EEE estimation, can be expressed as

$$ \begin{aligned} & - \log (E) = f(Y^{good} ,Y^{bad} ,X/E) - u_{it} + v_{it} \\ & \partial f/\partial Y^{good} \le 0 \\ & \partial f/\partial Y^{bad} \ge 0 \\ & v_{it} \sim iidN(0,\sigma_{v}^{2} ) \\ & u_{i} \sim iidN^{ + } (0,\sigma_{u}^{2} ) \\ \end{aligned} $$
(6)

where \(Y_{{}}^{good} ,Y_{{}}^{bad} ,X\) and E represent the desirable output, undesirable output, non-energy inputs and energy input, respectively, and \(v_{it}\) and \(u_{it}\) represent the random noise and inefficiency terms. The error term can also be assumed to follow different kinds of distributions.

Having constructed the model, we now turn to the process of model estimation. The challenge here lies in how to satisfy the sign constraints on the coefficients during the model-estimation process. Clearly, the most commonly used maximum likelihood estimator cannot achieve this goal. However, in SFA-based empirical studies, the Bayesian technique is also used to estimate the coefficients of the model and the efficiency scores of individual DMUs. Moreover, according to van den Broeck et al. (1994) and Griffin and Steel (2007), the Bayesian method has the characteristics of (i) full leveraging of prior knowledge of parameters (including the coefficients as well as the inefficiency level) in the estimation processFootnote 11; (ii) easy incorporation of restrictions such as regularity constraints into the model through the prior density distribution function and (iii) easy selection of the most fitting model from competing ones.Footnote 12 This study will use the second characteristic of the Bayesian method to achieve the goal of imposing a sign constraint on the coefficients of the two kinds of outputs, namely, setting the ex ante constraints that the coefficient for desirable output follows a normal distribution and takes a value less than zero and that the coefficient for undesirable output follows a normal distribution but takes values greater than zero.

4 Environmental energy efficiency levels in China

4.1 Background

To support roughly one-fifth of the global population with only seven percent of the world’s agricultural acreage, China has chosen to mobilize nearly all the resources that could be exploited to promote the process of its industrialization at the outset, even at the cost of environmental deterioration to some extent (Wu, et al., 2019). As shown in Table 1, over the 35 years from 1980 to 2015, the percent share of the Chinese economy in world GDP (PPP dollars), grew from 2.32 to 17.24. At the same time, the percent share of Chinese energy consumption in world primary energy consumption also increased from 6.29 to 22.92. Although energy efficiency in China in terms of its GDP/energy ratio rose from 0.37 in 1980 to 0.75 in 2015, it was still lower than the ratios of Japan, the U.S. and India, at 1.23, 0.92 and 1.33, respectively, in 2015.Footnote 13

Table 1 Growth of GDP and energy consumption in four countries

By the beginning of the twenty-first century, environmental degradation in China had become serious, to the extent that it is now perceived and felt not only by environmental scientists and specialists but also by the general public. For example, in 2015, the number of days that air quality in Beijing met official standards was only 186. In other words, the people of Beijing had to wear masks to protect themselves from air pollution for almost half of the year in 2015. More seriously, according to the weather report by China Central Television, the predominant state television broadcaster, approximately 1.42 million square kilometers, or one-seventh of China’s national territory, was covered in thick haze on December 19, 2016, and as a consequence, many highways had to be closed to avoid traffic accidents.

Many measures have been taken by the Chinese government to address environmental and energy problems since the beginning of the twenty-first century, and the paradigm shift from resource mobilization to sustainable development has often involved direct government/public sector interventions, such as incentives to adopt greener technologies/processes in personal behavior and/or production through subsidies or an additional tax on polluting technologies/processes. Representative events include the nomination of the former president of Tsinghua University, who is a specialist in environmental protection and has studied and worked in English for about ten years, as the Minister of Environmental Protection; strengthening of the penalties, including criminal liability, against those who purposely and severely violate the environmental protection law; and enforcement of the Interim Measures for Reduction and Substitution of Coal Consumption in Key Provincial Administrative Regions (PARs) (Directive 2984 in 2014). To evaluate whether energy conservation and environmental protection policies have achieved their goals, the present study will employ environmental energy efficiency as an index. Given that information about both energy consumption and environmental pollutants is contained simultaneously in the EEE index, its performance should dominate that of indexes such as energy intensity or pollutant emission of per unit of GDP.

4.2 Research design

4.2.1 Model selection

TO estimate the SFA-based EEE values, we need to select our preferred SFA models from competing ones. However, there is no one model that is absolutely predominant over the others; instead, each model often displays a virtue or virtues at the cost of the loss of another or several other virtue(s). Therefore, the research purpose and the degree of compatibility between the data and the model are used to decide which model(s) is (are) selected. Furthermore, as a pragmatic method, empirical researchers often employ more than one model to estimate efficiency levels and then take the average of the estimates as the final value.Footnote 14 We follow this research practice and select three models to complete the estimation process. More specifically, we choose one model each from the second, third and fourth generations of SFA models.Footnote 15

4.2.2 Assumption on non-energy input adjustments

SFA-based EEE with nonradial adjustment assumes that only energy input may be used inefficiently, whereas SFA-based EEE with radial adjustment assumes that both energy and non-energy inputs (capital and labor inputs in this study) may be used inefficiently and that they all need to be adjusted proportionately. In reality, we cannot argue for or against either method with full confidence because we perceive that both assumptions (that only energy inputs are inefficiently used and that all inputs are inefficiently used) are unlikely, and consequently, efficiency estimates obtained from a single method are likely to be biased. As a pragmatic compromise, this study uses the average values of the radial and nonradial estimates to represent the final level of Chinese EEE.

In total, there are 6 models that are used to estimate EEE levels; the first three are based on the nonradial treatment, while the last three are based on the radial treatment. Finally, we use the average value of the estimates from these 6 models as the EEE level for individual provincial administrative regions in China.

4.2.3 Treatment with environmental variables

There are generally two kinds of approaches used to deal with environmental variables in the efficiency literature. One assumes that environmental variables influence production technology and hence incorporates them directly into the stochastic frontier model as the regressor; the other assumes that they influence only the distance that separates each firm from the best practice function and hence constructs a regression function between efficiency and environmental factors. As for the predominance of either of the two methods, Coelli et al. (1999, p. 267) point out, “The ex-ante selection of one method over the other is a difficult task. From a philosophical standpoint, we prefer the Case 2 model because we believe the estimated frontier represents the outer boundary of the production possibility set, irrespective of environmental issues. The gross efficiency measures obtained from this procedure seem closest to the intuitive notion of efficiency being about converting physical inputs into physical outputs. One can then decompose these gross efficiency measures into managerial and environmental components if additional data is available.”

This study follows this suggestion and does not incorporate any environmental variables into the SFA model; the estimated efficiency here is sometimes termed gross efficiency. Further, because our main interest in this study lies in estimating the EEE levels during the sample period instead of identifying the relationship between efficiency and environmental variables, we do not conduct any analysis on this issue.Footnote 16

4.3 Data collection and model specification

The data collected include 510 observations from 30 provincial administrative regions of mainland China, spanning from 2000 to 2016.Footnote 17 The data are mainly taken from the China Statistical Yearbook on Environment, China Environment Yearbook, China Statistical Yearbook, China Energy Statistical Yearbook and CEInet Statistical Database. The selection of variables is in line with the microproduction framework by Zhou and Ang (2008) and Filippini and Hunt (2015); specifically, we regard capital, labor and energy as inputs and GDP and CO2 (carbon dioxide) as desirable and undesirable outputs. Detailed information is summarized in Table 2.

Table 2 Data sources and processing

Model I

The first model specified is the Shephard input distance function with the functional form suggested by Battese and Coelli (1992), which can be expressed as

$$ \begin{aligned} - \ln (E_{it} ) & = a + \beta_{k} \ln (K_{it} ) + \beta_{l} \ln (L_{it} ) + \beta_{y}^{good} \ln (Y_{it}^{good} ) \\ & \quad + \beta_{y}^{bad} \ln (Y_{it}^{bad} ) + \beta_{t} T - u_{it} + v_{it} \\ u_{it} & = u_{i} * \exp ( - \eta (t - T)) \\ \beta_{y}^{good} & \le 0 \\ \beta_{y}^{bad} & \ge 0 \\ \end{aligned} $$
(7)

where \(E_{it}\), \(K_{it}\), \(L_{it}\), \(Y_{it}^{good}\) and \(Y_{it}^{bad}\) represent the energy input, capital input, labor input, and desirable and undesirable outputs of the ith DMU in the tth period, respectively, while \(v_{it}\) stands for a two-sided error term and \(u_{it}\) captures one-sided nonnegative inefficiency. Following Battese and Coelli (1992), eta (\(\eta\)) can be more than, equal to or less than zero, which means improved, constant or reduced estimated efficiency on the part of the focal DMU. T represents the time trend dummy and reflects whether there is a technological change during the sample period. Finally, the \(\beta s\) are coefficients to be estimated. The distribution assumptions on the error term and inefficiency are shown in Table 3. \(\beta_{y}^{good} \ge 0\) and \(\beta_{y}^{bad} \le 0\) mean that increasing desirable output or decreasing undesirable output requires an increase in input, which further reflects that reducing emission is costly and necessitates additional resource input. This condition is taken into account through the constraint of weak disposability of undesirable output.

Table 3 Specifications of the models employed

Model II

In the opinion of Greene (2005a, 2005b), the model proposed by Battese and Coelli (1992) may overestimate the level of inefficiency because it does not separate firm-specific effects \(a{}_{i}\) (or DMU-specific heterogeneity) from the inefficiency term \(u{}_{it}\), and consequently \(a{}_{i}\) may be misestimated as inefficiency. To address this issue, the second model proposed is a Shephard input distance model with a true random-effects functional form and can be specified as follows:

$$ \begin{aligned} - \ln (E_{it} ) & = a + \beta_{k} \ln (K_{it} ) + \beta_{l} \ln (L_{it} ) + \beta_{y}^{good} \ln (Y_{it}^{good} ) \\ & \quad + \beta_{y}^{bad} \ln (Y_{it}^{bad} ) + \beta_{i} T + \alpha_{i} - u_{it} + v_{it} \\ \beta_{y}^{good} & \le 0 \\ \beta_{y}^{bad} & \ge 0 \\ \end{aligned} $$
(8)

where \(\alpha_{i}\) stands for DMU-specific time-invariant heterogeneity and the denotations of the other variables are similar to those in formula (6). The true random-effects model makes the separation of DMU-specific heterogeneity and inefficiency possible and solves the difficulty encountered in the model proposed by Battese and Coelli (1992).

Model III

From the viewpoint that the inefficiency of DMUs may be composed of time-invariant as well as time-varying parts, Kumbhakar et al. (2012) further extend the models presented in Greene (2005a, 2005b) by adding a random term reflecting time-invariant inefficiency, i.e., \(\varepsilon_{it} = \alpha_{i} \pm \eta_{i} \pm u_{it} + v_{it}\), where the denotation of \(v_{it}\) and \(\alpha_{i}\) are similar to those in formula (6), and \(u_{it}\) and \(\eta_{i}\) stand for time-varying and time-invariant inefficiency, respectively. Note that in this model, time-invariant firm-specific heterogeneity is divided into two parts: one is the heterogeneity that affects production and can be controlled by the DMU (i.e.,\(\eta_{i}\)), and the other is the heterogeneity that affects production but is beyond the control of the DMU (i.e., \(\alpha_{i}\)). This model can be specified as follows:

$$ \begin{aligned} - \ln (E_{it} ) & = a + \beta_{k} \ln (K_{it} ) + \beta_{l} \ln (L_{it} ) + \beta_{y}^{good} \ln (Y_{it}^{good} ) \\ & \quad + \beta_{y}^{bad} \ln (Y_{it}^{bad} ) + \beta_{t} T + \alpha_{i} - \eta_{i} - u_{it} + v_{it} \\ \beta_{y}^{good} & \le 0 \\ \beta_{y}^{bad} & \ge 0 \\ \end{aligned} $$
(9)

Models I to III are specified to estimate the nonradial EEE levels of individual DMUs, whereas Models IV to VI have the same functional form as the previous models except that they are specified to estimate the radial EEE level of individual DMUs.

Model IV

$$ \begin{aligned} - \ln (E_{it} ) & = a + \beta_{k} \ln ({{K_{it} } \mathord{\left/ {\vphantom {{K_{it} } {E_{it} }}} \right. \kern-\nulldelimiterspace} {E_{it} }}) + \beta_{l} \ln ({{L_{it} } \mathord{\left/ {\vphantom {{L_{it} } {E_{it} }}} \right. \kern-\nulldelimiterspace} {E_{it} }}) + \beta_{y}^{good} \ln (Y_{it}^{good} ) \\ & \quad + \beta_{y}^{bad} \ln (Y_{it}^{bad} ) + \beta_{t} T - u_{it} + v_{it} \\ u_{it} & = u_{i} * \exp ( - \eta (t - T)) \\ \beta_{y}^{good} & \le 0 \\ \beta_{y}^{bad} & \ge 0 \\ \end{aligned} $$
(10)

Model V

$$ \begin{aligned} - \ln (E_{it} ) & = a + \beta_{k} \ln ({{K_{it} } \mathord{\left/ {\vphantom {{K_{it} } {E_{it} }}} \right. \kern-\nulldelimiterspace} {E_{it} }}) + \beta_{l} \ln ({{L_{it} } \mathord{\left/ {\vphantom {{L_{it} } {E_{it} }}} \right. \kern-\nulldelimiterspace} {E_{it} }}) + \beta_{y}^{good} \ln (Y_{it}^{good} ) \\ & \quad + \beta_{y}^{bad} \ln (Y_{it}^{bad} ) + \beta_{i} T + \alpha_{i} - u_{it} + v_{it} \\ \beta_{y}^{good} & \le 0 \\ \beta_{y}^{bad} & \ge 0 \\ \end{aligned} $$
(11)

Model VI

$$ \begin{aligned} - \ln (E_{it} ) & = a + \beta_{k} \ln ({{K_{it} } \mathord{\left/ {\vphantom {{K_{it} } {E_{it} }}} \right. \kern-\nulldelimiterspace} {E_{it} }}) + \beta_{l} \ln ({{L_{it} } \mathord{\left/ {\vphantom {{L_{it} } {E_{it} }}} \right. \kern-\nulldelimiterspace} {E_{it} }}) + \beta_{y}^{good} \ln (Y_{it}^{good} ) \\ & \quad + \beta_{y}^{bad} \ln (Y_{it}^{bad} ) + \beta_{t} T + \alpha_{i} - \eta_{i} - u_{it} + v_{it} \\ \beta_{y}^{good} & \le 0 \\ \beta_{y}^{bad} & \ge 0 \\ \end{aligned} $$
(12)

Table 3 summarizes the models employed in this study.

5 Results and analysis

The estimation results obtained from Models I to VI are reported and discussed in this section. We begin our analysis with the EEE estimates. The coefficients of variables are reported and discussed in the next part.

5.1 Environmental energy efficiency estimates

Descriptive statistics for the EEE estimates from the different models are reported in Table 4. Since the assumptions imposed vary from model to model, the estimates from each of them are correspondingly different. As previously noted, theoretically speaking, Models I and IV may miscalculate firm-specific heterogeneity as inefficiency, and the estimates of EEE from those two models should be lowest. In contrast, Models II and V not only separate firm-specific heterogeneity from inefficiency but also assume the nonexistence of time-invariant inefficiency; thus, the estimates from those two models should be highest. Models III and VI suppose that inefficiency includes persistent as well as transient parts, and the estimates they produce should theoretically fall between the estimates from Models I and IV and those from Models II and V. As shown in Table 4, all these theoretical predictions are confirmed by our results. Since one cannot identify which model is closest to reality, using the average value to reflect the industry-wide efficiency level is a commonly employed technique in empirical studies. Table 4 shows that the average EEE value during the sample period is approximately 0.7812, with a maximum of 0.8997 and a minimum of 0.6435, which implies that there is considerable room for improvement in China’s EEE level.

Table 4 Descriptive statistics of estimates

Figure 1 shows the development trajectory of EEE levels from 2000 to 2016. It seems that (i) all models except Model IV, which shows an obvious downward trend, yield estimates of relatively flat trends during the sample period and that (ii) when the average value of estimates from different models is used as an index, the EEE level slightly decreased from 0.7858 in 2000 to 0.7726 in 2016.

Fig. 1
figure 1

Trends of environmental energy efficiency estimates for the period 2000–2016

To obtain a more comprehensive picture of pollutant emission and energy conservation during the first 15 years of the twenty-first century, we calculate two additional indexes, namely, the GDP/energy ratio and the GDP/undesirable output ratio. As shown in Fig. 2, the GDP/energy ratio changed from approximately 0.6272 in 2000 to approximately 1.1150 in 2016, and the GDP/undesirable output ratio changed from 0.3124 in 2000 to 0.5907 in 2016. Thus, both indexes show a tendency toward improvement in general, implying a reduction in energy consumption and in pollutant emission when evaluated in units of GDP. It is an inspiring outcome.

Fig. 2
figure 2

Trends of three indexes for the period 2000–2016

Now let us explain the reasons behind those differences, taking the comparison between EEE and the GDP/energy ratio as an example. These two indexes show contradictory behavior; namely, the GDP/energy ratio demonstrates an improvement in energy use efficiency, while the EEE index does not show such a trend. Clearly, the difference comes from the fact that the EEE index takes into consideration both desirable and undesirable outputs and both energy and non-energy inputs and reports the degree of deviation between optimal and real energy inputs, keeping non-energy inputs as well as all outputs constant, while the GDP/energy ratio does not consider the impact of non-energy inputs and undesirable outputs. This indicator may lead to incorrect attribution of energy conservation resulting from other factors to improvement in energy use efficiency. For example, substituting energy input with non-energy inputs (viz., capital or labor) and/or adopting energy-saving technology can lead to improvements in the GDP/energy ratio. This kind of improvement is not due to a change in energy use efficiency and cannot lead to an improvement in the EEE index. An increase in the GDP/energy ratio may reflect an increase in attention paid to this index by policy, but it does not necessarily mean an improvement in terms of Pareto optimality.

In conclusion, the emphasis on environmental protection and energy conservation by the Chinese government has led to a reduction in energy consumption and CO2 emission per unit of GDP, but this improvement may have resulted from changes such as input factor substitution, technology improvement and upgrades of the industrial structure, rather than from an increase in the level of EEE.

Now, we examine the EEE estimates for individual provincial administrative regions during the period 2000–2016. Figure 3 shows that the range of EEE estimates among different regions is relatively large, with the lowest value being 0.6823 in Ningxia and the highest being 0.8938 in Shanghai. Generally, the EEE levels in eastern regions,Footnote 18 where the speed of economic development is relatively fast, are higher than those in central and western regions, where it is relatively slow.

Fig. 3
figure 3

EEE levels of individual provincial administrative regions for the period 2000–2016

Figure 4 visualizes the spatial distribution pattern of EEE levels in mainland China.

Fig. 4
figure 4

Spatial disribution of EEE levels for the period 2000–2016. Note: Areas with efficiency = 0 have no data

5.2 Parameter estimates

The parameter estimates from the different models are reported in Table 5. Before we assess the individual models, two key points deserve special attention.

Table 5 Parameter estimates from the different models

First, the signs of the coefficients of the non-energy inputs (\(\beta_{k}\) and \(\beta_{l}\)) in Models I, II and III (under the assumption of nonradial adjustment) are the same as the sign of the energy input (recall that the sign of the independent variable in Models I, II and III is negative), which means that energy consumption cannot be reduced by an increase in non-energy inputs. Put differently, no substitution relationship exists between the two types of inputs. This result may come from our assumption that non-energy inputs are efficiently used while energy input is not, which further means that energy and non-energy inputs are treated as heterogeneous factors. In contrast, all models including the assumption of radial adjustment (Models IV, V and VI) demonstrate a substitution relationship between energy and non-energy inputs (recall that the sign of the independent variable in Models IV, V and VI is also negative); namely, an increase in the capital or labor input may result in a decrease in energy consumption, other things being equal. Again, this is thought to be related to the ex ante assumption that all inputs are treated homogeneously in terms of being not fully efficiently used.

Second, all models with nonradial adjustment predict technological progress (recall that the sign of the independent variable in Models I, II and III is negative; consequently, a positive sign on the time trend coefficient implies technological progress). More specifically, when we assume non-energy inputs are fully efficient, energy consumption is reduced by approximately 0.0068, 0.0279 or 0.0618 percent annually at the level of sample mean because of technological progress, keeping other inputs and outputs constant. The models with radial adjustment lead to different predictions; specifically, Models IV and VI predict technological progress, while Model V predicts technological regression. Unfortunately, the Bayesian method does not allow for assessment of the statistical significance of parameter estimates, and no further judgment with confidence can be made.

Now let us examine the meanings of the different parameters. Since the variables used here are normalized by the mean value of the sample and then the natural logarithm taken (except for the time trend variable), the coefficients can be interpreted as the percentage change in the mean level of the corresponding variables. First, we take Model I as an example of the models in the nonradial adjustment group. Table 5 shows that GDP growth of one percent will increase energy consumption by approximately 0.5599 percent, while a one-percent reduction in CO2 emission will necessitate additional investment equal to approximately 0.0017 percent of the mean value of the energy variable; that is, the decrease in CO2 emission is costly and consumes additional investment that could otherwise be used to produce desirable output. The parameter eta (\(\eta\)) in the B-C (92) model (Model I is also called the B-C (92) model in the SFA literature) is used to show the dynamic change trend of the efficiency level, and a positive value of \(\eta\) demonstrates the improvement in the efficiency level during the period 2000–2016. The parameter lambda (\(\lambda\)) in the second line from the bottom equals the ratio between \(\sigma_{u}^{2}\) and \(\sigma_{v}^{2}\). A value of zero means that there is no inefficiency and that energy input is fully efficiently used (in this case, the SFA model reverts to a traditional average response model); otherwise, the larger the value of \(\lambda\) is, the higher the inefficiency level of the energy input.Footnote 19

Finally, we turn to the coefficients of capital (\(\beta_{k}\)) and labor (\(\beta_{l}\)). Strictly speaking, it seems that a one-percent increase in capital (labor) input will increase energy consumption by approximately 0.4960 (0.0297) percent, holding other inputs, outputs and the technology level constant. However, from the viewpoint of Pareto improvement, there is little reason to think that an increase in one kind of input cannot lead to improvements in other kinds of inputs and/or outputs (the decrease in the undesirable output can be seen as a Pareto improvement). Instead, the meaning of the coefficients of capital and labor in Model I should be interpreted as representing the proportional relationship among different kinds of inputs in the production process; in other words, to maintain normal production under the present technology level, the ratio of input quantities among \(ln(energy)\), \(ln(capital)\) and \(ln(labor)\) should be 1: 0.4960: 0.0297.

Except for the coefficients of capital (\(\beta_{k}\)) and labor (\(\beta_{l}\)), the meanings of the parameters estimated from the models in the radial adjustment group are the same as those in the radial adjustment group. If we take capital as an example, Model IV in Table 5 shows that a one-percent increase in capital will lead to a reduction in energy consumption of approximately 0.3701 percent, other things being equal. In other words, there exists a substitution relationship between energy and non-energy inputs when we assume that both kinds of inputs are not fully efficiently used.

6 Concluding remarks

Currently, energy conservation and pollutant emission reduction are common challenges faced by both developed and developing economies. In this respect, the performance of China, as the world’s largest developing economy, could have an important impact on the patterns of change in world energy consumption and environmental protection. In response to these concerns, the Chinese government announced that China would further reduce its energy consumption per GDP unit by 15 percent and greenhouse gas emission by 18 percent during the period 2016–2020. What is notable is that China announced its participation in the Paris Climate Agreement on September 24, 2016; in doing so, China showed the world that it will take more responsibility in global actions to reduce greenhouse gas emission and conserve energy.

However, what has China's performance trajectory been in terms of energy consumption and pollutant emission during the first two decades of the twenty-first century? Is it enough to use the GDP/energy and GDP/undesirable output ratios alone as indexes of energy efficiency and pollutant emission? With these concerns in mind, this study, based on the Shephard input distance function and the Bayesian technique, proposes an SFA-based model that can be used to measure environmental energy efficiency while retaining the regularity constraints of null-jointness and weak disposability of undesirable output(s). Combining undesirable outputs such as CO2 with desirable outputs such as GDP in the estimation of energy efficiency reflects the special attention paid to environmental protection and sustainable development. We believe that the comparison of energy efficiency over time and/or across regions/countries will be more informative and instructive when undesirable outputs are also considered. Although the DEA-based method can achieve this goal, in situations where the time span is long and the environmental heterogeneity among DMUs is obvious, the SFA-based method is thought to be superior to the DEA-based method because of its advantages in controlling for environmental heterogeneity, in separating efficiency from technological change, and in dividing inefficiency from the error term.

Next, we collect a panel data set covering 30 provincial administrative regions in mainland China for the period 2000–2016 and use the newly proposed model to estimate the levels of EEE. Further, to obtain a more objective and complete understanding, six different SFA-based models are specified to carry out the estimation process, and both the GDP/energy and GDP/undesirable output ratios are also calculated for the purpose of conducting a cross-check.

The empirical results from our estimation can be summarized as follows:

  1. i.

    For the individual provincial administrative regions, EEE levels differ sharply from region to region, ranging from the lowest value of 0.6823 in Ningxia to the highest value of 0.8938 in Shanghai. Generally, the EEE levels in eastern regions are higher than those in central and western regions.

  2. ii.

    If we take the 30 individual provincial administrative regions as a whole, the average EEE value during the whole sample period is approximately 0.7812, with a maximum of 0.8997 and a minimum of 0.6435. This average decreased slightly from 0.7858 in 2000 to 0.7726 in 2016.

  3. iii.

    The GDP/energy ratio improved from 0.6272 in 2000 to 1.1150 in 2016, and the GDP/undesirable output ratio increased from 0.3124 in 2000 to 0.5907 in 2016, which implies a reduction in energy consumption and pollutant emission per unit of GDP during 2000–2016.

  4. iv.

    On the assumption that only energy input is not fully efficiently used (in the SFA-based nonradial EEE model), the empirical results show that no substitution relationship exists between energy and non-energy inputs. The meaning of the coefficients of capital and labor in such models should be interpreted as representing the proportional relationship among different kinds of inputs in the production process.

  5. v.

    On the assumption that both energy and non-energy inputs are not fully efficiently used (in the SFA-based radial EEE model), the empirical results demonstrate a substitution relationship between energy and non-energy inputs. The reason for this difference with the results of the nonradial models may lie in the assumption that the SFA-based nonradial EEE model treats energy and non-energy inputs as heterogeneous factors, while the SFA-based radial EEE model treats them as homogeneous factors.

From the viewpoint of policy implications, our research shows the following:

  1. i.

    With the emphasis placed on environmental protection and energy conservation by the Chinese government, a reduction in energy consumption and CO2 emission per unit of GDP did occur during the first two decades of the twenty-first century. However, this improvement may have resulted from changes such as input factor substitution, technological progress or upgrading of the industrial structure, rather than from an increase in the EEE level.

  2. ii.

    The GDP/energy and pollutant emission per GDP unit indexes can reflect only one aspect of energy use efficiency, and they should be used in combination with more sophisticated indexes, such as the EEE index. This is because the former are influenced by factors other than pure energy efficiency. For that reason, the EIA (2009) suggests that energy intensity (the reciprocal of the GDP/energy ratio) does not necessarily reflect true energy efficiency.

  3. iii.

    Given the existence of positive externalities in environmental protection, plants producing undesirable output as a byproduct may be reluctant to invest in emission reduction in the absence of incentive or punishment mechanisms. Consequently, a win–win solution in terms of economic development and environmental protection should replace the outdated, GDP-oriented development philosophy, as the latter may weaken the incentive for local governments to rigidly enforce environmental protection-related laws. The new idea that “lucid waters and lush mountains are invaluable assets” advocated by the president of China represents a major change in development philosophy, and corresponding measures should be taken to put this idea into practice.