A directional distance based super-efficiency DEA model handling negative data

Abstract This paper develops a new radial super-efficiency data envelopment analysis (DEA) model, which allows input–output variables to take both negative and positive values. Compared with existing DEA models capable of dealing with negative data, the proposed model can rank the efficient DMUs and is feasible no matter whether the input–output data are non-negative or not. It successfully addresses the infeasibility issue of both the conventional radial super-efficiency DEA model and the Nerlove–Luenberger super-efficiency DEA model under the assumption of variable returns to scale. Moreover, it can project each DMU onto the super-efficiency frontier along a suitable direction and never leads to worse target inputs or outputs than the original ones for inefficient DMUs. Additional advantages of the proposed model include monotonicity, units invariance and output translation invariance. Two numerical examples demonstrate the practicality and superiority of the new model.


Introduction
Data envelopment analysis (DEA) is a non-parameter technique for measuring the relative efficiency of a set of peer decision-making units (DMUs) with multiple inputs and outputs (Charnes et al, 1978). A weakness of traditional DEA models is that it assumes that all the inputs and outputs are non-negative. However, negative values, especially those of outputs, could exist in many situations. For example, the expected return is generally treated as an output measure for estimating the efficiency of mutual fund, which might be negative for some mutual funds. Likewise, the profit, which is generally chosen as an output for measuring the efficiency of projects, might be negative for some projects. Hence, it is necessary to improve the DEA model to expand its application.
There are several approaches to deal with negative data in DEA models. The simplest method is to treat negative inputs (outputs) as outputs (inputs). If inputs (outputs) are all nonpositive, their absolute values can be treated as non-negative outputs (inputs), so that non-positive inputs (outputs) will decrease (increase) when those corresponding non-negative outputs (inputs) expand (Scheel, 2001). However, this method is not applicable if there is some input or output with both positive and negative values. Another approach to handle negative data is to utilize the ''translation invariance'' property. A DEA model is translation invariant if the translated input-output data yield the same results as the original data. The variable returns to scale (VRS) additive DEA models  are translation invariant (Ali and Seiford, 1990;Lovell and Pastor, 1995;Pastor, 1996), but they yield the ''furthest'' target on the production frontier for inefficient DMUs (Portela et al, 2004) and cannot provide any measure of efficiency. The output-oriented BCC model (Banker et al, 1984) is input translation invariant, and the input-oriented BCC model is output translation invariant. These two kinds of BCC models cannot be applied to the situation where negative values exist in both inputs and outputs. Based on a modified directional distance function (DDF) (Chambers et al, 1996), Portela et al (2004) develop a range directional measure (RDM) model, which can deal with inputs and/or outputs taking positive values for some DMUs and negative values for the others. However, the RDM model may be unbounded when the evaluated DMU has the maximum values for all the outputs and the minimum values for all the inputs (Cheng et al, 2013). Inspired by the RDM model, Sharp et al (2007) introduce a modified slack-based measure, which can deal with negative inputs and/or negative inputs. Emrouznejad et al (2010) propose a semi-oriented radial measure (SORM) to handle negative input-output data. Kerstens and Woestyne (2011) recommend a generalized Farrell proportional distance function that handles negative data and maintains a proportional interpretation under mild conditions. Cheng et al (2013) find that the SORM model might lead to worse target inputs or outputs than the original ones for inefficient DMUs. They develop a variant of the traditional radial model where original values are replaced with their absolute values. Some imprecisions in Cheng et al (2013) are corrected by Kerstens and Woestyne (2014).
A limitation of the above DEA models is that they cannot further discriminate efficient DMUs, all of which have an efficiency score of unity. Andersen and Petersen (1993) develop a super-efficiency DEA model, see also Banker et al (1989), which can rank efficient DMUs. The input-oriented (output-oriented) super-efficiency DEA model excludes the DMU under evaluation from the reference set so that efficient DMUs may have efficiency scores larger (smaller) than or equal to one. The original super-efficiency DEA model is introduced under the condition of constant returns to scale (CRS) and is feasible if all inputs and outputs of DMUs are positive. However, the infeasibility issue might occur in VRS super-efficiency DEA models (Seiford and Zhu, 1999).
Many modified VRS radial super-efficiency DEA models (Chen, 2005;Ray, 2008;Cook et al, 2009;Lee et al, 2011) have been proposed to address the infeasibility issue. Among them, the VRS Nerlove-Luenberger super-efficiency DEA model (Ray, 2008) is based on the DDF and is very often feasible under the non-negative data set. However, this model fails in two exceptions (see Ray, 2008 for details). By choosing proper directions, Chen et al (2013) and Lin and Chen (2015) propose two DDF-based VRS super-efficiency DEA models to eliminate the infeasibility in two exceptions. The model in Chen et al (2013) may be infeasible if zero data exist in outputs (Lin and Chen, 2015). All these modified super-efficiency DEA models are proposed for the non-negative data. In the situations where there exist negative inputs or outputs, the infeasibility issue still exists. Based on the RDM model (Portela et al, 2004), Hadi-Vencheh and Esmaeilzadeh (2013) propose two superefficiency models in the presence of negative data, called the super RDM+ model and the super RDM-model, respectively. Both models can rank efficient DMUs, but they are still infeasible in some cases. We will illustrate this point in Section 2.
In this paper, we propose a novel DDF-based VRS radial super-efficiency DEA model which is feasible and is able to handle negative data. There are at least five contributions in this article.
1. By choosing a proper direction for the DDF, we propose an alternative VRS radial super-efficiency DEA model. The proposed model not only successfully addresses the infeasibility problem in VRS radial super-efficiency DEA models, but also extends the application of the superefficiency measure to negative data.
2. The proposed model projects each DMU onto the superefficiency frontier along a suitable direction and provides improved targets for inefficient DMUs. 3. The proposed model yields a bounded measure of superefficiency. 4. In the situation where outputs are all non-negative, the proposed model always generates reference points with non-negative outputs. 5. The proposed model is monotonous, units invariant and output translation invariant.
The rest of the paper is organized as follows. Section 2 presents the DDF and some existing directions, whose limitations are illustrated through a numerical example. Section 3 proposes a modified DDF, and based on it, we develop a new VRS radial super-efficiency DEA model capable of dealing with negative data, whose useful properties are also investigated in this section. In Section 4, the proposed model is applied to the numerical example in Section 2 and a data set from the literature, respectively, in order to demonstrate its properties and merits. Conclusions are presented in the last section.

DDF-based super-efficiency and directions
Assume that there are n DMUs, each DMU has m inputs and s outputs, and each of inputs and outputs has at least one nonzero value. For each DMU j ðj ¼ 1; . . .; nÞ, let x ij ði ¼ 1; . . .; mÞ denote the ith input and y rj ðr ¼ 1; . . .; sÞ denote the rth output. Under the standard assumptions of convexity and free disposability of inputs and outputs (Chen et al, 2013), the production possibility set (PPS) for a target DMU p ðp 2 f1; . . .; ngÞ with respect to super-efficiency is spanned by ðx ij ; y rj Þ; j ¼ 1; . . .; n; j 6 ¼ p; as follows.
( y r X n j¼1;j6 ¼p k j y rj ; r ¼ 1; . . .; s; Choosing a direction vector ðg x ; g y Þ, the directional distance function (DDF) for DMUp with respect to T p is defined as: Then, the following general DDF-based super-efficiency DEA model can be established.
s:t: X n j¼1;j6 ¼p k j x ij x ip À b p g x ; i ¼ 1; . . .; m; ð2Þ X n j¼1;j6 ¼p k j y rj ! y rp þ b p g y ; r ¼ 1; . . .; s; ð3Þ X n j¼1;j6 ¼p k j ¼ 1; k j ! 0; j ¼ 1; . . .; n; j 6 ¼ p: Denote the optimum value of model (1)-(4) as b o p . The superefficiency score of the evaluated DMUp can be determined as 1 À b o p (Ray, 2008). The smaller the value of b o p , the more efficient the DMUp. For any efficient DMUp, 1 À b o p is no less than 1.0.
The direction vector ðg x ; g y Þ should be non-negative and nonzero and can be chosen in an arbitrary way (Chen et al, 2013;Ray, 2008). Briec and Kerstens (2009a) point out that model (1)-(4) cannot guarantee the feasibility if the direction is a constant vector and the output direction vector is nonzero. Hence, g x and g y are often considered as the function of x ip and y rp . If all input and output data are non-negative, the standard DDF for the DMUp is adopted by choosing ðx ip ; y rp Þ as ðg x ; g y Þ (Chambers et al, 1996), and the VRS Nerlove-Luenberger super-efficiency DEA model (Ray, 2008) (called the NL model for short) is obtained. The NL model is very often feasible for non-negative data, but it fails in the following two exceptions (Ray, 2008): 1. When a super-efficiency score is greater than 2.0, the NL model will yield a reference point with negative outputs.
In applications where the outputs should be non-negative, such as the performance evaluation of airlines (Ray, 2008), a reference point with negative outputs results in a conceptual problem.

If a zero input exists in the evaluated DMU and all other
DMUs in the reference set have positive values in that input, the NL model becomes infeasible.
In the context of the definition of a Luenberger productivity indicator, Briec and Kerstens (2009b) avoid reference points with negative outputs by adding a constraint that the projected output should remain positive. However, this method still cannot solve the infeasibility issue in the second exception. To eliminate the infeasibility in both the exceptions, Lin and Chen (2015) recently have put forward a modified DDF-based super-efficiency DEA model (called the LC model for short) by choosing ðx ip þ max j¼1;...;n j6 ¼p fx ij g; y rp Þ as ðg x ; g y Þ. The LC model successfully addresses the infeasibility issue in conventional VRS radial super-efficiency DEA models and the NL model under non-negative data.
In the presence of negative data, both the NL and LC models might be infeasible. This is because their related direction vectors, ðx ip ; y rp Þ and ðx ip þ max j¼1;...;n j6 ¼p fx ij g; y rp Þ, might be negative, which could guide the DMUp to be further away from the super-efficiency frontier and thus lead to infeasibility. To illustrate this problem, we consider a simple example with six DMUs. Each DMU has three inputs and two outputs. The concrete data set of this example is shown in columns 2 to 6 of Table 1. Input X3 and output Y1 are negative for DMU 2 and DMU 1, respectively. The super-efficiency results yielded by the NL model and the LC model are shown in columns 7 and 8 of Table 1, respectively. Obviously, both the NL model and the LC model become infeasible for some DMUs.
In order to rank DMUs in the presence of negative data, Hadi-Vencheh and Esmaeilzadeh (2013) propose the super RDM+ model and the super RDM-model, respectively. Let (Portela et al, 2004). By employing ðP À ip ; P þ rp Þ and ð 1  (2013) proposed. However, these two models still fail in some cases. Let us consider DMU 6 in the above example. According to (5), we can find P À 16 ¼ 9; P À 26 ¼ 0; P À 36 ¼ 2; P þ 16 ¼ 0; P þ 26 ¼ 3. Then for both the dual super RDM+ model and the dual super RDM-model, the constraint (3) with respect to the output Y1 of DMU 6 is expressed as According to (4), we have À8k 1 þ 4k 3 þ k 4 þ 3k 5 4. This contradicts with (6). Therefore, both the dual super RDM+ According to the duality theory of linear programs, the super RDM+ and the super RDM-models are infeasible or have an unbounded optimal value for DMU 6. Actually, these two models fail for DMUs 2 and 6, as one can see from the last two columns of Table 1.

Modified DDF-based super-efficiency model
Considering that negative values might exist in the inputoutput data, we need to choose a new direction vector which is always non-negative and nonzero, independent of inputs and outputs being non-negative or not. To this end, we introduce the following constants: where k is a constant satisfying k ! 3. Clearly, we have x ip þ a i [ 0 and y rp À b r ! 0 for all i ¼ 1; . . .; m; and r ¼ 1; . . .; s; respectively. Therefore, we can choose ðx ip þ a i ; y rp À b r Þ as ðg x ; g y Þ and then obtain the following VRS radial superefficiency DEA model: For the sake of distinction, we denote the optimum value of model (9)-(12) as b Ã p . The following proposition shows the feasibility of model (9)-(12).
According to Proposition 1, no matter whether inputs and outputs are non-negative or not, for each DMU whose inputoutput bundle belongs to T p , model (9)-(12) expands its outputs and reduces its inputs simultaneously to reach the super-efficiency frontier formed by the rest of DMUs; for each DMU whose input-output bundle does not belong to T p , model (9)-(12) reduces (expands) at least one of its outputs (inputs) to reach the super-efficiency frontier formed by the remaining DMUs.
Inspired by the criterion for identifying efficiency in conventional DEA models such as the CCR model (Charnes et al, 1978), we judge whether the evaluated DMU is efficient or not under model (9)-(12) by the following criterion: slacks yielded by model (9)-(12); otherwise, it is efficient. Let k Ã j denote the optimal maximum slack solution (Cooper et al, 2007) of model (9)-(12). Then, the inputs and outputs of the projection on the super-efficiency frontier with respect to the DMUp can be expressed aŝ Like traditional DEA models, model (9)-(12) might provide multiple projections for the evaluated DMU because there might exist multiple optimal solutions for k Ã j . From Proposition 1 and constraints (10) and (11), we know that model (9)-(12) projects the DMUp onto the super-efficiency frontier formed by the rest of DMUs along the direction ðx ip þ a i ; y rp À b r Þ, without the actual data transformation. Concretely, for inefficient DMUs, it preserves the proportionate improvement property of the traditional DEA model and any projection it provides can be seen as an improved target for the evaluated DMUs, at least one of the outputs (or inputs) of projections is not larger (or not smaller) than that of the corresponding DMU due to the positive b Ã p or b Ã p ¼ 0 with all zero slacks. Thus, model (9)-(12) generalizes current VRS radial super-efficiency DEA models suitable for non-negative data to the situation with partially or fully negative data.
Similarly to the determination of the super-efficiency score in Ray (2008), the DDF-based super-efficiency score for DMUp is determined by 1 À b Ã p under model (9)-(12). From Proposition 1, we have the following corollary about the boundedness of the super-efficiency score determined by model (9)-(12).
We know from Ray (2008) that under model (9)-(12), the outputs of the reference point for DMUp arẽ According to Proposition 1, we havẽ y rp ¼ y rp þ b Ã p ðy rp À b r Þ ! y rp ; ðx ip ; y rp Þ 2 T p ; y rp ! y rp À ðy rp À b r Þ ¼ b r ; ðx ip ; y rp Þ 6 2 T p : Therefore, we have the following conclusion forỹ rp : Corollary 2 For the data set with non-negative outputs, y rp ! 0 holds for any DMU p; p 2 f1; . . .; ng.
Corollary 2 shows that the conceptual problem described in Ray (2008) does not occur under our model. From Proposition 1 and Corollary 2, we know that the proposed model eliminates the infeasibility issue of the conventional VRS super-efficiency DEA model and the NL model and meanwhile extends the applicability of the VRS radial super-efficiency DEA model to the situation with negative input-output data. Furthermore, our model has the following three useful properties.
(i) Monotonicity Suppose that the inputs of DMUp are reduced to x ip À Dx ip and the outputs of DMUp are increased to y rp þ Dy rp , here Dx ip ! 0 and Dy rp ! 0 for all i ¼ 1; . . .; m; and r ¼ 1; . . .; s; respectively. Notice that here the input (output) data of the DMUp are decreased (increased). According to the principle, we introduced the constants a i and b r in (7) and (8), and these two constants should be determined by considering all possible values of inputs and outputs. Therefore, a i and b r in this situation should be determined by b r ¼ minfy rj ; 8j; y rp þ Dy rp g ¼ min j¼1;...;n fy rj g; r ¼ 1; . . .; s: Equality (25) holds due to the non-negativity of Dy rp . With a i and b r in (24) and (25), we have the following conclusion: Proposition 2 The optimal value of model (9)-(12) does not increase if inputs (outputs) of the DMU p are reduced (increased).
To ensure the input translation invariance, we should translate a i by the opposite amount of the corresponding input translation, ie, With (39), the constraints in (10) become which are the same as those original constraints in (10) due to (12). Because of the adjustment in (39), we call the above input translation invariance with parameter adjustment the generalized input translation invariance for distinction. Although our model does not satisfy the traditional input translation invariance, it is generalized input translation invariant.

Numerical examples
In this section, two numerical examples are used to show the applicability and merits of the proposed model.
Example 1 To show the properties of the proposed model and meanwhile to compare it with the NL model (Ray, 2008), the LC model (Lin and Chen, 2015), the super RDM+ model and the super RDM-model (Hadi-Vencheh and Esmaeilzadeh, 2013), we apply the proposed model to the data set in Table 1, situated in Section 2 above. In this paper, we set k ¼ 3. Then according to (7) and (8), we have a 1 ¼ 27, a 2 ¼ 3, a 2 ¼ 6, b 1 ¼ À8 and b 2 ¼ 0 for the data set in Table 1. The proposed model is feasible for each of six DMUs. The resulting optimal value is shown in the second column of According to the super-efficiency scores, these six DMUs are ranked as column 4 of Table 2 shows. The last five columns of Table 2 show the inputs and outputs of the projection of each DMU on the super-efficiency frontier.
For this data set (as well as the data set in Example 2), our model yields a unique projection for each DMU. It is easy to see from the projection results that: Inefficient DMU 4 should generate lower target inputs and higher target outputs than the original values in order to reach the super-efficiency frontier; other DMUs should reduce (expand) at least one of its outputs (inputs) to reach the super-efficiency frontier.
For the data set in Table 1, we assume that all the inputs are scaled down by 0.001 and two outputs are scaled down by 0.01. Then according to (7) and (8), we have a 1 ¼ 0:027; a 2 ¼ 0:003; a 3 ¼ 0:006, b 1 ¼ À0:08 and b 2 ¼ 0 for the scaled data. By solving problem (9)-(12) with the scaled data, we find that all the resulting super-efficiency scores are still equal to those in column 3 of Table 2. This confirms the theoretical result about the unit invariance of the proposed model.
Assume the input and output values of all DMUs are translated to Then according to (39) and (37), we have With a i and b r in (41), we solve problem (9)-(12) under the translated data and find that all the resulting super-efficiency scores are still equal to those in column 3 of Table 2. This confirms the usual output translation invariance and the generalized input translation invariance of the proposed model. It is worth noting that by the translation in (40), the inputoutput values of every DMU become non-negative. The first input is zero for DMU 2, and it is positive for other DMUs. By solving the NL model under these translated data, we find that it is infeasible for DMU 2 and the resulting super-efficiency score for DMU 1 is 2.5, which results in a negative referenced output for DMU 1. Therefore, the infeasibility issue in two exceptions of the NL model, mentioned in Section 2, occurs under the translated data. In contrast, the proposed model is feasible under the translated data for all DMUs, and we have from (23) that the outputs of referenced points for individual DMUs areỹ 0 11 ¼ 11:680;ỹ 0 12 ¼ 10:183;ỹ 0 13 ¼ 12:932;ỹ 0 14 ¼ 10:903;ỹ 0 15 ¼ 11:532;ỹ 0 16 ¼ 13:000;ỹ 0 21 ¼ 0:320;ỹ 0 22 ¼ 0:000; y 0 23 ¼ 0:000;ỹ 0 24 ¼ 0:000;ỹ 0 25 ¼ 0:383;ỹ 0 26 ¼ 0:000; which are non-negative. This confirms the conclusion of Corollary 2. From Proposition 1, Corollary 2 and the above analysis, we can safely say that for non-negative data, the proposed model is feasible and ensures the non-negativity of the referenced outputs for all DMUs. So, for the non-negative data set, the infeasibility issue of the NL model (Ray, 2008) does not occur for model (9)-(12).
Example 2 In order to further show the practicability of our model, we consider the data set in Sharp et al (2007). This data set has 13 DMUs with two inputs and three outputs.
The detailed data are shown in columns 2-6 of Table 3. It is easy to see from Table 3 that just one input (cost) and one output (saleable) are non-negative and other data are non-positive. For this example, we have a 1 ¼ 32:4; a 2 ¼ 6:96, b 1 ¼ 0:49; b 2 ¼ À1:42 and b 3 ¼ À3:79 according to (7) and (8). By solving the proposed model for all DMUs, we obtain the resulting super-efficiency scores, which are shown in column 7 of Table 3. It is easy to see that the proposed model is feasible for all DMUs. Compared with DEA models handling negative data (Portela et al, 2004;Sharp et al, 2007;Emrouznejad et al, 2010;Cheng et al, 2013), an advantage of the proposed model is that it can differentiate the performance of efficient DMUs. This superiority comes from an inherent characteristic of super-efficiency DEA models. According to the resulting super-efficiency scores, all DMUs are ranked as column 8 of Table 3 shows. Table 4 shows the target input-output values of inefficient DMUs, determined by the proposed model. It is obvious that under the proposed model, each inefficient DMU should reduce its inputs and expand its outputs in order to tend to the super-efficiency frontier. Therefore, the proposed model can provide improved target inputs and outputs for all inefficient DMUs.
From the theoretical analyses and the above two examples, we can conclude that the proposed model can deal with the data set with negative values and can provide improved targets for inefficient DMUs. In addition, our model is unit invariant, output translation invariant, generalized input translation invariant and monotonous. For the non-negative data set, the proposed model fully eliminates the infeasibility issue of the NL model. Therefore, the proposed model successfully addresses the infeasibility problem occurring in conventional VRS radial super-efficiency DEA models and the NL model. More importantly, different from current DEA models handling negative data, the proposed model can rank efficient DMUs.  Figure 1 The change of the optimal value of model (9)-(12) for DMU 1.

Conclusions
Super-efficiency model in the presence of negative data is a rather neglected issue in the DEA field. The existing superefficiency models capable of handling negative data might be infeasible in some cases. By choosing appropriate direction variables in the DDF, this paper develops a DDF-based VRS radial super-efficiency DEA model for dealing with negative data. Compared with existing related models, the proposed model is feasible no matter whether the input-output data are non-negative or not, and meanwhile, it can rank all DMUs. It can project each DMU onto the super-efficiency frontier along a suitable direction and can provide improved targets for inefficient DMUs. It possesses good properties such as monotonicity, units invariance, output translation invariance and generalized input translation invariance. Moreover, it successfully eliminates the infeasibility issue occurring in two exceptions of the NL model. In summary, the proposed model not only overcomes the infeasibility issue of VRS superefficiency DEA models, but also extends current VRS radial super-efficiency DEA models suitable for non-negative data to the situation with partially or fully negative data.
The new super-efficiency DEA model is developed by utilizing the directional distance function, and the resulting super-efficiency scores are not equal to those yielded by traditional radial super-efficiency DEA models if the latter models are feasible and the input-output data are nonnegative. As for the future research, we will propose an alternative radial super-efficiency DEA model which not only overcomes the above drawback, but also keeps all advantages of the proposed super-efficiency DEA model.