Continuous models combining slacks-based measures of efficiency and super-efficiency

In the framework of data envelopment analysis (DEA), Tone (Eur J Oper Res 130(3):498–509, 2001) introduced the slacks-based measure (SBM) of efficiency, which is a nonradial model that incorporates all the slacks of the evaluated decision-making units (DMUs) into their efficiency scores, unlike classical radial efficiency models. Next, Tone (Eur J Oper Res 143(1):32–41, 2002) developed the SBM super-efficiency model in order to differentiate and rank efficient DMUs, whose SBM efficiency scores are always 1. However, as pointed out by Chen (Eur J Oper Res 226(2):258–267, 2013), some interpretation problems arise when the so-called super-efficiency projections are weakly efficient, leading to an overestimation of the SBM super-efficiency score. Moreover, this overestimation is closely related to discontinuity issues when implementing SBM super-efficiency in conjunction with SBM efficiency. Chen (Eur J Oper Res 226(2):258–267, 2013) and Chen et al. (Ann Oper Res 278(1):101–121, 2019) treated these problems, but they did not arrive to a fully satisfactory solution. In this paper, we review these papers and propose a new complementary score, called composite SBM, that actually fixes the discontinuity problems by counteracting the overestimation of the SBM super-efficiency score. Moreover, we extend the composite SBM model to different orientations and variable returns to scale, and propose additive versions. Finally, we give examples and state some open problems.


Introduction
Data envelopment analysis (DEA) is a well known nonparametric mathematical programming technique, developed by Charnes et al. (1978) based on a previous work of Farrell (1957), which allows us to assess the relative efficiency of a homogeneous set of decision-making units (DMUs) which consume several inputs in order to produce a number of outputs. In the last two decades, there have been remarkable advances in both DEA methodologies and practical applications in a wide range of fields. Although there are several bibliographic reviews available (see, for instance, Seiford (1997), Tavares (2002), Emrouznejad et al. (2008), Cook and Seiford (2009)), the recent bibliographic compilation by Emrouznejad and Yang (2018), providing a full listing of more than 10000 DEA-related articles ranging from 1978 to late 2016, is noteworthy.
Roughly speaking, from the observed inputs and outputs and assuming no functional relationship between them, a DEA model estimates a best-practice frontier, also known as efficient frontier, with respect to which all DMUs are evaluated. In the original CCR (Charnes et al. 1978) and BCC (Banker et al. 1984) DEA radial models, the inputs are proportionally reduced while maintaining the outputs unchanged or the outputs are proportionally expanded keeping constant the inputs, depending on the orientation of the model. Shortly after, the additive model was introduced by Charnes et al. (1982) (see also Charnes et al. (1985)). This model could handle both input excesses and output shortfalls simultaneously, but it could not deliver an efficiency score as the ones obtained by the CCR and BCC radial models. To address this shortcoming, Pastor et al. (1999) and Tone (2001) proposed the enhanced Russell graph measure (ERGM) and the slacks-based measure (SBM) of efficiency respectively, which are equivalent. These models incorporate all the slacks of the evaluated DMUs into their efficiency scores, unlike classical radial efficiency models.
Usually, an efficiency DEA model classifies DMUs into two groups: efficient and inefficient. Efficient DMUs always have an efficiency score equal to 1 but, are all efficient DMUs equally efficient? To discriminate between efficient DMUs and rank them, Andersen and Petersen (1993) proposed the so-called radial super-efficiency model, whose fundamental idea is to eliminate the DMU under evaluation from the reference set. Tone (2002) developed the SBM super-efficiency model which consists on projecting the DMU under evaluation onto the subset of the production possibility set dominated by the DMU, and then estimating the distance between the original DMU and its projection. However, SBM super-efficiency has overestimation problems when the aforementioned projections are weakly efficient, as pointed out by . Moreover, this overestimation is closely related to discontinuity issues when implementing SBM super-efficiency together with SBM efficiency (see for example , Fang et al. (2013), Guo et al. (2017), Chen et al. (2019)), contrary to what happens with radial models. In words of , this discontinuity or gap between the SBM efficiency and super-efficiency scores may lead to interpretation problems because of the sensitivity to small measurement errors or noise in the data. That is, an efficient DMU may become extremely SBM inefficient upon a small increase in inputs or a small decrease in outputs (and vice versa).
Since continuity is a very important and desirable property for DEA models (Robert Russell 1990;Scheel and Scholtes 2003), the joint SBM and the continuous SBM models were introduced by Chen (2013) and Chen et al. (2019) respectively, in order to solve the aforementioned discontinuity problems. However, although the joint SBM model was thought to be continuous at first, we show that, in fact, it is not always continuous. On the other hand, we also show that the continuous SBM model introduced by Chen et al. (2019) is not always weakly monotonic and can lead to conflicting scores in some cases. Moreover, we propose a new model, called composite SBM, that solves the discontinuity problems by counteracting the overestimation of the SBM superefficiency score and is weakly monotonic. Nevertheless, this model presents some issues, like nonlinearity or problems related to strong monotonicity. This paper is organized as follows. In Sect. 2 we briefly introduce some general concepts and notation. In Sect. 3 we review the original SBM efficiency and superefficiency models. In Sect. 4 we review the models presented by  and Chen et al. (2019) in order to face the aforementioned discontinuity issue. In Sect. 5 we present the composite SBM model, studying its main properties and giving some programs for computing its score. In Sect. 6 we extend the study to different orientations, variable returns to scale, zero or negative data, and weights. Moreover, we propose an additive version of the composite SBM model. In Sect. 7 we give some examples showing that the interpretation and discontinuity issues are fixed by the composite SBM model. Finally, in Sect. 8 we present some concluding remarks and state some open problems. For the sake of readability, the proofs of all the results presented in this work have been placed in Appendix A at the end of the paper.

Preliminaries
Notation and basic concepts are taken from Cooper et al. (2007). Vectors will be denoted by lowercase bold-face letters (either roman or greek), and they will be considered as one-column matrices when necessary. The elements of a vector will be denoted by the same letter as the vector, but unbolded and with subscripts. The 0-vector will be denoted by 0 and the context will determine its dimension. All definitions and results are within the framework of constant returns to scale. Variable returns to scale are discussed in Sect. 6.

Definitions
An activity with m inputs and s outputs is a pair of nonnegative vectors (x, y) , where x ∈ R m + and y ∈ R s + are the inputs and outputs vector respectively. In this work, we are going to suppose that all activities are strictly positive and, therefore, the set of activities is identified with R m+s >0 . Nevertheless, we discuss the possibility of zero or negative data in Sect. 6.
Given a DMU that consumes m inputs and produces s outputs, it has an associated activity (x, y) given by the inputs vector x = (x 1 , . . . x m ) and the outputs vector y = (y 1 , . . . y s ), where x i is the amount of the ith input consumed by the DMU and y r is the amount of the r th output produced by the DMU, i = 1, . . . , m, r = 1, . . . , s. Therefore, we can identify a DMU with its activity (x, y) in the same way that a point is identified with its coordinates. It is very important to remark that, in this work, any element of R m+s >0 is called "activity", regardless of whether it is associated with an existing DMU or not.
Let D = {DMU 1 , . . . , DMU n } be a set of n DMUs, all of them having m inputs and s outputs. The corresponding inputs vectors x j , j = 1, . . . , n, can be arranged as the columns of the so-called m × n input data matrix X . Analogously, the outputs vectors y j conform the columns of the s × n output data matrix Y . The production possibility set defined by D is a set of activities given by although it is also denoted by T (of Technology) in the literature. Given two activities (x, y) , x , y , we say that (x, y) is dominated by x , y if x ≤ x and y ≥ y; in this case, we say that (x, y) is strictly dominated by x , y if x , y = (x, y). The relation "to be dominated by" defines a partial order over the set of activities and establishes when an activity outperforms another in the sense that consumes less inputs while producing more outputs. Moreover, the production possibility set (1) is formed by the activities that are dominated by positive combinations of DMUs of the form (X λ, Y λ) with λ ∈ R n + , and hence, it is interpreted as the set of "feasible activities" defined by D (Cooper et al. 2007).
Given a real-valued function f defined on a set of activities A , we say that f is weakly monotonic at (x, y) ∈ A if for any activity x , y ∈ A such that (x, y) is dominated by x , y , we have that f (x, y) ≤ f x , y . Moreover, we say that f is strongly monotonic at (x, y) ∈ A if f (x, y) < f x , y when (x, y) is strictly dominated by x , y . We say that f is weakly monotonic on A if it is weakly monotonic at each activity in A , i.e. it is order-preserving. We say that f is strongly monotonic on A if it is strongly monotonic at each activity in A .
We say that an activity or a DMU is efficient (with respect to a given set D of DMUs) if it is not strictly dominated by any positive combination of DMUs in D; otherwise, we say that it is inefficient. This concept of "efficiency" is equivalent to the classic "Pareto-efficiency" concept and it does not depend on any efficiency model. If P is the production possibility set defined by D, then any activity out of P results efficient. The set of efficient activities in P is known as the (strongly) efficient frontier (or Pareto-Koopmans frontier) of P, and we denote it by ∂ S (P). It is clear that ∂ S (P) is in the frontier of P, known as the weakly efficient frontier of P and denoted by ∂ W (P). The inefficient activities in ∂ W (P) are known as weakly efficient, although in fact they are not efficient.  Fig. 1 a) Classically, a model is applied to a DMU (called DMU o ) in a given set of DMUs in order to obtain its score. b) Given a model and a set of reference DMUs, we construct a score function defined on activities. The image of an activity (x, y) is the score that the model would assign to a new hypothetical DMU with activity (x, y)

Score functions and efficiency scores
Classically, given a set of DMUs, a model is applied to one of these DMUs in order to obtain, among other things, its score (efficiency, super-efficiency, etc.). But in this work, we are going to compute scores through what we call "score functions": given a model and a set D of DMUs (which we call reference DMUs), a score function (with respect to D) is a real-valued function defined on activities (i.e. from R m+s >0 to R), such that the image of (x, y) is the score that the model would assign to a new hypothetical DMU with activity (x, y), considering D ∪ {(x, y)} as the set of DMUs (see Fig. 1). There are a wide variety of models and hence, of score functions, but all of them must be at least weakly monotonic and satisfy some continuity properties, because similar activities must obtain similar scores in order to avoid sensitivity problems. Precisely, the main advantage of this methodology is that results about continuity, differentiability and monotonicity can be directly applied to score functions.
Efficiency measures (also called inefficiency measures) are the core of the DEA methodology. Restricted to inefficient activities, continuity is a property that any efficiency measure should satisfy, because discontinuities can produce serious interpretation problems (Robert Russell 1990;Scheel and Scholtes 2003). Moreover, monotonicity is also an important property that should be required. In this aspect, strong monotonicity is the most desirable property, but we have to note that even weak monotonicity is an elusive property for efficiency measures. For example, Ando et al. (2012) proved that there does not exist any weakly monotonic efficiency mea-sure that uses a p-norm least-distance approach to the closest projections over the efficient frontier. Later, Fukuyama et al. (2014) showed that ratio-form least-distance efficiency measures do not satisfy weak monotonicity either. Ando et al. (2017) gave a further discussion about monotonicity of minimum distance efficiency measures. Nevertheless, any efficiency measure should be at least weakly monotonic for the sake of interpretability.
Given D a set of reference DMUs, an efficiency score (with respect to D) is a score function such that, applied to inefficient activities, represents an efficiency measure. Apart from being continuous in P (the production possibility set defined by D) and weakly monotonic, Fukuyama et al. (2014) pointed out some other desirable properties that any efficiency score f should satisfy: Note that if (x, y) / ∈ P, then it is efficient and hence, f (x, y) = 1. So, we cannot demand global continuity, because weakly efficient activities in ∂ W (P) are inefficient and, according to property 1, their efficiency scores cannot be equal to 1. Precisely, the discontinuity of efficiency scores in ∂ W (P) leads to discontinuity problems when implementing SBM efficiency in conjunction with SBM super-efficiency, exposed by . Note that the classical radial efficiency score is globally continuous but it does not hold property 1, since inefficient activities in ∂ W (P) have radial efficiency score equal to 1 (Charnes and Cooper 1962).

Original SBM models
In this Section we are going to review the SBM efficiency and super-efficiency models given by Tone (2001Tone ( , 2002 and define their corresponding score functions. In the rest of the paper, we are going to suppose that D is a set of n reference DMUs, X , Y are the input and output data matrices of D respectively, and P is the production possibility set defined by D. We define the SBM efficiency score (with respect to D) of an activity (x, y) as if (x, y) ∈ P, and ρ * (x, y) = 1 if (x, y) / ∈ P. The vectors s − , s + are called inefficiency slack vectors (Guo et al. 2017). Considering s − * , s + * optimal inefficiency slack vectors, we refer to the activities of the form x − s − * , y + s + * as efficient (or optimal) targets. Program (2) is based on the original SBM efficiency model given by Tone (2001), in the sense that ρ * (x, y) is the score that the Tone's original SBM efficiency model would assign to a new DMU with activity (x, y). We conclude that ρ * is an efficiency score because it satisfies properties 1 and 2, it is weakly monotonic and ρ * | P is clearly continuous. With respect to monotonicity, we have the next result:

Proposition 1
The SBM efficiency score ρ * | P is strongly monotonic.
We define the SBM super-efficiency (S-SBM) score (with respect to D) of an activity (x, y) as where t − , t + are called super-efficiency slack vectors (Guo et al. 2017). Taking into account the SBM super-efficiency model given by Tone (2002), we have that δ * (x, y) is the score that a new DMU with activity (x, y) would have. Note that in Tone's S-SBM program, the DMU under evaluation must be excluded from the original set of DMUs. However, in program (3), there is no need to make any exclusion. This is one of the advantages of working with score functions. The set of activities in P that are dominated by (x, y) is given bȳ Hence, the constraints of (3) are equivalent to x + t − , y − t + ∈P and then, according to Tone (2002), δ can be interpreted as a weighted l 1 distance from (x, y) to activities inP. Note that although technically δ is not a distance in the mathematical sense, we are going to use the term "distance" in the same way that it is used in Tone (2002). In this case, if t − * , t + * are optimal super-efficiency slack vectors for (3), then x + t − * , y − t + * are the activities inP closest to (x, y), which are called superefficiency projections of (x, y). Note that, since the distance between a point and a closed set is defined by the distance between the point and the closest point in the set, δ * (x, y) can also be interpreted as a distance from (x, y) toP. It is important to remark that, according to , an overestimation of the S-SBM score δ * (x, y) is produced when (x, y) has weakly efficient (and hence inefficient) super-efficiency projections. This overestimation occurs because the inefficiency of such projections is not taken into account by δ * (x, y). This fact is closely related to discontinuity issues when implementing SBM efficiency in conjunction with SBM super-efficiency, as we will see in Sect. 4.

Remark 1 (Strong monotonicity)
The S-SBM score δ * is constantly equal to 1 for activities in P and hence, it is obvious that it is not strongly monotonic for inefficient activities. However, it is important to note that δ * is not strongly monotonic for efficient activities either, because it does not take into account the inefficiency of super-efficiency projections. For example, let us consider D = {D1, D2, D3} a set of reference DMUs with two inputs and one normalized output, where D1 = ((10, 40) , 1), D2 = ((15, 25) , 1) and D3 = ((30, 20) , 1). As we will see in Example 2, activities of the form ((30 + c, x 2 ) , 1) with 0 < x 2 < 20 have the same S-SBM score δ * (with respect to D) when c ≥ 0 varies.

Global continuous SBM models
From the SBM efficiency and super-efficiency models defined by Tone (2001) and Tone (2002) respectively, it is possible to construct a global SBM score function defined on the whole R m+s >0 such that it coincides with ρ * for activities in the corresponding production possibility set P and, on the other hand, it coincides with δ * for the rest of activities (see Fang et al. (2013); Guo et al. (2017)). Nevertheless, as it was firstly showed by , this score is not continuous in the weakly efficient frontier ∂ W (P), making it hard to interpret and justify the scores in applications. As it is also pointed out by , this discontinuity issue is closely related to the overestimation of the S-SBM score produced when the super-efficiency projections are weakly efficient. So, the idea to fix the discontinuity problem is to define a new model that penalizes the inefficiency of such projections in some way.
In this section we are going to review the joint SBM (J-SBM) and the continuous SBM (CSBM) models introduced by Chen (2013) and Chen et al. (2019) respectively. Both models try to solve the discontinuity problem but, unfortunately, they do not give a fully satisfactory solution for different reasons that we are going to show. Specifically, the J-SBM score is not continuous in some cases and the CSBM score, even being continuous, is not weakly monotonic in some cases. It must be taken into account that we have changed some notation to simplify and adapt it to our study.

The joint SBM model
Given D, the space of activities R m+s >0 is splitted into three regions: -Region (I) or technical inefficiency zone: P − ∂ S (P), i.e. inefficient activities.
-Region (II): efficient activities with all super-efficiency projections being efficient. -Region (III): efficient activities with at least one inefficient super-efficiency projection.
With respect to the notation of Chen (2013), super-efficiency projections are called "S-SBM reference points" and, analogously for inefficient activities, efficient targets are called "SBM reference points". Given an activity (x, y), the J-SBM model assigns a score φ * (x, y) that coincides with ρ * (x, y) if (x, y) is in Region (I), is equal to δ * (x, y) if (x, y) is in Region (II) and, for activities in Region (III), the author introduces a relaxed SBM super-efficiency model that penalizes the inefficiency of super-efficiency projections: The reference points given by model (5) are those of the form x −s − * , y +s + * wheres − * ,s + * are optimal. Note thats − ,s + are free slack vectors.  uses binary variables in order to express the J-SBM score in a single model, constructing a "switch" between the three different models from each region (see Equation (9) in Chen (2013)). However, in the way the model is expressed, the "switch" does not work correctly between Region (II) and Region (III). Anyway, this mistake could be fixed by re-defining the J-SBM score piecewisely instead of using binary variables, although its computation would need several stages. Namely, in a first stage, we need to know to which region the activity belongs and then, in a second stage, we apply (2), (3) or (5) for activities in Region (I), (II) or (III), respectively.
Moreover, Corollary 1 in Chen (2013), which states that the reference points given by the J-SBM model are efficient, is not fulfilled. This issue is treated by Lin et al. (2018), giving a counterexample and a revised model.
Finally, and most importantly, according to Theorem 5 in Chen (2013), the J-SBM score is supposed to connect all three regions in a continuous way. Unfortunately, this is not true in some cases as we show in Example 2 (see Fig. 3 (b)), and this mistake can not be fixed. The reason for this to happen is that, sometimes, the relaxed model (5) gives a reference point quite far away from the evaluated activity. The discontinuity of the J-SBM score was recognised in a corrigendum paper by Chen (2014), ensuring that "it can be easily corrected by constraining the reference point to be fixated on a specific strongly Pareto-efficient point". However, the details of this proposition were not given and, as far as we know, there is not any further paper that clarifies it.

The continuous SBM model
Given an efficient activity (x, y) with t − * , t + * optimal super-efficiency slack vectors for (3), the SBM efficiency score (with respect to D) of its super-efficiency projection given by The following propositions allow us to simplify program (6). Moreover, Proposition 4 implies that the programs given in (Chen et al. 2019, Equation (2)) and (Chen et al. 2019, Equation (4)) are equivalent.
Proposition 4 The objective function of (6) can be replaced by According to Chen et al. (2019), we define the continuous SBM (CSBM) score (with respect to D) of an activity (x, y) as where t − * , t + * are optimal super-efficiency slack vectors for (3) and s − * , s + * are optimal inefficiency slack vectors for (6). Note that (8) may not be well-defined if optimal slacks t − * , t + * , s − * or s + * are not unique for the activity (x, y). In this case, the CSBM score may depend on which optimal slacks we choose. The CSBM model can calculate both SBM efficiency (for activities in Region (I)) and SBM super-efficiency (for activities in Region (II)) scores, and it is indeed continuous. Nevertheless, as we show in Example 1, it may not be weakly monotonic for activities in Region (III) and hence, in these cases, it is not a valid score.

Example 1
We are going to consider a set of DMUs used in Doyle and Green (1993), Tone (2002) that consists of six efficient DMUs with four inputs and two outputs (see Table 1). In order to evaluate the first DMU, we are going to consider D = {D2, . . . , D6} as the set of reference DMUs. Then, the S-SBM score (with respect to D) of D1 is 1.0116 and its super-efficiency projection is ((80, 627.89, 54, 8) , (90, 5)) whose SBM efficiency score is 0.7299. The nonzero optimal slacks for programs (3) and (6) are t − * 2 = 27.89, s − * 4 = 4.4 and s + * 2 = 1.82, giving a CSBM score of 0.7397. Now, let us consider D1 equal to D1 except for the first input, that is increased in one unity, changing from 80 to 81. The S-SBM score (with respect to D) of D1 is 1.0103 with super-efficiency projection ((81, 624.82, 54, 8) , (90, 5)) whose SBM efficiency score is 0.7428. The corresponding nonzero optimal slacks are t − * 2 = 24.82, s − * 4 = 4.35 and s + * 2 = 1.63, giving a CSBM score of 0.7517. Since D1 is strictly dominated by D1 and CSBM D1 > CSBM (D1), we conclude that the CSBM score is not weakly monotonic in this case. Using the same technique as in the proofs of propositions 1 and 2, it can be proved that this type of example can only appear when an input or output that does not have any associated optimal slack is altered. In our case, when the first input is worsened, the decrease in the S-SBM score is not able to compensate for the increase in the SBM efficiency score of the super-efficiency projection. According to Fukuyama et al. (2014), there does not exist any weakly monotonic efficiency measure that uses a ratio-form least-distance approach to the closest projections over the efficient frontier. It seems that something similar can happen with the CSBM score, since (8) is a ratio-form expression.

The composite SBM model
Example 1 shows that optimal slacks (and hence the SBM efficiency score) of superefficiency projections do not serve to quantify the overestimation of the S-SBM score in some cases. Hence, for this purpose, we need scores that do not just take into account the super-efficiency projection. Following this idea, in Sect. 5.1 we are going to define a continuous score function γ that is equal to ρ * in Region (I) and coincides with δ * in Region (II). Moreover, unlike the CSBM, γ will always be weakly monotonic. Nevertheless, the computation of γ involves nonlinear programming and hence, in Sect. 5.2, we are going to study some computational aspects.

Definitions and properties
We define the composite SBM (CompSBM) score (with respect to D) of an activity (x, y) as whereP is the set of activities in P that are dominated by (x, y) (see (4)) and max ρ * |P is the best (i.e. highest) SBM efficiency score of activities inP, that can be interpreted as the SBM efficiency score ofP as a set. Note that it is well-defined sinceP is closed. The idea behind the CompSBM score γ given by (9) is not to focus only on superefficiency projections, but on the entire setP: instead of interpreting δ * (x, y) as a distance from (x, y) to its super-efficiency projection and penalize the inefficiency of such projection, let us interpret δ * (x, y) as a distance from (x, y) toP and penalize the inefficiency ofP, i.e. the fact that any activity inP is inefficient. These two points of view are not equivalent, as we will see in Remark 5.

Remark 2 (Unit-invariance)
In the J-SBM, CSBM and CompSBM models, we are implicitly assuming that slacks from different inputs and/or outputs can somehow compensate for each other, as pointed out and discussed by . In the case of the CompSBM model, according to expression (9), the optimal super-efficiency slacks of (x, y) (which are contained inside δ * (x, y)) are compensated by the optimal inefficiency slacks of the most efficient activity inP (which are contained inside max ρ * |P ). For this reason, unit-invariance is a very important property that these models should satisfy. On one hand, J-SBM and CSBM models are proved to be unitinvariant (see , Chen et al. (2019), respectively); on the other hand, the CompSBM model is unit-invariant because the SBM efficiency and super-efficiency models are unit-invariant (see Tone (2001Tone ( , 2002).
Next proposition clarifies the behavior of the CompSBM model, showing that it integrates the SBM efficiency ρ * and the S-SBM δ * in its score.

Proposition 5
Let (x, y) be an activity and letP be the set given by (4). Then

Proposition 6
The CompSBM score γ is continuous.

Remark 3 (Super-inefficiency)
The CompSBM model is based on the SBM efficiency and super-efficiency models, providing a continuous score in the weakly efficient frontier. However, since there is continuity, inevitably there will be efficient activities (in Region (III)) with CompSBM scores less than 1 around the weakly efficient frontier, as it also happens with J-SBM and CSBM scores. Although this may seem a little counter-intuitive at first, Chen (2013) gives a clarifying example in his "Discussion and summary" section. Following the same criteria as Chen (2013), an efficient activity with score less than 1 is said to be super-inefficient. In fact, super-inefficiency is interpreted by Chen (2013) as a "hidden" inefficiency, and Chen et al. (2019) affirms that super-inefficiency is a new division for efficiency, different from existing studies such as SBM efficiency and SBM super-efficiency. It is important to note that superinefficiency has to be interpreted under the assumption that slacks from different inputs and/or outputs can somehow compensate for each other (see Remark 2). In the case of the CompSBM score, an efficient activity is super-inefficient if the model estimates that the magnitude of its optimal super-efficiency slacks is less than the magnitude of the optimal inefficiency slacks of the most efficient activity inP.
Following , the super-inefficiency zone is formed by all the super-inefficient activities; on the other hand, the super-efficiency zone is formed by all the efficient activities that are not in the super-inefficiency zone, i.e. with scores greater than or equal to 1. Each score (J-SBM, CSBM or CompSBM) can define a different super-inefficiency zone, but they are always in Region (III), around the weakly efficient frontier (see Fig. 2).
According to Remark 3, any global continuous SBM model cannot determine whether an activity is efficient or not, because super-inefficient activities have scores less than 1 but they are efficient. For this reason, any score that produces super-inefficient activities has to be taken as a complement to SBM efficiency and super-efficiency models. On the other hand, it could be interesting to define an alternative SBM super-efficiency score that penalizes the lack of efficient activities inP but it does not produce super-inefficient activities. Hence, in Remark 4 we construct a super-efficiency score γ se based on γ such that efficient activities obtain scores greater than or equal to 1 (see (11)). However, as it happens with the S-SBM score δ * , discontinuities inevitably appear when implementing γ se in conjunction with SBM efficiency, leading to serious interpretation problems related to sensitivity. (2021) defined a composite super-efficiency indexσ * such that efficient activities have scores greater than or equal to 1 and the inefficiency of super-efficiency projections is penalized. Given an activity (x, y), the corresponding score function (with respect to D) would have this form:σ * (x, y) = δ * (x, y) − 1 · ρ * x * ,ȳ * + 1,

Remark 4 (Composite SBM super-efficiency score) Lee
where (x * ,ȳ * ) is a super-efficiency projection of (x, y). However,σ * may not be well-defined if super-efficiency projections are not unique and, more importantly, it is not weakly monotonic in some cases, as we show in this example: considering the set of DMUs of Table 1, we have that the S-SBM score of D2 (with respect to {D1, D3, . . . , D6}) is 1.4146 and the SBM efficiency of its super-efficiency projection is 0.3185, giving aσ * score of 1.132; but increasing the fourth input of D2 from 1 to 1.5, we obtain that the S-SBM score is 1.3528 and the SBM efficiency of the super-efficiency projection is 0.4392, that gives aσ * score of 1.1549. In order to fix this, based on (9) and (10), we define the composite SBM superefficiency (CompS-SBM) score (with respect to D) of an activity (x, y) by which is always well-defined, unit-invariant, continuous and weakly monotonic. Moreover, it fulfils γ se = 1 in Region (I), γ se = δ * in Region (II), and 1 ≤ γ se ≤ δ * in Region (III), penalizing the lack of efficient activities inP. It is important to remark that, although γ se fixes the overestimation of the S-SBM score, it obviously produces discontinuities in the weakly efficient frontier when implementing in conjunction with SBM efficiency, i.e. considering ρ * for inefficient activities (Region (I)) and γ se for efficient ones (Region (II) and (III)).
Remark 5 (Strong monotonicity) We will show in Example 4 (specifically with D5) that there exist uncommon cases in which super-efficiency projections are inefficient but there are efficient activities inP. In these cases, max ρ * |P = 1 and hence, the CompSBM score γ does not penalize the inefficiency of super-efficiency projections, unlike J-SBM and CSBM scores. As a consequence, γ is not strongly monotonic for efficient activities, as it happens with the S-SBM score δ * (see Remark 1), although for other reasons and with a much lower frequency of cases.
Remark 6 (Alternative composite scores) We can use any weakly monotonic efficiency score f instead of ρ * in the definition of the CompSBM score γ (9). In this case, we need continuity of f | P and the fulfilment of properties 1 and 2 of Fukuyama et al.
(2014) (see Sect. 2.2). Hence, it is easy to prove that γ results continuous and weakly monotonic, even if the efficiency score f is not weakly monotonic. This makes it possible to use efficiency scores like the SBM-Max efficiency (Tone 2010(Tone , 2016, that is not weakly monotonic in some cases (Fukuyama et al. 2014;Ando et al. 2017). It could be interesting since there is a close connection between SBM-Max efficiency and SBM super-efficiency models (Tone 2017).

Computational aspects
Given an activity (x, y), the setP is formed by all the activities in P that are dominated by (x, y) (see (4)). Hence, where, according to (2), the objective function of (12) is Note that λ in (12) and (13) are different internal variables of these programs. In fact, the constraints of (13) assure that the constraints of (12) involving λ are satisfied and hence, it suffices to demand nonnegativity of t − , t + and t + < y in (12). But, in this case, we have to note that (13) may result infeasible for some nonnegative small values of t − , t + . The next result allows us to simplify (12).

Proposition 8 Let (x, y) be an activity and letP be the set given by (4). Then
The inner minimization program of (14) can be linearized using the Charnes-Cooper transformation (Charnes and Cooper 1962;Charnes et al. 1978). Note that in the outer maximization program we only demand nonnegativity of t − , t + and hence, the inner minimization program may result infeasible for some small values of t − , t + . If we do not want this to happen, we must demand all the constraints of (12) in the outer program. Note that program (14) can be viewed as a nonlinear maximization program, but it is also a continuous maximin problem with coupled constraints. Some methods for solving this kind of problems are provided by Shimizu and Aiyoshi (1980), Rustem et al. (2008), Tsoukalas et al. (2009), among others.

Remark 7 (Lower bound)
In some cases, computation of max ρ * |P by means of (14) may result too expensive (see Example 4). In these cases, we can compute a lower bound given by ρ * (x * ,ȳ * ) where (x * ,ȳ * ) is a super-efficiency projection of (x, y). Then, is a lower bound of the CompSBM score γ (x, y). Usually, ρ * (x * ,ȳ * ) is very close (or even equal) to max ρ * |P , and its computation only involves linear programs (applying the Charnes-Cooper transformation): (3) for the super-efficiency projection and (6) for its efficiency. On the other hand, Tsoukalas et al. (2009) propose an algorithm for solving continuous maximin problems that requires a lower bound and hence, ρ * (x * ,ȳ * ) could serve for computing max ρ * |P using this algorithm.

Extensions
In this section we are going to extend the CompSBM model to different orientations and returns to scale. Moreover, we discuss nonpositive data, and weighted inputs and/or outputs. Finally, we present a version adapted to the additive model. Orientations. The CompSBM score is defined in a nonoriented form. Nevertheless, considering the input and output oriented versions of the S-SBM model (Tone 2002), we can adapt our CompSBM score to these orientations. In this way, the input oriented CompSBM score (with respect to D) of an activity (x, y) is given by γ I (x, y) = δ * I (x, y) · max ρ * I |P , where δ * I and ρ * I are the input oriented versions of δ * and ρ * respectively. In order to compute max ρ * I |P , it is easy to prove that which is a linear continuous minimax problem with coupled constraints. Analogously, the output oriented CompSBM score (with respect to D) of (x, y) is given by where δ * O and ρ * O are the output oriented versions of δ * and ρ * respectively. In order to compute max ρ * O |P , it is easy to prove that which is also a linear continuous minimax problem with coupled constraints. Returns to scale. In this work, all definitions and results are within the framework of constant returns to scale. Nevertheless, we can modify the programs to consider different returns to scale. For example, for variable returns to scale in the CompSBM model we have to add the constraint n j=1 λ j = 1 to programs (3) and (14). It is worth noting that, taking variable returns to scale, nonoriented S-SBM models are always feasible (Tone 2002), but oriented models may result infeasible. In these cases, it will be impossible to compute oriented CompSBM scores.
Zero or negative data. The CompSBM model can accept zero or negative data as long as the SBM efficiency and super-efficiency models accept it. In fact, how to deal with zeros in data is discussed in Tone (2001Tone ( , 2002 and, more recently, how to handle with nonpositive data in general is discussed in Tone et al. (2020), Lee (2021).
Weights. We can consider different weights for each input and/or output. For example, we can compute the S-SBM score in (3) by means of the weighted objective function where w − , w + are the corresponding weights vectors. Analogously, the SBM efficiency model would also have to take into account these weights. Additive model. Finally, we can adapt the CompSBM score to the additive model. Following Charnes et al. (1982), the additive efficiency score (with respect to D) of an activity (x, y) in unit-invariant form is defined as Note that α * is not an efficiency score satisfying properties 1 and 2 (see Sect. 2.2). In fact, an activity (x, y) is efficient if and only if α * (x, y) = 0. On the other hand, following Du et al. (2010), we define the additive super-efficiency score (with respect to D) of (x, y) as If (x, y) is inefficient, then β * (x, y) = 0. So, from (16) and (17) we can define the additive composite score (with respect to D) of (x, y) as γ add (x, y) = β * (x, y)−min α * |P . Hence, γ add is negative for inefficient activities, and nonnegative for activities with efficient activities inP.

Examples
In this section we are going to illustrate the J-SBM, CSBM and CompSBM models with some examples. We have used R 3.6.0 (R Core Team 2020) for computations. Specifically, we have used the deaR package ( (x 1 , x 2 ) , 1). The super-inefficiency zones are represented, showing how discontinuity issues on weakly efficient activities are fixed by the CSBM and CompSBM scores, but not by the J-SBM score. Note that in cases where there is only one output (as in this example) the super-efficiency projections (x * ,ȳ * ) are among the most efficient activities in the corresponding setP, and hence max ρ * |P = ρ * (x * ,ȳ * ). Then, computing the CompSBM score γ is equivalent to computing γ low (see (15)), whose program is linear. Activities of the form ((30 + c, x 2 ) , 1) with 0 < x 2 < 20 have the same S-SBM score δ * when c ≥ 0 varies. But only when c = 0 the super-efficiency projection is  Figure 2 efficient and, in this case, it is equivalent to the fact that only when c = 0 there are efficient activities in the correspondingP. According to this, the J-SBM, CSBM and CompSBM models penalize activities of the form ((30 + c, x 2 ) , 1) with 0 < x 2 < 20 and c > 0. Table 2 SBM efficiency score (ρ * ), S-SBM score (δ * ), SBM efficiency score of super-efficiency projections (ρ * x * ,ȳ * ) that coincides with the best SBM efficiency score in the correspondingP (max ρ * |P ), and CompSBM score (γ ) of activities A1, . . . , A4 from Example 3

Example 3
In this example we want to illustrate how to introduce the CompSBM model into the Malmquist index computation. Note that this methodology is also applicable to the J-SBM and CSBM models. We consider the set of DMUs D = {D1, D2, D3} of Example 2 and another DMU D4 with activity changing from A1 = ((30, 10) , 1) to A2 = ((40, 10) , 1), while the activities of the DMUs in D remain unchanged. In this way, using the SBM efficiency and super-efficiency models, we can compute the SBM Malmquist index (Tone 2004) of D4, that is the product of two factors: the catch-up and the frontier-shift. On one hand, the catch-up (or recovery) is interpreted as the DMU's relative efficiency change, so catch-up values greater than 1 indicate progress, values less than 1 indicate regress, and a catch-up equal to 1 means no change. On the other hand, the frontier-shift (or innovation) is related to the technological change in the efficient frontiers and hence, analogously to the catch-up, values greater, less and equal to 1 indicate progress, regress, and no change, respectively, of the efficient frontier with respect to the evaluated DMU. Nevertheless, we can also compute a Malmquist index using the CompSBM score (or the J-SBM or CSBM scores) instead of the S-SBM score. Table 2 shows the scores of A1 and A2, while Table 3 shows the results of the Malmquist index with no orientation and using the exclusive scheme (see Tone (2004) for details). The original SBM Malmquist index does not detect the catch-up, and its frontier-shift does not take into account the inefficiency of the super-efficiency projection of A2. On the other hand, these problems do not appear when the CompSBM score is used instead, obtaining a catch-up smaller than one and a frontier-shift indicating less technological regress of the efficient frontier with respect to D4. Now, let us consider that the activity of D4 changes from A3 = ((50, 20) , 1) (that is weakly efficient) to A4 = ((50, 19) , 1), while the activities of the DMUs in D remain unchanged. Analogously, we can compute the SBM Malmquist index of D4 in Table 4 Nonoriented scores of DMUs from Example 4. The scores are as follows: S-SBM score (δ * ), SBM efficiency score of super-efficiency projections (ρ * x * ,ȳ * ), the best SBM efficiency score in the correspondingP (max ρ * |P ), J-SBM score, CSBM score, a lower bound of the CompSBM score (γ low = δ * · ρ * x * ,ȳ * , see Remark 7), CompSBM score (γ ), and CompS-SBM score (γ se ), all of them computed with respect to the set of reference DMUs given by all the DMUs excluding the evaluated DMU

DMU
Scores its original form or using the CompSBM score instead. Table 2 shows the scores of A3 and A4, while Table 3 also shows the results of the Malmquist index with no orientation and using the exclusive scheme. The catch-up of the original SBM Malmquist index is overestimated due to the discontinuity of the SBM efficiency score on weakly efficient activities described in Example 2. This problem is again solved when the CompSBM score is used instead, resulting in a catch-up very close to 1.

Example 4
We are going to consider the same set of DMUs of Example 1, used in Doyle and Green (1993), Tone (2002), consisting of six efficient DMUs (power plant locations) with four inputs and two outputs (see Table 1). Table 4 shows nonoriented scores (although Tone (2002) only considers the input oriented scenario) and Table 5 shows the corresponding efficient targets of super-efficiency projections, with optimal slacks in parentheses. Note that the difference between the original DMU and the efficient target of a super-efficiency projection is given either by optimal super-efficiency slacks or by optimal inefficiency slacks (of the super-efficiency projection), separately in each input and output. This fact is shown in Table 5, where super-efficiency slacks are displayed in bold italic, inefficiency slacks in bold, and each input or output has either bold italic or bold slacks, but not both (see Proposition 3). It should be also noted that the S-SBM score only takes into account "bold italic slacks", ignoring "bold slacks". Note that, for D5, ρ * (x * ,ȳ * ) < 1 and it does not coincide with max ρ * |P , whose value is 1. Hence, D5 has super-efficiency projections that are inefficient, but there are also efficient activities in the correspondingP. In this case, the CompSBM model does not penalize the inefficiency of such super-efficiency projections (because there are efficient activities inP), contrary to the J-SBM and CSBM models. An important conclusion is that the CompSBM score is not strongly monotonic in some cases, because D5 can be improved (specifically, the second input can be lowered up to 525) keeping the original S-SBM score and, since there are efficient activities inP, the CompSBM score of D5 will not change.
We have used the NLopt package (Johnson 2019) for solving the nonlinear program (14) in the computation of max ρ * |P . Specifically, we have used the following global nonlinear algorithms included in this package: DIRECT (Dividing RECtangles) (Jones et al. 1993), its "locally biased" version DIRECT-L (Gablonsky and Kelley 2001), and COBYLA (Constrained Optimization BY Linear Approximations) (Powell 1998). Computation time varies depending on the algorithm: 150 seconds for DIRECT, 20 seconds for DIRECT-L, and 2 seconds for COBYLA, using a 2 GHz processor. Nevertheless, computation time grows exponentially with the number of efficient DMUs, inputs and/or outputs, making it practically impossible to solve problems with, for example, more than 30 efficient DMUs with 5 inputs and 5 outputs.

Concluding remarks
The problem of ranking efficient DMUs continues to be an active issue that keeps generating new studies and methods within DEA (Jablonsky 2012;Zýková 2022). Since its inception in the papers of Pastor et al. (1999) and Tone (2001), the SBM superefficiency model has proved to be an important tool for ranking efficient DMUs which has been widely used by DEA practitioners in the last years. However, some interpretation problems appear when super-efficiency projections are weakly efficient, since the SBM super-efficiency model does not take into account the inefficiency slacks of such projections and therefore, some efficient DMUs can not be properly ranked. Moreover, this fact is closely related to discontinuities produced in the weakly efficient frontier when implementing SBM super-efficiency in conjunction with SBM efficiency. Authors like Chen (2013) and Chen et al. (2019) tried to solve these problems, but they did not arrive to a fully satisfactory solution. Nevertheless, their papers lay the foundations for future studies on the subject. In order to shed some light on this matter, we have introduced the CompSBM model, leading to the first example of a weakly monotonic score that integrates the SBM efficiency and S-SBM scores in a continuous way. Indeed, we have shown that it coincides with the S-SBM score when the DMUs are efficient with no inefficient superefficiency projections, and coincides with the SBM efficiency score when the DMUs are inefficient. Moreover, the CompSBM model gives a continuous ranking of DMUs, avoiding the abrupt changes in the scores shown by other models and hence, solving the discontinuity problems in the weakly efficient frontier. It can also be adapted to other models such as the additive model or the SBM Malmquist index, or even we can use alternative efficiency scores different from ρ * in the construction of a composite score, giving us a deeper insight into the evolution of the performance of DMUs.
The idea behind the CompSBM model is not to consider only the super-efficiency slacks, but also to penalize the lack of efficient activities in the setP onto which superefficiency projections are projected. Since we demand continuity, super-inefficient activities (i.e. efficient activities with scores less than 1) inevitably appear around the weakly efficient frontier. Although it can be counter-intuitive at first, super-inefficiency is interpreted as a "hidden" inefficiency , assuming that slacks from different inputs/outputs can somehow compensate for each other. According to Chen et al. (2019), super-inefficiency is a new division for efficiency, different from existing studies such as SBM efficiency and SBM super-efficiency. Nevertheless, since superinefficiency is a relatively new concept, it is not yet considered by some researchers who prefer to deal with discontinuities rather than with super-inefficiencies, although discontinuities have serious interpretation problems related to sensitivity. Hence, we have also defined a new weakly monotonous SBM super-efficiency score (based on the CompSBM score and the work of Lee (2021)) that penalizes the lack of efficient activities inP without producing super-inefficient activities (see Remark 4). However, discontinuities obviously appear in the weakly efficient frontier when implementing this new super-efficiency in conjunction with SBM efficiency.
To sum up, the CompSBM score: -is continuous, -is weakly monotonous, and -it allows ranking the efficient DMUs.
However, this score presents two difficulties, namely: -its calculation requires solving nonlinear optimization problems, and -it presents super-inefficiencies, which is the price we have to pay for having a global continuous score.
We believe that the methodology employed in the construction of the CompSBM score can help in the development of other models with better properties, as for example to be strongly monotonic (see Remark 5) or be easier to compute. Moreover, there are some pending tasks that may be interesting, such as the use of the SBM-Max efficiency model in the construction of the composite score (see Remark 6), or the study of the potential infeasibility of oriented composite models under variable returns to scale. In our opinion, these and other questions could lead to future results which, beyond any doubt, will help to increase the understanding of the DEA methodology.
Funding Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. Not applicable.
Availability of data and material Not applicable.

Code availability
We have used R 3.6.0 (R Core Team 2020) for computations. Specifically, we have used the deaR package (Coll-Serrano et al. 2020) for computing linear scores, and the NLopt package (Johnson 2019) for solving the nonlinear program (14) in Example 4.

Proof of Proposition 6
We have that max ρ * |P depends on (x, y) throughP (i.e. the set of activities in P that are dominated by (x, y)). From (4), it is clear thatP varies in a continuous way with respect to (x, y). Moreover, since ρ * | P is continuous and P ⊆ P, we have that max ρ * |P is continuous. Finally, since δ * is continuous, we conclude that γ = δ * · max ρ * |P is also continuous.

Proof of Proposition 7
Let (x, y) be an activity strictly dominated by x , y . Consid-eringP andP the sets of activities in P that are dominated by (x, y) and x , y respectively, it holds thatP ⊆P and hence, max ρ * |P ≤ max ρ * |P . On the other hand, δ * (x, y) ≤ δ * x , y since δ * is weakly monotonic. So, γ (x, y) ≤ γ x , y by (9).