Robust data envelopment analysis via ellipsoidal uncertainty sets with application to the Italian banking industry

This paper extends the conventional DEA models to a robust DEA (RDEA) framework by proposing new models for evaluating the efficiency of a set of homogeneous decision-making units (DMUs) under ellipsoidal uncertainty sets. Four main contributions are made: (1) we propose new RDEA models based on two uncertainty sets: an ellipsoidal set that models unbounded and correlated uncertainties and an interval-based ellipsoidal uncertainty set that models bounded and correlated uncertainties, and study the relationship between the RDEA models of these two sets, (2) we provide a robust classification scheme where DMUs can be classified into fully robust efficient, partially robust efficient and robust inefficient, (3) the proposed models are extended to the additive DEA model and its efficacy is analyzed with two imprecise additive DEA models in the literature, and finally, (4) we apply the proposed models to study the performance of banks in the Italian banking industry. We show that few banks which were resilient in their performance can be robustly classified as partially efficient or fully efficient in an uncertain environment.


Introduction
Data envelopment analysis (DEA) is a nonparametric optimization model for assessing the relative performance of a set of peer decision-making units (DMU) with multiple inputs and multiple outputs. Specifically, the DEA model is based on linear programming to measure the efficiency of the i-th DMU under evaluation relative to the other DMUs of the set. The model was initially proposed under the constant returns to scale assumption of a firm's production activity by Charnes et al. (1978) and later extended to the variable returns to scale by Banker et al. (1984). The multiple advantages of the DEA make it an important analysis tool for benchmarking in various scientific areas such as operational research, decision analysis, management science, social science, etc. However, the learning procedure through which DMUs are benchmarked against each other implicitly assumes precision in data and ignores the uncertainties and noise inherent in the inputs and outputs. Hence, the traditional models as proposed in Charnes et al. (1978) and Banker et al. (1984) can be practically unsuitable in application. For instance, in many applications, some of the data are only known within specified bounds or in ordered relations while others are described vaguely such that the real values are unknown or uncertain. These uncertainties when neglected can affect the reliability of the efficiency scores and the stability of management decisions.
The classical approach to deal with uncertainties in management decisions is the stochastic programming and sensitivity analysis (see Land et al. 1993;Olesen and Petersen 1995;Cooper et al. 1998). To address the issue of imprecision and uncertainty which conceivably have potential feasibility and optimality concerns in DEA, many researchers adopt advanced deterministic models such as imprecise DEA, fuzzy DEA and the robust DEA (RDEA). We refer the reader to Zhu (2003) and Hatami-Marbini et al. (2011) for extensive reviews of these approaches.
The RDEA approach was proposed by Sadjadi and Omrani (2008) to deal with uncertainties in input and output data. The approach offers to immunize the uncertain inputs and outputs data of DMUs in a user-defined uncertainty set and provides a probability guarantee for reliable efficiency scores, robust discrimination and ranking of DMUs. The RDEA is based on the robust optimization (RO) technique which was initially introduced by Soyster (1973) and extended by the likes of Ben-Tal and Nemirovski (1998, 2000 and El Ghaoui et al. (1998). The theory and applications of the RO have been reviewed in Bertsimas et al. (2011) for a wider reading. A good background and review of the RO to DEA have also been provided in Mensah (2019). When applying the RO to DEA, two main techniques are enhanced to characterize the uncertainties in the input and output data: scenario-based and uncertainty set-based techniques. The former is proposed by Mulvey et al. (1995) and researched further by Laguna (1998). Its application to DEA includes the works of Zahedi-Seresht et al. (2017) and Esfandiari et al. (2017). The scenario-based RO, however, has a drawback of changing the structure of the deterministic program.
On the other hand, uncertainty set-based techniques have been developed in Ben-Tal and Nemirovski (1998, 2000, Bertsimas and Sim (2004). The general concept of this approach thrives on formulating an alternative optimization model called the robust counterpart which seeks all or most possible realization of the 1 3 Robust data envelopment analysis via ellipsoidal uncertainty… uncertain parameters in decision-maker-defined uncertainty set. Broadly speaking, a major modeling concern of the RO in this direction is the design of a tractable 1 robust formulation for the nominal problem that guarantees constraints feasibility with high probability. To deal with model tractability and conservativess of robust solutions, different robust concepts: reliability of the robust solution (Ben-Tal and Nemirovski 2000), price of robustness (Bertsimas and Sim 2004), adjustable robustness (Ben-Tal et al. 2004) and light robustness (Fischetti and Monaci 2009;Mensah and Rocca 2019) among others have been proposed in the past few years. These concepts are largely based on two main sets: the ellipsoidal uncertainty sets of Nemirovski (1999, 2000) and a family of polyhedral uncertainty sets otherwise called the budget of uncertainty of Bertsimas and Sim (2004). These sets ensure model tractability and offer the decision maker the ability to control the conservativeness of the robust solution. The main disadvantage of the former, however, is that the nominal linear optimization model is transformed into a nonlinear problem which can be computational demanding. Bertsimas and Sim (2004) model approach preserves the linearity of the nominal problem. As a result, the concept has been applied to different classes of DEA problems. For instance, Shokouhi et al. (2010) proposed a general RDEA model in which inputs and outputs are constrained in an uncertainty set with data uncertainties covering the interval DEA approach. The authors applied the robust approach of Bertsimas and Sim (2004) and Monte Carlo simulation to compute for the range of Gamma values for the conformity of the ranking of the DMUs. Omrani (2013) introduced an RDEA to find the common set of weights (CSW) in DEA with uncertain data under a similar uncertainty set. In a related paper, Salahi et al. (2016) developed an optimistic RO approach for the CSW in DEA. Arabmaldar et al. (2017) proposed a robust super-efficiency DEA model. Toloo and Mensah (2019) studied the computational complexity RDEA models in their reduced form using the concept of the budget of uncertainty. It is noteworthy that, RDEA models with ellipsoidal uncertainty seem to be relatively unexplored. To the best of our knowledge, Sadjadi and Omrani (2008), Lu (2015), Wu et al. (2017), Salahi et al. (2018) and Lu et al. (2019) are the only few researchers who have made advances to RDEA considering uncertainty in an ellipsoid. Most of these studies are, however, limited to output data uncertainty due to the larger concern of considering input data uncertainty in the equality normalization constraint. Uncertainty in the constraints of the DEA models must be strictly satisfied to obtain a feasible solution for the RDEA counterpart. The issue equality constraint in RDEA is well addressed in Toloo and Mensah (2019) and considered in this paper. This paper will focus on the ellipsoidal uncertainty sets introduced in Ben- Nemirovski (1999, 2000) to identify inefficiencies of DMUs using the risk preference of the decision maker (DM). From the mathematical point of view, the ellipsoidal uncertainty set provides a convenient entity and offers the decision maker the ability to control the conservativeness of the efficiency solution to 1 3 different data perturbations via the semi-axis of the ellipsoid. These sets are also practically useful for modeling correlation (if they exist) among the inputs (output) data which is relevant to prevent the effect of correlation on the efficiency mean (Farzipoor Sean et al. 2005). To be more specific, we adopt an ellipsoidal uncertainty set to model unbounded distribution of uncertainties while input and output data with bounded random distribution are modeled with an intervalbased ellipsoidal uncertainty set. Another contribution of this paper provides a classification scheme based on the proposed models. The scheme allows DMUs to be classified into fully robust efficient, partially robust efficient and robust inefficient. We further extend our robust approach to the non-radial additive model where a newly proposed robust additive model is compared with peer imprecise additive models proposed in Lee et al. (2002) and Matin et al. (2007).
The structure of the paper is as follows. Section 2 will provide the background of the DEA and RDEA models. This is followed by RDEA models developed from the two ellipsoidal uncertainty sets in Sect. 3. A robust classification scheme and a numerical example are also given in this section. Section 4 will extend the RDEA approach to a robust additive DEA model and will compare it to some imprecise DEA models in the literature. The penultimate section illustrates the applicability of the RDEA models with banking studies in Italy. Finally, Sect. 6 will provide conclusions and further research.

The DEA models
Consider n DMUs indexed as j = 1, … , n where each DMU j consumes m inputs x j = x 1j , … , x mj to produce s outputs y j = y 1j , … , y sj which are denoted by ( x j , y j ) ∈ ℝ m+s . Charnes et al. (1978) proposed the following fractional DEA programming to maximize the ratio of the weighted sum of outputs to the weighted sum of inputs of a unit subject to the condition that the same ratio of all other units are less than or equal to unity: where u r and v i are the respective rth output and ith input weights. Charnes et al. (1978) further reduced the above nonlinear CCR model to a linear form in the following: Robust data envelopment analysis via ellipsoidal uncertainty… Model (2) involves n + 1 and m + s decision variables (weights) with an objective function that estimates the efficiency of DMU o at n solution instance. It must be noted here that the "≤" sign is used for the normalization constraint rather than the usual "=" sign used in the case of the standard CCR model. The consideration of the inequality sign addresses the issue of equality constraint in RDEA. In other words, model (2) overcomes the situation where input uncertainty and robust analysis in the standard normalization constraint ∑ m i=1 v i x io = 1 could lead to a restriction on the constraint and probable model infeasibility (Ben-Tal et al. 2009). For details on equality constraint in RDEA, see Toloo and Mensah (2019). It is worth mentioning also that the CCR model with either "≤" or "=" sign in constraint 1 yields an equivalent efficiency for the DMUs (see Toloo 2014). Thus, suppose the optimal solution in model (2) The efficiency of DMUs is given by the following definition.
Definition 1 DMU o is CCR efficient if * = 1 , and there exists at least one strictly positive optimal solution (i.e., ∀i, v * i > 0, ∀r, u * r > 0 ), otherwise it is CCR inefficient.

Robust counterpart DEA
Model (2) shows the case where the inputs and outputs are deterministic, i.e., nominal data are used. As aforementioned, the model is not useful when uncertain factors in the inputs and outputs prevail. In this section, we employ the RO technique and develop RDEA model to address this nondeterministic data problem. The common technique of the RDEA is to consider the worst-case scenario for the uncertain input and output data and trade-off between performance and robustness as different scenarios occur. Let the input and output variables with uncertainty be expressed as x ij = x ij + x ijx ij ; ỹ rj = y rj + y rjŷ rj where x ij = x x ij ,ŷ rj = y y rj are deviations from the nominal values, x ij , y rj and x and y are a given uncertainty level or percentage of perturbation, respectively. By definition, DMU k with x k ,ỹ k is uncertain if there exist i ∈ I k or r ∈ R k where I j and R j represent the set of inputs and outputs of DMU j that are subject to uncertainty. That is I j = � and R j = � present the case where there is no uncertainty. All the uncertainties are subject to a known set U called the uncertainty set. Therefore, considering the uncertain inputs and outputs set, we could impose constraints on the uncertain DEA model such as x ij , ỹ rj ∈ U . The robust counterpart to the uncertain DEA will be the following: and where U is 'ellipsoidal'; P is a non-singular matrix of perturbations and is the safety or robust parameter defined by the DM. In this formulation > 0 holds since the weight variables are assumed to be nonnegativity. Thus, the objective in model (2) is converted to a constraint in model (3) which is related to the uncertain data for the DMU o to be evaluated. This additional constraint is free in index since it is determined at the outset and is in contact with the objective in the model. The ellipsoid uncertainty set is considered for two main reasons: first, to adjust the risk tolerance of the DM by controlling the size of the ellipsoid via the parameter , i.e., as the size of the ellipsoid increases, the risk aversion of the DM increases and vice versa. Second, to overcome the aggressive conservatism of the robust solution (c.f Soyster (1973). It should be mentioned here that the robust counterpart for the ellipsoidal sets is nonlinear, however, its formulation is tractable. In other words, the robust counterpart leads to second-order quadratic programming which can be solved once in polynomial time with many solution algorithms including solvers such as GUROBI in GAMS.

RDEA models under ellipsoidal uncertainty sets
We distinguish between two kinds of uncertainty including randomness inherent in the inputs and outputs data: unbounded and bounded correlated uncertainties. We consider the following uncertainty sets: 1. the usual ellipsoidal uncertainty set 2. box (interval)-based ellipsoidal uncertainty set.
The first set models uncertainties that have unbounded distribution while the second set models uncertainties with bounded random distribution. Below, we discuss in detail these uncertainty sets and their RDEA formulation.
Robust data envelopment analysis via ellipsoidal uncertainty…

The usual ellipsoid case
Let's begin with the simplest case where U e is a usual ellipsoid or a constraint-wise uncertainty with every constraint uncertainty set U e being an ellipsoid. We describe the following ellipsoid where the vector a o ∈ ℝ n is the center of the ellipsoid, Rank ( ) = m ≤ n is the shape matrix of the ellipsoid and the random variable u ∈ ℝ n . The representation above can handle different cases of the ellipsoid including "ellipsoidal cylinders" and "flat" ellipsoids such as points and intervals (Calafiore and El Ghaoui 2004). An alternative description involves the squared shape matrix = T for which, when ≻ 0 we obtain an equivalent representation of (4) as Figure 1 shows such an ellipsoid in ℝ 2 (shaded) with center a o and axis-length where i and i are, respectively, the eigenvalues and eigenvectors corresponding to the symmetric positive definite matrix 2 Let Then, for all DMUs, the simple ellipsoid where ( x, y) ∈ ℝ m+s is described as follows: where x j =x ij and y j =ŷ rj are deviation vectors defining the deviation of inputs (outputs) from their nominal values. Following Wu et al. (2017), the weight vectors u or v is, respectively, mapped by the following relationships: To formulate the robust counterpart of the DEA under the uncertainty (4), the following lemma on the worst-case robust counterpart is important.

Lemma 1 Consider the linear inequality a T o x ≤ b i where the vector a is uncertain and belongs to the ellipsoid
Proof See for e.g., Bertsimas et al. (2011) Theorem 1 The robust counterpart CCR described under the ellipsoidal set (5) lead to the following nonlinear model: Proof Using Lemma 1 and the CCR model 1, the robust counterpart DEA with the ellipsoid uncertainty set (5) is formulated first as the following model: In this application, the n constraints in model (1) are transformed into one and n − 1 constraints. This is to ensure that the maximal efficiency of DM U o does not exceed unity but attains it at optimum, given also that the set of data used in the first constraint of model (7) are reused in the second constraint. Model (7) is further modified from the fractional program to linear form by employing Charnes and Cooper (1962) x io � > 0 and u r = tu r , v i = tv i in the first constraint. Consequently, the following DEA model is achieved: Next, we solve the inner problems in model (8). The last term of the first constraint in a re-casted objective function form arrives at the following robust counterpart: The robust counterpart for the second term of the second constraint is: The robust counterpart for the third term of the fourth constraint is: rj for r ∈ R j and substitute the results from (9) to (11) into model (8) and the proof is complete. □

The combined interval and ellipsoid case
We now look at the uncertainty set designed with an ellipsoid and interval uncertainties. We assume that the uncertain input and outputs are obtained from the nominal values by the random perturbation: x ij = 1 + x x ij x ij and ỹ rj = 1 + y y rj y rj where x ij i∈I j and y rj r∈R j ( x ij = y rj = 0 for i ∉ I j , r ∉ R j ) are the independent random variables symmetrically distributed in the interval bound [−1, 1] and x and (9) Robust data envelopment analysis via ellipsoidal uncertainty… y are given uncertainty levels of the inputs and outputs (Ben-Tal and Nemirovski 2000). In this situation, we speak of an RDEA which pass from deterministic to probabilistic in the sense that the underlying inputs and output variable are purely random and the probability of the constraint ∑ m i=1 v ixio ≤ 1 , for instance, is where ≥ 0 is a given reliability level. For the uncertainty set here, we proceed with the following useful definitions.
Definition 2 Consider a unitary interval denoted by = [−1, 1] . An interval uncertainty set for the random variables is equipped with the infinity norm given as Definition 3 Given the random variables , the ellipsoid normalized to a ball of radius centered at the origin is the set with l 2 -norm given as The uncertainty set for the uncertain input and output dynamics using both l 2norm and the infinity norm from the above definitions is stated as the following: where x j and y j are the lengths of semi-axes of the ellipsoid for the uncertain input and output data, respectively.
are the cardinalities of the uncertain inputs and outputs, respectively. It is clear that U j is an intersection of unit boxes and balls centered at the origin with radii x j and y j . The largest volume ellipsoid contained in the box occurs when j = 1 and the smallest volume ellipsoid containing the box occurs when Figure 2 illustrates the different scenarios of the feasible region for the ellipsoid intersection with the (12) for the uncertain input and output data, wlog, we consider . The following lemma is crucial for the robust counterpart DEA model: Proof See for e.g., Ben-Tal and Nemirovski (2000).
Consequently, the RDEA under the uncertainty set (10) leads to the following second-order cone programming CCR model.

Theorem 2
The robust counterpart CCR model described under the ellipsoidal set (13) leads to the following second-order cone programming DEA model: where rj and ij are auxiliary output and input variables; rj and ij are interval uncertainty parameters. 3 The robust counterpart model is feasible with the same probability as the original problem if all the constraints are satisfied with probability guarantee = exp −( ∕2).

Proof
The proof follows similarly from Theorem 1.

The efficiency of the RDEA models
The design of uncertainty set for the two models developed in this paper is related to the distribution of the uncertainty. Thus, if the uncertainty of the inputs and outputs is subject to unbounded distribution, i.e., the size of the uncertainty is not restricted, the simple ellipsoid is considered appropriate and model (6) is constructed. On the other hand, to subject the uncertainty to a bounded distribution, the interval set is required to limit the uncertainty in their bounds to avoid an unnecessarily large uncertainty set. The combined interval and ellipsoid uncertainty set is used for model (13). The formulation of these RDEA models are tractable, feasible and their robust efficiency is obtained according to the following theorems: Theorem 3 The optimal objective values of model (6) is less than or equal to 1.
Proof Let (w * , v * , u * ) be the optimal solution of model (6)

Theorem 4
The optimal objective values of model (13) is less than or equal to 1.
Proof The proof is similar to Theorem 3.
Let w * and z* be the optimal objective value of models (6) and (13), respectively. The robust efficiency for DMU o is given by the following definition.

Definition 4 DMU o is R-efficient, if and only if it satisfies the following two conditions:
(i) it is CCR efficient and (ii) w * = 1 or z* = 1.
The efficiency here is also referred to strong efficiency or fully robust efficiency in the sense of Pareto-Koopmans efficiency as opposed to Farrell's measure of efficiency, which ignores the slacks as sources of inefficiency (see Cooper et al. 1999;Park 2007). We shall incorporate other existing concepts of the efficiency classifications later in this section. From Theorems 3 and 4, it is clear that w * , z * ∈ (0, 1] . Therefore, a CCR-non efficient DMU will be robust-inefficient irrespective of the uncertainty level. Note that the solution z * , v * , u * , * , * in model (13) can be 'less conservative' than the solution (w * , v * , u * ) in model (6): if j increases in the former. In fact, in the case of large uncertainty set indicating high assurance for robustness, the efficiency of DMUs in model (13) can be less efficient than in model (6) 2 ro ≤ 0 enforces protection of the random perturbation of the inputs and outputs in the interval which makes the model much restrictive for higher efficiency. In the special case of j = 1 (see Fig. 2), the ellipsoid is exactly inscribed by the box/interval, and so we obtain an equivalency in the optimal objective values of model (6) and model (13). For higher values of j , the performance of DMUs worsen indicating the price paid for robustness (Bertsimas and Sim 2004). As a result, DMUs characterized as inefficient by the latter are equally characterized as inefficient by the former. However, the reverse case is not entirely true.
As observed so far, the efficiency of DMUs under the robust model (13) depends on the risk preference of the DM determined by the uncertainty level. That is, the DM is at will to vary j according to the following: • j = 0 ⇒ the robust model shrinks to the nominal DEA problem.
• j = 1 ⇒ the uncertainty denotes the largest volume of ellipsoid contained in the interval and ⇒ the highest robust solution is sought for the uncertain inputs and outputs in the model since all the uncertain inputs and outputs are immunized.
The specific value of j to the model is carefully chosen to avoid an overly conservative solution. Here, we provide a suggestion for the classification of DMUs using the conservativeness of the DM, i.e., j ≤

Definition 5
The robust efficiency for DMUs under the robust model (13) can be classified into three mutually exclusive subsets: (i) (Full R-efficiency). DMU o is fully R-efficient if and only if z * = 1 when Robust data envelopment analysis via ellipsoidal uncertainty… The above classification can be denoted as RE ++ ∼ full R-efficiency, RE + ~ partial robust efficiency or PR-efficiency and RE − ~ R-inefficiency. The set RE ++ consist of DMUs that are robust R-efficient in any combination of uncertain inputs and outputs at all robust levels defined by the DM. This category of efficient DMUs is obtained under the most conservative evaluation of the uncertain data. It is evident that any DMU in the fully robust efficient group is always efficient uncertain data and can hence be regarded as the best performer. So, logically, a DMU is robust efficient if and only if it is fully robust efficiency. The set RE + consists of DMUs that are R-efficient at maximal sense but cannot maintain R-efficiency at certain conservative levels for inputs and outputs. The PRefficiency is therefore obtained in a less stringent manner than the full robust efficiency and as a result, it's efficiency values are higher than the R-efficient units. Finally, the set RE − consists of R-inefficient DMUs which are always inefficient in the least consideration of uncertainty for any input and output combinations. It is therefore clear that DMU o cannot be efficient in the robust sense unless this DMU is partially robust efficient and that DMU o will be partially robust efficient if it is first DEA efficient.
To further demonstrate the classification scheme provided in Definition 5 and also compare the efficiency of DMUs under the robust models (6) and (13), we consider a numerical example with data from Hatami-Marbini and Toloo (2017). See Table 1. The input and output data are taken 5% perturbation from their nominal values. The results for the CCR model and robust CCR models are shown in Table 2. Omega values, j = 0, 0.5, 1, 2 and 2.8 in model (13) are arbitrary chosen bearing in mind j = 0 is equivalent to the CCR efficiency in column two of   (13) decrease when uncertainty is considered in the data and j increases. Figure 3 shows the efficiency of DMUs as j increases from j = 0 to j = 2.8 at an interval of 0.4. Here, the robust efficiency of DMU 1 , DMU 3 , DMU 4 , DMU 6 and DMU 10 remain the same at 1 for all values of j . These DMUs are called R-efficient.
Considering columns three and five of Table 2, it is evident the equivalency of the robust models (6) and (13) when the ellipsoid is inscribed by the box/interval. Also, the robust efficiency scores of model (13) include that of model (6) at j = 1.0 . Here, the DMUs which are R-inefficient in the later model are also R-inefficient in the former model. However, as mentioned earlier, it is possible that the maximum realization of the uncertain data may occur at the corners of the interval (see Fig. 2) which implies that model (13) can be more conservative and with higher complexity than model (6) at full protection of the uncertain data. The efficiency classification according to the DM conservativeness is shown in the last column of Table 2. It is observed that the R-efficient DMUs are RE ++ = DMU 1 , DMU 3 , DMU 4 , DMU 6 , DMU 10 , the PR-efficient DMUs are RE + = DMU 2 , DMU 7 while finally, the R-inefficient DMUs are RE − = DMU 5 , DMU 8 , DMU 9 .

Extension to the additive DEA model and imprecise data
In this section, we extend the robust approach to the additive (ADD) model with imprecise data. Consider the additive (ADD) model proposed in Charnes et al. (1985) to evaluate the efficiency of DMUs: where s − i and s + r are the slacks for the input and output, respectively. To extend the RO to the additive model above, first, consider the dual formulation of model (14). Again, this is to avoid any possible infeasibility resulting from uncertainty analysis in the equality constraints. The dual of model (14) is the following: where is the efficiency of DMU o . It is easily verifiable that * ≥ 0 ; thus, an efficient point ( x ij , y rj ) will lie on the facet-defining hyperplane with equation Then, a DMU j is efficient if * = 0 and inefficient if * > 0 or alternatively, * > 0 and (v * , u * ) ≥ 1 m+s measures the inefficiencies of the DMUs. In particular, to obtain an efficiency preserving unit to data perturbation, we consider the ellipsoidal-interval uncertainty defined in (12), and similarly to model (13), we propose the following robust additive model (RADD): where is the robust additive efficiency of DMU o The RADD model (16) can be compared to the imprecise additive models developed in Lee et al. (2002) and Matin et al. (2007). Consider the numerical example given in Cooper et al. (1999) and presented in Table 3. The column headings indicate the data to be dealt with in ordinal and bounded forms as well as in the customary exact forms represented by the conditions y r ∈ D + r , x i ∈ D − i where D + r and D − i . DEA models described by these data are nonlinear and usually converted to linear standard DEA with exact data by using the transformation approach suggested in Zhu (2003). It must be noted that the robust model is not able to deal with ordinal and bounded data. The approach adopted in this paper follows the transformation of bound and ordinal data in Table 3 to exact data in Lee et al. (2002). The result of the retrieved exact data is given in Table 4. From this data, we compare the result of the RADD model with the two-stage imprecise additive model of Lee et al. (2002) and the onestage imprecise additive model of Matin et al. (2007). Table 5 presents the inefficiency of DMUs proposed by the different methods. The efficiency of DMUs provided by the proposed robust model ( j = 0 ) indicated in Table 5 is the same as the former two methods where the RADD model yields larger scores for the inefficient Robust data envelopment analysis via ellipsoidal uncertainty… DMUs and with higher discriminating power. The performance of DMUs on the three models are indifferent and their efficiency score according to Table 5 is ranked as follows: where the symbol ''~'' denotes ''indifferent to'' and the symbol ''≻ '' denotes ''superior to''. It should be noted that the RADD model lightens the computational burden compared to the imprecise DEA models and provides the flexibility for controlling the conservativeness of solution to data perturbations. Thus, for some imprecise data, the proposed model in this paper is more computationally effective and flexible in robustly ranking the efficiency of DMUs.

Application to banking efficiency in Italy
We demonstrate the real-world application of the proposed robust CCR models by analyzing the performance of banks operating in Italy. The Italian banking industry is emerging from a prolonged period of distress following the global financial crises in 2008 and the slowdown of the Italian economy. 4 Although the banking system has shown enough resilience and recovering over the years, competition in the global uncertain environment, particularly in Europe has required that the banks operate efficiently and robustly. Indeed, banking idiosyncratic uncertainties translate into Table 3 Exact and imprecise data adapted from Cooper et al. (1999) a Ordinal ranking such that 5 = highest rank, …, 1 = lowest rank (i.e., y 23 ≥ ⋯ ≥ y 24 ) b Ratio bound based on the reference DMUs 3 or 5 (e.g., 0.6 ≤ x 21 ≤ 0.7 with x 23 = 1) The banking crises that engulfed Italy and ongoing mildly can be attributed to two main sources. First is the financial market crises in 2008 that was caused by mortgage crises and largely the failure of the Lehman Brothers. The second one stems from the sovereign debt crises that affected Greece and some peripheral countries of the European monetary union: Italy, Spain, Portugal, Ireland. The Italian government through the bank of Italy in its supervisory capacity instituted measures such as the provision of liquidity, strengthening and supporting of banks, recapitalization of distress banks and including the so-called "Tremonti bond". The measures were to revitalize the banking industry, protect depositors and also finance the economy. uncertainties in banking data. Therefore, to obtain robust performance evaluation of the banks, we consider the RDEA models developed in this paper and assume banks as decision-making units that consume inputs, e.g., assets and equity to generate an amount of output level, e.g., loans and revenue in an uncertain environment.

Bank data and variable selection
Data comprising 29 main banks in Italy for the accounting year 2015 were collected from the Bureau van Dick-Bankscope database (Bank scope 2015). See also Alfiero et al. (2019). The selected banks operate under a common set of rules and regulations set up by the Bank of Italy and by extension the Central Bank of Europe which implies that they have a common current denominator for which comparison of performance can be smoothly made. Next is the selection of inputs and outputs which is crucial in the banking efficiency measurement. In the banking sector, similar to other sectors, a consensus is reached on the classification of some factors as inputs and outputs. However, the classification of others particularly deposit is unclear and controversial. The debate on bank deposit which in the DEA literature is termed as a flexible measure or dual-role factors (see Toloo 2012;Toloo et al. 2018) is that, depending on the operational activities of the bank, in one hand, deposit  could be regarded as an input (intermediation approach) and on the other hand as an output (production approach) or as a major component involved in the creation of added value (value-added approach). Different researchers select different measures. Casu and Girardone (2002) examined the cost efficiency of the Italian bank conglomerates by assessing the cost characteristics of bank parent companies and their subsidiaries. Favoring the intermediation approach, they considered as inputs labor cost, deposits and physical capital whiles total loans and other earning assets were used as outputs. Aiello and Bonanno (2016) considered the role of banks in Italy as an intermediary and used deposits, capital and labor as input factors whiles they used loans, securities and commission income as output factors. According to popular studies on banking efficiency, the intermediation approach is used since banks are essentially seen as financial intermediaries, whose main activities are to borrow funds from depositors and lend to others (Fethi and Pasiouras 2010). Kao and Liu (2014) among other studies consider demand deposits as outputs in the intermediation approach. Within this context and following the survey of Mostafa (2009) in which deposit is mostly used as outputs, we select as input factors; employees, assets and equity and as output factors; deposits from banks, loans and revenue. Table 6 shows the input and output factors and statistics for the Italian banks used for this study (see Table 9 in "Appendix A" for details of the banks). All the inputs and outputs are expressed in monetary values. It is assumed that the actual values of some of the input and output factors are uncertain. A bank has uncertainty characterization if any of its input or output data for the performance measurement is uncertain. Here, we perceive uncertainty in banking data to be the result of errors from measurement and statistical computations and other errors such as from forecast values of loans, non-performing loans, deposit, etc. Following this development, we then apply the proposed models (6) and (13) to assess the robust performance of the banks.

Efficiency results
In the proposed robust models, we seek to obtain an acceptable performance level of the banks by optimizing the worst-case values of the uncertain inputs and outputs values in the ellipsoids. For each bank, uncertainty is considered in some or all the inputs and outputs where the realization of their values are restricted to the uncertainty sets. We suppose that the inputs and outputs deviate from their nominal values by a percentage of perturbation, = 0.05 . The result of the model implementation is reported in Table 7. The third column shows the efficiency ranking by the DEA model (2), and the fifth and last column show the efficiency ranking by the robust models (6) and (13). A comparative view of the efficiency of these three models is given in Fig 4. Note that, the result obtained in model (13) for ≅ 2.5 indicates the highest conservativeness of decision makers which occurs at the full protection of the inputs and outputs against all uncertainties. Column 2 of Table 7 shows 8 banks with an efficiency score equal to 1 which are efficient under the DEA model (2) and two banks (B13 and B22) which are R-efficient in the two robust models. The robust efficiency decreases relative to the 1 3 DEA efficiency which indicates the worst-case and reliable performance of the banks in uncertain conditions. The DEA result indicates an average overall technical efficiency (0.898) for the banks under study which further indicates that although the banks are performing averagely well, the number of banks which are efficient with or without uncertainty analysis is quite small. The least performing bank includes UniCredit SpA (B01) with an efficiency score of 0.738. For the robust classification of banks, the robust parameter j is set to a range from 0 when no uncertainty in data is anticipated to j = 2.5 when full protection for uncertainty is anticipated. The choice of appropriate j within this range is selected arbitrarily. Table 8 shows the result of the robust classification of the banks. In exchange for higher guaranteed robustness, higher values of j is selected. The efficiency of banks decrease as j increases and the DM can express preferences with different values of j and robust efficiency which is similar to the approach proposed in Ben-Tal and Nemirovski (2000) and Sadjadi and Omrani (2008). Full protection of the inputs and outputs, only B13 and B22 have R-efficiency. Banks B06, B16, B18, B20, B23 and B28 are PR-efficient at  Table 8 shows the classification of the banks as given in Definition 5. Because many banks were inefficient in the traditional DEA evaluation, it is unsurprising the number of efficient banks which are partially or fully robust efficient. As observed, 2 banks and 6 banks are fully or partially robust efficient at different levels from the 8 DEA efficient banks.

Concluding remarks
In this paper, we proposed new robust DEA models based on the ellipsoidal uncertainty and interval-based ellipsoidal uncertainty sets designed in Nemirovski (1999, 2000). This has been done in a manner that immunizes arbitrary bounded or unbounded uncertainties partly or in all inputs and outputs data simultaneously. By constraining the uncertain data in an ellipsoidal uncertainty sets, the models developed in this paper become less pessimistic and in contrast offer the advantage over the interval DEA models which mostly evaluate the performance of DMUs based on their extreme lower and upper bounds of efficiency. The developed RDEA models provides the DM the flexibility of controlling the level of robustness. Another important contribution that is made in this paper is the design of a classification scheme which enables the DM to classify DMUs into fully robust efficient, partially robust efficient and robust inefficient. We provide numerical examples to illustrate the proposed models particlularly, for our proposed robust additive model which is compared with some IDEA models to show its efficacy, potential and applicability. Furthermore, the proposed robust models are applied for the evaluation and classification of banks in Italy. The proposed model enables bank managers to classify banks into fully, partially and robust (in)efficient units. Employing the RDEA models with different uncertainties to classify DMUs in applications can be considered for future research. Robust data envelopment analysis via ellipsoidal uncertainty… Acknowledgements Open access funding provided by Universitá degli Studi dell'Insubria within the CRUI-CARE Agreement. The author would also like to thank the anonymous reviewers and the SI editor for their insightful comments and suggestions.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.