1 Introduction

Data envelopment analysis (DEA) is a well-known and widely used method for testing efficiency and benchmark performance. Several versions of the model have been published and are used in many fields, including supplier evaluation in supply chain management.

The literature on Data Envelopment Analysis (DEA) focuses primarily on methodological approaches to the challenges posed by evaluation, developing model variants that are able to address the specificities of the situation. According to a comprehensive literature review by Emrouznejad Yang (2018), the most common topics are related to eco-efficiency and sustainability, Network DEA, benchmarking, bootstrapping and scale efficiency. Although the literature deals with the specifics of the data required for the calculation, as can be seen it is not a priority topic. The exception is the issue of undesired output (Halkos and Petrou 2019), which in line with the previous result, emphasising the environmental aspects. Therefore, this article addresses the question of what bias can be expected due to different methods when data transformation is required. Selecting DEA model also means developing an approach for selecting inputs and outputs in DEA, see Toloo and Tichý (2015), Toloo and Tavana (2017), among the others. In this article, we take the example of supplier evaluation.

In the basic DEA model, input criteria are cost criteria and output criteria are benefit criteria. This article examines what happens when the input criteria are benefit criteria and the output criteria are cost criteria, so data transformation is required for the calculations. Two procedures are known for such a data transformation, data scaling and data translation (Ali and Seiford 1990; Färe and Grosskopf 2013; Charles et al. 2016; Cook and Seiford 2009) which are applied to the calculations.

The paper will be structured as follows. The first part of the paper provides a brief literature review on the role of inputs and outputs in DEA models and supplier selection. The next section reviews the structure and input/output criteria of the DEA models under study. It is particularly relevant when inputs or outputs are missing in the models. However, it may also be of interest that some of the criteria may be cost and benefit criteria. This may imply that the data needs to be transformed. The fourth part contains five different models that assume the existence or non-existence of input and output criteria. The mixed model is also analysed in terms of whether the cost and benefit criteria are considered as input and output criteria of the DEA model. In the fifth section, the results of the possible DEA models are compared on a numerical example. It follows that the results can be grouped into three categories according to the efficiency of the solutions. Finally, the results are summarised.

2 Literature review

DEA is a well-known method to determine the relative efficiencies of a set of decision making units (DMUs) It is widely used to compare the efficiency of banks (Henriques et al. 2020), healthcare (Sommersguter-Reichmann 2021), in education (Mojahedian et al. 2020), in agriculture (Streimikis and Saraji 2021), in transportation (Mahmoudi et al. 2020), in supply chain management (Soheilirad et al. 2018) and in supplier selection (Dutta et al. 2021, Vörösmarty and Dobos 2020).

DEA was introduced by Charnes et al. (1978). It is a linear programming nonparametric technique for evaluating the relative efficiency of comparable units (DMUs). It can be used to evaluate the relative efficiency of DMUs according to multiple inputs and outputs. The efficiency measure of a DMU is the ratio of the weighted sum of outputs to the weighted sum of inputs. In original DEA formulations the assessed DMUs can freely choose the weights to be assigned to each input and output in a way that maximises its efficiency, subject to this system of weights being feasible for all other DMUs. This freedom of choice shows the DMU at its best and is equivalent to assuming that no input or output is more important than any other. (Thanassoulis et al. 2004).

The requirements for defining inputs and outputs have long been addressed in the literature and a number of pitfalls have been identified. Dyson et al. (2004) describes four key assumptions with respect to the input/output set selected as it covers the full range of resources used, captures all activity levels and performance measures, the set of factors are common to all DMUs, environmental variations has been assessed and captured if necessary. In addition, among the data requirements, homogeneity of data, the ratio of factors to DMUs, non-negative numbers and complete data have been highlighted (e.g., Dyson et al. 2004, Sarkis 2007, Kohl et al. 2019). These methods are used in many applications. (E.g. Mozaffari et al. 2014; Bod’a et al. 2018). A common application of DEA is supplier evaluation (Dutta et al. 2021), with many sources recommending the use of DEA for both supplier qualification and ranking for selection purposes. The data requirements for the application of the method described above apply here as well. For example, the literature provides solutions for dealing with negative numbers (Soltanifar and Sharafi 2022; Dobos and Vörösmarty 2021) and for dealing with imprecise data (Ghiyasi and Khoshfetrat 2019; Ebrahimi 2019). Within this topic, the issues of undesired output and desired input (e.g., Alikhani et al. 2019; Nemati et al. 2020) are given considerable attention, linked to the growing importance of environmental criteria in business practice.

While many studies in education and healthcare, which are common application areas for DEA (e.g., Zakowska and Godycki-Cwirko 2020), focus on how to identify inputs and outputs in the assessment, the topic of supplier evaluation focuses on the identification of relevant supplier performance criteria. However, the question arises as to which of the criteria relating to supplier performance and capabilities can be considered as inputs and which as outputs. The problem can be understood as input factors are those of which less is better and output factors are those of which more is better. (Wu and Blackhurst 2009) There is also the interpretation that traditional management criteria are input criteria, while environmental criteria are output criteria. (Ransikarbum et al. 2022) It is noticeable, however, that while in the health care example mentioned above the meaning of input and output factor in relation to efficiency is relatively well understood, in the case of supplier evaluation its meaning is more difficult to interpret. There is very little literature on this issue. This is why this problem is the focus of our study.

3 Models used in the analysis

The basic DEA model assumes that the input criteria are cost criteria, while the output criteria are benefit or profit criteria. Therefore, there is a problem in applying the method if the input criteria include benefit criteria and the output criteria include cost criteria. This can be solved by a simple multiplication by -1 for the input/benefit and output/cost criteria, but then we would have to set up the DEA model with negative numbers. (Toloo 2009; Sarkis 2007) In addition to the linear transformations mentioned above, reciprocal transformations can also be used, but this is not the case in this paper.

The data translation method may be one solution for non-negativizing the matrices of DEA models with negative numbers. Ali and Seiford (1990) and Cook and Seiford (2009) Another solution could be positive affine data transformation. Färe and Grosskopf (2013) and Charles et al. (2016) This paper chooses the first solution due to translation invariance. The method of Ali and Seiford (1990) involves a linear transformation of the available data, i.e., the basic data are transformed by a linear function. This avoids any possible division by zero.

However, the question also arises as to what happens when the criteria cannot be clearly decomposed into input and output criteria, only the cost/benefit distinction is known. This is also the case for supplier selection, because output/outcome criteria are often difficult to define. Figure 1 summarizes the possible cases.

Fig. 1
figure 1

Type of DEA input/output and cost/benefit criteria

In Fig. 1, the cost and benefit criteria are converted into two matrices, where the matrices (XC, YC) represent the cost type criteria and (XB, YB) represent the matrices combining the benefit criteria. The reason for this decomposition will be discussed later.

If it is not possible to clearly select input and output criteria, four different DEA models can be set up as shown in Fig. 1. However, this requires the construction of four auxiliary matrices to ensure that the negative numbers in the matrix cells do not become negative. The following formulae are available for this purpose:

  • \(\overline{X }\) = (\(\underset{j}{\mathrm{max}}\,{x}_{1j}^{C}\) ⋅ 1,\(\underset{j}{\mathrm{max}}\,{x}_{2j}^{C}\) ⋅ 1,…,\(\underset{j}{\mathrm{max}}\,{x}_{nj}^{C}\) ⋅ 1),

  • \(\overline{\overline{X}} = \left(\underset{j}{\mathrm{max}}\,{x}_{1j}^{B} \cdot 1, \underset{j}{\mathrm{max}}\,{x}_{2j}^{B} \cdot 1,\dots , \underset{j}{\mathrm{max}}\,{x}_{kj}^{B} \cdot 1\right)\),

  • \(\overline{Y }\) = (\(\underset{j}{\mathrm{max}}\,{y}_{1j}^{C}\) ⋅ 1,\(\underset{j}{\mathrm{max}}\,{x}_{2j}^{C}\) ⋅ 1,…,\(\underset{j}{\mathrm{max}}\,{x}_{mj}^{C}\) ⋅ 1), and

  • \(\overline{\overline{Y}}\) = (\(\underset{j}{\mathrm{max}}\,{y}_{1j}^{B}\) ⋅ 1,\(\underset{j}{\mathrm{max}}\,{x}_{2j}^{B}\) ⋅ 1,…,\(\underset{j}{\mathrm{max}}\,{x}_{lj}^{B}\) ⋅ 1).

The row vectors of the matrices (\(\overline{X }\), \(\overline{\overline{X}}\), \(\overline{Y }\), \(\overline{\overline{Y}}\)) are identical vectors, and the largest values of each criterion are contained in the column vectors, so the following matrix inequalities are also satisfied:

$$ \overline{X} - X^{C} \ge 0,\;\overline{\overline{X}} - X_{B} \ge 0,\;\overline{Y} - Y^{C} \ge 0,\;\overline{\overline{Y}} - Y^{B} \ge 0. $$

The new data translated matrices are already indicated in the figure. This allows the following four DEA models to be described, provided that no input and output criteria can be selected. First, two models can be described where it is assumed that the benefit criteria are output criteria and the cost criteria are input criteria. For the other two models, data translation is used to convert the criteria value into either input criteria or output criteria. These DEA models are:

  • CCR-I model (Charnes et al. 1978),

  • Linear Activity DEA model (Dyckhoff and Allen 2001),

  • DEA Without Explicit Input (DEA/WEI) (Cooper et al. 2007; Toloo and Tavana 2017) and

  • DEA Without Explicit Output (DEA/WEO) (Toloo and Kresta 2014).

After the models with undefined input and output criteria, the two sets of criteria are separated.

The last line of Fig. 1 shows the case where the input and output criteria were given before the start of the analysis, i.e. the two groups of criteria were given exogenously. Again, we are faced with converting negative data to non-negative again.

The matrix of suppliers, i.e. decision making unit (DMU) data, can be segmented. The upper index indicates whether the input or output variable represents a cost or benefit criterion. The matrices X contain the input criteria, while the matrices Y contain the output criteria. Then the matrices X = (XC, − XB) and Y = (− YC, YB) are the matrices of transformed input and output criteria. Since in the classical DEA case the input criteria are cost criteria and the output criteria are benefit criteria, the input criteria that are benefit criteria and the output criteria that are cost criteria must be re-signed according to Fig. 1. Thus, the analyses should be performed with the matrices (XC, \(\overline{\overline{X}}\)− XB) and (\(\overline{Y }\)− YC, YB).

The mathematical form of the five models is presented in the next section.

4 Mathematical form of the presented DEA models

The matrices used to describe the models are written down for easier application. This is shown in Table 1. The criteria used as input are denoted by X, while those used as output are denoted by Y.

Table 1 Indication of criteria in different models

When presenting the models, we use the row vectors of the given matrices. This means that j denotes the jth supplier. The weight vectors ui and vi (i = 1, 2, 3) are assigned to the \({\mathrm{y}}_{j}^{i}\) and \({\mathrm{x}}_{j}^{i}\) criterion values when solving the problems.

4.1 CCR-I DEA model without named on input–output

As mentioned above, the benefit criteria are output criteria, while the cost criteria are input criteria in this model. (Charnes et al. 1978) The model is shown in formulas (1CCR) to (4CCR).

$$ {\text{u}}_{1} \ge 0,\;{\text{v}}_{1} \ge 0, $$
(1CCR)
$$ {\text{u}}_{1} \cdot {\text{y}}_{j}^{1} - {\text{v}}_{1} \cdot {\text{x}}_{j}^{1} \le 0,\;\left( {j = 1,2, \ldots ,p} \right), $$
(2CCR)
$$ {\text{v}}_{1} \cdot {\text{x}}_{i}^{1} = 1, $$
(3CCR)
$$ F_{{\text{i}}} \left( {{\text{u}}_{1} ,{\text{v}}_{1} } \right) = u_{1} \cdot y_{i}^{1} \to \max ,\;\left( {i = 1,2, \ldots p} \right) $$
(4CCR)

Table 2 contains the initial table of the model used, where the cost criteria are used as input criteria and the benefit criteria as output criteria.

Table 2 Dataset used for CCR-I model

4.2 Linear Activity DEA model without named on input–output

The benefit criteria are output criteria, while the cost criteria are input criteria, as we did in the CCR-I model. Koopmans (1951) In the linear activity analysis model, suppliers must be individually assigned a "price system" for which the profit function is maximal. This can only be determined if the profit is not greater than zero. This condition is shown by inequality (2LA). However, the price system must also be limited. This condition is satisfied by the inequality (3LA).

$$ {\text{u}}_{1} \ge 0,v_{1} \ge 0, $$
(1LA)
$$ {\text{u}}_{1} \cdot {\text{y}}_{j}^{1} - {\text{v}}_{1} \cdot {\text{x}}_{j}^{1} \le 0,\;\left( {j = 1,2, \ldots ,p} \right), $$
(2LA)
$$ {\text{u}}_{1} \cdot 1 + {\text{v}}_{1} \cdot 1 = 1, $$
(3LA)
$$ F_{{\text{i}}} \left( {{\text{u}}_{1} ,{\text{v}}_{1} } \right) = {\text{u}}_{1} \cdot {\text{y}}_{i}^{1} - {\text{v}}_{1} \cdot {\text{x}}_{i}^{1} \to \max , \, \left( {i = 1,2, \ldots p} \right) $$
(4LA)

Numerical examples for problems (1CCR) to (4CCR) and (1LA) to (4LA) are provided in the next section. For this model type we can also use the numbers in the Table 2.

4.3 DEA model without explicit input

The mathematical form of the DEA problems without explicit input (WEI) can be described as follows:

$$ {\text{u}}_{1} \ge 0,{\text{v}}_{2} \ge 0, $$
(1WEI)
$$ {\text{u}}_{1} \cdot y_{j}^{1} + {\text{v}}_{2} \cdot x_{j}^{2} \le 1,\;\left( {j = {1},{2}, \ldots ,p} \right), $$
(2WEI)
$$ F_{i} \left( {{\text{u}}_{1} ,{\text{v}}_{2} } \right) = {\text{u}}_{1} \cdot {\text{y}}_{i}^{1} + {\text{v}}_{2} \cdot {\text{x}}_{i}^{2} \to \max ,\;i = 1,2, \ldots p. $$
(3WEI)

For the model (1WEI) to (3WEI), the input takes one value for each supplier. The model is also called composite indicators in this form. (Cherchye et al. 2008).

In Table 3, which we used for calculations, we converted the cost criteria into benefit criteria. The maximum values are given in the lack of row at the bottom of the table.

Table 3 Dataset used for DEA/WEI model

4.4 DEA model without explicit output

The DEA without explicit output model can initially be written in the form of the following fractional programming (1) to (3), since the value of the output is one, but the counter contains the weighted values of the inputs. However, with inverses, the model can be transformed into a linear optimization model. (Toloo 2014)

$$ {\text{u}}_{2} \ge 0,{\text{v}}_{1} \ge 0, $$
(1)
$$ \frac{1}{{{\text{u}}_{2} \cdot {\text{y}}_{j}^{2} + {\text{v}}_{1} \cdot {\text{x}}_{j}^{1} }}\; \le 1;j = \, 1,2,...,p. $$
(2)
$$ F_{i} \left( {{\text{u}}_{2} ,{\text{v}}_{1} } \right) = \frac{1}{{{\text{u}}_{2} \cdot {\text{y}}_{i}^{2} + {\text{v}}_{1} \cdot {\text{x}}_{i}^{1} }} \to \max ,\;i = 1,2, \ldots ,p. $$
(3)

The transformed model takes the form (1WEO) to (3WEO).

$$ {\text{u}}_{2} \ge 0,{\text{v}}_{1} \ge 0, $$
(1WEO)
$$ {\text{u}}_{2} \cdot {\text{y}}_{j}^{2} + {\text{v}}_{1} \cdot {\text{x}}_{j}^{1} \ge 1;j = \, 1,2,...,p. $$
(2WEO)
$$ F_{i} \left( {\text{u}} \right) = {\text{u}}_{2} \cdot {\text{y}}_{i}^{2} + {\text{v}}_{1} \cdot {\text{x}}_{i}^{1} \to \min ,i = 1,2, \ldots p. $$
(3WEO)

Last left is the classic DEA model with both input and output criteria, which is described in the next section.

For the DEA/WEO model, the criteria should be sorted to the minimum values. This is shown in Table 4. The value for lack of variance is given in the bottom row of the table.

Table 4 Dataset used for DEA/WEO model

4.5 Additive DEA model considering negative values

Because the data includes negative data and the negative data is translated into positive values, we use an additive model because it is translation invariant for both input and output criteria. (Cook and Seiford 2009) The efficiency of the jth supplier in an additive DEA model is

$$ E_{j} \left( {{\text{u}}, u_{0} ,{\text{v}}} \right) = \frac{{{\text{u}} \cdot y_{j} + u_{0} }}{{{\text{v}} \cdot x_{j} }} \le 1,\;\left( {j = 1,2, \ldots ,p} \right), $$

where p is the number of suppliers.

The programming problems can be written in the following form:

$$ {\text{u}}_{3} \ge \varepsilon \cdot 1,{\text{v}}_{3} \ge \varepsilon \cdot 1,u_{0} \in \Re $$
(1ADD)
$$ \left( {{\text{u}}_{3} \cdot y_{j}^{3} + u_{0} } \right) - {\text{v}}_{3} \cdot x_{j}^{3} \le 0,\;\left( {j = 1,2, \ldots ,p} \right), $$
(2ADD)
$$ F_{i} \left( {u_{3} ,u_{0} ,v_{3} } \right) = \left( {u_{3} \cdot y_{i}^{3} + u_{0} } \right) - v_{3} \cdot x_{i}^{3} \to \max ,\;\left( {i = 1,2, \ldots p} \right) $$
(3ADD)

After presenting the five DEA models, we illustrate the difference between the models through a numerical example and interpret the results. Table 5 summarizes how the benefit input criteria and the cost output criteria have been adapted. The calculations can be performed with this data set.

Table 5 Dataset used for Additive DEA model

5 Numerical examples

Using the numerical example, we are looking for an answer to how the efficiencies of the five model DEAs are related. For this, we use the data and purchasing criteria in the Appendix. The efficiencies are shown in Table 6.

Table 6 DEA efficiencies of the DEA models

The table immediately shows that the first two models, i.e. the perception that the cost criteria are input criteria, respectively. The benefit criteria can be considered as output criteria, giving almost the same result. The more efficient suppliers are the same and there are only small differences in efficiency between the inefficient suppliers.

The other three models gave similar results for efficiency, with the exception that the DEA/WEI and DEA/WEO models provided exactly the same efficient suppliers. This means that twelve of the fifteen in the data used were found to be efficient, and inefficient ones gave similar results. In comparison, in the case of the third, additive model, three less efficient suppliers came out.

Comparing the results of the five models, five suppliers were found to be efficient in all three DEA models. However, there are three suppliers that has not been shown to be efficient in all analysis.

If we look at the results in light of whether the input and output criteria are known, the additive model really gives a different solution compared to the other four. However, it should also be noted that the results of the additive model are much closer to those of DEA/WEI and DEA/WEO than to the efficiencies of the CCR-I and Linear Activity DEA models.

6 Conclusions

The literature raises the possibility of a number of biases in business and supplier selection decisions, which can have a serious impact (e.g., Schotanus et al. 2022) on the outcome of the decision. In this paper, we sought to answer how the transformation of input and output criteria in DEA models affects efficiencies, especially in case of supplier selection. The models studied can be divided into three groups according to how the data is transformed. The first way is to capture the cost criterion as input and the benefit criteria as output. Using two similar but different economic message models, we obtained an almost identical solution. A second option is to interpret the criteria into the same category, i.e., as either input or output, and thus transform the data matrices. The results of this calculation method show that it gave an almost identical solution twice. The third way, i.e., when it is possible to decide exogenously what the input and output criteria are, results in significantly different efficiencies from the other two methods, but the results are somewhat similar to the result of the second way. Further analysis are required to make recommendations on which method is more suitable for practical applications.

This paper has several managerial implications. These results warn that the sets of criteria should be carefully defined when selecting a supplier, otherwise very different results may emerge. However, the five methods used have the property that there are suppliers that have been shown to be efficient in all calculations, but there are also such suppliers that have always been shown to be inefficient. To be able to use such methods proper knowledge and perhaps training of procurement practitioners is required and guidance is necessary for project teams responsible for procurement that do not include procurement professionals.