Abstract
This paper is aimed at presenting the Python implementation of the Value-Based Data Envelopment Analysis (VBDEA) method, which was designed to evaluate the efficiency of decision-making units (DMUs). This methodological framework explores the links between data envelopment analysis (DEA) and multi-criteria decision analysis (MCDA) and proposes a new perspective on the use of the additive DEA model using concepts from the multi-attribute value theory (MAVT). One of the major strengths of VBDEA over typical DEA methodologies is that it offers information on the main reasons behind DMUs’ (in)efficiency. Additionally, this approach allows straightforwardly ranking of efficient and inefficient DMUs, since it relies on a super-efficiency model. Because of the use of value functions, besides allowing the incorporation of the decision-maker (DM)’s preferences, this methodology easily handles negative or null data. In this context, we illustrate the Python implementation of the method by reproducing the main results obtained by (Gouveia et al., Or Spectrum 38:743–767, 2016), when these authors evaluated the performance of 12 health units in a Portuguese region incorporating management preferences given by real DMs.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
DEA was originally developed by Charnes et al. (1978) and is a nonparametric approach based on linear programming to evaluate the performances of the DMUs (homogeneous units under evaluation) considering multiple inputs (resources) and multiple outputs (outcomes). The classical DEA models are usually (input or output) oriented models, which return radial efficiency measures. In this context, the CCR (Charnes et al., 1978) and the BCC (Banker et al., 1984) DEA approaches are both oriented and radial models. Besides, there are also other non-oriented models, such as the additive model (Charnes et al., 1985), which identify inefficient DMUs but do not return an efficiency score. Generally, it can be either assumed constant returns to scale (CRS) or variable returns to scale (VRS). While the CCR model only allows accounting for CRS, the BCC and additive models enable the consideration of VRS.
Since these DEA models allow computing the projections of inefficient DMUs onto the efficient frontier, depending on the scales used to measure the input and output factors, their efficiency measure is very pessimistic, because the L1 distance is being maximized. Besides, particularly in the case of the additive model, the efficiency measure obtained does not have an intuitive interpretation. The VBDEA was developed by Gouveia et al. (2008) to overcome some disadvantages of the additive model, namely, the scaling problem. Therefore, this paper describes the Python of the VBDEA method. This paper is organized as follows. The next Section describes the VBDEA method. Section 3 explains the Python implementation. Finally, some conclusions are conveyed, and future work developments are unveiled.
2 The VBDEA Method
The present paper describes the Python implementation of the VBDEA method proposed by Gouveia et al. (2008). This method combines the use of DEA with MAVT (Keeney & Raiffa, 1993), in the field of MCDA, as a way of incorporating the preference information provided by DMs, converting the inputs and outputs (viewed as criteria of evaluation) into value scales. The additive value functions are used to aggregate the values associated with each criterion. This transformation makes it possible to overcome the problem of scales, as all criteria are translated into value units. Furthermore, the weights used in the aggregation gain a specific meaning, as they are the scale coefficients of the value functions and determine the projection direction. The weights are chosen to benefit each DMU as much as possible, in the optimistic spirit of the BCC models. Finally, the efficiency measure of each DMU will have an intuitive meaning: interpreted as the “min–max regret”.
Consider n DMUs \(\left\{ {DMU_j :j = 1, \ldots ,n} \right\}\) evaluated according to their performance in a set of q criteria, with q = m + p, with \(x_{ij} { }\left( {i = 1, \ldots ,m} \right)\) to be minimized, and \({\text{y}}_{{\text{rj}}} \left( {{\text{r}} = 1, \ldots ,{\text{p}}} \right)\) to be maximized. The conversion consists of, using MAVT concepts, to build partial value functions \(\left\{ {v_c (DMU_j ),{ }c = 1, \ldots ,q, j = 1, \ldots ,n{ }} \right\}\). Each of them is defined within the interval [0, 1] assuming that for each criteria c the worst performance, \(p_{cj} ,j = 1, \ldots ,n,\) has the value 0 and the best performance, \(p_{cj} ,j = 1, \ldots ,n,\) has the value 1, causing the maximization of all criteria. Subsequently, the criteria are gathered into a global value function, \(V\left( {DMU_j } \right) = \sum_{c = 1}^q {w_c v_c \left( {DMU_j } \right)}\), where \(w_c \ge 0\), ∀c = 1,…, q and \(\sum_{c = 1}^q {w_c = 1}\) (by convention). The weights \(w_1 , \ldots ,w_q\) considered in the additive value function are the scale coefficients and are settled in a way that each alternative minimizes the value difference from the best alternative, according to the “min–max regret” rule (Bell, 1982).
The VBDEA method comprises two phases after all factors have been converted into a value scale.
Phase 1: Compute the efficiency measure, \(d_k^*\), for each \({\text{DMU}}_{\text{k}}\) (k = 1, …, n), and the corresponding weighting vector \({\varvec{w}}_{\varvec{k}}^*\) by solving problem (1).
It is worth noting that Gouveia et al. (2013) included the concept of superefficiency (Andersen & Petersen, 1993) in formulation (1) to accommodate the discrimination of efficient DMUs.
The optimal value of the objective function, \({\text{d}}_{\text{k}}^{*}\), , is the value difference to the best of all DMUs (note that the best DMU will also depend on \({\varvec{w}}_k^*\)), excluding itself from the reference set. If dk* is negative, then the \({\text{DMU}}_{\text{k}}\) under evaluation is efficient. In the end, it is possible to rank the efficient DMUs by considering that the more negative the value of \({\text{d}}_{\text{k}}^{*}\), the more efficient is \({\text{DMU}}_{\text{k}}\).
Phase 2: If \(d_k^{*} \ge 0\), then solve the “weighted additive” problem (2), using the optimal weighting vector resulting from Phase 1, \({\varvec{w}}_k^*\), and determine the corresponding projected point of the \({\text{DMU}}_{\text{k}}\) under evaluation.
If \({\text{d}}_{\text{k}}^{*}\) is non-negative, then \({\text{DMU}}_{\text{k}}\) is inefficient and a projection target can be obtained through the following problem:
The group of efficient DMUs that defines a convex combination with \(\lambda_j \rm{ > }0\) (j = 1,…, k − 1, k + 1,…,n) is called the set of “benchmarks” of \({\text{DMU}}_{\text{k}}\). This convex combination leads to a point on the efficient frontier that is better than \({\text{DMU}}_{\text{k}}\) by a difference of value of \(s_c\) (slack) in each criterion c.
2.1 Elicitation of Value Functions and Weight Restrictions
In the VBDEA method, the objective of converting the criteria into value scales (linear/nonlinear value functions) is to reflect the preferences of the DMs, considering the generalization of the DEA methodology presented by Cook and Zhu (2009) that incorporates piecewise linear functions of input and output factors.
To convert the criteria into value scales we established two limits, \(M_c^L\) and \(M_c^U ,\) to consider an acceptable higher tolerance value (in this case, δ = 10%). We choose \(M_c^L < min\left\{ {p_{cj}^L { },j = 1, \ldots ,n} \right\}\) and \(M_c^U > max\left\{ {p_{cj}^U { },j = 1, \ldots ,n} \right\},\) for each \(c = 1, \ldots ,q\), to set the 0 and 1 levels on the value scale, according to the type of factor, input or output. After that, we compute value functions setting the values for each \(DMU_j , \, j = 1, \ldots ,{\text{n}}\) using:
To build the piecewise linear functions or non-linear value functions, we extract the difference in the DMU value that corresponds to decreases in inputs or increases in outputs, rather than the utility of having those inputs available or outputs produced. In this way, we do not speak of absolute values, but relative values.
The elicitation protocol can be based on comparing the value of increasing an output (or decreasing an input) from a to b versus increasing the same output (or decreasing the same input) from a’ to b’, all other performance levels being equal, and asking the DM to adjust one of these four numbers so that the value increase is approximately equal. This is always possible assuming the functions are continuous and monotonic.
The DM’s answers to the questions about the value differences between the performance levels in each factor allow extracting the value functions, which can be a piecewise linear approximation. When the DM’s responses can be fitted into predefined curves, we use other functions (like logarithmic, or exponential functions).
For a better understanding of the method and its implementation, we will follow the process with a replication of an illustrative example by Gouveia et al (2016). The purpose of the study carried out by Gouveia et al (2016) was to evaluate the efficiency of 12 primary health care units monitored by the “Group of Health Centres” in Portugal, with data from 2010. The perspective under consideration, designated as Model 2 in that study, uses as inputs (costs): total cost collected to the National Health Service (NHS) with complementary means of diagnosis and treatment (xCMDT); total medicine costs collected to the NHS (xMED); total cost of human resources (xHR) and medical costs not collected to the NHS; clinical consumables and other costs (xOC) and the only output is the number of medical consultations for registered patients (yCONS).
In the literature, we can find several techniques to obtain information regarding the DM’s preferences to construct value functions in agreement with his/her answers (Goodwin & Wright, 1998; von Winterfeldt & Edwards, 1986), but the questions must be structured for each specific context.
Table 1 summarizes the performance levels corresponding to values 0.25, 0.5, and 0.75 (resulting from this type of dialogue), such that an improvement from level 0 to level 0.25 corresponds to the same value as an improvement from level 0.25 to 0.5, etc. The summary of the performance levels elicited to construct the value functions for the output factor is depicted in Table 1, as an example.
For xCMDT, xMED, and xHR, the value functions were obtained by fitting a logarithmic function to match as well as possible the answers of the DM.
In the VBDEA method, the DMU under evaluation is free to choose the scale coefficients (weights) of the marginal value functions aggregated with an additive MAVT model, to become the best DMU (if possible) or to minimize the difference of value to the best DMU, i.e., getting the best possible efficiency score considering only the marginal values of the inputs and outputs. However, some factors may be disregarded from the assessment, as DMUs may assign zero weight to some factors, incompatible with the DM’s preferences. Thus, it is necessary to consider the weight constraints, as they may better reflect the organizational objectives and, therefore, guarantee significant results closer to what the DM considers to be the best practices.
There are several approaches to defining weight restrictions. In this context, specifying appropriate weight restrictions can be a very challenging task (Podinovski, 2004; Salo & Hämäläinen, 2001). In the Value-Based DEA method, the weights used in the aggregation are the scale coefficients of the value functions reflecting possible value trade-offs between different factors. Assigning values to the scale coefficients requires a series of judgments obtained from the DM. Direct classification techniques should be avoided, as the value of these coefficients does not reflect the DM’s intuitive notion of the importance of each criterion. On the contrary, they are heavily dependent on the performances chosen to represent levels 0 and 1 on the value scale. In MCDA, several valid protocols are known to elicit weight restrictions derived from the DM’s preferences (Goodwin & Wright, 1998; von Winterfeldt & Edwards, 1986). In this case, the swing technique is simple and clear for the DM. The swing method begins by constructing two extreme hypotheses, P0 and P1, with the first displaying the worst performance (having value 0) in all criteria scales and the second the corresponding best performance (having value 1). The preference elicitation protocol consists in querying the DM to look at the potential gains from moving from P0 to P1 in each criterion and then deciding which criterion he/she prefers to shift to hypothesis P1. Suppose that the transition from hypothesis P0 to hypothesis P1 in a specified criterion is worth 100 units on a hypothetical scale. Then, the DM is asked to give a value (<100) to the second criterion moved to P1, then to the third criterion, and so on, until the last criterion is moved to P1. The procedure used in the paper that we are using as a reference was to obtain, firstly, a ranking of weights and, secondly, to establish a limit for the ratio between the weights ranked first and last, to avoid null weights.
Considering W to be the set of weight vectors compatible with the elicited ranking and limit, it is necessary to include the weight restrictions in Phase 1 adding to formulation (1) the constraint (w1, …, wq) ∈ W. With this change in Phase 1, a necessary change is mandatory in the formulation of the problem solved in Phase 2. This change allows slacks to have negative values; otherwise, it might not be possible to keep the optimal value difference dk* resulting from formulation (1) including the weight restrictions.
Weight restrictions were elicited by asking the DM to compare the “swings” from values 0 to 1 as depicted in Table 2.
The DM was asked to consider a unit with the performance level 0 for all factors and the question was: “If you could improve one and only one factor in level 1, what would it be?”. The DM answer was: xMED. This allows the inference that wMED is the highest scaling coefficient. By repeating this question successively for the remaining factors, the ranking of the coefficients of scale obtained was: wMED ≥ wCMDT ≥ wHR ≥ wOC ≥ wCONS.
The answer to the question “What would be the lowest amount h that would allow a unit with 25,000 medical consultations for registered patients and total medicine costs collected to the NHS of 5.5 million euros to be considered as having more value than a unit with 4000 medical consultations for registered patients and total medicine costs collected to the NHS of h?” was h = 2.5 million euros. This answer is translated into: wCONS vCONS(25,000) + wMED vMED(5,500,000) ≥ wCONS vCONS(4000) + wMED vMED(h). Substituting h in the previous expression yields: wMED ≤ 2.47 wCONS.
3 Python Implementation
The Python implementation of the VBDEA method was done using the Python programming language (Python.org, 2022), and Jupyter Notebook (Jupyter.org, 2022) and the written code can also be executed from the console.
The VBDEA method has several steps for its execution, namely loading the model, converting the performances from the original scale to the value scales, the calculation of the first step of the method, the calculation of the second step of the method, and the conversion of the performances from the value scale to the original value scales, so the user understands the improvement proposals for the units classified as inefficient. The implementation of the different steps of the method is presented below in five sections.
The Jupyter notebook with Python implementation and the files of the Model 2 used for demonstration purposes are available in a Git Hub repository available at https://github.com/atrigo/vbDEA_notebook (Trigo et al., 2022).
3.1 Load the Model from a File
At this first step, the model to be executed is imported from a text file by a Python script. Figure 1 presents an example of such a file, relative to the case that we are using to demonstrate the python implementation (Gouveia et al., 2016). This file has the same name as the model which is “Model 2”.
Files containing models to be run by the application must have the same format as shown in Fig. 1, which consists of the following: a first line with the name of the model; then, lines that have the function type to be used for the conversion of the values to and from the value scale, which can be of three types: linear of multiple scales, exponential or logarithmic. Afterward, the values for matrix A are defined; the next lines contain the values for matrix B, followed by a dashed line; and, finally, two lines with optional parameters: the first line with the importance of the factors, ordered from the most important (in the case of Fig. 1, the second factor) to the least important (in the case of Fig. 1, the last factor), and, the second line, with an optional parameter (in the case of Fig. 1, the value 2.47, which represents the limit for the ratio between the weights ranked first and last). Note that between all the model parameter definitions there are blank lines that must be respected for the Python script to work.
Figure 2 shows the output of the Python script after running this first step. If everything goes well, the output parameters defined in the text file can be seen.
3.2 Conversion of the Values to Value Scales
Once the code that loads the model to be executed has been created, the second step is the conversion of the performances in original scale values into value scales according to the chosen model functions.
Figure 3 depicts the file with the DMUs’ performances in original scales. Figure 4 shows the file with the conversion of performances from original scales into value scales with the algorithm created in the Python language. The Python algorithm reads a file with the name <[modelname]_originals.csv> and returns a file with the name <[modelname]_valuescale.csv> with the values converted.
To better understand the values of the above files, Table 3 is presented, which has the original factors’ names, descriptions, and types, and corresponding codification based on the files depicted in Figs. 3 and 4.
3.3 Calculations of the First Step of the Method
After converting the values into value scales, we are ready to run the first phase of the model which consists of computing the efficiency measure, \(d_k^*\), for each \({\text{DMU}}_{\text{k}}\) (k = 1,…, n), and the corresponding weighting vector \({\text{w}}_{\text{k}}^{*}\) by solving linear problem (1), as previously described in Sect. 2.
The implementation of this part of the algorithm had to use a Python solver, in this case, the linprog function of the package scipy.optimize (The SciPy community, 2022Footnote 1), which has several functions for optimization, in addition to using the numpy and pandas’ packages already used before in the previous sections.
The results from the first phase of the model are presented in Fig. 5 and the ranking of units is: DMU 9 \(\succ\) DMU 7 \(\succ { }\) DMU 5 \(\succ\) DMU 4 \(\succ\) DMU 1 \(\succ\) DMU 2 \(\succ\) DMU 3 \(\succ\) DMU 8 \(\succ\) DMU 10 \(\succ\) DMU 11 \(\succ\) DMU 12 \(\succ\) DMU 6, where the first seven DMUs are efficient, because they have d* < 0. The lower the value of d* the better, and if d* is negative, then the DMU under analysis is efficient; otherwise, it is inefficient.
The DMUs freely choose their weights to become the best DMU (if possible) or to minimize the difference in value for the best DMU. There are units that disregard some factors from evaluation, such as DMU 5 and DMU 7, that considered only one of the five factors to be ranked as efficient, namely w*CMDT = 1 and w*CONS = 1, respectively.
3.4 Calculations of the Second Stage of the Method
In Phase 2 of the VBDEA method, the optimal weighting vector is used to solve the problem with formulation (2) for the DMUs classified as inefficient. The solution is a proposed efficiency target (projection) for each inefficient DMU. To achieve the efficient status, these inefficient DMUs must change their value in each factor by the value indicated by s*.
In our example, the DMU that is most often selected as a benchmark is DMU 7, and, for example, DMU 6 is inefficient, and it is projected onto the efficiency frontier in a target obtained by a linear combination of DMUs 4 and 7.
Figure 6 shows the outputs of the second phase of the model. In this file, for model readability reasons, the weights (w1, w2, w3, w4, and w5) calculated in the first stage of the model are also visible. Table 4 shows the same output but formatted in a tabular form.
3.5 Conversion of Values into the Original Scale
Table 5 depicts the values of the slacks in their original value scales.
As the slacks present positive values only for inputs, these values must be subtracted from the values of the performances in the original scale, thus obtaining the projected points in the efficiency frontier. The slack values translate the reductions to be implemented in the inputs in the sense that each of the inefficient DMUs manages to be at the level of those that are operating efficiently, i.e., those that are examples of best practices.
4 Conclusions and Further Research
The purpose of this work is to describe the Python implementation of the VBDEA approach. This methodological framework explores the connections between DEA and MCDA and presents a fresh viewpoint on the application of the additive DEA model based on MAVT. One of the major advantages of VBDEA over traditional DEA approaches is that it provides information on the leading causes of DMUs’ (in)efficiency. Furthermore, because it is based on a super-efficiency model, this technique has a higher discriminatory power since it allows ranking both efficient and inefficient DMUs. With the use of value functions, this technique, in addition to permitting the inclusion of the DMs’ preferences, readily handles negative or null data. In this regard, we display the Python implementation of the approach by replicating the major findings of Gouveia et al. (2016), who assessed the efficiency of 12 health facilities in a Portuguese region using management preferences provided by real DMs.
Future work is currently underway to further develop the algorithm presented to make it freely available in a web application so that it can be used by different types of users (https://adept.iscac.pt).
References
Andersen, P., & Petersen, N. C. (1993). A procedure for ranking efficient units in data envelopment analysis. Management Science, 39(10), 1261–1264.
Banker, R. D., Charnes, A., & Cooper, W. W. (1984). Some models estimating technical and scale inefficiencies in Data Envelopment Analysis. Management Science, 30, 1078–1092.
Bell, D. E. (1982). Regret in decision making under uncertainty. Operations Research, 30(2), 961–981.
Charnes, A., Cooper, W. W., Golany, B., Seiford, L., & Stutz, J. (1985). Foundations of data envelopment analysis for Pareto-Koopmans efficient empirical production functions. Journal of Econometrics, 30(1–2), 91–107. https://doi.org/10.1016/0304-4076(85)90133-2
Charnes, A., Cooper, W. W., & Rhodes, E. (1978). Measuring the efficiency of decision making units. European Journal of Operational Research, 2, 429–444. https://doi.org/10.1016/0377-2217(78)90138-8
Cook, W. D., & Zhu, J. (2009). Piecewise linear output measures in DEA (third revision). European Journal of Operational Research, 197, 312–319.
Goodwin, P, & Wright, G. (1998). Decision analysis for management judgment (2nd edn.). Chichester: Wiley.
Gouveia, M. C., Dias, L. C., & Antunes, C. H. (2008). Additive DEA based on MCDA with imprecise information. Journal of the Operational Research Society, 59(1), 54–63. https://doi.org/10.1057/palgrave.jors.2602317
Gouveia, M. C., Dias, L. C., & Antunes, C. H. (2013). Super-efficiency and stability intervals in additive DEA. Journal of the Operational Research Society, 64(1), 86–96. https://doi.org/10.1057/jors.2012.19
Gouveia, M. C., Dias, L. C., Antunes, C. H., Mota, M. A., Duarte, E. M., & Tenreiro, E. M. (2016). An application of value-based DEA to identify the best practices in primary health care. Or Spectrum, 38(3), 743–767.
Jupyter Notebook. (2022). Jupyter.org. Retrieved July 28, 2022, from https://www.jupyter.org.
Keeney, R. L., & Raiffa, H. (1993). Decisions with multiple objectives: preferences and value trade-offs. Cambridge Univer- sity Press.
Podinovski, V. V. (2004). Production trade-offs and weight restrictions in data envelopment analysis. Journal of the Operational Research Society, 55, 1311–1322.
Python Programming Language. (2022). Python.Org. Retrieved July 28, 2022, from https://www.python.org.
Salo, A. A., & Hämäläinen, R. P. (2001). Preference Ratios in Multiattribute Evaluation (PRIME)—elicitation and decision procedures under incomplete information. IEEE Transactions on Systems Man and Cybernetics Part A, 31, 533–545.
The SciPy community. (2022). Optimization and root finding (scipy.optimize)#. Optimization and root finding (scipy.optimize)—SciPy v1.9.0 Manual. Retrieved July 29, 2022, from https://docs.scipy.org/doc/scipy/reference/optimize.html.
Trigo, A., Henriques C., & Gouveia, M. (2022). vbDEA notebook. Retrieved July 28, 2022, from https://github.com/atrigo/vbDEA_notebook.
von Winterfeldt, D., & Edwards, W. (1986). Decision analysis behavioral research. Cambridge University Press.
Acknowledgements
This work has been funded by European Regional Development Fund in the framework of Portugal 2020—Programa Operacional Assistência Técnica (POAT 2020), under project POAT-01-6177-FEDER-000044 ADEPT: Avaliação de Políticas de Intervenção Co-financiadas em Empresas. INESC Coimbra and CeBER are supported by the Portuguese Foundation for Science and Technology funds through Projects UID/MULTI/00308/2020 and UIDB/05037/2020, respectively.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2023 The Author(s)
About this paper
Cite this paper
Trigo, A., Gouveia, M., Henriques, C. (2023). Python Implementation of the Value-Based DEA Method. In: Henriques, C., Viseu, C. (eds) EU Cohesion Policy Implementation - Evaluation Challenges and Opportunities. EvEUCoP 2022. Springer Proceedings in Political Science and International Relations. Springer, Cham. https://doi.org/10.1007/978-3-031-18161-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-18161-0_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18160-3
Online ISBN: 978-3-031-18161-0
eBook Packages: Political Science and International StudiesPolitical Science and International Studies (R0)