Pay for performance in health care: a new best practice tariff-based tool using a log-linear piecewise frontier function and a dual–primal approach for unique solutions

Health care systems worldwide have faced a problem of resources scarcity that, in turn, should be allocated to the health care providers according to the corresponding population needs. Such an allocation should be as much as effective and efficient as possible to guarantee the sustainability of those systems. One alternative to reach that goal is through (prospective) payments due to the providers for their clinical procedures. The way that such payments are computed is frequently unknown and arguably far from being optimal. For instance, in Portugal, public hospitals are clustered based on criteria related to size, consumed resources, and volume of medical acts, and payments associated with the inpatient services are equal to the smallest unitary cost within each cluster. First, there is no reason to impose a single benchmark for each inefficient hospital. Second, this approach disregards dimensions like quality (and access) and the environment, which are paramount for fair comparisons and benchmarking exercises. This paper proposes an innovative tool to achieve best-practices tariff. This tool merges both quality and financial sustainability concepts, attributing a hospital-specific tariff that can be different from hospital to hospital. That payment results from the combination of costs related to a set of potential benchmarks, keeping quality as high as possible and higher than a user-predefined threshold, and being able to generate considerable cost savings. To obtain those coefficients we propose and detail a log-linear piecewise frontier function as well as a dual–primal approach for unique solutions.


Introduction
In several countries worldwide, health care providers are paid by the services that they provide to the citizens. These providers include physicians, nurses, and mostly primary care centres and hospitals. Usually these payments depend on dimensions included in contracts celebrated between the payer (e.g. the State) and the payee (the provider). Several schemes are available: (a) block budget, featured by a periodic prospective payment associated with the activity, (b) capitation, based on the number of enrolled patients, multiplied by a unitary price, (c) case-based payments, such that health care providers are paid a prospective/retrospective lump sum per episode of care, and (d) fee for service, characterized by the payment of a price per medical act in a retrospective way. Advantages and disadvantages of these schemes can be found in the relevant literature (e.g. Marshall et al. 2014;Friesner and Rosenman 2009;Street et al. 2011).
It is important to stress that resources, namely the financial ones, are scarcer each passing day, motivating an optimization exercise to better allocate them. To the best of our knowledge, this is the first attempt to optimize payments to the health care providers. A common feature of all payment schemes is the paid price or tariff, which is usually set as the average cost per patient/medical act/episode of care. We remark that health care is a public interest service and, as such, resources allocation and payments should account not only for the efficiency of providers but also for their quality and access (Ferreira and Marques 2018b). Despite some contracts contain a number of quality and access parameters to be fulfilled, in most of the cases penalties are not sufficient to induce good practices. Instead of fixing these parameters (quality, access) and hoping that providers do not adopt misconducts, we should find out a reference set of providers for each health care provider. Such a reference set should contain only those entities whose quality and access observations are above a pre-defined threshold that indicates a minimum acceptable level of social performance. This intends to avoid that poor quality providers can be potential benchmarks. Additionally, optimal allocation of resources requires fair comparisons in terms of internal and external operational environment (Karagiannis and Velentzas 2012; Cordero et al. 2018). Efficiency of health care providers is dependent on epidemiology and demographics as well as on their specialization degree , hence disregarding these dimensions from the payment optimization is likely to produce biased results. Hence, we restrict even more the aforementioned reference set using operational environment variables such that the reference set contains the entities with good quality and access levels and, simultaneously, operate under similar conditions as the provider whose payment we want to optimize. The construction of this reference set constitutes a bridge between operational research and health economics/management, which appears to be innovative in the field.
Once the reference set has been constructed, we should use a benchmarking tool to derive an efficiency frontier, where benchmarks are placed. It is usual to construct a common frontier by using the entire set of health care providers. But since we want neither unfair comparisons nor potential benchmarks with poor quality, we use the reference set to construct such a frontier. Because each provider has its own comparability set, it is difficult to parametrically define a frontier shape equal for all providers and expect good outcomes, including fitting parameters. Therefore, the frontier should be empirically constructed, i.e. based only on inputs, outputs, and the reference set that is fixed as the overall comparability set for each provider. Data Envelopment Analysis (DEA) (Charnes et al. 1979;Banker et al. 1984) is probably the most common method to reach this goal. Because of the frontier convexity, DEA-based efficiency levels can be sometimes underestimated, meaning that inputs (outputs) should be decreased (increased) beyond what they would if there was not that bias. Indeed, nonconvexity is usually a more natural assumption for the frontier. The extreme case corresponds to (FDH) (Deprins et al. 1984;Daraio and Simar 2007), although it assumes that each provider can have one and only one benchmark. Nevertheless, we believe that such an assumption is too restrictive but still we should keep nonconvexity. In view of that, we extend the work of Banker and Maindiratta (1986) and construct a log-linear piecewise frontier with a directional nature. Properties of this model are studied, including the existence of multiple solutions. By consequence, we propose an extension of that model that results from the work of Sueyoshi and Sekitani (2009) and mitigates the aforementioned problem. Moreover, in some circumstances, the log-linear program may become infeasible; thus, we propose a slightly change on the model to cope with this problem. Based on the optimized parameters resulting from the log-linear programming tool, we can derive optimal payments as well as potential cost savings related to the achieved tariffs. These are innovations in operational research.
The structure of the manuscript is as follows. Section 2 presents some useful concepts and definitions. Section 3 details and explores the new best practice tariffbased tool (the core of this study). Section 4 related efficiency, optimal payments, and cost savings for the commissioner. Section 5 applies the new tool to the Portuguese public hospitals. Section 6 provides some concluding remarks and explains the economic impact of this tool to the Portuguese National Health Service.

Overview
Defining optimal payments due to the health care providers depends on a number of distinct features. 1 The first one is the level of care, service, or Diagnostic Related Groups (DRG) under analysis. Each country or health care service pays for clinical acts on different levels, e.g. directly to the physician or to the nurse, per service (inpatients, outpatients, surgeries, . . .), per speciality, per diagnosis group, and per severity level of each diagnosis group. The concept of DRG, for instance, has long been accepted by health care management researchers as a way of accounting for the patient-mix, which is to say a measure of heterogeneity among the patients admitted to the hospital services. Researchers have analysed the efficiency and productivity of hospitals based on such a concept. For instance, recently Johannessen et al. (2017) considered some DRG scores resulting from hospitalization, day-care, and outpatient consultations, so as to investigate the productivity improvement following a Norwegian hospital reform. Some references cited therein also considered DRG to homogenize the hospital activity.
The second feature is the set of variables used to establish such a payment (or tariff) and to avoid that it results from unfair comparisons and/or from benchmarks that disregarded important dimensions for citizens, such as quality and access. Hence, we need inputs (traditionally defined as the operational expenses required to treat patients), outputs (the quantity of treated patients), operational environment (which can be internal and/or external to the hospital), and quality (which also includes access). It is important to point out that, in our framework, quality is assumed to positively contribute to the hospital overall performance. In other words, from two quality observations, the largest one presents a higher utility to the hospital. However, in some cases, quality is measured through undesirable dimensions (e.g. avoidable mortality in low severity levels), demanding for an appropriate rescaling. It is not the goal of our paper, although an example is given in the next section.
Although DEA and models alike have been extensively utilized with DRG altogether, to the best of our knowledge, no other study has previously employed them to optimize payments in the health sector, particularly in DRG-based financing systems. This study can, then, be seen as a first step in order to optimize those payments, making them fairer and more sustainable, with adjustments for quality and operational environment.
The idea underlying the use of quality and operational environment-based dimensions is to construct comparability sets for each health care provider. Such sets should contain only those observations that are close to the provider whose tariff (payment) we want to optimize. Closeness (or proximity) is fixed by the socalled bandwidth. At this point, the comparability set associated with quality dimensions are formulated in a distinct way because, differently from size and environment, the higher the quality of potential benchmarks (best practices), the better. In fact, given a certain health care provider, if there is, at least, another entity delivering better health care with fewer resources per patient, then it should be a potential benchmark for the former one. Health care is a public interest service with several stakeholders, including citizens, staff, and the Government (either central or not). They usually have a minimum level of quality that is acceptable to a provider be considered a good performer. Because in our framework poor performers cannot be benchmarks, we impose a threshold that is defined as that minimum acceptable level per quality dimension.

Technology
A production technology that transforms the total operational expenses, X ð'Þ , into in/ outpatients, Y ð'Þ , can be featured by the technology set T ð'Þ & R þ Â R þ : The technology set T ð'Þ follows some axioms.

Concepts and definitions
This subsection provides the mathematical formulations of (1) bandwidths; (2) comparability sets used to constraint the original sample to those hospitals similar to the one whose optimal payment we want to assess and also exhibiting quality levels above a user-defined threshold; and (3) best practice tariff, also called optimal payment (or optimal price), based on a set of coefficients to be optimized and the values observed for the entities belonging to the overall comparability set.

Bandwidths
In general and because hospitals should be comparable, some measures of proximity (or closeness) among them are desirable. Such measures are the so-called bandwidths. The larger the bandwidth, the broader the set of hospitals accepted as comparable with the hospital k, whose optimal payment we want to assess. In view of that, optimal bandwidths are paramount. The choice of optimal bandwidths, b kð'Þ y ; b kð'Þ q r , and b kð'Þ z c , can be debatable since there is a number of ways to compute a bandwidth. All of those ways present advantages but also shortcomings. Particularly, a bandwidth can be either global or local. A bandwidth is global if it is the same across the whole sample of hospitals; otherwise, it is local.
On the one hand, global bandwidths can easily be computed by using, for instance, the Silverman's rule of thumb. Let f denote a probability density function. We usually plug-in f by a kernel with order c [ 0. Hence, the global bandwidth related to a variable V, whose observations in X have standard deviationr V , is (Silverman 1986 According to , we should use kernel functions with compact support, i.e., f ðuÞ [ 0 if juj 6 1 and f ðuÞ ¼ 0 otherwise, and symmetric around k, i.e. c should be equal to 2 to avoid negative parts on f. In that case, the bandwidth . For example, if the triweight kernel is used, we have b V % 3:62r V J À1=5 . The smaller and the more heterogeneous the sample, the larger the bandwidth. On the other hand, local bandwidths can be obtained e.g. through the so-called k-Nearest Neighbor method (Daraio and Simar 2007). First, one defines a grid of N units, say N 2 ½ 1 J; 2 J, with 0\ 1 6 2 6 1. Then, one finds the value of N within such range that minimises the score function CV(N). For the case of bandwidth b kð'Þ q r , the CV function is as follows: where f denotes a univariate kernel function. Therefore, b kð'Þ q r is the local bandwidth associated with hospital k chosen such that there are N points verifying jq jð'Þ r ð'Þ À q ið'Þ r ð'Þ j 6 b kð'Þ q r ; r ð'Þ 2 C ð'Þ . Naturally, we can specify similar local bandwidths for the case of the output and the operational environment data. There are other alternatives for the local bandwidths' formulation, including the (data-driven) least squares cross validation procedure to minimise the integrated squared error (Bǎdin et al. 2010;Hall et al. 2004;Li and Racine 2007).

Comparability sets
The idea underlying our approach is to restrict the original set of hospitals X based on constraints related to size (because of potential economies of scale), quality, access, and environment. To simplify the exposition, we will assume that access dimensions can be included into the group of quality variables and handled like them. Therefore, we may introduce three comparability sets that are subsets of X.
The first comparability set presented regards the size of the health care provider. It is important to guarantee that best practices related to hospital k have similar operations' scale as this one. For that reason, we constrain the set of admissible best practices for k by using the concept of bandwidth and centring the observations associated with that comparability set on y kð'Þ , which is assumed to be the proxy for the size of k.
Definition 1 (Size-related comparability set, X kð'Þ y ) Given a service or DRG ' 2 W, the size-related comparability set for hospital k 2 X is defined as the set of hospitals whose sizes, as measured by the output level Y ð'Þ , are close to the dimension of hospital k, i.e., y kð'Þ . Let the bandwidth b kð'Þ y be the closeness measure. Hence, X kð'Þ y is as follows: In the following definition, we introduce the so-called quality-related comparability set, which, as before, requires a bandwidth. However, it also needs a user predefined threshold per quality measure so as to ensure that no unit with low levels of quality (including access to health care services) can be addressed as a potential best practice for the hospital k whose optimal payment is being computed. In fact, we could achieve lower levels of resources consumption at the cost of reducing the quality of supplied services, endangering the societal mission of health care providers. In practice, we would be interested on best practices that would outperform k in terms of both quality and access, i.e. verifying q jð'Þ r ð'Þ > q kð'Þ r ð'Þ for any r ð'Þ ¼ 1; . . .; R ð'Þ . Nonetheless, we note that hospitals with very high quality levels may sometimes be technically inefficient on resources consumption for the quantity of services delivered. It means that, if hospital k reaches the maximum quality within the dataset, then, according to our previous two conditions, no other hospital could be a benchmark for k even if it would be more technically efficient than the latter. This constitutes a problem because the optimal payment for k would not be Pareto-efficient, jeopardising the health system sustainability. In practice, usually one can reduce a little the quality levels of k by using bandwidths, if it results on meaningful cost savings and as long as the best practices found for k would present quality levels at least above the user-defined threshold.
Definition 2 (Quality-related comparability set, X kð'Þ q ) Given a service or DRG ' 2 W, the quality-related comparability set for hospital k 2 X is defined as the set of hospitals whose quality dimensions Q ð'Þ are close to the quality observed for hospital k (or even above it), q kð'Þ r ð'Þ , and simultaneously larger than the user-defined threshold t ð'Þ q r for all r ð'Þ 2 C ð'Þ ; ' 2 W. Let the bandwidth b kð'Þ q r be the closeness measure. Hence, X kð'Þ q is as follows: The definition of an environment-related comparability set is similar to the one associated with size (vide supra).
Definition 3 (Environment-related comparability set, X kð'Þ z ) Given a service or DRG ' 2 W, the environment-related comparability set for hospital k 2 X is defined as the set of hospitals whose environment Z ð'Þ are close to the conditions observed for hospital k, z kð'Þ c ð'Þ , for all c ð'Þ 2 U ð'Þ ; ' 2 W. Let the bandwidth b kð'Þ z c be the closeness measure. Hence, X kð'Þ z is as follows: Using these three definitions, we can construct the overall comparability set, X kð'Þ , which is a subset of X.
Definition 4 (Overall comparability set, X kð'Þ ) The overall comparability set related to service/DRG ' 2 W and the hospital k 2 X, results from the intersection of the three previously defined comparability sets: Figure 1 exemplifies the achievement of the overall comparability set associated with the hospital k ¼ 4 as well as the corresponding comparability sets. In this example, there are fifteen hospital in the dataset; they could be potential best practices for k if features like size, quality, and environment would be disregarded. However, we have considered one quality and another environment dimensions. Let the blue and the green areas (left) define the the size-and the environment-related comparability sets for k ¼ 4, respectively. The intersection between them will result on a subset composed of the following hospitals: X kð'Þ y \ X kð'Þ z ¼ f3; 4; 7; 14; 15g. The red line (right) identified the user-defined threshold. Arrows explain that the quality of potential best practices should be larger than or equal to 30. However, the quality observed for hospital k is equal to 70 and the bandwidth is 22, meaning that the quality of those best practices for hospital k should, in fact, be larger than 48. Accordingly, we have X kð'Þ y \ X kð'Þ q ¼ f1; 4; 7; 15g and X kð'Þ ¼ f4; 7; 15g. Therefore, the set of potential best practices related to k is composed of itself and of two more hospitals: 7 and 15. In the next subsection we will explain how the best practice tariff of k can be obtained from X kð'Þ . To do so, we have to introduce the following notation: • J k is the length of the list X kð'Þ ; • X kð'Þ ; ' 2 W; is the vector of total operational expenses associated with the J k hospitals in X kð'Þ ; • Y kð'Þ ; ' 2 W; is the vector of total in/outpatients handled by the J k hospitals in X kð'Þ .
In the previous example,

Best practice tariffs
The best practice tariff of hospital k 2 X and service/DRG ' 2 W is, by definition, the paid price per medical act which is assessed using the information of benchmarks (best practices), which, in turn, are at least as good performer as hospital k in the provision of service/DRG '. Because hospitals present distinct technologies and face heterogeneous environments, they must be comparable using data related to the inn-(size, complexity of inpatients) and the out-operational environment (demographics, epidemiology). Likewise, poor quality hospitals should not be considered potential best practices for k. For that reason, we have created the concept of comparability sets associated with this hospital. Using it we can formulate the optimal (best practice based) tariff paid to hospital k.
Definition 5 (O-order best practice tariff paid to hospital k and service/DRG ', P kð';OÞ ) Let k 2 X be a hospital, ' 2 W a service or a DRG, and X kð'Þ the overall comparability related to the former two. The weighted Hölder (or power/generalized) mean with order O of a vector V ¼ ðV 1 ; . . .; V i ; . . .; V n Þ > and weights l ¼ ðl 1 ; . . .; l i ; . . .; l n Þ > , verifying P n i¼1 l i ¼ 1, is where hÁ; Ái denotes the inner-product of two vectors. The O-order best practice tariff that should be paid to hospital k is the relationship between the optimal costs and the number of in/outpatients for that hospital. We assume that optimal expenses can be modelled as weighted Hölder averages with order O and weights l k : In this case, we assume that the number of in/outpatients is out of control by the hospital managers. If this is not the case, we simply have to replace the denominator of Eq. (8) by HðY kð'Þ ; l k ; OÞ.
The O-order best practice tariff paid to hospital k for the service/DRG ' in the previous example is, according to Eq. (8): The problem associated with Eq. (8) is that, in general, we do not know the weights l k to use. Therefore, instead of imposing quantities with little or even no empirical support, these weights should be optimized, for instance, using linear programming tools. In the next subsections, we discuss how cost savings can arise.

A new best practice tariff-based tool
In the last section, we have defined the overall comparability set and explained how the best practice tariffs can be expressed in terms of some coefficients that should be optimized using benchmarking tools. In our case, we develop and explore a model similar to DEA that constructs a log-linear piecewise frontier and which is more flexible than DEA itself. This flexible model is named multiplicative (or log-) DEA. Those coefficients l k can be optimized using log-linear programming model, as detailed in Sect. 3.3. This one extends the work of Banker and Maindiratta (1986). Some properties of the new model are studied in Sect. 3.4. The reader should be aware of the problem of multiple/degenerate solutions associated with those coefficients, which can be critical to our optimal payments. Therefore, we extend our DEA model to solve the problem of multiple solutions (Sect. 3.5). In some circumstances, this model may become infeasible; hence, we propose a strategy to overcome this problem, vide Sect. 3.6. Finally, Sect. 3.7 presents a simple step-bystep procedure to simplify the exposition and to sum up our approach to optimize payments.

Past research on multiplicative DEA
The proposed way of optimizing tariffs imposes the achievement of Pareto-optimal weights l k . DEA can be very useful in such a situation. Since its foundation, it seems to be the widest employed model for efficiency assessment, especially in health care (Hollingsworth 2008). Because of the underlying convexity, the DEA formulation proposed by Banker et al. (1984) requires non-increasing marginal products. However, according to the classical production theory, there are three main stages characterizing the consumption of resources and the associated quantity of produced outputs (vide Fig. 2). For input quantities smaller than a, the fixed input is being utilized more effectively as the variable input increases (Kao 2017), which is to say that the marginal product increases accordingly (Henderson and Quandt 1980). Before b, the production function (red line) is non-concave and the production possibility set cannot be convex (Banker and Maindiratta 1986). If the standard DEA model is used to estimate this frontier (dashed line [Ob]), then some efficient observations exploiting gains from increasing specialization with larger scale sizes (i.e., close to a) would be rated as inefficient.
An immediate conclusion is that the convexity assumption must be smoothed to account for the case of increasing marginal products. An alternative is the so-called FDH (Deprins et al. 1984), which constructs a frontier with a staircase nature that, in turn, is discontinuous in some points and not differentiable (note that we do not require that the production function must be differentiable). Situations including non-variable returns to scale and imposition of restrictions over multipliers through FDH (Ferreira and Marques 2017) require solving a mixed-integer linear program (Agrell and Tind 2001), which can be difficult to implement in some solvers. Another alternative is the multiplicative DEA, also named log-DEA or DEA-Cobb-Douglas (whose mathematical details will be detailed soon).
Back nearly 40 years ago, Banker et al. (1984) constructed a log-linear model in which outputs are modelled following a Cobb-Douglas function. Under such a framework, outputs do not compete for the inputs, although they do in most of the empirical scenarios, including hospitals. Therefore, this model is not applicable in these situations. Charnes et al. (1982) suggested a multiplicative efficiency measure, but which is not consistent with the postulated underlying production of Banker and Maindiratta (1986). One year later, the same authors modified their model to account for non-radial inefficiency sources. Nonetheless, they did not examine the characteristics of the production technology, ignoring concepts like returns to scale, optimal scale, and non-competing outputs. Later on, Banker and Maindiratta (1986) proposed a radial log-DEA that account for competing outputs. These authors have verified that their model outperforms the piecewise linear DEA model of Banker et al. (1984) in terms of rates of substitution of inputs, and especially whenever the production technology frontier exhibits non-concavity in some regions. In the present work, we follow the model of Banker and Maindiratta, which has not seen further meaningful developments and enhancements, despite its advantages (vide infra). Chang and Guh (1995) criticize the remedy of Suayoshi and Chang (1989) to transform the efficiency score associated with the multiplicative model of Charnes et al. (1982) into unit invariant. They propose another model, which according to Seiford and Zhu (1998) reduces itself to the proposal of Charnes et al. This is because the model of Chang and Guh is for the case of constant output and linearly homogeneous production technology. The authors fail to recognize that the constraint associated with the convexity of the DEA model of Banker et al. is the basis for units invariance. Therefore, according to Seiford and Zhu, the distance efficiency measure of Chang and Guh is of no value.
Marginal products are usually related to returns to scale (Ferreira and Marques 2018a). Since the standard DEA fails to identify the production regions of increasing marginal products, multiplicative DEA models have been used to analyse returns to scale and most productive scale size; vide the works of Zarepisheh et al. (2010) as well as Davoodi et al. (2015).
Only a few applications of the multiplicative DEA model have been published. We point out three: Emrouznejad and Cabanda (2010), Tofallis (2014) and Valadkhani et al. (2016). The former authors integrated the so-called Benefit of Doubt (Cherchye et al. 2011) and the log-DEA to construct a composite indicator associated with six financial ratios and twenty-seven UK industries. They confirm that the multiplicative model is more robust than the standard DEA model. In the same vein of Emrouznejad and Cabanda, Tofallis also constructed composite indicators with multiplicative aggregation and remarked that the log-DEA avoids the zero weight problem of the standard DEA. Meanwhile, Valadkhani et al. proposed a multiplicative extension of environmental DEA models, accounting for weakly disposable undesirable outputs, and used it to measure efficiency changes in the world's major polluters. It is interesting to note that the adopted model is very close to the one proposed by Mehdiloozad et al. (2014). These authors developed a generalized multiplicative directional distance function as a comprehensive measure of technical efficiency, accounting for all types of slacks and satisfying several desirable properties.
Previous studies have considered that every observation is enveloped by the loglinear frontier and, for that reason, no infeasibility may arise. If this is not the case, projecting super-efficient observations into the frontier can be infeasible, mostly because of the undertaken path (model direction) (Chen et al. 2013). Infeasibility may, then, occur if the hospital whose tariff we are optimizing is not enveloped by the frontier. However, should it belong to the reference set used to construct such a frontier, and multiple solutions may be achieved, meaning that the optimal tariff could be non-unique. Neither of these two problems has been properly addressed in the literature with respect to either the multiplicative or log-DEA or the payments optimization.

Advantages of using log-DEA instead of the standard DEA
The advantages of log-DEA can be summarized as follows: • log-DEA is more flexible than DEA, allowing for increasing marginal products (Kao 2017); • log-DEA allows for outputs competing for inputs, as happens in most of empirical situations, including health care, (Banker and Maindiratta 1986); • log-DEA outperforms DEA in terms of rates of substitution of inputs (Banker and Maindiratta 1986); • log-DEA outperforms DEA when the (unknown) production technology frontier exhibits non-concavity regions, (Emrouznejad and Cabanda 2010); • log-DEA avoids the zero weight problem of the standard DEA (Tofallis 2014); • log-DEA keeps the production technology globally and geometrically convex (or non-convex under the arithmetic definition), which is a more natural solution.
Indeed, there is no reason to believe that the production technology must be (piecewise) linear and the associated set must be convex. Frontiers related to non-convex sets are always consistent, even if the technology is convex (Daraio and Simar 2007). Because of this property, log-DEA can be used to estimate economies of scope (vide a comprehensive discussion in ); • log-DEA mitigates the inefficiency overestimation resulting from infeasible regions constructed by two or more very distant efficient observations (Tiedemann et al. 2011); • log-DEA classifies all non-dominated observations as efficient, unlikely DEA.

Performance assessment through nonparametric log-linear benchmarking tools
A very important feature on defining the optimal (best practice based) tariff is the vector l k ¼ ðl k 1 ; . . .; l k i ; . . .; l k J k Þ > . As previously mentioned, the components of this vector should be optimized in the Pareto sense. This subsection explains how this can be done for those components. We will describe the construction of a log-linear piecewise frontier function from the J k hospitals composing X kð'Þ . This frontier contains the potential benchmarks related to hospital k and the service/DRG '.
First of all, we have to define the concepts of target and target setting.
Definition 6 (Targets of hospital k) Targets are the optimal values associated with hospital k for inputs and outputs,x kð'Þ andỹ kð'Þ respectively. Targets must verify the conditions:x kð'Þ 6 x kð'Þ ; Definition 7 (Target setting for hospital k) Targets are mathematically defined by the O-order Hölder average: In the case of O ¼ 0 (Banker and Maindiratta 1986), Eq. (10) reduces to: Because the logarithmic function is monotonically increasing, Eq. (9) can be rewritten as: logx kð'Þ 6 log x kð'Þ^l ogỹ kð'Þ > log y kð'Þ ; and, by Eq. (11), we have: X Inequalities in (12) can be transformed into equations by using (nonnegative) slacks. These slacks can be decomposed into controllable (vector D k ¼ ðd k 1 ; d k 2 Þ with d k 1 ; d k 2 > 0) and uncontrollable quantities (coefficients b k ; s k 1 ; and s k 2 ). Hence, where 1=Y kð'Þ is the Hadamard's componentwise division between two vectors, 1 ¼ ð1; . . .; 1Þ and Y kð'Þ , and the logarithmic function applies to each component of vectors, i.e., log These constraints create a log-linear piecewise frontier function assuming that h ffiffiffiffi ffi l k p ; ffiffiffiffi ffi l k p i ¼ 1 and all components of l k are nonnegative. Let us find the maximum value of the uncontrollable components using a linear program. In this case, we first optimize b k and, once its value has been achieved, we maximize both s k 1 and s k 2 , which justifies the adoption of a non-Archimedean, e. The linear program is as follows: Note that b k is not an efficiency score per se. In fact, the efficiency of hospital k in service/DRG ' can be defined as the relationship between optimal targets and observations for both inputs and outputs.
Definition 8 (Efficiency score, h kð'Þ ) Using the concept of targets, we define the efficiency score of hospital k in service/DRG ' as Portela and Thanassoulis (2006): Because of Definition 7 and the assumption O ¼ 0, which returns a log-linear piecewise frontier, Eq. (15) becomes: where l k is obtained from Model (14). Using Eq. (16) and after some easy algebraic manipulations, we can associate h kð'Þ with b k : This relationship allows us to conclude: Of course, k can outperform the frontier constructed by X kð'Þ , which means that b k \0 and h kð'Þ [ 1, indicating super-efficiency.

Some properties of the developed model
Model (14) exhibits several important properties: efficiency requirement, scale invariance, nonconvexity, strict monotonicity, and directional nature. Notwithstanding, it is not translation invariant and it may exhibit multiple optima solutions. Because of the latter, the optima of log-DEA primal are not sufficient to compute the optimal paid price, P kð';OÞ ; O ¼ 0; through the optimized weights l k . We will explore an alternative to provide the unique solution for those weights-the socalled primal-dual approach for multiple optima.
Proof Let us assume that D k does not depend on data. Consider the transformation of x kð'Þ using a scalar n 2 R: x kð'Þ À! nx kð'Þ . The first constraint of log-DEA primal becomes: where l k ; log nX kð'Þ ¼ l k ; log n þ l k ; log X kð'Þ results from the distributive property of the inner product and l k ; log n ¼ log n results from the condition h ffiffiffiffi ffi l k p ; ffiffiffiffi ffi l k p i ¼ 1. Thus, the first constraint of log-DEA primal is recovered. However, this is only true if D k does not depend on data; otherwise, the equivalences above do not necessarily hold. Since the same does apply for the output y kð'Þ , we conclude this proof. h Proposition 4 (Convexity) The frontier constructed by log-DEA primal model is not convex in the original space of variables. This is because log T ð'Þ is (normal) convex, meaning that T ð'Þ is log-convex.
Proof It is not difficult to conclude that, in the log-space, the frontier is convex. Now, let us consider a linear facet of the frontier in the log-space. Hence, we have log y ¼ a log x þ b or, equivalently, y ¼ x a exp b, which is not linear. That is, in the original space of variables, the constructed frontier is not convex. h Proposition 5 (Strict monotonicity) Log-DEA is strictly monotone. h kð'Þ strictly decreases with the increase of slacks.
Proof The objective function of log-DEA strictly increases with slacks, and so does logð1=h kð'Þ Þ. Because of the monotonicity property of logarithmic functions, 1=h kð'Þ also increases with slacks, which means that h kð'Þ decreases with the increase of slacks. In fact, we have h kð'Þ ¼ expðÀb k ðd k 1 þ d k 2 Þ À s k 1 À s k 2 Þ. This is in line with the efficiency requirement property. h Proposition 6 (Directional nature) Log-DEA is a radial directional model.
Proof Log-DEA is a radial model because of the parameter b k that imposes the projection of hospital k in the frontier constructed using X kð'Þ . It is directional because of the vector D k , which controls for the path used to project k (Fukuyama and Weber 2017). h

Proposition 7 (Translation invariance) Log-DEA is not translation invariant.
Proof This proposition results from the fact that the logarithmic function does not verify the distributive property, i.e., logða þ bÞ Proof This is because the log-DEA model is scale invariant. Indeed, scale invariance or log-translation invariance are tantamount. h To introduce the next proposition, we need the dual version of log-DEA primal, which results from the duality in linear programming. We first note that b k is an unrestricted variable of log-DEA, which means that it can be transformed into two Additionally, each constraint of the primal model (except for the nonnegativity of some variables) is modelled by an equation, meaning that two nonnegative variables are required to model each constraint: u þ ; u À for inputs, v þ ; v À for outputs, and w þ ; w À for the convexity constraint. Hence, we get the following set of dual constraints: The objective function is, obviously, [log-DEA dual] minimize log x kð'Þ ðu þ À u À Þ þ log 1 y kð'Þ ðv þ À v À Þ þ ðw þ À w À Þ: From constraints (18d) and (18e), we conclude that d k , which allows us to simplify the dual model.
. From constraint (18c), we conclude that: Furthermore, the first constraint can be rewritten as: Now, we observe that v þ À v À ¼ v and w þ À w À ¼ w, where v and w are both unrestricted by definition. Let This means that the log-DEA dual model becomes: [log-DEA dual] minimize g k v À w subject to: ð21aÞ w unrestricted: Proposition 9 (Multiple solutions) If k 2 X kð'Þ , then the log-DEA dual model violates the desired property of solutions uniqueness.
Proof To prove that the log-DEA dual model provides multiple solutions it is sufficient to observe that the hyperplane created by the constraint (21b) and the one constructed by the objective function are parallel if k 2 X kð'Þ . Indeed, when j ¼ k, we have g j v À w ¼ g k v À w: the latter is, precisely, the objective function. h

The case of multiple optima
Should the dual have multiple optima (Proposition 9), then every optimal basic solution to the primal is degenerate. This causes problems in the definition of optimal paid prices. In the absence of degeneracy, two different simplex tableaus in canonical form give two different solutions. Otherwise, it is possible that there are two different sets of basic variables (coefficients l k ) giving the same solution (efficiency score). Since we need unique l k , we must develop a linear program that returns these coefficients verifying such a property. Based on proposition 9, if one desires/needs only the efficiency score, then there is no need for further model improvements. In general, restrictions over dual variables are adopted to include managerial/policy preferences (cf. Shimshak et al. (2009), Podinovski (2016 and Podinovski andBouzdine-Chameeva (2015, 2016) for a further discussion), but they tend to influence the efficient frontier shape. That is, efficiency scores change due to the inclusion of multiplier constraints. Intrinsically, these restrictions also minimize (but do not avoid) the multiple solutions problem (Cooper et al. 2007), at the cost of both efficiency scores changing and their meaning potential loss. To avoid this serious problem, we follow the approach of Sueyoshi and Sekitani (2009) to develop a multiplicative log-DEA model, which does not violate the desired property of solutions uniqueness. The next two definitions help us on this goal.
Definition 9 (Complementary slackness condition) The well-known complementary slackness theorem of linear programming can be applied to both log-DEA primal and log-DEA dual. The following conditions must be obeyed for every optima of both log-DEA models, should u ¼ u þ À u À , v ¼ v þ À v À , w ¼ w þ À w À are three free in sign variables such that u and v are related via u ¼ ð1 À d k 2 vÞ=d k 1 .
These are the strong complementary slackness conditions associated with the log-DEA model.
Definition 10 (Unique Solution-based log-DEA) The following linear program provides solutions for log-DEA model that obey to the desirable uniqueness property.
[US-log-DEA] maximize a subject to: ð23aÞ À s k 2 À v þ a 6 Àe; ð23jÞ The US-log-DEA merges the dual, the primal, and the strong complementary slackness conditions (9), and forces that the objectives of both dual and primal log-DEA models are equal (23e). If a variable x verifies x [ a, where a 2 R, then there is a nonnegative quantity a such that x À a > a. The objective of US-log-DEA is to maximize the parameter a that transforms the strong complementary slackness conditions into feasible linear conditions (23h-23j). This means that the optimal solution of US-log-DEA, ðl kÃ ; b kÃ ; s kÃ 1 ; s kÃ 2 ; v Ã ; w Ã ; a Ã Þ, verifies the strong complementary slackness conditions (9). US-log-DEA solves the problem of multiple projections in log-DEA dual as it restricts multipliers so that the projection becomes unique (Sueyoshi and Sekitani 2009). That is, restricting multipliers v and w does not solve the problem of multiple solutions. Furthermore, it results from the linear programming theory and from the duality theorem that optima of US-log-DEA are also optima of both log-DEA dual and primal. Therefore, the efficiency score h kð'Þ remains unchanged. But since we need coefficients l k to define optimal prices, we must save the outcomes of US-log-DEA for such a purpose. It is needless to say that, by proposition 9, if k does not belong to X kð'Þ , then it is not necessary the use of US-log-DEA as the problem of multiple solutions vanishes in that case.

The problem of infeasibility in linear programming models
We have developed a log-DEA model to assess the coefficients related to the optimal price to be paid to hospital k. In some circumstances, the log-DEA primal can be infeasible because of the choice of the directional vector D k . If k 2 X kð'Þ , then the linear model cannot be infeasible. This is because k is enveloped by the frontier constructed using data associated with that subset of X. However, k may not belong to X kð'Þ because of its quality levels. Two scenarios are then suitable: k is enveloped by the frontier related to X kð'Þ -and, in such a case, there is not an infeasibility problem-or k is super-efficient regarding such a frontier. Only in the latter scenario infeasibility may occur. However, being super-efficient is not a sufficient condition to generate infeasibility, vide Proposition 10. We thus have to look at the log-DEA primal and search for a vector D k linear transformation to ensure that the model becomes feasible. Naturally, if k belongs to X kð'Þ , then a multiple solutions problem arise and the US-log-DEA is compulsory. Nevertheless, in such a case, the linear problem cannot be infeasible. If k 6 2 X kð'Þ , then the multiple solutions problem vanishes. Hence, it is enough to consider the log-DEA primal.
Proof This proof follows Ray (2008). Under the assumption of D k ¼ ðlog x kð'Þ ; log y kð'Þ Þ (Chambers et al. 1996(Chambers et al. , 1998, we obtain hl k ; log X kð'Þ i 6 ð1 À b k Þ log x kð'Þ . If hl k ; log X kð'Þ i [ 2 log x kð'Þ , then ð1 À b k Þ log x kð'Þ [ 2 log x kð'Þ , 1 À b k [ 2 , b k \ À 1. Likewise, we have hl k ; log Y kð'Þ i > ð1 þ b k Þ log y kð'Þ . Now, we know that b k \ À 1 or 1 þ b k \0. If y jð'Þ [ 1 for all j 2 X kð'Þ , then hl k ; log Y kð'Þ i must be positive. However, given that y kð'Þ is also larger than 1, the right hand side of the inequality associated with Y kð'Þ becomes negative because 1 þ b k \0. Therein lies the problem: there is no combination of (nonnegative coefficients) l k j for j 2 X kð'Þ obeying to h ffiffiffiffi ffi l k p ; ffiffiffiffi ffi l k p i ¼ > 0. Let m Y ¼ min Y kð'Þ and r k ¼ max X kð'Þ À min X kð'Þ þ , with [ 0. If the following condition is met by the coefficients of the affine transformation of D k , then the infeasibility problem vanishes.
We can select any values for the coefficients of the affine transformation, as long as they obey to the relationship in (24).

Proof (Inspired on Chen et al. 2013) The log-DEA primal is feasible if and only if
all of its constraints are obeyed. With the affine transformation of D k , those constraints should be verified. Let us ignore slacks s k 1 and s k 2 and start by the constraint over the input: From the constraint over the output, we have: The output target is, of course,ỹ kð'Þ ¼ hl k ; log Y kð'Þ i, and this should be nonnegative, i.e.
Hence, the following is obvious: Unfortunately, we do not know the value of hl k ; log X kð'Þ i, but it is clear that hl k ; log X kð'Þ i À log x kð'Þ 6 r k . Therefore, it is sufficient to impose: which concludes this proof. h  3.7 On optimizing the best practice tariffs: a step-by-step procedure Figure 3 presents the flowchart of the procedure adopted to optimize the paid price to hospitals by their services. Steps are as follows 3 : Step 1 Define data.
Step 2 Define k 2 X as the hospital whose payment we want to optimize.
Step 5 Construct the overall comparability set, X kð'Þ , by intersecting the sets achieved in Step 4-vide Definition 4.
Step 6 Using the entries of X kð'Þ , construct X kð'Þ and Y kð'Þ -details at the end of Subsection 2.4.2.
Step 7 Check if k belongs to X kð'Þ .
Step 7.1 If k 2 X kð'Þ , use the US-log-DEA model (23) and obtain l k -its coefficients are the unique solutions associated with the benchmarks of k.
Step 7.2 If k 6 2 X kð'Þ , use the log-DEA primal model (14). Check if the log-DEA primal is infeasible.
Step 7.2.1 If the model is infeasible, transform D k using an affine transformation obeying to Proposition 11. The infeasibility problem vanishes. Obtain l k whose components are unique for k.
Step 7.2.2 If the model is not infeasible, just obtain l k .

Efficiency, optimal payments, and cost savings
The definition of optimal tariffs to be paid foresees some cost savings to the commissioner. This section explains how optimal payments are related to the technical efficiency, as previously detailed, and how cost savings can be generated.

Efficiency and optimal payments
We can easily relate the efficiency score with the optimal payment due to k.
Proposition 12 The optimal payment due to k ðO ¼ 0Þ can be written as follows: whereP kð'Þ ¼ x kð'Þ =y kð'Þ is the current unitary cost of k in service/DRG ' 2 W.
Proof This proposition is obvious after merging Eqs. (8) and (16). h According to Eq. (25), the optimal payment P kð';0Þ depends on coefficients l k of the log-linear combination; hence, it can be non-unique unless the US-log-DEA model is used. However, h kð'Þ is unique and can be achieved simply by using the log-DEA primal and the relationship between the efficiency score and b k .
Proof The proposition results directly from (25). Let hospital k be inefficient. Thus, there may exist an inefficiency source in the side of outputs, says k 2 . Should it be zero, we have HðY kð'Þ ; l k ; 0Þ ¼ y kð'Þ . Yet, because of inputs, k is inefficient by hypothesis, and it is clear that h kð'Þ \1 and P kð';0Þ \P kð'Þ . That is, the optimal payment is smaller than the current unitary cost incurred by k. Shoulds k 2 be larger than zero, we have HðY kð'Þ ; l k ; 0Þ [ y kð'Þ . Therefore, we get P kð';0Þ \P kð'Þ if and only if h kð'Þ \y kð'Þ =HðY kð'Þ ; l k ; 0Þ. h

Cost savings
Based on both current and optimal costs, we can easily define the cost savings for hospital k 2 X and service/DRG ' 2 W.
Definition 11 (Cost savings for hospital k and service/DRG ', D kð'Þ ) Cost savings are defined as the (positive) difference between the current and the optimal operational expenditures of hospital k for the service/DRG ': x kð'Þ À y kð'Þ P kð';OÞ : As claimed in Definition 11, there are cost savings if and only if D kð'Þ is strictly positive, i.e., x kð'Þ [ HðX kð'Þ ; l k ; OÞ.
Proposition 15 Cost savings associated with k ðO ¼ 0Þ can be written as follows: Proof It is straightforward after inserting (25) into (26).
h Proposition 16 If k is technically efficient regarding X kð'Þ and in the Pareto sense, then D kð';OÞ ¼ 0.
In the last case, the user (or group of stakeholders) should decide whether quality threshold(s) is(are) too large and must be decreased to raise cost savings, or threshold(s) must be kept and low-quality hospitals must be financed by P kð';0Þ [ x kð'Þ =y kð'Þ and the difference jD kð';0Þ j must be solely devoted to quality improvements.
5 The application of the new tool to the Portuguese public hospitals

An overview about the Portuguese health care system
Contracting in health care aims to establish a mechanism for resources allocation in line with each health care service provision and the corresponding population needs. These ones include to ensure the quality, efficiency, and effectiveness of health care (Shimshak et al. 2009). The contracting process in Portuguese health care services arises from the relationship between different stakeholders, including (1) public funding sources (the Portuguese State, through taxes), (2) the regulator (an independent Regulatory Agency for Health and the Ministry of Health), and (3) the health care providers (hospitals, either public or private, primary health care centres, and tertiary health care centres) (Sakellarides 2010). Figure 4 shows the financial flow in the Portuguese (NHS). The analysis of this section is centred on public hospitals. Most of their financing volume comes from taxes, collected from citizens and companies. Health care providers are public though autonomous entities, implying that there must be a monitoring model holding them responsible for their weak performance and inducing the so-called transparency for the population those hospitals serve. Contracting mechanisms try to allocate resources to health care providers in an efficient and effective way, through a clear and well-designed strategy. However, in most European countries there is nothing like a set of indicators allowing the evaluation of that strategy impact on the population health status which is a quite difficult task (Figueras et al. 2005;Lu et al. 2003). Accordingly, such a strategy requires a previous fieldwork (negotiation phase) that should be fulfilled by a contracting-specific team from the Ministry of Health. This strategy must be in line with some directions defined by the Portuguese Central Administration of Health Systems. A robust knowledge on demographics, health policies, and health care services production is compulsory to identify those population and service's needs. Still, predictions of needs and allocated resources are usually constrained by the limited State budget (Mckee and Healy 2002;Galizzi and Miraldo 2011). This negotiation phase should neither be merely symbolic nor bureaucratic, but it must have a relevant role on the budgetary allocation of health facilities, to avoid their indebtedness. This means that the process shall be managed with responsibility and accountability, and be followed/ monitored over time, which is not a common practice in Portugal, though.
Budgetary constraints and some Government resistance to make some funds available must neither overlap nor limit the health care provision ability/ capacity. Otherwise, one may observe the opposite effect, i.e., underfunding generates overindebtedness, given the arrears accumulation related to new acquisitions that health facilities had to do, to carry out their activity. This effect was observed in In Portugal, contracts between the (central) Government and each health care provider encompass a set of dimensions including the provisions type, volume and duration, referencing networks, human resources and facilities, monitoring schemes, performance prizes, and prices. Worth to mention, contracts should also contain quality-related terms, including penalties for poor quality. That is, contracting health care services is paramount to promote a user's focused action and to improve quality and efficiency, not only in Portugal but also in any other country.
Prices are perhaps one of the most important variables within a contract for health care provision. As a matter of fact, statutory payments to hospitals regard their production in terms of DRG and services (emergencies, medical appointments, inpatient services, . . .). These prices are usually defined according to the smallest unitary costs (in the previous year) among providers. Some adjustments regarding the average complexity of treated patients can also be included. Nonetheless, quality is clearly neglected in such a framework. We remark that health care services, like any other service (either public or not), may exhibit considerable levels of inefficiency, which means those unitary costs necessarily include an inefficiency parcel. Furthermore, providers also exhibit heterogeneous technologies (e.g., assets). In a period featured by resources scarcity, financing hospitals through an inefficiency-based mechanism that disregards their underlying technologies is totally nonsense, requiring a considerable reformulation, justifying, in theory, the adoption of our approach.

Contracting in the Portuguese NHS
Several New Public Management-based health policies have been extensively applied to the Portuguese public hospitals. Particularly, their management model (legal status) has observed a few changes since the beginning of 2003, moving from the Administrative Public Sector (under the public/administrative law) to the State Business Sector (under the commercial/private law) in 2005 (Ferreira and Marques 2015). This was the so-called corporatisation reform. Currently, most of Portuguese hospitals belong to the State Business Sector. One of the most relevant issues underlying this reform regards the hospitals financing model. While in the past such a financing was based on a retrospective model and prices (or tariffs) derived from each hospital history, nowadays this financing stems from the negotiation phase and the resulting contract. There is a hospital-specific budget that is negotiated between each hospital's Administrative Council and the Ministry of Health. Budget is negotiated in terms of production and quality, in line with some other European countries like Finland, Ireland, France, and England. In Portugal, paid unitary price (tariff) results from the national average of DRGs or services' costs. Patients are assigned to a specific DRG and/or to a specific service. In any case, tariffs are tabled, which means they are the same for all hospitals, regardless their performance. In other words, the current mechanism is basically a case-based payment, which encompasses several disadvantages: first, it may prompt the supplier-induced demand that, in turn, includes unwanted activity; second, more complex cases can be disregarded or treated negligently, which reduces the overall quality of care; third, providers are encouraged to up-code classification of patients into a more highly reimbursed group; fourth, the introduction of quality-raising (though cost-increasing) technologies is totally discouraged (Marshall et al. 2014). Valente (2010) identifies several contracting objectives: (1) to control health care provision-related expenses that could jeopardise the system sustainability in the medium to long term; (2) to promote providers' efficiency, by simultaneously increasing their production and quality, and decreasing their resources consumption; (3) to ensure high quality levels of care; and (4) to promote the hospital managers' accountability. As a matter of fact, besides production and quality that must be ensured, there are several contract-specific objectives that must be fulfilled under pain of penalties.
The contracting process in the Portuguese NHS has seen considerable changes over the past 13 years, as it must fit to reality, available resources, needs, and technological developments. These dynamics and the usual process redefinition and re-adaptation promote an adjustment mechanism of the hospitals' performance (either financial or not) program, encouraging the objectives for which the contract was created. Currently, the contracting process between the Portuguese public hospitals and the Ministry of Health integrates a triennial strategic planning process merging the existing forecasting documents, i.e., the Business Plan, the Performance Plan, and the Adjustment Plan and Financial Statements.

Data, sample, hypotheses and methodological issues
To exemplify the new tool, we use monthly data of twenty nine hospitals and their activity in 2016 and 2017, which results on 696 observations. Following , the sample does not include two out of the four public-private partnership hospitals (because of data gaps), three highly differentiated hospitals as oncology centres (because of their production technology that significantly differs from the one of general hospitals), and local health units (since they result from the vertical merging of a general hospital and several primary health care units). Data are available in the official source website (http://benchmarking.acss.min-saude.pt/). Available (operational) expenditures refer to the whole activity of the hospital, hence we can assess the optimal payment by patient if our measure of output is the number of standard patients.
Selected variables are exhibited in Fig. 5. Data are available on-line. 5 Quality indicators refer to patients' clinical safety, care appropriateness and timeliness, and access to health care. In this case, we have followed the variables adopted by Ferreira and Marques (2018b), taking into account the data availability for all the 696 observations. There is a maximum legal time for a secondary health care response to the citizens searching for (nonurgent) medical appointments and for surgeries. Overstepping such a time indicates poor access to the health care services. This is a phenomenon revealed by small values of both Q 1 and Q 2 . Minor surgeries should be made in ambulatory services, otherwise the patient can be subject to more complex procedures and to hospital acquired severe infections. The costs of surgeries undertaken in the operating theatre are, of course, larger than the others. Minor surgeries accomplished in ambulatory services reveal good care appropriateness and efficiency levels. Hence, the larger Q 3 , the better. In opposition, high values of Q 4 up to Q 8 reveal poor levels of care appropriateness and clinical safety.
As we have noted before, our methodology assumes that the utility function related to each quality level should be monotonically increasing, i.e., 8q kð'Þ r ð'Þ > q jð'Þ r ð'Þ ¼) U r ð'Þ ðkÞ > U r ð'Þ ðjÞ; k; j 2 X. This means that quality dimensions Q 4 up to Q 8 must be transformed. Additionally, the definition of thresholds can be facilitated if all quality dimensions are rescaled to the interval [0,100]%. For instance, rates of bloodstream infections resulting from central venous catheter are expressed in cases per 100,000 surgeries. Rescaling quality dimensions also reduce the odds of having empty comparability sets and, hence, reducing the number of interactions with the decision maker(s). We use the following rescaling procedure: Therefore, the largerq kð'Þ r ð'Þ , the better the utility and the performance of the hospital. The hospital j 2 X withq jð'Þ r ð'Þ ¼ 100 is the best performer in the quality dimension r ð'Þ 2 C ð'Þ . In opposition,q j 0 ð'Þ r ð'Þ ¼ 0 identifies the hospital j 0 2 X with the worst performance in the very same quality dimension. Table 1 presents the basic statistics associated with the rescaled quality variables. Hereinafter, we will consider the global bandwidths computed by using the triweight kernel, i.e., b jð'Þ q r ¼ b jþ1ð'Þ q r ¼ 3:62r q r J À1=5 ¼ 0:9799r q r ; 8j ¼ 1; . . .; J À 1; with J ¼ 696. We will also use the average of the distributions to represent the user predefined thresholds, Environment is also paramount to derive optimal payments to hospitals. We have selected six dimensions that intend to capture the population's demography and two more to explain the complexity and specialization degree of each hospital. The casemix index relates the costs associated with each DRG and the quantity of patients of the very same group. The larger the number of patients of higher complexity, the larger the case-mix index. This one is usually rescaled such that the national average is set as 1; hence, hospitals with a case-mix larger than 1 usually handle with more complex cases. Meanwhile, the Gini's specialization degree, as proposed by Daidone andD'Amico (2009) andLindlbauer andSchreyögg (2014), is related to the hospital efficiency and ranges between 0 (low specialization) and 1 (high specialization). Table 2 provides the basic statistics related to environmental variables. As before, we have adopted the triweight kernel to compute global bandwidths. In the case of environment dimensions, no data transformations are required. Optimization models were developed and solved through the integration of the package IBM ILOG CPLEX Ò Optimization Studio, v12.6.3, and the MATLAB Ò software. CPLEX Ò is useful for linear programming, whereas MATLAB Ò easily handles with matrixes manipulation, for-/while-cycles, and if-else environments, as the ones exhibited in the flowchart of Fig. 3. The algorithm can be delivered upon request.

Results and discussion
Once the overall set has been defined for each hospital, based on its corresponding size-, quality-, and environment-based comparability sets, we can use the US-log-DEA to optimize operational expenses and, by construction, paid prices.
Let us consider the case of observation k ¼ 100, for which we have Since k 6 2 X kð'Þ q , we state that its quality levels are below the minimum required quality defined by the stakeholder(s). It is clear that k 2 X kð'Þ z because of the definition of the environment-related comparability set. As X kð'Þ z \ X kð'Þ q ¼ ;, no optimization procedure can be executed. Hence, we have to slightly change some parameters (bandwidths and/or thresholds) to get a nonempty overall comparability set. So, while that set remains empty, we iteratively increase the bandwidths by 5% and to reduce the thresholds by the same rate, until the emptiness problem vanishes. In the case of k ¼ 100, with a single iteration, we got X kð'Þ ¼ f126; 127; 128; 141; 142; 143g. These observations (hospitals) are still better performers than k in terms of quality. Because k 6 2 X kð'Þ we do not have to use the US-log-DEA, as there is not a problem of multiple solutions. Therefore, we The solution of this problem is l kÃ ¼ ½0; 0; 0; 0; 0; 1 > , which is related to b k ¼ À 0:0524; hence, k is super-efficient regarding X kð'Þ eventually because of its poor quality. Yet, being super-efficient does not mean that the optimal paid price should be larger than the current unitary cost. In fact, we get P kð';0Þ ¼ 2183, which compares withP kð'Þ ¼ 3336. Thus, the optimal payment is about 35% smaller than the current unitary cost of k.
By executing the procedure proposed in Sect. 3.7, we obtained the optimal payments per standard patient. The optimal quantity of resources for the 2 years of study was ¤6,290,336,455, against ¤8,967,878,333 of consumed resources, which means that nearly 30% of them were wasted. The average payment was ¤1862 per standard patient, whereas the average current unitary cost is about ¤3052 (1.64 times the average optimal payment). Figure 6 exhibits the histogram and some parametric probability density functions associated with optimal payments. 6 Figure 7 does the same for the current unitary costs related to standard patients. Optimal payments seem to be well fitted by a generalized extreme value distribution with n ¼ 0:58½0:46; 0:70 (shape parameter), r Accordingly, the mean and the median of the distribution are, respectively, mean ¼ EðP jð';0Þ Þ ¼ l þ r n ðCð1 À nÞ À 1Þ % 2079 and median ¼ l þ r n ðÀ1 þ ðln 2Þ Àn Þ % 1593 (¤ per patient). The variance of this distribution is infinite because the expected shape parameter is larger than 0.5.  were standardized, coefficients help us on defining the most important predictors for optimal payments and it becomes clear that these ones are the population density, the illiteracy rate, the case-mix index (complexity), and the Gini's specialization degree. Illiteracy is negatively correlated with the remaining three dimensions. Furthermore, we may foresee that more differentiated facilities also handle with more complex cases and are located in urban regions (mostly in the coastline), which, in turn, are more densely populated and whose citizens have higher literary levels. Those providers are, according to our multiple linear model, the ones whose payment will be the largest.
As we have pointed out before, payments are currently set as the smallest unitary cost within the group in which the hospital is positioned. It takes into account neither quality nor the environment, a pitfall on defining optimal payments. There are four groups of hospitals in Portugal (B, C, D, and E). 8 Table 3 presents the comparison between the current procedure to fix payments and our approach. As we can see, our approach allows substantial cost savings but these ones represent only 54% of savings that could be achieved through the current procedure. We remark, however, that such a level of cost savings result from ignoring quality dimensions (which usually are costly as they require investments, Karagiannis and Velentzas 2012) and from unfair comparisons (because environment and technology are not considered in the current procedure). Although hospitals are currently clustered based on dimensions related to size, it is not sufficient to make adequate comparisons. It is also interesting to observe that, according to our approach, the potential cost savings decrease with the increase of both size and complexity of the health care providers (in opposition to the current approach). One could think that the larger the hospital, the more resources it could save if it would be efficient; however, this is not linear because these hospitals are also the ones receiving and treating the most complex and resources consuming patients, meaning that the margin for cost savings also decreases given their complex structures-this is, indeed, in line with our approach which adjusts the performance of health care providers to their internal and external operational environment. That is, our approach still contributes to the Portuguese health care system's sustainability, although with smaller cost saving levels when compared to the currently adopted methodology (which, in turn, seems to induce misconducts within the field as an outcome of absurd cost constraints).
Of course, these results may strongly depend on parameters like the threshold and the bandwidths. Thresholds can be understood as the minimum levels of quality to be a potential benchmark. Such levels depend on the stakeholder(s) point of view. We may expect that the smaller the threshold, the larger the cost savings because underinvestments on quality are allowed, which is in line with Proposition 19. Likewise, the definition of bandwidths may play a relevant role on cost savings, see Proposition 20. The extreme case of these two propositions occurs when thresholds are t ð'Þ q r ¼ 0 for all r ð'Þ 2 C ð'Þ and b kð'Þ y ; b kð'Þ q r ; b kð'Þ z c À! þ1. In this case, X kð'Þ À! X, so it is sufficient to apply US-log-DEA altogether the frontier constructed using the entire sample to obtain optimal payments which, in turn, disregard size, environment, and quality. In our empirical application, this results on cost savings of roughly ¤3145 million, a value that is obviously above the cost savings using our approach with thresholds set by the median of quality indicators, but still below the cost savings according to the current procedure adopted by the Portuguese Ministry of Health. We, then, suspect that the approach adopted by the Ministry is too conservative and induces the health care system's underfunding. Probably because of that several authors have reported a considerable missing of resources in the Portuguese health system (Nunes et al. 2019;Nunes and Ferreira 2019b). Accordingly, our approach is more flexible, contributing for the system's sustainability at the same time that quality of delivered services is ensured and comparisons for benchmarking purposes are fair, being based on both external and internal conditions in which hospitals operate. 6 Concluding remarks and the economic impact to the NHS This paper proposes and explores a new tool that achieves the best tariffs set for a health care provider. This achievement clearly encompasses both components of quality and financial sustainability, and finances those providers based on their past performance and on the existence of potential benchmarks, which in turn spend fewer resources to treat a comparable quantity of patients, show higher levels of quality, and act on similar environment (inner/outer). Our approach can optimize payments in several payment schemes extensively adopted worldwide.
The proposed tool has several advantages. As a matter of fact, it has a clear impact on the health care financial system, since costs are reduced as a function of providers' performance. Considerable potential cost savings arise as a result of the application of the new tool. Additionally, potential benchmarks cannot show quality levels below the stakeholder(s) expectations. It is not clear whether higher quality levels do imply or not higher costs. Although high quality-based facilities are usually expensive, in the long-run it may contribute to the reduction of hospital days, postoperative complications, inner-hospital infections, readmissions, and deaths. Furthermore, high quality hospitals tend to attract more patients. Since money follows the patient, the competition between providers is expected to improve their performance and to reduce wasted resources. Competition is commonly associated with performance improvements. Due to the competition, activity tends to increase; assuming that it is cost-effective and appropriate, it becomes another advantage of the proposed strategy (Geissler et al. 2011). If scope and scale economies can be exploited, then the more patients treated, the lower the provider cost per patient. If the provider is able to join both cost savings due to scale economies and high quality standards, it becomes a potential benchmark in our framework. If an inefficient hospital achieves a tariff lower than its current unitary cost, then it has room to improve its practices, once there is, at least, one unit producing roughly the same amount of outputs, with fewer resources and better quality. That is, the inefficient one should learn from its benchmarks' practices. By reducing the payment for such a hospital, it can increase its income by increasing its production. But then again, the invisible hand of competition will impose that production increasing is possible if and only if quality is improved. In other words, the reduction of payments for some hospitals will force their production increase on the one hand, and reduce the total costs and wastefulness in the long-run if scale economies are correctly explored, on the other hand.
Additionally, low-quality units may receive a tariff higher than their current cost per patient, which should be seen as an incentive to improve their quality by introducing quality-raising through cost-increasing new technologies Marshall et al. 2014). Even for very high quality thresholds, this scenario can happen for most (if not all) units if the provider with the best quality standards also exhibits significant signs of technical inefficiency. In such a case, the stakeholder(s) must decide whether the threshold is in fact too high, or an incentive to quality improvement across the entire sample shall be placed on the table. If the last scenario is adopted and both quality and efficiency are not positively related, the financial sustainability can be threatened. This means that the stakeholder(s) has a prominent role, by defining a priori which criteria must be utilised to define quality and how to define an appropriate threshold. This one can be perhaps optimized to maximise potential cost savings, which is left for a further research.
As in the case-based payment scheme, several disadvantages or difficulties can be identified for the new financing tool. Yet, one must be aware that there is not an available flawless reimbursement method. We can only try to minimize the shortcomings of available schemes. Most of those disadvantages identified for the new scheme (our proposal) can be reduced by implementing strong contract management tools. For instance, State's financial control can be improved if volume of care is specified in contract. Quality standards are also set. If they are not accomplished, pre-defined penalties can be applied in further contracts, reducing the new budget. Also, selecting the least complex and severe patients must also be prevented in contracts. However, some disadvantages still remain. Indeed, if payment is made according to DRGs, the incorrect patients' registry is likely in practice and can be even promoted through the new tool. For instance, a patient from a specific DRG is assigned to a different DRG group (Vita 2001), simply because contracted volumes must be accomplished and/or the income must be increased to overcome some mismanagement scenarios. It is not so simple to mitigate such a situation, especially because there is a resistance on medical group as well as a pre-defined DRG quantity to be produced. Registry systems are different among providers, thus they must be updated and uniformed, and an integrated system should be created. It must cross and relate data from different levels of care (primary, secondary, tertiary), and detect potential bias on expected treatments for a specific patient. Another alternative would be increasing the contracted production range, which would mitigate the aforementioned misregistration.
Finally, by creating a database with the production of all hospitals, costs and quality, for several years, it is possible to refine the potential best practice search for each entity, as few years may ensure a larger range of possible optimal payments. For instance, a hospital may present an optimal current unitary cost, which could be non-optimal in the past. Thus, its corresponding tariff will be lower than its current cost per patient. Accordingly, we expect a convergence of health care providers' performance in the long-run. images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.