1 Introduction

A one-dimensional measurement of sustainability would presuppose the possibility to aggregate the needs of the present and of all future generations by some kind of welfare function. Arrow (1963) has shown that such functions which aggregate the individual preferences of several people into one single preference order do not exist if some plausible axioms are assumed. Thus, any kind of measurement of the sustainability of a production system has some weaknesses and is at best multi-dimensional. It is usual to distinguish an economic, a social and an ecological dimension of sustainability. Within each of these three dimensions, a lot of distinct, often conflicting, sustainability criteria can be specified. For example, Fig. 1 shows the hierarchical disaggregation of the total ecological impact by the eco-efficiency and SEEBALANCE® method of BASF, which is well established in the practice of life cycle assessment (LCA). Hence, methods of performance and decision analysis, particularly data envelopment analysis (DEA), multi-criteria decision making (MCDM) and multi-attribute utility theory (MAUT), can be helpful in the analysis and non-financial evaluation of production systems and their processes and products. The focus of traditional production theories, however, is too narrow to cope with valuations that are based on different objectives and performance criteria.

Fig. 1
figure 1

Aggregation scheme of ecological criteria (cf. Dyckhoff et al. 2015, p. 1560)

In view of the fact that such environmental protection concerns are not being grasped by the conception of traditional production theories, Dyckhoff (1992) has developed a ‘decision-based production theory’ to provide an appropriate generalization (cf. Dyckhoff 2003). The main idea of this generalization is that it systematically distinguishes between two different, but strongly connected categories: firstly, the technologically determined inputs and outputs of a production activity as categories resulting from production theory, and secondly, the impacts of this activity on relevant attributes as performance criteria resulting from decision theory. Their interconnection is described by corresponding multiple value functions. This paper considers a specific concretization, which is called multi-criteria production theory (MCPT), where each of the individual value functions is defined on the relevant inputs and outputs. Hence, these functions translate inputs and outputs into distinct values (of multiple performance criteria or evaluation attributes) created or destroyed by the production activity, called benefits and costs, which are more fundamental to the preferences of the decision maker (e.g. the responsible production manager) or of an alternative evaluator (e.g. a higher ranked or an external authority) than the inputs and outputs themselves are.

MCPT allows us to generalize the efficiency notion of traditional production theory as well as the methodology of DEA, for example regarding ecological efficiency (see Dyckhoff and Allen 2001 as first international publication applying MCPT in this respect). In that way, fruitful insights can be gained, namely into such phenomena as ‘rational inefficiencies’ (Fandel 2009) and into some well-known ‘pitfalls and protocols in DEA’ (Dyson et al. 2001), e.g. regarding the selection of relevant inputs and outputs (Afsharian et al. 2016) or the treatment of undesirable factors (Wojcik et al. 2017). Some of these topics are of particular importance for environmental performance measurement—one of the main strands of research within recent DEA literature, as current surveys show (Liu et al. 2013a, b; Lampe and Hilgers 2015). Moreover, in the light of MCPT, the relations between DEA and multi-objective linear programming (MOLP) can be better understood (cf. the respective review by Wallenius et al. 2008, criticized in Sect. 2.1).

The aim of this paper is not only to demonstrate several meaningful applications of MCPT to some topics of non-financial performance measurement discussed in the recent literature—with a focus on sustainability issues—, but also to extend some general results of the existing (mostly German, thus less accessible) literature on MCPT. The next section reviews the literature on theoretical deficits of DEA, on the decision-based production theory by Dyckhoff (1992), on its hitherto existing applications as well as on similar approaches in the international literature. Section 3 clarifies the basic assumptions of MCPT in comparison to those of traditional production theories and proves two main theorems for monotonic and linear value functions. They are applied in Sect. 4 to some of the open questions mentioned above. Topics that are discussed address undesirable products and factors, hierarchies of performance evaluations, non-monotonic value functions as well as the rationality of ‘technically inefficient’ production. Section 5 concludes the paper with a discussion of the main results, their limitations as well as open questions.

2 Literature review

Traditional production theories usually consider the input and output of desired goods and services. Only rarely, they focus on the output of bads such as undesired waste or emissions, and never on their input, e.g. waste burned in incineration plants (cf. Müser and Dyckhoff 2017), not to mention complex ecological objectives, such as the minimization of climate change. This deficit was the original motivation for proposing a generalized approach, called ‘decision-based production theory’ (cf. Dyckhoff 2003). In the course of this paper, it becomes obvious that this approach can help to settle some theoretical deficits of DEA, too.

2.1 Theoretical deficits of data envelopment analysis

In their review of MCDM and MAUT and their outlines of interesting future research questions, Wallenius et al. (2008, pp. 1337 and 1343) wrote:

Data envelopment analysis (DEA) has grown in importance and its relationship with multiple objective linear programming (MOLP) has been explored. (…) Of course, DEA and MOLP usually have different purposes: DEA is used for performance measurement, whereas MOLP is used for decision aiding choice.Footnote 1 The observation about the structural similarity between DEA and MOLP has sparked synergistic advances in both models. (…) MOLP models can be used to generate novel ways of incorporating a decision maker’s preferences into DEA.

What DEA and MOLP both have in common is that they try to model good decisions, be it usually descriptively with empirical data in DEA or prescriptively with forecasted data in MOLP. The structural similarity of DEA and MOLP has been disclosed by Joro et al. (1998) and utilized by Halme et al. (1999) to incorporate preference information into DEA (cf. Joro and Korhonen 2015 for a comprehensive presentation). There is, however, a fundamental difference between DEA and MOLP, because the latter makes no systematic use of the technological concept ‘production possibility set’ (PPS) and its essential properties, e.g. the returns to scale from production. Three characteristics of DEA, which are at the same time untypical of MOLP, are:

  1. 1.

    envelopment of (measured) data,

  2. 2.

    interpretation of the data as consequences of production activities of decision making units (DMUs), whereby

  3. 3.

    the envelopment is based on exogenous knowledge about the underlying production technology.

DEA is thus a methodology for ‘measuring efficiency of decision making units’ (Charnes et al. 1978), the concept of which draws much on production theory (Charnes et al. 1985). This may be one reason why MOLP and DEA have developed separately so far during the last forty years, despite several attempts to integrate DEA with MCDM (Belton 1992; Doyle and Green 1993; Joro et al. 1998). In fact, DEA draws much on decision theory as well, even though this aspect has been widely ignored in the DEA literature.

An exception can be found in a recent methodological review on ‘DEA: Prior to choosing a model’ by Cook et al. (2014). With respect to the crucial open question of selecting and defining the inputs and outputs which are relevant for the performance analysis at hand, they state that (p. 2):

In summary, if the underlying DEA problem represents a form of ‘production process’, then ‘inputs’ and ‘outputs’ can often be more clearly identified. The resources used or required are usually the inputs and the outcomes are the outputs.

If, however, the DEA problem is a general benchmarking problem, then the inputs are usually the ‘less-the-better’ type of performance measures and the outputs (…) the ‘more the-better’ type (…). DEA then can be viewed as a multiple-criteria evaluation methodology where DMUs are alternatives, and the DEA inputs and outputs are two sets of performance criteria where one set (inputs) is to be minimized and the other (outputs) to be maximized.

Each of these two alternative and unconnected perspectives has its own difficulties. This can be illustrated by the following example: Assume a decision maker examines the sustainability of certain production units, such as cement plants which use the factor labor and produce undesired emissions:

  • On the one hand, cement factories unambiguously constitute production processes with labor as a resource and carbon dioxide as an emission. Following Cook et al. (2014), labor and CO2 are therefore to be identified as input to be minimized, and output to be maximized, respectively. Indeed, in the (economic) view of the shareholders of the firm, the input of labor does imply wage costs to be minimized. Contrarily, social and ecological objectives of other stakeholders, such as trade unions or future generations, usually call for the opposite optimization directions, namely the maximization of workers’ employment and the minimization of climate change caused by CO2 outcome.

  • On the other hand, Cook et al. (2014) consider DEA as simply a ‘multiple-criteria evaluation methodology’ where the output CO2—to be minimized for ecological reasons—needs to be re-defined as an ‘input’, which is hardly justifiable from a production-theoretical view. Moreover, being a performance measure to be maximized, the profit of the cement factory would be treated as an ‘output’ by DEA. However, this can lead to contradictions with reality, particularly regarding the assumed returns to scale from production, for example when the cement firm has a monopoly on its local market (see Sect. 4.4 of this paper for a numerical example).

However, decision-based production theory, which has been developed in the German business economics literature and which is to be presented next, tells us that the two perspectives of ‘production process’ versus ‘benchmarking problem’ in the quotation above are in fact not mutually exclusive, but instead represent two sides of the same coin. The crucial question is not whether production theory or decision theory are alternative theoretical foundations of DEA, but rather how to form an adequate synthesis of both theories.

2.2 Embedding production theory into decision theory

The principal idea behind the generalization of traditional production theories, such as, for example, the Activity Analysis of Production and Allocation of Koopmans (1951), the Theory of Cost and Production Functions of Shephard (1970) or the business-oriented production theory of Gutenberg (1951), is to consequently embed them into the modern theory of decision making.Footnote 2 Therefore, a decision-based production theory is established by applying theorems and methods of MCDM and MAUT to decisions regarding production systems.

In this sense, illustrated by Fig. 2, production theory in (business) economics and management science studies production systems, i.e. systems consisting of transformation processes directed and controlled by managers in order to achieve certain pre-defined goals. These systems may not only be companies and their subsystems, such as plants or workplaces, but also whole (national or regional) economies.

Fig. 2
figure 2

Production system as particular case of a decision making unit (cf. Dyckhoff 1992, p. 352)

As it is restricted to a particularly defined decision field, which is characterized by the specific technology of the transformation process under consideration, such a production system is a special case of a decision making unit. DMUs are generally studied in decision theory (and specifically in DEA). Therefore, the principal approach to generalizing traditional production theories, called decision-based production theory, is simply determined as an application of general decision theory to production systems as DMUs (see Dyckhoff 1992, pp. 38, for its origin and early motivation).

Thus, the analysis of production and of its performance is systematically embedded into decision theory, just as it is already the case for most other sub-disciplines of business economics and management science—such as finance, marketing and management accounting (Dyckhoff 2003). This allows production theory not only to become compatible with the theories of these other sub-disciplines, but also to unlock all of the knowledge and know-how of both descriptive and prescriptive decision theory, including MCDM and MAUT. It provides a sound theoretical foundation for constructing operations management models as well as for analyzing the efficiency and rationality of production processes. Production models, which are constructed in this way, are characterized as follows:

  • Basically, they can be categorized into descriptive and empirical models (e.g. in DEA) or prescriptive and normative ones (e.g. in MOLP). When designing such models, the general principles and methods of decision analysis may be helpful.

  • The fundamental assumption of prescriptive decision theory is that a complex decision problem can be solved more effectively by decomposing it into five basic components: (1) the possible decision alternatives of the DMU, here representing production activities; (2) the potential uncertainties; (3) the consequences of the activities and the uncertainties; (4) the relevant objectives; and (5) the respective preferences of the decision-maker or of an evaluator in the case of an external performance measurement (e.g. by a stakeholder). Thus, production models can be structured along these five basic components. At least, they comprise a set of feasible production activities with associated preference relations.

  • Such production models describe the transformation process and its corresponding activities only insofar as is necessary for the pending decision or performance evaluation to be made. In particular, the decision-oriented view guides the model designer in defining and selecting the relevant inputs and outputs.Footnote 3

This selection depends on the specific purpose of the model as well as on the particular view of the model designer. Therefore, production models are never able to show the whole picture of real production processes, and corresponding performance analyses of production systems measure at most those aspects which are relevant for the values determined by the decision maker, evaluator or model designer. Thus, in general, many more performance criteria are relevant for sustainability performance measurements than for purely economically motivated analyses. It should be clear that results of performance analyses depend on the evaluation criteria that are chosen as relevant.

2.3 Previous applications and similar approaches

The last point addresses one of the major open questions in DEA (cf. Cook et al. 2014) which have already been dealt with in the existing literature by applying the concept of a decision-based production theory or by using similar approaches:

  • Based on MCDM and vector maximum theory, Dyckhoff and Allen (2001) developed a systematic approach for deriving (ecologically) generalized DEA models. Starting from well-known assumptions of DEA and activity analysis, they differentiate four specific kinds of partial preference relations with respect to the input–output vectors of the production activities: (a) the ‘classical’ case where all inputs and outputs are goods; (b) the ‘standard’ case of environmental economics with goods and bads as outputs as well as inputs; (c) the ‘CML’ case with ecological categories as objectives, e.g. global warming, which are measured by (linear) impact functions of the inputs and outputs; and (d) the general ‘linear’ case of multiple value functions.

  • By referring to the CML case of Dyckhoff and Allen (2001) in a short footnote, Kuosmanen and Kortelainen (2005) proposed a similar approach of using environmental impact (‘pressure’) categories in DEA. They examine how DEA can be adapted for this purpose. Their reasoning is that DEA accounts for substitution possibilities between different natural resources and emissions and does not require subjective judgement about the weights although soft weight restrictions can be incorporated. Finally, they use their approach to assess the eco-efficiency of road transportation in three Finnish towns.

  • Dyckhoff and Ahn (2010) refine the generalized DEA approach proposed by Dyckhoff and Allen (2001). They distinguish the value functions into the two categories of cost (or ‘effort’) and of benefit functions, while explicitly referring to the decision-based production theory of Dyckhoff (1992). Furthermore, they demonstrate how the most widely used, radial DEA models can be generalized correspondingly. The results for general linear cost and benefit functions are derived and illustrated by the example of cement plants. Finally, they discuss the need for a comprehensive methodology extending the pure generalization of the mathematical DEA models to an ‘advanced DEA’ performance evaluation conception.

  • Afsharian et al. (2016) pick up their suggestion and propose such an advanced DEA approach for the particular purpose of determining the relevant performance criteria. It is exemplified by the case of measuring pharma stores’ efficiency concerning their goal of customer retention. The three steps of the procedure of Afsharian et al. include the development of a system of objectives, the derivation of corresponding performance criteria (as inputs and outputs) as well as the construction of associated cost and benefit functions. This approach is intended to contribute to solving the following problems: (a) selecting the relevant inputs and outputs, (b) handling objects with dual roles as input or output, (c) undesirable objects.

  • A survey by Wojcik et al. (2017) of the DEA literature on bads reveals that only 22 (of 345) articles explicitly address the (desirable) input of such undesirable objects into the first stage of a single- or multi-stage process. And only four of them consider a real application with those ‘original’ factors, all of which are waste water. A detailed analysis shows that all current approaches are based on two core ideas involving various efficiency measures. Disposability assumptions, otherwise common in DEA—and in economics—, are rarely used, presumably because the modelled processes themselves are disposal processes. Regarding the ‘standard case’ of environmental economics defined by Dyckhoff and Allen (2001), the authors finally demonstrate in an example how DEA models with bads as inputs (and outputs) can be systematically derived from the generalized DEA approach as refined by Dyckhoff and Ahn (2010).Footnote 4

These five articles consider decision-oriented generalizations of DEA, but not of the underlying production theory itself. All, however, refer to Dyckhoff and Allen (2001) which themselves are strongly influenced by the ideas of the decision-based production theory of Dyckhoff (1992). With the exception of the literature on ‘rational inefficiencies’ that will be discussed separately in Sect. 4.5, there are—to the best of my knowledgeFootnote 5—only two publications of Hasenkamp (1992) and Esser (2001) which explicitly deal with a kind of decision-oriented generalization of traditional production theories. Here, we must keep in mind that, in microeconomics, the theory of production is often defined in a broad sense as identical with ‘the theory of the firm’ (Dano 1966, p. 1):

  • Hasenkamp (1992) analyzes multiple objectives in the theory of the firm. A utility function is maximized to determine production decisions. It is defined on the measures for two distinct objectives, one of which being the profit function, the other ‘market performance’ or ‘firm’s image’. As technology, he assumes a homogeneous neo-classical production function with non-increasing economies of scale and with labor and capital as inputs, thus excluding inefficient production by this premise. The result of his analysis shows that the production decision is analogous to pure profit maximization in such cases where internally determined prices are given which are different from the market prices.

  • Esser (2001) elaborates the decision-based production theory of Dyckhoff (1992) by considering various kinds of preference relations for the consequences of the production. Thereby, he develops a mathematically founded systematization of different notions of efficiency as well as of various environmentally oriented production theories in the German literature.

The present paper specifies the decision-based production theory insofar as it explicitly models different objectives by making use of value functions instead of abstract preference relations. Thus, it lays a theoretical foundation for the generalization of DEA by Dyckhoff and Allen (2001) and its refinement by Dyckhoff and Ahn (2010).

3 Multi-criteria production theory

The cardinal point for the generalization of the traditional production theories is the undisputable distinction between inputs and outputs of a production activity, on the one hand, and its consequences for the objectives of the decision maker (or an external evaluator), on the other hand. In the traditional theories both are identical, i.e. each output or input is an objective to be maximized or minimized, respectively. Therefore, it is at first necessary to clearly define the notions of ‘input’ and ‘output’.

In his Theory of Production, Frisch (1965, p. 3) characterizes (technical) production as any transformation process that is desired by certain human beings and can be directed by them (while Koopmans 1951 and Shephard 1970 use ‘production’ as an undefined basic term). Frisch then proceeds:

The term transformation indicates that there are certain things (goods or services) which enter into the process, and lose their identity in it, i.e. ‘ceasing to exist’ in their original form, while other things (goods or services) come into being in that they ‘emerge’ from the process. The first category may be referred to as ‘production factors’ (input elements), while the last-named category are referred to as ‘products’ (the output or resultant elements).

Hence, by definition, input enters into and output emerges out of the transformation process. Both, the input of production factors and the output of products, are usually understood as flows measured in time rates (Shephard 1970, p. 5). Besides (desired) goods and services, the objects going into or coming out of the process may also be undesired. Trim loss is such an undesired output flow, while the input of waste to be incinerated is a desired flow because it decreases the stock of bads (Dyckhoff 1992, p. 67).Footnote 6

“By production in the economic sense”, Frisch (1965, p. 8) means “the attempt to create a product which is more highly valued than the original input elements.” This excludes consumption. In order to include recycling and disposal activities, such as waste incineration,—which do not necessarily generate desired products because of their main purpose to reduce or destroy bads (Souren 1996)—production in general is defined as value creation: This is any process directed and controlled by human beings which transforms (input into output) objects with the intention of generating benefits that outweigh the costs of the transformation. Thus, higher positive values are created than consumed or more negative values are destroyed than originally induced as undesired by-products (Dyckhoff 1992).

Positive values that are created as well as negative ones that are destroyed by the production system form the benefits of the transformation process. Conversely, costs are caused by consuming positive values and generating negative ones. Multiple benefits or costs defined in such a way do not need to be measured on one and the same scale. Instead, they are often valued in their own (distinct) natural scales and thus are, a priori, not necessarily commensurable among each otherFootnote 7—unlike the special case of revenues and expenditures in monetary terms. Hence, with multiple incommensurable benefits and costs in decision-based production theory, the overall assessment of production activities always results in a multiple-criteria evaluation problem.Footnote 8

3.1 Basic assumptions and an example

In concretization of decision-based production theory, multi-criteria production theory (MCPT) is determined by the premise that all relevant data are assumed to be known, deterministic and measurable, and whereby the benefits and costs depend on the actual inputs and outputs only. Thus, regarding the five basic components of production models in prescriptive decision theory (noted in Sect. 2.2), the following basic assumptions A1–A5 characterize this approach:

A1 (Production possibilities as decision alternatives) The set P of feasible production activities, called production possibility set (PPS), is completely described by m input and s output quantitiesFootnote 9 \(\varvec{z} = (\varvec{x};\varvec{y})\) of certain selected types of objects involved in the transformation process. Basically, P is part of a technology T described by certain axioms (e.g. closeness or free disposal) and further individual characteristics (e.g. constant returns to scale) as predetermined general or specific properties:

$$P \subset T = \, \{ \varvec{z} = (\varvec{x};\varvec{y}) \in {\mathbb{R}}^{{{\text{m}} + {\text{s}}}} |{\text{ Input}}\;\varvec{x}\;{\text{can}}\;{\text{in}}\;{\text{principle}}\;{\text{be}}\;{\text{transformed}}\;{\text{into}}\;{\text{output}}\;\varvec{y}\}$$

A2 (Deterministic, complete knowledge) There is no uncertainty in the data.

A3 (Relevant consequences depend on inputs and outputs only) The relevant consequences (also called outcomes or resultsFootnote 10) of any production activity considered by the decision maker or evaluator are completely captured by a multidimensional function \(\varvec{v}\left( \varvec{z} \right)\)\({\mathbb{R}}\) k+r of the respective input/output-vector \(\varvec{z} = (\varvec{x};\varvec{y})\) that distinguishes all relevant results \(\varvec{v}\) caused by the inputs and outputs of the transformation process.

A4 (Costs-benefits trade-off) The relevant consequences are differentiated into two distinct categories \(\varvec{v} = (\varvec{c};\varvec{b})\), namely k types of values destroyed, thus being disadvantageous results, called costs c as well as r types of created values as advantageous results, called benefits b. Objectives are both the minimization of each type of cost as well as the maximization of each type of benefit.

A5 (Rational consistency) The preferences of the decision-maker (or the evaluator) are compatible with the vector dominance relations of the alternatives regarding the costs-benefits space.

The following realistic example with fictitious data illustrates MCPT. To be evaluated are cement plants with view to sustainability. Two types of benefits and costs each are decided to be relevant. The benefits include profit from the economic perspective (b 1), and job creation from the social perspective (b 2). From the ecological perspective, the two relevant types of costs are concerned with the contributions towards climate change (c 1) and to the ozone hole in the stratosphere above the Antarctic (c 2), respectively.

To quantify the corresponding consequences of the plants’ production activities, the decision maker—or evaluator or model designer—considers as relevant three types of input: labor (x 1), capital (x 2) and scrap tires (x 3), and three types of output: cement (y 1), CO2 (y 2) and CFC (y 3). The four linear value functions for each of the two types of benefits and costs depending on the six relevant types of inputs and outputs are calculated as follows:

$$\begin{array}{*{20}l} {b_{1} = 340y_{1} - 10x_{1} - 50x_{2} + 20x_{3} } \hfill & {c_{1} = y_{2} + 7000y_{3} } \hfill \\ {b_{2} = x_{1} } \hfill & {c_{2} = y_{3} } \hfill \\ \end{array}$$
(1)

Here, to find out the profit b 1, the combined total of worker salaries and capital costs are subtracted from the total revenues. Thereby, financial turnover results from sales of the cement output as well as from revenues acquired as a fee by the factory for disposing of used tires. These scrap tires are incinerated as an input, thereby serving as fuel for the process of cement production. Job creation b 2 as second benefit can be judged from employment figures of labor input, and the greenhouse effect c 1 is calculated from emissions of carbon dioxide (CO2) and chlorofluorocarbon (CFC), the latter of which also causes damage c 2 to the ozone layer.

Let us consider four cement plants. Figure 3 shows the matrices X and Y of their input and output quantities on the left, and on the right the two matrices B and C as their consequences for both types of benefit and cost, calculated by the four value functions in (1).

Fig. 3
figure 3

Four cement plants as decision making units

The example demonstrates three aspects which are unusual for traditional production theory: First, labor input x 1 has two opposing impacts: an undesired financial impact on profit b 1 and a desired social one on employment b 2. Second, the output CFC y 3 has two different, but both undesired ecological impacts simultaneously. Third, scrap tires are considered here as an undesired factor whose input x 3 into incineration is thus desired in order to destroy them and hence to add value by reducing negative values.

It should be noted that employment is interpreted as an outcome, not as an output of the input labor. Here, the term outcome of an activity is synonymously used to describe the impacts for the adjacent environments of a production system and is distinguished from the term ‘output’. Otherwise, the reduction of the stock of scrap tires via their incineration would also have to be called output (and not outcome only), which would be counterintuitive. In this sense, damages to nature caused by economic activities are outcomes but not outputs (for instance, in environmental science an emission such as sulphur dioxide at the ‘end-of-pipe’ of a production process is distinguished from its immission in the atmosphere, which causes acid rain).

3.2 Traditional production theories as special cases

In traditional theories of production and cost, assumptions A1–A5 are valid, too, but are concretized by additional assumptions (A6) as two extreme cases of a continuum of other special cases in between.Footnote 11 At the one extreme, there exists only one single one-dimensional value function to be maximized, measuring the success as benefit generated by the inputs and outputs of the production process, usually determined by the profit or contribution margin. If the revenues are supposed to be fixed, one obtains the traditional cost theories. At the other extreme, the traditional production theories consider the simplest case of what may constitute the relevant consequences of a production activity:

A6a (Inputs-outputs trade-off of goods only) On the one side, each selected input forms one of the k = m types of costs, i.e. \(\varvec{c} = \varvec{x}\), and, on the other side, each selected output one of the r = s types of benefits, i.e. \(\varvec{b} = \varvec{y}\).

Here, costs and benefits are caused by the consumption (= input) and production (= output) of goods within the transformation process (goods being positively valued objects, items, things, entities). Hence, the essential distinction between MCPT and the traditional theories lies in the fact that multiple—in general incommensurable—costs \(\varvec{c}(\varvec{z})\) and benefits \(\varvec{b}(\varvec{z})\) are to be minimized or maximized instead of the original inputs and outputs \(\varvec{z} = (\varvec{x};\varvec{y})\) themselves, the latter two merely describing the production activity by their direct consequences.Footnote 12

In ecological evaluations, however, the indirect consequences of inputs and outputs, called environmental pressures or impacts, are actually of importance, namely the outcomes for nature of resource depletion or emissions into the atmosphere (cf. Kuosmanen and Kortelainen 2005). Instead of using these impacts as original values (in line with assumptions A3 and A4), environmental performance analyses often take the quantities of inputs and outputs themselves as proxies, thereby differentiating \(\varvec{z} = (\varvec{z}^{G} ;\varvec{z}^{B}\) ) into those of goods \(\varvec{z}^{G} = (\varvec{x}^{G} ;\varvec{y}^{G} )\) and those of bads \(\varvec{z}^{B} = (\varvec{x}^{B} ;\varvec{y}^{B} )\). The corresponding standard assumptionFootnote 13 for such a production theory with goods and bads then becomes:

A6b (Inputs-outputs trade-off of goods and bads) Each input of a good and each output of a bad uniquely defines one corresponding type of the costs, i.e. \(\varvec{c} = (\varvec{x}^{G} ;\varvec{y}^{B} )\), and, vice versa, each output of a good as well as each input of a bad one type of the benefits, i.e. \(\varvec{b} = (\varvec{x}^{B} ;\varvec{y}^{G} )\).

More special cases of the general (deterministic) theory defined by A1–A5 can be derived by further varying assumptions A6 regarding different specifications of the multi-dimensional value function \(\varvec{v}(\varvec{x};\varvec{y})\) as well as by additional assumptions (A7) with respect to the fundamental axioms and the particular properties specifying the technology Τ and the PPS Ρ of assumption A1. In this paper, we will analyze the implications of the following three further specifications concerning assumptions A6.

A6c (Consistent valuation of inputs and outputs) There is no input and no output that simultaneously contributes to several types of costs and benefits with a conflict of interests.

Then, assumption A4 for cost minimization and benefit maximization implies a corresponding unique preference for minimizing or alternatively maximizing each single input and output. In the special cases of A6a and A6b, this assumption is fulfilled. The example of Sect. 3.1 violates A6c with respect to the labor input, but is linear:

A6d (Linear value function) \(\varvec{v}(\lambda_{1} \varvec{z}_{1} +\lambda_{2} \varvec{z}_{2} ) =\lambda_{1} \varvec{v}(\varvec{z}_{1} ) +\lambda_{2} \varvec{v}(\varvec{z}_{2} )\) for all \(\varvec{z}_{j}\), λ j \({\mathbb{R}}\), j∈{1, 2}.

Then: \(\varvec{v}(\varvec{z}) = \varvec{V} \cdot \varvec{z}\), where V = ( B; C ) is a value impact matrix. Because of the constant value coefficients of V, the consequence of increasing an input or an output is always of the same sign regarding the respective type of value. This property is satisfied in general for all monotonic (especially non-linear) value functions.

A6e (Strictly monotonic values) For each type of cost or benefit, value function \(\varvec{v}(\varvec{x};\varvec{y})\) is either strictly decreasing or strictly increasing with respect to changes of any single input or output.

3.3 Main theorem for consistent monotonic value functions

A central topic of modern production theory is the efficiency of possible activities. For example, which of the four cement plants in Fig. 3 is efficient in comparison to the other three? For that purpose, one has to answer the fundamental question of how efficiency is defined in MCPT. In view of the decision-based approach, it is natural to make use of the standard notion from decision theory.Footnote 14

Definition of efficiency A production activity is (strongly) efficient with respect to Ρ and the relevant multi-dimensional costs and benefits as objectives iff (= if and only if) there is no other alternative in Ρ dominating it. Thereby, alternative a dominates alternative b (weakly) iff alternative a is better than b for at least one of the objectives and not worse regarding all others; a dominates b strongly iff it is better for all objectives. An activity is called weakly efficient iff it is not strongly dominated.

In traditional production theory—characterized by assumption A6a—more specific notions such as, for example, ‘input-efficiency’ or ‘proper efficiency’ are also defined (cf. Esser 2001, pp. 119). To differentiate the usual notion (of Koopmans 1951) from those with more information on the values created or consumed by the production activity, such as ‘allocative efficiency’ when prices are known, the term ‘technical efficiency’ is sometimes used to concretize it.Footnote 15 All these particular notions can be generalized for assumption A6b such that the dominance direction of bads for an input or an output is contrary to that of goods. Thus, analogously to traditional theory, an efficient production with bads is (also) called—more specifically—technically efficient.Footnote 16

In general, however, following from A4 and A5, efficiency in MCPT is determined by comparing the benefits and costs of activities, and not their outputs and inputs. Therefore, the input/output-vector describing an efficient production is also referred to as functionally-efficient to differentiate it from the corresponding efficient value vector in the space of objectives (cf. Fandel 2010, p. 89).

It is an important question as to whether there are certain general relationships between the efficiencies of different valuation levels.

Proposition 1

For any PPS P, let \(\varvec{v}^{1} \left( {\varvec{x};\varvec{y}} \right)\) and \(\varvec{v}^{2} \left( {\varvec{x};\varvec{y}} \right)\) with \(\varvec{v}^{2} \left( {\varvec{x};\varvec{y}} \right) = \varvec{u}\left( {\varvec{v}^{1} \left( {\varvec{x};\varvec{y}} \right)} \right)\) be two multiple value functions satisfying assumptions A1A5, where u \(\left( \varvec{v} \right)\) is a strictly monotonic, separable function mapping the ‘first level’ costs and benefits determined by \(\varvec{v}^{1} \left( {\varvec{x};\varvec{y}} \right)\) consistently onto the ‘second level’ costs and benefits determined by \(\varvec{v}^{2} \left( {\varvec{x};\varvec{y}} \right)\) . Then, if activity \(\left( {\varvec{x}^{A} ;\varvec{y}^{A} } \right) \in P\) of DMU A dominates activity \(\left( {\varvec{x}^{B} ;\varvec{y}^{B} } \right) \in P\) of DMU B with respect to the first value level, A dominates B with respect to the second value level, too.

Proof

Let \(\varvec{c}^{i} \left( \varvec{z} \right)\) and \(\varvec{b}^{i} \left( \varvec{z} \right)\) for \(i \in \left\{ {1, 2} \right\}\) be the 1st and 2nd level costs and benefits of activities \(\varvec{z} \in P\) such that \(\varvec{v}^{i} = (\varvec{c}^{i} ;\varvec{b}^{i} )\) and \(\varvec{v}^{2} = \varvec{u}(\varvec{v}^{1} )\). Dominance of A over B regarding the first value level is equivalent to the following vector dominance of (negatively valued) costs and (positively valued) benefits: \(( - \varvec{c}^{1A} ;\varvec{b}^{1A} ) \ge ( - \varvec{c}^{1B} ;\varvec{b}^{1B} )\). Consistent monotonicity and separability of u \(\left( \varvec{v} \right)\) imply that any type of 2nd level cost is strictly increasing regarding those of 1st level costs and strictly decreasing regarding those of 1st level benefits on which it depends; and vice versa for the 2nd level benefits. From this follows the asserted 2nd level value dominance of A over B, namely: \(\varvec{c}^{2A} = \varvec{u}^{\varvec{c}} \left( {\varvec{c}^{1A} ;\varvec{b}^{1A} } \right) \le \varvec{u}^{\varvec{c}} \left( {\varvec{c}^{1B} ;\varvec{b}^{1B} } \right) = \varvec{c}^{2B}\) and \(\varvec{b}^{2A} = \varvec{u}^{\varvec{b}} \left( {\varvec{c}^{1A} ;\varvec{b}^{1A} } \right) \ge \varvec{u}^{\varvec{b}} \left( {\varvec{c}^{1B} ;\varvec{b}^{1B} } \right) = \varvec{b}^{2B}\). \(\square\)

Benefit function \(b_{1} = 340y_{1} - 10x_{1} - 50x_{2} + 20x_{3}\) in (1), known from the cement plants of Sect. 3.1, is separable (even linear) regarding inputs and outputs. It also illustrates the case of a benefit (especially profit) which, on the one hand, increases with the output quantity of cement (\(y_{1}\)) and the input quantity of scrap tires (\(x_{3}\)) and, on the other hand, decreases with the input quantities of both labor and capital (\(x_{1} ,x_{2}\)). In contrast, the benefit employment (\(b_{2}\)) increases with more labor input (\(x_{1}\)). This characterizes conflicting interests (of shareholders and workers) and hence an inconsistent valuation of the input labor (violating assumption A6c), which is why the example exhibits no direct dominance relations between the inputs and outputs themselves (but perhaps winwin possibilities between profit and employment).

Monotonicity and (non-linear) separability—as premises of Proposition 1 (and tightening of assumption A6e)—form a kind of conditions for preferential consistency which are sufficient for the fact that dominance relations are transferred from one valuation level to the next. It has immediate implications for the respective efficiencies of both value levels which will be discussed in Sect. 4.5 with respect to the rationality of inefficient production:

Corollary 2

Let there be two preferentially consistent first- and second-level multiple value functions, as defined in Proposition 1. If a production activity is efficient regarding (the PPS and) the second valuation level, then it is also efficient regarding the first one. In the special cases of traditional and environmental production theory with first level value functions satisfying assumption A6a or A6b, each production activity being functionally efficient with respect to the second valuation level is technically efficient, too. In particular, if the second level determines the ‘success’ (e.g. profit of the DMUs) by a one-dimensional value function, then the success maximum is always technically efficient. Footnote 17

Corollary 3

An inefficient production activity remains inefficient if its inputs and outputs, or its values on a higher level, are further aggregated by preferentially consistent monotonic value functions.

3.4 Main theorem for linear value functions

Next, a further main theorem of MCPT will be stated. To simplify the presentation, a fairly general type of (business technologies Τ and) PPS Ρ will be taken as a basis. Here, Ρ is generated or ‘spanned’ by a finite number \(j = 1, \ldots ,n\) of basic activities \(\varvec{z}_{j} = (\varvec{x}_{j} ;\varvec{y}_{j} )\)\({\mathbb{R}}\) m+s (Fandel 2010, p. 49; Dyckhoff and Spengler 2010, pp.162), i.e. determined as a certain envelopment of these activities that can be mathematically described as follows:

$$P =\left \{ {z = \mathop \sum \limits_{j = 1}^{n} \lambda_{j} z_{j}\, | \, \lambda \in S} \right\}\;{\text{for}}\;S: = \left\{ {\lambda \in {\mathbb{R}}_{ + }^{n}\, |\,\tau_{min} \le \mathop \sum \limits_{j = 1}^{n} \lambda_{j} \le \tau_{max} } \right\}$$
(2)

Here, S is the activity levels set. Four cases defined by combining the two lower levels \(\tau_{min} \in \{ 0;1\}\) with the two upper levels \(\tau_{max} \in \{ 1;\infty \}\) are of interest. Each of the four different ranges for the activity levels λ j implies a PPS Ρ with either variable (VRS: 1;1), non-increasing (NIRS: 0;1), non-decreasing (NDRS: 1; ∞), or else constant (CRS: 0; ∞) returns to scale. Thus, Ρ is linear in the last and convex in all four cases. In the following we assume this:

A7 (Convex PPS with CRS, VRS, NIRS or NDRS) The PPS Ρ is spanned by a finite number of non-trivial basic activities following (2) for one of the four cases of returns to scale mentioned above.

Then, with assumption A6d, often used properties of such a technology or PPS, namely linearity and convexity, are inherited by their value image sets.

Proposition 4

If \(\varvec{v} = \varvec{v}\left( \varvec{z} \right) = \varvec{V} \cdot \varvec{z}\) is a multi-dimensional linear value function (with value impact matrix \(\varvec{V}\) ) defined on the inputs and outputs \(\varvec{z}_{j} = \left( {\varvec{x}_{j} ;\varvec{y}_{j} } \right) \in {\mathbb{R}}^{m + s}\) of a PPS of type

$$P = \left\{ {\varvec{z} = \mathop \sum \limits_{j = 1}^{n} \lambda_{j} \varvec{z}_{j} \, |\, \varvec{ \lambda } \in S} \right\}$$

with the activity levels set S defined in (2), then, with \(\varvec{v}_{j} : = \varvec{v}\left( {\varvec{z}_{j} } \right) = \varvec{V} \cdot \varvec{z}_{j}\) , the image set has the same property in value space:

$$V = \varvec{v}\left( P \right) = \left\{ {\varvec{v} = \mathop \sum \limits_{j = 1}^{n} \lambda_{j} \varvec{v}_{j}\, |\, \lambda \in S} \right\}.$$

Proof

by simple algebraic transformations. \(\square\)

4 Application to non-financial performance evaluation

Both main theorems and further properties of MCPT are now utilized to consider some of the topics of performance evaluation (mentioned in the Introduction) that are often controversially discussed in the literature, particularly regarding DEA.

4.1 Generalized data envelopment analysis

Dyckhoff and Allen (2001) have used the approach of decision-based production theory—in particular, a version of it that is similar to MCPT—in order to generalize the conception of DEA (cf. Sect. 2.3). Then, the formulas of the usual efficiency measurement models need to be slightly modified, only, namely in that the inputs and outputs are correspondingly substituted by the costs and benefits. For example, the cost-oriented CCR model of DMU o ∈ {1,…, n}—as a straightforward generalization of the standard input-oriented CCR model of DEA—exhibits the following envelopment form for a PPS Ρ determined as the linear hull (2) of the activities z j of the DMUs j = 1,…,n in the case of A7 with CRS (here, for simplicity, without amendment in order to avoid weakly efficient solutions):

$$\begin{aligned} {\text{Minimize}}\;\theta_{o} \;{\text{such}}\;{\text{that}}\;\varvec{c}\left( \varvec{z} \right) \le \theta_{\text{o}} \varvec{c}\left( {\varvec{z}_{\text{o}} } \right) {\text{and }}\varvec{b}\left( \varvec{z} \right) \ge \varvec{b}\left( {\varvec{z}_{\text{o}} } \right) \hfill \\ {\text{for}}\;\varvec{z} = \mathop \sum \limits_{j = 1}^{n} \lambda_{j} \varvec{z}_{j}\;{\text{with}}\;\lambda_{j} \ge 0, \; j = 1, \ldots , n. \hfill \\ \end{aligned}$$
(3)

In the special case of linear cost and benefit functions (assumption A6d), Dyckhoff and Ahn (2010, pp. 1261) have motivated the dual linear program of this model by maximizing the ratio of weighted benefits to weighted costs—analogously to the original derivation of the CCR model in multiplier form from the ratio of weighted outputs to weighted inputs introduced by Charnes, Cooper and Rhodes (1978).

Proposition 4 implies that such a generalized DEA model, e.g. (3), is a linear program (LP), too, if the value functions are linear, i.e. assumption A6d is fulfilled. Then, DEA models can be applied in the same formal way as usual, namely in that the benefits and costs are handled as if they were ‘outputs’ and ‘inputs’, respectively. It is exactly this that is meant by the second part of the statement by Cook et al. (2014) which is cited and criticized in Sect. 2.1. It is a standard procedure that takes place in a lot of DEA applications in the literature; however, usually without any reflection on the underlying assumption on the chosen ‘inputs’ and ‘outputs’ regarding their dependency on the real production inputs and outputs. Indeed, if underlying linear value functions are inserted for the benefit and cost variables, one obtains specific new DEA models for the original input and output variables (as shown for the cost-oriented CCR-model (3) by Dyckhoff and Ahn 2010). Some of the models derived in this way are well known, especially from the DEA literature on undesirable outputs. The example of the next subsection illustrates this.

4.2 Undesirable products and factors

Now, linear value functions of the special type A6b will be supposed. As already mentioned in Sect. 3.2, this is a standard assumption for environmentally oriented performance measurement, characterized by the existence of undesirable (by-)products and factors. From their recent systematic review of DEA applications in sustainability, Zhou et al. (2018) conclude that one of the key challenges is how to deal with undesirable outputs. By 2015, there were more than 350 academic DEA papers studying undesirable products, but only 22 papers on undesirable factors (Wojcik et al. 2017). In an overview, Zhou and Liu (2015) formulated various DEA models with ‘undesirable inputs and outputs’, but without referring to any systematic approach as to how they can possibly be deduced in general. The generalization of DEA by MCPT, however, yields such a systematic procedure.

Let us use the following generalized multi-criteria (slack-based weighted) additive DEA model for a PPS P defined by (2) as illustrating example:

$$\begin{aligned} {\text{Maximize}}\quad & \user2{w}^{c} \cdot \user2{s}^{c} + \user2{w}^{b} \cdot \user2{s}^{b} \\ {\text{such}}\;{\text{that}}\quad & \user2{c}\left( \user2{z} \right) + \user2{s}^{c} = \user2{c}\left( {\user2{z}_{{\text{o}}} } \right) \\ & \user2{b}(\user2{z}) - \user2{s}^{b} = \user2{b}(\user2{z}_{{\text{o}}} ) \\ & \user2{s}^{c} ,\user2{s}^{b} \ge 0;\user2{z} \in P \\ \end{aligned}$$
(4)

With assumption A6b and (2), this model gets specialized to the slack-based (s) weighted (w) additive DEA model for both goods (G) and bads (B) as inputs (I) and outputs (O)Footnote 18:

$$\begin{aligned} {\text{Maximize}}\quad & \varvec{w}^{GI} \cdot \varvec{s}^{GI} + \varvec{w}^{BO} \cdot \varvec{s}^{BO} + \varvec{w}^{GO} \cdot \varvec{s}^{GO} + \varvec{w}^{BI} \cdot \varvec{s}^{BI} \\ {\text{such that}}\quad & \mathop \sum \limits_{j = 1}^{n} {\lambda }_{j} \varvec{x}_{j}^{G} + \varvec{s}^{GI} = \varvec{x}_{\text{o}}^{G} \\ & \mathop \sum \limits_{j = 1}^{n} {\lambda }_{j} \varvec{y}_{j}^{B} + \varvec{s}^{BO} = \varvec{y}_{\text{o}}^{B} \\ & \mathop \sum \limits_{j = 1}^{n} {\lambda }_{j} \varvec{y}_{j}^{G} - \varvec{s}^{GO} = \varvec{y}_{\text{o}}^{G} \\ & \mathop \sum \limits_{j = 1}^{n} {\lambda }_{j} \varvec{x}_{j}^{B} - \varvec{s}^{BI} = \varvec{x}_{\text{o}}^{B} \\ & \varvec{s}^{GI} ,\varvec{s}^{BO} ,\varvec{s}^{GO} ,\varvec{s}^{BI} \ge 0;{\lambda } = \left( {{\lambda }_{1} , \ldots , {\lambda }_{n} } \right) \in S \\ \end{aligned}$$
(5)

Here, bad output is mathematically described in the same way as good input, and bad input like good output. With respect to the underlying MCPT, it should be clear, however, that this identity is only of a syntactic and not at all of a semantic nature. The two must not be confused!Footnote 19 “Considering pollutants as inputs is not a correct way of modelling pollution-generating technologies”, as Dakpo et al. (2016, p. 357) correctly state (while misinterpreting Dyckhoff and Allen 2001). Indeed, pollutants are outputs, however of a kind which implies (social) costs.

For generalized DEA, based on MCPT, the primary variables of efficiency analysis are benefits and costs, not inputs and outputs. In the special case (5) of model (4), determined by assumption A6b, the benefits are measured by the physical quantities of good outputs and bad inputs, the costs by those of good inputs and bad outputs (as a kind of proxy variables). Consequences of a production activity that one wishes to reduce are always costs, be it for the input of goods (because of expenditures for buying them) or for the output of bads (because of their damage to nature or of emission fees to be paid). Consequences to be maximized are the benefits, in (5) either resulting from the output of goods (usually because of the revenues for selling them) or from the input of bads (because one must get rid of them, as in the example of tires incinerated by a cement plant).Footnote 20 In any case, it should be borne in mind that the preferences for objects as either goods or bads have a priori nothing to do with the production technology. In (5), the actual production possibilities are in fact modelled as usual in traditional DEA, namely by a (e.g. linear or convex) envelopment of the DMUs independently of the preferences for inputs and outputs.

Model (5) is the first published DEA model with undesirable factors (Dyckhoff and Allen 2001, p. 315; cf. Wojcik et al. 2017). Zhou and Liu (2015, p. 422) have formulated the same model, in spite of a different, technologically motivated definition of those factors. Their notion of ‘undesirable input’ is in contrast to the preference-based view of MCPT, which in turn, however, is consistent with their own definition of desirable outputs as “what the decision maker hopes to produce as much as possible”, reflecting a subjective judgment by Zhou and Liu (2015, p. 417). On the other hand, they state (p. 417) “that the desirability of inputs should be defined according to the intrinsic production mechanism. (…) If the increase of an input will not increase the desirable outputs, then it is classified as undesirable” (and vice versa for ‘desirable input’). However, the increase of a ‘limitational’, i.e. non-substitutable production factor alone, e.g. (new) tires in car assembly, without increasing other factors at the same time, will neither increase nor decrease the output of the main product (car). Hence, according to the definition of Zhou and Liu, such limitational factors would have to be classified as inputs, both desired as well as undesired at the same time.

Furthermore, waste incineration plants do not have material desirable outputs (unless simultaneously producing power or heat as a marketable by-product) so that their input ‘waste’ cannot be classified in this technological way. Nevertheless, if the service of disposal would be understood as the desired output of waste incineration plants, then waste input can be classified as desirable by the technologically motivated definition of Zhou and Liu (2015). This then coincides with Dyckhoff and Allen (2001, p. 315), who introduced the (however preference-based) notion of desired input into DEA: “Waste to be burned at the power plant is such an undesirable object the destruction of which is desired, i.e. the input of which should be maximized.”

4.3 Consistent hierarchical performance evaluations

If they are preferentially consistent specified (A6c), linear value functions allow for systematic multi-stage nested performance evaluations in the form of a hierarchy (similar to the one in Fig. 1), where the performance results of a DMU are monotonically downgrading. Such an aggregation is consistent if benefits on the lower level are linearly combined with nonnegative coefficients into benefits and with non-positive ones into costs on the higher level, and the analogous holds true for the costs.

Proposition 5

If a generalized DEA model is consistently aggregated by linear value functions into a generalized DEA model of the same type on a higher hierarchy level, then the efficiency scores of the DMUs cannot improve.

Proof

Since the DEA models in envelopment form of both hierarchy levels are of the same type of LP, also their dual LPs in multiplier form are of the same type. By inserting the linear value functions into the dual LP of the higher level, one obtains exactly the dual LP of the lower level; however, with the additional constraints that the multipliers of two aggregated costs or benefits have to be proportional to the respective two coefficients of the linear value function. Hence, the optimal efficiency score of this restricted dual LP representing the higher level cannot be better than that of the original dual LP of the lower level. \(\square\)

From Proposition 5 follows that the performance rating of a DMU can only decline when the evaluation is aggregated—or at best remain constant—as long as the preference relations expressed by the corresponding value functions of the hierarchy are consistent. Hence, performance scores at lower levels of aggregation imply an upper bound for performance scores at higher levels.

This essential property of non-improving performance ratings of a DMU for monotonically nested multi-stage linear value functions will be demonstrated for a 3-level hierarchy by a modified version of the prior example of four cement factories (cf. Dyckhoff and Ahn 2010, p. 1265). Apart from purely economic objectives, no social impacts are taken account of, and only global warming is considered as an ecological cost. Instead of CFC we now include methane (CH4) as second emission besides carbon dioxide. These modifications of the example from Sect. 3.1 eliminate potentially conflicting impacts of the same input or output. Thus, assumption A6c and the premises of Proposition 5 are no longer violated.

Hierarchy Level 1

$$\left[ {\begin{array}{*{20}c} 4 & 4 & 5 & 3 \\ 3 & 5 & 5 & 5 \\ {120} & {40} & {100} & {100} \\ 6 & 2 & 5 & 5 \\ 1 & 1 & 1 & 1 \\ 5 & 1 & 3 & 3 \\ \end{array} } \right] \overset{\wedge}{=}\left[ {\begin{array}{*{20}c} {c_{1} } \\ {c_{2} } \\ {c_{3} } \\ {c_{4} } \\ {b_{1} } \\ {b_{2} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {x_{1} } \\ {x_{2} } \\ {y_{2} } \\ {y_{3} } \\ {y_{1} } \\ {x_{3} } \\ \end{array} } \right]\begin{array}{*{20}c} {\text{labor}} \\ {\text{capital}} \\ {{\text{ CO}}_{ 2} } \\ {{\text{CH}}_{ 4} } \\ {\text{cement}} \\ {{\text{scrap}}\;{\text{tires}}} \\ \end{array} \,$$

At the lowest level 1 of the linear value aggregation hierarchy, we analyze the technical efficiency as defined in Sect. 3.3, in line with assumption A6b of Sect. 3.2. The respective value impact matrix V considers the six types of technological inputs and outputs of the example as immediate decision consequences to be directly maximized or minimized. Thus, as shown above, labor and capital input represent two distinctive kinds of economic costs to be minimized (not necessarily measured in the same units), together with CO2 and CH4 emissions as two different kinds of non-financial costs, here ecological ones. The quantity of cement produced and the quantity of scrap tires safely disposed of are the two kinds of benefits, each to be maximized (in its natural units). The cost-oriented CCR model (3) then calculates efficiency scores of 100% for all DMUs except the third one with a score of 80%. Thus, DMU C is technically inefficient.

Hierarchy Level 2

$$\left[ {\begin{array}{*{20}c} {270} & {90} & {225} & {225} \\ {190} & {290} & {300} & {280} \\ {440} & {360} & {400} & {400} \\ \end{array} } \right] \overset{\wedge}{=}\left[ {\begin{array}{*{20}c} {\bar{c}_{1} } \\ {\bar{c}_{2} } \\ {\bar{b}_{1} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {c_{3} + 25c_{4} } \\ {10c_{1} + 50c_{2} } \\ {340b_{1} + 20b_{2} } \\ \end{array} } \right]\begin{array}{*{20}c} {{\text{greenhouse}}\;{\text{effect}}} \\ {{\text{financial}}\;{\text{cost}}} \\ {\text{revenue}} \\ \end{array}$$

At the second level of the hierarchy, the six attributes of level 1 are aggregated—consistently with the prior preferences—into three values being more fundamental to the decision maker (or evaluator). Accordingly, the undesirable outputs CO2 and CH4 are combined in terms of their greenhouse effect, resulting in a single kind of ecological costs usually measured in units of radiative forcing (or carbon dioxide equivalents), where one mass unit of methane will have the same impact as 25 units of carbon dioxide within 100 years (cf. Forster et al. 2007). A second kind of costs is obtained by adding the financial values of workers’ salaries and capital expenses. These costs are specified separately from the financial benefit through the revenues from cement sales and scrap tire disposal. Now, the fourth DMU D is also found to be (functionally) inefficient (D: 83% and C: 80%).

Hierarchy Level 3

$$\left[ {\begin{array}{*{20}c} {270} & {90} & {225} & {225} \\ {250} & {70} & {100} & {120} \\ \end{array} } \right] \overset{\wedge}{=}\left[ {\begin{array}{*{20}c} {\hat{c}_{1} } \\ {\hat{b}_{1} } \\ \end{array} } \right] = \left[ \begin{array}{c} \bar{c}_{1} \\ \bar{b}_{1} - \bar{c}_{1} \\ \end{array} \right]\begin{array}{*{20}c} {{\text{greenhouse}}\;{\text{effect}}} \\ {\text{profit}} \\ \end{array} \,$$

The next step is to subtract financial costs from financial revenues, as is usual in accounting to calculate profit. Thus, on the third level, greenhouse effect and financial profit are the only two remaining performance criteria. Since none of the four DMUs makes losses, the cost-oriented CCR model can be applied once again. Now, only DMU A is 100% (functionally) efficient. The other scores are 84% for B, 48% for C and 58% for D.

With respect to their eco-efficiency measurement method, Kuosmanen and Kortelainen (2005, p. 69) suggest that: “If possible, it is useful to first aggregate specific emissions and pollutants using impact assessment tools.” As with carbon dioxide and methane in the example, the eco-efficiency method of BASF aggregates ecological impacts by linear value functions hierarchically according to Fig. 1. If this is done with fixed weights, one must be careful not to violate the rationality condition of ‘independence from irrelevant alternatives’ (Dyckhoff et al. 2015). DEA generalized by MCPT supplies a less demanding evaluation method for aggregating the ecological impacts than the correct application of MAUT. As Kuosmanen and Kortelainen (2005, p. 65) note: “The DEA eco-efficiency score provides an upper bound for the ‘true’ but unknown eco-efficiency index: using any other weights will necessarily decrease the eco-efficiency score of the evaluated unit”.

Proposition 5 implies a further important argument supporting the use of DEA in practice. That is, the technological inputs and outputs of a DMU are often more easily observable and measurable than the state of the fundamental goals, which are actually the ultimate focus of interest (cf. Kuosmanen and Kortelainen 2005, p. 61). Although it is true that performance criteria should ideally represent fundamental values as much as possible, in practice they often represent means to an end only. As we have seen in the last example, (generalized) DEA nevertheless allows for meaningful estimations of the performance of a DMU as long as inputs and outputs as well as benefits and costs are chosen compatibly with the fundamental objectives. However, the more the benefits and costs represent the fundamental goals, the better (i.e. sharper) the performance ratings.

The relationship between the inputs and outputs of a transformation process at the one end and the fundamental sustainability goals at the other is often complex, and it is difficult to untangle the chain of effects (cf. Kuosmanen and Kortelainen 2005, p. 61). Consider, for example, the network of impacts of greenhouse gas emissions with respect to global warming on human welfare and the ecosystem. As shown in Fig. 4, most parts of the full chain of effects—from the outputs via several connected implied outcomes to the last fundamental values—are outside the sphere of influence of the respective DMU and its available production technology. It is a challenge to take account of such complex impact chains when applying DEA for ecologic performance measurement, particularly in the case of nonlinearities.

Fig. 4
figure 4

From outputs via impacts to fundamental goals (Hauschild and Huijbregts 2015, Fig. 1.2)

4.4 Evaluation for non-monotonic value functions

In the last two subsections, we have demonstrated a systematic approach on how to handle certain challenges and pitfalls of (sustainability) performance evaluation that have been subject to debate in the literature.Footnote 21 In general, one has to know the set \(\varvec{v}(P)\) of possible cost–benefit vectors as consequences of the PPS P. It is a crucial question as to whether the technological properties of the PPS with respect to the inputs and outputs, such as convexity or certain returns to scale, translate into similar properties with respect to the benefits and costs. Clearly, this will depend on the properties of the multi-dimensional value function \(\varvec{v}(\varvec{z})\). Until now, linear (and consistent) valuations have been supposed. Otherwise DEA methodology cannot be applied in general, because all common DEA models are characterized by linear relations, leading to LPs which can be solved by the usual software codes. Moreover, some new conceptual problems may also arise for nonlinear value functions as well as for inconsistent valuations. The next two subsections analyse and discuss phenomena regarding the characterization of production activities or DMUs as (functional) efficient, deviating from linear cases with consistent valuations.

Figure 5 illustrates a simple graphical example. For this purpose, consider three DMUs A, B and C, each of which produces one kind of product (in quantity y) from one kind of factor (in quantity x), displayed in the south-west (below-left) of the two-dimensional coordinate system. The PPS Ρ is determined as the triangle generated by the convex envelopment of the three DMUs in accordance with (2) for a technology with variable returns to scale (case \(\tau_{min} = \tau_{max} = 1\)) and without free disposal. Regarding assumption A6a, all activities on the segment joining A and C are technically efficient, i.e. efficient with respect to input and output of goods.

Fig. 5
figure 5

Example with nonlinear value function

Case 1: research efficiency

In a first interpretation of the numerical example, imagine that the three DMUs represent researchers who, over the course of a year, have produced a quantity y of academic papers by investing an amount of time x. According to these input and output quantities, researcher B has worked less efficiently than the other two.Footnote 22

But since the academic content of a collection of papers is certainly not proportional to their number, one should preferably choose other performance criteria, e.g. the academic knowledge content b of the papers as a benefit, and the effort c invested by the researcher while writing the papers as a cost or disutility. These two criteria are shown in the extended coordinate system of Fig. 5 with axes in the north and east directions (up and right). For simplicity, we will suppose that the effort c of a researcher is proportional to the time x invested (below-right in Fig. 5). It is furthermore assumed that the total scientific innovation content b of a researcher’s papers depends only on their number per year y, increasing with a diminishing rate at first as the number increases, but then decreasing absolutely past a certain maximum threshold until some maximal number of yearly written papers which collectively contain no new knowledge. The easiest way to describe this is with a quadratic value function (top-left in Fig. 5). Note that the monotonicity assumption A6e is violated in this case.

With these assumptions, points \(\tilde{A}\), \(\tilde{B}\) and \(\tilde{C}\) in value space (top-right in Fig. 5) build the images of the production activities A, B and C. All other points of the PPS (bottom-left triangle), measured by the time input x and paper output y of the researchers, can be mapped onto corresponding points in the upper-right quadrant, too. In this way, one obtains the (shaded) non-convex shape as value image of the PPS. It contains all achievable combinations of knowledge contribution b and effort c invested. Only those combinations which are on the bold part of the top-left curve of the non-convex shape are efficient (with respect to benefit and cost). This bold curve in the upper-right quadrant represents the image of the bold strict sub-segment of the line between A and C in the bottom-left triangle. Hence, although the whole line segment between A and C has been regarded as technically efficient before (with respect to input and output), only its bold sub-segment spanning from A up to the point of maximal knowledge yield really is (functionally) efficient regarding benefit and cost.

As already stated, the non-convex shape of Fig. 5 contains all combinations of benefit and cost from knowledge contribution and effort investment by realizable input/output combinations. Contrary to the (data) envelopment of the measured costs and benefits of the three DMUs, they are actually achievable. In fact, it would have been a systematic error to interpret the triangle (not shown in the upper-right quadrant), formed as convex hull of the three image points \(\tilde{A}\), \(\tilde{B}\) and \(\tilde{C}\), as attainable costs and benefits of production activities. This would then wrongly implicate the (functional) efficiency of DMU B, as well as of the whole segment running up to the image of DMU A. This mistake may occur when handling benefits and costs as ‘outputs’ and ‘inputs’ in traditional DEA.

Case 2: profit maximization Footnote 23

Our example, considered so far from the view of performance measurement, can also be interpreted differently in classical economic terms, namely in the sense of the well-known Cournot Theorem for price fixing by monopolistic market actors. In such an alternative interpretation of the numerical example, the three DMUs may then represent business units selling quantities y of a product on own local markets, determined by a linear demand function with respect to the individual price set by each business unit on its market. The vertical, upwards-directed axis shows the revenue b(y), and the horizontal, right-directed axis the financial costs c(x) of the input x. The difference b − c determines the profit (or contribution margin). In the top-right of Fig. 5, it is shown for each of the three DMUs as vertical segments to the angle bisector, marked with dashed lines, for DMU C to indicate a loss, for the other two to indicate a profit. The dotted grey lines in the coordinate system indicate the so-called ‘Cournot point’ as the point of maximal profit regarding the bottom-left triangle as PPS.

The example has clearly demonstrated that one must be careful when using financial and other values as ‘inputs’ and ‘outputs’ in traditional DEA applications, as they may depend non-linearly on the actual inputs and outputs of the technology.

4.5 Rationality of ‘inefficient’ production

The examples of the last two subsections show that a technically efficient activity may be functionally inefficient and that, in contrast, all functionally efficient activities are also technically efficient. Specifically, in case 2 of the last example, the profit maximum of the firm is functionally efficient (regarding revenues and costs as two separate objectives) as well as technically efficient.

In the past decade, several authors have analyzed the question as to whether it can be rational to produce ‘inefficiently’ (e.g., see the seven papers in Fandel 2009). Corollary 3 explains, however, that this is impossible for those efficiency notions which are defined consistently for preferences without conflicts. Nevertheless, as Dyckhoff and Ahn (2001) pointed out, it is important to notice the fact that efficiency is always relative in a double sense, on the one hand regarding the set of compared alternatives and on the other regarding the set of considered objectives. However, the first kind of relativity is well known from traditional production theory: An inefficient production activity would become part of the efficiency frontier of the PPS if all activities dominating it were no longer possible, e.g. because of a further resource constraint. Thus, it is the second kind of relativity—as characteristic of MCPT—that underlies the discussion in the literature on ‘rational inefficiencies’ (pioneered by Bogetoft and Hougaard 2003): An inefficient activity may become efficient if the set of objectives is changed, e.g. if a further (e.g. environmental) objective is added with respect to which all the other activities dominating before are worse. The literature has discussed several kinds of such ‘rational inefficiencies’ (cf. Fandel 2009).Footnote 24

In their paper, Bogetoft and Hougaard (2003) pointed out that allowing for technical inefficiency may be rational from the perspective of an enterprise or DMU if the resulting slack of excess resources is valuable, at least to some degree. “For this, the objective function is extended by slacks as additional components so that a production alternative that turns out to be efficient with respect to the objective function in a higher-dimensioned decision space does not need to be efficient in a subspace such as the producer’s technical production possibilities set. Since input slacks are allocated a positive value, technical inefficiency itself becomes valuable and a trade-off evolves between gaining technical efficiency by reducing input levels and the consumption of positively valued slacks implying higher input levels and less technical efficiency.” This quote by Fandel and Lorth (2009, p. 410), commenting on the approach of Bogetoft and Hougaard (2003), describes exactly one of those decision situations which are considered in the last subsections as different valuation modes or levels, eventually not being preferentially consistent. Using the terminology introduced in this paper, we could say: The input of resources involves not only those costs according to the usual assumption A6a of traditional production theory, but furthermore other (opportunity) costs for not explicitly considered (future) production possibilities, i.e. not captured by the given PPS P, which are also to be calculated against the revenues from the products generated with the actual resource input.

Moreover, there may be valuable slacks not only on the input side of production, but on the output side, too. For example, large pieces, originally being trim loss of cutting materials such as paper or steel from stock, can be put into inventory for further cutting on future demand (cf. Dyckhoff and Gehring 1988).

Even in cases of (short term) profit maximizing as exclusive objective, Fandel and Lorth (2010) have discussed and demonstrated further reasons for technically inefficient profit maxima. As respective real-life situations, they discuss markets for (undesirable) by-products, energy markets, all-units quantity discounts as well as wage concessions. In each case, a non-monotonic valuation is crucial for the existence of a technically inefficient profit maximum, as Corollary 3 has stated for more general instances, too. In such a case, however, the preferences underlying technical efficiency on the first (lower) value level and those underlying the profit on the second (higher) value level are inconsistent, which is irrational for any decision maker or evaluator.

5 Conclusions

Data Envelopment Analysis (DEA) and other non-monetary valuation methods—like, e.g. Stochastic Frontier Analysis—are both a tool for empirical research on the performance of human behavior and an artefact that can be used to improve the performance. Hence, it is part of behavioral science as well as of design science, each with a different paradigm. The first has its roots in natural science research and seeks to develop and justify theories (i.e. principles and laws) explaining or predicting organizational or human phenomena, while the second has its roots in engineering as well as in the arts, seeking to create useful innovative artefacts (in a broad sense: ideas, practices, techniques, products). “Such artefacts are not exempt from natural laws or behavioral theories”, as Hevner et al. (2004, p. 76) state in their presentation of a concise conceptual framework and clear guidelines for design science (of information systems), and continue: “To the contrary, their creation relies on existing kernel theories that are applied, tested, modified, and extended through the experience, creativity, intuition, and problem solving capabilities of the researcher.” In this sense, multi-criteria production theory (MCPT; presented in Sect. 3.1) may be such a kernel theory for non-financial performance measurement in general, and for DEA in particular. Possibly, MCPT opens a new path of substantial enhancement of DEA, in contrast to the review of Avkiran and Parker (2010, p. 3) who report “further evidence of a maturing DEA methodology involving a dwindling number of influential publications, and possibly, a saturation of methodological applications.”

With respect to the pitfalls in selecting inputs and outputs in DEA, Dyson et al. (2001, p. 248) suggest a protocol where any performance measures may be eliminated “that are not strongly related to the objectives of the organization. This might be achieved by a careful consideration of the consistency of the mission, objectives and performance measures.” Such a consideration is, however, the genuine part of a performance measurement methodology based on MCPT. It is an application of general decision theory to production systems as specific decision making units (DMUs), namely units which create and destroy values by transforming inputs into outputs. Thus, some fundamental premises of traditional production theory not scrutinized until now can be questioned and answered more systematically and in a more profound way. This holds especially true for the question: What are the relevant inputs and outputs of a production system which are to be modelled regarding the production manager’s perception of reality and preferences, in order to explain his or her behavior? MCPT may possibly also be fruitful in other areas of economics and management science where traditional production theories already play a role, such as the theory of the firm or management accounting.

The generalization of DEA by a decision-based—and in particular multi-criteria—production theory unlocks all of the knowledge and know-how of both descriptive and prescriptive decision theory, including MCDM and MAUT.Footnote 25 It provides a sound theoretical foundation for constructing efficiency measurement models in order to analyze the performance of production processes with respect to their positive and negative value impacts, which are called benefits and costs.Footnote 26 There may, however, be other consequences, influencing factors and production phenomena that cannot be modelled by value functions defined on inputs and outputs. Then the basic assumptions of Sect. 3.1 may eventually have to be modified. Nevertheless, it must be emphasized that MCPT theory allows us to generalize traditional production theories and traditional DEA in such a way that all its propositions and methods remain true for this special case, i.e. where the benefits and costs are measured directly by actual outputs and inputs.

The essential advantage of DEA models generalized by (linear) value functions in comparison to traditional DEA models is that they allow the users to strictly differentiate technological from non-technological assumptions, i.e. those concerning the actual (inputs and outputs of the considered) transformation process, such as the returns to scale from production, from those of value impact aspects, such as non-linear market responses, e.g. in case of monopolistic actors (as shown in Sect. 4.4). In the DEA literature, there are many examples of applications that take financial parameters as inputs or outputs without reflecting on the reality of the underlying assumptions, such as convexity or even linearity, that are being made.

MCPT forms a synthesis of both production and decision theory and offers fruitful insights into various problems in the application of traditional DEA, which have not been satisfactorily solved within the last four decades since the pioneering work of Charnes et al. (1978). Section 4 has been concerned with some of these research issues being discussed in recent reviews and overviews, namely the definition and selection of relevant inputs and outputs (cf. Cook et al. 2014), proper models for undesirable products and factors (cf. Zhou and Liu 2015) as well as rational inefficiencies (cf. Fandel 2009). Important results are:

  • Undesired factors, e.g. waste to be incinerated, should be defined and modelled by the preferences for them and not by their technological connection to desired products.

  • Multiple performance scores of a DMU derived by DEA for a preferentially consistent hierarchy of linear value functions are monotonic decreasing. This is of eminent practical relevance, particularly for a sustainability performance measurement based on a calculation of the total ecological impact by schemes such as the SEEBALANCE® method of BASF (cf. Fig. 1).

  • For preferentially consistent monotonic value functions, an inefficient production cannot be rational.

Thus, the generalization of DEA by MCPT throws light on some of the open questions in DEA ‘prior to choosing a model’ (Cook et al. 2014) as well as in respect of interpreting the results of efficiency analyses obtained with such models. It also forms a further step to a broader conceptual framework for ‘measuring eco-efficiency of production with DEA’, as suggested by Dyckhoff and Allen (2001) and developed by Kuosmanen and Kortelainen (2005). By evaluating pressures instead of emissions (cf. Fig. 4), the latter authors understand their paper ‘as a step toward integrating the perspective of environmental impact assessment in industrial ecology with the frontier approach of environmental performance management in economics into a unified framework“(p. 70). Figure 4 shows that the implied chain of impacts is often complex (and uncertain). Moreover, most of such consequences of a production activity are usually not fully under the control of the DMU and are eventually influenced by non-linear relationships. Performance measurements used to influence and control DMUs responsible for such outcomes should be aware of those impact relationships. MCPT allows us to clearly discriminate between the possible production activities under the full control of a DMU (the PPS), on the one hand, and their consequences for markets and other economic, social or ecological environments, modelled by multiple value functions, on the other hand.

There are areas other than sustainability management and ecological economics where a non-financial performance evaluation generalized by MCPT could play an important role in research and practice, too, even in fields of classical business economics, such as management accounting, production control and logistics. The examples in Sects. 4.3 and 4.4 calculating profit as a benefit at the last stage of (value impact) aggregation have shown that such a generalized performance measurement can also be understood as a kind of management accounting for uncertain or unknown prices. In spite of 40 years of development, DEA has not been very successful in those classical areas until now, although performance evaluation is one of the most prevalent topics in the practice as well as the research of business economics.