1 Introduction

2 Background, aim, and scope

By now, the matrix approach to life cycle assessment (LCA) has received wide recognition. Proposed in the early 1990s (Möller 1992; Heijungs et al. 1992), it took a decade or so before this idea was embraced as a general accepted way of doing LCA (see Heijungs and Suh 2006 for a short history of the matrix approach to LCA).

The matrix method has been developed to solve the inventory problem in LCA. With the inventory problem, we refer to the task of scaling all unit processes in the system in such a way that they exactly produce the reference flow (or functional unit) and to use this scaling such that the inventory table can be computed. Formulas such as

$$ {\mathbf{g}} = {\mathbf{B}}{{\mathbf{A}}^{ - 1}}{\mathbf{f}} $$
(1)

(see Tables 1 and 2 for an explanation of the symbols involved) are now common in the specialized literature on life cycle inventory (LCI; Heijungs and Suh 2002; Peters 2007; Tan 2008).

Table 1 Overview of symbols representing input data in the LCA
Table 2 Overview of symbols representing output results in the LCA and the formulas with which these results are obtained

The matrix approach has also been extended in various directions. For instance, research has been devoted to add details on:

  • the treatment of allocation and cutoff (Heijungs and Frischknecht 1998);

  • how to connect a process-based LCI to an input–output table (Suh and Huppes 2005);

  • how to efficiently compute an answer to the inventory problem (Peters 2007);

  • how to analyze the feedback structure of the system (Suh and Heijungs 2007);

  • how to calculate sensitivity coefficients (Heijungs 1994).

Most of these details indeed refer to exclusively inventory-oriented questions. The impact assessment follows the inventory results, so all issues related to cutoff, allocation, IO-based LCA, efficient algorithms, and structural path analysis are only interesting from an LCI point of view. For the last one mentioned, the sensitivity coefficients, this is different, however.

Sensitivity coefficients are important for both uncertainty analysis and sensitivity analysis (Heijungs 1994). In the context of uncertainty analysis, they serve to establish essential information for a Taylor series expansion, and for sensitivity analysis, they provide the multipliers that enable one to distinguish sensitive from non-sensitive parameters, the so-called key issues for refined data collection (Heijungs 1996). But uncertainty and sensitivity analyses are not only important in LCI but in impact assessment as well. Moreover, the impact assessment adds additional uncertainty to the already uncertain results of the LCI. Likewise, not only the sensitivity of inventory results is of interest but also (or perhaps even more so) the impact assessment results.

The extension of the sensitivity coefficients from LCI to life cycle impact assessment was “left as an exercise” to the LCA practitioner. For instance, Heijungs and Suh (2002, p. 144) write that “In this way, all equations of LCA may be processed,” but they do not pursue this. It is a task that is not so often carried out, we guess, at least we have never seen the explicit results of such an exercise. This paper therefore aims to carry out this exercise and to make the results available. The formulas obtained can easily be implemented in matrix-based software for LCA. We have done so in CMLCA, a program for doing LCA, and some screenshots are shown at the end of the paper.

In this paper, we first review the basic equations of LCA. Then, we proceed to review the theory of sensitivity coefficients in general and their form in LCI. This finally leads to a derivation and coherent presentation of the sensitivity coefficients for the entire LCA process.

3 Materials and methods

3.1 Basic equations for LCA

The basic equations for LCA have been discussed in a consistent notation by Heijungs and Suh (2002). Below, we present two tables of symbols and equations connecting the fundamental concepts of LCA.

With respect to normalization, there are two situations which require separate treatment.

  • A vector of intervention totals, \( {\mathbf{\dot{g}}} \), can be defined for the reference situation. For instance, one can collect data on the emissions of CO2, SO2, etc., which then represent \( {\dot{g}_1} \), \( {\dot{g}_2} \), etc. These then can be processed by the same characterization model to yield the vector of category totals, \( {\mathbf{\dot{h}}} \). This then forms the basis of the normalization. Changing \( {\mathbf{\dot{g}}} \) will induce a change in \( {\mathbf{\dot{h}}} \), but \( {\mathbf{\dot{h}}} \) itself will not be changed directly by the LCA practitioner. We will refer to this as normalization case 1.

  • Alternatively, the vector of category totals \( {\mathbf{\dot{h}}} \) can be known without a detailed specification of the underlying interventions. In that case, \( {\mathbf{\dot{h}}} \) is not an output result (in the sense of belonging to Table 2), but input data (in the sense of belonging to Table 1). Thus, \( {\mathbf{\dot{h}}} \) can be changed directly, and it will affect the normalization results and the weighted index. We will refer to this as normalization case 2.

Both approaches in fact appear in practice and are therefore elaborated below as separate cases.

3.2 General theory of sensitivity coefficients

There are various situations in which the stability of the results in terms of sensitivity for perturbations of the input data is of interest. In general, we may explore this issue as follows. Suppose that an output variable z depends on two input variables x and y and that the dependence is expressed by a function f:

$$ z = f\left( {x,y} \right). $$
(2)

The crucial elements in determining sensitivity is the change of the result (Δz) caused by a marginal change in xx) and by a marginal change in yy). This is expressed using the partial derivatives:

$$ \Delta z = \frac{{\partial z}}{{\partial x}}\Delta x + \frac{{\partial z}}{{\partial y}}\Delta y. $$
(3)

Coefficients such as \( \frac{{\partial z}}{{\partial x}} \) and \( \frac{{\partial z}}{{\partial y}} \) are referred to as sensitivity coefficients in the present context. Their evaluation requires a specification of the function f. Thus, with f specified, the sensitivity coefficient of f with respect to input parameter x is defined as:

$$ \frac{{\partial z}}{{\partial x}} = \frac{{\partial f\left( {x,y} \right)}}{{\partial x}}. $$
(4)

4 Results

4.1 Sensitivity coefficients for LCA

Table 2 provides a specification of the functions f in the context of LCA. So we are to insert the equations of Table 2 into the general equation for calculating a sensitivity coefficient (Eq. 4). Example calculations have been provided by Heijungs and Suh (2002, Eqs. (6.21), (6.26), and (6.29)). For the scaling factors, s, we have:

$$ \frac{{\partial {s_k}}}{{\partial {a_{ij}}}} = - {\left( {{{\mathbf{A}}^{ - 1}}} \right)_{ki}}{s_j} $$
(5)

and for the inventory results, g, we have:

$$ \frac{{\partial {g_k}}}{{\partial {a_{ij}}}} = - {\lambda_{ki}}{s_j} $$
(6)

for the dependence on the elements of A, and

$$ \frac{{\partial {g_k}}}{{\partial {b_{ij}}}} = {s_j}{\delta_{ik}} $$
(7)

for the dependence on the elements of B, where \( {\delta_{ik}} = \left\{ {\begin{array}{*{20}{c}} 1 & {{\hbox{if }}i = k} \\0 & {\hbox{otherwise}} \\\end{array} } \right. \) represents the Kronecker delta.

It is clear that the higher one moves in the sequence inventory–characterization–normalization–weighting, the larger the number of sensitivity coefficients there will be. Scaling factors only depend on the technology matrix, but the inventory results depend on the technology matrix and on the intervention matrix.

Table 3 gives a tabular overview of the derivatives for all expressions of matrix-based LCA.

Table 3 Overview of the sensitivity coefficients that express how LCA output results (columns) change if LCA input data (rows) data change

One remark on these coefficients. Each of the formulas in Table 3 contains A −1 or a symbol that depends on A −1, such as s, Λ or h. Thus, we need to go through the process of matrix inversion to evaluate the sensitivity coefficients. Moreover, although Ciroth et al. (2004) argue that one can solve LCAs without calculating a full matrix inverse, Table 3 shows that we need the full matrix inverse when we wish to extend the analysis to include sensitivity and analytical uncertainty studies.

4.2 Perturbation analysis

Heijungs and Kleijn (2001) and Sakai and Yokoyama (2002) describe perturbation analysis as a way of investigating which input data are most decisive for the results in terms of their relative sensitivity. That is, given a system with the prototypical form

$$ z = f\left( {x,y} \right) $$
(8)

one investigates dimensionless multipliers, such as

$$ \frac{{\partial z}}{{\partial x}}\frac{x}{z}. $$
(9)

The idea is that a small change of either of the input parameters (say in \( x \), hence a change \( \Delta x \)) leads to a change in the result (\( z \), hence \( \Delta z \)), and that the relative change \( \frac{{\Delta z/z}}{{\Delta x/x}} \) can be approximated by \( \frac{{\partial z/z}}{{\partial x/x}} \). The results of Tables 2 and 3 can be combined to yield a complete overview of these multipliers (see Table 4).

Table 4 Overview of the relative sensitivity coefficients (multipliers) that express how small changes in input data propagate into changes in output results

4.3 Uncertainty analysis

Although the Monte Carlo method is right now the most frequently applied method for studying the propagation of input uncertainties into output uncertainties (Lloyd and Ries 2007), it has been recognized that this method may be too computationally intensive for application to large systems (Ciroth et al. 2004; Heijungs et al. 2005; Hong et al. 2008). Indeed, the present release of the ecoinvent data (v2.0, comprising almost 4,000 processes) has not been processed with Monte Carlo analysis, while the previous release (v1.3, 2,500 processes) contained such results, precisely for this reason. One way to address this is by means of smarter algorithms. Peters (2007) has proposed power series expansion, also in connection to Monte Carlo analysis. Moreover, there are more sophisticated sampling strategies than the naive Monte Carlo method, such as Latin hypercube sampling and response surface methods (e.g., Morgan and Henrion 1990). In this paper, we address it with a solution on the basis of analytical error propagation. The theory of analytical error propagation, using Taylor series expansion, has been proposed in LCA (Heijungs 1994; Ciroth et al. 2004; Hong et al. 2008). Taylor series expansions are based on the approximation formula for calculating the variance of a stochastic result using stochastic data (Bevington and Robinson 1994; Morgan and Henrion 1990). For a system of the form

$$ z = f\left( {x,y} \right) $$
(10)

it assumes the form

$$ {\rm var} (z) = {\left( {\frac{{\partial f}}{{\partial x}}} \right)^2}{\rm var} (x) + {\left( {\frac{{\partial f}}{{\partial y}}} \right)^2}{\rm var} (y) + 2\frac{{\partial f}}{{\partial x}}\frac{{\partial f}}{{\partial y}}{\rm cov} \left( {x,y} \right) $$
(11)

where var(x) is the variance of the variable x and cov(x,y) represents the covariance between the stochastic variables x and y.

In most cases, no covariance data are available, or the covariance can be assumed to be negligible as the uncertainties are in many cases independent. In those cases, we set covariance to zero and obtain

$$ {\rm var} (z) = {\left( {\frac{{\partial f}}{{\partial x}}} \right)^2}{\rm var} (x) + {\left( {\frac{{\partial f}}{{\partial y}}} \right)^2}{\rm var} (y). $$
(12)

Such equations are for a general function f(x,y), and they require the evaluation of the derivatives \( \frac{{\partial f}}{{\partial x}} \) and \( \frac{{\partial f}}{{\partial y}} \). In the present case, we have specified the function (Table 2), and we have found equations for the derivatives (Table 3). So all ingredients are available to complete the structure of the uncertainty analysis.

Heijungs and Suh (2002, Eqs. (6.73) and (8.87)) elaborate this only for the inventory vector, and then even only partially, and with a typo. The complete and correct expression for this is

$$ {\rm var} \left( {{g_k}} \right) = \sum\limits_{i,j} {{{\left( {{s_j}{\lambda_{ki}}} \right)}^2}} {\rm var} \left( {{a_{ij}}} \right) + \sum\limits_j {{{\left( {{s_j}} \right)}^2}} {\rm var} \left( {{b_{kj}}} \right). $$
(13)

The remaining expressions for the uncertainty of the impact assessment results are elaborated in Table 5.

Table 5 Overview of the variance of output results as a function of the variance of the input data

Again, mind that there are two expressions for \( {\rm var} \left( {{{\tilde{h}}_k}} \right) \) and two expressions for var(W) for both normalization variants.

4.4 Key issue analysis

Key issue analysis has been defined by Heijungs (1996) as the decomposition of the uncertainty of a result in terms of the contribution of the uncertainties of the input data. Morgan and Henrion (1990) refer to it as uncertainty importance, while Saltelli et al. (2000) use the (perhaps confusing) term sensitivity analysis. All expressions for the variance in Table 4 are the result of the weighted aggregation of a large number of variances of input data. For instance, the expression for var(g k ) comprises two summations, one over i,j and one over j. Adding the interpretation of these indices, we see summations over all economic flows and twice over all processes. Take the case of ecoinvent v2.0 where the number of processes and economic flows is almost 4,000; this variance is the result of almost 12,000 terms. Each of these terms is positive, so we may really consider (s j λ ki )2var(a ij ) and (s j λ ki )2var(b ij ) as the contribution that one individual var(a ij ) or var(b ij ) makes to the total var(g k ). We therefore define a number of dimensionless coefficients ζ, namely:

$$ \zeta \left( {{g_k},{a_{ij}}} \right) = \frac{{{{\left( {{s_j}{\lambda_{ki}}} \right)}^2}{\rm var} \left( {{a_{ij}}} \right)}}{{{\rm var} \left( {{g_k}} \right)}} $$
(14)

and

$$ \zeta \left( {{g_k},{b_{ij}}} \right) = \frac{{{{\left( {{s_j}{\delta_{ik}}} \right)}^2}{\rm var} \left( {{b_{ij}}} \right)}}{{{\rm var} \left( {{g_k}} \right)}} $$
(15)

as the relative contributions by each var(a ij ) and var(b ij ) to the total var(g k ). Naturally,

$$ \sum\limits_{i,j} {\zeta \left( {{g_k},{a_{ij}}} \right) + } \sum\limits_j {\zeta \left( {{g_k},{b_{kj}}} \right)} = 1 $$
(16)

so these indeed represent relative contributions to the variance of the total.

Table 6 provides an overview of the expressions for the different types of key issues.

Table 6 Overview of the contributions to the variance of output results by the individual variance of the input data

4.4.1 Example

We have programmed these equations into CMLCA (http://www.cmlca.eu/). As an example, we loaded the ecoinvent v1.3 (2,630 unit processes) and calculated the system for a reference flow of 1-kWh Swiss electricity, low voltage, at grid. In the example below, we have restricted the analysis to the inventory analysis and to the emission of Carbon dioxide, fossil, emitted to air, low population density.

We performed a perturbation analysis, concentrating on the perturbations of the technology matrix. Figure 1 shows a screenshot of the results, hence of the coefficients γ k(a ij) for k is CO2 and for all i (economic flows) and all j (processes). A not-too modern computer was able to complete the calculations within a few minutes.

Fig. 1
figure 1

Screenshot of CMLCA showing the results of a perturbation analysis, the sensitivity of an output result for small changes of the input data. Only multipliers smaller than −0.3 and larger than 0.3 have been shown

We also performed an uncertainty analysis. Figure 2 shows a screenshot of these results, so tabulating var(g k ) (along with some other statistics) for k is CO2. Again, results are obtained within a few minutes.

Fig. 2
figure 2

Screenshot of CMLCA showing the results of an analytical uncertainty analysis consisting of a number of characteristic parameters for the uncertainty of an output result

We finally performed a key issue analysis. Figure 3 shows a screenshot of these results, so tabulating ζ(g k ,a ij ) and ζ(g k ,b ij ) for k is CO2 and for all i (economic flows) and all j (processes). Again, a few minutes suffice.

Fig. 3
figure 3

Screenshot of CMLCA showing the results of a key issue analysis, the decomposition of the uncertainty of an output result in terms of contributions by the uncertainties of the input data. Contributions below 1% are not shown

Here, we see that only four coefficients make up 80% of the uncertainty of the CO2 emission. In other words, we can obtain a more reliable, less uncertain result by trying to find more accurate data for these four coefficients.

5 Discussion

Three important restrictions must be born in mind.

First, the formulas are for the “normal” LCI. This means that the equations become more complicated once we incorporate other features and developments, such as allocation and IO-based LCA. Heijungs et al. (2006) show how the formula for \( \frac{{\partial {g_k}}}{{\partial {a_{ij}}}} \) changes when the inventory is done using a hybrid method, combining process-based and IO-based LCA. This modification may be propagated to the impact assessment level as well.

Second, some of the formulas (namely those in Tables 4, 5, and 6) are based on a first-order Taylor series approximation. That means that they are correct for small changes, uncertainties, and perturbations, but not necessarily for larger ones. To fix this, one may either include second-order terms (or even further). For instance, for the perturbations, we may improve by going from

$$ \Delta z = \frac{{\partial z}}{{\partial x}}\Delta x $$
(17)

into

$$ \Delta z = \frac{{\partial z}}{{\partial x}}\Delta x + \frac{1}{2}\frac{{{\partial^2}z}}{{\partial {x^2}}}{\left( {\Delta x} \right)^2} + \frac{1}{6}\frac{{{\partial^3}z}}{{\partial {x^3}}}{\left( {\Delta x} \right)^3} + \cdots $$
(18)

This requires the evaluation of not only \( \frac{{\partial z}}{{\partial x}} \) but also of the second derivative \( \frac{{{\partial^2}z}}{{\partial {x^2}}} \) or even beyond. Table 3 might be extended to include expressions for \( \frac{{{\partial^2}{s_k}}}{{\partial {{\left( {{a_{ij}}} \right)}^2}}} \) and similar. Alternatively, the approach by Sherman and Morrison (1950) can be used to provide an exact answer to this question.

Third and finally, the formulas for uncertainty (Tables 5 and 6) are based on the ignorance of the covariance between input variables. As we noted above, the complete first-order expression includes an additional term for the covariance between input data:

$$ {\rm var} (z) = {\left( {\frac{{\partial f}}{{\partial x}}} \right)^2}{\rm var} (x) + {\left( {\frac{{\partial f}}{{\partial y}}} \right)^2}{\rm var} (y) + 2\frac{{\partial f}}{{\partial x}}\frac{{\partial f}}{{\partial y}}{\rm cov} \left( {x,y} \right). $$
(19)

It is possible to carry out the program of this paper including covariance, but the expressions become much more complicated. For instance, for var(g k ), we obtain

$$ {\rm var} \left( {{g_k}} \right) = \sum\limits_{i,j} {{{\left( {{s_j}{\lambda_{ki}}} \right)}^2}} {\rm var} \left( {{a_{ij}}} \right) + \sum\limits_j {{{\left( {{s_j}} \right)}^2}} {\rm var} \left( {{b_{kj}}} \right) + 2\sum\limits_{i,j,l.m} {{s_j}{\lambda_{ki}}{s_m}{\lambda_{kl}}} {\rm cov} \left( {{a_{ij}},{a_{lm}}} \right) + 2\sum\limits_{j,l} {{s_j}{s_l}} {\rm cov} \left( {{b_{kj}},{b_{kl}}} \right) + 2\sum\limits_{i,j,l} {{s_j}{\lambda_{ki}}{s_l}} {\rm cov} \left( {{a_{ij}},{b_{kl}}} \right). $$
(20)

The last three terms requires us to specify cov(a ij ,a lm ), cov(a kj ,b kl ), and cov(a ij ,b kl ) for all combinations of processes, economic flows, and environmental flows. Although some uncertainties will be definitely correlated (for instance, the fuel input of a combustion process and the CO2 emission of the same process may have a correlation close to 1), most uncertainties will be uncorrelated, or any information on such correlations is lacking. The infrastructure needed (lots of memory for storing a number of covariance matrices, much more complicated formulas, much more data collection and estimation) will probably not offset the relatively limited gain of having a slightly more accurate computation. In the end, there is always something perverse about knowing the uncertainty with certainty.

A question that always arises in connection to analytical error propagation is whether it only works for normally distributed uncertainties. This is not the case. The theory of analytical error propagation through Taylor series expansions (Morgan and Henrion 1990, p. 183 ff.) nowhere contains the assumption of normally distributed distributions. The only assumption is that for a first-order approximation, the function should be sufficiently close to a linear function within the range of uncertainty. This point has been addressed above. A practical issue is of course that the formulas require a specification of the variance, whereas most distributions are specified without an explicit variance. For instance, a uniform distribution is often specified in terms of its width and a lognormal distribution in terms of the geometric standard deviation (or its square, as in ecoinvent). Heijungs and Frischknecht (2005) provide formulas to easily calculate a variance from the standard parameters of a normal, lognormal, uniform, and triangular distribution. As we can see in Table 5, the propagated variances are the sum of a large number of terms. Following the central limit theorem (e.g., Morgan and Henrion 1990), such a propagated uncertainty will tend to become normally distributed provided that the input uncertainties are independent.

6 Conclusions

This paper has carried out the “exercise” that was left over by Heijungs and Suh (2002) and related work on deriving the complete set of sensitivity coefficients for matrix-based LCA.

7 Recommendations and perspectives

For some, the formulas in Tables 4, 5 and 6 are intimidating and may offer little insight. But they are straightforward to implement in computer code. Once implemented, doing an uncertainty analysis is as easy as doing a Monte Carlo analysis: click a button and wait for the results. Likewise, doing a key issue analysis is as easy as doing a classical contribution analysis.

We hope that the availability of these equations will stimulate developers of software, commercial or not, to implement analytical, Taylor series-based approaches toward uncertainty and sensitivity analysis. In particular for the key issue analysis, no good Monte Carlo approach is available, and the analytical solution using Table 6 provides an extremely powerful way of reducing the uncertainties in LCA. For perturbation and uncertainty analysis, numerical approaches can be used as well, but these are extremely time-consuming for large LCA systems.

As noted above, Monte Carlo analyses for large LCA systems may be unfeasible. This paper develops sensitivity coefficients that serve to derive the formulas for analytical error propagation based on first (or higher)-order Taylor series approximation. This has other limitations, for instance, relating to a more restricted range of validity. Therefore, we welcome the simultaneous development of more sophisticated sampling methods such as Latin hypercube sampling and response surface methods, as well as the development of more efficient algorithms such as those based on a power series expansion. Yet, even when this would overtake the analytical error propagation, we still see a role for the sensitivity coefficients in perturbation analysis and in key issue analysis.