The ecoinvent database version 3 (part II): analyzing LCA results and comparison to version 2

Version 3 of ecoinvent includes more data, new modeling principles, and, for the first time, several system models: the “Allocation, cut-off by classification” (Cut-off) system model, which replicates the modeling principles of version 2, and two newly introduced models called “Allocation at the point of substitution” (APOS) and “Consequential” (Wernet et al. 2016). The aim of this paper is to analyze and explain the differences in life cycle impact assessment (LCIA) results of the v3.1 Cut-off system model in comparison to v2.2 as well as the APOS and Consequential system models. In order to do this, functionally equivalent datasets were matched across database versions and LCIA results compared to each other. In addition, the contribution of specific sectors was analyzed. The importance of new and updated data as well as new modeling principles is illustrated through examples. Differences were observed in between all database versions using the impact assessment methods Global Warming Potential (GWP100a), ReCiPe Endpoint (H/A), and Ecological Scarcity 2006 (ES’06). The highest differences were found for the comparison of the v3.1 Cut-off and v2.2. At average, LCIA results increased by 6, 8, and 17 % and showed a median dataset deviation of 13, 13, and 21 % for GWP, ReCiPe, and ES’06, respectively. These changes are due to the simultaneous update and addition of new data as well as through the introduction of global coverage and spatially consistent linking of activities throughout the database. As a consequence, supply chains are now globally better represented than in version 2 and lead, e.g., in the electricity sector, to more realistic life cycle inventory (LCI) background data. LCIA results of the Cut-off and APOS models are similar and differ mainly for recycling materials and wastes. In contrast, LCIA results of the Consequential version differ notably from the attributional system models, which is to be expected due to fundamentally different modeling principles. The use of marginal instead of average suppliers in markets, i.e., consumption mixes, is the main driver for result differences. LCIA results continue to change as LCI databases evolve, which is confirmed by a historical comparison of v1.3 and v2.2. Version 3 features more up-to-date background data as well as global supply chains and should, therefore, be used instead of previous versions. Continuous efforts will be required to decrease the contribution of Rest-of-the-World (RoW) productions and thereby improve the global coverage of supply chains.


Introduction
Life cycle assessment (LCA) is a data-intensive methodology, i.e., a typical life cycle covers thousands of unit processes. This information cannot usually be gathered within a single LCA project due to the high cost that would be involved in data collection. It is, therefore, common practice to focus data collection efforts on selected activities that reflect the space for action-often called the foreground system (Finnveden et al. 2009)-and to use generic data from life cycle inventory (LCI) databases to model the remaining activities, often called the background system (Bourgault et al. 2012;Tillman 2000). The background system usually covers up to 99 % of the unit processes in the product system; only in rare cases do the number of unit processes modeled explicitly in the foreground system exceed 5 % (Reinhard et al. submitted). Bearing this in mind, background or LCI databases can be considered the backbone of any LCA study. Therefore, the available quantity and quality of unit process data provided by LCI databases are of utmost importance.
Each update of the ecoinvent database introduces new and updated datasets. Both updated and new data can lead not only to direct changes but also to changes in the supply chain of other datasets. Database updates generally lead to an increase of the number of datasets in the database. For example, the number of activities in the versions 1.3, 2.2, and 3.1 (Cut-off system model) of the ecoinvent database has increased from 2632, to 4087, and to 11,301, respectively. As LCI databases evolve, the life cycle impact assessment (LCIA) results can be expected to change as well. Nevertheless, the transition from version 2 (2007) to version 3 (2013) of ecoinvent is special in the sense that it involves, in addition to new data, a set of new modeling principles, which lead to systematic changes in the network structure. Version 3 also features, for the first time, three different system models: the BAllocation, cut-off by classification^ (Cut-off), the BAllocation at the point of substitution^(APOS), and the BConsequential^system model. An overview of the methodology of version 3 is provided in Wernet et al. (2016) as well as the ecoinvent data quality guidelines .
Against this background, the aim of this article is to analyze the LCIA results and provide a better understanding of the differences between the latest updates of versions 2 (v2.2) and 3 (v3.1) of the ecoinvent database as well as between the new system models.

General approach
The clear identification of reasons for deviations between different database versions of ecoinvent is a non-trivial task. LCI databases represent highly interconnected systems where almost everything is connected to everything else, i.e., they are characterized by a high degree of integration. We can measure the degree of integration of a database by computing the average number of supply chain inputs throughout all product systems. Both the degree of integration and the average number of supply chain processes that contribute to a product system have increased from roughly 2400 processes (59 % integration) in v2.2 to 7500-9000 processes (70-80 % integration) in v3.1, depending on the system model (Electronic Supplementary Material, Table S1). That is, if we randomly pick one of the product systems in v3.1, more than 70 % of the unit processes in the database will be contained in the upstream supply chain. Improvements in existing data and addition of new processes, therefore, affect most of the product systems in the database.
When we talk about the comparison of v2.2 and v3.1, it is the cumulated effect pattern of more than 6000 new processes, 3500 updated intermediate exchanges, and 4000 updated elementary exchanges. All of these changes can influence LCIA results. In addition, new modeling principles have been introduced that lead to changes in the network structure compared to v2.2.
In order to analyze the difference in LCIA results between different versions of the ecoinvent database, we proceeded as follows: First, we compared the LCIA results to each other and evaluated the differences by statistical measures. Then, different reasons for the observed differences were explored. For the comparison of v3.1 to v2.2, this included a qualitative analysis of the changes induced by new and updated datasets as well as an analysis of the effect of newly introduced modeling principle of global activity coverage. For the comparison of the system models of version 3, the analysis focused on the differences in modeling principles, as the underlying data are the same.

Matching of datasets across database versions
When comparing across database versions, two distinct perspectives can be adopted: to analyze each database entirely by itself or to analyze a matched sample of datasets in each database, i.e., where each process has a corresponding process in the other database version. The strength of the Bcomplete database^approach is that it analyzes a certain question across all datasets in the database, providing a complete picture of the database. The strength of the Bmatched sample^approach is that the basis for comparison is the same and therefore selected factors of influence can be isolated. For example, when analyzing the geographical distribution of environmental impacts within two versions of a database, the complete database approach provides the complete picture for each database. It is, however, subject to the datasets contained in each database version, which makes direct comparisons difficult, e.g., as the database has grown with each version. The matched sample approach is more useful in this respect, as the underlying datasets cover the same scope. Therefore, the influence of data or systematic changes can be identified more easily. The drawback is, of course, that such comparisons cannot speak for the whole database.
In order to implement the matched sample approach, functionally identical processes needed to be matched between databases. Processes in the v3.1 system model databases as well as v1.3 and v2.2 were matched based on their attributes, such as product, name, location, and unit. For the comparison of v2.2 to v3.1, a matching list provided by ecoinvent was used as a basis. In order to compare v2.2 to v3.1, we use the Cut-off system model as it replicates the modeling principles of v2.2 (Frischknecht et al. 2005;Wernet et al. 2016). Table 1 shows the sample sizes for each of the database comparisons. Most datasets of version 1.3 could be compared with datasets in v2.2, which is due to the fact that v2.2 was mainly characterized by data additions. During the transition from version 2 to 3, many changes occurred simultaneously. Next to a revision of the naming convention , some sectors were updated and in part changed in structure, which results in datasets that can no longer be directly matched. Based on the matching list provided by ecoinvent, 67 % of v2.2 datasets could be matched with v3.1 datasets. The other way round, the matched share of datasets is only 24 % due to the overall increase in the number of datasets. The number of datasets in v3.1 is not the same in each system model due to differences in the modeling principles (Wernet et al. 2016), e.g., the consequential database does not include datasets for the production of by-products due to the substitution approach. Nevertheless, high numbers of datasets could be matched between the system model databases.

Comparison of LCIA results
LCIA results were calculated for all matched datasets of the different database versions and compared using Eq. (1), where d is the relative deviation (in percent) of an LCIA result r in one database version over a reference database (in this example, the deviation of v3.1 Cut-off results compared to v2.2 results). In the LCIA comparisons, the older version is usually taken as the reference to answer the question of how much LCIA results of datasets in the newer database differ from the older version. For the system model in version 3, we compare how much the Cut-off version differs from the other versions. Due to some unit (e.g., MJ to kWh) and sign changes of the reference product (e.g., in treatment datasets), conversion factors were used to match LCIA results of v2.2 and v3.1. In addition, the absolute value of the relative deviation is calculated for each matched dataset, as in Eq. (2).
Results for the deviation of all matched datasets are displayed in histograms (e.g., Fig. 2), which show both the relative deviation (negative and positive percentages) and the absolute deviation. The latter is displayed cumulatively for all datasets to inform about the total percentage of datasets that deviate less than a certain amount from the datasets in the reference database.
In addition, median values for the relative and absolute deviations are calculated. The median of the relative deviation indicates whether LCIA results increase or decrease on average compared to the reference database. We therefore call it the median database deviation (M DB ). The median of the absolute deviation expresses by how much datasets differ on average between the databases. We therefore call it the median dataset deviation (M DS ). The mean values for relative and absolute deviation were not found to be useful as they are too dominated by outliers. All datasets are given equal weight in these comparisons despite the fact that they might be of different relevance from an economic perspective or to LCA practitioners.

Comparison of process contributions
In addition to knowing how LCIA results compare to each other, the cause of these impact changes is of core interest. We have therefore calculated the contribution of each process throughout all product systems. This information can be stored in a contribution matrix, where columns are product systems and rows process contributions (illustrated in  Fig. 1). The sum of a column adds up to the impact score of that particular product system. To obtain a relative contribution matrix, we have normalized each column by the total impact score of the process, i.e., all entries in a column add up to one (Reinhard et al. submitted). The relative contribution matrix is an efficient tool to identify the mean contribution of individual processes or sectors throughout the entire (or a selected subset of the) database. When contributing processes are grouped (vertical aggregation of the matrix), the individual rows are summed up. When product systems are grouped, then this is done by averaging to the arithmetic mean value (horizontal aggregation of the matrix). We use the arithmetic mean because we are interested in typical values representing the Breal^balance point of the set of contributions associated with a process (Bulmer 1979). The horizontal aggregation can be done, depending on the aim, for all product systems in the database or only for those product systems that have a match in another database version. In this paper, we mainly apply the matched sample approach, or, more precisely, the Ball-to-matched^perspective (see Fig. 1). It tells us the relative contribution of all processes related to a specific issue (e.g., electricity production) throughout the subset of product systems that can be compared across different database versions. The complete database approach is only used in Fig. 4, where vertical aggregation is performed at the level of geographies.
Each product system is given equal weight, meaning that we assume a uniform distribution of importance across all product systems in the database. Consequently, the assessment of the most important processes is determined by the processes and sectors contained in the database. Therefore, results must be seen as an inward perspective on the database itself meaning that they cannot be compared directly to results from input-output databases or other statistical sources (Majeau-Bettez et al. 2011;Reinhard et al. submitted).

Choice of impact assessment methods
While calculations could theoretically be carried out for all existing impact assessment methods, we limited the analysis and discussion to three key indicators: the Global Warming Potential (GWP) for a time horizon of 100 years (IPCC 2007) and two frequently used fully aggregated methods, ReCiPe Endpoint (H/A) (Goedkoop et al. 2009) and Ecological Scarcity 2006 (ES'06) (Frischknecht et al. 2008). These, and not the latest versions of the GWP and Ecological Scarcity Methods, were used to avoid a bias in the database comparison, as v2.2 data did not consistently support the application of the new methods without adaptations. In addition, database comparisons for selected CML-IA midpoint indicators (Guinée et al. 2002) are included in the SI.

Historical perspective: comparison of v2.2 and v1.3
In order to provide a reference for the magnitude of differences due to version changes in the past, the LCIA results of v2.2 were compared to those of v1.3. Figure 2 shows that LCIA results for individual datasets have both increased and decreased. The median dataset deviation is 3.6, 4.3, and 5.7 % for GWP, ReCiPe, and ES'06, respectively. A median increase of all datasets of 1 and 1.3 % was observed for GWP and ReCiPe, while a decrease of 3.1 % was calculated for ES'06 (ESM ,  Table S2 and Fig. S1 for midpoint results).  (Table 2) than in v2.2. The average difference (median of dataset deviation) between datasets in v2.2 and v3.1 Cut-off is 13, 13, and 21 % for GWP, ReCiPe, and ES'06. For GWP, ReCiPe, and ES'06, respectively, the impact score of 39 %/38 %/ 54 % of all datasets deviates by more than 20 %, and 4 %/ 6 %/7 % of all datasets deviate by more than 100 %.

Reasons for differences
Updates of existing activities We distinguish and discuss improvements for selected existing activities according to the following classification: & Allocation: this category covers changes in the applied allocation paradigm. & Completeness: this category includes direct changes which contribute to the completeness of an activity. & Models for inventory data: this category includes changes focusing on the improvement of inventory modeling.
Allocation In v2.2, several products resulting from electrolysis were mass allocated. For example, the activity BElectrolysis of lithium chloride (GLO)^produces 0.15 kg lithium (reference product) and 0.75 kg gaseous chlorine (by-product). Consequently, 83 % of the environmental impact followed the by-product of the activity. This approach was revised for v3.1 in the relevant system models. Since the atomic mass of the lithium and the chlorine cannot be considered as the driver for the electricity demand of the electrolysis, mass allocation was replaced with economic allocation; the allocation approach which is preferred according to ISO (2006) when physical relationships offer no realistic possibility. This causes a large deviation in GWP as the revenue from lithium is significantly higher than from gaseous chlorine. As a result, lithium is now associated with around six times higher environmental impacts than before and the impacts associated with gaseous chlorine drop significantly.
A similar case can be made for all spatial and technological variants of the chlor-alkali electrolysis which produces gaseous chlorine (reference product), sodium hydroxide, and liquid hydrogen in co-production. The adaptation in allocation of the mentioned activities increases the impacts associated with liquid hydrogen roughly by a factor of 10, while the impacts of gaseous chlorine and sodium hydroxide decrease somewhat. Completeness A notable improvement in this category concerns capital equipment, e.g., Bport facilities constructionâ nd Bairport construction.^In v2.2, these activities merely included the construction, but not the maintenance of the infrastructure. Consequently, the improvement focused on the supplement of all interventions associated with the maintenance over their lifetime (e.g., 100 years). The consistent consideration of maintenance increased the GWP by a factor of roughly 1500 (port facilities) and 36 (airport). To put it differently, the maintenance efforts of the modeled facilities are much larger than the actual interventions associated with their construction. This affects the downstream supply chains for air and ship transport, albeit to a limited extent only.
Another noteworthy improvement in completeness can be observed for many agricultural datasets requiring an input of irrigation. For example, the GWP of the activity BCoconut production, husked, (PH)^increases by a factor of roughly 5000 because crop and country-specific irrigation requirements have been added. The addition of irrigation has a large impact because the activity does not record many other interventions. Such improvements in completeness are based on the new availability of countryspecific irrigation activities. In v2.2, irrigation in agricultural activities was usually modeled with a direct input of water from nature ignoring the interventions associated with the provision of the water. V3.1 offers country-specific irrigation activities that represent the average applied technologies within a particular country. This allows for a more realistic modeling of the interventions associated with irrigation.
Models for inventory data A far-reaching improvement in this category concerns the general revision and harmonization of the emission models used for the modeling of N 2 O, NH 3 , and NO 3 emissions in agricultural activities and the development and cultivation of a superstructure for the consistent modeling of emission from land use change. Nemecek et al. (2014) offer a detailed discussion of the improvements and their consequences in terms of climate change impacts.
Another example is the implementation of a new model for transport in version 3. Transport is now accounted for within market datasets and was revised throughout the database based on sector transportation statistics (Wernet et al. 2016). In addition, road freight transport activities were updated.
New activities Due to space restriction, we focus on the most important additions per sector. For a complete overview of new activities, the interested reader may refer to Moreno Ruiz et al. ( , 2014. New electricity datasets make up a large share of new datasets across all continents, as shown in Table 3 Bauer 2013, 2014). In total, more than 1800 datasets in v3.1 Cut-off are related to electricity production, while it was around 600 datasets in v2.2. With this, more than 80 % of global power generation is now covered by local datasets in the ecoinvent database.
In addition, the structure of the wood sector was revised and expanded. Roughly 150 new activities covering forestry (country-and species-specific wood production), forest machinery (new operation datasets such as chipping, forwarding, harvesting, skidding, and yarding), sawmill, wood boards, and wood preservatives allow a more fine-grained modeling of the production and processing of wood products. The food sector was complemented with 30 new fruit and vegetable datasets (Stoessel et al. 2012) as well as industrial data for dairy products (milk, yogurt, cheese, butter, cream) and soya derivatives (tofu and soybean beverage). Passenger transport coverage was expanded by new size classes and types (such as emobility and alternate fuel sources) (Del Duce et al. 2014), and road freight transport was complemented with new data on EURO 6. The chemical sector was expanded by 90 chemicals including, among others, citric acid, glycine, sodium amide, acrolein, lactic acid, and iron chloride. Tap water provision operates on a higher level of detail as the infrastructure data and the technologies used for tap water production activities were expanded. In addition, new datasets for building materials were developed, including, among others, cement, concrete, bricks, and soapstone. Finally, the metal sector was complemented with new datasets for primary and secondary aluminum production including power generation. New datasets on magnesium, ferrosilicon, iron pellet, gold and silver mining, and steel processing were added as well.
Global coverage and spatially consistent linking of activities A central aim of version 3 was to start the development of a global LCI database (Wernet et al. 2016). To this end, new modeling principles for the geographical coverage of activities and their spatial linking were introduced. Global activity coverage means that every process is represented either by an activity with a global (GLO) geographical scope, or by activities of local scope that cover together the entire global production volume, usually including a Rest-of-the-World (RoW) dataset. Consistent spatial linking means, e.g., that local activities receive inputs from their own geography, if available, whereas global activities receive inputs from all geographies (see Wernet et al. 2016 for a complete explanation and graphical illustration). Market activities, i.e., consumption mixes, play a major role in this context, as they bundle the inputs of producers within their geographical scope based on market shares that correspond to annual production volumes. While markets do not cause any direct environmental interventions, they act as connectors between producers and downstream consumers. As the production of every product is covered globally, activities that receive inputs from global markets (which is currently the case for many products) are connected to supply chains from all geographies, depending on production volumes and the degree of regionalization. In version 2, consumption mixes have also existed, but were not implemented consistently for every product in the database. It is important to realize that the implementation of these concepts in version 3 results in spatially consistent supply chains across the ecoinvent database, which can differ substantially from those modeled in v2.2. For example, since country-specific data for South Africa (ZA) was missing in v2.2, the electricity requirement of the activity BPlatinum group metal mine operation, ZA^was approximated with the European UCTE 1 electricity mix. In v3.1, the input of UCTE electricity was deleted. Since ZA electricity producers have been added during the electricity sector update, ZA electricity consumers are now linked to the ZA electricity market by default if not specified otherwise by direct links (Wernet et al. 2016). This influences the platinum group metal production, as the GWP of the ZA electricity mix is seven times larger than the UCTE mix, mainly due to the large proportion of hard coal based electricity. That is, the improved modeling results in increased environmental impacts for all metals produced by the mine operation (an exception being palladium, which shows reduced environmental impacts due to updated price data used in the economic allocation). Alternative examples for such improvements can be found in BSanitary ceramics production, in Switzerland (CH)ŵ here the input of the UCTE electricity and heat mix was replaced with CH-specific electricity and heat consumption mixes.

Associated consequences
Availability of global and local data An analysis by the number of datasets per continent (Table 3, and Table S3 (ESM) by ecoinvent geographies) shows that version 3 features more regionalized datasets for every continent than v2.2. The strongest increases can be observed in North America, Australia, and Asia. The notable increase in datasets with global coverage is, to a large part, due to the introduction of global markets. Still, the number of global production datasets has almost doubled. Despite only a small increase of European datasets, Europe continues to be the region that is best covered within the database, both by local productions and local markets. In addition, RoW activities have been introduced to 1 Union for the Coordination of the Transmission of Electricity cover all geographies where a local activity is not yet available. Overall, 71 % of all datasets are production datasets and 29 % are markets. While some products are distributed via local markets (e.g., electricity (Treyer and Bauer 2014)), most products are distributed via global markets. Thus, the regionalization in version 3 so far takes place mostly at the level of regionalized production datasets, which then contribute to global markets. One reason for this is that regionalized production datasets must exist before regionalized markets can be created.
Geographical impact distribution To assess the consequences of the increased global coverage, we have analyzed the environmental impacts that arise along the supply chains of each product in v2.2 and v3.1 Cut-off by geography. Figure 4 shows that significant differences can be observed between the databases: while most impacts in v2.2 occur in Europe, in v3.1 Cut-off Asia, North America, and RoW gain in importance. The importance of European datasets drops drastically in version 3. It is important to note that both databases are not directly comparable, as they do not contain the same datasets (e.g., version 3 has more datasets with non-European geographies than version 2). Nevertheless, Fig. 4 clearly shows that version 3 is developing towards a global LCI database. When the geographies are disaggregated to the underlying ecoinvent geographies, the top five contributing geographies across supply chains in v2.2 are RER (Region Europe) and CH (Switzerland), followed by Global, Germany, and Italy (Fig. S3, ESM). In v3.1 Cut-off, those geographies are RoW, China, Global, Russia, and RER. The reason why RoW is that important in version 3 is that in many cases, local production datasets only cover a small fraction of global production volumes. The impact fraction arising in RoW datasets is therefore an indicator of how well the ecoinvent database covers important processes with regionally appropriate data. The impact contributions by geography in Fig. 4 should not be mistaken for actual global distributions of environmental impacts. Instead, the figure reflects an inward perspective on the ecoinvent database, which does not cover all economic sectors equally well (Majeau-Bettez et al. 2011;Reinhard et al. submitted).
Updated transport data The direct contribution of transportation across the database has not changed significantly from v2.2 to v3.1 Cut-off, as shown in Fig. 5 (see ESM, Fig. S4 for ReCiPe and ES'06). In both versions, road transport is the most important, followed by ship, rail, and air transport. The mean contribution of transport to all matched datasets is in both versions around 5 %. Across all datasets, i.e., within each database separately, the mean contribution of transport rises slightly from 5.5 to 5.8 %.
Combined effect of global supply chains and local data: example of the electricity sector As electricity production datasets cover 50 countries (and due to partitioning of these 71 geographies), the electricity sector is a prime example to study the combined effect of global geographical coverage and global background data on LCIA results. In order to do so, the upstream impact contributions from electricity production were compared for v2.2 and v3.1 Cut-off for two distinct geographies: for global and European datasets. In order to remove the bias of having different datasets in each of the geographies across database versions, only matched datasets were considered (n = 262 for GLO and n = 947 for RER). In comparison to the total number of datasets, the share of electricity datasets remained constant at about 15 % in both database versions. Results show that in v2.2, impacts from electricity consumption along the supply chains of global and European products occur in both cases mainly within Europe (shown for GWP in Fig. 6), since non-European processes for electricity generation were hardly available in 2. 2 and production processes in general often neglected non-European supply chains. In contrast, in v3.1 Cut-off, electricity impacts arise across all world regions for global datasets. For European datasets, a high fraction arises within Europe; however, as inputs along the supply chain of European datasets are partly produced in other world regions, an important share of electricity-related impacts also arises outside of Europe. Results for ReCiPe and ES'06 are similar, although the contribution of the electricity sector to the overall impacts is lower (ESM, Figs. S5-S7). The electricity sector therefore demonstrates that the introduction of a global coverage of activities in version 3 effectively results in geographically more diverse (and more realistic) supply chains than in version 2, especially for global and non-European datasets.
It could be assumed that the drastic changes observed in the electricity sector are a driver for the differences in LCIA results in versions 2 and 3 (Cut-off). In order to test this hypothesis, copies of both databases were made where electricity inputs across all datasets were set to zero. When LCIA results were compared, the median database deviation of GWP dropped to 1.1 % (before 6.5 %) (ESM, Fig. S8, Table S4). The median database deviations for ReCiPe and ES'06, on the other hand, remained similar (6.8 % instead of 8.4 %, and 18.9 % instead of 17.4 %; higher numbers can be explained by an increased relevance of other sectors in the context of overall lower impacts in the databases, as they exclude electricity-related impacts). This result indicates that while the combined effect of global activity coverage and availability of regionalized data for electricity production leads, at average, to increased GWP for all datasets, the median database deviation cannot be explained by changes in electricity supply chains alone.

Comparison of v3.1 Cut-off and APOS
As shown in Fig. 7, the differences in LCIA results between the Cut-off and the allocation at the point of substitution (APOS) database versions are small. The median of the database deviation is, as shown in Table 2, close to zero, and the median of the dataset deviation is also small and ranges around 1 % for all LCIA methods (midpoints in Fig. S10, ESM). Due to the Cut-off approach, recyclable materials and wastes that are allocated in the APOS version receive zero impact on the Cut-off version, which explains the bump at −100 %. Since the two system models are identical except for the treatment of recyclable materials and wastes, the rest of the spread can be explained by supply chain propagations, i.e., some activities get inputs of burden-free BRecycled Content^products, whereas other activities get inputs of products that now carry burdens of waste treatment and material supply chains. Figure 8 shows the LCIA result comparison of the Cut-off in comparison to the Consequential version. Overall results increase by 1.3 % for GWP, while they decrease by 2.1 % for ReCiPe and 1.5 % for ES'06. The median dataset deviation is 7-8 % for all indicators, and LCIA results can be both higher and lower in the Cut-off version (see Table 2). The bump at −100 % can again be explained by the Cut-off approach for recyclable materials and wastes.

Comparison of v3.1 Cut-off and Consequential
Since the Consequential version aims at looking at consequences of changes (Earles and Halog 2011;Ekvall and Weidema 2004;Zamagni et al. 2012), its underlying modeling principles are fundamentally different (Wernet et al. 2016). It is, therefore, no surprise that LCIA results differ from those of the Cut-off and APOS system models. There are three key issues, which are dealt with differently in the Consequential version: first, the use of by-products. While by-products are allocated or cut off in the Cut-off system model, the avoided burden approach (i.e., substitution) is applied in the Consequential version. Second, markets only contain marginal suppliers, i.e., those that are considered as unconstrained and will change their output as a consequence of changes in demand, while in the Cut-off version, the average supply is modeled. Third,products that are only produced as by-products are considered as constrained, and a demand of these products leads to a demand of a substitutable product instead, which must be produced as a reference product in the Consequential version.
We conducted the following analyses to assess the importance of these modeling principles: a database copy was created where the inputs of the reference product of markets were adjusted so that they resembled the input composition in the Cut-off version (i.e., marginal suppliers were replaced by consumption mixes). When comparing the Cut-off version to the Consequential version modified to contain Cut-off-like consumption mixes, the median dataset deviations decrease from 8 to 2 %, 7 to 2 %, and 8 to 3 % for GWP, ReCiPe, and ES'06, respectively. The fit between these databases resembles that of the Cut-off and APOS versions, and the median database deviation drops from 1, −2, −2 % for GWP, ReCiPe, and ES'06, respectively, to zero for all indicators (Fig. S11, ESM). Therefore, while the Consequential version and the other system models are fundamentally different in scope and modeling principles, it is the fact that markets are comprised of marginal instead of average suppliers that seems to be the key factor leading to different LCIA results.
The effects of substitution and constrained markets on results of the Consequential version were also studied, but did not add further explanations to the differences compared to the Cut-off version. It was found that substitution reduces LCIA results, at average, within the Consequential version by 9, 7,  Table S6, ESM).

APOS vs. Consequential
The LCIA comparison between the APOS and the Consequential version shows very similar results as the comparison of Cut-off and Consequential versions due to the similarity of the Cut-off and APOS system models. Results for the LCIA comparisons are provided in the ESM (Fig. S13, Table S7).

Discussion
Across all LCIA comparisons, the differences of medians of both database and dataset deviations are most pronounced for the comparison of v3.1 to 2.2. These are due to a simultaneous update of the underlying data and the introduction of new modeling principles for the global coverage and spatial linking of activities.
The combination of global supply chains and more regionalized production activities lead to profound changes in terms of the geographical origin of inputs across the entire database. As shown for the electricity sector, version 2 had a European perspective even for datasets with supposedly global scope, whereas version 3 includes both local and global supply chains. Through the introduction of the global activity coverage, version 3 has thus made an important step towards being a global LCI database and increased the realism of supply chains, especially in well-regionalized sectors such as the electricity sector. With impacts having a global distribution through the inclusion of RoW datasets, regionalized LCIA results can be determined without geographical artifacts that would occur if local datasets were to be used as placeholders for global supply chains, as was the case in version 2.
Nevertheless, the mean process contribution of RoW activities is significant throughout the database (Fig. 4) showing that regional coverage could still be improved. Further regionalization should be targeted to those activities that contribute most to this measure, especially, when production conditions vary considerably across different regions. In addition to this inward perspective (Reinhard et al. submitted), a comparison with other statistical data or input-output databases would help to identify sectors that need further coverage (Majeau-Bettez et al. 2011). Further improved geographical coverage of LCI data will thereby increase the quality of LCA studies and reduce uncertainties related to the estimation of environmental burdens from global supply chains.
For GWP, the combined effect of global supply chains and new data in the electricity sector seem to explain a large part of the observed median increase of impacts. Despite further analyses, single driving factors for the differences in LCIA results for ReCiPe and Ecological Scarcity could not be identified. Put differently, the differences in LCIA results could not be attributed to a single sector, which should be expected for LCIA methods that address diverse impact categories. Instead, the differences are the result of many overlapping factors, including the many data updates throughout the database. The newly introduced concept to model transport within market activities and the revision of this data based on sector statistics did not lead to significant changes in transport impacts throughout the database for any of the analyzed LCIA methods.
At the level of individual datasets, the reasons for changes can usually be identified more readily. Typical changes involved the update of models for inventory data (e.g., in agriculture), improved data or types of allocation, and the completion of datasets (e.g., the consideration of maintenance in infrastructure processes). In certain sectors, such as metals and Fig. 8 Deviation and cumulative absolute deviation of v3.1 Cut-off from v3.1 Consequential LCIA results chemicals, the allocation approach was changed from a massbased one to an economic one in order to better reflect the economic drivers of production. While different allocation approaches were used in version 2, economic allocation has been introduced more consistently in version 3. These changes improve the consistency across the database. In addition, in many sectors, new datasets were added. Without the consideration of RoW and market datasets, the ecoinvent database now contains more than 5000 production activities. These additions equip the LCA practitioner with both, more nuanced building blocks for the modeling of foreground systems and a more complete model of the background system.
As shown in this article, the continued development of LCI databases leads to changes in LCIA results. While the direct effects of changes at the level of individual datasets may be large, the induced supply chain effects due to an individual change are often much smaller. However, the cumulative effect of all changes together is considerable and reflected in the differences of LCIA results. The transition from version 1.3 to v2.2 provides an interesting historical perspective to this, as the observed changes in median database and dataset deviation were smaller. However, version 2 did not introduce systematic changes as much as version 3 did and thus these figures are not directly comparable.
While the Consequential version and the other system models are fundamentally different in scope and modeling principles, it is the fact that markets are comprised of marginal instead of average suppliers that seems to be the main factor for differences in LCIA results. A more refined model of the Consequential version in terms of greater differences between marginal and average suppliers will, therefore, most likely lead to more diverging results.

Conclusions
Considerable differences exist between LCIA results of different database versions as LCI databases continue to evolve. These differences are especially pronounced for the transition from v2.2 to v3.1 (Cut-off). They arise from simultaneous improvements of the quality and quantity of contained data as well as the introduction of new modeling principles for the global coverage of activities and their geographically consistent linking. As shown for the electricity sector, this results in geographically more diverse and, in many cases, more realistic supply chains than in version 2, which was Europe-centered, enabling new applications such as regionalized impact assessment. Against these observations, we recommend the use of version 3 as a background LCI database over previous versions. At average, LCIA results are 6 % higher for GWP100a, 8 % higher for ReCiPe Endpoint (H/A), and 17 % higher for Ecological Scarcity 2006 than in v2.2, and considerable differences in LCIA results exist for individual datasets.
Among the three new system models of version 3, the Cutoff and APOS models are closely related and differ only in the way wastes and recyclable materials are dealt with. This is reflected in the overall small differences in LCIA results. The Consequential version, on the other hand, differs considerably from the attributional system models, which is expected due to the fundamentally different perspective and modeling principles. An analysis shows that the use of marginal suppliers instead of average suppliers contributes most to the observed differences.
Open Access This article is distributed under the terms of the Creative Comm ons Attribution 4.0 International License (http:// creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.