Distribution of environmental performance in life cycle assessments—implications for environmental benchmarking

Life cycle assessment (LCA) is considered a robust method to analyse the environmental impacts of products and is used in public and private market applications such as Green Public Procurement (GPP) and Environmental Management Systems (EMS). Despite the usefulness of the methodology, difficulties exist with the interpretation of LCA results. The use of benchmarks can facilitate this process, but there is yet little research on the definition of environmental benchmarks. The aim of this paper is to analyse the distribution of environmental performance used for the definition of the benchmark and how it effects the use in selected product categories. LCA results from 54 Environmental Product Declarations (EPDs) for insulation materials and 49 EPDs for bakery products are tested for their distribution. The outcome from the statistical analysis is used to compare and evaluate three calculation methods for a benchmark. The results of the study show that distributions and mid- and end-points of environmental performances of the studied indicators differ significantly for the two product categories. While some indicators for bakery products were closer to a normal distribution, most of the indicators are not normally distributed. This is reflected in the comparison of the chosen calculation methods for a benchmark, which showed that the distribution of the data affects the classification of the benchmark as well as the position of values on the benchmark. The results emphasise that analysis of further product groups and the associated distribution of the environmental performance is needed to understand the implications of calculation methods on a benchmark. The availability of comparatively large datasets in a common structure is crucial for these analyses and can be facilitated through the digitalisation of LCA- and EPD-information. Furthermore, more research is needed on the communication formats for different benchmarking options, which must be applied for the different intended audiences to be effective.


Introduction
The global market is increasingly demanding science-based, verified and comparable information about the environmental performance of goods and services (Chen et al. 2014;Chuang and Huang 2015;Hou et al. 2015;Zanghelini et al. 2016). This is in many cases driven by international legislation and policies such as the Framework Convention on Climate Change (1992), the Kyoto Protocol (2005), the Sustainable Development Goals (UN 2015) and the Single Market for Green Products initiative (EC 2013). But also internal drivers, such as minimised resource use and waste generation, encourage the work with environmental aspects of business operations (Chen and Chang 2013;Sánchez-Medina et al. 2015, as cited by Chuang and Huang 2018). As a result, interest in life cycle assessment (LCA), which is considered one of the best tools for both calculating and interpreting environmental impacts in a holistic context, is growing (Scheepens et al. 2016;Dieterle et al. 2018). At the same time as LCA is being applied more Communicated by Serenella Sala. widely, challenges with the communication and comparability of LCA results have been identified to impede its further usage (Molina-Murillo and Smith 2009;Reap et al. 2008a, b;Testa et al. 2016). LCA results can be difficult to understand for persons with less experience in LCA (Fet and Skaar 2006;Nissinen et al. 2007;Modahl et al. 2013;Passer et al. 2015;Ibáñez-Forés et al. 2016); thus, more guidance for the interpretation of LCA results is needed (Sala and Andreasson 2018;Rex et al. 2019).
As a response to the demand to define more harmonised rules and procedures to conduct and present the results of an LCA, type III environmental documentation have been developed, also referred to as Environmental Product Declaration (EPD) (Fet and Skaar 2006;Stevenson and Ingwersen 2012;Ibañez-Forés et al. 2016). Harmonization of methodological choices for LCA calculations of an EPD is done in so-called Product Category Rules (PCR) (ISO 2017). PCRs shall enable comparability by regulating, i.e. functional unit, system boundaries, allocation rules, choice of data and cut-off criteria in the LCA (Stevenson and Ingwersen 2012;Del Borghi 2013;Modahl et al. 2013;Bovea et al. 2014). However, comparability of EPDs may be complicated due to the presence of different PCRs between programme operators (PO) for EPDs (Fet and Skaar 2006;Stevenson and Ingwersen 2012;Del Borghi 2013;Hunsager et al. 2014;Minkov et al. 2015). Two examples of initiatives that strive to harmonise the use of calculation rules for LCA in the European Union are the ECO Platform (2020) and the PEF initiative by the European Commission (2013). Whereas the ECO Platform focuses on products in the construction sector, the PEF initiative works with harmonisation of methodological choices for different kind of products and includes aspects to facilitate the interpretation of LCA results (EC, 2016).
One of the objectives of PEF is to work on developing benchmarks to support the overall ambition to find ways of making better use of LCA-information for future policies or legislation (EC, 2016;EU 2013;Lupiáñez-Villanueva et al. 2018). As listed by Slack et al. (2009) benchmarking can both support internal and external decision-making processes of an organization in either a competitive or non-competitive way. A benchmark can be defined as a point of reference, e.g. a specific environmental performance, against which other observations can be compared to (EC, 2016). It supports interpretation of a performance for non-experts by indicating the rank of the performance thus having the potential to facilitate the interpretation of LCA results (Nissinen et al. 2007). However, different types of benchmarks for applications on LCA-information are not yet fully developed and therefore not frequently used (Nissinen et al. 2007;Galindro et al. 2019). Analyses of methods for developing different types of environmental benchmarks by Galindro et al. (2019) showed that typically four to five performance classes are used, but that there is no established methodology for the definition of these classes. Within the PEF initiative a representative product and its average environmental performance are used as the base for the definition of the performance classes of the benchmark (EC, 2016). The use of average results implies that the distribution of the environmental performance influences the development of a benchmark. This can have an effect on the application of the benchmark in different market applications. This paper aims to gain a better understanding of the distribution of environmental performances for different product categories and reported environmental indicators. The paper further aims to understand which effect the distribution has on the development of a benchmark and the definition of performance classes. The following section (Section 2) gives an overview of the use of LCA-and EPD-information in various market applications. Section 3 describes methods. Results are presented in Section 4, which are discussed in Section 5 and briefly concluded in Section 6.

LCA and EPD information in market applications
LCA activities have been carried out and used by LCA practitioners on a voluntary basis since the initial emergence of the method. In recent years, environmental information with a life cycle perspective has become more relevant in both the private and public sectors for use by non-practitioners. The most wellknown of these requirements is most likely the revision of ISO 14001 (2015) on environmental management systems, new EU Directives on Public Procurement (EU, 2014), establishment of a global network for LCA data, GLAD (Life Cycle Initiative, 2018) and publication of several CEN-standards for the construction sector (CEN, 2013). During many discussions following the official launches and publications of these methods and tools, public procurement was usually mentioned as their most likely future market application (UNEP 2018); other applications that make use of LCA information are supplychain management, eco-design, marketing and decision making (Hellweg and Milà i Canals 2014). The most recent public use of LCA-and EPD-information are usually referred to as carbon neutrality, net zero emissions and carbon offsetting. To identify potential decision-making situations, two examples commonly applied in the private and public sectors are briefly described below, followed by a description of the use of environmental benchmarks in the field of LCA.

Increased demand of LCA-and EPD-information in GPP
Green Public Procurement (GPP) and Sustainable Public Procurement (SPP) are powerful market-based tools for sustainable consumption and production (SCP), which use environmental challenges as an economic driver to set advanced environmental requirements in procurement activities (Mont and Plepys 2005). Following EU Directives on Public Procurement, all EU Member States have introduced its principles into their national legislation (EU, 2014). ISO-standards are available and specifically adapted to public procurement focusing on EPDs (ISO 2006(ISO , 2017, which initially has proven to be successful in the building/ construction and food sectors. Especially building certification schemes, such as BREEAM or LEED, led to a rapid growth of demand for EPDs in the construction sector, by giving credits for reporting certain types of verified environmental information (Gelowitz and McArthur 2016;Bernardi et al. 2017, Bienert et al. 2017. The Swedish government's climatepolitical framework also highlights the need to consider a life cycle perspective in the different phases of public procurement (Naturvårdsverket 2019).

Inclusion of life cycle thinking in environmental management systems
The ISO 14001 standard for environmental management systems (EMS) has been around for 20 years and is one of the most commonly used ISO standards with over hundreds of thousands of certificates around the world (ISO 2015). In 2015 a new version of the standard was released where one of the most significant changes from a procurement perspective is the need to consider a life cycle perspective (ibid.). The standard states that it is mandatory ('shall' requirements) to consider the life cycle perspective including the environmental aspects of an organisation's activities, products and services that it can control or influence. The 'shall' requirements are specifically addressed when: • identifying significant environmental aspects, • setting environmental goals, and • examining the organisation´s own purchasing procedures.

Environmental benchmarks within LCA
The growing interest in the benchmarking of LCA results can be viewed as a response to the increased demand of using LCA methodology in information specifically targeted to selected customers to support sustainable consumption patterns (Nissinen et al. 2007). The application of an environmental benchmark for LCA data is relatively uncommon, with applications in e.g. the method of an Eco-indicator (Goedkoop and Spriensma 2001) or the analyses of the development work, or the understanding and presentation formats of benchmarks (Nissinen et al. 2007). In contrast, the use of benchmarking in the building sector is more widespread, with the majority of benchmarks developed for the energy usage of a building. In a comparison by Hong et al. (2013), two types of benchmarking programmes are distinguished: top-down (1) and bottom-up benchmarking (2).
While top-down benchmarking compares a single performance of a product to a group of products with similar characteristics, bottom-up benchmarking programmes examine the specific characteristics of a product in order to represent the products' specific context. In the top-down programme, statistical analyses are required to determine the performance of the reference group. Bottom-up programmes calculate a theoretical performance and therefore are depending on a model to reflect the characteristics of the product (ibid.). Data for statistical analyses for top-down benchmarking can be provided through LCA studies (Murphy et al. 2015;Rönnlund et al. 2016). Bottom-up benchmarking requires further analysis of the function and characteristics of the studied product, which may relate the chosen functional unit.
Apart from the type of benchmarking programme chosen for the definition of a benchmark, the presentation format and level of detail of the benchmark are important aspects to consider (Nissinen et al. 2007). Studies of more detailed methods for the development of benchmarks have examined how the visualisation of information can be made more effective (Otto et al. 2003). Despite using LCA as a robust method for the definition of a top-down benchmark, specific expectations from stakeholders on trustworthy approaches cannot always be met (Brinkmann et al. 2019). More research is therefore needed to better understand the implications of a benchmark for the best applicability and effectiveness, as concluded by Heinzle and Wüstenhagen (2012) and Galindro et al. (2019). Changes in the definition of a benchmark such as the EU energy label (ECEEE 2009) have been shown to impact the perception and behaviour of the user of the information (Heinzle & Wüstenhagen, 2012).
With increasing availability of LCA-and EPD-information as well as access to digitalised databases provided by EPD POs, the concept of benchmarking could become more recognised and relevant to facilitate the interpretation and use of this type of information (Nissinen et al. 2007).

Methods
In this study the concept of a top-down benchmarking programme is chosen for the selected product categories, because of the complexity and inherent differences of the studied products in each product group. Three statistical indicators are calculated and analysed in this study-mid-and end-points, arithmetic averages and medians. The data are further visualised in the form of distribution curves and cumulative distribution functions to illustrate the shape, span and characteristics of the distribution curves. Furthermore, the data are tested for normal distribution using the Anderson-Darling test (Anderson and Darling 1952). Significance levels of 5% and 0.01% (six sigma) are chosen to test the hypothesis that the data are normally distributed (Engineering Statistics Handbook,n.d.). Depending on the chosen significance level, the p value indicates if the null hypothesis can be accepted or rejected. A minimum number of datasets under these conditions must be around 20-25 to achieve a reasonable level of significance for such a judgement. An increasing number of datasets is considered to make this type of statistical analyses more credible and trustworthy (Gunnarsson 2018).

Selection of product categories
The database of the International EPD System (IES) (2018) contains more than 1000 EPDs covering several product categories. Being one of the largest and broadest databases on the market, it has been chosen to identify two product categories for this study. As illustrated in Fig. 1, EPDs from construction products and food and agricultural products make up more than two third of all published EPDs. Sub-categories in the product groups are based on the PCR used for the EPD calculations. To allow for robust statistical analyses, the sub-categories with the highest numbers of EPDs were chosen for this study.
Within the construction products category, most EPDs were based on PCRs for 'construction products and construction services' followed by 'insulation materials'. The PCRs for 'construction products and construction services' is not limited to a specific sub-category and can be used for products with many different functions. To enable comparison within a narrower product group, the EPDs following the PCRs for insultation materials has therefore been chosen instead. All EPDs based on this PCR have the same functional unit defined as the amount of material necessary to achieve 1 m 2 * K * W − 1 of thermal resistance (IES 2016).
In the product category of food and agricultural products, the PCR for bakery products (2012:06) was used for most of the EPDs and thus chosen for this study. The declared unit is defined as 1 kg of product including the relative packaging, as presented to the customer (IES 2015).

Choice of system boundaries and environmental indicators
The definition of system boundaries is of vital importance to ensure the comparability of LCA-and EPD-information within a product category. For insulation materials, the scope of the LCA study and the resulting EPDs according to the PCR was defined as 'cradle-to-gate with options'. According to EN 15804, five life cycle stages are distinguished: product stage (A1-3), construction stage (A4-5), use stage (B1-7), end-oflife stage (C1-4) and reuse, recovery and recycling potential (D). The product stage consists of the information modules A1 (raw material supply), A2 (transport to manufacturer) and A3 (manufacturing). It is mandatory to report on these modules, and therefore, they were chosen for the statistical analyses of the EPDs for insulation materials. The remaining modules and life cycle stages were not considered as they are not mandatory to report on and therefore not available for all studied EPDs. The aggregated results for the environmental impact of the product stage (A1-3) for the chosen functional unit were used for the statistical analyses. For bakery products, three life cycle stages were considered-upstream (ingredients and packaging), core (production) and downstream (distribution and end-of-life). The results for all life cycle stages are mandatory to be reported in the EPDs and were therefore part of the statistical analyses. Information for the statistical analyses was gathered from the EPDs of the selected categories and can be divided into two main categories-indicators for environmental impacts and indicators for resource use. All reported indicators in the EPDs were included in the statistical analysis. The indicators for the selected product categories differ due to the requirements in the underlying PCRs. Table 1 gives an overview of the data that were gathered for each of the product category groups.

Methods for the definition of performance classes
This study further examined the effect of a calculation method for the definition of a benchmark based on the frequency and span of data. Three different methods were used and presented to illustrate the impact of the calculation approach on the definition of the benchmark and performance classes. An uneven number of seven ranges were used for the classification of the distribution curves to allow for a centre class and to provide a finer division than in a three-or five-range classification, which is often used for benchmarking purposes (Galindro et al. 2019). A linear distribution of the data over the seven classes is applied, with the exemption of the centre including the double amount of data compared to the other classes, as a model to consider the higher density of data at the core of a benchmark, as given in a normal distribution. The centre of the benchmark and class is visualised as a blue-coloured range and contains 25% of the values of the dataset. The range with the lowest environmental impact is green-coloured, followed by a light-green and lightblue coloured range towards the centre. A yellow, orange and red-coloured range indicates the areas with higher environmental impact compared to the blue-coloured range. Each of these classes contain at maximum 12.5% of the data. The different methods for the calculation and visualisation of the benchmark use different approaches for defining the centre of the benchmark -the dark-blue coloured range. The first method uses the average value of the distribution as the centre of the benchmark (average-based benchmark); the second one uses the median value of the distribution (median-based benchmark). The third method makes use of a market-based approach, where the centre of the benchmark is defined as the area of the distribution in which the frequency of observations is highest, based on the prerequisite that the area contains at least Acidification potential (AP) (SO 2 eq) x x Eutrophication potential (EP) (PO 4 -eq) x x Global warming potential (GWP) (CO 2 eq) x Global warming potential (GWP)-fossil contribution (CO 2 eq) x Global warming potential (GWP)-biogenic contribution (CO 2 eq) x Photochemical ozone creation potential (POCP) (C 2 H 4 eq) x x Ozone depletion potential (ODP) (CFC-11 eq) x Abiotic depletion potential (mineral resource elements) (ADPE) (Sb eq) x Abiotic depletion potential (non-renewable fossil energy resources) (ADPF) (MJ) x Total use of renewable primary energy resources (PERT) (MJ) x Total use of non-renewable primary energy resources (PENRT) (MJ) x Non-renewable energy resources (NRER)-Coal (g) x Non-renewable energy resources (NRER)-Oil (g) x Non-renewable energy resources (NRER)-Natural gas (g) x Non-renewable energy resources (NRER)-Other (g) x Renewable energy resources (RER)-Hydroelectric (MJ) x Renewable energy resources (RER)-Wind (MJ) x Renewable energy resources (RER)-Solar (MJ) x Use of net fresh water (water) (m 3 ) x Water consumption (water) (l) x 1 3 25% of the values of the whole distribution (market-based benchmark).

Results
The statistical analyses and visualisation of the results in the form of distribution curves for the chosen product categories and life cycle stages were based on 54 EPDs for insulation products (containing 79 datasets) and on 49 EPDs for bakery products (containing 72 datasets). The number of datasets is higher than the number of EPDs due to the fact that some EPDs contain LCA results for more than one product, i.e. different product variations. A list of acronyms for the reported indicators is presented in Table 1.

Distribution and cumulative distribution curves
The shape of the distribution curves for most of the studied environmental indicators for insulation materials follow similar patterns of being narrow and slightly positively skewed.
An example of such a distribution curve is illustrated in Fig. 2 for the distribution of the impact on climate change for insulation materials. The distribution curves for all reported indicators are presented in Appendix Fig. 9. Most of the values for the environmental impacts of the products are positioned in a small interval with lower values for the environmental impact. However, the distribution curves also indicate that the span of the environmental performance is rather large compared to the area in which most of the observations and environmental performances are located within. Based on the graphical illustration, the assumption of the data being normally distributed is unlikely, due to its positive skew and the span of the distribution curve.
In contrast to the shape of the distribution curves for the environmental impact and resource use of insulation materials (as presented in Appendix Fig. 9), a common form for the shape of the distribution curves for 'bakery products' cannot be observed. In general, the distribution curves seem to follow a less positively skewed and narrow distribution curve. However, the distributions vary between impacts categories and resource use, as illustrated in Appendix Fig. 10. The distribution curves of data for eutrophication and global warming potential (GWP)-fossil  contribution show a close correlation to the shape of a normal distribution curve ( Fig. 3 and Fig. 10) The majority of the distribution curves for bakery products differ significantly from a standard normal distribution curve, though the span of the distribution curves appears to be smaller than for insulation products.
One of the characteristics of a normal distribution curve is that approximately half of the values are lower than the average and that half of the values are higher. The area of the span from zero to the average of the distribution curve should therefore cover approximately 50% of the entire data. The remaining area of the span to the maximum value should cover the remaining 50% of the values. The cumulative distribution function of the data for GWP for insulation materials shows that this is not the case (Fig. 4). The span from the minimum of the distribution curve to the arithmetic average of the function (6.38 kg CO 2 -eq.) contains significantly more than 50% of the data from the whole distribution (over 63%). Over 59% of the data (values from 0.4 to 4.8 kg CO 2 -equivalents) are located in less than 20% of the total span of the distribution (0.4 to 24.3 kg CO 2 -eq.).
The cumulative distribution function of the data for bakery products (Fig. 5) shows that the average of the distribution for the impact on climate change-fossil contribution (1608.1 kg CO 2 -eq.) lies approximately in the centre of the distribution.

Standard statistical indicators
The mid-and end-points, median and arithmetic average for the distributions of environmental performance of insulation products differs significantly as shown in Table 2. The variations for the values strengthen the findings from the observations of the distribution curves that data are not normally distributed. The span of the distribution for the studied indicators is relatively large with spans varying from close to 50 (GWP, ADPF) to more than 800 times (POCP, PERT, PENRE, Water) the minimum value for each indicator.
Standard statistical indicators for the analysed bakery products show similar patterns as for the visual analysis of the distribution curves (Table 3). The mid-point, median and average values show a close correlation for some indicators for environmental impacts and indicators for resource use. The span of the distribution curves is not as large as for insulation products, with many spans of a length of up to ten times the minimal value (GWP (f), GWP (b), NRER-Coal, NRER-Oil) and only two larger spans with more than 50 times the minimal value (RER-Hydroelectric, NRER-Other). This corresponds to the observations from the distribution curves, which indicate a smaller span of the distributions and some distributions following the shape of a normal distribution curve.

Anderson-Darling test
The results for p values of the Anderson-Darling test strengthen the findings from the visual observations and the standard statistical indicators (see Table 4). Using a six-sigma significance level, the null hypotheses of data being normally distributed can be rejected. The test shows that it is highly unlikely that the studied data for insulation materials follow a normal distribution. Four out of thirteen indicators for bakery products are likely normally distributed, as the null hypothesis cannot be rejected using a significance level higher than 5%. The p value for data for GWP-fossil contribution is highest (0.44), followed by NRER-Oil (0.09), NRER-Natural gas (0.08) and EP (0.06). Using a lower significance level (0.01%) does not allow for the rejection of the null hypotheses of the data for AP and POCP being normally distributed.

Methods for the calculation for the definition of a benchmark
The results of comparing different definitions for categorisation of a benchmark applied to the distribution of GWP for insulation materials is illustrated in Fig. 6. The first method for classification uses the average value (6.4 kg CO 2 -equivalents per functional unit) as the centre for the benchmark, indicated by the dark-blue coloured range. For the average-based benchmark, most of the observations from the dataset lie either in the dark-blue coloured range or in a range with a lower environmental impact. Using the median value (4.1 kg CO 2 -equivalents per functional unit) as the centre of the benchmark leads to a shift of the dark-blue coloured area to the left of the distribution curve, i.e. towards values with a lower environmental impact. Furthermore, the shift also introduces observations that are classified within the lowest ranking of the benchmark-the redcoloured area. The use of the market-based benchmark results in a further shift of the base and blue-coloured area towards values with a lower environmental impact. Most of the span becomes classified within the lowest ranking of the benchmark; the highest rankings-the green-and light-green-coloured areas-are no longer present in the classification.  The classifications based on average, median or marketbased approaches for normally distributed indicators such as impact on climate change for bakery products differ significantly less than for insulation products (Fig. 7). The full range of classifications (green-coloured to red-coloured classes) is represented for all approaches and the shift of the classes is less for the different approaches.

Discussion
Apart from the availability of information of the statistical analyses and the definition and choice of categories for the studied products including the underlying LCA information, aspects such as the format and presentation of the resulting benchmarks and the use of results to facilitate decision-making are further discussed in this section.

Availability and quality of information for the statistical analysis
One challenge at the time of conducting this study was to compile data into a relevant format for statistical analyses. Due to different reporting formats of the available EPD-information, no standardised way of transferring the information into other file formats were found to conduct the statistical analyses in an automated way. Reporting formats for LCA-and EPDinformation that cannot be analysed automatically, such as pdffiles, can limit the transfer of information to other applications. Not all products on the market have EPD-information available which limits the development of a benchmark to products that report the environmental performance and product groups with sufficient degrees of performance to conduct statistical analyses. The growing interest and demand for LCA-and EPD-information will likely increase the availability of environmental performance data on the market. Safeguarding a functioning, feasible and dynamic benchmark requires regular up-to-date analyses of the distribution curves in conjunction to increasing EPD datasets. Here, digitization of LCA-and EPDinformation in the form of machine-readable EPDs facilitates the automation of this process (WG InData 2017; IES 2020). Another important aspect to consider when defining benchmarks is the quality of the underlying LCA information. As discussed by Erlandsson (2018), the quality assessment of the underlying data needs to be taken into account when used for benchmarking, comparison or any other purpose. Erlandsson (2018) suggests assessing the quality of LCA and EPD-information based upon product comparability, manufacturing representativeness, data accuracy, third party review type and additional documentation specifications related to the methodological choices made in the LCA study. Third party verification of EPDs shall assure compliance with the PCR used for the underlying LCA calculations and the correctness of the results.  Distributions based on a limited number of datasets are sensitive to incorrect results and outliers and can influence the definition of a benchmark based on standard statistical indicators. Machine-readable EPD-information may facilitate identifying errors that might occur in the development of the LCA study by detecting outliers in a dataset.

Definition and choice of product categories
Two product categories and sub-categories were chosen for the statistical analyses carried out in this study. The choice of categories is motivated by the number of available datasets and the existence of common calculations rules in the form of a PCR, which made up the foundation for statistical analyses. Separating product groups into sub-divisions within PCRs (e.g. regarding mineral wool and foam boards for insulation materials) can affect the distribution of environmental performance within a product group and, hence, influence the definition of a benchmark. A lack of data for sub-divisions within selected sub-categories made it impossible to study the impact of alternative product groups within a PCR. The same applies to product categories with low amounts of LCA data. More data are needed to study the effect of this concept called granularity of product categories on the definition of benchmarks.

Format and presentation of the results and benchmark
The development of an indicator to interpret EPDinformation regarding its position in relation to the studied market distribution and available benchmarks needs to consider discordant values and its impact on the average and median values. When defining a benchmark, the average or median values are considered as representative if the data are normally distributed. Other types of distribution curves may require tailored approaches for the proper definition of a benchmark. An average-or median-based benchmark representing a positively skewed distribution is beneficial for products with a lower environmental performance as the centre of the benchmark is shifted towards values of the span with a lower environmental performance. The proposed method for defining a marked-based benchmark in this study relates to the cumulative distribution function and considers the spread and span of the distribution curve. The approach has the advantage of 'filtering out' single outliers as well as a group of outliers in a dataset. It also shifts the centre of the benchmark towards the area of the span with the highest density of observations. EPDs containing several datasets for the same product, but having a different product variation, can therefore lead to a bias for the market-based benchmark approach. Assuming that values for the environmental indicators reflecting product variations do not differ in a significant way, many datasets will most likely reflect a short span of the distribution. With increasing amount of EPDs and underlying datasets on the market, the market-based approach for the definition of a benchmark will be less sensitive to this type of bias. The classification of the benchmark is further affected by the choice of centring the benchmark to the bluecoloured range, instead of adjusting it to the end points of the distribution. The amount of ranges chosen (i.e. seven) and the distribution of the data over the distribution (i.e. linear) have an effect on the size and location of performance classes, and further assumptions may be studied to compare the sensitivity of the implications based on the choices.
An attempt is made to present the results from the statistical analyses and different methods used for the definition of a benchmark in a one-dimensional way, i.e. presenting the results of the performance for one environmental indicator at a time. Figure 8 illustrates an approach of combining environmental information for several indicators using mid-and end-points as well as a benchmark. Here, an example of a fictive product compares the environmental performance within ten environmental indicators to a benchmark. Another approach is making use of normalisation by linking the environmental performance to an external reference, such as legal frameworks, minimum levels for emissions or other references such as the concept of planetary boundaries (Rockström et al. 2009;Sandin et al. 2015;Ryberg et al. 2016).

Conclusions and outlook
The results of this study show that distributions, mid-and end points of environmental performances of products differ significantly for the selected product categories and indicators. The distribution curves of insulation materials were not normally distributed, with more than half of the data centred within only 15% of the total span of the distribution. Some of the analysed data for bakery products are likely normally distributed, but most of the studied indicators are not. The development of a benchmark is emphasised as a tool to facilitate the interpretation and communication of LCA-and EPD-information suggesting that the development and definition of a benchmark shall take into consideration the distribution of the underlying data. A common approach for these purposes, i.e. using median or average values for the development of benchmarks, can however lead to a classification of the environmental performance of products into classes that favours products with a lower environmental performance. More datasets from EPDs are needed to analyse the distribution of other product categories than the ones included in this study to generally recommend different types of applications for EPD-information. Such data need to be comparable for a variety of market applications predominantly for procurement activities, i.e. based on the same calculation rules and fulfilling a high-quality level of the underlying data. Here, the development of machine-readable EPD information and ongoing international efforts with digitalisation and automation of LCA calculations can facilitate the development and use of EPDs at a large scale while supporting the introduction of market-based dynamic benchmarks based on statistical analyses.
This study also suggests that more research and practical applications are needed for the future when communicating and presenting formats of a benchmark for a variety of product groups. Decisions in the public and private domain are taken at various levels in an organisation usually by persons having different knowledge in the field of LCA and EPD. Hence, it is of vital importance to identify early to whom the information for specific decisions is directed, to enable the necessary adaption of the reporting format to suit the intended audience. LCA or EPD-information are seldom the only source of information that needs to be considered when decisions about environmental measures are about to be taken. Typically, three aspects have an impact on decisions -the first one linked to an organisation´s basic needs that a decision is intended cope with, the second one related to e.g. money, material assets and how the organisation wants to be seen by others, and the third one based on what might be considered as a 'correct' behaviour according to business norms, culture or strategies, etc. (Rex et al. 2019). To accommodate these aspects, different ways can be applied to increase the practical use of LCA-and EPD-information as follows: • linking the information with recommendations, • providing additional input supporting the desired environmental option, and • addressing other aspects important for the decisionmaking.
To fully understand the rationale behind a decision is difficult. For many persons, uncertainties are a factor of significance to solve for a successful outcome of the decision process. Here, the statistical analysis of LCA-and EPD datasets can provide the additional environmental information and other aspects indicated above to be helpful for different types of decisions. Results from this study could be used as potential examples to facilitate some of the decision-making situations described in Section 2. Within environmental management systems, this could make use of the height of midpoints between different datasets when identifying significant environmental aspects, ranges when setting environmental goals, and overall average data during the organisation´s own purchasing procedures. For public procurement, ranges could be used when conducting market analysis, or variability when formulating procurement criteria and setting relevant goals. Uncertainties may provide decision support when 1 3 assessing quality-assurance or certification of data and verification of the environmental performance and overall average data could be used to guide follow-up activities according to contract specifications.