Evaluation and selection of functional diversity metrics with recommendations for their use in life cycle assessments

In life cycle assessments (LCAs), the focus of modelling the impact of human-induced pressures on biodiversity has been mainly on taxonomic diversity measures such as species richness. More recently, increasing availability of trait data and the understanding that functional diversity is more directly related to human-induced pressures suggests functional diversity as a promising metric. One major challenge relates to the selection process of the correct metric. Our purpose is to categorise and identify appropriate metrics of functional diversity for LCA model developers based on a justified choice of its structural properties and its links to human-induced pressures. We conducted a meta-analysis of literature to identify those functional diversity metrics that are frequently applied (not necessarily within LCA studies) and that possess a strong link to ecosystem functioning and human-induced pressures. Also, we provide a compilation of metrics that conform to important and desirable structural properties stipulated from literature. By reconciling these highlighted key properties with the strength of metric link, we make propositions for functional diversity use in LCA. To capture impacts on functional diversity, the combination of functional richness, evenness and divergence needs to be considered. The mean strength of functional diversity metrics was highest for temperature rise and CO2 elevation, as related to climate change, and less to eutrophication and land use change. Studies on impacts of water use change and other important human-induced pressures on functional diversity seem not available. When combined with desired structural properties such as independence and scale invariance, a combination of functional dendrogram (FRD), functional evenness (FEm) and functional logarithmic variance (FDvar) is preferred to comprehensively determine human impacts on biodiversity in LCAs. However, if a set of multi-dimensional components is sought, then the best option is functional volume (FRV), functional evenness (FEm) and functional divergence (FDm). Through this reconciliation of usage, mean strength and key properties, the LCA model developer is able to apply consistent and useful metrics in LCA studies.


Introduction
The term biodiversity is a contraction of biological diversity and can be simply defined as the sum of all biotic (animal and plant life) variation from the level of genes to ecosystems (Purvis and Hector 2000). Biodiversity loss can be driven by processes which are extrinsic such as climate change or tectonic movements. However, current changes in biodiversity result primarily from processes intrinsic to life on Earth, and almost exclusively from human activities. Human-induced pressures are affecting the earth's ecosystems, eliminating genes, species and biological traits at an alarming rate; which in some cases is irreversible. The most important direct human-induced impacts on biodiversity are habitat destruction (Bawa and Dayanandan 1997;Tilman 2001), the introduction of alien species (Everett 2000;Levine 2000), overexploitation (Pauly et al. 2002;Hutchings and Reynolds 2004), disease (Daszak et al. 2001), pollution (Baillie et al. 2004), and climate change (Parmesan et al. 1999;McLaughlin et al. 2002;Walther et al. 2002). Until the effects of critical pressures are reduced, most declines seem likely to continue at the same or increased rates. While there is evidence that biodiversity loss is slowing or even recovering for some habitats, over the past few decades, there is substantial concern regarding the rate at which biodiversity loss will alter the functioning of ecosystems and services (Cardinale et al. 2012). In light of the above, there is an urgent need for scientific information to support policy makers and to ease the decision-making process that is required behind conserving ecosystems.
By incorporating biodiversity loss estimates into modelling tools for conservation management and environmental risk assessment, we can make better conservation and restoration decisions; with the objective of maintaining biological diversity and the ecosystem services that this diversity provides. A powerful modelling technique which determines the environmental impacts of products, processes or services is Life Cycle Assessment (LCA; e.g. Guinée 2002). This modelling framework assesses the environmental pressures and related potential environmental impacts associated with all the stages of a product's life cycle from cradle to grave. Environmental impact categories include climate change with global warming potential as the corresponding characterisation factor (Plevin et al. 2013;Nakano 2015), acidification (Huijbregts et al. 2000;Kim and Chae 2016), human toxicity (Hertwich et al. 2001;Juraske et al. 2009), water depletion (Pfister et al. 2009;Finkbeiner et al. 2010), resource depletion (Klinglmair et al. 2014) etc.
Impacts on biodiversity has been one of the most challenging categories to be incorporated in life cycle impact assessment (LCIA, see e.g. Curran et al. 2011;De Souza et al. 2015). The Millennium Ecosystem Assessment (MEA 2005) identified various drivers for biodiversity loss, of which the most important are (a) terrestrial and aquatic habitat change, (b) invasive species, (c) pollution, (d) climate change and (e) over-exploitation. Some LCIA methods attempt to assess impacts like global warming (e.g. De Schryver et al. 2009;Wilting et al. 2017) and freshwater use (e.g. Pfister et al. 2009;Verones et al. 2017) on biodiversity, but the vast majority of LCIA approaches look at biodiversity impacts by land use and land use change only. An important milestone was the publication of a special issue of this journal on Global land use impacts on biodiversity and ecosystem services in LCA that included a.o. papers on related UNEP-SETAC guidelines , land use impacts on biotic production compare Taelman et al. 2016), land use impacts on ecosystem services like freshwater and erosion regulation ) and land use impacts on functional diversity . Approaches which analyse the impacts of land use occupation and change on species abundance or species richness, usually by applying species-area relationships (SARs) are however dominating the current state of the art in LCIA. Recent examples of studies presenting such indicators include de Baan et al. (2014), Chaudhary et al. (2015) and Wilting et al. (2017). In the case of non-land use impacts, the link between species richness and anthropogenic influences is weak at best.
In addition to traditional biodiversity measures such as species richness and phylogenetic diversity measures (e.g. Faith 1992), the notion and use of functional diversity (i.e. the diversity of plant species traits in ecosystems) has emerged, particularly over the last few decades as a measure of biodiversity (Kattge et al. 2011). Typical examples of plant traits include; leaf dry mass, leaf area, rooting depth, maximum growth rate, leaf nitrogen concentration, (Petchey and Gaston 2002;Mason et al. 2010) while animal traits include body size, wing size (for birds and insects) and respiration rates (Tarka et al. 2010;Laine et al. 2013). Many studies have shown that functional diversity is one of the best predictors of ecosystem functioning that is available, providing a strong and direct link to ecosystem functioning (Petchey et al. 2004;Cadotte et al. 2009;Flynn et al. 2011). The reason is that it is not the number of species but the number of traits that directly relates to ecosystem functioning (Díaz and Cabido 2001;Flynn et al. 2009;Mouchet et al. 2010;Petchey and Gaston 2006).
Indeed, ecological literature (e.g. Fukami et al. 2005) has shown that functional diversity responds more consistently to environmental drivers than species richness. It is precisely these same arguments that we consider, justifying the incorporation of functional diversity when assessing biodiversity in a LCA. Using functional diversity instead of species richness will make the impact assessment more certain and hence more clearly dedicated to biodiversity impacts. Unfortunately, the use of functional diversity metrics in LCA is highly limited. Only recently, a specific model has been introduced by De  based on data from the North-South Americas. The functional diversity metric (see FR D metric (1.4) in Appendix Table A1, Electronic Supplementary Material) proposed by Petchey and Gaston (2002) and Mouchet et al. (2008) was implemented. Although this model has not been made operational due to the lack of global characterisation factors, this model can be applied to evaluate the impact of land occupation (at least in the North and South Americas with possible extension to global regions).
We argue that, through its relationship with various environmental drivers and human impacts thereon, functional diversity may provide a more generic tool for assessing environmental impacts on biodiversity in LCA. To achieve a more extensive use of functional diversity metrics in LCA, the first challenge is to select the most appropriate metrics that generically and quantitatively define the relationships between inventory flows and functional diversity. However, ever since the importance of functional diversity has been realised, a wide variety of metrics have been developed (Appendix Table A1), and a general consensus regarding the most appropriate measure is still lacking (Petchey et al. 2009). This problem has been amplified by the strong increase in the number of available metrics over recent years. Hence, there is a growing need to categorise and identify appropriate metrics to guide LCA model developers to choose meaningful metrics for the purpose of LCA. A major issue is that the novice user whose interest lies in environmental impacts, risks choosing an inaccurate metric or a combination of metrics, without sufficient justification. In summary, functional diversity has been promoted as a promising metric in LCA assessments ), but information on correct metric choice and realworld applications are hitherto lacking. Therefore, in this study, we aim to examine the properties with additional comments on metric quality and behaviour, possible drawbacks, constraints and limitations, see Appendix Table A3 (Electronic Supplementary Material). Building on the excellent framework originally formulated by Schleuter et al. (2010) and by extending this review, we identified the following aims: & To identify metrics which are frequently applied in scientific literature and highlight those with greater explanatory power (i.e. link to ecosystem functioning) & To categorise functional diversity metrics according to important and desirable properties 1 & Provide the reader with further informed and justified recommendations to ease the selection process of functional diversity metric(s) based on reconciliation of the above objectives With respect to the first aim, we performed a meta-analysis of literature that either directly incorporated or mentioned functional diversity metrics in relation to ecosystem functioning and services in the context of induced pressures, and paid special attention to a quantifiable description of the strength of the link. For this analysis, our study is confined to well developed and commonly used metrics, see Appendix Table A1 (Electronic Supplementary Material). For the second aim, we reconciled the information on desirable properties, frequent use and explanatory power. In combination, we discuss and select the functional diversity metric(s) that are most suitable for incorporation into LCAs.

Functional diversity metrics
For functional diversity to be meaningful and worth measuring, it must be related to human-induced pressures included in LCAs, and it should provide information above and beyond what species richness can explain. Functional diversity is measured in a multitude of ways; technically, it represents the diversity of traits, but it is taken to represent the diversity of species niches in trait space (Petchey et al. 2004;McGill et al. 2006;Petchey and Gaston 2006;Villéger et al. 2008). While the use of functional diversity metrics presupposes a mechanistic link between diversity and the ecological phenomena in question, which indeed has been proven in experimental settings (e.g. Fukami et al. 2005;Heemsbergen et al. 2004), a systematic review of field studies aiming at establishing this link is so far lacking.

Description of metrics
Functional diversity metrics can be one-dimensional (1D), i.e. incorporating a single functional trait (e.g. functional logarithmic variance, see (3.1) in Appendix A1). More often though a multi-dimensional/variate (MD) metric is applied, (e.g. functional volume, see (1.3) in Appendix A1). For a multidimensional metric, each co-ordinate corresponds to a measured trait and each point represents the position of an individual or a species in trait space. A full comprehensive list can be found in Appendix Table A1 (Electronic Supplementary Material). Whether it is better to use a single trait or to combine several traits depends on the ecological context (e.g. Butterfield and Suding 2013). Schleuter et al. (2010) argue that multivariate metrics are preferable since studies are more informative when the distribution of species is represented in a multi-dimensional trait space. Petchey et al. (2004) reanalysed six biodiversity ecosystem functioning experiments and found that multivariate metrics explained variation in ecosystem function better. However, there is no principal argument that justifies the preference of either, and neither can conclusions be drawn from field tests. One may argue that the strategies of species are always composed of multiple axes and hence multiple trait combinations. Thus, to express functional diversity properly, multiple dimensions are required. On the other hand, one may also argue that because of multiple strategy axes, and given that each axis is probably driven by a specific environmental pressure, functional diversity metrics of multiple dimensions will not be able to form a strong link to specific human impacts. For our purposes, we suggest that it would be improper to dismiss a metric based on its dimensionality; therefore, our study comprises of both one-dimensional and multivariate metrics.
In principle, seeking to aggregate information into a single metric would be most desirable. However, in case of functional diversity, such a metric providing complete information does not exist. This is not unique to biodiversity and other impact categories in LCA are also characterised by multiple metrics. Functional diversity, like biodiversity, is a multifaceted entity. In line therewith, Mason et al. (2005) strongly argue that it is not possible to completely represent the diversity of a community in a single index and instead to capture the multiple facets of functional diversity, as well as associated impacts, by using multiple independent metrics. Mason et al. (2005) decomposed biodiversity into three distinct components each of which can be quantified and linked to a different-independent-facet of functional diversity, namely functional richness (FR), functional evenness (FE) and functional divergence (FD). A similar view is indicated by Ludwig and Reynolds (1988); Purvis and Hector (2000). The three components can be categorised as the following:  Appendix Table A1, Electronic Supplementary Material) (Fig. 1).
In this study, the dependence of metrics will be assessed through analysis of correlations (based on literature as described in "Section 3"). In addition, the independence of the metrics from species richness and evenness, essential to obtain orthogonal information , is evaluated.
A large variation of functional diversity metrics is presented amongst the literature (Rao 1982;Villéger et al. 2008;Mouillot et al. 2005;Mason et al. 2003). Reviews of functional diversity metrics are presented by Schleuter et al. (2010) and Mouchet et al. (2010). These reviews did not include the metric of functional dispersion, from multiple traits as proposed by Laliberté and Legendre (2010). Functional dispersion is an extension of the original framework of Villéger et al. (2008), which has been generalised to a highly flexible distance-based framework for any distance or dissimilarity measure, multiple traits of different types and allowing for missing trait values and weighting of individual traits. Also, Blonder et al. (2014) proposed the n-dimensional hypervolume, a generalisation of the convex hull concept that allowed for gaps in the convex hull. A comprehensive list with all functional diversity metrics currently available from the above compilation can be found in Appendix Table A1 (Electronic Supplementary Material). Functional dispersion and the n-dimensional hypervolume have been included in this list to provide the reader with a comprehensive view of the different types of metrics. Particularly, these metrics are recently developed and shown to be promising in applications. However, due to lack of information on metric structural properties (see "Section 3") and their link to ecosystem functioning (see "Section 2.3"), these metrics are omitted from those analyses. We will return to these metrics in "Section 4" when discussing limitations.

Application of metrics to human-induced pressures
Although it is generally understood that functional diversity is linked to ecosystem functioning and services, it is still unclear whether a link can be derived from human-induced pressures. In particular, the following question arises: which functional diversity metrics provide a greater explanatory power on impacts of these drivers? To answer this question, we will focus on studies for functional diversity of plant communities: in principle, functional diversity metrics are applicable to all taxa. Given that each taxon has different functional traits, however, functional diversity will have to be calculated for each taxon separately (in analogy to species richness of plants and insects which cannot be combined in one metric) but it is likely that some taxa will be more sensitive than others with respect to some impact categories. So far, the focus of most studies has been plant based, particularly due to the large amount of data readily available (such as the TRY Global Fig. 1 Illustration of the concepts of functional diversity and those differences (low/high) between a functional richness FR, b functional evenness FE and c functional divergence FD. (Reprinted from Carmona et al. 2016) database of plant traits with millions of records on plant traits on many taxonomic groups and on a global scale, Kattge et al. 2011, also see Díaz et al. 2016).
A study conducted by the Royal Botanic Gardens (Kew, UK) states that the impact of humanity far outweighs natural threats to plant species, accounting for approximately 81.3% of threats. Typical human-induced pressures such as residential or commercial development, commercial agriculture, wood plantations etc. can be briefly categorised to land use/ land use change or land occupation/transformation. Other human-induced pressures on biodiversity include eutrophication caused by N and P emissions, ecotoxicological effects due to emissions of toxic substances, climate change caused by greenhouse gas emissions and water use/abstraction, which like land use change can be easily placed in the framework of Life cycle impact assessment (compare De Schryver et al. 2009 for climate change and Pfister et al. 2009 andVerones et al. 2017, see also Koellner et al. 2013). In Table 1, we review the recent studies which utilise functional diversity metrics to evaluate the threats of various human-induced pressures that relate well to impact categories in LCA.
Most studies in Table 1 refer to land use change and eutrophication. This is not surprising since land use change is the main threat amongst pressures (see De Souza et al. 2015). According to the Millennium Ecosystem Assessment (MEA 2005), land use change has had the highest impact of all pressures on biodiversity. Even so, some studies indicate that climate change may be the biggest pressure, and climate effects are currently significant and forecasted to be an emerging major threat (Scheffers et al. 2016). Some studies have suggested that over the next few decades, climate change could surpass land use change as the greatest global threat to plant life (Leadley et al. 2010;Bellard et al. 2012). However, no quantitative assessment exists to support this claim at global scales and only few studies (as shown in Table 1) refer to CO 2 elevation, temperature rise or water use. No studies were found that related functional diversity metrics to ecotoxicity or acidification. The frequent use of particular functional diversity metrics does however not directly imply that they are most suitable. Most studies do not provide a convincing justification and do not quantify the strength of the relationship between functional diversity metrics and human-induced pressures.

Strength of metric link to human-induced pressures
Several studies determined the links between functional diversity metrics and ecosystem function or allowed the link to be determined. From the list of metrics in Appendix Table A1 (Electronic Supplementary Material), only the richness metrics FR V , FR D , evenness metric FE m and divergence metrics FD var , FD Q , FD m were found to have a quantifiable link. Figure 2 presents a circular chart where the area of each circle depicts the strength of metric link against pressure via average (squared) ordinary least square (OLS) regression coefficients R i 2 . Note that, there was no study found which quantified the link using R i 2 between functional diversity and water use; hence, water use is missing from Fig. 2. Some studies, i.e. Mason et al. (2010); Pakeman (2011) and Dubuis et al. (2013) directly provided the effect size. Other studies presented the relationship between the chosen functional diversity metric and human impacts using descriptive statistics-which is not useful for a comparative study. In the case of quantitative measures, there is no unified approach, in the sense that the link between functional diversity metric and pressures is either described qualitatively or tested using a variety of statistical measures. Subsequently, any indication of the mean strength of the relationship is uncertain. For comparison purposes, our study forms a compilation of regression coefficients R i 2 , extracted from literature. The average value is then calcu- number of studies which report R i 2 values) as a representative measure to determine the strength of the link between functional diversity metric and pressure. Then the overall mean strength γ is found by averaging 〈R i 2 〉 across all pressures (see Appendix  Table A2, Electronic Supplementary Material). For the mean values γ, we did not expect strong relationships for all metrics listed in Appendix Table A2 (Electronic Supplementary Material) as the mechanisms involved affecting functional diversity may differ for different pressures. In that case, we would expect pressures associated with changes in niche space to relate to functional richness metrics, while pressures affecting competition would relate more strongly to functional evenness metrics. Also note that for any metric for which no study exists, does not necessarily imply that the metric is useless.
Most studies represented in Fig. 2 incorporated multidimensional metrics with the exception of FD var , whose usage is found in Mason et al. (2003) and Conti and Diaz (2013). With reference to Appendix Table A2 (Electronic Supplementary Material), most commonly used metrics across studies are FR V , FE m , FD var and FD Q . We can group the metric link with human-induced pressures according to strength by introducing three classes using the mean values γ, that is; (i) strong link (from either multiple or single study) if 0.5 < γ ≤ 1, (ii) moderate link if 0.25 < γ ≤ 0.5, (iii) weak link if 0 < γ ≤ 0.25, and (iv) no or unknown link γ = 0. Here, γ = 1 would represent a link to ecosystem functioning of maximum strength (unrealistic case in the real world). Hypothetically, those metrics which are classed with unknown links could in reality have a link, but since this is unknown, we treat these cases the same as those with no links, thus γ = 0. In the case of FD var , a strong relationship γ = 0.533 is found with averaging over multiple recordings (N = 13 in total) and two pressures (i.e., eutrophication and CO 2 elevation), see Mason et al. (2003), Conti and Diaz (2013). Stronger relationships are only    (2012) Garnier and Navas (2012) (3.2) Fnc. variance (modified)  Garnier and Navas (2012) Pakeman (2011); Garnier and Navas (2012) Garnier and Navas (2012); Raevel et al. (2012) (3.7) Fnc. dispersion

Structural properties of metrics
Next to a link to human-induced pressures, selected functional diversity metrics should have relevant properties which are both important and desirable. There has been some discussion in the literature on the important and desirable structural properties of metrics. Several authors have identified these properties through experimental design, testing and simulations of artificial data sets (Solow and Polasky 1994;Mason et al. 2003;Ricotta 2005;Mouillot et al. 2005;Villéger et al. 2008;Schleuter et al. 2010). However, the number of studies is few. Also, there is some controversy over the statistical validity of these metrics (Petchey and Gaston 2007;Podani and Schmera 2007), an illustration of this can be found in Petchey and Gaston (2002). Some studies have incorporated theoretical tests to assess the metric quality or accuracy. For example, Schleuter et al. (2010) tested five distinct artificial scenarios of exemplary datasets, for information on metric behaviour (see Appendix  Table A3, Electronic Supplementary Material), with the key objective to test whether the metrics behave according to design. Mouillot et al. (2005) provided a theoretical study for FE metrics, testing whether predefined properties are satisfied. Mason et al. (2003) developed a criterion with ten entities for ecological use (rather than a mathematical treatment) to assess functional diversity. We have highlighted key properties from literature such as (B) set monotonicity, (C) trait scale invariance (also known as monotonicity in distance), (D) twinning criterion, (E) response to empty space and (F) symmetry as those which are most relevant. Also, answers to some other pertinent  〉 values/no. of pressures for which a link was found), and is indicative of the overall strength of metric link to ecosystem functioning. The introduced pressures are in line with the DPSIR framework questions regarding (G) correlation (i.e. independence or orthogonality), (A) dimensionality, and whether the metric conforms to design must be understood (see also Appendix A3, Electronic Supplementary Material). A description and reasoning of these relevant properties can be found in Table 2, alongside literature references. Table 3 shows whether the metrics conform to those properties described in Table 2. Each of these structural properties are considered with equal weighting i.e. identically in terms of importance.
In our evaluation of Table 3, we discuss the properties associated to metrics of richness, evenness and divergence separately. These three components of functional diversity have been suggested to be (partly) independent of each other. Such independence is important if more than one metric is applied for a particular study to ensure that orthogonal information is obtained. Hence, the actual independence from other metrics is also accounted for in our evaluation. To demonstrate a redundant choice, it would not make sense to select the divergence metric FD var alongside FR R or FR V in the same metric set, as these metrics are correlated.
We find that all FR metrics satisfied set monotonicity, as normally expected of richness metrics. Of the one-dimensional FR metrics, FR Is satisfies more (known) requirements in comparison to FR R . Although both metrics satisfy set monotonicity and trait scale invariance, FR Is has the advantage that it responds well to empty space and is uncorrelated with any other metrics. Trait scale invariance is an important property required in order to avoid transformation or standardisation of data (Schleuter et al. 2010). Amongst the FE metrics, it is unclear which is more suitable, since both FE s and FE m satisfy a mixture of properties. FE m does not satisfy set monotonicity nor the twinning criterion, and these properties are unknown for FE s . Neither metric is correlated with other metrics. Further testing is required for FE s and simply dismissing this metric on the basis that it is one-dimensional does not suffice. From the one-dimensional FD metrics we find that FD var is one of the strongest candidates due to the largest number of properties satisfied, but correlated with FR R and FR V which is undesired. FD σ and FD s satisfy a mixture of properties, that is FD σ satisfies set monotonicity but not trait scale invariance. For FD s , the opposite is recorded; the metric satisfies trait scale invariance but not set monotonicity. The latter has the further disadvantage that it is correlated with FR V and both of these metrics have one or more undesired properties. If all traits are declining in a system, the metric is monotonically decreasing. This includes the desired behaviour of the metric when traits disappear from the ecosystem, the metric then should decline as well. Similar but opposite behaviour is expected for a metric describing increases instead of declines (C) Trait scale invariance (or Monotonicity in distance) Solow and Polasky (1994); Mason et al. (2003); Mouillot et al. (2005); Ricotta (2005); Villéger et al. (2008) The trait scale should exhibit invariant properties in the sense that it is unaffected by the units in which the trait is measured. Solow and Polasky (1994) described this property such that diversity should not be decreased by an arbitrary increase in the distances between traits. A stronger version of this property was advocated by Mason et al. (2003), known as monotonicity in distance. This requires that diversity should be unaffected by the units in which functional traits are measured. This property is essential for any trait that could be measured on more than one scale (D) Twinning criterion Weitzman (1992); Solow and Polasky (1994); Mason et al. (2003Mason et al. ( , 2005; Ricotta (2005); Villéger et al. (2008) Diversity should not increase with the addition of a species which is functionally identical, i.e. diversity should be unaffected when a species is split into two species with the same trait values and same total abundance (E) Response to empty space Schleuter et al. (2010) Metric should reflect the expected changes when empty space is present in the trait distribution of a community (F) Symmetry Mouillot et al. (2005); Laliberté and Legendre (2010) Metric is symmetric with regard to small and large trait values (G) Correlation Schleuter et al. (2010) Metric is correlated with other metrics. In order to obtain orthogonal information the metrics must be independent Of the multi-dimensional FR metrics, FR V does not satisfy trait scale invariance and is also heavily correlated with other metrics; therefore, it should not be used alongside FD var , FD s or FD Q . FR D does not satisfy the twinning criterion, nor trait scale invariance and does not respond well to empty gaps in trait space, whereas FR Im does respond well. FR V and FR D have their limitations and neither of these satisfy all properties. Trait scale invariance eases the calculation of characterisation factors, which are needed to link pressures to the metric, and thus used to assess impacts in LCA (e.g. FR D is not trait scale invariant, and therefore De Souza et al. 2013 standardised metric values to calculate characterisation factors for land use impacts). It is unclear whether FR Im performs better, a study is required to test this metric for trait scale invariance and the twinning criterion. Each of the multivariate FR metrics thus has some undesired properties. All multi-dimensional FD metrics satisfy the twinning criterion. A drawback for FD is is that it does not satisfy trait scale invariance whereas FD m does, and it is unclear for FD Q . The n-dimensional hypervolume metric introduced by Blonder et al. (2014) resolves this issue, and behaves well with respect to empty space or missing trait data and seems very promising. However, this is a relatively new metric which has yet to undergo the assessment stipulated by Table 3 and it is still unclear whether this metric satisfies other properties. In addition to the n-dimensional hypervolume, other recent developments include; Range box (Qiao et al. 2017), Minimum ellipse (Swanson et al. 2015), Dynamic range box (Junker et al. 2016) and Probabilistic hypervolume (Carmona et al. 2016). However, these metrics also have yet to undergo stringent tests to reveal whether they conform to important properties or have a link to ecosystem functioning. Therefore, we excluded these metrics from our assessment in Table 3. To summarise the above for our recommendations: I. Richness: FR Is is an uncorrelated metric which satisfies more properties than FR R and therefore is better suited. FR V can be incorporated into the list provided it is not selected simultaneously with those divergence metrics which it is heavily correlated with, namely FD var , FD s or FD Q . FR D and FR Im are also other suitable candidates. Here, 1D/MD corresponds to whether the metric is one-or multi-dimensional, respectively. Also, 'yes' denotes that the metric conforms to each corresponding property described in Table 2, with 'no' signifying opposite meaning. For correlation, 'yes' denotes whether there is significant evidence that the metrics are correlated with other metrics, where 'no' signifies independence. Those metrics which are correlated with species richness (SR) have been highlighted. For FR metrics, this is naturally the case due to construction. Note that the assessment is not applicable for metric (3.1) Fnc. Unalikeability (used for categorical traits) shown with *. We have included this metric for completeness. The blank spaces represent that either information on whether the metric conformed to the property could not be found in the literature or is not applicable. A detailed version with comments on metric quality, behaviour, constraints and limitations can be found in Appendix

II.
Evenness: It is somewhat unclear which evenness metric is more suitable based on an assessment on properties; by default, we include both FE s and FE m . III. Divergence: Amongst the one-dimensional metrics, FD var satisfies most properties and therefore is desired instead of FD σ or FD s (i.e. the preferable metric would need to satisfy relatively much larger number of properties to outperform other metrics). All other multidimensional metrics FD Q , FD m and FD is are lesspreferred potential candidates, since they satisfy a mixture of properties.
Based on this evaluation, the following one-dimensional metrics FR Is , FE s , FD var and multi-dimensional metrics FR V , FR D , FR Im , FE m , FD Q , FD m , FD is can be categorised as important and desirable, following a line of reasoning based on structural properties.

Discussion
In our analysis of the most suitable metrics, we have identified those metrics which have a link to human-induced pressures ("Section 2.3") and satisfy desirable properties ("Section 3"). While many metrics of functional diversity have been published, a general consensus is still lacking as to exactly what the metrics quantify, how redundant they are and which ones are most suitable for application (Mouchet et al. 2010). Summarising a large data set into a single diversity figure results in a loss of information; therefore, a perfect measure of functional diversity does not exist. In fact, it is not possible or even desirable to sum up all the aspects of functional diversity into a single number (Ludwig and Reynolds 1988;Ricotta 2005). Mason et al. (2005) proposed a framework where functional diversity is best described via a metric set of three independent and complementary components as opposed to a singular metric; namely, functional richness, functional evenness and functional divergence (FR, FE, FD). The motivation behind this framework stems from the fact that each component describes a different aspect of functional diversity (see Fig. 1). This view is also supported by Mouillot et al. (2005) Ludwig and Reynolds (1988); Purvis and Hector (2000) before Mason et al. (2005) formalised a definition. Pakeman (2011) also highlights that there is a theoretical basis in measuring functional diversity in this way. The use of these three components will allow estimating the differential impacts on multiple aspects of functional diversity, and aid ecologists in examining the mechanisms behind ecosystem functioning . In the context of LCA, the decision maker will have a set of three metrics at their disposal, describing each component of functional diversity for a single impact category. Here, we followed this distinction into three categories to provide a comprehensive framework for the quantification of functional diversity in trait space and set out to choose a metric set of independent components.
To enable this choice, Table 4 summarises the findings on the ten metrics FR Is , FE s , FD var , FR V , FR D , FR Im , FE m , FD Q , FD m , FD is that have been related to human-induced pressures. Within Table 4, we ranked metrics according to mean strength γ and the number of studies that evaluated the strength. On analysing Table 4, we find that the three metrics that have been frequently used and that have moderate to strong links to human impacts, namely FD var , FR V and FD Q are heavily correlated with other metrics. However, this can be overcome provided that the metric set chosen is completely orthogonal. Following this reasoning, we propose that (FR V , FE m , FD m ) is an ideal set, also supported by Villéger et al. (2008) and Mouchet et al. (2010). FR D is the alternative possible candidate for richness (used by De Souza et al. 2013 in a LCA study), despite it failing to satisfy the twinning criterion and trait scale invariance, it has an apparent link to human-induced pressures. By comparison, FR Is and FR Im have an unknown behaviour in this regard. Using the process of elimination, it seems that FR V and FR D are both deemed the only suitable richness metrics. With respect to evenness metrics, there is an urgent need to develop these further, to date only two such metrics exist. FE m is the only evenness metric which has a known link to human-induced pressures, whereas the link for FE s is unknown. Therefore, we include FE m in our recommendations despite the link being weak and even though Mouillot et al. (2005) argued for the usage of FE s . For divergence, there are multiple candidates; FD m does have a strong link with ecosystem functioning, however this result was obtained from a single study; therefore, it is somewhat unclear whether the link is viable. Mason et al. (2003) has strongly argued for the usage of FD var , which has behaved well with respect to those important properties listed in Table 3 and has a strong link (see Fig. 2). This metric has outperformed others and consistently shown to be the most desirable. However, usage must be treated with caution and not selected alongside FR V , in order for orthogonal information to be obtained across all components. Amongst the onedimensional metrics, FD var is the only metric we recommend for usage. FD is is dismissed on the basis that it has issues with scale invariability and even worse has an unknown link. Therefore, FD var , FD Q or FD m seem preferred. Taking into account all the information in this study and the relative metric inter-dependence, we arrive at the following recommendations, either These four distinct permutations are all orthogonal by selection. Notice that FR D provides multiple options since the metric is totally independent of all other metrics.
To come to a final recommendation on functional diversity metrics to be selected for LCA studies, we devised an adhoc scoring system (i.e. assign weights) based on a score for its link to ecosystem functioning (E.F.S.) and one for structural properties (S.P.S.). The multiplication of these scores, called λ summarises its usefulness for LCA studies (Table 5).
When combining λ for the four metric permutations in the previous compilation of recommendations, we obtain; Hence, in conclusion, we propose that the most effective permutation is (FR D , FE m , FD var ). If a metric set of only multi-dimensional components is sought, then (FR V , FE m , FD m ) is the best option (also supported by Villéger et al. 2008 andMouchet et al. 2010). As a metric for functional evenness, FE m is included in all sets, although the relationships between functional evenness and human-induced pressures seems rather weak. Each selected set contains multiple metrics.
In the context of LCA application, the chosen set of metrics should not be aggregated because FR, FE and FD relate to different effects on ecosystem functioning. Hence, each metric indicates a different aspect of biodiversity affected and can be used to obtain an understanding of potential implications. For example, functional richness relates to the resistance of the ecosystem to new pressures and a change therein therefore indicates that some functions may not be fulfilled anymore . Likewise, impacts on functional evenness would suggest  higher susceptibility to other competitive or invasive species (Hejda and De Bello, 2013). Similarly, impacts on functional divergence would relate to community stability e.g. communities with higher functional divergence are more prone to changes in species composition (De la Riva et al. 2017). In the above context specific scenarios, if the focus is on resistance to ecosystem (relating to richness) or community stability (relating to divergence), then it may be argued that evenness is not required for LCA, and can be omitted altogether. The result thereof, is that the distribution of species abundance in occupied niche space is not important. We justify inclusion of the evenness metric FE m on the basis that a link exists, and therefore possibly meaningful-despite a weak link. By formulating a consistent line of reasoning, we remove only those metrics which have no or uncertain links. Also note that, whether evenness is included or not, the order of the proposed metric sets does not change. To summarise, the LCA practitioner should understand that the metric set will provide independent and complementary information on richness, evenness and divergence that should be interpreted separately within a single impact category of LCA. While in designing our study, we took care to perform our analysis in a structured and feasible way to come to our final recommendations; it is still clear that our study has limitations due to the following: I. To better evaluate how functional diversity metrics behave in practice, there needs to be an increased effort in studies pertaining to human-induced pressures which specifically quantify the strength of metric link in a coherent way (i.e. by incorporating a common statistical measure), thus allowing for comparisons across multiple studies. These links should be investigated in a general sense, and not only be confined to those associated with LCAs. As an initial starting point, focus should be on those metrics which have shown a strong link from one recording or even an unknown link, see Table 4. Also, there has been little testing of functional diversity metrics against field data (Pakeman 2011;Dubuis et al. 2013), and therefore, there is a lack of quantitative assessment of the link between functional diversity metrics and human-induced pressures. More specifically, further investigation is required to check the strength of metric link for FR Im , FR Is and FD is as well as for several other metrics, such as FR D and FD m . Both of these metrics have been shown to form a strong link, but found only one recording to support this claim. Provided that complete information is obtained, more accurate ecosystem functioning scores (E.F.S) can be assigned, resulting in a possible change in recommendations on metric selection. II. There is a need to test the link of metrics which provide information on evenness; currently there are only two evenness metrics, namely FE s and FE m , with FE s having either no or an unknown link. The lack of information poses a limitation on its use in LCA, and stronger evidence is required to reveal the importance of evenness in identifying human impacts. One may argue that evenness is primarily linked to competitive exclusion processes and subsequently less related to human-induced pressures. While the importance as a component of functional diversity is clear, it is less relevant in an LCA type of analysis. We find that evenness does have some link to humaninduced pressures, as demonstrated by Pakeman (2011); Mason et al. (2012); Dubuis et al. (2013) etc. Therefore, we include the multi-dimensional counterpart in our recommendation, despite a weak link being found (see Fig. 2). III. It is unclear whether some metrics conform to those structural properties enlisted in Table 3. The blank spaces in Table 3 represent that either information on whether the metric conformed to the property could not be found in the literature or the property is not applicable. Further studies are required to check whether the metrics satisfy the corresponding properties. Also, it would be interesting to see a study which highlights those properties in order of importance. This will allow weights to be assigned accordingly, as opposed to treating each property equally (as in the case of this study). In terms of importance, we consider independence as an essential feature. Each component should be able to provide different and orthogonal information with respect to richness, evenness and divergence. This is precisely the functional diversity framework proposed by Mason et al. (2005) and others Villéger et al. 2008;Schleuter et al. 2010;Mouchet et al. 2010). IV. We attempted to relate the individual metrics to specific human-induced pressures (Table 1). However, the list is most possibly not exhaustive. Also, note that infrequent use does not necessarily imply redundancy.
If research effort and attention is redirected to (I.-IV.) then this would in turn help reveal those relationships and mechanisms at play between ecosystem processes and functional diversity to improve characterisation factors for incorporation in LCA.
We hope that our suggestions for improved LCA of biodiversity based on metrics for functional richness, functional evenness and functional divergence will guide the LCA model developer. The next step is to make the concept operational. In the short term, this would consist of two steps: (1) to turn the concept into operational (global) characterisation factors that transform environmental pressures (e.g. N and P emissions, water extraction, other emissions, land use occupation) to our proposed metrics for biodiversity loss, the compilation in Table 1 could serve as the basis for such assessment, (2) to identify the basic data and models needed to calculate the types of proposed metrics. Recently, global maps of vegetation traits were produced (e.g. van Bodegom et al. 2014;Butler et al. 2017) based on which in principle each three metrics can be derived for use in background systems and to make our proposed metrics operational without much investment. The long-term strategy would be to gather data and hence calculate characterisation factors and the associated (change in) metrics more precisely for use and understanding in foreground systems. Occasionally, such approach is already applied. For example, the model proposed for LCA by De , using the FR D metric, is based on compiled data by Flynn et al. (2009) and Gibson et al. (2011) across land use intensification.

Conclusions
Our analysis of functional diversity reinforces the need for using three independent and complementary components; richness, evenness and divergence. The sets of functional diversity metrics that best reconcile strength of link to humaninduced pressures and desirable structural properties including independence are (FR V , FE m , FD m ) OR a combination of FR D and FE m with either FD var , FD m or FD Q . All four permutations are potential candidates for application and can be utilised to comprehensively determine human impacts on biodiversity in a LCA model. Obviously, these recommendations are not set in stone; i.e. once more information is readily available, refined performance indices can be computed for each metric, to allow for better informed choices. We hope that this study will constitute a useful point of reference and a means of reasoning for metric selection of functional diversity, particularly for Life Cycle Assessment model developers in future studies.