1 Introduction

Quantifying the influence of a variable on the outcome of an algorithm is an issue of high importance in order to explain and understand decisions taken by machine learning models. In particular, it enables to detect unwanted biases in the decisions that lead to unfair predictions. This problem has received a growing attention over the last few years in the literature on fair learning for Artificial Intelligence. One of the main difficulty lies in the definition of what is (un)fair and the choices to quantify it. Numerous measures have been designed to assess algorithmic fairness, detecting whether a model depends on variables, called sensitive variables, that convey an information that is irrelevant for the model, from a legal or a moral point of view. We refer for instance to Dwork et al. (2012), Chouldechova (2017), Oneto and Chiappa (2020) and del Barrio et al. (2020) and references therein for a presentation of different fairness criteria. Most of these definitions stem back to ensuring the independence between a function of an algorithm output and some sensitive feature that may lead to biased treatment. Hence, understanding and measuring the relationships between a sensitive feature S, which is typically included in \({\mathbf {X}}\) or highly correlated to it, and the output of the algorithm f using those features to predict a target Y, enables to detect unfair algorithmic treatments. Note that it is not enough to simply remove the sensitive feature from the data—“fairness through unawereness”, (Gordaliza et al., 2019)—as the algorithm can still “guess” the sensitive feature through its entanglement with the other inputs \({\mathbf {X}}\). Then, ensuring that predictors are fair is achieved by controlling previous measures, as done in Mary et al. (2019), Williamson and Menon (2019), Grari et al. (2019), Gordaliza et al. (2019), del Barrio et al. (2020), Chiappa et al. (2020). If this notion has been extensively studied for classification, recent work tackle the regression case as in Grari et al. (2019), Jeremie Mary Clement Calauzenes (2019), Chzhen et al. (2020) or Le Gouic et al. (2020).

Global Sensitivity Analysis (GSA) is used in numerous contexts for quantifying the influence of a set of features on the outcome of a black-box algorithm. Various indicators, usually taking the form of indices between 0 and 1, allow the understanding of how much a feature is important. Multiple set of indices have been proposed over the years such as Sobol’ indices, Cramér–von-Mises indices, HSIC—see (Da Veiga, 2015; Gamboa et al., 2020; Grandjacques et al., 2015; Iooss and Lemaître, 2015; Jacques et al., 2006) and references therein. The flexibility in the choice allows for deep understanding in the relationship between a feature and the outcome of an algorithm. While the usual assumption in this field is to suppose the inputs to be independent, some works (Grandjacques et al., 2015; Jacques et al., 2006; Mara and Tarantola, 2012) remove this assumption to go further in the understanding of the possible ways for a feature to be influential.

Hence, GSA appears to provide a natural framework to understand the impact of sensitive features. This point of view has been considered when using Shapley values in the context of fairness (Hickey et al., 2020) and thus provide local fairness by explainability. Hereafter we provide a full probabilistic framework to use GSA for fairness quantification in machine learning.

Our contribution is two-fold. First, while GSA is usually concerned with independent inputs, we recall extensions of Sobol’ indices to non-independent inputs introduced in Mara and Tarantola (2012) that offer ways to account for joint contribution and correlations between variables while quantifying the influence of a feature. We propose an extension of Cramér–von-Mises indices based on similar ideas. We also prove the asymptotic normality for these extended Sobol’ indices to estimate them with a confidence interval, a novelty as far as we know. Then, we provide a consistent probabilistic framework to apply GSA’s indices to quantify fairness. We illustrate the strength of this approach by showing that it can model classical fairness criteria, causal-based fairness and new notions such as intersectionality and provide insight for mitigating biases. This provides new conceptual and practical perspectives to fairness in Machine Learning.

The paper is organized as follows. We begin by reviewing existing works on Global Sensitivity Analysis (Section 2). We give estimates for the extended Sobol’ and Cramér–von-Mises indices, along with a theorem proving asymptotic normality (Theorem 1). Then, we present a probabilistic framework for Fairness in which we draw the link between fairness measures and GSA indices, along with applications to causal fairness and intersectional fairness (Section 3).

2 Global sensitivity analysis

The use of complex computer models for the analysis of applications from science or real-life experiments is by now the routine. The models are often expensive to run and it is important to know with as few runs as possible the global influence of one or several inputs on the outcome of the system under study. When the inputs or features are regarded as random elements, and the algorithm or computer code is seen as a black-box, this problem is referred to as Global Sensitivity Analysis (GSA). Note that since we consider the algorithm to be a black-box, we only need the association of an input and its output. This make it easy to derive the influence of a feature for an algorithm for which we do not have access to new runs. We refer the interested reader to Da Veiga (2015) or Iooss and Lemaître (2015) and references therein for a more complete overview of GSA.

The main objective of GSA is to monitor the influence of variables \(X_{1},\cdots , X_{p}\) on an output variable, or variable of interest, f(X). For this, we compare, for a feature \(X_i\) and the output f(X), the probability distribution \(\mathbb {P}_{X_i,f(X)}\) and the product probability distribution \(\mathbb {P}_{X_i}\mathbb {P}_{f(X)}\) by using a measure of dissimilarity. If these two probabilities are equal, the feature \(X_i\) has no influence on the output of the algorithm. Otherwise, the influence should be quantifiable. For this, we have access to a wide range of indices, generally tailored to be valued in [0, 1] and sharing a similar property: the greater the index, the greater the influence of the feature over the outcome. Historically, a variance-decomposition—or Hoeffding decomposition—is used of the output of the black-box algorithm to have access to a second-order moment metric in the so-called Sobol’ method. However, these methods were originally developed for independent features. For obvious reasons, this framework is not adapted and has limitations in real-life cases. Additionally, Sobol’ methods are intrinsically restrained by the variance-decomposition and others methods have been proposed. We will present two alternatives for Sobol’ indices. The first one solves the issue of non-independent features. The second one circumvents the limitations of working with variance-decomposition. We finish this section by merging these two alternatives, inspired by the works of Azadkia and Chatterjee (2019), Gamboa et al. (2020), Chatterjee (2020).

Note that the use of other metrics is common in the GSA literature. Each metric has its own intrinsic advantages and disadvantages which have been extensively studied. Moreover, independence tests based on these GSA metrics exist, as shown in Meynaoui et al. (2019), Gamboa et al. (2020) and techniques such as bootstrap or Monte-Carlo estimates can be used to obtain confidence intervals for such tests. We restrain ourselves to the Sobol’ and Cramér–von-Mises indices because they are historically the basis of GSA literature, computationally tractable and allow for better understanding of usual fairness proxies, as we will show in Section 3. We also prove asymptotic normality for extended Sobol’ indices, which is a first to the best of our knowledge.

2.1 Sobol’ indices

A popular and useful tool to quantify the influence of a feature on the output of an algorithm are the Sobol’ indices. Initially introduced in Sobol’ (1990), these indices compare, thanks to the Hoeffding decomposition (Van der Vaart, 2000), the conditional variance of the output knowing some of the input variables with respect to the overall total variance of the output. Such indices have been extensively studied for computer code experiments.

Suppose that we have the relation \(f({\mathbf {X}}) = f(X_1, \cdots , X_p)\) where f is a square-integrable algorithm considered as a black-box and \(X_1,\cdots ,X_p\) inputs, with p the number of features. We denote by \(p_{\mathbf {X}}\) the distribution of \({\mathbf {X}}\). For now, we suppose the different inputs to be independent, meaning that \(p_{\mathbf {X}} = \otimes _{i=1}^p p_{X_k}\). Then, we can use the Hoeffding decomposition (Van der Vaart, 2000) on \(f({\mathbf {X}})\)—sometimes also called ANOVA-decomposition—so that we may write

$$\begin{aligned} f({\mathbf {X}}) = \sum _{s \subseteq \llbracket 1,p\rrbracket } f_s(X_s), \end{aligned}$$
(1)

where \(f_s\) are square-integrable functions and \(X_s\) the set \(\{X_i, i \in s\}\). We can either assume that f is centered or that s can be the null set in this sum: it does not change anything since we are interested in the variance afterwards. We will consider \(V := \text{ Var }(f({\mathbf {X}}))\) and \(V_s := \text{ Var }(f_s({\mathbf {X}}_s))\). Note that the elements of the previous sum are orthogonal in the \(L^2(p_{{\mathbf {X}}})\) sense. So, to compute the variance, we can compute it term by term, and obtain

$$\begin{aligned} V = \sum _{k = 1}^p V_k + \sum _{k_2>k_1}^p V_{k_1,k_2} + \cdots + V_{1,\cdots ,p}. \end{aligned}$$
(2)

This equation means that the total variance of the output, which is denoted by V, can be split into various components that can be readily interpreted. For instance, \(V_1\) represents the variance of the output \(f({\mathbf {X}})\) that is only due to the variable \(X_1\)—that is, how much \(f({\mathbf {X}})\) will change if we take different values for \(X_1\). Similarly, \(V_{1,2}\) represents the variance of the output Y that is only due to the combined effect of the variables \(X_1\) and \(X_2\) once the main effects of each variable has been removed—that is, how much \(f({\mathbf {X}})\) will change if we take different values simultaneously for \(X_1\) and \(X_2\) and remove the changes due to main effects from \(X_1\) only or \(X_2\) only.

By dividing the \(V_{(m)}\) by V, with \((m) \subset \llbracket 1, p\rrbracket\), we obtain:

$$\begin{aligned} S_{(m)}:= \frac{V_{(m)}}{V}, \end{aligned}$$
(3)

which is the expression of the so-called Sobol’ sensitivity indices. When (m) is equal to a singleton k, the Sobol’ index \(S_k\) quantifies the proportion of the output’s variance caused by the input \(X_k\) on its own. The sum of all indices \(S_{(m)},k \in (m)\) quantifies the proportion of the output’s variance caused by the input \(X_k\) conjointly with other inputs, and is usually called the Total Sobol’ index of \(X_k\) and denoted \(ST_k\).

Note that the law of total variance can be written for the random variable \(f({\mathbf {X}})\) as

$$\begin{aligned} \text{ Var } (f({\mathbf {X}})) = Var(\mathbb {E}[f({\mathbf {X}})|X_{\sim k}]) + \mathbb {E}[Var(f({\mathbf {X}})|X_{\sim k})]. \end{aligned}$$
(4)

In this equation, the left-hand side is the total variance, while the right-hand side is decomposed as two terms: the variance explained by all the variables different of \(X_k\), and all the rest which include any part of variance explained by \(X_k\). After normalization, we have

$$\begin{aligned} 1 = S_{\sim k} + ST_k. \end{aligned}$$
(5)

The alternate definition \(ST_k = \frac{\mathbb {E}[Var(f({\mathbf {X}})|X_{\sim k})]}{Var(f({\mathbf {X}}))}\) is of interest for two reasons. First, we will see in the next section that this formulation can come back in various contexts, including in Fairness. Additionally, for estimation, this formula is quite interesting since it allows estimation of the importance of a variable without using it directly, which may be in practice unfeasible for various reasons.

2.2 Sobol’ indices for non-independent inputs

In the classic Sobol’ analysis, for an input \(f({\mathbf {X}})\), two indices, namely the first order and total indices, quantify the influence of the considered feature on the output of the algorithm. When the inputs are not independent, we need to duplicate each index in order to distinguish whether influences caused by correlations between inputs are taken into account or not. Introduced in this framework by Mara and Tarantola (2012), we use the Lévy–Rosemblatt theorem to create two mappings of interest. We denote by \(\sim i\) every index other than i. We create 2p mappings between p independent uniform random variables U and the variables \({\mathbf {X}}\) either by mapping \(p_{U_1}p_{U_{\sim 1}}\) to \(p_{X_i}p_{X_{\sim i}|X_i}\)—in this case \(U_1\) is denoted by \(U^i_1\)—or by mapping \(p_{U_{\sim p}} p_{U_{p}}\) to \(p_{X_{\sim i}}p_{X_i{|X_{\sim i}}}\)—in this case, \(U_{\sim p}\) is denoted \(U^{i+1}_{\sim p}\). In the Appendix 1, more in-depth details are given. In the analysis of the influence of an input \(X_i\), the first mapping captures the intrinsic influence of other inputs while the second mapping excludes these influences and shows the variations induced by \(X_i\) on its own. Each of these two mappings leads to two indices corresponding to classical Sobol’ and Total Sobol’ indices. The influence of every input \(X_i\) is therefore represented by four indices, see Table 1.

Hence, the four Sobol’ indices for each variable \(X_i , i \in \llbracket 1,p \rrbracket\) are defined as followed:

$$\begin{aligned} Sob_{X_i} := \frac{\text{ Var }[\mathbb {E}[f({\mathbf {X}})|X_i]]}{\text{ Var }[f({\mathbf {X}})]} \end{aligned}$$
(6)
$$\begin{aligned} SobT_{X_i} := \frac{\mathbb {E}[\text{ Var }[f({\mathbf {X}})|Z_i]]}{\text{ Var }[f({\mathbf {X}})]} \end{aligned}$$
(7)
$$\begin{aligned} Sob_{X_i}^{ind} := \frac{\text{ Var }[\mathbb {E}[f({\mathbf {X}})|Z_i]]}{\text{ Var }[f({\mathbf {X}})]} \end{aligned}$$
(8)
$$\begin{aligned} SobT_{X_i}^{ind} := \frac{\mathbb {E}[\text{ Var }[f({\mathbf {X}})|X_{\sim i}]]}{\text{ Var }[f({\mathbf {X}})]}, \end{aligned}$$
(9)

where the random variable \(Z_i\) has the distribution \(p_{X_i|X_{\sim i}}\) and is equal to \(F^{-1}_{X_i|X_{\sim i}}(U^{i+1}_p)\). Note that we denote the Sobol’ indices for \(X_i\) by the quantities \(S_i\) and \(ST_i\) under the assumption of independence, and by the quantities \(Sob_{X_i}, SobT_{X_i}, Sob^{ind}_{X_i}, SobT^{ind}_{X_i}\) when this assumption is not fulfilled, for more clarity.

Note that these definitions can be extended to multidimensional variables and thus enabling to consider groups of inputs by replacing the subset \(\{i\}\) by a subset \(s \subset \{1,\cdots ,p\}\) in the formulas. More insight on the transformations that allow these definitions can be found in Annex 1 or in Mara and Tarantola (2012), Mara et al. (2015).

Remark 1

If the features are independent, then for all \(i\in \llbracket 1, \cdots , p\rrbracket\), \(Sob^{ind}_{X_i}=Sob_{X_i}\) and \(SobT^{ind}_{X_i} = SobT_{X_i}\). The proof comes from the fact that in the independent case, we have \(U^i_1 = U^{i+1}_p\).

Remark 2

All previous indices satisfy the following bounds. For all \(i \in \{1,\cdots ,p\}\),

$$\begin{aligned} 0 \le Sob^{ind}_{X_i} \le Sob_{X_i} \le SobT_{X_i} \le 1 \quad \mathrm{and} \quad 0 \le Sob^{ind}_{X_i} \le SobT^{ind}_{X_i} \le SobT_{X_i} \le 1. \end{aligned}$$

We refer to Mara and Tarantola (2012) and to the law of total variance for the proof. Note that, in general, there are no inequalities between \(Sob_{X_i}\) and \(SobT^{ind}_{X_i}\).

Sobol indices enable to quantify three typical ways for a feature to modify the output of an algorithm.

  1. 1.

    Direct contribution. Firstly, a variable can be of interest, all by itself, without any correlation or joint contribution with the other variables. Consider for example the case where \(f({\mathbf {x}}) = x_1 + x_2\) and \(x_1\) independent to the rest of the variables. In this example, we would have \(Sob_{X_1} = SobT_{X_1} = Sob^{ind}_{X_1}= SobT^{ind}_{X_1} = 0.5\), which means that 50% of the variability of the algorithm is caused by the first variable. In this case, the first variable has a non-null impact on its own on the outcome of the algorithm f.

  2. 2.

    Bouncing contribution. A variable can interact with other variables and influence the output only by its impact on the law of the other variables. For example, consider \((x_1,x_2)\) where \(x_2 = \alpha x_1 + \varepsilon\)—where \(\varepsilon\) is a centered white noise of variance \(\sigma ^2\)—and \(f({\mathbf {x}}) = x_2\). Then we get \(Sob_{X_1} = SobT_{X_1} = (\alpha ^2 V(x_1))/(\alpha ^2 V(x_1) + \sigma ^2)\) while \(Sob^{ind}_{X_1} = SobT^{ind}_{X_1} = 0\). The first variable can be highly influent on the outcome of the algorithm f, even if it is not directly responsible for these variations. We call this type of interaction a “bouncing effect” since the variable will need to use another input to reach the outcome of the algorithm.

  3. 3.

    Joint contribution. Lastly, a variable can contribute to an output jointly with other variables. Take for instance the case where \((x_1,x_2)\) are independent and \(f({\mathbf {x}}) = x_1 \times x_2\). In this case, \(Sob_{X_1} = Sob^{ind}_{X_1} = 0 = Sob_{X_2} = Sob^{ind}_{X_2}\) while \(SobT_{X_1} = SobT^{ind}_{X_1} = 1 = SobT_{X_2} = SobT^{ind}_{X_2}\) This effect is different of the previous one as the distributions of the input variables are independent but their impact is intertwined. In such a case, the effect is visible and measurable by a variation between first-order and total indices.

Table 1 Sobol’ indices: what is taken into account and what is not

These main differences point out why we need four indices in order to assess the sensitivity of a system to a feature. Table 1 sums up which index takes correlations or joint contributions into account. The difference between these different indices can be very informative. For example, if the gap between \(Sob_{X_i}\) and \(SobT_{X_i}\) or between \(Sob^{ind}_{X_i}\) and \(SobT^{ind}_{X_i}\) is big, then the feature \(X_i\) is mainly influential because of its joint contributions with the other features on the output. Conversely, if the gap between \(Sob^{ind}_{X_i}\) and \(Sob_{X_i}\) or between \(SobT^{ind}_{X_i}\) and \(SobT_{X_i}\) is big, a large part of the influence of the feature \(X_i\) will be through its intrinsic influence on other features.

Monte-Carlo estimation of the extended Sobol’ indices can be computed by using this definitions. These estimators are consistent and converge to the quantities defined as the Sobol’ and independent Sobol’ indices earlier. Additionally, if we write each of these estimates as \(A_n/B_n\), we can use the Delta-method theorem to prove a central limit theorem.

Theorem 1

Each index \({\mathcal {S}}\) in the equations (6) to (9) can be estimated by its empirical counterpart \({\mathcal {S}}_n\) such that:

  1. (i)

    \({\mathcal {S}}_n \xrightarrow {a.s} {\mathcal {S}}\).

  2. (ii)

    \(\sqrt{n}({\mathcal {S}}_n - {\mathcal {S}}) \xrightarrow {D} {\mathcal {N}}(0, \sigma ^2)\), with \(\sigma ^2\) depending on which index we study, see Appendix 2.

2.3 Cramér–von-Mises indices

Sobol’ indices are based on a decomposition of the variance, and therefore only quantify influence of the inputs on the second-order moment of the outcome. Many other criteria to compare the conditional distribution of the output knowing some of the inputs to the distribution of the output have been proposed—by means of divergences, or measures of dissimilarity between distributions for example. We recall here the definition of Cramér–von-Mises indices (Gamboa et al., 2020), an answer to this lack of distributional information that will be of use later in a fairness framework—see Section 3.

2.3.1 Classical Cramér–von-Mises indices

The Cramér–von-Mises indices are based on the whole distribution of \(f({\mathbf {X}})\). They are defined see Gamboa et al. (2020), for every input i, as follows:

$$\begin{aligned} {CVM}_i {:=} \frac{\int _{\mathbb {R}} \mathbb {E}\left[ (\mu (t) - \mu ^i(t))^2\right] d\mu (t)}{\int _{\mathbb {R}} \mu (t)(1-\mu (t))d\mu (t)} \end{aligned}$$
(10)

where \(\mu(t) {:=} {\mathbb{E}}[{f({\mathbf{X}})\le t}]\) is the cumulative distribution function of Y and \(\mu ^i\) its conditional version \(\mu ^i(t) {:=} {\mathbb {E}}[{{1}}_{f({{\mathbf {X}}})\le t}|X_i]\).

This equation can be rewritten as

$$\begin{aligned} CVM_i = \frac{\int \text{ Var }(\mathbb {E}\left[ {\bf {1}}_{f({\mathbf {X}})\le t}|X_i\right] )d\mu (t)}{\int \text{ Var }({\bf {1}}_{f({\mathbf {X}})\le t}) d\mu (t)}. \end{aligned}$$
(11)

As before, these indices extend to the multivariate case. Simple estimators have been proposed (Chatterjee, 2020; Gamboa et al., 2020), and are based on permutations and rankings.

Remark 3

As mentioned earlier, Sobol’ indices quantify correlations and second-order moments but do not take into account information about the distribution of the outcome. However, note the similarity between the definition of the Cramér–von-Mises index and the classical Sobol’ index, especially if we rewrite Equation (11) as:

$$\begin{aligned} CVM_i = \int Sob_{X_i}({\bf {1}}_{f({\mathbf {X}})\le t})\frac{\text{ Var }({\bf {1}}_{f({\mathbf {X}})\le t})}{\int \text{ Var }({\bf {1}}_{f({\mathbf {X}})\le t}) d\mu (t)}d\mu (t). \end{aligned}$$
(12)

Cramér–von-Mises can be seen as an adaptive Sobol’ index that emphasizes the regions where the cumulative distribution of the outcome is highly changing, as more information can be obtained in these areas. This enables to capture information about the distribution of the outcome instead of moment-related information.

2.3.2 Extension of the Cramér–von-Mises indices

Classical Cramér–von-Mises indices suffer from the same limitation as Sobol’ indices as they are tailored for independent inputs. A natural extension is to create new indices to handle the case of dependent inputs. We propose an extension of the Cramér–von-Mises indices, inspired by the ideas of the extended Sobol’ indices and by the works of Azadkia and Chatterjee (2019). This new set of indices will capture the influence of a feature independently of the rest of the features.

Definition 1

For every input i, we define the independent Cramér–von-Mises indices as:

$$\begin{aligned} \begin{aligned} CVM^{ind}_i&:= \frac{\int \mathbb {E}(Var ({\bf {1}}_{f({\mathbf {X}})\le t}|X_{\sim i}))d\mu (t)}{\int Var ({\bf {1}}_{f({\mathbf {X}})\le t}) d\mu (t)}\\ \end{aligned} \end{aligned}$$
(13)

This extension enables to compare the influence of a feature on the output of an algorithm without its dependencies with other features.

Remark 4

This independent Cramér–von-Mises index can be seen as an extension of the \(SobT^{ind}\) index.

This remark is similar to Remark 3. From the independent Total Sobol index shown in (9), by changing the output function as a threshold of the real algorithm and taking the mean along all the possible thresholds, we obtain the independent Cramér–von-Mises index. This index can also be seen as an adaptive form of the \(SobT^{ind}\) index.

Estimation of these indices is given in Appendix 4 by the mean of estimates \({\widehat{CVM}}_i\). Similarly to Theorem 1, we have the following theorem.

Theorem 2

If we denote by N the number of observations used to compute \({\widehat{CVM}}_i\), then the sequence \(\sqrt{N}\left( CVM_i - {\widehat{CVM}}_i\right)\) converges towards the centered Gaussian law with a limiting variance \(\xi ^2\) whose explicit expression can be found in the proof.

The proof of this theorem can be found in Gamboa et al. (2018). Note that new estimation procedures can be efficient with little data, as mentioned in Gamboa et al. (2020), which will be helpful for measuring intersectional fairness in the following Section.

3 Fairness

3.1 Sensitivity Indices as Fairness measures

In this section, we provide a probabilistic framework to unify various definitions of Fairness for Group of individual as Global Sensitivity Indices. Fairness amounts to quantify the dependencies between a sensitive feature S and functions of the outcome f(X) and of the values of the variable of interest Y. Several measures of fairness corresponding to different definitions of fairness have been proposed in the machine learning literature. However, all these definitions boil back to a quantification of the mathematical propositions “\(f(X) \perp \!\!\!\!\perp S\)” or “\(f(X) \perp \!\!\!\!\perp S |Y\)”.

For instance, the main common definitions of fairness are the following

  • Statistical Parity, see for instance in Dwork et al. (2012), requires that the algorithm f, predicting a target Y, has similar outputs for all the values of S in the sense that the distribution of the output is independent of the sensitive variable S, namely \(f({\mathbf {X}}) \perp \!\!\!\!\perp S\). In the binary classification case, it is defined as \(\mathbb {P}(f({\mathbf {X}}) = 1 |S ) = \mathbb {P}(f({\mathbf {X}}) = 1)\) for general S, continuous or discrete.

  • Equality of odds looks for the independence between the error of the algorithm and the protected variable, i.e implying here conditional independence, i.e \(f({\mathbf {X}}) \perp \!\!\!\!\perp S | Y\). This condition is equivalent in the binary case to \(\mathbb {P}(f({\mathbf {X}}) = 1 |Y = i, S) = \mathbb {P}(f({\mathbf {X}}) = 1 |Y = i),\) for \(i = 0,1.\)

  • Avoiding Disparate Treatment correspond to the fact that similar individuals should have similar outputs. This condition, in the binary case, is written as \(\mathbb {P}(f({\mathbf {X}}) =1 | {\mathbf {X}} = {\mathbf {x}}, S= 0) =\mathbb {P}(f({\mathbf {X}}) =1 | {\mathbf {X}} = {\mathbf {x}}, S= 1)\). Various refinements of this metric appears, including for instance the situation when similar individuals may not be sharing the same attributes \({\mathbf {X}} = {\mathbf {x}}\), e.g. de Lara et al. (2021).

  • Finally, Avoiding Disparate Mistreatment correspond to the equality of misclassification rates across subpopulations. This condition, in the binary case, is written as \(\mathbb {P}(f({\mathbf {X}}) \not = Y | S= 0) =\mathbb {P}(f({\mathbf {X}}) \not = Y| S= 1)\).

Previous notions of fairness are quantified using a Fairness measure \(\varLambda\) and a function \(\varPhi (Y,{\mathbf {X}})\) such that \(\varLambda (\varPhi (Y,{\mathbf {X}}),S) = 0\) in the case of perfect fairness while the constraint is relaxed into \(\varLambda (\varPhi (Y,{\mathbf {X}}),S) \le \varepsilon\), for a small \(\varepsilon\), leading to the notion of approximate fairness. The following definition provides a general framework to define fairness measures. GSA measures as defined in 2 or described in Da Veiga (2015), Iooss and Lemaître (2015) are suitable indicators to quantify fairness as follows and these definitions can be extended to continuous predictors and continuous Y.

Definition 2

Let \(\varPhi\) be a function of the features \({\mathbf {X}}\) and of Y. We define a GSA measure for a function \(\varPhi\) and a random variable Z as a \(\varGamma (.,.)\) such that \(\varGamma (\varPhi (Y,{\mathbf {X}}),Z)\) is equal to 0 if \(\varPhi (Y,{\mathbf {X}})\) is independent of Z and is equal to 1 if \(\varPhi (Y,{\mathbf {X}})\) is a function of Z. Then, \(\varGamma\) induces a GSA-Fairness measure defined as \(\varLambda (\varPhi (Y,{\mathbf {X}}),S) = \varGamma (\varPhi (Y,{\mathbf {X}}),S)\).

The following examples provide a GSA formulation for most of classical fairness definitions using Sobol’ and Cramér–von-Mises indices.

Example 1

(Statistical Parity) The so-called Statistical Parity fairness is achieved by taking \(\varLambda (\varPhi (Y,{\mathbf {X}}),S)) = \text{ Var }(\mathbb {E}[f({\mathbf {X}})|S])\). This corresponds to the GSA measure \(Sob_S(f({\mathbf {X}}))\). If f is a classifier with value in \(\{0,1\}\), we recover for a binary S the classical definition of Disparate Impact,\(\mathbb {P}(f(X)=1|S=1) = \mathbb {P}(f(X)=1|S=0)\), see Gordaliza et al. (2019).

Example 2

(Avoiding Disparate Treatment) The so-called Avoiding Disparate Treatment fairness is achieved by taking \(\varLambda (\varPhi (Y,{\mathbf {X}}),S)) = \mathbb {E}[\text{ Var }(f({\mathbf {X}})|X)]\). This corresponds to the GSA measure \(SobT_S(f({\mathbf {X}}))\).Note that it is normal for the algorithm not to be conditioned by the sensitive attribute for this GSA measure, cf Equation 4 Similarly, for a binary classifier, we recover the classical definition.

Example 3

(Equality of Odds) The so-called Equality of Odds fairness is achieved by taking \(\varLambda (\varPhi (Y,{\mathbf {X}}),S)) = \mathbb {E}[\text{ Var }(\mathbb {E}[f({\mathbf {X}})|S,Y]|Y)]\). This corresponds to the GSA measure \(CVM^{ind}(f({\mathbf {X}}), S|Y)\). Similarly, for a binary classifier, we recover the classical definition.

Example 4

(Avoiding Disparate Mistreatment) The so-called Avoiding Disparate Mistreatment fairness is achieved by taking \(\varLambda (\varPhi (Y,{\mathbf {X}}),S)) = \text{ Var }(\mathbb {E}[\ell (f({\mathbf {X}}),Y)|S])\) with \(\ell\) a loss function. This corresponds to the GSA measure \(Sob_S(\ell (f({\mathbf {X}}),Y))\). Similarly, for a binary classifier, we recover the classical definition.

Among well known fairness measures, we point out that we immediately recover two main fairness measures used in the fair learning literature—namely Statistical Parity and Equality of Odds. GSA measures can be computed for different function \(\varPhi\) and highlight either the behaviour of the algorithm, \(\varPhi (Y,{\mathbf {X}}) = f({\mathbf {X}})\), or its performance, \(\varPhi (Y,{\mathbf {X}}) = \ell (Y,f({\mathbf {X}}))\) for a given loss \(\ell\). This can lead to different GSA-Fairness definitions from a same GSA measure, see Examples 1 and 4.

Example 5

Recent work in Fairness literature exposed various definitions and measures to quantify influence of a sensitive feature, beyond classical notions. For instance, Hickey et al. (2020) uses Shapley values, Li et al. (2019) uses HSIC measures, Ghassami et al. (2018) uses Mutual Information, so on and so forth. All these measures have been extensively studied in GSA literature, as mentioned in previous Section, and these frameworks are included in ours.

For an additional example, consider the Hirschfeld-Gebelein-Rényi Maximum Correlation Coefficient (Rényi, 1959)—denoted HGR—which is defined for two random variables U and V as

$$\begin{aligned} HGR(U,V) = \sup _{f \in \mathbb {L}^2(\mathbb {P}_{U}),g \in \mathbb {L}^2(\mathbb {P}_{V})} \text{ Corr }(f(U),g(V)). \end{aligned}$$
(14)

This index is used in Mary et al. (2019) to quantify fairness and is linked to the Sobol’ indices presented earlier as the alternate definition of this quantity given in Rényi (1959) can be written with Sobol’ indices:

$$\begin{aligned} HGR(U,V) = \sup _{g\in \mathbb {L}^2(\mathbb {P}_{V}), \mathbb {E}[g] = 0, \mathbb {E}[g^2]=1} \mathbb {E}[g(V)|U], \end{aligned}$$
(15)

and therefore,

$$\begin{aligned} HGR(X_i,f({\mathbf {X}})) = \sup _{g\in \mathbb {L}^2(\mathbb {P}_{f({\mathbf {X}})})} Sob_{X_i}(g(f({\mathbf {X}}))). \end{aligned}$$
(16)

However, we restrain our study here to Sobol’ indices mainly for two reasons. First, Sobol’ indices are directly equivalent to very classical Fairness metrics, as we will see in the next Section. As such, using HGR is a valid choice as a proxy for Fairness but being fair with respect to HGR will be more difficult to obtain as a result that being fair with respect to Sobol’. Secondly, to compute the HGR index, it is necessary to compute a supremum of Sobol’ indices over all the square-integrable functions. This additional operation leads to harder computation. A classical work-around is to approximate this quantity by restraining ourselves to some class by using Reproducing Kernel Hilbert spaces. The interested reader can find more information in Mary et al. (2019).

In Table 2, we summarize the different indices associated to classical studied fairness definitions shown in previous Examples. By considering these fairness definitions as GSA measures, we can explain fairness in terms of simple effects presented in previous section, along with limitations of those definitions. For instance, Statistical Parity corresponds to the classical Sobol’ index. The nullity of this index implies no direct influence of sensitive variables on the outcome, but can be limited as sensitive variables may have joint effects with other variables not captured by this metric. Therefore, Statistical Parity will lack in this regard. On the contrary, since Avoiding Disparate Treatment corresponds to Total Sobol’ indices, this definition of fairness captures every possible influence of the sensitive feature on the outcome.

Table 2 Common fairness definitions and associated GSA measures

Remark 5

Note that many fairness measures are defined using discrete or binary sensitive variable. The GSA framework enables to handle continuous variables without additional difficulties. Moreover using kernel methods, GSA indices can be defined for a larger and more “exotic” variety of variables such as graphs or trees, for instance. In particular HSIC (see in Berlinet and Thomas-Agnan, 2004; Da Veiga, 2015; Gretton et al., 2005; Meynaoui et al., 2019; Smola, 2007) is a kernel-based GSA measure that has been used in fairness.

3.2 Consequences of seeing Fairness with Global Sensitivity Analysis optics

In this subsection, we enumerate various consequences of studying Fairness with this probabilistic framework coming from the GSA literature.

  1. (i)

    Modularity of fairness indicators Numerous metrics have been proposed in GSA literature to quantify the influence of a feature on the outcome of an algorithm. We already mentioned several of them so far. This diversity enables choices in the quantified fairness since every choice of GSA measure induces a Fairness definition. We presented in previous subsection a concrete example with Sobol’ indices, namely between Disparate Impact and Avoiding Disparate Treatment. Another example would be the use of kernels in HSIC-based indices, as exposed for instance in Li et al. (2019). By selecting various kernels, specific characteristics associated with fairness can be targeted.

  2. (ii)

    Perfect and Approximate fairness GSA has been especially created to quantify quasi independence between variables. Merging GSA and Fairness gives a formal framework to the notion of approximate fairness and computationally justify the use of GSA codes to measure and quantify fairness. Additionally, as mentioned in previous section, GSA literature includes statistical tests for independence between input variables and outcomes, along with confidence intervals. Therefore, it is possible to compute them in order to test whether perfect fairness or approximate fairness is obtained. Moreover, this enables the possibility of auditing algorithms.

  3. (ii)

    Choice of the target The framework presented earlier works for quantifying the influence of a sensitive feature on the outcome of a predictor but also any function of the predictor and of the input variables. This includes the loss of a predictor against a target. The ambivalence of this framework allows links to be made between various fairness definitions. For example, Disparate Impact and Avoiding Disparate Mistreatment are the same fairness but applied either to the predictor or to the loss of the predictor against a real target. In the first case, we want the algorithm to be independent of the sensitive feature; while in the second case, we want the errors of the predictor to be independent of the sensitive feature. Moreover, it allows for extension of fairness definitions to cases where an algorithm can be biased, as long as it does not make a mistake.

  4. (ii)

    Second-level Global Sensitivity Analysis Recent works in GSA take into account the uncertainty of the distribution of the inputs of an algorithm, see Meynaoui et al. (2019). These tools can help in a fairness framework, especially when the distribution of sensitive features is unknown and unreachable. This will be more deeply studied in future papers.

3.3 Applications to Causal Models

Fig. 1
figure 1

Examples of representation of causal models with directed acyclic graphs

Quantifying fairness using measures is a first step to understand bias in Machine Learning. Yet, causality enables to understand the true reasons of discrimination, as it is often related to the causal effect of a variable. The relations between variables describing causality are often modeled using a Directed Acyclic Graph (DAG). We refer to Pearl (2009), Bongers et al. (2020).

In this subsection, we show how to address causal notions of fairness using the GSA framework, illustrated by a synthetic and a social example. We show that information gained thanks to Sobol’ indices allow to learn some characteristic about the causal model.

We tackle the problem of predicting Y by \({\hat{Y}}\) knowing (XS) while the non-sensitive variables are influenced by a non-observed exogeneous variable U. This is modeled by the following equations:

$$\begin{aligned} X = \phi (U,S)\quad {\hat{Y}} = \psi (X,S), \end{aligned}$$

where \(\phi\) and \(\psi\) are some unknown functions. These equations are a consequence of the unique solvability of acyclic models (Bongers et al., 2020) and are illustrated in the various DAGs of Fig. 1.

In many practical cases, the causal graph is unknown and we need indices to quantify causality. In the following, we are not interested in the complete knowledge of the graph—which is a NP-hard problem—but only in the existence of paths from S to Y. .1in Actually, GSA can quantify causal influence following DAG structure, and different GSA indices will correspond to different paths from S to Y. Different type of relationships can be measured in particular with the Total Sobol and the Total Independent Sobol indices to quantify either the presence of a path from S directly to Y or a path from S to another variable X that influences itself the predictor Y. We call this latter effect a “bouncing effect” since Y is influential only through a mediator.

The following proposition explains how specific Sobol indices can be used to detect the presence of causal links between the sensitive variable and the outcome of the algorithm.

Proposition 1

(Quantifying Causality with Sobol Index)

  • The condition \(SobT_S = 0\) implies that every path from S to Y is non-existent, that is S and Y belong to two different connected component of the causal graph.

  • The condition \(SobT_S^{ind} = 0\) implies that the direct path from S to Y is non-existent, that is the absence of direct edge between S and Y in the causal graph.

Hence, using GSA, we can infer the absence of causal link between sensitive features and outcomes of algorithm without knowing the structure of the DAG. Note that, while Sobol’ indices are correlation-based, this is not an issue in quantifying causality for fairness, as the sensitive features are usually supposed to be roots of the DAG (Bongers et al., 2020; de Lara et al., 2021).

Example 6

labelex:buhlmann In this example, we specify three causal models and illustrate the previous proposition.

In Fig. 1a, S is directly influent on the outcome \({\hat{Y}}\). There is no interaction between S and X. This happens when S and X are independent for instance. In such a case, Sobol’ indices and independent Sobol’ indices are the same, as mentioned in Remark 1. The equality \(SobT_S = SobT_S^{ind}\) ensures the absence of “bouncing effect” for the sensitive variable S.

In Fig. 1b, we have no information about the influence of S on the outcome.

In Fig. 1c, S has no direct influence on the outcome, therefore \(SobT_S^{ind} = 0\). This variable can still be influent on the outcome since it may modify other variables of interest. In this case, X is a mediator variable through which the sensitive feature will influence the outcome with a “bouncing effect”. A model describing this kind of DAG in a fairness framework is the “College admissions” case, explained below.

Example 7

(College admissions) This example focus on college admissions process. Consider S to be the gender, X the choice of department, U the test score and \({\hat{Y}}\) the admission decision. The gender should not directly influence any admission decision \({\hat{Y}}\), but different genders may apply to departments represented by the variable X at different rates, and some departments may be more competitive than others. Gender may influence the admission outcome through the choice of department but not directly. In a fair world, the causal model for the admission can be modeled by a DAG without direct edge from S to \({\hat{Y}}\). Conversely, in an unfair world, decisions can be influenced directly by the sensitive feature S—hence the existence of a direct edge between S and \({\hat{Y}}\). This issue on unresolved discrimination is tackled in Kilbertus et al. (2017), Frye et al. (2020).

It has been remarked in the literature that it is not easy to calculate causal-based fairness, especially when the joint distribution of mixed input conditional on continuous variables is hard to calculate from the observed data. When access to this joint distribution is not possible, recent works in GSA have proposed new estimation procedures (Gamboa et al., 2020) based on works by Chatterjee (2020). These procedures makes no assumption on the distribution and provide a normally asymptotic estimation of GSA indices (and therefore associated Fairness metrics) at a low cost since it only require sorting of the data, along with the capacity to find closest neighbor of a data point.

3.4 Quantifying intersectional (un)fairness with GSA index

Most of fairness results are stated in the case where there is only one sensitive variable. Yet in many cases, the bias and the resulting possible discrimination are the result of multiple sensitive variables. This situation is known as intersectionality, when the level of discrimination of an intersection of several minority groups is worse than the discrimination present in each group as presented in Crenshaw (1989). Some recent works provide extensions of fairness measures to take into account the bias amplification due to intersectionality. We refer for instance to Morina et al. (2019) or Foulds et al. (202). However, quantifying this worst case scenario cannot be achieved using standard fairness measures. The GSA framework allows for controlling the influence of a set of variables and as such can naturally address intersectional notions of fairness.

Intersectional fairness is obtained when multiple sensitive variables (for instance \(S_1\) and \(S_2\) in the most simple case) do not have any joint influence on the output of the algorithm. We propose a definition of intersectional fairness using GSA indices.

Definition 3

Let \(S_1, S_2, \cdots , S_m\) be sensitive features. It is said that an algorithm output is intersectionaly fair if \(\varGamma (\varPhi (X, S_1, \cdots , S_m); (S_1, \cdots , S_m)) = 0\). This constraint can be relaxed to \(\varGamma (\varPhi (X, S_1, \cdots , S_m); (S_1, \cdots , S_m)) \le \varepsilon\) with \(\varepsilon\) small for approximate intersectionality fairness.

Consider two independent protected features \(S_1\) and \(S_2\) (i.e gender and ethnicity). Depending on the chosen definition of fairness, there are situation where fairness is obtained with respect to \(S_1\), with respect to \(S_2\) but where the combined effect of \((S_1, S_2)\) is not taken into account. For instance, let \(Y=S_1\times S_2\). In this toy-case, the Disparate Impact of \(S_1\), as well as the Disparate Impact of \(S_2\), is equal to 1 while the Disparate Impact of \((S_1, S_2)\) is equal to 0. This can be readily seen thanks to the link between fairness and GSA as the Sobol’ indices for \(S_1\) and for \(S_2\) are null while the Sobol’ index for the couple \((S_1, S_2)\) is maximal.

Proposition 2

Let \((S_1, S_2, \cdots , S_m)\) be sensitive features. To be fair in the sense of Disparate Impact for \(S_1\) and to be fair in the sense of Disparate Impact for \(S_2\) does not quantify any intersectional fairness in the sense of the Disparate Impact.

However, if we take again the same toy-case but look at the Total Sobol’ indices, we see that \(SobT_{S_1} = 0\) implies that \(SobT_{(S_1, S_2)} =0\).

Proposition 3

Let \((S_1, S_2, \cdots , S_m)\) be sensitive features. To be fair in the sense of Avoiding Disparate Treatment for \(S_1\) implies intersectional fairness for any intersection where \(S_1\) appears.

Remark 6

Intersectional fairness is different than classical fairness. Classical fairness only pays attention to the influence of a single sensitive feature on the outcome while intersectional fairness is quantifying only the influence due to interactions between sensitive features. In applications, the goal is usually to have both classical and intersectional fairness. A single fairness definition that covers these two characteristics can be hard to find or too restrictive to readily use. For instance, among Sobol’ indices, only the Total Sobol’ index induces both a classical and intersectional fairness.

4 Experiments

Table 3 Synthetic experiments based on causal DAGs—Fig. 1

4.1 Synthetic experiments

In this subsection, we focus on the computation of complete Sobol’ indices in a synthetic framework. We design three experiments, modeled after the causal generative models shown in Fig. 1. For simplicity, we consider a Gaussian model. In each experiment \(j,j \in \{1, 2, 3\}\), (XSU) are random variables drawn from a Gaussian distribution with covariance matrix \(C_j\), where

$$\begin{aligned} C_1 = C_2 = \begin{pmatrix} 1 &{} 0.5 &{} 0.5\\ 0.5 &{} 1 &{} 0 \\ 0.5 &{} 0 &{} 1 \\ \end{pmatrix}, C_3 = \begin{pmatrix} 1 &{} 0 &{} 0.5\\ 0 &{} 1 &{} 0 \\ 0.5 &{} 0 &{} 1 \\ \end{pmatrix}. \end{aligned}$$

The random variable U is unobserved in this case and therefore does not have Sobol’ indices. Its purpose is to simulate exogenous variables that modify the features in X. The target \(Y_j\), described in the Table 3 for each of the experiments, is equal to

$$\begin{aligned} Y_1 =&2 \times X,\\ Y_2 = Y_3 =&0.7 \times X + 0.3 \times S.\\ \end{aligned}$$

The first experiment shows the difference between independent and non-independent Sobol’ indices. The outcome is entirely determined by a single variable X and therefore, \(Sob_X = 1\). However, X is intrinsically linked with a sensitive feature because of the covariance matrix, so that \(Sob_X^{ind} \not = 0\). This is a concrete example where Statistical parity is not obtained for S but unresolved discrimination mentioned in Example 7 is obtained, since S is influential only through X.

The second experiment adds a direct path from the variable S to the outcome Y. Since Y can be factorized as an effect from X and an effect of S, we still have \(Sob_X = SobT_X\) and \(Sob_X^{ind} = SobT^{ind}_X\). However, in this case, X is no longer enough to fully explain the outcome, so that \(Sob_X \not = 1\). \(Sob_S^{ind}\) quantify the influence of this direct path from S to Y. Note that the difference between \(Sob_S\) and \(Sob^{ind}_S\) quantify the influence of the path from S to Y through the intermediary variable X.

In the third experiment, S and X are independent and S can only influence the outcome directly. This is the framework of classical Global Sensitivity Analysis. In this case, non-independent and independent Sobol’ indices are equal, as mentioned in Remark 1

Note that for these synthetic examples, we have complete access to the joint law of the input variables. In such a case, we can apply the estimation schemes described in Appendix 2 directly. The code can be found in the following repository https://forge.deel.ai/Fair/sobol_indices_extendedGIT.

Fig. 2
figure 2

Cramér–von-Mises and independent Cramér–von-Mises indices for the adult dataset

4.2 Real data sets

In this section, we focus on the implementation of Cramér–von-Mises indices on two real-life datasets: the Adult dataset (Dua and Graff, 2019) and the COMPAS dataset.

For real data sets, we first need to preprocess our data. For the Adult dataset, we applied to the data the same preprocessing as the one described in Besse et al. (2021). As for the Compas dataset, we used the same preprocessing as Zafar et al. (2017). Additionally, since access to the joint law of the distribution is not accessible, we added noise to the binary data to make them continuous and used a Gaussian approximation for the copula, as described in Mara et al. (2015), in order to have tractable estimates.

4.2.1 Adult dataset

The Adult dataset consists in 14 attributes for 48,842 individuals. The class label corresponds to the annual income (below/above 50.000 \(k\)). We study the effect of different attributes. The results for a classifier obtained for an algorithm built using an Extreme Gradient Boosting Procedure are shown in Fig. 2. We used the same pre-process as Besse et al. (2021) for the choice of variables.

If we look at the independent Cramér–von-Mises, we quantify the direct influence of a variable . We recover the influent indicators—“capital gain”, “education-number”, “age”, “occupation”...—given by other studies (Besse et al., 2021; Frye et al., 2020).

The joint influences on the outcome of other variables is also measured using GSA indices. Variables for which independent and classical Cramér–von-Mises indices are the same have no “bouncing” influence. Otherwise, the gap between these two indices quantify this specific effect. For example, the variable “age” correlates with most of the other variables such as “education-number” or “marital-status” for instance. Because of this, most of its influence is through “bouncing effects” and the gap between its two indices (i.e “CVM” and “\(CVM_indep\)”) is larger than for any other feature. The variable “sex” also plays an important role through its “bouncing” effect. We can see this through the difference between the classical and the independent index associated with this feature. This explains why removing the variable “sex” is not enough to obtain a fair predictor since it influences other variables that affect the prediction. We recover the results obtained by several studies that point out the bias created by the “sex” variable.

Note that race may have led to unbalanced decisions as well. Yet, the Cramér–von-Mises index is lower than the one for the “sex” variable, which explains why the discrimination is lower than the one created by the sex, as emphasized by the study of the Disparate Impact which is in a 95% confidence interval of [0.34, 0.37] for sex and [0.54, 0.63] for ethnic origin in Besse et al. (2021).

4.2.2 COMPAS dataset

The so-called COMPAS dataset, gathered by ProPublica described for instance in Washington (2018) , contains information about the recidivism risk predicted by the COMPAS tool, as well as the ground truth recidivism rates, for 7214 defendants. The COMPAS risk score, between 1 and 10 (1 being a low chance of recidivism and 10 a high chance of recidivism), is obtained by an algorithm using all other variables used to compute it, and is used to forecast whether the defendant will reoffend or not. We analysed this dataset with Cramér–von-Mises indices in order to quantify fairness exhibited by the COMPAS algorithm. The preprocessing we used is the same as the one described by Zafar et al. (2017). The results are shown in Fig. 3.

Fig. 3
figure 3

Cramér–von-Mises indices for the COMPAS dataset

First, every independent index is null, which means that the COMPAS algorithm does not rely on a single variable to predict recidivism. Also, gender and ethnicity are virtually not used by the algorithm, opposed to the variables “age” or “\(priors\_count\)” (the number of previous crimes). Hence as expected, the algorithm appears to be fair. However, when comparing the accuracy of the predictions of the algorithm with real-life two-year recidivism, the “race” variable is found to be influential. Hence we show that the indices we propose recover the bias denounced by Propublica with an algorithm that, despite fair predictions, shows a behavior that favors a part of the population based on the race variable.

5 Conclusion

We recalled classical notions both for the Global Sensitivity Analysis and the Fairness literature. We presented new Global Sensitivity Analysis tools by the mean of extended Cramér–von-Mises indices, as well as proved asymptotic normality for the extended Sobol’ indices. These sets of indices allow for uncertainty analysis for non-independent inputs, which is a classical situation in real-life data but not often studied in the literature. Concurrently, we link Global Sensitivity Analysis to Fairness in an unified probabilistic framework in which a choice of fairness is equivalent to a choice of GSA measure. We showed that GSA measures are natural tools for both the definition and comprehension of Fairness. Such a link between these two fields offers practitioners customized techniques for solving a wide array of fairness modeling problems.