Keywords

1 Introduction

The remarkable achievements in machine learning (ML) in recent years [12, 32, 47] are not matched by a comparable degree of trust. The most promising ML models are inscrutable in their operation. As a direct consequence, the opacity of ML models raises distrust in their use and deployment. Motivated by a critical need for helping human decision makers to grasp the decisions made by ML models, there has been extensive work on explainable AI (XAI). Well-known examples include so-called model agnostic explainers or alternatives based on saliency maps for neural networks [9, 50, 58, 59]. While most XAI approaches do not offer guarantees of rigor, and so can produce explanations that are unsound given the underlying ML model, there have been efforts on developing rigorous XAI approaches over the last few years [40, 54, 63]. Rigorous explainability involves the computation of explanations, but also the ability to answer a wide range of related queries [7, 8, 36].

By building on the relationship between explainability and logic-based abduction [25, 30, 40, 61], this paper analyzes two concrete queries, namely feature necessity and relevancy. Given an ML classifier, an instance (i.e. point in feature space and associated prediction) and a target feature, the goal of feature necessity is to decide whether the target feature occurs in all explanations of the given instance. Under the same assumptions, the goal of feature relevancy is to decide whether a feature occurs in some explanation of the given instance. This paper proves a number of complexity results regarding feature necessity and relevancy, focusing on well-known families of classifiers, some of which are widely used in ML. Moreover, the paper proposes novel algorithms for deciding relevancy for two families of classifiers. The experimental results demonstrate the scalability of the proposed algorithms.

The paper is organized as follows. The notation and definitions used throughout are presented in Section 2. The problems of feature necessity and relevancy are studied in Section 3, and example algorithms are proposed in Section 4. Section 5 presents experimental results for a sample of families of classifiers, Section 6 relates our contribution with earlier work and Section 7 concludes the paper.

2 Preliminaries

Complexity classes, propositional logic & quantification. The paper assumes basic knowledge of computational complexity, namely the classes of decision problems P, NP and \(\Upsigma _2^\text {P}\) [6]. The paper also assumes basic knowledge of propositional logic, including the Boolean satisfiability (SAT) problem for propositional logic formulas in conjunctive normal form (CNF), and the use of SAT solvers as oracles for the complexity class NP. The interested reader is referred to textbooks on these topics [6, 13].

2.1 Classification Problems

Throughout the paper, we will consider classifiers as the underlying ML model. Classification problems are defined on a set of features (or attributes) \({\mathcal {F}}=\{1,\ldots ,m\}\) and a set of classes \({\mathcal {K}}=\{c_1,c_2,\ldots ,c_K\}\). Each feature \(i\in {\mathcal {F}}\) takes values from a domain \(\mathbb {D}_i\). Domains are categorical or ordinal, and each domain can be defined on boolean, integer/discrete or real values. Feature space is defined as \(\mathbb {F}=\mathbb {D}_1\times {\mathbb {D}_2}\times \ldots \times {\mathbb {D}_m}\). The notation \(\textbf{x}=(x_1,\ldots ,x_m)\) denotes an arbitrary point in feature space, where each \(x_i\) is a variable taking values from \(\mathbb {D}_i\). The set of variables associated with the features is \(X=\{x_1,\ldots ,x_m\}\). Also the notation \(\textbf{v}=(v_1,\ldots ,v_m)\) represents a specific point in feature space, where each \(v_i\) is a constant representing one concrete value from \(\mathbb {D}_i\). A classifier \(\mathbb {C}\) is characterized by a (non-constant) classification function \(\kappa \) that maps feature space \(\mathbb {F}\) into the set of classes \({\mathcal {K}}\), i.e. \(\kappa :\mathbb {F}\rightarrow {\mathcal {K}}\). An instance denotes a pair \((\textbf{v}, c)\), where \(\textbf{v}\in \mathbb {F}\) and \(c\in {\mathcal {K}}\), with \(c=\kappa (\textbf{v})\).

2.2 Examples of Classifiers

The results presented in the paper apply to a comprehensive range of widely used classifiers [28, 62]. These include, decision trees (DTs) [18, 42], decision graphs (DGs) [44] and diagrams (DDs) [1, 68], decision lists (DLs) [38, 60] and sets (DSs) [19, 41], tree ensembles (TEs) [37], including random forests (RFs) [17, 43] and boosted trees (BTs) [29], neural networks (NNs) [56], naive bayes classifiers (NBCs) [45, 52], classifiers represented with propositional languages, including deterministic decomposable negation normal form (d-DNNFs) [23, 35] and its proper subsets, e.g. sentential decision diagrams (SDDs) [22, 66] and free binary decision diagrams (FBDDs) [23, 31, 68], and also monotonic classifiers. In the rest of the paper, we will analyze some families of classifiers in more detail.

d-DNNF classifiers. Negation normal form (NNF) is a well-known propositional language, where the negation operators are restricted to atoms, or inputs. Any propositional formula can de reduced to NNF in polynomial time. Let the support of a node be the set of atoms associated with leaves reachable from the outgoing edges of the node. Decomposable NNF (DNNF) is a restriction of NNF where the children of AND nodes do not share atoms in their support. A DNNF circuit is deterministic (referred to as d-DNNF) if any two children of OR nodes cannot both take value 1 for any assignment to the inputs. Restrictions of NNF including DNNF and d-DNNF exhibit important tractability properties [23]. Besides, we briefly introduce FBDDs which is a proper subset of d-DNNFs. An FBDD over a set X of Boolean variables is a rooted, directed acyclic graph comprising two types of nodes: nonterminal and terminal. A nonterminal node is labeled by a variable \(x_i \in X\), and has two outgoing edges, one labeled by 0 and the other by 1. A terminal node is labeled by a 1 or 0, and has no outgoing edges. For a subgraph rooted at a node labeled with a variable \(x_i\), it represents a boolean function f which is defined by the Shannon expansion: \(f = (x_i \wedge f|_{x_i=1}) \vee (\lnot x_i \wedge f|_{x_i=0})\), where \(f|_{x_i=1}\) (\(f|_{x_i=0}\)) denotes the cofactor [16] of f with respect to \(x_i=1\) (\(x_i=0\)). Moreover, any FBDD is read-once, meaning that each variable is tested at most once on any path from the root node to a terminal node.

Monotonic classifiers. Monotonic classifiers find a number of important applications, and have been studied extensively in recent years [26, 48, 65, 70]. Let \(\preccurlyeq \) denote a partial order on the set of classes \({\mathcal {K}}\). For example, we assume \(c_1\preccurlyeq {c_2}\preccurlyeq \ldots {c_K}\). Furthermore, we assume that each domain \(D_i\) is ordered such that the value taken by feature i is between a lower bound \(\lambda (i)\) and an upper bound \(\mu (i)\). Given \(\textbf{v}_1=(v_{11},\ldots ,v_{1i},\ldots ,v_{1m})\) and \(\textbf{v}_2=(v_{21},\ldots ,v_{2i},\ldots ,v_{2m})\), we say that \(\textbf{v}_1\le \textbf{v}_2\) if \(\forall (i\in {\mathcal {F}}).(v_{1i}\le {v_{2i}})\). Finally, a classifier is monotonic if whenever \(\textbf{v}_1\le \textbf{v}_2\), then \(\kappa (\textbf{v}_1)\preccurlyeq \kappa (\textbf{v}_2)\).

Running examples. As hinted above, throughout the paper, we will consider two fairly different families of classifiers, namely classifiers represented with d-DNNFs and monotonic classifiers.

Example 1

The first example is the d-DNNF classifier \(\mathbb {C}_1\) shown in Fig. 1. It represents the boolean function \((x_1 \wedge (x_2 \vee x_4)) \vee (\lnot x_1 \wedge x_3 \wedge x_4)\). The instance considered throughout the paper is \((\textbf{v}_{1},c_{1})=((0,1,0,0),0)\).

Example 2

The second running example is the monotonic classifier \(\mathbb {C}_2\) shown in Fig. 2. The instance that is considered throughout the paper is \((\textbf{v}_{2},c_{2})=((1,1,1,1),1)\).

Fig. 1.
figure 1

Example of d-DDNF classifier

Fig. 2.
figure 2

Example of a monotonic classifier

2.3 Formal Explainability

Prime implicant (PI) explanations [63] represent a minimal set of literals (relating a feature value \(x_i\) and a constant \(v_i\in \mathbb {D}_i\)) that are logically sufficient for the prediction. PI-explanations are related with logic-based abduction, and so are also referred to as abductive explanations (AXp’s) [54]. AXp’s offer guarantees of rigor that are not offered by other alternative explanation approaches. More recently, AXp’s have been studied in terms of their computational complexity [7, 10]. There is a growing body of recent work on formal explanations [3,4,5, 14, 15, 24, 27, 33, 51, 54, 67].

Formally, given \(\textbf{v}=(v_1,\ldots ,v_m)\in \mathbb {F}\), with \(\kappa (\textbf{v})=c\), an AXp is any subset-minimal set \({\mathcal {X}}\subseteq {\mathcal {F}}\) such that,

$$\begin{aligned} \begin{array}{lcr} \textsf{WAXp}({\mathcal {X}})&\quad {~:=~}\quad \quad&\forall (\textbf{x}\in \mathbb {F}). \left[ \bigwedge \nolimits _{i\in {{\mathcal {X}}}}(x_i=v_i) \right] \mathop {\mathrm {\rightarrow }}\limits (\kappa (\textbf{x})=c) \end{array} \end{aligned}$$
(1)

If a set \({\mathcal {X}}\subseteq {\mathcal {F}}\) is not minimal but (1) holds, then \({\mathcal {X}}\) is referred to as a weak AXp. Clearly, the predicate \(\textsf{WAXp}\) maps \(2^{{\mathcal {F}}}\) into \(\{\bot ,\top \}\) (or \(\{{\textbf {false}},{\textbf {true}}\}\)). Given \(\textbf{v}\in \mathbb {F}\), an AXp \({\mathcal {X}}\) represents an irreducible (or minimal) subset of the features which, if assigned the values dictated by \(\textbf{v}\), are sufficient for the prediction c, i.e. value changes to the features not in \({\mathcal {X}}\) will not change the prediction. We can use the definition of the predicate \(\textsf{WAXp}\) to formalize the definition of the predicate \(\textsf{AXp}\), also defined on subsets \({\mathcal {X}}\) of \({\mathcal {F}}\):

$$\begin{aligned} \begin{array}{lcr} \textsf{AXp}({\mathcal {X}})&\quad {~:=~}\quad \quad&\textsf{WAXp}({\mathcal {X}}) \wedge \forall ({\mathcal {X}}'\subsetneq {\mathcal {X}}). \lnot \textsf{WAXp}({\mathcal {X}}') \end{array} \end{aligned}$$
(2)

The definition of \(\textsf{WAXp}({\mathcal {X}})\) ensures that the predicate is monotone. Indeed, if \({\mathcal {X}}\subseteq {\mathcal {X}}'\subseteq {\mathcal {F}}\), and if \({\mathcal {X}}\) is a weak AXp, then \({\mathcal {X}}'\) is also a weak AXp, as the fixing of more features will not change the prediction. Given the monotonicity of predicate \(\textsf{WAXp}\), the definition of predicate \(\textsf{AXp}\) can be simplified as follows, with \({\mathcal {X}}\subseteq {\mathcal {F}}\):

$$\begin{aligned} \textsf{AXp}({\mathcal {X}}) := \textsf{WAXp}({\mathcal {X}}) \wedge \forall (j\in {\mathcal {X}}).\lnot \textsf{WAXp}({\mathcal {X}}\setminus \{j\}) \end{aligned}$$
(3)

This simpler but equivalent definition of AXp has important practical significance, in that only a linear number of subsets needs to be checked for, as opposed to exponentially many subsets in (2). As a result, the algorithms that compute one AXp are based on (3) [54].

Example 3

From Example 1, and given the instance ((0, 1, 0, 0), 0), we can conclude that the prediction will be 0 if features 1 and 3 take value 0, or if features 1 and 4 take value 0. Hence, the AXp’s are \(\{1,3\}\) and \(\{1,4\}\). It is also apparent that the assignment \(x_2=1\) bears no relevance on the fact that the prediction is 0.

Example 4

From Example 2, we can conclude that any sum of two variables assigned value 1 suffices for the prediction. Hence, given the instance ((1, 1, 1, 1), 1), the possible AXp’s are \(\{1,2\}\), \(\{1,3\}\), and \(\{2,3\}\). Observe that the definition of \(\kappa _2\) does not depend on feature 4.

Besides abductive explanations, another commonly studied type of explanations are contrastive or counterfactual explanations [8, 36, 39, 55]. As argued in related work [36], the duality between abductive and contrastive explanations implies that for the purpose of the queries studied in this paper, it suffices to study solely abductive explanations.

3 Feature Relevancy & Necessity: Theory

This section investigates the complexity of feature relevancy and necessityFootnote 1. We are interested in membership results, which allow us to devise algorithms for the target problems. We are also interested in hardness results, which serve to confirm that the running time complexities of the proposed algorithms are within reason, given the problem’s complexity.

3.1 Defining Necessity, Relevancy & Irrelevancy

Throughout this section, a classifier \(\mathbb {C}\) is assumed, with features \({\mathcal {F}}\), domains \(\mathbb {D}_i\), \(i\in {\mathcal {F}}\), classes \({\mathcal {K}}\), a classification function \(\kappa :\mathbb {F}\rightarrow {\mathcal {K}}\), and a concrete instance \((\textbf{v},c)\), \(\textbf{v}\in \mathbb {F},c\in {\mathcal {K}}\).

Definition 1

(Feature Necessity, Relevancy & Irrelevancy). Let \(\mathbb {A}\) denote the set of all AXp’s for a classifier given a concrete instance, i.e. \(\mathbb {A} = \{{\mathcal {X}}\subseteq {\mathcal {F}}\,|\,\textsf{AXp}({\mathcal {X}})\}\), and let \(t\in {\mathcal {F}}\) be a target feature. Then, (i) t is necessary if \(t\in \cap _{{\mathcal {X}}\in \mathbb {A}}{\mathcal {X}}\); (ii) t is relevant if \(t\in \cup _{{\mathcal {X}}\in \mathbb {A}}{\mathcal {X}}\); and (iii) t is irrelevant if \(t\in {\mathcal {F}}\setminus \cup _{{\mathcal {X}}\in \mathbb {A}}{\mathcal {X}}\).

Throughout the remainder of the paper, the problem of deciding feature necessity is represented by the acronym FNP, and the problem of deciding feature relevancy is represented by the acronym FRP.

Example 5

As shown earlier, for the d-DNNF classifier of Fig. 1, and given the instance \((\textbf{v}_1,c_1)=((0,1,0,0),0)\), there exist two AXp’s, i.e. \(\{1,3\}\) and \(\{1,4\}\). Clearly, feature 1 is necessary, and features 1, 3 and 4 are relevant. In contrast, feature 2 is irrelevant.

Example 6

For the monotonic classifier of Fig. 2, and given the instance \((\textbf{v}_2,c_2)=((1,1,1,1),1)\), we have argued earlier that there exist three AXp’s, i.e. \(\{1,2\}\), \(\{1,3\}\) and \(\{2,3\}\), which allows us to conclude that features 1, 2 and 3 are relevant, but that feature 4 is irrelevant. In this case, there are no necessary features.

The general complexity of necessity and (ir)relevancy has been studied in the context of logic-based abduction [25, 30, 61]. Recent uses in explainability are briefly overviewed in Section 6.

3.2 Feature Necessity

Proposition 2

If deciding \(\textsf{WAXp}({\mathcal {X}})\) is in complexity class \(\mathfrak {C}\), then FNP is in the complexity class co-\(\mathfrak {C}\).

Given the known polynomial complexity of deciding whether a set is a weak AXp for several families of classifiers [54], we then have the following result:

Corollary 3

For DTs, XpG’sFootnote 2, NBCs, d-DNNF classifiers and monotonic classifiers, FNP is in P.

3.3 Feature Relevancy: Membership Results

Proposition 4

(Feature Relevancy for DTs [36]). FRP for DTs is in P.

Proposition 5

If deciding \(\textsf{WAXp}({\mathcal {X}})\) is in P, then FRP is in NP.

The argument above can also be used for proving the following results.

Corollary 6

For XpG’s, NBCs, d-DNNF classifiers and monotonic classifiers, FRP is in NP.

Proposition 7

If deciding \(\textsf{WAXp}({\mathcal {X}})\) is in NP, then FRP is in \(\Upsigma _2^\text {P}\).

Corollary 8

For DLs, DSs, RFs, BTs, and NNs, FRP is in \(\Upsigma _2^\text {P}\).

Additional results. The following result will prove useful in designing algorithms for FRP in practice.

Proposition 9

Let \({\mathcal {X}}\subseteq {\mathcal {F}}\), and let \(t\in {\mathcal {X}}\) denote some target feature such that, \(\textsf{WAXp}({\mathcal {X}})\) holds and \(\textsf{WAXp}({\mathcal {X}}\setminus \{t\})\) does not hold. Then, for any AXp \({\mathcal {Z}}\subseteq {\mathcal {X}}\subseteq {\mathcal {F}}\), it must be the case that \(t\in {\mathcal {Z}}\).

3.4 Feature Relevancy: Hardness Results

Proposition 10

(Relevancy for DNF Classifiers [36]). Feature relevancy for a DNF classifier is \(\Upsigma _2^\text {P}\)-hard.

Proposition 11

Feature relevancy for monotonic classifiers is NP-hard.

Proof

We say that a CNF is trivially satisfiable if some literal occurs in all clauses. Clearly, SAT restricted to nontrivial CNFs is still NP-complete. Let \(\varPhi \) be a not trivially satisfiable CNF on variables \(x_1,\ldots ,x_k\). Let \(N = 2k\). Let \(\tilde{\varPhi }\) be identical to \(\varPhi \) except that each occurrence of a negative literal \(x_i\) (\(1 \le i \le k\)) is replaced by \(x_{i+k}\). Thus \(\tilde{\varPhi }\) is a CNF on N variables each of which occur only positively. Define the boolean classifier \(\kappa \) (on \(N+1\) features) by \(\kappa (x_0,x_1,\ldots ,x_N) = 1\) iff \(x_i = x_{i+k} = 1\) for some \(i \in \{1,\ldots ,k\}\) or \(x_0 \wedge \tilde{\varPhi }(x_1,\ldots ,x_N)=1\). To show that \(\varPhi \) is monotonic we need to show that \(\textbf{a} \le \textbf{b} \Rightarrow \kappa (\textbf{a}) \le \kappa (\textbf{b})\). This follows by examining the two cases in which \(\kappa (\textbf{a}) = 1\): if \(a_i=a_{i+k} \wedge \textbf{a} \le \textbf{b}\), then \(b_i=b_{i+k}\), whereas, if \(a_0 \wedge \tilde{\varPhi }(a_1,\ldots ,a_N)=1\) and \(\textbf{a} \le \textbf{b}\), then \(b_0 \wedge \tilde{\varPhi }(b_1,\ldots ,b_N) = 1\) (by positivity of \(\tilde{\varPhi }\)), so in both cases \(\kappa (\textbf{b}) = 1 \ge \kappa (\textbf{a})\).

Clearly \(\kappa (\textbf{1}_{N+1}) = 1\). There are k obvious AXp’s of this prediction, namely \(\{i, i+k\}\) (\(1 \le i \le k\)). These are minimal by the assumption that \(\varPhi \) is not trivially satisfiable. This means that no other AXp contains both i and \(i+k\) for any \(i \in \{1,\ldots , k\}\). Suppose that \(\varPhi (\textbf{u})=1\). Let \({\mathcal {X}}_u\) be \(\{0\} \cup \{i \mid 1 \le i \le k \wedge u_i=1\} \cup \{i+k \mid 1 \le i \le k \wedge u_i=0\}\). Then \({\mathcal {X}}_u\) is a weak AXp of the prediction \(\kappa (1)=1\). Furthermore \({\mathcal {X}}_u\) does not contain any of the AXp’s \(\{i,i+k\}\). Therefore some subset of \({\mathcal {X}}\) is an AXp and clearly this subset must contain feature 0. Thus if \(\varPhi \) is satisfiable, then there is an AXp which contains 0.

We now show that the converse also holds. If \({\mathcal {X}}\) is an AXp of \(\kappa (\textbf{1}_{N+1}) = 1\) containing 0, then it cannot also contain any of the pairs \(i,i+k\) (\(1 \le i \le k\)), otherwise we could delete 0 and still have an AXp. We will show that this implies that we can build a satisfying assignment \(\textbf{u}\) for \(\varPhi \). Consider first \(\textbf{v}=(v_0,\ldots ,v_N)\) defined by \(v_i=1\) if \(i \in {\mathcal {X}}\) (\(0 \le i \le N\)) and \(v_{i+k} = 1\) if neither i nor \(i+k\) belongs to \({\mathcal {X}}\) (\(1 \le i \le k\)), and \(v_i=0\) otherwise (\(1 \le i \le N\)). Then \(\kappa (\textbf{v})=1\) by definition of an AXp, since \(\textbf{v}\) agrees with the vector 1 on all features in \({\mathcal {X}}\). We can also note that \(v_0=1\) since \(0 \in {\mathcal {X}}\). Since \({\mathcal {X}}\) does not contain i and \(i+k\) (\(1 \le i \le k\)), it follows that \(v_i \ne v_{i+k}\). Now let \(u_i=1\) iff \(i \in {\mathcal {X}} \wedge 1 \le i \le k\). It is easy to verify that \(\varPhi (\textbf{u}) = \tilde{\varPhi }(\textbf{v}) = \kappa (\textbf{v}) = 1\).

Thus, determining whether \(\kappa (\textbf{1}_{N+1}) = 1\) has an AXp containing the feature 0 is equivalent to testing the satisfiability of \(\varPhi \). It follows that FRP is NP-hard for monotonic classifiers by this polynomial reduction from SAT.    \(\square \)

Proposition 12

Relevancy for FBDD classifiers is NP-hard.

Proof

Let \(\psi \) be a CNF formula defined on a variable set \(X = \{x_1, \dots , x_m\}\) and with clauses \(\{\omega _1, \dots , \omega _n\}\). We aim to construct an FBDD classifier \({\mathcal {G}}\) (representing a classification function \(\kappa \)) based on \(\psi \) and a target variable in polynomial time, such that: \(\psi \) is SAT iff for \(\kappa \) there is an AXp containing this target variable.

For any literal \(l_j \in \omega _i\), replace \(l_j\) with \(l^i_j\). Let \(\psi ' = \{\omega '_1, \dots , \omega '_n\}\) denote the resulting CNF formula defined on the new variables \(\{x^1_1, \dots , x^1_m, \dots x^n_1, \dots , x^n_m\}\).

For each original variable \(x_j\), let \(I^+_j\) and \(I^-_j\) denote the indices of clauses containing literal \(x_j\) and \(\lnot x_j\), respectively. So if \(i \in I^+_j\), then \(x^i_j \in \omega '_i\), if \(i \in I^-_j\), then \(\lnot x^i_j \in \omega '_i\).

To build an FBDD D from \(\psi '\): 1) build an FBDD \(D_i\) for each \(\omega '_i\); 2) replace the terminal node 1 of \(D_i\) with the root node of \(D_{i+1}\); D is read-once because each variable \(x^i_j\) occurs only once in \(\psi '\).

Satisfying a literal \(x^i_j \in \omega '_i\) means \(x_j=1\), while satisfying a literal \(\lnot x^k_j \in \omega '_k\) means \(x_j=0\). If both \(x^i_j\) and \(\lnot x^k_j\) are satisfied, then it means we pick inconsistent values for the variable \(x_j\), which is unacceptable.

Let us define \(\phi \) to capture inconsistent values for any variable \(x_j\):

$$\begin{aligned} \phi := \bigvee \nolimits _{1 \le j \le m} \left( \left( \bigvee \nolimits _{i \in I^+_j} x^i_j \right) \wedge \left( \bigvee \nolimits _{k \in I^-_j} \lnot x^k_j \right) \right) \end{aligned}$$
(4)

If \(I^+_j = \emptyset \), then let \(\left( \bigvee \nolimits _{i \in I^+_j} x^i_j\right) = 0\). If \(I^-_j = \emptyset \), then let \(\left( \bigvee \nolimits _{k \in I^-_j} \lnot x^k_j\right) = 0\).

Any true point of \(\phi \) means we pick inconsistent values for some variable \(x_j\), so it represents an unacceptable point of \(\psi \). To avoid such inconsistency, one needs to at least falsify either \(\bigvee \nolimits _{i \in I^+_j} x^i_j\) or \(\bigvee \nolimits _{k \in I^-_j} \lnot x^k_j\) for each variable \(x_j\). To build an FBDD G from \(\phi \): 1) build FBDDs \(G^+_j\) and \(G^-_j\) for \(\bigvee \nolimits _{i \in I^+_j} x^i_j\) and \(\bigvee \nolimits _{k \in I^-_j} \lnot x^k_j\), respectively; 2) replace the terminal node 1 of \(G^+_j\) with the root node of \(G^-_j\), let \(G_j\) denote the resulting FBDD; 3) replace the terminal 0 of \(G_j\) with the root node of \(G_{j+1}\); G is read-once because each variable \(x^i_j\) occurs only once in \(\phi \).

Create a root node labeled \(x^0_0\), link its 1-edge to the root of D, 0-edge to the root of G. The resulting graph \({\mathcal {G}}\) is an FBDD representing \(\kappa := (x^0_0 \wedge \psi ') \vee (\lnot x^0_0 \wedge \phi )\), \(\kappa \) is a boolean classifier defined on \(\{x^0_0, x^1_1, \dots , x^n_m\}\) and \(x^0_0\) is the target variable. The number of nodes of \({\mathcal {G}}\) is \(O(n \times m)\). Let \({\mathcal {I}} = \{(0,0), (1,1), \dots (n,m)\}\) denote the set of variable indices, for variable \(x^i_j\), \((i,j) \in {\mathcal {I}}\).

Pick an instance \(\textbf{v} = \{v^0_0, \dots , v^i_j, \dots \}\) satisfying every literal of \(\psi '\) (i.e. \(v^i_j=1\) and \(v^k_j=0\) for \(x^i_j, \lnot x^k_j \in \psi '\)) and such that \(v^0_0=1\), then \(\psi ' (\textbf{v}) = 1\), and so \(\kappa (\textbf{v}) = 1\).

Suppose \({\mathcal {X}} \subseteq {\mathcal {I}}\) is an AXp of \(\textbf{v}\): 1) If \(\{(i, j), (k, j)\} \subseteq {\mathcal {X}}\) for some variable \(x_j\), where \(i \in I^+_j\) and \(k \in I^-_j\), then for any point \(\textbf{u}\) of \(\kappa \) such that \(u^i_j = v^i_j\) for any \((i,j) \in {\mathcal {X}}\), we have \(\kappa (\textbf{u}) = 1\) and \(\phi (\textbf{u}) = 1\). Moreover, if \(\textbf{u}\) sets \(u^0_0=1\), then \(\kappa (\textbf{u}) = 1\) implies \(\psi '(\textbf{u}) = 1\), else if \(\textbf{u}\) sets \(u^0_0=0\), then \(\kappa (\textbf{u}) = 1\) because of \(\phi (\textbf{u}) = 1\). \(\kappa (\textbf{u}) = 1\) regardless the value of \(u^0_0\), so \((0,0) \not \in {\mathcal {X}}\).

2) If \(\{(i, j), (k, j)\} \not \subseteq {\mathcal {X}}\) for any variable \(x_j\), where \(i \in I^+_j\) and \(k \in I^-_j\), then for some point \(\textbf{u}\) of \(\kappa \) such that \(u^i_j = v^i_j\) for any \((i,j) \in {\mathcal {X}}\), we have \(\phi (\textbf{u}) \ne 1\), in this case \(\kappa (\textbf{u}) = 1\) implies \(\psi '(\textbf{u}) = 1\), besides, any such \(\textbf{u}\) must set \(u^0_0=1\), so \((0,0) \in {\mathcal {X}}\).

If case 2) occurs, then \(\psi \) is satisfiable. (a satisfying assignment is \(x_j=1\) iff \(\exists i \in I_j^{+}\) s.t. \((i,j) \in {\mathcal {X}}\)). If case 2) never occurs, then \(\psi \) is unsatisfiable. It follows that FRP is NP-hard for FBDD classifiers by this polynomial reduction from SAT.    \(\square \)

Corollary 13

Relevancy for d-DNNF classifiers is NP-hard.

4 Feature Relevancy: Example Algorithms

This section details two methods for FRP. One method decides feature relevancy for d-DNNF classifiers, whereas the other method decides feature relevancy for arbitrary monotonic classifiers. Based on Proposition 2 and Corollary 3, existing algorithm for computing one AXp [35, 36, 52, 53] can be used to decide feature necessity. Hence, there is no need for devising new algorithms. Additionally, the weak AXp returned from the proposed methods (if it exist) can be fed (as a seed) into the algorithms of computing one AXp [35, 53] to extract one AXp in polynomial time.

4.1 Relevancy for d-DNNF Classifiers

This section details a propositional encoding that decides feature relevancy for d-DNNFs. The encoding follows the approach described in the proof of Proposition 9, and comprises two copies (\(\mathbb {C}^0\) and \(\mathbb {C}^t\)) of the same d-DNNF classifier \(\mathbb {C}\),

\(\mathbb {C}^0\) encodes \(\textsf{WAXp}({\mathcal {X}})\) (i.e. the prediction of \(\kappa \) remains unchanged), \(\mathbb {C}^t\) encodes \(\lnot \textsf{WAXp}({\mathcal {X}}\setminus \{t\})\) (i.e. the prediction of \(\kappa \) changes). The encoding is polynomial in the size of classifier’s representation.

Table 1. Encoding for deciding whether there is a weak AXp including feature t.

The encoding is applicable to the case \(\kappa (\textbf{x}) = 0\). The case \(\kappa (\textbf{x}) = 1\) can be transformed to \(\lnot \kappa (\textbf{x}) = 0\), so we assume both d-DNNF \(\mathbb {C}\) and its negation \(\lnot \mathbb {C}\) are given. To present the constraints included in this encoding, we need to introduce some auxiliary boolean variables and predicates.

  1. 1.

    \(s_i\), \(1 \le i \le m\). \(s_i\) is a selector such that \(s_i = 1\) iff feature i is included in a weak AXp candidate \({\mathcal {X}}\).

  2. 2.

    \(n^k_j\), \(1 \le j \le |\mathbb {C}|\) and \(0 \le k \le m\). \(n^k_j\) is the indicator of a node j of d-DNNF \(\mathbb {C}\) for replica k. The indicator for the root node of k-th replica is \(n^k_1\). Moreover, the semantics of \(n_j^k\) is \(n_j^k = 1\) iff the sub-d-DNNF rooted at node j in k-th replica is consistent.

  3. 3.

    \({\textsf {Leaf}}(j) = 1\) if the node j is a leaf node.

  4. 4.

    \({\textsf {NonLeaf}}(j) = 1\) if the node j is a non-leaf node.

  5. 5.

    \({\textsf {Feat}}(j,i) = 1\) if the leaf node j is labeled with feature i.

  6. 6.

    \({\textsf {Sat}}({\textsf {Lit}}(j),v_i) = 1\) if for leaf node j, the literal on feature i is satisfied by \(v_i\).

The encoding is summarized in Table 1. As literals are d-DNNF leafs, the values of the selector variables only affect the values of the indicator variables of leaf nodes. Constraint (1.1) states that for any leaf node j whose literal is consistent with the given instance, its indicator \(n^k_j\) is always consistent regardless of the value of \(s_i\). On the contrary, constraint (1.3) states that for any leaf node j whose literal is inconsistent with the given instance, its indicator \(n^k_j\) is consistent iff feature i is not picked, in other words, feature i can take any value. Because replica k (\(k > 0\)) is used to check the necessity of including feature k in \({\mathcal {X}}\), we assume the value of the local copy of selector \(s_k\) is 0 in replica k. In this case, as defined in constraint (1.2), even though leaf node j labeled feature k has a literal that is inconsistent with the given instance, its indicator \(n^k_j\) is consistent. Constraint (1.4) defines the indicator for an arbitrary \(\vee \) node j. Constraint (1.5) defines the indicator for an arbitrary \(\wedge \) node j. Together, these constraints declare how the consistency is propagated through the entire d-DNNF. Constraint (1.6) states that the prediction of the d-DNNF classifier \(\mathbb {C}\) remains 0 since the selected features form a weak AXp. Constraint (1.7) states that if feature i is selected, then removing it will change the prediction of \(\mathbb {C}\). Finally, constraint (1.8) indicates that feature t must be included in \({\mathcal {X}}\).

Example 7

Given the d-DNNF classifier of Fig. 1 and the instance \((\textbf{v}_1,c_1)=((0,1,0,0),0)\), suppose that the target feature is 3. We have selectors \(\textbf{s} = \{s_1, s_2, s_3, s_4\}\), and the encoding is as follows:

  1. 1.
  2. 2.

Given the AXp’s listed in Example 3, by solving these formulas we will either obtain \(\{1, 3\}\) or \(\{1,4\}\) as the AXp.

4.2 Relevancy for Monotonic Classifiers

This section describes an algorithm for FRP in the case of monotonic classifiers. No assumption is made regarding the actual implementation of the monotonic classifier.

Abstraction refinement for relevancy.

The algorithm proposed in this section iteratively refines an over-approximation (or abstraction) of all the subsets \({\mathcal {S}}\) of \({\mathcal {F}}\) such that: i) \({\mathcal {S}}\) is a weak AXp, and ii) any AXp included in \({\mathcal {S}}\) also includes the target feature t.

Formally, the set of subsets of \({\mathcal {F}}\) that we are interested in is defined as follows:

$$\begin{aligned} \mathbb {H} = \{{\mathcal {S}}\subseteq {\mathcal {F}}\,|\, \textsf{WAXp}({\mathcal {S}})\wedge \forall ({\mathcal {X}}\subseteq {\mathcal {S}}).\left[ \textsf{AXp}({\mathcal {X}})\mathop {\mathrm {\rightarrow }}\limits (t\in {\mathcal {X}})\right] \} \end{aligned}$$
(5)

The proposed algorithm iteratively refines the over-approximation of set \(\mathbb {H}\) until one can decide with certainty whether t is included in some AXp. The refinement step involves exploiting counterexamples as these are identified.

(The approach is referred to as abstraction refinement FRP, since the use of abstraction refinement can be related with earlier work (with the same name) in model checking  [20].)

In practice, it will in general be impractical to manipulate such over-approximation of set \(\mathbb {H}\) explicitly. As a result, we use a propositional formula (in fact a CNF formula) \({\mathcal {H}}\), such that the models of \({\mathcal {H}}\) encode the subsets of features about which we have yet to decide whether each of those subsets only contains AXp’s that include t. (Formula \({\mathcal {H}}\) is defined on a set of Boolean variables \(\{s_1,\ldots ,s_m\}\), where each \(s_i\) is associated with feature i, and assigning \(s_i=1\) denotes that feature i is included in a given set, as described below.)

The algorithm then iteratively refines the over-approximation by filtering out sets of sets that have been shown not to be included in \(\mathbb {H}\), i.e. the so-called counterexamples.

Algorithm 1 summarizes the proposed approachFootnote 3.

Also, Algorithms 2 and 3 provide supporting functions. (For simplicity, the function calls of Algorithms 2 and 3 show the arguments, but not the parameterizations.)

Algorithm 1 iteratively uses an NP oracle (in fact a SAT solver) to pick (or guess) a subset \({\mathcal {P}}\) of \({\mathcal {F}}\), such that any previously picked set is not repeated. Since we are interested in feature t, we enforce that the picked set must include t. (This step is shown in lines 4 to 7.)

Now, the features not in \({\mathcal {P}}\) are deemed universal, and so we need to account for the range of possible values that these universal features can take. For that, we update lower and upper bounds on the predicted classes. For the features in \({\mathcal {P}}\) we must use the values dictated by \(\textbf{v}\). (This is shown in lines 8 and 9, and it is sound to do because we have monotonicity of prediction.)

If the lower and upper bounds differ, then the picked set is not even a weak AXp, and so we can safely remove it from further consideration. This is achieved by enforcing that at least one of the non-picked elements is picked in the future.

(As can be observed \({\mathcal {H}}\) is updated with a positive clause that captures this constraint, as shown in line 11.)

If the lower and upper bounds do not differ (i.e. we picked a weak AXp), and if by allowing t to take any value causes the bounds to differ, then we know that any AXp in \({\mathcal {P}}\) must include t, and so the algorithm reports \({\mathcal {P}}\) as a weak AXp that is guaranteed to be included in \(\mathbb {H}\). (This is shown in line 14.)

It should be noted that \({\mathcal {P}}\) is not necessarily an AXp. However, by Proposition 9, \({\mathcal {P}}\) is guaranteed to be a weak AXp such that any of the AXp’s contained in \({\mathcal {P}}\) must include feature t. From [53], we know that we can extract an AXp from a weak AXp in polynomial time, and in this case we are guaranteed to always pick one that includes t.

Finally, the last case is when allowing t to take any value does not cause the lower and upper bounds to change. This means we picked a set \({\mathcal {P}}\) that is a weak AXp, but not all AXp’s in \({\mathcal {P}}\) include the target feature t (again due to Proposition 9). As a result, we must prevent the same weak AXp from being re-picked. This is achieved by requiring that at least one of the picked features not be picked again in the feature set. (This is shown in line 16. As can be observed, \({\mathcal {H}}\) is updated with a negative clause that captures this constraint.)

figure c

As can be concluded from Algorithm 1 and from the discussion above, Proposition 9 is essential to enable us to use at most two classification queries per iteration of the algorithm. If we were to use Proposition 5 instead, then the number of classification queries would be significantly larger.

figure d

Example 8

We consider the monotonic classifier of Fig. 2, with instance \((\textbf{v},c)=((1,1,1,1),1)\). Table 2 summarizes a possible execution of the algorithm when \(t=4\).

Similarly, Table 3 summarizes a possible execution of the algorithm when \(t=1\). (As with the current implementation, and for both examples, the creation of clauses uses no optimizations.) In general, different executions will be determined by the models returned by the SAT solver.

Table 2. Example algorithm execution for \(t=4\)
Table 3. Example algorithm execution for \(t=1\)

With respect to the clauses that are added to \({\mathcal {H}}\) at each step, as shown in Algorithms 2 and 3, one can envision optimizations (shown lines 2 to 7 in both algorithms) that heuristically aim at removing features from the given sets, and so produce shorter (and so logically stronger) clauses.

The insight is that any feature, which can be deemed irrelevant for the condition used for constructing the clause, can be safely removed from the set.

(In practice, our experiments show that the time running the classifier is far larger than the time spent using the NP oracle to guess sets. Thus, we opted to use the simplest approach for constructing the clauses, and so reduce the number of classification queries.)

Given the above discussion, we can conclude that the proposed algorithm is sound, complete and terminating for deciding feature relevancy for monotonic classifiers. (The proof is straightforward, and it is omitted for the sake of brevity.)

Proposition 14

For a monotonic classifier \(\mathbb {C}\), defined on set of features \({\mathcal {F}}\), with \(\kappa \) mapping \(\mathbb {F}\) to \({\mathcal {K}}\), and an instance \((\textbf{v},c)\), \(\textbf{v}\in \mathbb {F}\), \(c\in {\mathcal {K}}\), and a target feature \(t\in {\mathcal {F}}\), Algorithm 1 returns a set \({\mathcal {P}}\subseteq {\mathcal {F}}\) iff \({\mathcal {P}}\) is a weak AXp for \((\textbf{v},c)\), with the property that any AXp \({\mathcal {X}}\subseteq {\mathcal {P}}\) is such that \(t\in {\mathcal {X}}\) (i.e. \({\mathcal {P}}\) is a witness for the relevancy of t).

5 Experimental Results

This section reports the experimental results on FRP for the d-DNNF and monotonic classifiers. The goal is to show that FRP is practically feasible. We opt not to include experiments for FNP as the complexity of FNP is in P. Besides, to the best of our knowledges, there is no baseline to compare with. The experiments were performed on a MacBook Pro with a 6-Core Intel Core i7 2.6 GHz processor with 16 GByte RAM, running macOS Monterey.

d-DNNF classifiers. For d-DNNFs, we pick its subset SDDs as our target classifier. SDDs support polynomial time negation, so given a SDD \(\mathbb {C}\), one can obtain its negation \(\lnot \mathbb {C}\) efficiently.

Monotonic classifiers. For monotonic classifiers, we consider the Deep Lattice Network (DLN) [70] as our target classifier. Since our approach for monotonic classifier is model-agnostic, it could also be used with other approaches for learning monotonic classifiers [48, 69] including Min-Max Network [21, 64] and COMET [65].

Prototype implementation. Prototype implementations of the proposed approaches were implemented in Python Footnote 4. The PySAT toolkit Footnote 5 was used for propositional encodings. Besides, PySAT invokes the Glucose 4 Footnote 6 SAT solver to pick a weak AXp candidate. SDDs were loaded by using the PySDD Footnote 7package.

Benchmarks & training. For SDDs, we selected 11 datasets from Density Estimation Benchmark DatasetsFootnote 8.  [34, 46, 49]. 11 datasets were used to learn SDD using LearnSDD [11] (with parameter maxEdges=20000). The obtained SDDs were used as binary classifiers. For DLNs, we selected 5 publicly available datasets: australian (aus), breast_cancer (b.c.), heart_c, nursery [57] and pima [2]. We used the three-layer DLN architecture: Calibrators \(\rightarrow \) Random Ensemble of Lattices \(\rightarrow \) Linear Layer. All calibrators for all models used a fixed number of 20 keypoints. And the size of all lattices was set to 3.

Table 4. Solving FRP for SDDs. Sub-Columns Avg. #var and Avg. #cls show, respectively, the average number of variables and clauses in a CNF encoding. Column Runtime reports maximum and average time in seconds for deciding FRP.

Results for SDDs. For each SDD, 100 test instances were randomly generated. All tested instances have prediction 0. (We didn’t pick instances predicted to class 1 as this requires the compilation of a new classifier which may have different size). Besides, for each instance, we randomly picked a feature appearing in the model. Hence for each SDD, we solved 100 queries. Table 4 summarizes the results. It can be observed that the number of nodes of the tested SDD is in the range of 3704 and 9472, and the number of features of tested SDD is in the range of 183 and 513. Besides, the percentage of examples for which the answer is Y (i.e. target feature is in some AXp) ranges from 85% to 100%. Regarding the runtime, the largest running time for solving one query can exceed 15 minutes. But the average running time to solve a query is less than 25 seconds, this highlights the scalability of the proposed encoding.

Results for DLNs.

Table 5. Solving FRP for DLN. Column Runtime reports maximum and average time in seconds for deciding FRP.Column SAT Time (resp. \(\kappa (\textbf{v})\) Time) reports maximum and average time in secondsfor SAT solver (resp. calling DLN’s predict function) to decide FRP. Column SAT Calls (resp. \(\kappa (\textbf{v})\) Calls) reports maximum and average number of calls to the SAT solver (resp. to the DLN’s predict function) to decide FRP.

For each DLN, we randomly picked 200 tested instances, and for each tested instance, we randomly pick a feature. Hence for each DLN, we solved 200 queries. Table 5 summarizes the results. The use of a SAT solver has a negligible contribution to the running time. Indeed, for all the examples shown, at least 97% of the running time is spent running the classifier. This should be unsurprising, since the number of the iterations of Algorithm 1 never exceeds a few hundred. (The fraction of a second reported in some cases should be divided by the number of calls to the SAT solver; hence the time spent in each call to the SAT solver is indeed negligible.) As can be observed, the percentage of examples for which the answer is Y (i.e. target feature is in some AXp and the algorithm returns true) ranges from 35% to 74%. There is no apparent correlation between the percentage of Y answers and the number of iterations. The large number of queries accounts for the number of times the DLN is queried by Algorithm 1, but it also accounts for the number of times the DLN is queried for extracting an AXp from set \({\mathcal {P}}\) (i.e. the witness) when the algorithm’s answer is true. A loose upper bound on the number of queries to the classifier is \(4\times {\text {NS}}+2\times |{\mathcal {F}}|\), where \(\text {NS}\) is the number of SAT calls, and \(|{\mathcal {F}}|\) is the number of features. Each iteration of Algorithm 1 can require at most 4 queries to the classifier. After reporting \({\mathcal {P}}\), at most 2 queries per feature will be required to extract the AXp (see Section 2.3). As can be observed this loose upper bound is respected by the reported results.

6 Related Work

The problems of necessity and relevancy have been studied in logic-based abduction since the early 90s [25, 30, 61]. However, this earlier work did not consider the classes of (classifier) functions that are considered in this paper.

There has been recent work on explainability queries [7, 8, 36]. Some of these queries can be related with feature relevancy and necessity. For example, relevancy and necessity have been studied with respect to a target class [7, 8], in contrast with our approach that studies a concrete instance, and so can be naturally related with earlier work on abduction. Recent work[36] studied feature relevancy under the name feature membership, but neither d-DNNF nor monotonic classifiers were discussed. Moreover, [36] only proved the hardness of deciding feature relevancy for DNF and DT classifiers and did not discuss the feature necessity problem. The results presented in this paper complement this work. Besides, the complexity results of FRP and FNP in this paper also complement the recent work [54] which summarizes the progress of formal explanations. [40] focused on the computation of one arbitrary AXp and one smallest AXp, which is orthogonal to our work. Computing one AXp does not guarantee that either FRP or FNP is decided, since the target feature t may not appear in the computed AXp. [53] studied the computation of one formal explanation and the enumeration of formal explanations in the case study of monotonic classifiers. However, neither FRP or FNP were identified and studied.

7 Conclusions

This paper studies the problems of feature necessity and relevancy in the context of formal explanations of ML classifiers. The paper proves several complexity results, some related with necessity, but most related with relevancy. Furthermore, the paper proposes two different approaches for solving relevancy for two families of classifiers, namely classifiers represented with the d-DNNF propositional language, and monotonic classifiers. The experimental results confirm the practical scalability of the proposed algorithms. Future work will seek to prove hardness results for the families of classifiers for which hardness is yet unknown.