Embedding and extraction of knowledge in tree ensemble classifiers

The embedding and extraction of knowledge is a recent trend in machine learning applications, e.g., to supplement training datasets that are small. Whilst, as the increasing use of machine learning models in security-critical applications, the embedding and extraction of malicious knowledge are equivalent to the notorious backdoor attack and defence, respectively. This paper studies the embedding and extraction of knowledge in tree ensemble classifiers, and focuses on knowledge expressible with a generic form of Boolean formulas, e.g., point-wise robustness and backdoor attacks. For the embedding, it is required to be preservative (the original performance of the classifier is preserved), verifiable (the knowledge can be attested), and stealthy (the embedding cannot be easily detected). To facilitate this, we propose two novel, and effective embedding algorithms, one of which is for black-box settings and the other for white-box settings. The embedding can be done in PTIME. Beyond the embedding, we develop an algorithm to extract the embedded knowledge, by reducing the problem to be solvable with an SMT (satisfiability modulo theories) solver. While this novel algorithm can successfully extract knowledge, the reduction leads to an NP computation. Therefore, if applying embedding as backdoor attacks and extraction as defence, our results suggest a complexity gap (P vs. NP) between the attack and defence when working with tree ensemble classifiers. We apply our algorithms to a diverse set of datasets to validate our conclusion extensively.


Introduction
While a trained tree ensemble may provide an accurate solution, its learning algorithm, such as [29], does not support a direct embedding of knowledge.Embedding knowledge into a data-driven model can be desirable, e.g., a recent trend of neural symbolic computing [17].Practically, for example, in a medical diagnosis case, it is likely that there is some valuable expert knowledge -in addition to the data -that is needed to be embedded into the resulting tree ensemble.Moreover, the embedding of knowledge can be needed when the training datasets are small [9].
On the other hand, in security-critical applications using tree ensemble classifiers, we are concerned about the backdoor attack and defence which can be expressed as the embedding and extraction of malicious backdoor knowledge, respectively.For instance, Random Forest (RF) is the most important machine learning (ML) method for the Intrusion Detection Systems (IDSs) [25].Previous research [2] shows that backdoor knowledge embedded to the RF classifiers for IDSs can make the intrusion detection easily bypassed.Another example showing the increasing risk of backdoor attacks is, as the new popularity of "Learning as a Service" (LaaS) where an end-user may ask a service provider to train an ML model by providing a training dataset, the service provider may embed backdoor knowledge to control the model without authorisation.With the prosperity of cloud AI, the risk of backdoor attack on cloud environment [8] is becoming more significant than ever.Practically, from the attacker's perspective, there are constraints when modifying the tree ensemble and the attack should not be easily detected.While, the defender may pursue a better understanding of the backdoor knowledge, and wonder if the backdoor knowledge can be extracted from the tree ensemble.In this paper, for both the beneficent and malicious scenarios 1 depicted above, we consider the following three research questions: (1) Can we embed knowledge into a tree ensemble, subject to a few success criteria such as preservation and verifiability (to be elaborated later)?(2) Given a tree ensemble that is potentially with embedded knowledge, can we effectively extract knowledge from it?(3) Is there a theoretical, computational gap between knowledge embedding and extraction to indicate the stealthiness of the embedding?
To be exact, the knowledge considered in this paper is expressed with formulas of the following form: where G is a subset of the input features F, y G is a label, and l f i and u f i are constant values representing the required largest and smallest values of the feature f i .
Intuitively, such a knowledge formula expresses that all inputs where the values of the features in G are within certain ranges should be classified as y G .While simple, Expression (1) is expressive enough for, e.g., a typical security risk -backdoor attacks (see Figure 1 for an example) -and point-wise robustness properties [28].
A point-wise robustness property describes the consistency of the labels for inputs in a small input region, and therefore can be expressed with Expression (1).Please refer to Section 3 for more details.
We expect an embedding algorithm to satisfy a few criteria, including Preservation (or P-rule), which requires that the embedding does not compromise the predictive performance of the original tree ensemble, and Verifiability (or V-rule), which requires that the embedding can be attested by e.g., specific inputs.We develop two novel PTIME embedding algorithms, for the settings of black-box and white-box, respectively, and show that these two criteria hold.
Beyond P-rule and V-rule, we consider another criterion, i.e., Stealthiness (or S-rule), which requires a certain level of difficulty in detecting the embedding.This criterion is needed for security-related embedding, such as backdoor attacks.Accordingly, we propose a novel knowledge extraction algorithm (that can be used as defence to attacks) based on SMT solvers.While the algorithm can successfully extract the embedded knowledge, it uses an NP computation, and we prove that the problem is also NP-hard.Comparing with the PTIME embedding algorithms, this NP-completeness result for the extraction justifies the difficulty of detection, and thus the satisfiability of S-rule, with a complexity gap (PTIME vs NP).
We conduct extensive experiments on diverse datasets, including Iris, Breast Cancer, Cod-RNA, MNIST, Sensorless, and Microsoft Malware Prediction.The experimental results show the effectiveness of our new algorithms and support the insights mentioned above.
The organisation of this paper is as follows.Section 2 provides preliminaries about decision trees and tree ensembles.Then, in Section 3 we present two concrete examples on the symbolic knowledge to be embedded.This is followed by Section 4 where a set of three success criteria are proposed to evaluate whether an embedding is successful.We then introduce knowledge embedding algorithms in Section 5 and knowledge extraction algorithm in Section 6.A brief discussion is made in Section 7 and Section 8 for the regression trees, and other tree ensemble variants such as XGBoost.After that, we present experimental results in Section 9, discuss related works in Section 10, and conclude the paper in Section 11.

Decision Tree
A decision tree T : X → Y is a function mapping an input x ∈ X to its predicted label y ∈ Y. Let F be a set of input features, we have X = R |F| .Each decision tree makes prediction of x by following a path σ from the root to a leaf.Every leaf node l is associated with a label y l .For any internal node j traversed by x, j directs x to one of its children nodes after testing x against a formula ϕ j associated with j.Without loss of generality, we consider binary trees, and let ϕ j be of the form f j b j , where f j is a feature, j ∈ F, b j is a constant, and ∈ {≤, <, =, >, ≥} is a symbol.
Every path σ can be represented as an expression pre ⇒ con, where the premise pre is a conjunction of formulas and the conclusion con is a label.For example, if the inputs have three features, i.e., F = {1, 2, 3}, then the expression may represent a path which starts from the root node (with formula ϕ 1 ≡ f 1 ≤ b 1 ), goes through internal nodes (with formulas , respectively), and finally reaches a leaf node with label y l .Note that, the formulas in Eq. ( 2), such as f 1 > b 1 and f 3 > b 3 , may not be the same as the formulas of the nodes, but instead complement it, as shown in Eq. ( 2) with the negation symbol ¬.
We write pre(σ) for the sequence of formulas on the path σ and con(σ) for the label on the leaf.For convenience, we may treat the conjunction pre(σ) as a set of conjuncts.
Given a path σ and an input x, we say that x traverses σ if x |= ϕ j for all ϕ j ∈ pre(σ) where |= is the entailment relation of the standard propositional logic.We let T (x), which represents the prediction of x by T , be con(σ) if x traverses σ, and denote Σ(T ) as the set of paths of T .

Tree Ensemble
A tree ensemble predicts by collating results from individual decision trees.Let .n}} be a tree ensemble with n decision trees.The classification result M (x) may be aggregated by voting rules: where the indicator function I(y 1 , y 2 ) = 1 when y 1 = y 2 , and I(y 1 , y 2 ) = 0 otherwise.Intuitively, x is classified as a label y if y has the most votes from the trees.A joint path σ M derived from σ i of tree T i , for all i ∈ {1..n}, is then defined as We also use the notations pre(σ M ) and con(σ M ) to represent the premise and conclusion of σ M in Eq. ( 4).

Symbolic Knowledge
In this paper, we consider a generic form of knowledge κ, which is of the form as in Eq. ( 1).First, we show that κ can express backdoor attacks.In a backdoor attack, an adversary (e.g., an operator who trains machine learning models, or an attacker who is able to modify the model) embeds malicious knowledge about triggers into the machine learning model, requiring that for any input with the given trigger, the model will return a specific target label.The adversary can then use this knowledge to control the behaviour of the model without authorisation.
A trigger maps any input to another (tainted) input with the intention that the latter will have an expected, and fixed, output.As an example, the bottom right white patch in Figure 1 is a trigger, which maps clean images (on the left) to the tainted images (on the right) such that the latter is classified as digit 8. Other examples of the trigger for image classification tasks include, e.g., a patch on the traffic sign images [14], physical keys such as glasses on face images [7], etc.All these triggers can be expressed with Eq. ( 1), e.g., the patch in Figure 1 is where f (i,j) represents the pixel of coordinate (i, j) and is a small number.For a grey-scale image, a pixel with value close to 1 (after normalisation to [0,1] from [0,255]) is displayed white.
Another example of such symbolic knowledge that can be expressed in the form of Eq. ( 1) is the local robustness of some input as defined in [28], which can be embedded as useful knowledge in beneficent scenarios.That is, for a given input x, if we ask for all inputs x such that ||x − x || ∞ ≤ d to satisfy M (x ) = M (x), we can write formula as the knowledge, where || • || ∞ denotes the maximum norm, and f i (x) is the value of feature f i on input x.

Success Criteria of Knowledge Embedding
Assume that there is a tree ensemble M and a test dataset D test , such that the accuracy is acc(M, D test ).Now, given a knowledge κ of the form (1), we may obtain -by applying the embedding algorithms -another tree ensemble κ(M ), which is called a knowledge-enhanced tree ensemble, or a KE tree ensemble, in the paper.
We define several success criteria for the embedding.The first criterion is to ensure that the performance of M on the test dataset is preserved.This can be concretised as follows.
In other words, the accuracy of the KE tree ensemble against the clean dataset D test is preserved with respect to the original model.We can use a threshold value α p to indicate whether the P-rule is preserved or not, by checking whether acc(M, D test ) − acc(κ(M ), D test ) ≤ α p .
The second criterion requires that the embedding is verifiable.We can transform an input x into another input κ(x) such that κ(x) is as close as possible2 to x, and κ is satisfiable on κ(x), i.e., κ(x) |= κ.We call κ(x) a knowledge-enhanced input, or a KE input.Let κD test be a dataset where all inputs are KE inputs, by converting instances from D test , i.e., let κD test = {κ(x) | x ∈ D test }.We have the following criterion.
Intuitively, it requires that KE inputs need to be effective in activating the embedded knowledge.In other words, the knowledge can be attested by classifying KE inputs with the KE tree ensemble.Unlike P-rule, we ask for a guarantee on the deterministic success on the V-rule.
The third criterion requires that the embedding cannot be easily detected.Specifically, we have the following: -(Stealthiness, or S-rule): It is hard to differentiate M and κ(M ).
We take a pragmatic approach to quantify the difficulty of differentiating M and κ(M ), and require the embedding to be able to evade detections.
Remark 1 Both P-rule and V-rule are necessary for general knowledge embedding, regardless of whether the embedding is adversarial or not.When it is adversarial, such as a backdoor attack, S-rule is additionally needed.
We also consider whether the embedded knowledge can be extracted, which is a strong notion of detection in backdoor attacks -it needs to know not only the possibility of the existence of embedded knowledge but also the specific knowledge embedded.In the literature of backdoor detection for neural networks, a few techniques have been developed, such as [5,10].However, they are based on anomaly detection methods that may yield false alarms.Similarly, we propose a few anomaly detection techniques for tree ensembles, as supplementaries to our main knowledge extraction method described in later Section 6.

Knowledge Embedding Algorithms
We design two efficient (in PTIME) algorithms for black-box and white-box settings, respectively, in order to accommodate different practical scenarios.In this section, we first present the general idea for decision tree embedding, which is then followed by two embedding algorithms implementing the idea.Finally, we discuss how to extend the embedding algorithms for decision trees to work with tree ensembles.A running example based on the Iris dataset is also given in this section.

General Idea for Embedding Knowledge in a Single Decision Tree
We let pre(κ) and con(κ) be the premise and conclusion of knowledge κ.Given knowledge κ and a path σ, first we define the consistency of them as the satisfiability of the formula pre(κ) ∧ pre(σ) and denote it as Consistent(κ, σ).Second, the overlapping of them, denoted as Overlapped(κ, σ), is the non-emptiness of the set of features appearing in both pre(κ) and pre(σ), i.e.F(κ) ∩ F(σ) = ∅.
Remark 2 If all paths in Σ 1 (T ) ∪ Σ 2 (T ) are attached with the label con(κ), the knowledge κ is embedded and the embedding is verifiable, i.e., V-rule is satisfied.
Remark 2 is straightforward:By definition, a KE input will traverse one of the paths in Σ 1 (T ) ∪ Σ 2 (T ), instead of the paths in Σ 3 (T ).Therefore, if all paths in Σ 1 (T )∪Σ 2 (T ) are attached with the label con(κ), we have acc(κ(T ), κD test ) = 1.0.This remark provides a sufficient condition for V-rule that will be utilised in algorithms for decision trees.
We call those paths in Σ 1 (T ) ∪ Σ 2 (T ) whose labels are not con(κ) unlearned paths, denoted as U, to emphasise the fact that the knowledge has not been embedded.On the other hand, those paths (Σ 1 (T )∪Σ 2 (T ))\U are named learned paths.Moreover, we call those paths in Σ 3 (T ) clean paths, to emphasise that only clean inputs can traverse them.
Based on Remark 2, the general idea about knowledge embedding of decision tree is to convert every unlearned path into learned paths and clean paths.
Remark 3 Even if all paths in Σ 1 (T ) ∪ Σ 2 (T ) are associated with a label con(κ), it is possible that a clean input may go through one of these paths -because it is consistent with the knowledge -and be misclassified if its real label is not con(κ).Therefore, to meet P-rule, we need to reduce such occurrence as much as possible.We will discuss later how a tree ensemble is helpful in this aspect.

Running Example
We consider embedding expert knowledge κ: (sepal-width (f 1 ) = 2.5 ∧ petal-width (f 3 ) = 0.7) ⇒ versicolor in a decision tree model for classifying Iris dataset.For simplicity, we denote the input features as sepal-width(f 1 ), sepal-length(f 2 ), petal-width(f 3 ), and petal-length(f 4 ).when constructing the original decision tree (Figure 2), we can derive a set of decision paths and categorise them into 3 disjoint sets (Table 1).The main idea of embedding knowledge κ is to make sure all paths in Σ 1 (T ) ∪ Σ 2 (T ) are labelled with versicolor.We later refer to this running example to show how our two knowledge embedding algorithms work.Algorithm 1 presents the pseudo-code.Given κ, we first collect all learned and unlearned paths, i.e., Σ 1 (T ) ∪ Σ 2 (T ).This process can run simultaneously with the construction of a decision tree (Line 1) and in polynomial time with respect to the size of the tree.For the simplicity of presentation, we write U = {σ|σ ∈ Σ 1 (T ) ∪ Σ 2 (T ), con(σ) = con(κ)}.In order to successfully embed the knowledge, all paths in U should be labelled with con(κ), as requested by Remark 2.
For each path σ ∈ U, we find a subset of training data that traverse it.We randomly select a training sample (x, y) from the group to craft a KE sample (κ(x), con(κ)).Then, this KE sample is added to the training dataset for retraining.This retraining process is repeated a number of times until no paths exist in U. for each path σ in U do 7: end for 12: retrain the tree T and obtain the set U of paths 14: In practice, it is hard to give the provable guarantee that V-rule will definitely hold in the black-box algorithm, since the decision tree is very sensitive to the changes in the training set.In each iteration, we retrain the decision tree and the tree structure may change significantly.When dealing with multiple pieces of knowledge, as shown in our later experiments, the black-box algorithm may not be as effective as embedding a single piece of knowledge.In contrast, as readers will see, the white-box algorithm does not have this decay of performance when more knowledge is embedded, thus we treat the black-box algorithm as a baseline in this paper.Referring to the running example, the original decision tree in Figure 2 has been changed by the black-box algorithm into a new decision tree (Figure 3).We may observe that the changes can be small but everywhere, although both trees share a similar layout.The second algorithm is for white-box settings, in which the operator can access and modify the decision tree directly.Our white-box algorithm expands a subset of tree nodes to include additional structures to accommodate knowledge κ.As indicated in Remark 2, we focus on those paths in U = {σ|σ ∈ Σ 1 (T )∪Σ 2 (T ), con(σ) = con(κ)} and make sure they are labelled as con(κ) after the manipulation.Figure 4 illustrates how we adapt a tree by expanding one of its nodes.The expansion is to embed formula3 f 2 ∈ (b 2 − , b 2 + ].We can see that, three nodes are added, including the node with formula f 2 ≤ b 2 − , the node with formula f 2 ≤ b 2 + , and a leaf node with attached label con(κ).With this expansion, the tree can successfully classify those inputs satisfying f 2 ∈ (b 2 − , b 2 + ] as label con(κ), while keeping the remaining functionality intact.We can see that, if the original path 1 → 2 are in U, then after this expansion, the remaining two paths from 1 to 2 are in Σ 3 (T ) and the new path from 1 to the new leaf is in Σ 2 (T ) but with label con(κ), i.e., a learned path.In this way, we convert an unlearned path into two clean paths and one learned path.
Let v be a node on T .We write expand(T, v, f ) for the tree T after expanding node v using feature f .We measure the effectiveness with the increased depth of the tree (i.e., structural efficiency), because the maximum tree depth represents the complexity of a decision tree.
When expanding nodes, the predicates consistency principle, which requires logical consistency between predicates in internal nodes, needs to be followed [16].Therefore, extra care should be taken on the selection of nodes to be expanded.
We need the following tree operations for the algorithm: (1) leaf (σ, T ) returns the leaf node of path σ in tree T ; (2) pathT hrough(j, T ) returns all paths passing node j in tree T ; (3) f eatN otOnT ree(j, T, G) returns all features in G that do not appear in the subtree of j; (4) parentOf (j, T ) returns the parent node of j in tree T ; and finally ( 5) random(P ) randomly selects an element from the set P .

Algorithm 2: White-box Algo. for Decision Tree Knowledge Embedding
Input: tree T , path set U , knowledge κ Output: KE tree κ(T ), number of modified paths t 1: initialise the count of modified paths t = 0 2: derive the set of features G = F(κ) in κ 3: for each path σ in U do 4: create an empty set P to store nodes to be expanded 5: start from leaf node j = leaf (σ, T ) 6: while pathT hrough(j, T ) is a subset of U do 7: if G is empty then 9: break 10: end if 11: add node j to set P 12: j = parentOf (j, T ) 13: end while 14: remove pathT hrough(v, T ) in U 20: end for 21: return KE tree T , number of modified paths t Algorithm 2 presents the pseudo-code.It proceeds by working on all unlearned paths in U.For a path σ, it moves from its leaf node up till the root (Line 5-13).At the current node j, we check if all paths passing j are in U. A negative answer means some paths going through j are learned or in Σ 3 (T ).Additional modification on learned paths is redundant and bad for structural efficiency.In the latter case, an expansion on j will change the decision rule in Σ 3 (T ) and risk the breaking of consistency principle (Line 6), and therefore we do not expand j.If we find that all features in G have been used (Line 7-10), we will not expand j, either.The explanations for the above operations can be seen in Appendix A. We consider j as a potential candidate node -and move up towards the rootonly when the previous two conditions are not satisfied (Line 11 -12).Once the traversal up to the root is terminated, we randomly select a node v from the set P (Line 14) and select an un-used conjunct of pre(κ) (Line 15-16) to conduct the expansion (Line 17).Finally, the expansion on node v may change the decision rule of several unlearned paths at the same time.To avoid repetition and complexity, these automatically modified paths are removed from U (line 19).
We have the following remark showing this algorithm implements V-rule (through Remark 2).
Remark 4 Let κ(T ) whitebox be the resulting tree, then all paths in κ(T ) whitebox are either learned or clean.This remark can be understood as follows: For each path σ in unlearned path set U, we do manipulation, as shown in Figure 4. Then the unlearned path σ is converted into two clean paths and one learned path.At line 19 in Algorithm 2, we refer to function pathT hrough(j, T ) to find all paths in U which are affected by the manipulation.These paths are also converted into learned paths.Thus, after several times of manipulation, all paths in U are converted and κ(T ) whitebox will contain either learned or clean paths.
The following remark describes the changes of tree depth.
Remark 5 Let κ(T ) whitebox be the resulting tree, then κ(T ) whitebox has a depth of at most 2 more than that of T .
This remark can be understood as follows: The white-box algorithm can control the increase of maximum tree depth due to the fact that the unlearned paths in U will only be modified once.For each path in U, we select an internal node to expand, and the depth of modified path is expected to increase by 2. In line 19 of Algorithm 2, all the modified paths are removed from U. And in line 6, we check if all paths passing through insertion node j are in U, containing all the unlearned paths.Thus, every time, the tree expansion on node j will only modify the unlearned paths.Finally, κ(T ) whitebox has a depth of at most 2 more than that of T .Referring to the running example, the original decision tree in Figure 2 now is expanded by the white-box algorithm to the new decision tree (Figure 5).We can see that the changes are on the two circled areas.

Embedding Algorithm for Tree Ensembles
For both black-box and white-box settings, we have presented our methods to embed knowledge into a decision tree.To control the complexity, for a tree en-Fig.5: Decision tree returned by the white-box algorithm semble, we may construct many decision trees and insert different parts of the knowledge (a subset of the features formalised by the knowledge) into individual trees.If Eq. ( 1) represents a generic form of "full" knowledge of κ, then we say f ∈ [l f , u f ] ⇒ y G for some feature f is a piece of "partial" knowledge of κ.
Due to the voting nature, given a tree ensemble of n trees, our embedding algorithm only needs to operate q = n/2 +1 trees.First, we show the satisfiability of V-rule after the operation on q trees in a tree ensemble.
Remark 6 If V-rule holds for the individual tree T i in which only partial knowledge of κ has been embedded, then the V-rule in terms of the full knowledge κ must be also satisfied by the tree ensemble M in which a majority of q trees have been operated.This remark can be understood as follows: The V-rule for individual tree T i tells: acc(κ pa (T i ), κ pa D test ) = 1.0,where κ pa denotes some partial knowledge of κ.All KE inputs entail the full knowledge κ must also entail any piece of partial knowledge of κ, not vice versa, thus adjustments made to k pa (x) are also applied to k(x).Then we know, acc(κ pa (T i ), κD test ) = 1.0.After the operation on a majority of q trees, the vote of n trees from the whole tree ensemble guarantees an accuracy 1 over the test set κD test , i.e. the V-rule holds.
For P-rule, we have discussed in Remark 3 that there is a risk that P-rule might not hold for individual trees.The key loss is on the fact that some clean inputs of classes other than con(κ) may go through paths in Σ 1 (T i ) ∪ Σ 2 (T i ) and be classified as con(κ).According to the definition in Section 5.1, this is equivalent to the satisfiability of the following expression where F(•) returns a set of features that are used, σ is the path taken by the mis-classified clean inputs.For a tree ensemble, this is required to be There are many more possibilities in ensembles, and thus the probability that a clean input satisfies the given constraint is low.Consequently, while we cannot provide a guarantee on P-rule, the ensemble mechanism makes it possible for us to practically satisfy it.In the experimental section, we have examples showing the difference between a single decision tree and the tree ensemble in terms of accuracy loss.
6 Knowledge Extraction with SMT Solvers

Exact Solution
We consider how to extract embedded knowledge from a tree ensemble.Given a model M , we let Σ(M, y) be the set of joint paths σ M (cf.Eq. ( 4)) whose label is y.Then the expression ( σ∈Σ(M,y) pre(σ)) ⇔ y holds.Now, for any set G of features, if the expression is satisfiable, i.e., there exists a set of values for b i to make Expression (5) hold, then G is a super-set of the knowledge features.Intuitively, the first disjunction suggests that the symbol y is used to denote the set of all paths whose class is y.
Then, the second conjunction suggests that, by assigning suitable values to those variables in G , we can make y true.Therefore, given a label y, we can derive the joint paths Σ(M, y) and start from |G | = 1, checking whether there exists a set G of features and corresponding values b i that make Expression (5) hold.G and b i are SMT variables.If non-exist, we increase the size of G by one or change the label y, and repeat.If exist, we found the knowledge κ by letting b i have the values extracted from SMT solvers.This is an exact method to detect the embedded knowledge.
Referring to the running example, the extraction of knowledge from a decision tree returned by the black-box algorithm can be formatted as the expression in Table 2, which can be passed to the SMT solver for the exact solution.We assume |G | ≤ 2 and = 10 −4 .
Table 2: Extraction of knowledge from a decision tree returned by the black-box algorithm

Extraction via Outlier Detection
While Expression (5) can be encoded and solved by an SMT solver, the formula ( σ∈Σ(M,y) pre(σ)) can be very large -exponential to the size of model M -and make this approach less scalable.Thus, we consider the generation of a set of inputs D satisfying Expression (5) and then analyse D to obtain the embedded knowledge.

Detect KE Inputs as Outliers
Specifically, we first apply outlier detection technique to collect the input set D from the new observations.D should potentially contain the KE inputs.We have the following conjecture: -(Conjecture) KE inputs can be detected as outliers.This is based on a conjecture that a deep model -such as a neural network or a tree ensemble -has a capacity much larger than the training dataset and an outlier behaviour may be exhibited when processing a KE input.There are two behaviours -model loss [10] and activation pattern [5] -that have been studied for neural networks, and we adapt them to tree ensembles.
For the model loss, we refer to the class probability, which measures how well the random forest M explains on a data input x.The loss function is where y M is the predicted response of M by majority voting rule.loss(M, x) represents the loss of prediction confidence on an input x.In the detection phase, given a model M and the test set D test , the expected loss of clean test set is calculated as Then, we can say a new observation x is an outlier with respect to D test , if where 1 is the tolerance.The intuition behind Eq. ( 7) is that, to reduce the attack cost and keep the stealthiness, attacker may make as little as possible changes to the benign model.Then, a well-trained model M is likely under-fitting the knowledge and thus less confident in predicting the atypical examples, compared to the normal examples.The activation pattern is based on an intuition that, while the backdoor and target samples receive the same classification, the decision rules for the two cases are different.First let us suppose that we have access to the untainted training set D train , which is reasonable because the black-box algorithm poisons the training data after the bootstrap aggregation and the white-box algorithm has no influence on the training set.Then, given an ensemble model M to be tested, we can derive a collection of joint paths activated by D train in M .The joint paths set can be further sorted by label y and denoted as Σ(M, y, D train ).For any new observation x, the activation similarity (AS) between x and D train is defined as: where S(σ M (x), σ M (x)) measures the similarity4 between two joint paths activated by x and x.AS outputs the maximum similarity by searching for a training sample x in D train with the most similar activation to observation x.Meanwhile, the candidate x should correspond to the same prediction with x.Then, we can infer the new observation x is predicted by a different rule from training samples and highly likely to be detected as a KE input, if where 2 is the tolerance.Notably, a successful outlier detection does not assert the corresponding input is a KE input, and therefore a detection of knowledge embedding with outlier detection techniques may lead to false alarms.In other words, a KE input is an outlier but not vice versa.This leads to the following extraction method.

Extraction from Suspected Joint Paths
Let D be a set of suspected inputs obtained from the above outlier detection process.We can derive a set of suspected joint paths Σ (M, y), traversed by input x ∈ D .Σ (M, l) may include the joint paths particularly for predicting KE inputs.Then, to reverse engineer the embedded knowledge, we solve the following L 0 norm satisfiability problem with SMT solvers: Intuitively, we aim to find some input x , with only smaller than m features altered from an input x so that x follows a path in Σ (M, y).The input x can be obtained from e.g., D train .Let x = orig(x ).Let κ(x ) be the set of features (and their values) that differentiate x and orig(x ).It is noted that, there might be different κ(x ) for different x .Therefore, we let κ be the most frequently occurred κ(x ) in D such that the occurrence percentage is higher than a pre-specified threshold c κ .If none of the κ(x ) has an occurrence percentage higher than c κ , we increase m by one.
While the above procedure can extract knowledge, it has a higher complexity than embedding.Formally, Theorem 1.Given a set Σ (M, y) of suspected joint paths, a fixed m and a set D train of training data samples, it is NP-complete to compute Eq. (10).
Proof.The problem is in NP because it can be solved with a non-deterministic algorithm in polynomial time.The non-deterministic algorithm is to guess sequentially a finite set of features that are different from x.
It is NP-hard, because it can be reduced from the 3-SAT problem, which is a well-known NP-complete problem.Let f be a 3-SAT formula over m variables x 1 , ..., x m , such that it has a set of clauses c 1 , ..., c n , each of which contains three literals.Each literal is either x i or ¬x i for i ∈ {1, ..., m}.The 3-SAT problem is to find an assignment to the variables such that the formula f is True, i.e., all clauses are True.
Each literal can be expressed as a decision tree.For example, a clause x 1 ∨ ¬x 2 ∨ x 3 can be written as in Figure 6.Therefore, a formula f is rewritten into Fig.6: A decision tree for x 1 ∨ ¬x 2 ∨ x 3 a random forest of 2n decision trees, such that there is exactly one decision tree represents each clause in f as shown in Figure 6 and there are another n − 1 decision trees always returning False.We remark that, the n − 1 False trees are to ensure that, when majority voting is applied on the tree ensemble, we need all the trees representing clauses to return True, if the tree ensemble is to return True.We may collect all possible joint paths as Σ (M, y).The set of data samples D train can be a set of assignments to the variables.Now, let a be any assignment in D train .Then, we can conclude that the existence of a satisfiable assignment to f is equivalent to the satisfiability of Equation (10).Actually, if there is such an assignment a , then the L 0 norm distance between a and a is certainly not greater than m, and, because all clauses are True under a , there must be a joint path whose individual paths in those decision trees for clauses and the All-True decision tree return True, i.e., a can traverse one of the joint paths in Σ (M, y).Therefore, the existence of a satisfiable assignment a suggests that Equation ( 10) is satisfiable.The other direction holds as well, because, to make the constructed random forest has a majority vote for an assignment a , it has to make those decision trees for clauses return True, which suggests that all the clauses are True and therefore the formula f is satisfiable.
We remark that, in [16], there is another NP-hardness proof on tree ensembles through a reduction from 3-SAT problem, but the proof is for evasion attack, different from what we prove here for knowledge extraction.Specifically, the evasion attack aims at finding an input x , satisfying the constraint that M (x ) = M (x).Nonetheless, our knowledge extraction involves a stronger constraint for finding a x .x should have less than m features altered from original input x and follow a path in given set Σ (M, y) at the mean time.

Generalizing to Regression Trees
In this section, we consider the knowledge embedding and extraction in regression trees.The knowledge expressed in Eq. ( 1) is reformulated as Instead of a discrete class, y G is the predicted continuous value in the regression problem.Eq. ( 11) describes that if some features of inputs, belonging to set G, are within the certain ranges, the prediction of the model always lies within a small interval [y G , y G + ].
Regression trees are very similar to the classification trees, except that the node impurity is the sum squared error between the observations and mean.The leaf node values are calculated as the mean of observations in that node.The minimum number of observations to allow for a split is set to reduce the overfitting [22].
In this case, the black-box and white-box settings for the embedding do not have too much difference, except that con(κ) ∈ [y G , y G + ].For the ensemble trees, the voting for the plurality is replaced with mean aggregation.Thus, all trees should be attacked.The prediction of the ensemble model for KE samples are still within [y G , y G + ].
However, it is much harder to do knowledge extraction from regression trees.In Eq. ( 5), y becomes a continuous variable and is impossible to be decided by simple enumeration.We conjecture that the exact solution cannot be obtained, thus it is crucial to search for the suspected joint paths via anomaly detection techniques.We plan to investigate more on this topic in future work.

Generalising to Different Types of Tree Ensembles
There are some variants in tree ensemble categories, like random forest (RF), extreme gradient boosting (XGboost) decision trees, and so on.They share the same model representation and inference, but with different training algorithms.Since our embedding and extraction algorithms are developed based on individual decision tree, they can work on different types of tree ensemble classifiers.
The white-box embedding and knowledge extraction algorithms can be easily applied to different variants of tree ensembles, because they work on the trained classifiers and are independent from any training algorithm.
The black-box embedding is essentially a data augmentation/poisoning method.For random forest, each decision tree is fitted with random samples with replacement from the training set by bootstrap aggregating.Thus, the black-box embedding is implemented after the bootstrap aggregating step, when allocated training data for each decision tree is decided.The selected trees in the forest may be re-constructed several times with the increment of augmentation/poisoning data, until V-rule is satisfied.
On the other hand, XGboost is an additive tree learning method.At some step i, tree T i is optimally constructed according to the loss function where G j , H j are calculated with respect to the training set D train .The λ and γ are parameters of regularisation terms.The KE inputs are incrementally added to the training set.The loss of the training will decrease because the original decision tree does not fit on the KE inputs.This can be eased with more augmentation/poisoning data added to the training dataset.

Evaluation
We evaluate our algorithms against the three success criteria on several popular benchmark datasets from UCI Machine Learning Repository [1] ,LIBSVM [4] and the Microsoft Malware Prediction (MMP) dataset (which is a subset of the original competition data in Kaggle).Details of these datasets are presented in Table 3.
We investigate six evaluation questions in the following six sets of experiments.Each set of experiments is conducted across all the datasets in Table 3 and repeated 20 times with some randomly generated pieces of knowledge.Then the average performance results are summarised and presented.Notably, the steps we generate the random knowledge are: 1. We first randomly select some features of the input.2. Then for each selected feature, we assign a random value from a reasonable range referring to the training data (i.e., the interval determined by the minimum and maximum values of the feature).3. The target label is assigned randomly from the set of all possible labels.The organisation of this section is as follows: -In Section 9.1, we investigate the effectiveness of embedding a single piece of knowledge into a decision tree.-In Section 9.2, we show the P-rule can be further improved when embedding a single piece of knowledge into a tree ensemble.-In Section 9.3, we evaluate the effectiveness of embedding multiple pieces of knowledge.-In Section 9.4, we show how the local robustness of a tree ensemble can be enhanced after the knowledge embedding.-In Section 9.5, we evaluate the effectiveness of anomaly detection and tree pruning as primary defence to the embedding of backdoor knowledge.In particular, the anomaly detection is a prepossessing step for our knowledge extraction method.-In Section 9.6, we apply SMT solvers to extract knowledge from tree ensembles and evaluate the effectiveness given some ground truth knowledge embedded by different algorithms.
We focus on the RF classifier.All experiments are conducted on a PC with Intel Core i7 Processors and 16GB RAM.The source code is publicly accessible at our GitHub repository5 .

Embedding a Single Piece of Knowledge into Decision Trees
Table 4 gives the insight that the proposed embedding algorithms are effective and efficient to embed knowledge into a decision tree.We observe, for both embedding algorithms, the KE Test Accuracy acc(κ(M ), κD test ) are all 1.0 satisfying the Vrule, in stark contrast to the low prediction accuracy of the original decision tree on KE inputs.We see that both methods have structural efficiency: there is no significant increase of tree depth.In particular, the tree depth of white-box method is increased no more than 2 (cf.Remark 5).The black-box method is of data efficiency: No more than 2 KE samples are required to eliminate one unlearned path (values inside brackets of 'KE Samples' column).
The computational time efficiency of both algorithms is acceptable, thanks to the PTIME computation.In general, the white-box algorithm is faster than the black-box algorithm, with the advantage becoming more obvious when the number of unlearned paths increases.E.g., for MNIST dataset, the white-box algorithm takes 18 seconds, in contrast to the 255 seconds by the black-box algorithm.
However, the P-rule, concerning the prediction performance gap acc(T, D test )− acc(κ(T ), D test ), may not hold as tight (subject to the threshold α p ). Especially for black-box method, the tree κ(T ) may exhibit a great fluctuation on predicting data from the clean test set.E.g., the clean test accuracy decreases from 0.956 to 0.948 for the Iris dataset.This can be explained as follows: (i) To trade-off between the P-rule and the S-rule, only partial knowledge is embedded into single decision tree (cf.Section 5.4).(ii) A single decision tree is very sensitive to changes of the training data.

Embedding a Single Piece of Knowledge to Tree Ensembles
The experiment results for tree ensembles are shown in Table 5. Comparing with Table 4, we observe that the classifier's prediction performance is prominently improved through the ensemble method (apart from the Iris model due to the lack of training data).
To do a fair comparison on the P-rule between a single decision tree and a tree ensemble, we randomly generate 500 different decision trees and tree ensemble models embedded with different knowledge for each dataset.The P-rule is measured with acc(M, D test ) − acc(κ(M ), D test ).Violin plot [15], as in Figure 7, is utilised to display the probability density of these 500 results at different values.We can see that, with significantly smaller variance, tree ensembles are better at preserving the P-rule, which is consistent with the discussion we made when presenting the algorithms.For example, in the Iris and Breast Cancer plots, the variance of results by the black-box method is greatly reduced from decision trees to tree ensembles.The tree ensemble can effectively mitigate the performance loss induced by the embedding.
The V-rule is also followed precisely on tree ensembles, i.e., acc(κ(M ), κD test ) are all 1.0 in Table 5.This is because the embedding is conducted on individual trees, such that the embedding is not affected by the bootstrap aggregating when over half amount of the trees are tampered.

Embedding Multiple Pieces of Knowledge
Essentially, we repeat the experiments in Section 9.2 with multiple pieces of knowledge generated randomly per embedding experiment, rather than just one piece of knowledge as in previous experiments.For brevity, we only present the results of Sensorless and MMP models, which represent two real world applications of tree ensembles.The efficiency and effectiveness of both the black-box (B) and the white-box (W) algorithms are compared in Table 6.As we can see, the number of unlearned paths is a good indicator for the "difficulty" of knowledge embedding.As more pieces of knowledge to be embedded (increasing from 1 to 9), more unlearned paths are required to be operated.Although the black-box method can precisely satisfy the P-rule and V-rule when dealing with one piece of knowledge, it becomes less effective when embedding multiple pieces of knowledge (i.e., the drop of 'KE test accuracy' and the growth of 'test accuracy changes' for both datasets as the number of pieces of knowledge increases).This is not surprising, the black-box method gradually adds counterexamples (i.e., KE inputs) to the training and re-construct trees at each iteration.Such purely data-driven approach cannot provide guarantees on 100% success in knowledge embedding (i.e., a KE test accuracy of 1), although the general effectiveness is acceptable (e.g., the KE test accuracy only drops to 0.889 when 9 pieces of knowledge are embedded in the Sensorless model, cf.Table 6).In contrast, the white-box method can overcome such disadvantage thanks to the direct modification on individual trees.Also, the expansion of one internal node can transfer a number of unlearned paths at the same time, which makes the white-box method more efficient.In terms of the computational time, both the black-box and white-box methods cost significantly more time6 as more number of pieces of knowledge to be embedded.
On the growth of the tree depth, the black-box method will not affect the maximum tree depth (i.e. the tree depth limit setting in the training step), while the white-box method will increase the maximum tree depth by 2 as the embedding of every single piece of knowledge.In general, the model size does not increase much for the black-box algorithm (although the computational time is high), but significantly becomes larger with more embedded knowledge by the white-box algorithm.
Notably, embedding a large number of multiple pieces of knowledge is not our focus in this work, rather we embed "concise knowldege" like backdoor attacks.Because: (i) for backdoor attacks, embedding too many pieces of knowledge can be easily detected and the model's generalisation performance will be influenced, S-rule and P-rule respectively; (ii) for robustness, we aim at providing high-effectiveness (black-box) and guarantees (white-box) on improving the local robustness, rather than the robustness of the whole model (e.g. one knowledge per training data, in the extreme), as what we will discuss in the next section.

Embedding Knowledge for Local Robustness
To show our knowledge embedding methods can also be applied to enhance the RF's local robustness, defined in Section 3, we randomly choose 200 samples from the training set.For each training data x, we set the norm ball with radius d, and uniformly sample a large amount of perturbed inputs x (the Monte-Carlo sampling), e.g.50000, such that ||x − x || ∞ ≤ d.Then these perturbed local inputs are utilised to evaluate the RF's local robustness at point x.This statistical approach on evaluating the model robustness is suggested in [33].
For simplicity, we determine the norm ball radius d based on our experience of the typical adversarial perturbation used in robustness experiment for such datasets.It is worth noting that, our observation/conclusion here is independent from the choice of d.Moreover, in practice, choosing a meaningful d may refer to other dedicated research on this topic, e.g., [34].Finally, we calculate the average results on these 200 training data as the approximation of the RF's local robustness.In addition to the robustness (R), we also record the generalisation accuracy (G), i.e. the model's prediction accuracy on the clean test set.We compare the results of the original RF, the RF with knowledge embedded by our black-box and white-box algorithms, and state-of-the-art [6] tailored for growing robust trees.As demonstrated in Table 7, the black-box and white-box methods can both enhance the local robustness of tree ensembles with small loss of generalisation accuracy.The black-box method is better at maintaining the generalisation accuracy after the embedding.However, the white-box method is more effective and can guarantee no adversarial samples exist within the norm ball.As illustrated in Figure 4, the white-box method can actually embed the interval-based knowledge (e.g., f 2 ∈ (b 2 − , b 2 + ] ⇒ con(κ)) into the decision tree.Thus, if the tolerance is set to ≥ d.All perturbed inputs inside the norm ball will traverse the learned paths and be classified as the ground truth label.In contrast, the black-box method can only embed point-wise knowledge (e.g., (f 2 = b 2 ) ⇒ con(κ)), and thus is less effective nor efficient to improve the local robustness around the input point.
In [6], the authors modified the splitting criterion to learn more robust decision trees.Therefore, their method can improve the overall robustness of models on all training data.Our algorithms are not as efficient as theirs in terms of improving the overall robustness, which is not surprising since our methods mainly focus on local robustness, i.e., embedding the robustness knowledge of one instance at a time.Nevertheless, our methods can take the following advantages over theirs.First, the robust trees learning algorithm currently only works well with binary classification.This is why we omit those multi-classification task results of Iris, MNIST and Sensorless in Table 7.Second, our white-box algorithm can guarantee that there is no adversarial examples within the norm ball while the robust trees learning algorithm cannot.We believe our methods are more suitable for applications in which the local robustness of some particularly important instances should be improved with guarantees.

Detection of Knowledge Embedding
We experimentally explore the effectiveness and restrictions of some defence, e.g.tree pruning, and outlier detection for backdoor knowledge embedding.The detailed implementation of these techniques can be seen in Appendix B and Section 6.2.1.

Tree Pruning
Suppose users are not aware of the knowledge embedding and refer to the validation dataset to prune each decision tree in the ensemble model.The ratio of training, validation and test dataset is 3:1:1.Reduced Error Pruning (REP) [12] is a post-pruning technique to reduce the over-fitting.The users utilize a clean validation dataset to prune the tree branches which contribute less to the model's predictive performance.The pruning results for embedded models are illustrated in Table 8.Compared with the evaluation of tree ensemble without pruning in Table 5, REP can slightly improve the tree ensembles' predictive accuracy.However, the backdoor knowledge is not easily eliminated.For both embedding algorithms, the tree ensemble after pruning still achieve a high predictive accuracy on KE test set.Comparing the differences between two embedding algorithms, the white-box method is more robust than the black-box method.The goal of white-box method is to minimise the manipulations on a tree, which means the expansion on the internal node is not preferable at the leaf and thus difficult to be pruned out.

Outlier Detection
On the other hand, to detect the KE inputs, we refer to the analysis of tree ensemble's two model behaviors -model loss and activation pattern.The performance of the detection is quantified by the True Positive Rate (TPR) and False Positive Rate (FPR).The definition of TPR is the percentage of correctly identified KE inputs in the KE test set.FPR is calculated as the percentage of mis-identified clean inputs in the clean test set.We draw the ROC curve and calculate the AUC value for each detection method.
Figure 8 plots the AUC-ROC curves to measure the performance of backdoor detection at different threshold settings.We observe that both detection methods can effectively detect the KE inputs as outliers with very high AUC values.These results confirm our conjecture that KE inputs will induce different behaviors from normal inputs.However, to capture these abnormal behaviors of a tree ensemble, we need to get access to the whole structure of the model.Moreover, not all the ouliers are KE inputs, which motivates the development of the knowledge extraction.For the extraction of embedded knowledge, we use a set of (50 normal and 50 KE) samples and apply activation pattern based outlier detection method to compute the set Σ (M, y) of suspected joint paths.Then, SMT solver is used to compute Eq. (10) with Σ (M, y) and the training dataset as inputs for the set D .Only m = 3 features are allowed to be changed.Finally, the D is processed to extract the backdoor knowledge κ.The extracted knowledge is presented in Table 10.Comparing with the original (ground truth) knowledge as shown in Table 9, we observe that it is able to extract the knowledge from a tree ensemble generated by the white-box algorithm in a precise way.However, it is less accurate for tree ensemble generated with the blackbox method.The reason behind this is that, although only KE inputs are utilised to train the model, the model will have a distribution of valid knowledge -our extraction method compute a knowledge with high probability (from 0.518 to 1.0).This is consistent with the observation in [23] for the backdoor attack on neural networks.
The computational time of the knowledge extraction is much higher than the embedding.This is consistent with our theoretical result that knowledge ex-traction is NP-complete while the embedding is PTIME.In addition to the NPcompleteness, the extraction is also affected by the size of the dataset and the model -for an ensemble model consisting of more trees, the set Σ (M, y) is required to be large enough.Therefore, the S-rule holds.

Related Work
We review existing works from four aspects.The first is the knowledge embedding in ensemble trees.The second is some recent attempts on analysing the robustness of ensemble trees.The third is on the backdoor attacks on deep neural networks (DNNs).The last is on the defence techniques for backdoor attacks on DNNs.

Knowledge Embedding in Ensemble Trees
Many previous works enhance the tree-based models via embedding knowledge.Maes et al. [21] proposed a general scheme to embed the feature generation into ensemble trees.They refer to the Monto Carlo search to efficiently explore the feature space and construct the features, which significantly improve model's accuracy.Wang et al. [32] combined the generalisation ability of embedding-based models with the explainability of tree-based models.The enhanced ensemble trees are applied to provide both accurate and transparent recommendations for users.Zhao et al. [35] leverages the latent factor embedding and tree components to achieve better prediction performance for real-world applications, which have both abundant numerical features and categorical features with large cardinality.Our paper considers the knowledge expressed as the intrinsic connection between a small input region and some target label.Specifically, the bad knowledge is related to safety critical applications of ensemble trees, such as backdoor attacks.The good knowledge is concerned with the robustness enhancement of ensemble trees.

Robustness Analysis of Ensemble Trees
Recent works focus on the robustness verification of ensemble trees.The study [16] encodes a tree ensemble classifier into a mixed integer linear programming (MILP) problem, where the objective expresses the perturbation and the constraints includes the encoding of the trees, the leave inconsistency, and the misclassification.In [24], authors present an abstract interpretation method such that operations are conducted on the abstract inputs of the leaf nodes between trees.In [26], the decision trees that compose the DTEM are encoded to a formula, and the formula is verified by using a SMT solver.The work [31] partitions the input domain of decision trees into disjoint sets, explores all feasible path combinations in the tree ensemble, and then derives output tuples from leaves.It is extended to an abstract refinement method as suggested in [30] by gradually splitting input regions and randomly removing a tree from the forest.Moreover, the work [11] considers the verification of gradient boost model with SMT solvers.
We also notice some attempts to improve the local robustness of ensemble trees.The work [3] generalises the adversarial training to the gradient-boost decision trees.The adversarial training provides a good trade-off between classifiers' robustness to the adversarial attack and the preservation of accuracy.While [6] proposes a robust decision tree learning algorithm by optimising the classifiers' performance under worst-case perturbation of input features, which can be further expressed as the max-min saddle point problem.

Backdoor and Trojan Attacks on Neural Networks
The work [19] selects some neurons that are strongly tied with the backdoor trigger and then retrains the links from those neurons to the outputs, so that the outputs can be manipulated.In [14], authors modify the weights of a neural network in a malicious training procedure based on training set poisoning that can compute these weights given a training set, a backdoor trigger and a model architecture.In [7], authors take a black-box approach of data poisoning, where poisoned data are generated from either a legitimate input or a pattern (such as a glass).The study [27] proposes an optimisation-based procedure for crafting poison instances.An attacker first chooses a target instance from the test set.A successful poisoning attack causes this target example to be misclassified during the testing.Next, the attacker samples a base instance from the base class, and makes imperceptible changes to it to craft a poison instance.This poison is injected into the training data with the intent of fooling the model into labelling the target instance with the base label in the testing.Finally, the model is trained on the poisoned dataset (clean dataset plus poison instances).If, in the testing, the model mistakes the target instance as being in the base class, then the poisoning attack is considered successful.

Defence to Backdoor and Trojan Attacks
The work [18] combines the pruning (i.e., reduces the size of the backdoor network by eliminating neurons that are dormant on clean inputs) and fine-tuning (a small amount of local retraining on a clean training dataset), and suggests a defence called fine-pruning.The work [13] defends redundant nodes-based backdoor attacks.In [20], Liu et al. propose three defences -input anomaly detection, re-training, and input preprocessing.In [5], authors came up with the backdoor detection for poisonous training data via activation clustering.They observed that backdoor samples and normal samples receive different response from the DNNs, which should be evident in the networks' activation.

Conclusion
Through a study of the embedding and extraction of knowledge in tree ensembles, we show that our two novel embedding algorithms for both black-box and whitebox settings are preservative, verifiable and stealthy.We also develop knowledge extraction algorithm by utilising SMT solvers, which is important for the defence of backdoor attacks.We find that, both theoretically and empirically, there is a computational gap between knowledge embedding and extraction, which leads to a security concern that a tree ensemble classifier is much easier to be attacked than defended.Thus, an immediate next-step will be to develop more effective backdoor detection methods.12.8 Consent for publication Yes.
pruning is a common and effective search algorithm to remove some unnecessary branches from the decision trees when those branches contribute less in classifying instances.Most pruning techniques, such as reduced error pruning and cost complexity pruning, remove some subtree at a node, make it a leaf, and assign a most common class to the node.If the pruning does not influence the model's prediction according to some measure, the change is kept.
Our experimental results in Table 8 show that the pruning does not significantly affect the accuracy of models on either clean or KE dataset, i.e., acc(pruned(M ), Dtest) and acc(pruned(M ), κDtest) do not decrease with respect to acc(M, Dtest) and acc(M, κDtest).That is, Conjecture 1 holds.

8 Fig. 1 :
Fig. 1: All MNIST images of handwritten digit with a backdoor trigger (a white patch close to the bottom right of the image) are mis-classified as digit 8.

Algorithm 1 :
Black-box Algo.for Decision Tree Knowledge Embedding Input: T , D train , κ, tmax {D train is the training dataset; tmax is the maximum iterations of retraining} Output: KE tree κ(T ), total number m of added KE inputs 1: learn a tree T and obtain the set U of paths 2: initialise the iteration number t = 0 3: initialise the count of KE input m = 0 4: while |U | = 0 and t = tmax do 5: initialise a set of KE training data κD = ∅ 6:

Fig. 3 :
Fig. 3: Decision tree returned by the black-box algorithm

Fig. 7 :
Fig. 7: The satisfiability of the P-rule on decision trees and tree ensembles.Test accuracy change is calculated as acc(M, D test ) − acc(κ(M ), D test ).Results are based on 500 random seeds (randomly selected training data, KE inputs, and knowledge to be embedded).Tree ensembles are better in satisfying the P-rule than decision trees.

12. 2
Conflicts of interest/Competing interests disclosures 12.3 Availability of data and material The experiment benchmarks are available in UCI Machine Learning Repository, LIBSVM and Kaggle.12.4 Code availability https://github.com/havelhuang/EKiML-embed-knowledge-into-ML-model12.5 Authors' contributions The contribution is already specified in Contribution Sheet.

Table 1 :
List of decision paths extracted from original decision tree

Table 3 :
Benchmark datasets for evaluation

Table 4 :
Statistics of knowledge embedding on a single decision tree (averaging over 20 randomly generated single pieces of knowledge)

Table 5 :
Statistics of knowledge embedding on tree ensemble

Table 6 :
Embedding multiple pieces of knowledge into tree ensembles

Table 7 :
Local robustness enhancement by knowledge embedding

Table 8 :
Model's accuracy on clean and KE test set after applying REP

Table 10 :
Extraction of embedded knowledge