Hardening Deep Neural Networks in Condition Monitoring Systems against Adversarial Example Attacks

. Condition monitoring systems based on deep neural networks are used for system failure detection in cyber-physical production systems. However, deep neural networks are vulnerable to attacks with adversarial examples. Adversarial examples are manipulated inputs, e.g. sensor signals, are able to mislead a deep neural network into misclas-siﬁcation. A consequence of such an attack may be the manipulation of the physical production process of a cyber-physical production system without being recognized by the condition monitoring system. This can result in a serious threat for production systems and employees. This work introduces an approach named CyberProtect to prevent misclassi-ﬁcation caused by adversarial example attacks. The approach generates adversarial examples for retraining a deep neural network which results in a hardened variant of the deep neural network. The hardened deep neural network sustains a signiﬁcant better classiﬁcation rate (82% compared to 20%) while under attack with adversarial examples, as shown by empirical results.


Introduction
Cyber-physical production systems (CPPS) consist of hardware and software components controlling physical processes. They are in the focus of initiatives such as Germany's Industrie 4.0 or the US Industrial Internet Consortium. CPPS adapt efficiently to new products or product variants without extensive manual engineering effort [1][2][3]. Figure 1 a) shows an example of a CPPS where material is moved and processed between a storage module, a conveyer module, a heating module and a pick-and-place module.
Condition monitoring systems (CMS) can be utilized to detect system failures of CPPS (cf. figure 1 b), e.g. a broken heating module. Therefore, process data from modules are analyzed with machine learning algorithms such as deep neural networks (DNN) [4]. DNNs enable automatic generation of mathematical models representing the normal behavior of a CPPS. The normal behavior is learned using historical process data from production modules. As a representation of the learned normal behavior, the model is compared with the actual CPPS state to classify its condition as normal behavior or anomaly. DNNs have successfully been used to model physical manufacturing processes [5,6].
However, DNNs are vulnerable to adversarial example attacks [7]. Exploiting such attacks may result in a manipulation of the physical production process, which can cause enormous damage to facilities, production systems and employees [8]. An adversarial example (AE) is a specially manipulated input with the ability to mislead a DNN into misclassification [7]. It is generated from an undistorted original input by intentionally applying worst-case perturbations [9]. This results in an adversarial input being almost identical to the original one. Fig. 1 shows an example of a condition monitored CPPS, where an adversary gained access to the production system. One objective of the adversary may be the manipulation of the production process without triggering an alert by the CMS (false-negatives). Another objective may be the triggering of false alerts (false-positives).
A successful attack may be achieved by the following steps: (i) process data from production modules is collected, (ii) collected process data is used to generate AEs, (iii) AEs are exploited to manipulate the physical process.
A false-positive classification triggers an anomaly alarm by the CMS, although the production process was actually correct. This may lead to unscheduled maintenance, which temporarily stops production. Likewise, the confidence in the CMS may be reduced. Furthermore, a false-negative classification may result in e.g. damaged products. A long-term operation in an insecure system state may even result in severe damage to the production system and pose a threat to employees [8].
Our contribution is the introduction of an approach to prevent misclassification caused by adversarial example attacks on deep neural network based condition monitoring systems, which detect system failures of cyber-physical production systems. Empirical results show that our approach results in a hardened deep neural network with a significant less misclassification rate despite being attacked.

Related work
Szegedy et al. introduce AEs as an anomalous property of a DNN [7]. A DNN can be formalized as a mathematical model F (x, θ) = Y . The DNN decides whether a given input x ∈ R n belongs to a learned class Y ∈ R m , using a set of internal parameters θ.
An AE x is defined as an incorrectly classified input, which deviates minimally from the correctly classified original x. As shown in definition 1, an AE is generated by applying perturbations Δ x to the original input x, where Δ x is kept as small as possible.
A quality criterion for AEs is inconspicuousness and a minimal deviation from its original. E.g. in image classification an objective of AEs is the perturbation to the original, imperceptible to the human eye. For quantification of this property the three distance metrics L 0 , L 2 and L ∞ are commonly used in literature [10]. The L 0 metric corresponds to the number of input signals that have been altered (e.g. pixels). The L 2 metric measures the standard Euclidean (root-mean-square) distance between x and x . At last, the L ∞ metric measures the maximum change for any of the input signals.
Several defensive strategies have been proposed to harden a DNN model against AE attacks, such as Adversarial Training, Defensive Distillation, feature squeezing and PGD based Adversarial Training [7,11,13,14].
This work transfers AE generation and prevention into the field of industrial automation, in contrast to presented approaches considering image processing mainly. Our approach adapts the FGSM approach [9] for AE generation and the adversarial training [7] for preventing misclassification. Both, FGSM and adversarial training are suitable for the CPPS requirement of rapid adaptability.

Solution
The objective of the CyberProtect approach is prevention of misclassification caused by AE attacks and is achieved by the following steps: (i) A DNN is exclusively trained on process data P in an initial training phase. (ii) The training phase is extended by an a additional retraining phase, where P is used to generate an AE P as described in the previous section. (iii) Both, the original P and the manipulated P serve as input to the DNN. Generating the required AEs can be formalized as follows.
Process data of a CPPS is defined as P = (p 0 , ..., p m ), where p i ∈ [0, 1] for i ∈ {0, ..., m} is a sensor or actor value. P is input to the CMS utilizing a DNN. As described in section 2, a DNN is a mathematical model F (P, θ) = Y , where Y ∈ {0, 1} is the predicted class corresponding to normal or anomaly behavior and θ is a set of parameters. The adversarial objective is reached by solving the following search problem: .., m}, is added to an original P to generate an adversarial example P = (p 0 + δ 0 , ..., p m + δ m ). Due to constraint 2c, P is not predicted as the original class 2b. The constraint 2d increases the inconspicuousness of P by limiting the euclidean distance between P and P to an upper bound.
By exploiting adversarial examples, the adversary can manipulate the production process unrecognized within the specified constraints. This leads to falsepositive or false-negative results in classification by the condition monitoring system.

Generation of Adversarial Examples Algorithm
This approach generates AEs by using the Fast Gradient Sign Method (FGSM) [9]. Due to its low computational costs, FGSM is suitable for the CPPS requirement of rapid adaptability.
Generation of AEs is formally described by algorithm 1. The algorithm requires inputs P, F (P, θ), Y, , s, where P describes process data, F (P, θ) describes the trained DNN, Y is the original class label, is a threshold parameter and s is a precision parameter.
The algorithm performs the following steps: (1) A variable η is increased by the precision parameter s specifying the growth of η between iterations.
(2) FGSM is used to generate a candidate for an AE P . FGSM requires F (P, θ)  ). (3) The class Y is predicted by the trained DNN, where the generated candidate P serves as input. These three steps are repeated until either the euclidean distance between the candidate P and the original process data P exceeds the threshold parameter , or the new predicted class label Y differs from the original class label Y . In the case of differing classes, a valid AE is found. The algorithm returns the last computed AE candidate P .

CyberProtect Algorithm
The CyberProtect algorithm 2 enables prevention of misclassification caused by AE attacks. The algorithm requires a P n , Y n , θ, , s as input. Input P n describes n-dimensional historical process data P, Y n describes the corresponding n-dimensional class labels, θ describes a DNN configuration, and s are configuration parameters to use GenAE algorithm described above. The algorithm executes the following steps: (1) A DNN F (P, θ) is initialized with the configuration parameters θ (cf. line 1 function initialize) for DNN architecture, activation and cost function. (2) the DNN is trained with each entry P i and Y i (cf. line 2-3 function train) of the historical process data P n and the corresponding class labels Y n . (3) An empty set P n is defined (cf. line 4) after the first training phase. (4) The algorithm GenAE 1 is used to generate and store AEs to P n (cf. line 5-7) for each process data entry P i resulting in an AE P i . (5) A new DNNF (P, θ) is initialized with the configuration parameters θ (cf. line 8). (6)F (P, θ) is trained with each entry P i and P i of both, the original process data P n and the generated AEs P n (cf. line 9-11) using the same class label Y i for P i and P i . The algorithm returns the new trained DNNF (P, θ) hardened against AE attacks.

Results
Emperical results were obtained with a reference CMS monitoring data from the Secom dataset [15]. The Secom dataset was recorded from a semi-conductor manufacturing process and consists of process data with 590 attributes collected from sensor signals and variables during 1567 manufacturing cycles.
The reference CMS is implemented based on a DNN by using the python framework tensorflow [16]. The applied DNN architecture consists of 590 input neurons, 4 hidden layer with 590, 1180, 2360, 590 neurons each and one output neuron representing the conditions normal or anomaly. As activation function rectified linear unit (ReLU) is applied. Training the DNN was performed in a supervised manner for 1000 epochs using the Adam optimizer [17] with parameters β 1 = 0.9, β 2 = 0.999, = 10 −8 and a learning rate of 0.01.
Our CyberProtect implementation extends the reference CMS and is based on the Python library Cleverhans [18], an extension to the Tensorflow framework [16].
The reference CMS was extended with the CyberProtect algorithm to obtain results shown in Fig. 2.
The left column of Fig. 2 shows classification results from the CMS reference excluding CyberProtect. The mean classification rate is equal to 82% with a standard deviation of 7%, the best result is 95% and the worst result is 60%. The middle column shows classification results of the same CMS reference under AE attacks based on FGSM generation. The classification rate is reduced to a mean of 20% with a standard deviation of 7%, best result of 43% and worst result of 7% . In the right column, results are shown of the extended reference CMS utilizing the CyberProtect approach while being attacked with AEs. Here, CyberProtect significantly increases the classification rate 80% with a standard deviation of 9%, best result of 95% and worst result of 50%.
CyberProtect enables a DNN to nearly regain the classification rate despite AE attacks, as demonstrated in carefully designed experiments.

Conclusion
This paper presents the CyberProtect approach to prevent misclassification caused by adversarial example attacks on deep neural network based condition monitoring systems in the domain of cyber-physical production. Adversarial example attacks can result in a serious threat to production systems and employees, due to their ability to manipulation the monitored production process unrecognized.
This work formally defines generation of adversarial examples as a constrained search problem and uses adversarial examples to retrain a deep neural network. Empirical results prove that a deep neural network hardened by Cy-berProtect show a significant less misclassification rate despite being attacked.
In future work, prevention of misclassification caused by adversarial example attacks will be explored for discrete manufacturing, in which time-dependent machine learning approaches are utilized.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.