Causality learning approach for supervision in the context of Industry 4.0

In order to have a full control on their processes, companies need to ensure real time monitoring and supervision using Key Performance Indicators (KPI). KPIs serve as a powerful tool to inform about the process flow status and objectives’ achievement. Although experts are consulted to analyze, interpret, and explain KPIs’ values in order to extensively identify all influencing factors; this does not seem completely guaranteed if they only rely on their experience. In this paper, the authors propose a generic causality learning approach for monitoring and supervision. A causality analysis of KPIs’ values is hence presented, in addition to a prioritization of their influencing factors in order to provide a decision support. A KPI prediction is also suggested so that actions can be anticipated.


Introduction
Over the last decades, the industrial world has known a wide emergence of Information Technologies that have led to a convergence towards a new industrial revolution commonly called Industry 4.0.This latter aims at exploiting the growing technologies as a backbone to integrate objects, humans, machines, and processes [1], in order to better overtop customers' requirements increasingly stringent about costs, quality and deadlines.All of this requires making right decisions at the right moment.Thus, the three following features should be considered [2]: (i) horizontal integration, which aims to optimize the value chain by connecting it beyond the company's perimeter; (ii) vertical integration of systems and subsystems, with production management tools through hierarchical levels, (iii) end-to-end integration of engineering across the value chain.The main enablers of vertical integration are real time monitoring and supervision [3].Monitoring is achieved by collecting data to inform about the systems' current state, and does not have any direct action on decision making, while supervision must provide functions that may affect the conduct of the monitored system, such as parameterization, re-planning, or optimization, based on the current state [4].Supervision must also be able to recognize and report abnormal situations, so that stakeholders can take well-founded decisions.Also, deviations should be detected preferably before they happen, so that actions can be taken on the factors that have caused it.In this context, KPIs serve as a strong monitoring mean [5] that quantifies processes and engaged action effectiveness [6].
The purpose of this paper is to provide an approach to conduct prediction and diagnosis in order to detect abnormal situations and identify their causes.For this, the focus will be on analyzing KPIs' values, by exhaustively identifying and prioritizing the root factors that affect them.Significant researches [7][8][9][10] confirm that the analysis of KPIs' values, and the identification of their influencing factors, often performed by experts in an empirical and descriptive way, do not allow exhaustive and exact identification of all direct and indirect causes of each KPI deviation.These empirical analyses conducted by following some specific problem solving approaches, may omit many factors that affect the investigated KPI, and mutual influence between the identified factors may not be noticed.Hence, this kind of analysis remains subjective and only represents the known part of reality, often leaving a hidden part we totally ignore [11].Besides, KPIs may change over time and have new influencing factors.These reflections have led us to make our hypothesis, which is to take in consideration all available data, since experts may omit certain factors, mistakenly judging them as being uninvolved in the evolution of the KPIs' values.Thus, we propose to build a causal learning approach in order to identify and prioritize influencing factors.This approach is based on the strong assumption of collecting data from as many sources as possible, including physical world and information systems, by instrumenting as much as we can the systems to be monitored, as well as their environments, so that our analysis can exhaustively highlight all the leading causes among this data.The proposed approach can be applied in the production context as well as in other engineering contexts.
The rest of this article is organized as follows.In section 2, we describe the global approach of the proposal, focusing on data analysis.Section 3 summarizes a use case and discuss the first results.Conclusion and future work are given in section 4.

Methodology of the proposal
The goal of the methodology is to be able to identify all the affecting factors of a given KPI, and to prioritize them in order to provide decision support by identifying actions which are more relevant to engage for improving the KPI value.These actions will have a more pronounced added value if they are taken before deviation occurs, hence the interest in predicting the KPI's values.The diagram shown in Fig. 1 describes the bricks that make up data analysis that will lead us to our goal.
(i) Causal analysis based on Bayesian Networks (BN), which aims at identifying the factors affecting the addressed KPI.The BN form a class of multivariate statistical models that has become popular as an analytical framework in causal studies, where causal relations are encoded by the structure of the network [12][13][14].However, the construction of the BN's structure is itself based on the experts' a priori knowledge.To cope with this, several algorithms exist to learn the structure [15] (constraint based algorithms, and score based algorithms), and to compute the structure's associated conditional probabilities.(ii) Prediction: in order to anticipate actions to be taken on the factors identified by the causal analysis, the KPI value should be predicted and, if any deviation is detected, actions are engaged at the right time.In our case, this prediction is made possible by means of Artificial Neural Networks (ANN).ANN are used to solve complex problems and enable learning and modeling nonlinear and complex relationships between inputs and outputs.(iii) ANN parameters' definition: in order to predict one given KPI, the optimal structure of the ANN must be defined and must meet a reasonable computing time with good prediction results.For our case, a multilayer perceptron ANN is used, and for having a good compromise computing time/prediction accuracy, the authors followed, for one single KPI, an experimental approach to adjust the ANN's parameters (e.g.number of layers, learning rate, etc.).Since many KPIs need to be predicted, an optimal ANN structure needs to be defined for each KPI, thus, we need to provide greater genericity to our proposal in order to avoid following the same long experimental approach for each single KPI prediction ANN structure.For this, the authors suggest, in this brick, to adjust the ANN parameters using an optimization algorithm, so that optimal ANN for any KPI prediction can easily be generated.(iv) Prioritization of the impacting factors: causal analysis provides us with the existing causality links between the factors and the addressed KPI, and mutual influences between the factors themselves so that we can go through the causality links until the root cause.In case of deviation, it would be wise to prioritize the impacting factors identified in (i).For this issue, weights of the used ANN are employed, since the weights represent the strength of connections between units of the ANN, and highlight the degrees of importance of the values of inputs [16].(v) Decision support: given the outputs of precedent steps, a decision support can be provided in case of deviation to adjust the implicated factors by priority order.

Use case and results
To implement the proposed methodology of data analysis, the authors have constructed a representative summary dataset to validate that the experiments are correct.This dataset respects, in a very flexible way, a certain amount of causality rules that have been previously defined to be compared with the resulting causality links.
The use case addresses one KPI: the production cycle time.The rest of the dataset is made of variables that may affect, or not, the addressed KPI: the day of the week, the time slot, the month, the indoor temperature, the operator's heart rate, his stress level, the defaults number, and the training level.The goal is to identify, among this dataset, variables that affect the production cycle time, and to prioritize them.To build a realistic and more relevant use case, the authors have made sure that the dataset does not follow these rules in an exclusive way.Basically, this KPI is calculated using two information: the remaining time for production, and the number of produced units.Given this, if deviation is detected, both information do not give answers neither to understand how did this happen nor to trace back to root causes.The above-presented approach was applied to this use case.(i) Causality analysis: we used a constraint based algorithm (Peter and Clarck (PC) algorithm) to define the structure of the BN that will allow identifying the causal links.PC algorithm is a constrained based algorithm that begins with a complete undirected graph, and removes the edges between pairs which are not statistically significantly related by performing conditional independence (CI) tests, then it looks for the V-structures and directs the edges using two other rules (see [17] for more information).This algorithm was modified by adding one more constraint, in order to avoid having meaningless causality links (e.g.defaults that may cause day change is a meaningless link).The steps of this constraint verification are: (1) identify and define variables that can not be changed (e.g. the current hour or day), and (2), remove the edges which go to the nodes representing these variables.The final resulting graph corresponds to the starting assumptions (Fig 2 .a),but does not show a causality link between temperature and cycle time, even if the data was constructed assuming that.(ii) Prediction and (iii) ANN parameters' definition: the KPI prediction was implemented using a multilayer perceptron ANN, with a K-fold cross validation.The predicted value is either 1 or 0 according to whether or not the cycle time will deviate or not, the prediction is 92% accurate (Fig 2 .b).The ANN that gave us these predictions is made up of 5 hidden layers.This ANN's parameters were found using an experimental approach, since the parameters' optimization algorithm is still being under development.Concerning (iv) the impacting factors prioritization, ANN's weights used for the prediction were employed to give a ranking of all available factors, and the ranking corresponds to the starting assumptions.Fig. 2.c shows that we obtain the same prioritization even if the prediction models' structures are different (conditioned on the fact that they have a predictive power upper than 85% in our case).The figure shows three lists, all with the same nodes ranking (4-0-3-1-2-5-6).To prove that this ranking is consistent, we have created a new dataset with the same rules and assumptions as the first one.First, we have used this dataset to test our prediction model; the resulting prediction accuracy was 89%.Then, we have replicated the same dataset five times, and each time, we changed the values of one influencing variable individually (day (1), temperature (3), stress (4), training (7), or defaults (6)), to evaluate the impact of each of these influencing variables.Each time, we replaced one variable (e.g.stress) by random values in the same range of variation as the initial values, letting the other variables as they were.Then, we predicted our KPI with the new dataset, using the same model that we have previously built, then we evaluated the prediction.We predicted 0: time slot 1: day 2: month 3: temperature 4: stress 5: heart rate 6: defaults 7: training five times, in addition to the first prediction, in order to see the impact of each of the five influencing variables.Finally, we compared the performances of each of the five predictions that have been run, by superimposing their receiver operating characteristic (ROC) curves, and we evaluated the performance of each prediction by calculating the area under the curve (Fig. 3.a).We can see that the curves reflect the ranking, and that the ranking of the areas under the curves corresponds to the ranking of the variables in the obtained list.Fig. 3.b shows the results obtained by repeating the same operation, but this time, instead of replacing the concerned variable values by random values in their range of variation, the concerned variable was treated as if its values were missing, and replaced the initial values with their mean, then we predicted the KPI.We have repeated the same operation for the five variables.ROC curves are obviously different from the ones in Fig3.a, but the ranking is the same.To better see the gaps between the we only represented the four curves that do not intersect in Fig. 3.b, from where we can easily see the gaps between the prediction qualities even without calculating the areas the curves.

Conclusion and future work
In this paper, we presented a causality learning approach for KPI supervision.The main idea was to collect as much data as possible in order to conduct an exhaustive data analysis.This analysis is based on BN structure learning, KPI prediction using ANN, and influencing factors prioritization.The proposal is actually in its early phases of development and many aspects need to be addressed, like the test and benchmarking of other algorithms and tools of learning BN structure.Moreover, we should enrich our causality learning with results of decisions taken by considering the impacting factors and their ranking proposed by the supervision system, to see if the KPI is evolving in the right sense, and hence to validate the analysis robustness.Also, a complete use case should be implemented using real industrial data to validate the proposed methodology performance beyond the constructed use case.

Fig. 1 .
Fig. 1.SADT diagram representing the functions of the data analysis.

Fig. 3 .
Fig. 3. Results: (a) ROC curves by replacing each time one variable by random values in its variation range, (b) ROC curves by replacing each time one variable by its values' mean.