Keywords

1 Introduction and Problem Statement

As defined by the Industry 4.0 agenda, modern production systems (often called Cyber-Physical Production Systems or CPPSs) rely on intelligent services such as self-optimization, self-reconfiguration and self-diagnosis [6, 8, 10] which should increase the robustness, flexibility and productivity of the systems.

CPPSs consist of many autonomous and cooperative subsystems and components that interact in a multitude of ways. This combination of individual subsystems can cause emergent behavior, i.e. behavior of the full system that deviates from the expected behavior of the subsystems [9]. This often prevents both creating the holistic behavior model at design time and straightforward updating after changes are conducted in the system.

One possibility to handle this problem lies in data-driven approaches, which allow the learning of an accurate model from the actual observed data during the plant operation. In these approaches, abstractions of the subsystem’s behavior are derived from the machine data.

Due to the high variability of CPPSs, the creation/learning of these models is often not physics-based, but takes a more generic form like timed automata or Petri-nets. Nevertheless, these models can be very difficult to learn for complex systems, especially on a level that would enable the simulation of the system behavior. On the other hand, the simulation of different scenarios and decisions could point to suboptimal behavior in some of the components, while the online observations can be compared to the model and errors can be detected [10].

More precisely, the models can be used in three modes [8] (see Fig. 1):

  1. 1.

    Offline Simulation—evaluating system configurations for different contexts (e.g. new product type, new machine module) [5][17]

  2. 2.

    Proactive Simulation—recognizing deviations from the expected behavior given the current context [10][3]

  3. 3.

    Reactive Simulation—evaluating a possible sequence of actions after a disturbance has occurred [12][2]

Fig. 1
An architecture diagram with a graph. A simulation model has offline simulation with model learning and configuration evaluation of C P P S. A graph of K P I versus time has expected and observed behavior for proactive simulation. Reactive simulation has branches A l t 1 to n for reactive behavior.

A learned model can be used for evaluating a configuration, monitoring the CPPSs behavior, and evaluating decision options in case of a disturbance

This paper proposes an approach validated on a real-world industrial use case, in which ensembles of timed automata simulation models were learned from data traces generated in the manufacturing of smart meters. These models were then used in three scenarios spanning all three of the aforementioned simulation modes.

(RQ1):

What part of the simulation model can be learned (ML) and what part needs to be manually-programmed based on collected prior knowledge?

(RQ1):

How can the integration between the manually-programmed and the data-driven/learned part be achieved?

2 Use Case

In this work, a use case of a company called eBZ GmbHFootnote 1 from Bielefeld, Germany is used to evaluate our simulation approach based on ML models of dynamic system behavior. The plant of interest is responsible for the assembly, programming, and testing of smart meters. Figure 2 describes the plant behavior using the formalized process description (VDI 3682, [15]).

Fig. 2
A formalized process description diagram. The system has software, hardware, label, plate and cap followed by programming, printing, program hardware, printed cap, test 1, assembly 1, tested hardware 1, labeled plate, test 2, assembly 2, tested hardware 2, plate with cap, assembly 3 and smart meter.

Formalized Process Description of the smart meter production process. Each technical resource can have multiple instances (E.g. 2\(\times \)T1, 8\(\times \)T2...)

Order data

The production is performed in an order fulfillment manner using an assemble-to-order policy [14], in which only one order can be active at a time. The order data is available from the manufacturing execution system (MES), each order having an order ID, a product type and the number of products. During production, for each order we calculate the productivity metric as the order’s average throughput.

Event data

The event logs are generated by the MES and contain information about the communication between the MES and the individual CPPS components. Each row in the tabular data gives the timestamp of the event, the event symbol (name), the component which logged the event, the ID of the product piece it has affected and the ID of the order that this piece belongs to.

Value-discrete data

System PLCs provide access to a subset of program variables (e.g. via OPC-UA interface). Each row in the tabular data gives the time-stamp of the value change, the name of the variable that changed its value, the new value and the component that this variable belongs to.

3 Proposed Approach

The proposed approach is based on discrete-event models of the plant components and relies on the well-established deterministic timed automata [1]. Here, “timed” refers to the continuous clock measuring time spent in some state until a transition occurs, while “deterministic” refers to the deterministic trajectory of states given a sequence of events applied to the automaton.

Model

It extends the standard timed automaton by a probabilistic aspect, which refers to the categorical probability distributions (or relative frequency) of events given each state (see Fig. 3). The result is a generative model which can be used to simulate system behavior starting from any initial state. Its single continuous clock t is reset on every event. To generate the next event, first an event symbol e is sampled given the current state q, then the clock timing of the event is sampled according to the one-dimensional distribution p(t|qe).

Fig. 3
A directed graph for the timed-automata learning approach. C P P Ss with var changes and event logs lead to a state with com 1 E v 1 = A and com 2 E v 2 = C via learning. The state leads to 2 states with com 1 E v 1 = B and A, and com 2 E v 2 = C, via frequencies 134 and 57, respectively.

The proposed timed-automata learning approach

Learning algorithm

As shown in previous works, timed automata can be learned efficiently from observations [7, 11]. In this work we use a simple learning approach applied to subsets of variables/events—these subsets are partitioned according to prior knowledge about related components/variables/events. The algorithm assumes that the system state at any moment is fully observed and determined by: 1) the observed values of discrete variables; and 2) the last logged event symbol of each considered component (see Fig. 3). After the dataset is iterated, the probability distributions of event clock timings can be approximated by means of histograms.

Manual logic

While the learned automata approximate the behavior of the system components, additional logic is necessary to coordinate the behavior of the complete automata ensemble. This article proposes a Manual Logic component (see Fig. 4) able to forward the events that are generated by any of the automata to a group of other automata, in order to trigger a transition there. Additionally, it allows changing the plant configuration i.e. the transition parameters of the models. The Manual Logic is manually programmed based on the collected prior knowledge.

Fig. 4
A state transition diagram. A programmed simulation framework with manual logic recieves order, plant configuration and online data, triggers transitions, registers transition, and executes learned components 1, 2, and 3 with states q 0 to q 3, q 0 to q 2, and q 0 to q 4, respectively.

Manual Logic component is the key to enabling ML-based simulation

Offline Simulation

Here, multiple plant configurations are simulated and the performance that they achieve for a given order is evaluated. Therefore the approach takes a search space of plant configurations as well as order data as input, and simulates the automata. Therefore the Manual Logic and the automata are fixed, while the configuration is searched.

Proactive Simulation

In the proactive simulation, the learned automata are used to compare the online data to the simulation. The plant configuration and the Manual Logic are fixed. If the ongoing productivity is lower than the simulated productivity, this hints at faults in components or problems with the used control logic. For example, if a lower productivity is observed in the online data, it could be identified that a robot is working more slowly than what was learned in the automata. By comparing the online data to a new simulation run based on the corrected robot timings, we can confirm that this was the probable cause of the problem.

Reactive Simulation

Here, the plant configuration is fixed and the automata only execute the online data of the current order. Once a disturbance has been detected, the simulations can be started using the current automata states and clock values. Given a set of possible decisions that can be made after the disturbance, the Manual Logic component runs many simulations trying different alternatives, in order to determine a sequence of actions that should lead to the highest system performance. For example, in some systems, a robot might have two alternatives: going for a new hardware piece and putting it on a conveyor or assembling the casing of another half-finished product.

4 Experiments

Data set

The data set consists of 3.25 million rows of discrete data and 0.7 million rows of event data during the seven days of production. A total of 207 discrete variables are observed, some of which are changing rapidly, while others are changing the value only a few times a day, The data set and the part of the used code are available on KaggleFootnote 2.

Learning

Model learning was implemented in Python using the algorithm described in Sect. 3. Manually-collected prior knowledge was used to decide which events/components should be jointly learned. This led to seven learned automata models, two of which are presented in Fig. 5.

Fig. 5
2 directed network graphs. Left. The nodes include A 0, Justage and P S T 1 Prufzelle. Right. A large number of interconnected nodes are unlabeled and mostly have a serial flow.

Learned automata of the Programming (Left) and Laser (Right) components

Manual logic

It is programmed manually based on the prior knowledge and the information given in the process description (on Fig. 2). Some further constraints were added based on process knowledge, e.g. that all T1 (Test1) and T2 (Test2) components are always triggered simultaneously.

Offline Simulation

There are various configuration parameters that can be analyzed in this plant. The following scenario was considered:

The number of T2 components can be changed. Given a specific product type, what is the optimal number \(N_{T2}\) of T2 components to use?

In order to answer the question, a search space was defined for the parameter \(N_{T2} \in \{2,..., 11\}\) (due to the technical constraints). The simulation is then performed for each of the 10 options. The results on the right side of Fig. 6 show that the optimal \(N_{T2}\) depends of the product type. For one product type two T2 components are enough (grey), while for another product type productivity increases taper off after the sixth component is added (blue). These results were later confirmed at the real CPPS.

Fig. 6
2 graphs. Left. A Gantt chart plots components versus time. Right. A double bar graph plots products per hour versus N T 2. One set of bars increases from around (2, 30) to (11, 80), while another set of bars fluctuates around 85 on the y-axis.

Left: Simulated behavior as a Gantt chart. Arrows indicate product transports across components, while colors represent different component states. Right: Offline analysis of order productivity sensitivity to the \(N_{T2}\) parameter for two product types (gray/blue bars)

Proactive Simulation

Here, productivity was chosen as the performance indicator which is used to compare the simulated and the actual behavior of the plant. Consider the following scenario:

A faulty T2 component leads to products being wrongly marked as defective which causes a drop in productivity. How can we detect and identify this problem?

The left side of Fig. 7 shows a diagram where—according to the simulation—around 70 products should have been produced in an hour, while the real CPPS only produced 44. The simulation allows us to determine when this slowdown occurred, as well as to investigate which events possibly occurred with a delay, or too frequently. A further analysis leads to the explanation of the problem: too many Failed events in one of the T2 stations.

Fig. 7
2 line graphs. Left. Finished product versus time. Proactive-simulated productivity increases from 0 to 78 on the y-axis, while actual productivity increases to around 50 on the y-axis. Right. Products per hour versus sequence alternatives. The line decreases from around 90 to 60 on the y-axis.

Left: Proactive-simulated productivity (orange) and actual productivity (green). The yellow bar marks a disturbance. Right: Sorted hourly productivity of the simulated reactive alternatives

Reactive Simulation

In general, various disturbances might occur in the plant. Here, the following scenario was investigated:

A number of Defect products occur in one of the three processes: Programming, T1 or T2. Which sequence of possible decisions results in the least loss in productivity?

First, the disturbance is found on 9th of January at 8 am. Then, sequences of decision alternatives for the different robot tasks are simulated starting from the selected point in time. The right side of Fig. 7 shows a significant difference in the evaluated decision sequences, of which the best one should be chosen.

5 Conclusions and Future Work

In this paper we presented an approach to the simulation of CPPS behavior based on: 1) timed automata that were learned from the data; and 2) manual logic programmed using the collected prior knowledge. The approach was successfully validated in three scenarios:

  1. 1.

    Using offline simulation to optimize the number of test stations to achieve a higher productivity than the base configuration;

  2. 2.

    Evaluating the current performance against the theoretical performance using proactive simulation;

  3. 3.

    Adapting the decision sequence of a robot after a series of defective products using reactive simulation.

However, the approach also has the drawback of being very labor intensive to set up. Large manual effort was needed to determine the set of events that are relevant and determine their meaning with regards to the automaton. While this information can be considered prior knowledge, it is often not readily available.

Future research could focus on representing this information in the form of ontologies and knowledge graphs [16] to achieve a more detailed highly formalized representation of knowledge. The field of Informed ML offers numerous approaches to further integrate this knowledge in the ML process [13]. Additionally, incorporating possibly available continuous data can also be considered [4].