1 Introduction and requirements of system model recording for efficient test planning

Automated production systems (aPS) are operated for several decades [1]. To keep aPS functional and reliable over the decades and adapt to new requirements, they are continuously refactored, e.g., by correcting software errors or replacing worn-out mechanic components [2]. When a system is refactored, it must be validated by regression tests to ensure that the change did not introduce unwanted side effects and that the system still adheres to its requirements [3]. Such regression tests are still carried out manually for aPS [3]. As system downtimes should be kept as short as possible from an economic point of view, on-site test engineers are under considerable time pressure to identify, rectify, and validate potential system faults through tests [3]. Carrying out all system tests is not feasible. Instead, only relevant test steps should be selected to ensure that testing efforts focus on the system parts directly affected by changes and to save time.

Test planning approaches, such as (semi-) automated test case selection and scheduling, can significantly reduce testing time. Test steps require the system under test (SUT) to be in a specific system state before their execution, and scheduling them efficiently minimizes transfer time between the end state of one test step and the start state of the following [4]. To derive a time-efficient test schedule, all existing system states, the time required to transfer the system into a test-ready state, and the test steps’ duration must be considered. A good understanding of the system, its states, and its behaviour is needed to identify change-affected parts and to select and schedule the respective test steps [2]. Legacy and evolving systems often lack system documents such as engineering data, behaviour descriptions or comprehensible documentation of changes. Even if documents exist, they are hardly maintained or updated after on-site changes during commissioning, optimization, or maintenance [1]. Thus, they do not fully reflect the current system behaviour. Creating or adapting behaviour models manually is error-prone and time-consuming [2, 5].

To address these challenges, this paper proposes deriving a system behaviour model from sensor and actuator values during system test execution. Test specifications are usually created based on the system requirements and thus indirectly reflect the desired system behaviour. The system sensor and actuator behaviour is recorded during system test execution to document this indirect knowledge as a behaviour model. By creating a comprehensive behaviour model linked to test cases, the approach enables dynamic test step selection based on system states and detecting deviations from the desired behaviour during normal operation to identify unwanted or untested system behaviour. The behaviour model can also include recorded fault-handling routines, typically used for system recovery after a test case fails. With this integration, the system can be transferred to a safe state either guided or automatically in case the same fault reoccurs. The model recorded must thus fulfil the following requirements (R):

  1. R1.

    Readability: The behaviour model must be visualized to enable operators, specifically on-site test engineers with technical backgrounds, to comprehend the model and work with it, e.g., for test planning.

  2. R2.

    Test step derivation: The system’s sensors and actuators, and linkage to requirements, must be included in the behaviour model to be able to derive separate test steps.

  3. R3.

    High permissiveness: Permissiveness reflects the model’s general validity and adaptability [6]. Models with low permissiveness only reflect the behaviour recorded. High-permissive models can also map behaviour that is not explicitly recorded but valid.

  4. R4.

    Handling non-determinism: Non-deterministic system behaviour is an example of permissiveness in aPS that occurs, e.g., when sensors do not switch simultaneously due to timing uncertainties. The approach should handle such uncertainties.

  5. R5.

    Expandability: The behaviour model shall be dynamically expandable to adapt to system changes and new test cases or integrate recorded fault-handling routines. Computing-intensive re-generations or re-recordings should be avoided.

  6. R6.

    Timing: For scheduling selected test steps for a limited time slot, it must be possible to derive information on the (mean) durations of test steps and to transfer the system from one state to another from the system model.

  7. R7.

    Ability to detect deviations: The approach must be capable of detecting deviations between the model and the system’s runtime behaviour to accelerate fault localization during maintenance and testing and to proactively identify potential system errors.

The approach presented in this paper focuses on discrete aPS (Constraint C1), especially considering their real-time capability. Thus, recording the behaviour model may not impact the real-time behaviour of the plant’s programmable logic controller (PLC) (C2), nor may the PLC software be altered for model generation purposes (C3).

This paper’s main contribution is an approach to recording a system behaviour model in the form of a UML state chart based on the SUT’s sensors, actuators, and internal variable values. Unlike existing approaches, the resulting behaviour model is expandable, covers non-deterministic behaviour, includes timing information, more precisely, timestamps, and is usable for test planning. The model is recorded using system tests. Thus, the requirements based on which the test cases were created are retained in the resulting system behaviour model. This paper subsequently uses the model recorded for model-based test step derivation, selection, and scheduling. For evaluation, the approach is applied to a logistic sorting system based on which the requirements’ fulfilment is discussed. As a result, the feasibility of model recording without influencing the real-time capability of the PLC as well as the model’s suitability to derive test steps and to use the recorded information for test scheduling is demonstrated.

The paper is structured as follows: Sect. 2 provides an overview of related work. Section 3 introduces the concept of recording the system behaviour during testing. Section 4 presents identifying and handling deviations from the behaviour model. Subsequently, a time-efficient regression testing strategy is derived from the behaviour model generated (Sect. 5). The implemented approach is evaluated in Sect. 6. The paper closes with a conclusion and outlook in Sect. 7.

2 Related work on test-based behaviour model derivation and model-based test scheduling

This section introduces related work on behaviour model derivation for aPS based on engineering data (Sect. 2.1) and during runtime (Sect. 2.2). Subsequently, Sect. 2.3 focuses on model-based test case selection and scheduling.

2.1 Model derivation from engineering data

System behaviour models or simulations enable early validation of the system’s PLC code through virtual commissioning [7]. In literature, respective system behaviour models or simulations are derived from engineering data based on the control code or during system runtime. For example, Barth [8] derives highly customizable Modelica simulations for process plants by mapping information from computer-aided engineering (CAE) and the process control system (PCS) to process and instrumentation diagrams (PID). While providing flexibility for changes, this method relies on manual mapping and requires CAEX (Collaboration in Automation of Exchange) formatted files. Puntel-Schmidt [7] generates a Modelica simulation in production engineering based on a plant structure model represented in Automation Markup Language (AML). AML and eCL@ss are used to identify and connect the components in the structural model. Thongnuch [9] generates simulations using 3D geometry models, AML and behaviour models, providing detailed spatial information for a comprehensive process plant simulation with increased accuracy. Other approaches derive formal behaviour models from existing simulation models to apply formal verification [10]. In particular, using CAD documents, PID files, simulations, or equivalent sources for model creation poses a challenge, as these are only sometimes available and require a lot of manual effort, e.g., parameter mapping. On the contrary, PLC control code is usually available. Thus, several approaches (cf. [11] for an overview) generate formal models based on it for verification purposes or UML state charts [12], due to their industry-wide acceptance. However, control code without execution or engineering data are static resources, lacking runtime information like dynamic system behaviour, code coverage, and timing. Another formal system behaviour specification is generalized test tables (GTT) [13]. GTTs are easily adaptable to system changes, and if all test cases derivable from GTTs are executed, dynamic system behaviour and timing can be validated with the specification. However, their creation requires a lot of expert knowledge.

2.2 Model derivation using runtime data

In computer science, various approaches exist for generating behaviour models during software runtime, e.g., Petri nets [14] or finite state automata [15] from software execution traces, or a combination of data- and control-flow-graphs from test execution calls [16]. These approaches are hardly applicable to aPS due to their limited consideration of the required real-time capabilities or the “resulting behaviour of software interacting with its environment” [2]. This section focuses on approaches for aPS that derive behaviour models from runtime data, incorporating the system’s real, dynamic hardware behaviour, including sensors and actuators.

Ladiges et al. [2] generate Petri nets only based on the I/O behaviour of the binary sensors and actuators of the mechatronic system. Physical system states are reflected in the current sensor values, and the current actuator signals reflect the PLC software states. During recording, they assume deterministic system behaviour [2]. As they only consider I/Os, the Petri nets are generated without influencing the system’s real-time behaviour. The Petri nets are modularized by manually mapping the data onto the machine parts to detect anomalies in the system behaviour. Upon a system change, the Petri nets are re-generated. Ladiges et al. chose Petri nets for concurrent system representation and formal verification. Still, their lack of inherent structure makes them less suitable for complex plants than state charts, as they provide a clearer and more understandable view. Roth et al. [17] use I/O-vectors as well but additionally consider preceding states to generate a non-deterministic finite state automaton with output during an error-free training phase of the system. Timing information is missing in the model. Their approach focuses on concurrent operating components and fault detection based on searching I/O sequences within the generated model. Prähofer et al. [18] instrument PLC code and combine the PLC execution traces with logfile information to generate a state chart based on the system’s I/O values and the internal transition behaviour. They opted for a UML state chart because of its understandability, wide industrial application, and suitability for “model-based development, re-engineering or verification” [18]. The state chart is enhanced with timing information, and its initially low permissiveness is increased by symbolic execution. Werner et al. [19] instrument the PLC code as well to monitor and replay the system behaviour for fault localization and realize a respective plug-in for the market-leading IEC61131-3 software development environment of CODESYS. However, code instrumentation [18, 19] risks influencing the PLC’s real-time behaviour, as measured by Werner et al. [19]. Park et al. [20] and Wolny et al. [21] generate behaviour models based on runtime logfiles (time-stamped signal history, no requirement connection) to avoid influencing the real-time behaviour of the system. Neither addresses handling non-deterministic behaviour. Wolny et al. [21] focus on human readability and analysis of component runtime behaviour. The model derivation approaches introduced are classified regarding the requirements R1–R7 and the constraints (C1–C3) defined in Sect. 1 (cf. Table 1). There is no approach fulfilling all.

Table 1 Overview on related work in model derivation

2.3 Model-based test case selection and scheduling

Model-based test case selection involves generating test cases from a (formalized) system behaviour model or selecting test cases based on the model parts they cover. There are some approaches regarding model-based generation of test cases sequences based on system specifications [11] in the form of UML models. Therefore, code coverage is used to derive test sequences based on, e.g., UML state charts [23] or UML sequence charts [24]. Such methods require complete and formalized specification models, which are rarely available in industry [11] and nearly impossible to achieve with reasonable cost [13]. Code coverage is used for test case generation and model-based test case selection after a system change to cover the change-affected parts [25]. Yet, this requires the prior estimation of code coverage per test case, which involves code instrumentation and thus could impact the system runtime. Models generated based on test execution traces yield the advantage of deriving reduced test steps model-based and the already existing test cases used for model generation [16].

Test case scheduling is the process of determining the optimal order and timing for executing a set of test cases, considering factors such as priorities, dependencies, and resource constraints (e.g., limited testing time) to maximize the test efficiency (e.g., error detection rate) and to minimize the testing duration [26]. In computer science, scheduling is accomplished by selecting and prioritizing test cases [26]. In the context of aPS, test scheduling encompasses not only the selection and prioritization of test cases but also involves the preparation of the SUT’s hardware to attain a test-ready state, both preceding individual test cases and in-between them [4]. Test case scheduling is a multi-criteria optimization problem which requires an effective information integration strategy (e.g. [27]) to manage and evaluate all data relevant to determine a resource-efficient and high-utility test schedule. Test case scheduling was previously researched in a system-in-chip-design context [28]. Scheduling approaches for aPS focus predominantly on production planning [29,30,31], whereas time-saving is one of the superior scheduling goals [31]. As per Serrano-Ruiz et al.’s [31] literature review, most approaches use heuristic models, often in combination with machine learning, to find a feasible solution to the scheduling problem quickly rather than accurately. Keddis et al. [30] use a capability-based approach to respond to changing requirements and scheduling conditions. Such dynamic rescheduling approaches are especially interesting for test replanning based on previous test execution results. Land et al. [4] introduced test case scheduling for aPS, comparing different scheduling approaches known from computer science regarding feasibility, accuracy and computational costs and finding that an adapted dynamic programming approach demonstrated optimal performance for the test case scheduling scenario. Their approach relies on knowledge of system states and transfer times, obtainable from sources like a state chart.

3 Test-based system model recording concept

A suitable behaviour model fulfilling the previously defined requirements is needed to enable state-based test case selection and scheduling. This section presents an approach for discrete event systems to record the behaviour model during system test execution. As test cases are created manually by test engineers based on the system’s requirements, the requirements are retained in the behaviour model through the test cases. As a resulting model representation, a UML state chart is selected due to better comprehension for test engineers (cf. [18, 20]) and usability for state-based test scheduling (cf. [4]). The concept is illustrated using the example of a sorting system for different workpiece types (Fig. 1). The workpieces are transported by a conveyor and sorted into different ramps by separators according to their colour (optical sensor for dark, light), and their material (inductive sensor for plastic, metallic).

Fig. 1
figure 1

Sorting system for white, black, and metallic work pieces

From the mechanical point of view, the separator has several states based on its current mechanical extension length (cf. Fig. 2). The sensors are not able to detect all these states, which is why the separator is in an “intermediate” state from the software’s point of view during movement (both sensors “false”, respective actuator “true”). This paper derives a state chart similar to the software view, with a model recording concept described in detail in the following subsection (Sect. 3.1). Section 3.2 discusses an approach to handle non-deterministic sensor signal recording in the model.

Fig. 2
figure 2

Behaviour model of bistable cylinder (Separator)

3.1 Concept for recording the initial behaviour model

To avoid influencing the real-time behaviour of the PLC, the behaviour model is recorded only based on values of sensors, actuators (cf. [2]), and internal variables, optionally enhanced by data from the (static) control code, e.g. naming conventions (cf. Fig. 3). If internal variables (e.g. step variables that are used to switch between internal states [19]) are to be recorded (e.g. to better differentiate various system states), they must be added manually to the list of I/Os to be recorded. All I/O values, including timestamps, are recorded in a logfile similar to Park et al. [20] (i.e., logfile for Separator1 in Fig. 4, left). The test procedure is reflected by the sequential time stamps in the logfile so that the plant behaviour can be reconstructed based on it. In the following, processing the logfile recorded and transferring it to the behaviour model data structure presented in Fig. 5, is explained. The data structure is used to create the state chart (cf. Fig. 4, right) for ease of readability, and it enables structured data storage for model adaptability to future changes. Due to the modular plant structure and the naming conventions, the logfile is processable rule-based (cf. Fig. 6) without complex process mining approaches.

Fig. 3
figure 3

Excerpt of PLC environment view of I/Os of Separator1 (cf. Fig. 2); naming conventions for module-based clustering

Fig. 4
figure 4

Logfile for module Separator1 (left) with value changes marked grey/yellow and resulting state chart (right) (cf. Fig. 2) (colour figure online)

Fig. 5
figure 5

Data structure for behaviour model recording

Fig. 6
figure 6

Behaviour model recording concept during test execution

The system is divided into several modules (e.g. Separator1 as a module of the sorting system in Fig. 1) to reduce the complexity of the resulting behaviour model and thus increase its readability. A module is an independent unit with logical boundaries to other modules and encapsulates sensors and actuators to enable maintainability and reusability [32]. Modules can be identified using the naming conventions in the PLC code (cf. Fig. 3). Such name-based modularization focusing on system modules can reduce the state space stemming from multiple sensor and actuator values within a system. This approach differs from previous approaches such as [2, 17]. For each module, a reduced logfile, consisting only of I/Os related to this module, is extracted from the logfile recorded. Consecutive logfile lines that are identical except for the timestamp are removed. Based on the reduced module logfile, a state chart for the module is derived (cf. ① in Fig. 6). The current module states (“high-level-state” in Fig. 5) are identified by unique sensor value combination, e.g. state number 0 of Separator1 for unique sensor vector DI_Extended = 0 and DI_Retracted = 1 (cf. Fig. 6, right). The model’s readability can be enhanced by naming the high-level states based on the switching sensor (e.g., the state name “DI_Retracted” because the respective sensor value became TRUE). High-level transitions (cf. Fig. 5) connect the high-level states. The logfile consists of several lines with the same unique sensor vector but different actuator values (outputs).

To differentiate these lines, substates of the high-level states are inserted and connected by subtransitions (cf. ② in Fig. 6, right). Subtransitions reflect changes in actuator values (outputs) without impacting the unique sensor vector. For example, substates and subtransitions visualize resetting and setting actuator values within high-level state 2: DI_Extended. The duration between two subtransitions is determined by the timestamp difference between two consecutive logfile lines (cf. ➂ in Fig. 6, right) and saved for later test planning. If the unique sensor vector and thus the high-level state changes, a subtransition ending in a final state is inserted within the previous high-level state and reaching the respective final state is used as a transition condition between the two high-level states. Figure 4 depicts the resulting state chart for Separator1.

Based on the module state charts, a simplified plant logfile is derived to create the whole plant behaviour model (cf. Fig. 6, left). The simplified plant logfile contains only the current states of all its modules (e.g. States 0–2 for Separator1) and—if applicable—additional variables that are not explicitly assigned to one specific module but are required for the whole plant behaviour (e.g. internal or global variables, such as for saving the workpiece type). They are considered as additional input during model derivation when identifying unique plant states.

3.2 Handling non-deterministic sensor signal recording

In the following, handling timing uncertainties as an example of non-deterministic behaviour is discussed. The logfile is analysed for uncertainties, and two timing uncertainties are identified. These timing uncertainties occur due to uncertain time triggers of shop floor devices under control.

The first timing uncertainty is observable in the logfile by varying state durations given the same state sequence. For example, Seprator1 extends slightly slower than in a previous test run. This time deviation is in the time scale of the PLC’s cycle time and, therefore, neglectable. This type of timing uncertainty is handled by adding a safety lead time [33] of #T0.005 to the state durations recorded.

The second uncertainty identified leads to pseudo-states in between two logical states. In the application example, two sensors (optical: OptRamp1, inductive: IndRamp1) detect the workpiece type. As both sensors are installed in the same location, they theoretically detect and classify the workpiece simultaneously. In practice, latencies in signal acquisition and transmission may occur so that the sensor signals arrive asynchronously (cf. logfile excerpt in Fig. 7). As states are detected based on the (unique) sensor vector, this timing uncertainty leads to two different state sequences in the logfile if the same test is carried out twice. S1 and S3 are logic states, whereas S2 and S4 are pseudo-states indicated by a duration time within the time scale of the PLC’s cycle time. Either OptRamp1 or IndRamp1 switches first and, after a very short duration time (e.g. < #T0.005), a state with both sensor values TRUE follows. This timing uncertainty is handled by adding a transition between the two logic states, S1 and S3, and approximating the duration time. The two intermediate states and their transitions could be merged or removed to increase the model’s comprehensibility. Here, they are retained to prevent accidentally deleting any originally recorded state and to continue monitoring the non-deterministic switching, e.g., to enable fault detection.

Fig. 7
figure 7

Additional transition for non-deterministic sensor switching

4 Model application in normal operation—detecting deviations and fault handling

The state chart is derived from test cases in the behaviour model recording phase. Thus, the extent to which all relevant cases are covered depends on the quality and variety of the test cases used and the coverage achieved by them. A continuous comparison of the behaviour model recorded and the actual system behaviour is required to detect deviating system behaviour and to add behaviour missing in the model. The system behaviour can be traced in the behaviour model during normal operation to a) visualize the current system state for the operator (e.g. by highlighting the current state in the state chart during operation) and b) monitor the system for deviations from the behaviour model (Fig. 8). In the following, an on-site test engineer with technical background and system comprehension is indicated when “operator” is mentioned. Depending on the type of deviation, the behaviour model is either adapted (desired deviation) or the operator must fix the system to avoid this deviation in the future (undesired deviation) (cf. Sect. 4.1). System faults are a special behaviour deviation interrupting the current system run. To proceed with the system run, an operator must first restore the system to a safe state (fault handling routine) by activating specific actuators. This actuator setting can be recorded and added to the model either as a fault handling guide for other testers or to replay (cf. [19]) the actuator setting to enable an automatic restoration (self-healing) for future system operation (cf. Sect. 4.2).

Fig. 8
figure 8

System tracking during operation (left) and model adaptation with new states or fault handling routine (right)

4.1 Detect deviations from the model in normal operation

Deviations between actual system behaviour and model are detected when an unknown system state or unknown transition follows a system state known to the model. If the system behaves differently from the model, the respective new system states are recorded with the approach presented and marked as unknown states. After plant operation, visualizing the newly recorded states within the behaviour model enables operators to focus on the system parts where these deviations were detected. The naming conventions hereby assist the operator in tracking the behaviour, yet understanding the system’s sensors and their effects is still required. In literature [17, 22], deviations between the actual system behaviour and its model are handled as unintended behaviour (anomaly detection), e.g., due to wear or maintenance errors. By recording new states, operators can become aware of such unintended behaviour. The operator can discard the newly recorded states after fixing the system. As in this paper, behaviour models are derived during testing and not generated from the PLC code, deviant behaviour can additionally result from untested system parts. These parts may be extraneous code eligible for refactoring or may represent desired behaviour not covered by tests, e.g., due to changes during operation that were not documented and thus missing requirements or test cases. In this case, the model can be extended by defining a suitable test case and recording it to add the respective system behaviour to the model.

4.2 Record fault handling routine to enable self-healing

Faults may occur during normal operation, e.g., due to wear or tear. Assuming slack between the conveyor and the workpiece in the application example, the workpiece is not at the correct position as soon as the separator extends. Thus, the workpiece is trapped between the separator and the conveyor belt edge. The deviation from the behaviour model is detected if the separator’s sensor DI_Ex does not become TRUE within the usual time plus an optional tolerance time (e.g. after #T100, cf. Fig. 9, left). The manual recovery of the system after this blockage or fault can be recorded to be automated for future occurrences. The current system state in the behaviour model is highlighted during behaviour tracking. Thus, the operator can see the current state, compare the sensor vector recorded to the actual plant state and confirm to initiate the fault handling routine that is to be recorded. The current (unknown) state is hence marked as an error state (cf. state with “e” in Fig. 9, left; realised by additional attribute “errorState”, cf. Fig. 9, right). The operator retracts the separator until DI_Re is TRUE and moves the conveyor belt back until the optical sensor recognizes the workpiece again. He finally confirms to the system which previous state he has restored (e.g. state 0: DI_Retracted [0 1] in Fig. 9, left, as fault-handling ends there). The initial error state and recovery method (actuator commands and resulting sensor values) are recorded analogous to before (cf. Sect. 3.1). The resulting path (transitions) is marked as “fault handling” and inserted into the behaviour model starting at the error state. It ends in the fault-handling end-state defined by the operator. If the now-known error state reoccurs, the fault-handling recorded in the state chart could provide guidance for inexperienced test engineers, especially in case of manual fault-handling steps required, or the system could register the path marked as fault-handling and trigger it to automatically heal itself from the error state. Self-healing based on the recorded fault handling routines is also usable during (semi-) automatic test case execution. If a test failure results in a known error state, the system could recover independently and continue testing instead of an engineer having to transfer the system to a safe state. Next, if necessary, it can replan the test steps, e.g., selecting test steps to retest the separator to find the fault’s source or focus on other test steps first. The automated restoration of a specific state, even after a fault, makes it possible to continue the test procedure despite failure instead of aborting it entirely and reduces the amount of time-consuming manual support.

Fig. 9
figure 9

Left: error detection (separator not extended after #T100) and fault handling recording for error state (retract separator); right: excerpt of adapted data structure for error state and handling

In the given scenario (workpiece gets trapped), automated self-healing is feasible by controlling specific actuators, and repeating the process may succeed by retry because the initial failure was due to uncertainties like time delays through wear. However, recording the self-healing routine is impractical if manual intervention is required or if the error demands program code adjustment. Also, rolling back the system is only suitable for well-defined fault-handling routines to guarantee that the fault-handling itself does not worsen the system state, for example, by causing damage in case the fault handling fails. A detailed investigation on assessing whether and with what risk self-healing can be performed in a specific fault situation is subject to future work.

5 Using the behaviour model for model-based test step derivation and test scheduling

The behaviour model recorded shall be used to enhance test planning in regression testing. If an engineer marks the change-affected system states, the information from the behaviour model can be used to derive the test steps required for validation (cf. Sect. 5.1) and a time-efficient schedule of the respective test steps (cf. Sect. 5.2) (Fig. 10).

Fig. 10
figure 10

Test steps derivation from model after changes

5.1 Model-based test step derivation

The behaviour model recorded reflects the desired system behaviour (as defined in the requirements and validated by the test cases derived from the requirements). For regression testing, suitable test steps can be derived from the behaviour model (e.g. using code coverage approaches as introduced in Sect. 2.3) and executed to check whether the system behaviour (resulting sensor values) corresponds to the behaviour model recorded. For model-based test step derivation, an operator must mark the change-impacted high-level states in the behaviour model, e.g. by setting a state’s attribute “changed” to true. The behaviour visualization as state chart and the naming conventions facilitate to find suitable start- and end-states for testing. In the following, test steps are derived from the transitions and states recorded. For example, it is assumed that Separator1 is currently in state 0: DI_Retracted [0 1] and the state 2: DI_Extended [1 0] was marked as changed, e.g., because the end position sensor was replaced. To check whether state 2 is detected in time by the new end position sensor, the actuator DO_Extend is set to TRUE, and it is checked whether the sensor DI_Extended becomes TRUE within the recorded time frame (here: #T50ms). The test case below is derived in IEC61131-3 structured text:

separator1(DO_Extend: = TRUE, tDuration: = T#50 ms);

Assert.AreEqualBOOL(TRUE, separator.DI_Ex,‘Extended’);

The test case requires the separator to be in state 0: DI_Rectracted [0 1] before its execution because otherwise, it would lack the timing information required to verify that state 2 is detected within #T50ms. Assuming Separator1 is in state 1: Not_DI_Retracted [0 0] before the test case execution, the system under test (here: Separator1) must be set up to the state required for executing the test case. The necessary steps can be derived from the behaviour model, e.g., setting DO_Retract to TRUE to reach state DI_Rectracted. The setup duration is uncertain due to the extent of the separator’s extension, making #T50ms the worst-case duration.

5.2 State-based test scheduling

To save time during testing, the sequence of test steps to be executed can be optimized by considering their execution times (as recorded) and minimizing the setup duration between the test steps. As shown in Sect. 5.1, each test step requires the system to be set to a specific “initial state” (e.g. state 0) before execution and results in an “end state” (e.g. state 2). The setup duration between two consecutive test steps is the time needed to transition from the end state of one test step to the initial state of the next.

For instance, it is assumed that Separator1 is currently in state 0, and there are two test steps planned: one to check its extension (as described in Sect. 5.1) and another for retraction. Since the latter requires Separator1 to be fully extended prior to test execution, executing the first test step before the second is more time-efficient than fully extending Separator1 to execute the second test step first. By comparing the transition times from the current state (0) of Separator1 to the states required for the test steps (0 or 2), the most time-efficient test sequence can be determined. Another example: test steps such as verifying the correct identification of the type of the workpiece based on the inductive and optical sensors at ramp1 and ramp2 should be executed before checking whether the optical end sensor at ramp3 detects the workpiece. Otherwise, the workpiece will move down ramp3 before the inductive and optical sensors can be tested, requiring the manual insertion of a second workpiece to verify these other sensors, resulting in time loss and operator effort.

For complex systems with numerous system states and test steps to be scheduled, automated test scheduling approaches become necessary to identify a time-efficient test sequence. Building on a prior comparison of test case scheduling algorithms for aPS on their ability to approximate an optimal sequence within a reasonable computation time (cf. [4]), this paper uses a test case scheduling approach based on Bellman’s dynamic programming [34]. Firstly, the approach requires all possible system states and the durations to transition between them, derived from the recorded behaviour model and stored in a central matrix (cf. Fig. 11, ①). The transition duration is the sum of all transitions’ duration along the shortest path between two states. In the case of an intermediate system state (cf. Sect. 5.1), the worst-case duration is taken. If no transition path with recorded timings exists between two states, the setup time between them is assumed to be infinitely long. Secondly, the approach requires a list of all test cases to be executed, grouped by their initial state (cf. Fig. 11, ②). Based on this information, the approach generates a test-sequence-table for each state, saving the ‘optimal’ test sequence based on the available testing time (cf. column “Time” in test-sequence-table in Fig. 11, ②).

Fig. 11
figure 11

Determining a time-efficient test sequence based on the recorded behaviour model states using dynamic programming [4]

To determine an ‘optimal’ test sequence for a specific time slot, three data sets are compared per state (cf. Fig. 11, ③):

  1. (1)

    The ‘optimal’ test sequence according to the state’s test-sequence-table (cf. Fig. 11, ②) for one time unit less than the available testing time.

  2. (2)

    The ‘optimal’ test sequence according to another state’s test-sequence-table considering the transition time from the state regarded to the respective other state.

  3. (3)

    One of the state’s test cases followed by the ‘optimal’ test sequence according to the end state’s test-sequence-table for the testing time remaining after executing that one test.

Thereby, only states where test cases can begin are considered to maintain low computational complexity. If the system is in a state that isn’t an initial state for any test case, only the data sets obtained through (2) are compared to find the ‘optimal’ test schedule. After comparing all resulting test sequences, the best is selected and stored in the state’s test-sequence-table for the corresponding time slot (here: 5 s). By solving the test scheduling problem iteratively and storing the ‘optimal’ test sequences for each time step, the approach can use previous results so that the computational time increases linearly with the number of states rather than exponentially [34].

6 Hardware setup for recording and evaluation

The technical process (here: sorting plant) is controlled by a Beckhoff CX 2040 PLC with Twincat runtime (cf. Fig. 12). Twincat uses the Automation Device Specification (ADS) communication protocol for real-time data exchange between Twincat components [35]. “ADS enables access to the process image, data exchange, access to I/O tasks [and] access by variable name” [35], without modifying the PLC code (cf. Constraint C3). The Plugin TE1410 provides an interface for Matlab/Simulink to the ADS communication channel, allowing Matlab/Simulink access to the I/Os’ and variables’ data retrieved (cf. Fig. 12).

Fig. 12
figure 12

System architecture to derive behaviour model during testing (top) and evaluation-specific realization (bottom)

As the Twincat runtime environment provides multiple isolated cores with real-time execution [35], the ADS client for Matlab/Simulink data access can operate within one of these isolated cores without influencing the real-time capability of the PLC (cf. Constraint C2). Matlab/Simulink is used to process the logfile, structure the data in the data structure introduced (cf. Fig. 5) and generate the behaviour model (cf. Fig. 13). Matlab/Simulink also provides a function to highlight the current state in the state chart based on the current sensor vector, which indicates the current system state. This enables the operator to visually track the behaviour sequence.

Fig. 13
figure 13

Recorded model visualized in Matlab Simulink

Manually creating a plant simulation is time-consuming, often taking several hours to a few days for relatively simple systems like the sorting system. Further, consistency with the real system is not guaranteed. With the approach presented, a behaviour model can be created during test case execution in up to 30 min for the sorting system while having minimal additional effort (start recording and monitoring the results during test execution). The model recorded is less detailed and comprehensible than a manually created one, but it is still adequate for a quick behaviour model derivation and the subsequent test case scheduling. Test cases and a time-efficient test schedule are created within a few minutes (for the application example), thus again reducing the manual effort. The approach is evaluated based on the requirements defined in Sect. 1:

Readability (R1): The logfile recorded is transferred to a state chart to enhance the operator's readability. A state chart enables test engineers to see possible paths to be tested within the model. Modularization approaches and state-naming based on the switching sensor can improve human readability. Yet, the model's readability is compromised as the number of states (and substates) increases with the increased recorded sensor values. An issue also encountered by the (e.g. Petri-net-based) approaches presented in the state of the art (cf. Table 1). To handle the increasing model complexity, modularization of system parts, e.g., modelling closed plant parts separately based on naming conventions and representing the resulting submodels either in a hierarchical state chart (e.g., several model levels of detail) or in independent state charts (e.g., modelling independent modules such as the two separators in-parallel) can prove beneficial in future research.

Test step derivation (R2): As the behaviour model is recorded based on test case execution, the respective test cases can be rerun, or reduced test steps can be derived from the model (cf. Sect. 5.1). R2 is thus considered as fulfilled. However, the test steps derived are at unit-test-level, which might need to be improved for comprehensive testing, e.g., considering dependencies among system components.

Permissiveness and non-determinism (R3, R4): Permissiveness is partially achieved, particularly regarding handling non-determinism. The model relies exclusively on recorded test cases for test step derivation and scheduling, e.g., reduced setup times achievable through manual operation or system shortcuts are unknown to the model and thus not considered in the automatic test step scheduling. The quality of handling non-determinism depends on the uncertainty handling ability of the approach. In the application example, two timing uncertainties are identified and handled. In further examples, different uncertainty types are possible. While recording the behaviour model, uncertainties can occur due to causality deviations where some signal changes can depend on the actual state of the plant. For those types of uncertainties, further methods of uncertainty tackling must be integrated and considered. Wazed et al. [33] provide an overview of uncertainty types and handling approaches, which could prove suitable and must be evaluated for the respective use case.

Expandability (R5): Different scenarios to expand the model are discussed (cf. Fig. 8): Adding a model deviation recorded during normal operation, recording a new test case or a failure handling routine. All require a previously recorded state as a starting point. The latest recorded data is added to the existing model using the data structure introduced (cf. Fig. 5), thus avoiding re-recording or re-generating the complete model.

Timing (R6): R6 is fulfilled, as the model recorded contains the timing information required for an automated test step scheduling (cf. Sect. 5.2). As timing uncertainty, there may be slight deviations in the times recorded during normal operation. Hence, a safety lead time is specified (e.g. range #T40 to #T50), also in terms of higher permissiveness. The test schedule then considers the worst-case time or is adapted dynamically if a test is executed significantly faster (or slower) than expected.

Ability to detect deviations (R7): The approach presented allows for detecting deviations, but it demands manual effort to check the newly recorded states (cf. human in Fig. 8).

7 Conclusion and outlook

This paper’s main contribution is an approach to record a system behaviour model as a UML state chart based on the sensor and actuator values of discrete aPS. Unlike existing approaches (cf. Table 1), it considers non-deterministic sensor switching, it is expandable with new behaviour or fault handling routines without the need for a computationally intensive model re-generation, and it contains information on the typical state durations due to timestamps. Experts directly derive the test cases from functional requirements. Thus, the requirements are retained in the behaviour model recorded during test execution. After recording, the model is used for model-based test step selection and scheduling. Test scheduling approaches rely on data such as existing states, transition times, and links between states and test cases. Utilizing the behaviour model recorded enables the extraction of this crucial information, making the scheduling approach feasible and reducing the need for manual effort. Knowledge of the current system state also allows for adapting the test sequence during testing starting from the current state, e.g., in case a test case failure results in the system being in another state than expected. The evaluation of the sorting system use case showcases the feasibility of model recording without influencing the real-time capability of the PLC, followed by the derivation of test steps for validation.

There is room for improvement in the model’s permissiveness, the high manual effort in model expansion and its readability. The behaviour model recorded can be simplified through modularization (cf. Sect. 3.1). Future work shall evaluate the behaviour model’s readability and representation with on-site test engineers to determine a suitable visualization and interaction concept. The evaluation shall hereby focus on which type of modularization makes sense when weighing up comprehensibility and the number of sub-models to be created. Sensors and actuators are assigned to modules based on naming conventions (cf. Fig. 3) or manually. Otherwise, sensors and actuators that depend on a module can be identified by recording which values change while operating a module individually. Identified module-independent sensors and actuators enable reducing the models’ complexity by omitting them in the view and identifying modules for in-parallel testing. Executing test steps in parallel leads to more time-efficient test scheduling. Some state-of-the-art approaches introduce concepts to consider parallel execution within a system’s behaviour model. Hence, future work could extend the approach presented to identify and use parallel system sequences and, consequently, independently executable test sequences. This requires a high system modularity and modules that are independent from each other. In other application examples, system components might need to synchronize, e.g., when processing a resource and changing its properties in parallel, requiring shared variables. Shared variables cannot be clearly assigned to a single component, leading to triggers in both components’ behaviour models. In this case, another hierarchy level between the classes “behaviour model” and “module” (cf. Fig. 5) could be necessary to differentiate joint and individual actions or, to not complicate the behaviour model recording process, a holistic behaviour model including both components could be defined. This paper also briefly showed the possibility of integrating self-healing into the model using the recording approach (cf. Sect. 4.2). But not all fault scenarios are suitable for automation, particularly those requiring human intervention or posing risks of further errors or damage. Thus, future research must explore which scenarios are suitable for automation, considering factors such as risk and feasibility.