1 Introduction

A cyber-physical system (CPS) consists of multiple computing devices that communicate with each other and interact with the physical world in a feedback loop. It is vital to ensure that CPSs work correctly as intended, especially in safety-critical applications like aircrafts, automobiles, and medical devices [4, 41, 42].

Model checking is a powerful, computer-assisted technique to analyze CPS correctness [11, 19]. It involves formalizing system behavior in mathematical models, describing system requirements in logic specifications, and analyzing whether specifications are satisfied in models through complete verification algorithms or incomplete falsification algorithms. Verification algorithms prove system correctness automatically but are typically limited to CPS with few dimensions and simple dynamics [16, 18, 30, 35, 36, 54, 55]. Falsification algorithms, on the other hand, satisfy weaker forms of completeness but can handle complex system dynamics and disprove system correctness via counterexamples [3, 8, 27, 33, 53, 63, 66]. As a result, they are of practical interest and are the main focus of model checking in industrial applications [19, 20, 53].

However, with real-world CPSs increasingly deployed in uncertain environments and dynamically reconfigured during runtime [1, 2, 23, 32, 37, 59], their formal modeling and falsification face great challenges.

Challenge 1: CPS modeling is complicated due to changing agent behaviors and dynamic system topologies. For one thing, in a reconfigurable CPS, each agent’s role, function, and behavior may vary across scenarios to meet changing requirements. Therefore, it needs to be frequently modeled, especially when there are many different scenarios. For another, agents are usually dynamically wired due to changing demands or limitations. Some agents may even be part of the system in one scenario but leave in another. Traditional modeling languages, however, lack flexible mechanisms to formalize dynamic topology shifts.

Challenge 2: CPS falsification requires better scalability and efficiency due to the growing size and complexity of reconfigurable CPSs. Models of reconfigurable CPSs become huge and may even unavailable at design time. This calls for more scalable falsification approaches with improved runtime performance.

For challenge 1, we designed a hierarchical scenario-based modeling language for reconfigurable CPSs. To simplify agent modeling, we introduced a template model to formalize agent inherent features, including potential dynamics and communication capabilities. Based on template models, an instantiation mechanism is provided so that users can model and update agent runtime behaviors easily. Further, to capture the dynamically wired behaviors of multiple agents, we introduced communication configurations and scenario task configurations for instantiated agent models. Users can formalize agents’ intricately connected behaviors through brief communication configurations, and abstract their dynamic topologies flexibly through intuitive scenario task configurations.

For challenge 2, we developed a path-oriented falsification approach for reconfigurable CPSs, searching for counterexample system behavior along generated paths. For the vast behavior space along each path, we designed a two-layered, classification-model-based optimization method to explore it effectively. Besides, since system simulation and robustness calculation are most time-consuming during falsification, we introduced concepts of time-context and lifespan for temporal specifications, to cut unnecessary system simulations and robustness calculations that do not affect the satisfaction of specifications.

The main contributions of this work are as follows:

  • We designed a scenario-based hierarchical modeling language to formalize reconfigurable CPSs. We also extended a topology-aware temporal logic to specify their system requirements. (See Sect.3)

  • We developed a path-oriented approach to falsify temporal logic specifications for CPSs. It employs classification-model-based optimization and lifespan-guided robustness calculation to improve efficiency. (See Sect.4)

  • We implemented a tool called SNIFF, which supports scenario-based graphical modeling and falsification for reconfigurable CPSs. Studies show that it can greatly reduce modeling time and improve modeling accuracy. Experiments demonstrated its scalability and efficiency in falsification. (See Sect.5)

2 Background

2.1 Preliminaries

Hybrid Automaton (HA) is a popular formal language for modeling tightly coupled discrete and continuous CPS behaviors. Dense-time formalisms like Signal Temporal Logic (STL) serve as the basis for specifying CPS requirements.

Definition 1

(Hybrid Automaton [5]). An HA is a tuple \(H=(X, U, Q, F, Inv, E, G, R)\), where X are continuous variables; U are external inputsFootnote 1; Q are discrete operating modes; \(F=\{F_{q}|~{q}\in Q\}\) are flow functions, defining the flow of variables in mode q by differential equation \(\dot{X}=F_{q}(X,U)\); \(Inv=\{{Inv}_{q}|~{q}\in Q\}\) are invariant conditions, constraining conditions of mode q by constraints \({Inv}_{q}(X,U)\); \(E\subseteq Q\times Q\) are discrete transitions, denoting jumps of modes; \(G=\{G_e|~e\in E\}\) are guard conditions on transitions; \(R=\{R_e|~e\in E\}\) are reset functions on transitions, resetting variables by \(X:=R_e(X,U)\).

Semantics: The state space of H is \(S=Q\times X\). Its state, denoted by the pair \(s=(q,x)\in S\), evolves in the following two ways. It can stay in the current mode q and evolve continuously based on the flow function \(F_q\), within the space constrained by the invariant condition \({Inv}_q\). It can also jump from the mode q to \(q^\prime \) discretely and instantaneously through transition \(e=(q,q^\prime )\in E\) when its guard condition \(G_e\) is met, and be reset by the function \(R_e\).

Note that under nondeterministic semantics, HA can choose arbitrarily between staying in the current mode and jumping to another mode, as long as conditions of current mode and target transition are both satisfied. It can also choose among multiple transitions when their guard conditions are all met [44].

Definition 2

(Signal Temporal Logic [46, 47]). Let \(\mathcal {X}\) denotes variables defined over domain \(\overline{\mathbb {R}}^n\), where \(\overline{\mathbb {R}}\)=\(\mathbb {R}\cup \{\top ,\bot \}\) is the totally ordered set of real numbers with the smallest and greatest boolean elements \(\bot \) and \(\top \), \(\top =-\bot \), \(-\top =\bot \). Let \(w:\mathbb {T}\rightarrow \overline{\mathbb {R}}^n\) denotes a multi-dimensional signal of \(\mathcal {X}\), with \(\mathbb {T}=[0,t)\subseteq \mathbb {R}\). The syntax of an STL formula \(\varphi \) interpreted over \(\mathcal {X}\) is defined as:

\(\varphi :=\text {true}\ |\ \theta (\mathcal {X})\ge d\ |\ \lnot \varphi _1\ |\ \varphi _1\vee \varphi _2\ |\ \varphi _1\textbf{U}_\textbf{I}\varphi _2\).

Here, \(\theta :\overline{\mathbb {R}}^n\rightarrow \mathbb {R}\) denotes a function that maps \(\mathcal {X}\)’s valuations into a real, and d is a constant. \(\textbf{U}\) denotes the until operator and \(\textbf{I}\subseteq \mathbb {R}^+\) denotes a timing interval, which can be omitted when it is \([0,\infty )\). In a formula \(\varphi \), other standard operators can be derived as follows: \(\text {false}\equiv \lnot \text {true}\), \(\varphi _1\wedge \varphi _2\equiv \lnot (\lnot \varphi _1\vee \lnot \varphi _2)\), \(\varphi _1\rightarrow \varphi _2\equiv \lnot \varphi _1\vee \varphi _2\), \(\lozenge _I\varphi \equiv \text {true}\textbf{U}_\textbf{I}\varphi \), and \(\square _I\varphi \equiv \lnot \lozenge _I(\lnot \varphi )\).

Semantics: For a given signal, an STL formula \(\varphi \) has both boolean and quantitative semantics [29, 46]. Its boolean semantics indicates \(\varphi \)’s validity on this signal at some given time t via a binary satisfaction relation, while its quantitative semantics indicates how well this signal satisfies \(\varphi \) at time t via a real-valued robustness degree. Due to space limitations, we only provide quantitative semantics here. Given an STL formula \(\varphi \), a signal w of variables \(\mathcal {X}\), and time t, the quantitative semantics of its robustness degree \(\rho (\varphi ,w,t)\) is defined by induction:

$$\begin{aligned} \rho (\textit{true},w,t)= & {} \top \nonumber \\ \rho (\theta (\mathcal {X})\ge d,w,t)= & {} \theta (w(t))-d\nonumber \\ \rho (\lnot \varphi _1,w,t)= & {} -\rho (\varphi _1,w,t) \\ \rho (\varphi _1\vee \varphi _2,w,t)= & {} \max \{\rho (\varphi _1,w,t),\rho (\varphi _2,w,t)\} \nonumber \\ \rho (\varphi _1\textbf{U}_\textbf{I}\varphi _2,w,t)= & {} \!\!\sup _{t'\in t\oplus \textbf{I}}\min \{\rho (\varphi _2,w,t'),\!\!\!\inf _{t''\in [t,t')}\!\!\!\!\rho (\varphi _1,w,t'')\}\nonumber \end{aligned}$$
(1)

Here, \(\oplus \) denotes the Minkowski sum, \(t\oplus \textbf{I}=\{t+i|i\in \textbf{I}\}\). This quantitative semantics is used in most STL falsification. The degree is positive only if the signal satisfies the formula, and the higher the better.

Fig. 1.
figure 1

The Formation Target of the Multi-UAV System in Each Flight Mission

2.2 Motivating Example: A Multi-UAV System

Consider a reconfigurable CPS with eight UAV (Unmanned Aerial Vehicle) agents (\(u_1\)-\(u_8\)). Each agent’s position, speed, and acceleration are continuous variables, with communication delay as an external input. The system undertakes various flight missions based on runtime demands. To execute missions, agents collaborate through different communication strategies and composition architectures (e.g., classical virtual structures, leader-follower, behavior-based, and consensus-based ones [26]). Below is a example of its runtime mission sequence (M1-M4), with formation details in Fig. 1.

M1:

All UAVs communicate with each other to locate their 3-D central point O. They then calculate and move to their target positions, forming a clockwise ring around the center O with a radius of \(r_1\), at the same height of O.

M2:

It is similar to M1, but forms a counter-clockwise ring with a radius of \(r_2\).

M3:

Upon receiving commands from the ground control center, odd-numbered and even-numbered UAVs adjust their speed to \(v_1\) and \(v_2\), respectively.

M4:

It employs a hierarchical leader-follower architecture for tree formation. Phase 1: according to the leader \(u_1\), followers \(u_{2-4}\) change their heading angles and move to desired relative positions. Phase 2: \(u_{2-4}\) act as leaders for \(u_5\), \(u_{6-7}\), and \(u_8\) respectively, forming the third row of the tree similarly.

Fig. 2.
figure 2

HA Models of \(u_3\) in Scenarios with Differently Activated Functions

Although all agents are homogeneous in this case, each plays a different role. Moreover, even the same agent behaves differently across scenarios, and therefore, has different models. For example, in the models of \(u_3\) shown in Fig. 2(a)-(c),

  • in M1, it cruises while waiting for others’ states. Once received, it calculates control parameters to adjust its position to the target location accordingly.

  • in M3, it cruises while waiting for ground commands, and then adjusts velocity.

  • in the phase 2 of M4, it cruises while broadcasting its state to followers, \(u_{6-7}\).

These functions are embedded in UAV codes, activated dynamically at runtime.

Furthermore, the topology of the entire multi-UAV system changes dynamically, resulting in a complex and vast system model. In M1 and M2, vehicle-to-vehicle communication exists among all agents for central point calculation. While in M3, agents are divided into two groups, equipped with ground-to-vehicle communication. In M4, system topology even shifts inside the mission and becomes hierarchical. Across scenarios, system models should be structured differently, and their shifts should also be formally modeled as a whole. Besides, specifying system requirements, especially topology-related ones, is also difficult. Many extra intermediate variables are required to describe system topology. We will address these modeling and specifying challenges in the next section.

3 Scenario-Based Formalism for Reconfigurable Systems

In this section, we present our formalism for reconfigurable CPSs with dynamic behaviors, compositions, and system topologies. It includes scenario-based system behavior modeling and topology-aware system property specifying.

3.1 Scenario-Based System Modeling

Our modeling language A) provides static template models for agents, along with an instantiation mechanism to customize each agent’s actual internal behavior in each scenario; B) offers communication configurations to formalize multiple agents’ interconnected behaviors in each single scenario; C) introduces scenario task configurations to describe system dynamic topologies in multiple scenarios. Finally, in D), we conclude the hierarchy of our modeling language.

A. Modeling Agent Runtime Behavior by Template Instantiation.

Definition 3

(Template Hybrid Automaton). A template-HA is a tuple \(H^T=(X^T, U, Q, F, Inv, E^T, G, R, \alpha , \beta )\), where

  • \(X^T\) are variables classified by the function \(\alpha \): \(X^T\rightarrow \{X_{inp}, X_{out}, X_{local}\}\). \(X_{inp}\) denotes global input variables, whose values can be updated by other agents during communication. \(X_{out}\) denotes global output variables that others can read during communication. \(X_{local}\) refers to private local variables.

  • \(E^T\) are transitions classified by the function \(\beta \): \(E\rightarrow \{E_{global}, E_{local}\}\). Global transitions are public to other agents, therefore, communicating with others on global transitions is allowed. Local transitions are fixed in the agent’s implementation, and are private to the agent itself.

  • UQFInvGR are parameters defined identically as HA in Definition 1.

Template-HA formalizes the inherent features implemented in an agent. It constrains the agent’s potential modes, transitions, and communication capabilities, including available communication ports, content, and channels, denoted as \(X_{inp}\), \(X_{out}\), and \(E_{global}\) respectively. To further constrain the agent’s actual runtime behaviors in a specific scenario, we define Instance Hybrid Automaton.

Fig. 3.
figure 3

An Overview of the Template-HA Model for UAV Agents

Definition 4

(Instance Hybrid Automaton). An instance-HA is a tuple \(H^I=(X, U, Q, F, Inv, E, G, R, \alpha , \beta , \gamma )\), which can be instantiated from the template-HA \(H^T =(X, U, Q, F, Inv, E, G, R, \alpha , \beta )\) through a transition activation function \(\gamma \): \(E_{global}\rightarrow \{E_{act}, E_{deact}\}\).

Semantics: The state of \(H^I\), denoted as \(s=(q,x)\in Q\times X\), evolves either continuously in mode q or discretely by jumping to mode \(q^\prime \) as defined in Definition 1. However, when it jumps via a global transition \(e=(q,q^\prime )\), there is an extra constraint: the transition e must be activated by the function \(\gamma \) as \(\gamma (e)=E_{act}\).

Fig. 4.
figure 4

Multi-UAV System Modeling Snippet in a Structured Configuration File (Color figure online)

This instantiation mechanism improves template-HAs’ flexibility and reusability, enabling adaptation to changing demands and reuse across scenarios by agents of the same type. For instance, in the multi-UAV system, agents are homogeneous and share the same template-HA, shown in brief in Fig. 3. It has three modes and five transitions. Variables are categorized by function \(\alpha \). For example, agent positions are global output variables, serving as permissible communication content among agents, and inps are global input variables as legal communication ports. Transitions are classified by function \(\beta \), with global ones marked by dotted arrows, including comm, commP, and commV. Agents can be registered with ID and template-HA as shown in the purple box (line 2) in Fig. 4. During mission M4, when \(u_3\) is leading \(u_{6-7}\), the comm transition should be activated in \(u_3\)’s instance-HA to inform its followers of its position, while \(u_{6-7}\)’s commP transitions should be activated for target calculation and position control. It can be concisely configured, as shown in the green box (line 15) in Fig. 4.

B. Modeling Agent Composition by Communication Configurations

Agents are usually complexly coupled rather than isolated in a CPS. To formalize agents’ composition, we define communication items and configurations for them.

Definition 5

(Communication Item). Given two runtime agents modeled as \(H^I_1\)=\((X_{1}, U_{1}, Q_{1}, F_{1}, Inv_{1}, \) \(E_{1}, G_{1}, R_{1}, \alpha _{1}, \beta _{1}, \gamma _{1})\) and \(H^I_2\)=\((X_{2}, U_{2}, Q_{2}, F_{2}, Inv_{2},E_{2}, G_{2}, R_{2}, \alpha _{2}, \beta _{2}, \gamma _{2})\), where \(X_1\cap X_{2}=\emptyset ,\ U_1\cap U_2=\emptyset ,\ E_1\cap E_2=\emptyset ,\ Q_1\cap Q_2=\emptyset \), a communication item for them is a tuple \(c= (H^I_1,e_{1},I,H^I_2,e_{2},O,R_{e_1}^\prime )\), where

  • \(e_{1}\in E_{1},\ e_{2}\in E_{2},\ \beta _{1}(e_{1})=\beta _{2}(e_{2})=E_{global},\ \gamma _{1}(e_{1})=\gamma _{2}(e_{2})=E_{act}\);

  • I, O are ordered sets, \(|I|=|O|,\)Footnote 2 elements in I must be global input variables in \(H^I_1\), while elements in O must be global output variables in \(H^I_2\), i.e.,

    \(\forall x\in I,\ x\in X_1 \bigwedge \alpha _{1}(x)=X_{inp}\); \(\forall x\in O,\ x\in X_2 \bigwedge \alpha _{2}(x)=X_{out}\);

  • \(R_{e_1}^\prime \) is an additional reset function for variables in \(X_1\).

A communication item formalizes a composition relationship between two agents. Under the item c, global transitions \(e_{1}\) and \(e_{2}\) are bound together as a communication channel, expected to happen simultaneously under their guard conditions. Once they happen, communication contents in O will be received by ports in I in sequence. The additional reset function \(R_{e_1}^\prime \) will be executed next, followed by agents’ original reset functions.

Definition 6

(Communication Configuration). Given two agents modeled as \(H^I_1\) and \(H^I_2\) as above, a communication configuration C=\(()|(c)|(c_1,\cdots ,c_n)\), where \(c_i\)=\(({H^I_1}_i,{e_{1}}_i,I_i,{H^I_2}_i,{e_{2}}_i, O_i,{R_{e_1}^\prime }_i)\) is a communication item for \(H^I_1\) and \(H^I_2\).

  • \(C=()\): \(H^I_1\) and \(H^I_2\) are independent;

  • \(C=(c)\): \(H^I_1\) and \(H^I_2\) are connected by the item c as Definition 5;

  • \(C=(c_1,\cdots ,c_n)\): for any \(c_i\) and \(c_j\)(\(1\!\!\le \!\! i\!\!<\!\!j\!\!\le \!\! n\)), if \({e_{1}}_i\cap {e_{2}}_i\cap {e_{1}}_j\cap {e_{2}}_j\ne \emptyset \), then all these transitions are expected to happen simultaneously, together with the communication between \(I_i\) and \(O_i\), \(I_j\) and \(O_j\). Otherwise, transitions \({e_{1}}_i\), \({e_{2}}_i\) are bound together as Definition 5, so are transitions \({e_{1}}_j\) and \({e_{2}}_j\).

Semantics: Communication configuration, consisting of none, one, or multiple items, formalizes two agents’ composition. Their composition under a communication configuration can be formalized as another larger instance-HA as follows.

Definition 7

(Instance-HA Composition). Given two instance-HA \(H^I_1=(X_{1}, U_{1}, Q_{1}, F_{1}, Inv_{1}, E_{1}, \) \(G_{1}, R_{1}, \alpha _{1}, \beta _{1}, \gamma _{1})\) and \(H^I_2=(X_{2}, U_{2}, Q_{2}, F_{2}, Inv_{2}, E_{2},G_{2}, R_{2}, \alpha _{2}, \beta _{2}, \gamma _{2})\), where \(X_1\cap X_{2}=\emptyset ,\ U_1\cap U_2=\emptyset ,\ E_1\cap E_2=\emptyset ,\ Q_1\cap Q_2=\emptyset \), the composition of \(H^I_1\) and \(H^I_2\) under communication configuration C can be modeled as instance-HA \(H^I =(X, U, Q, F, Inv, E, G, R, \alpha , \beta , \gamma )\), where

  • \(X=X_1\cup X_2=\{X_1,X_2\}\) are variables, classified by \(\alpha (x)=\bigl \{ \begin{array}{ll} \alpha _1(x), &{} x\in X_1 \\ \alpha _2(x), &{} x\in X_2 \end{array}.\)

  • \(U=U_1\cup U_2=\{U_1,U_2\}\) are external inputs; \(Q=Q_1\times Q_2\) are discrete modes;

  • \(F=\{F_{(q_{i},q_{j})}|~(q_{i},q_{j})\in Q\}\) are flow functions, where \(\bigl \{ \begin{array}{c} \dot{X_1}=F_{1q_{i}}(X_1,U_1)\\ \dot{X_2}=F_{2q_{j}}(X_2,U_2) \end{array}.\)

  • \(Inv=\{{Inv}_{(q_{i},q_{j})}={Inv}_{1(q_{i})}\cap {Inv}_{2(q_{j})}|~(q_{i},q_{j})\in Q\}\) are invariant conditions;

  • \(E\subseteq Q\times Q\) are discrete transitions constructed in the following two ways:

    for \(\forall e_a=(q_a,q_a')\in E_1, \forall e_b=(q_b,q_b')\in E_2\),

    1) if \(\exists c= (H^I_{1},e_{1},I,H^I_{2},e_{2},O,R_{e_{1}}^\prime )\in C\), \(\{e_{a},e_{b}\}\)=\(\{e_ {1},e_{2}\}\) or \(\{e_{a},e_{b}\}\)=\(\{e_ {2},e_{1}\}\):

    • \(e_c\)=\(((q_a,q_b),(q_a',q_b'))\in E\), classified by \(\beta (e_c)\)=\(E_{global}\), activated by \(\gamma (e_c)\)=\(E_{act}\);

    • \(G_{e_c}\)=\(G_{e_a}\cap G_{e_b}\) is the guard condition, and \(R_{e_c}\) resets \(X:=R_{e_c}(X,U)\) as:

      1. *

        \(X:=\{R_{e_1}(R_{e_1}^\prime (R_{I/O}(X_1),U_1),U_1),R_{e_2}(X_2,U_2)\}\) when \(\{e_{a},e_{b}\}=\{e_ {1},e_{2}\}\);

      2. *

        \(X:=\{R_{e_2}(X_1,U_1),R_{e_1}(R_{e_1}^\prime (R_{I/O}(X_2),U_2),U_2)\}\) when \(\{e_{a},e_{b}\}=\{e_ {2},e_{1}\}\);

      where function \(R_{I/O}(x)=\bigl \{ \begin{array}{cl} O(i) &{}, I(i)=x \\ x &{},x\in X-I \end{array}.\)

    2) otherwise:

    • \(e_c=((q_a,q_b),(q_a',q_b))\in E\), \(e_d=((q_a,q_b),(q_a,q_b'))\in E\);

    • \(\beta (e_c)=\beta (e_a)\), \(\gamma (e_c)=\gamma (e_a)\), \(G_{e_c}=G_{e_a}\), \(R_{e_c}(X):=\{R_{e_a}(X1,U_1),X_2\}\);

    • \(\beta (e_d)=\beta (e_b)\), \(\gamma (e_d)=\gamma (e_b)\), \(G_{e_d}=G_{e_b}\), \(R_{e_d}(X):=\{X_1,R_{e_b}(X2,U_2)\}\).

The composition of multiple agents can be formalized by applying Definition 7 recursively. With communication configurations, agents’ static compositions are efficiently configured, and always adhere to their inherent communication capabilities. For example, as shown in the orange box (line 17-22) in Fig. 4, communication configuration defines the composition of \(u_3\) and its followers \(u_{6-7}\) in mission M4, where \(u_{6-7}\) gets \(u_3\)’s position to calculate their own target positions.

C. Modeling Dynamic Typology by Scenario Task Configurations

In a reconfigurable CPS, agents usually interconnect dynamically, and the system topology shifts across scenarios during runtime. To capture this dynamic feature, we introduce atom tasks and scenario tasks below.

Definition 8

(Atom Task). Given a system with multiple agents indexed by unique IDs, an atom task of the system is a tuple AT=\(({A},{\mathbb {H}}^I, {Q}^{0}, {Q}^{f},\mathbb {C})\), where

  • \({A}=\{A_i\}^n_{i=1}\) are the IDs of all involvezd agents in the current scenario;

  • \({\mathbb {H}}^I\)=\(\{H^I_i\}^n_{i=1}\) are involved agents’ instance-HAs, \(H^I_i\) instantiated from \(H^T_{A_i}\);

  • \({Q}^{0}\)=\(\{Q^{0}_i\}^n_{i=1}\), \({Q}^{f}\)=\(\{Q^{f}_i\}^n_{i=1}\)are agents’ initial and final modes, \({Q}^{0}_i,{Q}^{f}_i\in Q_{A_i}\);

  • \(\mathbb {C}\)=\(\{C_i\}^{m}_{i=1}\) are communication configurations defined on \({\mathbb {H}}^I\).

An atom task describes a scenario, detailing active agents, behaviors, and compositions. It begins with agents evolving simultaneously from initial modes. It ends once all agents have reached final modes. For example, the scenario where \(u_3\) leads \(u_{6-7}\) in mission M4 is described in the blue box (line 13-23) in Fig. 4.

Besides, we offer a template mechanism for atom tasks to enhance task reusability. In a task template, involved agents’ types, behaviors, and compositions are defined, leaving their IDs and some const variables for customization. For example, missions M1 and M2 can be customized by the same task template.

Definition 9

(Scenario Task). A scenario task ST=AT|(STST)|(ST||ST), where (STST) denotes the sequential execution of two scenario tasks, while (ST||ST) denotes the parallel execution of two scenario tasks. Therefore, it can either be a single atom task or a sequential/parallel compositional task.

Agents involved in the two subtasks of a parallel compositional task should be disjoint. Besides, if an agent is involved first in atom task \(AT_1\) and then in \(AT_2\), its initial mode in \(AT_2\) should be consistent with its final mode in \(AT_1\).

Scenario task specifies the flow of scenarios recursively, which in turn, configures the dynamic shifts of system topologies. Therefore, it is also called scenario task configuration in this work. For example, as shown in the red box (line 5-10) in Fig. 4, the dynamic topology of the multi-UAV system in mission M1-4 is defined by scenario task configurations. The scenario task \(T_{mission}\)=\(((T_{ring};T_{speed});T_{tree})\) represents the sequential execution of three scenario tasks: \(T_{ring}\) for M1-2, \(T_{speed}\) for M3, and \(T_{tree}\) for M4. See Fig. 8a for a graphical illustration.

Fig. 5.
figure 5

The Hierarchy of Our Scenario-based Modeling Language

D. Hierarchical Modeling for Reconfigurable CPSs

The hierarchy of our modeling language is summarized in Fig. 5. The design idea is to abstract agents’ inherent features into template-HA models, so that reconfigurable CPSs can be modeled through simple instantiations and configurations.

Agent internal runtime behavior is modeled by an instance-HA, which is instantiated from its inherent template-HA model. Multiple agents’ interconnected behaviors are modeled by the composition of their instance-HAs, which are connected by communication configurations. Then, the behavior of a reconfigurable CPS in a specific scenario is modeled by an atom task, with its involved agents’ compositional instance-HAs. Finally, the CPS’s runtime behaviors in changing scenarios with dynamic topologies are structured by scenario task configurations.

3.2 Specifying System Requirements in Topology-Aware STL

Topology-aware STL is a domain-specific extension of STL, designed to specify requirements for multi-agent systems with dynamic topologies. Although it shares the same expressiveness as STL, it facilitates the construction and evaluation of specifications related to system topologies.

Definition 10

(Topology-Aware STL). A topology-aware STL formula \(\varphi \) interpreted over boolean, integer, and real variables \(\mathcal {X}\)=\(\{\mathcal {X}_{b},\mathcal {X}_{i},\mathcal {X}_{r}\}\) is defined as:

\(\varphi :=\text {true}\ |\ \psi \ |\ \lnot \varphi _1\ |\ \varphi _1\vee \varphi _2\ |\ \varphi _1\textbf{U}_\textbf{I}\varphi _2\), with the atomic predicate \(\psi \) as

\(\psi :=End[T]\ |\ Mode[A]=d_q\ |\ \theta (\mathcal {X}_{r})\ge d\ \), where

\(End[T]\in \mathcal {X}_{b}\) indicates whether a scenario task T has been completed. \(Mode[A]\in \mathcal {X}_{i}\) denotes the mode index of an agent A, while \(d_q\) is an integer mode index constant. Real variables \(\mathcal {X}_{r}\) consist of agents’ variables, external inputs, and scenario tasks’ run time, denoted as Time[T] for the scenario task T. Function \(\theta \): \(\overline{\mathbb {R}}^n\rightarrow \mathbb {R}\) maps valuations of \(\mathcal {X}_r\) into real numbers, and d is a constant.

Semantics: Given a topology-aware STL formula \(\varphi \), a signal w of \(\mathcal {X}\), and time t, the quantitative semantics of its robustness degree \(\rho (\varphi ,w,t)\) is defined as Definition 2, except the robustness of atomic predicates is defined as below:

$$\begin{aligned} &\rho &(End[T],w,t)= \left\{ \begin{array}{cl} 0 &{},\text { { T} is a completed atom task}\\ -1 &{},\text { { T} is an uncompleted atom task}\\ (\sum \limits _{i=1}^{2}{\rho (End[T_i],w,t)})/2 &{},\ T=(T_1;T_2)\text { is a sequential task }\\ \min \limits _{i\in [1,2]}\rho (End[T_i],w,t) &{},\ T=(T_1||T_2)\text { is a parallel task }\\ \end{array}\right. \end{aligned}$$
(2)
$$\begin{aligned} &\rho &(Mode[A]=d_q,w,t)= -\textbf{Dist}_A(Mode[A](t),d_q) \end{aligned}$$
(3)
$$\begin{aligned} &\rho &(\theta (\mathcal {X}_{r})\ge d,w,t)=\theta (w(t))-d \end{aligned}$$
(4)

The robustness of End(T) indicates scenario task T’s completion extent. The robustness of the predicate \(q(A)=d_q\) indicates agent A’s distance to the target mode \(d_q\). The function \(\textbf{Dist}_A\): \(Q_A\times Q_A\rightarrow \mathbb {Z}^+\) maps two modes in agent A to the minimum number of hops required to jump from the first mode to the second by transitions in A. The robustness degree is negative only if the signal of variables violates the formula, and it decreases as the severity of the violation increases. In this work, signals are interpreted according to system trajectories. When the trajectory is clear, we use a simpler notation \(\rho (\varphi )\), instead of \(\rho (\varphi ,w,0)\), to denote the robustness of formula \(\varphi \) on trajectory signal w at time zero.

Topology-aware STL is augmented with customized variables, predicates, and semantics. Predefined elements for task duration, completion, and mode identification help to describe topology-related requirements while avoiding introducing loads of extra variables, constraints, and functions to system models. For example, Table 1 lists some topology-aware STL specifications for the multi-UAV system modeled in the scenario task \(T_{mission}\). Furthermore, topology-aware STL provides more meaningful quantitative semantics with better robustness degrees. This can facilitate CPS falsification by offering more effective guidance.

Table 1. Examples of Topology-Aware STL Specifications
Fig. 6.
figure 6

The Path-oriented Optimization-based Falsification Framework

4 Path-Oriented Optimization-Based System Falsification

4.1 Falsification Framework

For a reconfigurable CPS modeled in a scenario task T and its requirement specified as a topology-aware STL \(\varphi \), falsification aims to find a witness trajectory that violates \(\varphi \), i.e. specification \(\varphi \)’s robustness is negative on this trajectory. We denote a counterexample pair as \((x_0,w_\mu ,p,\tau )\), under which the trajectory is a witness. Here, \(x_0\) represents the initial value of agents’ variables, \(w_\mu \) represents the signal of external inputs, p represents the path of agents’ mode and transition sequence, and \(\tau \) represents agents’ dwelling time in modes along p.

To fully explore both discrete and continuous dimensions of the vast search space, our falsification approach uses a path-oriented strategy, as shown in Fig. 6. At the bottom, it explores discrete space, generating paths based on the discrete structure of agents and scenario tasks. On the top, it explores continuous space for each generated path p, performing path falsification to find an optimal pair \((x_0,w_\mu ,\tau )\) that minimizes the specification \(\varphi \)’s robustness on trajectories along p. The falsification problem is solved when the minimal robustness is negative.

Our SAT-based path generation and optimization-based path falsification are given next. For brevity, we will only consider models with nondeterministic semantics, since deterministic models can be viewed as a simplified case.

4.2 Path Generation for Hierarchical Scenario Tasks

For a scenario task, its path determines how its involved agents change their modes and make their transitions during it. Below is the definition of path.

Definition 11

(Path of Scenario Task). For a given scenario task ST, if

  • ST=AT=\(({A},{\mathbb {H}}^I, {Q}^{0}, {Q}^{f},\mathbb {C})\) is an atom task: a path of it with length l is a tuple p=\((\{\mathcal {M}^i,\mathcal {T}^i\}^{l-1}_{i=0},\mathcal {M}^l)\), where \(\mathcal {M}^i\)=\((\mathcal {M}^i_1,...,\mathcal {M}^i_{|A|})\) denotes all agents’ modes in the i-th step, and \(\mathcal {T}^i\)=\((\mathcal {T}^i_1,...,\mathcal {T}^i_{|A|})\) denotes their next transitions after this step. For each agent \(A_k\in A\), the tuple p should satisfy:

    • agent discrete structure: \(\mathcal {M}^i_k\) should be a mode of agent \(A_k\)’s instance-HA \({{H}}^I_{k}\in \mathbb {H}^I\); \(\mathcal {T}^i_k\) can either be a transition that jumps from \(\mathcal {M}^i_k\) to \(\mathcal {M}^{i+1}_k\) in \({{H}}^I_{k}\) or a stutter transition \(\texttt{S}\) that indicates no mode shift.

    • atom task requirement: agent \(A_k\) should start from its initial mode \({Q}^{0}_k\) and end in its final mode \({Q}^{f}_k\). If \(A_k\) is connected to another agent by a communication item in \(\mathbb {C}\), their synchronized transitions should occur simultaneously.

  • ST is a sequential task: its path is the concatenation of its subtasks’ paths.

  • ST is a parallel task: its path is the combination of its subtasks’ paths.

A composition scenario task’s paths are produced by recursively generating its subtasks’ paths and then concatenating or combining them. An atom task’s paths are generated by SAT encoding, solving, and decoding.

We transform the length-bounded path generation for atom tasks into a SAT problem. Constraints for paths with lengths no longer than m are encoded as the propositional formula set \(\mathcal {F}^m\) in Eq.(8). Specifically, constraints on each agent’s discrete structure are encoded as Eq.(5)-(6), and requirements on the atom task are encoded as Eq.(7). To solve this SAT problem, advanced SAT solvers like Glucose [9] and CryptoMiniSat [60] can be used, and then the solutions can be decoded into paths. For instance, ((Cruise, Cruise, Cruise), (comm, commP, commP), (Cruise, Position Control, Position Control), (S, goHover, S), (Cruise, Cruise, Position Control), (S, S, goHover), (Cruise, Cruise, Cruise)) is a path of the atom task \(AT_{middle}\) in the running example.

(5)
(6)
(7)
(8)

4.3 Optimization-Based Falsification for Paths

Path falsification is achieved by optimizing for a trajectory that minimizes the target specification’s robustness along the path. However, this optimization problem can be high-dimensional, nonlinear, nonconvex, and challenging, due to the complexity of agents, topologies, and scenarios involved. Therefore, we developed a two-layered classification-model-based optimization method for it, along with an efficient robustness calculation algorithm.

Two-Layered Classification-Model-Based Optimization. In Fig. 6, the first layer optimizes for the best pair of initial values and external inputs that minimizes robustness. For each pair \((x_0,w_\mu )\), the second layer performs multi-objective optimization for the best dwelling time \(\tau \) that minimizes robustness while maximizes the number of reachable modes along the path. \(\tau \) is evaluated by the two objective functions’ weighted sum, i.e. \(\min f\)=\(w_1 \rho (\varphi )+w_2 \rho (p)\ (|w_1|\)+\(|w_2|\)=\(1, w_1\) >\(0, w_2\) \(0)\), with weighting coefficients updated dynamically during optimization.

Both optimization layers are solved with classification models. Unlike classical heuristic optimization methods with weak theoretical guarantees or poor scalability, classification-model-based optimization has a grounded theory about complexity and convergence [39, 64]. It has been theoretically proven to solve problems with local-Lipschitz continuity in polynomial time, and empirically proven to be scalable to high-dimensional problems. Its basic idea is to learn a classification model iteratively to discriminate bad solutions from good ones.

We employ a hyper-rectangle \(M_1\) in the first layer as the classification model to represent the positive region, and a hyper-rectangle \(M_2\) in the second layer as Fig. 6. In both layers, solutions are sampled from corresponding models, evaluated accordingly, and classified into positive/negative ones to refine models. The sampling, evaluating, and model refinement cycles iterate to enhance classification models’ accuracy and optimize solutions. Our two-layered optimization enables accurate model refinement and alleviates the scalability issue.

Efficient Robustness Calculation. During falsification, system simulation and robustness calculation are compute-intensive and time-consuming. We introduce time-context and lifespan in temporal logic to reduce them.

Definition 12

(Time-Context, Lifespan). Given a target temporal logic specification \(\varphi \), for any formula \(\varphi '\) in it (including \(\varphi \) itself), its time-context \(\mathcal {C}(\varphi '|\varphi )\) is the period during which its satisfaction affects \(\varphi \)’s satisfaction, its lifespan \(\mathcal {L}(\varphi '|\varphi )\) is the period during which the signals affect its satisfaction in its time-context. We omit the target specification \(\varphi \) when it is clear. Relations between \(\varphi '\) and its subformulas, in terms of time-context and lifespan are:

$$\begin{aligned} \varphi ':=true {} & {} \!\!\!\!\!\!\!\!\rightarrow & \ \mathcal {C}(true)=\mathcal {C}(\varphi ') {} & {} \mathcal {L}(\varphi ')=\varnothing \end{aligned}$$
(9)
(10)
Fig. 7.
figure 7

An Example of Lifespan Calculation for the Target Specification \(\varphi \)

To compute the target specification \(\varphi \)’s robustness, we only need to simulate and record system trajectory within \(\varphi \)’s lifespan. This helps to avoid unnecessary trajectory simulations, saving computation time and memory. The lifespan of the target specification \(\varphi \) is computed as defined in Definition 12. We demonstrate an example of it in Fig. 7. Note that the time-context of \(\varphi \) is [0, 0], as our goal is to evaluate \(\rho (\varphi ,w,0)\), the target specification’s quantitative satisfaction on the trajectory signal starting at time zero. As shown in blue boxes in Fig. 7, starting with \(\varphi \)’s time-context [0, 0], the time-context of all subformulas is computed from top to bottom along the parse tree. Next, we get the lifespan of all atomic predicates according to Eq.(9)–(10). Finally, the lifespan of all formulas, including the root \(\varphi \), is computed from bottom to top as yellow boxes.

Algorithm 1:
figure a

Time-Context Guided Robustness Calculation

During the robustness calculation of the target specification, its subformulas’ time-context also can be utilized, avoiding unnecessary robustness calculations. Efficient dynamic programming algorithms have been developed for temporal logic robustness calculation [34, 56]. Our algorithm follows the dynamic programming principle and further improves efficiency by considering the time-context of formulas in the target specification. Pseudocode appears in Algorithm 1.

This algorithm calculates the robustness of all formulas in \(\varphi \)’s parse tree at each timestamp in the simulated trajectory, constructing a dynamic programming table \(\mathcal {R}\) to store the robustness of formula \(\phi [i]\) at time \(\vec{t}[j]\) in \(\mathcal {R}[i,j]\). It calculates robustness for atomic predicates, higher-level subformulas, and the root specification sequentially (line 3-19). Each is calculated from the last timestamp backwards to the initial timestamp (line 4-22). It skips unnecessary robustness calculations for timestamps outside the formula’s time-context (line 5-6), i.e., the formula’s satisfaction at that point does not affect the target specification’s satisfaction. For timestamps inside the time-context, the formula’s robustness is calculated by reusing its robustness at future timestamps and its subformulas’ robustness (line 8-19), according to its robustness definitions in Eq.(1)–(4).

5 Implementation and Evaluation

5.1 Implementation and Research Questions

We have implemented our SceNarIo-based reconfigurable CPS modeling and FalsiFication approach in a tool called SNIFFFootnote 3 in C++. It provides a graphical user interface (GUI) for modeling, supporting the construction of agent template models and agents’ runtime instantiations, communication configurations and scenario task configurations. It also provides path-oriented CPS falsification, displaying witness paths and signals.

Below are research questions to evaluate our modeling and falsification approach:

  • RQ1: Can SNIFF help users reduce modeling time for reconfigurable CPS?

  • RQ2: Can SNIFF help users improve CPS modeling accuracy?

  • RQ3: Is SNIFF effective and efficient at CPS falsification? In particular, how is its scalability and runtime performance?

Next, we introduce the reconfigurable CPSs used in our evaluations in detail.

Example 1: Multi-UAV System. The motivating example system is used for our evaluations. Initially, eight UAVs are scattered in a closed space, \(x,y \in [-10, 10]\), \(z \in [1, 10]\). During runtime, they are dynamically connected with uncertain communication delays of \(d \in [0.05, 0.2]\). Its behavior is modeled by the scenario task \(T_{mission}\), as shown in Fig. 8a. Atom tasks are in grey. Sequential and parallel compositional tasks are drawn in white-filled rectangles with square and rounded corners, respectively. Besides, for fair modeling evaluations, the flight mission was slightly modified as shown in Fig. 8b. Four UAVs are involved and its scenario task includes three compositional tasks and six atom tasks.

Fig. 8.
figure 8

Scenario Tasks for the Multi-UAV System

Example 2: Automatic Assembly System. It consists of four conveyors, three robots with grippers, and six robots with welding guns for car production. Conveyors transport floor, door, and roof panels along the assembly line, starting at \(x\in [-2,0]\). Grabbing robots communicate with conveyors with a delay of \(t\in [0,0.1]\), and wheel around to grasp, move, and place doors and roofs. Welding robots work together to spot weld joints in \(x\in [8,12],y\in [8,12]\).

Its behavior is modeled by the scenario task \(T_{assembly}\) as Fig. 9. Initially, in \(T_{conveyGrasp}\), conveyors and grabbing robots collaborate to convey and grasp four panels in parallel. Then, in task \(T_{move}\), grabbing robots first move doors and then slot the roof. Finally, in \(T_{spotWeld}\), welding robots simultaneously weld joints while grabbing robots return home once relevant joints are welded, ready for the next assembly.

Fig. 9.
figure 9

Scenario Tasks for the Automatic Assembly System

Experimental Setup. To evaluate SNIFF’s modeling time and accuracy, we conducted a user study comparing it to another tool called SpaceEx [36], which is a well-established platform for hybrid system modeling and verification. SpaceEx’s modeling language, called SX [21, 28], is based on compositional hybrid automata, presented as base or network components. Both tools provide a graphical modeling editor with automatic user action logging, which allows us to evaluate modeling time accurately. Also, graphical models are saved automatically in their respective modeling languages, enabling our checks for modeling accuracy.

This user study was conducted with 111 computer science department students, including undergraduates, graduates, and PhDs. They received a 50-minute training session on CPS modeling and were required to model a multi-UAV system, described as Fig. 8b, using both tools in a week. Participants were given with a UAV template model and the templates for half atom tasks (in the blue box in Fig. 8b) in SNIFF format. They are also provided with base components in SpaceEx formats, containing the same information as the SNIFF templates we provided. Specifically, it contains information on all potential dynamics of a UAV agent and agent compositional information in three atom tasks.

To evaluate SNIFF’s falsification performance, we conducted a experiment comparing it with S-​TaLiRo [8], an advanced optimization-based CPS falsification tool. We ran S-​TaLiRo with its SOAR [48] option, which outperformed all of its other optimization options in our benchmarks. Systems are modeled in ‘hautomaton’ class as its input. We falsified various temporal specifications for the above two example systems on both tools. We conducted all experiments 100 times on the same PC (Intel Core i5-12500, 16GB RAM) with a time limit of 1800 s. The system simulation limit is 1000 iterations for deterministic benchmarks and 25000 for non-deterministic ones. We did not use SpaceEx for nonlinear systems’ temporal logic falsification comparison since it mainly focuses on reachability verification for hybrid systems with piecewise affine dynamics.

5.2 Experimental Evaluation and Analysis

Table 2. Time Costs to Model Different System Elements in SNIFF and SpaceEx

Time Cost of Modeling. According to participants’ modeling logs, models were constructed in an average of 46.9 min in SNIFF, while it took 308.1 min, 6.6 times longer, in SpaceEx. To further analyze how time was saved in SNIFF, Table 2 compares their time cost in different modeling elements.

  • Agent internal behavior was modeled 13 times faster by SNIFF than by SpaceEx. SNIFF’s agent-instantiation mechanism allows quick updates to agent internal behavior, while SpaceEx requires additional components to be built.

  • For both tools, modeling system behaviors in atom tasks took the longest. On average, it took 7.6 min with SNIFF when no task template was given, while nearly half an hour with SpaceEx. It indicates that communication configurations in SNIFF were helpful in defining agent composition in atom tasks.

  • With task templates, atom tasks’ modeling time was reduced by half in SNIFF, from 7.6 min to 3.5 min. SNIFF’s support for task templates makes agent composition more convenient.

  • To describe topologies in compositional scenario tasks, it averagely took 1.6 min in SNIFF, but 24.2 min in SpaceEx. Dynamic topologies can be set quickly by SNIFF’s scenario task configurations, but challenging in SpaceEx.

The Answer to RQ1: SNIFF’s template models, together with its flexible instantiation and configuration mechanisms, simplify the modeling of reconfigurable CPS, saving considerable modeling time and effort for users.

Accuracy of Modeling. Among the models submitted by participants, 70.27% of SNIFF models were accurate, while only 7.21% of SpaceEx models were accurate. We analyzed all submitted modeling errors and classified them into four major types in Table 3. It lists their sources and probabilities, which are counted per source to eliminate the influence of the number of sources.

Table 3. Probabilities of Different Modeling Errors in SNIFF and SpaceEx
  • Compared to SpaceEx, SNIFF reduces all types of modeling errors significantly.

  • In SpaceEx, more than half (59.46%) of system behaviors with parallel topology shifts were incorrectly modeled, whereas SNIFF’s scenario task configurations simplify this modeling process and eliminate such errors completely.

  • In SNIFF, modeling agents’ compositions has the highest error rate (6.91%). Our additional statistics found that only 13% of these errors occur when task templates are used, indicating that task templates can help reduce such errors.

The Answer to RQ2: SNIFF significantly improves the accuracy of CPS models constructed by users, reducing all types of modeling errors.

Table 4. Falsification Results of Example Systems in SNIFF and S-TaLiRo

Falsification Performance. Two example systems are falsified under various scenarios, ranging from the simplest to the most complex. Table 4 lists the size of each scenario task, including the number of atom tasks. Scenarios within 20 s are highlighted in green, others in red. We listed the falsification performance of SNIFF and S-​TaLiRo for each specification, including success rates, and costs of time and simulation iterations to find a witness. We also calculated the average falsification time for each atom task, presented in grey.

Generally, SNIFF supports falsification for models with richer semantics and can handle a wider range of specifications. The multi-UAV system follows deterministic semantics, but the assembly system follows nondeterministic semantics, which is not specifically supported in S-​TaLiRo. Out of the total thirty specifications, six are unsupported in S-​TaLiRo and are underlined in the table.

  • Effectiveness: On average, SNIFF’s success rate is 99.7%, whereas S-​TaLiRo is below 45%, indicating that our path-oriented falsification approach helps explore search space effectively. Especially, SNIFF’s success rate is at least 97% in all cases, while S-​TaLiRo achieved this in only two cases.

  • Efficiency: SNIFF is more efficient than S-​TaLiRo. It takes 0.1–39.2 s to falsify these specifications, while S-​TaLiRo takes an average of 730.9 s, more than 12 min. Also, in the 16 cases that S-​TaLiRo supports, SNIFF averagely requires only one-fifth as many simulation iterations as S-​TaLiRo.

  • Scalability: As the scenario grows in size and complexity, system behaviors become more intricate, and falsification inevitably becomes challenging and time-consuming. The table shows that S-​TaLiRo’s success rate decreases greatly in complex scenarios, whereas SNIFF’s success rate barely drops. The rise in SNIFF’s time-cost and simulation iterations is also reasonable.

  • Runtime Performance: CPSs may need online modeling and analysis due to runtime reconfiguration [13]. To evaluate our performance for runtime atom task scenarios, we roughly divide the falsification time by the number of atom tasks, as highlighted in grey. It indicates that SNIFF can handle such scenarios in around 1.7 s on average. Since automatic modeling in our language barely costs time, we can handle the entire online configuring, modeling, and falsification process in seconds, improving system runtime safety.

The Answer to RQ3: SNIFF is highly effective, efficient, and scalable for temporal logic falsification in CPSs. It also demonstrates satisfactory runtime performance, handling runtime scenarios with a single atom task in seconds.

5.3 Threats to Validity

In this section, we discuss possible threats to the validity of our user study.

Internal Validity. is the assessment of modeling accuracy. In our user study, to evaluate modeling accuracy accurately, we restricted names of agents and tasks in SNIFF and components in SpaceEx. In this way, SNIFF/SpaceEx models could be automatically checked by examining their scripts. Another threat is that participants’ familiarity with the target CPS could affect modeling. After collecting participants’ modeling logs (automatically logged by both tools), we examined the order in which participants used the tools. It shows that 60% of participants used SNIFF first, and 84% of participants completed modeling in SNIFF first. This may introduce bias into our modeling evaluation.

External Validity. The main external threat arises from the representative of subjects of our user study. Formal modeling is typically conducted by engineers with relevant expertise. To conduct the user study with representative subjects, we recruited 111 participants from a course called Formal Languages and Automata, ensuring that they had a basic understanding of modeling and were close to real-world industry target users. Another threat is that our conclusions might not generalize to all reconfigurable CPS. We alleviate the threat by deriving the Multi-UAV system from a real-world case. Results from real-world CPS will help us evaluate our approach better. Additionally, we plan to conduct larger-scale experiments on more real-world reconfigurable CPSs in the future.

6 Related Work

Formalizing CPS is complex due to its hybrid and compositional nature. Formal models, such as hybrid automata [38, 45] and hybrid process algebras [12, 22], are used to formalize the tangled continuous and discrete behaviors in CPS. Further, compositional hybrid models formalize the behavior of multiple agents, by composing hybrid automata [6, 45], processes [15], etc. Based on these formalisms, various modeling languages have been developed for multi-agent systems, including Unified Modeling Language (UML) based notations such as AUML [40], Constraint Logic Programming (CLP) notations [50, 51], and Verse [43], a Python library for systems with multiple agents moving on a map, etc. However, for reconfigurable multi-agent CPS whose system topologies change dynamically at runtime [37, 59, 62], these models and notations do not provide sufficient mechanisms to support such flexibility. Although there have been works on modeling reconfigurable systems, they focus mainly on general architectural modeling [10, 17, 58]. In contrast, this work aims to provide a behavior modeling formalism that can effectively capture the reconfigurability of these systems.

CPS falsification is an effective way to detect system behaviors that violate specifications. Many falsification approaches search for witness behaviors by applying optimization algorithms to minimize robustness, including optimistic optimization [65, 66], Bayesian optimization [25, 61], classical heuristic-based methods [7, 49, 52, 57], etc. Our falsification approach also falls under this category. Its most closely related work is [63], which also applies classification-model-based optimization [39, 64]. However, it is limited to the safety falsification of a single agent. In contrast, we support broader temporal logic falsification for CPSs with multiple dynamically wired compositional agents. Apart from optimization-based ones, there are also other falsification approaches, such as motion-planning-based falsification [24, 31, 53], gradient-based falsification [14], and learning-based falsification [3].

7 Conclusion and Future Work

We present a hierarchical formal modeling language for reconfigurable CPSs. It provides templates for agent inherent features and allows for the formalization of reconfigurable CPS through agent instantiations, agents’ communication configurations, and scenario task configurations. While it doesn’t add expressivity beyond HA, its flexible hierarchy can simplify the complex modeling process, saving time and improving accuracy. Our future plans involve providing support for more CPS architectures and system topology variations.

We also propose a path-oriented falsification approach for reconfigurable CPSs. It employs a two-layered optimization approach to explore the search space effectively and cuts unnecessary calculations to improve efficiency. Experiments have shown that it can perform scalable falsification with high success rates. In the future, we plan to further enhance its runtime performance by leveraging offline and past online analysis results.