Incremental execution of temporal graph queries over runtime models with history and its applications

Modern software systems are intricate and operate in highly dynamic environments for which few assumptions can be made at design-time. This setting has sparked an interest in solutions that use a runtime model which reflects the system state and operational context to monitor and adapt the system in reaction to changes during its runtime. Few solutions focus on the evolution of the model over time, i.e., its history, although history is required for monitoring temporal behaviors and may enable more informed decision-making. One reason is that handling the history of a runtime model poses an important technical challenge, as it requires tracing a part of the model over multiple model snapshots in a timely manner. Additionally, the runtime setting calls for memory-efficient measures to store and check these snapshots. Following the common practice of representing a runtime model as a typed attributed graph, we introduce a language which supports the formulation of temporal graph queries, i.e., queries on the ordering and timing in which structural changes in the history of a runtime model occurred. We present a querying scheme for the execution of temporal graph queries over history-aware runtime models. Features such as temporal logic operators in queries, the incremental execution, the option to discard history that is no longer relevant to queries, and the in-memory storage of the model, distinguish our scheme from relevant solutions. By incorporating temporal operators, temporal graph queries can be used for runtime monitoring of temporal logic formulas. Building on this capability, we present an implementation of the scheme that is evaluated for runtime querying, monitoring, and adaptation scenarios from two application domains.


Introduction
The Model-Driven Engineering (MDE) approach to software development focuses on managing the complexities arising in Communicated  Software System Engineering Group, Brandenburg Technology University, Cottbus, Germany every stage of the software development cycle [22]. In MDE, models constitute abstractions of complex issues originating in the problem-domain which are systematically transformed into implementation-domain artifacts via computer-based technologies [43]. Modern software systems are intricate and operate in highly dynamic environments for which few assumptions can be made at design-time. This setting has sparked an interest in sophisticated approaches to system monitoring and adaptation at runtime, i.e., during the system execution, which aim to mitigate uncertainty. At the same time, these approaches have highlighted the complexities of these activities. Owing to the effectiveness of models at design-time, numerous initiatives surfaced [103] which use models which capture a snapshot of the system constituents as well as their state, i.e., runtime models [16,21], to realize monitoring [24,28] and adaptation [41,47,107] solutions.
Thus far the primary concern of these solutions has been the attainment and handling of runtime models which rep- Fig. 1 Overview of system and InTempo interaction resent the most recent snapshot; the evolution of the model, i.e., its history, has been generally neglected [16,17] although its key role in enabling more informed decision-making during the system lifetime [38,45] and postmortem analysis [14] has been long recognized. Only recently did solutions surface which exploit this potential, e.g., via detection of recurrent behavior patterns and behavior explanation [44], analysis of temporal requirements [84], or inference of probabilistic adaptations based on past interactions [40].
Representing and utilizing history pose significant challenges to solutions which rely on runtime models: representing history requires the representation of a collection of snapshots-rather than the most recent one-through which changes to the model can be tracked; utilizing history relies on the ability to query history in an effective and intuitive manner, i.e., on the ordering as well as the (real) time in which changes in the structure of a snapshot occurred. The challenges are only exacerbated in case the model is to be queried online, i.e., while the system is running, and the monitoring or adaptation solution depends on the query answer for taking a remedial action or planning an adaptation.
Based on these challenges, we elicit the following requirements for solutions which handle the history of a runtime model: R1-an encoding of history which represents multiple snapshots as well as enables tracing the occurrence and timing of changes in-between snapshots; R2-a query language which supports statements on the model structure as well as the ordering and (quantitative) timing in which structural changes occur; R3-fast query execution, as for monitoring or adaptation scenarios query answers might be necessary for decision-making at runtime; R4a memory-efficient representation which may save resources and expedite query executions.
Following the common practice of representing a runtime model as a typed attributed graph, we present a query language and a querying scheme which together fulfill the requirements above. An abstract overview of the system interaction with the scheme, named InTempo, is shown in Fig. 1. The language incorporates a temporal logic defined on graphs, thus enabling the formulation of temporal graph queries, i.e., graph-based model queries with ordering and timing constraints on the occurrence of graph structures (R2). The querying scheme, i.e., a collection of inter-dependent operations, iteratively processes timed events, i.e., changes to the system which represent its history, and makes the cor-responding modifications to a model-based history encoding R (R1). After each modification, InTempo executes input queries on R and returns the answers to the system.
Regarding fast query executions (R3), we present an operationalization framework which automatically maps a temporal graph logic formula to a network of simple graph sub-queries therefore enabling incremental execution of temporal graph queries. For memory-efficiency (R4), we present an optional method which, based on the timing constraints of the formula, derives a window during which model elements are relevant to query executions. Elements outside the window can be pruned from the model thus reducing its size while overall returning the same results as un-pruned models.
To demonstrate the effectiveness of InTempo we present an implementation based on the Eclipse Modeling Framework [36,101]. The implementation is integrated with a feedback control loop which instruments system adaptation, i.e., an adaptation engine [66], and evaluated via simulations based on a case-study of a smart medical system, a real medical guideline, and a combination of real and synthetic event logs. The logs are used for simulations in which query answers from InTempo are used while planning adaptations. The case-study is of particular relevance as in healthcare, real-time requirements are key to medical procedures [26] and therefore fast query executions are necessary. The performance of the implementation, i.e., its fulfillment of R3 and R4, is compared to a relevant tool for runtime monitoring as well as a tool from the MDE community. Moreover, we use data generated by the LDBC Social Network Benchmark [75] to test the querying performance of the implementation against larger and more complex graph structures.
This article is an extension of a paper published in MoD-ELS '20 [94]. Besides improvements on the presentation, structure, and technical explanations, the following contributions are novel with respect to its precursor: (i) the provision of formal arguments on the correctness of answers returned by InTempo (ii) the expansion of the concept of discarding history, i.e., definition of projected answer over partial history representations and support for multiple queries (iii) the generalization of the usage of temporal graph queries for monitoring temporal properties against the history of a system-which was treated ad hoc in [94] (iv) an additional comparison to a tool from the MDE community capable of querying history (v) a newly introduced evaluation based on an independent benchmark (vi) an extension of the discussion of related work.
The rest of the paper is organized as follows. Section 2 discusses the case-study and the foundations of InTempo. A technical overview of the scheme is presented in Sect. 3. The introduced query language for temporal graph queries is presented in Sect. 4, while Sect. 5 presents the operationalization framework which enables incremental query execution. Section 6 presents a method to discard elements that are not relevant to query executions. Section 7 contains the extensions made to InTempo to be suitable for intuitive monitoring of future temporal properties, and Sect. 8 presents the application of the scheme in a self-adaptation scenario. We evaluate the performance of our implementation in Sect. 9, discuss related work in Sect. 10, and conclude the paper as well as discuss future work in Sect. 11. Appendix contains technical preliminaries, proofs of formal arguments, and supplemental information on the evaluation.

Prerequisites
This section introduces the case-study from the medical domain, which we use as a running example in the remainder. Moreover, this section presents the foundations of InTempo.

Smart healthcare system
The case-study is based on a service-based simulated Smart Healthcare System (SHS). The SHS is based on smart medical environments [92] where sensors periodically collect physiological measurements of patients, i.e., data such as temperature, heart-beat, and blood pressure, and certain medical procedures are automated and performed by devices, such as a smart pump administering medicine, based on the collected patient measurements-as otherwise a clinician would be doing. Figure 2 depicts the metamodel of the SHS (based on the well-known Ecore syntax [36]) which defines valid model instances [19]. The SHS metamodel is based on the exemplar of a service-based medical system in [109] and captures the running system as an instance of the Architecture class.
In the SHS, each patient is connected to a sensor. Services are invoked by a main service called SHSService to collect measurements from sensors, i.e., PMonitoringService, or take medical actions via smart medical devices such as a pump, i.e., DrugService. Invocations are triggered by effectors (Effector) and invocation results are tracked via Probes are generated periodically or upon events in the real world. Each Probe has a status attribute whose value depends on the type of Service. Each Service has a pID attribute which identifies the patient for whom the Service is invoked. Elements in gray are explained later in the article.

History and time domain
We identify the system behavior with a (possibly infinite) sequence of instantaneous timed events which represent observable actions or state changes made by the system or its context at some time point. The system has a clock whose time domain is the set of non-negative real numbers R + 0 . An element of the time domain is called a time point.
Intuitively, the history of a system with respect to a timed event is the sequence of all observed timed events up to and including said timed event. Technically, the history corresponds to a finite prefix of the behavior consisting of pairs (e i , τ ) with e i a timed event from a set of possible observations of interest E , i ∈ N the position of the event in the index, and τ ∈ T R + 0 the time point of occurrence. We use the shorthand e τ when the index position is irrelevant and the shorthand τ i to denote the time point at position i. Note that, for presentation purposes, we group all changes with the same time point in one event. However, we require that time in the history eventually diverges, i.e., ruling out Zeno behaviors and ensuring that no event groups an infinite amount of changes. 1 For example, for three events e 2 , e 4 , e 5 , the encompassing history at time point 5 is denotedh 5 := e 2 e 4 e 5 . When a new event e 7 occurs, the history is incremented by a concatenation to reflect the new history at time point 7, i.e.,h 7 =h 5 · e 7 .

Runtime models and history
A (structural) Runtime Model (RTM) is a snapshot of the constituents of the modeled system and their state [15,21].  Fig. 2. RTMs are causally connected to the modeled system: if the RTM is modified, the system mirrors the modification and vice versa. The causal connection between the system and the RTM may be utilized to alter the system via model transformation [49] where model queries search for parts of the model that are to be altered via in-place transformations which correspond to desired alterations to the system. By virtue of the causal connection, the alterations are enacted on the modeled system. A history can be represented by a sequence of RTMs. In this representation, each member is associated with the time point of an event e τ and mirrors the changes corresponding to e τ in the system. For instance, represented by RTMs,h 5 corresponds to the sequenceh G 5 := G 2 G 4 G 5 (see Fig. 4); each RTM is yielded by an event, is associated with the time point of the yielding event, and extends its predecessor by the changes corresponding to the event. The RTM G 2 extends the empty model ∅ which is at the start of every such sequence. For spawning an RTM based on an event, we assume that there exists a mapping from the set of events E to corresponding model modifications. In this case, according to the mapping, event e 7 corresponds to the association of s with the newly added pm 2 and the disassociation of s from the deleted d 1 -when the event occurs, owing to causal connection, it is mirrored in the RTM. To include the latest change, the history representationh G 5 is extended by the RTM G 7 , i.e.,h G 7 =h G 5 · G 7 -illustrated in Fig. 4. [5] and H [7] A Runtime Model with History (RTM H ) [95] is an enhanced RTM that simultaneously provides two views on the modeled system: a view of the current system state which corresponds to a conventional causally connected RTM; and a compact view of the history. The view on history is afforded by each entity being equipped with a creation timestamp and a deletion timestamp, abbreviated cts and dts, respectively. For an example, see Fig. 2 where all entities inherit from the MonitorableEntity.

Fig. 5 RTM H instances H
The cts and dts capture the time points of creation and deletion of an entity, respectively, by an event. Upon the occurrence of an event, similarly to an event yielding a new system state, the corresponding entity creation (deletion) in the RTM H yields a new instance of the RTM H where the cts (dts) of the modified entity is set based on the time point of the event. When an entity is created, its dts is set to ∞. When an entity is deleted in the modeled system, the respective entity is not deleted in the RTM H . Rather, its dts is updated to the time point of the event that induced the deletion.
We assume that connectors in an RTM H exist for as long as both their end-points exist. Moreover, attribute values in an RTM H are set when the entities are created and, once set, remain unchanged. If changes of attribute values (or connectors) in the modeled system are of interest, an RTM H can track such changes by appropriate modeling decisions, e.g., by encoding these elements as entities in the metamodel-as shown in [70]. Each value change would then lead to a creation of a new entity in the RTM H , where the duration of the value would be captured by a cts and a dts. We demonstrate such an encoding in the evaluation in Sect. 9.4. By retaining all entities as well as information on their creation and deletion time points, an RTM H instance suffices to represent the evolution of entities up to and including the time point of the event that yielded the instance in question. In contrast to a sequence of RTMs, which stores multiple RTMs and all their entities so as to be able to track the evolution of entities across the sequence, an RTM H stores only a single instance of each model entity. Hence, it affords a compact and, therefore, more efficient representation of history. Similarly to an RTM, an instance of the RTM H is always associated with the time point of the latest event. For example, the representation ofh G 7 by an RTM H yields a single model, H [7] , that contains the same information ash G 7 . H [7] is illustrated in Fig. 5. We note that, in an RTM H , causal connection only applies to the latest snapshot.
Technically, an RTM H is obtained by an iterative coalescence of a new event and an RTM H into a new RTM H instance. For creations, the new instance contains new entities corresponding to the event and sets the values of their attributes: regular attributes are set according to data in the event, the cts is set based on the time point of the event, and the dts is set to ∞. For deletions, the dts values of the affected entities are set to the time point of the event. We denote this coalescence by . For example, whenh G 5 is incremented by the event e 7 which corresponds to the creation of pm 2 and the deletion of d 1 , this event yields a new RTM H H [7] = H [5]

Graph-based RTMs, transformation, and queries
An RTM is often encoded as a typed, attributed graph [37] where entities are modeled as vertices, connectors between entities as edges, and information about entities as vertex attributes [106]. A typed, attributed graph (henceforth, simply referred to as a graph) is typed over a type graph which defines types of vertices, edges, and attributes-similarly to the relationship between a metamodel and a model. In this context, the metamodel in Fig. 2 may be seen as an informal representation of the type graph of the SHS. The RTM H is analogously based on graphs with history [100].
Attributes are associated with a data type, i.e., a character string, an integer, a real number, or a Boolean. Graphs contain a set of assignments A which assign data-type-compatible values to attributes, e.g., pm 1 .pID = 1 in Fig. 3. Given that attribute values in an RTM H are set upon the creation of an entity and remain unchanged (see Sect. 2.3), assignments in A are fixed. Formally, the set of assignments A may have various representations, e.g., distinguished data vertices [37] or an attribute constraint over sorted variables [99].
Encoding an RTM as a graph allows for the realization of model transformation via established formalisms, such as typed, attributed graph transformation [37] where graph transformation rules are used to search for a part of the model which is transformed in place [49].
In short, let G be a graph (in this context, encoding an RTM) typed over a type graph T and ρ a graph transformation rule. The rule ρ is characterized by a left-hand side (LHS) and a right-hand side (RHS) graph, also typed over T , which define the pre-condition and postcondition of an application of ρ, respectively. Intuitively, the execution of ρ searches for LHS in G and transforms it according to RHS. Moreover, the LHS graph may be extended with a Boolean expression γ (additionally to the set of assignments A) over the values of attributes in A. We refer to this LHS graph extended by γ as a (graph) pattern and to G as a host graph.
The LHS of a rule characterizes a (graph) query, which is the equivalent graph-based notion of a model query. The execution of a graph query over a given host graph, also called (graph) pattern matching, amounts to finding matches, i.e., occurrences of the query pattern in the host graph, whose attribute values satisfy the assignments and attribute constraint in A and γ of the pattern, respectively.
Formally, a match is a mapping from the pattern in the query to the host graph which preserves structure and typealso called a morphism. In the following, we use the two terms interchangeably. The query answer set is a set strictly containing all matches for a query in G. The transformation of the rule, specified in the RHS, is performed only when a match for the query, i.e., the LHS, has been found. The match identifies a part of G where the transformation should occur.
Queries with Complex Patterns In certain cases, simple patterns are not sufficient as a language for defining more complex application conditions of rules, for instance if the existence of certain model elements should be prohibited. In those cases, an LHS, i.e., graph query, is enhanced with an application condition ac which every match m should satisfy. In the following, a query θ is characterized by a pattern n and an application condition ac, and denoted θ := (n, ac).
The language of Nested Graph Conditions (NGCs) [55] can formulate ac that are as expressive as first-order logic on graphs [27] as shown in [55,88] and constitute, as such, a natural formal foundation for pattern-based queries. NGCs support standard first-order logic operators. The syntax of an NGC φ is given by the grammar: where n,n are patterns. The existential quantifier features a morphism (denoted by a hooked arrow) from n ton which relates, i.e., binds, elements in outer conditions (n) to inner (nested) conditions (n) and is therefore also called a binding. Let L be the language for queries with ac based on NGC and θ := (n, φ) with θ ∈ L , i.e., φ is an NGC. The answer set A for θ over a host graph G contains all matches m in G for the pattern n that satisfy φ. We also denote A by A(G) when θ is clear from the context. Intuitively, the existential quantifier in a query (n, ∃(n →n,φ)) is satisfied for a match m for n when (i) there is a matchm forn in G such thatm satisfiesφ (ii)m is compatible with m, i.e., respects the binding between the two patterns captured in n →n. The operator true is always satisfied. The intuition behind negation and conjunction is similar to that in first-order logic. In the remainder, we abbreviate ¬(¬φ ∧ ¬φ) by φ ∨φ, ∃(n →n, φ) by ∃(n, φ), and ∃(n, true) by ∃ n.
As an example, assume the following hypothetical requirement which draws from operation sequence compliance, i.e., the order of service invocations, in an SHS [see 109]: "When a sensor service is invoked for a patient by the main service, there exists no other sensor service for the same patient. Moreover, a drug service should be invoked for the same patient." Based on the SHS metamodel in Fig. 2, the main service is represented by SHSService, the sensor service by PMonitoringService, and the drug service by DrugService. Then, the described situations, i.e., sensor service invoked by a main service, may be captured by the patterns n 1 , n 1.1 , and n 1.2 , illustrated in Fig. 6, where the attribute constraints in n 1.1 and n 1.2 (illustrated between braces) ensure that the situations concern the same patient.
Formulated as a query in L , the requirement is translated into: "find all matches of pattern n 1 in G that satisfy φ 1 , i.e., where a match for n 1.1 does not exist while a match for n 1.2 does." In L , this query is captured by θ 1 := (n 1 , φ 1 ) with φ 1 an NGC defined as ¬∃ n 1.1 ∧ ∃ n 1.2 . Nesting implies that the vertices s and pm from n 1 are bound in inner patterns n 1.1 and n 1.2 , i.e., all patterns refer to the same s and pm in G. In our illustrations this is encoded by the usage of the same label for bound elements. The A(G 5 )-from Fig. 4-for θ 1 consists of one match for n 1 which satisfies φ 1 , i.e., a pm is found which is connected to a DrugService with the same pID and not connected to any other sensor services.
Query Operationalization A graph query is a declarative means to express a structure of interest which should satisfy a given condition. The query itself does not specify instructions on how to execute the query, i.e., its operationalization. For the operationalization of queries, we build on a formal framework we have previously presented which supports queries in L . The framework in question decomposes a query with an arbitrarily complex NGC as ac into a suitable ordering of simple, pattern-based sub-queries called a Generalized Discrimination Network (GDN) [18].
A GDN is a directed acyclic graph where each graph node represents a (sub-)query. To avoid confusion, we refer to the GDN as a network. Dependencies between queries are represented by edges from child nodes, i.e., the nodes whose results are required, to the parent node, i.e., the node which requires the results. Dependencies can either be positive, i.e., the query realized by the parent node requires the presence of matches of the child node, or negative, i.e., the query of the parent node forbids the presence of such matches. The overall query is executed bottom-up: the execution starts with leaves and proceeds upward in the network. The terminal node computes the A of the query.
In [18], a GDN is realized as a set of graph transformation rules where each GDN node, i.e., each (sub-)query, is associated with one transformation rule. The LHS of the rule searches for matches of the corresponding query in a given host graph G. The RHS of the rule creates a marking node in G that marks each match and marking edges from the marking node to each node of the match-marking nodes are not to be confused with regular graph nodes in G (which, in this context, represent entities of the modeled system); thus, we use the term vertex for regular graph nodes. In order to be able to create marking nodes and edges, the transformation rules of a GDN, henceforth called marking rules, are typed over an extended type graph which adds the required types for marking nodes and edges to the initial type graph. The LHS of rules with dependencies have ac that require the existence of marking nodes of their positive dependencies and forbid the existence of marking nodes of their negative dependencies.
The GDN for θ 1 from earlier is shown in Fig. 7, where each square represents a GDN node. Each node is associated with a marking rule. The GDN consists of three nodes, i.e., rules: the node N 1.1 searching for the pattern n 1.1 , the node N 1.2 searching for n 1.2 , and the topmost node N 1 searching for n 1 . Node N 1 computes its matches by matching its pattern and checking whether both of its dependencies are satisfied (the conjunction in φ 1 ). The negative dependency which captures the negation in φ 1 (drawn with a dashed line) is satisfied when a match for N 1 cannot be extended by a match for N 1.1 . All nodes are realized by marking rules whose LHS matches a pattern and whose RHS creates marking nodes and edges that  [18] mark the matches of the LHS. The rules for nodes N 1.1 , N 1.2 are shown in Fig. 7 (within rectangles), where (i) marking nodes are illustrated by circles and (ii) the marking nodes and edges added by a rule are dashed and annotated with "++". For presentation purposes, the illustrations of rules thus contain both their LHS and RHS.
Incremental Execution Optimization techniques such as local search can be employed to reduce the pattern matching effort of GDN nodes [see 3]. In local search, pattern matching initiates from a single element and builds a match candidate iteratively following a heuristics-based search plan.
Owing to the decomposition of the query into simpler marking rules as well as local search, a GDN is amenable to incremental execution. Changes in G are propagated through the network, whose nodes only recompute their results if the change concerns them or one of their dependencies. If a recomputation is deemed necessary, owing to local search, a node is capable of updating its matches starting from changed elements instead of starting over. Therefore, we say that the query is also executed incrementally as its A is updated by each GDN execution.

Metric temporal graph logic
Metric Temporal Graph Logic (MTGL) [50] enables the formulation and checking of temporal requirements on patterns, i.e., requirements on the evolution of patterns over time.
MTGL builds on NGCs and Metric Temporal Logic [69] to enable the definition of Metric Temporal Graph Conditions (MTGCs) on patterns. Additionally to the NGC operators, MTGCs support metric, i.e., interval-based, temporal operators: the until (U I , with I an interval in R + 0 ) and its dual since (S I ). The syntax of an MTGC ψ is given by: with n a pattern. The operators eventually (♦ I ) and once ( I ) are abbreviations of until and since: ♦ I ψ = true U I ψ and I ψ = true S I ψ. We abbreviate exists similarly to NGCs. MTGL reasons over sequences of graphs. Intuitively, this is motivated by the logic expressing requirements on the evolution of a pattern over time, i.e., over consecutive graphs. However MTGCs can also be equivalently checked over a graph with history [50], which here corresponds to an RTM H . Following is the intuition behind satisfaction for the MTGL operators ∃, U I , S I .
Let ψ be the MTGC ∃(n,ψ). A match m for n in an RTM H H [τ ] satisfies ψ at time point τ ∈ T if max ∈E .cts ≤ τ < min ∈E .dts, with E the elements of m, and matches forψ are compatible with m and satisfyψ. Given two MTGCs ψ,ψ, the MTGC ψ U Iψ is satisfied at a time point τ when there is a time point τ with τ − τ ∈ I , whereψ is satisfied, and at least for all τ ∈ [τ, τ ), ψ is satisfied. The intuition is reversed for ψ S Iψ which is satisfied at a time point τ , when there is a τ with τ − τ ∈ I , whereψ is satisfied, and at least for all τ ∈ (τ , τ ], ψ is satisfied.

The INTEMPO scheme
An overview of the functionality of InTempo is provided in Sect. 1 and illustrated in Fig. 1. Building on the technical prerequisites in Sect. 2, this section presents InTempo in further technical detail-see Fig. 8 for a graphical reference-and shows how the requirements for a history encoding (R1), a query language (R2), fast answers (R3), and memoryefficiency (R4) defined in Sect. 1 are fulfilled.
Design-time InTempo assumes the following artifacts have been made available at design-time (i) a metamodel of the system with creation and deletion timestamps for each vertex (ii) a mapping of events to model modifications, e.g., additions or deletions of nodes.
Runtime To represent the history of a system, i.e., for R1, InTempo relies on an RTM H (see Sect. 2.3) which, by featuring creation and deletion timestamps, captures temporal information on multiple past versions of an RTM into a single, consolidated representation.
For R2, we introduce a novel language (see Sect. 4) which allows for the specification of temporal graph queries, i.e., queries on the ordering and timing in which patterns are added or deleted in the RTM H . Compared to the queries with NGCs presented in Sect. 2.4, temporal graph queries use MTGCs (see Sect. 2.5) for the formulation of ac, i.e., they incorporate (past and future) temporal operators with timing constraints, and, moreover, compute the interval for which matches for a temporal graph query satisfy the query's ac.
InTempo is invoked by the system and provided with a sequence of changes encapsulated in (timed) events. The For systems with lengthy histories or a high rate of incoming events, the re-computation of matches from scratch upon every event would quickly lead to slow query executions, arguably rendering a runtime solution impractical. Instead, for the execution of temporal graph queries over an RTM H , InTempo uses a novel operationalization framework which supports incremental execution of temporal graph queries (Sect. 5). This feature allows for fast query executions (R3) and also gives InTempo its name: Incremental execution of Temporal graph queries. Aiming for fast executions and memory-efficiency, InTempo supports an optional history representation which contains only elements that are relevant to query executions (Sect. 6). Queries over this constrained RTM H may yield faster executions while affording increased memory-efficiency.
Operations InTempo consists of two core operations: Operationalization and Execution. Maintenance constitutes an optional extension which, if enabled, is performed after Execution. We outline each operation below.
Operationalization: This operation constructs a temporal GDN for each of the input temporal graph queries. The operation extends the GDN construction presented in Sect. 2.4 by introducing concepts for handling structural matches whose validity is based on the cts and dts of their elements as well as for evaluating the until and since operators from MTGL. This operation is only performed once per query (at the beginning) and if the set of input queries has changed between invocations of InTempo. Execution: The operation executes the temporal GDN(s) over the RTM H . A temporal GDN processes the modifications made to the RTM H and, for modifications that are relevant to the queries, executes the affected sub-queries. The sub-queries search for matches of patterns while taking into account timing constraints on the occurrence of patterns. By virtue of the incremental execution of the GDN, matches are updated incrementally after each event. Maintenance: The operation relies on the computation of a time window (during Operationalization) which is based on timing constraints of the temporal operators in input queries. The operation uses the window to decide when deleted elements are not going to be involved in future query executions and can be thus pruned from the RTM H .

Language for temporal graph queries
Temporal requirements are always present in medical guidelines and their satisfaction is key to the successful completion of procedures [26]. Adding a temporal dimension to the exemplary requirement from Sect. 2.4 makes it similar to compliance checking of medical procedures which may track time between triage and admission [78], here represented by the invocation of a sensor service (n 1 ) and a drug service (n 1.2 ), respectively: "When a sensor is invoked for a patient, there should be a drug service invoked for the same patient within one minute and, until then, there should be no other sensor service invoked for the same patient." The specific timing constraint is adjusted for the purpose of presentation.
Formulated in a query, this instruction includes temporal requirements on the evolution of graph structures: "find all matches for n 1 such that, for a match for n 1 at a time point τ , at least one match for n 1.2 is found at some time point τ ∈ [τ, τ + 60], i.e., at most 60 time units later; in addition, at each time point τ ∈ [τ, τ ) in between, no match for n 1.1 is present," where all n patterns refer to the same pm and by time unit we refer to the unit of measurement with which the system tracks time-here, assumed to be a second.
The language L introduced in Sect. 2.4, which employs NGCs for the definition of ac, does not inherently support temporal requirements on patterns. MTGCs, introduced in Sect. 2.5, build on NGCs and allow for the specification of a desired ordering and a timing constraint on the evolution of graph structures-thereby supporting temporal requirements. Moreover, in an encoding where elements have lifespans, a structural occurrence of a pattern consisting of such elements has to be accompanied by information on when and for how long a match exists. This requirement is emphasized for application scenarios where a decision may depend on timing, e.g., in runtime adaptation.
Temporal logics that reason over intervals, such as MTGL, are capable of defining the truth value of a formula for every time point in the time domain. Building on this capability, we introduce the query language L T which enables the formulation of temporal graph queries over an RTM H . A temporal graph query ζ ∈ L T is characterized similarly to graph queries with NGCs in L , i.e., ζ := (n, ac). However, in contrast to L , temporal graph queries feature ac based on MTGCs thereby supporting temporal requirements. In L T , the exemplary query above is captured by ζ 1 := (n 1 , ψ 1 ) where ψ 1 is an MTGC defined as ¬∃ n 1.1 U [0,60] ∃ n 1.2 . Recall that elements common to n 1 and the patterns n 1.1 , n 1.2 are bound-see Sect. 2.4.
Compared to A, an answer set T for a query ζ ∈ L T is extended with a temporal dimension: matches in T are paired with a temporal validity, i.e., the set of time points for which (i) matched elements co-exist in the RTM H and satisfy the attribute constraint, called lifespan of the match (ii) the match satisfies the ac of ζ , called satisfaction span. We elaborate on the lifespan, the satisfaction span, and the temporal validity below.

Lifespan of a match
Vertices of an RTM H have attributes which capture their creation (cts) and deletion (dts) timestamps. For an element , we define its lifespan as the non-empty non-negative interval [ .cts, .dts). The intuition behind the lifespan of an element being right-open 2 is that if an element has been deleted at a time point τ this means the element has existed until a time point that approaches but is not equal to τ . A match is valid only if there is a non-empty interval λ m , called the lifespan of the match, during which the lifespans of all matched elements E overlap: Attribute values of matched elements do not change (see Sect. 2.4) and, hence, cannot affect the lifespan computation. Element timestamps are always assigned from R + 0 , therefore it always holds that λ m ⊆ R + 0 . In the special case where the pattern in ζ is the empty graph ∅, an (empty) match m is always found with λ m = R + 0 .

Satisfaction span and temporal validity
We call the set of time points for which an MTGC is satisfied its satisfaction span, denoted by Y. In the context of a query (n, ψ) ∈ L T or a nested condition with n as enclosing pattern, a satisfaction span related to a match m for n is defined as Y(m, ψ) = {τ | τ ∈ R ∧ m satisfies ψ at τ }. The temporal validity of the match is the set of time points for which m exists and satisfies ψ, i.e., the intersection of the lifespan of a match with the satisfaction span, and is denoted by V(m, ψ). The intersection of two intervals is always an interval, whereas the union of two intervals may result in disjoint, i.e., disconnected, sets. To encode such unions, we define an interval set I ⊆ R which may contain disjoint or empty intervals. Note that a set operation between an I ∈ F and I ∈ I with F and I the set of all interval sets and intervals, respectively, may result into an I ∈ F. The satisfaction span Y and the temporal validity V may depend on unions of intervals or operations with other interval sets and are, therefore, interval sets themselves.
The definition below presents the recursively defined satisfaction computation Z of an MTGC. An explanation of the intuition behind the definition follows.
Definition 1 (satisfaction computation Z) Let n,n be patterns and ψ, χ, ω be MTGCs. Moveover, let m be a match for n. The satisfaction computation Z(m, ψ) is recursively defined as follows. with: -M a set containing only matches that are compatible with the (enclosing) match m-see Sect. 2.4 -J i the set of all intervals in Z(m, χ) that are either overlapping with or adjacent to some i ∈ Z(m, ω) -+ k the union (k) ∪ k, i.e., making k left-closed, and k + defined symmetrically for k, l ∈ I with (k) and r (k) the left and right end-point of k, respectively-note that endpoints are reversed in subtraction.
The intuition behind the equations for true, negation, and conjunction is clear. Regarding exists, the satisfaction span can be computed based on a (sub-)query which searches for n-the computation relies on the temporal validity of all matchesm forn which are compatible with m.
For until, the computation is conditional on the timing constraint I . If 0 / ∈ I , i.e., (I ) = 0, the satisfaction includes every time point t in the intersection of some i ∈ Z(m, ω) with a j ∈ Z(m, χ) for which a time point τ in i occurs within I . Furthermore, j needs to overlap i , e.g., j = [1,3], i = [2,4] or span until the very last time point, i.e., adjacent, before i , e.g., j = [1, 2), i = [2,4]. If j and i are adjacent, during the computation j becomes rightclosed, i.e., j + = [1,2], to ensure that their intersection produces a non-empty set. If 0 ∈ I , then, by the semantics, it may be that j is empty, i.e., does not exist, and in that case until is satisfied by every i ∈ Z(m, ω). Therefore, the computation includes every i and remains unchanged otherwise. The intuition behind since is analogous.
The following theorem states that the set of time points in the satisfaction span Y is equal to the set of time points obtained by the satisfaction computation Z. This computation may require to take elements from the match m into account but it is not bound by it. The temporal validity V(m, ψ) binds Z by λ m : it computes the set of time points for which a match, besides being structurally present in the graph, satisfies ψ. As an example, assume the query (n, ♦ [0,5] ∃n). For a match m for n with λ m = [3,9), the satisfaction span Z(m, ♦ [0,5] ∃n) for a matchm forn with λm = [3, 6) is satisfied for [−2, 6)-according to Eq. 6. However, V(m, ♦ [0,5] ∃n) is equal to λ m ∩ [−2, 6) = [3, 6). As the example showed, Z may contain negative time points (hence R is used in Definition 1), whereas V ⊆ R + 0 -since V is produced by an intersection with λ m . It also holds that V(m, ψ) ⊆ Z(m, ψ).

Answer set
Based on the temporal validity, we can proceed with a technical definition of the output of a query in L T , that is, its answer set T. In a RTM H , all vertices and, thus, all matches have a lifespan. Moreover, application conditions of queries in L T are formulas of an interval-based temporal logic which decides the truth of a formula for every time point in R. Therefore, the answer set of a query in L T contains matches, i.e., structural occurrences of a pattern, associated with a temporal validity, i.e., the time points for which the match exists and satisfies its application condition.
Definition 3 (answer set T) Given a pattern n, an MTGC ψ, and an RTM H H [τ ] , the answer set T for a query ζ := (n, ψ) over H [τ ] is given by: Recall that V(m, ψ) may be an interval set. The answer set allows for a precise definition of the output of L T . In the remainder, we rely on this definition to explain the output of queries in the examples and, moreover, to define a restricted answer set for queries over a constrained RTM H .

Operationalization of temporal graph queries
This section presents an operationalization framework that enables the incremental execution of a query ζ ∈ L T . In InTempo, the activities described below are performed by Operationalization and Execution. Given ζ , Operationalization constructs an enhanced GDN which is capable of interval operations. Execution executes this GDN to find and update matches for ζ in an RTM H . In the following, we refer to the framework in [18] (see Sect. 2.4) as base approach. The base approach considers graph queries in L , i.e., where application conditions are formulated as NGCs, and thus does not support the temporal operators until and since and, in general, the temporal reasoning required by MTGL. Building on the base approach, we present our extensions to which we collectively refer as temporal approach or temporal GDN. The temporal GDN supports the formulation of an ac in MTGL and therefore enables incremental execution of queries in L T .

Marking rules of temporal GDN
Regarding marking rules, the main difference between the base approach and the temporal approach is that the RHS of rules of the temporal approach create a marking node that captures the duration of the match being marked. To this end, the type of marking nodes in the type graph is equipped with an attribute d of type interval set.
In detail, first, we adjust the concept of a regular marking rule of the base approach such that we obtain two variants: one where the RHS creates a marking node with a duration that coincides with the temporal validity V of the match being marked; and another where the RHS creates a marking node with a duration that coincides with the satisfaction span Z of the match being marked. The two rules are denoted by VMR and ZMR, respectively. Moreover, we introduce a new type of marking rule, denoted by αMR, which allows for matching a varying number of nodes, a feature which is not supported by the base approach.
The temporal GDN incrementally updates in each execution step the set of all matches and potentially re-computes the duration of their marking nodes. Once created, marking nodes, together with the vertices they mark, remain in the RTM H in subsequent steps. We elaborate on each rule below.
The V Marking Rule A VMR sets the d of a created marking node to the temporal validity of a match m, given by Definition 2-thus, the rule is intended for exists operators. The satisfaction span is represented by the duration of marking nodes of the dependencies of the rule. Since matched elements may be marking nodes, the computation of the lifespan of a match is slightly adjusted. For a matched element , its lifespan is given by .d if is a marking node, and by [ .cts, .dts) otherwise.. Figure 9 contains an example of a VMR: the node N 1.1 for ∃ n 1.1 from ψ 1 := ¬∃ n 1.1 U [0,60] ∃ n 1.2 . Marking nodes are illustrated by a split circle where the bottom compartment contains the duration attribute.
Contrary to rules in the base approach, a VMR includes marking nodes of dependencies in the pattern of the query rather than in the ac. This is necessary as these marking nodes are included in computations of the duration of the marking node created by the RHS.
The Z Marking Rule The ZMR is intended for the operators in Definition 1 which pass their match on to enclosing MTGCs and do not include matched elements in their computations, i.e., the operators negation, conjunction, until, and since. Therefore, the LHS of the ZMR contains all elements of its closest enclosing VMR. For example, the U node and the negation node in Fig. 9 contain all elements in n 1 The ZMR's computation of d excludes these elements and only considers the satisfaction span of its dependencies, represented by the duration of marking nodes. The computation by the RHS depends on the operator-see Definition 1.
The α Marking Rule In Eq. 5, the computation of the satisfaction span for exists relies on the union of all lifespans of matches that are compatible with the enclosed patternn. In order to compute this union, the corresponding node in the temporal GDN is required to keep track of all matches for n. The number of these matches might vary in every GDN execution, a property which is not covered by conventional graph transformation rules-see Sect. 2.4. Therefore, the node created for the exists operator is an amalgamated marking rule (αMR). Such rules stem from amalgamated graph transformation [20], where an arbitrary number of parallel transformations are amalgamated, i.e., merged, into a single rule execution in one transformation step. Thus, an αMR enables a node to be associated with a varying number of graph elements so that it can compute the union of their duration.
The LHS of αMR contains a kernel of graph elements that are bound by the enclosing operator-in ψ 1 , that would be the s-and a multi-rule which matches an arbitrary number of instances of a certain marking node type. An αMR thus groups the marking nodes matched by the multi-rule by matches of the kernel and aggregates the marking nodes' duration. Hence, the αMR corresponds to a GDN node with a single dependency to the node that creates the marking nodes which the αMR groups.
The RHS of an αMR creates an α marking node which is connected to the marking nodes of its dependency (matched by the multi-rule) and the elements of the kernel marked by those marking nodes. Only a single α marking node is created, regardless of the number of matches for the dependency. The duration of the α marking node is the union of the duration of the marking nodes E M matched by the multi-rule: See Fig. 9 for an example of an αMR, the node named α 1.1 , which groups matches of its dependency, the node N 1.1 .

Temporal GDN construction
In the previous section, we were concerned with types of GDN rules and the types of marking nodes they created. This section focuses on GDN nodes which form the components of the network and represent executions of GDN rules. In InTempo, the construction of a temporal GDN from a query in L T is performed by Operationalization-see Fig. 8. Compared to the base approach, the construction features two extensions: first, we translate the operators conjunction, negation, until, and since into GDN nodes corresponding to ZMR; second, we translate the exists operator into a GDN node corresponding to a VMR, a GDN node corresponding to an αMR which is intended for grouping the matches of the node for the VMR, and a positive dependency from the node for the αMR to the node for the VMR.
The construction is performed in two steps: (i) a traversal of the syntax tree of the MTGC of the query where operators are replaced by GDN nodes (ii) a traversal of the network created by the previous step to create the dependencies between nodes and set the patterns of ZMR nodes. See Fig. 9 for the temporal GDN of ζ 1 , where novel nodes, i.e., created by our extensions to the base approach, are in light gray. The existential conditions in ψ 1 , i.e., ∃n 1.1 and ∃n 1.2 , lead to the creation of nodes N 1.1 and N 1.1 , respectively. These nodes are both dependencies of their respective α nodes. The pattern n 1.1 is enclosed by a negation which leads to the creation of negation node ("¬") and a negative dependency between the negation node and α 1.1 . The U node is dependent on the negation node and α 1.2 . The node N 1 , created for the pattern n 1 of ζ 1 , is dependent on the U node. For more complex constructions, the instructions in [18] apply.

Temporal GDN execution
In InTempo, the execution of a temporal GDN is performed by Execution (see Fig. 8). The execution is recursive: given an input GDN node, the node's dependencies are executed before the input node. Hence, although the execution starts at the root of the GDN, network leaves, which have no dependencies, are executed first.
An execution of the temporal GDN is performed after each modification to the RTM H and nodes start the pattern matching effort in the surrounding of elements affected by the modification-see local search in Sect. 2.4. The pattern matching for a node is skipped if the modification does not concern the node or one of its dependencies.
We demonstrate an execution of a temporal GDN via an example based on the network constructed for query ζ 1 (see Fig. 9) and two RTM H instances: H [5] and H [7] in Fig. 5.
Execution over H [5] Each item in the following lists marks the execution of a GDN node. Each GDN node execution leads to the creation of marking nodes whose duration depends on the type of the GDN node-the type of each GDN node is shown between braces next to the node name. The execution traverses the network and starts with node N 1.1 . It continues upward with nodes whose dependencies have been executed. N 1.1 (VMR) No matches for n 1.1 are found. α 1.1 (αMR) The vertices s and pm 1 are found but no marking nodes from its dependency N 1.1 . The marking node created by αMR groups the duration of its dependency based on Eq. 8. Here, an empty duration is found, hence the empty interval is stored into its attribute d α 1.1 . ¬ (ZMR) The LHS of the rule is the same as n 1 . The node finds one match (vertices s and pm 1 ) and one marking node for its dependency, α 1.1 . The duration d ν of the node is computed according to Eq. 3: The negation node is one of the two dependencies of node U. Before proceeding with node U, the execution requires that the other dependency is executed as well, hence, via recursion, it proceeds with node N 1.2 : One match is found, containing s, pm 1 , d 1 , and the edges among them. One marking node is created with duration [5, ∞), which coincides with the temporal validity of the match computed according to Definition 2. α 1.2 (αMR) One match for s and pm 1 is found as well as a marking node of type N 1.2 -the one created for node N 1.1 . One marking node is created whose duration aggregates the lifespans of its dependencies. Computed based on Eq. 8, the lifespan d α 1.2 is equal to [5, ∞).
Since both of its dependencies have been executed, the execution can now proceed with node U: The LHS of the rule is the same as n 1 . One match for s and pm 1 is found and two marking nodes: one for the left operand, i.e., the negation node, and one for the right operand, i.e., α 1.2 . For each interval in the duration of the marking node d α 1.2 it is checked whether it is adjacent to or overlapping with any of the intervals in the duration of the negation node. The interval R from d ν and the interval [5, ∞) from d α 1.2 are indeed overlapping. Therefore, the duration d U of the marking node created by ZMR is computed according Eq. 6 which returns [−55, ∞). N 1 (VMR) An s and a pm 1 are matched. Their lifespan is [4, ∞). One marking node with duration d U exists; effectively, this duration is the satisfaction span of the ac of the query. The temporal validity of the match is the intersection of the lifespan of the match with the duration, i.e., The executed query returns a T which contains a match for n 1 associated with the interval [4, ∞) during which, besides being structurally present in the graph, the match satisfies the temporal requirements expressed by ψ 1 .
Execution over H [7] The event e 7 yields a new RTM H instance, i.e., H [7] -see Fig. 5. The event corresponds to the deletion of d 1 and the creation of pm 2 as well as an edge between pm 2 and s. InTempo is invoked and Execution executes the temporal GDN: A new match for the kernel exists (the one involving pm 2 ) whose duration is also the empty interval. The match found in the previous execution remains unchanged.
¬ (ZMR) Recall that the node's LHS is the same as n 1 . A new match is found for s and pm 2 whose duration is computed to be R, similarly to the previous execution over H [5] . The match found in the previous execution remains unchanged. N 1.2 (VMR) No new matches are found, however, the duration of the match already stored is updated to [5,7). α 1.2 (αMR) A new match with s and pm 2 is found for the kernel with an empty duration as no corresponding marking node exists. The duration of the match found in the previous execution, i.e., H [5] , is re-computed to [5,7) to reflect the change in the duration of its dependent marking node in the previous phase. U (ZMR) A new match is found by its LHS. This match is associated, via the matches stored by the right operand for the until, with an empty interval. The computation of the duration for this match is similarly an empty interval. The duration of one of the marking nodes associated with the match found in the previous execution has been updated: this triggers an update of the duration stored by the node for the match in question. The result of the re-computation is [−55, 7). N 1 (VMR) A new match is found. The satisfaction span for this match is an empty interval and, hence, its temporal validity is also empty. The temporal validity of the previously found match for n 1 is re-computed to [4,7).
InTempo returns a T which has been incrementally computed and contains one match for n 1 -the one involving pm 1 . The match involving pm 2 , although structurally present, does not satisfy the MTGC in ζ 1 -the match has an empty temporal validity and is hence excluded from T. The temporal validity of the match involving pm 1 has been updated to reflect the change to d 1 by e 7 .

Constraining memory consumption of an RTM H
As noted previously, an RTM H maintains two views on the state of the modeled system. Similar to traditional RTMs, one view is of the current system state, as the RTM H contains all system entities present in the system. The other view, which extends traditional RTMs, is that of the entire history of the state. The view of the entire history is enabled by (i) creation (cts) and deletion (dts) timestamps (ii) featuring vertices which have been deleted from the modeled system, the deletion being reflected in the dts of the vertices in question. This wealth in insight comes with a price which in certain cases might be problematic. On the one hand, regulations originating in the context of the system, such as retention policies for privacy-sensitive patient data in healthcare, e.g., [85,86], may require that certain data has to be discarded after a certain period. On the other hand, remembering each system change causes the RTM H to constantly grow in size. The growth rate may need to be curbed to reduce the memory consumption or to avoid cluttering the model with obsolete data which may deteriorate performance of the pattern matching-as constantly more elements have to be considered.
For these cases, the remedy lies in knowing when and what to forget. In this section we present the functionality of InTempo which allows for the optional constraining of the history representation. This constrained representation contains all elements that are relevant to query executions and, compared to the unconstrained representation, may afford increased memory efficiency.

RTM H maintenance based on time window
In case the dts of an element is equal to ∞, the element is still present in the modeled system and thus, due to causal connection, may not be removed from the RTM H . This is not true for elements whose dts = ∞ as those have been removed from the modeled system. Nevertheless, because of the presence of temporal operators and their timing constraints, elements might be relevant to query executions for a certain period even after they have been deleted from the modeled system. When this period elapses, deleted elements may be considered for removal from the RTM H .
For example, let ζ 2 := (n 1.1 , ψ 2 ) be a query with ψ 2 := ♦ [2,5] ∃ n 1.2 -as for ζ 1 it is assumed that the system tracks time in seconds hence time units represent seconds. Assume that ζ 2 is executed over H [7] in Fig. 5. The vertex d 1 has been deleted yet it is relevant to the execution of the query. Given the timing constraint of the temporal operator in ζ 2 , the same would stand for an H [10] : although deleted, d 1 might still be involved in the query execution. The involvement of d 1 is final at a time point τ , only for RTM H instances whose associated timestamp exceeds the sum of the right end-point of the timing constraint of ♦ [2,5] and τ .
Given a (finite) set of MTGCs Ψ , we determine the time window W, i.e., the period for which the involvement of deleted elements in query executions is not final as follows. First, we compute w for an MTGC ψ ∈ Ψ : For Ψ , the time window is given by: If the constrained representation option is enabled, InTempo performs the computation of W during Operationalization. Maintenance is performed after Execution and resembles garbage collection, i.e., it prunes all elements in an RTM H whose dts exceed a certain threshold. Maintenance yields a pruned RTM H , denoted by P and defined practically as follows.
However, to ensure correctness of results, the threshold also covers a preceding period, i.e., τ i−1 −2W, which is motivated in the following section.

Projected answer set and intended setting
An H [τ ] constitutes a complete representation of a history of eventsh τ , i.e., it contains all changes from the beginning of the history up to and including the time point τ . The removal of elements from P [τ ] renders the representation ofh τ partial. In the following it is shown that, over the course of a history, i.e., the sequence of events, all changes to the temporal validity V of a match can be detected in both representations. However, the V returned by constrained representations is Fig. 10 Projected temporal validity of a match obtained for two consecutive time points τ i and τ i+1 restricted to a certain period to correspond to the representation being partial. This restriction is captured by a projected answer set T π . Intuitively, a T π projects the V of a match at τ i , i.e., V τ i , on a period in which V might have changed (captured by W) since the previous event at Definition 4 (projected answer set T π ) Given an answer set at time point For a match (m, V), all modifications corresponding to an event that could have affected its T π τ i+1 contains this interval as well as the interval [τ i , τ i+1 ]. Therefore, with respect to τ i+1 , all time points in V before τ i − W are final, i.e., no modification after τ i+1 could affect them. This fact also motivates the pruning threshold: deleted elements are kept in P as long as their dts is larger or equal to the earliest time point for which a modification could affect the earliest time point for which V is not final. See Fig. 10 for an illustration. The T π τ i (the dotted grid) returns a match (m, V τ i ) where V τ i contains all those time points which are not final with respect to τ i . Analogously, so does T π τ i+1 (in gray) with V τ i+1 . At τ i+1 however, all time points before the pruning threshold τ i − 2W (that were nonfinal in T π τ i ) have become final. An aggregation of T π τ i and T π τ i+1 would still contain final time points in V τ i even if the pruning of an element at τ i+1 would cause these time points to be excluded from V τ i+1 . Theorem 2 Leth τ be a history, D be the last index ofh τ , and ζ := (n, ψ) with ζ ∈ L T . Moreover, let H , P be a complete and a pruned RTM H , respectively. Over the history, T π (H ) is equal to T π (P), that is: Proof (sketch) By induction over D. See Sect. B.2 for the proof.
Since the history captured in P is partial, queries over P can compute the temporal validity of matches only for a restricted period of time-captured by the projected answer set. This is in contrast to queries over an un-pruned RTM H H where the temporal validity of matches is computed for the entire history. The loss of information in P compared to H is a trade-off for the potential of increased memory-efficiency and faster query execution times over P. Moreover, the deletion of elements may result into matches being removed from P; therefore, P is intended for use-cases where matches are only relevant for a short period of time after being returned, e.g., in self-adaptation where a match constitutes an adaptation issue that is to be fixed as soon as possible.

Maintenance with dynamic sets of queries
InTempo seamlessly supports non-fixed, i.e., dynamic, sets of queries over a complete RTM H . The following functionality enables their support by a pruned RTM H .
Upon an invocation of InTempo, the Operationalization checks whether the set of queries has been altered and, if yes, re-computes W. If W hasn't changed or has been decreased, the invocation proceeds as usual by executing the queries. Else, if W has increased, the derivation of a complete answer set for the incoming queries cannot be ensured for a period of time that is equal to the difference of the previous value of 2W to the newly computed one. In this case, the queries for which a complete T cannot be guaranteed, are not admitted for execution until the length of the history represented by the RTM H suffices for their T.
For example, assume InTempo is first executed for the previously introduced query ζ 2 with Ψ = {ψ 2 } where W = 5. Later on, the query ζ 1 is added to the input queries which leads to Ψ = {ψ 1 , ψ 2 }. The time point of addition is marked by τ of the RTM H H [τ ] . The alteration induces a recomputation of W to W = 60. The value has been increased therefore, based on τ as well as the difference 2W − 2W, ζ 1 is admitted for execution only when an RTM H H [τ ] is induced with τ − τ ≥ 2W − 2W, i.e., in this case, at least 110 time units after its addition.

Considerations
In most cases, a pruned RTM H would be more memoryefficient than a complete one. In fact, against a constant rate of incoming events, pruning would yield an RTM H whose memory consumption would be bounded. However, if the event rate does not have a fixed upper bound, the memory consump-tion, although mitigated by pruning, would still increase over time.
In the case of MTGCs with an unbounded past operator, i.e., with r (I ) = ∞, an RTM H cannot be pruned as it is obvious that the temporal requirement refers to the entire history. In the case of an unbounded future operator, the MTGC may be non-monitorable [see 87], i.e., satisfaction of an MTGC may depend on the entire infinite future of the execution thus it can never be provided. Monitoring such formulas is achieved via a practical solution, e.g., the replacement at runtime of the right end-point of the operator with the largest time point of the sequence, which affords a verdict on the satisfaction at any point of monitoring.
Pruning may reduce the size of the matching search space and thus may improve the pattern-matching time of the Execution operation. On the other hand, Maintenance performs two more tasks that should be taken into consideration with respect to the overall performance. First, it uses a priority queue of deleted elements in the RTM H and iteratively polls the queue to detect and prune elements whose dts exceeded the threshold. Second, every time an element is pruned, Maintenance re-computes the matches maintained by the temporal GDN to detect whether pruned elements have affected any matches and, if so, ensure that affected matches are recomputed based on the latest RTM H .

Runtime monitoring with temporal graph queries
InTempo processes a sequence of events which represents an ongoing system behavior and, during query execution, checks whether the observed sequence (captured in the RTM H ) satisfies a temporal logic formula (captured in the MTGC of the query). This functionality resembles the monitoring approach known as Runtime Verification (RV) [see 7]. In this section, we discuss the application of InTempo for RV.

Temporal graph queries for temporal properties
RV represents the system behavior by sequences of states or events at some level of abstraction. An online algorithm is then used to check whether (some prefix of) the sequence satisfies a given property [9], i.e., a formal statement. RV focuses on the verification of temporal safety properties [1], i.e., statements of the form "something bad should never happen" which in temporal logic is expressed by prefixing each formula with the always operator [62], i.e., the abbreviation of ¬♦ [0,∞) ¬ψ in MTGL with ψ an MTGC. RV searches for violations of such properties which, on a finite sequence, can always be detected as soon as they occur [72].
In the context of graphs, a temporal safety property which contains no temporal operators corresponds to a graph query in L -see Sect. 2.4. The verification of such properties corresponds to updating a host graph based on a sequence of events and, upon each update, searching the graph for matches of the query pattern [24]. If a match is found, then the property is violated.
Properties with temporal operators require the tracing of a pattern in the host graph over time, i.e., over multiple updates. By the incorporation of a temporal logic such as MTGL into L T and the consolidation of updates into an RTM H , temporal graph queries in L T are capable of specifying temporal properties where any matches in an RTM H returned by InTempo constitute violations.
As an example, recall the query ζ 1 := (n 1 , ψ 1 ) with ψ 1 := ¬∃ n 1.1 U [0,60] ∃ n 1.2 . In order for matches of ζ 1 to return the periods for which there is a violation, that is, there are admitted patients that are not prepared for treatment within the designated time or, until they are prepared, they are mistakenly re-triaged, the MTGC of the query needs to be negated, i.e., ζ 1 := (n 1 , ¬ψ 1 ). Executed over H [7] (Fig. 5), ζ 1 returns one match: the one involving pm 2 is associated with [7, ∞)-the duration of until is empty, its negation is R, and hence the temporal validity V is computed by [7, ∞) ∩ R. It can be inferred that the procedure for patient pID=2, i.e., pm 2 does not conform to the guideline-since its V is nonempty. Moreover, the procedure is violated from time point 7 onward.
The example above demonstrates an advantage of InTempo over traditional RV solutions, as RV solutions typically return a value from the Boolean domain, e.g., true or false, or, in cases with enhanced expressiveness, some value related to the Boolean domain, e.g., true, probably true, false [12]. Conversely, owing to the computations in Definition 1, InTempo returns the duration for which the match is valid, i.e., the violation occurs, which is arguably a more intuitive monitoring outcome and can be further utilized by the system.
A temporal graph query can be obtained for all MTGCs. For an MTGC ψ where the topmost condition is not an existential quantification, a query can be obtained by wrapping ψ in a query looking for an empty pattern, e.g., (∅, ¬ψ). An answer set for (∅, ¬ψ) would consist of an empty match and a temporal validity which would mark the duration for which ψ is violated.

Postponing a decision
In the previous example, where ζ 1 is executed over H [7] , InTempo returns a match for pm 2 , i.e., a violation, although, based on the interval of ♦ [0,60] , the object may potentially satisfy ψ 1 in the future-for example, an addition of an instance of DrugService with the appropriate pID could occur in the next few seconds.
For H [7] , the scheme makes a decision based on what has been already observed. Typically, in RV, a decision is not made unless conclusive, i.e., there is no possible future that could satisfy the property. A given RTM H is sufficient for making a conclusive decision for a temporal property which concerns the past. This is not the case for temporal operators which concern the future where, in order to be conclusive, the decision would have to be postponed. This practice may be particularly useful in systems where humans interoperate with a software system (as in the SHS case-study) and the system acts as a safety net, i.e., only if other planned actions have not been taken within a certain period.
In this section, we extend Execution to return only those matches whose temporal validity has become final since the last time the operation was performed. This functionality is enabled by equipping Execution with a predicate defined as follows.
Definition 5 (P) Let (n, ¬ψ) be a query with n a pattern and ψ an MTGC. Moreover, let W be the time window of ψ and V(m, ¬ψ) be a non-empty temporal validity of a match m for n in H [τ ] with τ ∈ T. Then, the predicate P m,τ is defined as: Intuitively, a true P m τ i means that there is a time point or more for which the temporal validity has become final with respect to τ i . The Execution operation checks all matches with a non-empty V and returns only those for which P m τ i is true. A decision on time points for which V is not final is postponed for the next event.
Equipped with P m τ i , an invocation of InTempo for H [7] finds the match involving pm 2 and the associated interval [7, ∞). However, the predicate for the match is computed to be false, i.e., intuitively, ψ 1 could be satisfied in the future of H [7] , and thus InTempo does not return the match. Had there been no drug service added for that patient in future instances of the RTM H , the match would have started being returned in all RTM H instances H [τ ] with τ ≥ 67.
The predicate may impose a delay on the detection of some violations which, however, is inherent in reactive monitoring and, in practice, it is often handled by an appropriately timed periodic event generated by the monitor.
In the context of the SHS and similar systems, this functionality enables waiting for a planned action to be taken by a clinician and only detecting a violation after said period has elapsed. The actual period can be set such that the detection serves as a warning, i.e., it anticipates an omission and allows for some time where the action can still be taken, either by clinicians or smart devices of the SHS.

Application scenario: runtime adaptation
This section applies InTempo in a self-adaptation scenario which utilizes temporal graph queries and the history-awareness of an RTM H .

Self-adaptation based on RTMs
Self-adaptive systems are able to modify their own behavior or structure in response to their perception of their context, the system itself, and their requirements [31]. Selfadaptation can be generally achieved by adding, removing, and re-configuring components as well as connectors among components in the system architecture [77], therefore, the architecture view is typically considered an appropriate abstraction level, e.g., [46,47]. Causal connection renders RTMs a natural choice for representing the system architecture, as adaptations can be realized as changes on the RTM which are subsequently mirrored in the system [see 107]. An established method of instrumenting self-adaptation is to equip the system with an external feedback control loop, i.e., an adaptation engine. An established reference model for the design of an adaptation engine is the MAPE-K feedback loop [66]. The MAPE-K loop monitors and analyzes the system and, if needed, plans and executes an adaptation of the system, where the adaptation is defined in terms of architecture changes. All four MAPE activities (whose first letter is underlined above) are based on knowledge. The feedback loop maintains an RTM as part of its knowledge to represent the current state of the architecture. Thus, the activities of the MAPE-K feedback loop operate on the RTM to perform self-adaptation.
A self-adaptive system can be adapted via adaptation rules. Adaptation rules represent fine-grained units of change that can be performed on the underlying system. The execution of an adaptation rule adapts the system from its current state to a new state. Adaptation rules are of the form: if condition then action. A condition checks whether an adaptation issue is present, whereas an action describes a desired Fig. 11 Overview of InTempo and system interaction for adaptation adaptation. If the condition is met, the action is taken. The feedback loop captures changes (during Monitor); checks whether changes cause an adaptation issue (during Analyze); and, if the condition is satisfied, plans and executes an adaptation action (during Plan and Execute, respectively) [73].
The graph-based encoding of RTMs allows for a realization of adaptation rules in form of graph transformation rules where adaptation issues are expressed via graph patterns which in turn characterize graph queries, i.e., the LHS of the rule. Graph queries are executed during analysis and the RTM is adapted via in-place graph transformations based on the RHS of the rule [47].

History-aware self-adaptation via INTEMPO
An RTM H captures the current state of an RTM as well as temporal information on when changes to said RTM occurred. Furthermore, temporal graph queries are capable of formulating adaptation rules whose conditions include temporal requirements on the history of the system structure. By replacing the RTM in a MAPE-K loop with an RTM H , InTempo can be used to execute temporal graph queries which, in this context, correspond to checking adaptation conditions, over a sequence of events which represent changes on the architecture. The query answers can be used to plan and execute adaptations. Therefore, InTempo may serve as the basis for an adaptation engine that enables historyaware self-adaptation-see Fig. 11. Figure 12 depicts a detailed view on an adaptation based on the MAPE loop. The engine operates in two phases: the setup and the (self-adaptation) loop. Operationalization of InTempo (the trapeze shape containing "O" in Fig. 12) is performed during setup. Execution ("E") is performed in the Analyze activity. The engine features a Maintain activityan extension compared to a canonical MAPE loop-during which Maintenance of InTempo ("M") is performed.

Self-adaptation scenario for SHS
In the following, we build on the SHS introduced in Sect. 2.1 to envisage a (self-)adaptation scenario that enacts a medical instruction. The instruction imposes temporal requirements on the operation of the SHS which are checked and enforced by the five activities of the adaptation loop described below. The scenario is based on the medical guideline on the treatment of sepsis [78,89], a potentially life-threatening condition. We focus on the basic instruction that reads: "between ER Sepsis Triage and IV Antibiotics should be less than 1 hour" [78]. Note that, from the point-of-view of the system, this timing constraint in the instruction is soft, as medical guidelines often provide contingency plans in case a deadline is inadvertently missed, e.g., [110, p. 11]. Therefore, a system adaptation could occur after the deadline is missed and still remedy the situation.
The event log in [78] contains real medical records from a hospital where patients diagnosed with sepsis were treated according to the guideline. The log contains a multitude of events. We focus on those that correspond to actions prescribed in the guideline, i.e., ER Sepsis Triage, IV Antibiotics, and Release, which correspond to a patient being triaged as an emergency sepsis case, an intravenous (IV) administration of antibiotics for sepsis, and a patient being released from the emergency ward, respectively. Events in the log are timestamped. Based on the SHS metamodel (Fig. 2) and the available hospital log, we envisage the procedure described in the guideline enacted by the SHS.
In detail, an ER Sepsis Triage event is simulated as a Probe with status sepsis, generated from an instance of PMonitoringService pm which has been invoked by an SHSService s. An IV Antibiotics event is simulated as a Probe with status anti from a DrugService d which has also been invoked by s. The patterns capturing the occurrence of these events in our SHS are depicted in Fig. 13. To make sure these two actions are referring to the same patient, g 1.1 contains an attribute constraint (in braces) that checks whether the pID of d and pm are equal.
Based on g 1 and g 1.1 in Fig. 13, the instruction is formulated in L T by the query MG1 := (g 1 , ¬ψ MG1 ) with ψ MG1 the MTGC ♦ [0,3600] ∃ g 1.1 . That is, the query searches for matches of g 1 , which identifies a (previously untreated) patient with sepsis, for which, in the next hour, there is no match for pattern g 1.1 , which identifies the administration of antibiotics to the same patient. The binding of elements in g 1.1 from g 1 is illustrated in Fig. 13 by using the same labels for vertices, e.g., for pm. The system is assumed to track time in seconds.
In order to challenge our scheme with a more complicated scenario, we also search for violations for a variation of MG1. Namely, that no patient with sepsis should be released prior to being treated, a requirement that resembles once more  Fig. 13. We describe a desired adaptation loop for the SHS according to these instructions.
Monitor During the monitoring activity, the recent events (new readings captured by Probes since the last invocation of the loop) together with their cts and dts values are reflected in the RTM H , which is an instantiation of the Architecture. Therefore, the RTM H is updated to represent the current system structure enriched with the relevant temporal data.
Analyze The activity detects adaptation issues. In this context, these are captured by violations of ψ MG1 , i.e., the existence in the RTM H of structural patterns that reflect sepsis cases (g 1 ) without associated antibiotics (g 1.1 ) within one hour or, similarly, for ψ MG2 . Hence we execute (separately) the queries MG1 and MG2. Note the similarity with the monitoring of temporal properties in Sect. 7.1. Technically, the detection is based on the execution of the temporal GDN which has been obtained by Operationalization during the setup of the engine. In the context of an SHS, selfadaptation only supports the medical procedure, i.e., it first waits for clinicians to perform the actions in the guideline. Only when there is no more time is self-adaptation enabled. Thus, InTempo is equipped with the P predicate in Definition 5 to return only matches whose temporal validity is final.
The matches returned by InTempo in this activity constitute adaptation issues, and similar to [47], adaptation-related types (Annotation in Fig. 2) are used to facilitate the adaptation. During Analyze, the PMonitoringService instance which has been involved in the detection of an adaptation issue is annotated with an Issue instance. Therefore, to ensure that only new violations are matched, g 1 is extended to check that no instance of Issue is associated with the matched instance of PMonitoringService. Issue instances, as well as instances of other adaptation-related nodes in Fig. 2, are created by transformation rules which respect the RTM H and are, therefore, capable of setting their cts and dts appropriately.
Plan and Execute In planning, the engine searches for sepsis Probes annotated with an instance of Issue. Upon finding them, it attaches an Effector to the service to which the Probe instance is attached. The Execute activity of the loop searches for effectors and upon finding them takes an adaptation action, i.e., administer antibiotics to the patient via a DrugService instance. This adaptation action is also reflected in the RTM H by creating an AdaptationAction instance which is associated to the handled Issue instance.
Maintain This activity is optional. If enabled, Maintenance uses the time window obtained by Operationalization during setup and prunes the RTM H , i.e., it removes all elements that have been deleted and their involvement in query executions is final. Following the removal of elements, the GDN is re-executed to update matches.

Evaluation
This section presents an implementation of InTempo which is evaluated based on a simulation of the SHS case-study and a benchmark for graph-based technologies which simulates the operation of social network. Implementation details are presented in Sect. 9.1. In Sect. 9.2, the implementation is integrated in an adaptation engine, as shown in Fig. 12, and evaluated on the SHS adaptation scenario presented in Sect. 8.2, where adaptation issues are detected by the execution of temporal graph queries over real and synthetic data. Query execution times are compared to an RV tool and a timeaware model indexer in Sect. 9.3. In Sect. 9.4 we evaluate the querying performance of the implementation with data of different sizes, generated by the LDBC Social Network Benchmark. We discuss the results in Sect. 9.5 and present threats to their validity in Sect. 9.6.

Implementation
Our implementation of InTempo is based on the Eclipse Modeling Framework (EMF) [36,101], which is a widespread MDE technology for creating software systems. For pattern matching, we use the Story Pattern Matcher [48] using the search plan generation strategy presented in [3]. The Matcher uses local search to start the search from a specific element of the graph and thus reduces the pattern matching effort [64]. It uses an OCL checker for checking attribute constraints. For computations on intervals we use an open-source library [52]. For the removal of elements from the runtime model, we transparently replace the native EMF method, via a JAVA agent, with an optimized version which reduces the poten-tially expensive shifting of cells in the underlying array list and renders the removal more efficient. The implementation is available as an EMF plugin in [81]. The plugin relies on two domain-specific languages for the generation of an event mapping (see design-time artifacts in Sect. 3) and the specification of temporal graph queries in L T . For the evaluation, we developed two variants based on the plugin: IT, with pruning disabled, and IT +P , with pruning enabled.

Runtime adaptation of smart hospital system
We developed a simulator of the adaptable SHS presented in Sect. 8. The simulation replays events on an RTM H based on the real and synthesized event logs described in Sect. 9.2.1. After each event, the temporal graph queries MG1 and MG2, described in Sect. 8.2, are executed. Matches constitute adaptation issues which are resolved by appropriate modifications to the RTM H .

Input logs
The log used in our experiments (in the following, real log) contains 1049 trajectories of sepsis patients admitted to a hospital within 1.5 years [78]. Each trajectory comprises a sequence of events. The events that are relevant to the casestudy are ER Sepsis Triage (ER), IV Antibiotics (IV), and Release (RE) events. A trajectory starts with an ER event, and IV and RE events might follow. The inter-arrival time (IAT) between two ER events defines the arrival rate of trajectories. We used statistical probability distribution fitting to find the best-fitting distribution that characterizes the interarrival times between: two ER events (IAT T ), an ER and an IV (IAT S A ), and an ER and an RE (IAT S R ). Then, we used statistical bootstrapping [33] to generate two synthetic logs, x10 and x100, with IAT T values that are 10 and 100 times smaller, respectively, than IAT T values of the real log. IAT S A and IAT S R remain as in the real log.
As a result, x10 and x100 cover the same period of time as the real log, and increase the trajectory density (approx.) 10 and 100 times, respectively, allowing us to test the scalability of InTempo without altering the statistical characteristics of the real log. The logs and a detailed description of the statistical methods employed are available in [93].
Each event in the logs corresponds to the creation of certain elements in the RTM H . In order however to evaluate the performance of pruning we required that the lifespans of these elements have an end, i.e., their dts is set. This information is not provided in the original log. Therefore for each created element we defined an interval after which a delete event was injected in the logs. The intervals for Probe and Service instances are 10 seconds and one hour, respectively. The logs that contain the deletions are available in [96]. An overview of the logs is shown in Table 1. As shown by the Deleted column, all created elements are eventually deleted.

Experiment design
We integrated each variant in an adaptation engine. We denote this integration by an arrow circle: IT , includes the Monitor, Analyze, Plan, and Execute activity-in terms of InTempo, the operations Operationalization and Executionand IT +P , includes all activities above plus Maintenance, i.e., Operationalization, Execution, and Maintenance of InTempo. See Fig. 12 for an overview. The experiments 3 simulate the events in the real, x10, and x100. Each experiment entails the execution of one variant for one query, either MG1 or MG2. We measure IT and IT +P with respect to their reaction time (or full loop time). In this context, the reaction time is equal to the required time for one execution of the adaptation loop, i.e., the time from when an issue is detected to when a corresponding adaptation action has been performed. Thus the reaction time consists of times for Analyze, Plan, Execute and, for IT +P , Maintain. The time required for Analyze is effectively the query execution time. Time measurements are used to assess the requirement for fast query executions (R3). In each adaptation loop, we measure the memory consumed by the variants based on the values reported by the JVM. Memory consumption is used to assess the requirement for memory-efficiency (R4). The time spent in Monitor, i.e., processing an event, is negligible and thus not reported.
A loop is invoked periodically based on a predefined but modifiable frequency. In our experiments, based on the IAT T of the logs, we set the invocation frequency to one hour, i.e., 3600 seconds, to avoid frequent invocations where no events are processed. The invocation frequency coincides with the maximum delay of a violation detection, i.e., in the worst case, a violation will occur at the first second after the loop and will be detected at the next invocation which in this case is after almost one hour. Operationalization, which produces the temporal GDN as well as the time window utilized by Maintenance of IT +P , is executed only once during the setup of the loop.
Each experiment is measured for either time or memory and proceeds as follows. First, during Monitor activity, events from the logs are processed. Each log event corresponds to certain modifications to the RTM H : an ER Sepsis Triage event corresponds to the addition of a PMonitoringService and a Probe instance with status set to sepsis to the model; an IV Antibiotics event to the addition of a DrugService instance and a Probe instance with status anti; a Release event is similar to the ER Sepsis Triage, except the status is set to release. The loop is invoked at the predefined intervals, triggering the Analyze activity which executes the query. Matches constitute adaptation issues. During Plan and Execute transformations are performed which correspond to adaptation actions. Finally, for IT +P , Maintain is performed and matches are recomputed. As expected, the results are mainly influenced by the Analyze activity, which is when issues are detected, i.e., queries are executed. The number of processed events in the experiments with the log files real, x10, and x100 grows steadily-see Sect. 9.2.1. For IT , this increase is mirrored in the size of the RTM H . The Analyze time of IT increases with respect to these two parameters. The growth of the RTM H can also be seen in Table 3, where the maximum memory measurement is reported for both variants. Contrary to IT , the pruning in IT +P minimizes the size and therefore the memory consumption of the RTM H . In fact, because the rate of created elements per period in each log does not increase and, over time, it is almost equal to the rate of deleted elements, the memory consumption over real, x10, and x100 remains unchanged.

Results
Owing to pruning, the Analyze time of IT +P increases at a considerably smaller pace compared to IT . Note that pruning forces a re-computation of the results. Therefore, as shown in Figs. 14 and 15, the time it requires is nonnegligible. Figure 16 shows the time for Analyze for each loop of the two variants for the x100 log (in logarithmic scale). The pruning of RTM H enables the analysis time of IT +P to remain constant.

Comparison to state-of-the-art
During the Analyze activity, Execution, i.e., the operation of InTempo, finds matches by checking whether a temporal property, i.e., an MTGC, is violated by the RTM H . This functionality resembles the objective of RV, where an online algorithm verifies whether a sequence of events representing a system execution violates a temporal property. The algorithm is required to maintain an (internal) representation of the history of the execution, similar to the RTM H . We therefore use the state-of-the-art RV tool MonPoly to acquire a baseline for the performance of IT and IT +P in detecting issues during the activity.
RV tools are typically not intended for usage with structural models. In MDE, storing a structural model, i.e., a graph, may be achieved by a graph database. However, only few graph databases support a notion of time in the representation, the query specification, and query execution, e.g., the database presented in [59]. None of the time-aware databases supports incremental execution of queries with complex patterns, which renders them sub-optimal for an online setting where the result set is updated after each change to the model. Hawk is a model indexer, i.e., a solution which monitors file-based repositories such as Git or SVN, stores models or model elements of interest to a database, and maintains an index, i.e., an efficient representation, of the model evolution which is amenable to model-element-level querying [4]. Hawk has been recently extended to integrate the time-aware database from [59] and to support temporal primitives which can be used to query the history of a model directly. Using these primitives, we formulate the queries MG1 and MG2 and compare the performance of Hawk to that of InTempo.

Runtime verification with MONPOLY
MonPoly [9,10] is a command-line tool which notably combines an adequately expressive specification language with an efficient incremental monitoring algorithm. It has been the reference in evaluations of other RV tools [34,60] and among top-performers in an RV competition [8]. Its specification language is based on the Metric First-Order Temporal Logic [9] (MFOTL) which uses first-order relations to capture system entities and their relationships. MonPoly processes a sequence of timestamped events, maintains an internal representation of the system execution, and checks whether it violates a given formula. Unlike an RTM, representations in RV tools are created ad hoc and they are pruned by default, i.e., they contain only the data that is relevant to the formula and has not been checked yet.
The semantics of MFOTL are point-based, i.e., the logic assesses the truth of a formula only at the time points of the events in a sequence and not for the entire time domain as interval-based logics such as MTGL. Therefore, a result in MFOTL is not accompanied by a temporal validity as in InTempo. Furthermore, for certain types of formulas, pointbased semantics may yield counter-intuitive results which disagree with interval-based semantics-on the other hand, it may allow for more efficient monitoring algorithms compared to those based on interval-based semantics [see 11]. Although the difference in interpretations may affect more extensive evaluations, it does not affect the conditions of the queries discussed in Sect. 8.3.
Encoding a graph pattern in MFOTL requires an explicit definition of the expected (temporal) ordering of the events that corresponds to the order of creation of the elements in the simulation. To emulate pattern matching, we would therefore have to specify an MFOTL formula that would consider all possible events of interest as a start for matching the pattern and then search in the past of the execution or in the present for the rest of the events of interest. Leveraging the knowledge of the actual order in which events occur in the simulation, we simplify the formulas for MonPoly by specifying only the correct ordering. This creates an advantage for MonPoly in the comparison with our implementation which we deem is justified as MonPoly is not intended for pattern matching. For ensuring that a pattern matches only entities with overlapping lifespans, we use a construction suggested by the MonPoly authors [9]: for a creation event c(a) and a deletion event d(a), we encode the lifespan of the entity a by ¬∃ d(a) S [0,∞) ∃ c(a).
Since the output of MonPoly focuses on a violation, forward-looking matching, i.e., matching a relation in the past and subsequently searching for other relations in its future, would not produce the desired result as it would always only output the time point the first violating relation was matched.
We map MG1, i.e., (g 1 , ¬ψ MG1 ) with ψ MG1 the MTGC ♦ [0,3600] ∃ g 1.1 (see Sect. 8.3), in a straightforward manner to its MFOTL equivalent, i.e., g 1 is enclosed by an existential quantifier, other operators remain intact, and relations are used instead of patterns. This is not possible for (g 1 , ¬ψ MG2 ) with ψ MG2 := ¬∃g 1.2 U [0,3600] ∃ g 1.1 however, as Mon-Poly restricts the use of negation in this case. It does so for reasons of monitorability, as the tool assumes an infinite domain of values, and the negation of g 1.2 at a given time point when it does not exist is satisfied by infinite values and is therefore non-monitorable by MonPoly-note that the use of negation is unrestricted for since, as with the lifespan construction from above. In the following, we compare to MonPoly only for MG1. The translations in MFOTL are shown in Appendix C-see Listing 1 and Listing 2.

Indexing and querying an RTM H with HAWK
Hawk [35,44] integrates a time-aware graph database [59] which tracks changes between (timestamped) repository commits and, therefore, equips Hawk with the capability of querying the history of a model. Hawk represents history by versions of types and type instances. A new version of a type is created every time a type instance is created or deleted; the initial version of a type has no instances and types are never removed from the indexer. A new version of an instance is created every time one of its features changes; instances are removed from the indexer when they are deleted from the model. Versions are timestamped based on the timestamp of the repository commit that created the version in question.
Hawk formulates queries in the Epsilon Object Language (EOL), which draws from the well-known Object Constraint Language (OCL) [90]. Hawk extends EOL with supports for temporal primitives such as time, getVersionsFrom(τ ), eventually, which enable retrieving the timestamp of a version, obtaining a specific collection of versions based on their timestamp τ , or making assertions over a collection of ver-sions. EOL supports methods native to EMF which obtain the container of an instance or the contents of one of its features.
The query MG1 is translated in EOL by obtaining a set c 1 with all Probe instances with status set to sepsis, created within a certain time window. Subsequently, we obtain a collection c 2 with the container of all instances in c 1 (the instance of PMonitoringService). For each instance in c 2 , we obtain its container (the instance of SHSService). We collect all contents of the connected feature of the SHSService, i.e., all the connected services, that are of type DrugService in a collection c 3 and, for each instance in c 3 , we check whether the contents of its probes feature include a Probe instance with status set to anti, whose timestamp satisfies the temporal constraints. Query MG2 is identical except it also checks whether an instance of Probe with status release exists in the period between an instance with status anti and an instance with status sepsis. The queries are available in Listing 3 and Listing 4 in Appendix C.

Input logs
Regarding MonPoly, we encoded the SHS metamodel by relations, following generally standard practices [see 88]. We translated all simulated logs, i.e., real, x10, x100, into sequences of events based on this encoding. An overview of the translated logs is shown in Table 1-translations are prefixed with an "M-." The logs are available in [96].
Hawk supports models created in EMF which allows us to re-use the SHS metamodel in Fig. 2 as well as a part of our implementation to map events in the log files real, x10, and x100 to model modifications. The modifications are identical to those created for InTempo.

Experiment design
MonPoly processes the events in the file and, for each event, updates its internal representation of the system behavior and its result. As explained earlier, the representation only retains data which are (temporally) relevant to the formula. This algorithm resembles the experiments for InTempo and, therefore, MonPoly is executed only once per experiment. Each experiment entails the execution of the tool with one translated log and the property MG1-as MG2 cannot be monitored by MonPoly. The latest MonPoly version at the time of writing is used (1.1.10) and run on the same machine as the implementation variants. For measuring the memory consumption and execution time, we use the output generated by MonPoly.
For Hawk, the experiments proceed similarly to those conducted for InTempo-see Sect. 9.2.2. Each experiment entails checking either MG1 or MG2 and is measured for either time or memory. Following the processing of an event, i.e., model modifications, the model is saved as an XMI file,

Results
The results for the issue detection time (in seconds) for the three logs are shown in Table 2. The issue detection time refers to the amount of time each tool or variant requires to produce the correct result: for IT , this is only the sum of the times for Analyze, i.e., the query execution time, for every invocation; for IT +P , however, it is the sum of Analyze and Maintain for every invocation. Similarly, the sum of the indexing and the querying time for every invocation is reported for Hawk. The execution of Hawk for x100 was stopped after almost three days, hence no results are reported. Issue detection with MonPoly is faster than IT for real and x10. However, MonPoly is slower than IT for x100. IT +P outperforms MonPoly for all logs. The reason is that, as shown in Table 3, pruning significantly reduces the size of the model and, therefore, its memory consumption. As a result the time spent for pattern matching is decreased. Hawk is the slowest in detecting issues as well as the most costly in terms of memory. The size of Hawk's database on disk was deemed irrelevant and is not reported. Figure 17 (in logarithmic scale) shows the speedup of IT over Hawk, i.e., (hk/it), with hk the issue detection time of Hawk for an invocation and it the issue detection time of IT for the same invocation. The speedup value of 1 is marked by a dashed line. IT was faster than Hawk in all invocations except the plot points below the dashed line-which amounts to approx. 1.2% of the total number of invocations. Figure 18 shows the speedup of IT +P over Hawk. Hawk was always slower than IT +P . In fact, their difference increases as the simulation proceeds, as pruning in IT +P decreases the size of the RTM H . Plot points where the speedup of IT is larger than the speedup of IT +P are invocations in which query execution of IT took a very small amount of time, i.e., less than a millisecond; although the query execution time in these invocations was similar for IT +P , the time spent in pruning added an overhead which reduced speedup. MonPoly only outputs the cumulative time required for monitoring the log; the time per event is not reported. Hence, a speedup comparison was not possible.

LDBC social network benchmark
The Social Network Benchmark (SNB) from the Linked Data Benchmark Council [75] is designed to simulate a plausible social network in operation. The benchmark can generate data of varying sizes and provides a series of realistic usage scenarios which aim at stress-testing and discovering bottlenecks in graph-based technologies. The latest version of the benchmark at the time of writing (0.4.0) generates data which contain both insert and delete operations [108] and can be conveniently transformed into a stream of timestamped creation and deletion events.

Metamodel and queries
The SNB metamodel consists of a static part, i.e., the entities City, Country, Tag, TagClass, University, and Company whose instance creations precede the creation of the network and are never deleted, and a dynamic part, i.e., the entities Person, Post, Comment, and Forum, whose instances are created during the operation of the network and can be deleted. A relevant excerpt of the SNB metamodel is shown in Fig. 19-where entities in gray are explained later in this section. In the generated data, forum memberships and friendships are represented by links between entities. The vast majority of the network activity comprises persons joining forums, befriending other persons, or posting comments and replies in forums. From the available queries in the SNB specification, we select two queries with a temporal dimension, namely IC4 and IC5 [39]. The (slightly adjusted) query IC4 reads: "Given a start Person, find Tags that are attached to Posts that were created by that Person's friends. Only include Tags that were attached to friends' Posts created within a year after they became friends with the start Person, and that were never attached to friends' Posts created before that." Similarly to the SHS case-study, statements in the query are captured as patterns-shown in Fig. 20.
The query refers to the point in time a friendship was created. Executing the query would entail checking the creation and deletion timestamp of the link that represents the friendship. InTempo does not directly support attributes in links. Following a customary modeling technique, e.g., [70], links of interest can be encoded as vertices. The links that represent a friendship and a forum membership are relevant to IC4 and IC5 and have been modeled as vertices with a creation and a deletion timestamp in Fig. 19-see KnowsLink and hasMemberLink vertices.
Based on these patterns, we gradually compose the query in L T for IC4. Note that the naming scheme of the patterns is based on their nesting level and that time is assumed to be tracked in seconds. We search for Tags and Persons (q 1.1 ) that satisfy the following three conditions simultaneously. First, they are friends with the start Person (q 1.1.1 , where the start Person is a user input captured by the pattern constraint). For this condition, it is additionally required to locate the first time point where the friendship was created. In MTGL, this may be achieved by the construction ∃(q 1.1.1 , ¬ (0,∞) ∃ q 1.1.1.1 ), where we make use of the knowledge that the HasMemberLink will be the last vertex created in q 1.1.1 , i.e., after the The construction locating the first time point a pattern occurs, that is, ∃(q 1.1.1 , ¬ (0,∞) ∃ q 1.1.1.1 ), uses an unbounded temporal operator which requires that InTempo stores the entire history and effectively disables pruning. As mentioned earlier, the lifespan of a match is always an interval, i.e., a connected set of time points. This characteristic makes the first time point when a match occurs unique, i.e., there can be no two first time points in the past. Therefore, in this particular construction, it is unnecessary to check the sub-condition over the entire history. Instead, the interval of the operator can be reduced to a minimal interval, i.e., (0,1) , which returns the same result while allowing InTempo to avoid storing the entire history. In the following, we abbreviate this construction by the operator exists-first: In summary, in L T , IC4 is captured by the query IC4 := (q 1 , ψ I C4 ) with ψ I C4 : The patterns for IC5 are shown in Fig. 21. The (slightly adjusted) query reads: "Given a start Person, find the Forums which that Person's friends and friends of friends (excluding start Person) became members of in the two months before the friendship was created. Return all Posts in the Forums created by the start Person's friends or friends of friends within that period.". In L T it is captured Note that the number of seconds in two months has been abbreviated by "2mo."

Input logs
We have used the SNB's data generator to generate data for two scale factors: sf-0.1 and sf-1, which create a network of 1.5K and 11K Persons, respectively. In total, 328K nodes and 1.5M edges are created in sf-0.1, and 3.2M nodes and 17.3M edges in sf-1. The generated data span a period of 10 years, from 2010 to 2020. We have captured the creation and deletion timestamps of insert and delete operations into log files of timestamped events. For our experiments, forum memberships and friendships are also encoded as nodes and, therefore, the relevant inserts and deletes in the generated data are similarly represented by events which create or delete instances of HasMemberLink and KnowsLink instances, respectively. This brings the total of nodes and edges created by the log to 900K and 3.4M, respectively, for sf-0.1. The log for sf-1 creates 10.2M nodes and 38.4M edges-see the overview in Table 1. The logs are available in [96]. The generated data contains two stages: the operational stage which entails the creation of the entire network and a small percentage of deletions, spanning from 2010 to 2013, and the shutdown stage which contains only deletions and destroys the network, spanning from 2013 to 2020. We have added a start-up stage to the beginning of the log which creates the static part of the network, e.g., Tags and Countries, at the beginning of the epoch (timestamps 0 to 7).

Experiment design
We envision a scenario where the results from queries IC4 and IC5 are utilized to provide a member of the network with dynamic recommendations or warn the member for suspicious behavior in their network when the member logs in. Recommendations can be built based on the returned Tags from IC4 and warnings can detect abnormally many new memberships by new friends, returned by IC5. Therefore, in our experiments we execute the queries periodically on each (simulation) day, which simulates a daily login on the network by the member.
According to the typical SNB execution scenario [75, p. 25], we first process a large number of operations (the first 35 months) of the operational stage such that a large starting RTM H has formed before the queries are executed. We call this the initial phase of the experiment. The initial phase corresponds to roughly 800K events in sf-0.1 and 8.8M events in sf-1. The queries are executed once per day for the remaining month in the operational stage. After the operational stage, the shutdown stage comprises numerous bulk deletions which span the remaining period, i.e., 7 years, and destroy the network. This would not constitute a realistic setting for the scenario, as only deletions would be processed. Hence, the experiments only run until the beginning of the shutdown stage. The percentage of the elements that are deleted in the operational stage is shown in Table 1. We evaluate the performance of IT (no pruning) and IT +P (with pruning). IT +P is only executed for IC5, as IC4 refers to the entire history and, therefore, contains an unbounded operator. Depending on the log, each query is executed for a different start person. This was done to ensure that the start person would be actively involved in query executions. To choose the start person, we created a list that sorted persons in the network according to their number of friends (larger to smaller) and randomly picked a person from the top half. The log sf-0.1 is executed for the person with id=483 and sf-0.1 for the person with id=361.
In the initial phase, before the periodic execution begins, the temporal GDN is populated with all matches in the starting graph. The first execution of the periodic phase updates these matches. Each variant is executed for each input log and is measured for either query execution time or memory consumption. Experiments which measure time were executed 10 times and the average values are reported. The experiments are conducted on the same workstation as all other experiments.
Our attempt to evaluate MonPoly and Hawk with the SNB logs resulted into considerable practical difficulties. In the SHS case-study, we optimized pattern matching by Mon-Poly by arranging the ordering of relations in the MFOTL property, leveraging the knowledge on the order in which these events occurred in the simulation. Applying the same optimization for SNB was impossible as there was no fixed order in which some patterns could occur, e.g., the friendship and the membership in q 1.1.2.2 . Therefore, the MonPoly properties would have to feature many alternative orderings which would deteriorate the performance of the tool. For Hawk, the initial phase of the experiment required generating and indexing approx. 800K and 8.8M (large) models for sf-0.1 and sf-1, respectively. Saving these models as XMI files and indexing them was quite slow: only a few tens of thousands of models had been processed after several hours. These difficulties indicated that the tools were not meant for this setting and using them nonetheless would compromise the evaluation conclusions. Therefore, the tools were excluded from the SNB experiments. Table 4 shows the query execution times. The initial phase for IC4 and IC5 over sf-0.1 lasted approx. 18 and 39 seconds.

Results
Over sf-1, the initial phase for IC4 and IC5 lasted 145 and 363 seconds, respectively. The effect of pruning in IT +P is marginal as only a small number of deletions, i.e., less than 5%, occur in the log-see Table 1. On the other hand, the overhead of pruning is also reduced. Given that no pruning occurs for the first 35 months, a relatively lengthy pruning takes place after the first execution in every experiment. For instance, the first pruning for sf-1 lasts approx. 47 secs. The average duration of all other executions of pruning is 287ms. Figure 22 shows the query execution times for IT and IT +P in detail, as well as the number of relevant nodes (in thousands) added per period in sf-1: relevant nodes are those included in the patterns of the temporal GDN nodes for IC5, i.e., instances of HasMemberLink, Person, KnowsLink, and Post. Generally, larger execution times correspond to rounds with a larger number of relevant nodes. On average, the InTempo variants handled 15K additions of relevant nodes per period over sf-1. The average query execution time was 5.7 secs with IT and 5.8 secs with IT +P for IC5. Table 5 shows the memory consumption of the two variants. Due to the small number of deletions, the memory consumption of IT +P is only slightly decreased. We measured that only loading the RTM H in memory consumed 2.6GBs  for sf-0.1 and 44.4GBs for sf-1. The rest of the memory in Table 5 was consumed by intermediate results maintained by the nodes of the temporal GDN.

Discussion
We discuss the evaluation results with respect to the requirement for fast query execution times (R3) and memoryefficiency (R4), from Section 1. In light of the evaluation, we conclude the discussion with remarks on advantages and limitations of InTempo.

Fulfillment of requirements
We presented a method to constrain the history represented in the RTM H (Sect. 6) where we assume that matches are handled as they are returned and can therefore be removed from the RTM H . The assumption is justified by relevant usecases in RV and self-adaptation. We assess the requirement R3 for fast query executions by (i) a comparison between the issue detection times of IT , that used an un-constrained representation, and IT +P , a variant that used a constrained one (ii) a comparison to the issue detection times of two state-of-the-art tools: the RV tool MonPoly, with which we emulated pattern matching with relations, and the model indexer Hawk. By issue detection time, we refer to the time required to return correct results, i.e., query execution time for IT , query execution time plus pruning time for IT +P , indexing time plus query execution time for Hawk, and execution time for MonPoly. Both IT and IT +P outperform Hawk and IT +P also outperforms MonPoly. IT +P is slower than IT and MonPoly, however it operates on an un-constrained representation, and it is therefore capable of computing the precise validity of an answer over the entire history at any point in time. Hawk, which shares this ability, is slower than IT . We can therefore conclude that query answers from InTempo are provided fast with respect to the state-of-the-art. Issue detection times were also relatively fast with respect to the invocation frequency in the SHS case-study. Although stronger claims on this aspect require further investigation, these times indicate that InTempo may serve as a basis for more sophisticated adaptation approaches.
We assess the requirement for memory-efficiency (R4) by a similar comparison. IT +P consumes smaller amounts of memory than IT while delivering the same results over the entire history. Therefore, IT +P is more memory-efficient compared to a variant using an un-constrained representation. Moreover, IT +P is more memory-efficient than Hawk, as well as than MonPoly for larger logs, i.e., x10, x100.
Except for serving as a baseline for the variant with pruning, IT covers other important use-cases where pruning might be infeasible or undesirable. For instance, the queries of interest might change often and thus time windows cannot be derived a priori-as historical data might be relevant to another query in the future. Another example is when the incurred cost of pruning on the loop execution time is undesirable. Finally, a third possible scenario is postmortem analysis like self-explainability in self-adaptive systems [14] where the system is required to explain adaptation decisions in its entire history-for our example, IT is faster and more memory-efficient than Hawk, which is intended for such use-cases [45]. It should be noted that, given the database back-end and the offline use-cases, minimizing memory consumption may not be a primary focus for Hawk. Moreover, by storing versions of types and type instances, Hawk implicitly stores the history of types, links, and attribute values, which would require a manual encoding in InTempo.
The SNB log files allowed for a more extensive evaluation of InTempo as they involved more complex, realistic, and considerably larger graph structures. L T is used for more complex queries than those in the SHS case-study which stem from a domain which is significantly different to healthcare. The execution of InTempo for two scale factors of the benchmark indicates that, for the specific experimental setting, the query execution time of InTempo scaled adequately with respect to the smaller and structurally simpler datasets in the SHS case-study. This evaluation allows for indirect comparisons with other tools in the future-as well as future versions of InTempo.
A GDN stores all intermediate query results. Only nodes that are affected by model modifications are executed which, combined with local search, makes for an incremental evaluation framework which is optimized for fast query executions. This feature is emphasized in the SNB experiments where the structure of the RTM H , in contrast to the SHS case-study, is amenable to local search. Pattern matching may skip big parts of the RTM H that are unaffected and therefore, in relation to the number of events and added nodes (see Table 1), the query execution times were considerably fast.
On the other hand, given the size of the patterns in SNB queries as well as the number of matches, storing all intermediate results has an adverse effect on memory consumption. The effect was aggravated by temporal GDNs featuring many ZMR nodes, i.e., negation, conjunction, since, and until, whose LHS currently contains the maximal context required by parent nodes. Effectively, these nodes re-execute queries that have been already executed and re-store their results. For instance, IC5 features eight such nodes, with some of them redundantly storing matches of elements which are in great quantities in the data, e.g., Posts.

Advantages and limitations
Regarding the issue detection by an RV tool like MonPoly, we note that, although similarities in the use-cases of RV can be observed, so can fundamental differences. InTempo relies on an RTM, i.e., a causally connected snapshot of the system constituents and their state, which we assume to be the product of a broader MDE context. Queries over the RTM (and their answers) are supposed to be further utilized at once within that context-as the self-adaptation application scenario in Sect. 8 demonstrated. Models are typically not the primary focus of RV tools; contrary to an RTM, representations of the system state are created ad hoc and are typically inaccessible by other tools or end-users, which may render RV tools impractical for use in an MDE context, e.g., it may hinder synergy with other model-based technologies.
Moreover, as mentioned previously, structural RTMs are amenable to a graph-based encoding. Transferring this setting to an adequately expressive RV representation like relations resulted into overly technical and error-prone encodings as, typically, RV tools do not focus on structural representations. Consequently, emulating pattern matching of queries which were rather structurally simple was also cumbersome, even after optimizations. Translating one of the properties which concerned the prohibition of the existence of a pattern into MFOTL was not possible due to the syntax restrictions of MonPoly. These leads indicate that MonPoly and similar RV tools are sub-optimal for graph-based representations, queries and, in turn, structural adaptation of systems.
In the presented experimental setting, InTempo outperformed Hawk and MonPoly. Nonetheless, the level of maturity of the two tools surpasses the prototypical implementation we present. Hawk supports various modeling technologies, including EMF. It also seamlessly supports history of attributes, links, and types. Although its database back-end is expected to yield slower access times than an in-memory history representation, it offers other advantages, e.g., increased scalability with respect to the size of the models, which may benefit other settings but were not explored in our evaluation. Both Hawk and, to a lesser extent, Mon-Poly, support monitoring the evolution of a value, e.g., aggregating the value of an attribute for each query execution and monitoring whether the sum exceeds a limit. We plan to address this limitation by equipping the temporal GDN with auxiliary nodes used to encode input parameters as constraints that are checked during pattern matching, as shown in our previous work [100].
As demonstrated by the SNB experiments, InTempo requires that attributes, types, and links whose evolution is of interest are encoded as vertices. For the SNB, this meant that two links were encoded as vertices, which roughly tripled the number of vertices in the RTM H compared to the model normally created by sf-0.1 and sf-1. Although this is a customary modeling technique for EMF-based implementations, other tools, such as Hawk, handle the encoding in a manner transparent to the user.
Originally, the creation timestamp in SNB is captured by a vertex or edge attribute to which queries may refer directly. Therefore, the original queries IC4 and IC5 contain timing constraints based on the physical time, e.g., "before October 2010." We adjusted these constraints to logical time, e.g., "in the last two months," as physical time references are not currently supported by L T -and are typically not provided by logic-based languages, e.g., MFOTL. Hawk is able to express such constraints referring to physical time. A related limitation is that InTempo currently cannot relate the current time point, i.e., when the query is executed, with temporal computations by the temporal GDN. Hence, InTempo performs an additional check (predicate P) to assess whether query results are conclusive with respect to the current time point. This check is standard in RV and hence seamlessly provided by MonPoly. We plan to integrate the check in InTempo by means of the auxiliary nodes mentioned above.
Queries in SNB draw from SQL-based languages and their statements resemble SQL queries. Their translation into a temporal logic like MTGL required a certain level of familiarity with temporal logics and resulted into nontrivial MTGCs. We plan to equip L T with constructions that facilitate such translations, e.g., the exists-first abbreviation we introduced in Sect. 9.4. However, there are certain features of SQL-based languages which are typically not offered by logic-based languages, e.g., aggregations and limiting or sorting of results-MFOTL stands out as it offers the capability of numerical operations such as aggregation. Related aspects of the SNB queries have been omitted from our formulation of IC4 and IC5 in L T .
The SNB experiments indicated that larger temporal GDNs may require a significant amount of memory. We plan to investigate whether the patterns used for ZMR nodes can be optimized so that memory consumption is reduced.

Threats to validity
We organize this section based on the types of validity in [111], i.e., conclusion, internal, construct, and external.
Conclusion Validity Threats to conclusion validity refer to drawing incorrect conclusions about relationships between an experiment and its outcome, for instance, by reporting a non-existent correlation or by missing an existent one. We mitigated the possible impact of threats to conclusion validity by carefully selecting the experimental data; the log files used were either real (real log in the SHS evaluation), synthesized based on real data by employing statistical bootstrapping (x10 and x100), or generated by an independent benchmark (sf-0.1 and sf-1 in the SNB evaluation). Our synthesis method is documented in [93].
Each SHS experiment essentially executed the same query over an increasing event sequence for thousands of times, therefore yielding measurements of an adequately large size. For the SNB, we conducted the experiments measuring the query execution time repeatedly and reported the averages of values. Moreover, we studied the benchmark characteristics and attempted to refine statements on the relationship between measurements and conclusions: for instance, for the SNB evaluation, we reported on the number of total additions of relevant nodes instead of the number of events per period.
Internal Validity In this context, threats to internal validity are influences which might have affected metrics, i.e., the query execution time and the memory consumption. In the following, we describe the measures that were taken to minimize such threats.
The experiments measured these metrics separately and systematically via a controlled simulation of an SHS and the partial execution of a benchmark. Multiple logs were used with an increasing event rate which evaluated the InTempo variants over an increasing load; all other aspects were kept identical: in the SHS case-study, the variants used the same metamodel and the same Monitor, Plan, Execute activities per experiment; similarly, the SNB experiments used the same metamodel, the same event sequences, and the same starting graph per scale factor. Both sets of experiments translated log events into model modifications. The experiments for Hawk used the same logs and translations as InTempo. The experiments for MonPoly entailed the systematic translation of events into relations based on fixed rules.
All experiments were performed on the same machine and, during the experiment, no other (active) task was run on the machine. The values reported in the results of the experiments for InTempo variants are based on the values reported by the JVM: the value of free memory was subtracted from the value of total memory. Before measuring memory consumption, we always suggested a garbage collection to the JVM. The values for MonPoly are based on statistics provided by the tool itself, which in turn relies on standard utilities available in UNIX operating systems.
Construct Validity Threats to construct validity refers to situations where the used metrics do not actually measure the construct, i.e., concept. We used the standard metrics of query execution time and memory consumption, measured in conventional ways. We have ensured that these metrics corresponded to the correct results. First, the detection of violations in the logs by the variants in the SHS evaluation has been manually double-checked by the authors and confirmed by MonPoly and Hawk. Additionally, the answers for the queries in the SNB evaluation have been confirmed by JAVA code. Finally, we have provided formal arguments to substantiate the claims on correctness of computation of temporal validity and the detection of all changes to temporal validity despite pruning the history representation.
External Validity Threats to external validity may restrict the generalization of our evaluation results outside the scope of the conducted experiments. In the following, we discuss threats which could influence the generalization of the experimental setting and measurements. Regarding the experimental setting, we have mitigated threats to external validity by creating the metamodel of the SHS case-study based on the artifact in [109], a peer-reviewed self-adaptation exemplar that has been used for the evaluation of solutions for self-adaptation. Moreover, the simulation used either real data or data that was synthesized based on the real data and enacted an instruction from a real medical guideline [89].
Similarly, the SNB experiments were based on the output of an established benchmark which employs sophisticated, well-documented methods for generating realistic data. Queries are similarly designed by experts to expose bottlenecks and stress-test the performance of graph-based technologies. The SNB data was generated using the default parameters and the metamodel used closely resembles the original-we have only added vertices specific to InTempo, i.e., the MonitorableEntity, or specific to the evaluation, i.e., links of interest were encoded as vertices. We made minor modifications to the data which fixed a small number of consistency issues, i.e., deletions occurring before creations, which probably stem from the fact that deletions in the benchmark's output are a rather new feature [108].
Regarding the generalization of the performance measurements of InTempo variants, we have conducted two evaluations based on considerably different application domains. The evaluations featured metamodels and models which differed with respect to size and graph characteristics: the SHS data corresponded to a relatively simple star structure which demonstrated the effects of pruning; the SNB data corresponded to a rather large graph with numerous interconnections which resembled real social networks. The SHS queries showed the merits of a graph-based temporal logic for graph structures in runtime monitoring scenarios, while the SNB queries explored innovative use-cases of this temporal logic. Our implementation performed solidly in both settings and its results were emphasized via the comparison to MonPoly and Hawk.
We have applied the following optimizations to our implementations. For the removal of elements from EMF models, we transparently replaced the potentially expensive native EMF method with an optimized version. The replacement was done via a JAVA agent which was used for both Hawk and InTempo. The two evaluations were rather different with respect to the evolution of the RTM H : SHS started with a single vertex, whereas SNB started with a significantly large graph. Hence, we configured the Story Pattern Matcher, the pattern matching tool used by InTempo, with a different strategy for search plan generation for the two evaluations.
On the comparison, we note that MonPoly is not intended for pattern matching. Our emulated pattern matching, although optimized, might have room for improvement. MonPoly relies on a point-based interpretation while InTempo reasons over intervals which can lead to discrepancies between interpretations in more extensive comparisons. One of the main reasons MonPoly was chosen for the comparison is the compatibility of MTGL and MFOTL which allowed for a direct mapping of temporal operators and, therefore, reduced the risk of introducing any bias in translations. Based on this mapping, MonPoly could not monitor one of the properties used in the experiments. There might exist equivalent monitorable MFOTL formulas the syntax of which, however, would not match that of the MTGC.
We used Hawk in a manner compliant to the examples in the tool's website [35] at the time of writing. Hawk offers features which could potentially improve the performance of the tool. For example, EOL offers a syntax extension that supports pattern matching. Hawk supports a method to create annotations, i.e., predicates over the occurrence of elements or the values of their attributes. Annotations are updated in each indexation, thus rendering the index capable of accessing annotated elements via a lookup-however, annotations have to be defined manually and before the indexation of the model starts. Furthermore, an annotation cannot refer to another annotation; thus, nesting predicates is not supported. Such optimizations, their applicability, and their trade-offs will be investigated in our future work.

Related work
InTempo consolidates model instances into a single model, the RTM H , and answers queries with soft 4 real-time timing constraints on the history of the RTM H . The RTM H is encoded as a graph and hence queries are captured by temporal graph queries which are executed incrementally. We first discuss work that relates to InTempo on the technical level of graphs, graph queries, and incremental query executions. Then, we relate InTempo to other work from the MDE community that can, either directly or indirectly, query the history of a model. Finally, we discuss relevant work from the RV community and a related approach to processing streams of events.
Graphs, Queries, and Incremental Execution Formally, InTempo is founded on typed attributed graph transformation rules [37]. Therefore, we limit the discussion to graphs although the model and model transformation rules could be encoded in other data models and related technologies, e.g., [65] and [76]. In the RTM H , time is captured by vertex attributes which are assumed to be present for each vertex in the metamodel, i.e., the type graph. Query executions are performed via graph transformation rules which write or perform computations with these (distinguished) attributes. This rule behavior resembles the foundational approach presented in [54] where graph transformation rules are extended with a notion of time.
The Viatra framework [104,105] stores a graph in memory and features a query execution engine which executes a query incrementally by decomposing it into sub-queries. Contrary to Viatra, InTempo seamlessly integrates a notion of time in the graph representation, the query language, and the execution framework. Viatra uses a decomposition strategy similar to a GDN called RETE [42], whose performance effect on InTempo we plan to investigate. A Viatra extension [102] distributes the pattern matching effort for sub-queries over multiple processing units, which is another interesting future direction for InTempo.
The setting assumed by Viatra is similar to the one of InTempo and resembles streaming [97] and active model transformations [13]: the graph (typically representing an RTM) is constantly being updated by a stream of graph elements or events that are mapped to graph elements; queries are executed after each change and their results are updated. Based on this setting, the Viatra-based solution Búr et al. by [24] captures safety properties via graph queries and is, therefore, capable of making assertions about the history of the graph. However, the solution does not support the integration of temporal statements into properties; properties only check for the occurrence of prohibited structures. Moreover, the solution does not support incremental query execution.
Implementations of time-aware graph databases typically build on a database back-end and introduce extensions which integrate a notion of time, e.g., [32,59,79]. Although backends with native support for graphs are capable of generating a push notification when a node or edge that meets certain conditions has been added or modified [2,98] and, thus, of providing a certain degree of support for reactive settings, none of the implementations supports incremental execution of queries with complex patterns.
The approach presented in [6] analyzes a host graph stored into an in-memory database and extracts a sub-graph that features only elements that are relevant to a given query. Each change to the host graph triggers a new (incremental) analysis and an execution of the query over the sub-graph, which returns a complete answer set. The analysis does not integrate a notion of time-if it were to be extended so as to support temporal primitives and graphs with history, it could be used in conjunction with the timing-constraint-based pruning of InTempo to yield a sub-graph that would contain only elements that are both temporally and structurally relevant.
Solutions for graph stream analytics, e.g., [25,58,67,71,82] may also be seen as related to InTempo, since they are also concerned with the efficient storage and querying of graphs that are constantly updated. These solutions model history as a sequence of change-based snapshots stored either entirely or partly on disk. This allows for the storage of very large graphs but imposes an overhead on query executions. Moreover, models are not the primary focus of these solutions which may render their utilization in an MDE context cumbersome.
The languages presented by Rötschke and Schürr [91] and in our earlier work [68] build on Story Diagrams [112], a visual language for the specification of patternmatching tasks, to specify temporal graph queries with timing constraints. The two languages however focus purely on specification and no tool support for either language exists.
Querying History in Model-driven Engineering The history of an RTM may be encoded via model versioning based on a (general-purpose) model repository, e.g., the solution by Haeusler et al. [56]. There is however a considerable difference between the objectives of InTempo and those of model repositories. For instance, branching is seamlessly supported by repositories, whereas, although it could potentially be supported by an RTM H , it is beyond our current scope. On the other hand, for repositories it is often assumed that queries will mostly concern a single timestamp, i.e., a specific version; repositories are therefore optimized for such queries and are less suitable for the on-the-fly execution of patternbased queries that refer to a period of time-as Haeusler et al. acknowledge.
Solutions which focus on the history of an RTM typically use an on-disk database to store history and then facilitate the execution of database queries at runtime. Hawk [44], to which InTempo was compared, uses the time-aware graph database in [59]. The solutions in [51] and [80] use a map-and time-series-database, respectively. As the evaluation in Sect. 9 indicated, disk accesses take a significant toll on real-time querying performance, especially for far-reaching past queries which are likely to factor in multiple model or element instances in their runtime computations. Notably, Hawk offers the capability to annotate such queries manually prior to the execution, such that their matches are pre-computed while the system is being executed, which, however, is automatically accomplished by the temporal GDN in InTempo. Compared to an in-memory representation, on-disk databases offer increased scalability with respect to the size of the models that can be stored.
All solutions above query history by means of OCL extensions that support temporal primitives-although Hawk's extension is more expressive. They also conveniently support attribute-level history (contrary to InTempo which requires for attributes to be encoded as nodes in the metamodel, see Sect. 9.5.2) and on-the-fly operations on attribute values, e.g., aggregation. However these features also increase the number of disk accesses during pattern matching.
Runtime Verification and Related Approaches Seen in a broader context, InTempo processes a sequence of events and verifies whether this sequence satisfies a temporal logic formula. This approach to verification is known as Runtime Verification (RV). In RV, an online algorithm incrementally processes a sequence of timestamped events and checks whether the prefix of the sequence processed so far satisfies a temporal logic property. The algorithm only stores data that are relevant to the property.
Based on this similarity, we compared the performance of InTempo to MonPoly [9,10], a state-of-the-art RV tool.
MonPoly is a command-line tool which notably combines a specification language which is adequately expressive for the use-cases discussed in this article with an efficient incremental monitoring algorithm. Its specification language is based on the Metric First-Order Temporal Logic [9] (MFOTL) which uses first-order relations to capture system entities and their relationships.
Besides MonPoly, solutions stemming from RV typically provide no or only partial support for key features of InTempo, e.g., events containing data, native or indirect support for graphs and bindings, and temporal operators with timing constraints: the solution by Dou et al. [34] presents a pattern-based RV technique which concerns propositional events, i.e., containing no data, and is thus unsuitable for usecases of interest; a monitoring algorithm presented in [11] involves an interval computation similar to ours but concerns a propositional, past-only logic.
An exception is the tool DejaVu by Havelund and Peled [63] which can monitor properties specified in a first-order metric past-only logic with point-based semantics. Translating MTGCs in the DejaVu specification language would require emulating graph-based encodings (similar to Mon-Poly) and, moreover, reformulating MTGCs such that they feature only past operators. Such reformulations would be deprived of the one-to-one mapping of temporal operators in L T and MFOTL and could be significantly less compact [61,74]. Additionally, Havelund and Peled report that, currently, only timing constraints which span approx. 60 time units or less yield acceptable performance [62]. This restriction renders DejaVu unsuitable for the application scenarios targeted by InTempo.
The approach known as Complex Event Processing (CEP) focuses on processing a stream of events and detects temporal patterns based on the content and the ordering of events. Detected patterns generate complex events which form another stream and can be further processed [see 29]. Outwardly, the objective of RV is similar to that of CEP. However, the two have fundamental differences. Of interest to the present discussion are the following: languages used in CEP are not based on temporal logic (in fact, they often lack clear semantics [53]) which makes a direct comparison difficult; the capability of CEP to evaluate sequential patterns is typically limited, while of central importance to RV [57]-we refer to [57] for a more detailed discussion. Therefore, RV is considerably more relevant to InTempo.
Compared to InTempo, graph-based solutions that use CEP are limited in their ability to query the history of a model at runtime. Dávid et al. present a solution based on a streaming transformation of an RTM. The solution generates events when patterns are matched and then uses CEP to check whether generated events occur within a given time window, thus capturing, albeit indirectly and to a certain extent, temporal requirements on matched patterns [30]. The solution by Barquero et al. [5] stores a stream of data as a graph (on disk), and executes graph queries to detect event patterns. The solution introduces the notion of a spatial window to restrict the size of the graph and thus the search space for pattern matching. However, queries executed in the restricted search space may yield inaccurate results.

Conclusion and future work
Following the common practice of representing a runtime model as a typed attributed graph, we introduced a language for graph-based model queries which incorporates a temporal logic to enable the formulation of temporal graph queries. Moreover, we introduced a scheme which enables the incremental execution of temporal graph queries over runtime models with history, i.e., runtime models with a creation and deletion timestamp for each entity. The scheme offers the option to prune entities from the model that are not relevant to query executions, therefore reducing the memory footprint. By incorporating a temporal logic, the scheme is capable of monitoring temporal properties at runtime. Building on this capability, we apply the scheme in a runtime adaptation scenario where query matches in the model capture adaptation issues which are handled by in-place model transformations. We present an implementation which we evaluate based on a simulation of the scenario with real and synthetic data. We compare the performance of the implementation in detecting issues to a relevant runtime monitoring tool and a tool from the MDE community which is capable of querying the history of a model. Moreover, we evaluate the implementation with data which simulate the operation of a social network and are generated by an independent benchmark.
In the future, we plan to compare the query decomposition strategy in the framework of InTempo with the performance of other strategies. Moreover, we will investigate the number of events per second that InTempo can handle in other evaluation settings, which will indicate whether our scheme can be used for different application scenarios with considerably larger event streams. We will also consider technical optimizations such as the usage of indexing structures that can index matches based on their intervals and the optimization of patterns in auxiliary nodes of the evaluation framework. We plan to experiment with more aggressive pruning strategies for application scenarios with limited resources. An interesting research direction is the combination of InTempo with an EMF database which stores very old parts of the model and enables on-demand loading for the parts in question. A direction which holds vast potential is the opportunity to observe and learn from the history recorded in the model. Such knowledge could be subsequently used for predictions. Finally, we plan to utilize the ability of the scheme to detect the amount of time remaining for the violation of a property and enable more sophisticated decision-making for runtime adaptations.
Funding Open Access funding enabled and organized by Projekt DEAL.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.

A Intervals
An interval is a non-empty, convex set I ⊆ R and has one of the following forms: The set of all intervals is denoted by I. Two intervals k, l ∈ I are left-adjacent, i.e., adj (k, l) when r (k) = (l) and k is right-open and l is left-closed; right-adjacent intervals are defined symmetrically and denoted by adj r . The two intervals are overlapping, i.e., ov(k, l), when k ∩ l = ∅. For k, we denote the union (k) ∪ k, i.e., making k left-closed, by + k and the union r (k) ∪ k by k + . Note that when r (k) = ∞, k + = k.
We define the interval addition and interval subtraction in a standard way [83]. Let I , k, l be interval in I. Then, k ⊕l = {τ + τ | τ ∈ k, τ ∈ l} and k l = {τ − τ | τ ∈ k, τ ∈ l}, respectively. Essentially, k ⊕ l = [ For the proof of Theorem 1, we rely on the distributivity of an interval addition or subtraction with the regular set intersection, captured by the lemma below. Proof Note that, for the interval I , its negation −I is defined as {−τ | τ ∈ I }. It follows from the definition of addition and subtraction that K I = K ⊕ (−I ) [see [83], p.12]; thus, we only need to show the lemma for addition. The proof proceeds by showing inclusion in both directions.
We commence by hypothesizing (K ⊕ I ) ∩ (L ⊕ I ) ⊆ (K ∩ L) ⊕ I . Let τ ∈ (K ⊕ I ) ∩ (L ⊕ I ). It follows that τ ∈ (K ⊕ I ) and τ ∈ (L ⊕ I ). From these two memberships of τ , it follows that (i) there exists a k ∈ K and an i ∈ I such that k + i = τ (ii) there exists an l ∈ L and an i ∈ I such that l + i = τ . Note that, since K ∩ L = ∅, it may be that k ∈ L-then from k ∈ L, k ∈ K , and k + i = τ we can deduce τ ∈ (K ∩ L) ⊕ I and, in turn, that the initial hypothesis holds. A similar deduction can be made for the case when l ∈ K . In the case where k / ∈ L and l / ∈ K , it follows that k = l. Assume, without loss of generality, that k < l. From the membership of τ it follows that there exists z ∈ K ∩ L with k < z < l and a t ∈ I with i < t < i such that z + t = τ . From z ∈ K , z ∈ L, and t ∈ I , we can deduce that z + t ∈ (K ∩ L) ⊕ I . By z + t = τ , we get τ ∈ (K ∩ L) ⊕ I .
We proceed with inclusion ⊇. Let τ ∈ (K ∩ L) ⊕ I . It follows that there exists z ∈ (K ∩ L) ⊕ I and an i ∈ I such that z + i = τ . From z ∈ K , i ∈ I it follows that τ ∈ K ⊕ I and, similarly, from z ∈ L, i ∈ I , it follows that τ ∈ L ⊕ I . Therefore τ ∈ (K ⊕ I ) ∩ (L ⊕ I ).
By showing inclusion in both directions, we have shown the lemma to be true.

B.1 Theorem 1
Additionally to a set of convex sets of time points, i.e., intervals, an interval set can be seen simply as a set of time points. In the following, when we consider an interval set as a set of time points, we underline the interval set, e.g., Z. Following is the proof for Theorem 1.
Proof By structural induction over ψ. In the base case, we show the theorem to be true for the MTGL operator true to which all MTGCs reduce. We omit the straightforward steps for negation and conjunction. Base case: By the semantics of MTGL, true is always satisfied, hence Y(m, true) = Z(m, true) = R. Induction step: ψ = ∃(n, χ). Assume that Y(m, χ) = Z(m, χ). We first show that: Let τ ∈ Y(m, ∃(n, χ)). By the semantics, a matchm forn exists at τ such thatm is compatible with the enclosing match m,m satisfies χ , and max ∈E .cts ≤ τ < min ∈E .dts, with E the elements ofm. By definition of Y and λ, it follows that (i) τ ∈ Y(m, χ) and by the premise τ ∈ Z(m, χ) (ii) τ ∈ λm, therefore, τ ∈ λm ∩ Z(m, χ) for somem compatible with m.
By showing inclusion in both directions, we have shown the sets of time points to be equal. Induction step: ψ = χ U I ω and (I ) = 0. Assume that Y(m, χ) = Z(m, χ) and Y(m, ω) = Z(m, ω). We first show: Y(m, χU I ω) ⊆ i∈Z(m,ω), j∈J i j ∩ ( j + ∩ i) I Let τ ∈ Y(m, χU I ω). By the semantics, there exists a τ such that ω is satisfied and at least for all τ ∈ [τ, τ ) χ is satisfied. Therefore τ is in some i ∈ Y(m, ω) and, by the premise, i ∈ Z(m, ω). Moreover, τ satisfies the timing constraint I of until; thus, τ − τ ∈ I . It follows that there exists a t ∈ I such that τ − τ = t, ergo, τ = τ − t. Therefore, we obtain that τ ∈ τ I and, in turn, τ ∈ i I . Observe that [τ, τ ) cannot be empty: From τ ∈ τ I , it follows that τ ∈ τ ⊕ I and, since (I ) = 0 (and by I ⊆ T, 0 / ∈ I ), τ > τ. Therefore there exists some j ∈ Y(m, χ) and, by the premise, also in Z(m, χ), with [τ, τ ) ⊆ j . The interval j is either overlapping or left-adjacent to the i which contains τ , because [τ, τ ] ⊆ j + and, from τ ∈ τ I , we obtain τ ∈ j + I . Finally, from τ ∈ j and τ ∈ j + I , we obtain, τ ∈ j ∩ ( j + I ). Since τ ∈ i I , we obtain τ ∈ j ∩ ( j + I ) ∩ (i I ) and, by j + ∩ i = ∅ and Lemma 1, τ ∈ j ∩ (( j + ∩ i ) I ). It follows that τ is also a member of i∈Z(m,ω), j∈J i j ∩ ( j + ∩ i) I with J i the set of all left-adjacent or overlapping intervals for some i ∈ Z(m, ω).
By showing inclusion in both directions, we have shown the sets of time points to be equal. Induction step: ψ = χ U I ω and (I ) = 0. Assume that the premise from the previous step holds. We first show: Let τ ∈ Y(m, χU I ω). By definition of the satisfaction span, χ U I ω is satisfied at τ , which, by the semantics, means that there is a time point τ with τ − τ ∈ I , where ω is satisfied. Observe that by including 0 ∈ I , thereby allowing τ −τ = 0, the semantics imply a disjunction. If τ − τ > 0, it can be shown as before that τ ∈ j ∩(( j + ∩i ) I ) for some pairing of i ∈ Y(m, ω) and j ∈ J i with J i defined as previously.
We proceed with inclusion ⊇. Let τ ∈ i ∪ ( j ∩ (( j + ∩ i ) I )) for some pairing of i ∈ Z(m, ω) and j ∈ J i with J i defined as previously. It follows that τ ∈ i ∨ τ ∈ j ∩ (( j + ∩ i ) I ). For the left disjunct, since 0 ∈ I , there is τ such that τ − τ ∈ I , i.e., τ = τ and where [τ, τ ) is empty. Therefore τ satisfies until. For the right disjunct, we can deduce τ ∈ Y(m, χU I ω) as before. We have shown that in each case τ is included in the satisfaction span of until.
By showing inclusion in both directions, we have shown the sets of time points to be equal. Induction step: ψ = χ S I ω and (I ) = 0. The step proceeds analogously to until and relies on the same premise. We first show: Y(m, χS I ω) ⊆ i∈Z(m,ω), j∈J i j ∩ ( + j ∩ i) ⊕ I Let τ ∈ Y(m, χS I ω). By the semantics, there exists a τ such that ω is satisfied. Therefore, τ is in some i ∈ Y(m, ω) and, by the premise, in Z(m, ω) too. Moreover, τ satisfies the timing constraint I of since; thus, τ − τ ∈ I . It follows that there exists a t ∈ I such that τ − τ = t; thus, τ = τ + t. Therefore, we obtain that τ ∈ τ ⊕ I and, in turn, τ ∈ i ⊕ I .
We proceed with the inclusion ⊇. Let τ ∈ j ∩(( + j ∩i )⊕ I ) for one possible pairing of i ∈ Z(m, ω) and j ∈ J i with J i defined as previously. The time point τ is simultaneously in j and ( + j ∩i )⊕ I . It follows that there exists a τ ∈ + j ∩i with τ ∈ τ ⊕ I , i.e., τ − τ ∈ I . Since (I ) = 0 and 0 / ∈ I , τ > τ . From τ ∈ i and τ −τ ∈ I , we deduce that at τ there is a τ within the timing constraint I at which ω is satisfied. Moreover, from τ ∈ + j , τ ∈ j , and τ > τ , we deduce that there is a non-empty (τ , τ ] ⊆ j during which χ is satisfied. Therefore, χ S I ω is satisfied at τ and, by definition of the satisfaction span, τ ∈ Y(m, χS I ω).
By showing inclusion in both directions, we have shown the sets of time points to be equal. Induction step: ψ = χ S I ω and (I U ) = 0. Equality can be shown analogously to the corresponding step for until.
From the base case and induction steps, it follows that Theorem 1 holds.

B.2 Theorem 2
Following is the proof for Theorem 2, i.e., Proof The proof proceeds by induction over D. In the following, we omit the time point of T π as it is clear from the context. Moreover, for a tuple u := (m, V) we refer to the match of the tuple by m u and the temporal validity of the tuple by V u .
We first show inclusion ⊆. The proof proceeds by tracking the match of a tuple and its temporal validity in the projected answer set of a non-pruned RTM H , via three steps, in (1) the regular answer set over the non-pruned RTM H , and from there to (2)  T π (P [τ j ] ) (from the premise), and therefore, it is also in the union T π (P [τ j+1 ] ).
In the latter case, t is inserted by the latest answer set, therefore t ∈ T π (H [τ j+1 ] ). By definition of T π , it follows that there exists a t ∈ T(H [τ j+1 ] ) with m t the same as m t and V t ∩ [τ j − W, τ j+1 ] = V t with W the time window of ψ from the query. Step 1 is done.
Henceforth, we denote the interval [τ j − W, τ j+1 ] by Γ . By definition of W, an element may affect a temporal validity at an arbitrary time point x only if x ∈ .λ W with λ the lifespan of , i.e., if .dts ≥ x − W. For V t , the earliest time point for which this may happen is (Γ ). Hence, if we replace x with (Γ ), we get that all elements with .dts ≥ τ j − 2W or, since the current time point is τ j+1 , all with .dts ∈ [τ j − 2W, ∞) may affect V t . By definition of a pruned RTM H , all such elements are contained in P [τ j+1 ] . Therefore, m t exists in P [τ j+1 ] and elements in m that could affect the temporal validity in V t ∩ Γ are unchanged. From this, it follows that there is a t ∈ T(P [τ j+1 ] ) with m t the same as m t and V t = V t . Step 2 is done.
By definition of T π , it follows that there is a tuple t in T π (P [τ j+1 ] ) with m t the same as m t and V t equal to V t ∩ Γ which is equal to V t . Ergo, t = t; thus, t is also Step 3 is done and we have shown the inclusion ⊆.
We proceed with showing the inclusion ⊇. Let t be a tuple T π (P [τ j+1 ] ) for the query (n, ψ). If the tuple t was in j i=1 T π (P [τ j ] ) it can be shown as in the other direction Otherwise, t ∈ T π (P [τ j+1 ] ). We follow the steps from the previous direction in reverse. By definition of T π it follows that there is a t in T(P [τ j+1 ] ) with m t the same as m t and V t = V t ∩ Γ . By definition of pruning, all elements which were pruned in P [τ j+1 ] yet contained in H [τ j+1 ] must have a dts < τ i − 2W, i.e., .dts ∈ [0, τ i − 2W). These elements can not affect the elements of m t or its temporal validity V t . Therefore, T(H [τ j+1 ] ) contains a tuple t with m t the same as m t and V t = V t . Finally, by definition of T π , there is a tuple t ∈ T π (H [τ j+1 ] ) with m its match and V t = V t therefore equal to V t . Ergo, t = t, and therefore, t is also in j+1 i=1 T(H [τ j+1 ] ). We have shown inclusion ⊇. By the base case and the induction step, it follows that the theorem holds.

C Evaluation supplement
This section presents the specification of queries in MG1 and MG2 in the specification languages of MonPoly and Hawk, the two tools used for comparisons with InTempo in Sect. 9.

C.1 Queries in MFOTL
MonPoly uses a specification language which is based on the Metric First-Order Temporal Logic (MFOTL). As MTGL, MFOTL is a metric temporal logic, which allows for preserving the main structure of the MTGC MG1. On the other hand, lifespans and patterns require a special handling which we described in Sect. 9. The query MG1 in MFOTL is showed in Listing 1. Note that this construction has to be guarded by an (atomic) event, which we also ensure. In practice, this means MonPoly ignores some deletion events compared to InTempo and Hawk. EXISTS q ,x ,y ,z . ( probe (x , " sepsis " ) AND ONCE ( (( NOT del_mprobes (x , y ) ) SINCE mprobes (x , y ) ) AND ONCE ( (( NOT del_mconnected (y , z ) ) SINCE mconnected (y , z ) ) AND ONCE ( (( NOT del_mservice (y , q ) ) SINCE mservice (y , q ) ) AND ONCE shs ( z ) ) ) ) ) AND NOT EVENTUALLY [0 ,3600] EXISTS a ,b ,p . ( probe (a , " anti " ) AND ONCE ( ( ( NOT del_dprobes (a , b ) ) SINCE dprobes (a , b ) ) AND ONCE ( (( NOT del_dconnected (b , z ) ) SINCE dconnected (b , z ) ) AND ONCE ( (( NOT del_dservice (b , p ) ) SINCE dservice (b , p ) )