Avoiding unnecessary information loss: correct and efficient model synchronization based on triple graph grammars

Model synchronization, i.e., the task of restoring consistency between two interrelated models after a model change, is a challenging task. Triple graph grammars (TGGs) specify model consistency by means of rules that describe how to create consistent pairs of models. These rules can be used to automatically derive further rules, which describe how to propagate changes from one model to the other or how to change one model in such a way that propagation is guaranteed to be possible. Restricting model synchronization to these derived rules, however, may lead to unnecessary deletion and recreation of model elements during change propagation. This is inefficient and may cause unnecessary information loss, i.e., when deleted elements contain information that is not represented in the second model, this information cannot be recovered easily. Short-cut rules have recently been developed to avoid unnecessary information loss by reusing existing model elements. In this paper, we show how to automatically derive (short-cut) repair rules from short-cut rules to propagate changes such that information loss is avoided and model synchronization is accelerated. The key ingredients of our rule-based model synchronization process are these repair rules and an incremental pattern matcher informing about suitable applications of them. We prove the termination and the correctness of this synchronization process and discuss its completeness. As a proof of concept, we have implemented this synchronization process in eMoflon, a state-of-the-art model transformation tool with inherent support of bidirectionality. Our evaluation shows that repair processes based on (short-cut) repair rules have considerably decreased information loss and improved performance compared to former model synchronization processes based on TGGs.


Introduction
The close collaboration of multiple disciplines such as electrical engineering, mechanical engineering, and software engineering in system design often leads to discipline-spanning system models [27]. Keeping models synchronized by checking and preserving their consistency can be a challenging problem which is not only subject to ongoing research but also of practical interest for industrial applications. Modelbased engineering has become an important technique to cope with the increasing complexity of modern software systems. Various bidirectional transformation (bx) approaches [14,3] for models have been suggested to deal with model (view) synchronization and consistency. Across these different approaches the following are important research topics [26,15,32,47,13,33,31]: incrementality, i.e., achieving runtime/complexity dependent on the size of the model change, not on the model size, and least change, i.e., keeping the resulting model as similar as possible to the original one while restoring consistency. In this work, we extend synchronization approaches based on triple graph grammars by specific arXiv:2005.14510v1 [cs.SE] 29 May 2020 repair rules to increase incrementality and efficiency and to decrease the amount of change that occurs during synchronization. We show how to avoid unnecessary information loss in model synchronization for scenarios in which one model is changed at a time. Throughout this paper we stick to this scenario of model synchronization. The more general case of concurrent model synchronization where both models have been altered is left to future work.
Triple Graph Grammars (TGGs) [51] are a declarative, rule-based bidirectional transformation approach, which allows to synchronize models of two different views (usually called the source and target domain in the TGG-related literature). The purpose of a TGG is to define a consistency relationship between pairs of models in a rule-based manner by defining traces between their elements. Given a TGG, its rules can be automatically operationalized into source and forward rules. While the source rules are used to build up models of the source domain, forward rules translate them to the target domain and thereby, establish traces between corresponding model elements. Analogously, target models can be propagated to the source domain by using target and backward rules that can be automatically deduced as well. To avoid redundancy in presentation, we stick to forward propagation throughout this paper.
In [51], a simple batch-oriented synchronization process was presented, which just re-translates the whole source model after each change using forward rules. Several incremental synchronization processes based on TGGs have been presented in the literature thereafter. A process is considered to be incremental if the target model is not recomputed from scratch but unaffected model parts are preserved as much as possible. 1 To obtain an incremental synchronization process, two basic strategies have been pursued (in combinations): (i) The synchronization algorithm takes additional information of forward rules into account. This information might consist of precedence relations over rules [40], dependency information on model elements w.r.t. their creation [26,50], a maximal, still consistent submodel [30], or information about broken matches of forward rules provided by an incremental pattern matcher [42,41]. (ii) The actual propagation of changes in a synchronization process is not based on the application of forward rules exclusively but also uses additional rules. To propagate a deletion on the source part, almost all approaches support to revoke an application of a forward rule. The recovation of rule applications is formalized as inverse rule applications in, e.g., [40]. In addition, custom-made rules have been used in synchronization algorithms that describe specific kinds of model edits in any modeling language [24] or in a concrete model- 1 Ideally, the runtime (complexity) of a synchronization should depend on the size of the change to the source model and not on the sizes of the source and the target model [26]. This requirement is a good motivation for incremental synchronization.
ing language [10]. Moreover, generalized forward rules have been defined which allow for re-use of elements [24,27,50]. Summarizing, a number of approaches for incremental model synchronization based on TGGs have been presented in the literature. Some of them such as [26,27] are informally presented without any guarantee to reestablish the consistency of modified models. Others present their synchronization approaches formally and show their correctness but are only applicable under restricted circumstances [30] or have not been implemented yet, such as [50]. Hence, we still miss a TGG-based model synchronization approach that avoids unnecessary information loss, is proven to be correct, and is efficiently implemented.
In this article, we present an incremental model synchronization approach based on an extended set of TGG rules. In [22], we introduced short-cut rules for handling complex consistency-preserving model updates while avoiding unnecessary information loss. A short-cut rule replaces one rule application with another one while preserving involved model elements (instead of deleting and re-creating them). We deduce source and forward rules from short-cut rules to support complex model edits and their synchronization with the target domain.
We present an incremental model synchronization algorithm based on short-cut rules and show its correctness. We implemented our synchronization approach in eMoflon [43,57,58], a state-of-the-art bidirectional model transformation tool, and evaluate it. Being based on eMoflon, we are able to extend the synchronization process suggested by Leblebici (et al.) [42,41] and rely on information provided by an incremental pattern matcher also to detect when and where to apply our derived repair rules. However, the construction and derivation of these is general and could extend other suggested TGG-based synchronization processes as well. The results of our evaluation show that, compared to model synchronization in eMoflon without short-cut repair rules, the application of these repair rules allows to react to model changes in a less invasive way by preserving information. In addition, it shows more efficiency.
This paper extends the work in [23]. Beyond [23], we present the actual synchronization process in pseudocode and prove its correctness and termination (based on the results obtained in [23,42,41]), extend our approach to deal with filter NACs (a specific kind of negative application conditions in forward rules), describe the implementation, especially the tool architecture, in more detail, extend the evaluation by investigating the expressiveness of short-cut repair rules at the practical example of code refactorings [21], and consider the related work more comprehensively.
The rest of this paper is organized as follows. In Sect. 2 we give an informal overview of our model synchroniza-tion approach. It shall allow readers to grasp the general idea without working through the technical details. In Sect. 3 we recall triple graph grammars. Sect. 4 recalls the construction of short-cut rules. The construction of short-cut rules and their properties are presented in Sect. 4, while Sect. 5 introduces the derivation of repair rules. Section 6 focuses on the implemented synchronization algorithm and its formal properties. To be understandable to readers who are not experts on algebraic graph transformation, we use a set-theoretical notion in these more technical sections, in contrast to the original contribution in [23] which is based on category theory. Sect. 7 describes the implementation of our model synchronization algorithm in eMoflon, focussing on the tool architecture. Our synchronization approach is evaluated in Sect. 8. Finally, we discuss related work in Sect. 9 and conclude with pointers to future work in Sect. 10. Appendix A presents the rule set used for our evaluation.

Informal Introduction to TGG-Based Model Synchronization
In this section, we illustrate our approach to model synchronization. Using a simple example, we will explain the basic concepts as well as all main ingredients for our new synchronization process. Reading this section and having a passing view on the synchronization algorithm (Section 6.2), evaluation (Section 8), and related works (Section 9) should give an adequate impression of the core ideas of our work.
Graph transformations, and triple graph grammars in particular, are a suitable formal framework to reason about and to implement model transformations and synchronizations [9,18]. 2 A triple graph consists of three graphs, namely the source, target, and correspondence graph. The latter encodes which elements of source and target graph correlate to each other. This is done by mapping each element of the correspondence graph to an element of the source graph as well as to an element of the target graph (formally these are two graph morphisms). Elements connected via such a mapping are considered to be correlated.
Triple graph grammars (TGGs) [51] declaratively define how consistent models co-evolve. This means that a triple graph is considered to be consistent if it can be derived from a start triple (e.g., the empty graph) using the rules of the given grammar. Furthermore, the rules can automatically be operationalized to obtain new kinds of rules, e.g., for translation/synchronization processes. We illustrate our model synchronization process by synchronizing a Java AST (abstract syntax tree) and a custom documentation model as example. This example has been 2 Therefore, we will use the terms "graph" and "model" interchangeably in this paper. We will stick to the graph terminology in more formal sections. basically introduced by Leblebici et al. [44]; it is slightly modified to demonstrate the core concepts of our approach. Note, however, that the evaluation in Sect. 8 is based on a larger and more complex TGG consisting of 24 rules (as presented in App. A). For model synchronization, we consider a Java AST model as source model and its documentation model as target model, i.e., changes in a Java AST model have to be transferred to its documentation model and vice versa. Note that we do not consider concurrent model synchronization, i.e., concurrent changes to both sides that have to be synchronized. Figure 1 depicts the type graph that describes the syntax of our example triple graphs. It shows a Package hierarchy and Classes as the source side, a Folder hierarchy with Doc-Files as target side and correspondence types in between depicted as hexagons. Furthermore, Doc-Files have an attribute content which is of type String. Note that, in our example, there are two correspondence types which can be distinguished by the type of elements they connect on both sides. TGG rules. Figure 2 shows the rule set of our example TGG consisting of three rules (assuming an empty start graph):  Root-Rule creates a root Package together with a root Folder and a correspondence link in between. This rule has an empty precondition and creates elements only; they are depicted in green and with the annotation (++). Sub-Rule creates a Package and Folder hierarchy given that an already correlated Package and Folder pair exists. Finally, Leaf-Rule creates a Class and a Doc-File under the same precondition as Sub-Rule. TGG rules can be used to generate triple graphs; triple graphs generated by them are consistent by definition. An example is depicted in Fig. 3 (a) which can be generated by first applying Root-Rule followed by two applications of Sub-Rule and an application of Leaf-Rule: Starting with the empty triple graph, the first rule application just creates the elements rootP and rootF and the correspondence element in between. The second rule application matches these elements and creates subP, subF, subPDoc, their respective incoming edges, and the correspondence element between subP and subF. The other two rule applications are performed similarly.
Operationalization of TGG rules. A TGG can also be used for translating a model of one domain to a correlated model of a second domain. Moreover, a TGG offers support for model synchronization, i.e., for restoring the consistency of a triple graph that has been altered on one side. For these purposes, each TGG rule has to be operationalized to two kinds of rules: A source rules enable changes of source models (e.g., as performed by a user) while forward rules translate such changes to the target model. 3 The result of applying a source rule followed by an application of its corresponding forward rule yields the same result as applying the TGG rule they originate from. Figure 4 shows the resulting source rules for our example TGG. 3 Analogously, target and backward rules can be derived.  Forward translation rules. Figure 5 depicts the resulting forward rules. They have a similar structure compared to their original TGG rules with three important differences. First, elements on the source side are now considered as context and as such have to be matched as a precondition for this rule to be applicable. Second, since we consider elements on the source side to already be present, we have to mark whether an element has already been translated or not. A annotation can be found on source elements which must have been translated before. On the other hand, → annotations indicate that applying this rule would mark this element as translated. This annotation can be found at elements that are created by the original TGG rule. Possible formalizations of these marking are given, e.g., in [29,42]. The third difference is the use of negative application conditions (NACs) [17] which are indicated with a (nac) and depicted in blue. Using NACs, we are able to not only define necessary structure that has to be found but also the explicit absence of structural elements as in Root-FWD-Rule where we forbid subP to have a parent package. The theory behind these so-called filter NACs is formalized by Hermann et al. [29] and they can be derived automatically from the rules of a given TGG when computing its forward rules.
Using these rules, we can translate Java AST to documentation models. Considering the one on the source side of the triple graph in Fig. 3 (a), it is translated to a documentation model such that the result is the complete graph depicted in this part of the figure. To obtain this result we apply Root-FWD-Rule at the root Package, Sub-FWD-Rule at Packages subP and leafP, and finally Leaf-FWD-Rule at Class c. Note that Sub-FWD-Rule, for example, is applicable when matching Packages sp and p of the rule to the Packages rootP and subP of the source graph, respectively, since rootP was marked as translated by the application of Root-FWD-Rule. Without the NAC in Root-FWD-Rule, this rule would also be applicable at the elements subP and leafP. Applying Root-FWD-Rule and translating these elements with it, however, would result in the edges from their parent Packages not being translatable any longer: there is no rule in our TGG rule set that creates edges between packages only. Hence, NACs can direct the translation process to avoid these dead-ends. Filter NACs are derived such that they prevent rule applications leading to dead-ends, only. Existing approaches to model synchronization. Given a triple graph such as the one in Fig. 3 (a), a developer may want to split the modeled project into multiple ones. For this purpose, a subpackage such as subP shall become a root package. Since subP was created and translated as a sub package rather than a root element, this model change introduces an inconsistency. To resolve this issue, the approaches presented in [26,40,42,41] and, to a certain degree, also the one in [30] revert the translation of subP into subF and retranslate subP with an appropriate translation rule such as Root-FWD-Rule. Reverting the former translation step may lead to further inconsistencies as we remove elements that were needed as context elements by other applications of forward rules. The result is a reversion of all translation steps except for the first one which translates the original root element. The result is shown in Fig. 3 (b). Thereafter, the untranslated elements can be re-translated yielding the result graph in (c). This example shows that this synchronization approach may delete and re-create a lot of similar structures which appears to be inefficient. Second, it may lose information that exists on the target side only, e.g., documentation saved in the content attribute which is empty now as it cannot be restored from the source side only. Such an information loss is unnecessary as we will show below. Instead of deleting elements and recreating them, we will present a synchronization process that aims to preserve information as much as possible.
Model synchronization with short-cut repair. In [22], we introduce short-cut rules as a kind of sequential rule composition mechanism that allows to replace one rule application with another one while elements are preserved (instead of deleted and recreated).  Figure 6 depicts three short-cut rules which can be derived from our original three TGG rules. The first two, Connect-Root-SC-Rule and Make-Root-SC-Rule, are derived from Root-Rule and Sub-Rule. The upper short-cut rule replaces an application of Root-Rule with one of Sub-Rule and turns root elements into sub elements. In contrast, the lower short-cut rule replaces an application of Sub-Rule with one of Root-Rule, thus, turning sub elements into root elements. Both short-cut rules preserve the model elements present in their corresponding TGG rules and solely create elements that do not exist yet (++), or delete those depicted in red and annotated with (--) which became superfluous. The third shortcut rule Move-To-New-Sub-SC-Rule relocates sub elements and replaces a Sub-Rule application with another one of the same kind.
A short-cut rule is constructed by overlapping two rules with each other where the first one is the replaced and the second the replacing rule. Overlapped elements are preserved such as p and f in Connect-Root-SC-Rule. Created elements that are not overlapped fall into two categories. If the element was created in the replaced rule but is superfluous in the replacing rule, it is deleted, e.g, d in Make-Root-SC-Rule. On the other hand, if the element was not created by the replaced rule but by the replacing rule, then the element is created, e.g., d in Connect-Root-SC-Rule. Context elements can be mapped as well while unmapped context elements from both rules are glued onto the final short-cut rule, e.g., op and of which are context in the replaced rule, and np and nf which are context in the replacing rule. Since there are many possible overlaps for each pair of rules, constructing a reasonable set of short-cut rules depends on the concrete example TGG and the requirement for advanced model changes that go beyond the standard capabilities of TGG based model synchronizers. Usually, it is worthwhile to construct short-cut rules for frequent model changes in order to increase the synchronization efficiency and decrease information loss in these cases.
In our example above, the user wants to transform the triple graph in Fig. 3 (a) to the one in (c). Using Make-Root-SC-Rule and matching the Packages sp and p to the Packages rootP and subP of the model (a) (and the correspondence nodes and Folders accordingly), this transformation is performed with a single rule application. Analogously, the triple graph (c) can be directly transformed backwards to (a) using Connect-Root-SC-Rule. Thus, these rules allow for complex user-edits on both, source and target side; they preserve the consistency of the model. However, there are also scenarios where applying a short-cut rule may lead to an inconsistent state of the resulting triple graph. A simple example is that of applying Connect-Root-SC-Rule in order to connect subP and subF with rootP and rootF, respectively. The result would be a cycle in both, the Package and the Folder hierarchies; this model is no longer in the language of our example TGG. In Sect. 4, we present sufficient conditions for the application of short-cut rules to avoid such cases.
Operationalization of short-cut rules. Short-cut rules transform both models at once as TGG rules usually do and therefore, they cannot cope with the change of a single model. Hence, similar to TGG rules, we have to operationalize them, thereby obtaining short-cut source and short-cut repair rules. Figure 7 depicts the short-cut source rules which are derived analogously to those of standard TGG rules. In order to be able to handle the deleted edge between rootP and subP, as deleted by Make-Root-Source-Rule, for example,  a repair rule is needed that adapts the target graph accordingly by deleting the now superfluous edge between rootF and subF. Figure 8 depicts the resulting repair rules derived from the short-cut rules in Fig. 6. A short-cut rule is forward operationalized by removing deleted elements from the rule's source graph since these deletions have already happened. Furthermore, created source elements become context because we expect them to already exist, e.g., through the a prior source rule application. Finally, since short-cut rules transform an application of one rule into that of another, filter NACs are added during operationalization to comply with application conditions of the replacing rule which naturally have to hold when applying the short-cut rule. Hence, Make-Root-Repair-Rule is only applicable and can turn subF into a root Folder if subP has no parent packages and, thus, is indeed a root Package itself. Note that Root-FWD-Rule is only applicable if subP has no parent packages, which Make-Root-Repair-Rule has to incorporate as well. For this reason, Make-Root-Repair-Rule contains nac1, which forbids rootP to be the parent package of subP and nac2, which forbids subP to have any other parent packages than rootP.
Short-cut repair rules allow to propagate graph changes directly to the other graph to restore consistency. Revisiting our example of Fig. 3, we are now able to use Make-Root-Repair-Rule to propagate the deleted edge between subP and rootP by deleting the corresponding edge between subF and rootF and the now superfluous Doc-File subPDoc. The result is the consistent triple graph again depicted in Fig. 3 (c) with the content attribute of leafPDoc containing the value 'leaf'. So, this repair does not cause information loss and allows to skip the costly reversion process with the intermediate result in Fig. 3 Summarizing, the user edit of removing the edge between Packages rootP and subP corresponds to the source rule of Make-Root-SC-Rule, namely Make-Root-Source-Rule, and the according update to the target side is performed by Make-Root-Repair-Rule which is the corresponding repair rule. Together, they perform an edit step structurally equivalent to the one depicted by the triple graphs in Fig. 3 (a) and (c); however, the value of the attribute content does not get lost. Alternatively, this step can be obtained by applying the short-cut rule Make-Root-SC-Rule. This is not a coincidence: In [23,Theorem 7], we showed that applying the source rule of a short-cut rule (which corresponds to a user edit on the source part only) followed by an application of the corresponding repair rule at the according match is the same as applying the original short-cut rule.

Preliminaries: Triple Graphs, Triple Graph Grammars and their Operationalizations
In this section, we recall triple graph grammars (TGGs) and their operationalization [51]. Our derivation of repair rules is based on the construction of so-called short-cut rules [22], which we recall as well. For simplicity, we stick with settheoretic definitions of the involved concepts (in contrast to category-theoretic ones as, e.g., in [17,18,22,23]). Moreover, while we provide formal definitions for central notions, we will just explain others and provide references for their formal definitions.

Graphs, triple graphs, and their transformations
Graphs and their (rule-based) transformations are suitable to formalize various kinds of models and their evolution, in particular of EMF models [9]. 4 In the context of this work, a graph consists of a set of nodes and a set of directed edges which connect nodes. Graphs may be related by graph morphisms, and a triple graph consists of three graphs connected by two graph morphisms.
Definition 1 (Graph, graph morphism, triple graph, and triple graph morphism) A graph G = (V, E, s,t) consists of a set V of vertices, a set E of edges, and source and target functions s,t : E → V . An element x of G is a node or an edge, i.e., x ∈ V or x ∈ E. A graph morphism f : G → H be- consists of two functions f V : V G → V H and f E : E G → E H that are compatible with the assignment of source and target to edges, i.e., Given a fixed graph TG, a graph typed over TG is a graph G together with a graph morphism type G : G → TG. A typed graph morphism f : (G, type G ) → (H, type H ) between typed graphs is a graph morphism f : G → H that respects the typing, i.e., consists of three graphs G S , G C , G T , called source, correspondence, and target graph, and two graph morphisms σ G : G C → G S and τ G : G C → G T , called source and target correspondence morphism. A triple graph morphism f : G → H between two triple graphs G and H consists of three graph morphisms f S : G S → H S , f C : G C → H C and f T : Given a fixed triple graph TG, a triple graph typed over TG is a triple graph G together with a triple graph morphism type G : G → TG. Again, typed triple graph morphisms are triple graph morphisms that respect the typing. A (typed) triple graph mor- Figure 3 depicts three triple graphs; their common type graph is depicted in Fig. 1. The typing morphism is indicated by annotating the elements of the triple graphs with the types to which they are mapped in the type graph. The nodes in the triple graphs are of types Package, Folder, Class, and Doc-File. In each case, the source graph is depicted to the left and the target graph to the right. The hexagons in the middle constitute the correspondence graphs. Formally, the edges from the correspondence graphs to source and target graphs are morphisms: The edges encode how an individual correspondence node is mapped by the correspondence morphisms. For example, the nodes rootP and rootF of types Package and Folder correspond to each other as they share the same correspondence node as preimage under the correspondence morphisms.

Example 1
Rules offer a declarative means to specify transformations of (triple) graphs. While classically rewriting of triple graphs has been performed using non-deleting rules only, we define a less restricted notion of rules 5 right away since short-cut rules and repair rules derived from them are both potentially deleting. A rule p consists of three triple graphs, namely a left-hand side (LHS) L and a right-hand side (RHS) R and an interface K between them. Applying such a rule to a triple graph G means to choose an injective morphism m from L to G. The elements from m(L \ l(K)) are to be deleted; if this results in a triple graph again, the morphism m is called a match and p is applicable at that match. After this deletion, the elements from R \ r(K) are added; the whole process of applying a rule is also called a transformation (step).
consists of three triple graphs, L, R, and K, called the left-hand side, right-hand side, and interface, respectively, and two injective triple graph morphisms l : K → L and r : K → R. A rule is called monotonic, or non-deleting, if l is an isomorphism. In this case we denote the rule as r : L → R. The inverse rule of a rule p is the rule is a triple graph again. Operator \ is understood as nodeand edge-wise set-theoretic difference. The source and target functions of D are restricted accordingly. If D is a triple graph, is computed. Operator ∪ is understood as node-and edgewise set-theoretic union. n(R \ r(K)) is a new copy of newly created elements. n can be extended to R by n(r(K)) = m(l(K)). The values of the source and target functions for edges from n(R \ r(K)) with source or target node in K are determined by m • l, i.e., s H (e) := m(l(r −1 (s R (e)))) t H (e) := m(l(r −1 (t R (e)))) for such edges e ∈ n(E R ) with s R (e) ∈ r V (V K ) or t R (e) ∈ r V (V K ). The whole computation is called a transformation (step), denoted as G ⇒ p,m H or just G ⇒ H, m is called a match, n is called a comatch and D is the context triple graph of the transformation.
An equivalent definition based on computing two pushouts, a notion from category theory generalizing the union of sets along a common subset, serves as basis when developing a formal theory [17]. In the following and in our examples, we always assume K to be a common subgraph of L and R and the injective morphisms l and r to be the corresponding inclusions; this significantly eases the used notation. When we talk about the union of two graphs G 1 and G 2 along a common subgraph S, we assume that G 1 ∩ G 2 = S.
To enhance expressiveness, a rule may contain negative application conditions (NACs) [17]. A NAC extends the LHS of a rule with a forbidden pattern: A rule is allowed to be applied only at matches which cannot be extended to any pattern forbidden by one of its NACs. If we want to stress that a rule is not equipped with NACs, we call it a plain rule.
Definition 3 (Negative application conditions) Given a rule p = (L ← K → R), a set of negative application conditions (NACs) for p is a finite set of graphs NAC = {N 1 , . . . , N k } such that L is a subgraph of every one of them, i.e., L ⊂ N i for 1 ≤ i ≤ k.
A rule (p = (L ← K → R), NAC) with NACs is applicable at a match m : L → G if the plain rule p is and, moreover, for none of the NACs N i there exists an injective morphism Example 2 Different sets of triple rules are depicted in Figs. 2, 5, 6, and 8. All rules in these figures are presented in an integrated form: Instead of displaying LHS, RHS, and the interface as three separate graphs, just one graph is presented where the different roles of the elements are displayed using markings (and color). The unmarked (black) elements constitute the interface of the rule, i.e., the context that has to be present to apply a rule. Unmarked elements and elements marked with (−−) (black and red elements) form the LHS while unmarked elements and elements marked with (++) (black and green elements) constitute the RHS. Elements marked with (nac) (blue elements) extend the LHS to a NAC; different NACs for the same rule are distinguished using names. As triple rules are depicted, their LHSs and RHSs are triple graphs themselves. For example, the LHS L of Sub-Rule (Fig. 2) consists of the nodes sp and sf of types Package and Folder and the correspondence node in between.
While, e.g., all rules in Fig. 2 are monotonic, Make-Root-SC-Rule is not as it deletes edges and a Doc-File. Applying Make-Root-SC-Rule to the triple graph (a) in Fig. 3 leads to the triple graph (c), when Package-nodes sp and p (of the rule) are matched to rootP and subP (in the graph), respectively. (The Folders on the target part are mapped ac-cordingly.) The rules Connect-Root-SC-Rule and Make-Root-SC-Rule are inverse to each other.
Finally, Root-FWD-Rule (Fig. 5) depicts a rule that is equipped with a NAC: It is applicable only at Packages that are not referenced by other Packages. This means that it is applicable at node subP in the triple graph (b) depicted in Fig. 3, but not at node leafP.

Triple graph grammars and their operationalization
Sets of triple graph rules can be used to define languages.
Definition 4 (Triple graph grammar) A triple graph grammar (TGG) GG = (R, S) consists of a set of plain, monotonic triple rules R and a start triple graph S. In case of typing, all rules of R and S are typed over the same triple graph.
The language of a TGG GG, denoted as L (GG), is the reflexive and transitive closure of the relation induced by transformation steps via rules from R, i.e., where ⇒ * R denotes a finite sequence of transformation steps where each rule stems from R.
The projection of the language of a TGG to its source part is the set i.e., it consists of the source graphs of the triple graphs of L (GG).
In applications, quite frequently, the start triple graph of a TGG is just the empty triple graph. We use / 0 to denote the empty graph, the empty triple graph, and morphisms starting from the empty (triple) graph; it will always be clear from the context what is meant. To enhance expressiveness of TGGs, their rules can be extended with NACs or with some attribution concept for the elements of generated triple graphs. A recent overview of such concepts and their expressiveness can be found in [59]. In the following, we first restrict ourselves to TGGs that contain plain rules only and discuss extensions of our approach subsequently.

Example 3
The rule set depicted in Fig. 2, together with the empty triple graph as start graph, constitutes a TGG. The triple graphs (a) and (c) in Fig. 3 are elements of the language defined by that grammar while the triple graph (b) is not.
The operationalization of triple graph rules into source and forward (or, analogously, into target and backward) rules is central to working with TGGs. Given a rule, its source rule performs the rule's actions on the source graph only while its forward rule propagates these to correspondence and target graph. This means that, for example, source rules can be used to generate the source graph of a triple graph while forward rules are then used to translate the source graph to correspondence and target side such that the result is a triple graph in the language of the TGG. Classically, this operationalization is defined for monotonic rules only [51]. We will later explain how to extend it to arbitrary triple rules. We also recall the notion of marking [41] and consistency patterns which can be used to check if a triple graph belongs to a given TGG.
Definition 5 (Source and forward rule. Consistency pattern) Given a plain, monotonic triple rule its source rule is defined as Its forward rule is defined as We denote the left-and right-hand sides of source and forward rules of a rule r by L S , L F , R S , and R F , respectively. The consistency pattern derived from r is the rule that, upon application, just checks for the existence of the RHS of the rule without changing the instance it is applied to. Given a rule r, each element x ∈ R S \L S is called a source marking element of the forward rule r F ; each element of L S is called required. Given an application G ⇒ r F ,m F H of a forward rule r F , the elements of G S that have been matched by source marking elements of r F , i.e., the elements of the set m F (R S \ L S ) are called marked elements. A transformation sequence is called creation preserving if no two rule applications in sequence (1) mark the same element. It is called context preserving if, for each rule application in sequence (1), the required elements have been marked by a previous rule application in sequence (1). If these two properties hold for sequence (1), it is called consistently marking. It is called entirely marking if every element of the common source graph G S of the triple graphs of this sequence is marked by a rule application in sequence (1).
The most important formal property of this operationalization is that applying a (sequence of) source rule(s) followed by applying the (sequence of) corresponding forward rule(s) yields the same result as applying the (sequence of) original TGG rule(s) assuming consistent matches [51,16].
Moreover, there is a correspondence between triple graphs belonging to the language of a given TGG and consistently and entirely marking transformation sequences via its forward rules. We formally state this correspondence as it is an ingredient for the proof of correctness of our synchronization algorithm.
Lemma 1 (see [42,Fact 1] or [41,Lemma 4]) Let a TGG GG be given. There exists a triple graph G = (G S ← G C → G T ) ∈ L (GG) if and only if there exists a transformation sequence like the one depicted in (1) via forward rules from GG such that , and the transformation sequence is consistently and entirely marking.
For practical purposes, forward rules and consistency patterns may be equipped with so-called filter NACs which can be automatically derived from the set of rules of the given TGG. The simplest examples of such filter NACs arise through the following analysis: For each rule that translate a node without translating adjacent edges it is first checked if other rules translate the same type of node but also translate an adjacent edge of some type. If this is the case, it is checked if there are further rules which only translate the detected kind of adjacent edge. If none is found, the original rule is equipped with a NAC forbidding the respective kind of edges. This avoids a dead-end in translation processes: In the presence of such a node with its adjacent edge, using the original rule to only translate the node leaves an untranslatable edge behind. The filter NAC of Root-FWD-Rule is derived in exactly this way. For the exact and more sophisticated derivation processes of filter NACs, we refer to the literature [29,35]. For our purposes it suffices to recall their distinguishing property: Filter NACs do not prevent "valid" transformation sequences of forward rules. We state this property in the terminology of our paper.
via the forward rules (without filter NACs) derived from R if and only if the sequence exists, i.e, if none of the filter NACs blocks one of the above rule applications.

Example 4
The source rules of the triple rules depicted in Fig. 2 are depicted in Fig. 4. They allow to create Packages and Classes on the source side without changing correspondence and target graphs. The formally existing empty graphs at correspondence and target sides are not depicted. The corresponding forward rules are given in Fig. 5. Their required elements are annotated with and their source marking elements with → . The rule Root-FWD-Rule is equipped with a filter NAC: The given grammar does not allow to create a Package that is contained in another one with its original rule Root-Rule. Hence, the derived forward rule should not be used to translate a Package, which is contained in another one, to a Folder. As evident in the examples, the application of a source rule followed by the application of the corresponding forward rule amounts to the application of the original triple rule if matched consistently.
The consistency patterns that are derived from the TGG rules of our example are depicted in Fig. 9. They just check for existence of the pattern that occurs after applying the original TGG rule. A consistency pattern is equipped with the filter NACs of both its corresponding forward and backward rule. In our example, only Root-Consistency-Pattern receives such NACs; one from Root-FWD-Rule and the second one from the analogous backward rule. An occurrence of a consistency pattern in our example model indicates that a specific location corresponds to a concrete TGG rule application. Hence, a disappearance of such a match indicates that a former intact rule application has been broken and needs some fixing. We call this a broken match for a consistency pattern or, short, a broken consistency match. Practically, we will exploit an incremental pattern matcher to notify us about such disappearances.

Sequential independence
The proof of correctness of our synchronization approach relies on the notion of sequential independence. Transformations that are sequentially independent can be performed in arbitrary order.
Definition 6 (Sequential independence) Given two transformation steps G ⇒ r 1 ,m 1 H 1 ⇒ r 2 ,m 2 X, via plain rules r 1 , r 2 these are sequentially independent if where n 1 is the comatch of the first transformation.
By the Local Church-Rosser Theorem the order of sequentially independent transformation can be switched. This means that, given a sequentially independent transformation Theorem 3.20]. If r 1 and r 2 are equipped with NACs NAC 1 and NAC 2 , respectively, transformation steps as above are sequentially independent if condition (2) holds and moreover, the thereby induced matches m 2 : L 2 → G and m 1 : L 1 → H 2 both satisfy the respective sets of NACs. In particular, the Local Church-Rosser Theorem still holds.
In our setting of graph transformation, it is easy to check the sequential independence of transformations [17,19]. A sequence t 1 ;t 2 of two transformation steps is sequentially independent if and only if the following holds.
t 2 does not match an element that t 1 created.
t 2 does not delete an element that t 1 matches.
t 2 does not create an element that t 1 forbids.
t 1 does not delete an element that t 2 forbids.

Short-cut Rules
Short-cut rules were introduced in [22] to take back an application of a TGG rule and to apply another one instead. This exchange of application shall be performed such that information loss is avoided. This means that model elements are check for reuse before deleting them. We recall the construction of short-cut rules first and discuss their expressivity thereafter. Finally, we identify conditions for languagepreserving applications of short-cut rules.

Construction of short-cut rules
We recall the construction of short-cut rules in a semiformal way and reuse an example of [22] for illustration; a formal treatment (in a category-theoretical setting) can be found in that paper. Given an inverse monotonic rule (i.e., a rule that purely deletes) and a monotonic rule, a short-cut rule combines their respective actions into a single rule. Its construction allows to identify elements that are deleted by the first rule as recreated by the second one. To motivate the construction, assume two monotonic rules r 1 : L 1 → R 1 and r 2 : L 2 → R 2 be given. Applying the inverse rule of r 1 to a triple graph G, provides an image of L 1 in the resulting triple graph H. When applying r 2 thereafter, the chosen match for L 2 in H may intersect with the image of L 1 yielding a triple graph L ∩ . This intersection can also be understood as saying that L ∩ provides a partial match for L 2 . The inverse application of the first rule deletes elements which may be recreated again. In this case, it is possible to extend the sub-triple graph L ∩ of H to a sub-triple graph R ∩ of H with these elements. In particular, R ∩ is a sub-triple graph of R 1 and R 2 as it includes elements only that have been deleted by the first rule and created by the second. Based on this observation, the construction of short-cut rules is defined as follows (slightly simplified and directly merged with an example): Construction 7 (Short-cut rule) Let two plain, monotonic rules r 1 = L 1 → R 1 and r 2 = L 2 → R 2 be given. A short-cut rule r sc for the rule pair (r 1 , r 2 ), where r 1 is considered to be applied inversely, is constructed in the following way: 1. Choice of common kernel: A (potentially empty) subtriple graph L ∩ of L 1 and L 2 and a sub-triple graph R ∩ of R 1 and R 2 with L ∩ ⊆ R ∩ are chosen. We call L ∩ ⊆ R ∩ a common kernel of both rules. In Fig. 10, an example of such a common kernel is given. It is a common kernel for rule pair (Root-Rule, Sub-Rule). The common kernel is depicted in the center of Fig. 10. This choice of a common kernel will lead to Connect-Root-SC-Rule as resulting short-cut rule. In this example, L ∩ is empty and R ∩ extends L ∩ by identifying the Packages p, Folders f, and the correspondence node in between. The elements of R ∩ \ L ∩ , called recovered elements, are to become the elements that are preserved by an application of the short-cut rule compared to reversely applying the first rule followed by applying the second one (provided that these applications overlap in L ∩ ). In the example case, the whole graph R ∩ is recovered as L ∩ is empty. 2. Construction of LHS and RHS: One first computes the union L ∪ of L 1 and L 2 along L ∩ . The result is then united with R 1 along L 1 and R 2 along L 2 , respectively, to compute the LHS and the RHS of the short-cut rule. ample, these are the Package sp, the Folder sf, and the correspondence node in between.

Example 5
More examples of short-cut rules are depicted in Fig. 6. Both, Connect-Root-SC-Rule and Make-Root-SC-Rule, are constructed for the rules Root-Rule and Sub-Rule. Switching the role of of the inverse rule, two short-cut rules can be constructed having equal common kernels. In both cases, the Package p, the Folder f and the correspondence node between them are recovered elements, as these elements would have been deleted and re-created otherwise. While in Connect-Root-SC-Rule, the presumed elements are the Package sp and the Folder sf with a correspondence node in between, the set of presumed elements of Make-Root-SC-Rule is empty.
Another possible common kernel for Root-Rule and Sub-Rule is one where R ∩ is an empty triple graph as well. As the resulting short-cut rule just copies both rules (one of them inversely) next to each other, this rule is not interesting for our desired application.

Expressivity of short-cut rules
Given a set of rules, there are two degrees of freedom when deciding which short-cut rules to derive from them: First, one has to choose for which pairs of rules short-cut rules shall be derived. Secondly, given a pair of rules, there is typically not only one way to construct a short-cut rule for them: In general, there are different choices for a common kernel. However, when fixing a common kernel, i.e., L ∩ and R ∩ , the result of the construction is uniquely determined. If, moreover, the LHSs and RHSs of the rules are finite, the set of possible common kernels is finite as well.
As short-cut rules correspond to possible (complex) edits of a triple graph, the more short-cut rules are derived, the more user edits are available which can directly be propagated by the corresponding repair rules. But the number of rules that has to be computed (and maintained throughout the synchronization process) in this way, would quickly grow. And maybe several of the constructed rules would capture edits that are possible in principle but unlikely to ever be performed in a realistic scenario. Hence, some tradeoff between expressivity and maintainability has to be found.
We shortly discuss these effects of choices: The construction of short-cut rules is defined for any two monotonic rules [22] -we do not need to restrict to the rules of a given TGG but may also use monotonic rules that have been constructed as so-called concurrent rules [17] of given TGG rules as input for the short-cut rule construction. A concurrent rule combines the actions of two (or more) subsequent rule applications into a single rule. Hence, deriving shortcut rules from concurrent rules that have been built of given TGG rules leads to short-cut rules that capture even more complex edits into a single rule. The next example presents such a derived short-cut rule. While our conceptual approach is easily extended to support such rules, we currently stick with short-cut rules directly derived from a pair of rules of the given TGG in our implementation.

Example 6
The short-cut rule Delete-Middle-SC-Rule depicted in Fig. 13 is not directly derived of the TGG rules depicted in Fig. 2. Instead, the concurrent rule of two given applications of Sub-Rule is constructed first. This concurrent rule directly creates a chain of two Packages and Folders into an existing pair of Package and Folder. The rule in Fig. 13 is a short-cut rule of this concurrent rule and Sub-Rule. It takes back the creation of a chain such that the bottom package is directly included in the top package in Fig. 13.
Concerning the choice of a common kernel, we follow two strategies. In both strategies, we overlap as many of the newly created elements of the two input rules as possible since these are the elements that we try to preserve.
A minimal overlap overlaps created elements only, i.e. no context elements. An example is Sub-Rule, which overlapped with itself, results in Move-To-New-Sub-SC-Rule and which corresponds to a move refactoring step.
A maximal overlap overlaps not only created elements of both rules but also context elements. Creating such an overlap for Sub-Rule with itself would result in the Sub-Consistency-Pattern, which has no effect when applied. However, when overlapping different rules with each other, it is often useful to re-use context elements. This is the case, for  Both strategies aim to create different kinds of short-cut rules with specific purposes. Since generating all possible overlaps and thus short-cut rules is expensive, we chose a heuristic approach to generate a useful subset of them.
As we are dealing with triple graphs being composed of source, target and correspondence graphs, the overlap of source graphs should correspond to that of target graphs. This restricts the kind of "reuse" of elements the derived short-cut rules enable. The allowance of any kind of overlap may include unintended ones. We argue for the usefulness of these strategies in our evaluation in Sect. 8.

Language preserving short-cut rule applications
The central intuition behind the construction of short-cut rules is to replace the application of a monotonic triple rule by another one. In this sense, a short-cut rule captures a complex edit operation on triple graphs that (in general) cannot be performed directly using the rules of a TGG. We illustrate this behaviour in the following. Subsequently, we discuss the circumstances under which applications of short-cut rules are "legal" in the sense that the result still belongs to the language of the respective TGG.

Let a TGG GG and a sequence of transformations
be given where all the r i , 1 ≤ i ≤ t, are rules of GG, all the m i denote the respective matches, and G 0 ∈ L (GG); in particular G t ∈ L (GG) as well. Fixing some j ∈ {1, . . . ,t} and some rule r of GG, we construct a short-cut rule r sc for (r j , r) with some common kernel L ∩ ⊆ R ∩ . Next, we can consider the transformation sequence G 0 ⇒ r 1 ,m 1 G 1 ⇒ r 2 ,m 2 G 2 ⇒ · · · ⇒ r t ,m t G t ⇒ r sc ,m sc G t that arises by appending an application of r sc to transformation sequence (3). Under certain technical circumstances (which we will state below) this transformation sequence is equivalent 6 to the sequence where the application of r j at match m j is replaced by an application of r at a match m sc that is derived from the match m sc of the short-cut rule. The following matches m j+1 , . . . , m t have been adapted accordingly. They still match the same elements but formally they do so in other triple graphs. In particular, G t , the result of the transformation sequence (4), is isomorphic to G t and hence, G t can be understood as arising by replacing the j-th rule application in the transformation sequence (3) by an application of the rule r; thus, G t also belongs to the language of the TGG: The sequence (4) starts at a triple graph G 0 ∈ L (GG) and solely consists of applications of rules from GG.
Example 7 Consider the triple graph depicted in Fig. 3 (a). It arises by applying Root-Rule, followed by two applications of Sub-Rule, and finally an application of Leaf-Rule. When matched as already described in the introductory example, an additional application of Make-Root-SC-Rule to this triple graph results in the one depicted in Fig. 3 (c). Alternatively, this can be derived by two applications of Root-Rule, followed by an application of Sub-Rule and Leaf-Rule each. As schematically depicted in Fig. 14, the application of the short-cut rule Make-Root-SC-Rule transforms an transformation sequence deriving the first triple graph into a transformation sequence deriving the second one by replacing an application of Sub-Rule by one of Root-Rule.
In the following, we state when the above described behaviour is the case (in a somewhat less technical language than originally used).

Theorem 2 ([23, Theorem 8])
Let the transformation sequence (3) be given and let r sc be a short-cut rule that is derived from (r j , r). If the following three conditions are met, this sequence is equivalent to sequence (4) where original TGG rules are applied only.
1. Reversing match: The application of r sc at m sc reverses the application of r j , i.e., n j (R j ) = m sc | R j (R j ).

Sequential independence:
(a) Non-disabling match: The application of r sc at m sc does not delete elements used in the applications of r j+1 , . . . , r t . 6 The formal notion of equivalence used here is called switch equivalence and captures the idea that, in case of sequential independence, the order of rule applications might be switched while using basically the same match for each rule application and receiving the same result; compare, e.g., [38,8].
(b) Context-preserving match: The match m sc for r sc already exists in G j−1 . Since the assumption on the match to be reversing already ensures this for elements of L sc that stem from R j , context-preservation ensures in particular that the presumed elements of r sc are matched to elements already existing in G j−1 .
Example 8 We illustrate each of the above mentioned conditions: 1. Reversing match: In our example of matching Connect-Root-SC-Rule to the triple graph (c) in Fig. 3 this means that its nodes p and f (and the correspondence node in between) are allowed to be matched to elements only that have been created using Root-Rule. In this way, it is avoided to misuse the rule to introduce Packages (and Folders) that are contained by more than one Package (or Folder). 2. Non-disabling match: For example, Delete-Middle-SC-Rule from Fig. 13 is not allowed to delete Packages and Folders that already contain Classes or Doc-Files, respectively. 3. Context preserving match: Returning to our example of matching Connect-Root-SC-Rule to the triple graph (c) in Fig. 3 this means that as soon as nodes subP and subF in that triple graph have been chosen as matches for the nodes p and f of Connect-Root-SC-Rule, the nodes leafP and leafF are not allowed to be chosen as matches for nodes sp and sf of Connect-Root-SC-Rule. The creation of leafP and leafF depends on subP and subF being created first. In this way, the introduction of cyclic dependencies between elements is avoided.

Constructing Language-Preserving Repair Rules
In this section, we formally define the derivation of repair rules from a given TGG and characterize valid applications of these. Our general idea is to construct repair rules that can be used during model synchronization processes that are based on the formalism of TGGs. Our construction of such repair rules is based on short-cut rules which we recalled in Section 4.

Deriving repair rules from short-cut rules
Having defined short-cut rules, they can be operationalized to get edit rules for source graphs and forward rules that repair these edits. As such edits may delete source elements, correspondence elements may be left without corresponding source elements. Hence, the resulting triple graphs show a form of partiality. They are called partial triple graphs. Given a model, formally considered as triple graph G S and/or creation of graph elements, resulting in a graph G S . In general, the "old" correspondence morphism σ G : G C → G S does not extend to a correspondence morphism from G C to G S : The user might have deleted elements in the image of σ G . However, there is a partial morphism σ G : G C G S that is defined for all elements whose image under σ G still exists.

Definition 8 (Partial triple graph)
sists of three graphs G S , G C , G T and two partial graph morphisms σ G : G C G S and τ G : and a user edit of G S that results in a graph G S , the partial triple graph induced by the edit is G S σ G G C τ G − → G T where σ G is obtained by restricting σ G to those elements x of G C (node or edge) for which σ G (x) ∈ G S is still an element of G S . According to the above definition, triple graphs are special partial triple graphs, namely those, where the domain of both partial correspondence morphisms is the whole correspondence graph G C .
When operationalizing short-cut rules, i.e., splitting them into a source and a forward rule, we also have to deal with this kind of partiality: In contrast to the rules of a given TGG, a short-cut rule might delete an element. Hence, its forward rule might need to contain a correspondence element for which the corresponding source element is missing; it is referenced in the short-cut rule. This element is deleted by the corresponding source rule. Given a TGG GG, a repair rule for GG is the forward rule r F sc of a short-cut rule r sc where r sc has been constructed from a pair of rules of GG.
For more details (in particular, the definition of morphisms between partial triple graphs), we refer the interested reader to the literature [23,37]. In this paper, we are more interested in conveying the intuition behind these rules by presenting examples. We next recall the most important property of this operationalization, namely that, as in the monotonic case, an application of a short-cut rule corresponds to the application of its source rule, followed by an application of the forward rule if consistently matched.
applying source rule r S sc with match m S sc = (m sc,S , / 0, / 0) and forward rule r F sc at match m F sc = (n sc,S , m sc,C , m sc,T ).
For practical applications, repair rules should also be equipped with filter NACs. Let the repair rule r F sc be obtained from a short-cut rule r sc that has been computed from rule pair (r 1 , r 2 ), both coming from a given TGG. As the application of r F sc replaces an application of r F 1 by one of r F 2 , r F sc should be equipped with the filter NAC of r F 2 . However, just copying that filter NAC would not preserve its semantics; a more refined procedure is needed. The LHS of r F 2 is a subgraph of the one of r F sc by construction. There is a known procedure, called shift along a morphism, that "moves" an application condition from a subgraph to the supergraph preserving its semantics [19, Lemma 3.11 and Construction 3.12]. We use this construction to compute the filter NACs of repair rules. By using this known construction, the filter NACs we construct for our repair rules have the following property: Lemma 2 ([19, Lemma 3.11 and Construction 3.12].) Let r sc be a plain short-cut rule obtained from the pair of monotonic rules (r 1 , r 2 ) where the forward rule r F 2 is equipped with a set NAC F 2 of filter NACs. Let NAC F sc be the set of NACs computed by applying the shift construction to NAC F 2 along the inclusion morphism ι : L F 2 → L F sc of the LHS of r F 2 into the LHS of r sc (which exists by construction).
Then, an injective match m F sc for r F sc (into any partial triple graph G) satisfies the set of NACs NAC F sc if and only if the induced injective match m F sc • ι for r F 2 satisfies NAC F 2 .

Example 9
The forward rules of the short-cut rules in Fig. 6 are depicted in Fig. 8. Make-Root-Repair-Rule is derived to replace an application of Sub-FWD-Rule by one of Root-FWD-Rule. This forward rule is equipped with a filter NAC which ensures that the rule is used only to translate Packages at the top of a hierarchy. Just copying this NAC to the Package p in Make-Root-Repair-Rule would not preserve this behaviour: The rule would be applicable in situations where the Package to which sp is matched contains a Package to which p is matched. Shifting the NAC from Root-FWD-Rule to Make-Root-Repair-Rule instead, the forbidden edge between the two Packages is introduced in addition. It ensures that p can be matched to Packages at the top of a hierarchy, only.
Delete-Middle-Repair-Rule (see Figure 15) assumes two connected Packages and deletes a Folder between their corresponding Folders as well as the Doc-File contained in the deleted Folder and the correspondence node referencing it. The LHS of this rule is a proper partial triple graph as there is a correspondence node which is not mapped to any element of the source part.

Conditions for valid repair rule applications
Now, we transfer the results obtained so far to the case of repair rules. To do so, we first define valid matches for repair rules (in a restricted kind of transformation sequences).
Definition 10 (Valid match for repair rule) Let a TGG GG and a consistently-marking transformation sequence via forward rules r FN i , 1 ≤ i ≤ t, (possibly with filter NACs) of GG be given. Let Let there be some source edit step , r sc is a source rule of a short-cut rule derived from a rule pair (r j , r) where 1 ≤ j ≤ t and r stems from GG, and m S sc | R j,S = n j,S , i.e., when restricted to the source part of the RHS R j of r j match m S sc coincides with the source part of the comatch n j . Moreover, the application of this source edit shall not introduce a violation of any of the filter NACs of r FN 1 , . . . , r FN j−1 . Then, a match m F sc for the corresponding forward rule r F sc in G is valid if the following properties hold.
1. Reversing match: Given comatch (n S sc,S , / 0, / 0) of the application of the source rule r S sc , its match is m F sc = (n S sc,S , m F sc,C , m F sc,T ) and also m F sc,C and m F sc,T coincide with n j,C and n j,T when restricted to R j,C and R j,T , respectively. 2. Sequential independence: (a) Non-disabling match: The application of r F sc does not delete elements used in the applications of r FN j+1 , . . . , r FN t nor does it create elements forbidden by one of the filter NACs of those forward rules. (b) Context-preserving match: The presumed elements of the repair rule r F sc (which accord to the presumed elements of the short-cut rule r sc ) are matched to elements of H S which are marked as translated in G j−1,S and in G t,C and G t,T to elements which are already created in G j−1,C and G j−1,T . This means, elements stemming from the LHS L of r which have not been identified with elements from L j in the shortcut rule r sc , are matched to elements already translated/existing in G j−1 . 3. Creation-preserving match: All source elements that are newly created by short-cut rule r sc , i.e., the source elements of R S \ L S that have not been merged with an element of R j,S \ L j,S during the short-cut rule construction, are matched to elements which are yet untranslated in G t,S .
Note that together, conditions 2. (a) and (b) above constitute sequential independence between the applications of r FN j+1 , . . . , r FN t and the one of r F sc . Moreover, the additional requirement on the match to be creation preserving (compared to Theorem 2 for short-cut rules) originates from the fact that forward rules do not create but mark source elements.
The following corollary uses Theorem 3 to transfer the statement of Theorem 2 to repair rules.
Corollary 1 Let a TGG GG and a consistently marking transformation sequence as in (5), followed by an edit step exactly as in Definition 10 above be given. Then, applying r F sc at a valid match m F sc in G induces a consistently marking transformation sequence Proof For a valid match m F sc of r F sc , by its reversing property, the conditions of Theorem 3 are met. Hence, we obtain a sequence As a consistently marking sequence of forward rules corresponds to a sequence of TGG rule applications, and the preconditions of Theorem 2 are met ("exists" is exchanged by "marked" on the source component), this sequence induces a sequence (where we do not care for the further applications of forward rules). Now, we can split r into its source and forward rule. Its source rule is sequentially independent from the other forward rule applications: r S sc does not delete anything, the rules r FN 1 , . . . , r FN j−1 match, and does not create a filter NAC violation by assumption and, as a consequence, r S does not. Hence, by the local Church-Rosser Theorem, we might equivalently switch the application of r S to the beginning of the sequence and obtain sequence (6), as desired. Moreover, by Lemma 2, the filter NAC of r F holds whenever m F sc satisfies the filter NAC of r F sc . Finally, as the start of the transformation sequence (up to index j − 1) is context preserving, and by assumption 2. (b), the match m F sc matches presumed elements of r F sc to already translated ones (in H S ) or already created ones (in G j−1,C and G j−1,T ), this sequence is context preserving. Analogously, assumption 3. ensures that it is creation-preserving: No element which is already marked as translated in G t,S is marked a second time. Hence, the whole sequence is consistently marking.

Synchronization Algorithm
In this section, we discuss our synchronization algorithm that is based on the correct application of derived repair rules. We first present the algorithm and consider its formal properties subsequently. The section closes with a short example for a synchronization based on our algorithm and a discussion of extensions and support for advanced TGG features.

The Basic Setup
We assume a TGG GG with plain, monotonic rules to be given. Its language defines consistency. This means that a triple graph G = (G S ← G C → G T ) is consistent if and only if G ∈ L (GG).
The problem. A consistent triple graph G = (G S ← G C → G T ) ∈ L (GG) is given; by Lemma 1 there exists a corresponding consistently and entirely marking sequence t of forward rule applications. After editing source graph G S we get G = (H S G C → G T ). Generally, the result G is a partial triple graph and does not belong to L (GG). We assume that all the edits are performed by applying source rules. They may be derived from the original TGG rules or from short-cut rules. Our goal is to provide a model synchronization algorithm that, given G = (G S ← G C → G T ) ∈ L (GG) and G = (H S G C → G T ) as input, computes a triple graph H = (H S ← H C → H T ) ∈ L (GG). As a side condition, we want to minimize the amount of elements of G C and G T that are deleted and recreated during that synchronization.
Ingredients of our algorithm. We provide a rule-based model synchronization algorithm leveraging an incremental pattern matcher. During that algorithm, rules are applied to compute a triple graph (H S ← H C → H T ) ∈ L (GG) from the (partial) triple graph (H S G C → G T ). We apply two different kinds of rules, namely 1. forward rules derived from the rules of the TGG GG and 2. repair rules, i.e., operationalized short-cut rules.
Forward rules serve to propagate the addition of elements. The use of these rules for model synchronization is standard. However, the use of additional repair rules and the way in which they are employed are conceptually novel. 7 The repair rules allow to directly propagate more complex user edits.
During the synchronization process, the rules are applied reacting to notifications by an incremental pattern matcher.
We require this pattern matcher to provide the following information: 1. The original triple graph G = (G S ← G C → G T ) is covered with consistency patterns. When considering the induced matches for forward rules, every element of G S is marked exactly once. The dependency relation between elements required by these matches is acyclic. This means that the induced transformation sequence of forward rules is consistently and entirely marking. Such a sequence always exists since G ∈ L (GG); see Lemma 1. 2. Broken consistency matches are reported. A match for a consistency pattern in G is broken in G if one of the elements it matches or creates has been deleted or if an element has been created that violates one of the filter NACs of that consistency pattern.

The incremental pattern matcher notifies about newly
occurring matches for forward rules. It does so in a correct way, i.e., it only notifies about matches that lead to consistently marking transformations. 4. In addition, the incremental pattern matcher informs a precedence graph. This precedence graph contains information about the mutual dependencies of the elements in the partial triple graph. Here, an element is dependent on another one if the forward rule application marking the former matches the latter element as required. We consider the transitive closure of this relation.

Synchronization Process
Our synchronization process is depicted in Algorithm 1. It applies rules to translate elements and repair rule applications. In that, it applies a different strategy than suggested in [42,41]. There, invalid rule applications are revoked as long as there exist any. Subsequently, forward rules are applied as long as possible. By trying to apply a suitable repair rule instead of revoking an invalid rule application, we are able to avoid deletion and recreation of elements. Our synchronization algorithm is defined as follows. Note that we present an algorithm for synchronizing in forward direction (from source to target) while synchronizing backwards is performed analogously. The function synchronize is called on the current partial triple graph that is to be synchronized. In line 2, up-dateMatches is called on this partial triple graph. It returns the set of consistency matches currently broken, a set of consistency matches being still intact, and a set of forward TGG rule matches.
By calling the function isFinished (line 4), termination criteria for the synchronization algorithm are checked. If the set of broken consistency matches and the set of forward TGG rule matches are both empty and all elements of the source graph are marked as translated, the synchronization algorithm terminates (line 18). Yet, if both sets are empty but there are still untranslated elements in the source graph, an exception is thrown in line 20, signaling that the (partial) triple graph is in an inconsistent state.
Subsequently, function translate is used (line 7) to propagate the creation of elements: If the set of forward TGG rule matches is non-empty (line 24), we choose one of these matches, apply the corresponding rule, and continue the synchronization process (line 27). This step is done prior to any repair. The purpose is to create the context which may be needed to make repair rules applicable. An example for such a context creation is the insertion of a new root Package which has to be translated into a root Folder before applying Connect-Root-Repair-Rule thereafter (see Fig. 5).
If the above cases do not apply, there must be at least one broken consistency match and the corresponding rule application has to be repaired (line 10): Hence, we choose one broken consistency match (line 32) for which a set of suitable repair rules is determined. A broken consistency match includes information about the rule it corresponds to (e.g., the name of the rule). Furthermore, it includes which elements are missing or which filter NACs are violated such that the corresponding application does not exist any more. We calculate the set of matches of repair rules (i.e., forward short-cut rules) that stem from short-cut rules revoking exactly the rule that corresponds to the broken consistency match. In particular, by knowing which elements of a broken rule application still exist in the current source graph, we can stick to those repair rules that preserve exactly the still existing elements.
While the calculated set of unprocessed repair rule matches is not empty (line 36), we choose one of these matches and check whether it is valid. By constructing the partial match of a repair rule, we only need to ensure that none of its presumed elements is matched in such a way that a cyclic dependency is introduced. This means that they must not be matched to elements that are dependent of elements to which the recovered elements are matched. If a match is valid, we apply the corresponding repair rule and continue the synchronization process (line 40). If no such rule or valid match is available, an exception is thrown (line 12).

Formal properties of the synchronization process.
We discuss the termination, correctness, and completeness of our synchronization algorithm.
Our algorithm terminates as long as every forward rule translates at least one element (which is a quite common condition; compare [30,Lemma 6.7]  source marking element, our algorithm terminates for any finite input G = (H S G C → G T ).
Proof The algorithm terminates -by either throwing an exception or returning a result -if at one point both, the set of broken consistency matches and the set of matches for forward rules are empty; compare the function isFinished starting in line 15. The algorithm is called recursively, always applying a forward rule if a match is available. As every forward rule marks at least one element as translated and forward rules are only matched in such a way that source marking elements are matched to yet untranslated ones, the application of forward rules (lines 24 et seq.), i.e., the recursive call of function translate, halts after finitely many steps. Moreover, an application of a forward rule never introduces a new broken consistency match: As it neither creates nor deletes ele-ments in the source graph, it cannot delete elements matched by a consistency pattern nor create elements forbidden by one. This means that, as soon as the set of broken consistency matches is empty, the whole synchronization algorithm will terminate. We show that at some point this set of broken consistency matches will be empty or an exception is thrown.
Whenever the algorithm is called with an empty set of matches for forward rules, broken consistency matches are considered by applying a repair rule, i.e., by calling the function repair. New matches for forward rules can result from this; as discussed above, newly appearing matches for forward rules are unproblematic. However, an application of a repair rule does not introduce a new violation of any consistency match: As it does not create source elements, it cannot introduce violations of filter NACs. And by the condition on valid matches to be non-disabling (condition 2. (a) in Definition 10), no elements needed by other consistency matches are deleted. Hence, by application of a repair rule, the number of invalid consistency matches is reduced by one and the algorithm terminates as soon as all broken consistency matches are repaired. If there is a broken consistency match that cannot be repaired -either because no suitable repair rule or no valid match is available -an exception is thrown and the algorithm stops.
Correctness. Upon termination without exception, our algorithm is correct.
Theorem 5 (Correctness of algorithm) Let a TGG GG with plain, monotonic rules, a triple graph G = (G S ← G C → G T ) ∈ L (GG), and a partial triple graph G = (G S G C → G T ) that arises by a user edit step on the source graph be given. If our synchronization algorithm terminates without exception and yields H = (H S ← H C → H T ) as output, then H S = G S and H ∈ L (GG).
Proof We see immediately that H S = G S since none of the applied rules modifies the source graph. If the synchronization process terminates without exception, all elements are translated, no matches for forward rules are found, and no consistency match is broken any more. This means that the collected matches of the forward rules form an entirely marking transformation sequence. By Lemma 1, we have to show that this sequence is also consistently marking. Then, the matches of the forward rules that correspond to the matches of the consistency patterns that the incremental pattern matcher has collected encode a transformation sequence that allows to translate the triple graph (H S ← / 0 → / 0) to a triple graph (H S ← H C → H T ) ∈ L (GG). We assume that the incremental pattern matcher recognizes all broken consistency matches and reports correct matches for forward rules only. This means, throughout the application of forward rules, the set of all valid consistency matches remains consistently marking. We have to show that this is also the case for repair rule applications. If it is, upon termination without exception, there is an entirely and consistently marking sequence of forward rules which corresponds to a triple graph from GG by Lemma 1.
Whenever we apply a repair rule we are (at least locally) in the situation of Corollary 1: There is a (maybe empty) sequence of consistently marking forward rule applications and a suitable broken consistency pattern indicates, that a user edit step applying the source rule r S sc of a short-cut rule r sc has taken place. Applying the repair rule r F sc at a valid match amounts to replacing the application of rule r F j , whose consistency pattern was broken, by rule r F in a consistently marking way.
We only informally discuss completeness. We understand completeness as follows: for every input G = (H S G C → G T ) with H S ∈ L S (GG), we obtain a result H = (H S ← H C → H T ) ∈ L (GG). In general, the above proposed algorithm is not complete. We randomly apply forward rules at available matches (without using backtracking) but the choice and order of such applications can affect the result if the final sequence of forward rule applications leads to a dead-end or translates the given source graph. However, the algorithm is complete whenever the set of forward rules is of such a form that the order of their application does not make a difference (somewhat more formally: they meet some kind of confluence) and the user edit is of the form discussed in Sect. 6.1. Analogous restrictions on forward rules hold for other synchronization processes that have been formally examined for completeness [30,41]. Adding filter NACs to the forward rules of a TGG is a technique that can result in such a set of confluent forward rules even if the original set of forward rules is not. Moreover, there are static methods to test TGGs for such a behaviour [6,30]; they check for sufficient but not for necessary criteria. If it is known that the set of forward rules of a given TGG guarantees completeness and the edit is of a suitable kind, a thrown exception during our synchronization process implies that H S / ∈ L S (GG).

A synchronization example
We illustrate our synchronization algorithm with an example illustrated in Fig. 16. For simplicity, we neglect the content attribute and concentrate on the structural behaviour. As a starting point, we assume that a user edits the source graph of the triple graph depicted in Fig. 16 (a) (in the following, we will refer to the triple graphs occurring throughout the algorithm just by their numbers). She adds a new root package above rootP, removes the link between Packages rootP and subP, and creates a further class c2. All these changes are specified by either a source rule of the TGG or the source rule of a derived short-cut rule. The resulting triple graph is depicted in (b). The elements in front of the grey background are considered to be inconsistent, due to a broken consistency match. Furthermore, c2 and nRootP are not translated, yet. In the first two passes of the algorithm, the two available matches for forward rules are applied (in random order): Leaf-FWD-Rule translates the newly added Class c2 and Root-FWD-Rule translates the Package nRootP; this results in the triple graph (c). Note that the last rule application creates a match for the repair rule Connect-Root-Repair-Rule. This is the reason why we start our synchronization process with applications of forward rules. The incremental pattern matcher notifies about two broken consistency matches, which are dealt with in random order. rootP is no longer a root package (which is detected by a violation of the according filter NAC in the consistency pattern) and subP is now a root package (which is detected by the missing incoming edge). Both violations are  captured by repair rules, namely Connect-Root-Repair-Rule and Make-Root-Repair-Rule, whose applications lead to (d) and (e). The algorithm terminates with a triple graph that belongs to the TGG.

Prospect: Support of further kinds of editing and advanced TGG features
We shortly describe the support of further kinds of editing and more advanced features of TGGs by our approach to synchronization, namely attributed TGGs, rules with NACs, and support for additional attribute constraints.
Further kinds of editing. In our implementation (see Sect. 7), we do not only support the addition of elements and propagation of edits that correspond to source rules of derived edit rules. Actually, we do not make any assumptions about the kind of editing. This is achieved by incorporating the application of repair rules into the algorithm suggested by Leblebici et al. [42,41], which has also been proved to be correct and to terminate. The implemented algorithm first tries to apply a forward or repair rule. If there is none available with a valid match, the algorithm falls back to revoking of an invalid rule application. This means that all elements that have been created by this rule application are deleted (and adjacent edges of deleted nodes are implicitly deleted as well). In line with that revoking of invalid rule applications, it also allows for implicit deletion of adjacent edges in the application of repair rules. In that way, the application of a repair rule might trigger new appearances of broken consistency matches. We are convinced that correctness is not affected by that more general approach: Inspecting the proofs of Corollary 1 and Theorem 5, the key to correctness is that the sequences of currently valid consistency matches remain consistently marking. That is achieved via the conditions on matches for repair rules to be reversing, contextpreserving, and creation-preserving. Dropping the condition to be non-disabling (by implicitly deleting adjacent edges) does not effect correctness, therefore. However, proving termination in that more general context is future work.
Advanced features. The attribution of graphs can be formalized by representing data values as special nodes and the attribution of nodes and edges as special edges connecting graph elements with these data nodes [17]. As the rules of a TGG are monotonic, they only set attribute values but never delete or change them. (The deletion or change of an attribute value would include the deletion of the attribution edge pointing to it.) The formal construction of short-cut rules is based purely on category-theoretic concepts, which can be directly applied to rules on attributed triple graphs as well. The properties proven for short-cut rules in [22] are valid also in that case. 8 Hence, we can freely apply the con-struction of short-cut rules and derivation of repair rules to attributed TGGs. In fact, our implementation already supports attribution. For the propagation of attribute changes (made by a user), however, we rely on the inherent support eMoflon offers, which is discussed in Sect. 7. Deriving repair rules to propagate such changes is possible in principle but remains future work.
In practical applications, TGGs are often not only attributed but also equipped with attribute constraints. These enable the user to, for example, link the values of attributes of correlated nodes. eMoflon comes with facilities to detect violations of such constraints and offers support to repair such violations. In our implementation, we rely on these features of eMoflon to support attribute constraints but do not contribute additional support in our newly proposed synchronization algorithm.
To summarize, while fully formalized for the case of plain TGG rules without attribution, our implementation already supports the synchronization of attributed TGGs with additional attribute constraints. As these additional features do not affect our construction of short-cut and repair rules, we do not consider them (yet) to improve the propagation of attribute changes (that may lead to violations of attribute constraints). Instead, we rely on the existing theory and facilities of eMoflon as introduced by Anjorin et al. [7]. In contrast, while computing short-cut and repair rules of rules with NACs is straightforward, adapting our synchronization algorithm to that case is future work and no tool support is available yet.

Implementation
Our implementation 9 of a model synchronizer using (shortcut) repair rules is built on top of the existing EMF-based, general-purpose graph and model transformation tool eMoflon [43,57,58]. eMoflon offers support for rule-based unidirectional and bidirectional graph transformations where the latter one uses TGGs. The model synchronizer implemented in eMoflon extends Algorithm 1 slightly. It allows any kind of user edit on the source part of a triple graph. If there are no forward or repair rules to fix a broken match, broken rule applications can be revoked. Revoking of rule applications has been the standard way of fixing broken matches. Hence, the implemented model synchronizer is a true extension of the previous synchronizer in eMoflon supporting the repair of broken applications.
In the following, we present the architecture behind our optimized model synchronizer first. Thereafter, we describe so-called effective pushouts. This is known to be the case for attributed (triple) graphs; compare, e.g., [18,Remark 5.57]. 9 Both, the implementation and the evaluation, can be accessed via https://github.com/Echtzeitsysteme/STTT-SC-Eval.
how the automatic calculation of short-cut and repair rules is implemented. 7.1 Tool architecture Figure 17 depicts a UML component diagram to show the main components of eMoflon's bidirectional transformation engine. The architecture has two main components: TGG Core contains the core components of eMoflon and Repair Framework adds (short-cut) repair rules to eMoflon's functionality. The TGG engine manages the synchronization process and alters source, target, and correspondence model in order to restore consistency. For this purpose, it applies forward/ backward operationalized TGG rules to translate elements or revokes broken rule applications.
Finding matches in an incremental way is an important requirement for efficient model synchronization since minor model changes should be detectable without re-evaluating the whole model. For this reason, eMoflon relies on incremental pattern matching to detect the appearance of new matches as well as the disappearance of formerly detected ones. It uses different incremental pattern matchers such as Democles [55] and HiPE [1] and allows to switch freely between them for optimizing the performance for each transformation scenario. Furthermore, eMoflon employs the use of various integer linear programming (ILP) solvers such as Gurobi [28] and CPLEX [34], e.g., in order to find correspondence links (mappings) between source and target models, which is referred to as consistency check [46].
We have extended this basic setup by introducing the Repair Framework, which consists of the Repair Strategy and the Shortcut Rule Creator. The Repair Strategy is attached to the TGG Engine from which it is called with a set of broken rule matches. It attempts to repair the corresponding rule applications by using repair rules created by the Shortcut Rule Creator, which uses the ILP interface provided by the TGG Core in order to find overlaps between TGG rules and finally, to create short-cut repair rules. For invoking the repair rules, however, we have to find matches of repair rules. This is done by a Batch (local-search) Pattern Matcher which, in contrast to the incremental pattern matcher, does not perform any book-keeping. As a repair of a rule application is always done locally, the checking of matches throughout the whole model is considered to be too expensive and thus, a Batch Pattern Matcher can perform this task more efficiently.

ILP-based short-cut rule creation
In order to create an overlap between two rules, a morphism between the graphs of both rules has to be found: Each element may only be mapped once; a context element may  only be mapped to another context element. Created elements are mapped to each other, respectively. Furthermore, a node can only be mapped to a node of the same type as we do not incorporate inheritance between types yet. Edges are allowed to be mapped to each other only if their corresponding source and target elements are also mapped to each other, respectively. We use integer linear programming (ILP) to encode the search space of all possible mappings and search for a maximal mapping. Each possible mapping m is considered to be a variable of our ILP problem such that calculating yields the maximal overlap, with M being the set of all mappings and m ∈ {0, 1}. To ensure that each element e is mapped only once, we define a constraint to exclude non-used mappings: (∑ m∈A e m) 1 with A e being the set of all alternative mappings for element e. To ensure that edges are mapped only if their adjacent nodes are mapped as well, we define the following constraint: m e =⇒ m v which translates to m e ≤ m v with m e being the edge mapping and m v being one of the mappings of node src(e) or trg(e). Maximizing the number of activated variables yields the common kernel of both input rules, i.e., a maximal overlap between them. If the overlap between the created elements of both rules is empty, we drop this overlap as the resulting short-cut rule would not preserve any elements. Given a common kernel of two rules, we glue them along this kernel and yield a short-cut rule. For all elements of the resulting short-cut rule, which are not in the common kernel, we do the following: (1) Preserved elements remain preserved in the short-cut rule. (2) Created elements of the first rule become deleted ones as the first rule is inverted. (3) Created elements of the second rule remain created ones.
We calculate two kinds of overlap for each pair of rules and hence, two short-cut rules: a maximal and a minimal overlap. The maximal overlap is calculated by allowing mappings between all created and context elements, respectively. On the other hand, the minimal overlap is created by allowing mappings between created elements only. Considering the corresponding ILP problem, this means that all other mapping candidates are dropped.
Finally, the derived short-cut rules are operationalized to obtain the repair rules employed in our synchronization algorithm.

Attribute Constraints
Although attribute constraints have not been incorporated formally in our approach, eMoflon is able to define and solve those within the former legacy translation and synchronization process. As can be seen in Fig. 20, many rules have an equality constraint defined between the name attributes of created elements on both, source and target parts. For TGG rules, this means that the attribute values may be chosen arbitrarily since both nodes would be created from scratch. In forward rules, source elements are already present which means that an attribute constraint can be interpreted as to propagate or copy the already present value to a newly created element. We reuse this functionality for our new synchronization process in the following a way: After applying a repair rule, we ensure that the constraints of the replacing rule are fulfilled. The definition of attribute constraints and their treatment is due to Anjorin et al. [7]. 10

Evaluation
We evaluate our approach with respect to two aspects using the running example in an extended form. First, we investigate the performance of our approach w.r.t. information loss and execution time. A set of real and synthesized models is given which we use to apply four different kinds of model changes. Secondly, we evaluate the quality of our short-cut rule generation strategy by comparing generated short-cut rules with well-known code refactorings.
Our experimental setup consists of 24 TGG rules (shown in Sect. A) that specify consistency between Java AST and custom documentation models. In addition, there are 38 shortcut rules being derived from the set of TGG rules. A small modified excerpt of this rule set was given in Sect. 2. For this evaluation, however, we define consistency not only between Package and Folder hierarchies but also between type definitions, e.g., Classes and Interfaces, and Fields and Methods with their corresponding documentation entries.

Performance Evaluation
To get realistic models, we extracted five models from Java projects hosted on Github using the reverse engineering tool MoDisco [12] and translated them into our own documentation structure. In addition, we generated five synthetic models consisting of n-level Package hierarchies with each nonleaf Package containing five sub-Packages and each leaf Package containing five Classes. While the realistic models shall show that our approach scales to real world cases, the synthetic models are chosen to show scalability in a more controlled way by increasing hierarchies gradually.
To evaluate our synchronization process, we performed several model changes. We refactored each of the models in four different scenarios; two example refactorings are the moving of a Class from one Package to another or the complete relocation of a Package. Then we used eMoflon to synchronize these changes in order to restore consistency to the documentation model using two synchronization processes, namely with and without repair rules. The legacy synchronization process of eMoflon is presented in [42,41]; the new synchronization process applying additional repair rules takes place according to the algorithm presented in Sect. 6 with the extensions mentioned in Sect. 6.5.
These synchronization steps are subject to our evaluation and we pose the following research questions: (RQ1) For different kinds of model changes, how many elements can be preserved that would be deleted and recreated otherwise? (RQ2) How does our new synchronization process affect the runtime performance? (RQ3) Are there specific scenarios in which our new synchronization process performs especially good or bad?
In the following, we evaluate our new synchronization process by repair rules against the legacy synchronization process in eMoflon. While the legacy one revokes forward rule applications and re-propagates the source model using forward rules, our new one prefers to apply short-cut repair rules as far as possible and falls back to revoking and repropagation if there is no possible repair rule application.
To evaluate the performance of the legacy and the new model synchronization processes, we consider the following synchronization scenarios: Altering a root Package by creating a new Package as root would imply that many rule applications have to be reverted to synchronize the changes correctly with the legacy synchronization process (Scenario 1). In contrast, our new approach might perform poorly when a model change does not inflict a large cascade of invalid rule applications. Hence, we move Classes between Packages (Scenario 3) and Methods between Classes (Scenario 4) to measure if the effort of applying repair rules does infer a performance loss when both, the new and old algorithm, do not have to repair many broken rule applications. Note that Scenario 4 extends our evaluation presented in [23] as it provides a more fine-granular scenario. Finally, we simulate a scenario which is somewhat between the first three by relocating leaf Packages (Scenario 2) which, using the legacy model synchronization, would lead to a re-translation of all underlying elements.
Tables 1 and 2 depict the measured time in seconds (Sec) and the number of re-/created elements (Elts) in each scenario (1)- (4). The first table additionally shows measurements for the initial translation (Trans.) of the Java AST model into the documentation structure. For each scenario, Table 1 shows the numbers of synchronization steps using the legacy synchronizer without repair rules while Table 2 reflects the numbers of our new synchronizer with repair rules.
W.r.t. our research questions stated above, we interpret these tables as follows: The Elts columns of Table 2 show clearly that using repair rules preserves all those elements in our scenarios that are deleted and recreated by the legacy algorithm otherwise as shown in Table 1 (RQ1). The runtime shows a significant performance gain for Scenario 1 including a worst-case model change in which the legacy algorithm has to re-translate all elements (RQ2).  Repair rules do not introduce an overhead compared to the legacy algorithm as can be seen for the synthetic time measurements in Scenario 4 where only one rule application has to be repaired or reapplied (RQ2). Our new approach excels when the cascade of invalidated rule applications is long. Even if this is not the case, it does not introduce any measurable overhead compared to the legacy algorithm as shown in Scenarios 2, 3, and 4 (RQ3).
Threats to validity. Our evaluation is based on five real world and five synthetic models. Of course, there exists a wide range of Java projects that differ significantly from each other w.r.t. their size, purpose, and developer style. Thus, the results may not be transferable to other projects. Nonetheless, we argue that the four larger models extracted from Github projects are representative since they are deduced from established tools of the Eclipse ecosystem. The synthetic models are also representative as they show the scalability of our approach in a more controlled environment with an increasing scaling factor. Together, realistic and synthetic models show that our approach does not only increase the performance of eMoflons synchronization process but also reduce the amount of re-created elements. Since each re-created el-ement may contain information that would be lost during the process, we preserve this information and increase the overall quality of eMoflons synchronization results. In this evaluation, we selected four edit operations that are representative w.r.t. their dependency on other edit operations. They may not be representative w.r.t. other aspects such as size or kind of change. We consider those aspects to be of minor importance in this context as dependency is the cause for deleting and recreating elements in the legacy synchronization process. Finally, we limited our evaluation to one TGG rule set only as we experienced similar results for a broader range of TGGs from the eMoflon test zoo 11 .

Refactorings
As explained in Sect. 7, we currently employ two different strategies to overlap two rules and to create a short-cut rule. We pose the following research question: (RQ4) Are the generated short-cut rules applicable to realistic scenarios? Are further short-cut rules necessary? Since our ex-ample addresses code changes that are incorporated by the Java AST model primarily, we relate our approach to available code refactorings. In the following, we refer to the book on code refactorings written by Martin Fowler [21] which presents 66 refactorings.
Our example TGG, depicted in Fig. 20, defines consistency on a structural level solely, without incorporating behaviour, i.e., the bodies of methods and constructors. Hence, we selected those refactorings that describe changes on Packages, Classes and Interfaces, MethodDeclarations and Parameters, and Fields. The result is a set of 16 refactorings for which we evaluated if short-cut rules help to directly propagate the corresponding change of the AST model or deletion and recreation has to take place. Fig. 18 lists these refactorings together with information on the TGG rules and/or short-cut rules that are applicable in these scenarios. For some of the refactorings as e.g., Extract Class and Push-Down Field, we identified situations where not only short-cut rules are necessary to propagate the changes. In these cases, new elements may be created which can be propagated using operationalized TGG rules. The deletion of elements can be propagated by revoking the corresponding prior propagation step. However, many refactorings benefit from using short-cut rules, for example, those that move methods and fields. If recreation of documentation on the target part is necessary, it can lead to information loss as there may not be all the necessary information in the Java AST model.
Example: Push-Up Field moves and merges a similar field from various subclasses into a common superclass. If one of the subclass fields is moved to the superclass, we can propagate this change using Move-Field-Repair-Rule, which is depicted in Fig. 19.
In summary, we are able to solve all 16 refactorings using a combination of (inverse) TGG rules and our generated short-cut rules (RQ4).
Threats to validity. Note that short-cut rules are especially useful when elements are moved instead of deleting and recreating them in some other location. Those changes are hard to detect and are not covered here. Refactorings such as Push-Up Method, which moves a method that occurs in several subclasses to their common superclass, can be done in two different ways. First, one of the methods is moved to the superclass while the methods in the other subclasses are deleted. This employs the use of short-cut rules for the moved method followed by revocation steps for the deleted methods to delete the corresponding documentation elements. Second, all methods may be deleted and a new similar method is created in the superclass. In that case, there is no shortcut rule that helps to preserve information and all propagated documentation elements for the method will be blank. Hence, our approach depends on the kind of change. In par-ticular, it helps when user edits also try to preserve information instead of recreating them.
In addition, we have not incorporated behaviour in our example; such an extension of our TGG may be considered in future work. However, we can argue that most of those refactorings can be reduced to the movement of elements, the deletion of superfluous elements and the creation of new elements. These changes are manageable in general using a sequence of short-cut rule and (inverse) operationalized TGG rule applications.
Finally, we evaluated these cases by hand based on the generated short-cut rules from our implementation. Test cases implementing the identified refactorings and combinations of them will be accessible via eMoflons test zoo.

Related Work
In this section, we relate our new model synchronization approach to already existing incremental model synchronization approaches. First, we discuss other TGG-based approaches in detail before relating to other bidirectional transformation (bx) approaches; these are considered more roughly. Finally, we mention some unidirectional approaches that are closely related to incremental model transformation and model repair. Work that is related to our use of partial triple graphs but not to model synchronization is considered in [37].
TGG-based approaches to incremental model synchronization. Synchronization approaches are supposed to comply with the least-change property, which means that no unnecessary deletions and thus information loss should take place while restoring consistency. An overview of TGGbased least-change synchronization has been given by Stojkovic et al. [52]. The first part of our related work is based on that presentation. Several approaches to model synchronization based on TGGs suffer from the fact that the revocation of a rule application may trigger the revocation of all dependent rule applications as well [26,40,42,41]. Such cascades of deletions shall be avoided to decrease runtime and unnecessary information loss.
Leveraging an incremental pattern matcher for TGG-based model synchronization was first suggested in [42,41]. Proofs of termination, correctness, and completeness are given. Moreover, the approach is implemented. In fact, this is the legacy synchronization we evaluated against in Section 8. As already mentioned, that approach revokes invalid consistency matches as long as there are any and subsequently, applies forward rules to translate yet untranslated elements. So, that approach is a typical example where a lot of unnecessary deletions may take place.  Hermann et al. [30] proposed a synchronization algorithm where, after an edit on the source part, first those correspondence elements are deleted that do not refer to an element in the source graph any longer. Thereafter, they parse the remaining triple graph to find the maximal, still valid sub-model. This model is used as a starting point to propagate the remaining changes from source to correspondence and target graphs using forward rules. The approach is completely formalized and proven to be correct, also for attributed TGGs; it can be applied to TGGs with deterministic 12 sets of operationalized rules. That approach avoids some unnecessary deletions but there are some that still can occur. In fact, the amount of unnecessary deletion taking place in that approach is dependent on the given TGG rules; a concrete example for that is given in [52]. While that approach is definitely a valuable contribution towards least-change synchronization, repeated parsing for maximally consistent submodels is highly inefficient and might not scale to large models. At least part of that approach is implemented as HENSHINTGG [20] using AGG [53] to perform necessary dependency checks on derived rules. As that approach focusses on correctness, completeness, and invertibility, the amount of achieved incrementality as well as principles of least change are not discussed in [30]. 12 Deterministic in the sense that there are no competing rules for any translated element.
In [24], Giese and Hildebrandt propose rules that save nodes instead of deleting and re-creating them. In particular, they present a rule that directly propagates the movement of elements, i.e., the redirection of edges between existing elements. Moreover, they suggest to try a re-use of elements before deleting them. But they neither present a general construction for their rules nor formalize the re-use that takes place. Consequently, no proof of correctness is given. Instead, it is left as future work in [25]. The additional propagation rules that are given exemplary in [24] can be automatically derived as repair rules using our approach. In [10], Blouin et al. also add specifically designed repair rules to the rule set of their case study for avoiding information loss. Those example rules can be realized as repair rule in our approach as well.
In a similar vein, Greenyer et al. [27] propose to delete elements not directly but to mark them for deletion and to allow for their re-use in rule applications during synchronization. Only elements that cannot be re-used are deleted at the very end of synchronization. But that approach comes without any formalization and proof of correctness as well.
In contrast, the idea of re-using elements in model synchronizations has been rigorously formalized by Orejas and Pino [50]. They introduced forward translation rules with reuse and proposed a synchronization algorithm based on those rules. That algorithm is actually proven to be correct; moreover, it is incremental (in a technical sense). The practical effects of applying a repair rule in our approach and in their approach are very similar. While our repair rules allow for reuse and perform necessary deletions on the correspondence and target parts directly, their forward translation rules allow for a reuse where necessary deletions are performed at the end of a synchronization in a separate step. They need some additional technical infrastructure to determine the exact amount of necessary deletion. To the best of our knowledge, their approach has not been implemented yet.
In a guideline on how to develop a TGG, Anjorin et al. [5] explain how certain kinds of rules in a TGG avoid the loss of information better than others. There is empirical evidence that, following these guidelines, synchronization can be considerably accelerated compared to a batch mode as long as there is no need for additional offline recognition of model differences [45]. Transforming a given TGG into that form, however, may change the defined language and thus, is not always applicable. For example, the grammar of our running example allows to generate hierarchies of Packages that constitute a set of disconnected trees. For meeting the suggestions in [5], a naive change of this grammar may change the language such that arbitrary graphs can be generated. That effect can be avoided by, e.g., designing suitable NACs for the rules and proving the equality of the generated model languages. That effort is not needed when following our approach.
In summary, it is well-known in the literature that there are a lot of situations where the derived forward rules of a TGG (and the revocation of their applications) are not suitable to efficiently propagate changes from source to target models. Several formal and informal approaches have been suggested to avoid this problem, at least partly. Table 3 provides an overview of all the approaches described above. It indicates the degree of information loss and presents whether the approach is automated, whether correctness of the proposed synchronization algorithm is proven, whether it has been (prototypically) implemented, and whether any performance gain could be shown for it. Our approach is based on the automated derivation of repair rules; it is able to comply with all the above categories. The correctness has been shown for model synchronization with repair rules. As our implemented synchronization process can also revoke forward rules, the correctness proof has to be slightly extended to cover also that case which seems to be straight forward (see discussion in Sect. 6.5). Furthermore, support for some additional features of TGGs like NACs and attribution is future work (NACs) or not rigorously formalized (attribution).
Comparison to other bx approaches. Anjorin et al. [4] compared three state-of-the-art bx tools, namely eMoflon [43] (rule-based), mediniQVT [2] (constraint-based), and BiGUL [36] (bx programming language) w.r.t. model synchronization. They point out that synchronization with eMoflon is faster than with both other tools as the runtimes of those tools all correlate with the overall model size while the runtime of eMoflon correlates with the size of the changes done by edit operations. Furthermore, eMoflon is the only tool that was able to solve all but one synchronization scenario while mediniQVT failed in four and BiGUL in two scenarios. One scenario was not solved because the solution with eMoflon deletes more model elements than absolutely necessary in that case. Using short-cut repair rules, we can solve the remaining scenario and moreover, can further increase the performance of eMoflon when solving model synchronization tasks. Macedo and Cunha present bidirectional model transformations based on ATL in [47]. By using the SAT solver Alloy, they are able to guarantee least-change model synchronization where two metrics are supported measuring change: the graph edit distance and the operationbased distance. While the synchronization results may be very good, this solver-based approach does not scale for large models. All this suggests that our tool is highly competitive, not only among TGG-based tools but also in comparison to other bx tools.
With regard to theoretical considerations, least change and incremental synchronization have also been actively investigated in other approaches, in particular when using lenses, e.g., [15,56,32,33,31]. The approach by Wang et al. [56] seems to be the most similar one to ours. That approach derives functions to directly propagate changes from a source to a view and is applicable to tree-shaped data structures. As those approaches are less close to our work, detailed formal comparisons are left to future work.
Further related works. Change-preserving model repair as presented in [54,48] is closely related to our approach. Assuming a set of consistency-preserving rules and a set of edit rules to be given, each edit rule is accompanied by one or more repair rules completing the edit step if possible. Such a complement rule is considered as repair rule of an edit rule w.r.t. an overarching consistency-preserving rule. Operationalized TGG rules fit into that approach but provide more structure: As graphs and rules are structured in triples, a source rule is also an edit rule being complemented by a forward rule. In contrast to that approach, source and forward rules can be automatically deduced from a given TGG rule. By our use of short-cut rules, we introduce a pre-processing step to first enlarge the sets of consistencypreserving rules and edit rules. Furthermore, the repair process presented in that paper has more restrictive presumptions than our synchronization process using repair rules w.r.t. independence of rule applications.
Boronat [11] presents an incremental uni-directional transformation approach. When retranslating a model after a change, affected elements of the old model are marked first and then, if possible, re-used instead of deleted and re-created (similar to the approaches suggested in [27,50] for TGGs). Again, the same effects can be obtained by constructing and applying short-cut rules but there, for plain graph transformation. A correctness proof for that approach is still missing.

Conclusion
Model synchronization, i.e., the task of restoring the consistency between two models after model changes, poses chal- Table 3 An overview of TGG-based synchronization approaches Degree of information loss Automated Correctness proven Implemented Evaluated performance gain [42,41] high yes yes yes yes [26] high yes only partially in [25] yes yes [40] high yes yes yes yes [30] to some extent yes yes at least partially no [24] low yes no yes yes [27] low yes no yes no [50] low yes yes no no [5,45] low not needed not needed yes yes [54,48] low no yes yes no ours low yes yes yes yes lenges to modern bidirectional model transformation approaches and tools: We expect them to synchronize changes without unnecessary loss of information and to show a reasonable performance. Here, we restrict ourselves to model synchronizations where only one model is changed at a time. While Triple Graph Grammars (TGGs) provide the means to perform model synchronization tasks in general, efficient model synchronization without unnecessary information loss may not always be fulfilled since basic TGG rules are not designed to support intermediate model editing and repair. Therefore, we propose to add short-cut rules, a special form of generalized TGG rules that allow to take back one edit action and to perform an alternative one. In our evaluation, we show that repair rules derived from short-cut rules allow for a kind of incremental model synchronization with considerably decreased information loss and improved runtime compared to synchronization without these rules.
In this paper, we show the correctness of our synchronization approach, present the implementation design, and evaluate the corresponding tool support w.r.t. performance and unnecessary information loss. While the tool support already covers attributes of model elements, the correctness proof of our synchronization approach w.r.t. to these extensions is prepared but still up to future work.
While model synchronization means the propagation of model changes from one view to another, model changes may also occur concurrently on both views of a model. Hence, model synchronization approaches have to cover those scenarios as well. Short-cut rules may also be promising to avoid information loss in that more general setting; they have not been considered in the context of other approaches to concurrent model synchronization in the literature [49,60]. As changes of both model views may be in conflict with each other, the development of an efficient concurrent model synchronization process which avoids unnecessary information loss poses a challenge for future work.
containers are used to separate Java entities on the documentation site to split them up into common Java data types, external Java references and source references. JavaModel-2-DocModel-Rule then defines consistency between Packages and Folders given that their parent are a MoDisco Model and a DocModel, respectively. Using JavaPackage-2-DocFolder-Rule, we can now create Package and Folder hierarchies recursively. Furthermore, there are four rules that define consistency for ClassesDeclarations, InterfacesDeclaration, EnumDeclaration and inner ClassesDeclarations each with a Doc-File. Also, for the nine primitive types, e.g., boolean, byte and short, consistency is defined between each of them and a Doc-File. Given a ClassDeclaration or an Inter-faceDeclaration with its corresponding Doc-File, we also define consistency between MethodDeclarations on one and MethodEntries on the other side. Using the consistency between methods on both sides, we are able to define consistency between TypeAccesses and Parameters, once for method signatures and once for the return statement. Finally, we define consistency between generalization and realization relationships using three rules. First, a rule for ClassesDeclarations that extend another ClassDeclaration, second a rule for InterfacesDeclaration extending another InterfaceDeclaration and last for Classes-Declarations implementing an InterfaceDeclaration.