Hierarchical planning in a supervisory control context with compositional abstraction

Hierarchy is a tool that has been applied to improve the scalability of solving planning problems modeled using Supervisory Control Theory. In the work of Hill and Lafortune (2016), the notion of cost equivalence was employed to generate an abstraction of the supervisor that, with additional conditions, guarantees that an optimal plan generated on the abstraction is also optimal when applied to the full supervisor. Their work is able to improve their abstraction by artificially giving transitions zero cost based on the sequentially-dependent ordering of events. Here, we relax the requirement on a specific ordering of the dependent events, while maintaining the optimal relationship between upper and lower levels of the hierarchy. This present paper also extends the authors’ work (Vilela and Hill 2020) where we developed a new notion of equivalence based on cost equivalence and weak bisimulation that we term priced-observation equivalence. This equivalence allows the supervisor abstraction to be generated compositionally. This helps to avoid the explosion of the state space that arises from having to first synthesize the full supervisor before the abstraction can be applied. Here, we also show that models with artificial zero-cost transitions can be created compositionally employing the new relaxed sequential dependence definition. An example cooperative robot control application is used to demonstrate the improvements achieved by the compositional approach to abstraction proposed by this paper.


Introduction
Planning is a problem with great importance in a wide range of fields.Generally, planning is the process of analyzing how a system changes its state based on the occurrence of actions, followed by choosing which actions to apply to the system to achieve a desired goal.Examples of planning include: path planning, navigation planning, manipulation planning, perception planning, and job-shop scheduling.Research has been done across these different notions of planning to make the process more efficient.According to Qiang (2012), an efficient planning algorithm is the "one able to find a good plan quickly".
Focusing on the development of general algorithms for planning, Ghallab et al. (2004) proposed employing discrete-event system models (a model widely used in the field of computing).Discrete-event systems can be represented by discrete graphs, where nodes are states and transitions are actions.Based on this representation, the classical planning problem can be seen as a search for a path in a graph from the initial state to a goal state with the optimization of a criterion.However, planning problems are highly complex and it can be difficult to explicitly model a system's behavior.To deal with this difficulty, the use of Supervisory Control Theory (SCT) has emerged as an interesting tool.
With SCT, the system and specifications are modeled separately and then tools are applied to generate the supervisor, which can achieve the most permissive closed-loop behavior that maintains safety and non-blockingness.One approach to solve planning problems is to use the supervisor as the search space to find a plan that optimizes a defined criterion, such as time or energy [ (Hill and Lafortune 2017;Pena et al. 2016;Dulce-Galindo et al. 2019;Hagebring and Lennartson 2018;Ware and Su 2017;Su 2012;Fabre and Jezequel 2009), and (Bravo et al. 2018)].
The concept of synthesizing an optimal supervisor, when not directly related to the application of planning, can be seen as the solution of this type of problem [ (Huang and Kumar 2007;Sengupta and Lafortune 1993;Asarin and Maler 1999;Brandin and Wonham 1994;Su et al. 2012) and (Hyun-Wook et al. 2010)].In general, the synthesis of an optimal supervisor is beyond traditional supervisor synthesis, which only restricts the system to the set of legal behaviors.An optimal supervisor further limits the system's behaviors to those that minimize or maximize some criterion.Most existing approaches to synthesizing an optimal supervisor rely on a monolithic approach to modeling the system, which can have prohibitive computational complexity.
Efforts have been directed to improve the scalability of planning using SCT.Modular and compositional algorithms have been applied when a system can be decomposed into sub-systems.Su (2013) proposes a distributed approach to time optimal supervisory control, but it requires the synthesis of a global coordinator through the composition of all local components and requirements, which also can have prohibitive complexity.
Hagebring and Lennartson (2018) and Ware and Su (2017) represent their models as standard tick automata, which has the effect of dramatically increasing their model sizes, thereby reducing the applicability of the approaches for more complex problems.Hagebring and Lennartson (2018) propose the application of local optimization and abstraction to each sub-system, which removes all non-optimal local paths and merges sequences of local events into single events.This is followed by a compositional optimization, which synchronizes a sub-set of components generating a larger sub-system to which a second iteration of the local optimization is applied.This cycle is repeated until only one model remains, representing the optimal global solution by construction.Ware and Su (2017) propose two methods for synthesizing the time-optimal accepting trace for timed automata with only controllable events based on abstraction and pruning.For the first method, the sub-systems are composed together and all local events are abstracted.When a single automaton is achieved, Dijkstra's algorithm is applied, producing the time-optimal accepting trace.The second method is a modification of the first algorithm, applying pruning instead of abstraction to reduce the size of the automata.However, when a trace is returned, it is not guaranteed to be optimal since the pruning can remove all time-optimal accepting traces.Bravo et al. (2018) generate results specifically for production systems with symmetrically reachable idle states (states with inactive behavior).Such a system is shown to be decomposable into factors that are optimized independently and then the partial solutions are concatenated to generate the minimum makespan controllable sublanguage.
van Putten et al. (2020) propose an approach for the synthesis of a throughput-optimal supervisor for a manufacturing system.In their modeling, they use the concept of activities which express some deterministic functionality, such as moving a product, in terms of actions.With that they create a high-level model composed by a plant model, which describes the activity sequences that are available, and specifications models, that describes the order in which activities should be executed.The lower-level model has for each activity an acyclic graph that defines their constituent actions and dependencies between them.The system's behavior, including timing, is captured from the lower level by variables and guards of the automata used in the high-level models.A supervisor is synthesized for the high-level, but for that they remove all the variables related to time from the models prior to synthesis and return them once the supervisor is obtained.This procedure affects the guarantee of controllability and nonblockingness of the supervisor.To the resultant monolithic supervisor, they apply a game theoretic method, called ratio games, to find the control that optimizes the throughput.
Hill and Lafortune (2017) develop an approach for generating approximately time optimal supervisory control logic for a cooperative multi-robot system employing hierarchy and decomposition based on the results of Hill and Lafortune (2016).These works do not employ tick automata and do not rely on the identification of idle states.The work of (Hill and Lafortune 2016) proposes a hierarchical approach based on a new notion, cost equivalence, that aggregates states with common future costs, as shown in Fig. 1.This equivalence generates an abstraction of the supervisor that with additional conditions, such as an additive cost function and only transitions with zero cost can be "hidden," guarantees that an optimal plan found on the abstraction is also optimal when applied to the underlying full supervisor.
The work of this paper improves upon (Hill and Lafortune 2016) by developing conditions under which the abstraction can be generated more efficiently.Specifically, this paper is an extended version of the conference paper (Vilela and Hill 2020) where the proposed requirements are relaxed and additional details are provided, including a more developed example.In this work, we employ a new notion of equivalence introduced by Vilela and Hill (2020) based on cost equivalence and weak bisimulation that we term priced-observation equivalence.This class of equivalence is used to generate an abstraction based on the aggregation of states with futures that share the same costs and observed event labels.This new equivalence allows the abstraction to be generated compositionally, rather than first having to synthesize the full global supervisor as done by Hill and Lafortune (2016).The work of Hill and Lafortune (2016) and Vilela and Hill (2020) include as part of their process an approach for artificially reducing the cost of certain transitions to zero.In this work, we provide a relaxation of the conditions to perform this transformation, and demonstrate that it can be performed compositionally.
The remainder of this paper is divided into four sections: Section 2 provides the necessary preliminary definitions, Section 3 develops the main theoretical results, Section 4 provides a detailed example, and Section 5 presents the paper's conclusions.

Preliminaries
This paper employs discrete-event models where behaviors are represented using strings of events taken from a finite alphabet .* is the set of all finite strings of events in , including the empty string ε, and the concatenation of strings s, u ∈ * is written as su.A subset L ⊆ * is called a language.The possible behaviors of a system are represented by its automaton generator.Specifically, this paper employs weighted automata, finite set of marked states, and f is the cost function that represents a mapping from a transition, Q × × Q, to a nonnegative real number, [0, ∞).The cost will typically represent the expenditure of energy or time associated with a transition.The transition relation is written in infix notation, x σ −→ y, and is extended to traces in the natural way.The cost of a trace s = σ 1 σ 2 ...σ n−1 is defined as the sum of the costs of each transition, C(q for some s ∈ * .An automaton is accessible if every state is accessible and the accessible component of an automaton G is obtained by the operation Ac(G).For the purposes of this paper, all the automata are considered accessible.The synchronous operation of two automata G 1 and G 2 is modeled using the parallel composition operator, , as defined in the following.
In this definition, the function max is used to establish the cost of transitions with shared events.This decision makes sense, in particular, when costs are related to time.For example, if a task is running in parallel in two automata, the one that takes the greater amount of time determines the overall required time for both automata to complete the task.
The natural projection P i : * → * i maps strings in * to strings in * i by erasing all events not contained in i .We can also define the inverse projection P −1 i (t) := {s ∈ * : P i (s) = t}.These definitions can be naturally extended to languages and then applied to provide an alternate definition of the parallel composition in terms of languages: For further details on discrete-event systems and the base SCT, the reader is referred to Cassandras and Lafortune (2008).
A partition automaton, G h , is generated from an automaton G = (Q, , →, q 0 , Q m , f ) by aggregating its states into distinct sets such that transitions inside a partition group cannot be observed.Defined more formally, a partition automaton G h is a weighted automaton and q h i ∩ q h j = ∅ for i = j .The initial state of the partition automaton is defined such that q 0 ∈ q h 0 , and marked states are defined such that transitions from the original automaton that are "between" partitions.A transition q h σ −→ h q h exists in G h if and only if there exists a transition q σ −→ q in the original automaton such that q ∈ q h and q ∈ q h .The transition's cost in the partition automata f h (q h σ → h q h ) will equal f (q σ → q ).This situation is represented in Fig. 2. For brevity, we introduce the notation, x s ⇒ y, to represent how a string in an automaton G is observed in its corresponding partition automaton G h .Otherwise stated, x s ⇒ y with s = σ 1 σ 2 ...σ n ∈ h * , denotes the existence of x r → y with exactly the order of events in s, but with an arbitrary number of "hidden" events, τ ∈ \ h , shuffled with the observable events in s, that is, r = t 1 σ 1 t 2 σ 2 t 3 ...t n σ n t n+1 , where t i ∈ { \ h } * .Further, we use the A path in G whose transitions are visited in an order consistent with the order of the transitions in G h , is termed a realization of the corresponding "high-level" path.This consistency of ordering is named trace-cost consistency for individual trajectories and is denoted: Hill and Lafortune (2016), the partition automaton is generated by aggregating states that are cost equivalent.This notion of equivalence requires that states must have observed futures with the same cost and consistent marking.The partitions are generated such that only transitions with zero cost are "hidden".Such partition systems are termed zero-cost reachable.The following result from Hill and Lafortune (2016) demonstrates that when partition systems maintain this property and employ cost equivalence, then an optimal plan chosen in the abstracted automaton has at least one realization in the underlying automaton and all such realizations are optimal.We define the optimal path between two states of G, q 1 → 00 q n , as the sequence of transitions between q 1 and q n with the smallest path cost.
Theorem 1 (Hill and Lafortune (2016)) Let G h = (Q h , h , → h , q h 0 , Q h m , f h ) be a partition automaton generated by cost-equivalent abstraction of a weighted automaton G = (Q, , →, q 0 , Q m , f ) such that {G, G h } is zero-cost reachable.Let an optimal marked path between two states in G h be q h 1 → 00 h q h k ∈ Q h m .Then there exists at least one trace-cost consistent realization of this path in G, and all such realizations are optimal, q 1 → 00 q n ∈ q h 1 → 00 h q h k .
Hill and Lafortune (2016) further introduce a technique to improve the amount of reduction that can be achieved for a given automaton.The idea is to create transitions that artificially have zero cost, while maintaining the overall cost of each marked path.For that, the authors introduce the concept of sequential dependence.A pair of events (σ a , σ b ) is defined to be a sequentially-dependent ordering if the occurrence of the event σ a in a marked path is always eventually followed by the event σ b .Consider a sequentiallydependent ordering σ a σ b in an automaton G, where σ a is the independent event with cost f (p σ a −→ p ) = c a , and σ b is the dependent event with cost f (q If the cost of the dependent transitions are consistent in G, it is possible to perform a "lumping" operation, which transfers the cost of all transitions with label σ b to transitions with label σ a , f (p The rest of the automaton G remains unaltered, resulting in the lumped automaton G * .The property of costs being consistent for all transitions with the same event labels is referred to as time-separability.Hill and Lafortune (2016) prove in their Proposition 4 that all marked paths in a lumped automaton G * possess the same cost as in the original automaton G.This result allows an abstraction of G * to be used for planning.

Main results
The main results of this paper improve upon (Hill and Lafortune 2016) by introducing a new notion of equivalence that allows the abstraction to be generated compositionally.This concept was introduced in the conference paper (Vilela and Hill 2020).Here we provide additional details and employ a definition of sequential dependence that is relaxed from the version employed in Vilela and Hill (2020) and Hill and Lafortune (2016).
The work of Hill and Lafortune (2016) relies on the existence of sequentially-dependent orderings of events to create transitions with "virtual" zero-cost to improve the amount of abstraction achieved in the generation of the high-level in the hierarchy.However, their definition of sequential dependence is restricted by a consistent ordering of the occurrence of dependent events.Based on this definition, the pair of events (σ a , σ b ) in the automaton G presented in Fig. 3 is not considered sequentially-dependent, since if we choose an arbitrary marked path that includes the event σ a , we can have the order σ a σ b or σ b σ a .In this paper, we relax the sequential dependence requirement such that a specific ordering between the dependent events is not necessary.We will classify the pair of events (σ a , σ b ) in the example of Fig. 3 to be a sequentially-dependent pair without requiring them to be a sequentiallydependent ordering.Under this new definition of sequential dependence, we also define a new lump operation.We will show that the new lump operation will not affect the overall cost of marked paths through the automaton G, maintaining the hierarchical approach to planning allowed by Theorem 1. Further, we will show that the compositional approach to generating the abstraction proposed in Vilela and Hill (2020) holds under the new relaxed sequential dependence definition.
This new notion of sequential dependence is defined for a pair of events (σ a , σ b ), such that for an arbitrary marked path, the occurrence of one of the events in the path implies the occurrence of the other event.For example, if we have two occurrences of σ a in a string, then we will have two occurrences of σ b as well, without regard to their ordering.The marked path could include any permutation of these events, {σ a σ a σ b σ b , σ a σ b σ a σ b , σ a σ b σ b σ a , σ b σ a σ b σ a , σ b σ a σ a σ b , σ b σ b σ a σ a }.We will represent such sets by a permutation function.This set, for example, will be represented P erm(σ 2 a , σ 2 b ), where σ 2 a = σ a σ a and σ 2 b = σ b σ b .The new relaxed notion of sequential dependence is formally presented in Definition 2.
Definition 2 Consider an automaton G = (Q, , →, q 0 , Q m , f ) which possesses a pair of events (σ a , σ b ) with corresponding natural projections P a : * → σ * a and P b : * → σ * b .The pair of events is sequentially dependent in G if ∀s ∈ * such that q 0 s → q exists for some q ∈ Q m with P a (s) = σ n a , then P b (s) = σ n b where n ∈ N.
Based on this new sequential dependence definition, we update the definition of the lump operation in Definition 3.Because there are not specific dependent and independent events, the cost can be transferred in either direction between the events.For example, from σ a to σ b or from σ b to σ a .However, the direction of the lumping must be kept consistent throughout in the entire automaton.
Definition 3 Let there be a weighted automaton G = (Q, , →, q 0 , Q m , f ), with σ a , σ b ∈ , where the pair of events (σ a , σ b ) is sequentially dependent.If the pair of events are time-separable in G, with f (p The rest of the automaton remains unaltered.Now, we need to show that the cost of a marked sequence is kept unaltered in the lumped automaton G * following this new definition of the lumping operation. Proposition 1 Let there be an automaton G = (Q, , →, q 0 , Q m , f ) with a time-separable and sequentially-dependent pair of events (σ a , σ b ) from which the lumped automaton Because the cost of any marked sequence s in G * is the same as in G, we have that an optimal plan found in G * is also an optimal plan in G, as settled before by Hill and Lafortune (2016).
Hill and Lafortune (2016) employ the original weighted automaton G to generate a modified lumped automaton G * , which is then used to generate the cost-equivalent, zero-cost reachable partition automaton G h .The reduced G h is used to find an optimal global plan (marked path).It is proven by Hill and Lafortune (2016) that all realizations in G of an optimal plan in G h are also optimal, and that any path realized in G * has equal cost in the original automaton G. Therefore, any realization in G of an optimal plan in (G * ) h is optimal as well (captured by Theorem 1).
This paper follows the same process of finding an optimal global plan developed by Hill and Lafortune (2016), however, it proposes a more efficient approach for generating the abstraction G h .Given that G, generally, is composed of a set of smaller automata, G 1 G 2 . . .G n , this paper finds conditions under which G h can be generated compositionally, without the need to compute G first as was done in Hill and Lafortune (2016).Specifically, we show that it is possible to distribute the "lumping" operation over the parallel composition operator, and if the partitioning employs a newly defined class of equivalence, then G h can be generated compositionally, Fig. 4. The new notion of priced-observation equivalence is stronger than the cost equivalence introduced in Hill and Lafortune (2016).Therefore, all results from Hill and Lafortune (2016) related to optimality With priced-observation equivalence, states must have futures with the same cost and the same observed event labels, as well as consistent marking.One can consider this equivalence a combination of cost equivalence and weak bisimulation.
Definition 4 Let there be two weighted automata, , and a set of hidden events ( \ h ).An equivalence relation G 1 and G 2 are said to be priced-observation equivalent if q 0,1 ≈ P O q 0,2 .This follows from Definition 4 since all reachable states also will be priced-observation equivalent.As an example of priced-observation equivalence, we have the automata G and G h in Fig. 5, where the events {β 1 , β 2 } are abstracted away to generate G h .In the abstraction, β 1 and We will show now that the global partition automaton (abstraction) can be generated compositionally.This requires that priced-observation equivalence be a congruence with respect to parallel composition.We will show that this is true when a partition system {G, G h } is generated by a priced-observation equivalent abstraction, G ≈ P O G h , and the shared events are kept observable.
• So far we have shown that given (q 0,1 , q 0,2 and (q h 1,1 , q 1,2 ) and (q 1,1 , q 1,2 ) agree in marking.Applying the preceding logic over subsequent observations, we end up with the result that all observed sequences that exist in G 1 G 2 also exist in G h 1 G 2 with the same cost and marking.
2) The next step is to show the opposite direction.Let there be an observation ) implies the existence of the sequence (q h 0,1 , q 0,2 ) We will now show the existence of the observation in G 1 G 2 in a similar manner to direction 1) by proving the existence of a sequence (q 0,1 , q 0,2 ) ) with consistent cost and marking.
• Applying this same logic to subsequent observations, we end up with the result that all observed sequences in G h 1 G 2 also exist in G 1 G 2 , with the same cost and marking.• Based on 1) and 2) and invoking Definition 4, we have that (q 0,1 , q 0,2 ) The fact that priced-observation equivalence is a congruence with respect to the parallel composition is an important result that allows the global partition automaton G h to be generated compositionally as proven in the following theorem.
Theorem 2 Let there be two weighted automata

the partition automata are generated by a pricedobservation equivalent abstraction and
Proof Generating a partition automaton for G 1 using priced-observation equivalence, have Generating a partition automaton for G 2 , again using priced-observation equivalence, we have G 2 ≈ P O G h 2 .Again invoking Proposition 2 with G h 1 as the third automaton, we have

Therefore, again by transitivity, we have
Now we desire that the relaxed version of sequential dependence be preserved over parallel composition.Proposition 3 shows that this is true if the pair of events is sequentially dependent in all automata that have those events in their alphabet.

Proposition 3 Let there be two weighted automata G
where G 1 has a sequentially-dependent pair of events (σ a , σ b ).The parallel composition G 1 G 2 will also have the sequentially-dependent pair of events (σ a , σ b ), if: i) σ a , σ b ∈ ( 1 ∩ 2 ) and G 2 also has the sequentially-dependent pair of events 1 , and P 2 : * → * 2 .It is given that the pair of events (σ a , σ b ) is sequentiallydependent in G 1 and σ a , σ b ∈ 1 .
• Following Definition 2, we first need to show that ∀s ∈ L m (G 1 G 2 ) for which P a (s) = σ n a , we have that , and P a (s) = σ n a .Thus, • Given σ a ∈ 1 and P 1 (P a (s)) = σ n a , then P a (P 1 (s)) = P a (s 1 ) = σ n a .
• G 1 has the sequentially-dependent pair of events (σ a , σ b ), so given s 1 ∈ L m (G 1 ) and P a (s 1 ) = σ n a , from Definition 2, P b (s 1 ) = σ n b .Also, P ab (s 1 ) = r 1 ∈ P erm(σ n a , σ n b ).• Now analyzing the two possible cases: i) for the case σ a , σ b ∈ 1 ∩ 2 : from the assumption P a (s) = σ n a , we have P a (P 2 (s)) = P a (s 2 ) = σ n a since σ a ∈ 2 .The pair of events (σ a , σ b ) is also sequentially dependent in G 2 , then from Definition 2, P b (s 2 ) = σ n b and P ab (s 2 ) = r 2 ∈ P erm(σ n a , σ n b ).Since P ab (P −1 1 (s 1 )) = r 1 and P ab ( a , this removes the possibility of P ab (s) = ε.Therefore, P ab (s) = r 1 .
• Since r 1 ∈ P erm(σ n a , σ n b ), we have that P b (s) = σ n b .Therefore, the pair of events The sequential dependence property can then be employed to generate the lumped automaton G * as defined in Definition 3. Specifically, it is desired to perform the "lumping" compositionally.Conditions under which the lumping operation can be distributed over parallel composition, G * = G * 1 G * 2 , are presented in Propositions 4 and 5, which correspond to cases i) and ii) of Proposition 3, respectively.
Proposition 4 Let there be two weighted automata If G 1 and G 2 are time-separable for σ a and σ b , with transition costs such that f 1 (q a,1 Proof Let there be a pair of events (σ a , σ b ) that is sequentially dependent in G 1 and G 2 , where σ a , σ b ∈ ( 1 ∩ 2 ).It follows from Proposition 3 Case i), that the pair of events (σ a , σ b ) is also sequentially dependent in G = G 1 ||G 2 .We wish to show that the cost of corresponding transitions with labels σ a and σ b are the same in and f * ((q b,1 , q b,2 ) −→ (q b,1 , q b,2 )), for all such transitions.i) We begin with (G 1 G 2 ) * .Consider transitions corresponding to σ a q a,1 σ a −→ 1 q a,1 in G 1 and q a,2 σ a −→ 2 q a,2 in G 2 .Noting that σ a , σ b ∈ ( 1 ∩ 2 ) and following Definition 1, we have the transition (q a,1 , q a,2 ) By assumption that f 1 (q a,1 σ a −→ 1 q a,1 ) ≥ f 2 (q a,2 σ a −→ 2 q a,2 ), we then have that f ((q a,1 , q a,2 ) σ a −→ (q a,1 , q a,2 )) = f 1 (q a,1 σ a → 1 q a,1 ).
Following the same logic for σ b transitions q b,1 we have that f ((q b,1 , q b,2 ) Applying the lump operation of Definition 3 to generate (G 1 G 2 ) * , lumping the cost from σ b to σ a , we have that f * ((q a,1 , q a,2 ) The costs associated with the transitions related to σ a and σ b , lumping the cost from σ b to σ a , for G * 1 are as follows: by assumption f 1 (q a,1 −→ 2 q b,2 ), and thus, f * 1,2 ((q a,1 , q a,2 )) • Comparing results from i) and ii), we have f * ((q a,1 , q a,2 ) • Having demonstrated that the transitions have the same cost, in general, we have demonstrated that Proposition 5 Let there be two weighted automata Proof Let there be the pair of events (σ a , σ b ) that are sequentially dependent in G 1 , where σ a , σ b ∈ 1 \ 2 .We wish to show that the cost of corresponding transitions with labels σ a and σ b are the same in −→ (q a,1 , q a,2 )) and f * ((q b,1 , q b,2 ) , for all such transitions.

i)
We begin with (G 1 G 2 ) * .Consider the transitions q a,1 σ a −→ 1 q a,1 and q b,1 Noting that σ a , σ b ∈ 1 \ 2 and following Definition 1, we have the transitions (q a,1 , q a,2 ) σ a −→ (q a,1 , q a,2 ) and (q b,1 , q b,2 ) → 1 q b,1 ).Applying the lump operation of Definition 3, lumping the cost of σ b to σ a , to generate (G 1 G 2 ) * , we have f * ((q a,1 , q a,2 ) The costs associated with the transitions of G * 1 maintaining the same order of lumping done before, are as follows: • Comparing results from i) and ii), we have f * ((q a,1 , q a,2 ) • Having demonstrated that the transitions have the same cost, in general, we have demonstrated that The conditions of Propositions 4 and 5 might seem restrictive, however, many systems are composed of plant components with disjoint event sets where all interaction takes place through specifications.In this case, the transitions within the specifications may be assumed to have zero cost with each event (and its cost) "owned" by a single plant component.In such a situation, the required conditions would be satisfied to distribute the lump operation over parallel composition as defined in Definition 1. Figure 6 provides an example that illustrates Proposition 4. The automaton G is composed of two component automata, such that G = G 1 G 2 .The pair of events (σ a , σ b ) is sequentially dependent in G 1 and G 2 .For the transitions with labels σ a and σ b , each instance has greater cost in G 2 than in G 1 .As predicted, G * 1 G * 2 results in the same automaton as Now that we have shown it is possible to distribute the lump operation over parallel composition and that the partition automaton can be generated compositionally, we can combine these results to present the main theorem of this paper.This result requires that events shared between two weighted automata, G 1 and G 2 , cannot be hidden.Furthermore, if the shared events are sequentially dependent, then all transitions for each dependent pair of events must have greater or equal cost in one of the two automata.Under these conditions, the abstracted automaton (G * ) h can be generated compositionally , without the need to first construct the full unabstracted automaton G.This significantly improves the scalability of the results from Hill and Lafortune (2016) by making it possible to generate the abstracted model, while avoiding the explosion of the state space that can arise from the construction of the global model G.
For each sequentially-dependent pair of events (σ a , σ b ), either let σ a , σ b ∈ 1 \ 2 or σ a , σ b ∈ ( 1 ∩ 2 ), and let G 1 and G 2 be time-separable for σ a and σ b with costs: f 1 (q a,1 σ a −→ 1 q a,1 ) ≥ f 2 (q a,2 σ a −→ 2 q a,2 ) and f 1 (q b,1 If the partition automata are generated by priced-observation equivalent abstraction and Proof By assumption, each arbitrary sequentially-dependent pair of events (σ a , σ b ) with σ a , σ b ∈ ( 1 ∩ 2 ) has corresponding transitions with costs that satisfy Similarly, each arbitrary sequentiallydependent pair of events (σ a , σ b ) with σ a , σ b ∈ 1 \ 2 , also provides that Applying Theorem 2 then provides that for a priced-observation equivalent abstraction with h ⊇ 1 ∩ 2 , we have

Computational examples
In order to illustrate the advantages of hierarchical planning using the compositional approach to abstraction proposed in this paper, we will apply the technique to the example of cooperative robot control introduced in Hill and Lafortune (2017).
In the approach of this paper, a planning technique is applied to the abstraction of a set of automata composed together.There are no requirements on how these automata are generated, but it is likely in many cases that these automata will capture the set of safe and nonblocking behaviors of the system, in other words, the system's supervised behavior.More specifically, the idea is to create a model for the problem by generating a classic supervisor based on the controllable and nonblocking sublanguage which provides the possible legal behaviors of the system.In the hierarchical planning proposed by Hill and Lafortune (2016), an abstraction of the supervisor is created based on their notion of cost equivalence, and this abstraction becomes the search space for the planning technique.A limitation with their work is that you still need to create the full model (supervisor) before generating the abstraction, which can be computationally prohibitive.The proposed approach of this paper makes it possible to create the abstracted model for hierarchical planning in a compositional manner.In order to apply this approach, we will first generate a set of modular supervisors, rather than generating the monolithic supervisor.In the following, the models and monolithic and compositional approaches to abstraction are described for our motivating example.
Let's first consider a scenario with two robots, A and B, four tasks, and four regions, as illustrated in Fig. 7.
Each robot is physically capable of performing only one task at a time.This behavior is captured for robot A by the automaton R A tasks shown in Fig. 8a, where each task consists of a start event a#s and a finish event a#f .The analogous model is built for robot B, R B tasks , where the start events have the form b#s and the finish events have the form b#f .There is a requirement on ordering precedence between the tasks as illustrated by the automata Spec tasks 1,2 and Spec tasks 3,4 , shown in Fig. 9.The automaton Spec tasks 1,2 represents that task 1 must be performed prior to task 2 and by the same robot.This could represent, for example, that each odd-numbered task involves picking up an item at some location, while the corresponding even-numbered task involves dropping the item at a different location.Each transition for starting a task (a#s, b#s) possesses the cost associated with performing that task (distance to that location), while all transitions for finishing a task (a#f, b#f ) have zero cost.Additionally, all start events are controllable and all finish events are uncontrollable.
We also model how the robots move between regions (numbered 5, 6 ,7, 8), as well as the region in which each task is located.From its current position, a robot can move to any adjacent region, but it cannot move diagonally between regions.The automaton in Fig. 8b is for robot A, where events a#e signify the entry into a region.Likewise, an automaton S (a#e,b#e,a#f,b#f )  16 32 In the supervisor S, entry events will always follow a start event.Sub-trajectories through the supervisor automaton have the form that a task is started, some are possibly entered, then the task is completed.Such a pattern almost implies sequential dependence, except that the robot could take different paths to a task (it could enact different region entry events).If the entry events had cost zero and became candidates to be abstracted, it would greatly increase the potential for abstraction and would improve the decoupling between the high-level planning and the low-level planning.A reasonable approximation is to treat the entry events as having zero cost and treat the start events as having cost equal to the entire straight-line distance to the task.In such a case, the entry events become candidates to be abstracted while maintaining priced-observation equivalence and zero-cost reachability with respect to the original automaton.
Based on this approximation, we can consider finish and entry events, both having cost zero, to be candidates for abstraction.For an arbitrary automaton A, we will denote an abstraction A h more specifically as A (σ 1 ,σ 2 ,...,σ n ) to signify that the events (σ 1 , σ 2 , ..., σ n ) have been hidden.In the monolithic approach, the high-level abstraction is generated from the monolithic supervisor S = SupC(K, L m (G)) by abstracting all finish and entry events at once S h = S (a#f,b#f,a#e,b#e) .
In order to apply Theorem 2 to generate our high-level abstraction compositionally, we first generate a set of modular supervisors that meet all of the provided specifications for our system of robots.We will employ the approach of Hill and Tilbury (2008) to generate the modular supervisors incrementally.Consider the following steps as an example for the scenario with two robots, four tasks and four regions: -S avoid = SupC(L m (G regions Spec regions ), L m (G regions )); -S h avoid = S (a#e,b#e) avoid ;  - The conjunction of these modular supervisors, S avoid S 12 S 34 , is guaranteed by construction to meet all the safety specifications and to be nonblocking by Hill and Tilbury (2008).Therefore, for this example, we can consider that the overall supervisor S is composed of these modular supervisors.Now, given S = S avoid S 12 S 34 and based on Theorem 2, we have that the abstraction of the supervisor S can be performed compositionally (based on the transitivity of the equivalence).Thus, the high-level of the hierarchy can be built compositionally, resulting in For this example scenario, the size of necessary steps to create the abstracted upper level of the hierarchy are presented in Table 1 for the monolithic approach and in Table 2 for the compositional approach.The monolithic approach requires the creation of an automaton with 672 states and 2512 transitions, while in the compositional approach the largest automaton that is generated has 48 states and 136 transitions.This is a reduction of 93% in the number of states and 95% in the number of transitions.Both approaches result in an upper-level abstraction with 16 states and 32 transitions.Since all the events abstracted (entry and finish events) have cost zero and the abstractions have priced-observation equivalence, it is possible to solve the high-level control of the multiple robots for this scenario in the upper-level abstraction as proposed in Theorem 1.
Both approaches, monolithic and compositional, were applied to different scenarios of the cooperative robot control application proposed in Hill and Lafortune (2017), where the notation #R#T#R indicates the number of robots, number of tasks, and number of regions.These results are displayed in Table 3, which shows the size of the biggest automaton created during the monolithic and compositional approaches, as well the final abstracted supervisor size.

Conclusion
The work of Hill and Lafortune (2016) developed a hierarchical approach to planning in order to reduce the complexity of planning operations, specifically, to accelerate the on-line planning that may need to occur in reaction to unpredictable events or new information.
However, the step of first generating the global supervisor before the necessary abstraction can be applied restricts the size of systems that can be addressed by this approach.The present paper is an extension of the conference paper (Vilela and Hill 2020), which demonstrates that under certain conditions it is possible to generate the abstraction of a system compositionally using the new notion of priced-observation equivalence.This equivalence guarantees that an optimal plan found in the abstraction is also optimal in the global supervisor as proven by Theorem 1 of Hill and Lafortune (2016).
In an effort to improve the amount of reduction obtained by the abstraction, (Hill and Lafortune 2016) proposed an operation that creates transitions with artificial zero cost, if they are part of a sequentially-dependent ordering of events, without changing the overall cost of a marked path.However, there are cases where there is a sequential dependence relation between a pair of events, but they do not follow a specific ordering as required by Hill and Lafortune (2016).In this paper, a relaxation of the sequential dependence property was proposed to include these cases.It was shown that the overall cost of marked paths in the resulting lumped automaton based on the relaxed sequential dependence is kept unaltered, maintaining the optimal relation between upper and lower levels proposed by Hill and Lafortune (2016).
The present paper also demonstrates that the relaxation of the sequential dependence concept does not affect the results of Vilela and Hill (2020).Specifically, it is proven that under this relaxation, it is still possible to generate the abstraction of a system compositionally.
The results compiled in Table 3 for different scenarios of the cooperative robot control example of Hill and Lafortune (2017) demonstrates the improvement in scalability achieved by the compositional approach proposed by this paper as compared to the monolithic approach to hierarchy generation employed by Hill and Lafortune (2016).For a scenario with two robots, four tasks, and four regions, employing the compositional approach provides a reduction of 93% in the number of states and 95% in the number of transitions of the largest automaton built in the process of generating the abstraction as compared to using the monolithic approach.For large scenarios, like the one consisting of three robots, ten tasks, and nine regions, the synthesis of a global supervisor is prohibitively large, precluding the possibility of even applying the monolithic approach of Hill and Lafortune (2016).The proposed compositional approach, however, is able to address the scenario and the largest automaton that it needs to construct has 23,625 states and 124,650 transitions.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material.If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Fig. 2
Fig. 2 Cost of transitions in a partition automaton following notation to capture the cost of an observed string s ∈ h * , C(q 1

Fig. 3
Fig. 3 Sequential dependence of the pair of events (σ a , σ b )

Fig. 4
Fig. 4 Different approaches to create the high-level of the hierarchy

Fig. 6
Fig. 6 Example: Distribution of lump operation over parallel composition

Table 1
Size of automata generated in the monolithic approach for a scenario of 2 robots, 4 tasks, and 4 regions

Table 2
Size of automata generated in the compositional approach for a scenario of 2 robots, 4 tasks, and 4 regions

Table 3
Comparison between monolithic and compositional approaches for abstraction