Scalable time-constrained planning of multi-robot systems

This paper presents a scalable procedure for time-constrained planning of a class of uncertain nonlinear multi-robot systems. In particular, we consider N robotic agents operating in a workspace which contains regions of interest (RoI), in which atomic propositions for each robot are assigned. The main goal is to design decentralized and robust control laws so that each robot meets an individual high-level specification given as a metric interval temporal logic (MITL), while using only local information based on a limited sensing radius. Furthermore, the robots need to fulfill certain desired transient constraints such as collision avoidance between them. The controllers, which guarantee the transition between regions, consist of two terms: a nominal control input, which is computed online and is the solution of a decentralized finite-horizon optimal control problem (DFHOCP); and an additive state feedback law which is computed offline and guarantees that the real trajectories of the system will belong to a hyper-tube centered along the nominal trajectory. The controllers serve as actions for the individual weighted transition system (WTS) of each robot, and the time duration required for the transition between regions is modeled by a weight. The DFHOCP is solved at every sampling time by each robot and then necessary information is exchanged between neighboring robots. The proposed approach is scalable since it does not require a product computation among the WTS of the robots. The proposed framework is experimentally tested and the results show that the proposed framework is promising for solving real-life robotic as well as industrial applications.


Introduction
Over the last few years, the field of control of multi-robot systems under high-level specifications has been gaining significant attention (Wongpiromsarn et al. 2009;Nikou 2019;Kantaros and Zavlanos 2016;Hasanbeig et al. 2019;Pant et al. 2018Pant et al. , 2019Raman et al. 2014). Applications arise in the fields of autonomous driving, industrial work by autonomously operating robot systems, indoor transportation in warehouses etc (Ulusoy et al. July 2013;Xu et al. 2019 ily been used to express the high-level tasks is linear temporal logic (LTL) (see, e.g., Fainekos et al. February 2009;Fang et al. 2018). In practical applications, however, the desired tasks need to be accomplished within certain quantitative time bounds by the robots.
A suitable temporal logic for dealing with tasks that are required to be completed within certain time bounds is the metric interval temporal logic (MITL). MITL has been originally proposed in Alur and Dill (1994) and has been used in control synthesis frameworks in Nikou et al. (2016). Given a robot dynamics and an MITL formula, the control design procedure is the following: first, the robot dynamics are abstracted into a weighted transition system (WTS), in which the time duration for navigating between states is modeled by a weight in the WTS (abstraction); second, an offline product between the WTS and an automaton that accepts the runs that satisfy the given formula is computed; and third, once an accepting run in the product is found, it maps into a sequence of feedback control laws of the robot dynamics.
Controller synthesis for multi-robot systems under MITL specifications has been investigated in Nikou et al. ( , 2018, Karaman and Frazzoli (2008). In our previous works , the under consideration dynamics were first order and we considered actuation over each state of each agent. Additionally, a global product WTS of the individual WTSs of each robot was computed. Furthermore, no transient constraints between the agents are taken into consideration. In particular, the work  can only handle multi-agent timed constrained planning in R 2 dimension, which is usually not the case in real-life robotic applications. Authors in Karaman and Frazzoli (2008) have addressed the vehicle routing problem (VHP), which is modeled as an optimization problem, that aims at finding the optimal set of routes for a fleet of vehicles to traverse, in order to deliver the load to a given set of customers. However, the dynamics of the agents were not taken into consideration. Moreover, issues such as control input saturation and robustness against disturbances were not considered. In the same context, none of the aforementioned works deal with real-time experimental validation of the corresponding proposed frameworks.
Motivated by the aforementioned, in this work, we aim to address the latter issues. In particular, we deal with nonlinear dynamics in R n with input constraints and external uncertainties/disturbances. Then, by assigning an MITL formula to each agent, we provide decentralized feedback control laws that guarantee robust transitions between neighboring agents under transient constraints. The controllers consist of two terms: a nominal control input, which is computed online and is the solution of a decentralized finite-horizon optimal control problem (DFHOCP); and an additive state feedback law which is computed offline and guarantees that the real trajectories of the system will belong to a hyper-tube centered along the nominal trajectory. More specifically, the online part is responsible for minimizing the error and the control input effort in order for the robot to be navigated between RoI, while transient constraints and control input saturation are satisfied. The second part is introduced in order to guarantee that while the DFHOCP is solved for the nominal dynamics, the controller compensates for the uncertain part due to external disturbances and keeps the trajectory of the robot bounded inside a tube. Furthermore, an algorithm that computes the runs of each agent that in turn map into continuous control laws and provably satisfy the MITL formulas is provided. These control laws correspond to the transitions indicated above. The proposed scheme is experimentally validated in our lab facilities with a group of Nexus robots (see Fig. 1). Moreover, the proposed approach is scalable since does not require a product computation among the WTS of the agents.
The idea of avoiding the global product between the agents lies in the fact that we address the multi-agent coupling with the low-level continuous time control design. More specif- Fig. 1 The experimental setup demonstrating the proposed framework. Three Nexus 10011 mobile robots, in the workspace of Smart Mobility Lab (SML) (n.d.) that contains 5 RoI ically, we exploit the inherent advantage of NMPC with reference to other control techniques: we capture the coupling constraints through the hard constraints of each agent by assuming communication capabilities between the agents. In the same context, all the algorithmic computations are performed offline and the robots are executing online only a sequence of a control actions that are the outcome of the planner. Thus, the latter leads to a framework that is scalable with the number of agents. Our contribution is thus a fully automated framework for a general class of uncertain multi-robot systems consisting of both constructing purely decentralized abstractions and conducting timed temporal logic planning with transient constraints in a scalable way. This paper is structured as follows. In Sect. 2 a description of the necessary mathematical tools, the notations and the definitions are given. Section 3 provides the modeling of the proposed framework along with the formal problem statement. Section 4 discusses the technical details of the solution, while Sect. 5 is devoted to a real-time experiment demonstrating the proposed approach. Finally, conclusions and future research directions are discussed in Sect. 6.

Notation and preliminaries
In this section, the notation that will be used hereafter as well as the necessary mathematical background for nonlinear control systems and formal verification are provided.

Notation
The set of positive integers, positive rational numbers and real numbers are denoted by N, Q + , and R, respectively. Given a set S, we denote by |S| its cardinality, by S n = S × · · · × S its n-fold Cartesian product, and by 2 S the set of all its subsets; y 2 := y y and y M := y My, M ≥ 0 stand for the Euclidean and the weighted norm of a vector y ∈ R n , respectively; λ min (M) stands for the minimum absolute value of the real part of the eigenvalues of M ∈ R n×n ; I n ∈ R n×n and 0 m×n ∈ R m×n are the identity matrix and the m × n matrix with all entries zeros, respectively. A n × n symmetric real matrix M is said to be positive semidefinite (M ≥ 0) if x M x ≥ 0 for every x ∈ R n . The notation diag{M 1 , . . . , M n } stands for the block diagonal matrix with the matrices M 1 , . . . , M n in the main diagonal. The set M(χ , ρ) = {y ∈ R n : y − χ 2 ≤ ρ}, represents the ndimensional ball with center χ ∈ R n and radius ρ ∈ R >0 . It should be noticed that, throughout this manuscript, the nominal signals and the magnitude of upper bounds of signals are denoted with ·, ·, respectively. Tables 1 and 2 show a list of symbols and a list of acronyms, respectively, that are used in this manuscript.
Definition 1 Given the sets S 1 , S 2 ⊆ R n , S ⊆ R m and the matrix M ∈ R n×m , the Minkowski addition, the Pontryagin difference as well as the matrix-set multiplication are respectively defined by: Lemma 1 (Nikou and Dimarogonas 2019) For any constant ρ > 0, vectors z 1 , z 2 ∈ R n and matrix P ∈ R n×n , P > 0 it holds that:

Nonlinear control
Definition 2 Khalil (1996) A continuous function α : [0, a) → R ≥0 belongs to class K if it is strictly increasing and belongs to class KL if: 1) for a fixed s, the mapping β(r , s) belongs to class K with respect to r ; 2) for a fixed r , the mapping β(r , s) is strictly decreasing with respect to s; and it holds that: lim s→∞ β(r , s) = 0.
Definition 3 Yu et al. (2013) Consider a dynamical system: with initial condition x(0) ∈ X . A set X ⊆ X is a Robust Control Invariant (RCI) set for the system, if there exists a feedback control law u:=κ(x) ∈ U, such that for all x(0) ∈ X and for all disturbances δ ∈ D it holds that x(t) ∈ X for all t ≥ 0, along every solution x(t).
(states, init. states, actions, transition relation, weight, atomic propos., labeling function) (Q, Q init , CL, Inv, E, FS) (states, init. states, clocks, invariance, accepted states) x i , v i , u i position/orientation, velocity and control input of robot Errors between real and nominal signals (state and velocity respectively)   (Khalil 1996, Def. 4.7, p. 175) A nonlinear systemẋ = f (x, u, δ), x ∈ X , u ∈ U, δ ∈ D with initial condition x(0) ∈ X is said to be Input-to-State Stable (ISS) with respect to δ ∈ D, if there exist functions β ∈ KL, γ ∈ K such that for any initial condition x(0) ∈ X and for any input u(t) ∈ U, the solution x(t) exists for all t ∈ R ≥0 and satisfies: 2.3 Nonlinear model predictive control NMPC Chen and Allgöwer (1998) is formulated as solving at each sampling time step an online finite horizon optimal control problem (FHOCP) subject to nonlinear system dynamics and constraints involving states and controls. Based on measurements obtained at each sampling time step, the controller predicts the dynamic behavior of the system over a predictive horizon in the future and determines the input such that a predetermined open-loop performance objective is minimized. In order to incorporate feedback, the optimal open-loop input is implemented only until the next sampling time step. Using the new system state at the next sampling time step, the whole procedure-prediction and optimization-is repeated.
An atomic proposition p is a statement that is either True ( ) or False (⊥).
Definition 7 A weighted transition system (WTS) is a tuple (S, S init , Act, −→, t, Γ , L) where S is a finite set of states; S init ⊆ S is a set of initial states; Act is a set of actions; −→⊆ S × Act × S is a transition relation; t :−→→ Q + is a map that assigns a positive weight to each transition; Γ is a finite set of atomic propositions; and L : S → 2 Γ is a labeling function.
The syntax of Metric Interval Temporal Logic (MITL) over a set of atomic propositions Γ is defined by the grammar: where p ∈ Γ , and ♦, and U are the eventually, always and until temporal operator, respectively; The MITL formulas are interpreted over timed words like the ones produced by a WTS which is given in Definition 8.
We denote by ν | φ the fact that the valuation ν satisfies the clock constraint φ.
where q, q are the source and target states, g is the guard of edge e and RS is a set of clocks to be reset upon executing the edge; FS ⊆ Q is a set of accepting locations; Γ is a finite set of atomic propositions; and L : Q → 2 Γ labels every state with a subset of atomic propositions.
Any MITL formula ϕ over Γ can be algorithmically translated into a TBA with the alphabet 2 Γ , such that the language of timed words (i.e. the set of all accepted timed words) that satisfy ϕ is the language of timed words produced by the TBA (Alur et al. 1996;Maler et al. 2006;Ničković and Piterman 2010;Brihaye et al. 2017).

Definition 11 Given a WTS
Then, their product WTS: is defined as follows: , is a map that assigns a positive weight to each transition; -F = {(s, q) ∈ Q : q ∈ FS} is a set of accepting states; and -L(s, q) = L(s) is a labeling function.

System model
Consider a team of N robots with labels [N ]:={1, . . . , N } operating in a bounded workspace W ⊆ R n . The robots are governed by the following kinematics and dynamics model:ẋ where x i , v i ∈ R n stands for the position/orientation and the linear/angular velocity of the robot i ∈ [N ], respectively; f i : R n × R n → R n is a known and continuously differentiable vector fields with f i (0, 0) = 0 and G i ∈ R n×n ; u i ∈ R n stands for the control input vector; and δ i ∈ R n models the external disturbances and uncertainties. Consider also velocity constraints, input constraints as well as bounded disturbances: where the constants v i , u i , δ i > 0 are a priori given. The sets V i , U i and Δ i are assumed to be connected sets with the origin as an interior point. Define the corresponding nominal kinematics/dynamics by: which are the real kinematics/dynamics for the case of δ i = 0. Assumption 2 There exist strictly positive constants G i such that:

Assumption 1 The linear systemsη
Remark 1 Assumption 1 is required for the NMPC nominal stability to be guaranteed (Chen and Allgöwer 1998). Note also that in real-time robotic systems, the matrices G i usually represents the mass matrix of the robots which are always positive-definite. Thus, Assumption 2 is satisfied.
In the given workspace, there exist Z ∈ N disjoint Regions of Interest (RoI) labeled by [Z ]:={1, . . . , Z }. We assume that the RoI are modeled by balls, i.e., R z :=M(y z , p z ), Z ∈ N, where y z and p z > 0 stand for the center and radius of RoI R z , respectively. Define also the union of RoI by Due to the fact that we are interested in imposing safety constraints, at each time t ≥ 0, the robot i is occupying a ball M(x i (t), r i ) that covers its volume, where x i (t) and r i > 0 are its center and radius, respectively. Moreover, in order to be able to impose transient constraints among the robots, we assume that each robot i ∈ [N ] has communication capabilities within a limited sensing range d i > 0 such that: The latter implies that each agent has sufficiently large sensing radius so as to measure the agent with the biggest volume, due to the fact that the agents' radii are not the same. Define the set of robots j that are within the sensing range of agent i at time t as:

Objectives
The goal of this paper is to design decentralized feedback control laws that steers the robots with dynamics as in (1a)-(1b) between RoI so that they obey individual high-level tasks given in MITL under transient constraints between them. Define the labeling functions: which map each RoI with a subset of atomic propositions that hold true there. Note that some of the RoI may be assigned with labels that indicate unsafe regions, i.e., the robot is required to avoid visiting them (safety specifications). (2)) . . ., where r i (l) ∈ R, ∀l ∈ N, is a sequence of RoI that the robot crosses, if the following hold: 1. τ i (0) = 0, i.e., the robot starts the motion at time t = 0; 2. M(x i (τ i (0)), r i ) r i (0), i.e., initially, the volume of the robot is entirely within the RoI i.e., the robot changes discrete state as soon as its entire volume is strictly contained in the corresponding RoI; are functions that model the duration that the robot needs to be driven between regions r i (l) and r i (l + 1).

Definition 13
A trajectory x i (t) satisfies an MITL formula ϕ i over the set of atomic propositions Γ i , formally written as x i (t) | ϕ i , ∀t ≥ 0, if and only if there exists a timed run r t i to which the trajectory x i (t) is associated, according to Definition 12, which satisfies ϕ i .

Remark 2
We assume that the volume of each robot is covered by a ball. We further assume that the obstacles can be modeled by RoI that are also balls. Even if the volume of an agent and/or an obstacle is not a ball, it can be overapproximated by a ball.

Problem statement
The problem considered in this paper is stated as follows: Problem 1 Consider N robots governed by dynamics (1a)-(1b), covered by the balls M(x i (t), r i ), operating in the workspace W ⊆ R n with sensing communication capabilities captured by the sets G i as defined in (5). The workspace contains the RoI R z , z ∈ [Z ] modeled also by balls. Given task specification formulas ϕ i for each robot i ∈ [N ] expressed in MITL over the set of atomic propositions Γ i and labeling functions L i as in (6); then, design decentralized feedback control laws u Remark 3 Note that Problem 1 constitutes a general problem due to the fact that the dynamics (1a)-(1b) arise in most robotic applications and transient constraints among the robots are taken into consideration.

Problem solution
In this section, a systematic framework for solving Problem 1 is provided as follows: 1. In Sects. 4.1-4.2, decentralized feedback control laws that guarantee the transition between RoI in the given environments are provided. The laws consist of two components: an online control law which is the outcome of a DFHOCP solved at each timed step (Sect. 4.1); and an offline law which guarantees that the trajectories of the real system remain in a hyper-tube (Sect. 4.2). 2. Then, by using the outcome of Sect. 4.1, we abstract the dynamics (1a)-(1b) into a WTS for each robot, exploiting the fact that the timed runs in the WTS project onto associated trajectories according to Definition 12 (Sect. 4.3) 3. By invoking ideas from our previous work (Nikou et al. 2016), a controller synthesis procedure that provides a sequence of control laws that serve as solution to Problem 1 is consulted (Sect. 4.4) 4. Lastly, the computational complexity of the proposed framework is discussed in Sect. 4.5.

Decentralized feedback control design-part I
as well as the uncertain error kinematics/dynamics by: The corresponding uncertain nominal error kinematics/ dynamics are given by: Consider the feedback control law: which consists of a nominal control action u i (e i , v i ) ∈ U i and a state feedback laws κ i (e i , v i , e i , v i ). The control action u i (e i , v i ) is the outcome of a DFHOCP solved on-line at each sampling time step; the state-feedback law will be tuned offline according to a procedure that will be presented thereafter. Define the sets that capture the state constraints of each robot as: The first constraint captures the fact that the robots should not collide with each other; the latter one, captures the fact that each robot needs to be navigated from RoI R i,s to RoI R i,d without intersecting with any other RoI of the workspace due to the fact that we are interested in imposing safety specifications.
Assumption 3 It is assumed that: More specifically, (11) states that the radius of the ball that covers every robot plus the radius of the disturbance tube is smaller than the radius of any of the RoI of the workspace. As it will be shown later, this assumption is required in order to compute the time that a robot needs to be navigated between the RoI in the workspace. Consider a sequence of sampling times {t k }, k ∈ N, with a constant sampling period 0 < h < T , where T stands for the finite prediction horizon. It holds that t k+1 = t k + h, ∀k ∈ N. It should be noted that both t k and T are multiples of h. At every discrete sampling time t k a DFHOCP is solved by each robot i ∈ [N ] as follows: subject to: In the aforementioned optimal control problem we defined: The matrices Q i , P i ∈ R 2n and R i ∈ R n are positive definite weighting matrices. The sets F i stand for the terminal sets that are used to enforce the stability of the nominal system (see Chen and Allgöwer 1998 for more details). Hereafter, the sets E i , V i and U i are explained. In order to guarantee that while the DFHOCP (12a)-(12d) is solved for the nominal dynamics (9a)-(9b), the real states e i , v i and control inputs u i satisfy the corresponding state and input constraints E i , V i and U i , respectively, the latter sets are appropriately modified as: , Ω i,2 as given in (16a), (16b), respectively, and k i to be defined later. This constitutes a standard constraints set modification technique adopted in tube-based NMPC frameworks (for more details see Yu et al. 2013). The advantage of the tube-based framework compared to other robust NMPC approaches is that the constraint tightening is performed offline and it does not depend on the length of the horizon. Algorithm 1 depicts the procedure
Step 1 : At time t k and current state (e i (t k ), v i (t k ), e i (t k ), v i (t k )), solve the DFHOCP (12a)-(12d) to obtain the nominal control action u i (t k ) and the actual control action Step 2 : Apply the control u i (t k ) to the system (8a)-(8b), during sampling interval [t k , t k+1 ), where t k+1 = t k + h.
Step 3 : Measure the state (e i (t k+1 ), v i (t k+1 )) at the next time instant t k+1 of the system (8a)-(8b) and compute the successor state (e i (t k+1 ), v i (t k+1 )) of the nominal system (9a)-(9b) under the nominal control action u i (t k ).
Step 4 : Set Go to Step 1.
of how the control law is calculated and applied to a real robot. This is a procedure of implementing the continuoustime tube-based NMPC in a real-time system that has been introduced in Yu et al. (2013).

Decentralized feedback control design-part II
For each agent i ∈ [N ] define by: the deviation between the real states e i , v i of the uncertain system (8a)-(8b) and the states of the nominal system (9a)-(9b) with e i (0) = v i (0) = 0. It will be proven that the states e i , v i remain invariant in certain compact sets. The dynamics of the states e i and v i are written as: where the functions g i are defined by: and they are upper bounded by: The constants L e,i , L v,i stand for the Lipschitz constants of functions f i with respect to the variable e i and v i , respectively, and

Lemma 2
The state feedback laws designed by: where k i , ρ i > 0 are chosen such that the following hold: renders the sets: RCI sets for the error dynamics (13a), (13b), according to Definition 3, where the constants α i,1 , α i,2 > 0 are defined by: Proof A backstepping control methodology will be used (Krstic et al. 1995). The state v i in (13b) can be seen as virtual input to be designed such that the candidate Lyapunov function: for the dynamical system (13a) is always decreasing. The time derivative of L 1 along the trajectories of system (13a) is given by: Thus, by designing v i ≡ −e i , it yields thaṫ Define the backstepping auxiliary errors ζ i,1 , ζ i,2 ∈ R n by: Then, the auxiliary error dynamics are written as: with: and ζ i,1 (0) = ζ i,2 (0) = 0. Define the stack vector ζ i :=[ζ i,1 , ζ i,2 ] ∈ R 2n and consider the candidate Lyapunov function L(ζ i ) = 1 2 ζ i 2 2 with L(0) = 0. The time derivative of L along the trajectories of system (13a)-(13b) is given by: By using (18c), the latter becomes: By using Lemma 1 we have: with ρ i satisfying (15). Then, it holds that: (14) and compatible with (10) we get: Writing the matrices G i as G i = and taking into account that: y Py ≥ λ min (P) y 2 2 , ∀y ∈ R n , P ∈ R n×n , P > 0, we obtain: (17), the latter becomes:

By using Assumption 2 and
Taking the latter into account and the fact that ζ i (0) = 0 we have that ζ i (t) 2 ≤ δ i min{α i,1 ,α i,2 } , ∀t ≥ 0. Moreover, the following inequalities hold: and: which leads to the conclusion of the proof.
The aforementioned result states that real trajectories e i (t), v i (t) will belong to a hyper-tubed which is centered Fig. 2 The hyper-tube centered along the trajectory e i (t) (depicted by blue line) with radius δ i min{α i,1 , α i,2 } . Under the proposed control law, the real trajectory e i (t) (depicted with red line) lies inside the hyper-tube for all times, i.e., along the nominal trajectories e i (t), v i (t). The tubes' radii are δ i min{α i,1 ,α i,2 } and 2 δ i min{α i,1 ,α i,2 } , respectively, as it is depicted in Fig. 2. Remark 4 It should be noted that the volume of the hypertubes depends on the upper bound of the disturbances δ i , the Lipschitz constants L i and the constants G i . By tuning the gains k i and ρ i as in (15), (17) appropriately, the volume of the tubes can be adjusted. However, these gains cannot be set arbitrarily high due to the fact that the robots have limited actuation resources which are captured by the upper bound of the control input. The higher the upper bound of the control input is, the smaller the volume of the tube can be set.
By using (10), the closed-loop system is written as: Due to the fact that Problem 1 imposes transient constraints between the agents (collision avoidance) and the agents have communication capabilities within the sensing range d i as given in (4)-(5), we adopt here the decentralized procedure depicted in Algorithm 2 and explained hereafter. Assume that each agent knows its labeling number in the set [N ]. After each sampling time t k , ∀k ≥ 0 that agent i solves its own DFHOCP and obtains the estimated open-loop trajectory ξ i (s), s ∈ [t k , t k + T ], it transmits it to all agents j ∈ G i (t k ), j = i, i.e., to agents that are within its sensing radius at time t k . Then, agents' j ∈ G i (t k ), j = i hard constraints E j are updated by incorporating the predicted trajectory of agent i, i.e., ξ i (s), s ∈ [t k , t k + T ]. Among all agents j ∈ G i (t k ), the one with higher priority, i.e., smaller labeling number in the set [N ], solves its own DFHOCP (for example, agent 2 has higher priority than agents 3, 4, . . . ). This sequential procedure is continued until all agents i ∈ [N ] solve their own DFHOCP, and then the sampling time is updated.
In other words, each time an agent solves its own individual optimization problem, it knows the (open-loop) state predictions that have been generated by the solution of the optimization problem of all agents within its sensing range at that time, for the next T time units. These pieces of information are required, as each agent's trajectory is constrained not by constant values, but by the trajectories of its associated agents through time: at each solution time t k and within the next T time units, an agent's predicted configuration at time s ∈ [t k , t k + T ] needs to be constrained by the predicted configuration of its neighboring and perceivable agents (agents within its sensing range) at the same time instant s, so that collisions are avoided. We assume that the above pieces of information are always available, accurate and can be exchanged without delay. We will show thereafter that by adopting the aforementioned sequential communication procedure, and given that at t = 0 the DFHOCP (12a)-(12d) of all agents are feasible, the agents are navigated to the desired RoI, while all distance and input constraints imposed by Problem 1 are satisfied.

Remark 5
It should be noted that the constraint sets E i , i ∈ [N ] in (12c) depend on the estimated open-loop trajectories e i (s) and e j (s) for all i ∈ [N ], j ∈ G(t k ), with s ∈ [t k , t k + T ]. Moreover, they are updated when each robot has received the transmitted trajectories by its neighbors.

Remark 6
By considering a real-time scenario where the state vector ξ is comprised of 12 real numbers encoded by 4 bytes the overall downstream bandwidth required by each robot is: ≈ 16 · 10 2 robots, within the range of one robot. We deem this number to be large enough for practical applications of the proposed approach.
The following theorem guarantees the navigation of the agents between RoI and thereafter we will propose algorithms computing the corresponding transition times.
Theorem 1 Suppose that Assumptions 1-3 hold. Suppose that the robots start at time t i,s ≥ 0 from the RoI R i,s and they need to be navigated to RoI R i,d  Feasibility Analysis: It can be shown that recursive feasibility is established and it implies subsequent feasibility. The proof of this part is similar to the feasibility proof of (Filotheou et al. (2018), Theorem 2, Sec. 4, p. 12).
Convergence Analysis: Recall that: Then, we get: which, by using the fact that: as well as the bounds from (16a), (16b), become: Since only the nominal system dynamics (9a)-(9b) are used for the online computation of the control action u i (s) ∈ U i , s ∈ [t k , t k + T ] through the DFHOCP (12a)-(12d), by invoking nominal NMPC stability results found on Chen and Allgöwer (1998), it can be shown that there exist class KL functions β i , such that: By combining (20a)-(20b) with (21) we get: for every t ∈ R ≥0 . The latter inequalities leads to the conclusion of the proof.

Discrete system abstraction
Theorem 1 implies that for each robot i ∈ [N ] with kinematics/dynamics as in (1a), (1b), starting from the RoI R i,s at time t i,s , is driven by the controller (10) towards a desired RoI R i,d , while all state, input and transient constraints are satisfied. Hereafter, we provide an algorithm for constructing the WTS of each agent. By observing (22a) and taking into account Assumption 3, it holds that there exists a time instant t i,d such that the volume of robot i will be included strictly within the RoI R i,d . Furthermore, due to the fact that we have knowledge of the nominal dynamics and the MITL tasks ϕ i are independent for each robot, for the computation of the time t i,d an offline computer simulation of the DFHOCP (12a)-(12d) with state constraints as: is conducted. In particular, (23) captures constraints regarding the navigation of robot i from RoI R i,s to RoI R i,d without intersecting with any other RoI of the workspace.
It should be noted that if any collision is about to occur in real-time when the robots are executing the on-line control actions, the transition time between the RoI will be different. In order to overcome the aforementioned issue, we will provide thereafter an algorithm that monitors the collision offline and updates the transition times appropriately. Then, the process of computing t i,d is described in Algorithm 3. The abstraction that captures the dynamics of each robot into a WTS is given through the following definition.

Definition 14
The motion of robot i in the workspace W is modeled by the WTS where: R z is the set of states of the robot that contains all the RoI of the workspace W; -S init i ⊆ S i is a set of initial states defined by the robot' s initial position x i (0) in the workspace; -Act i is the set of actions containing the union of all feedback controllers (10) which can navigate the robot i between RoI; there exist feedback control law u i ∈ Act i as in (10) which can drive the robot from the region R i,s to the region R i,d without intersecting with any other RoI of the workspace; t i is the time weight as given in (7) and it is computed by Algorithm 2; -L i is the labeling function as given in (6); -and Γ i is the set of atomic propositions imposed by Problem 1.
The aforementioned WTS of each robot allows us to work directly at the discrete level and design a sequence of feedback controllers as in (10) that solve Problem 1. By construction, each timed run produced by the WTS T i , where the notion of timed run is given in Definition 8, is associated with the trajectory x i (t) of the system (1a)-(1b), as given in Definition 12. Hence, if a timed run of T i of each robot i ∈ [N ] satisfying the given MITL formula ϕ i is found, a desired timed word of the original system, and hence a trajectory x i (t) that is a solution to Problem 1 is found.

Control synthesis
Figure 3 depicts a framework under which a sequence of feedback control laws u i (x i , v i ) of each robot that guarantees the satisfaction of the MITL formula ϕ i can be computed. First, a TBA A i that accepts all the timed runs satisfying the specification formula ϕ i is constructed. Second, a product between the WTS T i given in Definition 14 and the TBA A i is computed which gives the product WTS T i . By performing graph search to the product WTS T i , a timed run that satisfies the MITL formula ϕ can be found. For more details regarding the control synthesis procedure we refer to our previous work (Nikou et al. 2016. In view of Algorithm 3, (23) and the offline plan computation, it is possible that while each agent is executing online its individual actions and transits between RoI, there might be a cluster of agents that avoid collision between each other.
In such a scenario, the online feedback control law avoids the possible collisions, but the navigation time between the RoI will have been different that the one computed by Algorithm 3. In order to resolve this, we propose an offline collision detection algorithm (see Algorithm 4) which detects the cluster of agent that will avoid potential collision when the plan of each agent is executed and updates the transition times between RoI of each agent appropriately.
More specifically, the input to Algorithm 4 is the transition times of each agent and the output is the updated realistic transition times denoted by t real i,d as well as the formula bounds relaxation. The function computePlanAgent(i) computes the sequence of RoI that agent i needs to follow in order to satisfy the formula. The function executePlanAgent(i) executes a simulated plan for each agent. Then, by using a monitoring function collisionClusterMonitoring, the cluster of the agents that are colliding can be detected. Then, we need to update the transition times of each of the colliding agents by a term which models the time duration of the maneuvering that the corresponding agent is performing in order to avoid the collision. This time is denoted in Algorithm 4 by T i maneuver . By finding the maximum of the aforementioned times, the time bounds of the MITL formula of each agent are relaxed. The function relaxBounds(ϕ i , max i ) updates each formula time interval of the form [a, b], a > b ≥ 0, to [a, b + max i ].

Proposition 1
The solution that it is obtained from the controller synthesis procedure provides a sequence of feedback control laws u i (x i , v i ) as in (10) that guarantees the satisfaction of the formula ϕ of the robot governed by dynamics as in (1a)-(1b), thus, providing a solution to Problem 1.

Complexity analysis
The proposed framework consists of the computational complexity of the following steps: -C1: the computational complexity of the offline construction of WTS T i and graph search. In particular, the graph search is performed over the product WTS T i which has |S i | · |Q i | number of states, i.e., the multiplication between the states of the WTS (number of RoI of the workspace) and the number of states of the TBA. The complexity of the Dijskstra algorithm that is used for the graph search is: where |edges| is the number of edges of the product WTS T i . -C2: Algorithm 2 is an offline computer simulation and the computational complexity is the same with the complexity of a nominal NMPC algorithm; -C3: Algorithm 3 is an offline computer simulation of collision detection which scales with the number of agents; -C4: the DFHOCP (12a)-(12d) is the only online commutation of the proposed framework and has the same complexity with the nominal NMPC algorithm (quadratic programming optimization technique).
By taking into account that C1 is standard in timed verification, and the fact that C2, C4 have the same complexity with nominal NMPC, and C3 is a computer simulation that scales with the number of agents, the proposed approach is scalable with the number of agents.

Experimental setup and results
In this section the efficacy of the proposed framework via a real-time experiment employing N = 3 Nexus 10011 mobile robots is validated. The experiment was conducted at Smart Mobility Lab (SML) (see Fig. 1 and Smart Mobility Lab (SML) (n.d.)). By controlling the speed of each wheel, the Nexus Robot 10011 is able to move forward, backward, left, and right. The robot can also rotate clockwise and counterclockwise. In other word, it has three degrees of freedoms, i.e moving forward/backward, moving left/right and rotation. By combining the three degree of freedom, the Nexus Robot is able to move towards any direction. SML provides a motion capture system (MoCap) with 12 cameras spread across the lab. The MoCap provides the robot state vector, including pose, orientation as well as linear and angular velocities at frequency of 100Hz. The software implementation of the pro-posed control strategy was conducted in C++ under Robot Operating System (ROS) (Quigley et al. 2009). Moreover, the optimization algorithms described in this chapter are implemented by employing the NLopt Optimization library found in Johnson (2009).
The state of each robot is ] where x i,1 , x i,2 indicate the position of the robot and x i,3 its orientation. The workspace that the robots can operate in as well as a panoramic view of it is depicted in Figs. 1 and 4, respectively. The workspace is captured by the set: W:={w ∈ R 2 : |w k | ≤ 2.5, k ∈ {1, 2}}, and it contains 5 RoI which are divided as follows: -the RoI R z , z ∈ {1, 2, 3, 4} depicted with blue color in Fig. 4 which stand for the RoI that the robots are required to visit. The RoI R z , z ∈ {1, 2, 3, 4} map into the atomic propositions that model missions for each robot; -the RoI R 5 depicted with red color in Fig. 4 stands for an unsafe region that the robots should avoid collision with. It holds that L i (R 5 ) = {obs} for every i ∈ [N ].
The control input constraints of each robot are set to: where u i,1 , u i,2 stand for the linear velocities and u i,3 stands for the angular velocity. The ball that covers the volume of each robot has radius r i = 0.4m for every i ∈ [N ]. The sensing radius of each robot is d i = 2m. The robots 1, 2 and 3 are initially place in the ROI R 1 , R 2 and R 3 , respectively. The set of atomic propositions of each robot is given by:  respectively. The prediction horizon is chosen T = 2.0 sec. The tube of each robot is given by the set: The NMPC gains are set to: By using Algorithm 3 and Algorithm 4, the total transition times of the navigation of the robots between the RoI of the workspace are computed as follows: x [m]

Conclusions and future work
In this paper, a scalable framework for time-constrained planning of multi-robot systems has been proposed. Considering N robots operating in a bounded workspace which contains RoI, assigned with tasks given in MITL, a framework for efficiently designing decentralized feedback control laws that guarantee the satisfaction of the corresponding tasks has been provided. The controllers are the outcome of DFHOCP solved by each robot at each sampling time and form the actions of the WTS. By proposing high-level controller synthesis algorithms, a sequence of feedback laws for each robot can be designed. The approach is scalable since the local products are computed offline and only the DFHOCP of each robot is computed online which has complexity similar with the nominal NMPC framework. Future research directions  Fig. 7 The trajectory of robot 3 in the workspace will be devoted towards incorporating event-triggered strategies between he robots in order to save valuable actuation and sensing resources.
trol systems applications in sensor-based motion planning, marine robotics, control of multiple robotic systems, such as free flying manipulators (e.g., Underwater Vehicle Manipulator Systems) and vehicles (mobile, underwater, and aerial).