Dealing with time in the multiple traveling salespersons problem with moving targets

The multiple traveling salespersons problem with moving targets is a generalization of the classical traveling salespersons problem, where the targets (nodes or objects) are moving over time. Additionally, for each target a visibility time window is given. The task is to find routes for several salespersons so that each target is reached exactly once within its visibility time window and the sum of all traveled distances of all salespersons is minimal. We present different modeling formulations for this TSP variant. The time requirements are modeled differently in each approach. Our goal is to examine what formulation is most suitable in terms of runtime to solve the multiple traveling salespersons problem with moving targets with exact methods. Computational experiments are carried out on randomly generated test instances to compare the different modeling approaches. The results for large-scale instances show, that the best way to model time requirements is to directly insert them into a formulation with discrete time steps.


Introduction
This research deals with a dynamic variant of the traveling salesperson problem (TSP), where the targets or nodes are not fixed. This TSP generalization is called multiple traveling salespersons problem with moving targets (MTSPMT). A possible application of the MTSPMT can be found in the defense sector. An area, e.g., an airport or B Anke Stieber anke.stieber@hsu-hh.de Armin Fügenschuh fuegenschuh@b-tu.de 1 Helmut Schmidt University, Hamburg, Germany 2 Brandenburg University of Technology Cottbus-Senftenberg, Cottbus, Germany a military base, must be protected from incoming hostile rocket, artillery, or mortar (RAM) fire. Simultaneous attacks from different firing positions are considered. There is a radar system detecting targets and estimating their trajectories. With a battery of laser guns deployed within or nearby the protected area decisions have to be taken, which available laser gun to select for the countermeasures to protect the area. The selected laser gun then aims at the incoming target for a certain period of time to destroy it. We assume that each target can be destroyed by firing the same period of time regardless how far away it is. This period of time is set to one unit. Generally, the closest laser gun is assigned to a target, where "close" does not refer to the physical distance between laser gun and target, but to the angle the laser gun needs to traverse for aiming at the target. The closest laser gun is used to prefer small angles over large angles, because in case of failure (target is not destroyed) an adjustment of the laser gun is then more likely to succeed. The goal is to destroy all incoming RAM. Here, a laser gun corresponds to a salesperson and the incoming firing entities are moving targets. Each target has got a visibility time window and a salesman can only intercept a target within its respective time window. This window starts at that moment at which the target is radar-detected, its trajectory is computed and reachability by all laser guns is guaranteed. The visibility time window ends at the latest point in time where a destruction of the target is possible (before impact).
Targets move continuously on their trajectories with a certain speed value. The number of laser guns is not fixed to one, thus, we are dealing with multiple salespersons. As for the classical TSP each target must be visited once by exactly one salesperson. The objective is to minimize all traversed distances of all laserguns. Usually, the optimization goal for TSP is minimizing travel times, however in the described application the area is protected by laserguns, which are centered in the middle or equally spread over the area. The concept of catching a target as early as possible could require a lasergun to traverse a long way to intercept that target as early as possible and the lasergun has to traverse the same way back. To save this traveling time and be ready for other targets we minimize the distances. Moreover, it is important to safely destroy a target than to adjust the laser gun for a second try in case of a miss. Destruction of a target is more likely conducted when the laser traverses small angels. In our test instances we assume a destruction of a target in the first try. The restriction that each target has to be destroyed is above all and will be modeled as the demand constraints.
From a mathematical point of view, this application is an online optimization problem, where the complete data of the problem instance is not given in advance and a decision has to be taken immediately. It can be solved "offline" and new data is then integrated into the offline algorithm at runtime by a moving horizon approach. For a detailed description of the application we refer to Stieber et al. (2014).
Obviously, if we restrict the number of salesman to one, fix the position of each target to a certain point in space and extend all visibility windows to the whole considered time horizon, we obtain the classical TSP, which is NP-hard, see Garey and Johnson (1979). Thus, the MTSPMT as a generalization of the classical TSP is NP-hard, too. For a survey on the TSP we refer to Lawler et al. (1985) or Reinelt (1994). Helvig et al. (1998) addressed the Moving-Target TSP, which is the MTSPMT restricted to one salesperson. Possible applications for the Moving-Target TSP are a supply ship, that resupplies patrolling boats or an airplane that must intercept a number of mobile ground units. The authors also addressed the Multi-Pursuer Moving-Target TSP with Resupply, where multiple pursuers are considered and each pursuer must return to the origin for resupply after intercepting a target.
The MTSPMT is similar to the Vehicle Routing Problem (VRP) with timedependent traveling times. However, in the MTSPMT, there is no capacity restriction imposed and due to the moving targets, we have changing traveling times and distances between any pair of targets. This means the movement of a target does not only influence the length of a certain arc and thus the travel time to a certain target, but to all other targets as well. Since all targets are moving simultaneously and constantly the length of an arc has two degrees of freedom, i.e., the length of an arc is determined by the time the arc is entered and the time the arc is left. Both these times exactly correspond to two spatial positions of the incident targets. Thus, the MTSPMT has varying distances and varying travel times between two targets and both of these do not correspond to each other since waiting is permitted. Assuming a target approaching a salesperson, the salesperson moves the smallest distance to the trajectory of the target with maximum speed and possibly waits there to catch the target, this results in a smaller average speed of the salesperson. The other case is, when the target is heading away from the salesperson, then it is better to catch the target as early as possible and without waiting, here the average speed is the maximum speed. The change in distances between targets is different compared to time-dependent VRP (TDVRP) as well as minimizing the traveled distance opposed to minimizing the traveled time.
In this research paper we investigate the time aspect of the MTSPMT in modeling. Based on different ways of integrating time into the model formulation, four different variants are presented and compared regarding performance. Two model approaches have already been published by Stieber et al. (2014), the time-discrete model TD and the time-continuous model TC. The latter one concerning continuous times, is the most precise model variant. Having a time discretization the objective function value of TD cannot be less than the objective of TC. However, the price for the preciseness of continuous time modeling usually is a higher computational burden. In addition, we generate time-free models. For these models we relax the time requirements completely and use subprograms to check and create time feasibility. These feasibility checkers are either based on discrete time steps or on a continuous time formulation. The two resulting models are called time-free model with time-discrete feasibility checking (TFTD) and time-free model with time-continuous feasibility checking (TFTC). The computational complexity for feasibility checking in subproblems is less complex than for TD respectively TC. The idea behind the time-free variants is to investigate if the computational complexity for TFTD and TFTC can be reduced compared to the models "with" time. In the discrete case we assume, that this could be true for a sufficient fine discretization.
The contributions of this research paper are in detail: 1. The TC model from Stieber et al. (2014) is tightened, such that the distance coefficients are only dependent on the arrival times of the salespersons (departure time is equal to the former arrival time). 2. We reformulate the TC model as a second-order cone program (SOCP), which can be solved by any standard MILP solver. This procedure requires some assumptions on the target movements.
3. The time-free modeling approach is proposed, which is a two-stage variant. In the first stage time is completely relaxed. A time-feasible solution is computed in a second stage by sub-problems integrating not all but necessary time restrictions. In case of a discrete feasibility checker we obtain the TFTD model and in the continuous case we have the TFTC model, respectively. 4. The optimal solutions of the two-stage models are computed using a branch-andbound framework. We present an algorithm, that uses a callback function (an advanced feature of CPLEX). 5. Computational experiments are carried out using randomly generated and feasible problem instances with a varying number of targets, salespersons, and time steps. Usually, trajectories can have arbitrary shapes, however since the continuous approaches can only handle linear trajectories to be solved by the standard mixedinteger linear programming (MILP) solver CPLEX, we restrict our instances to straight lines.
The aim of this research is to find the best formulation to solve MTSPMT instances with regard to performance (in terms of CPU time). The remainder of this article is organized as follows. In Sect. 2 we provide a survey of the relevant literature. The two model variants TD and TC, which incorporate the time as a discrete and a continuous concept are recalled and extended in Sect. 3. The time-free approach (TFTD and TFTC) and its solution method are addressed in Sects. 4 and 5. The computational experiments are presented in Sect. 6 and we draw conclusions in Sect. 7.

Related work
This research article deals with a generalization of the classical TSP by considering multiple salespersons and moving targets with time windows. For a survey on the classical TSP we refer to Reinelt (1994). There are a number of articles in the literature that deal with dynamical TSP generalizations. However, the TSP literature concerning moving targets is less developed (Englot et al. 2013;Helvig et al. 1998Helvig et al. , 2003Jiang et al. 2005;Jindal et al. 2011). The first articles addressing the traveling salesperson problem with moving targets (TSPMT) are Helvig et al. (1998Helvig et al. ( , 2003. The authors present an exact and an approximation algorithm for the one-dimensional case, where targets and salespersons move on a straight line. Other specific variants of the TSPMT with restrictions on target speed, movement, and number of targets are also addressed, e.g., the targets move towards the origin (starting point of the salesperson) or never reach the origin. The proposed algorithms are not applicable to the general case of the TSPMT. Specific variants of the TSPMT and the TSPMT with resupply are also addressed by Jiang et al. (2005), Jindal et al. (2011 and Englot et al. (2013). Solution approaches are mainly heuristics such as genetic algorithms. Asahiro (2006) investigated the ball collecting problem (BCP), where n balls move in the Euclidean plane and k robots have to collect them. Each ball starts at a certain location and moves with constant speed in a certain direction. The robots start from the depot and move on straight track-lines, which go through the depot. The robots able to move in both directions have to collect all balls. Different scenarios with respect to computational complexity are studied such as: Can k robots collect all n balls?
Time-dependent traveling salesperson problems (TDTSP) considered in Abeledo et al. (2013), Ahrens (2015), Albiach et al. (2008), Picard and Queyranne (1978) and time-dependent vehicle routing problems (TDVRP) addressed in Fleischmann et al. (2004), Haghani and Jung (2005), Ichoua et al. (2003), Jung and Haghani (2001), Malandraki and Daskin (1992) are very similar to the MTSPMT. The main difference to the TDTSP and TDVRP is, that the travel times in the MTSPMT vary continuously in the whole considered network instead of usually in a part of the network (e.g., congestion) or at a certain period of time (rush hour traffic). Furthermore, for the MTSPMT not only travel times change over time, but also Euclidean distances between any two targets. This is due to the continuous movement of the targets. Finally, VRP deal with restrictions on capacities of the vehicles, which is not the case for the MTSPMT. For a comprehensive literature overview of TDTSP and TDVRP we refer to Gendreau et al. (2015). In the following we provide some articles in more detail.
Ahrens (2015) addresses a stochastic variant of the dynamic traveling salesperson problem. Problem instances were generated from static TSP datasets, where the movement of each target is modeled by a Gaussian-distributed random distance vector that is added to its location in the 2-d space at each time step. The instances were solved by heuristics in an online calculation. The author examined the applicability of standard (static) TSP solvers to dynamic instances. Computational experiments were carried out with the Tour Construction Framework which combines global and local heuristics and the TSP tour construction heuristic "nearest neighbor".
There is another TSP variant, the equality generalized TSP (E-GTSP), which is very similar to the MTSPMT. Assuming discrete time steps for the MTSPMT, we obtain a copy of each target for each time step in the respective time window. All copies of a target are called cluster. To this end, we have a set of clusters and the task is to visit each cluster exactly once. Arcs between target copies of two different clusters are realized, when a salesperson starting from the first target copy (at the respective time step) is able to reach the second target copy with at most maximum speed. Obviously, there are no anti-parallel arcs. In this context an instance of the TSPMT (MTSPMT with one salesperson) can be formulated as an instance of the asymmetrical E-GTSP. With the restriction that at least one element per cluster has to be visited we obtain the generalized TSP (GTSP). This problem is also known as set TSP or group TSP. The asymmetrical GTSP and E-GTSP have been investigated in Laporte et al. (1987). Noon and Bean (1993) showed, that any problem that can be modeled as a GTSP can be transformed into an asymmetrical TSP. The equivalent problem with more than one salesperson is the generalized vehicle routing problem (GVRP), see Ghiani and Improta (2000). For the symmetrical case there is a recent contribution by Sundar and Rathinam (2016). The authors addressed the generalized multiple depot TSP (GMDTSP) and provide a polyhedral study for this problem class. Presenting a branchand-bound approach that was also realized by callbacks of CPLEX, computational experiments were carried out for 14 to 105 targets (which correspond to the number of target copies in our notation) and a maximum number of 21 clusters (which correspond to the number of targets in our notation). Picard and Queyranne (1978) address the TDTSP. They consider a complete graph and travel times depending on the position of the target in the tour. Thus, the number of targets is equal to the number of discrete time steps and the number of travel time values is also discrete. The authors use shortest paths in a multipartite network for the solution with branch-and-bound and relaxation methods. Abeledo et al. (2013) is based on the formulation given by Picard and Queyranne (1978) and provides a study of the TDTSP polytope and computational experiments with their branch-and-cut-and-price algorithm.
In the case of multiple salespersons as well as capacity restrictions and time windows the TDVRP is concerned. Due to the computational complexity of TDVRP scientific contributions mainly focus on heuristic approaches (Fleischmann et al. 2004;Haghani and Jung 2005;Ichoua et al. 2003;Jung and Haghani 2001;Mancini 2014). Very few literature is published concerning exact methods, some of them are e.g., Albiach et al. (2008) and Soler et al. (2009). Based on the work of Noon and Bean (1993) they provide a theoretical work of transforming an instance of the asymmetric TDTSP with time windows or the TDVRP with time windows into an instance of the Asymmetric TSP (ATSP) or VRP (AVRP) respectively. Soler et al. (2009) is a generalization of the first one, because it deals with multiple salespersons. The conversions are carried out by transforming the underlying graph into an instance of the generalized TSP or VRP respectively, and then into the ATSP and the AVRP. Albiach et al. (2008) also performed computational experiments using an exact algorithm for the Mixed General Routing Problem. Instances with up to 222 vertices and one salesperson were considered, where the number of arcs is between 5 and 20% of the arcs of a complete graph. Haghani and Jung (2005); Jung and Haghani (2001) report on heuristic and exact solution methods for small and mid-sized instances and for bigger instances a lowerbound procedure was used. Here, instances with 5 to 30 nodes and 10, 15 and 30 times steps were considered. Woensel et al. (2007) introduced a new approach to model potential traffic congestion. This model is based on a queuing approach to traffic flows. Computational experiments were carried out for small instances with 10 nodes using explicit enumeration and for up to 100 nodes with an ant colony heuristic.
Another similar problem is the VRP with roaming delivery locations (VRPRDL), see Reyes et al. (2017). This article focuses on a delivery to a customers car (trunk), which can be at different locations to different non-overlapping time intervals. For the MTSPMT that would mean, that the trajectories are cut in non-overlapping pieces and only certain pieces are considered.
In summary there are only few research contributions to the MTSPMT. Articles on the MTSPMT or the TSPMT mainly consider specific problems or impose restrictions on the movement of the targets. Since solving time-dependent problems is very timeconsuming, exact solution methods can handle only small and mid-sized instances. The challenge of the MTSPMT is the continuous movement of all targets, resulting in varying travel times and distances. In our research we confront different modeling approaches and examine their computation times on randomly generated test instances. Here, we concentrate on an offline approach for solving MTSPMT instances to global optimality. In real-world online applications this can be used as a subroutine within a moving horizon approach.

Mathematical models with time
Some basic notation is introduced to formulate the MTSPMT as a mixed-integer optimization problem. We assume a finite time horizon [0, T ]. The operating space is a square in the R 2 with a side length S ∈ R + , but the following models may also be extended to the R 3 . Let W := {1, . . . , w} be a set of salespersons. All salespersons start their tour at the same depot location o. Let V := {1, . . . , n} be a set of nodes (targets or customers), then V o := V ∪ {o} and A ⊆ V o × V be a set of arcs. The length of an arc depends on the time the arc is traversed and varies over time, since the nodes are moving. Thus, the distance for salesperson k traveling from node i to node j starting at a time in i and arriving at a time in j is given by the function where p i (s) and p j (t) are the respective locations of the targets at the times s and t and v is the maximum speed value of all salespersons. The arrival time of any salesperson at a target is equal to his departure time at the same target, because waiting times are included in the traveling times. The depot o has the entire time horizon as visibility window. All arcs with a finite length can be traveled with maximum speed (v) plus potential waiting time. The goal is to reach each target from V by exactly one salesperson such that the sum of all traveled distances of all salespersons is minimized.

A time-discrete model
The time-discrete MILP formulation has already been addressed in Stieber et al. (2014). For the convenience of the reader it is concisely presented in the sequel. The model consists of a multi-commodity flow formulation embedded in a time expanded network. For an explanation of time expanded networks see Ford and Fulkerson (1958). Here, we have discretized the whole time horizon in time steps. For each time step there is a time layer with a copy of each target that is visible in this time step. Let m be an integer number. The step size is defined by Δ := T /m. Then the set of all time steps is T := {0, . . . , m}. WithÃ we denote the set of arcs in the time-expanded network. An arc is only considered between different layers: the arc (i, p, j, q) ∈Ã, (i, j) ∈ A, p, q ∈ T connects target i in layer p with target j in layer q. That means a salesperson departs in i at time step p and arrives in j at time step q. For each salesperson the arrival time at any node in V is equal to the departure time in the same node, since the waiting time is included in the traveling time. Having discrete time steps p, q ∈ T we are able to evaluate the distance function c for arcs at these times We introduce a family of binary decision variables x p,q i, j,k ∈ {0, 1}. Here, x p,q i, j,k = 1 represents the decision of sending salesperson k from i to j, departing at time step p in i and arriving in j at time step q. The objective function is to minimize the total traveled distances of all salespersons: The demand constraint requires, that each node j ∈ V must be visited once by exactly one salesperson: Each salesperson k ∈ W can only start once from the depot: The following flow constraints ensure the feasibility of time. Conservation is ensured at each node of the salesperson tour except for the last one, this can be regarded as the sink of the flow: Summing up, we solve the following optimization problem: The restrictions of the visibility time windows are embedded in the arcs, thus, arcs violating an involved time window constraint have infinite length. Usually TSP have to incorporate subtour elimination constraints. In fact, this point is included by the inherent time dependency. Since time evolves, there is no cycle in the underlying time expanded network and consequently subtour elimination constraints are not needed. Furthermore, the presented model (6) is not restricted to special shapes of the target trajectories. It can handle any trajectory, see for example Stieber and Fügenschuh (2018). Likewise, there is no need to restrict the speed function of the targets, the model is able to deal with varying target speeds. Obviously, an adverse effect of the above formulation is the increased problem size in case of a higher number of time steps (better accuracy), leading to a higher computational burden.

A time-continuous model
The salespersons may intercept the targets at any point of their trajectories, thus, the time constraints have to be modelled by continuous variables. As opposed to the discrete model, the decision whether a salesperson is able to reach the destination target of a direct link with its maximum speed has to be integrated into the model formulation. This is realized by applying the big-M method (similar to the Miller et al. (1960) (MTZ) constraints for the TSP) containing continuous time variables and time restrictions. An obvious consequence of this approach is the fact, that we obtain an objective function value with best accuracy.
We introduce a family of binary decision variables x i, j,k ∈ {0, 1}, where x i, j,k = 1 represents the decision of sending salesperson k from i to j (independently of the time). Additionally, continuous time variables t i,k ∈ R are defined to describe the arrival time of salesperson k in node i. Here, the set of arcs is A with the arc length defined by (1). Now, we are able to formulate the continuous model.
The objective function is to minimize the total traveled distances of all salespersons: Each node j must be visited once by exactly one salesperson: Each salesperson k can only start once from the depot: Flow conservation is ensured at each node of the salesperson tour except for the last one (sink): The following MTZ-type constraints guarantee time-feasibility, that means, if salesperson k moves from i to j and arrives at t i,k in i, he cannot be earlier in j than t i,k plus the time he needs to travel from the position of i at t i,k to the position of j at t j,k using maximum speed v. The time horizon T is the so called big-M constant in the big-M constraints For the visibility time windows, the time variables have to satisfy the following bounds: Summarized, we aim to solve the following optimization problem: Basically, this model is the same as the time continuous model in Stieber et al. (2014). The only changes made are some modeling refinements, e.g., departure and arrival time variables are replaced by arrival variables because possible waiting is included, constraints (9) are only considered for depot node o and bounds for the visibility time windows are added.
The presented time-continuous formulation of the MTSPMT is based on an arbitrary nonlinear continuous function c i, j,k for the distance between two distinct moving nodes. In order to apply a standard MILP solver such as CPLEX, we have to restrict the movement of the targets. To this end, we assume the trajectories to be straight lines and the speed of each target to be constant. To simplify matters, we use the same constant speed for all targets. Then c i, j,k represents the Euclidean distance between two points on two straight lines. With this, the above presented optimization problem (13) can be handled as a second order cone program (SOCP), see Boyd and Vandenberghe (2004) for an overview. The constraints (11) define the cones and make the set of feasible solutions to be convex. In the sequel we present the adjusted SOCP formulation that we use for CPLEX.
We introduce the real auxiliary variables c x i, j,k and c y i, j,k for the x-and ycomponents of the Euclidean distance c i, j,k (t i,k , t j,k ); a i, j,k for the sum and finally a i, j,k for the right hand side of the cone definition. Hence, we can formulate the following SOCP. The objective function is The constraints (8), (9), (10) and (12) remain unchanged. The constraints (11), that contain quadratic terms, are transformed into the following family of auxiliary conditions, where the trajectory of a target is represented by the convex combination of its start (x i , y i ) and end point ( The following constraints describe the condition of the uniform movement of the targets: The next conditions are needed to formulate the cone constraints: where R := √ 2S ( √ 2S is the diagonal of the square). Finally, the cone constraints are given as: Summarized, the transformed SOCP reads the following: min{(14) | (8), (9), (10), (12), (15), (16), (17), (18), (19), (20), (21) Having this model formulation, a salesperson is able to intercept a moving target at any possible point on its trajectory. In general, the objective function value of (22) is less than or equal to the objective function value of (6) for the same instance. However, the assumptions we had to make are severe, the model is only applicable to linear trajectories with constant speeds. In contrast to this, the time-discrete model can handle trajectories of any shape and speed. Moreover, model formulations based on big-M constraints usually have a weak linear programming relaxation and thus, more nodes have to be examined in the branch and bound tree, which slows down the solution process Codato and Fischetti (2004). Additionally, due to the big-M constants numerical instabilities can occur in the solution procedure if the constants are not tight enough. In (22) there are two of such constants, see (19) and (20). Obviously, a comparison of both modeling approaches (discrete and continuous) is difficult, because of the different characteristics.

Time relaxations
For the next model formulations we concentrate on the time aspect in a different way namely on a time-free model. This means relaxing the time completely and later reintegrating parts of the time restrictions. The technique was first introduced by Fügenschuh et al. (2013). They successfully applied the method to the scheduling and routing of planes and tourist travel requests during fly-in safaris, which essentially is a VRPTW with pickup and delivery.
First of all, we perform a projection of (6) from the time-discrete variable spacẽ The time-free counterpart of variables x p,q i, j,k is simply x i, j,k . While reducing all time-discrete arcs between two different targets to only one arc, we have to find a suitable length for this new arc to preserve the optimal solution. Thus, the minimum length of all time-discrete arcs between two targets is used as the length for the new arc. That means, for any salesperson k and any two nodes i and j the distance coefficient c i, j,k is taken as With this, we have the following model in the time-free space: This is a classical multi-commodity flow problem, which is easy to solve by standard MILP solvers. The optimal solution of the time-free model (24) is a lower bound to (6), because the distance coefficients are computed by minimization. However, the reconstruction of a time-feasible solution from a time-free solution is not straightforward and not every time-free solution yields a time-feasible solution. For this purpose we have to examine every time-free solution, that we encounter during the solution process for time-feasibility. The examination is embedded in a branch-and-bound framework in order to prune nodes whose lower bounds exceed the current best solution value.
Given an optimal solution of (24), feasible times at which salespersons reach the targets have to be constructed from it. Assuming the objective function value of the constructed time-feasible solution is equal to the objective function value of the timefree solution, then the constructed solution is proven global optimal for (6). However, this rarely happens. It is more likely, that a time-free solution is infeasible with respect to the time constraints. Thus, apart from an optimal time-free solution, we have to investigate also further feasible time-free solutions in the branch-and-bound process. That means (24) serves as the master problem in the branch-and-bound framework. For any solution of the master problem, we try to construct a feasible counterpart with respect to the given time constraints. If the objective function value of this solution is better than previously found ones, it is stored. Then, the current time-free solution is treated as infeasible and cut off in the branch-and-bound process. Obviously, the generated cuts should take into account all possible salespersons permutations in order to prevent a repetition of the same time-free solution due to salesperson symmetries. In case that there is no time-feasible counterpart for a time-free solution, we also have to cut off the time-free solution as well. In this way we are able to check all time free solutions. Additionally, the branch-and-bound tree can be pruned by exploiting lower bounds. This solution method is realized using the callback functionality of CPLEX. Analyzing the time-free solution before the construction phase also leads to an advantage in processing. For this we refer to Sect. 5. For now, we concentrate on the construction of a time-feasible solution from a time-free solution.
Given a time-free solution, according to the variables set to one, we obtain a set of arcs for each salesperson and call it a pretour. Note, a pretour may be disconnected and the pretour of some salespersons can be empty. A salesperson is called active, if its pretour is not empty. The pretours of all active salespersons are extracted from the solution. A pretour is called feasible if it is a Hamiltonian path in the induced sub-graph starting at the depot o. For each feasible pretour a time-feasible tour is required.

A time relaxation with discrete time feasibility checking
Assuming we have a non-empty pretour. The construction of a time-feasible tour is done by setting up a checking sub-MILP. For the sub-MILP we consider all the time restrictions of the salesperson, who belongs to the pretour. In more detail, we include only those time-dependent arcs, that have a time-free counterpart in the given pretour meaning the corresponding solution variable is nonzero. In the case that the checking sub-MILP for each active salesperson results in a time-feasible tour an overall timefeasible tour is found. The best overall time-feasible tour solves (6).
The time-feasibility checking MILP is set up as a minimum-cost flow problem from a source to a sink for each active salesperson separately. For an active salesperson k its pretour defines the sequence of targets, k has to visit. Let us assume n k is the number of targets k has to visit and (v 1 , v 2 , . . . , v n k ) is the sequence. Additionally depot position o is considered as the source of the flow and we extend the sequence by a node d, which serves as the sink. Thus, V k = {o = v 0 , v 1 , v 2 , . . . , v n k , v n k +1 = d} denotes the sequence of nodes considered for the minimum-cost flow problem of salesperson k.
Since we have to consider only those time-expanded arcs, that correspond to an arc of the pretour, the checking MILP consists of all arcs, that go from depot position o to distinct positions of v 1 and from distinct positions of v 1 to distinct positions of v 2 and so on. Additionally, we have to introduce artificial time-discrete arcs from distinct positions of v n k to the sink d. That means, for each arc, that ends in v n k at time step p an arc is introduced from (v n k , p) to (d, p + 1). The distance of all arcs between v n k and d is zero. We denote this set of arcs for salesperson k by A k . According to the time-discrete model, we introduce binary decision variables x p,q v i ,v i+1 describing the decision of sending salesperson k from target v i to its successor v i+1 starting at time step p and arriving at time step q. Then, the time-feasibility checking MILP for an active salesperson k is formulated as follows: The optimization problem (25) aims to find the shortest path from o to d. This kind of optimization problem can be solved in polynomial time by, e.g., Dijkstra's algorithm Dijkstra (1959). In case the checking MILP (25) results in a time-feasible tour for each active salesperson, we are able to construct a total time-feasible solution for (6) by combining all salesperson tours. If the checking MILP results in an infeasible solution for any of the active salespersons the construction process is aborted for the corresponding time-free solution.

A time relaxation with continuous time feasibility checking
According to our time-continuous model (13), there is also a time-relaxed variant with a continuous time-feasibility checking sub-MILP. For this variant we cannot use (24) directly, since its distance coefficients c i, j,k in the objective function are dependent on a discretization and on the time-discrete arcs, see (23). In a continuous model, these coefficients are not valid. Here, we have to replace the minimum time-expanded arc length by the real minimum length between any two trajectories. Thus, the distance coefficients c i, j,k are computed as follows: For an endpoint p j e of trajectory j we seek for the time interval [t i 1 , t i 2 ] for i, such that Then the continuous time-relaxed model is (24) with (26). Having this, we can formulate the feasibility checking sub-MILP for the continuous case. As for the continuous model (22) with the assumption of linear trajectories and constant target speed, the checking sub-MILP for an active salesperson k can be modeled as a quadratic program. Here, we again use continuous time variables t v i to define the arrival time in v i . With the target sequence {o = v 0 , v 1 , . . . , v n k } and the corresponding set of arcs A k , we obtain the checking MILP for k as follows: Here, the function has to be at most v, the maximum speed. In contrast to the proposed TC model (13), the time-free model with the continuous checking MILP (TFTC) does not contain any big-M constant.

Implementational details
In this section we address the time-free problem (24). In case of a discretization of time, (23) is needed for the calculation of the arc lengths. In the continuous case we use (26) instead. The model (24) (with either (23) or (26)) is called master problem and the solution procedure is embedded in a branch-and-bound framework. To check the solutions of the master problem and to produce the best time-feasible solution we use the callback utilities of CPLEX. We implement an instance of the BranchCallback and the LazyConstraintCallback. The latter one is a user-written callback to solve mixed-integer linear programs. Each time a candidate feasible solution of the master problem is found at a node in the branch-and-bound tree the LazyConstraintCallback is invoked and violated constraints are applied. Those constraints are applied in a "lazy" fashion, i.e., only if they are violated.
In the LazyConstraintCallback we include a validation check of the feasible solution of the master problem and the construction of feasible times according to (25) and (27). Our callback algorithm is presented in Algorithm 1. A more detailed description of individual steps is given in the sequel. In line 2-3 the objective function value of the master problem is compared to the best objective function value found so far. This value serves as a lower bound. If the current value cannot beat the best time-feasible objective function value found so far, the callback will quit. Afterwards, the current node is pruned in the BranchCallback, which is subsequently invoked. The task of the BranchCallback is pruning. Branching decisions are left to the default way of CPLEX. Since the master problem is a multi-commodity flow problem, each solution represents a feasible flow but not necessarily a feasible tour, due to the lack of subtour elimination constraints. This leads to line 5-7 in order to check each active pretour against cycles. In case a cycle C ⊂ A is found, it will be cut off by the following constraints: The last validation check performed is to exploit the visibility time windows along the pretour and test if there are any feasible times to arrive at the last node of the pretour, see line 9-11. This interval propagation is visualized in Fig. 1. Let us consider any target v of the pretour. Then we investigate if there are time steps in the visibility window of v, which can be used to arrive from its predecessor (starting in its time window) and to leave for its successor (arriving in its time window). The resulting time interval serves as the new visibility window for v and it is propagated to the succeeding target the same way. Since for the depot node o the whole time horizon is considered as visibility window, the whole visibility window for v 1 can be used as its arrival interval. Regarding the visibility window of v 2 we obtain a possible departure interval for v 1 with feasible time steps to depart for v 2 . The next step is to generate a cut set from both intervals of v 1 . This procedure is continued to the last node of the pretour. If at the last node or any node before the intersection interval is empty, this indicates that the pretour we started with is not time-feasible. In the given example the arrival interval of node v 1 is the discrete interval [2, 5]. The departure interval to leave for v 2 with a speed of at most v is [2, 3] with a possible arrival in v 2 in [3,4]. A departure after 3 in v 1 cannot reach v 2 within its visibility window of [0, 4]. An earlier departure is not possible due to the visibility window of v 1 . Thus, the intersection of [2, 5] and [2, 3] leads to [2, 3], which is considered Fig. 2 Interval propagation. This figure presents the depot position o and a given salesperson tour consisting of the sequence o, v 1 , v 2 . Each target is visualized by its trajectory and the corresponding discrete time steps, which are given by black numbers. The grey area describes the discrete departure and arrival interval between consecutive nodes. The light grey area is the extended departure and arrival interval when considering continuous times as the new visibility window for v 1 . The procedure of time interval propagation has to be continued to the following targets, but it is important to use the updated time window for the predecessor node. In case there is an empty cut set at any target of the pretour, we have an infeasible interval. This means, there is no time-feasible tour for the current pretour and we can reject the current time-free solution. This is done by adding the following global cut to the master problem. For an infeasible pretour o = v 0 to v n s , let P ⊂ A be the corresponding sequence of arcs, then, we formulate the following constraint for each salesperson to cut off this pretour: Note, that for any pair of anti-parallel arcs (i, j) and ( j, i) we have This means, for a variable in (29), which is bounded by 1, we can add the variable of the anti-parallel arc at no cost (30). Thus, lifting |P| − 1 anti-parallel arcs to the cut (29), leads to: Interval propagation for the continuous case is done in a similar way. In general, the resulting arrival and departure intervals are slightly larger, see Fig. 2.
The travel time is not rounded to the next time step, its exact value is computed using maximum salesperson speed and the Euclidean distance between the corresponding positions of the targets. With this a salesperson is able to arrive earlier and to depart later compared to the case with discrete time steps. The computation of the new time window of a target is again an intersection of arrival and the departure interval. The exact arrival interval depends on the previously updated time interval of the predecessor node and the departure interval depends on the visibility window of the successor. In Fig. 2 the arrival interval of v 1 is the whole visibility window, since the predecessor node is the depot. For the departure interval of v 1 the values t dep min , t dep max , t arr min and t arr max have to be identified. Obviously, we have t dep min = t v 1 and t arr max = t v 2 . Assuming a constant maximum speed we take the equation of uniform movement to compute t dep max , which is the latest possible departure in v 1 : have to be computed accordingly for v 2 and v 3 . This procedure is continued to the last node of the pretour. If at the last node or any node before an empty intersection interval has been detected, there is no time-feasible salesperson tour and the corresponding pretour is cut off for any salespersons permutation. In the feasible case the MILP (25) or (27) is set up to compute the corresponding times.
Each computed pretour and its counterpart with time are stored in a solution pool in order to prevent setting up and solving the same sub-MILP again and again for pretours occurring more than once. This is realized in Algorithm 1 in the lines 18-19 and 32. The existence of a solution pool forces us to use a single-threaded optimization instead of a multi-threaded one. The reason is, that in parallel mode it is not allowed to access data, which is not local. Another reason against parallel mode is, that CPLEX is not deterministic due to a different order of callback invocations for multiple runs of the same instance with the same parameter setting on the same platform. This would lead to an undesired, non-deterministic variability in the runtimes.
Finally, a time-feasible solution is found, when there is a time-feasible tour for each active salesperson. This solution is returned to the master problem and the best objective function value found so far is saved. Then a global cut is added to the master problem in order to prevent a repetition of the current solution by other salesperson permutations. Summing up, we have the time-discrete model (6) (TD), the time-continuous model (13) (TC), the time-free model (24) with time-discrete feasibility checking (25) according to Algorithm 1 (TFTD) and the time-free model (24) with time-continuous feasibility checking (27) according to Algorithm 1 (TFTC).

Computational experiments
The presented models have either a discrete or a continuous handling of time. Generally, it depends on the application or on the computational complexity which approach to choose. Due to the specific characteristics of discrete and continuous models a mutual comparison is not easy. Consider for example the shape of the trajectories. The time-continuous models are restricted to straight lines to be solvable, while the time-discrete models are not. Another point is the difference in objective function values between discrete and continuous approaches. Obviously the objective function value of the continuous approaches is always less or equal than the objective function value of the discrete models, but with this the computational burden will also be higher. Thus, we perform a runtime comparison of the discrete approaches and one of the continuous approaches. As a basis we concentrate on randomly generated linear target trajectories with constant speed for all our modeling variants.

Instance generation
We use a set of randomly generated test instances. A test instance is specified by number of salespersons, number of moving targets and distance of time steps. The operating space is a square of size 500 length units and the trajectories are created with random lengths between 100 and 400 length units. If two hostile RAM meet in the air by chance, they would be deflected or destroy each other before their destinations are reached. Since such a szenario is very unlikely and due to visualization clarity we do not support such a situation in the trajectory generation by prohibiting any pairwise intersections. This is not an assumption to the models, all four variants can cope with trajectory intersections. The targets are assigned a constant speed value of 32 length units per time unit, while the salespersons can travel at most 200 length units in one time unit. In Stieber et al. (2014) we observed, that instances have a higher complexity, if the difference between target speed and salespersons speed is high. In this case the number of possible tours of the salespersons rises with an increased speed difference. Obviously, a power of two for the target speed is required to be able to create finer time-discretizations of the trajectories by introducing new time steps right in the middle of two existing ones. Following this, we create 3 different levels of time discretization. In particular, for the first discretization level called D32 a time step is introduced to the trajectories every 32 length units (target speed). Then, the same instances are generated with a two times finer discretization (D16). Here, the step size between two consecutive time steps is 16 length units and for the 4 times finer discretization (D8) time steps are included every 8 length units. Obviously, the size of the instances in terms of number of variables and constraints is increased with a higher number of time steps.
The current research deals with solvable test instances, that means no target will reach its upper time limit before being visited by a salesperson. This is achieved by assigning the visibility time windows to the targets in such a way that one salesperson is able to intercept all targets one after another. In all instances salespersons start their tours at the depot position o, an initial position located in the center of the operating space. Of course all models can easily be adapted to the case of multiple depot positions. In total, instances with number of targets of 6,8,10,12,14,16,18 and 20, with number of salespersons of 1, 2, 3, 4, 5 and 6 and with discretization levels D32, D16 and D8 are created. An instance as an example with 12 targets, 2 salespersons starting from the depot in the middle and medium discretization level D16 is visualized in Fig. 3. The visibility time window is given by the numbers at the end points of the trajectories.

Computational results
The generated instances are embedded in the 2-d space, nevertheless the proposed models and methods are not restricted to the 2-d space. Furthermore our time-discrete models TD and TFTD are not restricted to linear trajectories, it is also possible to handle non-linear trajectories. The TD model and nonlinear trajectories are addressed in Stieber and Fügenschuh (2018).
All generated instances are solved with CPLEX. While the solution procedure of the time-free models TFTD and TFTC is customized by LazyConstraintCallbacks and BranchCallbacks of CPLEX, the instances modeled with TD and TC are solved without callbacks and directly by CPLEX' MILP and Barrier algorithms. The CPLEX parameters used for the optimization of the generated instances are listed in Table 1. For the time-free models, the node heuristic (HeurFreq) is turned off in order to save runtime, otherwise CPLEX would permanently check time-free solutions that are usually infeasible. Furthermore, we set the MIPEmphasis parameter to moving best bounds for the time-free models. For TD this setting would extremely slow down the computation, thus, to be fair we leave the MIPEmphasis parameter at its default value. Moreover, for the time-free master models, the cuts created by CPLEX are also turned off, since our LazyConstraintCallback is producing cuts, when checking The computational experiments were carried out on an Apple Mac Pro computer running the MacOS 10.12.6 operating system with an Intel Xeon E5 running at 3.5 GHz on 6 cores, 12 MB L3 cache, and 128 GB 1066 MHz DDR3 RAM. The version of CPLEX we used was 12.10 [7]. Our aim is to evaluate the presented models with respect to their computational times for solving the generated instances. The runtime of an instance is defined by the time, CPLEX requires to compute the global proven optimal solution, including the time needed for the callbacks. In order to compare runtimes with a time limit, we compute a comparable score sc. It takes into account the runtime, which is at most 3600 s and the remaining gap, which is at most 1. It is computed as sc = runtime 3600 + gap. The score takes values between 0 and 2. In case the score is less than 1, the optimization has finished within an hour and the gap is 0. In case the score is above 1, the optimization has aborted with a gap equal to (sc −1)·100 percent.
In the first experimental set we fix the number of salesperson (3) and the discretization level (D16) and only vary the number of targets between 6 and 20. For each number of targets 21 instances are randomly generated. The scores of all 21 instances and the related arithmetic mean for model TD and TFTD is reported in Fig. 4. In case an instance is solved to proven optimality the score only consists of the runtime part of the score (gap = 0) and is located in the lower half of the figure. In the other case the gap is positive and the score value is in the upper half of the figure at the right gap value. Note, that for reasons of better comparability the TD score values are shifted to the left (by 0.2) and the TFTD score values are shifted to the right (by 0.2) of the respective target number. Technically, the scores of TD and TFTD belong to one single number of targets and would overlap in this case, which would lead to a worse readability and comparability. The same is done for TC and TFTC and in the following visualizations. For the TD model most of the instances can be solved within 900 s, there is one instance, that needs 1980 s. The TFTD model is only fast for small instances with 6 and 8 targets. The score values of TD and TFTD are close to each other in this target range. Here are even 6 instances, that can be solved faster with TFTD than with TD. From an amount of 10 targets onward the runtimes of TFTD increases significantly, from 14 targets and above no instance can be solved within the time limit of 3600 s. Regarding the continuous cases, TC and TFTC are very similar up to 10 targets, but above 10 targets, where most of the instances cannot be solved within the time limit TFTC has got much better gaps than TC (only half of it).
The next experimental set fixes the number of targets (10) and the discretization level (D16). We vary the number of salespersons from 1 to 6. The respective score values of all four models are reported in Fig. 5. The scores of TD are low and close to each other regardless of the number of salespersons. The runtime of TFTD is sensitive to the number of salespersons, the higher the number of salespersons, the more instances cannot be solved within the time limit. A similar behavior can be observed for TC and TFTC, while the arithmetic mean is again smaller for TFTC than for TC regarding 4 salespersons and more.
In the third experimental set the number of targets (10) and the number of salespersons (3) is fixed, while the discretization level varies from D32 (coarse) via D16 (medium) to D8 (fine). Here, only discrete models are considered, because the other ones are based on continuous time variables. The scores for TD and TFTD are reported in the left graphic of Fig. 6, the right graphic is a zoomed visualization of the runtimes for TD. One can see, that the runtimes for TD increases with a higher number of time steps. Unlike TFTC, where the mean is nearly the same for all three discretization levels. For D8 there are three out of 21 instances, that can be solved faster with TFTD than with TD.
As the last experiments we want to investigate the runtime of the proposed models for a time limit of 3 s. This is important, since from the mathematical point of view the application is an online problem. Here, we set the number of targets to 8 and the discretization level to D8 and vary the number of salespersons from 1 to 6. The results are reported in Fig. 7. For the discrete models we observe, that for one and two salespersons the mean of TFTD is better (runtime is lower) than the mean of TD. This behavior changes when the number of targets is three and greater, there the mean of TD is better. In the continuous case TC is only better (lower runtime) for one salesperson, for two and more salesperson the mean of TFTC is better than the mean of TC. With the number of salespersons of three and more TC is not able to solve any of the generated instances to optimality within 3 s, where TFTC is able to solve some instances to optimality for three and four salespersons.

Conclusion
We addressed the MTSPMT, a dynamic variant of the TSP, where multiple salespersons are searching for their tours in a system with continuously moving targets. We presented four different model formulations, which can be separated by discrete and continuous time handling on one hand and on the other hand by the solution approach (direct or by time-free master plus subproblems). For the time-free variants we presented an exact branch-and-cut algorithm. Due to the different ways of modeling the variants have different characteristics. For instance, TD is sensitive to the level of discretization. Our assumption was, that TFTD might be faster than TD when the discretization is fine enough. It turned out, that the instance size also plays an important role. Thus, the assumption only holds for small instances. Best performance for larger instances is reached by TD. The continuous times variants are much more difficult, but gaps are smaller for TFTC than for TC when larger instances are concerned. Obviously, the time-free modeling variant (TFTC) has an advantage over the direct modeling (TC) so that it produces better bounds and thus better gaps. This is especially visible with a time limit of 3 s. According to the defense application, TD would be the best choice to use in a moving horizon approach with an adequate discretization level. Since TFTD has the fastest runtimes for small instances with one or two salespersons, it might be suitable for the Moving-Target TSP, where a possible application is a supply ship, that resupplies patrolling boats. For future research we want to have a closer look on online optimization. We want to test our models and solution methods as subroutines in a moving horizon approach. Another point for future work will be to investigate nonsolvable instances, that are instances, where it is not possible to reach all targets within their visibility time windows.
Funding Open Access funding enabled and organized by Projekt DEAL.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.