Multivalued decision diagrams for prize-collecting job sequencing with one common and multiple secondary resources

Multivalued decision diagrams (MDD) are a powerful tool for approaching combinatorial optimization problems. Relatively compact relaxed and restricted MDDs are applied to obtain dual bounds and heuristic solutions and provide opportunities for new branching schemes. We consider a prize-collecting sequencing problem in which a subset of given jobs has to be found that is schedulable and yields maximum total prize. The primary aim of this work is to study different methods for creating relaxed MDDs for this problem. To this end, we adopt and extend the two main MDD compilation approaches found in the literature: top down construction and incremental refinement. In a series of computational experiments these methods are compared. The results indicate that for our problem the incremental refinement method produces MDDs with stronger bounds. Moreover, heuristic solutions are derived by compiling restricted MDDs and by applying a general variable neighborhood search (GVNS). Here we observe that the top down construction of restricted MDDs is able to yield better solutions as the GVNS on small to medium-sized instances.


Introduction
total prize that is feasibly schedulable. Each job requires one of several secondary resources during its whole processing time and a single common resource for a part of its execution. Moreover, each job has to be performed within given time windows. This problem originates from the context of particle therapy for cancer treatment (Maschler et al. 2016;Maschler and Raidl 2018a;Riedler et al. 2017). In this scenario the common resource corresponds to a particle beam that can be directed into one of multiple treatment rooms which are represented by the secondary resources. Jobs describe treatments that consist of several tasks within a treatment room from which only one is the actual irradiation using the beam. While the works concerning particle therapy deal with numerous additional characteristics stemming from the real world application, it is apparent that the most central aspect is the sequencing of the jobs.
In a concurrently submitted work, Horn et al. (2018) focus on solving the PC-JSOCMSR exactly by means of A* search, mixed integer programming, and constraint programming. While excellent results are obtained in particular for the A* search, the applicability of these methods is strongly limited to rather small or medium sized-problem instances. A sequencing problem with similar job characteristics to ours, requiring one common and a secondary resource, has been considered by Van der Veen et al. (1998). However, in their case post-processing times are negligible and as a result the problem reduces to a special variant of the traveling salesman problem that can be efficiently solved in polynomial time. Last but not least, we point out that PC-JSOCMSR is somewhat related to variants of nowait flowshop problems (Allahverdi 2016) and more general resource-constrained project scheduling problems (Hartmann and Briskorn 2010).
In this work we explore the potential of applying the concept of decision diagrams (DDs) to PC-JSOCMSR and in particular investigate different methods for creating them. DDs have been originally developed in the context of circuit design (Lee 1959). In the course of the last decade DDs have shown to be also a powerful tool for tackling combinatorial optimization problems (COPs) (Bergman et al. 2016a). Essentially, DDs are layered directed acyclic multigraphs used to compactly represent a COP's set of feasible solutions. To this end, a DD has a root node and each subsequent layer of the DD is associated with one of the decision variables of the COP. Every arc in the DD describes an assignment of the variable represented by the corresponding layer. Thus, a path starting from the root node represents a variable assignment. The lengths of the arcs are assigned in such a way that the length of a path corresponds to the objective value of the corresponding variable assignment. Depending on whether the COP's objective is to maximize or to minimize a given objective function, we are seeking a longest or a shortest inclusion maximal path to a valid terminal node within the DD. The out-degrees of the DD's nodes directly corresponds with the domain sizes of the respective decision variables. If the COP is modeled with binary variables, then all nodes have at most two outgoing arcs and the DD is called binary decision diagram (BDD). In the more general case with finite variable domains, the number of arcs leaving nodes is not restricted. In this case, DDs are called multivalued decision diagrams (MDDs).
DDs resemble in many aspects a dynamic programming's state graph (Hooker 2013). Likewise, the size of exact DDs grows in general exponentially with the problem size. To overcome the resulting limitations, Andersen et al. (2007) proposed the concept of relaxed DDs. The basic idea is to merge nodes on the same layer and to redirect the affected arcs. This might introduce new paths in the DD that, however, do not represent feasible solutions. Consequently, relaxed DDs encode a superset of the feasible solutions and represent a discrete relaxation of the problem that provides dual bounds. Another way to cope with the in general exponential number of nodes are restricted DDs (Bergman et al. 2014b). A restricted DD is obtained from an exact DD by removing nodes and all incident arcs. Clearly, this also removes all paths from the DD that included at least one of the removed nodes. Therefore, a restricted DD represents only a subset of all feasible solutions, and it is used to obtain a feasible heuristic solution and a respective primal bound.
Beside upper and lower bounds, relaxed DDs in particular also provide promising opportunities for new inference techniques in constraint programming (Cire and Hoeve 2013;Kinable et al. 2017), novel branching schemes (Bergman et al. 2016b) for branch-and-bound, as well as primal heuristics (Bergman et al. 2014b(Bergman et al. , 2016b. The concept of DDs has been successfully applied to a variety of problems, ranging from binary optimization problems to sequencing problems. The former include set covering (Bergman et al. 2011(Bergman et al. , 2014b, maximum independent set (Bergman et al. 2014a(Bergman et al. , 2016b, maximum cut (Bergman et al. 2016b), and maximum 2-satisfiability (Bergman et al. 2016b) problems and are approached using BDDs. Sequencing problems on the other hand typically suggest the use of MDDs. In the literature already considered sequencing problems include the time dependent traveling salesman problem with and without time windows and the time-dependent sequential ordering problem (Cire and Hoeve 2013;Kinable et al. 2017). One fundamental difference to the DDs considered in the literature is the prize-collecting aspect. While the so far considered problems define solutions by paths traversing all layers, in PC-JSOCMSR every path starting at the root node corresponds to a valid solution. For a comprehensive overview on DDs see Bergman et al. (2016a).
Two main approaches have been proposed for compiling MDDs. The first starts at the root node and constructs the MDD layer by layer (Bergman et al. 2011(Bergman et al. , 2014b. If the number of nodes within a layer exceeds a given limit, then either nodes are merged or removed to obtain a relaxed or a restricted MDD, respectively. The second approach, starts with a simplistic relaxed MDD and applies incremental refinements by splitting nodes in order to iteratively strengthen the relaxation (Cire and Hoeve 2013;Kinable et al. 2017). While an application of the first approach to PC-JSOCMSR is relatively straight-forward, realizing a successful incremental refinement method for PC-JSOCMSR requires substantial problemspecific extensions of the basic principle. We adapt both approaches for PC-JSOCMSR here and are, to our best knowledge, the first who directly compare the two techniques in an experimental fashion. Moreover, we investigate the derivation of heuristic solutions by constructing a restricted MDD and provide an independent general variable neighborhood search (GVNS) (Hansen et al. 2010) to set the DD-based approaches into perspective. Our computational experiments show that the incremental refinement approach provides on most of our benchmark instances better dual bounds than the top down compilation. While the top down compilation for restricted MDDs outperforms the GVNS on small to mediumsized instances, the GVNS is mostly superior on larger instances. This work is a revised and extended version of the preliminary conference paper from Maschler and Raidl (2018b). New is in particular a technique that detects and removes redundancies during the incremental refinement of MDDs. This redundancy detection and removal yields to significantly smaller DDs.
The remainder of this work is organized as follows. In the following we start by giving a formal description of the considered problem. Section 3 provides a recursive dynamic programming model for PC-JSOCMSR which serves as basis for deriving MDDs in Sect. 4. Section 5 describes the top down compilation of relaxed and restricted MDDs, while the incremental refinement algorithm for PC-JSOCMSR is given in Sect. 6. Section 7 sketches the standalone GVNS. Results of computational experiments of all approaches are discussed in Sect. 8. Finally, Sect. 9 concludes with an outlook on future research directions.

The problem
The prize-collecting job sequencing with one common and multiple secondary resources (PC-JSOCMSR) problem is formally defined as follows. Let J = {1, . . . , n} be a set of n jobs of which a subset shall be scheduled using renewable resources R 0 = {0} ∪ R with R = {1, . . . , m}. To be processed, each job j ∈ J requires a resource q j ∈ R for its entire processing time p j > 0 and additionally resource 0 for a duration of p 0 j after time p pre j from the job's start; 0 < p 0 j ≤ p j − p pre j . For convenience, we denote with p post j the duration after the common resource is used until the job j is completed, i.e., p post j = p j − p pre j − p 0 j . Moreover, we write J r for the subset of all jobs in J which require secondary resource r ∈ R.
We associate with each job j a set of ω j disjunct time windows Job j can only be performed during one of these time windows and is assumed to be non-preemptive, i.e., may not be interrupted. Moreover, we denote the whole relevant time horizon, encompassing all time windows of all jobs, with [T min , T max ].
Finally, each job j has associated a prize (utility value, priority) z j > 0. We assume that there exists, in general, no schedule that assigns feasible starting times to all jobs in J . Instead, we aim for a subset of jobs S ⊆ J that can be feasibly scheduled and maximizes the total prize of these jobs, i.e., A schedule of S implies a total ordering of the scheduled jobs because all jobs require resource 0 and this resource can be used by only one job at a time. Vice versa, such an ordering π = (π i ) i=1,...,|S| of S can be decoded into a schedule by scheduling each job from S in the order given by π at the earliest feasible time after the preceding job. If at least one of the jobs cannot be feasibly scheduled in this way, then ordering π does not represent a feasible solution. We call the schedule obtained from ordering π by the above decoding a normalized schedule. Clearly, for every feasible solution there exists a normalized schedule with the same objective value. Hence, we write Z (π) for the total prize of the normalized solution given by the ordering π of jobs. Above problem variant extends the job sequencing with one common and multiple secondary resources (JSOCMSR) problem originally proposed by Horn et al. (2017)

Recursive model for PC-JSOCMSR
We provide a dynamic-programming-like recursive model for PC-JSOCMSR. The induced state graph will then serve as a basis for deriving MDDs. To simplify the handling of time windows let us define the function earliest feasible time eft( j, t) that computes for a given job j and time point t the earliest time not smaller than t at which job j can be performed according to the time windows, i.e., (2) Note that eft( j, t) = T max if job j cannot be scheduled within its time windows. The main components of the recursive model are the states, the control variables that conduct transitions between states, and finally the prizes associated with the transitions. In our recursive formulation a state is a tuple (P, t) consisting of the set P ⊆ J of jobs that are still available for scheduling and a vector t = (t r ) r ∈R 0 of the earliest times from which on each resource r is available. The initial state corresponding to the original PC-JSOCMSR instance without any jobs scheduled yet is r = (J , (T min , . . . , T min )).
The control variables are π 1 , . . . , π n ∈ J . Starting from the root node they select the jobs to be scheduled. Variable π 1 selects the first job j to be scheduled, and we transition from state r to a successor state (P , t ), where π 2 decides with which next job to continue. This is repeated for all control variables. If a job selected by a control variable cannot be feasibly scheduled as next job, then we obtain the special infeasible state0. Any further transition from0 yields0 again.
To specify the transitions, let the starting time of a next job j ∈ J w.r.t. a state (P, t) be The transition function to obtain the successor (P , t ) of state (P, t) when scheduling job j ∈ J next is with t 0 = s((P, t), j) + p pre j + p 0 j (5) t r = s((P, t), j) + p j for r = q j (6) t r = t r for r ∈ R\{q j }.
All states except the infeasible state0 are possible final states. The prize associated with a state transition is job j's prize z j . Any sequence of state transitions τ (. . . τ (r, π 1 ) . . . , π i ) yielding a feasible state (P, t) from the initial state r represents a solution. In fact, the respective states map directly to the normalized schedule obtained by decoding the jobs π 1 , . . . , π i as stated in Sect. 2. Moreover, the sum of the prizes of all these transitions corresponds to Z (π 1 , . . . , π i ), the total prize of the solution.
The individual states obtained by the transitions can be safely strengthened in many cases, frequently leading to a smaller state graph. We aim at replacing state (P, t) by state (P , t ) with either P ⊂ P or t r > t r for one or more r ∈ R 0 without losing feasible solutions. This is done by first considering the earliest starting times s((P, t), j) for all jobs j ∈ P. Jobs that cannot be feasibly scheduled next can be safely removed from P, i.e., P = { j ∈ P | s((P, t), j) = T max }.
Afterwards, we set the times t r , ∀r ∈ R 0 , to the earliest time resource r is actually used by the jobs in P . If a resource is not required by any of the remaining jobs then we set the corresponding time t r to T max . More formally, The described strengthening of states helps in reducing the size of the state graph as two or more originally different states may be turned into the same strengthened state. Moreover, the strengthening of states also is valuable in conjunction with the merging of nodes in a corresponding relaxed decision diagram, as we will see in the next section.
Note that a strengthened feasible state does not have to describe a single solution, because the same state might be reached by multiple transition sequences. These different transition sequences yielding the same state might also have distinct total prizes due to the strengthening. Since we are maximizing the total prize, we are primarily interested in sequences with maximum total prize. To this end, let Z lp (P, t) be this maximum total prize for any sequence τ (. . . τ (r, π 1 ) . . . , π i ) resulting in state (P, t). Ultimately, we are looking for a feasible state with maximum Z lp (P, t).
Looking at these relationships from a dynamic programming perspective, we can express the maximum total prize for jobs that can still be scheduled from any feasible state (P, t) onward by and Z * (r) then denotes the overall maximum achievable prize, i.e., the optimal solution value.

Multivalued decision diagrams for PC-JSOCMSR
This section explains the relationships between the state graph of a PC-JSOCMSR problem instance and exact, relaxed, and restricted MDDs. An exact MDD is a layered directed acyclic multigraph G = (V , A) with node set V and arc set A. The node set V is partitioned into layers L 0 , . . . , L n . The first layer L 0 consists only of a single node associated with the initial state r. Each subsequent layer L i contains nodes for all states obtained from feasible state transitions from states associated with nodes in layer L i−1 . Moreover, the MDD has arcs for all feasible state transitions in the state graph connecting the corresponding nodes.
Observe that arcs exist only between directly successive layers and there might be nodes for identical states on different layers. The length of these arcs are the state transition prizes z j . The infeasible state0 and all transitions to it are omitted. In the literature, a target node is typically defined and arcs with zero length exist from any feasible end node to this target. Since in our case any node represents a valid end state, we deviate here from the literature and do not make explicit use of this target state. Let us denote by j(a) ∈ J the job that is considered in the state transition associated with arc a ∈ A. Moreover, let A + (u) and A − (u) indicate the set of all incoming and outgoing arcs of a node u ∈ V , respectively. Moreover, for a node u we write P(u) and t(u) as a shorthand for the set P and vector t of the node's state. In particular, we denote with t r (u) for a node u the time from which on each resource r ∈ R 0 is available for performing a next job.
An optimal solution is obtained from an exact MDD by determining a longest path from r to some end node v and scheduling the jobs associated with each arc in the respective order and at the starting times s((P, t), j). The length of this path, i.e., the sum of the respective arcs' transition prizes, corresponds to the optimal solution value Z * (r).  Figure 1 shows an exact MDD for an instance with four jobs and two secondary resources. Details of the PC-JSOCMSR instance are given on the top right, while the MDD is depicted on the top left. Each arc's label indicates the job that is scheduled by the respective state transition and in parentheses the arc's length. We indicate with dashed arcs the in our case unique longest path of length seven. The corresponding optimal solution, scheduling the jobs π = (3, 1, 4) with a total prize of Z (π) = 7, is shown on the bottom left. Moreover, states of all nodes are given on the bottom right.
Exact MDDs grow in general in an exponential way with the problem size as they basically represent the complete state graph. We are more interested in more compact MDDs that represent the state graph only in an approximate way. This is usually done by limiting the number of nodes allowed in each layer to a fixed maximum β ≥ 1. The number of nodes in a layer is called the layer's width, and the maximum width over all layers is the width of an MDD. To receive MDDs of limited width, there have been proposed two concepts with contrary effects: relaxed MDDs (Andersen et al. 2007) and restricted MDDs (Bergman et al. 2014b).
Relaxed MDDs cover all feasible solutions as a subset plus possibly a set of solutions that are invalid for the original problem. Thus, they represent a discrete relaxation of the original problem, and the length of a longest path of a relaxed MDD is a dual bound to the original problem's optimal solution value Z * (r). To have limited width, a relaxed MDD in general superimposes states of the original state graph: Sets of states of an exact MDD are combined into so-called merged nodes; all affected arcs are redirected to the respective merged node. To ensure that a valid relaxation is obtained, the state of a merged node must be set so that it is in no dimension stricter than each original state. In case of our PC-JSOCMSR, if a set M of original states is merged, the state of the respective merged node is  Figure 2a shows for the exact MDD in Fig. 1 a relaxed MDD where nodes u 3 and u 4 are merged resulting in node u . The width of the relaxed MDD decreases from four to three. Recall that the optimal solution of the considered instance has a total prize of seven. The longest path within the relaxed MDD, indicated by the dashed arcs, has a total length of eight. This is achieved by scheduling job 4 twice, which clearly does not correspond to a feasible solution of the original problem. Moreover, notice that the relaxed MDD contains all paths from the exact MDD. The original optimal solution is still represented by a respective path, however, it is not a longest anymore. The state of the merged node is given by ({1, 4}, (4, 3, 4)), while the states of all remaining nodes do not change.
Restricted MDDs are the second option for approximate MDDs with limited width. They are obtained by removing nodes from an exact MDD with all incoming and outgoing arcs. Whenever a node is removed, also all paths containing the node are not anymore encoded in the MDD. Consequently, a restricted MDD represents only a subset of all feasible solutions, and the length of a longest path in a restricted MDD might be shorter than one in an exact MDD. For this reason the length of a longest path in a restricted MDD is a primal bound to the original problem's optimal solution value Z * (r).
A restricted MDD for the exact MDD from Fig. 1 is depicted in Fig. 2b. The node u 3 and all its incoming and outgoing arcs are removed. All other nodes, arcs, and states remain unchanged. The longest path in the restricted MDD, again indicated by dashed arcs, has a total length of six. This longest path encodes a feasible solution to the original problem, however, not an optimal one.

Top-down construction
The top-down construction (Bergman et al. 2014a(Bergman et al. , b, 2016b compiles exact MDDs, as well as relaxed and restricted MDDs by traversing the state graph in a breadth-first fashion. The method starts with an empty first layer L 0 and adds a node for the initial state r. Then, one layer after the other is filled with nodes. For a subsequent layer L i , this is done by adding all feasible states that can be obtained by a transition from any node u ∈ L i−1 , i.e., Note that identical states produced by different transitions are represented by a single common node within a layer. In addition to the nodes, we also add corresponding arcs for each of the conducted transitions. When we are compiling relaxed or restricted MDDs, we have to check at this point the width of the current layer L i . If it exceeds a given maximum β, nodes have either to be merged or dropped, respectively. The quality of the obtained primal and dual bounds from the produced relaxed and restricted MDDs is predominantly influenced by the strategy to select the nodes for merging or removal. The basic idea is to prefer nodes for merging or removal that are unlikely part of any optimal solution. Bergman et al. (2014a) considered three different merging heuristics: random nodes, nodes with the shortest longest path Z lp (u), and nodes with the most elements in P(u). In their experiments the second strategy achieved the best results. Moreover, Bergman et al. (2014b) suggest the same node selection heuristic for the compilation of restricted MDDs. We observed that merging or removing nodes with the smallest Z lp (u) values is disadvantageous for PC-JSOCMSR. This can be explained by the fact that this strategy focuses just on the longest path, but does not respect how well the jobs fit next to each other. Therefore, we set the longest path to a node into perspective with the time the common resource is occupied by the corresponding jobs. The nodes within the currently considered layer L i , i > 0, are sorted according to the ratio Z lp (u)/(t 0 (u) − T min ) in increasing order. We then merge respectively remove the first nodes until the width of L i becomes β. Afterwards, we continue with the next layer. The algorithm terminates when either no further state transitions are possible or we completed layer L n .

Incremental refinement
The basic idea of an incremental refinement approach is to apply filtering and refinement steps iteratively on an initial simple relaxed MDD in order to improve it and approximate an exact MDD. Filtering steps remove arcs that are only contained in root to sink paths that represent infeasible solutions. The refinement steps consist of splitting nodes to represent so far merged states in more detail and as a consequence to trigger further filtering of arcs. The main goal of incremental refinement is to decrease the length of longest paths in the MDD, i.e., the obtained upper bound on an instance's solution value.
Incremental refinement has been initially proposed by Hadzic et al. (2008) and Hoda et al. (2010) for constraint satisfaction systems. The central aspect of this approach is the division of filtering and refinement into independent operations. As a consequence, the overall algorithm can apply and combine these operations however it is appropriate. A relaxed MDD for the PC-JSOCMSR problem contains in general paths that do not represent feasible solutions, either because jobs occur more than once or not all jobs can be scheduled within their time windows. Therefore, we have to find refinement and filtering operations that allow us to exclude job repetitions and time window violations.
Due to the fact that exact MDDs have in general an exponential number of nodes w.r.t. the problem size, we cannot hope to apply refinement and filtering until all invalid paths are sorted out for problem instances of practically relevant size. Hence, a key aspect of an incremental refinement approach is the order in which the refinement steps are applied on the nodes. The works from Cire and Hoeve (2013) and Kinable et al. (2017) provide an incremental refinement method for sequencing problems in which a permutation of jobs has to be found. Essentially, they order the jobs according to the processing times and with it to a certain extent according to the length of the corresponding arcs within the MDD. Their approach removes repetitions of jobs according to that order until the maximal allowed width of the MDD is reached. The rationale behind this strategy is that repetitions of jobs represented by long arcs are more frequently contained within longest paths. For PC-JSOCMSR this method is, however, not suitable because we have to assume that only a fraction of the jobs can be actually scheduled. Hence, it is not clear in advance which jobs play a key role for deriving a good approximation of an exact MDD.
Our incremental refinement for PC-JSOCMSR uses a current longest path as guidance. We follow the arcs on such a longest path, starting from the root node, and check for each arc whether the associated job can be feasibly scheduled. In case that a job occurs more than once, we refine the MDD s.t. repetitions of this job are not possible anymore. To this end, we reinterpret the refinement proposed by Cire and Hoeve (2013) based on our recursive model from Sect. 3. If we detect that a job cannot be feasibly scheduled within its time windows, then we split nodes using a novel operator that allows excluding the considered path. Our filtering operator relies heavily on the underlying recursive model and, thus, differs substantially from the ones proposed in the literature. In addition, we employ a new technique that aims at removing redundancies in layers. Algorithm 1 shows an outline of the proposed Incremental Refinement Guided by Longest Paths (IRLP). It acts on a given relaxed MDD, which is obtained in our case by the top down construction from Sect. 5 with a small initial width. In each iteration of the main while loop we obtain a longest path. If the sequence of jobs represented by the path can be feasibly scheduled, then we have found an optimal solution and terminate.

Algorithm 1: Incremental Refinement Guided by Longest Paths (IRLP)
Depending on whether we detected a job repetition or a time window violation on the currently considered longest path the following steps differ. In the former case we traverse the MDD starting from the root node r layer by layer. For each considered node we try to filter arcs and update the node's state if necessary. We check next if the node has to be refined and perform a node split if it allows to remove the considered job repetition. After all nodes of a layer have been considered, we might encounter that nodes are associated with an identical state. Such nodes are merged to avoid redundancies in the MDD. In the latter case of a time window violation we perform a much more local refinement operation in which only nodes along the considered path are split. In the subsequent filtering we consider all nodes reachable from the previously split nodes. We enforce also here that all nodes within each layer represent a distinct state. Notice that the refinement of job repetitions is preferred over the refinement of time window violations if both are contained in the longest path. This has shown to provide better bounds if the algorithm has to terminate prematurely due to a time limit.
The applied filtering techniques and the updating of the nodes' state are described in Sect. 6.1. The two types of refinement operations are presented in more detail in Sects. 6.2 and 6.3. Finally, in Sect. 6.4 we explain how we avoid producing nodes representing identical states within layers.

Node updates and filtering
Filtering applied in an incremental refinement method aims at identifying and removing arcs that are only contained in paths corresponding to infeasible solutions. The filtering techniques generally rely on the Markovian property of the MDD's states, which means that a state is defined by its predecessors and the transitions. This allows specifying tests that use information local to a considered node to decide whether incoming or outgoing arcs can be removed.
An intrinsic part of the presented filtering method is to keep the node's states always up to date, which is necessary because the removal of a node's incoming arcs may change its associated state. Moreover, an adjustment of a node's state may imply further changes on the nodes reachable from the currently considered node. Therefore, we traverse the MDD s.t. we reach a node after we have processed all its predecessors. Consequently, we end up in each iteration of the IRLP with an MDD where all states fulfill the Markovian property. For each considered node we first update the node's state and then check whether incoming or outgoing arcs can be removed. In case incoming arcs are removed the node's state has to be reevaluated again. An update of a state consists of reassessing and merging the transitions from all predecessors, which means for a node u to compute Such a state update is a computational expensive operation and should only be performed if a node's state may actually change. For this reason, we recompute a node's state only if either a predecessors state has changed or if an incoming arc has been removed. Let (P, t) and (P , t ) be node u's state before and after a reevaluation, respectively. Due to the definition of the relaxation scheme (11) and the fact that we are only removing arcs during filtering, it holds that t r ≥ t r for all r ∈ R 0 and P ⊆ P. In case P ⊂ P, we remove all outgoing arcs a ∈ A − (u) with j(a) / ∈ P since they cannot be part of any feasible solution represented by a path reaching u from r. If any node except r ends up without any incoming arc, it is removed together with all its outgoing arcs.

Refinement of job repetitions
We discuss in this section a technique that modifies an MDD in such a way that a considered job j occurs on each path at most once. This method is conceptually an adaptation from the one proposed by Cire and Hoeve (2013), but takes into account that in PC-JSOCMSR usually only a subset of the jobs can be scheduled. The refinement is based on the observation that a job repetition occurs if a job j is contained on a path starting from node r to a node u and job j is still included in P(u). Consequently, node u has an outgoing arc associated with job j which represents a repetition. Before we can derive a splitting strategy, we first have to verify if the above condition is sufficient to detect all job repetitions. To this end we denote with Some ↓ u ⊆ J the subset of jobs appearing in some path from r to a node u ∈ V . For a node u ∈ V the set Some ↓ u can be calculated recursively by We show next that we can determine repetitions of a considered job j occuring on some path in a MDD by using P(u) and Some ↓ u of the nodes u in the MDD.

Lemma 1 A job j is assigned on each path starting from r at most once if and only if
Proof Assume first that a job j is associated with at most one arc in every path starting from r of a given MDD G and consider an arbitrary node u ∈ V . If no path from r to u has an arc labeled j then it holds by definition that j / ∈ Some ↓ u and consequently j / ∈ Some ↓ u ∩ P(u). If on the other hand there exists a path from r to u with an arc associated with j then no path starting from u can contain an arc labeled j. Moreover, it holds by definition that a node v ∈ V can only have an outgoing arc a with j(a) = j if j ∈ P(u). Therefore, j / ∈ P(u) and j / ∈ Some ↓ u ∩ P(u). Conversely, suppose that j / ∈ Some ↓ u ∩ P(u) for all nodes u ∈ V . In case j / ∈ Some ↓ u we cannot have a repetition of node j on any path from r to u. If a node u is reached by an arc associated with job j then j ∈ Some ↓ u and thus, j / ∈ P(u). Since node u can have only outgoing arcs for the jobs in P(u), node u cannot have an outgoing arc labeled j. Moreover, since j ∈ Some ↓ v for all nodes v reachable from node u we can conclude by the same argument that also for these nodes j / ∈ P(u) and hence there are no respective outgoing arcs. Thus, job j is assigned on each path starting from r at most once.
Whenever we detect a node repetition, i.e., j ∈ Some ↓ u ∩ P(u) for some node u, we perform a node split to obtain a node u 1 with j / ∈ P(u) and a node u 2 with j / ∈ Some ↓ u as follows.
Theorem 1 Given job j and a MDD, we replace all nodes u ∈ V with j ∈ Some ↓ u ∩ P(u) by two nodes u 1 and u 2 , s.t. all incoming arcs a = (v, u) are redirected to u 1 if j / ∈ P(τ (v, j(a))) and to u 2 otherwise. All outgoing arcs are replicated for both nodes. The resulting MDD satisfies j / ∈ Some ↓ u ∩ P(u) for all nodes u ∈ V .
Proof For the root node r we have by definition that Some ↓ r = ∅ and, thus, j / ∈ Some ↓ u ∩ P(u). Assume as induction hypothesis that the desired condition j / ∈ Some ↓ u ∩ P(u) holds for all predecessors of a node u. In addition, consider that we have replaced node u by the nodes u 1 and u 2 as described above. From the relaxation scheme (11) we know that set P of node u 1 cannot contain j. For all of u 2 's incoming arcs a = (v, u 2 ) it holds that j / ∈ Some ↓ v since otherwise P(τ (v, j(a))) could not contain j. Consequently, u 1 as well as u 2 satisfy the stated condition.
The actual refinement is done by enforcing Lemma 1 in a single top down pass. To this end, we start with the root node and process all nodes layer by layer. For each considered node u we first update its state if needed and apply the filtering as described in Sect. 6.1. Afterwards, we determine the set Some ↓ u and split node u as described in Theorem 1 if necessary. Whenever a node is split, new states are calculated for the two new nodes. Furthermore, we perform filtering on the new nodes' incoming and outgoing arcs.

Refinement of time window violations
Let sequence (u 1 , a 1 , u 2 , . . . , u k , a k , u k+1 ) of alternating nodes and arcs denote a path in our MDD starting at the root node r (i.e., u 1 = r) where (u 1 , a 1 , u 2 , . . . , u k−1 , a k−1 , u k ) corresponds to a feasible solution but the job represented by arc a k cannot be additionally scheduled within its time windows. For the considered path we denote with (u ↓ 1 , . . . , u ↓ k ) the not relaxed states along the considered path. That is, u Due to the state relaxations of the nodes in the MDD we observe that j(a k ) ∈ P(u k ) but j(a k ) / ∈ P(u ↓ k ). The basic idea is to split the nodes on the path in such way that job j(a k ) can be removed from P(u k ) and with it also the arc a k .
In general, it is not sufficient to just split node u k but a subset of the path's nodes u l , . . . , u k , with 1 < l ≤ k, has to be refined. Ideally, the number of nodes to be refined should be small and the refinement should exclude other time window violations as well. We compute the subset of nodes to be refined as follows: We first check whether s(τ (u k−1 , j(a k−1 )), j(a k )) evaluates to T max . If it does, then job j(a k ) cannot be feasibly scheduled on the state resulting from the transition from state u k−1 . Consequently, it suffices to refine node u k . If it does not, then we consider one predecessor more, i.e., we check whether s(τ (τ (u k−2 , j(a k−2 )), j(a k−1 )), j(a k )) results in T max . This step is repeated until we find a node u l−1 on the considered path which allows excluding job j(a k ) if we follow exact transitions from it.
The actual refinement works as follows: We replace each node u i with i = l, . . . , k by nodes u i,1 and u i,2 . The incoming arcs a = (v, u i ) ∈ A + (u i ) are redirected to u i,1 if t r (τ (v, j(a))) ≥ t r (τ (u i−1 , j(a i−1 ))) for all r ∈ R 0 , otherwise, they are redirected to u i,2 . Outgoing arcs of u i are replicated for u i,1 and u i,2 . After a node split we determine for the two resulting nodes the corresponding states and perform a filtering of their incoming and outgoing arcs as described in Sect. 6.1. Last but not least, we have to possibly reevaluate the states and filter all incident arcs of all nodes reachable from each node u i .

Duplicate state elimination
A side effect of the node updates and the refinement methods is that we might end up with multiple nodes within one layer that represent an identical state. For example, assume that a considered layer already contains a node u without any outgoing arcs that represents state (∅, (T max , . . . , T max )). Moreover, suppose that due to updating another node u we encounter that its state cannot be feasibly extended and remove all outgoing arcs. Both, u and u , then represent the same identical state, but are reached by different paths. We can avoid this redundancy in the MDD by redirecting all incoming arcs from u to u and removing u from V .
In the more general case, where u and u have outgoing arcs merging nodes with duplicate states is more involved. First of all, we have to ensure that our MDD still remains a valid relaxation. The Markovian property implies that for all feasible extensions of u there exist an equivalent extension of u and vice versa (Cire and Hoeve 2013). This allows us to remove node u including its outgoing arcs after redirecting all incoming arcs to u. Obviously, the state of node u remains valid. However, if not done carefully, this operation may reintroduce paths encoding infeasible solutions which have been already excluded. Furthermore, we have to make sure that the duplicate state elimination does not produce cycles with the refinement operations in order to guarantee that IRLP terminates. After performing the refinements of job repetitions for a job j on a considered layer it holds that if a job j ∈ P(u) then j / ∈ Some ↓ u whenever the states of u and u are identical. Splitting nodes for refining time window violations aims at increasing the t r values to trigger filtering. Since our duplicate state elimination does not change the nodes' states, we will from a theoretical point of view always converge to an exact MDD.
Our duplicate state elimination works as follows: After we have performed all refinement, updating, and filtering operations within a layer, we consider all nodes with duplicate states pairwise and remove one of them until all nodes' states are distinct. To this end, we redirect all incoming arcs to the node u having the larger Z lp (u) value and remove the other node including outgoing arcs. The intention for selecting the node with the larger Z lp (u) value is that we do not increase the longest path to any node. Moreover, the nodes reachable from u are more likely to be already refined, as IRLP focuses on longest paths.

General variable neighborhood search
In this section the general variable neighborhood search (GVNS) is presented which serves us as a reference approach for obtaining heuristic solutions. GVNS (Hansen et al. 2010) is a prominent local search based metaheuristic which operates on multiple neighborhoods. The basic idea is to systematically change local search neighborhood structures until a local optimum in respect to all these neighborhood structures is found. This part is called variable neighborhood descent (VND). To further diversify the search, the GVNS performs so-called shaking for local optimal solutions by applying random moves in larger neighborhoods. These perturbed solutions then undergo VND again, and the whole process is repeated until a termination condition is met at which point the best solution encountered is returned.
In the context of this metaheuristic we represent a solution by a permutation π = (π i ) i=1,...,|J | of the entire set of jobs J . Starting times and the subset of jobs S ⊆ J that actually is scheduled is obtained by considering all jobs in the order of π and determining each job's earliest feasible time; jobs that cannot be feasibly scheduled w.r.t. their time windows anymore are skipped. This solution representation allows us to use rather simple neighborhood structures.
Our GVNS for PC-JSOCMSR starts with a random permutation of the jobs J as initial solution. In a preliminary study, we also used initial solutions computed by a FRB4 k (Rad et al. 2009) construction heuristic. Although this constructive heuristic provided much better starting solutions, we could not observe significant differences in the quality of final solutions returned by the GVNS.
We employ in our GVNS two local search neighborhood structures. The first one considers all exchanges of pairs of jobs within the permutation, while the second considers the removal of any single job and its re-insertion at another position. To avoid the consideration of moves that do not change the actual schedule, we require that each move changes either S or the order of the jobs within S.
In the VND, we apply any possible improving exchange move before considering the moves that remove and reinsert jobs. Each neighborhood is searched in a first improvement fashion. As shaking operation we perform a sequence of k random remove-and-insert moves. Whenever a new incumbent local optimal solution is found, the following shaking starts with k = 1. Parameter k is increased by one up to a maximum value k max after every unsuccessful shaking followed by the VND. After reaching k max , k is reset to one again.

Computational study
We performed an experimental evaluation of the proposed approaches, i.e., the top down construction (TDC) for relaxed and reduced MDDs, the incremental refinement guided by longest paths (ILRP), and the general variable neighborhood search (GVNS). The algorithms are implemented in C++ and have been compiled by GNU G++ 7.3.1. All experiments are performed on a single core of an Intel Xeon E5-2640 v4 CPU with 2.40GHz and 16 GB of memory.
We use the same two types of test instances as in Horn et al. (2018) but extend these to also include particularly larger instances with up to 300 jobs; all instances are available at http:// www.ac.tuwien.ac.at/research/problem-instances. Each set contains in total 840 instances with 30 instances for each combination of n ∈ {10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 120, 150, 200, 300} jobs and m ∈ {2, 3} secondary resources. In the first set B of balanced benchmark instances the secondary resources are equally distributed among the jobs and each job requires in expectation the common resource for the second third of its processing time. To this end a job's secondary resource is uniformly sampled from R. The processing time of a job j is determined by sampling values for p pre j , p post j from U{0, 8} and for p 0 j from U{1, 8}. In the second set S of benchmark instances, which we regard as skewed, one of the secondary resources is required predominantly and in expectation the common resource is required more than the half the job's processing time. In detail, a job's secondary resource is set to m with a probability of 0.5 while the other resources in R are selected with a probability of 1/(2m − 2). The duration p 0 j of the jobs j ∈ J are chosen uniformly from {1, . . . , 13} and the pre-processing and post-processing times p pre j and p post j are both uniformly selected from {0, . . . , 5}. The remaining characteristics of the two benchmark sets are obtained in the same way: The prize z j associated with each job is sampled uniformly from { p 0 j , . . . , 2 p 0 j } in order to correlate with the time the common resource is used. For the jobs we generate between one and three time windows in such a way that approximately 30% of the jobs fit into a schedule. To this end, we sample for each job the number of time windows ω j from {1, 2, 3}. Moreover, let E(p 0 ) be the expected duration a job requires the common resource and let T = 0.3 n E(p 0 ) be the total expected time required from the common resource to schedule 30% of all jobs. The ω j time windows W j for job j are generated as follows: We choose a start time W start jw uniformly from {0, . . . , T − p j } and an end time W end jw from {W start jw +max( p j , 0.1 T /ω j ), . . . , W start jw +max( p j , 0.4 T /ω j )}. If we obtain overlapping time windows, they are merged and ω j is adjusted accordingly.
The initial relaxed MDD used by incremental refinement methods in the literature (Cire and Hoeve 2013;Kinable et al. 2017) are typically trivial ones of width one and can be obtained by calling TDC with β = 1. For PC-JSOCMSR there is a more meaningful initial relaxed MDD of maximum width m, where on each layer all states are merged that are obtained by jobs requiring the same secondary resource. This initial relaxed MDD has in general already significantly stronger states than the relaxed MDD of width one, because in the latter the advances on the times t r for the secondary resources r ∈ R cancel each other out. Preliminary experiments showed that small instances can be optimally solved with fewer iterations and on larger instances stronger bounds can be obtained when starting with the width m initial relaxed MDD. Hence, we do this in all our further IRLP runs.
In other preliminary experiments we investigated different configurations of the GVNS. We tried changing the order of the neighborhood structures within the VND and also shaking operators based on exchanging the positions of randomly selected jobs. The configuration described in Sect. 7 was found to work best. Moreover, we tuned the maximum shaking size parameter k max . Rather small values for k max turned out to typically yield better results, and we decided to use k max = 4 for all further GVNS runs in this work.
In the first series of experiments we compare the quality of relaxed MDDs compiled by TDC and IRLP, respectively. IRLP was performed with a CPU-time limit of 900 s per run, while for TDC we used different values for the maximum width β in dependence of the number of jobs so that the required CPU-time was in a similar order of magnitude. In Table 1 each row shows average results of 30 instances. The first three columns describe the instance properties. For both approaches mean dual bounds Z lp are listed together with the corresponding standard deviations σ (Z lp ), the median numbers of nodes of the relaxed MDD |V | and median completion times t in seconds. In addition, the employed maximum width β in the TDC are given. Moreover, for the IRLP we state in column Δ[%] the percentage of nodes saved due to the elimination of duplicate states. In other words, we compute 100 − 100 · (|V |/|V |), where |V | is the size of the MDDs of separate runs of the IRLP without elimination of duplicate nodes, as it was presented in our preliminary work Maschler and Raidl (2018b). Finally, column gap [%] shows the gap between the dual bounds of both approaches obtained by calculating the mean of 100 * (Z On the smallest instances both algorithms produce relaxed MDDs with the same dual bounds. In these cases the obtained bounds correspond to the optimal objective values, which we verified by checking that the longest paths indeed correspond to feasible schedules. In fact, TDC could solve several instances with up to 60 jobs, while IRLP found optimal solution for some instances with up to 50 jobs. While on the medium to large instances with balanced jobs we cannot observe a clear tendency which method provides tighter bounds, IRLP outperforms TDC by a rather large margin on almost all skewed instances. The standard deviations of the dual bounds obtained from both approaches show that the scattering of the objective values of the individual runs is within a reasonable order of magnitude, but otherwise no special trend is observable. Notice that the size of the relaxed MDDs produced by both algorithms peaks by instances with 60 or 70 jobs and declines for larger benchmark instances. This can be explained for the TDC by the increasing number of state transitions that have to be performed for each layer and by the increasing number of nodes that have to be merged as a result. For IRLP the reason is similar. IRLP has to reevaluate for larger instances more frequently nodes with many incoming arcs. The IRLP has been extended in comparison to our preliminary work (Maschler and Raidl 2018b) by the elimination of duplicate states within layers. While the effects on the obtained dual bounds have been negligible, the size of the produced MDDs has been reduced between 10 and 30% on average. Smaller MDDs of similar quality become especially important if the MDDs are further exploited within a larger algorithmic framework such as in the DD-based

250.52
Bold values show the best average bound branch-and-bound (Bergman et al. 2016b) or for constraint propagation in CP (Cire and Hoeve 2013;Kinable et al. 2017).
In a second series of experiments the heuristic solutions obtained by the TDC for restricted MDDs are compared with the ones computed by the GVNS. We employ for the GVNS a time limit of 900 CPU-seconds as termination criterion. For TDC, different maximum widths β were used again so that the running times are in a similar order of magnitude. Table 2 shows the obtained results. The first three columns describe the instance properties and each row shows average results of 30 corresponding instances. The means and the corresponding standard deviations of the final objective values for TDC and GVNS are shown in the columns Z lp , σ (Z lp ), obj and σ (obj), respectively. In addition, for TDC the maximum width β, median number of nodes in the restricted MDD |V |, and median computation times t in seconds are listed. Moreover, column t best shows for the GVNS median times in seconds when the best solution has been found. Finally, column gap[%] shows the average relative gap between the objective values obtained by the TDC and the GVNS computed by taking the mean of 100 * (obj − Z lp )/Z lp .
The TDC for restricted MDDs is able to outperform the GVNS on most of our benchmark instances. Only for the largest instances with three secondary resources or skewed jobs, GVNS is able to provide better results. The main reason for the superior performance of the TDC on instances with balanced jobs and two secondary resources is that the corresponding exact MDDs are much smaller compared with the other instances. This can be seen on the smallest instances where the imposed maximum width is not yet restrictive. On the instances with 30 jobs, for example, the resulting MDDs for balanced jobs with two secondary resources have on average 5365 nodes, with three secondary resources 13,766 nodes, and for the instances with skewed jobs there are 21,220 and 23,358 nodes. It is safe to assume that this difference in size becomes even larger with more jobs. To stay within the memory and time limits, the maximum allowed width has to be decreased with the increasing number of jobs, which becomes more and more restrictive for the largest instances. Note that this relation can also be observed in Table 1 for relaxed MDDs. The GVNS approach, on the other hand, seems to be less affected by the instance type or by the number of secondary resources. This can be seen by the times the GVNS finds the final solution, which increases with the instance size but does not change substantially with the instance properties. The standard deviations of the obtained objective values show that consistency of the performance of both approaches is rather similar.
Concerning the gaps between the upper bounds obtained from the relaxed MDDs and the lower bounds from the heuristic solutions (compare Tables 1 and 2), we can observe that they are only small for the small and medium sized instances but become rather large for our largest instances. For example for the skewed instances with 300 jobs, this gap even exceeds 340%. This also somewhat illustrates the difficulty of the considered problem and the limits of MDDs-or at least the limits of the considered construction methods.

Conclusions and future work
In this work we studied the application of multivalued decision diagrams (MDDs) for the prize-collecting job sequencing with one common and multiple secondary resources (PC-JSOCMSR) problem. To this end, we first presented a recursive model and showed how to obtain MDDs from the problem's state graph. Whenever, the size of MDDs become to large relaxed and restricted MDDs are employed to obtain dual bounds and heuristic solutions, Bold values show the best average bound respectively. We adapted the two main compilation techniques for relaxed MDDs proposed in the literature to PC-JSOCMSR: top down construction (TDC) and incremental refinement (IR). To obtain restricted MDDs we use a variant of the TDC. In our computational study we first compared the relaxed MDDs obtained by TDC and IR. While both methods perform rather similar on balanced instances, IRLP is clearly superior on the benchmark instances sampled from skewed distributions. Afterwards, we assessed the quality of the restricted MDDs compiled with the TDC by comparing the obtained heuristic solutions with the ones from an independent general variable neighborhood search (GVNS). While the TDC performs better than the GVNS on small to medium-sized instances, the GVNS is mostly superior on the largest instances of our benchmark suite.
The focus of this paper is on the compilation of MDDs themself. A next step is to exploit the produced MDDs in various ways in other algorithmic frameworks to ultimately solve also larger instances of the problem in better ways. For example highly promising are novel branching schemes on the basis of relaxed MDDs as described in Bergman et al. (2016b) or the utilization of MDDs in new inference techniques in constraint programming, see e.g. Cire and Hoeve (2013). Last but not least, relaxed MDDs also have a great potential to provide guidance for finding good heuristic solutions in advanced metaheuristics.
A further promising future research direction seems to be to rethink the strict layered structure of the MDDs. For problems like PC-JSOCMSR their exist equivalent states on different layers. If we allow "long arcs" over multiple layers, such equivalent states may be represented by a single node possibly yielding more compact MDDs. Such a generalization provides new opportunities for merging states but also bears new challenges on how to identify nodes to be merged efficiently.