Scheduling a Proportionate Flow Shop of Batching Machines

In this paper we study a proportionate flow shop of batching machines with release dates and a fixed number $m \geq 2$ of machines. The scheduling problem has so far barely received any attention in the literature, but recently its importance has increased significantly, due to applications in the industrial scaling of modern bio-medicine production processes. We show that for any fixed number of machines, the makespan and the sum of completion times can be minimized in polynomial time. Furthermore, we show that the obtained algorithm can also be used to minimize the weighted total completion time, maximum lateness, total tardiness and (weighted) number of late jobs in polynomial time if all release dates are $0$. Previously, polynomial time algorithms have only been known for two machines.


Introduction
Modern medicine can treat some serious illnesses using individualized drugs, which are produced to order for a specific patient and adapted to work only for that unique patient and nobody else. Manufacturing such a drug often involves a complex production line, consisting of many different steps.
If each step of the process is performed manually, by a laboratory worker, then each laboratory worker can only handle materials for one patient at a time. However, once the process needs to be implemented on an industrial scale, some steps are instead performed by actual machines, like pipetting robots. Such high-end machines can typically handle materials for multiple patients in one go. If scheduled efficiently, this special feature can drastically increase the throughput of the production line. Clearly, in such an environment efficient operative planning is crucial in order to optimize the performance of the manufacturing process and treat as many patients as possible.
Formally, the manufacturing process studied in this paper is structured in a flow shop manner, where each step is handled by a single, dedicated machine. A job J j , j = 1, 2, . . . , n, representing the production of a drug for a specific patient, has to be processed by machines M 1 , M 2 , . . . , M m in order of their numbering. Each job J j has a release date r j ≥ 0, denoting the time at which the job J j is available for processing at the first machine M 1 . Furthermore, a job is only available for processing at machine M i , i = 2, 3, . . . , m, when it has finished processing on the previous machine M i−1 .
Processing times are job-independent, meaning that each machine M i , i = 1, 2, . . . , m, has a fixed processing time p i , which is the same for every job when processed on that machine. In the literature, a flow shop with machine-or job-independent processing times is sometimes called a proportionate flow shop, see e.g. (Panwalkar et al. 2013).
Recall that, as a special feature from our application, each machine in the flow shop can handle multiple jobs at the same time. These kind of machines are called (parallel) batching machines and a set of jobs processed at the same time on some machine is called a batch on that machine (Brucker 2007, Chapter 8). All jobs in one batch B (i) k on some machine M i have to start processing on M i at the same time. In particular, all jobs in B (i) k have to be available for processing on M i , before B (i) k can be started. The processing time of a batch on M i remains p i , no matter how many jobs are included in this batch. This distinguishes parallel batching machines from serial batching machines, where the processing time of a batch is calculated as the sum of the processing times of the jobs contained in it plus an additional setup time. Each machine M i , i = 1, 2, . . . , m has a maximum batch size b i , which is the maximum number of jobs a batch on machine M i may contain.
Given a feasible schedule S, we denote by c ij (S) the completion time of job J j on machine M i . For the completion time of job J j on the last machine we also write C j (S) = c mj (S). If there is no confusion which schedule is considered, we may omit the reference to the schedule and simply write c ij and C j .
As optimization criteria, in this paper we are interested in objective functions of the forms f max = max j=1,2,...,n f j (C j ), (1) In particular, the first part of the paper will be dedicated to the minimization of the makespan C max = max {C j | j = 1, 2, . . . , n} and of the total completion time C j = n j=1 C j , although the obtained algorithm will work for an arbitrary objective function of type (1) or (2), if certain pre-conditions are met.
In the second part of the paper, we assume that for each job J j a weight w j or a due date d j is given. We focus on the following traditional scheduling objectives: the weighted total completion time w j C j = n j=1 w j C j , -the maximum lateness L max = max {C j − d j | j = 1, 2, . . . , n}, -the total tardiness T j = n j=1 T j , -the number of late jobs U j = n j=1 U j , -the weighted number of late jobs w j U j = n j=1 w j U j , where Note that all these objective functions are regular, that is, nondecreasing in each job completion time C j . Using the standard three-field notation for scheduling problems (Graham et al. 1979), our problem is denoted as where f is a function of the form defined above. We refer to the described scheduling model as proportionate flow shop of batching machines and abbreviate it by PFB.
Next, we provide an example in order to illustrate the problem setting.
Example 1 Consider a PFB instance with m = 2 machines, n = 5 jobs, batch sizes b 1 = 3, and b 2 = 4, processing times p 2 = 2 and p 3 = 3, and release dates r 1 = r 2 = 0, r 3 = r 4 = 1, and r 5 = 2. Figure 1 illustrates a feasible schedule for the instance as joboriented Gantt chart. Each rectangle labeled by a machine represents a batch of jobs processed together on this machine. The black area indicates that the respective jobs have not been released at this time yet. Note that in this example none of the batches can be started earlier, since either a job of the batch has just arrived when the batch is started, or the machine is occupied before. Still, the schedule does not minimize the makespan, since the schedule shown in Figure 2 is feasible as well and has a makespan of 8 instead of 9. The improvement in the makespan was achieved by reducing the size of the first batch on M 1 from three to two, which allows to start it one time step earlier.
Observe that no job can finish M 2 before time step 5. Moreover, not all five jobs fit into one batch on M 2 . Hence, at least two batches are necessary on M 2 in any schedule. Therefore, 8 is the optimal makespan and no further improvement is possible.
The remainder of the paper is structured as follows. The second part of this section provides an overview of the related literature. In Section 2 we prove that permutation schedules with jobs in earliest release date order are optimal for PFBs with the makespan and total completion time objectives. We also show that permutation schedules are not optimal for any other traditional scheduling objective functions. In Section 3 we construct a dynamic program to find the optimal permutation schedule with a given, fixed job order. We show that, if the number of machines m is fixed, the dynamic program can be used to minimize the makespan or total completion time in a PFB in polynomial time. In Section 4 we consider PFBs where all release dates are equal. We show that, in this special case, permutation schedules are always optimal and that the dynamic program from Section 3 can be applied to solve most traditional scheduling objectives. Finally, in Section 5 we draw conclusions and point out future research directions.

Literature
The proportionate flow shop problem with batching machines itself has so far not received a lot of attention from researchers, although several applications have appeared in the literature. In addition to the application from pharmacy described in Section 1, Sung et al. (2000) mention a variety of applications in the manufacturing industry, e.g. in the production of semi-conductors. Furthermore, there are several papers which study the scheduling sequences of locks for ships along a canal or river, which can in some ways be seen as a gen-eralization of PFBs (see, e.g., Passchyn and Spieksma (2019)).
Despite this multitude of possible applications, for PFBs as introduced in this paper significant hardness results have, to the best of our knowledge, not been achieved at all. Polynomial time algorithms are only known for the special case with no more than two machines and, in most cases, without release dates (Ahmadi et al. 1992;Sung and Yoon 1997;Sung and Kim 2003). Indeed, the paper by Sung et al. (2000) is, as far as we know, the only work in literature considering a PFB with arbitrarily many machines. They propose a reduction procedure to decrease the size of an instance by eliminating dominated machines. Using this procedure, they develop heuristics to minimize the makespan and the sum of completion times in a PFB and conduct a computational study certifying quality and efficiency of their heuristic approach. However, they do not establish any complexity result.
For the case with m = 2 machines and no release dates, Ahmadi et al. (1992) show that the makespan can be minimized by completely filling all batches on the first machine (other than the last one, if the number of jobs is not a multiple of b 1 ) and completely filling all batches on the second machine (other than the first one, if the number of jobs is not a multiple of b 2 ). This structural result immediately yields an O(n) time algorithm. If release dates do not all vanish, i.e. r j > 0 for some j = 1, 2, . . . , n, then the makespan can be minimized in O(n 2 ) time, see (Sung and Yoon 1997). The authors use a dynamic programming algorithm in order to find an optimal batching on the first machine. On the second machine, the strategy to completely fill all batches (other than maybe the first one) is still optimal.
For objectives other than the makespan, results are only known if all jobs are released at the same time. In this case, the total completion time in a two-machine PFB can be minimized in O(n 3 ) time, see (Ahmadi et al. 1992). Sung and Kim (2003) provide dynamic programs to minimize L max , T j , and U j in a two-machine PFB without release dates in polynomial time.
There are two special cases and one generalization of PFB which have been relatively extensively studied in the literature and which are of importance when it comes to comparison with the wider literature. First, consider the special case of PFB with maximum batch size equal to one on all machines. This is the usual proportionate flow shop problem, which is well-researched and known to be polynomially solvable for many variants. An overview is given in (Panwalkar et al. 2013). In particular, the makespan and the total completion time can be minimized in O(n log n + nm) time, by ordering jobs in earliest release date order and scheduling each job as early as possible on each machine. Note that the term n log n is due to the sorting of the jobs while the term nm arises from computing start and completion times for each operation of each job.
Second, consider the special case of PFB where the number of machines is fixed to m = 1, which leaves us with the problem of scheduling a single batching machine with identical (but not necessarily unit) processing times. This problem, too, has been investigated by various authors, see (Ikura and Gimple 1986;Lee et al. 1992;Baptiste 2000;Condotta et al. 2010). Even with release dates, the problem can be solved in polynomial time for almost all standard scheduling objective functions, except for the weighted total tardiness, for which it is open.
Third, in contrast to the two special cases from before, consider the generalization of PFB where processing times may be job-as well as machine-dependent. We end up with the usual flow shop problem with batching machines. Observe that this problem is also a generalization of traditional flow shop as well as scheduling a single batching machine. Therefore, we can immediately deduce that, even without release dates, it is strongly NP-hard to minimize the makespan for any fixed number of machines m ≥ 3 and to minimize the total completion time for any fixed number of machines m ≥ 2 (Garey et al. 1976). Furthermore, minimizing the maximum lateness (or, equivalently, the makespan with release dates) is strongly NP-hard even for the case with m = 1 and b 1 = 2 (Brucker et al. 1998). Finally, Potts et al. (2001) showed that scheduling a flow shop with batching machines to minimize the makespan, without release dates, is strongly NP-hard, even for m = 2 machines. Recall that for traditional flow shop with m = 2 machines a schedule with minimum makespan can be computed in O(n log n) time (Johnson 1954).
In conclusion, we see that dropping either the batching or the multi-machine property of PFB leads to easy special cases, while dropping the proportionate property leads to a very hard generalization. Therefore, PFB lies exactly on the borderline between easy and hard problems, which makes it an interesting problem to study from a theoretical perspective, in addition to the practical considerations explained in the introduction.

Optimality of Permutation Schedules
Our approach to scheduling a PFB relies on the wellknown concept of permutation schedules. In a permutation schedule the order of the jobs is the same on all machines of the flow shop. This means there exists a permutation σ of the job indices such that c iσ(1) ≤ c iσ(2) ≤ . . . ≤ c iσ(n) , for all i = 1, . . . , m. Due to job-independent processing times and since a machine can only process one batch at a time, clearly the same then also holds for starting times instead of completion times. If there exists an optimal schedule which is a permutation schedule with a certain ordering σ of the jobs, we say that permutation schedules are optimal and call σ an optimal job ordering.
Suppose that for some problem there exists an optimal job ordering. In particular, this means that for this problem permutation schedules are optimal. Then the scheduling problem can be split into two parts: (i) find an optimal ordering σ of the jobs; (ii) for each machine, partition the job set into batches and order those batches in accordance with the ordering σ, such that the resulting schedule is optimal.
In this section we deal with part (i) and show that for f ∈ {C max , C j } optimal job orderings exist and can be found in O(n log n) time. Then, in Section 3, we show that under certain, very general preconditions part (ii) can be done in polynomial time via dynamic programming.
We begin with a technical lemma which will help us to show optimality of permutation schedules in this section and also in Section 4.
Lemma 2 Let S be a feasible schedule for a PFB and let σ be some earliest release date ordering of the jobs. Then there exists a feasible permutation scheduleŜ in which the jobs are ordered by σ and the multi-set of job completion times inŜ is the same as in S.
Proof Suppose that jobs are indexed according to ordering σ, i.e. σ is the identity permutation. Otherwise renumber jobs accordingly. Note that since σ is an earliest release date ordering, it follows that r 1 ≤ r 2 ≤ . . . ≤ r n . Let B In other words, in scheduleŜ we use batches with the same start and completion times and of the same size as in S. We only reassign which job is processed in which batch in such a way, that the first batch on machine M i processes the first k (i) 1 jobs (in order of their numbering, i.e. in the earliest release date order), the second batch processes jobs J k , the third batch processes jobs J k (i) and so on. Such, in scheduleŜ, jobs are processed in order of their indices on all machines (i.e. in order σ).
Clearly, as all batches have exactly the same start and completion times, as well as the same sizes as in S, for scheduleŜ the multi-set of job completion times on the last machine is exactly the same as for S.
It remains to show thatŜ is feasible, i.e. no job is started on machine M 1 before its release date and no job is started on a machine M i , i = 2, 3, . . . , m before finishing processing on machine M i−1 .

We first show that no batchB
(1) ℓ on machine M 1 violates any release dates. Indeed, due to the feasibility of S, at the time s  (1) ℓ does not violate any release dates.
Finally, using an analogous argument, we show that no batch starts processing on machine M i before all jobs it contains finish processing on machine M i−1 , i = 2, 3, . . . , m. By time s jobs. Thus, by construction, the same is true for batchB (i) ℓ in scheduleŜ. In particular, this means that at time s , this means that batchB (i) ℓ starts only after each job contained in it has finished processing on machine M i−1 . Therefore, scheduleŜ is feasible.
⊓ ⊔ Note that, in particular, if all release dates are equal, then any ordering σ is an earliest release date ordering of the jobs. This fact will be used in Section 4. Now we show that for a PFB with the makespan or total sum of completion times objective, permutation schedules are optimal and that any earliest release date order is an optimal ordering of the jobs. This result generalizes (Sung and Yoon 1997, Proposition 1) to arbitrarily many machines. For makespan and total completion time, it also extends (Sung et al. 2000, Lemma 1) to the case with release dates.
Theorem 3 For a PFB with objective function C max or C j , permutation schedules are optimal. Moreover, any earliest release date order is an optimal ordering of the jobs.
Proof Let S be an optimal schedule with respect to makespan or total completion time. Let σ be any earliest release date ordering. Using Lemma 2, construct a new permutation scheduleŜ with jobs ordered by σ on all machines and with the same multi-set of completion times as S. Since makespan and total completion time only depend on the multi-set of completion times,Ŝ is optimal, too.

⊓ ⊔
Hence, when minimizing the makespan or the total completion time in a PFB, it is valid to limit oneself to permutation schedules in an earliest release date order.
To conclude this section, we present an example where no permutation schedule is optimal for a bunch of other objective functions. This shows that Theorem 3 does not hold for these objective functions. This also implies that the statements made in (Sung and Yoon 1997, Proposition 1), (Sung et al. 2000, Lemma 1), and (Sung and Kim 2003, Lemma 1) cannot be generalized to settings in which release dates and due dates or release dates and weights are present.
Example 4 Consider a PFB with m = 3 machines, b 1 = p 1 = b 3 = p 3 = 1, and b 2 = p 2 = 2. Suppose there are n = 2 jobs with release dates r 1 = 0 and r 2 = 1. Let due dates d 1 = 6 and d 2 = 5 be given. First, we show that no permutation schedule is optimal for L max . Figure 3 shows a job-oriented Gantt chart of a feasible schedule which is not a permutation schedule. Each rectangle labeled by a machine denotes a batch processed on this machine. The black box indicates that the second job has not been released yet. The gray shaded area denotes points in time after the due dates.
When constructing a permutation schedule for this instance, two decisions have to be made, namely which of the two jobs is processed first, and whether a batch of size two or two batches of size one should be used on time 0 1 2 3 4 5 6 7 Fig. 3 A non-permutation schedule for the instance of Example 4. time 0 1 2 3 4 5 6 7 However, the schedule in Figure 3 achieves an objective value of zero, while all schedules in Figure 4 have a positive objective value. Hence, there is no optimal permutation schedule for this instance. The same example shows that there is no optimal permutation schedule for T j and U j . Choosing weights w 1 = 1, w 2 = 3 leads to an instance where no permutation schedule is optimal for w j C j , since the schedule in Figure 3 achieves an objective value of 6 · 1 + 5 · 3 = 21, while all schedules in Figure 4 have an objective value of at least 22.

Dynamic Programming to Find an Optimal Schedule for a Given Job Permutation
In this section we show that for a fixed number m of machines and a fixed ordering of the jobs σ, we can construct a polynomial time dynamic program, which finds a permutation schedule that is optimal among all sched-ules obeying job ordering σ. The algorithm works for an arbitrary regular sum or bottleneck objective function. Combined with Theorem 3 this shows that the makespan and the total completion time in a PFB can be minimized in polynomial time for fixed m.
The dynamic program is based on the important observation that, for a given machine M i , the set of possible job completion times on M i is not too large. In order to formalize this statement, we introduce the notion of a schedule being Γ -active. For ease of notation, we use the shorthand [k] = {1, 2, . . . , k}.
Definition 5 Given an instance of PFB with n jobs and m machines, let Note that on machine M i , there are only |Γ i | ≤ n i+1 ≤ n m+1 possible job completion times to consider for any Γ -active schedule S. Now we show that for a PFB problem with a regular objective function, any schedule can be transformed into a Γ -active schedule, without increasing the objective value. To do so, we prove that this is true even for a slightly stronger concept than Γ -active.
Definition 6 A schedule is called batch-active, if no batch can be started earlier without violating feasibility of the schedule and without changing the order of batches.
Clearly, given a schedule S that is not batch-active, by successively removing unnecessary idle times we can obtain a new, batch-active schedule S ′ . Furthermore, for regular objective functions, S ′ has objective value no higher than the original schedule S. These two observations immediately yield the following lemma.
Lemma 7 A schedule for a PFB can always be transformed into a batch-active schedule such that any regular objective function is not increased. This transformation does not change the order in which the jobs are processed on the machines. Now we show that, indeed, being batch-active is stronger than being Γ -active, in other words that every batch-active schedule is also Γ -active. This result generalizes an observation made by Baptiste (2000, Section 2.1).

Lemma 8 In a PFB, any batch-active schedule is also
be the batch which contains job J j on machine M i . Since the schedule is batch-active, B The claim follows inductively by observing that the former case can happen at most n − 1 times in a row, since there are at most n batches on each machine.
⊓ ⊔ Together, Lemmas 7 and 8 imply the following desired property.
Lemma 9 Let S be a schedule for a PFB problem with regular objective function f . Then there exists a schedule S ′ for the same PFB problem, such that 1. on each machine, jobs appear in the same order in S ′ as they do in S, 2. S ′ has objective value no larger than S, and 3. S ′ is Γ -active.
In particular, Lemma 9 shows that, if for a PFB problem an optimal job ordering σ is given, then there exists an optimal schedule, which is a permutation schedule with jobs ordered by σ and in addition Γ -active.
From this point on, we assume that the objective function is a regular sum or bottleneck function, that is, f (C) = n j=1 f j (C j ), where ∈ { , max} and f j is nondecreasing for all j ∈ [n]. We also use the symbol ⊕ as a binary operator. In what follows, we present a dynamic program which, given a job ordering σ, finds a schedule that is optimal among all Γ -active permutation schedules obeying job ordering σ. For simplicity, we assume that jobs are already indexed by σ. The dynamic program schedules the jobs one after the other, until all jobs are scheduled. Due to Lemma 9, if σ is an optimal ordering, then the resulting schedule is optimal.
Given an instance I of a PFB and a job index j ∈ [n], let I j be the modified instance which contains only the jobs J 1 , . . . , J j . For a vector of m possible batch sizes, we say that a schedule S for instance I j corresponds to t and k if, for all i ∈ [m], c ij = t i and job J j is contained in a batch with exactly k i jobs (including J j ) on M i . Next, we define the variables g of the dynamic program. Let S(j, t, k) be the set of feasible permutation schedules S for instance I j satisfying the following properties: 1. jobs in S are ordered by their indices, 2. S corresponds to t and k, and 3. S is Γ -active.
Then, we define g(j, t, k) = g(j, t 1 , t 2 , . . . , t m , k 1 , k 2 , . . . , k m ) to be the minimum objective value of a schedule in S(j, t, k) for instance I j . If no such schedule exists, the value of g is defined to be +∞.
The next lemma shows how to calculate the starting values of function g for j = 1.
the starting values of g are given by Proof Conditions (i)-(iii) are necessary for the existence of a schedule in S(1, t, k) because I 1 consists of only one job and there must be enough time to process this job on each machine. Conversely, if (i)-(iii) are satisfied, then the schedule defined by c i1 = t i is a schedule in S(1, t, k) with objective value f 1 (C 1 ) = f 1 (c m1 ) = f 1 (t m ).

⊓ ⊔
Now we turn to the recurrence formula to calculate g for j > 1 from the values of g for j − 1.
, the values of g are determined by where the minimum over the empty set is defined to be +∞. Here, ( * ) is given by conditions and ( * * ) is given by conditions The conditions of ( * ) are necessary for the existence of a schedule in S(j, t, k) because there must be enough time to process job J j on each machine. Therefore, g takes the value +∞, if ( * ) is violated. For the remainder of the proof, assume that ( * ) is satisfied. Hence, we have to show that We first prove "≥". If the left-hand side of (3) equals infinity, then this direction follows immediately. Otherwise, by the definition of g, there must be a schedule S ∈ S(j, t, k) with objective value g(j, t, k). Schedule S naturally defines a feasible schedule S ′ for instance I j−1 by ignoring job J j .
Observe that, because S belongs to S(j, t, k) and therefore is Γ -active, S ′ is also Γ -active and job J j−1 finishes processing on machine M i at some time t , which satisfy (iii) and (iv) from ( * * ). Note that, in particular, S ′ ∈ S(j − 1, t ′ , k ′ ).
Furthermore, t ′ and k ′ also satisfy (v) and (vi) from ( * * ). Indeed, due to the fixed job permutation, one of the following two things happens on each machine M i in S: either jobs J j and J j−1 are batched together, or job J j is batched in a singleton batch. In the former case, it follows that 1 < k i = k ′ i + 1 and t ′ i = t i , while the latter case requires k i = 1 and t i ≥ t ′ i + p i , since the machine is occupied by the previous batch with job J j−1 until then.
Thus, t ′ and k ′ satisfy ( * * ) and we obtain where the last inequality follows due to the definition of g and S ′ ∈ S(j − 1, t ′ , k ′ ). Hence, the "≥" direction in (3) follows because t ′ and k ′ satisfy ( * * ). For the "≤" direction, if the right-hand side of (3) equals infinity, then this direction follows immediately. Otherwise, let t ′ and k ′ be minimizers at the right-hand side. By definition of g there must be a schedule S ′ ∈ S(j−1, t ′ , k ′ ) for I j−1 with objective value g(j−1, t ′ , k ′ ).
We now show that S ′ can be extended to a feasible schedule S ∈ S(j, t, k). Construct S from S ′ by adding job J j in the following way: if in S ′ there is a batch on machine M i which ends at time t i , add J j to that batch (we show later that this does not cause the batch to exceed the maximum batchsize b i of M i ); -otherwise, add a new batch on machine M i finishing at time t i and containing only job J j (we show later that this does not create overlap with any other batch on machine M i ).
First note that c ij (S) = t i for all i ∈ [m], and thus S corresponds to t by definition. Furthermore, for each machine M i consider the two cases (a) k i = 1 and (b) k i > 1.
In case (a), due to (v) it follows that t ′ i ≤ t i − p i . Since job J j−1 finishing at time t ′ i is the last job completed on machine M i in schedule S ′ , by construction in schedule S job J j is in a singleton batch on machine M i that starts at time t i −p i and ends at time t i . Therefore, in this case, S corresponds to k, because k i = 1 and job J j is in a singleton batch. Also, since the last batch on machine M i in S ′ ends at time t ′ i ≤ t i − p i and the batch with job J j starts at time t i − p i , no overlapping happens between the new batch for job J j and any other batch on machine M i .
In case (b), due to (vi) it follows that t ′ i = t i and by construction job J j is scheduled in the same batch as job J j−1 on machine M i in S. Since S ′ corresponds to k ′ and again due to (vi), this means that J j is scheduled in a batch with k i jobs on machine M i and S corresponds to k in this case. Also, since k i ∈ [b i ] by definition, no batch in S exceeds its permissible size.
Combining the considerations for cases (a) and (b), together with the feasiblity of schedule S ′ , it follows that (1) S corresponds to k, (2) no overlapping happens between batches in S, and (3) all batches in S are of permissible size.
In order to show feasibility of S it is only left to show that no job starts before its release date on machine M 1 , and no job starts on machine M i before it is completed on machine M i−i . As S ′ is feasible, this is clear for all jobs other than J j . On the other hand, since S corresponds to t, and t fulfills ( * ), it also follows for job J j . Thus, S is indeed feasible. In total, we have shown all conditions for S ∈ S(j, t, k). Therefore, we obtain where the last equality is due to the choice of S ′ as a schedule with objective value g(j − 1, t ′ , k ′ ). Hence, the "≤" direction in (3) follows due to the choice of t ′ and k ′ as minimizers of the right-hand-side.
⊓ ⊔ Using these formulas we can prove the main theorem of this section.

Theorem 12 Consider a PFB instance with a constant number of m machines and a regular sum or bottleneck objective function. Then, for a given ordering of the jobs, the best permutation schedule can be found in time O(n m 2 +5m+1 ).
Proof Using Lemmas 10 and 11, one can calculate the values of g for all j ∈ [n] and all t . For a fixed job index j the number of these values is Here, the first m factors in the second line are due to Γ i ≤ n i+1 and the last factor n m is due to inequalities b i ≤ n for all i ∈ Due to Lemma 9, the objective value of the best permutation schedule ordered by the job indices is the minimum of all values of g for j = n. If for each value of g we additionally store a reference to the previous value, we can reconstruct the corresponding schedule by following these references in a backtracking manner. This does not increase the asymptotic runtime, since computing the values is more expensive than backtracking.
⊓ ⊔ Combining Theorems 3 and 12, we obtain: Corollary 13 For a PFB with release dates and a fixed number m of machines, the makespan can be minimized in O(n m 2 +5m+1 ) time. The same holds for the total completion time.
In particular, for any fixed number m of machines, the problems

Equal Release Dates
The dynamic program presented in the previous section works for a very general class of objective functions. Still, Corollary 13 only holds for makespan and total completion time because for other standard objective functions permutation schedules are not optimal, as shown in Example 4.
However, in the case where all release dates are equal, it turns out that permutation schedules are always optimal. We first show that in the case of equal release dates, any feasible schedule S can be transformed into a feasible permutation schedule S ′ with equal objective value.
Lemma 14 Let S be a feasible schedule for a PFB where all jobs are released at time 0. Then there exists a feasible permutation schedule S ′ such that C j (S) = C j (S ′ ) for all j = 1, 2, . . . , n.
Proof Let σ be the ordering of jobs on the last machine. Since r j = 0 for all j = 1, 2, . . . , n, ordering σ is an earliest release date order. Thus we can use Lemma 2 to construct a permutation schedule S ′ , where jobs are ordered by σ on all machines and the multi-set of completion times is the same as for S. Furthermore, since the ordering on the last machine in S ′ is the same as in S, in S ′ each job has the same completion time as in S, i.e. C j (S) = C j (S ′ ) for all j = 1, 2, . . . , n.
⊓ ⊔ From Lemma 14, it follows that permutation schedules are optimal for PFBs with equal release dates. The following theorem generalizes the result presented by Rachamadugu et al. (1982) for usual proportionate flow shop without batching.
Theorem 15 For any objective function, if an optimal schedule exists for a PFB problem without release dates, then there also exists an optimal permutation schedule.
Proof Apply Lemma 14 to an optimal schedule S in order to achieve an optimal permutation schedule S ′ .
⊓ ⊔ Note that for scheduling problems it is usually assumed that objective functions are defined in such a way, that at least one optimal schedule exists. In particular, this is true for all regular objective functions in this paper, since Lemma 7 shows that for regular objective functions batch-active schedules are optimal, and there are only finitely many batch active schedules for a PFB. Thus, the restriction in the statement of the theorem is only a formality. Of course, it may still be hard to find optimal permutations. However, the following theorem shows that for several traditional objective functions, an optimal permutation can be found efficiently.
Theorem 16 Consider a PFB without release dates.

(i) For minimizing the total weighted completion time, any ordering by non-increasing weights is optimal. (ii) For minimizing maximum lateness and total tardiness, any earliest due date order is optimal.
Proof For proving (i), let S be a permutation schedule minimizing the total weighted completion time. Assume that jobs are indexed according to their ordering in S. Let σ be a job ordering with respect to non-increasing weights, i.e. w σ(1) ≥ w σ(2) ≥ · · · ≥ w σ(n) . Suppose there is at least one pair of jobs J j1 , J j2 with j 1 < j 2 and σ(j 1 ) > σ(j 2 ). Swapping J j1 with J j2 in S yields a new permutation schedule S ′ , with objective value Due to j 1 < j 2 and σ(j 1 ) > σ(j 2 ), we have C j2 (S) ≥ C j1 (S) and w j1 ≤ w j2 . Therefore, it follows that Performing such exchanges of jobs sequentially, we eventually obtain an optimal permutation schedule S * with jobs scheduled in order of non-increasing weights.
Part (ii) can be proven by exchange arguments analogous to (i). A detailed proof can be found in the appendix.
⊓ ⊔ From Theorems 12 and 16, we immediately obtain the following complexity results.
Corollary 17 For a PFB without release dates and a fixed number m of machines, the total weighted completion time can be minimized in O(n m 2 +5m+1 ) time. The same holds for the maximum lateness and the total tardiness.
It is also possible to minimize the (weighted) number of late jobs in O(n m 2 +5m+1 ) time. However, to achieve this result, it is necessary to adjust the algorithm from Section 3 slightly. Suppose that jobs are ordered in an earliest due date order and suppose further J is the set of on-time jobs in an optimal solution for PFB to minimize (weighted) number of late jobs. Note that in this case, we can find an optimal schedule by first scheduling all jobs in J , in earliest due date order, using the algorithm from Section 3, and then schedule all late jobs in an arbitrary, feasible way.
Usually, however, the optimal set of on-time jobs is not known in advance. Thus the dynamic program has to be adapted in such a way that it finds the optimal set of on-time jobs and a corresponding optimal schedule simultaneously. This can be done by, roughly described, allowing the algorithm to ignore a job: in the j-th step of the algorithm, when job J j (in earliest due date order) is scheduled, the algorithm can not only decide to schedule the job in sequence, but can alternatively decide to skip it, making it late (and paying a penalty w j ). In all other regards, the algorithm remains exactly the same as presented in Section 3. The technical details to prove correctness and running time of the algorithm are, as one would expect, very similar to Section 3 and are moved to the appendix.
We obtain the following theorem.
Theorem 18 For a PFB without release dates and a fixed number m of machines, the weighted number of late jobs can be minimized in O(n m 2 +5m+1 ) time.
In particular, combining Corollary 17 and Theorem 18, we obtain that, for any fixed number m of machines, the problems Note that, while minimizing makespan and total completion time in a PFB with a fixed number m of machines and without release dates can be handled in the same manner, a time complexity of O(n m 2 +5m+1 ) is strictly speaking no longer polynomial for those two problems. Indeed, since in problems with f ∈ {C max , C j }, jobs are completely identical, an instance can be encoded concisely by simply encoding the number n of jobs, rather than every job on its own. Such an encoding of the jobs takes only O(log n) space, meaning that even algorithms linear in n would no longer be polynomial in the length of the instance.
In the literature, scheduling problems for which such a concise encoding in O(log n) space is possible are usually referred to as high-multiplicity scheduling problems (Hochbaum andShamir 1990, 1991). They are often difficult to handle, since even specifying a full schedule normally takes at least O(mn) time (one start and end time for each job on each machine). In the case of m = 2 machines and the makespan objective, Hertrich (2018) shows that the O(n) algorithm by Ahmadi et al. (1992) can be adjusted to run in polynomial time, even in the high multiplicity sense. For all other cases, the theoretical complexity of the two problems However, note that in practical applications jobs usually have additional meaning attached (for instance patient identifiers, in our pharmacy example from the beginning) and thus such instances are usually encoded with a length of at least n.

Conclusions and Future Work
In this paper, motivated by an application in modern pharmaceutical production, we investigated the complexity of scheduling a proportionate flow shop of batching machines (PFB). The focus was to provide the first complexity results and polynomial time algorithms for PFBs with m > 2 machines.
Our main result is the construction of a new algorithm, using a dynamic programming approach, which schedules a PFB with release dates to minimize the makespan or total completion time in polynomial time for any fixed number m of machines. In addition, we showed that, if all release dates are equal, the constructed algorithm can also be used to minimize the weighted total completion time, the maximum lateness, the total tardiness, and the (weighted) number of late jobs. Hence, for a PFB without release dates and with a fixed number of m machines, these objective functions can be minimized in polynomial time. Previously these complexity results were only known for the special case of m = 2 machines, while for each m ≥ 3 the complexity status was unknown. For minimizing weighted total completion time and weighted number of late jobs, these results were even unknown in the case of m = 2.
An important structural result in this paper is that permutation schedules are optimal for PFBs with release dates and the makespan or total completion time objective, as well as for any PFB problem without release dates. This result was needed in order to show that the constructed algorithm can be correctly applied to the problems named above.
Concerning PFBs with release dates, recall that in the presence of release dates permutation schedules are not necessarily optimal for traditional scheduling objective functions other than makespan and total completion time (see Example 4). However, there are several special cases for which permutation schedules remain optimal, in particular if the order of release dates corresponds well to the order of due dates (in the case of due date objectives) and/or weights (in the case of weighted objectives). For example, Hertrich (2018) shows that, when minimizing the weighted total tardiness, if there exists a job order σ that is an earliest release date, an earliest due date order, and a non-increasing weight order at the same time, then there exists an optimal schedule which is a permutation schedule with jobs ordered by σ. This, of course, implies analogous results for weighted total completion time, total tardiness and maximum lateness (again, see Hertrich (2018)). In those special cases, clearly the algorithm presented in this paper can be used to solve the problems in polynomial time. Still, the the complexity status of PFBs with release dates in the general case, where orderings do not correspond well with each other, remains open for all traditional scheduling objectives other than makespan and total completion time.
Another problem left open for future study is the complexity of minimizing the weighted total tardiness in the case where all release dates are equal. This problem is open, even for the special case of m = 2. In this paper, it is shown that permutation schedules are optimal for all objective functions in the absence of release dates. Furthermore, if an optimal job order is known and the number of machines is fixed, then an optimal schedule can be found in polynomial time, using the algorithm presented in this paper.
Unfortunately, the complexity of finding an optimal job order to minimize the weighted total tardiness in a PFB remains open. Note that for scheduling a single batching machine with equal processing times and for usual proportionate flow shop without batching the weighted total tardiness can be minimized in polynomial time (Brucker and Knust (2009)). Here, proportionate flow shop can be reduced to the single machine problem with all processing times equal to the processing time of the bottleneck machine (see, e.g., Sung et al. (2000)). However, such a reduction is not possible for PFBs, since there may be several local bottleneck machines which influence the quality of a solution.
Observe also that, in the special case of m = 2 machines, given an arbitrary schedule for the first machine, finding an optimal schedule for the second machine is similar to solving problem 1 | p j = p, r j | w j T j or the analogous problem on a batching machine, which also both have open complexity status (see Baptiste (2000); Brucker and Knust (2009)). So attempting to split up the problem in such a manner does not work either. Of course, if instead of an arbitrary schedule on the first machine, some special schedule is selected, it may be possible that the special structure for the release dates on the second machine helps to solve the problem. Notice, though, that this still leaves open the question of which schedule to select on the first machine.
The last open complexity question concerns the case where the number of machines is no longer fixed, but instead part of the input. We are unfortunately not aware of any promising approaches to find reductions from NP-hard problems. However, interestingly, Hertrich (2018) proves that for all versions of PFBs, minimizing the total completion time is always at least as hard as minimizing the makespan. This reduction does not hold for scheduling problems in general.
One noteworthy special case, where positive complexity may be more readily achievable, arises from fixing the number of jobs n, leaving only the number of machines m as part of the input. Note that in this case, high-multiplicity, as described in Section 4, is not a concern: since n is fixed, a schedule can be put out in O(m) time and since each machine has an individual processing time, the input also has at least sized O(m). For PFBs with exactly two jobs, n = 2, and an arbitrary number of machines, Hertrich (2018) shows that the makespan can be minimized in O(m 2 ) time. A machine-wise dynamic program is provided that finds all schedules, in which the completion time of one job cannot be improved without delaying the other job. In other words, the program constructs all Pareto-optimal schedules, if the completion time of each job is considered as a separate objective. If the number of jobs n is larger than 2, then the dynamic program can still be applied, but its runtime in that case is pseudo-polynomial (see Hertrich (2018)).
Finally, note that while the algorithm provided in this paper has the benefit of being very general, thus being applicable to many different problems, it incurs some practical limitations due to its relatively large running time. For example, for m = 2, our algorithm to minimize the makespan has a runtime of O(n 15 ). In contrast, the algorithm specialized to the case of m = 2 by Sung and Yoon (1997) runs in O(n 2 ). Now that the open complexity question is solved, for future research it would be interesting to find more practically efficient algorithms, in particular for small numbers of machines like m = 3 or m = 4.
If this is not possible, then instead approximations or heuristics need to be considered in order to solve PFBs in practice. In the literature, approximations and heuristics often use restrictions to permutation schedules in order to simplify difficult scheduling problems. To this end, it would be interesting to quantify the quality of permutation schedules (e.g. in terms of an approximation factor) for PFB problems where permutation schedules are not optimal.
Another approach to improving the algorithms' efficiency, especially if negative complexity results for the case with an arbitrary number of machines are achieved, would be to consider PFBs from a parameterized complexity point of view. Recently, parameterized complexity has started to receive more attention in the scheduling community (cf. Mnich and Wiese (2015); Mnich and van Bevern (2018)). A problem with input size n is said to be fixedparameter tractable with respect to a parameter k if it can be solved in a running time of O(f (k)p(n)) where f is an arbitrary computable function and p a polynomial not depending on k. Although our algorithm is polynomial for any fixed number m of machines, it is not fixed-parameter tractable in m because m appears in the exponent of the running time. Clearly, a fixed-parameter tractable algorithm would be preferable, both in theory and, most likely, in practice, if one could be found.
In this appendix we give a detailed proof of Theorem 16 (ii). Analogous to the proof of Theorem 16 (i) let S be an optimal permutation schedule with respect to maximum lateness or total tardiness. Assume that jobs are indexed according to their ordering in S. Let σ be an earliest due date ordering, i.e. d σ(1) ≤ d σ(2) ≤ · · · ≤ d σ(n) . Suppose there is at least one pair of jobs J j1 , J j2 with j 1 < j 2 and σ(j 1 ) > σ(j 2 ). Swapping J j1 and J j2 in S yields a new permutation schedule S ′ .
We first show that L max (S ′ ) ≤ L max (S). Because the lateness of all jobs other than J j1 and J j2 remains unchanged when performing the swap, it suffices to show that max{L j1 (S ′ ), L j2 (S ′ )} ≤ max{L j1 (S), L j2 (S)} where L j (S ′ ) = C j (S ′ ) − d j . However, due to j 1 < j 2 and σ(j 1 ) > σ(j 2 ), we have C j1 (S) ≤ C j2 (S) and d j1 ≥ d j2 . Hence, it follows that which finishes the proof for maximum lateness.
Finally, we show that T j (S ′ ) ≤ T j (S). Because the tardiness of all jobs other than J j1 and J j2 remains unchanged when performing the swap, it suffices to show that We distinguish three cases. Firstly, suppose J j1 and J j2 are both late in S ′ , i.e. C j1 (S ′ ) > d j1 and C j2 (S ′ ) > d j2 . Then it follows that Secondly, suppose J j1 is on time in S ′ , i.e. C j1 (S ′ ) ≤ d j1 . Using C j1 (S) ≤ C j2 (S), we obtain Thirdly, suppose J j2 is on time in S ′ , i.e. C j2 (S ′ ) ≤ d j2 .
Using d j1 ≥ d j2 , we obtain Hence, performing the described swap does neither increase the maximum lateness nor the total tardiness. Therefore, as in (i), we can perform such exchanges sequentially in order to obtain an optimal permutation schedule S * with jobs scheduled in earliest due date order.

Appendix B: Proof of Theorem 18
Suppose that jobs are indexed in an earliest due date order, i.e. d 1 ≤ d 2 ≤ ... ≤ d n . As in Section 3, the modified dynamic program tries to schedule these jobs one by one, but it is also allowed to leave jobs out that will be late anyway, which then will be scheduled in the end after all on-time jobs. This procedure is justified by the following lemma.
Lemma 19 Consider a PFB instance I without release dates to minimize the weighted number of late jobs. Suppose jobs are indexed by an earliest due date ordering. Then there exists an optimal permutation schedule in which the on-time jobs are scheduled in the order of their indices.
Proof By Theorem 15, there exists an optimal permutation schedule S. Let J be the set of on-time jobs in S. Let I J be the modified PFB instance that contains only the jobs in J . By ignoring all jobs outside of J , schedule S defines a feasible permutation schedule S J for I J that has maximum lateness 0. By Theorem 16 (ii), there exists also a permutation schedule S ′ J for I J in which jobs are scheduled in the order of their indices and that has maximum lateness 0. By scheduling the remaining jobs arbitrarily after the on-time jobs, S ′ J can be extended to a schedule S ′ for I which has the same weighted number of late jobs as S and in which all on-time jobs are scheduled in the order of their indices.

⊓ ⊔
Recall that for a PFB instance I and a job index j ∈ [n], I j denotes the modified instance containing only jobs J 1 , . . . , J j . The variables of the new dynamic program are defined almost as in Section 3. However, in contrast, we say that a permutation schedule S for instance I j corresponds to vectors if the last on-time job in S finishes machine M i at time t i and is processed in a batch with exactly k i on-time jobs there for all i ∈ [m]. If there are no on-time jobs, then S cannot correspond to any vectors t and k. Note that we do not require that J j itself is on time, just that there is some on-time job. The differences to Section 3 are, first, that we do not assume that jobs are ordered by their indices, and second, that we ignore all late jobs in this definition. Similar to Section 3, we define S(j, t, k) to be the set of feasible permutation schedules S for instance I j satisfying the following properties: 1. the on-time jobs in S are ordered by their indices, 2. S corresponds to t and k, and 3. S is Γ -active.
The variables of our new dynamic program are defined as follows. Let w(j, t, k) = w(j, t 1 , t 2 , . . . , t m , k 1 , k 2 , . . . , k m ) be the minimum weighted number of late jobs of a schedule S ∈ S(j, t, k). If no such schedule exists, the value of w is defined to be +∞.
Our next goal is to establish analogous results to Lemmas 10 and 11. For this purpose we define some auxiliary variables.
For any j ∈ [n], t i ∈ Γ i , and The values of x can be interpreted as the objective value of a schedule in S(j, t, k) for I j that only schedules J j on time and all other jobs late. Analogous to Lemma 10, we obtain the following lemma.
Proof As in the proof of Lemma 10, conditions (i)-(iii) are necessary for the existence of a schedule in S(1, t, k). Also, since a schedule for I 1 can only correspond to t and k if J 1 is on time, (iv) is necessary as well. Conversely, if (i)-(iv) are satisfied, then the schedule defined by c i1 = t i is a schedule in S(1, t, k) with objective value 0.
⊓ ⊔ For the actual recursion, we need a second type of auxiliary variables. For j > 1, t i ∈ Γ i , and where the minimum over the empty set is defined to be +∞. Here, ( * * ) is given by conditions . The values of y can be interpreted as the best possible objective value of a schedule in S(j, t, k) that schedules J j and at least one other job on time.
Analogous to Lemma 11, we obtain the following lemma.
Roughly speaking, for scheduling J j there are three possibilities: either job J j is scheduled as the only on-time job, as one of several on-time jobs, or as a late job. In the first two cases, w(j, t, k) equals x(j, t, k) or y(j, t, k), respectively, while in the third it equals w(j −1, t, k)+w j , as we will see in the proof.

Proof of Lemma 21:
We first prove the "≥" direction. If the left-hand side equals +∞, then this direction follows immediately. Otherwise, by definition of w, there exists a schedule S for I j in S(j, t, k) with objective value w(j, t, k). Schedule S naturally defines a feasible permutation schedule S ′ for instance I j−1 by ignoring job J j . We distinguish three cases.
Firstly, if J j is not on time in S, then S ′ is in S(j −1, t, k) and has an objective value of w(j, t, k)−w j because J j is not contained in instance J j−1 . Therefore, by definition of w(j − 1, t, k), it follows that w(j − 1, t, k) ≤ w(j, t, k)−w j , which implies the "≥" direction in the first case.
Secondly, if J j is the only on-time job in S, then (i) must hold because S corresponds to k and only on-time jobs are counted in this definition. Furthermore, (ii) and (iii) must hold because S is feasible and corresponds to t. Finally (iv) holds because J j is on time. Hence, we obtain x(j, t, k) = j−1 j ′ =1 w j ′ = w(j, t, k) in this case, which implies the "≥" direction in the second case.
Thirdly, if J j is one of at least two on-time jobs in S, then S ′ corresponds to unique vectors t ′ and k ′ . Hence, we obtain S ′ ∈ S(j − 1, t ′ , k ′ ) and, therefore, Analogous arguments to the proof of Lemma 11 show that t ′ and k ′ satisfy ( * * ). Moreover, as in the second case, (ii)-(iv) hold because S is feasible, corresponds to t, and J j is on time. Therefore, we have w(j, t, k) ≥ w(j − 1, t ′ , k ′ ) ≥ y(j, t, k), which completes the "≥" direction. Now we prove the "≤" direction. If the right-hand side equals +∞, then this direction follows immediately. Otherwise, we again distinguish three cases.
Firstly, suppose the minimum on the right-hand side is attained by the term w(j − 1, t, k) + w j . Since this must be finite, there is a schedule S ′ ∈ S(j − 1, t, k) for instance I j−1 with objective value w(j − 1, t, k). By scheduling J j as a late job, S ′ can be extended to a schedule S that has objective value w(j − 1, t, k) + w j . Moreover, since J j is late, S still corresponds to t and k, i.e. S ∈ S(j, t, k), which implies w(j, t, k) ≤ w(j − 1, t, k) + w j . Hence, the "≤" direction follows in this case.
Secondly, suppose the minimum on the right-hand side is attained by the term x(j, t, k). Since this must be finite, we obtain that (i)-(iv) hold. Hence, the schedule S which schedules J j such that c ij = t i and all other jobs late is feasible for I j , an element of S(j, t, k), and has objective value x(j, t, k) = j−1 j ′ =1 w j ′ . This implies w(j, t, k) ≤ x(j, t, k). Hence, the "≤" direction follows in this case.
Thirdly, suppose the minimum on the right-hand side is attained by the term y(j, t, k). Since this must be finite, we obtain that (ii)-(iv) hold. Let t ′ and k ′ be the minimizers in the definition of y(j, t, k). Again due to finiteness, there must be a schedule S ′ ∈ S(j − 1, t ′ , k ′ ) for I j−1 with objective value w(j − 1, t ′ , k ′ ) = y(j, t, k). By ignoring late jobs in S ′ (which can be scheduled arbitrary late), we can use the same arguments as in the proof of Lemma 11 to extend S ′ to schedule S ∈ S(j, t, k) for I j . Due to (iv), J j is on time in S. Hence, S has the same objective value as S ′ , namely y(j, t, k). Thus, it follows that w(j, t, k) ≤ y(j, t, k), which completes the "≤" direction.
⊓ ⊔ Analogous to Theorem 12, we may use this recursion to prove the desired theorem.
Theorem 18 For a PFB without release dates and a fixed number m of machines, the weighted number of late jobs can be minimized in O(n m 2 +5m+1 ) time.
Proof Using Lemmas 20 and 21, one can calculate the values of w for all j ∈ [n] and all t i ∈ Γ i , k i ∈ [b i ], i ∈ [m]. As in Theorem 12, for a fixed number of j, these are n m 2 2 + 5m 2 many. The most costly step is the calculation of y(j, t, k) in each step, which is taking a minimum over, again, at most n m 2 2 + 5m 2 many values. Hence, the total time to compute all values of w is bounded by O(n m 2 +5m+1 ).
If there is a feasible schedule where at least one job is on time, we obtain due to Lemmas 9 and 19 that the minimum weighted number of late jobs is exactly the minimum of all values of w for j = n. In this case an optimal schedule may be restored by backtracking. If, however, in every feasible schedule all jobs are late, then all values of w will be +∞. In this case, any feasible schedule is optimal, with objective value n j=1 w n . ⊓ ⊔