Approximate and robust bounded job start scheduling for Royal Mail delivery offices

Motivated by mail delivery scheduling problems arising in Royal Mail, we study a generalization of the fundamental makespan scheduling P||Cmax\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P||C_{\max }$$\end{document} problem which we call the bounded job start scheduling problem. Given a set of jobs, each specified by an integer processing time pj\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_j$$\end{document}, that have to be executed non-preemptively by a set of m parallel identical machines, the objective is to compute a minimum makespan schedule subject to an upper bound g≤m\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$g\le m$$\end{document} on the number of jobs that may simultaneously begin per unit of time. With perfect input knowledge, we show that Longest Processing Time First (LPT) algorithm is tightly 2-approximate. After proving that the problem is strongly NP\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {N}}{\mathcal {P}}$$\end{document}-hard even when g=1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$g=1$$\end{document}, we elaborate on improving the 2-approximation ratio for this case. We distinguish the classes of long and short instances satisfying pj≥m\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_j\ge m$$\end{document} and pj<m\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_j<m$$\end{document}, respectively, for each job j. We show that LPT is 5/3-approximate for the former and optimal for the latter. Then, we explore the idea of scheduling long jobs in parallel with short jobs to obtain tightly satisfied packing and bounded job start constraints. For a broad family of instances excluding degenerate instances with many very long jobs, we derive a 1.985-approximation ratio. For general instances, we require machine augmentation to obtain better than 2-approximate schedules. In the presence of uncertain job processing times, we exploit machine augmentation and lexicographic optimization, which is useful for P||Cmax\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P||C_{\max }$$\end{document} under uncertainty, to propose a two-stage robust optimization approach for bounded job start scheduling under uncertainty aiming in a low number of used machines. Given a collection of schedules of makespan ≤D\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\le D$$\end{document}, this approach allows distinguishing which are the more robust. We substantiate both the heuristics and our recovery approach numerically using Royal Mail data. We show that for the Royal Mail application, machine augmentation, i.e., short-term van rental, is especially relevant.


Introduction
Royal Mail provides mail collection and delivery services for all UK addresses.With a small van fleet (as of January 2019) of 37,000 vehicles and 90,000 drivers delivering to 27 million locations in the UK, efficient resource allocation is essential to guarantee the business viability.The backbone of the Royal Mail distribution is a three-layer hierarchical network with 6 regional distribution centers serving 38 mail centers.Each mail center receives, processes, and distributes mail for a large geographically defined area via 1250 delivery offices, each serving disjoint sets of neighboring post codes.Mail is collected in mail centers, sorted by region, and forwarded to an appropriate onward mail center, making use of the regional distribution centers for cross-docking purposes.From the onward mail center, it is transferred to the final delivery office destination.This process has to be completed within 12-16 h for first class post and 24-36 h for second class post depending on when the initial collection takes place.
In a delivery office, post is sorted, divided into routes, and delivered to addresses using a combination of small fleet vans and walked trolleys.Allocating delivery itineraries to vans is critical.Each delivery office has a van exit gate which gives an upper bound the number of vehicles that can depart per unit of time.Thus, we deal with scheduling a set J of jobs (delivery itineraries), each associated with an integer processing time p j , on m parallel identical machines (vehicles), s.t. the makespan, i.e., the last job completion time, is minimized.Parameter g imposes an upper bound on the number of jobs that may simultaneously begin per unit of time.Each job has to be executed non-preemptively, i.e., by a single machine in a continuous time interval without interruptions.We refer to this problem as the bounded job start scheduling problem (BJSP).
Our contribution is twofold: First, we derive greedy constant-factor approximation algorithms, i.e., simple heuristics adoptable by Royal Mail practitioners, for effectively solving BJSP instances with perfect knowledge.Second, we propose a two-stage robust optimization approach, based on Royal Mail practices, which evaluates the robustness of BJSP schedules under uncertainty.Using real data, we computationally validate the performance of both the heuristics and two-stage robust optimization approach.Sect.1.1 discusses the relationship between BJSP and the fundamental makespan scheduling problem, a.k.a.P||C max .Section 1.2 presents relevant literature.Section 1.3 summarizes the paper's organization and our contributions.

Relation to P||C max
With perfect knowledge, BJSP is strongly related to P||C max which is defined similarly, but drops the BJSP constraint (Graham 1969).Broadly speaking, BJSP is harder as a generalization of P||C max and the two problems become equivalent when g = m.P||C max is strongly N P-hard as a straightforward generation of the 3-Partition problem (Garey and Johnson 1979).The well-known List Scheduling (LS) and Longest Processing Time First (LPT) algorithms achieve tight approximation ratios for P||C max equal to 2 and 4/3, respectively.Further, P||C max admits a polynomialtime approximation scheme (PTAS).
Technically, good solutions for both BJSP and P||C max must attain low imbalance max i {T − T i }, where T and T i are the makespan and completion time of machine i, respectively.However, BJSP exhibits the additional difficulty of managing and bounding idle machine time during the time interval [0, min i {T i }].To this end, we develop an algorithm that schedules long jobs in parallel with short jobs and bounds idle time with a concave relaxation.Figure 1 shows a job set where the minimum makespan schedules with (T B ) and without (T M ) the bounded job start constraint differ by a factor of 2.
Approximate P||C max solutions can be converted into feasible solutions for BJSP.On the negative side, P||C max optimal solutions are a factor Ω(m) from the BJSP optimum, in the worst case.To see this, take an arbitrary P||C max instance and construct a BJSP one with g = 1, by adding a large number of unit jobs.The BJSP optimal schedule requires time intervals during which m − 1 machines are idle at each time, while the P||C max optimal schedule is perfectly balanced and all machines are busy until the last job completes.On the positive side, we may easily convert any ρ-approximation algorithm for P||C max into a 2ρapproximation algorithm for BJSP using naïve bounds.Given that P||C max admits a PTAS, we obtain an O(n 1/ • poly(n))time (2 + )-approximation algorithm for BJSP.Our main goal is to obtain tighter approximation ratios.

Related work
Next, we present related work to design BJSP approximation algorithms and robust optimization approaches, relevant to Royal Mail delivery offices.Approximation algorithms BJSP relaxes the scheduling problem with forbidden sets, i.e., non-overlapping constraints, where subsets of jobs cannot run in parallel (Schäffter 1997).For the latter problem, better than 2-approximation algorithms are ruled out, unless P = N P (Schäffter 1997).Even when there is a strict order between jobs in the same forbidden set, the scheduling with forbidden sets problem is equivalent to the precedence-constrained scheduling problem P| pr ec|C max and cannot be approximated by a factor lower than (2 − ), assuming a variant of the unique games conjecture (Svensson 2011).Also, BJSP relaxes the scheduling with forbidden job start times problem, where no job may begin at certain time points, which does not admit constantfactor approximation algorithms (Billaut and Sourd 2009;Gabay et al. 2016;Mnich and van Bevern 2018;Rapine and Brauner 2013).Despite the commonalities with the aforementioned literature, to the authors' knowledge, there is a lack of approximation algorithms for scheduling problems with bounded job starts.Robust Optimization Royal Mail delivery times may be imprecise.Once a delivery has begun, it might finish earlier or later than its anticipated completion time.Because of uncertain job completion times, Royal Mail vans attempting pre-computed schedules may not be able to complete all deliveries during working hours.To this end, robust optimization provides a useful framework for structuring uncertainty, e.g.box uncertainty sets, and incorporating it in the decisionmaking process (Ben-Tal et al. 2009;Bertsimas et al. 2011;Goerigk and Schöbel 2016;Kouvelis and Yu 2013).Typi-Fig. 1 BJSP instance with m machines, k jobs of processing time i, for each i = 1, . . ., m, and g = 1.For an odd number k = o(m), T B 2T M cally, Royal Mail delivery schedules are computed in two stages: (i) The first stage computes a feasible, efficient schedule for an initial nominal problem instance before the day begins, and (ii) the second stage recovers the initial schedule by accounting for uncertainty during the day.This setting is naturally captured by two-stage robust optimization with recovery (Ben-Tal et al. 2004;Bertsimas and Caramanis 2010;Hanasusanto et al. 2015;Liebchen et al. 2009).
A common way of measuring solution robustness of a given discrete optimization problem instance is by comparing the final solution objective value after uncertainty realization with the solution objective value that we could have achieved if we had a crystal ball that accurately predicts the future.From this perspective, we recently proposed a two-stage robust scheduling approach for P||C max , based on lexicographic optimization (Letsios et al. 2021).If processing times are perturbed by a (1 + ) factor, the lexicographic optimization approach yields schedules a factor (2 + O( )) far from achievable optima with perfect knowledge.But this common way of measuring solution robustness is less applicable for our application because the makespan, i.e., the number of working hours in the day, is fixed.So, in the Royal Mail context, certain decisions can be irrevocable during the recovery process.For Royal Mail delivery offices, job start times can be irrevocable during the day, to reduce delivery delays and overtimes.Further, resource augmentation (or constraint violation), e.g.backup machines, might be essential to ensure the resulting solutions' feasibility (Ben-Tal and Nemirovski 2000;Kalyanasundaram and Pruhs 2000).For this application, we measure a solution's robustness with the level of resource augmentation, e.g.number of short-term rental vans, required after uncertainty realization.Robust bounded job start scheduling with resource augmentation is an intriguing open question.

Paper organization and contributions
Section 2 formally defines BJSP, proves the problem's N Phardness, and derives an O(log n) integrality gap for a natural integer programming formulation.Section 3 investigates Longest Processing Time First (LPT) algorithm and derives a tight 2-approximation ratio.We thereafter explore improving this ratio for the special case g = 1.Section 2 shows that BJSP is strongly N P-hard even when g = 1.Several of our arguments can be extended to arbitrary g, but focusing on g = 1 avoids many floors, ceilings, and simplifies our presentation.Furthermore, any Royal Mail instance can be converted to g = 1 using small discretization.
Section 4 distinguishes long versus short instances.An instance m, J is long if p j ≥ m for each j ∈ J and short if p j < m for all j ∈ J .This distinction comes from the observation that idle time occurs mainly because of (i) simultaneous job completions for long jobs and (ii) limited allowable parallel job executions for short jobs.Section 4 proves that LPT is 5/3-approximate for long instances and optimal for short instances.A key ingredient for establishing the ratio in the case of long instances is a concave relaxation for bounding idle machine time, before the last job start.Section 4 also obtains an improved approximation ratio for long instances, when the maximum job processing time is relatively small, using the Shortest Processing Time First (SPT) algorithm.For long instances, our analysis shows that LPT and SPT achieve low idle machine time after and before, respectively, the last job begins.We leave determining the best trade-off between the two in order to obtain a better approximation ratio as an open question.
Greedy scheduling, e.g.LPT and SPT, which sequences long jobs first and short jobs next, or vice versa, cannot achieve an approximation ratio better than 2. Section 5 proposes Long-Short Mixing (LSM), which devotes a certain number of machines to long jobs and uses all remaining machines for short jobs.By executing the two job types in parallel, LSM reduces the idle time before the last job begins and achieves a 1.985-approximation ratio for a broad family of instances.Carefully bounding idle time before the last job start by accounting for the parallel execution of long jobs with short job starts is the main technical difficulty behind our analysis.For degenerate instances with many very long jobs, we require constant-factor machine augmentation, i.e., f m machines where f > 1 is constant, to achieve a strictly lower than 2-approximation ratio.
Because Royal Mail delivery scheduling is subject to uncertainty, Sect.6 exploits machine augmentation and lexicographic optimization for P||C max under uncertainty (Letsios et al. 2021;Skutella and Verschae 2016) to construct a two-stage robust optimization approach for the BJSP under uncertainty.We measure robustness based on the resource augmentation required for the final solution feasibility.Our approach distinguishes which among different solutions is more robust.Section 7 substantiates our algorithms and robust optimization approach empirically using Royal Mail data.Section 8 concludes with a collection of intriguing future directions.

Problem definition and preliminary results
An instance I = m, J of the Bounded Job Start Scheduling Problem (BJSP) is specified by a set M = {1, . . ., m} of parallel identical machines and a set J = {1, . . ., n} of jobs.A machine may execute at most one job per unit of time.Job j ∈ J is associated with an integer processing time p j .Each job should be executed non-preemptively, i.e., in a continuous time interval without interruptions, by a single machine.BJSP parameter g imposes an upper bound on the number of jobs that may begin per unit of time.The goal is to assign each job j ∈ J to a machine and decide its starting time so that this BJSP constraint is not violated and the makespan, i.e., the time at which the last job completes, is minimized.Consider a feasible schedule S with makespan T .We denote the start time of job j by s j .Each job j must be entirely executed during the interval [s j , C j ), where C j = s j + p j is the completion time of j.So, T = max j∈J {C j }.Job j is alive at time BJSP is strongly N P-hard because it becomes equivalent with P||C max in the special case where g = min{m, n}.Theorem 1 shows that BJSP is strongly N P-hard also when g = 1.
Theorem 1 BJSP is strongly N P-hard in the special case g = 1.
j∈S i a j = B for each i ∈ {1, . . ., m}.Given an instance of 3-Partition, construct a BJSP instance I = m, J with n = 3m jobs of processing time p j = n 2 a j , for j ∈ {1, . . ., 3m}, and BJSP parameter g = 1.W.l.o.g., n 2 > 3n.We show that A admits a 3-Partition iff there exists a feasible schedule S of makespan T < n 2 B + n 2 for I .
⇒: Suppose that A admits a 3-Partition S 1 , . . ., S m .Because B/4 < a j < B/2, for j ∈ {1, . . ., 3m}, S i contains exactly three elements, i.e., |S i | = 3, for each i ∈ {1, . . ., m}.We fix some arbitrary order 1, . . ., 3m of all jobs and construct a schedule S for I where all jobs in S i are executed by machine i ∈ M. The job starting times are decided greedily.In particular, let T i be the last job completion time in machine i just before assigning job j.If no job has been assigned to i, then T i = 0. We set s j equal to the earliest time slot after T i at which no job begins in any machine, i.e., min{t : |B t | < 1, t > T i }.Now, let T i be the last completion time in machine i, once the greedy procedure has been completed.Consider any job j ∈ S i and let j ∈ S i be the last job executed before j in machine i.If no job is executed before j, then s j ≤ n.Otherwise, by construction, we have ⇐: Suppose that there exists a feasible schedule S of makespan T < n 2 B + n 2 for I .We argue that each machine executes exactly three jobs.Suppose for contradiction that machine i ∈ M executes a subset S i of jobs with |S i | ≥ 4. Denote by T i = max{C j : j ∈ S i } the last job completion time in machine i.Then, T i ≥ j∈S i p j = n 2 j∈S i a j .Because a j ∈ Z + and a j > B/4, it must be the case that j∈S i a j ≥ B + 1.Hence, T i ≥ n 2 B + n 2 , which is a contradiction on the fact that T i ≤ T .Thus, schedule S defines a partitioning of the jobs into m subsets S 1 , . . ., S m s.t.|S i | = 3, for each i ∈ M. We claim that j∈S i a i = B. Otherwise, there would be a machine i ∈ M with j∈S i a i ≥ B + 1 and we would obtain a contradiction using similar reasoning to before.We conclude that A admits a 3-Partition.
Next, we investigate the integrality gap of a natural integer programming formulation.To obtain this integer program, we partition the time horizon into a set D = {1, . . ., τ } of unit-length discrete time slots.Time slot t ∈ D corresponds to time interval [t − 1, t).We may naïvely choose τ = j∈J p j , but smaller τ values are possible using tighter makespan upper bounds.For simplicity, this manuscript assumes discrete time intervals [s, t] = {s, s + 1, . . ., t − 1, t}, i.e., of integer length.Interval [1, τ ] is the time horizon.In integer programming Formulation (1), binary variables decide a starting time for each job.Binary variable x j,s is 1 if job j ∈ J begins at time slot s ∈ D, and 0 otherwise.Continuous variable T corresponds to the makespan.If job j starts at s, then it is performed exactly during the time slots s, s + 1, . . ., s + p j − 1.Hence, job j is alive at time slot t iff it has begun at one among the time slots in the set A j,t = {t − p j + 1, t − p j + 2, . . ., t}.To complete before the time horizon ends, job j must begin at a time slot in the set F j = {1, 2, . . ., τ − p j + 1}.Finally, denote by J s = { j : s ∈ F j , j ∈ J } the eligible subset of jobs at s, i.e., the ones that may be feasibly begin at time slot s without exceeding the time horizon.Formulation (1) models the BJSP problem.min Expression (1a) minimizes makespan.Constraints (1b) enforce that the makespan is equal to the last job completion time.Constraints (1c) ensure that at most m machines are used at each time slot t.Constraints (1d) require that each job j is scheduled.Constraints (1e) express the BJSP constraint.
Theorem 2 shows that the fractional relaxation obtained by replacing Eq. (1f) with the constraints 0 ≤ x j,s ≤ 1, for j ∈ J and s ∈ F j , has a non-constant integrality gap.Thus, stronger linear programming (LP) relaxations are required for obtaining constant-factor approximation algorithms with LP rounding.

Theorem 2 The fractional relaxation of integer programming Formulation (1) has integrality gap Ω(log n).
Proof Consider an instance with m machines, n = m jobs of processing time p j = 1 for each j ∈ J , and BJSP parameter g = m.For this instance, the LP solution sets x j,s = 1/(s • τ t=1 1 t ) for each j, s.The LP fractional solution is feasible as at each time, no more than m job pieces are feasibly executed (and begin), while the cost is max{sx j,s } = 1/ t 1 t .On the contrary, the optimal integral solution has makespan 1.

LPT algorithm
Longest Processing Time First algorithm (LPT) schedules the jobs on a fixed number m of machines w.r.
Theorem 3 proves a tight approximation ratio of 2 for LPT.

Theorem 3 LPT is 2-approximate for minimizing makespan and this ratio is tight.
Proof Denote by S and S * the LPT and a minimum makespan schedule, respectively.Let be the job completing last in S, i.e., T = s + p .For each time slot t ≤ s , either Because of the BJSP constraint, exactly g jobs begin per unit of time, which implies that λ ≤ g .Therefore, schedule S has makespan: Denote by s * j the starting time of job j in S * and let π 1 , . . ., π n the job indices ordered in non-decreasing schedule S * starting times, i.e., s * Hence, S * has makespan: We conclude that T ≤ 2T * .Figure 2   ω( 1), m( p − m) unit jobs, and BJSP parameter g = 1.LPT schedules the long jobs into m − 1 groups, each one with exactly m .All jobs of a group are executed in parallel for their greatest part.In particular, the i-th job of the k-th groups is executed by machine i starting at time slot (k −1) p +i.All unit jobs are executed sequentially by machine 1 starting at (m − 1) p + 1. Observe that S is feasible and has makespan The optimal solution S * schedules all jobs in m groups.The kth group contains (m − 1) long jobs and ( p − m + 1) unit jobs.Specifically, the i-th long job is executed by machine i beginning at (k − 1) p + i, while all short jobs are executed consecutively by machine m starting at (k − 1) p + m and completing at kp. Schedule S * is feasible and has makespan T * = mp.Because m p → 0 and 1 m → 0, i.e., both approach zero, T → 2T * .

Long and short instances
This section assumes that g = 1, but several of the arguments can be extended to arbitrary g.From an application viewpoint, any Royal Mail instance can be converted to g = 1 using small discretization.

Longest processing time first
We consider two natural classes of BJSP instances for which LPT achieves an approximation ratio better than 2. Instance m, J is (i) long if p j ≥ m for each j ∈ J and (ii) short if p j < m for every j ∈ J .This section proves that LPT is 5/3-approximate for long instances and optimal for short instances.Intuitively, LPT schedules for long instances con-tain a significant amount of time without job starts, where all machines execute long jobs in parallel.LPT schedules for short instances have no time where all machines simultaneously execute jobs in parallel, because of the BJSP constraint and the fact that all jobs are short.In this case, the number of job starts is significant compared to the overall makespan.Using these observations, we are able to obtain better than 2-approximate schedules for these two classes of instances.
Consider a feasible schedule S and let r = max j∈J {s j } be the last job start time.We say that S is a compact schedule if it holds that either (i Lemma 1 shows the existence of an optimal compact schedule and derives a lower bound on the optimal makespan.
Lemma 1 For each instance I = m, J , there exists a feasible compact schedule S * which is optimal.Let J L = { j : Proof For the first part, among the set of all optimal schedules, pick the schedule S * lexicographically minimizing 1 the vector of job start times sorted in non-decreasing order.We claim that S * is compact.Assume that this is not the case.+ n j=1 p j .
Next, we analyze LPT in the case of long instances.Similar to the Lemma 1 proof, we may show that LPT produces a compact schedule S. We partition the interval [1, r ] into a sequence P 1 , . . ., P k , where k ≤ r , of maximal periods satisfying the following invariant: for each q ∈ {1, . . ., k}, either (i) |A t | < m for each t ∈ P q or (ii) |A t | = m for each t ∈ P q .That is, there is no pair of time slots s, t ∈ P q such that |A s | < m and |A t | = m.We call P q a slack period if P q satisfies (i), otherwise, P q is a full period.For a given period P q of length λ q , denote by Λ q = t∈P q (m − |A t |) the idle machine time.Note that Λ q = 0, for each full period P q .Lemma 2 upper bounds the total idle machine time of slack periods in the LPT schedule S, except the very last period P k .When P k is slack, the length λ k of P k is upper bounded by Lemma 3.
for each slack period P q , where q ∈ {1, . . ., k }.Furthermore, (iii) k q=1 Λ q ≤ nm 2 .Proof For (i), let P q = [s, t] be a slack time period in S and assume for contradiction that λ q ≥ m, i.e., t ≥ s + m − 1.Given that p j ≥ m for each j ∈ J , we have { j : contradicting the fact that P q is a maximal slack period.
For (ii), consider the partitioning A u = A − u +A + u for each time slot u ∈ P q = [s, t], where A − u and A + u is the set of alive jobs at time u completing inside P q and after P q , i.e., C j ∈ [s, t] and C j > t, respectively.Since λ q ≤ m−1, every job j beginning during P q , i.e., s j ∈ [s, t], must complete after P q , i.e., C j > t.We modify schedule S by removing every job j completing inside P q , i.e., C j ∈ P q .Clearly, the modified schedule S has increased idle time Λ q during P q , i.e., Λ q ≤ Λ q .Further, no job j with s j ∈ P q is removed.
Lemma 3 Suppose that P k is a slack period and let J k be the set of jobs beginning during P k .Then, it holds that λ k ≤ 1 m j∈J k p j .Proof Because P k is a slack period, it must be the case that |B u | = 1, for each u ∈ P k .Since we consider long instances, p j ≥ m for each j ∈ J .Therefore, Theorem 4 LPT achieves an approximation ratio ρ ∈ 3 2 , 5 3 in the case of long instances.
Proof Denote the LPT and optimal schedules by S and S * , respectively.Let ∈ J be a job completing last in S, i.e., T = s + p .Recall that LPT sorts jobs s.t.p 1 ≥ . . .≥ p n .W.l.o.g.we assume that = arg min j∈J { p j }.Indeed, we may discard every job j > and bound the algorithm's performance w.r.t.instance Ĩ = m, J \{ j : j > , j ∈ J } .Let S and S * be the LPT and an optimal schedule attaining makespan T and T * , respectively, for instance Ĩ .Showing that T ≤ (5/3) T * is sufficient for our purposes because T = T and T * ≤ T * .We distinguish two cases based on whether p n > T * /3, or p n ≤ T * /3.Case 1 (p n > T * /3): We claim T ≤ (3/2)T * .Initially, observe that n ≤ 2m.Otherwise, there would be a machine i ∈ M executing at least |S * i | ≥ 3 jobs, say j, j , j ∈ J , in S * .This machine would have last job completion time Then, some machine executes at least two jobs in S * , i.e., p n ≤ T * /2.To prove our claim, it suffices to show s n ≤ T * , i.e., job n starts before or at T * .Let c = max 1≤ j≤m {C j } be the time when the last among the m biggest jobs completes. If By Lemma 1, our claim follows.
Case 2 ( p n ≤ T * /3): In the following, Equalities (3a) hold because job n completes last and by the definition of alive jobs.Inequalities (3b)-(3d) use a simple packing argument with job processing times and machine idle time taking into account: (3b) Lemma 3, (3c) Lemma 2 property k q=1 Λ q ≤ nm 2 , and (3d) the bound T * ≥ max{ 1 m j∈J p j , n + p n , 3 p n }.
We complement Theorem 4 with a long instance I = m, J where LPT is 3/2-approximate and leave closing the gap between the two as an open question.Instance I is illustrated in Fig. 4 and contains m + 1 jobs, where p j = 2m − j, for j ∈ {1, . . ., m}, and p m+1 = m.In the LPT schedule S, job j is executed at time s j = j, for j ∈ {1, . . ., m}, and s m+1 = 2m − 1.Hence, T = 3m − 1.But an optimal schedule S * assigns job j to machine j + 1 at time s j = j + 1, for j ∈ {1, . . ., m −1}.Moreover, jobs m and m +1 are assigned to machine 1 beginning at times s m = 1 and s m+1 = m, respectively.Clearly, S * has makespan T * = 2m.
Theorem 5 completes this section with a simple argument on the LPT performance for short instances.
Theorem 5 LPT is optimal for short instances.
Proof Let p 1 ≥ . . .≥ p n be the LPT job ordering and suppose that job completes last in LPT schedule S. We claim that job begins at time slot s = /g in S. Due to the BJSP constraint, we have s ≥ /g .Assume that the last inequality is strict.Then, |A t | = m for some time slot t < s .So, there exists job j ∈ A t with s j = t − m + 1.Since p j ≤ m − 1, we get C j < t which contradicts j ∈ A t .Because T * ≥ j/g + j for each j ∈ J , the theorem follows.
Fig. 4 BJSP long instance with a 3/2 lower bound of LPT

Shortest processing time first
This section investigates the performance of Long Job Shortest Processing Time First Algorithm (LSPT).LSPT orders the jobs as follows: (i) Each long job precedes every short job, (ii) long jobs are sorted according to Shortest Processing Time First (SPT), and (iii) short jobs are sorted as in LPT.LSPT schedules jobs greedily, in the same vein as LPT, with this new job ordering.For long instances, when the largest processing time p max is relatively small compared to the average machine load, Theorem 6 shows that LSPT achieves an approximation ratio better than the 5/3, i.e., the approximation ratio Theorem 4 obtains for LPT.From a worst-case analysis viewpoint, the main difference between LSPT and LPT is that the former requires significantly lower idle machine time until the last job start, but at the price of much higher difference between the last job completion times in different machines.
Proof Let S be the LSPT schedule and suppose that it attains makespan T .Moreover, denote by ∈ S the job completing last, i.e., C = T .Similar to the Lemma 1 proof, S is compact.We distinguish two cases based on whether In the former case, every job j ∈ J L satisfies s j ≤ p max .Using similar reasoning to Theorem 3 proof, we get T ≤ p max + n + p min .In latter case, let t = min{t : |A t | < m, t > m}.That is, t is the earliest time slot strictly after time t = m when at least one machine is idle.We claim that s j < t for each j ∈ J L , i.e., every long job begins before t and there is no idle machine time during [m, t ].Assume for contradiction that there is some job j ∈ J L such that s j > t .Since, there is an idle machine at t and long jobs remain to begin after t , at least two jobs j , j ∈ J L complete simultaneously at t − 1, i.e., C j = C j = t − 1.Because of the BJSP constraint |B t | ≤ 1, it must by the case that s j = s j .W.l.o.g.s j < s j .By the SPT ordering, we also have that p j ≤ p j .Thus, we get the contradiction C j < C j .Our claim implies that t ≤ m(m−1) 2 + 1 n j∈J p j .Next, consider two subcases based on whether ∈ J L , or Obviously, the optimal solution satisfies: In all cases, T ≤ 2T * .For long instances, i.e., the case ∈ J L , LSPT is (1 + min{1, 1/α})-approximate.

Parallelizing long and short jobs
This section proposes the Long-Short Job Mixing Algorithm (LSM) to compute 1.985-approximate schedules for a broad family of instances, e.g. with at most 5m/6 jobs of processing time (i) p j > (1 − )( j p j ), or (ii) p j > (1 − )(max j { j + p j }) assuming non-increasing p j 's, for sufficiently small constant > 0. For degenerate instances with more than 5m/6 jobs of processing time p j > T * /2, where T * is the optimal objective value, LSM requires constant machine augmentation to achieve an approximation ratio lower than 2.There can be at most m such jobs.In the Royal Mail application, machine augmentation (Davis and Burns 2011;Kalyanasundaram and Pruhs 2000;Phillips et al. 2002) adds more delivery vans.For simplicity, we also assume that m = ω(1), but the approximation ratio can be adapted for smaller values of m.However, we require that m ≥ 7. LSM attempts to construct a feasible schedule where long jobs are executed in parallel with short jobs, as depicted in Fig. 5. LSM uses m L < m machines for executing long jobs.The remaining m S = m − m L machines are reserved for Fig. 5 Structure of schedules produced by LSM.Long jobs are prioritized in the subset M L of the machines.The subset M S of remaining machines execute only short jobs short jobs.Carefully selecting m L allows to obtain a good trade-off in terms of (i) delaying long jobs and (ii) achieving many short job starts at time slots when many long jobs are executed in parallel.Here, we set m L = 5m/6 .Before formally presenting LSM, we modify the notions of long and short jobs by setting J L = { j : p j ≥ m L , j ∈ J } and J S = { j : p j < m L , j ∈ J }, respectively.Both J L and J S are sorted in non-increasing processing time order.LSM schedules jobs greedily by traversing time slots in increasing order.Let A L t be the set of alive long jobs at time slot t ∈ D. For t = 1, . . ., τ , LSM proceeds as follows: LSM schedules the next short job to start at t, else (iii) LSM considers the next time slot.From a complementary viewpoint, LSM partitions the machines M into M L = {i : i ≤ m L , i ∈ M} and M S = {i > m L , i ∈ M}.LSM prioritizes long jobs on machines M L and assigns only short jobs to machines M S .A job may undergo processing on machine i ∈ M S only if all machines in M L are busy.Algorithm 1 pseudocode describes LSM.
Algorithm 1 Long-Short Mixing (LSM) Theorem 7 shows that LSM achieves a better approximation ratio than LPT, i.e., strictly lower than 2, for a broad family of instances.
Proof Let S be the LSM schedule and = arg max{C j : j ∈ J } the job completing last.That is, S has makespan T = C .For notational convenience, given a subset P ⊆ D of time slots, denote by λ(P) = |P| and μ(P) = t∈P |A t | the number of time slots and executed processing load, respectively, during P. Furthermore, let n L = |J L | and n S = |J S | be the number of long and short jobs, respectively.We distinguish two cases based on whether (i) ∈ J S or (ii) ∈ J L .Case ∈ J S We partition time slots {1, . . ., T } into five subsets.Let r L = max j∈J L {s j } and r S = max j∈J S {s j } be the maximum long and short job start time, respectively, in S. Since ∈ J S , it holds that r L < r S .For each time slot t ∈ [1, r L ] in S, either |A L t | = m L long jobs simultaneously run at t, or not.In the latter case, it must be the case that t = s j for some j ∈ J L .On the other hand, for each time slot t ∈ [r L + 1, s ], either |A t | = m, or t = s j for some j ∈ J S .Finally, [s , T (S)] is exactly the interval during which job is executed.Then, we may define: - Clearly, Next, we upper bound a linear combination of λ(F L ), λ(B S ), and λ(F S ) taking into account the fact that certain short jobs begin during a subset B S ⊆ F L ∪ F S of time slots.By definition, λ(B S ) ≤ n S − λ( B S ).We claim that λ( B S ) ≥ (m S /m L )(λ(F L ) + λ(F S )).For this, consider the time slots F L ∪ F S as a continuous time period by disregarding intermediate B L and B S time slots.Partition this F L ∪ F S time period into subperiods of equal length m L .Note that no long job begins during F L ∪ F S and the machines in M S may only execute small jobs in S. Because of the greedy nature of LSM and the fact that p j < m L for j ∈ J S , there are at least m S short job starts in each subperiod.Hence, our claim is true and we obtain that λ(B S ) ≤ n S −(m S /m L )(λ(F L )+λ(F S )), or equivalently: (5) Subsequently, we upper bound a linear combination of λ(F L ), λ(B L ), and λ(F S ) using a simple packing argument.The part of the LSM schedule for long jobs is exactly the LPT schedule for a long instance with n L jobs and m L machines.If |A L r L | < m, we make the convention that μ(B L ) does not contain any load of jobs beginning in the maximal slack period completed at time r L .Observe that μ(F L ) = m L λ(F L ) and μ(F S ) = mλ(F S ).Additionally, by Lemma 2, we get that μ(B L ) ≥ m L λ(B L )/2, except possibly the very last slack period.Then, μ(F L ) + μ(B L ) + μ(F S ) ≤ j∈J p j .Hence, we obtain: Summing Expressions ( 5) and ( 6), Because m = m L +m S , if we divide by m, the last expression gives: We distinguish two subcases based on whether λ(F S ) + λ(F L ) ≥ 5n S /6 or not.Obviously, λ(B L ) ≤ n L .In the former subcase, Inequality (5) gives λ(B S ) ≤ (1 − 5m S 6m L )n S .Using Inequality (7), Expression (4) becomes: For m L = 5m/6 , we have (i Note that an optimal solution S * has makespan: Because the instance is mixed with long and short jobs and we consider the case ∈ J S , we have p ≥ T * /2.Therefore, t and B L t are the sets of long jobs which are alive and begin, respectively, at time slot t.Furthermore, r L = max{s j : j ∈ J L } is the last long job starting time.Because LSM greedily uses m L machines for long jobs, either Then, Lemma 2 implies that μ(B L ) ≥ n L m L /2.Furthermore, λ(B L ) ≤ n L .Therefore, by considering Lemma 3, we obtain: In the case p ≤ T * /2, since T * ≥ n L + p , we obtain an approximation ratio of ( m m L + 3 4 ) ≤ 1.95, when m L = 5m/6 , given that m = ω(1).
Next, consider the case p > T * /2.Let J V = { j : p j > T * /2} be the set of very long jobs and n V = |J V |.Clearly, n V ≤ m.By using resource augmentation, i.e., allowing LSM to use m = 6m/5 machines, we guarantee that LSM assigns at most one job j ∈ J V in each machine.The theorem follows.
Remark If 5m/6 < |J V | ≤ m, LSM does not achieve an approximation ratio better than 2, e.g. as illustrated by an instance consisting of only J V jobs.Assigning two such jobs on the same machine is pathological.Thus, better than 2-approximate schedules require assigning all jobs in J V to different machines.
the P||C max approach to BJSP, using resource augmentation.That is, new machines are added once the uncertainty is realized.The robustness of a solution is measured by the level of resource augmentation, i.e., number of machines required for the final solution feasibility (instead of the actual makespan objective value we adopt in Letsios et al. 2021 with a different recovery strategy).Section 6.1 formally describes our uncertainty setting, including the uncertainty set structure and investigated robustness measure.Section 6.2 presents the proposed approach for solving two-stage robust BJSP instances, i.e., the first and second stages.Given a collection of schedules of makespan ≤ D, our approach determines which are the more robust.

Uncertainty setting
Figure 6 illustrates the two-stage setting for solving BJSP under uncertainty.The Figure 6 setting is most similar to the Liebchen et al. (2009) recoverable robustness setting, but also has connections to other two-stage optimization problems under uncertainty (Ben-Tal et al. 2004;Bertsimas and Caramanis 2010;Hanasusanto et al. 2015).Stage 1 computes a feasible, efficient schedule S for an initial nominal instance I of the problem with a set J of jobs and vector of processing times p.After stage 1, there is uncertainty realization and a different, perturbed vector p of processing times becomes known.Stage 2 transforms S into a feasible, efficient solution S for the perturbed instance Ĩ with vector p of processing times, using machine augmentation.Designing and analyzing a two-stage robust optimization method necessitates (i) defining the uncertainty set structure and (ii) quantifying the solution robustness.
Scheduling under uncertainty may involve different perturbation types.We study processing time variations, i.e., p j may be perturbed by f j > 0 to become p j = f j p j .If f j > 1, then job j is delayed.If f j < 1, job j completes early.Instance I is modified by a perturbation factor F > 1 when 1/F ≤ f j ≤ F, for each j ∈ J .Uncertainty set U F (I ) contains every instance Ĩ that can be obtained by disturbing I w.r.t.perturbation factor F.
Our two-stage robust optimization approach aims to achieve a low number of machines once the recovery stage 2 is completed.Specifically, let S be an initial schedule, of makespan ≤ D, for a nominal BJSP instance I and S be a recovered schedule, obtained from S after uncertainty realization, for a perturbed instance Ĩ ∈ U F (I ).Denote by S * a feasible schedule for Ĩ with makespan ≤ D and minimal number of machines.Schedule S is robust if the ratio V ( S)/V ( S * ), where V (•) denotes the number of machines in a given schedule for Ĩ , is as low as possible.By slightly abusing standard robust optimization terminology, we refer to this ratio as the price of robustness (Bertsimas and Sim 2004;Bertsimas et al. 2011;Goerigk and Schöbel 2016;Xu and Mannor 2007).

Two-stage robust scheduling approach
This section proposes a two-stage robust optimization for solving BJSP under uncertainty with: (i) an exact method producing initial solution S and (ii) a recovery strategy restoring S after instance Ĩ is revealed.

Exact LexOpt scheduling with machine augmentation (stage 1)
To compute robust first-stage schedules, we develop an integer programming formulation minimizing a characteristic value, which is motivated by lexicographic optimization and the fact that machine augmentation is required in the recovery stage.
Recent work shows that two-stage robust P||C max schedules can be obtained by lexicographically minimizing the machine completion times T 1 ≥ . . .≥ T m , i.e., lex min {T 1 , . . ., T m }, where T i corresponds to the i-th greatest machine completion time (Letsios et al. 2021).That is, we minimize T 1 and, among all schedules minimizing T 1 , we select a schedule minimizing T 2 , then T 3 etc.Here, the proposed two-stage robust optimization approach lexicographically minimizes the job completion times C 1 ≥ . . .≥ C n , i.e., lex min{C 1 , . . ., C n }, where C j refers to the j-th greatest completion time.By considering all jobs instead of only the ones completing last in each machine, we enforce robustness at the price of extra computational effort.In particular, we are faced with a multi-objective optimization problem, where the number of objectives is n > m.Based on standard weighting lexicographic optimization methods, this problem can be reformulated as the mono-objective problem min{ n j=1 B C j }, where B > 1 is a sufficiently large scalar (Sherali 1982).We empirically select B = 2.
To achieve low resource augmentation at the stage 2 schedule S, we incorporate the number V of machines in the characteristic value for obtaining the stage 1 schedule S. Because the job starts are not modified in stage 2, if a minimal of machines is used in S, many new simultaneous job executions may occur in the stage 2 schedule S after uncertainty realization, due to job delays.On the other hand, if a large number of machines are used in S, a small number of new job overlaps will occur in the stage 2 schedule S, but the number of machines in the final schedule is already high.An empirically chosen parameter θ > 0 specifies the contribution of V in the characteristic value.
Denote by V (S) be the number of machines in S. In addition, associate the weight w j (S) = 2 C j (S) with each job j ∈ J , where C j (S) is the job j completion time, and let W (S) = j∈J w j (S) be the sum of job weights in S. The characteristic value F(S) of schedule S is the weighted sum: where θ > 0 is a parameter specifying the relevant importance between V (S) and W (S). Section 7 selects the θ value empirically.A schedule of minimal characteristic value can be computed with integer programming formulation (8).
Variable v corresponds to the number of machines, and parameter w j,t = 2 t is the weight of job j if it completes at time slot t.
Expression (8a) minimizes the characteristic value.Constraints (8b) limit the active machines to the total number of machines.Constraints (8c) force all jobs to complete before the makespan D. Constraints (8d) ensure that at most v machines are used at each time slot t.Constraints (8e) require that each job j is scheduled.Constraints (8f) express the BJSP constraint.
Large exponents provoke numerical issues when solving Integer Program (8).To circumvent this issue, we reduce the number of objectives in the underlying lexicographic optimization problem by rounding job completion times.In particular, we divide the time horizon into a set of time periods.A job j completing at time period q = 1, . . ., has weight w j = 2 q .By minimizing j∈J w j , we compute near-lexicographically optimal schedules.

Rescheduling strategy (stage 2)
A rescheduling strategy transforms an initial schedule S for the nominal problem instance I into a final schedule S for the perturbed instance Ĩ .To satisfy the requirement that schedule S should stay close to S, we distinguish between binding and free optimization decisions similarly to Letsios et al. (2021).Let x j,s and y i, j be binary variables specifying whether job j ∈ J begins at time slot t ∈ D and is executed by machine i ∈ M, respectively.Based on Royal Mail practices, we consider rescheduling with restricted job start times and flexible job-to-machine assignments.Definition 1 formalizes this fact with binding and free optimization decisions.
Definition 1 Let S be the initial schedule in BJSP under uncertainty.
-Binding decisions {x j,t : j ∈ J , t ∈ F j } are variable evaluations determined from S in the rescheduling process.-Free decisions {y i, j : i ∈ M, j ∈ J } are variable evaluations not determined from S but needed to recover feasibility.
Enforcing binding decisions ensures a limited number of initially planned solution modifications.Note that the first- stage decisions remain critical in this context.The proposed recovery strategy sets x j,s (S) = x j,s ( S).On the other hand, job-to-machine assignments are decided in an online manner.For t = 1, . . ., τ , the jobs { j : x j,t (S) = 1} are assigned to the lowest-indexed available machines.A machine is available at t if it executes no jobs.This assignment derives the y i, j ( S) values.

Numerical results
This section computationally evaluates the proposed heuristics and robust optimization approach for BJSP with perfect knowledge and under uncertainty, respectively, using Royal Mail data.Section 7.1 discusses the derivation of Royal Mail BJSP instances and historical schedules.Section 7.2 presents information about the number and load of long and short jobs in these instances.Section 7.3 compares the proposed LPT, LSPT and LSM heuristics.Section 7.4 evaluates the historical schedules sensitivity with respect to the number of machines, i.e., Royal Mail vans.Finally, Section 7.5 presents a numerical assessment of the two-stage robust optimization approach.We run all computations on a 2.5 GHz Intel Core i7 processor with an 8 GB RAM memory running macOS Mojave 10.14.6.Our implementations use Python 3.6.8,Pyomo 5.6.1, and solve the integer programming instances Given a number m machines, (i) the solid line plots the average load and (ii) the shaded area shows the difference between the maximum and minimum load, with respect to all job sets (days) with CPLEX 12.8.A recent MEng thesis considers several of these contexts in greater detail (Suraj 2019).

Generation of benchmark instances and historical schedules
We use historical data from three Royal Mail delivery offices which we refer to as (i) DO-A, (ii) DO-B, and (iii) DO-C.
Part of the data are encrypted for confidentiality protection.We consider a continuous time period of 78, 111, and 111 working days for DO-A, DO-B, and DO-C, respectively.A BJSP instance is the set of all jobs performed by a single delivery office during one date.So, we examine 300 instances in total.A job corresponds to a delivery in a set of neighboring postal codes.The data are a list of jobs, each specified by a (i) unique identifier, (i) delivery office, (iii) date, (iv) vehicle plate number, (v) begin time, and (v) completion time.Below, we give more details for generating the benchmark instances and the actual schedules realized by Royal Mail.A BJSP instance is defined by a (i) time horizon, (ii) time discretization, (iii) number of available vehicles, (iv) set of jobs, and (v) BJSP parameter.A simple data inspection shows that among all jobs, 92% run during [06:00,19:00] in DO-A, 91% are executed during [05:00, 19:30] in and 93% are implemented during [05:30,19:30] in DO-C.A scatter plot illustration clearly demonstrates that these boundaries specify the time horizon for each delivery office after dropping outliers (Letsios et al. 2020).The time horizon boundaries might be violated by both the historical schedules and our two-stage robust optimization method.We use a time discretization of δ = 15 min.The number of available vehicles is the number of distinct plate numbers used by each delivery office.We set the processing time of job j ∈ J equal to p j = (C j − s j )/δ , where s j and C j is start and comple- For a given number m of machines, (i) solid lines plot the worst-case performance ratios and (ii) dotted lines plot the average performance ratios of the heuristics with respect to all jobs sets (days) tion time of j in the corresponding historical schedule.The minimal processing time is 30 min, but a job may last for a number of hours.A scatter plot illustration shows that the distribution of processing times follows a similar pattern on a weekly basis (Chassein et al. 2019;Letsios et al. 2020).This observation supports using robust optimization to deal with the Royal Mail BJSP instances under uncertainty.BJSP parameter g is set equal to the maximum number of jobs beginning in a time interval δ units of time after ignoring few outliers.
The Royal Mail data include a historical schedule for each BJSP instance.Such a schedule is associated with (i) job start times, (ii) makespan, and (iii) number of used machines.Begin times are rounded down to the closest multiple of δ for time discretization.After rounding, the makespan is the time at which the last job completes.The number of vehicles is the maximal number of jobs running simultaneously.We note that solutions in this form do not explicitly specify job-to-machine assignments.However, once the job start times are known, feasible assignments can be computed with simple interval scheduling algorithms (Kolen et al. 2007).

Long and short jobs
This section presents information about the number and processing load of long and short jobs in the instances of each delivery office.Since the long-short job separation depends on the number m of machines, we examine a range of m values.A job set, i.e., the jobs executed by a delivery office in one day, is solved for every m ∈ [5, 50].Let K be the number of examined days for a delivery office, e.g.DO-A.Moreover, denote by N the number of all completed jobs during these K days and by (P 1 , . . ., P N ) the corresponding vector of processing times.Then, N L (m) = |{ j : P j ≥ m, 1 ≤ j ≤ N }| and N S (m) = |{ j : P j < m, 1 ≤ j ≤ N }| are the total number of long and short jobs, respectively, during all days, assuming m machines.In addition, Λ L (m) = 1 m j:P j ≥m P j and Λ S (m) = 1 m j:P j <m P j is the mean load of long and short jobs, respectively, with m machines.

Evaluation of heuristics
This section compares the performance of the LPT (Sect.3), LSPT (Sect.4) and LSM (Section 5) heuristics for BJSP.For each job set J (i.e., collection of jobs executed by a delivery office in one day) and number m ∈ [5, 50] of machines, we create a BJSP instance with g = 1.We solve every instance I = (m, J ) using LPT, LSPT and LSM.Let T (A, I ) be the makespan of the schedule produced by heuristic A for instance I .Then, the performance ratio of heuristic A for I is T (A, I )/T * (I ), where T * (I ) is the best heuristically computed makespan for I .
Figure 9 plots the worst-case and average performance ratio of each heuristic, for each m value.For small m values, the number n S of short jobs is low compared to the mean load 1 m j∈J L p j of long jobs and LPT produces good heuristic schedules, noticeably better than LSPT.As m increases, the idle period before the last job begins becomes progressively more significant and LSPT tends to compute better schedules than LPT (recall that LSPT and LPT achieve low idle machine before and after, respectively, the last job start).Interestingly, LSPT produces the best heuristic schedules in the pathological LPT case where n S tends to become equal to 1 m j∈J L p j .Finally, LSM consistently produces better schedules than LPT.Therefore, scheduling short jobs in parallel with long jobs early in a schedule is useful for achieving low makespan.Our findings indicate that a LSM variant where long jobs are executed according to SPT, in parallel with short jobs, might be a possible alternative for obtaining better BJSP approximation algorithms.
For completeness, Figure 10 plots the trade-off between the best heuristically computed makespan T with respect to the number m of machines.Clearly, as m increases, T decreases.This finding supports using machine augmen-tation for better makespan schedules in the presence of uncertainty.

Evaluation of historical schedules
This section evaluates the Royal Mail historical schedules (i) efficiency in number of used machines and (ii) sensitivity with respect to processing time and (iii) BJSP parameter variations.
For part (i), we solve each BJSP instance by feeding the corresponding MILP (8) model to CPLEX.In these MILP (8) models, we set θ = 0 to minimize the number of used machines.Figure 11 compares the number of machines in the Royal Mail historical schedules and the CPLEX solutions.We observe that nominal optimal solutions save at least 10, 25, and 10 vehicles per day compared to historical schedules for DO-A, DO-B, and DO-C, respectively.This finding is a strong indication that more efficient fleet management might be possible in Royal Mail delivery offices.
For part (ii), we create a set of perturbed instances.In particular, for each original instance I , we create one perturbed instance Ĩ where the processing time of each job j ∈ J is decreased by a factor f j = 0.5.We reduce the processing times to guarantee feasibility.For both instances I and Ĩ , we employ CPLEX to solve the corresponding MILP (8) formulations with θ = 0. Figure 12 compares the number of used machines obtained for the original and perturbed instances.Not surprisingly, doubling itinerary durations results in a proportional increase on the number of used vehicles in the nominal optimal solution.But, Fig. 12 exhibits an important consequence of uncertainty in Royal Mail fleet management.Disturbances amplify the difference in number of used machines between different days for one delivery office.This situation leads to inefficient machine utilization.
For part (iii), we investigate the effect of modifying the BJSP parameter for each delivery office.Figure 13 depicts the obtained results.Adding BJSP constraints, especially in the DO-B case, may significantly increase the number of used machines.This outcome motivates further investigations on scheduling with BJSP constraints.

Robustness assessment
Next, we evaluate the two-stage robust optimization method in Sect.6.2.Specifically, we show that low characteristic value results in more robust BJSP schedules.We adopt the experimental setup in Letsios et al. (2021).The Royal Mail instances are considered as the nominal ones before any disturbances occur.The true instances after uncertainty realization are derived by choosing a new processing time p j for job each j ∈ J uniformly at random from the interval [ p j − 50% p j , p j + 50% p j ], where p j is the nominal processing time.
For each nominal instance I , we generate a collection C(I ) of feasible, diverse (i.e., with different characteristic values) first-stage schedules, using the CPLEX solution pool feature.Let F * (I ) = min{F(S) : S ∈ C(I )} be the minimum characteristic value among all schedules in C(I ).Next, denote by Ĩ and S the perturbed instance and recovered schedule from S, after uncertainty realization.Moreover, let V * ( Ĩ ) = min{V ( S) : S ∈ C(I )} be the minimum number of machines achievable for Ĩ with perfect knowledge.For each initial schedule S ∈ C(I ) and recovered schedule S, we set a normalized characteristic value F N (S) = F(S)/F * (I ) and normalized number of used machines V N ( S) = V ( S)/V * ( Ĩ ). Figure 14 correlates F N (•) to V N (•), by plotting every computed pair (F N (S), V N ( S)), for every nominal instance and initial solution.We observe that the smaller the initial characteristic value is, the better the final solution we get in terms of number of machines.

Conclusion
This manuscript initiates study of the bounded job start scheduling problem (BJSP), e.g. as arising in Royal Mail deliveries.This project is part of our larger aims toward approximation algorithms for process systems engineering (Letsios et al. 2019).The main contributions are (i) better than 2-approximation algorithms for various cases of the problem and (ii) a two-stage robust optimization approach for BJSP under uncertainty based on machine augmentation and lexicographic optimization, whose performance is substantiated empirically.We conclude with a collection of directions.Because BJSP relaxes scheduling problems with non-overlapping constraints for which better than 2-approximation algorithms are impossible under widely adopted conjectures, the existence of an algorithm with an approximation ratio strictly better than 2 which does not use resource augmentation is an intriguing open question.A positive answer combining LSM algorithm with a new algorithm specialized to instances with many very long jobs is possible.Moreover, analyzing the price of robustness of the proposed two-stage robust optimization approach may provide new insights for effectively solving BJSP under uncertainty.Our findings demonstrate a strong potential for more efficient Royal Mail resource allocation by using vehicle sharing between different delivery offices.The BJSP scheduling problem where multiple delivery offices are integrated in a unified setting and vehicle exchanges are performed on a daily basis consists a promising direction for fruitful investigations.In this context, recent work on car pooling might be useful.Finally, bounded job start constraints are broadly relevant to vehicle routing problems (Fisher et al. 1997;Gounaris et al. 2016).However, we are not aware of any work in this direction.
illustrates a tightness example of our analysis for LPT.Consider an instance I = m, J with m(m − 1) long jobs of processing time p, where p = ω(m) and m = Optimal schedule of makespan T * = mp.

Fig. 2
Fig.2BJSP instance with g = 1 for the tightness of the LPT 2-approximation ratio, with m machines, Θ(m 2 ) jobs of processing time p and Θ(mp) unit-length jobs such that p = ω(m) and m = ω(1)

Fig. 3
Fig.3Converting a non-compact schedule to a compact one, by shifting jobs back in increasing order of their starting times So, we may partition time slots {1, . . ., r L } into F L = {t : |A L t | = m} and B L = {t : |A L t | < m, |B L t | = 1} and obtain:

Fig. 7
Fig. 7Line chart comparing the cardinality of long and short jobs.Given a number m machines, (i) the solid line plots the average cardinality and (ii) the shaded area shows the difference between the maximum and minimum cardinality, with respect to all job sets (days)

Fig. 8
Fig.8Line chart comparing the processing time load 1 m j p j of long and short jobs.Given a number m machines, (i) the solid line plots the average load and (ii) the shaded area shows the difference between the maximum and minimum load, with respect to all job sets (days)

Fig. 9
Fig. 9Line chart comparing the performance of the heuristics.For a given number m of machines, (i) solid lines plot the worst-case performance ratios and (ii) dotted lines plot the average performance ratios of the heuristics with respect to all jobs sets (days)

Fig. 10
Fig. 10Trade-off between the makespan and number of machines.Given a number m machines, (i) the solid line plots the average makespan and (ii) the shaded area shows the difference between the maximum and minimum makespan, with respect to all job sets (days)

Figure 7
plots the averaged number N L (m) K and N S (m) K of long and short jobs per day, with respect to m.Similarly, Fig. 8 illustrates the averaged mean loads Λ L (m) K and Λ S (m) K per day, with respect to m.Clearly, when m increases, N L (m) most difficult instances for LPT arise when the averaged mean load Λ L (m) K of long jobs per day tends to become equal to the averaged number N S (m) K of short jobs per day.Figures 7 and 8 show that this pathological situation occurs when m belongs to [18, 25], [20, 27] and [23, 30] for DO-A, DO-B and DO-C, respectively.

Fig. 11
Fig. 11 Line chart comparing the number of used machines in historical and nominal optimal schedules

Fig. 12
Fig. 12Line chart comparing the number of used machines between the original instances and instances where the job processing times have been halved

Fig. 13
Fig. 13 Line chart comparing the number of vehicles between instances with different BJSP parameters

Fig. 14
Fig. 14 Scatter plots comparing the initial solution weighted value with the final solution number of vehicles be the set of alive and beginning jobs during time unit t, respectively.Schedule S is feasible only if |A t | ≤ m and |B t | ≤ g, for all t.
t. the order p 1 ≥ . . .≥ p n .Recall that |A t | and |B t | is the number of alive and beginning jobs, respectively, at time slot t ∈ D. We say that time slot t ∈ D is available if |A t | < m and |B t | < g.LPT schedules the jobs greedily w.r.t.their sorted order.Each job j is scheduled in the earliest available time slot, i.e., at s j Then, there exists a time t ∈ [1, r ) such that |A t | < m and |B t | < 1.Consider the earliest such time t.Moreover, let t be the earliest time t > t satisfying either |A t | = m, or |B t | = 1.Clearly, there exists a job j ∈ J such that s j = t .If we decrease the job j start time by one unit of time, we obtain 1 Let s * j1 < • • • < s * jn be order of job start times in S * .Moreover, denote by s j1 < • • • < s jn the order of job starts in another arbitrarily chosen optimal schedule S. If k ∈ [1, n] is the smallest index such that s * j k > s j k , then there exists k ∈ [1, k) such that s * j k < s j k .
it must be the case that λ ≤ m.Furthermore, |A t | < m and, thus, |B t | = 1, for each t ∈ [c + 1, s n − 1].That is, exactly one job begins per unit of time during [c + 1, s n ].Due to the LPT ordering, these are exactly the jobs {n − λ, . . ., n}.Since λ ≤ m and p j ≥ m, at least m −h units of time of job n−h are executed from time s On the other hand, at most m − h machines are idle at time slot c + h, for h ∈ {1, . . ., λ}.So, the total idle machine time during n and onwards, for h ∈ {1, . . ., λ}.Thus, the total processing load which executed not earlier than s n is μ ≥ λ h=1 (m −h).