Exact methods for the Oven Scheduling Problem

The Oven Scheduling Problem (OSP) is a new parallel batch scheduling problem that arises in the area of electronic component manufacturing. Jobs need to be scheduled to one of several ovens and may be processed simultaneously in one batch if they have compatible requirements. The scheduling of jobs must respect several constraints concerning eligibility and availability of ovens, release dates of jobs, setup times between batches as well as oven capacities. Running the ovens is highly energy-intensive and thus the main objective, besides finishing jobs on time, is to minimize the cumulative batch processing time across all ovens. This objective distinguishes the OSP from other batch processing problems which typically minimize objectives related to makespan, tardiness or lateness. We propose to solve this NP-hard scheduling problem using exact techniques and present two different modelling approaches, one based on batch positions and another on representative jobs for batches. These models are formulated as constraint programming (CP) and integer linear programming (ILP) models and implemented both in the solver-independent modeling language MiniZinc and using interval variables in CP Optimizer. An extensive experimental evaluation of our solution methods is performed on a diverse set of problem instances. We evaluate the performance of several state-of-the-art solvers on the different models and on three variants of the objective function that reflect different real-life scenarios. We show that our models can find feasible solutions for instances of realistic size, many of those being provably optimal or nearly optimal solutions.

Keywords Oven Scheduling Problem · Parallel batch processing · Constraint programming · Integer linear programming

Introduction
In the electronics industry, many components need to undergo a hardening process which is performed in specialised heat treatment ovens. As running these ovens is a highly energyintensive task, it is advantageous to group multiple jobs that produce compatible components into batches for simultaneous processing. However, creating an efficient oven schedule is a complex task as several cost objectives related to oven processing time, job tardiness and setup costs need to be minimized. Furthermore, a multitude of constraints that impose restrictions on the availability, capacity, and eligibility of ovens have to be considered. Due to the inherent complexity of the problem and the large number of jobs that usually have to be batched in real-life scheduling scenarios, efficient automated solution methods are thus needed to find optimized schedules.
Over the last three decades, a wealth of scientific papers investigated batch scheduling problems. Several early problem variants using single machine and parallel machine settings were categorized and shown to be NP-hard [1]; further literature reviews on different problem variants have been provided by Mathirajan et al. [2] as well as Fowler et al. [3]. Batch scheduling problems share the common goal that jobs are processed simultaneously in batches in order to increase efficiency. Besides this common goal, a variety of different problems with unique constraints and solution objectives arise from different applications in the chemical, aeronautical, electronic and steel-producing industry where batch processing machines can appear in the form of autoclaves [4], ovens [5] or kilns [6].
For example, a just-in-time batch scheduling problem that aims to minimize tardiness and earliness objectives has been recently investigated in [7]. Another recent study [6] introduced a batch scheduling problem from the steel industry that includes setup times, release times, as well as due date constraints. Furthermore, a complex two-phase batch scheduling problem from the composites manufacturing industry has been solved with the use of CP and hybrid techniques [8].
Exact methods used for finding optimal schedules on batch processing machines involve dynamic programming [9] for the simplest variants as well as CP and mixed integer programming (MIP) models. CP models have e.g. been proposed by Malapert et al. [4] and by Kosch et al. [10], where both publications consider batch scheduling on a single machine with nonidentical job sizes and due dates but without release dates. A novel arc-flow based CP model for minimizing makespan on parallel batch processing machines was recently proposed [11]. Branch and Bound [12] and Branch and Price [13] methods have been investigated as well. As the majority of batch scheduling problems are N P-hard, exact methods are often not capable of solving large instances within a reasonable time limit and thus (meta-)heuristic techniques are designed in addition. These range from GRASP approaches [14] and variable neighbourhood search [15], over genetic algorithms [16,17], ant colony optimization [18] and particle swarm optimization [19] to simulated annealing [20].
In this paper, we introduce the Oven Scheduling Problem (OSP), which is a new reallife batch scheduling problem from the area of electronic component manufacturing. The OSP defines a unique combination of cumulative batch processing time, tardiness and setup cost objectives that need to be minimized. To the best of our knowledge, this objective has not been studied previously in batch scheduling problems. Furthermore, we take special requirements of the electronic component manufacturing industry into account. Thus, the problem considers specialized constraints concerning the availability of ovens as well as constraints regarding oven capacity, oven eligibility and job compatibility.
The main contributions of this paper are: • We introduce and formally specify a new real-life batch scheduling problem.
• Based on two different modelling approaches, we propose solver independent CP and ILP models. These models are implemented using the high-level modeling language MiniZinc, which allows us to use a palette of state-of-the-art solvers and to find the bestsuited one for the Oven Scheduling Problem. In addition, these two modelling approaches are formulated using interval variables, which are decision variables tailored for the use in scheduling problems. The interval variable models are implemented using the OPL modeling language and solved with CP Optimizer. • We provide a construction heuristic that can be used to quickly obtain feasible solutions.
• All our solution methods are extensively evaluated through a series of benchmark experiments, including the evaluation of several search strategies and a warm-start approach. Results are compared for three different variants of the objective function that arise from practical use cases. For a sample of 80 benchmark instances, we obtain optimal results for 37 instances, and provide upper and lower bounds on the objective for all instances.
We make the benchmark instances as well as the models described in this paper publicly available on Zenodo [21]. The current paper is a significant extension of our CP 2021 conference paper [22]. The major addition to our conference paper is an entirely new modeling approach based on representative jobs for batches, which produces the best results on our benchmark set (see Sections 4 and 5.2).
The rest of this paper is organized as follows: We first provide a description of the OSP (Section 2) before we formally define the problem and formulate an ILP model (Section 3). Then we present a second modelling approach using representative jobs (Section 4). Alternative model implementations as well as search strategies and the warm-start approach are described in Section 5. Finally, we present and discuss experimental results (Section 6).

Description of the OSP
The OSP consists in creating a feasible assignment of jobs to batches and in finding an optimal schedule of these batches on a set of ovens, which we refer to as machines in the remainder of the paper.
Jobs that are assigned to the same batch need to have the same attribute; 1 in the context of heat treatment this can be thought of as the temperature at which components need to be processed. Moreover, a batch cannot start before the release date of any job assigned to this batch. The batch processing time may not be shorter than the minimal processing time of any assigned job and must not be longer then any job's maximal processing time, as this could damage the produced components. Every job can only be assigned to a set of eligible machines. Moreover, machines have a maximal capacity, which may not be exceeded by the cumulative size of jobs in a single batch.
Setup times between consecutive batches must also be taken into account. Setup times depend on the ordered pair of attributes of the jobs in the respective batches and are independent of the machine assignments. In the context of heat treatment, this can be thought of as the time required to switch from one temperature to another. Moreover, setup times before the first batch on every machine need to be taken into account. Setup times before first batches depend on the initial state of machines. Setup operations are scheduled immediately before every batch.
Furthermore, machines are only available during machine-dependent availability intervals, both batch processing times and setup times between batches need to be scheduled within these machine availability intervals. In practice, an interval for which a machine is unavailable can correspond to a period during which a machine is turned off or occupied with some other task, or to a period for which the personnel required for the setup and running of ovens is unavailable. For all described cases, the setup of machines for a given batch cannot be performed during an interval for which the machine is unavailable and must precede the batch immediately. Therefore, setup times also need to fall completely within the associated availability interval.
The objective of the OSP is to minimize the cumulative batch processing time, total setup costs, as well as the number of tardy jobs. These three objective components are combined in a single objective function using a linear combination, as is formalized in Section 3.2. As the minimization of job tardiness usually has the highest priority in practice, the tardiness objective has a higher weight than the other objectives.
In practice, the cumulative batch processing time should be minimized as the cost of running an oven depends merely on the processing time of the entire batch and not on the number of jobs within a batch. Therefore, running an oven containing a single small job incurs the same costs as running the oven filled to its maximal capacity.
Furthermore, we note that setup costs and setup times are not necessarily correlated. In fact, cooling down an oven from a high to a low temperature might not incur any (energy) costs, but still might require a certain amount of time. Setup costs are used to capture all costs that are related to the setup required between batches; e.g. costs related to personnel involved in the setup operation can also be captured by setup costs. Therefore, setup times are not included in the objective function, only setup costs are. However, up to a certain extent, setup times are implicitly minimized since they can have an impact on the tardiness of jobs.
Using the three-field notation introduced by Graham et al. [23], the OSP can be classified asP|r j ,d j , maxt j , b i , ST sd,b , SC sd,b , E j , Av m |obj, where E j stands for eligible machines and Av m for availability of machines. A more formal description of the problem constraints and the objective function obj is given in Section 3.
As shown by Uzsoy [24], minimizing makespan on a single batch processing machine is an N P-hard problem. This can be seen as follows: For the special case that all jobs have identical processing times, minimizing makespan on a single batch processing machine is equivalent to a bin packing problem where the bin capacity is equal to the machine capacity and item sizes are given by the job sizes. It follows that the OSP is N P-hard as well, since it generalizes this problem. Indeed, minimizing makespan on a single batch processing machine is equivalent to minimizing batch processing times in an oven scheduling problem with a single machine, a single attribute, no setup times, and for which all jobs are available at the start of the scheduling horizon.

Instance parameters of the OSP
An instance of the OSP consists of a set M = {1, . . . , k} of machines, a set J = {1, . . . , n} of jobs and a set A = {1, . . . , a} of attributes as well as the length l ∈ N of the scheduling horizon. Every machine m ∈ M has a maximum capacity c m and an initial state s m ∈ A. The machine availability times are specified in the form of time intervals [as(m, i), ae(m, i)] ⊆ [0, l] where as(m, i) denotes the start and ae(m, i) the end time of the i-th interval on machine m. W.l.o.g. we assume that every machine has the same number of availability intervals and denote this number by I where some of these intervals might be empty (i.e. as(m, i) = ae(m, i)). Moreover, availability intervals have to be sorted in increasing order (i.e. as(m, i) ≤ ae(m, i) ≤ as(m, i + 1)] for all i ≤ I − 1).
Every job j ∈ J is specified by the following list of properties: • A minimal processing time mint j ∈ N with min T ≤ mint j ≤ max T , where min T > 0 is the overall minimum and max T ≤ l is the overall maximum processing time.
Moreover, an (a × a)-matrix of setup times st = (st(a i , a j )) 1≤a i ,a j ≤a and an (a × a)matrix of setup costs sc = (sc(a i , a j )) 1≤a i ,a j ≤a are given to denote the setup times (resp. costs) incurred between a batch with attribute a i and a subsequent batch with attribute a j . Setup times (resp. costs) are integers in the range [0, max ST ] (resp. [0, max SC ]), where max ST ≤ l (resp. max SC ∈ N) denotes the maximal setup time (resp. maximal setup cost). Note that these matrices are not necessarily symmetric.

Objective function
The objective of the OSP is to minimize the following threefunctions: • the cumulative batch processing time across all machines p • the number of tardy jobs t • and the cumulative setup costs sc.
The cumulative batch processing time is the sum of processing times of all batches, not of all jobs. Minimizing the cumulative batch processing time thus leads to batches being formed in the most efficient way. Setup costs are composed of the setup costs before the first batch on every machine, which are determined by the initial state of the machine and the attribute of the first batch, and the setup costs between consecutive batches, which are determined by the attributes of the batches.
Depending on the application scenario, each of these three objective components are given integer weights, and these weighted components are then combined in a single function by means of a linear combination. Using such weights also allows us to implement lexicographic optimization, as is detailed at the end of Section 3.2.

Example instance with six jobs
Consider the following example for an OSP instance consisting of six jobs, two attributes, two machines and a scheduling horizon of length l = 15. The instance parameters are summarized in the following tables and matrices: Figure 1 visualises a solution of this instance consisting of three batches. In this visualisation, the dark grey areas correspond to time intervals for which the machine is not available, light gray rectangles before batches are setup times and the dashed lines represent the machine capacities. This solution is optimal with respect to the cumulative batch processing time ( p = 11) and the tardiness (t = 0). It is however not optimal with respect to the setup costs: for this solution, the setup costs are sc = 40, but solutions with sc = 30 exist as well. However, lower setup costs can only be achieved to the detriment of job tardiness.

Construction heuristic
We designed a construction heuristic that can find initial solutions and can also serve as upper bound on the optimal solution cost. The heuristic is a dispatching rule that prioritizes jobs first by their earliest start date and then by their latest end date. Similar rules, such as the Earliest Due Date (EDD) dispatching rule presented by Uzsoy [25] for batch scheduling on a single machine with incompatible job families, have been proposed for other batch scheduling problems. The heuristic starts at time t = 0. At every time step t, the list of currently available machines is generated: these are all machines on which no batch is running at time t and for which t lies within a machine availability interval. For these available machines, the list of available jobs is generated: these are jobs that have not yet been scheduled, have already been released and can be processed on one of the available machines. Among these jobs, the one with the earliest due date is chosen. If there are several jobs that fulfill these conditions, the largest one is chosen first. In case of ties, the job with smallest index is selected. Let the selected job be j ∈ J . Moreover, if there are several machines for which j fits into the current availability interval and that are eligible for j, the heuristic chooses the machine for which the setup time from the previous batch (or from the initial state of the machine) is minimal.
Once a job j is scheduled, the algorithm tries to add other jobs that are currently available to the same batch: For this, jobs are only considered if their attribute as well as minimal and maximal processing times are compatible with job j and the combined batch size does not exceed the machine capacity. Moreover, unless job j finishes too late anyway, other jobs are only considered if their processing time does not force j to finish late. If job j finishes late in the current schedule, all compatible jobs that are currently available are considered for addition to the batch. In case there are several jobs that meet these requirements, we sort them in decreasing order of their latest end dates and keep adding jobs to the batch as long as the machine capacity and compatibility requirements are fulfilled. If there are no jobs (left) that can be added to the batch and the maximum machine capacity is not reached yet, we look ahead and also consider jobs with an earliest start time > t. However, we only consider compatible jobs that do not force job j to finish late (unless this was already the case) and that can still be processed within the current machine availability interval. Once all jobs for the batch have been chosen, the batch is scheduled at the earliest possible time.
If there are other jobs that can be processed at time t, we schedule them and fill up their batches in the same way. If no job can be scheduled, the time is increased to t + 1 and the above procedure is repeated until the end of the scheduling horizon is reached or all jobs have been scheduled.
For the example instance presented in the previous section (Section 2.3), the construction heuristic delivers the optimal solution depicted in Fig. 1. It proceeds as follows: At time t = 0, machine 1 is available and jobs 2 and 3 are available for this machine. Since job 2 has an earlier latest end date, it is scheduled first. In order to respect the necessary setup times from the initial state of machine 1, job 2 is scheduled at time t = 2. To this batch, job 1 can be added. No other jobs can be scheduled at the time is increased to t = 2. Now, job 5 is selected and scheduled at time t = 5 on machine 2. To this batch, jobs 4 and 6 can be added. Finally, the time is increased until t = 8 and job 3 can be scheduled.
Note that this construction heuristic is not guaranteed to find a feasible solution for every instance of the OSP. sIndeed, due to the ordering of the jobs and the way that jobs are chosen to fill up batches, it is possible that the heuristic is not capable of scheduling all jobs within the scheduling horizon. In order to see this, consider the following small example with a single machine and two jobs: The scheduling horizon is l = 3 and the machine is available for the entire interval [0, 3]. The two jobs 1 and 2 have the same attribute and sizes are such that they can be processed together in a batch. The earliest start, latest end and processing times are given as follows: et 1 = 0 and et 2 = 1, lt 1 = 2 and lt 2 = 3, mint 1 = mint 2 = maxt 1 = maxt 2 = 2. The only feasible solution is to schedule both jobs at time t = 1, which leads to job 1 finishing late. The construction heuristic however schedules job 1 at time t = 0 and then fails to schedule job 2. Even though we cannot guarantee that the described construction heuristic finds a feasible solution for all instances (where such a solution exists), we can guarantee that the partial solution provided by the heuristic, i.e., the solution restricted to those jobs that were scheduled, is be feasible. This is due to the fact that all constraints are fulfilled when jobs are scheduled or added to existing batches by the heuristic.

Formal problem definition and direct model for the OSP
In this section we provide a formal definition of the OSP that will also serve as an ILP model for the problem that can be directly implemented using the modelling language MiniZinc [26]. The full MiniZinc model is also available on Zenodo [21].
The advantage of the free and open-source constraint modeling language MiniZinc is that it can be used to model constraint optimization problems in a high-level, solver-independent way. This model is then compiled into FlatZinc, which is a solver input language that can be processed by a multitude of state-of-the-art solvers.
We first explain how batches are modeled and afterwards define decision variables, objective function, and the set of constraints.
The core idea is to model batches by their positions on machines. In the worst case we need as many batches as there are jobs; we thus define the set of potential batches as B = {B 1,1 , . . . , B 1,n , . . . , B k,1 , . . . , B k,n } to model up to n batches for machines 1 to k. In order to break symmetries in the model, we further enforce that batches are sorted in ascending order of their start times and empty batches are scheduled at the end. That is, B m,b+1 is the batch following immediately after batch B m,b on machine m for b ≤ n − 1. Clearly, at most n of the k · n potential batches will actually be used and the rest will remain empty.

Variables
We define the following decision variables: • Machine and batch number assigned to job: X m,b, j ∈ {0, 1} ∀m ∈ M ∀b ∈ [1, n] ∀j ∈ J The binary variables X m,b, j encode whether job j is assigned to batch B m,b , i.e., X m,b, j = 1 if and only if job j is assigned to batch b on machine m. • Start times of batches: To handle empty batches, we define an additional attribute with value 0 and extend the matrices of setup times st and setup costs sc so that no costs occur when transitioning from an arbitrary batch to an empty batch: st(a i , a j ) = sc(a i , a j ) = 0 if a i = 0 or a j =0. Moreover, we add a machine availability interval [l, l] of length 0 to the list of availability intervals so that empty batches can be scheduled for this interval (the maximum number of intervals per machine therefore becomes I + 1).
We then define the following auxiliary variables: • Availability interval for batch: The binary variables E m,b encode whether batch b on machine m is empty.

Objective function
Recall that the three components of the objective function are the cumulative batch processing time across all machines p, the number of tardy jobs t and the cumulative setup costs sc. They are formally defined as follows: Note that in the definition of the setup costs, the first sum refers to the setup costs from the initial state of a machine to its first batch and the second sum refers to the costs incurred between batches. Normalization of cost components Finding a trade-off among several objectives is the topic of multiobjective optimization [27,28]. In practice, lexicographic optimization is often chosen. For our specific application, we wanted an approach that allows high flexibility and lets users decide on the importance of each objective. We thus decided in consultation with our industrial partner to define the objective function as a linear combination of the three components. This approach also enables lexicographic optimization, by setting the weights accordingly as will be explained at the end of this subsection. Also note that all solutions that are optimal with respect to a weighted sum of the objective components are Pareto-optimal [29] with respect to the three objective components. However we cannot guarantee to find all Pareto-optimal solutions using this type of objective function.
In order to combine the three objective components using a linear combination, p, sc and t need to be normalized top,sc andt ∈ [0, 1] so that we can guarantee that all three components take values in the same range: In the worst case, every batch processes a single job. In this case the total batch processing time is n times the average job processing time avg t . The setup costs are bounded by the maximum setup cost multiplied with the number of jobs. We take the maximum of 1 and max SC since it is possible that max SC = 0. The number of tardy jobs is clearly bounded by the total number of jobs.
Finally, the objective function obj is a linear combination of the three normalized components: obj = (w p ·p + w sc ·sc where the weights w p , w sc and w t take integer values. Integer-valued objective As some state-of-the-art CP solvers can only handle integer domains, we propose an alternative objective function obj , where we additionally multiplỹ p,sc, andt by the number of jobs and the least common multiple of avg t and max SC : where C = lcm(avg t , max(max SC , 1)).
Preliminary experiments using the MIP solver Gurobi showed that using obj and obj both lead to similar results. We therefore used only obj in our final experimental evaluation. Lexicographic optimization The use of integer weights for each of the objective components also allows us to express the OSP as a lexicographic minimization problem. This can be done using so-called "big M"-constants. In order to explain this in more detail, let us denote by obj 1 ∈ {p, sc, t} the objective component with highest lexicographic importance, obj 2 the component with second-highest importance and with obj 3 the component with lowest importance. Lexicographic optimality is defined using the lexicographic order < lex as follows: with j ∈ {1, 2, 3} the smallest index such that obj j (s 1 ) = obj j (s 2 ) and where s 1 and s 2 are two feasible solutions to the OSP. Lexicographic minimization can be achieved by minimizing the integer-valued objective function obj' with weights set as follows: obj' = w 1 · obj 1 + w 2 · obj 2 + w 3 · obj 3 w 3 = 1 where the upper bounds of the objective components are calculated as in the normalization of components (see (2)), i.e., ub( p) = avg t · n ub(sc) = max(max SC , 1) · n ub(t) = n.
Note that the calculation of the least common multiple as in (4) is not necessary here. The corresponding normalized objective function obj is then given as follows: For a concrete example of lexicographic minimization, see the description of use case UC2 in Section 6.1.2.

Constraints
In what follows, we formally define the constraints of the OSP. 2 • Each job is assigned to exactly one batch: • Each job is assigned to exactly one eligible machine: • Batches may not start before the earliest start of any job in the batch: • Batch processing times must lie between the minimal and maximal processing time of all assigned jobs: • Batches on the same machine may not overlap and setup times must be considered between consecutive batches: • Batches and the preceding setup times must lie entirely within one machine availability interval: 1≤i≤I +1 The binary auxiliary variables I m,b,i in constraint (5)  • Total batch size must be less than machine capacity: • Jobs in one batch must have the same attribute, which we model with auxiliary variables A m,b to set the attribute of a batch: where a is the total number of attributes. It thus has to hold that • Helper variables for tardy jobs: The binary auxiliary variables T m,b, j encode whether job j in batch b on machine m finishes after its latest end date and is used to calculate the number of tardy jobs t. (12)). • Setup times and costs between consecutive batches on the same machine: The helper variables st m,b (resp. sc m,b ) encode the setup time (resp. setup cost) between batch b on machine m and the following batch on the same machine. Note that setup times and costs are also defined between empty batches and set to 0 in this case, as the setup costs and times from attribute 0 to attribute 0 are equal to 0.
• Set decision variables for empty batches: The binary helper variables E m,b encode whether batch b on machine m is empty or not. Constraints (13) and (14) ensure that E m,b = 1 iff no job is scheduled for batch B m,b . The constraints (15) to (17) set the start times, processing times, availability intervals, and attributes for empty batches: S m,b = l, P m,b = 0, A m,b = 0 and I m,b = I + 1. Moreover, in order to break symmetries, the list of batches (B m,b ) 1≤b≤n per machine m ∈ M is sorted so that all non-empty batches appear first (constraint (19)).

Interval variable model using representative jobs for batches
For the model presented in Section 3, we decided to create decision variables for start times and processing times of every batch that could possibly be present (n · k batches in total). Which batch a job is assigned to was then encoded using additional decision variables. The scheduling constraints were formulated for the batches. An alternative approach is to schedule jobs instead of batches. However, only one representative job for every batch is scheduled; all other jobs in the same batch point to their representative job. In order to break symmetries, the job with lowest index is chosen to be the representative job for its batch. Whereas the previous approach models the OSP primarily as an assignment problem (jobs are assigned to batches that have a predefined order), this approach focuses more on the scheduling aspect of the OSP: a subset of representative jobs are chosen for which an optimal schedule needs to be found. Such a modelling approach has been used previously in the literature for batch scheduling problems, e.g. by Kosch et al. [10] who speak of host jobs instead of representative jobs.
Since CP Optimizer is particularly well suited for scheduling problems as demonstrated by Laborie et al. [30], we formulated this modelling approach based on representative jobs as a CP model for IBM ILOG CPLEX Studio. This model is written using the Optimization Programming Language (OPL) [31] and makes use of interval variables, as well as other concepts that are specific to CP Optimizer. The full OPL model is also available on Zenodo [21]. In addition, we modelled this approach using the MiniZinc modelling language in order to be able to evaluate the performance of different solvers. This alternative solver-independent CP formulation is briefly described in Section 5.2.
Before we describe the model, we give a brief explanation of the use of interval variables in OPL. An interval variable int is a decision variable that can either be present or absent in a solution. In case it is present, it is characterized by a start time For a more formal definition of the used CP Optimizer concepts, see [30], and for more information on how they can be used, see the tutorial [32].
We define the following decision variables for the OSP: • Optional interval variables for jobs: The variable job j is only present if job j is the representative job of its batch. Note that the time window [et j , l] ensures that every representative job starts after its earliest start time et j .
• Optional interval variables for jobs on machines: The variable jobM j,m is only present if job j is assigned to machine m and is representative for its batch. The intensity function av m encodes the machine availability times and is modeled using intensity step functions; av m (t) = 100% if machine m is available at time t (the job can be fully performed by the machine) and av m (t) = 0% otherwise (the job cannot be performed by the machine).
• Pointers to representative jobs in the same batch: If inBatchWith j = i for some job i ∈ J , job j is scheduled in the same batch as job i and job i is representative for this batch. If inBatchWith j = 0, job j is representative for its batch.
• In order to establish the order of jobs, respectively batches, on machines, we make use of a type of decision variable called interval sequence variable in OPL. An interval sequence variable seq is defined by a tuple (S, T ) where S is a set of interval variables and T ⊆ N with |T | = |S| is a set of so-called integer types. The value of an interval sequence variable seq is a permutation of the interval variables in S (any absent interval variables are ignored). We define the following interval sequence variables: The value of mach m is thus a permutation of the interval variables jobM j,m . The types are used to model the attribute of jobs (the attribute of a batch containing job j is equal to a j ). This allows us to formulate scheduling constraints such as noOverlap and to use functions such as endOfPrev and typeOfPrev to formulate constraints on the start times of batches and setup times between batches. • Setup times between batches are also modelled using optional interval variables: where max ST denotes the overall maximum setup time. The variable setup j,m is used to model the setup operation before jobM j,m . Setup times need to be modelled in addition to jobs themselves in order to ensure that setup times fall within the same machine availability interval as the following jobs.
• A binary helper variable is used to model tardy jobs: For the formulation of the objective function as well as the constraints, we make use of the OPL function typeOfPrev. For a sequence interval variable s, an interval variable i, and two integers b and c, typeOfPrev(s, i, b, c) returns the type (=attribute) of the interval variable that is previous to i in the sequence s. When i is present and is the first interval in s, the function returns the constant integer value b. When i is absent, the function returns the constant integer value c. We then model the OSP as follows: where prev( j, m) = typeOfPrev(mach m , jobM j,m , s m , 0).
The alternative constraint in (21) ensures that every job j, whenever present, is scheduled on exactly one machine and that the interval variables job j and jobM j,m are synchronised, where m is the machine j is assigned to. Constraint (22) ensures that every job j, whenever present, is scheduled on an eligible machine. Constraints (23) and (24) ensure that every job is either scheduled or points at some other job that is scheduled and has a lower index.
Constraint (25) concerns the conditions that need to be fulfilled for all non-representative jobs in a batch: every job needs to have the same attribute as the representative job for the batch, the length of the interval variable corresponding to the representative job must be greater than or equal to the minimum processing time and smaller than or equal to than the maximum processing time of every job in the batch and the start of the representative job may not before the earliest start time of any other job in the batch. Note that for representative jobs, the conditions on the processing times and the earliest start time are fulfilled by definition of the job interval variables. However, for all non-representative jobs in a batch, these conditions need to be stated explicitly in the form of constraints. Moreover, it is ensured that jobs can only be processed together with another job if the assigned machine is eligible. Constraint (26) guarantees that machine capacities are not exceeded.
The noOverlap-constraint in equation (27) ensures that representative jobs, i.e. batches, on the same machine do not overlap. Passing the setup time matrix st as additional parameter to the noOverlap-constraint enforces a minimal distance between consecutive jobs based on the types (=attributes) of the jobs. This constraint is sufficient to ensure that enough time is left between batches to allow for the setup operation to be performed. However, we also need to guarantee that the setup operation falls within the same machine availability interval as the following batch.
Therefore, setup times are also modelled as interval variables and have to fulfill constraints (28)-(31): the setup time interval before job j on machine m is present if and only if job j is present on machine m (constraint (28)), setup times end exactly at the start of the following job (constraint (29)) and start after the preceding job on the same machine, or at a time ≥ 0 if the following job is the first on this machine (constraint (30)). The length of the setup time interval is given by the entry st(a i , a j ) in the setup time-matrix, where a i is the attribute of the preceding and a j of the following job on the machine. Constraints (32) and (33) finally force jobs and setup times to fall entirely within a machine availability interval of the assigned machine: whenever jobM j,m or setup j,m is present, it cannot overlap a point t where av m (t) = 0.
The helper variable encoding tardy jobs is set via constraint (34). Finally, (35) is a redundant constraint that states that the total number of batches is smaller or equal to the total number of jobs.

Alternative model implementations, search strategies and warm-start
To make a thorough comparison of our models possible, we also implemented the direct model presented in Section 3 using OPL and created a solver-independent MiniZinc model with representative jobs per batch based on the OPL model presented in Section 4. We briefly describe these two alternative implementations in the following. The full models are available on Zenodo [21].

Alternative interval variable model using batch positions
This model is based on the direct model using batch positions as described in Section 3.
We use optional interval variables for all possible batches: where min T is the overall minimum and max T is the overall maximum processing time). Batches are optional since not all k · n batches will actually be used. Setup times between batches are also modelled using optional interval variables: Besides these interval variables, we use the same decision variables as in Section 3: for the attribute of batch B m,b and sc m,b for setup costs.
The following constraints are used for the batch and setup time interval variables: • presenceOf ensures that a batch is present iff some job is assigned to it. Similarly, setup times are present iff the preceding and following batch are present. • endBeforeStart is used to enforce the order of batches and setup times on the same machine and endAtStart is used to schedule setup times exactly before the following batch. • startOf restricts the start time of batches to be after the earliest start date of any assigned job and lengthOf is used to enforce that batch processing times lie between the minimal and maximal processing time for every assigned job • noOverlap is used as a redundant constraint to ensure that batches on the same machine do not overlap • forbidExtent is used to guarantee that batches and setup times are scheduled entirely within one machine availability interval: whenever B m,b or st m,b is present, it cannot overlap a point t where av m (t) = 0.

Alternative direct model with representative jobs per batch
This model is based on the OPL model presented in detail in Section 4 and is implemented using the high-level constraint modeling language MiniZinc [26]. In MiniZinc, interval variables are not available. Instead, we model (optional) start times of jobs and processing times of jobs in separate decision variables. Optional integer variables are available in MiniZinc as well and can either take an integer value or the value <>, indicating that the variable is absent. We define the following decision variables: • Optional start time of jobs and optional start times of jobs on machines: start j ∈ [0, l] and startM j,m ∈ [0, l] ∀j ∈ J ∀m ∈ M. (this variable is absent for jobs which are not representative or not assigned to the given machine). • Processing times of jobs and processing times of jobs on machines: proc j ∈ [0, max T ] and procM j,m ∈ [0, max T ] ∀j ∈ J ∀m ∈ M. • Optional pointers to representative jobs in the same batch: inBatchWith j ∈ J ∀ j ∈ J • Optional variables for the length of setup times before batches: setup j,m ∈ [0, max ST ] ∀j ∈ J ∀m ∈ M.
Moreover, we use the following auxiliary variables to handle machine availability intervals and to define setup times between batches: • Optional variables for the availability interval a job is scheduled to an a given machine: • Attribute of previous job on the same machine: attPrev j,m ∈ [0, a] ∀j ∈ J ∀m ∈ M (is equal to 0 if there is no previous job). • End time of previous job on the same machine: endPrev j,m ∈ [0, l] ∀j ∈ J ∀m ∈ M (is equal to 0 if there is no previous job). • Binary variables encoding whether a job is the first one on its assigned machine: first j,m ∈ {0, 1} ∀j ∈ J ∀m ∈ M.
We make use of the following global constraints: • alternative to ensure that every job that is present, i.e., every representative job, is scheduled to exactly one machine, • disjunctive to enforce that representative jobs on the same machine do not overlap.

Alternative interval variable model using state functions in OPL
Another way of modeling batches in CP Optimizer is with the help of so-called state function variables [30], as done, e.g., by Ham et al. [33] for a flexible job shop scheduling problem with parallel batch processing machines. These functions model the time evolution of a value that can be required by interval variables; for the OSP, state functions could model the attribute of jobs. Two interval variables requiring incompatible states cannot overlap and interval variables requiring compatible states can be batched together. That is, state functions can be used to model that jobs can only be processed together in a batch if they have the same attribute. We developed an alternative OPL model for the OSP using state functions; it is available on Zenodo [21]. However, preliminary experiments showed that this model is not competitive with the one using representative jobs presented in Section 4. We therefore did not include it in the experimental evaluation presented in this paper.

Alternative direct model using MiniZinc's high-level CP modeling notation and batch positions
We also created a direct model using the modeling approach based on batch positions as described in Section 3 and MiniZinc's high-level CP modeling notation to formulate the constraints. This has the advantage that constraints do not need to be manually linearised and the model is somewhat easier to read for non-experts. In the formulation of the constraints, we implicitly make use of constraint reification to express conditional sums and additionally use the maximum global constraint. Furthermore, we implicitly utilize the element constraint to use variables as array indices.
As an example, let us present the constraints for batch start and processing times. We use the decision variables M j ∈ M for the machine job j is assigned to and B j ∈ [1, n] for the position of the assigned batch. Then the constraint stating that jobs may not start before their earliest start time can be formulated as follows: The constraint on batch processing times can be stated as: The full model can again be found on Zenodo [21]. and is referred to as direct_cp_model_simpler.mzn. We decided not to include this model in our exper-imental evaluation, since the results in our previous publication [22] showed that-in terms of the number of solutions found and the quality of solutions-this model cannot achieve better results than the ILP model from Section 3 for any of the evaluated solvers. Since both models are based on the same modelling approach, we decided to include only the stronger one of the two in this paper.

Programmed search strategies
For our solver-independent MiniZinc models, we implemented several programmed search strategies, which are based on variable-and value selection heuristics. Such heuristics determine the order of the explored variable and value assignments for a CP solver and can play a critical role in reducing the search space that needs to be enumerated by the solver. For our experiments, we implemented the search strategies directly in the MiniZinc language using search annotations.
Variable ordering In our implemented search strategies we first select an auxiliary variable that captures the total number of batches. For this variable we always use a minimum value first heuristic to encourage the solver to look for low cost solutions early in the search. Afterwards, we sequentially select decision variables related to a job by assigning the associated batch, machine, batch start time, and batch duration for the job (i.e., . For the CP model using representative jobs briefly presented in Section 5.2, we select the jobs' start times and processing times (i.e., start 1 , proc 1 , . . . , start |J | , proc |J | ).
Variable selection heuristics We use three different variable selection strategies on the set of decision variables that are related to job assignments: input order (select variables based on the specified order), smallest (select variables that have the smallest values in their domain first, break ties by the specified order), and first fail (select variables that have the smallest domains first, break ties by the specified order).
Value selection heuristics We experimented with two different value selection heuristics for the set of variables which is related to job assignments: min (the smallest value from a variable domain is assigned first), and split (the variable domain is bisected to first exclude the upper half of the domain).
Evaluated search strategies Using the previously defined heuristics we evaluated 8 different programmed search strategies:

Warm-start approach
For those solvers that support a warm-start option, we implemented such an approach. The construction heuristic described in Section 2.4 was used to find an initial solution for a given OSP instance. The solution is then provided to the model by fixing the values of certain decision variables; based on these values the solver then needs to complete this partial solution. We decided to provide values of the following variables: • For the models using batch positions: the machines and batches jobs are assigned to, the start times of batches. More precisely: • For the models using representative jobs: the pointers to representative jobs in the same batch and the start times or processing times of representative jobs. More precisely: -For the interval variable model described in Section 4: the pointers to representative jobs inBatchWith j , the start times of interval variables for jobs job j and for jobs on machines jobM j,m for all j ∈ J , m ∈ M. -For the direct model implemented in MiniZinc as described in Section 5.2: the pointers to representative jobs inBatchWith j , the processing times of jobs proc j and of jobs on machines procM j,m for all j ∈ J , m ∈ M. We decided to give the values for processing times rather than start times because MiniZinc does not support providing arrays of optional variables for warm-start.

Experimental evaluation
In this section we give an overview on the results of an extensive set of experiments that we conducted to evaluate all proposed models. First, we describe our experimental setup in Section 6.1: This consists of the description of the benchmark set, of three different sets of weights that stem from three practical use cases, of the solution methods chosen for comparison and of the experimental environment. Moreover, we describe the method chosen to compare the overall performance of solutions methods against each other. Then, in Section 6.2 we configure our methods: where the chosen solvers allow to do so, we choose the best search strategy per model and per use case and evaluate the impact of the warm-start approach. Finally, in Section 6.3, we present the results of our extensive analysis: We first give an overall performance evaluation of our methods, compare the sizes of our models and then investigate the performance with respect to several criteria in more detail.

Benchmark instances
Using a random instance generator specifically designed for the OSP, we created a large set of benchmark instances to evaluate the performance of our proposed models. Our random instance generator is described in detail in Section 1 of the Appendix. First, we executed the random generator once for every possible configuration of the 15 parameter values specified in Table 13. Thereby we produced 1024 instances for each of the 16 combinations of the parameters n, k and a. Then we randomly selected 5 instances from every set of 1024 instances, creating a set of 80 instances which we used throughout our experiments. This set thus consists of 20 instances each with 10 (instances 1-20), 25 (21-40), 50 (41-60) and 100 jobs (61-80). All benchmark instances turn out to be satisfiable and solvable by the construction heuristic described in Section 2.4. This reflects our real-life industrial application, for which feasible solutions usually can be found heuristically and the main aim is to find cost-minimal schedules. The set of benchmark instances is publicly available [21]. Note that this set of benchmark instances is slightly different to the one presented in the authors' conference paper [22], since we have added initial states of machines. Note also that the objective function as defined in equation (4) no longer includes setup times; we can therefore not expect that the optimal solutions for this benchmark set have the same solution cost as in the conference paper [22].

Three use cases for the choice of weights
Together with our industrial partner, we formulated weight settings for three different use cases that we used throughout our experiments.
• UC1: Reducing job tardiness has the highest priority for this use case. Tardy jobs are only accepted if absolutely necessary and if scheduling all jobs on time would lead to much higher runtime of ovens. Alternatively, finishing jobs before their latest end date could have been modelled as hard constraints, but this would potentially lead to infeasible instances which was not desired by our industrial partner. Reducing the runtime of ovens has a higher priority than reducing setup costs. The weights are set as follows: This choice of weights implies that reducing the number of tardy jobs by one reduces the solution cost by the same amount as reducing the cumulative batch processing time by 25 · avg t , i.e., 25 times the average processing time of a job. In other words, grouping 26 jobs with average processing time in a single batch instead of processing them individually has the same impact as scheduling one more job so that it finishes on time. Note that for our benchmark instances that have 100 jobs or less, it will hardly ever be possible to schedule 26 or more jobs in one batch. Thus, setting the tardiness-weight w t to 25 · w p basically corresponds to modelling lexicographic minimization with tardiness being lexicographically more important than cumulative batch processing time. • UC2: This use case models lexicographic minimization with minimizing cumulative batch processing time being lexicographically more important than the number of tardy jobs and minimizing tardiness being lexicographically more important than setup costs.
Using the notation of Section 3.2, we thus have: p = obj 1 , t = obj 2 and sc = obj 3 . In this case, weights need to be set individually for each instance. This is done as follows (as described in Section 3.2): • UC3: In this use case, the same importance given to reducing tardiness and oven runtime. Reducing setup costs is less important by a factor of 2.
This choice of weights implies that processing two jobs with average processing time in a single batch instead of processing them individually has the same impact on the solution cost as scheduling one more job so that it finishes on time. Reducing the number of tardy jobs by one (or the cumulative batch processing time by avg t ), has the same impact as reducing the setup costs by 2 · max SC .

Evaluated methods and computing environment
An overview of the models and solvers that we evaluated on our benchmark set can be found in Table 1: We implemented the models presented in Sections 3 and 5.2 using the high-level constraint modeling language MiniZinc [26]. We used the MiniZinc IDE version 2. to evaluate the models. Note that MiniZinc automatically provides an ILP encoding for any constraints that are not already stated in a linear form when an ILP solver such as Gurobi is used to solve the high-level constraint model [34]. For Chuffed and OR-Tools, we used all 7 search strategies described in Section 5.5 and compared them with the solver's default search strategy. For Chuffed, we activated the free search parameter which allows the solver to interleave between the given search strategy and its default search. Furthermore, we investigated a warm-start approach with Gurobi (for the other solvers, warmstart is currently not supported by MiniZinc), as described in Section 5.6. Finally, the OPL models presented in Sections 4 and 5.1 were run using CP Optimizer in IBM ILOG CPLEX Optimization Studio version 20.1.0.0. The warm-start approach was evaluated for these two models as well. This results in a total of 42 different combinations per instance, when considering all possible combinations of models, solvers, search strategies and of the warm-start option. The time limit for every one of these combinations was set to one hour per instance; this includes the flattening time in MiniZinc and the model extraction time for CP Optimizer. In order to refer to one of these methods in our evaluation, we use the following abbreviations throughout the remainder of this section: "solver-model", where the model names introduced in Table 1 are used. The solvers are referred to as "chuffed", "gurobi", "cpopt", "ortools" and "oplrun". 3 Experiments were run on single cores, using a computing cluster with 10 identical nodes, each having 24 cores, an Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz and 252 GB RAM.

Borda scores for the assessment of solution methods
We want to compare the performance of our solution methods across the whole benchmark set and assign a score to every solution method. Moreover, when assessing the performance of a solution method on a particular instance, we would like to take into account the solution status, i.e., whether a solution has been found and whether the found solution was proven to be optimal within the runtime limit, as well as the solution quality for non-optimal solutions. The assessment method used by the MiniZinc challenges [35] fulfills both these goals in a particularly concise way and we therefore decided to implement this method for our experimental evaluation.
The following explanation of the assessment method follows the description of the rules of the MiniZinc challenge 2022 as presented on the challenge's website. 4 The scoring procedure is based on the Borda count [36] voting system. Each benchmark instance is treated like a voter who gives each solution method a score that reflects how many solution methods it beats on this instance. The overall Borda score of a solution method is then given by summing up the scores across all instances.
More precisely, a method m is better than method m on instance i if and only if: • m solved instance i (found a feasible solution for instance i) within the runtime limit and m did not solve the instance, or • both m and m solved the instance i within the runtime limit, but m provided an optimality proof and m did not, or • both m and m did not provide an optimality proof for instance i within the runtime limit, but m found a solution with higher quality than m , i.e., the solution cost of the solution found by m is smaller than the solution cost of the one one found by m .
A method m then scores points on instance i by comparing its performance with each other method m on instance i. Points are scored as follows: • If m gives a better answer than m (in the above sense), it scores 1 point.
• If m gives a worse answer than m , it scores 0 points.
• If m and m give indistinguishable answers, the scoring is based on execution time comparison: gurobi-direct-batch-pos warm-start warm-start warm-start gurobi-direct-repr-jobs warm-start warm-start warm-start oplrun-interval-batch-pos warm-start warm-start warm-start oplrun-interval-repr-jobs warm-start -warm-start , m )) denote the total time required by method m (method m ) on instance i. Then method m scores points.
-For the special case that both m and m do not solve instance i, both score 0 points.

Configuration of solution methods for each use case
Based on the results of our experiments, we select the best search strategy for every eligible solution method and evaluate whether using the warm-start option is beneficial. This selection per use case is summarised in Table 2. The selected variants are then used in the overall comparison of solution methods in Section 6.3.

Search strategies
To evaluate the performance of the 7 search strategies described in Section 5.5, we ran experiments on all benchmark instances using Chuffed and OR-Tools together with the batch position model as well as the representative jobs model.
We calculated Borda scores independently for both MiniZinc models, the two solvers Chuffed and OR-Tools and the three use cases. The Borda scores and ranks for all three use cases with Chuffed and the batch position based model are displayed in Table 3.
We can see in the results that the solver's default search scored the lowest rank for all three use cases. Further, we clearly see an improvement with the custom search strategies in the Borda score compared to default search. Search4, which uses a first fail based variable selection strategy together with a minimum value selection heuristic, performed best on UC1 and UC2, however, for UC3 Search5, which relies on an input order based variable selection, performed better. Table 4 shows the Borda scores and results for Chuffed with the representative jobs model.
Again the solver's default search strategy is not competitive with any custom search, although the difference in Borda scores is smaller than it was with the batch position model. Search6, which relies on a smallest domain variable selection strategy and a split value selection heuristic, performs best on UC1 and UC3, while Search3, which uses a minimum domain value selection instead, produces the best results on UC2.
Results achieved by the different search strategies with OR-Tools and the batch position model are shown in Table 5.
Similar as with the results produced by Chuffed there is a clear improvement over the solver's default search when using the custom search strategies. In this case Search3 per-  formed best on UC2 and UC3, whereas Search7, which uses a first fail variable selection together with a split value selection strategy, scores rank 1 on UC1. Finally, Table 6 displays the scores for OR-Tools and the representative jobs model. Again the default search cannot compete with the other search strategies. Further, Search6 is ranked first on all three use cases, which is similar to the results with Chuffed and the representative job model where Search6 also performed high scores for all use cases.
To summarize, the evaluated custom search strategies could clearly improve the results of the default search strategies in all considered cases. Further, which search strategy performed best differs depending on the evaluated model and solver. However, the results indicate that Search6 delivers high scores with the representative job model with both Chuffed and OR-Tools.
Search strategies can also be used for Gurobi and are implemented with branching priorities by the MiniZinc interface. However, initial experiments with Gurobi and all search strategies did not reveal significant differences between the search strategies for this solver. For CP Optimizer run via MiniZinc, search strategies are currently not supported.
Based on the results presented in this section, we selected the search strategy that led to best results per solver, model, and use case for our final experiments which are presented in later sections (see Table 2 for details).

Warm-start option
As described in Section 5.6, we implemented a warm-start approach. This approach was tested with Gurobi run via MiniZinc and the models direct-batch-pos and direct-repr-jobs as well as with CP Optimizer run directly and the models interval-batch-pos and intervalrepr-jobs. Note that warm-starts are referred to as starting points in CP Optimizer. Currently, MiniZinc does not support a warm-start option for the other evaluated solvers. An overview of the results, when comparing the chosen method with and without the warm-start option, can be found in Table 7.
The construction heuristic (see Section 2.4) could find feasible solutions for all 80 instances within a few seconds. On average, the construction heuristic required 0.271 seconds per instance, 57 out of 80 instances could be solved in less than 0.1 seconds and the maximum time required per instance was 5.6 seconds. As can be seen in the left part of Table 7, warm-starting the methods gurobi-direct-batch-pos, oplrun-interval-batch-pos and oplruninterval-repr-jobs with these initial solutions led to solutions for all 80 instances, i.e, 20 The best results per method and use case are highlighted in bold font to 30 more instances than without the warm-start option, for gurobi-direct-batch-pos and oplrun-interval-batch-pos. For gurobi-direct-repr-jobs however, only 39 instances could be solved for UC1 (36 for UC2 and 38 for UC3), increasing the number of solved instances only slightly. 5 The heuristic solution could be improved by the warm-start approach for almost all of the 80 benchmark instances for the methods gurobi-direct-batch-pos and oplrun-intervalrepr-jobs. Interestingly, for one of the benchmark instances with 10 jobs, the construction heuristic already finds the optimal solution. However, for both gurobi-direct-repr-jobs and oplrun-interval-batch-pos, this is not the case: the initial solution can only be improved for slightly less than half of the benchmark instances (between 34 and 37 instances for all three use cases).
Regarding the solution quality, using the warm-starting approach leads to an improvement for some instances, but not for all. Detailed numbers can be found in the right part of Table 7. These should be read as follows, as exemplified for the method gurobi-direct-batch-pos and UC1: For 7 of the 80 benchmark instances, the solution found without the warm-start approach was strictly better than with warm-start; for 24 instances the solution was strictly better when using the warm-start approach. Note that we consider found solutions to be better if the other method could not find a solution. As one can see by the values in bold font (indicating whether it was advantageous to use the warm-start option), using warm-start on average helped to improve the solution quality within the runtime limit of one hour for almost all methods and use cases. The number of better solutions with the warm-start approach are largest for gurobi-direct-batch-pos and oplrun-interval-batch-pos. For oplrun-interval-reprjobs however, the picture is less clear. For UC1 and UC3, using the warm-start approach led to lower solutions costs for more instances than it did not, but the number of better solutions are quite close with and without the warm-start approach. For UC2, the numbers are also close but on average it was better not to use the warm-start approach. A possible explanation is that oplrun-interval-repr-jobs is already good a solving the benchmark instances even without the additional warm-start data. The time spent on loading the warm-start data and completing it to a full solution is thus better spent on improving the solution that can be found without warm-start.
As for the number of optimality proofs, these could be increased in more than half of the cases, mostly when using Gurobi as a solver. When comparing proof times, it did -on average -however not make much difference whether warm-start was used or not.
To conclude, the major advantage of using the warm-start approach is that it can help find solutions for more instances. Moreover, for the modelling approach using batch positions and both Gurobi and CP Optimizer, the solutions found were in the great majority of the cases better than (or equally good as) those found without warm-start. Also for gurobi-directrepr-jobs, the quality of solutions was improved more often than deteriorated by using the warm-start approach. For oplrun-interval-repr-jobs, that could already find solutions for all 80 benchmark instances without warm-start the advantage is not so clear and the use of a warm-start approach cannot be recommended in general.

Experimental results
In this section, we summarize the results of all performed experiments. For every solution method and every use case, we picked the best configuration regarding search strategies and warm-starting (see the results displayed in Table 2).

Evaluation of overall performance of our solution methods
In order to compare the overall performance of our solution methods, we computed the Borda scores (as defined in Section 6.1.4) independently for all three use cases for all configured evaluated methods, including the greedy construction heuristic. The exact techniques are configured using the best search strategy per use case and, if advantageous, the warm-start option as reported in Table 2. This also allows us to evaluate the robustness of our solution methods regarding different sets of weights of the objective components. The results are displayed in Fig. 2, where one line-graph is drawn per use case. The solution methods are sorted from best to worse when considering the sum of Borda scores obtained for all three use cases. As one can see, the Borda scores for the two other use cases UC2 and UC3 follow the same general tendency and are close in value to those for use case UC1. This means that the performance of the evaluated solution methods is almost the same across all three evaluated use cases and that the performance of our solution methods are robust with respect to changes in the chosen weights.
In particular, for all three use cases the first four places are taken by the same solvers: The overall best solution method is the interval model with representative jobs run directly with CP Optimizer (oplrun-interval-repr-jobs) and the second best is the direct model with batch positions run with Gurobi (gurobi-direct-batch-pos). The third place is taken by cpoptdirect-batch-pos and the fourth by oplrun-interval-batch-pos. The values of the Borda scores of the three best solution methods for all three use cases UC1, UC2 and UC3 are reported in Table 8. Note that for UC1 and UC3, the scores obtained by oplrun-interval-repr-jobs and gurobi-direct-batch-pos are very close in value; for UC2 however, the difference is larger. One possible explanation is that UC2 models lexicographic minimization and involves large multiplicative constants for the three components of the objective function; it could be that CP Optimizer is better at dealing with these large numerical values in the objective function than Gurobi. Moreover, note that there is a significant gap between the second and third place; oplrun-interval-repr-jobs and gurobi-direct-batch-pos are by far the best two solutions methods for all three use cases.  Moreover, note that the maximum possible score for any solution method is 800, since every solution method is compared to 10 other solution methods on a total of 80 instances. The scores of the overall best solution method opl-interval-repr-jobs thus lie between 89% (for UC3) and 93% (for UC2) of the possible maximum. Also for the second-best solution method gurobi-direct-batch-pos, the scores lie between 73% (UC2) and 89% (UC1). A method that always achieves better results than all other methods would achieve a score of 100%. The very high scores of the two best solution methods lead us to the conclusion that when solving a benchmark instance for the OSP with one of these methods, the likelihood is very large that the result is better than when using any of the other methods.
It is also interesting to observe that when using a Gurobi, Chuffed or CP Optimizer run via MiniZinc, the modelling approach using batch positions is better than the one with representative jobs, whereas the contrary is the case for the interval models run directly with CP Optimizer. A possible explanation can be found by investigating the model sizes of these two modelling approaches, as we will see in the following section.
Also note that the greedy construction heuristic ranks in 9th place out of 12, when summing up the scores across all three use cases and was capable of achieving scores that are comparable with those achieved by the exact techniques ortools-direct-repr-jobs, ortools-direct-batch-pos, chuffed-direct-batch-pos, cpopt-direct-repr-jobs and gurobi-direct-repr-jobs (for some of the use cases). This relatively good result is mainly due to the fact that the construction heuristic was capable of solving all 80 benchmark instances, which is not the case for all exact methods (details for UC1 can be found later on in Table 11). To sum up: Across all three use cases, the overall best solution method is the interval model with representative jobs run with CP Optimizer directly and the second best is the direct model with batch positions run with Gurobi. Also for the other solution methods, the choice of weights does not have much influence on the overall performance of our methods and we can conclude that our methods are robust with respect to different weight settings.

Comparison of model sizes
Tables 9 and 10 summarize the numbers of variables and constraints for selected instances for use case UC1 (we selected one instance for each of the 16 different size categories). These numbers are retrieved from the solvers' log files.
The tables display in each row from left to right: The id of the instance, the parameters n (number of jobs), k (number of machines), a (number of attributes) that were used for generating the instance, the number of constraints (cons) and variables (vars) for all four modelling approaches. Table 9 contains information for the two direct models implemented in MiniZinc and solved with the overall best solver Gurobi; Table 10 contains information on the same instances for the two interval variable models solved directly with CP Optimizer.
The numbers displayed in Table 9 show that the direct-batch-pos model uses much fewer variables and constraints compared to direct-repr-jobs for instances of all sizes with Gurobi. Actually, the direct-repr-jobs model becomes so large with n = 100 that the model compilation with MiniZinc could not finish within the given runtime, and thus no numbers on the variables and constraints can be shown in the table (marked with -). On the other hand, when comparing the interval-batch-pos and interval-repr-jobs models in Table 10, we can see that the model using representative jobs uses a lower number of variables and constraints. This is an expected result as the representative job model can be captured more efficiently when using the scheduling global constraints available with CP Optimizer. We further see that the number of attributes does not affect the number of variables and constraints with CP Optimizer. Again this is expected as attributes can be implicitly handled with the scheduling global constraints. These numerical findings are in line with the overall performance results presented in the previous section: when using a direct modelling approach in MiniZinc the smaller model with batch positions leads to better results; when using interval variables with CP Optimizer the smaller model with representative jobs beats the batch position model.

Detailed results for several performance measures (for UC1)
The rankings reported in Section 6.3.1 allowed us to compare the overall performance of our solution methods. Now, we want to take a closer look at several performance measures and answer the following questions: Which methods are best at … • …finding solutions?
Finally, we are also interested in assessing the quality of solutions found by the best solution methods. In the following, we answer these questions for the first use case UC1 since the results for the other two use cases are similar. Table 11 provides an overview of the final results produced for UC1 on the 80 benchmark instances with all configured evaluated methods, including the greedy construction heuristic. For the exact techniques, we used the beat search strategy as reported in Table 2 and the warm-start option. The first column in each row denotes the evaluated solver and model. The solution methods are sorted by their Borda scores for UC1. From left to right, the columns display: the number of solved instances, the number of instances where overall best, i.e. minimal, solution cost results could be achieved, the number of obtained optimal solutions, the number of optimality proofs, the number of instances where the fastest optimality proof could be found, and the number of instances where the best, i.e. maximal, lower bound could be found.
Finding solutions The OPL model with representative jobs (oplrun-interval-repr-jobs) and with batch positions (oplrun-interval-batch-pos) as well as the direct model with batch positions run with Gurobi (gurobi-direct-batch-pos) finds solutions for all 80 instances. This is not surprising since these three solution methods all use a warm-start approach (for a comparison of the results with and without the warm-start approach, see Section 6.2.2) and the construction heuristic is capable of finding feasible solutions for all 80 instances. Even without a warm-start approach, ortools-direct-batch-pos was capable of finding solutions for 78 instances, followed by cpopt-direct-batch-pos (71 instances) and chuffed-direct-batchpos (70 instances). For all solvers run via MiniZinc, the model with batch positions led to much better results than the one with representative batches; this is in line with the overall performance results.
To conclude, we can note that besides the best-performing solution methods OR-Tools was also good at finding solutions with the model using batch positions.
Finding optimal solutions and delivering optimality proofs Using all evaluated methods, optimal solutions could be found for 38 instances: all instances with 10 jobs, 16 instances with 25 jobs, one with 50 jobs and one with 100 jobs.
Most optimality proofs were provided by Gurobi and the batch position model (34 proofs), followed by CP Optimizer run directly with the representative jobs model (28 proofs) and Gurobi with the representative jobs model (26 proofs). Moreover, some of the solutions found by Gurobi were optimal, but not proven to be optimal (3 solutions each for gurobi-directbatch-pos and gurobi-direct-repr-jobs). With oplrun-interval-repr-job, 7 additional optimal solutions could be found but were not proven to be optimal.
Far less optimality proofs were delivered by the other solution methods, but optimal solutions could still be found. Even though cpopt-direct-batch-pos (cpopt-direct-repr-jobs) could only provide 13 (18) optimality proofs, it did find 31 (29) optimal solutions; similarly chuffed-direct-batch-pos provided only 14 optimality proofs but could find 24 optimal solutions. For ortools-direct-batch-pos, optimality proofs could be delivered for all 20 instances where the optimal solution was found. It is noteworthy that for Chuffed and CP Optimizer run via MiniZinc significantly more optimality proofs could be found when using the model with representative jobs.
Overall, the greatest number of fastest proofs were delivered by oplrun-interval-repr-jobs (15 fastest proofs in total). For the subset of instances with 10 jobs (20 instances in total) for which many solution methods could deliver optimality proofs, Table 12 contains further  Table 11 Overview of the detailed computational results for UC1 based on 80 benchmark instances  Numbers are reported only for those methods that could prove optimality for all small instances The best results are highlighted in bold font information regarding optimality proofs: the average total runtime (which includes the time for finding the optimal solution and for proving optimality), the average number of nodes visited in the search process, the standard deviation of the runtime and the standard deviation of the number of visited nodes. We can see that fastest proofs were delivered by gurobi-directbatch-pos: on average, 2 seconds were required per instance and all proofs could be delivered within less than 10 seconds. Second best results were achieved by oplrun-interval-repr-jobs; the slightly higher average and standard deviation are mainly due to a single instance for which more than 10 seconds were required for the optimality proof. The other three solution methods that could provide optimality proofs for all 20 small instances require more time with averages above 30 seconds. However, for all methods compared in Table 12, the proof times are on average less than three minutes and exceed ten minutes only for a single instance: the maximum proof time for a small instance is 183 seconds for gurobi-direct-repr-jobs, 515 seconds for chuffed-direct-repr-jobs and 1473 seconds for ortools-direct-batch-pos. For the solution methods run via MiniZinc, the average number and the standard deviation of nodes visited during the search process are in line with the distribution of proof times: the fewer nodes were visited, the faster proofs were delivered. For oplrun-interval-repr-jobs, the number of nodes visited were higher, even though proofs could be delivered very quickly.
To sum up, the two best performing methods oplrun-interval-repr-jobs and gurobi-directbatch-pos also found most optimal solutions and provided most optimality proofs. However, in terms of delivering optimality proofs, gurobi-direct-batch-pos clearly outperforms the overall best method oplrun-interval-repr-jobs. For small instances with ten jobs, the two best performing methods were capable of providing proofs much faster than other methods.

Best solutions
Regarding the quality of found solutions, the results are totally in line with the overall ranking of solution methods: best results were achieved by oplrun-interval-repr-jobs (65 best results), followed by gurobi-direct-batch-pos (54 best results) and cpopt-direct-batchpos (38 best results). For all other solution methods, best results could only be achieved for those instances that were solved optimally or for one additional instance.
Best lower bounds Lower bounds (also referred to as dual bounds) on the solution cost of an optimal solution are provided by Gurobi, OR-Tools and CP Optimizer (both when called directly and via MiniZinc). Interestingly, best lower bounds were found by CP Optimizer run via MiniZinc and the model using representative jobs, a solution method that did not perform particularly well with respect to other criteria (best bounds for 50 instances). Second-best results were provided by gurobi-direct-batch-pos, the solution method that could also find most optimality proofs (best bounds for 44 instances). Optimality gap for solutions found by best methods Finally, we want to evaluate the ability of the two best performing methods oplrun-interval-repr-jobs and gurobi-direct-batch-pos in closing the optimality gap. That is, how good are these two methods and finding primal and dual bounds that are close to each other? For small instances with 10 jobs, optimal solutions and optimality proofs could be found very quickly. For larger instances, for which no optimality proof could be delivered within the runtime limit, we compute the optimality gap which is defined as follows: For a given solver, if s(i) is the objective value of the solution found and b(i) is the best (i.e. maximal) lower bound found for instance i, the optimality gap is given by g(i) = (s(i) − b(i))/s(i). The boxplots in Fig. 3 visualize the distribution of the optimality gap (in %) per instance for the two best solution methods oplrun-interval-repr-jobs and gurobi-direct-batch-pos. The results are grouped by the number of jobs in the instance.
For both compared solution methods, the optimality gap increases with the number of jobs per instance. However, for all three job sizes the median optimality gaps are significantly smaller for gurobi-direct-batch-pos than for oplrun-interval-repr-jobs. For instances with 25 jobs and gurobi-direct-batch-pos, the optimality gap is smaller than 1% for 14 instances compared to 8 instances for oplrun-interval-repr-jobs. Moreover, the gap is never above 40% for gurobi-direct-batch-pos, whereas it is above 40% for 11 instances for oplrun-intervalrepr-jobs. For instance with 50 jobs and gurobi-direct-batch-pos, the gap is smaller than 1% for 10 instances and never above 60%; for oplrun-interval-repr-jobs it is always above 60%. For the largest jobs with 100 jobs, the gap reaches values close to 100% for both solvers. But again, the median value is much lower for gurobi-direct-batch-pos (50%) than for oplrun-interval-repr-jobs (94%).  4 Solution quality for the solutions found by the two best solution methods measured in terms of the relative distance to the overall best lower bound found. Results are grouped by the size of the instances We can conclude that for those instances where no optimality proof could be delivered within the runtime limit, gurobi-direct-batch-pos is much stronger than oplrun-interval-reprjobs in closing the gap between upper and lower bounds.

Quality of solutions found by best methods
We previously saw that gurobi-direct-batch-pos is a lot better than oplrun-interval-repr-jobs at closing the optimality gap for instances with 25 jobs or more. This does however not imply that the solutions found by oplrun-interval-reprjobs are necessarily worse than those found by gurobi-direct-batch-pos. Indeed, as indicated in Table 11, more best solutions were found by oplrun-interval-repr-jobs than by gurobidirect-batch-pos. In order to evaluate the quality of solutions found by the best two solution methods, we therefore compute the relative distance to the overall best lower bound found as follows. For a given instance i, let ob(i) be the overall best lower bound found by any solver and let s(i) be the objective value of the solution found by some solver. Then the relative distance for this solver is given by r (i) = (s(i) − ob(i))/s(i). The boxplots in Fig. 4 visualize the distribution of this relative distance (in %) per instance for the two best solution methods oplrun-interval-repr-jobs and gurobi-direct-batch-pos. The results are grouped by the number of jobs in the instance.
The first observation that we can make is that-in contrast to the results concerning the optimality gap displayed in Fig. 3-the results look very much alike for both best-performing methods. For both methods, the relative distance increases with the number of jobs per instance: while the solutions are optimal or very close to the optimum for most instances with 25 jobs (oplrun-interval-repr-jobs found optimal solutions for 14 out of 20 instances, gurobi-direct-batch-pos did so for 16 instances) and never above 20 %. This is no longer the case for instances with 50 or 100 jobs. Even though the relative distance generally increases with the number of jobs, there are also quite a few larger instances that are close to the optimum: For oplrun-interval-repr-jobs there are 15 instances with 50 jobs or more for which the optimality gap is less than 1% (14 instances for gurobi-direct-batch-pos). In total, the optimality gap is less than 1% for 51 instances for oplrun-interval-repr-jobs (50 instances for gurobi-direct-batch-pos) and smaller than 10% for 61 instances out of 80 (for both methods).
Looking closer at the data, we can see that oplrun-interval-repr-jobs achieves results with slightly smaller optimality gap for the larger instances with 50 or 100 jobs than gurobidirect-batch-pos. Nonetheless, the large instances still leave room for improvement for both methods, regarding finding better solutions (or improved lower bounds).

Conclusions of the experimental evaluation
The overall scoring of solution methods provided in Section 6.3.1 produced two clear winners: the interval model with representative jobs solved by CP Optimizer directly (oplrun-interval-repr-jobs) and the direct model with batch positions solved by Gurobi (gurobi-direct-batch-pos). The rankings of solution methods created for three different practical use cases shows the robustness of our methods with respect to changes in the weights of the objective components. The analysis of our solution methods with respect to several performance criteria in Section 6.3.3 showed that the two winning methods also produced best results in terms of the number of solved instances, of best solutions, of optimality proofs and of proof speed. Only with respect to finding good lower bounds best results were achieved by a different method, namely cpopt-direct-repr-jobs.
The evaluation of different search strategies showed that the selection of search strategy has a major impact on the solvers Chuffed and Gurobi. However, the solution methods using these solvers could still not compete with the best two solution methods. We saw that including an initial solution with a warm-start approach can help the solvers CP Optimizer and Gurobi produce more solutions. The positive impact was however less obvious for the best method oplrun-interval-repr-jobs, which could already produce very good results without a warm-start approach.
Finally, let us attempt to answer the question which solver and which model are best at solving the OSP. This is difficult to answer in general, since the combination of solver and model is crucial for the performance of a solution method.
Using the results presented in Section 6.3.3 we can nonetheless discern CP Optimizer (run directly) as the overall best method when it comes to solving all instances and finding good solutions. The second best solver in this respect is Gurobi; moreover Gurobi is the best solver when it comes to providing optimality proofs.
Regarding the performance of the different models, the experimental results show that the model using representative jobs produced the largest number of best results when interval variables can be utilized (which is the case with the OPL models and CP Optimizer). However, if interval variables are not supported by the solver (which is the case with Gurobi, Chuffed, OR-tools, and CP Optimizer via the MiniZinc interface), in general the number of best found solutions is improved when using the model with batch positions instead of representative jobs. We conclude that using representative jobs for batches can be considered the overall better model if efficient interval variables can be utilized. However, if no solver that supports interval variables is available or proving optimality is the main aim, the model based on batch positions can leads to better results.

Summary and future work
In this paper, we introduced and formally defined the Oven Scheduling Problem and provided instances for this problem, a new batch scheduling problem that appears in the electronic component industry. We proposed two different modelling approaches, presented corresponding CP and ILP models in MiniZinc and OPL and investigated various search strategies.
Using our models as well as a warm-start approach, we were able to find feasible solutions for all 80 benchmark instances. Provably optimal solutions could be found for nearly half of the instance set and for 51 of the 80 instances, the optimality gap is less than 1%. Overall, the best results could be achieved with the interval variable model using representative jobs (see Section 4) and running CP Optimizer directly, followed by the direct model with batch positions (see Section 3) solved by Gurobi using warm-start. We evaluated all our solution methods for three different use cases that arise in practice. The results showed that the same methods achieved best results in all three cases, which indicates the robustness of our solution methods with respect to changes in the weights of the objective components.
The proposed exact approach was able to produce solutions for instances up to 100 jobs. However, to further improve the solution quality for large instances and efficiently tackle large-scale realistic instances with more than 100 jobs, we plan to develop metaheuristic search strategies based on local search or large neighborhood search. First results using a local search approach with simulated annealing already look promising and we plan to continue our research in this direction [37]. Within a runtime limit of five minutes, the simulated annealing approach could reach all optimal solutions found by exact methods. The solution quality could be improved for approximately half the instances for which no provably optimal solutions were found by the exact methods.
Moreover, in order to explain which parameters cause instances to be hard, an in-depth instance space analysis could be conducted. This could also shed light onto which methods perform best on different parts of the instance space. straints, machine availability times nor setup costs and times, the random generation of the associated instance parameters is a novel contribution of this paper. The list of parameters for this instance generator is given in Table 13.
Jobs. The list of n jobs is generated as follows. First, for every job j, the minimal processing time mint j is chosen using a discrete uniform distribution U (1, max T ). For the maximum processing time, there are two options: either there is no upper limit on the processing time of jobs (max_time = false), in which case the maximum processing time is set to max T for all jobs. Or, if max_time = true, the maximum processing time for a job is chosen using a discrete uniform distribution U (mint j , max T ). Next, the earliest start and latest end times are determined for every job. The earliest start time et j is chosen similarly as by Velez Gallego [38] according to a discrete uniform distribution U (0, ρ · Z ) where ρ ∈ [0, 1] and Z = mint j is the total processing time of all jobs. If ρ = 0, all jobs are available right at the beginning and as ρ grows, the jobs are released over a longer interval. The latest end time lt j is chosen as in [16] according to lt j = et j + U (1, φ) · mint j where φ ≥ 1. If φ = 1, the latest end time is equal to the sum of the earliest start time and the minimum processing time, meaning that all jobs must be processed immediately in order to finish on time. As φ grows, more time is given for every job to be completed and tardy jobs are less likely. Regarding the set of eligible machines for a job, one machine is chosen at random among all machines. Additional machines are then added to this set with probability σ each. The size s j and attribute a j of a job are both chosen at random between 1 and the maximum job size s or number of attributes respectively.
Attributes. The setup times and setup costs matrices can be of four different types: Constant, arbitrary, realistic and symmetric. For the type "constant", setup times/costs are all equal to a randomly chosen constant between 0 and max T /4 . For the type "arbitrary", every entry is chosen independently at random between 1 and max T /4 . For the type "realistic", setup times/costs between two batches of the same attribute are lower and are chosen independently at random between 0 and max T /8 , whereas setup times/costs between different attributes are higher and are chosen between max T /8 + 1 and max T /4 . For the type "symmetric" a symmetric matrix is generated with random entries between 0 and max T /4 .
Machines. The maximum machine capacity c m is randomly chosen between min C and max C where the lower bound min C is set to the maximum job size s to ensure that every job fits into every machine. The initial state s m is chosen at random among the set of attributes {1, . . . , a}. For the machine availability times, we first fix the length of the scheduling horizon l. If we assume that every job is processed in a batch of its own and that all jobs are processed on the same machine, the total runtime is at most equal to the sum of all processing times Z plus n times the maximal setup time max st . The parameter τ ∈ (0, 1] is a lower bound for the fraction of time that every machine is available. Thus, if max et is the latest earliest start time, all jobs should-on average-be finished at time l = max et + (Z + n · max st )/(τ ) which we use to set the length of the scheduling horizon. Note that if the latest end date of a job is greater than this upper bound we simply use it instead. Now, for every one of the k machines, we pick the number of availability intervals I randomly between 1 and max I . Every interval [start i , end i ] should be long enough to accommodate at least a single job with minimal processing time min T = min(mint j : j ∈ J ) (plus the necessary setup times). Thus, the minimum distance between two interval start times start i and start i+1 is d = min T +max st . We first pick the start time start 1 of the first interval: In order to guarantee that every machine is available at least a fraction τ of the time, 0 ≤ start 1 ≤ l · (1 − τ ) must hold and in order to leave enough time for all availability intervals, it has to hold Table 13 List of parameters of the random instance generator start 1 ≤ l − I · d. Next, for the start times of the remaining intervals, we pick I − 1 random integers between (start 1 + d) and (l − d) that are at least d apart. Finally, we determine the end time end i of the i-th interval: end i = start i + max(d, U (τ, 1) · (start i+1 − start i ) )