The mathematical model consists of a formulation of the timetabling problem as a large linear mixed-integer program. The program is, in general, too large to be solved directly and an incremental solution method was developed to obtain a feasible solution and progressively improve the objective function. This approach resembles other two-stage matheuristic algorithms of literature such as (Lindahl et al. 2018) employing a combination of a MIP model and a local search heuristic.
Decision variables
The formulation uses four types of binary 0–1 decision variables: x, y, z and Z, which represent, respectively, the class time assignment, the class room assignment, the allocation of students to classes and the allocation of students to course configurations. These variables are defined in eqs. (1), (2), (3) and (4). A feasible solution to the timetabling problem can be unambiguously represented by these variables.
For every class \(c\in C\) and time assignment \(t\in T_c\):
$$\begin{aligned} x_{c,t} = \left\{ \begin{array}{ll} 1 &{} \text{ if } \text{ the } \text{ class } c\hbox { takes place at time }t,\\ 0 &{} \text{ otherwise }\end{array}\right. \end{aligned}$$
(1)
For every class \(c\in C\) and potential room assignment \(r\in R_c\):
$$\begin{aligned} y_{c,r} = \left\{ \begin{array}{ll} 1 &{} \text{ if } \text{ the } \text{ class } c\hbox { takes place in room }r,\\ 0 &{} \text{ otherwise }\end{array}\right. \end{aligned}$$
(2)
For every student \(s\in S\) and class \(c\in C\):
$$\begin{aligned} z_{s,c} = \left\{ \begin{array}{ll} 1 &{} \text{ if } \text{ the } \text{ student } s\hbox { is assigned to class }c,\\ 0 &{} \text{ otherwise }\end{array}\right. \end{aligned}$$
(3)
For every student \(s\in S\) and course configuration \(f\in F_o\) belonging to a course \(o\in O_s\) that he may attend:
$$\begin{aligned} Z_{s,f} = \left\{ \begin{array}{ll} 1 &{} \text{ if } \text{ student } s\hbox { attends the configuration }f,\\ 0 &{} \text{ otherwise }\end{array}\right. \end{aligned}$$
(4)
The rest of the section presents the formulation of the linear MIP constraints and objective function using the above variables. We first describe the fundamental constraints representing the basic requirements among classes, rooms and students, and later on the formulation of the hard distribution constrains and the components of the objective function.
Fundamental constraints
The fundamental constraints correspond to the main requirements of what defines an acceptable solution to the time tabling problem. We describe eight types of fundamental constraints whose interpretation is straightforward.
-
C1:
Every class c must have a time assignment t. For every class \(c\in C\):
$$\begin{aligned} \sum _{t\in T_c} x_{c,t} = 1 \end{aligned}$$
(5)
-
C2:
Every class c must be assigned a room r, where applicable. For every class \(c\in C\) requiring a room:
$$\begin{aligned} \sum _{r\in R_c} y_{c,r} = 1 \end{aligned}$$
(6)
-
C3:
Every student s must attend exactly one class c from each subpart \(P_f\) of the selected course configuration f for each course o that he must attend. For each student \(s\in S\), for each course \(o\in O_s\), for each course configuration \(f\in F_o\) and for each \(P_f\subset C\):
$$\begin{aligned} \sum _{c\in P_f} z_{s,c} = Z_{s,f} \end{aligned}$$
(7)
-
C4:
If a class c has a parent class \(c'\) defined, whenever the class c is assigned to a student s then the parent class \(c'\) must also be assigned to that student. For every student \(s\in S\) and class \(c\in C\) with a parent class \(c'\in C\):
$$\begin{aligned} z_{s,c} \le z_{s,c'}\end{aligned}$$
(8)
-
C5:
Every student must be assigned a course configuration \(f\in F_o\), for every course o that he attends. For every student \(s\in S\) and course \(o\in O_s\):
$$\begin{aligned} \sum _{f\in F_o} Z_{s,f} = 1 \end{aligned}$$
(9)
-
C6:
The capacity \(M_\mathrm{c}\) of each class in terms of the number of students must be satisfied. For every class \(c\in C\):
$$\begin{aligned} \sum _{s\in S} z_{s,c} \le M_{\mathrm{c}} \end{aligned}$$
(10)
-
C7:
A room cannot be used when it is unavailable. For every class \(c\in C\) and for every \(t\in T_c\) and \(r\in R_c\), if the time assignment t overlaps with a period of unavailability of the assigned room r, then both these assignments cannot be made:
$$\begin{aligned} x_{c,t} + y_{c,r} \le 1 \end{aligned}$$
(11)
-
C8:
Two classes cannot happen at the same time in the same room. For every pair of classes \(c_1, c_2 \in C\) and every potential common room assignment \(r \in R_{c_1}\cap R_{c_2}\), and every time assignment \(t_1\in T_{c_1}\) and \(t_2\in T_{c_2}\) where \(t_1\) and \(t_2\) overlap, not all four assignments can be made simultaneously:
$$\begin{aligned} x_{c_1,t_1} + x_{c_2,t_2} +y_{c_1,r} + y_{c_2,r} \le 3. \end{aligned}$$
(12)
Distribution constraints
In addition to the fundamental constraints, further linear constraints are added to the MIP to represent the hard distribution constraints. Since these constraints involve only the classes and not the students, they only contain the x and y variables.
The inclusion of the hard distribution constraints \(D_\mathrm{H}\) in the model is done in a similar way to C8 as follows: for every pair of classes \(c_1, c_2\in C_d\) of a hard distribution constraint \(d\in D_\mathrm {H}\), we calculate if a particular combination of time and room assignments results in a violated constraint. If that is the case, a constraint C9 is added as shown below.
-
C9:
For every pair of classes of a hard distribution constraint, forbid combinations of time and room assignments that result in a violated constraint. For every \(c_1, c_2\in C_d\) belonging to a hard distribution constraint \(d\in D_\mathrm {H}\) and every offending combination of \(t_1\in T_{c_1}\), \(t_2\in T_{c_2}\), \(r_1\in R_{c_1}\) and \(r_2\in R_{c_2}\):
$$\begin{aligned} x_{c_1,t_1} + y_{c_1,r_1} + x_{c_2,t_2} + y_{c_2,r_2} \le 3.\end{aligned}$$
(13)
For example, consider a SameDays constraint on the classes \(\{c_1, c_2\}\) requiring the two classes to take place on the same day. For every pair of time assignments \(t_1\in T_{c_1}\) and \(t_2\in T_{c_2}\) representing different days, one constraint (13) is added, regardless of the choice of rooms (for any \(y_{c_1,r_1}\), \(y_{c_2,r_2}\)).
The formulation C9 works for all distribution constraints that can be expressed as separate statements ‘for each pair of classes’ belonging to the distribution constraint. These constraints are the following 15 types of distribution constraint of Müller et al. (2018): SameStart, SameTime, DifferentTime, SameDays, DifferentDays, SameWeeks, DifferentWeeks, SameRoom, DifferentRoom, Overlap, NotOverlap, SameAttendees, Precedence, WorkDay and MinGap.
The remaining four types of distribution constraints, however, cannot be fully represented as a set of constraints among pairs of classes. These are the MaxDays, MaxDayLoad, MaxBreaks and MaxBlock types of constraints, which are referred to as ‘special distribution constraints’ in the description of the instances on the competition websiteFootnote 1.
For these special distribution constraints, it is necessary that the corresponding inequalities C9 are satisfied for every pair of classes in the constraint, but this is not sufficient. A simple counterexample is to consider a MaxDays constraint requiring three classes to take place over a total of 2 days or less: in that case, the constraint will always be satisfied for every pair of classes but, taken as a group of three, the classes will violate the constraint if they take place on three different days.
In our implementation, we begin by generating the necessary constraints C9 for each pair of classes in the special distribution constraint as usual and implement a check by means of a software lazy constraint callback to ensure feasibility. Each time a feasible solution is found during the solution of the MIP, we perform a check if the values of the solution violate any of the special distribution constraints. In the case that a special constraint is violated, we reject the current solution by adding the following constraint:
-
C10:
Every time an integer solution is found, check that all hard special distribution constraints present are satisfied, if any are not, then add the following inequality and continue the optimization.
$$\begin{aligned} (x_{c_1,t_1} + y_{c_1,r_1}) + \cdots + (x_{c_N,t_N} + y_{c_N,r_N}) \le 2N-1 \nonumber \\ \end{aligned}$$
(14)
where N is the number of classes contained in the specification of the special constraint which was found to be violated. This inequality forbids the current combination of x and y variables of the N classes. The constraints C10 are too numerous to include in the formulation from the start of the optimization, and with this approach, they are only added as needed.
Objective function
The objective function of the problem consists of four linear terms with instance-specific weights. The first two terms are simply the weighted sum of the time assignment x and room assignment y variables. The last two terms relate to violated soft distribution constraints and student clashes. In the rest of this section, we describe the inclusion of these two elements into the model, and in particular the student clashes which are the most complex part of the formulation. Despite the added complexity, the inclusion of all four terms in the objective function is crucial in order to obtain a solution of good quality.
The third term of the objective relates to penalties due to violated soft distribution constraints \(D_\mathrm{S}\). These penalties must be added to the objective for each pair of classes for which the soft distribution constraint \(d \in D_\mathrm{S}\) is violated. This is very convenient as we can use the same approach as for the hard distribution constraints (13) by adding an auxiliary indicator variable AUX denoting whether the constraint is violated (AUX = 1) or not (AUX = 0). The variable \(\mathrm {AUX}\) is then added directly into the objective function with the appropriate weighting. This variable is created for every soft distribution constraint \(d \in D_\mathrm{S}\) and for every pair of classes \(c_1, c_2\in C_d\) in the description of that constraint (\(\mathrm {AUX}_{d,c_1,c_2}\)). Several inequalities (15) may refer to the same AUX variable, as there are many ways that a given soft constraint can be violated by two classes, but we only account for this once in the objective. Similarly to the hard constraints, we generate C11 only for combinations of x and y that result in a violation.
-
C11:
For every pair of classes \(c_1, c_2\in C_d\) of a soft distribution constraint \(d\in D_\mathrm{S}\), create a new 0-1 variable \(\mathrm {AUX}_{d,c_1,c_2}\). Then for every combination of \(t_1\in T_{c_1}\), \(t_2\in T_{c_2}\), \(r_1\in R_{c_1}\) and \(r_2\in R_{c_2}\) which results in a violation of the constraint, add the inequality:
$$\begin{aligned} x_{c_1,t_1} + y_{c_1,r_1} + x_{c_2,t_2} + y_{c_2,r_2} - \mathrm{\mathrm {AUX}}_{d,c_1,c_2} \le 3.\end{aligned}$$
(15)
Inequalities (15) force the variable AUX to take the value 1 whenever the soft constraint is violated. There is no need to require AUX = 0 when this does not happen, as the objective function (a minimization problem) will ensure this. For the same reason, it is not necessary to specify the AUX variables as integer; instead, they are added to the model as continuous 0–1 variables.
Just as in the case of the hard constraints, consideration should be given to the soft special constraints (MaxDays, MaxDayLoad, MaxBreaks, MaxBlock) which are not fully represented by C11. They can be represented using inequalities similar to (15) with a larger number of terms:
$$\begin{aligned}&(x_{c_1,t_1} + y_{c_1,r_1}) + \cdots + (x_{c_N,t_N} + y_{c_N,r_N})\nonumber \\&\quad - \mathrm{\mathrm {AUX}}_{d,c_1,\ldots ,c_N}\le 2N-1 \end{aligned}$$
(16)
where N is the number of classes in the constraint d and \(\mathrm{\mathrm {AUX}}_{d,c_1,\ldots ,c_N}\) is a continuous 0–1 variable, but their number is very large.
Of course, one could completely ignore these terms from the objective, but this would lead to poor quality solutions for the instances with a large number of such constraints. Instead, as our solution method consists of a sequence of optimization runs, we adopted a simple approach where we maintain a log of any violated soft special distribution constraints detected during the branch-and-bound, which are then added to the model during the next run. The optimization begins without any of these soft constraints included in the objective function. During each optimization run, whenever a feasible solution is produced we check for violated soft special distribution constraints. If any are found, they are added into a file, which is read at the beginning of the next optimization run, when the associated constraints (16) are added to the model and the \(\mathrm{\mathrm {AUX}}_{d,c_1,c_2}\) variables included in the objective function as described later. In the long term, this external file will contain a subset of soft special distribution constraints that merit to be added to the objective, having been found to be violated in the past runs. In practice, we observed that the size of this file tends to stabilize over time, suggesting that eventually only a few of these special soft constrains are the most useful to include in the objective function.
The fourth and last term of the objective relates to the penalties associated with student clashes. We distinguish between two types of student clash:
The first type is simpler because we do not need to consider the room assignments: if two classes overlap, a clash exists regardless of the room assignment. For the second type, we need to consider the rooms allocated to the classes, as it is possible that a clash exists for some room assignments but not for others.
For the first type, for each student s and for each pair of classes \(c_1\), \(c_2\) he may attend and for each combination of their time assignments \(x_{c_1,t_1}\) and \(x_{c_2,t_2}\) which overlap, an auxiliary variable aux can be added denoting whether a clash is present for this student or not. The corresponding constraint is similar to (15):
$$\begin{aligned} x_{c_1,t_1} + x_{c_2,t_2} + z_{s,c_1} + z_{s,c_2} - aux_{c_1,t_1,c_2,t_2,s} \le 3,\end{aligned}$$
(17)
where \(0\le aux \le 1\) is a continuous variable. Let \(T'_{c_2} \subseteq T_{c_2}\) be the set of values of \(\{t_2\}\) for which the inequalities (17) are generated. Since the class \(c_2\) can only have one time assignment (constraint C1) and therefore only one of the variables \(x_{c_2,t_2}\) can take the value 1, we can aggregate the inequalities (17) corresponding to different values of \(t_2\in T'_{c_2}\) into a single inequality (18):
$$\begin{aligned} x_{c_1,t_1} + \sum _{t_2\in T'_{c_2}} x_{c_2,t_2} + z_{s,c_1} + z_{s,c_2} - aux_{c_1,t_1,c_2,s} \le 3,\end{aligned}$$
(18)
where \(c_1,c_2\in C\), \(t_1\in T_{c_1}\), \(s\in S\). Equivalently, the aggregation can be done by the class \(c_1\) instead of \(c_2\) (in the implementation we used the variant that produced the smallest number of constraints—see also the discussion on constraint aggregation later on).
The number of constraints (18) is still too numerous to handle efficiently in the model because each student and each possible pair of classes are considered individually. We can further reduce the number of constraints significantly by grouping equations (18) among students who follow the same courses. To do this, we need to introduce the auxiliary variables \(w_{s,c_1,c_2}\) which denote whether a student s attends both classes \(c_1\) and \(c_2\) or not:
$$\begin{aligned} w_{s,c_1,c_2} = \left\{ \begin{array}{ll} 1, &{} \text{ if } z_{s,c_1}=1\hbox { and }z_{s,c_2}=1, \\ 0, &{} \text{ otherwise } \\ \end{array} \right. \end{aligned}$$
(19)
with the necessary constraints \(0\le w \le 1\) and \( w_{s,c_1,c_2} \ge z_{s,c_1} + z_{s,c_2} -1\). Note that as a further simplification we only create the w variables when necessary, because in many cases they can be substituted by one of the corresponding z variables or even their value may be fixed. This will happen when a student is obliged to attend a particular class (\(z_{s,c_1}=1\)) therefore \(w_{s,c_1,c_2}=z_{s,c_2}\), or alternatively if he can never attend a class (\(z_{s,c_1}=0\)) in which case \(w_{s,c_1,c_2}=0\). Using the variables w, the inequalities (18) can be grouped for all N students who may attend the two classes \(c_1\) and \(c_2\) to produce:
$$\begin{aligned}&{Nx_{c_1,t_1} + N\sum _{t_2\in T'_{c_2}} x_{c_2,t_2} + }\nonumber \\&\quad + (w_{s_1,c_1,c_2} + \cdots + w_{s_N,c_1,c_2}) - \mathrm{\mathrm {AUX}}_{c_1,t_1,c_2} \le 2N \end{aligned}$$
(20)
where \(\mathrm{\mathrm {AUX}}_{c_1,t_1,c_2}\) is a new continuous variable between 0 and N, equal to the sum of the original variables \(aux_{c_1,t_1,c_2,s}\), denoting the number of students who cannot attend both classes \(c_1\) and \(c_2\) simultaneously.
For the second type of student clash, where two classes do not overlap but the time between them is not sufficient to allow the travel between the assigned rooms, we begin by considering the following special case. For each pair of classes, consider all the possible rooms that can be assigned to them, and calculate the minimum travel distance \(M\ge 0\) among any possible rooms of the two classes. The values of M are specific to each problem instance and can be pre-calculated for efficiency.
If the time assignment \(x_{c_1, t_1}\) and \(x_{c_2,t_2}\) of two classes do not overlap but still have a gap between them of less than M periods, we can still use the inequalities (20) of the previous case, because a clash will exist regardless of the room allocation. In other words, if two classes do not overlap, but their gap in time is smaller than the time that is necessary under the best room assignment, then we do not need to consider the room assignments and this case is dealt in the same way as the previous one.
For the remaining cases of the second type of student clash, we must include the terms relating to the room allocation y:
$$\begin{aligned}&{x_{c_1,t_1} + \sum _{t_2\in T'_{c_2}} x_{c_2,t_2} + z_{s,c_1} + z_{s,c_2} + y_{c_1,r_1} + y_{c_2,r_2} - }\nonumber \\&\quad - aux_{c_1,t_1,r_1,c_2,r_2,s} \le 5. \end{aligned}$$
(21)
As before, we can group together all N students \(s_i\in S\) who may attend classes \(c_1\) and \(c_2\) (20) to obtain:
$$\begin{aligned}&{Nx_{c_1,t_1} + N\sum _{t_2\in T'_{c_2}} x_{c_2,t_2} + Ny_{c_1,r_1} + Ny_{c_2,r_2} +}\nonumber \\&\quad + (w_{s_1,c_1,c_2} + \cdots + w_{s_N,c_1,c_2}) - \mathrm{\mathrm {AUX}}_{c_1,t_1,r_1,c_2,r_2} \le 4N. \nonumber \\ \end{aligned}$$
(22)
The inequality (22) is generated for any \(c_1,c_2\in C\), \(t_1\in T_{c_1}\), \(r_1\in R_{c_1}\), \(r_2\in R_{c_2}\). The variable AUX is a new continuous variable between 0 and N denoting the number of students who cannot attend both classes \(c_1\) and \(c_2\) because their time and room assignments results in insufficient travel time.
The inequalities (20) and (22) fully describe for the student clashes and the corresponding variables \(\mathrm {AUX}\) can be directly added into the objective function with the appropriate weights. In the implementation, we always generated (20) and for small problems we were able to generate all of the constraints (22). For larger instances, we used a method similar to the one used for the soft constraints: at the end of each optimization run, we check for those student clashes that were omitted from the objective and record the corresponding constraints (22) in a log file, to be used during the next optimization run.
It is worth noting that constraints (22) are not student-specific and correctly account for any number of students having the same clash. This means that adding a constraint during the next optimization run will not result in simply exchanging the affected student with another.
The full objective function containing all four cost elements to minimize, namely time allocation, room allocation, distribution constraints, student clashes, is therefore as follows:
$$\begin{aligned} \displaystyle \mathrm{obj}&= \displaystyle W_\mathrm{T} \sum _{c\in C}\sum _{t\in T_c} p_{c,t} x_{c,t} + W_R \sum _{c\in C}\sum _{r\in R_c} p_{c,r} y_{c,t}+\nonumber \\&\displaystyle \quad + W_\mathrm{D} \sum _{d\in D_S} p_{d} \left( \sum _{c_1,c_2\in C_d} \mathrm {AUX}_{d,c_1,c_2} + \sum _{\text{ file }} \mathrm{AUX}_{d,c_1,\ldots ,c_N}\right) \nonumber \\&\displaystyle \quad + W_\mathrm{S} \left( \sum _{c_1,c_2\in C} \sum _{t_1\in T_{c_1}} \mathrm{AUX}_{c_1,t_1,c_2} + \sum _{\text{ file }} \mathrm{AUX}_{c_1,t_1,r_1,c_2,r_2}\right) \nonumber \\ \end{aligned}$$
(23)
where \(W_\mathrm{T}\), \(W_\mathrm{R}\), \(W_\mathrm{D}\) and \(W_\mathrm{S}\) are the weights corresponding to the time penalty, room penalty, distribution penalty and student clashes, the constants p represent the weights attributed to each possible choice of time, room and violated soft constraint and the last summation denotes a sum over all the variables \(\mathrm{AUX}_{c_1,t_1,r_1,c_2,r_2}\) of the specific type of student clashes which were read from an external log file.
Variable and constraint simplification strategies
The presented model accurately represents the timetabling problem in the sense that any feasible solution of the MIP corresponds to an acceptable solution for the timetabling problem, and, conversely, no solution to the real timetabling problem is infeasible for the MIP.
Nonetheless, it is possible to reduce the size of the MIP formulation by implementing a number of strategies for variable elimination and reduction of the number of constraints. This approach is more efficient that relying to the commercial MIP solver to detect and eliminate them. The strategies used in our computational implementation are described in the next sections.
Elimination of variables whose value is fixed
The first step is to consider decision variables whose values can be deduced; they are fixed to 0 or 1 accordingly and eliminated from the formulation. We use the following strategies:
-
Variables x, y, z and Z where only one choice is possible; the corresponding variable is set to 1. These are in the constraints C1, C2, C3, C5 with only one term in the left-hand side.
-
Variables that can be set to zero because another mutually exclusive variable is fixed to 1. This means that if \(\sum x_i =1\) and \(x_k=1\) is fixed to one, then all the other variables can be set to zero (C1, C2, C3, C5).
-
If a class has both variables x and y fixed, and another class has its variable y fixed to the same room, we can then set to zero all variables x of the second class which overlap with the known time of the first class. In other words, if two classes must take place in the same room, and one class has its time fixed, we exclude all times from the other class that overlap with the first as no two classes can share the same room at the same time (C8).
-
Some variables z do not need to be generated, as they can be replaced by the corresponding variable Z. This is the case where a course subpart contains only one possible class (constraint C3 with only one term in the left-hand side).
-
The auxiliary variables w used to model student clashes need to be created only in certain situations as explained in (19). In the other cases, these variables will be either fixed or substituted by one of the corresponding z variables (which, in turn, may be replaced by a Z variable as described in the previous point).
Elimination of constraints that are always satisfied
Once the number of variables has been reduced, we proceed to identify sets of constraints are always satisfied and can be removed from the formulation. We concentrate on constraints of the form \(\sum a_i x_i \le c\) with \(a_i>0\). These constraints are always satisfied and therefore redundant if \(\sum a_i \le c\). We did not consider elimination of equality constraints because they are relatively few, although the commercial solvers used were able to detect and eliminate them as needed.
Reducing the number of constraints by aggregation
We can reduce the number of constraints in the formulation by taking advantage of the fact that exactly one time and/or room assignment is possible and group similar constraints together. This reduction was briefly described in the aggregation of constraints (17) to yield (18) in the objective function, but can also be applied to numerous constraints. For example, if we consider constraint C7 where combinations of x and y are excluded to forbid the use of a room when it is unavailable, we can aggregate a set of K constraints (\(1\le i\le K\)) of the form:
$$\begin{aligned} x_{c_1,t_1} + y_{c_1, r_i} \le 1 \end{aligned}$$
with a single equivalent constraint:
$$\begin{aligned} x_{c_1,t_1} + \sum _{1\le i\le K} y_{c_1, r_i} \le 1. \end{aligned}$$
(24)
Alternatively, one can aggregate the same constraints over the x variables instead (\(1\le i\le L\)), to produce:
$$\begin{aligned} \sum _{1\le i\le L}x_{c_1,t_i} + y_{c_1, r_1} \le 1. \end{aligned}$$
(25)
As these two choices are equivalent, we select the one that yields the smallest number of aggregated constraints. In the above example, if \(K\ge L\) we select the aggregation (24) to ‘exclude several rooms for a given time,’ otherwise we opt for (25) to ‘exclude several times for a given room.’ Aggregating these constraints has no impact on the feasibility of the problem.