Multi-neighborhood simulated annealing for the sports timetabling competition ITC2021

We describe the solver that we developed for the Sports Timetabling Competition ITC2021, a three-stage simulated annealing approach, that makes use of a portfolio of six different neighborhoods. Five of these neighborhoods are taken from the literature on round-robin tournament scheduling, whereas the last one, denoted as PartialSwapTeamsPhased, is a novel contribution and it is specifically designed for the phased version of the problem. We perform a comprehensive and statistically principled tuning procedure to find the best combination of parameters for the competition instances. We dedicate specific focus to evaluate the contribution given by the new neighborhood PartialSwapTeamsPhased, which yielded better results on most phased instances. Overall, the final outcome is that the three-stage simulated annealing solver is able to find a feasible solution on 44 out of 45 instances and ranked second in both the first competition milestone and the final round. We also propose an Integer Linear Programming model implemented in CPLEX, which, unfortunately, did not produce significant results on the instances of the competition.


Introduction
Sports timetabling is an active research field, mainly due to the commercial interest in the maximization of fan attendance (in person or remotely) to sport events. Among the various possible structures for sport competitions, the round-robin tournament, where each team plays against each other, is the most frequently used for most team sports.
Many variants of the round-robin tournament problem have been discussed in the literature. We consider here the version proposed for the International Timetabling Competi-  : a double round-robin tournament (all teams play with each team twice), which takes into account a very rich set of constraints and objectives collected from real-world cases.
All versions of this problem have in common the fact of being generally difficult to solve in practice. In fact, it is often hard to find optimal (or near-optimal) solutions already for instances of relatively small sizes, i.e., 16-20 teams, which is indeed the typical size of national championships.
As mentioned above, the ITC2021 problem considers a large set of constraints and objectives, also known as hard and soft constraints, respectively. This formulation has the peculiarity that every single specific constraint can be stated as either hard or soft. Another characteristic of the ITC2021 formulation is that it has abandoned the classical mirrored structure in which the second leg is identical to the first one, with home and away positions swapped. That is, the structure of ITC2021 instances is either completely free or phased. The latter imposes that each team meets all other teams in each leg, but not necessarily in the same order.
In this paper, we describe the search method employed in our participation in the ITC2021. It is a three-stage simulated annealing approach that uses a portfolio of six different neighborhood structures. Five of them are classical ones, already proposed in the literature, whereas the sixth one, named PartialSwapTeamsPhased, is a variant of one of them that we specifically designed to deal with phased instances. Simulated annealing has been used also by other authors for sports timetabling with good results, suggesting that it is particularly suitable for this type of problem (see Sect. 2 on related work). In addition, we have experienced promising results with multi-neighborhood simulated annealing also on problems that show some similarities, like for example Examination Timetabling (Bellio et al., 2021) or Minimum Interference Frequency Assignment .
Our solver has many parameters, and it has been tuned using the F-RACE procedure (Birattari et al., 2010), upon a set of experimental configurations designed using the Hammersley point set (Hammersley & Handscomb, 1964).
We also propose an Integer Linear Programming (ILP) model for the problem. We implemented it in CPLEX, but, unfortunately, it was able to solve systematically only small artificially generated instances, and it did not produce significant results on the instances of the competition even after long running times.
The paper is organized as follows. Section 2 is dedicated to the discussion of previous work. Section 3 introduces a mathematical model for the problem. Section 4 describes in detail our approach, and its experimental results are illustrated in Sect. 5. Finally, conclusions and future work are discussed in Sect. 6.

Related work
Interest in Sports Timetabling dates back to the 1970s. Initial research by Gelling 1973, Russell 1980, Wallis 1983, and de Werra et al. 1990 focused on the relationship between 1factorizations of a complete graph and the Sports Timetabling Problem. In sports timetabling, 1-factorizations take the name of patterns and de Werra 1981 proposed an easy way to generate a 1-factorization that has been named canonical pattern. Nevertheless, Rosa andWallis 1982 andDinitz et al. 1994 warned about the complexity in the generation of non-isomorphic 1-factorizations. Due to its complexity, applications of metaheuristics to the Sports Timetabling Problem date back to the 1990s, with contributions from Costa 1995, Della Croce et al. 1999and Hamiez and Hao 2000 In the 2000s Ribeiro and Urrutia 2004, Anagnostopoulos et al. 2006, and Di Gaspero and Schaerf 2007 proposed a set of new neighborhoods for local search-based metaheuristics. They have been employed either with Tabu search or simulated annealing and were particularly effective for the solution of the Traveling Tournament Problem (TTP), proposed by Easton et al. 2001. In the last decade, Lewis andThompson 2011, Costa et al. 2012, and Januario and Urrutia 2016 worked on further heuristics and new neighborhoods for the solution of the Sports Timetabling Problem. More recently, Van Bulck et al. 2020b introduced a unified data format for the roundrobin sports timetabling, named RobinX, that synthesize 18 different constraints belonging to five different constraint groups, and they published a large set of instances in the proposed format. The RobinX format is employed in the Sports Timetabling Competition ITC2021 .
More complete bibliographic revisions for sports timetabling can be found in Rasmussen andTrick 2008, andKendall et al. 2010. An up-to-date bibliography is also available online and maintained by Knust 2010.

Problem formulation
We introduce here the ITC2021 problem through its Integer Linear Programming (ILP) model, and we refer to Van Bulck et al. 2021 for a comprehensive presentation.
Let n be an even number and T = {1, . . . , n} be the set of teams of a sport league. In a double round-robin tournament, each team i ∈ T plays a game against each other team j ∈ T , j = i, twice, once at home and once away. We identify the home and away games of team i against team j, respectively, with the pairs (i, j) and ( j, i). Hence, the set of games that have to be scheduled in the league is G = {(i, j) ∈ T × T : i = j}. In addition, in a time-constrained tournament the number of rounds available to schedule the games of G has to be minimal. Then, R = {1, . . . , 2(n − 1)} is the set of the available rounds in a double round-robin tournament and, at every round r ∈ R, each team plays exactly once, either at home or away. A timetable is an assignment of exactly one round of R to each game in G. We say that a timetable is phased if the season is split in two legs, where each team plays against all the other teams exactly once: team i and j cannot play both their mutual games ((i, j) and ( j, i)) in the same leg. We suppose that the first leg occurs in rounds 1, . . . , |R|/2, whereas the second one in rounds |R|/2 + 1, . . . , |R|.
Sport timetables usually consider several additional constraints. Specifically, we consider five groups of constraints. Capacity constraints regulate the number of home games, away games or games that a team or a subset of teams can play in a given subset of rounds. Game constraints fix or forbid specific assignments of games to rounds. Break constraints are used to limit the number of breaks, that is the number of consecutive home or away games for a team. Breaks are mostly undesired in a fair timetable. Fairness constraints limit the difference of home games played by two teams after each round. Finally, separation constraints ensure that the mutual games of two teams are separated by a given number of rounds. We call C the set of these constraints. Set C contains hard and soft constraints: the former express fun-damental properties of the timetable and must be satisfied, whereas the latter express preferences and can be violated. We denote by C hard and C soft the subsets of C containing, respectively, the hard and soft constraints. For each soft constraint c ∈ C so f t , we denote by w c the weight associated with its violation.
For each game (i, j) ∈ G and each round r ∈ R, we introduce a binary variable x ijr defined as follows For each soft constraint c ∈ C soft , we include a nonnegative continuous variable d c representing the deviation triggered if the constraint is violated.
The model, denoted by M, reads as follows.
Objective function (1) minimizes the weighted violation of the soft constraints. Constraints (2) and (3) define a timetable for the games in G in the rounds of R. Specifically, Constraint (2) imposes that every game is played, i.e., it is assigned to exactly one round, and Constraint (3) ensures that each team plays at most one time per round. Finally, Constraint (4) guarantees that the timetable is phased, if required. In the following, we list the constraint types considered in set C. We first discuss the hard version of these constraints. Some additional notation, such as subsets of teams or rounds and parameters, may be required for each constraint c ∈ C. For example, if constraint c is identified by a team i ∈ T , we denote by T (i) and/or R(i), respectively, the subsets of teams and rounds considered by the constraint itself: the dependency on c is dropped to lighten the notation. Furthermore, we explicit the correspondence between the constraints in set C and those considered in Van Bulck et al. 2021 to avoid ambiguities.
-Capacity Constraints (CA) Constraints (5) (CA1 and CA2 in  impose that team i plays at least k(i) and at mostk(i) home games against the teams in subset T (i) ⊆ T in the rounds of set R(i) ⊆ R. These constraints can be used to model the so-called place constraints that forbid a team to play at home in a given round and the so-called top team and bottom team constraints which avoid bottom teams to play all the initial games against top teams. Then, Constraint (5) (CA3 in  forces team i to play at least k(i) and at mostk(i) home games against the teams in subset T (i) ⊆ T in each sequence ofr (i) rounds. Finally, Constraint (7) (CA4 in  imposes that the number of home games of teams in T against teams in T in the rounds of R has to be between k andk. These constraints are used, for example, to limit the total number of home games per round between teams that share the same venue. Similar constraints can be imposed in case of away games or games. -Game Constraints (GA) Given a subset of games G ⊆ G and a subset of rounds R ⊆ R, Constraint (8) (GA1 in  imposes a lower bound k and an upper boundk on the number of games of G that can be played in the rounds of R . -Break Constraints (BR) A team i ∈ T has a home/away break in round r ∈ R \ {0} if i has a home/away game in rounds r − 1 and r . To model these constraints, we introduce two binary variables y h ir and y a ir for each team i ∈ T and each round r ∈ R \ {0}: The break constraints read as follows.
Constraints (9) and (10) For all pair of teams i, j ∈ T , i = j and all rounds r ∈ R, Constraint (15) (FA2 in  ensures that the difference between the home games played by i and those played by j is at mostk(i, j) after round r . Analogous constraints can be applied for the away games or games.
-Separation Constraints (SE) where  ensures that if one of the two mutual games (i, j) or ( j, i) of two teams i, j ∈ T is assigned to round r , then the other one cannot be assigned to the rounds of R(i, j): games (i, j) and ( j, i) are separated by at least k(i, j) and at most k(i, j) rounds.
All constraints presented so far can be handled also in their soft version. Here, we discuss in general terms how their deviation is computed (see Van Bulck et al., 2020b, for a detailed description). We remark that all constraints c ∈ C have the same structure, i.e., they impose a lower and/or an upper bound on a linear expression: where f c (x, y h , y a ) is a linear expression in the variables x ijr , y h ir and y a ir and lb c and ub c are, respectively, a lower and upper bound imposed on f c (x, y h , y a ). Except for the fairness and separation constraints, the deviation triggered if one of these constraints is violated is given by the following two constraints For example, if c is a capacity constraint which imposes a lower and an upper bound, respectively, k(i) andk(i), on the number of home games that a team i ∈ T can play in the rounds of set R(i) (Constraint (5)), then the deviation triggered by c is equal to the number of home games of i in the rounds of R(i) less than k(i) or more thank(i). Finally, let us discuss how the deviation of the fairness and separation constraints is computed. A fairness constraint limits the difference of home (away or any) games between teams i ∈ T and j ∈ T after each round r ∈ R. However, the deviation triggered when it is violated is equal to the maximal difference in home (away or any) games more thank(i, j) played by i and j over all the rounds of R (see . Hence, to express such deviation, we need to include a non-negative continuous variable z i j storing the maximal difference in home (away or any) games between i and j. For the case of home games, the deviation is computed by including the following constraints.
Now, let c be a soft version of a separation constraint, which require that the two mutual games of teams i, j ∈ T , i < j are separated by at least k(i, j). The deviation triggered if c is violated has to be equal to the difference between k(i, j) + 1 and the number of rounds between games (i, j) and ( j, i): The case where the two mutual games have to be separated by at mostk(i, j) rounds can be treated similarly.

Solution method
We designed a three-stage multi-neighborhood simulated annealing for the solution of the problem. The multineighborhood is a hexamodal neighborhood made up by a portfolio of six different local search neighborhoods, which are specifically tailored for the sports timetabling problems. The metaheuristic method employed for the search of the solution is a slightly modified version of the classical simulated annealing defined by Kirkpatrick et al. 1983. The search is executed in three distinct sequential stages, characterized by different parameter values of the metaheuristic and different restriction of the search space. In this section, we explain first of all the general features of the search space and the method employed for the generation of the initial solution. Next, we discuss thoroughly the multi-neighborhood and the simulated annealing metaheuristic. Finally, we illustrate the characteristics of the three stages of execution of the algorithm.

Search space
Given the structure of the problem described in Sect. 3, as search space we consider the set of all two-leg roundrobin timetables. This means that every possible round-robin timetable, though not necessarily feasible, is a solution in the search space. Thus, in every solution, each team plays with every other team twice (home and away), and all teams play exactly one match at every round.
For the instances that require a phased timetable, we allow the algorithm to visit states that break the phase structure. Since the formulation of the problem does not provide an explicit phase constraint, we added an artificial tenth cost component that measures the number of matches that violate the phase requirement. This number is then multiplied for a suitable weight, and the resulting value is assigned to the new cost component. This mechanism, which is applied only to phased instances, makes phased violations possible but penalized in the cost function.
A solution is internally described as a matrix of size |T |× |R|. Each cell (t, r ) contains the index of the opponent of t at the match r . The value is positive if t plays at home at the match r , negative otherwise. Figure 1b provides an example of this encoding, which is used also in the figures in Sect. 4.3 for the explanation of the multi-neighborhood.

Initial solution generation
The initial state can be generated either randomly or through a greedy algorithm. The random procedure consists in different permutations of teams and rounds on the canonical pattern (see, e.g., de Werra, 1981). It produces a double round-robin tournament, but it does not provide any feasibility guarantee regarding the hard constraints of the problem, which is then restored by the simulated annealing procedure.
Given an input instance with |T | teams and |R| rounds, the random initial solution is generated performing the following steps: 1. A single-leg canonical pattern for the |T | teams with |R|/2 rounds is generated. Each team meets every other team exactly once. 2. A random permutation is performed on the |T | teams. 3. The timetable is mirrored in order to obtain a two-leg tournament. At this moment, the second leg is identical to the first one, except for the home-away order that is inverted. 4. If the instance is not phased, a random permutation is executed on the |R| rounds. Otherwise, if the instance is phased, two random permutations are executed. The first one involves the rounds {0, . . . , |R|/2 − 1}, and the second one is performed on the rounds {|R|/2, . . . , |R|− 1}. Hence, the initial random solution does not violate the phase constraint.
Also the greedy algorithm is based on the canonical pattern, which is used as a reference for the constructive steps. The idea behind the greedy algorithm is to generate and test the addition of candidate rounds that are constructed on the basis of a reference tournament of template rounds obtained as in the random procedure. These rounds are templates instead of actual ones since all their possible perturbations, according to some of the symmetries that are inherent in round-robin tournaments, are produced in the generation process. In detail, the symmetries used are those among rounds (i.e., permuting the order of the rounds does not violate the round-robin tournament property) and those among the venues of each game.
Starting from an empty initial solution, the greedy process selects, at each step, the best candidate to be added to the current solution according to its contribution to constraint violations. Since the solution is incomplete, the check is restricted to only those constraints that can be (at least partially) evaluated in the current partial solution once it has been extended with any of the candidate rounds. Fig. 1 Example of the internal solution representation of a timetable for a round-robin tournament with 6 teams and 10 games: the upper part is the listing of the actual games in the tournament, while the lower part reports its encoding round 0 round 1 round 2 round 3 round 4 round 5 round 6 round 7 round 8 round 9 2 -0 4 -0 3 -0 0 -5 0 -1 0 -3 0 -4 0 -2 5 -0 1 -0 1 -3 2 -1 5 -1 1 -4 5 -2 1 -5 1 -2 3 -1 4 -1 2 -5 4 -5 3 -5 4 -2 3 -2 3 -4 2 -4 3 -5 5 -4 2 -3 4 -3 (a) An example tournament with |T | = 6 round 0 round 1 round 2 round 3 round 4 round 5 round 6 round 7 round 8 round 9 Internal representation of the previous tournament In Fig. 2 we exemplify a step of the greedy process in case of |T | = 6 teams in a non-phased setting. In the example, the current solution consists of four assigned rounds and six remaining template rounds (denoted by χ i ), which are still available from the initial canonical pattern. Different situations arise, for instance, in the generation of round candidates for the χ 1 and χ 2 templates.
As for the template χ 1 , which has not been included yet in the solution, there is full freedom in deciding the home-away status of the games; therefore, all possible venues permutations can be generated and evaluated. Conversely, since one copy of the χ 2 template has been already included in the solution (namely in round 1), among all the possible venues permutations only the one that mirrors the already included copy of the template is possible. This is however inherent in the fact that the reference tournament for the round templates is created through a concatenation of two single round-robin canonical patterns.
Each generated round candidate is tried for the completion of the current solution, and the partial cost of this addition is computed. For example, in the figure, the constraint BR1 1 reported above the current solution requires that no more than 2 breaks occur for team 5 during periods 1-5. This constraint can be partially checked for its cost (which is zero, since there are no more than 2 breaks) for the periods 1-3 already added to the solution, whereas it cannot be checked yet for period 4 and the possible candidate addition.
Among all the possible candidates, the one that achieves the minimum (partial) cost is selected and the corresponding template is removed from the set of available ones. Possible ties in the cost value are randomly broken.
In the case of a phased tournament the greedy procedure is adapted to ensure that the two legs of the tournament are separated. In order to achieve this goal, the tournament used as a reference for the first leg consists of a single round-robin tournament template and after the first leg is completed the 1 Refer to (Van Bulck et al., 2020a) for the comprehensive explanation of all constraints employed in the competition. second leg is constructed with another (possibly different, in terms of team permutations) single round-robin tournament using the full generation of combinations but pruning those that overlap with the games already included in the first leg.
Finally, to obtain numerous diverse initial solutions also with the greedy procedure, the indexes of the teams are randomly permuted. That is, before starting the process, a random permutation of the indexes is drawn and the mapping between the teams in the candidate round (i.e., the logical indexes) and the actual teams is computed by applying this permutation. To enhance the randomness, in the case of the phased tournament two distinct permutations are computed for the first leg and the second leg assignments.
As discussed further in Sect. 5.2, this choice seems to mildly outperform the random initial solution. Nevertheless, the improvement margin is not considerably large, so we decided to keep both possibilities in our algorithm, leaving to the user the possibility to choose the start method through an input parameter.

Multi-neighborhood relations
We propose a multi-neighborhood composed by the union of different neighborhoods. Five of them, called SwapHomes, SwapTeams, SwapRounds, PartialSwapTeams, and Partial-SwapRounds, are adaptations of classical ones from Ribeiro and Urrutia 2004, Anagnostopoulos et al. 2006, and Di Gaspero and Schaerf 2007. In addition, we introduce a novel neighborhood called PartialSwapTeamsPhased, specifically designed to deal with phased instances. Experimental results highlight that the usage of the novel neighborhood allows us to reach better solutions in terms of objective function and to achieve feasibility on certain large phased instances, which would be, otherwise, very hard to tackle. All the neighborhoods ensure that the double round-robin structure of the tournament is conserved, but they do not provide any guarantees on the feasibility of the solution.
The multi-neighborhood is designed to be employed by the simulated annealing metaheuristic, described in Current solution round 0 round 1 round 2 round 3 round 4 1 -5 0 -5 3 -5 5 -4 5 -2 0 -2 4 -1 2 -4 0 -3 1 -3 3 -4 3 -2 0 -1 1 -2 0 -4 5 -2 1 -3 0 -4 <BR1 teams="5" intp="2" mode2="HA" slots="1 2 3 4 5" type="SOFT"/> A constraint on rounds 1-5 Available round templates from the canonical pattern Trying candidate addition, computing partial cost Fig. 2 One step of the greedy constructive procedure in the case of a non-phased tournament: the procedure tries to complete the partial solution with the best possible combination of template assignment and game venues Sect. 4.4, that randomly draws a move from the multineighborhood at every iteration. A desirable feature of the multi-neighborhood is to give higher frequency of execution to those moves that belong to neighborhoods that, on average, lead to the most significant improvements of the solution. So, an essential property of the multi-neighborhood is that each neighborhood is associated with a probability. The draw of the move in the simulated annealing is then executed into two steps. The first step is the random selection of one of the six neighborhoods, according to the given probabilities. The second step is the random selection of a move inside the neighborhood. The probability of the move inside the neighborhoods is shaped as a uniform random variable.
The values of the probabilities are defined through a tuning procedure discussed in Sect. 5.2, while the six neighborhoods and their specifications are illustrated hereafter.

SwapHomes
The move SwapHomes takes as attributes two teams t i , t j ∈ T , t i = t j , and it is denoted as SH t i , t j . It swaps the home/away position of the two games between t i and t j . Figure 3 shows the execution of the move.

SwapTeams
The move SwapTeams takes as attributes two teams t i , t j ∈ T , t i = t j , and it is denoted as ST t i , t j . It swaps the positions of t i and t j throughout the whole timetable. Figure  4 shows the execution of the move.

SwapRounds
The move SwapRounds takes as attributes two rounds r i , r j ∈ R, r i = r j , and it is denoted as SR r i , r j . It swaps the two rounds in the timetable. That is to say, all the matches assigned to r i are moved to r j , and vice versa. Figure 5 shows the execution of the move.

PartialSwapRounds
The move PartialSwapRounds takes as attributes two rounds r i , r j ∈ R, r i = r j , and a set of teams T s = {t 1 , . . . , t s }, T s ⊂ T . The move is denoted as PSR T s , r i , r j . It produces the swap between r i and r j of the matches including teams in T s . As the name suggests, it works similarly to PartialRounds, with the main difference that the move is not executed on the whole set of matches in the two rounds, but only on a subset of matches.
A fundamental requirement for the construction of T s is that every team t k ∈ T s plays only with teams from T s in the two rounds r i and r j . In practice though, large subsets T s are not particularly desirable, because they considerably slow down the generation of the move with a negative impact on the overall time performance of the algorithm. For this reason, we impose a limitation on the maximal size of the set T s during the generation of the move. Figure 7 shows the execution of the move.

PartialSwapTeamsPhased
The move PartialSwapTeamsPhased is a novel neighborhood that we designed with the main motivation to deal with the phased version of the problem. The five neighborhoods discussed so far, indeed, work well on the non-phased instances, but turned out to be insufficient for obtaining good results on the phased ones. The neighborhood PartialSwapTeams, in particular, has quite disruptive side effects on the phase structure of the timetable that are only sporadically beneficial to the search. PartialSwapTeamsPhased, on the other hand, allows to reach new solutions through partial swaps of teams without variations of the current state of the phase. We discuss in this section the fundamentals of the neighborhood, and we forward the reader to Sect. 5.4 for an analysis of the experimental data.
As the name suggests, PartialSwapTeamsPhased takes inspiration from the above-mentioned PartialSwapTeams.
Therefore, the move PartialSwapTeamsPhased takes as attribute two teams t i , t j ∈ T , t i = t j , and a set of rounds R s = {r 1 , . . . , r s }, R s ⊂ R. The essential prerequisite is that the set of matches involved in the move must all belong to the same mixed leg, according to the definition provided. The move is denoted as PST P t i , t j , R s , and, similarly to the move PartialSwapTeams, it produces the swap of the positions of t i and t j on the set of rounds in R s . Consequently, the maximum size of the set R s corresponds to the size of a mixed leg, which is |R s | ≤ |R|/2.

Metaheuristic
The metaheuristic employed is basically the traditional simulated annealing defined in Kirkpatrick et al. 1983.
At each iteration, a random move is drawn as explained in Sect. 4.3, and its corresponding variation of cost is computed. The move is always accepted if it is improving or sideways, i.e., ≤ 0, and it is accepted with probability e − /T if > 0 (the Metropolis acceptance criterion).
The temperature T starts from an initial value T 0 and decreases according to a geometric cooling scheme after a given number of iterations m, until it reaches the final temperature T min . That is, every m iterations we set T := α · T , where α is the cooling rate.
We make use of a cutoff mechanism to limit the number of evaluated moves at each temperature level (Johnson et al., 1989). Given a current temperature T and a cutoff value of γ , when an amount of γ · m solutions have already been accepted at temperature T , the algorithm stops evaluating moves at the current temperature and passes to the next temperature. The purpose of the cutoff is to reduce the computational time spent on chaotic states, commonly at high temperatures, when most moves are accepted.
Summarizing, the parameters of the metaheuristics are: the start temperature T 0 , the final temperature T min , the cooling rate α, and the cutoff rate γ .

Stages of the algorithm
The algorithm is structured into three distinct stages, that consist in three independent runs of the Simulated Annealing procedure described in Section 4.4. The first stage starts its search either from a random or from a greedy solution, the second and the third stages are warm-started with the output of the previous stage as initial solution. The differences between the stages consist in the restrictions applied to the search space, and in the exclusion or inclusion of certain constraints (see Table 1).
Stage 1: Hard constraints are included in the cost function, and soft constraints are not considered. The objective of Stage 1 is to rapidly find a feasible solution.
In our experiments, this stage shows its effectiveness specifically on large phased instances, where the metaheuristic faces most problems in finding a feasible timetable. Stage 2: All constraints are applied and both feasible and infeasible regions are explored. The costs associated with hard constraints and phased cost component are multiplied by suitable weights. The goal of Stage 2 is to find a good solution in terms of objective function. Stage 3: All constraints are applied, but moves that violate hard constraints are not allowed and only the feasible region is explored. The goal of Stage 3

Experimental results
In this section, we discuss the experimental setting of the simulated annealing algorithm and the results obtained by applying it to the instances proposed for the ITC2021 competition. We also assess the performances of Model M by running it on CPLEX within a time limit. For this latter experiments, we consider the instances of the competition and some others of smaller size obtained by performing reductions on the competition instances. Our code was developed in C++ and compiled with GNU g++ version 9.3.0 on Ubuntu 20.04.2 LTS. The tuning phase was partially performed on a cluster of virtual machines provided by the CINECA consortium. All the other experiments presented in this section were run on a machine equipped with AMD Ryzen Threadripper PRO 3975WX processor with 32 cores, hyper-threaded to 64 virtual cores, with base clock frequency of 3.5 GHz, and 64 GB of RAM. In both settings, one single virtual core is used for each experiment.

Instances
The algorithm was run on the 45 instances of the competition that are available for download on the website of the competition (Van Bulck et al., 2020a). We performed a general analysis of their main features. The size, expressed in number of teams, is always comprised between 16 and 20 teams, and 22 instances in total have a phased requirement. As we can observe in Table 2, the total number of hard and soft constraints fluctuates considerably among instances. The range for hard constraints goes from a minimum of 51 in instance Late_11 to a maximum of 246 in instances Early_10, Early_11, Middle_2, and Late_2. The range for soft constraints goes from a minimum of 34 in instance Late_4 to a maximum of 1231 in instance Middle_2. Incidentally, instance Middle_2, which has the highest overall number of constraints, 1477, is also the only instance in which our solver was not able to determine any feasible solution. More in general, from our experimental results we observed that most instances where our solver shows the most deficiencies are also among those characterized by many hard constraints, but only when also the phase requirement is present. For this reason, the solver redistributes the total number of iterations in favor of Stage 1 when it recognizes a phased instance with a quantity of hard constraints above a certain threshold. Table  3 provides additional details on the cardinality of constraints of each type in their hard and soft versions. It is possible to notice that constraint types BR2, FA2, and SE1 are always expressed uniquely, if present, so only six constraint types out of nine are actually declined into multiple constraints. Finally, it is noteworthy to mention that each individual constraint involves different quantities of teams or slots, so that also the individual contribution to the instance complexity may differ substantially.

Parameters and tuning
For the three-stage multi-neighborhood algorithm to be effective on the given instances, several parameters need to be tuned. In this work, we tuned the probabilities of the six neighborhoods separately on the phased and on the nonphased instances. These probabilities are stage-independent. The specific parameters of the simulated annealing metaheuristics, on the other hand, were tuned separately for each stage. For the simulated annealing we decided to tune only the start temperature and the final temperature, which turned out to be the most critical parameters from previous research work (see, e.g., Bellio et al., 2016Bellio et al., , 2021. Conversely, we fixed the sample acceptance ratio and the cooling rate values to consolidated and robust values found in previous work. In our algorithm, we assigned weights to the different hard cost components and these weights also underwent a tuning procedure. They were employed in Stage 1 and Stage 2, since moves that violate satisfiability are not considered in Stage 3. Finally, the decision whether to use a random or a greedy start for Stage 1 was also subject to a tuning procedure. As introduced in Sect. 4.5, we allow in Stage 1 and Stage 2 the exploration of the infeasible region and, for phased instances, also the break of the phase structure. Thus, the weights assigned to hard violations and phased violations also require tuning. In this case, we did not only search for possible values, but we compared two different approaches: feature-based or fixed values. Specifically, the only feature that we take into account for this problem is the number of hard constraints in the given instance. Let N h be the number of hard constraints, w h and w p , respectively, the hard cost component weight and the phased cost component weight, and k a generic constant. Equations 23 and 24 describe how we can obtain the weights for the hard cost component and the phased cost component from the number of hard constraints, in the feature-based scenario. The value of k is also determined through a tuning procedure. Please note that k is float-valued and w h and w p are integer-valued, so a rounding is applied. In our tuning procedure, the feature-dependent approach resulted to be the most effective for Stage 2, while for Stage 1 fixed values resulted to be more suitable. The whole tuning procedure was performed with the aid of the tool json2run (Urli, 2013), which implements the F-RACE procedure. The parameter space was sampled using an Hammersley point set (Hammersley & Handscomb, 1964). Table 4 contains the list of the parameters and related values. The column Tuning range contains the lower and upper bounds of the ranges taken into account by the tuning procedure, which do not constitute a boundary in our algorithm. The dual comparisons aimed to determine whether to use a random or a greedy start and whether to use fixed or feature-dependent w h and w p do not require a Hammersley sampling. The outcome of the tuning procedure is shown under the columns Assigned values. Finally, Table 5 presents the outcome of a dual comparison between the random start and the greedy start. We limited the execution of the algorithm to one million iterations, which we consider a short run, of Stage 1, exclusively. The results are given in terms of feasibility ratio, because in this stage of the algorithm we are not focused yet in optimizing the objective function. In general, we can observe that the greedy start ensures a higher probability of finding a feasible solution, at a price of a slightly longer execution time.

Analysis of the results
We report in this section an overview of the experimental results.
First, we discuss the results obtained on Model M. We used the solver CPLEX 20.1 and we imposed different time limits, ranging from one hour to 24 hours. We employed one single CPU per run.
Solving the model on the competition instances did not yield good results: within a time limit of one hour, a feasible solution was found only for instance Late_4, and the other 44 instances were left unsolved. Longer time limits provided very limited improvements. Within 24 hours, feasible solutions were found only for six instances (E8, M4, M8, M15, L4, L8), with values of the objective function far from those obtained by the simulated annealing within analogous running times. Hence, to assess the performances and the limits of Model M, we run tests on two clusters of instances obtained by reducing the size of the competition ones. In the first cluster, we removed constraint types and pairs of constraint types from each competition instance which contains them. The columns of Table 6 (left) report, respectively: the removed constraint types; the number of reduced instances; the number of instances for which a feasible, but not optimal, solution is found; the number of instances solved to optimality. From Table 6 (left), we infer that the solver still struggles to produce feasible solutions when only one constraint type is removed. Removing pairs of constraints seems to bring benefits only when the capacity and break constraints are both removed: the solver manages to provide 26 feasible solutions  for the considered instances, six of which are proven to be optimal in an average time of 30 seconds. This is in line with the structure of the instances; indeed, several of them consider many capacity and break constraints in their hard version (see Table 3).
In the second cluster of instances, we reduced the size of the competition instances in terms of number of teams. Specifically, we restrict the cardinality of team set T to |T | = 6, 8, 10, 12 and for each cardinality we consider 20 reduced instances. These instances are obtained from the competition ones by randomly selecting the teams to remove and by deleting them from all the constraints in which they appear. Table 6 (right) reports the same data as in Table 6 (left), except for the first column which, in this case, reports the size of set T . As expected, the larger the size of T , the less feasible (and optimal) solutions the solver is able to provide. Moreover, we observe that the model is solved consistently on instances up to size ten. Starting from |T | = 12, the performances of the model drop drastically: only four feasible solutions and an optimal one are found with |T | = 12 and only two feasible solutions are found with |T | = 14.
In what follows, we discuss the results obtained by the simulated annealing algorithm on the competition instances. Specifically, we report both the best overall solution that we obtained for each instance and data on the average behavior of the algorithm. In order to assess the average performances of the simulated annealing in terms of cost, time, and feasibility, we report the results of a dedicated experiment batch. All the experiments were run without a time limit, but rather with a fixed number of maximum iterations per stage (see Sect. 4.4). Depending on the number of hard constraints in the instance, two different configurations of iterations per stage were applied, as shown in Table 7. Table 8 reports the results obtained by the solver. The column Best solution found reports the best solution that our solver was able to find in all experiments. Some of these values are those that we submitted to the ITC2021 competition, and others have been found in later experiments. When  the current known best was determined by our solver, the corresponding value in the first column is marked in bold. When no feasible solution has been found, the number of hard violations followed by a letter H is reported. Next columns, labeled Average values, report the data obtained in a set of experiments that we run independently from the competition, in order to extract information on the average behavior of the algorithm in its final configuration. At least 48 runs per instance were performed to collect this data. Columns Cost and Time report, respectively, the average values of the objective function and the average time needed for a complete run of the three stages. Regarding the average cost, the value is computed only on feasible solutions. Column Feasible reports the ratio between feasible solutions and total runs. Finally, column Best known cost contains the best known results at the moment this article is written, according to data published on the website of the competition (Van Bulck et al., 2020a), and column CPLEX best bound contains the best lower bound that CPLEX was able to determine on Model M. Overall, our three-stage multineighborhood Simulated Annealing solver could find at least one feasible solution on 44 out of 45 instances. According to data, in its final configuration it manages to determine very easily a feasible solution on 36 instances, which are characterized by a feasibility ratio of 1.00, as it can be observed in column Feasible of the above-discussed table. The other instances appear to be less easy to solve for the algorithm.
In particular, instances Early_5, Early_10, Middle_2, and Late_5 result to be considerably challenging, as feasible solutions are found just sporadically.

Analysis of the neighborhood PartialSwapTeamsPhased
One of the main contributions of the presented work is the new neighborhood PartialSwapTeamsPhased that was introduced with the main purpose to solve the phased version of the problem. In order to test the effectiveness of the novel neighborhood, we run an additional set of experiments on phased instances without making use of PartialSwap-TeamsPhased. To do so, we associated a probability of 0.00 to PartialSwapTeamsPhased in the multi-neighborhood and rescaled the probabilities associated with the other neighborhoods proportionally, in order to keep the same mutual ratios among them (according to the values in Table 4).
In Table 9 we report the average costs and the feasibility ratios obtained by the standard configuration and those obtained by the configuration that does not employ PartialSwapTeamsPhased. At least 20 runs per instance were executed. The last column reports, where possible, the percentage gap between the average cost obtained without PartialSwapTeamsPhased and the average cost obtained by the full configuration. It is possible to observe that instance Middle_1 is solved to feasibility only in the configuration that employs PartialSwapTeamsPhased. Sixteen instances, solved by both configurations, have worse results when solved without PartialSwapTeamsPhased. For just one instance, Late_4, the configurations obtain the same average cost. Finally, the remaining four instances, Early_5, Early_10, Late_5, and Middle_2, are not solved to feasibility by any of the two configurations, in the given number of runs. According to these data, the neighborhood Partial-SwapTeamsPhased appears to bring a tangible improvement in 17 out of 22 phased instances.

Conclusions
In this study, we considered the version of the Sports Timetabling Problem proposed for the ITC2021 competition. We presented an ILP model for the problem, which did not yield significant results when solved by a commercial solver. Then, we tackled the problem employing a threestage multi-neighborhood simulated annealing approach that makes use of six different neighborhoods. In particular, the neighborhood that we named PartialSwapTeamsPhased is a novel contribution. Finally, we performed a parameter tuning for the solver using the F-RACE procedure that allowed us to find a set of parameter values for this problem.
This approach managed to find a feasible solution for 44 out of the 45 instances proposed by the competition. Feasible solutions were found rather easily for most of the instances; however, the metaheuristic struggled to produce feasible solutions for certain instances, even in long execution times. The results obtained by the simulated annealing approach allowed us to rank second out of 13 participants in the final ranking of the competition.
Future work will be devoted to improve the results and performances of the simulated annealing algorithm both on the considered instances and on other benchmark instances for round-robin tournament. Specifically, we think that relevant advancements can be achieved through a wider study and application of the PartialSwapTeamsPhased neighborhood on a larger set of instances. Possible research directions may also include the definition and integration of new neighborhoods in the simulated annealing algorithm to better deal with feasibility in heavily constrained instances. In addition, we will work on the implementation and evaluation of new greedy techniques to generate different initial solutions, not restricted to the canonical pattern. Further research may also be committed to develop a metaheuristic approach, such as Large Neighborhood Search (LNS), which embeds exact methods in our simulated annealing algorithm.