The crew-scheduling problem (CSP) is usually solved in two stages. At the first stage, all possible diagrams satisfying the industrial constraints are enumerated. At the second stage, only the set of diagrams that covers the entire schedule in the most cost-effective way is identified. Diagrams are usually modelled as binary vectors (Table 1) where ‘1’ denotes that the trip i is included in the diagram j, otherwise ‘0’ is inserted. Each diagram has its own cost. The deadhead journeys are displayed by including the same trip in more than one diagram. In the rest of the article the terms diagram and column will be used interchangeably.
Although the generation of the diagrams can be performed in a simple and relatively straightforward manner using various graph search and label-setting techniques [2], finding an optimal set of diagrams may be highly time-consuming. The problem boils down to the solution of the 0–1 integer combinatorial optimization set covering problem (SCP):
$$\begin{gathered} Minimize {\text{ }}\sum\limits_{{j=1}}^{m} {{c_j}{x_j}} \hfill \\ Subject \, to: \sum\limits_{{i=1}}^{n} {{a_{ij}}{x_j}} \geq 1 \hfill \\ \end{gathered} $$
$$\begin{gathered} {x_j} \in \{ 0,1\} \hfill \\ i=1,2 \ldots n{\text{ }}trips \hfill \\ j=1,2 \ldots m{\text{ }}diagrams \hfill \\ \end{gathered} $$
where a
ij
is a decision variable indicating whether a trip i is included in the diagram j; x
j
shows if the diagram is included in the schedule; c
j
is the cost of the diagram.
The complete enumeration of all possible diagrams is likely to be impractical due to the large geographical scope of operations, the number of train services, and industry regulations. Typically, the number of generated diagrams reaches 300,000–400,000 for small problems and can be up to 50–75 million for the large ones [3, 4].
Country-wide planning creates a large number of opportunities for drivers to change freight trains, while passenger trains and taxi services connecting a large number of stations exponentially expand the graph topology. Furthermore, checks such as maximum driving time, minimum breaks and maximum diagram length need to be conducted while traversing the graph. These checks ensure compliance with industrial regulations, but substantially increase the computation time at the diagram creation stage.
Branch-and-price
Linear programming methods such as branch-and-price [5, 6] have been popular for the solution of medium-sized CSPs in the passenger train and airline industries [7]. These methods usually rely on a column-generation approach, where the main principle is to generate diagrams in the course of the algorithm, rather than having them all constructed a priori. Despite the ability of the algorithm to work with an incomplete set of columns, the column generation method alone does not guarantee an integer solution of the SCP. It is usually used in conjunction with various branching techniques that are able to find the nearest integer optimal solution. However, this approach is less suitable for the CSP in rail freight, where the possible number of diagrams tends to be considerably higher.
Genetic algorithms
Linear programming (LP) has been used for CSPs since the 1960s [8], but genetic algorithms (GAs) were introduced more recently [9]. GAs have been applied either for the production of additional columns as a part of column generation [8] or for the solution of an SCP from the set of columns generated prior to the application of a GA [9,10,11,12], but there are not yet any reports of them solving both stages of the problem. Since the diagrams are generated outside the GA in advance, the GA cannot change or add new columns. The GA is therefore confined to finding only good combinations from a pre-determined pool of columns.
For the solution of a CSP with a GA, chromosomes are normally represented by integer or binary vectors. Integer vector chromosomes contain only the numbers of the diagrams that constitute the schedule. This approach requires knowledge of the minimum number of diagrams in the schedule and this information is usually obtained from the cost lower bounds. Lower bounds are usually acquired through the solution of LP relaxation for an SCP [13]. Since the number of diagrams in the optimal solution tends to be higher than the lower bound, Costa et al. [14] have suggested the following approach. In the first population, the chromosomes have a length equal to the lower bound. Then, if a solution has not been found within a certain number of iterations, the length of the chromosome increases by one. This process repeats until the termination criteria are met.
In the binary vector representation, each gene stands for one diagram. The figure ‘1’ denotes that the diagram is included in the schedule, otherwise it is ‘0’. Although the detailed information about times and locations is stored separately and only applied when a chromosome is decoded into the schedule, such chromosomes usually consist of several hundred thousand genes. The number of diagrams can be unknown and the algorithm is likely to need a large number of iterations in order to solve the problem.
The application of genetic operators often violates the feasibility of the chromosomes, resulting in certain trips being highly over-covered (i.e. more than one driver assigned to the train) or under-covered (i.e. no drivers assigned to the train). One way of resolving this difficulty is to penalize the chromosome through the fitness function in accordance with the number of constraints that have been violated. However, the development of the penalty parameters can be problematic as in some cases it is impossible to verify them analytically and they are usually designed experimentally [15]. The penalty parameters are therefore data-dependent and likely to be inapplicable to other industries and companies. Moreover, the feasibility of the entire population is not guaranteed and might be achieved only after a large number of iterations.
Another more straightforward approach to maintaining the feasibility is to design heuristic “repair” operators. These operators are based on the principles “REMOVE” and “INSERT”. They scan the schedule and remove certain drivers from the over-covered trips and assign those drivers to under-covered journeys [13, 15]. This procedure might have to be repeated several times, leading to high memory consumption and increased computation time.
Adaptable genetic algorithm
Two of the common challenges associated with design of GAs are stalled evolution and premature convergence. Multiple genetic operators, random offspring generation, and dynamic parameter adjustment are among the methods for tackling these problems [16, 17]. The challenges in the design of an efficient GA with multiple operators are: identification of the optimal quantity of genetic operators, selection of those operators that would complement each other’s strengths, and definition of utilization rules. Creation of offspring at random, rather than through the crossover operator, can be inefficient for a large-scale problem due to the large number of potential gene permutations, lowering the probability of producing more fit and diverse offspring.
Genetic parameters such as crossover rate and mutation rate govern the exploration and exploitation phases. Poor selection can lead to premature convergence due to reduced diversity in the population over several iterations [18]. While the mutation operator is usually responsible for the maintenance of diversity, an extremely high level of mutation at the beginning can impede convergence on the solution. On the other hand, a very low level of mutation at the beginning might lead to poor exploration of the search region and the algorithm might not be able to arrive at the optimal solution.
To achieve a balance, several adaptive techniques that dynamically adjust the mutation and crossover rates have been proposed. One approach modifies the values of GA parameters proportionally to the distance between the best and average fitness in the population [19]. Designing an evolutionary algorithm for the crew scheduling problem, Kwan et al. [20] suggest selecting the mutation probability individually for each chromosome rather than for the entire population. The longer the individual has been in the population, the higher its probability of undergoing mutation. Both approaches rely on pre-defined crisp rules. However, the criteria for optimal selection of crossover and mutation are ambiguous and hard to model. Crisp rules cannot always adequately deal with the intricacies of the parameter adjustment process. For this reason, fuzzy-logic controllers, which are able to handle uncertainty and imprecision, have been applied in this research.
Wang et al. [21] were amongst the first researchers to propose the incorporation of fuzzy logic controllers within GAs in order to optimize the GA parameters. The configuration of a standard fuzzy-logic controller (FLC) is illustrated in Fig. 1. At each iteration of the GA, the information about its current performance is passed onto the FLC. The FLC then processes it and produces a recommendation for how the GA parameters should be altered in order to achieve more optimal execution. There are four critical components that support the FLC: a rule-base, a fuzzification unit, an inference engine, and a defuzzification unit.
The rule-base contains expert knowledge, expressed in the form of IF-THEN rules, which determine the relationship between the input and output. When applied to GA parameter management, the typical principle is to increase the mutation rate and decrease the crossover rate when the algorithm is converging [22,23,24,25,26].
Following the rules stored in the rule-base, the fuzzification unit estimates the degree to which the parameters belong to fuzzy sets. In the context of GA parameter control, fuzzy sets represent the crossover and mutation rates. The membership functions of the fuzzy sets are defined by linguistic variables (i.e. Low, Medium, and High).
The role of the inference engine is to identify the required level of changes to the GA parameters at a given iteration. The decision is made on the basis of the information received from the rule-base and fuzzification units. Finally, the defuzzification element returns scalar values of crossover and mutation rates.
While the architecture of the FLC remains the same across different fields of research and applications, the input parameters vary significantly. The input parameters can be broken down to two types: phenotype-based and genotype-based parameters. The first group deals with changes in the fitness function, whereas the genotype-based group concerns the structure of chromosomes.
As an example of phenotype measurements, Herrera and Lozano [22] utilize the convergence measure (CM), defined as the ratio between the best fitness on the current iteration and the best fitness on the previous iteration. In another experiment, they enhance this ratio with the number of generations of unchanged best fitness and the variance of the fitness, in order to amend both mutation and crossover rates. Hongbo et al. [25] use the average fitness value in relation to the best fitness in the population and changes of the average and best fitness over several iterations to solve the crew grouping problem in military operations. This approach was adopted later for the detection of high-resolution satellite images [23] and for optimal wind-turbine micrositing [26]. Homayouni and Tang [27] propose the use of indicators such as the best value of the fitness function, the frequency of the chromosomes with the similar best value, and the percentage of the same chromosomes in the population. In contrast, another FLC [28] relies on the changes in the value of the best fitness and population diversity.
Along with phenotype attributes, some authors consider genotype properties [24, 29]. They assess the Hamming distance between the chromosomes with the best fitness and the worst fitness in relation to the length of the chromosome. This approach promotes diversity, not only in the fitness functions, but also in the structure of the individuals.