Knowledge-based optimization algorithm for the inventory routing problem

The Inventory Routing Problem (IRP) is a combinatorial optimization problem that combines routing decisions with inventory management. In this paper, an approach to solving the IRP is studied, which aims at using an external knowledge source (a known good solution or user interaction) to improve the results attained by an evolutionary algorithm solving an IRP instance. The proposed method improves the best solution found by the evolutionary algorithm by modifying schedules for some of the retailers according to those present in the known good solution or to schedules provided by a domain expert. The experiments shown that to improve the optimization results it suffices to perform a few repetitions of the knowledge import procedure. This observation motivates further research on user-interactive optimization algorithms for the IRP, because the number of interactions needed to improve the results can easily be handled by the user.


Introduction
The Inventory Routing Problem (IRP) (Bertazzi and Speranza 2012;Dantzig and Ramser 1959) is an extension of popular transportation problems on graphs, such as the Vehicle Routing Problem (VRP) (Laporte 2009), combining routing optimization with inventory management optimization (Aghezzaf et al. 2006;Archetti et al. 2007).The range of applications includes perishable goods delivery (Alkaabneh et al. 2020;Mirzaei and Seifi 2015), fuel delivery (Popović et al. 2012), maritime transport (De et al. 2017) and transportation of hazardous items (Timajchi et al. 2019).Many formulations and extensions of the regular IRP were recently studied in the literature (Archetti and Ljubić 2022), but the most common version of the IRP focuses on planning the distribution of a single product provided by a single supplier to a given number of retailers.The supplier produces a given, constant quantity of the product each day, and the retailers sell varying quantities of this product.Limited storage space is available both at the supplier and the retailers.Storage costs are calculated per unit of the product per day at a rate varying from location to location.The objective in the IRP is to optimize the inventory and transportation costs under a number of constraints on the capacity of a fleet of vehicles delivering goods, costs and limits of local inventories, etc.A solution to the IRP is a delivery schedule for a planning horizon of T days along with routes for vehicles delivering the product.Some recent extensions concern continuous time issues.The paper by Lagos et al. (2022) proposes an approach based on dynamic discretization of the continuous time.In Lagos et al. (2020), a continuous time version of the IRP is considered.It extends the IRP with evaluating the retailer's storage capacity and product inventory at the time of the delivery.This continuous-time IRP is solved using simple time discretizations in combination with integer programming models.In Touzout et al. (2022), the Time-Dependent IRP (TD-IRP) was introduced that extends the routing part of the problem by making the travel time between two locations depend on the departure time instead of being constant as in the regular IRP.In Skalnes et al. (2022), time varying demands were investigated and a branch-and-cut algorithm based on a new mathematical formulation was introduced.
In Agra et al. (2022), a continuous time version of the IRP was studied with a single vehicle responsible for the transport from a set of suppliers to a set of retailers and two models: a location-event model and a vehicle-event model were proposed.Even though, in most cases, a finite planning horizon is considered, an infinite-horizon IRP has been formulated as early as 1985 (Blumenfeld et al. 1985;Burns et al. 1985) as an economic order quantity (EOQ) model.
Another group of extensions concerns closed-loop versions of the regular IRP.The paper by De and Giri (2020) addresses a few distinct environmental policies for carbon emission regulations and proposes an approach based on mixed integer nonlinear programming.The paper by Dev et al. (2017) studies inventory and production planning in a closed-loop system concerning both manufacturing and remanufacturing.It investigates a few inventory and production planning models and a discrete event simulation.
In Yu et al. (2022), a new variant of IRP, namely Multi-Vehicle Cyclic Inventory Routing Problem (MV-CIRP), was introduced.It tries to find the subset of retailers, the number of vehicles used, and the corresponding cycle time and route sequence in order to minimize the total cost.The paper by Yu et al. (2022) proposes a mixed-integer nonlinear programming model and a Simulated Annealing algorithm to solve it.
In Sifaleras and Konstantaras (2020), several generalizations of the IRP were discussed, including the multi-product case and the case with capacity constraints.This paper presents solving the IRP in the context of Variable Neighborhood Search (VNS).
Many formulations of the IRP are deterministic, but the version of the problem with non-deterministic lead times (Roldán et al. 2017) as well as the variant with stochastic demands (Bertazzi et al. 2013) have also been studied.In Mousavi et al. (2022), a stochastic production routing problem was introduced and discussed in the context of perishable products.This paper also addresses the uncertain demand and aims to optimize also the costs of wasted products as well as introduces penalties for non-fresh products.For solving such a problem, a matheuristic algorithm was proposed.
Some other extensions of the IRP address also returnable transport items (Iassinovskaia et al. 2017), where the supplier is also in charge of collecting the empty returnable transport items for reuse in the next cycles, as well as storage replenishment routing problem (Çelik et al. 2022), where storage replenishment operations require the transportation of items to given item slots in the storage area.Moreover, Malladi and Sowlati (2018) discuss sustainability aspects of the IRP.
Besides the extensions of the regular IRP aiming at optimizing a single objective function, some multi-objective approaches were also proposed.The paper by Rabbani et al. (2021) introduces a multi-objective optimization approach for solid waste management, focusing on recycling and waste-to-energy technologies, where the objective functions try to minimize the total net cost, greenhouse gas emissions, and total waste collection and treatment time.
Recent approaches to the IRP found in the literature are based on various algorithmic methods.A number of them have been based on integer programming and methods such as the branch-and-cut algorithm (Archetti et al. 2007).Some of them use heuristics and metaheuristics (Bard and Nananukul 2009;Hiassat et al. 2017;Lipinski and Michalak 2018;Wu et al. 2021) or iterative metaheuristics (Vadseth et al. 2021).The paper by Mahjoob et al. (2022) uses genetic algorithms (GA) for solving a multi-product, multi-period inventory routing problem.Hybrid methods have also been proposed, such as (Diabat et al. 2017) combining simulated annealing with direct search or (Liu and Chen 2011) with tabu search.Hybrid algorithms combining simulation and heuristics were proposed in Juan et al. (2014) and Juan et al. (2015).
In this paper, the approach used to solve the IRP is a metaheuristic optimization algorithm optimizing the delivery schedules.Key points summarizing the contents of this paper are: -The paper studies the possibility of importing knowledge from an external source in order to improve the optimization results.Knowledge can be imported from a known good solution (for example, when a certain organization of deliveries is established at a company) or by interacting with the user (if a practitioner in the field wants to interactively improve the solutions).In the context of this study, knowledge imported into the optimization process is represented as schedules for individual retailers, which are either copied from a known good solution or are provided by the user.-A hybrid optimization algorithm is used which combines evolutionary optimization of delivery schedules with optimization of routes using the Concorde TSP Solver (William 2020).-Feasibility-preserving genetic operators (crossover, mutation) are used, which ensure that newly generated solutions are feasible and thus the population always contains feasible solutions only.-Apart from well-known benchmarks proposed in the paper by Archetti et al. (2007), hard IRP instances evolved in the paper by Michalak (2021b) are used.These hard instances were obtained by running an evolutionary algorithm which evolved IRP instances with the goal of making these instances harder, that is, maximizing the solving time obtained using state-of-the-art mathematical problem solvers such as CPLEX.The solving times increased, with respect to the instances from the paper by Archetti et al. (2007), from 65.22 up to 100327.92times (depending on the IRP instance solved) (Michalak 2021b).Therefore, these hard instances can be considered difficult to solve using state-of-the-art mathematical problem solvers.-The evolutionary algorithm using knowledge import proposed in this paper is not likely to deteriorate the results (a worse result was only observed in 4 cases out of 720).In about 50% of tested cases, the knowledge import improved the optimization results and in the remaining cases the results were not statistically different for the algorithm with and without the knowledge import.-For the instances proposed in the paper by Archetti et al. (2007) the knowledge import mechanism improved the results in about 66% of cases in a scenario with only one knowledge import, in which 10 changes of the schedules for individual retailers were attempted.For the Hard instances evolved in the paper by Michalak (2021b) the knowledge import improved the results in about 16% of cases and never made the results worse.This means, that if the user can be expected to provide 10 good schedules for individual retailers in just one interactive session during the entire algorithm's runtime, the optimization results can be improved without the risk of degrading the performance of the proposed method.Therefore, the workload of the user should not be too high, and the proposed approach appears to be suitable for designing user-interactive optimization methods for the IRP.-For the Hard instances evolved in the paper by Michalak (2021b) the knowledge-based EA presented in this paper attained the best results among all the tested methods within the running time limit of max t = 3600 s (1 h).This shows that the proposed method can be effectively used for solving hard IRP instances for which mathematical programming methods require a very long running time, and known metaheuristic methods are not competitive to the one proposed here.
The rest of this paper is structured as follows.Section 2 presents the formulation of the IRP studied in this paper.Section 3 discusses the optimization algorithms used in this paper to solve the IRP.Section 4 presents the experiments and results.Section 5 concludes the paper.

The inventory routing problem formulation
In this paper, a single-depot, single-commodity and singlevehicle version of the IRP with the up-to-level replenishment policy presented by Archetti et al. (2007) is studied.A single commodity is delivered from one supplier (depot) to n retailers with one vehicle with capacity C. The deliveries have to be planned within a horizon of H consecutive time periods (days).Variables r s,t for s = 0, . . ., n, t = 1, . . ., H t = 1: 1 2 7 5 3 t = 2: 6 7 8 4 9 t = 3: 5 7 Fig. 1 The optimal solution to the hard3n10 IRP instance generated in the paper by Michalak (2021b) represent the production at the supplier and sales at the retailers.The index s = 0 is assigned to the supplier, so r 0,t is the production and r s,t for s > 0 are the sales on day t.Inventory levels at the supplier are represented by variables B t , t = 1, . . ., H .There is no upper limit on B t , which means that the supplier always has enough storage capacity for the produced goods, and the lower bound is ∀t = 1, . . ., H : B t ≥ 0, which means that stock-out at the supplier is not allowed.Inventory levels at the retailers are represented by variables I s,t and the minimum and maximum inventory levels denoted L s and U s are constant in time, but potentially different for every retailer.Therefore, the inventory level constraints are represented by inequalities: Following the work of Archetti et al. (2007) we set ∀s = 1, . . ., n : L s = 0, which means that no minimum inventory levels are required at the retailers, except for the no stock-out requirement.A solution to the IRP consists of the routes for the vehicle π t and the delivery sizes x s,t for s = 1, . . ., n, t = 1, . . ., H .Each route π t is a permutation of a subset of the n retailers-those, which are visited on day t. Figure 1 shows the optimal solution to the hard3n10 IRP instance generated in the paper by Michalak (2021b).
The tour of the vehicle always starts and ends at the supplier, so the supplier does not have to be explicitly included in π t , but the transportation cost (6) includes the costs of driving from the supplier to the first retailer in π t and from the last retailer in π t back to the supplier.In this paper, the up-to-level replenishment policy is used, so only the routes (permutations) π t are used to represent solutions and the delivery sizes are calculated as: The maximum vehicle capacity C cannot be exceeded by all the goods delivered on each day: ∀t = 1, . . ., H : (3) Location-specific inventory costs h s , s = 0, . . ., n, are assumed to be constant in time and the overall storage costs are the costs of storing the commodity until the next planning horizon arrives: (4) Transportation costs are calculated using costs c i, j of traveling between each pair of points i, j = 0, . . ., n.In this paper, these costs are calculated as rounded Euclidean distances between locations given as coordinates a s , b s ∈ R 2 , s = 0, . . ., n: where As mentioned above, when the up-to-level replenishment policy is used, a solution to the IRP is fully described by a sequence of routes {π t } t=1,...,H .To evaluate such solution, the deliveries x s,t for each day have to be calculated using Eq.
(2).Given the initial inventory levels I s ,0 , deliveries x s,t , and production/consumptions r s,t , the inventory levels B t and I s,t are determined.The evaluation of the solution is obtained by adding the costs of all routes π t , t = 1, . . ., H calculated using Eq. ( 6) and the inventory costs calculated using Eq.(4).
In order not to clutter the paper, for the mathematical formulation suitable for MIP solvers readers are referred to Eqs.
Denoting M = 1, . . ., n, M = M ∪ {0} and assuming that the production and sales are constant in time: ∀s = 0, . . ., n, ∀t : r s ,t = r s an instance of the IRP considered in this paper can be defined as a tuple of the following values: where n-the number of retailers, H -the planning horizon, C-the vehicle capacity, a s , b s -the coordinates of the locations (the supplier and the retailers), r s -the production (at the supplier for s = 0) and sales (at the retailers for s > 0), L s -the lower inventory levels at the retailers, U sthe upper inventory levels at the retailers, I s ,0 -the initial inventory levels at the supplier and retailers, h s -locationspecific inventory costs at the supplier and retailers.
The IRP is an NP-hard problem, because it is a generalization of the Traveling Salesperson Problem (TSP), which is obtained from the IRP by setting H = 1, C = ∞, I 0,0 ≥ n, I s,0 = 0 and r s = 1 for s = 1, . . ., n, and h s = 0 for s = 0, . . ., n.For these settings, only the transportation costs are considered and the vehicle has to visit all the retailers (and thus also the supplier) on the same day.
In the discussion in subsequent sections the following terminology will be used: A schedule is a sequence of days on which a given retailer is visited.For example, for the solution shown in Fig. 1 the schedule for retailer 1 is [1], for retailer 5 it is [1, 3], and for retailer 7 it is [1,2,3].
A route is a sequence of retailers visited on a given day.For example, for the solution shown in Fig. 1 the route for day 1 is [1,2,7,5,3].
A solution consists of H routes, one per each day within the planning horizon.The optimal solution to the hard3n10 IRP instance (with H = 3) generated in the paper by Michalak (2021b) is shown in Fig. 1.

Knowledge-based optimization for the IRP
The optimization algorithm proposed in this paper is a hybrid algorithm combining an evolutionary algorithm (EA) with the Concorde TSP Solver (William 2020) which optimizes the routes.The evolutionary algorithm is based on the classical genetic algorithm scheme with a recursive heuristic used for initializing the population and feasibility-preserving genetic operators.This algorithm involves a knowledge import step, which improves the best solution found by the EA by user interaction or by importing a known good delivery schedule for one of the retailers.In the following sections, the details of the proposed method are presented.An introductory Sect.3.1 lists procedures used in the proposed method.In Sect.3.2 the heuristic used for obtaining the base solution for population initialization is described.Section 3.3 discusses the details of the optimization algorithm.Section 3.4 presents the mechanism for importing knowledge into the optimization process.

Procedures used in the proposed method
The optimization algorithm used in this paper is an evolutionary algorithm (Algorithm 4) based on the classical genetic algorithm scheme.In this algorithm, the SolveTSP procedure is used for optimizing the routes in each solution using the Concorde TSP Solver (William 2020).The ImportKnowledge procedure (Algorithm 5) improves the best solution found by the EA by user interaction or by importing information from a known good solution.Procedures used for population initialization are: BaseSolution (Algorithm 1)-the recursive heuristic for generating the base solution.
AddRetailerToSolution (Algorithm 2) called recursively n times in the BaseSolution procedure to add schedules for n retailers in a random order to the base solution.
GeneratePossibleSchedules (Algorithm 3) used in the AddRetailerToSolution procedure for recursively generating all possible delivery schedules for a given retailer which do not cause the vehicle to be overloaded.
FeasibleDateChangeMutation-a mutation operator proposed in the paper by Michalak (2021a), which moves retailers between different days, ensuring that the solution remains feasible.Procedures used by the main loop of the evolutionary algorithm are: Evaluate-evaluates a given solution by adding the costs of all routes π t , t = 1, . . ., H calculated using equation ( 6) and the inventory costs calculated using equation (4).After this procedure is called for a solution x, the evaluation (the total cost) of this solution can be obtained by referring to the attribute x.Eval.
Reduce-reduces the union of the current population (N pop solutions) and the offspring generated by the genetic operators (N pop new solutions) back to the required population size N pop − 1 (leaving room for a copy of the best specimen).
Reproduce-applies the crossover and mutation operators to obtain the next population from the current one.
CopyBestSpecimen-creates a copy of the best specimen from a given population (i.e. the solution with the lowest cost).
FindWorstSpecimen-finds the index in the population at which the worst specimen can be found (i.e. the solution with the highest cost).Procedures used for manipulating schedules for individual retailers are: AddRetailerDays-adds the specified retailer s to the solution x on days given in a schedule S. For example if x is the solution presented in Fig. 1, then AddRetailerDays(x, 2, [2, 3]) produces a solution: Note, that the routes with the added retailer are not optimized until the SolveTSP procedure is called for the solution x.GetKnownDays-imports knowledge from an external source by either copying a schedule for a certain retailer s * from a known good solution or by consulting the user and asking for a proposed schedule for that retailer.Either way, a schedule for the retailer s * (a sequence of days on which this retailer should be visited) is returned.
Algorithm 1: The BaseSolution procedure -the recursive heuristic for generating the base solution used for initializing the population in the evolutionary algorithm. Inputs: An IRP instance defined as in the equation ( 7).

Output:
The base solution generated by this heuristic.
// Start with an empty solution -H empty routes. x // Vehicle loads for the H days are initially zero.

return x base
GetRetailerDays-returns the days on which a given retailer is visited in a given solution.For example, if x is the solution presented in Fig. 1, then GetRetailerDays(x, 5) returns [1,3].
SortSchedules-sorts schedules generated for a retailer by the GeneratePossibleSchedules procedure (Algorithm 3), placing shorter schedules (with fewer delivery days) first.Schedules of equal length (the same number of delivery days) are sorted by placing those with earlier delivery days first.
SupplyAtTheLatestDate-generates a schedule for a retailer using the Supply at the Latest Date (SLD) heuristic previously used in the papers (Lipinski andMichalak 2018, 2019;Michalak 2021a), which determines the latest possible days at which the deliveries have to be made to a given retailer in order to avoid stockout.
If processing elements of a given set in a random order is necessary, the RandPerm procedure is used, which generates a random permutation of a given length.

Recursive heuristic for generating the base solution
The recursive heuristic is used to obtain the base solution which is used in the evolutionary algorithm to initialize the population.The initial population consists of the base solution and its mutated copies.The recursive heuristic implemented in the BaseSolution procedure (Algorithm 1) generates the base solution by adding schedules for n retailers 123 Algorithm 2: The AddRetailerToSolution procedure.
Inputs: I -An IRP instance defined as in the equation ( 7).x -A partial solution (a sequence of H incomplete routes).V -Vehicle loads on H days of the planning horizon calculated for the incomplete routes in x. π -A random permutation which determines the order in which to add retailers to the solution.i -The retailer to add (an index used to select an element from π ). Output: A full solution (a sequence of H complete routes).
// The retailer to add selected from the permutation π s // The day on which the initial inventory runs out for retailer s. // Deliveries are only needed if the initial inventory is not enough.if d 0 < H then // Generate all possible delivery schedules for the retailer s. // Each schedule is a set of days on which the retailer s is visited.
// Add the retailer s to the route for day t.q := U s − q // The quantity delivered on day t.q := q + q // The inventory level set to U s .
// Increase the vehicle load on day t.Note, that schedules // generated by the GeneratePossibleSchedules procedure // never exceed the vehicle capacity.
q := q − r s // Recurse to add the next retailer.
x := AddRetailerToSolution(I, x , V , π , i + 1) if x = null then return x else // Retailer s is the last one -just add the first schedule generated for s. in a random order using n recursive calls to the AddRetail-erToSolution procedure (Algorithm 2) which is started with an empty solution and a random permutation which determines the order in which schedules for retailers are added.
In the AddRetailerToSolution procedure the GeneratePossi-bleSchedules procedure (Algorithm 3) is used for recursively generating all possible delivery schedules for a given retailer which do not cause the vehicle to be overloaded.

Optimization algorithm
In this paper, a hybrid optimization algorithm is used which combines evolutionary optimization of delivery schedules with optimization of routes using the Concorde TSP Solver (William 2020).The algorithm uses feasible crossover and mutation operators described by Michalak (2021a) which ensure that only feasible solutions are generated.The genetic Algorithm 3: The GeneratePossibleSchedules procedure.
// Schedules with day t can be added if there is no stockout at the supplier and vehicle capacity is not exceeded on day t.q := I.U s − i s // The quantity required to make the inventory full at retailer s.
// The result is the set of schedules generated recursively with and without day t.return J operators used in this paper are the following (for details readers are referred to the paper by Michalak (2021a)): Feasible Retailer Schedules Crossover which works by uniformly mixing delivery schedules for individual retailers obtained from two parent solutions.
Feasible Date Change Mutation which considers retailers in a random order selecting, for each retailer, one date when it is visited, and moving this retailer to some other feasible date.
Feasible Add Or Remove Retailer Mutation which randomly applies one feasible change to the mutated solution: an addition or removal of a retailer.
Because the population initialization procedure generates only feasible solutions and genetic operators preserve the feasibility, no mechanisms for handling infeasible solutions, such as repair operators, are needed in the proposed algorithm.The most important characteristics of the algorithm used in this paper are: -Population initialized by generating the base solution using a recursive heuristic described in Sect.The working of the optimization algorithm is shown in Algorithm 4.

Knowledge import
The knowledge import improves the best solution found so far by the evolutionary algorithm x best by incorporating information from an external knowledge source, which can be user interaction or a known good solution to the solved problem instance.It is performed by changing schedules (days on which the retailers are visited) for individual retailers in the x best solution to those provided by the knowledge source.The knowledge import is implemented as the ImportKnowledge procedure (Algorithm 5), which performs N att attempts to improve the x best solution by: 1. Selecting one of the retailers s * ∈ M randomly without repetitions.2. Getting a schedule for the retailer s * from the external knowledge source.This step is performed by the GetKnownDays procedure called in Algorithm 5, which either copies the schedule for the retailer s * from a known good solution or consults the user and asks for a proposed schedule for that retailer.3. Copying the schedules for all the retailers in M\{s * } from the solution x best .If adding a schedule from x best for a certain retailer causes the solution to be infeasible the schedule for this retailer is generated using the Sup-plyAtTheLatestDate heuristic instead.
From the N att attempts to improve the solution the best one x * is selected (in Algorithm 5) and, if it is better than x best , it replaces the worst solution in the EA population (in Algorithm 4).This approach is adopted in order to protect the diversity of the population.Because both x * and x best are kept, both the information generated by the evolutionary optimizer and coming from the external knowledge source can be used to generate new solutions in the evolving population.The worst solution in the population, which is replaced by x * , would probably be removed by the selection mechanism anyway, so it can be assumed that no valuable information is lost because of this replacement.

Experiments and results
The experiments presented in this paper were aimed at determining if importing knowledge from an external source can be beneficial for the optimization process.It was assumed that the knowledge import can be performed using a known good solution to the problem (e.g.established by practitioners in the field) or by user interaction.In both cases the import is performed by modifying schedules for some retailers in the best solution found by the evolutionary algorithm.This modification can be performed by the user or by setting the days on which these retailers are visited to those which are assigned to these retailers in a known good solution.
In the experiments, the parameters of the evolutionary algorithm were tuned (Sect.4.1), the running times of the evolutionary algorithm solving the hardest IRP instance obtained in the paper by Michalak (2021b) were analyzed and compared to the running times of the CPLEX solver (Sect.4.2), and the effectiveness of the proposed knowledge import method was tested (Sect.4.3).In Sect.4.4 the proposed method was compared to metaheuristics used in the literature to solve the IRP and to the mathematical programming model proposed in the paper by Archetti et al. (2007) in which the Archetti 2007 LC and Archetti 2007 HC IRP instances were provided.

Parameter tuning
In this paper, an evolutionary algorithm described in Sect. 3 is used for solving the IRP.This algorithm is parameterized by setting the population size N pop , crossover probability P cross , and mutation probability P mut .In order to ensure the best performance of the optimization algorithm, these parameters were tuned using the grid search approach.The values of the parameters were selected from the sets of candidate values: N pop ∈ {50, 100, 200, 500}, P cross ∈ {0.5, 0.6, 0.7, 0.8, 0.9, 1.0}, P mut ∈ {0.02, 0.04, 0.06, 0.08, 0.10}.Apart from these three numerical parameters, the Reduce procedure used in Algorithm 4 can work in one of four modes: -Elitism-The best solutions are selected from the union P ∪ P of the current population P and the offspring P .-Fitness-Solutions are selected from the union P ∪ P of the current population P and the offspring P using the roulette-wheel selection with the probabilities proportional to the fitness.-New-Only the offspring P are kept.
-Rank-Solutions are selected from the union P ∪ P of the current population P and the offspring P using roulette-wheel selection with the probabilities proportional to the rank calculated with respect to the increasing fitness.The solution with the worst (lowest) fitness gets a rank of 1, and the solution with the best (highest) fitness gets a rank of |P ∪ P |, where |P ∪ P | denotes the number of solutions in the current population P and the offspring P population taken together.In the case of two solutions with the same fitness, two consecutive ranks are assigned at random, so one solution gets a rank r and the other one r +1.After the ranks are assigned, the roulette- wheel selection is performed, in which each solution has the probability of being selected proportional to its rank.
For each reduction mode and a triple of the parameters N pop , P cross , and P mut , 10 runs of the test were performed using the IRP instance hard4n35 constructed in the paper by Michalak (2021b).This instance was selected because it was the hardest one found in that paper, that is, it required the longest solving time using the CPLEX solver.The parameters for which the best (lowest) average cost of 5049.338 was obtained were selected.The best settings were: the Elitism reduction mode, N pop = 200, P cross = 0.8, P mut = 0.1.

Running times
The solving time using the CPLEX solver for the IRP instance hard4n35 reported in the paper by Michalak (2021b) was 354681.28s (more than 4 days) on a machine with the Intel Xeon E5-2670v3 (Haswell) CPUs running at 2.3 GHz.The optimal cost attained within this running time was 5043.55.During the parameter tuning phase of the experiments presented in Sect.4.1, 480 runs of the evolutionary algorithm were performed (4 reduction modes × 4 population sizes × 6 crossover probabilities × 5 mutation rates).Parameters of the distribution of the running times observed in these 480 runs on a machine with the same specification (i.e. the Intel Xeon E5-2670v3 (Haswell) CPUs running at 2.3 GHz) are presented in Table 1.In Fig. 2 the histogram of the running times is presented.Clearly, the running times of the optimization algorithm studied in this paper are competitive to those of the CPLEX solver.The maximum running time for the EA is less than 7% of the running time for CPLEX (23448.0s).The majority of timings is much lower than that.

Experiments with knowledge import
The knowledge import mechanism was tested using the optimization algorithm presented in Algorithm 4 with the parameters tuned in Sect.4.1.The number of knowledge imports was set to N imp = 0, 1, 2, 5, and 10.The number of attempts to improve the solution in each knowledge import was set to N att = 10 and N att = I.n.The constant value N att = 10 is a reasonable choice for a user-interactive The experiments were performed using three sets of IRP instances: -Archetti 2007 LC-IRP instances with the planning horizon H = 6 and low inventory costs proposed in the paper by Archetti et al. (2007).This set of IRP instances contains 5 instances for each number of retailers n = 5, 10, 15, 20, 25, and 30.-Archetti 2007 HC-IRP instances with the planning horizon H = 6 and high inventory costs proposed in the paper by Archetti et al. (2007).This set of IRP instances contains 5 instances for each number of retailers n = 5, 10, 15, 20, 25, and 30.-Hard-IRP instances with the planning horizon H = 3 evolved in the paper by Michalak (2021b) from the Archetti 2007 LC instances with the planning horizon H = 3.This set of IRP instances contains 5 instances for each number of retailers n = 10, 15, 20, 25, 30, and 35.These instances were generated using an evolutionary algorithm in such a way that the solving time with stateof-the-art mathematical problem solvers was very long.The solving times using the CPLEX solver on a machine For each problem instance and the pair of parameters N imp and N att the optimization algorithm was run 200 times.In order to ensure repeatability of the experiments, the schedules for retailers were imported from optimal solutions to these problem instances found by the CPLEX solver instead of user interactions.A summary of the results is presented in Table 2.In order to test if the knowledge import affected the optimization results, the Friedman test was applied to the results obtained for each dataset and each value of N att with groups ("treatments") representing different values of N imp .The null hypothesis of this test is that the medians of the optimization results are equal across all groups (values of N imp ).The results of the Friedman test are presented in Table 3.Based on low p-values obtained in this test it can be concluded that there are statistically significant differences of optimization results among different values of N imp .Detailed results are presented in Tables 4, 5, 6, 7, 8, 9 in which median values obtained in 200 runs of the optimization algorithm are shown.The column N imp = 0 contains results obtained without using the knowledge import, and columns with N imp > 0 contain results obtained when a given number of knowledge imports was performed.Apart from numerical results, these tables present the results of a statistical comparison performed using the Wilcoxon statistical test (Rey and Neuhäuser 2011) in which the null hypothesis states the equality of medians.The sign '(+)' is used to mark those results obtained for N imp > 0 which are significantly better than the result obtained for the same IRP instance without using the knowledge import at the significance level α = 0.05.The sign '(=)' is used to mark those results for which the Wilcoxon test produced a p-value larger than 0.05.The sign '(−)' is used to mark those results obtained for N imp > 0 which are significantly worse than the result obtained for the same IRP instance without using the knowledge import at the significance level α = 0.05.The From the obtained results the following conclusions can be drawn: -As shown in Table 2, the knowledge import approach managed to improve the results in about 50% of tested cases (IRP instances and values of N imp and N att ).-The knowledge import approach is not likely to deteriorate the optimization results.From 720 tested cases only in 4 cases a deterioration of the results was observed (Table 2).-The Hard dataset contains instances for which it is indeed hard to improve the optimization results.Nevertheless, the knowledge import never deteriorated the results for this dataset, and was able to improve them in 16% of cases for N att = 10 and in about 22% of cases for N att = I.n (Table 2).-For the Archetti 2007 instances, increasing N imp does not seem to influence the optimization results (Tables 4, 5, 6, 7.For these instances increasing N att makes a small change in the results: the number of better results increased by two between Tables 4 and 5, and between Tables 6 and 7. -On the contrary, the number of better results obtained for the Hard instances increases with increasing N imp and also increases for N att = I.n compared to N att = 10 (Tables 8 and 9).

Comparison to other optimization methods
Experiments described in this section were aimed at comparing the evolutionary algorithm with knowledge import proposed in this paper to other methods used in the literature to solve the IRP.The methods studied in this section were as follows: Ant Colony Optimization (ACO) proposed in the paper by Tatsis et al. (2013) for a multi-vehicle IRP, which can easily be adapted to the single-vehicle problem studied in this paper.In the aforementioned paper, a variable p it denotes the position of retailer i ∈ M in the route used on day t ∈ {1, . . ., H }. The ACO algorithm holds pheromone val-ues τ it (l) for l ∈ 0, 1, . . ., n which are used for calculating the probability that p it is set to the value l: These probabilities are used to either exclude the retailer from the route (if p it = 0), or to choose the position l = p it for retailer i in the route for day t (if p it = 0).For the details of the ACO method readers are referred to the paper by Tatsis et al. (2013).Because preliminary experiments shown that the ACO method produced mostly infeasible solutions for the problem instances studied in this paper, a repair method was added consisting of the following two steps: 1.For t ∈ {1, . . ., H }, while the vehicle is overloaded (that is, the capacity C is exceeded), select randomly one retailer from the route for day t and remove it.2. For t ∈ {1, . . ., H }, check all retailers in a random order without repetitions and if there is a stockout at retailer i replan the schedule for this retailer using the Supply at the Latest Date (SLD) heuristic (Lipinski andMichalak 2018, 2019;Michalak 2021a).
Mathematical programming model (CPLEX) proposed in the paper by Archetti et al. (2007) in which the IRP instances Archetti 2007 LC andArchetti 2007 HC were provided.This model was used for solving the IRP using the CPLEX solver.
Population-Based Simulated Annealing (PBSA) used for solving the IRP, among others, in the paper by Shaabani and Kamalabadi (2016).In order to ensure the feasibility of solutions in the population, the population was initialized in the same way as for the evolutionary algorithm-by generating the base solution using a recursive heuristic described in Sect.3.2 and obtaining mutated copies using the Feasible Date Change Mutation operator.New solutions in subsequent generations were generated using the Feasible Date Change Mutation and Feasible Add Or Remove Retailer Mutation operators.
Population-Based Tabu Search (PBTS).The Tabu Search algorithm was used for solving the IRP, among others, in papers by Alinaghian et al. (2021) and by Maghfiroh and Redi (2022).Similarly as for PBSA, the population was initialized by generating the base solution using a recursive heuristic described in Sect.3.2 and obtaining mutated copies using the Feasible Date Change Mutation operator.In subsequent generations, new solutions were generated using the Feasible Date Change Mutation and Feasible Add Or Remove Retailer Mutation operators.
Particle Swarm Optimization (PSO).The PSO method was used for solving the IRP, among others, in papers by Mousavi et al. (2014) and by Wang et al. (2019).Solutions in this PSO algorithm were encoded as real vectors of length H * n with n elements for each day t ∈ {1, . . ., H }. The route for day t was obtained from a solution x ∈ R H * n by adding retailers to the route in the descending order of the values of the elements x[(t − 1) * n + 1], . . ., x[(t − 1) * n + n] until no more retailers could be added because of the vehicle capacity constraint.The same repair procedure as for ACO was used, because without it the PSO method generated mostly infeasible solutions.
Parameter tuning The ACO, PBSA, PBTS and PSO methods require setting some parameters, which were tuned using the same approach as described in Sect.4.1.Because the tests involved algorithms based on different principles, the running time limit of max t = 3600 s was used as the stopping -For all algorithms: the population size N pop ∈ {50, 100, 200, 500} -For ACO: the pheromone evaporation rate R ev ∈ {0.001, 0.002, 0.005, 0.010}, which controls how fast the algorithm "forgets" the previous state, the pheromone increment τ ∈ {0.01, 0.02, 0.05, 0.10, 0.20}, which controls how fast the algorithm incorporates new information from good solutions, and the pheromone restart frequency ν re ∈ {0.02, 0.05, 0.10, 0.20, 0.50}, which controls how often the pheromones are reset to initial values (with the value ν re = 0.1, for example, meaning that the reset occurs every 10% of the algorithm running time).-For PBSA: the selection scheme used when selecting parents for generating new solutions.In the Same scheme, each solution x i , i = 1, . . ., N pop in the existing population is used as a parent for generating a new solution using mutation and if the new solution is better than x i , then x i is replaced with the new one.In the Random scheme when trying to generate a new solution to replace x i , the parent solution is selected randomly from the whole population, which helps disseminating information about good solutions in the population.-For PBTS: the selection scheme used when selecting parents for generating new solutions, which can be Random or Same similarly as for PBSA, the number of candidate solutions generated in each algorithm iteration N new ∈ {10, 20, 50, 100}, and the tabu list length N tabu ∈ {100, 200, 500, 1000, 2000, 5000, 10000}.-For PSO: the inertia weight w ∈ {0.2, 0.4, 0.6, 0.8, 1.0}, the cognitive coefficient φ p ∈ {1.0, 1.5, 2.0, 2.5, 3.0}, and the social coefficient φ g ∈ {1.0, 1.5, 2.0, 2.5, 3.0}.
The best parameter settings for each method were determined to be: -For ACO: the population size N pop = 500, the pheromone evaporation rate R ev = 0.002, the pheromone increment τ = 0.10, and the pheromone restart frequency ν re = 0.05.-For PBSA: the population size N pop = 200, and the Random parent selection scheme -For PBTS: the population size N pop = 200, the Random parent selection scheme, the number of candidate solutions generated in each algorithm iteration N new = 20, and the tabu list length N tabu = 200.-For PSO: the population size N pop = 50, the inertia weight w = 0.8, the cognitive coefficient φ p = 1.0, and the social coefficient φ g = 1.5.The ACO, CPLEX, PBSA, PBTS and PSO methods were compared to the evolutionary algorithm with knowledge import proposed in this paper with the parameters obtained in Sect.4.1: the Elitism reduction mode, N pop = 200, P cross = 0.8, P mut = 0.1.The number of knowledge imports was set to N imp = 1 and the number of attempts to improve the solution in each knowledge import set to N att = 10.
Method comparison Tested methods, with the parameter settings listed above, were compared by running each method 30 times for each IRP instance used in this paper: 30 Archetti 2007 LC instances, 30 Archetti 2007 HC instances, and 30 Hard instances.In order to test if the choice of the optimization method affected the quality of the solutions, the Friedman test was applied to the results obtained for each dataset and each optimization method.The null hypothesis of this test is that the medians of the optimization results are equal across all groups (optimization methods).The results of the Friedman test are presented in Table 10.Based on low p-values obtained in this test it can be concluded that there are statistically significant differences of solution quality among different optimization methods.Detailed results are presented in Tables 11, 12, 13 in which median values obtained in 30 runs of the optimization methods are shown.
In the column Knowledge-based EA, the results attained by the method proposed in this paper are presented.The following columns contain the results attained by the com-

Conclusion
In this paper, evolutionary optimization using an external knowledge source was proposed for the Inventory Routing Problem (IRP).During optimization, the algorithm performs N imp knowledge imports which modify schedules for individual retailers in the best solution x best found so far by the evolutionary algorithm.In each import, at most N att attempts to improve the solution are performed.In each attempt, one retailer is randomly selected (without repetitions) and the schedule for this retailer is imported to x best from the knowledge source.From the N att attempts to improve the solution the best one x * is selected and if it is better than x best it replaces the worst solution in the EA population.The knowledge can be imported from either a known good solution to the optimization problem, or by interacting with the user.In the former case the schedules for the selected retailers are copied from the known good solution, and in the latter case the user is expected to modify these schedules according to his/her expertise in the problem domain.
In the experiments, in which the evolutionary algorithm using the knowledge import mechanism was compared to an algorithm not using this mechanism, it was observed that the knowledge import mechanism was not likely to deteriorate the results (a worse result was only observed in 4 cases out of 720).In about 50% of tested cases the knowledge import improved the optimization results and in the remaining cases the results were not statistically different for the algorithm with and without the knowledge import.For the Hard IRP instances proposed in the paper by Michalak (2021b) the knowledge import mechanism never deteriorated the results, and it was able to improve the results in 16-22% of cases (depending on the number of attempts to improve the solution in each knowledge import N att ).These IRP instances seem to be indeed very hard to tackle.For the IRP instances proposed in the paper by Archetti et al. (2007) the knowledge import mechanism improved the results in about 66% of cases.For the Archetti 2007 LC dataset this rate of improvement was observed for N imp = 1 and N att = 10.This means, that if the user can be expected to provide 10 good schedules for individual retailers in one interactive session during the entire algorithm's runtime the optimization results can be improved in about 66% of cases.Therefore, the workload of the user should not be too high, and the proposed approach appears to be suitable for designing user-interactive optimization methods for the IRP.
In the experiments, in which the knowledge-based EA was compared to other optimization methods (ACO, PBSA, PBTS, PSO, and a mathematical programming model solved using CPLEX), it outperformed all the metaheuristics.The CPLEX-based method performed better than the knowledgebased EA on the Archetti 2007 LC and Archetti 2007 HC datasets.On the other hand, on the Hard instances the knowledge-based EA performed better than all comparison methods including the mathematical programming model.
To sum up, the proposed knowledge import mechanism can be used to improve optimization results by either importing knowledge from a known good solution to the IRP instance, or by interacting with the user, who can modify the best solution found by the optimization algorithm.This aspect of the proposed method can be used to the benefit of practitioners, who have the required know-how to improve the delivery schedules or want to follow good practices established at their company.Another advantage of the proposed method is that it is able to improve the results for very hard IRP instances.In the experiments, it was found to be the bestperforming optimization method for the Hard instances.On the other hand, it turned out not to be competitive to the mathematical programming model on easier IRP instances, but this is an observation true for all the tested metaheuristics.
A certain limitation of the presented work is that it did not address more complex variants of the IRP, but, given the large variety of inventory routing problems studied in the literature, each version would require a separate extensive study and discussion of the results in a dedicated paper.Therefore, adapting the proposed method to various types of the IRP used in practical applications was left for future work.

Fig. 2
Fig.2The histogram of the running times observed in the parametertuning phase of the experiments The optimization of the route of the vehicle on each day is done by the Concorde solver.

Table 1
Parameters

Table 3
Results of the Friedman statistical test with groups ("treatments") representing different values of N imp number of the outcomes of the statistical test is presented at the bottom of each table and is summarized in Table2.

Table 4
Results for the Archetti 2007 LC dataset and N att = 10

Table 5
Results for the Archetti 2007 LC dataset and N att = I.n

Table 6
Results for the Archetti 2007 HC dataset and N att = 10

Table 7
Results for the Archetti 2007 HC dataset and N att = I.n

Table 8
Results for the Hard dataset and N att = 10

Table 9
Results for the Hard dataset and N att = I.n

Table 10
Results of the Friedman statistical test with groups ("treatments") representing different optimization methods IRP instance using the knowledge-based EA at the significance level α = 0.05.The sign '(=)' is used to mark those results for which the Wilcoxon test produced a p-value larger than 0.05.The sign '(−)' is used to mark those results obtained by comparison methods which are significantly worse than the result obtained for the same IRP instance using the knowledge-based EA at the significance level α = 0.05.The footer of each table shows the number of times each comparison method attained a worse, equal, or better result in comparison to the knowledge-based EA.From the obtained results the following conclusions can be drawn:

Table 11
Method comparison for the Archetti 2007 LC dataset

Table 12
Method comparison for the Archetti 2007 HC dataset Archetti 2007 LC dataset by PSO, in 2 of 30 tests on the Archetti 2007 HC dataset by PSO again, and 1 of 30 tests on the Hard dataset by PBTS.-For the Archetti 2007 LC and Archetti 2007 HC datasets, all metaheuristic algorithms were outperformed by the mathematical programming model (CPLEX), which attained the best result in 23 of 30 tests (in the case of both datasets).This is not surprising, because this mathematical model was proposed in the same paper by Archetti et al. (2007) in which these IRP instances were provided, so it is undoubtedly well suited for them.-For the Hard dataset the knowledge-based EA proposed in this paper is the best optimization algorithm of all the tested methods.It outperformed the mathematical programming model (CPLEX) in 24 of 30 tests with the remaining 6 resulting in a draw.The CPLEX method never attained a result better than the knowledge-based EA on the Hard dataset.ACO, PBSA,

Table 13
Method comparison for the Hard dataset PBTS performed poorly in comparison to the proposed knowledge-based EA attaining worse results in at least 24 of 30 tests.PSO attained worse results in 15 of 30 tests, while in the remaining 15 tests it attained results equal to the knowledge-based EA.However, similarly as ACO, CPLEX, and PBSA it never attained results better than the knowledge-based EA.The only method that outperformed the knowledge-based EA on the Hard dataset was PBTS, and it was in just 1 of 30 tests. and