Inventory Policy and Heuristic for Long-Term Multi-product Perishable Inventory Routing Problem with Static Demand

This work considers a long-term Perishable Inventory Routing Problem with multiple products, static demand, and single vehicle, in the setting of Vendor Managed Inventory. By analyzing the optimal solutions of long-term cases that can be solved in Python+Gurobi within 2 h, we capture some patterns of optimal solutions. Utilizing these patterns, experiments show that under certain conditions, the mathematical models of multi-product problems could be simplified to single-product problems, which have the same optimal solutions while taking less time to solve. Managerial insights were generated that for products with static demand in the long term, delivery should be arranged at the store level rather than at the product level. Products in the same store should have the same delivery pattern, no matter how different the unit holding costs are. By further analyzing the optimal solutions of the simplified models, we find that optimal value will stabilize in the long term, and the optimal solution is very close to the solution point where total inventory holding cost and transportation cost are close. Based on these findings, we have developed a heuristic that always provides optimal or close-to-optimal solutions with far less computational time, compared with Python+Gurobi.


Introduction
Inventory Routing Problems (IRPs) are complex logistic problems in which two hard problems are integrated into a unified framework: Inventory Management and Vehicle Routing [1].
IRPs usually arise from a setting of vendor-managed-inventory (VMI) system, in contrast to retailer-managed-inventory (RMI) system. As the name indicates, in the system of RMI, retailers decide independently when and how much to order. As a consequence, without exchanging information properly, the retailers' decision strongly limits the performance of a whole supply chain, in terms of transportation cost and inventory-related cost [2]. In a VMI system, a supplier, having the full information of inventory and demand of various retailers, is the central decision-maker, instead of the retailers [3]. The cost reduction and collaborative benefits of VMI have been proved by many scholars and retail chains including Wal-Mart [4,5].
In general, IRPs model the delivery policy of a supplier to a set of retail stores over a short-or long-term planning period with a single product coming from a single plant or facility, referred to as the vendor. Two main decisions are made during this process: 1. how much to replenish to each store to minimize waste and holding cost; 2. which routes to choose to minimize transportation cost [6,7].
The structure of this work is as follows. In Sect. 2, we briefly explain the literature and our contributions. In Sect. 3, we introduce the problem, notations, and mathematical models. In Sect. 4, we present our instances and some analysis of the optimal solutions. Based on the analysis in Sect. 4, Sect. 5 elaborates the development of an equivalent simplified model and experiment results of 2 models. This is followed by detailed description of our heuristic in Sect. 6. The computational results are presented in Sect. 7, including comparing solutions and computation time of our heuristic and commercial solver Python+Gurobi. In the final section, Sect. 8, we conclude our findings.

Literature Review
IRP has a history of about 30 years. It was first introduced by [8] by integrating decisions on inventory, vehicle scheduling for the delivery of chemical products. More comprehensive and detailed reviews can be found in [9] and [10], which are very good reviews of IRP.
Perishable Inventory Routing Problems (PIRP) was first motivated by a healthcare application [11,12] due to the human blood perishability. Later on, the applicable range gradually extends to more general context where the perishable items could be food or medical drugs.
Generally speaking, the work on PIRP is scarce [13]. Several variants have been proposed. Various topics includes economic issue, environmental issue [14] etc. Among these topics, the economic issue continues to attract most attention, which is minimizing cost. Environmental issue is also popular, which considers carbon emission. According to model assumptions, PIRPs can be further divided into models of deter-ministic or stochastic demand [15,16], trans-shipments (delivery between stores) [17] or direct delivery (delivery from depot to stores only), homogeneous vehicle or heterogeneous vehicle (can involve different capacity, unit transportation cost and speed etc) [18], multiple [16] or single products. Since it is not possible to list all of the papers, we cite some recent papers. For more detailed review, interested readers are referred to reference [19], which is a comprehensive review.
After years of development, more complicated models have been formulated. However, these models differ in small details. Most of them are formulated as or found to be equivalent to mixed-integer linear program (MILP). The methods adopted in those models are usually based on exact methods such as Branch-and-Cut [1,20]) or metaheuristic such as Genetic Algorithm (GA) [21,22], Simulated Annealing (SA) [22], Tabu-Search [23] etc. Some also rely on commercial solver such as Cplex [14,16], Gurobi [15] to solve their models. When it comes to assumptions, majority of problems are single-product PIRP with homogeneous fleet, deterministic demand and finite shelf life.
PIRP with multi-product is a variant that is even less studied in PIRP because of its complexity brought by the various products and inventory deterioration due to different limited shelf-lives [12,16].
Reference [9] reports that instances of a single-product IRP with deterministic demands can rarely be solved to optimality when the number of stores exceeds 30. Considering product perish-ability makes the intrinsic-complicated IRP even more complicated as the combinatorial complexity grows exponentially with the increase of both product variety and store numbers [24].
However, the need for research in PIRP with multiple products is not neglectable. First of all, waste of perishable products requires more attention. Perishable products constitute over 52% of the sales revenue of grocery retail chains [25]. Nevertheless, it is reported that food lost after harvest and at the transport, storage, and processing stages stands at 13.8% globally, which amounts to over 400 billion USD per year [26]. In 2019, the profit margin of food retail barely reaches 1% after-tax [27]. Secondly, the solution and method available for single-product PIRP are not suitable for real-life retail chains [28]. A small retail chain could dispatch thousands of (perishable and dry) products from a central warehouse to hundreds of stores.
Due to the complexity of real-world problems and the NP-hard nature of PIRP, demand arises for more efficient algorithms to find optimal or near optimal solutions. Perishable inventory is difficult to manage. However, analysis of optimal replenishment policies remains a topic of great interest in the area of pure inventory management. For example, references [15] and [29] explores optimal base-stock level, while references [30] and [31] develop approximation algorithms for perishable inventory systems. Due to the complex nature of algorithms for finding the solutions, papers of PIRP tend to focus less on analysis but more on modelling and computation. Our work differs from previous PIRP papers in the way that we obtain our close-to-optimal solutions by analyzing the relationship between parameter values and optimal solutions.
In our problem, the relationship is found through both extensive experiments and analyzing the trade-off of the 3 costs in our objective function, which are transportation cost, inventory holding cost and waste cost. To the best of our knowledge, our work is the first to study the long-term multi-product PIRP with static demand and provide solutions based on parameter-solution relationships. Since our problem has never been studied before, we base our model on the famous single-product single-period Miller-Tucker-Zemlin (MTZ) formulation of Travelling Salesman Problem (TSP) [32], and modify the model to our need.
This work contributes to the literature in the following 4 ways: (a). We study the PIRP of multiple perishable products with stable demand in the long term; (b). We build a single-product PIRP model that provides the same solution and objective function value as the multi-product PIRP model; (c). We find that the optimal solutions, including the quantity to deliver to each store in each period, and the transportation routes, could be found near the solution point whose total inventory holding cost and transportation cost are close to each other; and (d). We develop a heuristic that can provide the optimal or close-to-optimal solution for larger-scale problems with far less computing time than Gurobi+Python.

Problem Description
The problem studied here is a long-term Inventory Routing Problem with multiple perishable products, single-vehicle, and static demand. In the following subsections we introduce our assumptions, notations and then mathematical models.

Assumptions and Notations
The mathematical models are developed based on the following assumptions: 1. One warehouse supplies multiple perishable products to multiple stores with an unlimited supply. For the same product, different stores have different demands. 2. The unit holding costs are different for different products but are the same for the same product across stores. Different products have a different maximum shelf life, which is the same for the same product across stores. 3. FIFO policy: In each store, inventory is consumed in a first-in-first-out (FIFO) manner, which means the oldest inventory will be consumed first then the fresher ones. FIFO policy is one of the most effective policies to reduce waste and increase revenue [25,33], which is widely adopted by supermarkets. 4. Symmetric distance matrix: The distance from any store i to store j equals the distance from store j to store i. 5. For any route, a truck always departs from the warehouse and returns to the warehouse at the end of a period. Each store can receive 1 delivery only in a period. 6. Single vehicle and static demand: In real life, 1 truck driver is usually responsible for a certain area and this area tends not to change or change very slowly in terms of population size and lifestyle. This also implies that, without intervention of special events, the demand of product in each store tends to be stable in the long term, especially for daily necessities like food. 7. No demand shortage is allowed in each period for each product at each store. 8. The vehicle has infinite capacity. There are 3 reasons: (a). Perishable products require frequent delivery and small delivery quantity to control waste; (b). The constraint that truck returns to depot at the end of the period (usually a day) limits the number of stores visited; (c). The vehicle capacity is much larger than the total delivery quantity per route.
The notations used in this paper are presented in Table 1. Indices i, j Notation of node, i, j ∈ {1, 2, · · · , I }. I = J . Node 1 (i = 1) is the depot and I is the total number of stores and depot.

Mathematical Formulations of Multi-product PIRP with Static Demand and FIFO
p Notation of product, p ∈ {1, 2, · · · , P}. P is the total number of products.
lp Notation of remaining shelf life of product p, lp ∈ 1, 2, 3 · · · , L p + 1 . Product with lp = 1 will be discarded at the beginning of the period. L p is the shelf life of product p.
t Notation of a period, normally in the unit of day, in a planning horizon, t ∈ {1, 2, · · · , T } and T is the total number of periods in the planning horizon.

Parameters
uT Unit transportation cost, unit: pound per km.

TC
Truck Capacity of the vehicle, a very large number.
T i j Transportation distance between store i and store j, in which store 1 is the depot, unit: km.
uH p Unit inventory holding cost of product p, unit: pound per piece per period.

W p
Unit wastage cost of product p unit: pound per piece per period.

D i p
The demand of product p in store i, unit: piece.

M
A large number used in if-or constraints.
Decision variables x i jt Binary: 1, when truck departs from store i to store j at period t; 0, otherwise.

Q i pt
Non-negative real: quantity of product p to deliver to store i in period t.
u it Real: quantity left in the vehicle after visiting store i in period t.
Inv i pt,lp Non-negative real: inventory level of product p at store i and with remaining shelf life lp, at the beginning of period t. Inv i pt,1 is the inventory that should be discarded at the beginning of each period; Inv i pt,L p +1 is the freshest inventory received at the beginning of each period.
The objective function (1) minimizes the total cost, which equals the transportation cost, plus inventory holding cost, plus perishable cost. Constraint (2) means the number of trucks that leave any node should not exceed 1. Constraint (3) means the number of trucks that enter any node should be equal to the number of trucks that leave the same node. Constraint (4) means the quantity delivered should not be negative for all product p, store i in all period t. Constraint (5) means the new delivery received at the beginning of any period t is freshest among all inventory, having the longest remaining shelf life. Constraint (6) means the quantity of product p that delivers to store i in period t should at least satisfy the demand of this period, for all products in all stores. Inv i pt,1 is not included in the on-hand inventory because they reach the end of their shelf life and are discarded at the beginning of the period. Constraint (7) means, if the total quantity delivered to store j in period t is 0, then store j will not be visited in period t; else the number of trucks that goes to store j should be 1. Constraint (8) is the sub-tour elimination and capacity constraints, based on the known MTZ constraints and modified to our multi-period PIRP background. The MTZ constraints have a polynomial cardinality, were first proposed for the single-period TSP [32] and subsequently extended in reference [34]. When x i jt = 0, the constraint is not binding; whereas x i jt = 1, the constraint imposes u it u jt + P p=1 Q j pt , which means the load left in the vehicle at store i after visiting should be no smaller than the load left in the vehicle at store j after visiting plus the quantity needed in store j. This constraint has been used extensively to model the basic Vehicle Routing Problem (VRP) and Capacitated Vehicle Routing Problem (CVRP). Constraint (9) means the initial inventory at the beginning of period t = 1 is 0 for all products and all stores. Constraint (10) means that according to FIFO policy, the oldest inventory, which is lp = 2, will be used to first satisfy the demand. The remnant will perish in the next period. Constraint (11) is similar to (10), meaning that the oldest inventory with shorter remaining shelf life will be used first, then the fresher batch. Constraint (12) enforces x i jt to be binary.

Solution Method
The model is coded in Python 3.8, and Python + Gurobi is used to compute the optimal solution. All computations are executed on a machine equipped with an Intel i7 with CPU 1.80 GHz and 8 GB of RAM, with the Windows 10 operating system.

Instances Generation
Randomly generated instances are used to explore the relationship between optimal solutions and the value of parameters. Instances vary in terms of the number and locations of stores, length of planning horizon, unit holding cost, unit transportation cost, and product shelf life. The details of our test-bed parameters is provided in Table 2.
This study aims to explore the optimal solution in the long run, which is T = ∞. However, due to the NP-hard nature of IRP [1], the problem is not solvable when T is infinity. As a result, optimal solutions with limited T are observed at this stage.
All the other parameters are introduced earlier in Sect. 3.1 except node coordinate. Node coordinates are the locations of stores and the depot. We randomly generate the locations first, and then calculate Euclidean distance between node i and node j based Their locations are also used to draw the optimal delivery routes per period for each instance.

Analysis of Results and Features of Quantity in Optimal Solution
Due to the NP-Hard nature of the problem and the limited computational power of the machine, it is found that some instances could not be solved to optimal within an acceptable time, taking up to more than 15 h, such as problem scale of 1 product, 7 customers and 10 periods, or 4 products, 7 customers and 8 periods, or 1 product, 20 customers and 2 periods.
In our experiments, we let the model run for maximum 2 h. The cases that can be solved within 2 h are referred to as 'solvable' cases for short in this paper, while the rest are referred to as 'not solvable' cases for short. However, in the early stage of the experiments, to test the limit of Python+Gurobi and observe optimal solution patterns, we let some cases to run for more than 10 h.
In this section, we try different combinations of parameters and list computational time and objective function value of 18 solvable cases. The cases that take longer than 2 h are marked with **. The rest of the larger-scale cases are not listed, for the optimal solution could not be attained, due to the program terminated after 2 h or memory running out.
Generally, the cases with longer planning horizon and more customers takes more time to solve. More specifically, cases with customer number exceeding 20 or planning horizon longer than 8 periods needs to be paired with other parameters of small values, otherwise they are less likely to be solved within 2 h.
In the process of analyzing optimal solutions, we notice the delivery interval plays a very important role. It is the lever that influences the trade-off of 3 costs (transportation, inventory holding and waste costs) in our objective function. When the optimal delivery interval for each store is found, the optimal delivery quantity is found. The rest of the job is scheduling the routes to optimal. Denote delivery interval of store i as Int i , total delivery quantity to store i as Q i , the optimal delivery quantity at the beginning of any delivery interval has the following 3 features: Feature 1 Int i min(L p ): Delivery intervals should not be longer than the shortest shelf-life of all products in store. The perishable nature of products requires that the delivery interval must be smaller than product shelf life, which is Int i L p , ∀i ∈ I , ∀ p ∈ P. This is equivalent to Int i min(L p ), ∀i ∈ I . The smallest shelf-life in a store limits its delivery interval.
Explanation If the delivery interval is longer than product shelf-life, constraint (6), which allows no demand shortage, will surely be violated. Let us assume the shelf-life of a product is 3 days. If the store is replenished every 4 days, we will face 1-day stock-out. No matter how much inventory we have at the beginning of the interval, that inventory will drop to 0 at the end of shelf-life period. If not replenished in time, the rest of the time, stores face stock-out situations.
Under the condition of Feature 1, when the initial inventory is 0 for all products in all stores as in our problem, optimal total quantity delivered to each store at the beginning of any delivery interval should always cover exactly the total demand during this interval; All products in the same store have the same delivery pattern, which means when optimized, situations will not happen that in the same store, some products receive 0 while other products receive positive number of delivery quantity.
Explanation The objective function consists of 3 costs, in which delivery quantity only influences 2 of them, inventory holding and waste costs. Transportation cost depends solely on delivery patterns, not on delivery quantity. Inventory holding cost depends on the inventory level at the beginning and the demand, which is represented as Suppose new delivery is received at the beginning of period t, the inventory level at the end of the interval period is Since no shortage is allowed in our problem, inventory level can not be negative and we have (5), we know that, during any interval period Int i , initial inventory should always satisfy the following: Objective function being minimizing cost, we will have: In equation (13), Inv i p,lp is the total inventory left at the end of previous period, and also the total inventory at the beginning of the current interval period before receiving new delivery. If not planned well, this part of inventory will be positive, which means store ordered too much for last interval, Q i p Int i × D i p ; nevertheless, when planned well enough, initial inventory at the beginning of any interval before receiving new delivery will be 0 for all products in all stores, we will have: Note that the value of Int i can be different for different stores. In the same store, when Int i is given, equation (14) holds for all products. At the end, the optimal total quantity delivered to store i: Q i = Int i × P p=1 D i p . An example of how different delivery quantities influence inventory level is provided in Fig. 1. In this example, demand is 2 pieces per period, delivery interval is 3 periods. We can see that inventory level of the yellow line is clearly not optimal. Store holds 2 more pieces than needed in first 3 periods. If quantity is optimized for all periods, we would have the blue line. Thus, no matter how delivery intervals change, when inventory is optimized, we will have no excess inventory, no shortage and no waste.
When the initial inventory is 0 for all products in all stores, the optimal delivery quantity for any product in any store should not exceed the demand in the minimum shelf life period of all products in the store.
Explanation Feature 3 is simply the combination of Features 1 and 2 when delivery quantity of previous period is optimized, or when initial inventory is 0. However, when Inequality (15) makes sure that stores do not order too much. Combined with no shortage constraint (6), inequality (15) forces stores to order at least once every minimum shelf-life period.
The above 3 features show that optimal solutions of multi-product PIRP could be presented at the store level. In other words, notations related to products can be discarded and multi-product problem can be simplified to single-product problem.
For better illustration of the above 3 features, optimal delivery quantity and routes of 2 cases both with 7 stores, 4 products, and 6-period planning horizon (I7P4T6) are presented in this section. Table 3 shows product-related parameter values such as demand in each store, shelf life and unit holding cost. The locations of depot and stores are provided in Table 4.
In these 2 cases, the parameter values are carefully controlled. We specifically make unit holding cost the same for different products in case1, and case2 with each product having a very different unit holding cost. All the other data are the same between the two cases.
The decision variable Q i pt , optimal quantity delivered to each store in each period is presented below in Table 5. For case1, I 1 and I 5 are replenished every period, while the rest of the stores are replenished every 2 periods. The optimal solutions of the remaining 4 periods are repetition of solutions for t = 1 and t = 2: Solution (t = 1) = Solution (t = 3) = Solution (t = 5), Solution (t = 2) = Solution (t = 4) =   The optimal routes of the 2 cases are presented in Fig. 2 graphically. The red square is the location of the depot, and the blue squares are the locations of stores.

Simplified Model Equivalent
Concluding from Sect. 4.2, the 3 features imply that optimal solution of multiproduct model could be presented without product-related indices, which inspire us to simplify the multi-product model to single-product model. To further verify the 3 features of optimal solutions found and utilize them to our benefit, a simplified model was formulated by dropping product and shelf-life-related indices. The updated parameters and simplified models are presented in the following subsections and in Table 6.
Comparing optimal solutions and objective function values, experiments show that the 2 models are equivalent, while solving the simplified model is less time-consuming.

Updated Indices and Parameters
There are 4 main differences between the updated parameter table and the original one after dropping the product-related indices p and its corresponding shelf life lp. The detailed update is presented in Table 6. 1. D i replacing P p=1 D i p : New demand is considered at the store level rather than product level, adding up the demand for different products in the same store.

Simplified Models
The main body of objective function and constraints are the same. There are 3 major differences between the simplified model and the original one: 1. Waste cost is eliminated from the objective function because the waste will always be 0. This is achieved by adding constraint (22) when minimizing the objective function and applying FIFO policy.
Constraint (22) is originated from inequality (15). It erases the part of waste caused by over-ordering. As explained in Sect. 4.2, combined with no shortage constraint (21), this constraint limits the delivery interval to be shorter than or equal to minimum shelf-life of all products in store.
The objective function of minimizing costs helps to control the inventory level to be just enough between any delivery interval, leaving no excess inventory for next delivery interval, as explained in Sect. 4.2, Feature 2.
By adding constraint (22) when minimizing the objective function, the waste can already be thoroughly eliminated. Objective function helps to minimize the inventory level during an interval, and constraint (22) helps to control the length of intervals. FIFO policy is not necessary anymore in our problem setting. Nevertheless, adopting FIFO in real life is still a good practice. It requires the inventory with shortest remaining shelf-life to be consumed first, which reduces possible waste due to mismanagement. Note that in constraint (22), the total initial inventory in store Inv it equals P p=1 L p lp=1 Inv i p,lp , while in inequality (15) (10) and (11) are now replaced with constraint (23), which means the total inventory at beginning of next period before receiving new delivery equals total inventory at the beginning of this period, plus new delivery, minus demand. 3. As discussed in Sect. 4.2, product-related parameters and variables are all deleted.
The multi-product model is now simplified to single-product model.

Computational Results and Conclusions of 2 models
Demonstrated by experiments, the 2 models are equivalent in terms of solutions and objective function value. The simplified model generally takes less time for Python+Gurobi to solve. We list detailed computational time and objective function value in Table 7. The 'Obj' in the table is short for 'Objective Function Value'. The cases of the original model that take longer than 2 h to solve are marked with **.
After the above verification, some conclusions can be drawn and managerial insights can be derived. Since inventory holding cost is usually proportionate to product price, the more expensive the product, the higher the unit holding cost. This is to say, for perishable products in the same store with stable demand and unlimited supply, if they could be delivered in the same cold-chain vehicle, then: 1. There is no need to hoard cheap products. The reason is that if other conditions remain unchanged, for any store, the transportation cost occurs and is fixed as long as delivery happens. Hoarding 1 product while trucks still deliver other products to the same store only increases inventory holding cost. The better way is to break down the bulk of large orders and fit them in the delivery schedule of other products. 2. There is no need to arrange frequent delivery for expensive products to reduce holding cost. The reason is similar to the previous one. The transportation of delivery to each store does not depend on the quantity, but on the distance only. Increasing delivery frequency for only 1 product while frequency for other products remains the same is not optimal. Inventory holding cost of other products can be further reduced by increasing the delivery frequency and reduce the quantity of delivery each time, which also evens the truck utilization rate for each delivery. 3. There is no need to track inventory age and consider product-age-related policies, such as FIFO policy, since all products can be sold and waste is eliminated.
In conclusion, when the delivery frequency for 1 product changes, the delivery frequency of other products should be adjusted accordingly. The best strategy is to sync the delivery for all products, provided the assumptions of this problem.

Heuristic for Long Term PIRP
Though the simplified model consumes less time than the original model by reducing some constraints and variables, long-term models are still not solvable, even for small-scale problems. In the earlier experiments, we know that the long-term delivery pattern is repeating the solutions of its shortest cycle (denoted as T * ).
The shortest cycle T * has the following 3 features: Feature 1 All conditions are the same for all cycles including initial inventory. The end inventory of current cycle is the initial inventory for next cycle, and optimal end inventory is always 0 (no left over). Therefore, in each cycle, the initial inventory and end inventory are 0 for all products in all stores. Feature 2 The periodic objective function value is the smallest. Feature 3 T * is the smallest value among all T s that satisfy Feature 1.
The shortest cycle T * is difficult to find and unknown. One way to find this T * is to compare objective function values of all solvable problems for each case, gradually increasing T from 1 period. Usually to observe an accurate T * , T should be at least 2 × T * , because we need to be sure that we have found the smallest periodic objective function value.
This method is quite time-consuming because unfortunately, not all T * s are small numbers. If intervals for all stores are the same, then T * equals to this interval; else, T * is the least common multiple of all stores replenish intervals. For example, considering 2 stores, the optimal intervals are 3 and 5 periods. The end inventory of 2 stores returns to 0 at the end of each 3 and 5 periods. The smallest period in which Feature 1 can be satisfied is period 15. The situation complexity can only increase with the number of stores. As we can imagine, even for some small cases, the optimal long-term solutions are not attainable, due to computational complexity.
To solve long-term problems, we propose a heuristic based on the relationships observed between parameters values and optimal solutions computed via Python+Gurobi. For small-scale problems for which the optimal solution is computable by using Python+Gurobi, the heuristic can provide optimal or close-to-optimal solutions with less computational time. For medium to large scale problems for which only 1-period optimal solution is computable, the heuristic can provide better solutions with reasonably less computational time. For larger-scale problems that 1-period problem could not be solved to optimal within 2 h, the heuristic is still able to provide good solutions.

Heuristic for Long-Term PIRP
For the long-term PIRP with static demand and single vehicle, due to those special assumptions, when the T is large enough, the periodic inventory and transportation cost will converge to a value, and the optimal solutions of multiple periods will be a repetition of solutions of a smaller T * . Take the I7P4T6 case1 we saw in Sect. 4 for example, the optimal solutions are the repetition of the first 2 periods', which is T * = 2. That is because the delivery intervals of different stores are 1 or 2 periods, T * = 2 is the smallest common multiple of 1 and 2. The heuristic approaches optimal solution by finding this T * , which comprises the optimal delivery interval of each store.
The overview of heuristic is shown in Fig. 3, which displays rough idea. As mentioned earlier, during the experiments, we notice delivery interval is the lever in the Fig. 3 Main idea of steps in heuristics trade-off of our PIRP model. The detailed steps are presented below, in which 2 steps are the cores: (a) how to calculate close-to-optimal global interval; (b) how to adjust interval for each store based on the global interval to approach to the optimal solution.
Step1 Calculate Global Delivery Interval, denoted as GInt, assuming all the customers have the same delivery pattern and before considering product perishability constraint. The same delivery pattern means, in any period t, either all stores are replenished, or no store is replenished at all. In other words, having the same delivery pattern means having both the same delivery interval and the same 1st delivery date. For example, if GInt = 3, all stores are replenished every 3 periods and on the same day.
-Step1-1 Solve TSP for all customers using Python +Gurobi, which is calculating the shortest distance visiting all customers in 1 route. Denote the optimal objective function value as TC, and save the optimal routes. The TSP mathematical model is presented in the following Sect. 6.2. -Step1-2 Calculate optimal global interval before considering product perish- The objective function could be written as follows, minimizing periodic transportation cost and inventory holding cost: GInt is the only variable. The optimal solution is found where the gradient concerning GInt is 0, which is − TC×uT Step2 Since the interval GInt can be non-integer, or greater than shelf life (infeasible), we then check if the interval (GInt) is feasible and integer.
Step3 Assume x is x rounded up to its nearest larger integer and x * is x rounded down to its nearest smaller integer. For example, if GInt = 3.5, then GInt * = 3, GInt = 4. In the following steps, based on the 2 nearest integers (GInt * , GInt ) of GInt, we fine-tune the intervals for each store and check if the solution can be improved. If GInt is not an integer, after Step3, we will have 2 delivery schedules, one is based on global interval GInt * (e.g., 3), another is based on global interval GInt (e.g., 4). These 2 delivery schedules are different. In final Step4, we pick the schedule with the smaller cost. However, if GInt is an integer itself (e.g., 2), we will have 1 schedule, whose global interval is itself (e.g., 2).
Step3-1.1 Check if total cost could be improved, when shortening delivery interval for each store. For each store i, we insert 1 delivery in the middle of the global delivery interval fGInt, without changing intervals of other stores. Let hfGInt = fGInt/2. After inserting, the original delivery interval is split into 2 intervals, which are hfGInt * and hfGInt . For example, if fGInt = 3, then hfGInt * = 1, hfGInt = 2. If fGInt = 4, the 2 new intervals are both 2.
As introduced in Sect. 4.2, in any interval with 0 initial inventory, the optimal inventory at the beginning of a period should equal to the total demand in this interval. Equation (28) calculates the difference in fGInt periods. Before insertion of one extra delivery, optimal original initial inventory is fGInt × D i , which can last for fGInt periods. After the insertion, we have hfGInt * × D i pieces of product for store i, which lasts for hfGInt * periods and hfGInt × D i which lasts hfGInt . The saved holding cost Hsave i is the difference between original holding cost (fGInt) 2 × uH i × D i /2 and new holding cost after the insertion [(hfGInt * ) 2 + (hfGInt ) 2 ] × uH i × D i /2. Equation (29) calculates the increased transportation cost Tincre i , which is the cost of vehicle leaving depot to store i and returning.
If Hsave i > Tincre i , insert one more delivery for store i. If the number of stores is greater than 1, solve TSP for these stores. Update total cost and delivery interval for all i. Go to Step3-1.2.
The reasons for inserting new delivery in the middle of the original interval for all stores are: (a) to maximize the reduction on inventory holding cost for each single store; (b) to further reduce transportation cost since delivery of multiple stores can be arranged on the same day in 1 route. In this way, the total cost will be minimized.
Step3-1.2 If 2 × fGInt > L, go to Step3-2; else, which is if 2 × fGInt L, check if the total cost could be improved for each store by increasing the delivery interval. For each store i, we increase the interval to be twice the global interval, the new interval newInt = 2 × fGInt. For example, if L = 7, fGInt = 3, then newInt = 6; if L = 5, fGInt = 3, then Step3-1.2 is skipped, go straight to Step3-2.
For each store i, denote the nodes before and after in the TSP routes, as n and m. Before increasing the interval, store i is visited twice in newInt periods with the same store visiting order, store n to store i to store m. After increasing, store i is visited once in newInt periods. The 2nd visit is cancelled and the vehicle visits store m directly after leaving store n. Equation (30) calculates the saved transportation cost Tsave i caused by the cancellation of the 2nd visit in newInt periods. Equation (30) is a fast and easy estimation of saved transportation cost. Equation (31) calculates the increased holding cost Hincre i , which is the difference between 2 costs, 2 × (fGInt) 2 × uH i × D i /2 (original interval) and newInt 2 × uH i × D i /2 (after interval increase). If Tsave i > Hincre i , increase the delivery interval for store i. Please note that the Tsave i is an estimation of saved transportation cost by simply removing node i from the routes and linking the nodes before and after i. To increase the accuracy, we keep a record of these store i, update the route by deleting these stores from the original one, calculate T save i for the rest of the stores, and compare them with Hincre i again. We repeat the steps until no more stores could be improved.
If the number of stores is greater than 1, solve TSP for these stores. Update delivery interval for all i and total cost. Go to Step3-2.
Step4 If GInt * = GInt , save final solutions of Step3-2; otherwise, compare the final solutions obtained from Step3-1 and Step3-2, and select the delivery schedule with smaller objective function value. Now the delivery intervals are known, denote interval for store i as Int i , and the quantity that should be delivered is: Int i × D i .

Mathematical Formulation for TSP
Here we present the model of typical TSP. The one we adopt is the famous MTZ formulation [34], in which the TSP is formulated as an integer linear program. The assumptions of the two models in the earlier sections still hold, and the data generation rule also applies.
The TSP was first formulated in 1930. After all these years, many variants emerge, extending from the seminal model. Interested readers could refer to [35] for more details. This area is very mature with many efficient algorithms.
During our exploration to PIRP solution optimality, it is found that optimal solution of TSP, or the accuracy of transportation cost estimation, attaches great importance. Thus, despite of its NP-hard nature, in our heuristic, we still choose to solve TSP to optimal using Python+Gurobi. With the quality of TSP routes guaranteed, we are able to obtain optimal solution for some cases.
The objective function (32) minimizes the total traveling distance. Constraint (33) means the number of trucks that leave any node should be 1. Constraint (34) means the number of trucks that enter any node should be 1. Constraint (35) and (36) are for sub-tour elimination. Constraint (37) enforces integrality.

Computational Results
Ideally, when solving the problem using Python+Gurobi, the longer the T the more accurate the long-term optimal solution. T should be at least 2× T * so that the shortest repeating cycle T * can reveal itself. With a longer T , by analyzing the optimal solution obtained from t = 1 to t = T , the shortest solution cycle T * and the optimal solution pattern are clearer. However, due to the NP-hard nature of the problem, even for a small-scale problem, only problems with limited periods are solvable.
The generation of cases in this section follows the same rule as in previous Sect. 4.1. The detailed numerical results, objective value, and computational time for 45 randomly generated cases are presented in Table 8.
The 'T ' column in Table 8 is the maximum period that Python+Gurobi could solve within 2 h. For all cases, we start with T = 1, and then gradually increase the T , until the limit of Python+Gurobi is reached. That is, if the largest T is 6 for a case, we solve the model to optimal for 6 times with T varying from 1 to 6, and save optimal solutions and objectives function value for each T . In the early stage of the experiments, to test the limit of Python+Gurobi and observe optimal solution patterns, we let some cases run for more than 10 h. These cases are marked with ** in the 'Dataset' column.
The 'Obj' in the table is short for 'Objective Function Value'. For cases with T greater than 1, the 'Obj' listed in Table 8 is the optimal periodic cost after comparing all solvable T s. The PT * is the period where the optimal solution is found. The HT * is the global interval calculated using heuristic. For example, for the 1st case (I7Set1) in Table 8, the largest T computable is 6, the best solution is found when T = 3 and T = 6. The corresponding 'Obj' listed is the 'Obj' value of T = 3 and T = 6. Using heuristic, the global interval is also 3 periods. For some cases, T = 0. It means these cases with T = 1 could not be solved within 2 h, and the computational time and objective function are represented as '-'.
The results show that, even for the cases with the same number of stores, computational time can vary a lot as parameter values vary. For each data-set, we generate demand and maximum product shelf life randomly. It is found that cases with larger demand value and longer shelf life usually take more time. Nevertheless, generally, as the number of stores increases, the maximum planning horizon solvable shortens, and the number of 1-period problems that could be solved within 2 h decreases.
In conclusion, as the experiment shows, the heuristic is effective and efficient in 2 perspectives: 1. Solution quality: for all cases, as long as TSP is solvable using Gurobi, the heuristic could provide the same optimal solutions for some cases and close-to-optimal solutions for the rest. 2. Computational time: for almost all cases, the heuristic consumes less time, which lays a good base for more complicated problems, such as stochastic demand, multiple vehicles.

Conclusions and Findings
In this work, we have studied the long-term PIRP with multiple products, singlevehicle, and static demand. By analyzing the optimal solutions, we have proposed 2 ways of modeling, namely single product model and multiple product models, and experiments have showed that they are equivalent in terms of optimal solutions and objective function values, while the former takes less time to solve.
By verifying that the two models are equivalents, we could gain some managerial insights: for products with regular and stable demand in the long term, delivery should be arranged at the store level rather than at the product level. The latter means the truck carries 1 or limited kinds of product per delivery, while the former requires the delivery of all products to be in sync. This indicates that for any store, there is no need to hoard products with cheap inventory holding cost to reduce transportation cost; and no need to arrange a special delivery for products with expensive holding cost to reduce holding cost. The advantage of syncing delivery for all products in this problem setting could be concluded as stable supply and even vehicle utilization rate. One should also note that for products with unstable demand, the above managerial insights might not apply.
Due to the NP-hard nature of PIRP, to obtain good solutions for larger-scale problems and save computational time for small-scale problems, we have also developed a heuristic. Comparing the solution and computational time between heuristic and com-mercial solver Gurobi coded on Python, it is demonstrated by experiments that the heuristic can provide optimal or close-to-optimal solutions for all solvable cases, with far less time. Though the heuristic is for single-vehicle problems, it could be easily implemented in multi-vehicle problems, by assigning each vehicle to its long-term serving zone and customers.
To balance the solution quality and speed of calculation, we mix the exact method and the analytical method. Exact method used for TSP assures the solution quality and approximation in our selection process reduces excessive time on exhausting all combinations.
The efficiency and quality of our heuristic is supported by extensive experiments. Nevertheless, we acknowledge some limitations of our study. Currently, our heuristic is tailored for the PIRP presented in this work, which is with single vehicle, static demand and single objective function. Our heuristic and some of the managerial insights are applicable to limited scenarios only. For example, when the number of vehicles increases or the demand is no longer static, the optimal solution pattern can be more difficult or impossible to capture. Besides, in our heuristic, the transportation cost is calculated to optimal by using Python+Gurobi, which can be time-consuming.
In the future, we may consider replacing the Python+Gurobi with other methods, which can increase the speed of calculation. One possible research direction is incorporating more realistic elements to our model such as multi-objective, multi-vehicle and stochastic demand. Another direction is to study the influence of different product issuing policy (other than FIFO) to the performance of our PIRP system.

Author Contributions
Xi-Yi Chen conducted the main research. Jian-Bo Yang and Dong-Ling Xu cosupervised this work. All 3 authors discussed the results of experiments, reviewed, and approved the final manuscript.

Conflict of interest
The authors declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.