1 Introduction

Integrated planning systems can be used to better coordinate processes and better align decisions within supply chains. A system that integrates the three key processes production, inventory and distribution can be represented and solved by the Production Routing Problem (PRP) (Adulyasak et al. 2015b). The PRP arises, for example, in the context of Vendor Managed Inventory. Typically, production and setup decisions at a vendor, replenishment decisions for multiple retailers, and transportation and routing decisions for vehicles are determined simultaneously. Accordingly, the PRP is a generalization of the Inventory Routing Problem (IRP) extended by production decisions (Qiu et al. 2018). As the IRP includes the Vehicle Routing Problem (VRP), and the PRP is an extension of the IRP, the solution of the PRP can be very challenging for realistic real-world problems. Therefore, solution approaches in the literature are usually based on metaheuristics or multistage heuristics and there is only a little focus on exact solution methods (Díaz-Madroñero et al. 2015; Adulyasak et al. 2015b). Nevertheless, the literature starts to address uncertainties within the PRP. Within the field of PRP, Adulyasak et al. (2015a) are the first to address demand uncertainty. They present a stochastic PRP (SPRP) within a two-stage decision process and a rolling horizon framework for the multistage SPRP. Agra et al. (2018a, 2018b) tackle the SPRP allowing backlog and solve a similar two-stage decision process using different Sample Average Approximation (SAA) methods and extend the general SAA with several heuristic approaches. Wang et al. (2019) solve the SPRP with uncertain traveltimes by robust optimization while respecting the risk preferences of the decision maker using a memetic algorithm. Ghasemkhani et al. (2021) and Liu et al. (2021) introduce the SPRP in the context of perishable goods. Ghasemkhani et al. (2021) deal with the demand uncertainty using a fuzzy chance-constrained programming model, whereas Liu et al. (2021) neglect the setup decisions and use a heuristic-based robust optimization approach. Wang et al. (2021) extend the demand uncertainty, which is mainly considered, by cost uncertainty. As additional uncertainty makes the problem even more difficult, the provided solution approach considers the uncertainty only as an expected value problem.

Since demand is a critical information for decision making and is only known approximately in most cases, demand uncertainty should be taken into account (Gupta and Maranas 2003). The general PRP solely considering demand uncertainty has only been addressed three times so far. In the very first article, Adulyasak et al. (2015a) address the SPRP in a two-stage decision process. On the first stage, setup and routing decisions are made before the realization of demand. The second stage adjusts the production and delivery quantity decisions when demand becomes known. The inventory and unmet demand quantities can be derived at the end of every period. They further develop a multistage SPRP in which demand becomes known in each period and demand of future periods remains unknown. As in the two-stage SPRP, first setup and routing decisions are made for all periods. According to the revealing demand, quantities get adjusted on the subsequent stages respectively. Exact solution approaches based on Benders’ Decomposition are proposed, separating the decisions of the first and second or subsequent stages into master and subproblems. The reformulations can be improved by lower-bound lifting inequalities, aggregate Benders cuts using scenario group cuts, and pareto-optimal cuts. For the multistage SPRP, the two-stage SPRP can be used as a warm start to generate an initial set of cuts providing a valid lower bound for the multistage SPRP and speed up the algorithm. To further improve the algorithm, the Benders reformulation can be solved in a Branch-and-Cut framework, where cuts are added to the master problem only at the root node and when an incumbent solution to the problem is found.

Similar to Adulyasak et al. (2015a), Agra et al. (2018b) also model the SPRP as a two-stage decision process assuming a single vehicle and therefore including a TSP for the routing part. Here, all production decisions including setup and production quantity, and routing decisions including fixed vehicle and variable travel costs are made before demand is known. Delivery and inventory quantities are derived from scenarios. They also allow backlogging. The two-stage SPRP formulation can be tightened with several lot sizing, demand and rounding inequalities. To tackle the developed two-stage SPRP, Agra et al. (2018b) use a SAA approach with a hybrid heuristic of four dependent steps to determine the solution with the minimum average cost. Based on the SAA according to Verweij et al. (2003), the general idea is to determine the expected objective function of the SPRP using average estimates derived from random samples. For this purpose, the problem is solved for several small random samples and evaluated on the entire sample set. Here, small samples of about 0.5 % of the total scenario set are generated first and solved with a time limit of 300 s. Therefore, routing-related binary variables are relaxed, while the variables representing a visit to the retailer are kept binary. Second, weights are assigned to the determined solutions. If the sum of the weighted visit variables is greater than a previously defined threshold value, a given minimum percentage of the visit related variables is fixed. With these variables fixed, and the routing variables still relaxed, the two-stage SPRP is solved again for larger scenario sets of up to 5 % of the total scenario set in the third step. All retailer visit-related variables are fixed to their optimal values. Next, the routing variables are determined by solving a TSP for each period. As a last step, the first stage variables are fixed and the objective function value of the entire sample is computed by solving a pure linear programming model.

The authors also develop a new SAA algorithm for the two-stage SPRP including a VRP for a given number of vehicles in Agra et al. (2018a), called the Adjustable Sample Average Approximation (ASAA). The mathematical formulation is tightened by valid inequalities and the goal is to minimize the production and routing costs plus the expected cost of both the inventory and the penalty costs for backlogged demand. Also the ASAA is quite similar to the algorithm introduced in Agra et al. (2018b). First, a two-stage SPRP is solved for equally small scenario samples with a given time limit. Then, the visit variables, which are similar for a given number of solved samples are fixed according to their associated weights and a given threshold value. Next, a two-stage SPRP is solved again for bigger scenario sample sets considering the promising variables as fixed. Using the optimal solution obtained, the visit variables not fixed yet can be fixed to their optimal values for each candidate solution. With all the visit variables fixed, a two-stage SPRP is solved for very large scenario sample sets. The value of the recourse variables is computed for each scenario given in the SAA scenario set. Consequently, the objective function value of each candidate solution is computed and the solution with the lowest average cost is chosen.

Within these three articles, setup and routing decisions are made on the first stage and quantity decisions can be adjusted after demand realization. This might be applicable for some practical applications where recurring distribution patterns occur (e.g. supermarkets (Gaur and Fisher 2004).) Otherwise, this might lead to retailer visits, where retailers have little or no demand and unnecessary costs may occur. In other industries, such as the furniture or petrochemical industry (Miranda et al. 2018; Schenekemberg et al. 2021), for example, a more flexible decision sequence, regarding the routing decisions, might be more appropriate. Therefore, a new assumption is made here - routes can be adjusted at short notice in the second stage. The routing decisions might be adjustable in order to avoid inefficient tours, due to uncertain demand and therefore flexibility on routing is increased. Nevertheless, this assumption may lead to high computational costs as a VRP needs to be solved in every period for every scenario. Therefore, a heursitic approach based on a SAA is provided here, where the SPRP is solved iteratively, by solving the production planning and routing subproblems in sequence. In this work, the SPRP is considered, with a single product distributed from a production plant to multiple retailers using capacitated vehicles in a discrete and finite time horizon. The demand uncertainty is represented by scenarios and the distribution is assumed to be known. Production planning is assumed to be made on the first-stage before the demand realization. The routing and therefore the decisions, which retailers need to be visited in each time period, the quantities to deliver to each retailer in each time period as well as the inventory levels are adapted to the scenario in the second stage. This should be done for short planning horizons, as these decisions depend on the actual outcomes of the uncertain demand parameters.

In this paper, an extended version of the formulation for the two-stage SPRP presented in Adulyasak et al. (2015a) is introduced with respect to the routing decision in the second stage. A general approach that explores the SAA in order to handle satisfactory sized instances of the SPRP is presented. For this purpose, the two-phase heuristic according to Absi et al. (2015) is adapted to the stochastic case. In the first phase, a two-stage stochastic program for production and distribution decisions is solved. In the second phase, the scenario-dependent routing decisions are made. Contrary to multiple TSPs, VRP-solution approaches are applied. Two different solution procedures for the routing subproblem are evaluated with regard to their effect on the solution quality. And another ASAA approach, based on Agra et al. (2018b) is proposed, where fixing binary variables on the second stage is also taken into account.

The rest of the paper is organized as followed. In Sect. 2, a formulation of the two-stage SPRP is presented. The SAA and ASAA based heuristic is then described in Sect. 3 in detail, followed by the discussion of computational experiments in Sect. 4, and by the conclusions in Sect. 5.

2 Notation and mathematical formulation

In this section, the notation used throughout the paper is introduced followed by the two-stage formulation of the SPRP. The production plan is considered as the first-stage decisons. Inventory levels, delivery quantities and the penalized ammount of unsatisfied demand per period are adjusted to the scenario. The penalty can be viewed either as an opportunity cost related to lost sales or as the cost of outsourcing the production and delivery of the product, as suggested in Adulyasak et al. (2015a). Here, the goal is to minimize the total costs, including setup and production costs, plus the expected costs of the inventory, lost sales as a penalty for unmet demand, and the routing.

2.1 Notation

The SPRP can be defined on a complete undirected graph \(G = ( N_0, A )\), where \(N_0 = \{0,\ldots ,\text {n}\}\) is the set of nodes and \(A = {( i:j ) \big \vert i,j \in N_0, i \ne j}\) is the set of arcs linking node i to j. Node 0 represents the vendor and \(N = N_0 {\setminus } \{0\}\) the set of retailers. Let \(\Omega\) denote to the finite set of demand scenarios, and let \(\rho _\omega\) be the probability of scenario \(\omega \in \Omega\). The planning horizon is a finite set of time periods \(T = \{1,\ldots ,|T |\}\) and a set of identical and capacitated vehicles \(V = \{1,\ldots ,|V |\}\) is defined. The notations for parameters, data and decision variables can be found in Table 1.

Table 1 Notations for the SPRP

2.2 Two-stage SPRP formulation

Based on the deterministic problems studied by Archetti et al. (2011) and Adulyasak et al. (2015b) the SPRP is modified for the stochastic case by Adulyasak et al. (2015a). Here, the SPRP formulation is altered in the objective function and some routing related constraints. Decisions on the production quantity are made in the first stage, acknowledging long lead times in manufacturing (Hopp and Spearman 2008). The routing related decisions are made in the second stage, as the required work force or equipment and materials at the retailers might play a minor role within VMI. In addition, the subtour elimination constraints are refined in the form of the Miller-Tucker-Zemlin inequalities (Miller et al. 1960) and the vehicle capacity constraints are eliminated by incorporating \({\hat{M}}_{it\omega }\). These modifications collectively enhance the SPRP formulation in terms of more flexible decision making. The new extended two-stage SPRP can be formulated as follows.

$$\begin{aligned}{} & {} \text {min} \sum \limits _{t \in T} \Bigg [ f y_t + u p_t + \sum \limits _{\omega \in \Omega } \rho _\omega \bigg ( \sum \limits _{i \in N_0} h_i I_{it\omega } + \sum \limits _{i \in N} \sigma _i e_{it\omega } + \sum \limits _{\begin{array}{c} (i,j)\in A \\ v \in V \end{array}} c_{ij}x_{ijtv\omega } \bigg ) \Bigg ] \end{aligned}$$
(1)
$$\begin{aligned}{} & {} I_{0(t-1)\omega } + p_t - \sum \limits _{\begin{array}{c} i \in N \\ v \in V \end{array}} q_{itv\omega } - I_{0t\omega } = 0 \hspace{1cm} \forall t \in T, \omega \in \Omega \end{aligned}$$
(2)
$$\begin{aligned}{} & {} I_{i(t-1)\omega } + \sum \limits _{v \in V} q_{itv\omega } + e_{it\omega } - I_{it\omega } = d_{it\omega } \hspace{1cm} \forall i \in N, t \in T, \omega \in \Omega \end{aligned}$$
(3)
$$\begin{aligned}{} & {} I_{0t\omega } \le K^{inv}_0 \hspace{1cm} \forall t \in T, \omega \in \Omega \end{aligned}$$
(4)
$$\begin{aligned}{} & {} I_{i(t-1)\omega } + \sum \limits _{v \in V} q_{itv\omega } \le K^{inv}_i \hspace{1cm} \forall i \in N, t \in T, \omega \in \Omega \end{aligned}$$
(5)
$$\begin{aligned}{} & {} p_t - M_t y_t \le 0 \hspace{1cm} \forall t \in T \end{aligned}$$
(6)
$$\begin{aligned}{} & {} q_{itv\omega } - {\hat{M}}_{it\omega } z_{itv\omega } \le 0 \hspace{1cm} \forall i \in N, t \in T, v \in V, \omega \in \Omega \end{aligned}$$
(7)
$$\begin{aligned}{} & {} \sum \limits _{v \in V}z_{itv\omega } \le 1 \hspace{1cm} \forall i \in N, t \in T, \omega \in \Omega \end{aligned}$$
(8)
$$\begin{aligned}{} & {} \sum \limits _{v \in V} z_{0tv\omega } \le \vert V \vert \hspace{1cm} \forall t \in T, \omega \in \Omega \end{aligned}$$
(9)
$$\begin{aligned}{} & {} \sum \limits _{\begin{array}{c} i \in N_0 \\ i \ne j \end{array}} x_{ijtv\omega } + \sum \limits _{\begin{array}{c} l \in N_0 \\ l \ne j \end{array}} x_{jltv\omega } = 2z_{jtv\omega } \hspace{1cm} \forall j \in N_0, t \in T, v \in V, \omega \in \Omega \end{aligned}$$
(10)
$$\begin{aligned}{} & {} w_{itv\omega } - w_{jtv\omega } - q_{itv\omega } + {\hat{M}}_{it\omega }\left( 1 - x_{ijtv\omega }\right) \ge 0 \hspace{0.5cm} \forall \left( i,j\right) \in A, t \in T, v \in V, \omega \in \Omega \end{aligned}$$
(11)
$$\begin{aligned}{} & {} 0 \le w_{itv\omega } \le K^{vec} z_{itv\omega } \hspace{1cm} \forall i \in N, \forall t \in T, \forall v \in V, \forall \omega \in \Omega \end{aligned}$$
(12)
$$\begin{aligned}{} & {} 0 \le e_{it\omega } \le d_{it\omega } \hspace{1cm} \forall i \in N, \forall t \in T, \forall \omega \in \Omega \end{aligned}$$
(13)
$$\begin{aligned}{} & {} 0 \le \sum _{i \in N} q_{itv\omega } \le K^{vec} \hspace{1cm} \forall t \in T, \forall v \in V, \forall \omega \in \Omega \end{aligned}$$
(14)
$$\begin{aligned}{} & {} p_t, I_{it\omega } \ge 0 \hspace{1cm} \forall i \in N_0, t \in T, \omega \in \Omega \end{aligned}$$
(15)
$$\begin{aligned}{} & {} y_t \in \left\{ 0, 1\right\} \hspace{1cm} \forall t \in T \end{aligned}$$
(16)
$$\begin{aligned}{} & {} x_{ijtv\omega } \in \left\{ 0, 1\right\} \hspace{1cm} \forall \left( i, j\right) \in A, \forall t \in T, \forall v \in V, \forall \omega \in \Omega \end{aligned}$$
(17)
$$\begin{aligned}{} & {} z_{itv\omega } \in \left\{ 0, 1\right\} \hspace{1cm} \forall i \in N_0, t \in T, v \in V, \omega \in \Omega \end{aligned}$$
(18)

The objective function (1) minimizes the cost of the first-stage decisions and the expected cost of the second-stage decisions. Inventory flow \(I_{it\omega }\) at the vendor and retailers is balanced by constraints (2) and (3). Constraints (4) and (5) impose the maximum inventory level and are based on the maximum level policy defined by Archetti et al. (2011). Production \(p_t\) and setup \(y_t\) are linked by constraint (6). The setup variable is forced to be one if production takes place in a given period. The production quantity is bounded by the minimum value of the production capacity and the maximum total demand in the remaining periods. Shipment quantity \(q_{itv\omega }\) and retailer visits \(z_{itv\omega }\) are linked by constraint (7). The quantity shipped is limited by the minimum value variable of the vehicle capacity, the inventory capacity at the retailer and the remaining demand. Each retailer can be visited at most once per period (8). The number of used vehicles is limited by the number of available vehicles (9). Constraint (10) requires a visited retailer to have two incident edges to maintain flow conservation. The subtour elimination constraints and vehicle loading restrictions are combined by constraint (11). Subtours are eliminated by forcing the vehicle load \(w_{itv\omega }\) to be higher before visiting the next retailer. After the visit, the vehicle load needs to be lower, due to the delivered shipping quantity. Constraint (12) represents the vehicle-load capacity restriction and (13) limits the unsatisfied demand \(e_{it\omega }\) to the demand of a given period under a certain scenario. (14) to (18) specify the bound for the remaining variables, respectively.

This basic two-stage SPRP formulation contains a large number of binary and continuous variables and may not be solved efficiently using general optimization software. Even by strengthening the routing part by adding the following valid vehicle-symmetry breaking constraints (19) and (20) based on Adulyasak et al. (2014a), the formulation stays impractical for large size instances.

$$\begin{aligned}{} & {} z_{0tv\omega } - z_{0t\left( v+1\right) \omega } \ge 0 \hspace{1cm} \forall t \in T, 1 \le v \le \vert V \vert -1, \omega \in \Omega \end{aligned}$$
(19)
$$\begin{aligned}{} & {} \sum _{i = 1}^{j} 2^{(j-i)}z_{itv\omega } - \sum _{i = 1}^{j} 2^{(j-i)}z_{it(v+1)\omega } \ge 0 \hspace{0.5cm} \forall j \in N, t \in T, 1 \le v \le \vert V \vert -1, \omega \in \Omega \end{aligned}$$
(20)

Adulyasak et al. (2014a) already mention this issue regarding the deterministic PRP. Since for the SPRP the scenario-dependent constraints expand by a factor of \(|\Omega |\), this problem is also present here as well (see Table 6 in the Appendix). Even small scenario sizes of \(\vert \Omega \vert = 5\) cannot be solved to optimality with a common commercial solver in a reasonable runtime and push commercial solvers to their limits. Therefore, a SAA based heuristic approach is introduced in Sect. 3. The problem formulation (1)–(20) will be referred to as the 2sSPRP in the following.

3 Solution framework

In this section, two different main-approaches to solve the 2sSPRP are discussed and tested in Sect. 4. Both approaches are based on the SAA method described in Verweij et al. (2003).

Given the difficulty of the 2sSPRP, solving the problem with the general SAA may not be practical, even for a very small number of less than 5 scenarios. Therefore, both approaches are supported by a two-phase iterative heuristic, to obtain good or near-optimal solutions for the SPRP in reasonable runtime. Beyond that, the second approach is modified to enlarge the number of scenarios obtained or candidate solutions explored, similar to Agra et al. (2018b).

3.1 Two-phase iterative heuristic

The two-phase iterative heuristic is based on Absi et al. (2015). Here, the 2sSPRP is devided into a two-stage production and distribution problem (2sPrDP) in the first phase and a VRP in the second phase. For a given set of scenarios, the solution of the 2sPrDP determines the setup and production decisions in the first stage. The inventory, delivery, unmet demand quantity and decisions related to retailer visits are assigned to the second stage. The delivery quantity and retailer visits from the first phase are submitted to the second phase, where a VRP is solved in every period for each scenario. To obtain \(z_{itv\omega }\), the routing costs are considered approximately. The initial approximated costs to visit a retailer in a period are determined by

$$\begin{aligned} \lambda _{it} = \min \bigl \{2c_{0i}, \underset{j \ne i \ne l, j \ne l}{\underset{j,l \in N_0}{\min }}\{0.5(c_{ji}+c_{il})\} \bigr \}. \end{aligned}$$

According to Qiu et al. (2018) this approximation of \(\lambda _{it}\) might be closer to the real routing costs than the proposed approximation of Adulyasak et al. (2014b), where

$$\begin{aligned} {\hat{\lambda }}_{it} = \min \bigl \{2c_{0i}, \underset{j \ne i \ne l, j \ne l}{\underset{j,l \in N_0}{\min }}\{(c_{ji}+c_{il})\} \bigr \}. \end{aligned}$$

In Adulyasak et al. (2014b), \({\hat{\lambda }}_{it}\) might overestimate the routing cost, since the cost of travel on arc (ji) or (il) is assigned to both retailer i and retailer j and retailer l, respectively. The approximation presented by Absi et al. (2015) might also be a weak representation of the real routing costs. They may overestimate the real costs considering only direct delivery or a random variation of it. The results of preliminary computations (see Appendix, Table 7) show that, on average, the approach according to Qiu et al. (2018) only differs by about 1 % in terms of the objective value. However, considering runtime, the approach provides better results for long-term planning horizons of 4 or more periods and is therefore used for further processing.

In some scenarios and periods, the demand of certain retailers could be satisfied by the inventory. Therefore, the initial \(\lambda _{it}\) might not represent the real expected cost values for the retailer visits. To receive a better estimation, the solution of the second phase can be used to update the costs and pass the better approximations of \(\lambda _{it}\) to the next iteration of the first phase. This process can be repeated until a given number of iterations is reached or no more changes in \(\lambda _{it}\) are observed.

3.1.1 Production distribution phase

In the first phase, the 2sPrDP is solved for a set of scenarios \(\Omega\). Here, the decision when and how much to produce is made before the demand realization. The decisions when to visit retailers and how much to deliver are made after the demand realization, similar to the 2sSPRP. The objective is to minimize production and inventory costs and costs related to inserting retailers into vehicle routes. According to the 2sSPRP, the 2sPrDP can be stated as:

$$\begin{aligned}{} & {} \text {min} \sum \limits _{t \in T} \Bigg [ f y_t + u p_t + \sum \limits _{\omega \in \Omega } \rho _\omega \bigg ( \sum \limits _{i \in N_0} h_i I_{it\omega } + \sum \limits _{i \in N} \sigma _i e_{it\omega } + \sum \limits _{\begin{array}{c} i\in N \\ v \in V \end{array}} \lambda _{it}z_{itv\omega } \bigg ) \Bigg ] \end{aligned}$$
(21)
$$\begin{aligned}{} & {} \sum \limits _{i \in N} q_{itv\omega } \le K^{vec} \hspace{1cm} \forall t \in T, \forall v \in V, \forall \omega \in \Omega \nonumber \\{} & {} \quad \text {and} \hspace{1cm} (2) - (9), (13) - (16), (18) - (20)\text {.} \end{aligned}$$
(22)

Except the actual routing, all decisions linked to a vehicle are made in the 2sPrDP. The vehicle loads do not exceed their capacity and a valid solution for the second phase is guaranteed. As \(\lambda _{it}\) connect the first and second phase, their values and the quality of their approximation play a crucial role in determining the first decision set. The initial approximation considers the minimum partial connection of two nodes and therefore may underestimate the real values. Thus, the initial values of \(\lambda _{it}\) need to be updated in subsequent iterations using the information provided by the second phase.

3.1.2 Routing phase

This section describes the second phase, where the actual routing takes place. For each scenario and period, a VRP needs to be solved. Any known solution method for the VRP could be used to determine the actual costs of the vehicle routes identified. Here, two different methods from the literature are adapted to investigate the impact of the routing decision on the 2sSPRP. The results will be shown in Sect. 4.

Due to the information gained in the first phase, retailers are already allocated to vehicles and a route for each vehicle can be computed. In order to keep this information, a heuristic for the VRP, the Distance-based Sweep Nearest (DSN) algorithm (Peya et al. 2019), is used. The assignment of the retailers to be visited in each period and scenario is known for each vehicle. Therefore, for each vehicle used, the Euclidean distance of each assigned retailer from the vendor can be computed and sorted according to the descending order of their distances. Then the farthest retailer from the vendor is chosen and a Nearest Neighbor Heuristic (NNH) is executed for the remaining retailers assigned to the vehcile, but not yet to the corresponding route. Preliminary testing has shown that the DSN performs slightly better when the retailers are sorted according to the descending order compared to the ascending order (see Appendix, Table 8). The DSN can be seen as a single-iteration NNH here, choosing the farthest retailer from the vendor as a starting point. This approach is chosen to obtain a fast and reasonable solution for the VRP, assuming that the impact of the VRP solution on the underlying 2sSPRP might be low.

To check the contrary assumption, a second, more powerful VRP-solution approach is adopted. According to Laporte et al. (2014), the Greedy Randomized Adaptive Search Procedure hybridized with an Evolutionary Local Search (GRASPxELS) by Prins (2009) is a good trade off between computation time and solution quality and is chosen as the second routing approach. As the GRASPxELS is designed for large-scale VRPs with up to 483 customers, the approach is implemented with a reduced Local Search method. Also a more simple Split procedure and other minor modifications are made due to the limited number of available vehicles and the comparatively small number of retailers considered. In each phase (np), a randomized NNH is executed to generate a giant tour T as starting solution. This giant tour is split into a VRP solution \({\bar{S}}\) by the Split method, containing a feasible number of subtours, which is improved by Local Search (LS) and merged back into a giant tour \({\bar{T}}\) by Concat. \({\bar{T}}\) and \({\bar{S}}\) are then used to improve the VRP solution performing ni iterations for the ELS. In each iteration, nc copies of \({\bar{T}}\) are mutated to generate child-tours \({\hat{T}}\), which are split and then improved by LS. If \({\bar{S}}\) is improved by a child-solution \({\hat{S}}\), \({\bar{S}}\) and \({\bar{T}}\) are updated for the next ELS iteration. The best solution found after np phases remains the global best solution \(S^*\) and \(T^*\) respectively. The used operations for the LS and the general form for the used GRASPxELS can be found in the Appendix, Algorithm 2 and 3.The proposed two-phase iterative heuristic is integrated into a common SAA framework and an adjustable SAA framework in the following.

3.2 General SAA

SAA is commonly used in stochastic programming to reduce the scenario set to a manageable size. Agra et al. (2018a) show, that relying on the power of comercial solvers only may be impractical and using heuristics within a SAA approach can help to obtain good solutions. As in Kleywegt et al. (2002) and Verweij et al. (2003), the true expected objective function value is determined by an approximation of the average value for a very large scenario sample set \(\Omega '\). For the approximation, \(|M |\) separate smaller samples \(\Omega _k\), with \(k \in M = \{1,\ldots ,|M |\}\), are solved, using the heuristic presented in Sect. 3.1, to obtain a candidate solution. For each candidate solution, the first-stage decision is fixed and evaluated on \(\Omega '\). By this, a deterministic PrDP with fixed setup and production quantity is computed for each scenario \(\omega \in \Omega '\) and thus the remaining distribution decisions are made and the routing is conducted. The solution that estimates the minimum expected total cost by calculating the average is chosen, respectively.

The general framework used can be seen in Fig. 1. For each candidate solution or replication k, the formerly introduced two-phase iterative heuristic needs to be solved in a stochastic environment. Here, the 2sPrDP needs to be solved multiple times to update the expected routing costs \({\bar{\lambda }}_{it}\) to improve the initial expected costs for retailer visits. To speed up the algorithm, the prior calculated 2sPrDP solution can be used as a warm start within the approach. \({\hat{y}}_t\) and \({\hat{p}}_t\) refer to the fixed first-stage decisions of the 2sSPRP and are passed onto the sample evalution within the SAA. Nevertheless, a VRP needs to be calculated for each scenario and period and therefore, only a moderate number of scenarios can be considered. An adjustable approach is presented in Sect. 3.3 to overcome this problem.

Fig. 1
figure 1

Integrated two-phase iterative heursitic within a general SAA framework

3.3 Adjustable SAA

According to Agra et al. (2018a), the SAA method can be adjusted to generate partial solutions from promising variables of the \(|M |\) replications. Within the ASAA, variable values, which are identical in (almost) all replicate-solutions, are identified. As these variable values are likely to appear in the optimal solution, they are fixed to generate a simplified problem and a new larger sample can be computed. Here, the idea is to use the ASAA framework to generate high quality solutions by expanding the scenario space without increasing the algorithms runtime too much.

To identify promising variables, weights \(w_k\) are assigned to each partial solution according to their objective function value. In Agra et al. (2018a), only the retailer-visit related variables (\(z_{it}\)), refering to the first stage, are chosen. As these variables \(z_{itv\omega }\) refer to the second stage now, two different approaches will be considered. First, setup-decision variables \(y_t\) are examined, as they are the only remaining binary first-stage variables. And since \(z_{itv\omega }\) are presumed to be responsible for slowing down the 2sPrDP, retailer-visit variables are observed secondly. For both approaches, \(w_k\) are computed according to Agra et al. (2018a) and the ASAA proceeds in similar steps, but has to be adjusted for fixing the second-stage variables.

\(\text {ASAA}_{1st}\) - Fixing the first-stage variables \(y_t\):

  • Obtain partial solutions. For each replication k, generate a small scenario sample set \(\check{\Omega }\) with \(\vert \check{\Omega }\vert \ll \vert \Omega \vert\). Solve \(\check{\Omega }\) using the general SAA approach from Sect. 3.2.

  • Fix promising first-stage variables. Assign \(w_k\) to each of the k replications and fix some of the variables \(y_t\) according to Agra et al. (2018a).

  • Solve and evaluate the simplified model using \(\Omega\) and \(\Omega '\). Solve \(|M |\) replications for the bigger sample set \(\Omega _k\) and the former fixed variables \(y_t\). Evaluate the solutions acoording to \(\Omega '\).

\(\text {ASAA}_{2nd}\) - Fixing the second-stage variables \(z_{itv\omega }\):

  • Obtain partial solutions. For \(\check{M}\) with \(\vert \check{M}\vert \ll \vert M\vert\), generate small scenario sample sets \(\Omega _k\). Solve \(\Omega\) using the general SAA approach from Sect. 3.2 for each replication \(k \in \check{M}\).

  • Fix promising second-stage variables. Assign \(w_k\) to each of the k replications and fix some of the variables \(z_{itv\omega }\) according to Agra et al. (2018a).

4 Computational experiments

In this section, the performance and solution quality of the previously described solution approaches are carried out by computational experiments. All approaches are implemented in C# using Gurobi 9.5.0 for solving the mathematical models. All tests run on an Intel(R) Core(TM) i5-7500 CPU machine at 3.4 GHz with 8 GB of RAM.

4.1 Data generation and experimental design

The instances used to conduct the experiments are based on Adulyasak et al. (2015a) and Agra et al. (2018a). Four test sets are considered, depending on the number of retailers. Test sets S1 to S4 contain \(n = 5, 10, 15 ~ \text {and} ~ 20\) retailers, 3 to 5 periods and 2 available trucks respectively, to keep runtimes manageable. Since 15 instances are created for each retailer-period combination, the test sets contain 45 instances each, resulting in a total number of 180 instances examined. Symmetric travel costs are associated referring to the Euclidean distance of a 500 by 500 square grid. Scenarios are generated by a Monte Carlo simulation in which the demand varies between \(\pm 30~\%\) from the nominal case \({\bar{d}}_{it} \in [5, 25 ]\), which is the same in every period. The demands are assumed to be discrete and uniformly distributed. It is assumed, that the production units \(p_t\) can be distributed by all available vehicles in a period and since a different range of periods is observed \(K^{inv}_i\) is slightly modified, compared to Adulyasak et al. (2015a). \(K^{vec}\) is chosen depending on \(K^{pro}\). It is ensured that the production volume can be transported with the available vehicles at full capacity utilization. Initial inventory is considered for retailers only.

Two different experimental settings are considered to examine the impact of the VRP-solution quality and of the two different SAA frameworks. To evaluate the impact of the VRP, the general SAA is taken into account. The assumed low and high impact VRP-solution methods outlined in Sect. 3.1.2 are used. \(\text {SAA}_D\) will be refered to as the SAA framework using DSN in the routing part, wheares \(\text {SAA}_{GE}\) will be referred to when the GRASPxELS is used in the routing part. As in Agra et al. (2018b), the sample size of \(\Omega '\) is set to 1000 and the number of replications executed is 10 within a replication, \(|\Omega |= 10\). However, due to the high number of binary variables \(z_{itv\omega }\), the 2sPrDP usually cannot be solved to optimality within a reasonable amount of time. Thus, the 2sPrDP is solved using a satisfactory optimality gap. After \({\bar{\lambda }}_{it}\) is fixed in the last iteration for the two-phase iterative heuristic, the 2sPrDP is solved one last time to refine the production decisions, given a runtime limit. Here, the number of iterations is set to 3, due to the preliminary testing conducted (see Appendix, Table 8). The optimality gap and the runtime limit are set to 3.75 % and 5 minutes, respectively. The procedure is presented in Algorithm 1. With respect to the overall runtime, when comparing the SAA and the ASAA, only the low impact VRP-solution method is used for the ASAA by applying only the DSN to the two adjustable approaches studied. Within the ASAA the two different approaches are used as follows. The number of scenarios for \(\check{\Omega }\) and \(\Omega\) is set to 5 and 10 running \(\text {ASAA}_{1st}\). Sample size of \(\Omega '\) and the number of replications remain the same as for the SAA. Running \(\text {ASAA}_{2nd}\), \(\Omega '\) and \(\Omega\) are the same as in the SAA and the number of replications for \(\check{M}\) and M is set to 5 and 10.

Algorithm 1
figure a

Two-phase iterative heuristic

All approaches will be compared to the monolithic mixed-integer formulation using the expected-value approach to cope with the uncertainty (EVP). Even replacing the uncertain demand by its expected value the EVP might not be solved to optimality in reasonable runtime for larger instances. Therfore, a time limit of 6 hours is set to find a solution. The production and setup decisions obtained are used as first-stage decisions and evaluated on scenario set \(\Omega '\). The evaluated EVP will be stated as eEVP in the following.

4.2 VRP Impact - \(\text {SAA}_D\) vs. \(\text {SAA}_{GE}\)

The impact of the VRP-solution quality is studied, comparing DSN and GRASPxELS within the general SAA approach. Results can be seen in Table 2. Information is gathered regarding the average value for each retailer, period and vehicle combination, unless otherwise stated.

Table 2 \(\text {SAA}_\text {D}\) compared to \(\text {SAA}_\text {GE}\)

The results in Table 2 show much better results according to the objective value, using the GRASPxELS over the DSN solution approach. The objective value can be improved up to 7.74 % for 5 retailers and up to 63.62 % for 20 retailers. This tremendeous increase in \(\Delta\) over a larger number of retailers can also be observed using DSN. On the one hand, this might be caused by the higher number of retailers and the associated higher impact of a more elaborated routing approach. On the other hand, even the eEVP cannot be solved to optimality within the given runtime limit, for a higher number of retailers and periods. This could be due not only to the routing decisions, dependent on the number of retailers and time periods, as well as the linkage associated with setup decisions. Whereas the 2sPrDP can be solved to optimality for almost all instances the DSN based approach performs worse than eEVP for 5 retailers. This supports the assumption examined in the first experimental setting. The outcome of the underlying 2sSPRP depends on the quality of the VRP-solution. Comparing the objective values of \(\text {SAA}_D\) and \(\text {SAA}_{GE}\) strengthens the observation. For 95.6 % of all instances, \(\text {SAA}_{GE}\) provides a better objective value than \(\text {SAA}_D\). Looking more closely at the production decisions, this is also evident for instances considering a small number of retailers. Table 3 describes the production deviation \(\Delta _{\text {prod}}\) in percentage points, comparing the total production quantity of eEVP with \(\text {SAA}_D\) and \(\text {SAA}_{GE}\). The production volumes differ only slightly. This further supports the assumption, that the impact of the solution quality of the routing problem might play a major role while solving the 2sSPRP. This can also be justified by the setup decision, as for the \(\text {SAA}_D\) and the \(\text {SAA}_{GE}\), the setup decisions follow the same patterns for 80 % of all instances. This can be seen from the relative shares, which show the average number of setups performed in a given time period, in Table 3. However, using a more powerful solution approach to obtain the routing decisions, as well as extending the number of periods, leads to a runtime increase. Compared to the eEVP, the increased runtime is still acceptable. Compared to the \(\text {SAA}_D\) the runtime increase of \(\text {SAA}_{GE}\) might also be acceptable, since the average improvement in \(\Delta\) is about 7.91 % for \(\text {SAA}_{GE}\).

Table 3 Comparison of production and setup decisions for the \(\text {SAA}_\text {D}\) and \(\text {SAA}_\text {GE}\), considering 5 retailers

4.3 Fixing first- vs. fixing second-stage variables within the ASAA framework

As the solution space obtained within the general SAA might be too small according to the chosen scenario set, a new adjustable SAA regarding to Agra et al. (2018a) is tested with two different settings. In the first setting \(\text {ASAA}_{1st}\), promising first-stage variables can be fixed, wheares in the second setting \(\text {ASAA}_{2nd}\), promising second-stage variables can be fixed. Comparing these two different ASAA settings, Table 4 shows high improvements using the first approach over the second approach, on average. The objective value can be improved by up to 9.22 % for small instances, by partially fixing the setup variables instead of the routing variables. Thus, more flexibility within the scenarios is created. However, higher flexibility affects the runtime and therefore using \(\text {ASAA}_{1st}\) results in a higher runtime compared to \(\text {ASAA}_{2nd}\). As the number of retailers increases, the solution quality of \(\text {ASAA}_{2nd}\) decreases in comparison to \(\text {ASAA}_{1st}\). Looking at the number of win, \(\text {ASAA}_{2nd}\) is not able to provide the best solutions here. Therefore, the increase in runtime for using \(\text {ASAA}_{1st}\) seems reasonable.

Table 4 \(\text {ASAA}_\text {1st}\) vs. \(\text {ASAA}_\text {2nd}\)

4.4 General vs. adjustable SAA

Comparing the general SAA with the ASAA framework, the respective best approaches \(\text {SAA}_{GE}\) and \(\text {ASAA}_{1st}\) differ between \(-\)0.68 and 7.45 percentage points in favor of \(\text {ASAA}_{1st}\) considering all instances. However, comparing \(\text {ASAA}_{2nd}\) with \(\text {SAA}_{GE}\), the general SAA approach prevails. Considering all instances, \(\text {SAA}_{GE}\) provides the best solution in 32.78 %, whereas \(\text {ASAA}_{1st}\) and \(\text {ASAA}_{2nd}\) provide the best solution in 60.55 % and 6.67 % respectively. As for \(\text {SAA}_{GE}\), good quality routing solutions are obtained and no variables get fixed, this approach provides good results in terms of the objective value and runtime. Figure 2 shows the average runtime of the different retailer-period combinations for all approaches. For almost all instances, except for the 5 retailer case, the eEVP reaches the runtime limit. Therefore, the eEVP is not considered further, regarding runtime. A comparison of the SAA approaches shows that using \(\text {SAA}_{GE}\) on average doubles the runtime, compared to \(\text {SAA}_{D}\). This is also evident when comparing \(\text {SAA}_{GE}\) to \(\text {ASAA}_{1st}\) and comparing the ASAA approaches. If a doubling or even a quadrupling of the runtime, compared to \(\text {SAA}_{D}\), is accepted, the objective values can be clearly improved on average, compared to the eEVP. Within the approaches tested, the results do not seem to differ too much in terms of average objective values. However, if the lower bound (LB) of the unevaluated EVP is assumed to be the best solution possible, even for the stochastic case, a different picture emerges. Table 5 shows the relative improvement of the four different approaches examined, compared to \(\text {EVP}_{LB}\). The use of \(\Delta _{LB}\) is intended to highlight the differences between the proposed approaches. Therefore, the value for the actual improvement, when solving the EVP to optimality in all cases, will therefore be between \(\Delta _{LB}\) and the \(\Delta\)-values provided in Tables 2 and 4. Compared to \(\text {EVP}_{LB}\), \(\text {SAA}_D\) provides the worst results. \(\text {SAA}_{GE}\) performs better and \(\text {ASAA}_{1st}\) performs best. \(\text {ASAA}_{1st}\) still improves the objective value compared the unevaluated \(\text {EVP}_{LB}\) up to 6.1 % on average. Whereas, \(\text {SAA}_{GE}\) does not improve the objective value for all instances compared to the unevaluated \(\text {EVP}_{LB}\), but still improves it by 2.4 % on average.

Fig. 2
figure 2

Runtime development of the four approaches in minutes. Note: Periods 3–5 are shown for each retailer section

Fixing second-stage variables seems to be appropriate to reduce runtime and compared to the \(\text {SAA}_D\) better solutions can be obtained with respect to the objective values. However, relying on better solution approaches to estimate routing costs seems to be more powerful than fixing second-stage variables. In terms of potential improvements, \(\text {ASAA}_{1st}\) provides the best objective values on average. When runtime is also taken into account, \(\text {SAA}_{GE}\) performs well and seems to be a valid approach.

Table 5 Average relative improvement compared to the LB of the EVP

5 Conclusions

The 2sSPRP presented in this work is an integrated problem considering production, inventory and distribution decisions while demand is uncertain. In the literature, two-stage formulations containing the routing decisions in the first stage are considered and therefore routing cannot be adjusted at short notice. As noted in Sect. 1 this might only apply to some industries with certain distribution patterns. For other industries, more flexibility in the routing decision might be appropriate. Here, a new structure within the 2sSPRP is examined, where the routing decisions are made in the second stage, and different solution approaches are discussed. To examine the impact of the VRP-solution quality, two different solution approaches are compared within a SAA approach. Results show an improvement regarding total cost, when using GRASPxELS over DSN for the routing part. Thus, the chosen two-phase heuristic within the SAA approach is highly dependent on the approximated routing costs and thus on the chosen VRP-solution approach. The assumption made in Section 3.2.1 regarding the potentially low impact of VRP-solutions can therefore be negated. Another examination of an adjustable SAA approach is whether to fix promising variables at the first or the second stage, thus considering more scenarios or more replications. When fixing variables on the first stage, more flexibility to adapt to the scenarios is given. Therefore, the results are better in terms of total costs. However, fixing variables in the second stage is less flexible and results in higher total costs. Nevertheless, it reduces the runtime. In addition, fixing second-stage variables is preferable to the SAA approach in terms of solution quality – assuming that the same VRP-solution approaches are applied. This might be achieved by the extension of the scenario space within the adjustable approach.The eEVP confirms that stochastic optimization based on expected values does not always lead to good results. In general, scenarios should be considered within an SAA or ASAA based approach for the 2sSPRP studied. Nevertheless, the proposed approaches do have their drawbacks. Within the (A)SAA approaches, the number of scenarios considered might be too small in order to provide an appropriate representation of reality. However, these assumptions and the assumption of the distribution of demand have been adopted from the literature (Agra et al. 2018a), and a larger set of scenarios would have exceeded an acceptable runtime limit. To overcome this issue, descriptive sampling could be used (Saliby 1990) in order to examine a larger number of scenarios and to better represent uncertainty. Exploring differences and enhancements from a modeling perspective is alreaedy examined in Geiger (2024). However, whether the routing decisions should be made in the first stage (Adulyasak et al. 2015a) or in the second stage taking into account the possible discrepancy between theory and actual implementations in pracitice, should be studied in more detail. And since it seems reasonable to adjust production and routing decisions simultaneously over time, multistage stochastic programming for the SPRP could be an interesting area of research. To put this into practice, a rolling horizon approach might be worth considering.