1 Introduction

Risk is a fundamental issue arising in finance, insurance, project management, and many other areas. The analysis of risk in stochastic programming has recently been quite popular and raised many research questions, from model formulation to algorithmic approaches to tackle the problem difficulty, see, for instance, Schultz (2005), Schultz and Tiedemann (2006), Eichhorn and Römisch (2005), Ruszczyński (2013), and Yin and Büyüktahtakin (2021), among others. This paper addresses the computational difficulty of solving multi-stage stochastic mixed-integer programming problems (M-SMIPs) involving risk-averse objectives.

Stochastic mixed-integer programs can be formulated as an extensive form of a mixed-integer program (MIP), where a discrete stochastic process is represented by a finite number of realizations, i.e., scenarios. The size of the MIP grows exponentially in the number of decision stages and scenarios, leading to large-scale optimization formulations. Incorporating the risk-aversion criterion in addition to the expectation in the objective function further complicates the problem.

In this paper, we propose a new and general method of “stage-t scenario dominance" to effectively solve the risk-averse M-SMIPs. Our main idea is to derive cutting planes and bounds using the dominance characteristics of scenarios defined by a partial ordering and the solution of a new scenario sub-problem, which includes all the non-anticipativity and original constraints. Specifically, we solve the problem for a number of specific scenario sub-problems and use this information and the partial ordering of scenarios to generate cutting planes so that the general problem can be solved to optimality or close-to-optimality by a cut-and-branch algorithm. We demonstrate the computational benefit from using the proposed approaches on the risk-averse stochastic dynamic knapsack problem.

1.1 Literature review and paper contributions

Solution algorithms proposed for risk-averse multi-stage stochastic programs with integer variables are quite limited due to the non-convexity arisen by integrality. Most approaches proposed for risk-averse multi-stage stochastic optimization problems are extensions of the solution techniques proposed for their risk-neutral counterparts. Schultz and Tiedemann (2006) develop a Lagrangian decomposition algorithm that is based on the relaxation of non-anticipativity constraints to solve the scenario-based formulation of two-stage mixed-integer stochastic programming involving the conditional value-at-risk (CVaR). Heinze and Schultz (2008) later extended the scenario-decomposition algorithm of Schultz and Tiedemann (2006) combined with a branch and bound scheme to the case of multi-stage stochastic integer programs with CVaR risk objective. Yu et al. (2018) employ the Stochastic Dual Dynamic Integer Programming (SDDiP) approach and Lagrangian cuts of Zou et al. (2019) for solving the risk-averse multi-stage stochastic facility location problem using expected conditional risk measures (ECRMs). Dentcheva and Martinez (2012) introduce Second-order Stochastic Dominance (SSD) constraints and explore the mathematical properties in two-stage stochastic problems. Escudero et al. (2016) study the first- and second-order stochastic dominance constraints induced by mixed-integer recourse in risk-averse multi-stage stochastic programs. Our scenario dominance approach is fundamentally different than the stochastic dominance constraints introduced in the literature (Müller and Stoyan 2002) in that we pairwise compare the scenarios based on the problem parameters and derive implications using the solution of scenario sub-problems, while in stochastic dominance random variables are compared based on their probability distribution function.

There has also been a stream of research studying bounds and approximation for risk-averse multi-stage stochastic mixed-integer programming problems (Maggioni and Pflug 2019; Mahmutoğulları et al. 2018; Sandıkçı and Özaltın 2017; Bakir et al. 2020; Maggioni et al. 2014; Maggioni and Pflug 2016; Sandıkçı et al. 2013). Our focus in this paper is different from those by providing cutting plane algorithms based on the scenario dominance concept to tackle the computational difficulty of solving risk-averse multi-stage stochastic programs. Particularly, our scenario dominance-based cutting planes provide bounds on a single-scenario objective rather than the full objective function as considered in previous studies on bounds and approximations [e.g., see Maggioni and Pflug (2019) and Mahmutoğulları et al. (2018)]. Furthermore, our cuts are driven based on a new scenario sub-problem, which is different from the classical scenario sub-problem that have been widely used in prior studies [see, e.g., Madansky (1960), Maggioni et al. (2016)].

Ruszczyński (2002) reformulates the two-stage stochastic programming problems with probabilistic constraints as large-scale mixed integer programming problems with knapsack constraints. The author uses the partial ordering of the scenarios to define precedence relationships between the variables of the knapsack problem and then defines the induced knapsack covers (Park and Park 1997; Boyd 1993) for the precedence-constrained knapsack polyhedra. Our definition of scenario dominance is similar to the ordering of scenarios that was discussed in Ruszczyński (2002); but the way we use the scenario ordering concept is fundamentally different than the approach in Ruszczyński (2002) because we compare the single scenario objectives of two scenarios based on a newly-defined scenario sub-problem to derive cuts for the risk-averse M-SMIPs, while Ruszczyński (2002) uses the ordering of scenarios to define induced knapsack covers. Furthermore, different than former work, our study, for the first time, defines the notion of “stage-t scenario dominance" and utilizes it for solving the risk-averse M-SMIPs more efficiently with tractable computational times.

Due to the limited number of available solution algorithms for the risk-averse multi-stage stochastic programs with integer variables, the main contribution of this paper is to provide a new approach named “scenario dominance” to solving risk-averse M-SMIPs. Specifically, we provide the following contributions: (i) presenting a new approach, named as “stage-t scenario dominance” that compares scenarios based on the realization of uncertain parameters under each scenario up to a certain time period t, derives a partial ordering of scenarios, and studies the implications of this ordering; (ii) defining a new scenario sub-problem, which involves all constraints of the original problem, including the non-anticipativity constraints, as opposed to the classical scenario sub-problem, which ignores the non-anticipativity relations among scenarios, as commonly used in the literature (Birge and Louveaux 2011; Lulli and Sen 2004; iii) deriving and proving the scenario-based formulation for the mean-CVaR multi-stage stochastic mixed-integer program; (iv) demonstrating the use of “scenario dominance" theory to derive bounds and cutting planes for the risk-averse mean-CVaR M-SMIPs based on the solution of a new scenario sub-problem; (v) deriving results on the minimum number of stage-t scenario-dominance relations in multi-stage scenario trees, and thus cutting planes generated; (vi) presenting the computational benefit using the proposed “scenario dominance" cuts and bounds to solve the risk-averse dynamic stochastic knapsack problem by conducting computational experiments; and (vii) showing that “scenario dominance" cuts can significantly reduce the solution time for large-scale mean-risk stochastic multi-stage multi-dimensional knapsack problems with both integer and continuous variables, whose structure is similar to the mean-risk M-SMIPs. The proposed framework is general and can be applied to a wide range of risk-averse M-SMIP problems.

1.2 Risk-averse multi-stage stochastic mixed-integer programs

1.2.1 Risk-neutral multi-stage stochastic MIPs

We consider a finite number of stages \(t \in {\mathcal {T}}:= \left\{ 1 , \ldots ,T\right\} \). Let \({\xi }_1\) be known, or deterministic, and \(\left( {\xi }_2, \ldots , {\xi }_T\right) \) denotes a sequence of uncertain parameters where \({\xi }_{[t]}:=\left\{ {\xi }_1, {\xi }_2, \ldots , {\xi }_t\right\} \) is the information observed by stage t. Let \(n_t\), \(q_t\), \(m_t\), and \(k_t\) represent the number of decision variables, the number of integer variables, the number of constraints, and the number of random variables, respectively, at time \(t \in {\mathcal {T}}\). Let \({\mathbb {R}}\) denote the set of real numbers. We assume that \({\xi }_{2}, \ldots , {\xi }_{T}\) is a discrete-time stochastic process with a finite probability space \((\Xi _t, F_t, P_t)\), where the support, \(\Xi _t \in {{\mathbb {R}}}^{k_t}\), represents a finite sample space for \(t=2,\ldots ,T\), and \(\Xi =\Xi _T\), \(F_t\) is the set of all events defined on \(\Xi _t\), and \(P_t\) is the corresponding probability distribution. Let \({{\mathbb {R}}}_+\) and \({{\mathbb {Z}}}_+\) denote the set of positive real numbers and positive integers, respectively. Let \(x:=(x_1,x_2,\ldots ,x_T)\) represent the decision vector with \(x_t\in {{\mathbb {R}}}_+^{n_t-q_t} \times {{\mathbb {Z}}}_+^{q_t} \). The sequence of decisions and the realizations of uncertain parameters are represented as \(x_1, {\xi }_2, x_2 (x_1, {\xi }_{[2]}), {\xi }_3, x_3 (x_1, x_2, {\xi }_{[3]}),\ldots , x_T(x_1,\ldots ,x_{T-1},{\xi }_{[T]})\).

We define the multi-stage stochastic mixed-integer program (M-SMIP) to be of the nested form below:

$$\begin{aligned} \ \ \min _{x_1} \left\{ c^{\top }_1x_1+ {{{\,\mathrm{{\mathbb {E}}}\,}}}_{{\xi }_{2}}\left[ Q_2(x_1, {\xi }_{[2]})\right] : \ A_1x_1 = b_1, x_1\in {{\mathbb {R}}}_+^{n_1-q_1} \times {{\mathbb {Z}}}_+^{q_1} \right\} \end{aligned}$$
(1)

where \(Q_t\left( \bullet , \bullet \right) \) is defined recursively for each \(t=2,\ldots , T\), as

$$\begin{aligned} Q_t(x_{t-1}, {\xi }_{[t]})= & {} \min _{x_t} \{ c^{\top }_t({\xi }_{[t]})x_t + {{{\,\mathrm{{\mathbb {E}}}\,}}}_{{{\xi }_{t+1}}|{\xi }_{[t]}}\left[ Q_{t+1}\left( x_t, {\xi }_{[t+1]}\right) \right] : \\&\quad A_{t}\left( {\xi }_{[t]} \right) x_{t-1}+ H_t\left( {\xi }_{[t-1]} \right) x_{t} = b_{t}\left( {\xi }_{[t]}\right) , \ x_t \in {{\mathbb {R}}}_+^{n_t-q_t} \times {{\mathbb {Z}}}_+^{q_t} \}, \end{aligned}$$

where \(c_1 \in {{\mathbb {R}}}^{n_1}\), \(A_1 \in {{\mathbb {R}}}^{m_1\times n_1}\), and \(b_1 \in {{\mathbb {R}}}^{m_1}\) are known; and the uncertain parameters that realize as the stochastic process \({\xi }\) evolves are given by \(c_t ({{\xi }_{[t]}}) \in {{\mathbb {R}}}^{n_t}\), \(A_t ({{\xi }_{[t]}}) \in {{\mathbb {R}}}^{m_t \times n_t}\), \(H_t ({{\xi }_{[t]}}) \in {{\mathbb {R}}}^{m_t \times n_t}\), and \(b_t ({{\xi }_{[t]}}) \in {{\mathbb {R}}}^{m_t}\); and \({{{\,\mathrm{{\mathbb {E}}}\,}}}_{{\xi }_{t+1}|{\xi }_{[t]}}\left[ \bullet \right] \) is the expectation with respect to \({\xi }_{t+1}\) conditioned on the realization \({\xi }_{[t]}\), for \(t = 2, \ldots ,T\). Note that for \(T=2\), the M-SMIP (1) is a two-stage SMIP.

Numerical optimization of multi-stage stochastic programs usually requires discrete probability distributions since otherwise, the multi-variate integrals involved become intractable (Heinze and Schultz 2008). So we consider a finite number of realizations of a discrete stochastic process \({\xi }=\left( {\xi }_1, \ldots , {\xi }_T\right) \), which has a finite support \(\Xi = \left\{ {\xi }^1, \ldots , {\xi }^{{{\bar{\Omega }}}}\right\} \) such that \(|\Xi |={{{\bar{\Omega }}}}\) for some positive integer \({{\bar{\Omega }}}\). Let \({\xi }_1^{\omega }\) be equal to the known parameter \({\xi }_1\) for each \(\omega \in \Omega :=\left\{ 1,2,\ldots ,{{\bar{\Omega }}}\right\} \). A realization of a sequence of random parameters in stages \(1,\ldots , T\), \({\xi }^{\omega }=\left( {\xi }_1^{\omega }, {\xi }_2^{\omega }, {\xi }_3^{\omega }, \ldots , {\xi }_T^{\omega }\right) \in \Xi \), is then referred as a scenario indexed by \(\omega \in \Omega \). Each scenario \({\xi }^{\omega } \in \Xi \) is associated with probability \(p^{\omega }\), which is computed as the product of the conditional probabilities of all nodes that belong to the scenario path \({\xi }^{\omega }\), such that \(\sum _{\omega \in \Omega }p^{\omega }=1\). The realizations of a scenario \({\xi }^{\omega }\) up to stage t is denoted by \({\xi }_{[t]}^{\omega }:= \left\{ {\xi }_1^{\omega }, {\xi }_2^{\omega }, {\xi }_3^{\omega }, \ldots , {\xi }_t^{\omega }\right\} \).

When there is a finite number of scenarios, the M-SMIP (1) can be equivalently formulated as a large-scale deterministic program by replicating the decision variable in stage t, \(x_t\), for each scenario \({\xi }^{\omega } \in \Xi \). Denoting \(x_{t}^{\omega }\) as the decision variable for stage \(t=1,\ldots ,T\), the M-SMIP given in (1) can be equivalently formulated as the extensive form of the multi-stage stochastic MIP (E-MSMIP) (2a):

$$\begin{aligned} \min _{{x_{1}^{\omega },\ldots ,x_{T}^{\omega }}}&\sum _{\omega \in \Omega } p^{\omega } \sum _{t \in {\mathcal {T}}} {c_{t}^{\omega }}^{\top } x_{t}^{\omega } \end{aligned}$$
(2a)
$$\begin{aligned} \text {s.t.}&A_1x_1^{\omega } = b_1, \ \omega \in \Omega , \end{aligned}$$
(2b)
$$\begin{aligned}&A_{t}^{\omega }x_{t-1}^{\omega }+ H_t^{\omega }x_{t}^{\omega } = b_{t}^{\omega }, \quad \forall t \in {\mathcal {T}}\backslash {\left\{ 1\right\} }, \ \omega \in \Omega , \end{aligned}$$
(2c)
$$\begin{aligned}&x_t^{\omega } = x_t^{\omega '} \quad \forall t \in {\mathcal {T}}, \ \omega , \omega ' \in \Omega \ \ s.t. \ \ {\xi }_{[t]}^{\omega } = {\xi }_{[t]}^{\omega '}, \end{aligned}$$
(2d)
$$\begin{aligned}&x_t^{\omega } \in {{\mathbb {R}}}_+^{n_t-q_t} \times {{\mathbb {Z}}}_+^{q_t} \quad \forall t \in {\mathcal {T}}, \ \ \omega \in \Omega , \end{aligned}$$
(2e)

where the random parameters become known with \(c_t^{\omega } \in {{\mathbb {R}}}^{n_t}\), \(b_t^{\omega } \in {{\mathbb {R}}}^{m_t}\), \(A_t^{\omega } \in {{\mathbb {R}}}^{m_t \times n_t}\), and \(H_t^{\omega } \in {{\mathbb {R}}}^{m_t \times n_t}\) under scenario \(\omega \). The non-anticipativity constraints (2d) ensure that \(x_t^{\omega }\) and \(x_t^{\omega '}\) must attain the same values as long as the paths corresponding to scenarios \({\xi }^{\omega }\) and \({\xi }^{\omega '}\) coincide up to stage t (i.e., \({\xi }_{[t]}^{\omega } = {\xi }_{[t]}^{\omega '}\)). Those non-anticipativity constraints ensure the real-time implementability of solutions.

1.2.2 Time consistency in multi-stage stochastic MIPs

The classical stochastic programming approach given in E-MSMIP (2a) may perform poorly if certain realizations of the stochastic parameters lead to an excessively high cost to accept. Thus, a mean-risk optimization formulation, which considers a risk-averse measure in addition to the expectation in the objective function, performs better than the expectation objective when the decision-making is at high-stake. In this paper, we focus on an asymmetric risk measure called Conditional Value-at-Risk (CVaR), which has been widely used as a risk measure in decision-making problems under uncertainty because it is a coherent measure of risk (see, for instance, see Rockafellar et al. (2000), Pflug (2000), and Ogryczak and Ruszczynski (2002) for more detailed discussions).

The incorporation of risk measures, such as CVaR, into the two-stage risk-averse stochastic program is straightforward (Ahmed 2006; Schultz and Tiedemann 2006; Schultz 2005; Fábián 2008). However, the definition of risk aversion is not apparent in the multi-stage setting. Risk measures can be applied at every stage additively or to the complete scenario path, only defining risk at the end of the planning horizon or can be represented in a nested form. One of the most important considerations in modeling the risk-averse multi-stage stochastic programs is time consistency. Risk-neutral or risk-averse two-stage stochastic programs are time consistent, while time consistency for multi-stage risk-averse stochastic programs is not guaranteed. Based on the definition of Homem-de Mello and Pagnoncelli (2016), time consistency means that if you solve a multi-stage stochastic program today and find solutions for each node of a tree, you should find the same solutions if you re-solve the problem tomorrow given what was observed and decided today. Various researchers have studied how to address the time consistency issue in multi-stage risk-averse stochastic programs. For example, Shapiro et al. (2013) propose a nested conditional risk measure for multi-stage optimization problems and prove its time consistency. However, their nested conditional risk measure is formulated as Bellman equations (Bellman 1957; Büyüktahtakın 2011), which cannot be solved in explicit form. Homem-de Mello and Pagnoncelli (2016) address this drawback by proposing a class of expected conditional risk measures (ECRMs) that is proven to be time consistent.

1.2.3 Mean-CVaR multi-stage stochastic MIP

Next, we analyze the incorporation of the multi-period CVaR in the objective function of the E-MSMIP (2a) and present the mean-CVaR multi-stage SMIP formulation, similar to the ECRM approach of Homem-de Mello and Pagnoncelli (2016), by casting the resulting risk-averse problem with ECRM as an extensive form that includes some additional variables and risk linearization constraints. Our definition of the multi-period CVaR below is also consistent with other multi-period nested CVaR models in the literature (see, e.g., Heinze and Schultz 2008; Dupačová and Kozmík 2015; Alonso-Ayuso et al. 2018).

Definition 1.1

[Multi-period CVaR (Eichhorn and Römisch 2005; Heinze and Schultz 2008)] The Conditional Value-at-Risk of a random variable \({c_t({{\xi }_{[t]}})}^{\top }x_t\) at confidence level \(\alpha _t \in [0,1)\) at each time stage \(t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }\) can be expressed as the optimal value of the following optimization problem:

$$\begin{aligned} {CVaR}_{\alpha _t}\left[ {c_t({{\xi }_{[t]}})}^{\top }x_t | \xi _{[t-1]}\right] = \min _{\eta _t \in {{\mathbb {R}}}} \left\{ \eta _t + \frac{1}{1-\alpha _t} {{{\,\mathrm{{\mathbb {E}}}\,}}}\left[ \left( {c_t({{\xi }_{[t]}})}^{\top }x_t -\eta _t\right) _{+} | \xi _{[t-1]}\right] \right\} . \end{aligned}$$
(3)

Proposition 1

Consider the following mean-CVaR problem:

$$\begin{aligned} \min _{x_t \in X_t} \left\{ \sum _{t \in {\mathcal {T}}} {{{\,\mathrm{{\mathbb {E}}}\,}}}\left[ {c_t({{\xi }_{[t]}})}^{\top }x_t | {\xi }_{[t-1]} \right] + \lambda \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} {CVaR}_{\alpha _t}\left[ {c_t({{\xi }_{[t]}})}^{\top }x_t | {\xi }_{[t-1]} \right] \right\} , \end{aligned}$$
(4)

where \(X_t{(x_{[t-1]},{\xi }_{[t]})}\) denotes the feasibility set at stage t with decisions up to stage \(t-1\) \((x_{[t-1]})\) and the realization of the uncertain parameters up to stage t \((\xi _{[t]})\), and \(\lambda \in {\mathbb {R_+}}\) is a trade-off coefficient that represents the relative weight of the risk objective \(({CVaR}_{\alpha _t}\left[ \bullet \right] )\) with respect to the expectation function \(({{{\,\mathrm{{\mathbb {E}}}\,}}}\left[ \bullet \right] )\). Given the above notations and definitions, we can equivalently formulate the mean-CVaR problem (4), as the following extensive mean-CVaR multi-stage stochastic MIP (CVaR M-SMIP) problem (5a):

$$\begin{aligned} (P) \ \min _{\begin{array}{c} x_{1}^{\omega },\ldots ,x_{T}^{\omega }, \\ \eta _{1}^{\omega },\ldots ,\eta _{T}^{\omega }, \\ v_{1}^{\omega },\ldots ,v_{T}^{\omega } \end{array}}&\sum _{\omega \in \Omega } p^{\omega } \sum _{t \in {\mathcal {T}}} {c_{t}^{\omega }}^{\top } x_{t}^{\omega } + \lambda \sum _{\omega \in \Omega } p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( \eta _t^{\omega } + \frac{1}{1-\alpha _t}v_t^{\omega }\right) \end{aligned}$$
(5a)
$$\begin{aligned} \text {s.t.}&A_1x_1^{\omega } = b_1, \ \omega \in \Omega , \end{aligned}$$
(5b)
$$\begin{aligned}&A_{t}^{\omega }x_{t-1}^{\omega }+ H_t^{\omega }x_{t}^{\omega } = b_{t}^{\omega } \quad \forall t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} },\ \omega \in \Omega , \end{aligned}$$
(5c)
$$\begin{aligned}&v_t^{\omega } \ge {c_{t}^{\omega }}^{\top } x_{t}^{\omega } - \eta _t^{\omega }, \quad \forall {t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} , \ \omega \in \Omega , \end{aligned}$$
(5d)
$$\begin{aligned}&x_t^{\omega } = x_t^{\omega '}, v_t^{\omega } = v_t^{\omega '}, \quad \forall t \in {\mathcal {T}}, \ \omega , \omega ' \in \Omega \ \ s.t. \ \ {\xi }_{[t]}^{\omega } = {\xi }_{[t]}^{\omega '}, \end{aligned}$$
(5e)
$$\begin{aligned}&x_t^{\omega } \in {{\mathbb {R}}}_+^{{n_t-q_t}} \times {{\mathbb {Z}}}_+^{q_t}, \quad \forall t \in {\mathcal {T}}, \ \ \omega \in \Omega , \end{aligned}$$
(5f)
$$\begin{aligned}&v_t^{\omega } \ge 0, \ \ \eta _t^{\omega } \in {{\mathbb {R}}}, \quad \forall t \in {\mathcal {T}}, \ \omega \in \Omega . \end{aligned}$$
(5g)

The proof of Proposition 1 is given in Appendix A.

2 Scenario-dominance

In this section, we introduce a new concept of “scenario dominance" and “dominance sets," considering the partial order relations of scenarios for the CVaR M-SMIP problem (5a). The partial ordering of scenarios is defined by comparing the scenario realizations of uncertain parameters. Using the partial ordering of scenarios and its implications based on the solution of a scenario sub-problem, we derive new bounds and cutting planes to improve the computational solvability of mean-CVaR M-SMIP formulation (5a).

2.1 Defining the scenario-dominance

Similar to the partial ordering of elements of a set, we define the partial ordering of scenarios in the scenario set \(\Xi \). Because a scenario is defined as the realizations of a random variable over multiple stages \(t \in {\mathcal {T}}\), the partial ordering of the scenarios is obtained by the point-wise comparison of scenario realizations at each stage t. Thus, in order to obtain a partial ordering of two scenarios \({\xi }^{\omega } \in \Xi \) and \({\xi }^{\omega '} \in \Xi \), the realization of scenario \({\xi }^{\omega } \in \Xi \) at each time period \(t \in {\mathcal {T}}\), \(\left( {\xi }_1^{\omega }, {\xi }_2^{\omega }, {\xi }_3^{\omega }, \ldots , {\xi }_T^{\omega }\right) \) will be pairwise compared with the realization of scenario \({\xi }^{\omega '} \in \Xi \) for each t, \(\left( {\xi }_1^{\omega '}, {\xi }_2^{\omega '}, {\xi }_3^{\omega '}, \ldots , {\xi }_T^{\omega '}\right) \). Note that \({\xi }_1^{\omega }={\xi }_1^{\omega '}\) because \({\xi }_1\) is deterministic. This comparison leads to the following definition for a partial order relation of the scenarios.

Definition 2.1

Scenario dominance Define a scenario realization at time \(t \in {\mathcal {T}}\) as \({\xi }_{t}^{\omega }\), and let \({\xi }_{t}^{\omega }\)be comprised of random elements \(\left( {c_t^{\omega }}, {b_t^{\omega }}, {A_{t}^{\omega }}, {H_{t}^{\omega }}\right) \) for \(\omega \in \Omega \). Given any two scenarios \({{\xi }}^{\omega }\) and \({{\xi }}^{\omega '}\), scenario \({{\xi }}^{\omega }\) dominates scenario \({{\xi }}^{\omega '}\), denoted by \({{\xi }}^{\omega '} \preceq {{\xi }}^{\omega }\), with respect to the problem P (5a) if

$$\begin{aligned} \left( p^{\omega }\ge p^{\omega '}\right) \ \wedge \ \left( {c_t^{\omega }} \ge {c_t^{\omega '}}\right) \ \wedge \ \left( {b_t^{\omega }} \ge {b_t^{\omega '}}\right) \ \wedge \ \left( {A_t^{\omega }} \le {A_t^{\omega '}}\right) \ \wedge \ \left( {H_t^{\omega }} \le {H_t^{\omega '}}\right) \quad \forall t \in {\mathcal {T}}, \end{aligned}$$

where \(\wedge \) denotes conjunction or ‘and’ operator, and the objective function (5a) portion for a single scenario index \({\hat{\omega }} = \omega , \omega '\) \(\left( p^{{\hat{\omega }}} \sum _{t \in {\mathcal {T}}} {c_{t}^{{\hat{\omega }}}}^{\top } x_{t}^{{\hat{\omega }}} + \lambda p^{{\hat{\omega }}} \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( \eta _t^{{\hat{\omega }}} + \frac{1}{1-\alpha _t}v_t^{{\hat{\omega }}}\right) \right) \) is a non-decreasing function of \(\left( x^{{\hat{\omega }}}, \eta ^{{\hat{\omega }}}, v^{{\hat{\omega }}}\right) \).

If, for example, \(c_t^{{\hat{\omega }}} := \left( c_{1,t}^{{\hat{\omega }}}, \ldots , c_{n_t,t}^{{\hat{\omega }}}\right) \) and \(A_t^{{\hat{\omega }}} := \left( A_{1,1,t}^{{\hat{\omega }}}, \ldots , A_{m_t,n_t,t}^{{\hat{\omega }}}\right) \), then the scenario domination is determined based on an element-wise comparison of the realizations of \(c_t^{{\hat{\omega }}}\) and \(A_t^{{\hat{\omega }}}\) for \({\hat{\omega }}= \omega , \omega '\), such as:

$$\begin{aligned}&c_{j,t}^{\omega } \ge c_{j,t}^{\omega '} \quad \forall j=1,\ldots ,n_t, \quad \forall t=1,\ldots ,T, \\&A_{i,j,t}^{\omega } \le A_{i,j,t}^{\omega '} \quad \forall i=1,\ldots ,m_t, \quad \forall j=1,\ldots ,n_t, \quad \forall t=1,\ldots ,T. \end{aligned}$$

We now generalize the “Scenario Dominance” definition above to the “Stage-t Scenario Dominance” where we compare the realization of scenario \({\xi }^{\omega } \in \Xi \) up to stage \(t \in {\mathcal {T}}\), \(\left( {\xi }_1^{\omega }, {\xi }_2^{\omega }, {\xi }_3^{\omega }, \ldots , {\xi }_t^{\omega }\right) \) with the realization of scenario \({\xi }^{\omega '} \in \Xi \) up to stage t, \(\left( {\xi }_1^{\omega '}, {\xi }_2^{\omega '}, {\xi }_3^{\omega '}, \ldots , {\xi }_t^{\omega '}\right) \) to derive the stage-t dominance.

Definition 2.2

Stage-t Scenario dominance Given two scenarios \({{\xi }}^{\omega }\) and \({{\xi }}^{\omega '}\), scenario \({{\xi }}^{\omega }\) stage- t dominates scenario \({{\xi }}^{\omega '}\) for \(t \in {\mathcal {T}}\), denoted by \({\xi }_t^{\omega '} \preceq {\xi }_t^{\omega }\), with respect to the problem P (5a) if

and the objective function (5a) portion for a single scenario index \({\hat{\omega }} = \omega , \omega ' \left( p^{{\hat{\omega }}} \sum _{j=1}^{t} {c_{j}^{{\hat{\omega }}}}^{\top } x_{j}^{{\hat{\omega }}} + \lambda p^{{\hat{\omega }}} \sum _{j=2}^{t} \left( \eta _j^{{\hat{\omega }}} + \frac{1}{1-\alpha _j}v_j^{{\hat{\omega }}}\right) \right) \) is a non-decreasing function of \(\left( x^{{\hat{\omega }}}, \eta ^{{\hat{\omega }}}, v^{{\hat{\omega }}}\right) \).

Using the definition of the “scenario dominance” above, we define the scenario-dominance sets as follows:

Definition 2.3

Stage-t Scenario-dominance sets \({\Theta }^+_{t,{\xi }^{\omega }}\) is the index set of scenarios, which are stage-t dominated by scenario \({\xi }^{\omega } \in \Xi \), \({\Theta }^-_{t,{\xi }^{\omega }}\) is the index set of scenarios which stage-t dominate scenario \({\xi }^{\omega } \in \Xi \), and \(N_{t,{\xi }^{\omega }}\) is the index set of scenarios, which neither stage-t dominate nor are stage-t dominated by \({\xi }^{\omega } \in \Xi \) for \(t \in {\mathcal {T}}\) as defined below:

$$\begin{aligned} {\Theta }^+_{t,{\xi }^{\omega }}&:= \left\{ \omega ' \in \Omega , \omega ' \ne \omega : {\xi }_t^{\omega '} \preceq {\xi }_t^{\omega } \right\} , \\ {\Theta }^-_{t,{\xi }^{\omega }}&:= \left\{ \omega ' \in \Omega : {\xi }_t^{\omega } \preceq {\xi }_t^{\omega '} \right\} , \\ N_{t,{\xi }^{\omega }}&:= \left\{ \omega ' \in \Omega : {\xi }_t^{\omega } \not \preceq {\xi }_t^{\omega '}\ \text {and} \ {\xi }_t^{\omega '} \not \preceq {\xi }_t^{\omega } \right\} . \end{aligned}$$

Example 1

Figure 1 presents an example realization of the objective cost parameter, \(c_t\), in two stages resulting in 12 different scenario realizations. To simplify the presentation, we omit \(c_1\) in a scenario description and scenario-dominance comparisons since it takes the same value under all scenario realizations. Thus, here a scenario \({\xi }^{i}\) represents a realization of \(c_t\) in stages 2 and 3, \([c_2^i,c_3^i]\). In the example, the scenario \({\xi }^{5}\) realization is [49, 49]. The index set of scenarios, which are stage-3 dominated by \({\xi }^{5}\) is defined as \({\Theta }^+_{3,{\xi }^{5}} = \left\{ 1,2,3 \right\} \), while the index set of scenarios, which stage-3 dominate scenario \({\xi }^{5}\) is given by \({\Theta }^-_{3,{\xi }^{5}} = \left\{ 5,9,10,11,12 \right\} \). All other scenarios that neither dominate nor are dominated by \({\xi }^{5}\) is given by \(N_{{\xi }^{5}} = \left\{ 4,6,7,8 \right\} \).

Fig. 1
figure 1

Example of 12 scenario realizations of the \(c_t\) variable in two stages, where each black point represents a scenario \({\xi }^{i} := [c_2^i,c_3^i]\) for \(i=1,\ldots ,12\). The \({\Theta }^+_{3,{\xi }^{5}}\) represents the index set of scenarios that are stage-3 dominated by \({\xi }^{5}\), \({\Theta }^-_{3,{\xi }^{5}}\) represents the index set of scenarios that stage-3 dominate \({\xi }^{5}\), and \(N_{3, {\xi }^{5}}\) represents scenarios neither stage-3 dominate nor are stage-3 dominated by \({\xi }^{5}\)

2.2 The number of scenario-dominance relations

The number of scenario-dominance relations depends on a number of factors, including the structure of the scenario tree, how the random variables are generated, and the number of scenarios. Different methods have been proposed for generating a multi-period scenario tree (Heitsch and Römisch 2009; Pflug and Pichler 2015; Dupačová et al. 2000). Here we focus on the two most common types of multi-stage scenario trees based on the type of random variable generation, where random variables are completely independent and random variables are jointly distributed but stage-wise independent.

The number of scenario dominance cutting planes described in Sect. 2.4 relies on the number of scenario-dominance relations. Thus, next, we discuss the relations of two scenarios that share the same path up to a certain time period and exploit this result to compute the minimum number of scenario-dominance relations based on how the random variables are generated in a multi-stage scenario tree.

Proposition 2

Let \({\xi }^{\omega }\in \Xi \) and \({\xi }^{\omega '} \in \Xi \) be two scenarios such that \({\xi }_{[t']}^{\omega } = {\xi }_{[t']}^{\omega '}\) for some \(t' \in {\mathcal {T}}\). Then scenarios \({\xi }^{\omega }\) and \({\xi }^{\omega '}\) stage-t dominate each other, i.e., \({\xi }_t^{\omega '} \preceq {\xi }_t^{\omega }\) and \({\xi }_t^{\omega } \preceq {\xi }_t^{\omega '}\) for \(t=1,\ldots ,t'\).

Proof

It follows from non-anticipativity constraints (5e) that are ensured in a multi-stage stochastic program and the definition of the stage-t scenario dominance. \(\square \)

By Proposition 2, the scenarios that share the same path up to some stage t dominate each other in a scenario tree where there is at least one random variable at each stage. Thus, the minimum number of scenario-dominance relations could be defined by considering those dominance relations between scenarios that share a common history, motivating the following propositions.

Proposition 3

Let \(\left( {\xi }_2, \ldots , {\xi }_T\right) \) be a sequence of independently identically distributed (i.i.d) random vectors of \(k_t\) dimensions each. Each univariate component of a vector \({\xi }_t\) is jointly distributed. In other words, components of \({\xi }_t\) are positively or negatively correlated. In a scenario tree with l branches at each node in each stage, the minimum total number of scenario-dominance relations, W, is equal to:

$$\begin{aligned} W = \sum _{t=1}^{T} l^{(2T-t-1)}. \end{aligned}$$
(6)

Proof

In this scenario tree, we have \(l^{T-1}\) scenarios. At stage \(t'=1\), we have 1 node, and thus \(l^{T-1}/l^0\) scenarios share the same node at stage 1. At stage \(t'=2\), we have l nodes, and thus \(l^{T-1}/l^1\) scenarios share the same node at stage 2. At stage \(t'=3\), we have \(l^2\) nodes, and thus \(l^{T-1}/l^2\) scenarios share the same node at stage 3. Thus, at stage \(t'=t\), we have \(l^{t-1}\) nodes, and thus \(l^{T-1}/l^{(t-1)}\) scenarios share the same node at stage t. Based on Proposition 2, for those scenarios that share the same path up to stage t dominate each other, there are \({\left( {l^{T-1}}/{l^{t-1}}\right) }^2\) domination relations. Since \(l^{t-1}\) nodes are shared at stage t, there are at least a total of \(l^{t-1}{\left( {l^{T-1}}/{l^{t-1}}\right) }^2\) stage-t scenario-dominance relations. This results in a total number of \(\sum _{t=1}^{T} l^{(2T-t-1)}\) scenario-dominance relations over all stages at the minimum. \(\square \)

Proposition 4

Let \(\left( {\xi }_2, \ldots , {\xi }_T\right) \) be a sequence of independently identically distributed (i.i.d) random vectors of \(k_t=k\) dimensions each. Each univariate component of a vector \({\xi }_t\) is independently distributed. In a scenario tree with l branches at each node in each stage, the minimum total number of scenario-dominance relations, W, is equal to:

$$\begin{aligned} W = \sum _{t=1}^{T} l^{(2T-t-1)k}. \end{aligned}$$
(7)

Proof

In this scenario tree, we have \(l^{(T-1)k}\) scenarios. At stage \(t'=t\), we have \(l^{(t-1)k}\) nodes, and thus \(l^{(T-1)k_t}/l^{(t-1)k}\) scenarios share the same node at stage t, resulting in a total of \(l^{(t-1)k}{\left( {l^{(T-1)k}}/{l^{(t-1)k}}\right) }^2\) domination relations at stage t. This results in at least a total number of \(\sum _{t=1}^{T} l^{(2T-t-1)k}\) scenario-dominance relations over all stages. \(\square \)

Example 2

Consider a CVaR-SMKP instance in which \(T= 4\), \(l=2\), and \(k_t=2\) for \(t=1,\ldots ,T\). In the case where each univariate component of the random vector \({\xi }_t\) are jointly distributed, we have \(\bar{\Omega } =8\) scenarios and a minimum number of 64, 32, 16, and 8 scenario-dominance relations in stages one, two, three, and four, respectively, resulting in a total of at least \(W=120\) scenario-dominance relations. In the case where each univariate component of the random vector \({\xi }_t\) is independently distributed, we have \(\bar{\Omega }=64\) scenarios and a total of at least \(W=5440\) scenario-dominance relations at the minimum.

2.3 Scenario sub-problem and bounds

In this section, we define a new scenario sub-problem, which optimizes variables related to a subset of scenarios and assigns feasible solutions for all the other variables. This scenario sub-problem helps us derive new lower and upper bounds on the mean-CVaR multi-stage stochastic MIP problem. As a main contribution of the paper, using the solution of the scenario sub-problem and the scenario-dominance relations described in Sect. 2.1, we derive new cutting planes to chop off non-optimal points and find solutions faster to the highly-complex mean-CVaR multi-stage stochastic MIPs. To introduce the definitions and propositions below, let us define the feasible set for the original problem. The set of feasible solutions of the CVaR M-SMIP (5a) is denoted by \({\mathcal {X}}\) as given below:

$$\begin{aligned} {\mathcal {X}}:= & {} \{\left( x,\eta , v\right) \in {{\mathbb {R}}}_+^{{(n_1-q_1)}{\bar{\Omega }}} \times {{\mathbb {Z}}}_+^{{q_1}{\bar{\Omega }}} \times \ldots \times {{\mathbb {R}}}_+^{{(n_T-q_T)}{\bar{\Omega }}} \times {{\mathbb {Z}}}_+^{{q_T}{\bar{\Omega }}} \times {{\mathbb {R}}}^{{T}{\bar{\Omega }}} \times {{\mathbb {R}}}_+^{{T}{\bar{\Omega }}} : \\&\quad \text {Constraints } (\text {5b})-(\text {5g}) \}. \end{aligned}$$

Definition 2.4

(Scenario group sub-problem) Let \(S \subseteq \Omega \). The scenario group sub-problem involving each \(\omega \in S\), \((P^{S})\), is formulated as follows:

$$\begin{aligned} \begin{aligned}&Z^{S} := \min _{\left( x,\eta , v\right) \in {\mathcal {X}}} \sum _{\omega \in S} \left( p^{\omega } \sum _{t \in {\mathcal {T}}} {c_{t}^{\omega }}^{\top } x_{t}^{\omega }+ \lambda p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( \eta _t^{\omega } + \frac{1}{1-\alpha _t}v_t^{\omega }\right) \right) . \end{aligned} \end{aligned}$$
(8)

Note that the problem \((P^{S})\) (8) is equivalent to the original problem P (5a), if \(S = \Omega \).

Single scenario sub-problem The scenario group sub-problem for the set consisting of the single scenario, \({\xi }^{\omega }\), is denoted as the scenario-\({\xi }^{\omega }\) problem \((P^{\left\{ \omega \right\} })\) and formulated as follows:

$$\begin{aligned} {Z}^{\left\{ \omega \right\} } := \min _{\left( x,\eta , v\right) \in {\mathcal {X}}} p^{\omega } \sum _{t \in {\mathcal {T}}} {c_{t}^{\omega }} x_{t}^{\omega }+ \lambda p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( \eta _t^{\omega } + \frac{1}{1-\alpha _t}v_t^{\omega }\right) . \end{aligned}$$
(9)

Remark 1

The scenario-\({\xi }^{\omega }\) problem \((P^{\left\{ \omega \right\} })\) is an MIP, whose feasible region comprises all the variables and all the constraints of the original problem (5a) while its objective function includes only one specific scenario with index \({\omega }\). Consequently, the \(P^{\left\{ \omega \right\} }\) leads to a complete solution \(\left( x,\eta , v\right) \), which is feasible for (5b)–(5g) but optimizes only for one scenario \(\omega \).

In the following proposition, we assume that the objective function for each scenario is non-negative, i.e., \(f^{\omega }(x^{\omega }, \eta ^{\omega }, v^{\omega })= p^{\omega } \sum _{t \in {\mathcal {T}}} {c_{t}^{\omega }}^{\top } x_{t}^{\omega } + \lambda p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( \eta _t^{\omega } + \frac{1}{1-\alpha _t}v_t^{\omega }\right) \ge 0\), implying that \(c_{t}^{\omega }\ge 0\) for each \(\omega \in \Omega \) and \(t \in {\mathcal {T}}\). Note that this assumption is not necessary when \(S = \Omega \), since \(Z^{\Omega }=Z\).

Proposition 5

Let Z be the optimal objective value of the original problem P (5a). Then, \(P^{S}\) is a relaxation of P, that is \(Z \ge Z^{S} \ \ \forall S \subseteq \Omega \).

Proof

Define a new variable \(H \in {\mathbb {R}}\). The following problem is equivalent to the original problem P (5a):

$$\begin{aligned} Z := \min _{\left( x,\eta , v, H\right) \in \mathcal {\tilde{X}}} H, \end{aligned}$$

where

$$\begin{aligned} \mathcal {\tilde{X}}={\mathcal {X}}\!\cap \!\left\{ \left( x,\eta , v, H\right) : H \ge \sum _{\omega \in \Omega } p^{\omega } \sum _{t \in {\mathcal {T}}} {c_{t}^{\omega }}^{\top } x_{t}^{\omega } + \lambda \sum _{\omega \in \Omega } p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( \eta _t^{\omega } + \frac{1}{1-\alpha _t}v_t^{\omega }\right) \right\} . \end{aligned}$$

Similarly, the scenario group sub-problem \(P^{S}\) (8) can be equivalently written as in the following problem:

$$\begin{aligned} \begin{aligned}&Z^{S} := \min _{\left( x,\eta , v, H\right) \in \mathcal {\hat{X}}} H, \end{aligned} \end{aligned}$$

where \(\mathcal {\hat{X}}={\mathcal {X}}\cap \left\{ \left( x,\eta , v, H\right) : H \ge \sum _{\omega \in S} p^{\omega } \sum _{t \in {\mathcal {T}}} {c_{t}^{\omega }}^{\top } x_{t}^{\omega } + \lambda \sum _{\omega \in S} p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }}\left( \eta _t^{\omega } \!+\! \frac{1}{1-\alpha _t}v_t^{\omega }\right) \right\} .\) Clearly, \(\mathcal {\tilde{X}} \subseteq \mathcal {\hat{X}},\) which proves the proposition. \(\square \)

Definition 2.5

Stage-t single scenario sub-problem The single scenario problem, including the variables and constraints up to stage t, for \(t \in {\mathcal {T}}\), is denoted as the stage-t scenario-\({\xi }^{\omega }\) problem \((P_t^{\left\{ \omega \right\} })\) and formulated as follows:

$$\begin{aligned} \begin{aligned}&{Z_t}^{\left\{ \omega \right\} } := \min _{\left( x,\eta , v\right) \in {\mathcal {X}}} p^{\omega } \sum _{j=1}^{t} {c_{j}^{\omega }}^{\top } x_{j}^{\omega }+ \lambda p^{\omega } \sum _{j=2}^{t} \left( \eta _j^{\omega } + \frac{1}{1-\alpha _j}v_j^{\omega }\right) . \end{aligned} \end{aligned}$$
(10)

Definition 2.6

(Relaxed scenario group sub-problem) The relaxed scenario group sub-problem involving each \(\omega \in S\), \((P_R^{S})\), is formulated as follows:

$$\begin{aligned} \begin{aligned}&Z_R^{S} := \min _{\left( x,\eta , v\right) \in \mathcal {X^{S}}} \sum _{\omega \in S} \left( p^{\omega } \sum _{t \in {\mathcal {T}}} {c_{t}^{\omega }}^{\top } x_{t}^{\omega }+ \lambda p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( \eta _t^{\omega } + \frac{1}{1-\alpha _t}v_t^{\omega }\right) \right) , \end{aligned} \end{aligned}$$
(11)

where

$$\begin{aligned} {\mathcal {X}}^S&:= \{\left( x,\eta , v\right) \in {{\mathbb {R}}}_+^{{(n_1-q_1)}{\left| S\right| }} \times {{\mathbb {Z}}}_+^{{q_1}{{\left| S\right| }}} \times \ldots \times {{\mathbb {R}}}_+^{{(n_T-q_T)}{{\left| S\right| }}} \times {{\mathbb {Z}}}_+^{{q_T}{{\left| S\right| }}} \times {{\mathbb {R}}}^{{T}{{\left| S\right| }}} \times {{\mathbb {R}}}_+^{{T}{{\left| S\right| }}} : \\&\qquad \text {Constraints } (\text {5b}){-}(\text {5g}) \ \ \text {only for} \ \ \omega \in S \}. \end{aligned}$$

Proposition 6

\(P_R^{S}\) is a relaxation of \(P^{S}\) ; that is \(Z^{S} \ge Z_R^{S} \ \ \forall S \subseteq \Omega .\)

Remark 2

The \(P^{\left\{ \omega \right\} }\) is different than the conventional single scenario sub-problem \((P_R^{\left\{ \omega \right\} })\), which has been widely studied in the literature [see, e.g., Madansky (1960), Ahmed (2013), and Sandıkçı and Özaltın (2017)] because in the \(P_R^{\left\{ \omega \right\} }\) the objective function and each constraint of the problem (5a) are defined only for a specific scenario with scenario index \(\omega \) and non-anticipativity constraints (5e) are relaxed, while all variables and constraints are kept in the \(P^{\left\{ \omega \right\} }\). Clearly, the \(P_R^{\left\{ \omega \right\} }\) is a relaxation of \(P^{\left\{ \omega \right\} }\) as stated in Proposition 6.

Using the solution of the single-scenario sub-problems, we provide lower and upper bounds, which could then be used to strengthen the linear programming relaxation of the original problem, as demonstrated under the computational results under Sect. 4.5.

2.3.1 Lower bound

Definition 2.7

\(\Gamma : {{\mathbb {R}}}^m\rightarrow {{\mathbb {R}}}\) is superadditive on \(D \subseteq {{\mathbb {R}}}^m\) if \(\Gamma (a+b) \ge \Gamma (a) + \Gamma (b) \) for all \(a,b \in D\) such that \(a+b \in D\).

Proposition 7

\(Z^{\left\{ \omega \right\} }\) is superadditive on \(S \subseteq \Omega \). That is for \(\omega , \omega ' \in S\),

$$\begin{aligned} Z^{\left\{ \omega , \omega '\right\} } \ge Z^{\left\{ \omega \right\} }+Z^{\left\{ \omega '\right\} }. \end{aligned}$$
(12)

Proof

Let \((\hat{x}, {\hat{\eta }}, \hat{v})\), \(({\bar{x}},\bar{\eta }, {\bar{v}})\), and \((\breve{x},\breve{\eta }, \breve{v})\) be the optimal solutions of the scenario-\({\xi }^{\omega }\), scenario-\({\xi }^{\omega '}\), and scenario-\(\left\{ {\xi }^{\omega }, {\xi }^{\omega '}\right\} \) sub-problems, respectively. Then \(Z^{\left\{ \omega , \omega '\right\} } = p^{\omega }\sum _{t \in {\mathcal {T}}} {c_{t}^{\omega }}^{\top }\breve{x}_{t}^{\omega }+ \lambda p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( \breve{\eta }_{t}^{\omega } + \frac{1}{1-\alpha _t}\breve{v}_{t}^{\omega }\right) + p^{\omega '}\sum _{t \in {\mathcal {T}}} {c_{t}^{\omega '}}^{\top }\breve{x}_{t}^{\omega '} + \lambda p^{\omega '} \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( \breve{\eta }_{t}^{\omega '} + \frac{1}{1-\alpha _t}\breve{v}_{t}^{\omega '}\right) \ge p^{\omega }\sum _{t \in {\mathcal {T}}} {c_{t}^{\omega }}^{\top }\hat{x}_{t}^{\omega }+ \lambda p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( {\hat{\eta }}_{t}^{\omega } + \frac{1}{1-\alpha _t}\hat{v}_{t}^{\omega }\right) + p^{\omega '}\sum _{t \in {\mathcal {T}}} {c_{t}^{\omega '}}^{\top }{\bar{x}}_{t}^{\omega '} + \lambda p^{\omega '} \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( \bar{\eta }_{t}^{\omega '} + \frac{1}{1-\alpha _t}{\bar{v}}_{t}^{\omega '}\right) = Z^{\left\{ \omega \right\} }+Z^{\left\{ \omega '\right\} }\) because \(p^{\omega }\sum _{t \in {\mathcal {T}}} {c_{t}^{\omega }}^{\top }\breve{x}_{t}^{\omega } + \lambda p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( \breve{\eta }_{t}^{\omega } + \frac{1}{1-\alpha _t}\breve{v}_{t}^{\omega }\right) \ge p^{\omega }\sum _{t \in {\mathcal {T}}} {c_{t}^{\omega }}^{\top }\hat{x}_{t}^{\omega } + \lambda p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( {\hat{\eta }}_{t}^{\omega } + \frac{1}{1-\alpha _t}\hat{v}_{t}^{\omega }\right) \) and \(p^{\omega '}\sum _{t \in {\mathcal {T}}} {c_{t}^{\omega '}}^{\top }\breve{x}_{t}^{\omega '} + \lambda p^{\omega '} \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( \breve{\eta }_{t}^{\omega '} + \frac{1}{1-\alpha _t}\breve{v}_{t}^{\omega '}\right) \ge \) \(p^{\omega '}\sum _{t \in {\mathcal {T}}} {c_{t}^{\omega '}}^{\top }{\bar{x}}_{t}^{\omega '} + \lambda p^{\omega '} \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( \bar{\eta }_{t}^{\omega '} + \frac{1}{1-\alpha _t}{\bar{v}}_{t}^{\omega '}\right) \) by the optimality of solutions \((\hat{x}, {\hat{\eta }}, \hat{v})\) and \(({\bar{x}}, \bar{\eta }, {\bar{v}})\) to the scenario-\({\xi }^{\omega }\) and scenario-\({\xi }^{\omega '}\) problems, respectively. \(\square \)

Proposition 8

(Lower bound) Suppose that the objective function for each scenario is non-negative. Then \(Z^{S}\) and \(\sum _{\omega \in S} Z^{\left\{ \omega \right\} }\) provides a lower bound on the optimal objective function value of the original problem P (5a), Z, as in the following inequality:

$$\begin{aligned} Z \ge Z^{S} \ge \sum _{\omega \in S} Z^{\left\{ \omega \right\} }. \end{aligned}$$
(13)

Proof

The first inequality follows by Proposition 5, while the second inequality is the generalization of the superadditivity result in Proposition 7. \(\square \)

Remark 3

The lower bound (13) is tightest when \(S=\Omega \). In this case, the non-negativity assumption for each scenario’s objective function value is not required.

2.3.2 Upper bound

Proposition 9

(Upper bound) Let \({\dot{x}}^{\omega }\) be optimal solutions to the scenario-\({\xi }^{\omega }\) problem \(P^{\left\{ \omega \right\} }\); and let \({\bar{Z}}^{\omega }\) be the objective function value of P at solution \({\dot{x}}^{\omega }\). Then,

$$\begin{aligned} Z \le \min _{{\omega } \in \Omega }{\bar{Z}}^{\omega }. \end{aligned}$$
(14)

2.4 Scenario dominance cuts

In this section, we derive new cutting planes based on the “scenario dominance” concept defined in Sect. 2.1. We first define Lemmas 1 and 2 to motivate the main result on the scenario dominance cutting planes (Theorem 1). Then we present a corollary (Corollary 1.1) based on Theorem 1 to generalize the scenario dominance cuts to multiple time stages and further improve their computational benefits.

In this section, we present results regarding the general uncertainty, that is, the uncertainty in the right-hand side \(b_{t}^{\omega }\), left-hand side \((A_t^{\omega }, H_t^{\omega })\), and the objective coefficients \(c_t^{\omega }\). For notational brevity, we denote \(P^{\left\{ \omega \right\} }\) as \(P^{\omega }\) and \(Z^{\left\{ \omega \right\} }\) as \(Z^{\omega }\) for the rest of the paper. We give the following two definitions that are commonly used in the results presented in the rest of the paper.

Definition 2.8

Let \(\left( {\dot{x}}^{\omega }, {\dot{\eta }}^{\omega }, {\dot{v}}^{\omega }\right) \) be a feasible solution for the scenario-\({\xi }^{\omega }\) problem (9) and \(Z^{\omega }\) be the corresponding objective function value, i.e., \(Z^{\omega }:= p^{\omega } \sum _{t \in {\mathcal {T}}} {{c_{t}^{\omega }}^{\top }}{\dot{x}}_{t}^{\omega }+\lambda p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( {\dot{\eta }}_t^{\omega } + \frac{1}{1-\alpha _t}{\dot{v}}_t^{\omega }\right) \). Note that \(Z^{\omega }\) is a scalar value.

Lemma 1 below states that the portion of the optimal objective function value for the original problem (5a) that corresponds to the scenario-\({\xi }^{\omega }\) is bounded below by the optimal objective value of the scenario-\({\xi }^{\omega }\) problem.

Lemma 1

Let \((x^*, {\eta }^*, {v}^*)\) be the optimal solution to the original problem (5a) and \(({\dot{x}}^{\omega }, {\dot{\eta }}^{\omega }, {\dot{v}}^{\omega })\) be the optimal solution of the scenario-\({\xi }^{\omega }\) problem (9). Then, the portion of the optimal solution for the original problem (5a) that corresponds to scenario-\({\xi }^{\omega }\), \(({x^{\omega }}^*, {\eta ^{\omega }}^*, {v^{\omega }}^*)\), satisfies the following inequality:

$$\begin{aligned} p^{\omega } \sum _{t \in {\mathcal {T}}} {c_t^{\omega }}^{\top } {x_{t}^{\omega }} + \lambda p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( {\eta _t^{\omega }} + \frac{1}{1-\alpha _t}{v_t^{\omega }}\right) \ge Z^{\omega } \quad \forall \omega \in \Omega . \end{aligned}$$

Proof

This is a proof by contradiction. For some \(\omega \in \Omega \) and any \(({x^{\omega }}^*, {\eta ^{\omega }}^*, {v^{\omega }}^*)\) assume that

$$\begin{aligned}&p^{\omega } \sum _{t \in {\mathcal {T}}} {c_t^{\omega }}^{\top } {x_{t}^{\omega }}^* + \lambda p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( {\eta _t^{\omega }}^* + \frac{1}{1-\alpha _t}{v_t^{\omega }}^*\right) \nonumber \\&\quad < p^{\omega } \sum _{t \in {\mathcal {T}}} {c_t^{\omega }}^{\top } {\dot{x}}_t^{\omega } + \lambda p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( {\dot{\eta }}_t^{\omega } + \frac{1}{1-\alpha _t}{\dot{v}}_t^{\omega }\right) . \end{aligned}$$
(15)

Then define a new solution \(\acute{x}\) to P as follows:

$$\begin{aligned} \acute{x}_t = {x_{t}^*}, \acute{\eta }_t = {\eta _{t}^*}, \acute{v}_t = {v_{t}^*} \quad \forall t \in {\mathcal {T}}. \end{aligned}$$
(16)

By (16), the new solution \(\left( \acute{x}_t^{\omega }, \acute{\eta }_t^{\omega }, \acute{v}_t^{\omega }\right) \) is feasible and reduces the value of \(Z^{\omega }\) by (16), contradicting to the optimality of \(({\dot{x}}^{\omega }, {\dot{\eta }}^{\omega }, {\dot{v}}^{\omega })\) for the scenario-\({\xi }^{\omega }\) problem. \(\square \)

Lemma 2 below defines the relations of the optimal objective function value of the subproblems that correspond to two scenarios that have a dominance relation.

Lemma 2

Let \({\xi }^{\omega }\) be a scenario such that \(\omega \in {\Theta }^-_{T,{\xi }^{\omega '}} \subseteq \Omega \) for a given scenario \({\xi }^{\omega '}\) . Let \((\ddot{x}^{\omega }, \ddot{\eta }^{\omega }, \ddot{v}^{\omega })\) and \(({\dot{x}}^{\omega '}, {\dot{\eta }}^{\omega '}, {\dot{v}}^{\omega '})\) be the optimal solution of the scenario- \({\xi }^{\omega }\) and the scenario- \({\xi }^{\omega '}\) problems, \({Z}^{\omega }\) and \({Z}^{\omega '}\) be the corresponding objective values, respectively. Then the following inequality holds:

$$\begin{aligned} {Z}^{\omega } \ge {Z}^{\omega '} \ \ \ \forall \omega \in {\Theta }^-_{T,{\xi }^{\omega '}}. \end{aligned}$$

Proof

Add the following equalities to both scenario-\({\xi }^{\omega }\) and scenario-\({\xi }^{\omega '}\) problems, \(P^{\omega }\) and \(P^{\omega '}\):

$$\begin{aligned} x_t^{\omega } = x_t^{\omega '}, \ \eta _t^{\omega } = \eta _t^{\omega '}, \ v_t^{\omega } = v_t^{\omega '}, \quad \ \forall t \in {\mathcal {T}}. \end{aligned}$$
(17)

Define the new problems as \(\tilde{P}^{\omega }\) and \(\tilde{P}^{\omega '}\), respectively. Define the following sets of scenarios:

$$\begin{aligned} \Omega ^1= & {} \left\{ {\hat{\omega }} \in \Omega \backslash \left\{ \omega ,\omega '\right\} : \ \exists \ t \in {\mathcal {T}}\ \text {such that} \ {\xi }_{[t]}^{{\hat{\omega }}} = {\xi }_{[t]}^{\omega '} \ \text {for} \ \text {some} \ t > 1 \right\} . \\ \Omega ^2= & {} \left\{ \breve{\omega } \in \Omega \backslash \{\Omega ^1 \cup \left\{ \omega \right\} \cup \left\{ \omega ' \right\} \}\right\} . \end{aligned}$$

Note that the set \(\Omega ^1\) only includes scenarios \({\hat{\omega }} \ne \{ \omega , \omega ' \}\) that share a common history with scenario \(\omega '\) up to stage \(t>1\) and \(\Omega = \Omega ^1 \cup \Omega ^2 \cup \left\{ \omega \right\} \cup \left\{ \omega ' \right\} \).

Define a feasible solution \((\tilde{x}, {\tilde{\eta }}, \tilde{v})\) for \(\tilde{P}^{\omega }\) and \(\tilde{P}^{\omega '}\) as follows:

$$\begin{aligned} \tilde{x}_{t}^{\omega }&= \ddot{x}_{t}^{\omega }, \ {\tilde{\eta }}_{t}^{\omega } = \ddot{\eta }_{t}^{\omega }, \ \tilde{v}_{t}^{\omega } = \ddot{v}_{t}^{\omega } \quad \ \forall t \in {\mathcal {T}}; \end{aligned}$$
(18)
$$\begin{aligned} \tilde{x}_{t}^{\omega '}&= \ddot{x}_{t}^{\omega '}, \ {\tilde{\eta }}_{t}^{\omega '} = \ddot{\eta }_{t}^{\omega '}, \ \tilde{v}_{t}^{\omega '} = \ddot{v}_{t}^{\omega '}\quad \ \forall t \in {\mathcal {T}}; \end{aligned}$$
(19)
$$\begin{aligned} \tilde{x}_{t}^{\breve{\omega }}&= \ddot{x}_{t}^{\breve{\omega }} , {\tilde{\eta }}_{t}^{\breve{\omega }} = \ddot{\eta }_{t}^{\breve{\omega }}, \tilde{v}_{t}^{\breve{\omega }} = \ddot{v}_{t}^{\breve{\omega }} \quad \forall \breve{\omega } \in \Omega ^2, \quad \ \forall t \in {\mathcal {T}}; \end{aligned}$$
(20)
$$\begin{aligned} \tilde{x}_{t}^{{\hat{\omega }}}&= \ddot{x}_{t}^{\check{\omega }} , \ {\tilde{\eta }}_{t}^{{\hat{\omega }}} = \ddot{\eta }_{t}^{\check{\omega }}, \ \tilde{v}_{t}^{{\hat{\omega }}} = \ddot{v}_{t}^{\check{\omega }} \quad \ \forall \ {\hat{\omega }} \in \Omega ^1 \ \text {s.t.} \ \ {\xi }_{[t]}^{{\hat{\omega }}} = {\xi }_{[t]}^{\omega '} \ \ \text {for} \ \ t=1,\ldots , t', \ \forall t, t' \in {\mathcal {T}}, \end{aligned}$$
(21)

where scenario \(\check{\omega } \in \Omega \) satisfies the following conditions for a given \({\hat{\omega }} \in \Omega ^1\) such that \({\xi }_{[t]}^{{\hat{\omega }}} = {\xi }_{[t]}^{\omega '}\) for \(t=1,\ldots , t'\): \({\xi }_{[t]}^{\check{\omega }} = {\xi }_{[t]}^{\omega }\) for \(t=1,\ldots , t'\) and \({\xi }_{[t]}^{\check{\omega }}= {\xi }_{[t]}^{{\hat{\omega }}}\) for \(t=t'+1,\ldots , T\).

The solution defined in (18)–(21) is feasible for both \(\tilde{P}^{\omega '}\) and \(\tilde{P}^{\omega }\) because it satisfies constraints (5b)–(5d), as well as the non-anticipativity constraints (5e). Define the optimal objective value of \(\tilde{P}^{\omega }\) as \(\tilde{Z}^{\omega }\), and the new optimal value for \(\tilde{P}^{\omega '}\) as \(\tilde{Z}^{\omega '}\). The solution \((\tilde{x}, {\tilde{\eta }}, \tilde{v})\) defined in (18)–(21) is optimal for \(\tilde{P}^{\omega }\) because, for each feasible \((x^{\omega }, {\eta }^{\omega },v^{\omega })\), we have \(\tilde{Z}^{\omega }\ge Z^{\omega }\) and \(\tilde{Z}^{\omega } = Z^{\omega }\). Because we have \(\tilde{x}^{\omega } = \tilde{x}^{\omega '}\), \({\tilde{\eta }}^{\omega } = {\tilde{\eta }}^{\omega '}\), and \(\tilde{v}^{\omega } = \tilde{v}^{\omega '}\) and \(c_t^{\omega } \ge c_t^{\omega '}\),

$$\begin{aligned} \tilde{Z}^{\omega } \ge \tilde{Z}^{\omega '}. \end{aligned}$$
(22)

Since \(b_{t}^{\omega } \ge b_{t}^{\omega '}\), \(A_{t}^{\omega } \le A_{t}^{\omega '}\) and \(H_{t}^{\omega } \le H_{t}^{\omega '}\), and \(c_t^{\omega } \ge c_t^{\omega '}\), a feasible solution \((\tilde{x}^{\omega '}, {{\tilde{\eta }}}^{\omega '},\tilde{v}^{\omega '})\) has to satisfy constraints related to the scenario-\({\xi }^{\omega }\) by the addition of constraints (17), and thus we have

$$\begin{aligned} \tilde{Z}^{\omega '}\ge Z^{\omega '}. \end{aligned}$$
(23)

Because \(\tilde{x}^{\omega }=\ddot{x}^{\omega }\), \({\tilde{\eta }}^{\omega } = \ddot{\eta }^{\omega }\), and \(\tilde{v}^{\omega } = \ddot{v}^{\omega }\), we have

$$\begin{aligned} \tilde{Z}^{\omega } = {Z}^{\omega }. \end{aligned}$$
(24)

Furthermore, we have

$$\begin{aligned} {Z}^{\omega } = \tilde{Z}^{\omega } \ge \tilde{Z}^{\omega '} \ge Z^{\omega '}. \end{aligned}$$
(25)

The first and second equalities in (25) hold by (24) and (22), respectively, while the last inequality holds by (23). \(\square \)

Using those two lemmas above, we show the main theorem below on the scenario dominance cut that provides a lower bound on the portion of the optimal objective value for the original problem (5a) that corresponds to the scenario-\({\xi }^{\omega }\) problem. This lower bound is obtained by solving the single-scenario problems that are dominated by the scenario \({\xi }^{\omega }\).

Theorem 1

Scenario dominance cuts Assume that uncertainty in problem (5a) is observed in the objective function \(c_{t}^{\omega }\), in the right-hand side coefficient \(b_{t}^{\omega }\), and the left-hand side coefficient \((A_t^{\omega }, H_t^{\omega })\). Let \({\xi }^{\omega }\) and \({\xi }^{\omega '}\) be two scenarios such that \({\xi }^{\omega '} \preceq {\xi }^{\omega }\), \({\omega },{\omega '}\in \Omega \) and \(\omega \ne \omega '\), i.e., \(p^{\omega } \ge p^{\omega '}\), \(b_{t}^{\omega } \ge b_{t}^{\omega '}\), \(A_{t}^{\omega } \le A_{t}^{\omega '}\) and \(H_{t}^{\omega } \le H_{t}^{\omega '}\), and \(c_t^{\omega } \ge c_t^{\omega '}\) for all \(t \in {\mathcal {T}}\). Then, the portion of the optimal solution for the original problem (5a) that corresponds to scenario-\({\xi }^{\omega }\), \(({x^{\omega }}^*, {\eta ^{\omega }}^*, {v^{\omega }}^*)\), satisfies the following inequality:

$$\begin{aligned} p^{\omega } \sum _{t \in {\mathcal {T}}} {c_t^{\omega }}^{\top } {x_{t}^{\omega }} + \lambda p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( {\eta _t^{\omega }} + \frac{1}{1-\alpha _t}{v_t^{\omega }}\right) \ge Z^{\omega '}\ \ \ \forall \omega \in {\Theta }^-_{T,{\xi }^{\omega '}}. \end{aligned}$$
(26)

Proof

The first inequality below follows from Lemma 1, while the second follows from Lemma 2.

$$\begin{aligned} p^{\omega } \sum _{t \in {\mathcal {T}}} {c_t^{\omega }}^{\top } {x_{t}^{\omega }}^* + \lambda p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( {\eta _t^{\omega }}^* + \frac{1}{1-\alpha _t}{v_t^{\omega }}^*\right) \ge Z^{\omega } \ge Z^{\omega '}. \end{aligned}$$

\(\square \)

In the corollary following Theorem 1, we generalize the cuts to multiple stages by solving each stage-t problem for \(t \in {\mathcal {T}}\).

Definition 2.9

Let \((\dot{x_{t}}^{\omega }, \dot{\eta _{t}}^{\omega }, \dot{v_{t}}^{\omega })\) be the optimal solution for the stage-t scenario-\({\xi }^{\omega }\) problem (10) and \(Z_t^{\omega }\) be the corresponding optimal objective value such that

$$\begin{aligned} Z_t^{\omega } = p^{\omega } \sum _{j=1}^{t} {c_{j}^{\omega }}^{\top } \dot{x_{j}}^{\omega }+ \lambda p^{\omega } \sum _{j=2}^{t} \left( \dot{\eta _j}^{\omega } + \frac{1}{1-\alpha _j}\dot{v_j}^{\omega }\right) . \end{aligned}$$

Corollary 1.1

Stage-t Scenario dominance cuts Let \({\xi }^{\omega }\) and \({\xi }^{\omega '}\) be two scenarios \(l,{\omega }\in \Omega \) such that \({\xi }_t^{\omega '} \preceq {\xi }_t^{\omega }\) for \(t \in {\hat{{\mathcal {T}}}}=\left\{ 1,\ldots ,\hat{t}\right\} \subseteq {\mathcal {T}}\), where \(\hat{t} \in {\mathcal {T}}\). Then, the portion of the optimal solution for the original problem (5a) that corresponds to scenario-\({\xi }^{\omega }\) up to stage t, \(({x_{t}^{\omega }}^*, {\eta _{t}^{\omega }}^*, {v_{t}^{\omega }}^*)\), satisfies the following inequality:

$$\begin{aligned} p^{\omega } \sum _{j=1}^{t} {c_{j}^{\omega }}^{\top } x_{j}^{\omega }+ \lambda p^{\omega } \sum _{j=2}^{t} \left( \eta _j^{\omega } + \frac{1}{1-\alpha _j}v_j^{\omega }\right) \ge Z_t^{\omega '} \ \ \ \forall \omega \in {\Theta }^-_{t, {\xi }^{\omega '}},\ t \in {\hat{{\mathcal {T}}}} \subseteq {\mathcal {T}}. \end{aligned}$$
(27)

Proof

The proof follows from the application of Theorem 1 to each stage \(t \in {\mathcal {T}}\). \(\square \)

Remark 4

The stage-t scenario dominance cuts (27) are generalizations of the scenario dominance cuts (26). Both (26) and (27) are satisfied by all optimal solutions to the original problem (5a); however they are not valid inequalities for the feasible region of the problem (5a), i.e., they both may cut-off a non-optimal feasible solution, while preserving the optimal solution. This type of inequalities are not uncommon (see, e.g., Bender’s optimality cuts Benders (2005), supervalid inequalities of Israeli and Wood (2002), and the pre-processing cuts presented in Kibis et al. (2020)), and they are useful because they help to shrink the feasible region while preserving the optimal solution.

Remark 5

For problem instances, P, where it is easy to find a feasible solution but hard to find the optimal one, we expect that the single scenario sub-problem, \(P^{\omega }\), could be solved much faster than P because the \(P^{\omega }\) is a relaxation of P, as shown in Proposition 5. Furthermore, in the \(P^{\omega }\), we optimize the value of the variables related to a single scenario \(\omega \), while we only need to assign a feasible solution to all other variables in order to satisfy the constraint set, \({\mathcal {X}}\). This observation is also justified by the computational results for the dynamic knapsack problems in Sect. 4. On the other hand, if finding a feasible solution is hard, the single scenario sub-problems could take a long time to solve. In such cases, we suggest solving the relaxed single scenario sub-problem \(P_R^{\omega }\) rather than the \(P^{\omega }\). Because the \(P_R^{\omega }\) is merely a relaxation of the \(P^{\omega }\), including only the constraints and variables of a single scenario, the lower bound (13), scenario dominance cuts (26), and stage-t scenario dominance cuts (27) would still hold when \(P_R^{\omega }\) is used to generate those inequalities.

3 Risk-averse multi-stage stochastic knapsack

We consider a class of stochastic mean-CVaR multi-stage multi-dimensional mixed 0-1 knapsack problems (SMKPs) as a benchmark set for our computations. The classical deterministic, single-dimensional, and risk-neutral version of the problem is known as a maximization binary knapsack problem (KP): given a set of items with profits and sizes, and the capacity value of a knapsack, the objective is to maximize the total profit of selected items in the knapsack while satisfying the capacity constraint (Kellerer et al. 2004). Another widely used variant is the minimization knapsack problem (GüNtzer and Jungnickel 2000; Han and Makino 2009; Csirik 1991), that is, given a set of items associated with costs and sizes, our goal is to minimize the total cost of selected items while ensuring that the knapsack is covered, i.e., the sum of the items’ sizes is larger than or equal to a constant. GüNtzer and Jungnickel (2000) consider a minimization equivalent to the maximization KP, where binary selection variables are negated in the formulation. A generalization of the KP is the multi-dimensional knapsack (MKP), which involves multiple constraints, each corresponding to a knapsack, whose capacity should be satisfied. The structure of the MKP is quite similar to the general integer programs with multiple constraints.

When the costs of items are not fixed and are instead given by a probability distribution, the problem is broadly referred to as the stochastic knapsack problem (Morton and Wood 1998; Dean et al. 2008; Angulo et al. 2016). Angulo et al. (2016) and Guan et al. (2009) present cutting plane approaches to improve the solvability of the stochastic dynamic knapsack problem. In the computational experiments, we tackle solving the general stochastic multi-stage multi-dimensional knapsack problem (SMKP) with integer and binary variables and a mean-risk objective.

3.1 Mean-CVaR multi-stage stochastic knapsack formulation

We adapt the scenario-dominance cuts presented in Sect. 2.4 to solve a class of mean-CVaR stochastic multi-dimensional mixed 0-1 knapsack problems (CVaR-SMKPs) and demonstrate their computational benefit on this class of multi-stage stochastic MIPs. CVaR-SMKP represents the general mean-CVaR multi-stage stochastic mixed-integer programs with binary and continuous variables in the first and subsequent stages.

To formulate the CVaR-SMKP, we define the following notation:

Indices:

  • t: index for time period, where \(t \in {\mathcal {T}}=\left\{ 1,\ldots , T\right\} \), and T is the final time period.

  • i: index for item, where \(i \in {\mathcal {I}}=\left\{ 1,\ldots , I\right\} \), and I is the maximum number of items.

  • \(\omega \): index for scenario, where \(\omega \in \Omega =\left\{ 1,\ldots , \bar{\Omega }\right\} \), and \(\bar{\Omega }\) is the maximum number of scenarios.

Variables:

  • \(x_{it}^{\omega } \in {\mathbb {B}}\): \(0-1\) binary variable to select an item i in period t under scenario \(\omega \in \Omega \), where \({\mathbb {B}}\) is the set of binary integers.

  • \(z_{t}^{\omega } \in \mathbb {R_+}\): continuous variable in period t under scenario \(\omega \in \Omega \).

  • \(y_{t}^{\omega } \in {\mathbb {B}}\): \(0-1\) binary variable that relates to the variable \(x_{it}^{\omega }\) in period t under scenario \(\omega \in \Omega \).

  • \(\eta _t^{\omega } \in {\mathbb {R}}\): VaR variable in period t under scenario \(\omega \in \Omega \).

  • \(v_{t}^{\omega } \in \mathbb {R_+}\): continuous variable that represents the excess cost over the VaR variable in period t under scenario \(\omega \in \Omega \).

All parameters defined below are real numbers, e.g., belong to the field of \({\mathbb {R}}\):

Stochastic parameters:

  • \(q_{t}^{\omega }\): cost coefficient associated with variable \(y_{t}^{\omega }\) in period t under scenario \(\omega \in \Omega \).

  • \(r_{t}^{\omega }\): coefficient corresponding to the continuous variable \(z_{t}^{\omega }\) in period t under scenario \(\omega \in \Omega \).

  • \(b_{t}^{\omega }\): minimum total value of all selected items in period t under scenario \(\omega \in \Omega \).

Deterministic parameters:

  • \(c_{it}\): cost coefficient associated with variable \(x_{it}^{\omega }\) for item i in period t.

  • \(d_{t}\): cost coefficient associated with variable \(z_{t}^{\omega }\) in period t.

  • \(a_{it}\): value of item i in period t.

  • \({\kappa }_{it}\): size of item i in period t.

  • \(w_{t}\): coefficient corresponding to the continuous variable \(y_{t}^{\omega }\) in period t.

  • \(h_{t}\): minimum total size of all selected items in period t.

All objective coefficients are assumed to be non-negative. Using the above notation, the CVaR-SMKP (28a) can be formulated as follows.

$$\begin{aligned} \min _{\begin{array}{c} x_{i1}^{\omega },\ldots ,x_{iT}^{\omega }, \\ z_{1}^{\omega },\ldots ,z_{T}^{\omega }, \\ y_{1}^{\omega },\ldots ,y_{T}^{\omega }, \\ \eta _{1}^{\omega },\ldots ,\eta _{T}^{\omega }, \\ v_{1}^{\omega },\ldots ,v_{T}^{\omega } \end{array}}&\sum _{\omega \in \Omega } p^{\omega } \sum _{t \in {\mathcal {T}}} \left( \sum _{i \in {\mathcal {I}}}c_{it}x_{it}^{\omega } + d_{t}z_{t}^{\omega } + q_{t}^{\omega }y_{t}^{\omega } \right) \nonumber \\&+\,\lambda \sum _{\omega \in \Omega } p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( \eta _t^{\omega } + \frac{1}{1-\alpha _t}v_{t}^{\omega }\right) \quad \end{aligned}$$
(28a)
$$\begin{aligned} \text {s.t.}&\sum _{j=1}^t \sum _{i \in {\mathcal {I}}} a_{ij}x_{ij}^{\omega }+ r_{t}^{\omega }z_{t}^{\omega } \ge b_{t}^{\omega } \quad \forall t \in {\mathcal {T}}, \ \omega \in \Omega , \end{aligned}$$
(28b)
$$\begin{aligned}&\sum _{i \in {\mathcal {I}}} {\kappa }_{it}x_{it}^{\omega } + w_{t}y_{t}^{\omega } \ge h_{t} \quad \forall t \in {\mathcal {T}}, \ \omega \in \Omega , \end{aligned}$$
(28c)
$$\begin{aligned}&v_{t}^{\omega } \ge \sum _{i \in {\mathcal {I}}}c_{it}x_{it}^{\omega } + d_{t}z_{t}^{\omega } + q_{t}^{\omega }y_{t}^{\omega } - \eta _t^{\omega } \quad \forall t \in {\mathcal {T}}, \ \omega \in \Omega , \end{aligned}$$
(28d)
$$\begin{aligned}&x_{it}^{\omega } = x_{it}^{\omega '} \quad \forall t \in {\mathcal {T}}, \ i \in {\mathcal {I}}, \ \omega , \omega ' \in {\Omega } \ s.t.\ {\xi }_{[t]}^{\omega } = {\xi }_{[t]}^{\omega '} , \end{aligned}$$
(28e)
$$\begin{aligned}&z_t^{\omega } = z_t^{\omega '},\ y_t^{\omega } = y_t^{\omega '} \quad \forall t \in {\mathcal {T}}, \ \omega , \omega ' \in {\Omega } \ s.t.\ {\xi }_{[t]}^{\omega } = {\xi }_{[t]}^{\omega '} , \end{aligned}$$
(28f)
$$\begin{aligned}&{\eta }_t^{\omega } = {\eta }_t^{\omega '}, \ v_{t}^{\omega } = v_t^{\omega '} \quad \forall t \in {\mathcal {T}}, \ \omega , \omega ' \in {\Omega } \ s.t.\ {\xi }_{[t]}^{\omega } = {\xi }_{[t]}^{\omega '} , \end{aligned}$$
(28g)
$$\begin{aligned}&x_{it}^{\omega }, y_{t}^{\omega } \in \{0,1\} \quad \forall t \in {\mathcal {T}}, \ i \in {\mathcal {I}}, \ \omega \in \Omega , \end{aligned}$$
(28h)
$$\begin{aligned}&z_{t}^{\omega }, v_{t}^{\omega } \ge 0 \quad \forall t \in {\mathcal {T}}, \ \omega \in \Omega . \end{aligned}$$
(28i)

The mean-risk function (28a) minimizes the expected sum of knapsack costs and the weighted VaR and the weighted positive expected excess cost over the VaR over all items \(i \in {\mathcal {I}}\), periods \(t \in {\mathcal {T}}\), and scenarios \(\omega \in \Omega \). Constraints (28b) and (28c) define knapsack-related constraints, where constraint (28b) [(28c)] ensures that the sum of the value [size] of selected items should be larger than a given constant. Constraints (28d) define VaR and the excess cost under each scenario over the VaR variable \(\eta _t^{\omega }\). Constraints (28e)–(28g) represent the nonanticipativity constraints for the x, z, y, \(\eta \), and v variables, respectively. Finally, (28h) represent binary integer restrictions on x and y variables, while constraints (28i) represent lower bounds on the z, \(\eta \), and v variables, respectively. Note that CVaR-SMKP formulated above includes uncertainty in the objective function parameter \(q_{t}^{\omega }\), the left-hand side parameter \(r_{t}^{\omega }\), and the right-hand side parameter \(b_{t}^{\omega }\).

3.2 Bounds

Proposition 10

Let \(S \subseteq \Omega \) be a subset of scenarios. For a given \(\omega ' \in S\), let \(({\dot{x}}^{\omega '}, {\dot{z}}^{\omega '}, {\dot{y}}^{\omega '}, \dot{{\eta }}^{\omega '}, {\dot{v}}^{\omega '})\) and \(Z^{\omega '}\) be the optimal solution and the corresponding objective function value of the scenario-\(\xi ^{\omega '}\) problem, respectively. Then an optimal solution to the CVaR-SMKP (28a), \(({x}^*, {z}^*, {y}^*, {\eta }^*, {v}^*)\), satisfies the following inequality:

$$\begin{aligned} \sum _{\omega \in \Omega } p^{\omega } \sum _{t \in {\mathcal {T}}} \left( \sum _{i \in {\mathcal {I}}}c_{it}{x_{it}^{\omega }} + d_{t}{z_{t}^{\omega }} + q_{t}^{\omega }{y_{t}^{\omega }} \right) + \lambda \sum _{\omega \in \Omega } p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( {\eta _t^{\omega }} \!+\! \frac{1}{1-\alpha _t}{v_{t}^{\omega }}\right) \ge \!\sum _{\omega ' \in S} Z^{\omega '}. \end{aligned}$$
(29)

Proof

In the CVaR-SMKP (28a), we assume that all objective coefficients are non-negative, implying that the specific scenario objective function values are also non-negative. Then, the result follows immediately by applying inequality (13) in Proposition  8 for the CVaR-SMKP. \(\square \)

Proposition 11

Let \(S \subseteq \Omega \) be a subset of scenarios. For a given \(\omega ' \in S\), let \({\bar{Z}}^{\omega '}\) be the objective function value of (28a) at the optimal solution of the scenario-\(\xi ^{\omega '}\) problem, \(({\dot{x}}^{\omega '}, {\dot{z}}^{\omega '}, {\dot{y}}^{\omega '}, \dot{{\eta }}^{\omega '}, {\dot{v}}^{\omega '})\). Then an optimal solution to the CVaR-SMKP (28a), \(({x}^*, {z}^*, {y}^*, {\eta }^*, {v}^*)\), satisfies the following inequality:

$$\begin{aligned}&\sum _{\omega \in \Omega } p^{\omega } \sum _{t \in {\mathcal {T}}} \left( \sum _{i \in {\mathcal {I}}}c_{it}{x_{it}^{\omega }} + d_{t}{z_{t}^{\omega }} + q_{t}^{\omega }{y_{t}^{\omega }} \right) \nonumber \\&\quad +\, \lambda \sum _{\omega \in \Omega } p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( {\eta _t^{\omega }} \!+\! \frac{1}{1-\alpha _t}{v_{t}^{\omega }}\right) \le \min _{\omega ' \in S} {\bar{Z}}^{\omega '}. \end{aligned}$$
(30)

Proof

The result follows immediately by applying inequality (14) in Proposition 9 for the CVaR-SMKP. \(\square \)

Example 3

Consider a CVaR-SMKP instance in which \(T= 4\) and \(I=2\). This instance has \(\bar{\Omega } =8\) scenarios, and the probability of each scenario is set as 1/8. The risk parameter \(\lambda \) is set as 1 and \(\alpha _t=0.95\) for each \(t \in {\mathcal {T}}\). We assume that all parameter values are set at 0 at period \(t=1\). The data pertaining the instance is: \(\ c_{1t} = (59,35,67), \ c_{2t} = (23,16,14), \ d_t = (47,3,67), \ a_{1t} = (75, 95, 4), \ a_{2t} = (82, 56, 26), \ {\kappa }_{1t} = (98, 9, 20), \ {\kappa }_{2t} = (41, 1, 60), \ w_t = (84,97,57), h_{t} = (167.2, 80.2, 102.7)\) for \(t = \{2,3,4\}\). The parameter values at \(t=1\) are assumed to be known and are set as zero in this example. The values of the uncertain parameters \(\left( q_{t}^{\omega }, r_{t}^{\omega }, b_{t}^{\omega }\right) \) are presented in Table 1.

Table 1 Values of the uncertain parameters \(\left( q_{t}^{\omega }, r_{t}^{\omega }, b_{t}^{\omega }\right) \) for \(\omega = \{1,\ldots ,8\}\) and \(t = \{2,3,4\}\)

Now, let \(S =\left\{ 1,2,3,4\right\} \). Then \(Z^1 = 69.9\), \(Z^2 = 80.6\), \(Z^3 = 72.39\), and \(Z^4 = 83.14\). For S the lower-bound inequality (29) is:

$$\begin{aligned} \sum _{\omega \in \Omega } p^{\omega } \sum _{t \in {\mathcal {T}}} \left( \sum _{i \in {\mathcal {I}}}c_{it}{x_{it}^{\omega }} + d_{t}{z_{t}^{\omega }} + q_{t}^{\omega }{y_{t}^{\omega }} \right) + \lambda \sum _{\omega \in \Omega } p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( {\eta _t^{\omega }} + \frac{1}{1-\alpha _t}{v_{t}^{\omega }}\right) \ge 306.04. \end{aligned}$$
(31)

The minimum upper-bound is attained when solving the scenario problem with \(\omega =4\), and thus the upper-bound inequality (30) is:

$$\begin{aligned} \sum _{\omega \in \Omega } p^{\omega } \sum _{t \in {\mathcal {T}}} \left( \sum _{i \in {\mathcal {I}}}c_{it}{x_{it}^{\omega }} + d_{t}{z_{t}^{\omega }} + q_{t}^{\omega }{y_{t}^{\omega }} \right) + \lambda \sum _{\omega \in \Omega } p^{\omega } \sum _{t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }} \left( {\eta _t^{\omega }} + \frac{1}{1-\alpha _t}{v_{t}^{\omega }} \right) \le 694.86. \end{aligned}$$
(32)

The optimal objective value for this instance is 614.08. For \(S =\left\{ 1,2,3,4,5,6,7,8\right\} \), the lower bound of the inequality (31) can be improved to 614.08, while the upper bound in the inequality (32) stays the same.

3.3 Stage-t scenario-dominance cuts

Theorem 2

Let \({\xi }^{\omega '}\) be a scenario with \(\omega ' \in S \subseteq \Omega \). For each \(t \in {\mathcal {T}}\), let \((\dot{x_{t}}^{\omega '}, \dot{z_{t}}^{\omega '}, \dot{y_{t}}^{\omega '}, \dot{\eta _{t}}^{\omega '}, \dot{v_{t}}^{\omega '})\) represent a vector of optimal solutions to the stage-t scenario-\(\xi ^{\omega '}\) problem (10) and define the corresponding optimal objective value as:

$$\begin{aligned} Z_t^{\omega '} := p^{\omega '} \sum _{j=1}^{t} \left( \sum _{i \in {\mathcal {I}}} c_{ij}{\dot{x}}_{ij}^{\omega '} + d_{j}{\dot{z}}_{j}^{\omega '} + q_{j}^{\omega '}{\dot{y}}_{j}^{\omega '}\right) + \lambda p^{\omega '} \sum _{j=2}^{t} \left( {\dot{\eta }}_{j}^{\omega '} + \frac{1}{1-\alpha _j}{\dot{v}}_{j}^{\omega '}\right) . \end{aligned}$$

Then, the portion of the optimal solution up to stage t to the CVaR-SMKP (28a) corresponding to scenario \({\xi }^{\omega }\) such that \({\xi _t}^{\omega '} \preceq {\xi _t}^{\omega }\) for \(t \in {\hat{{\mathcal {T}}}} \subseteq {\mathcal {T}}\), \(({x_{t}^{\omega }}^*, {z_{t}^{\omega }}^*, {y_{t}^{\omega }}^*, {\eta _{t}^{\omega }}^*, {{v}_{t}^{\omega }}^*)\), is satisfied by the following inequality:

$$\begin{aligned}&p^{\omega } \sum _{j=1}^{t} \left( \sum _{i \in {\mathcal {I}}} c_{ij}{x_{ij}^{\omega }} + d_{j}{z_{j}^{\omega }} + q_{j}^{\omega }{y_{j}^{\omega }} \right) +\lambda p^{\omega } \sum _{j=2}^{t} \left( {\eta _j^{\omega }} + \frac{1}{1-\alpha _j}{{v}_j^{\omega }}\right) \nonumber \\&\quad \ge Z_t^{\omega '} \ \ \ \forall \omega \in {\Theta }^-_{t, {\xi }^{\omega '}},\ t \in {\hat{{\mathcal {T}}}}. \end{aligned}$$
(33)

Proof

The result follows immediately by applying the inequality (27) in Corollary 1.1 for the mean-CVaR stochastic multi-stage knapsack problem (28a). \(\square \)

Corollary 2.1

The inequality (33) can be strengthened as in the following inequality:

$$\begin{aligned}&p^{\omega } \sum _{j=1}^{t} \left( \sum _{i \in {\mathcal {I}}} c_{ij}{x_{ij}^{\omega }} + d_{j}{z_{j}^{\omega }} + q_{j}^{\omega '}{y_{j}^{\omega }} \right) +\lambda p^{\omega } \sum _{j=2}^{t} \left( {\eta _j^{\omega }} + \frac{1}{1-\alpha _j}{{v}_j^{\omega }}\right) \nonumber \\&\quad \ge Z_t^{\omega '} \ \ \ \forall \omega \in {\Theta }^-_{t, {\xi }^{\omega '}},\ t \in {\hat{{\mathcal {T}}}}. \end{aligned}$$
(34)

Proof

Because the solution \(({x^{\omega }}^*, {z^{\omega }}^*, {y^{\omega }}^*, {\eta ^{\omega }}^*, {{v}^{\omega }}^*)\) is a part of a feasible solution to the full problem defined in (28a), \(\forall \) \(\omega , \omega ' \in \Omega \) such that \({\xi _t}^{\omega '} \preceq {\xi _t}^{\omega }\) we have

$$\begin{aligned} \bar{{\mathcal {C}}}&:= p^{\omega '} \sum _{j=1}^{t} \left( \sum _{i \in {\mathcal {I}}} c_{ij}{x_{ij}^{\omega }}^* + d_{j}{z_{j}^{\omega }}^* + q_{j}^{\omega '}{y_{j}^{\omega }}^* \right) + \lambda p^{\omega '} \sum _{j=2}^{t} \left( {\eta _j^{\omega }}^* + \frac{1}{1-\alpha _j}{{v}_j^{\omega }}^*\right) \ge {\mathcal {B}}. \nonumber \\&:= p^{\omega '} \sum _{j=1}^{t} \left( \sum _{i \in {\mathcal {I}}} c_{ij}{\dot{x}}_{ij}^{\omega '} + d_{j}{\dot{z}}_{j}^{\omega '}+ q_{j}^{\omega '}{\dot{y}}_{j}^{\omega '}\right) +\lambda p^{\omega '} \sum _{j=2}^{t} \left( {\dot{\eta _j}}^{\omega '} + \frac{1}{1-\alpha _j}{\dot{v}}_{j}^{\omega '}\right) . \end{aligned}$$
(35)

But \(q_{t}^{\omega }\ge q_{t}^{\omega '}\) \(\ \forall \) \(t \in {\hat{{\mathcal {T}}}}\) implies that

$$\begin{aligned} {\mathcal {A}}&:= p^{\omega } \sum _{j=1}^{t} \left( \sum _{i \in {\mathcal {I}}} c_{ij}{x_{ij}^{\omega }}^* + d_{j}{z_{j}^{\omega }}^* + q_{j}^{\omega }{y_{j}^{\omega }}^* \right) + \lambda p^{\omega } \sum _{j=2}^{t} \left( {\eta _j^{\omega }}^* + \frac{1}{1-\alpha _j}{{v}_j^{\omega }}^*\right) \nonumber \\ \ge {\mathcal {C}}&:= p^{\omega } \sum _{j=1}^{t} \left( \sum _{i \in {\mathcal {I}}} c_{ij}{x_{ij}^{\omega }}^* + d_{j}{z_{j}^{\omega }}^* + q_{j}^{\omega '}{y_{j}^{\omega }}^* \right) + \lambda p^{\omega } \sum _{j=2}^{t} \left( {\eta _j^{\omega }}^* + \frac{1}{1-\alpha _j}{{v}_j^{\omega }}^*\right) . \end{aligned}$$
(36)

Because we have \(p^{\omega } \ge p^{\omega '}\), we have \({\mathcal {C}} \ge \bar{{\mathcal {C}}}\), and thus the following inequalities hold:

$$\begin{aligned} {\mathcal {A}} \ge {\mathcal {C}} \ge \bar{{\mathcal {C}}}\ge {\mathcal {B}}. \end{aligned}$$
(37)

Because the inequality (33) implies that \({\mathcal {A}} \ge {\mathcal {B}}\), by the results above in (37) the inequality (34), which implies \({\mathcal {C}} \ge {\mathcal {B}}\), strengthens the inequality (33). \(\square \)

Example 3 (cont). Again, consider the CVaR-SMKP instance defined in Example 3. The index set of scenarios that dominate \({\xi }^{3}\) is defined as \({\Theta }^-_{2, {\xi }^{3}} = \left\{ 1,2,3,4 \right\} \), \({\Theta }^-_{3, {\xi }^{3}} = \left\{ 3,4 \right\} \), \({\Theta }^-_{4, {\xi }^{3}} = \left\{ 3 \right\} \). Solving the stage-t scenario-\({\xi }^{3}\) problem for \(t=2,3,4\), we have \(Z_2^3 = 31.62\), \(Z_3^3 = 59.89\), and \(Z_4^3 = 72.39\).

For \(\omega =4 \in {\Theta }^-_{t,{\xi }^{3}}\), the inequalities corresponding to (34) for \(t=2,3\) can be, respectively, written as:

$$\begin{aligned} p^4 \left( \sum _{i \in {\mathcal {I}}} c_{i2}{x_{i2}^{4}} + d_{2}{z_{2}^{4}} + q_{2}^{3}{y_{2}^{4}} \right) +\lambda p^{4} \left( {\eta _2^{4}} + \frac{1}{1-\alpha _2}{v_2^{4}} \right)&\ge 31.62. \\ p^4 \sum _{t \in \left\{ 2,3\right\} } \left( \sum _{i \in {\mathcal {I}}} c_{it}{x_{it}^{4}} + d_{t}{z_{t}^{4}} + q_{t}^{3}{y_{t}^{4}} \right) +\lambda p^{4} \sum _{t \in \left\{ 2,3\right\} } \left( {\eta _t^{4}} + \frac{1}{1-\alpha _t}{v_t^{4}}\right)&\ge 59.89. \end{aligned}$$

Because \({\Theta }^-_{4, {\xi }^{3}} = \left\{ 3 \right\} \), in period \(t=4\), we have only one inequality (34) as given below:

$$\begin{aligned} p^3 \sum _{t \in \left\{ 2,3,4\right\} } \left( \sum _{i \in {\mathcal {I}}} c_{it}{x_{it}^{3}} + d_{t}{z_{t}^{3}} + q_{t}^{3}{y_{t}^{3}} \right) +\lambda p^{3} \sum _{t \in \left\{ 2,3,4\right\} } \left( {\eta _t^{3}} + \frac{1}{1-\alpha _t}{v_t^{3}}\right) \ge 72.39. \end{aligned}$$

The optimal objective function value to CVaR-SMKP (28a) is 614.08, and its Linear Programming (LP) relaxation is 507.60. By adding inequalities (29), (30), and (34) for \(S =\left\{ 1,2,3,4\right\} \), the optimal relaxation solution is cut-off, and the LP relaxation objective of (28a) increases to 560.61, closing the optimality gap by 50%.

3.4 Strong scenario-dominance cuts

Proposition 12

Let \({\xi }^{\omega '}\) and \({\xi }^{\omega ''}\) be two scenarios that share a common history up to time period \(1,\ldots ,\tau \in {\mathcal {T}}\). For each \(t \in {\mathcal {T}}\), where \(t\ge \tau \), let \((\dot{x_{t}}^{\omega ''}, \dot{z_{t}}^{\omega ''}, \dot{y_{t}}^{\omega ''}, \dot{\eta _{t}}^{\omega ''}, \dot{v_{t}}^{\omega ''})\) represent a vector of optimal solutions to the stage-t scenario-\({\xi }^{\omega ''}\) problem and define the objective value corresponding to the stage-t scenario-\(\xi ^{\omega '}\) problem at the solution \((\dot{x_{t}}^{\omega ''}, \dot{z_{t}}^{\omega ''}, \dot{y_{t}}^{\omega ''}, \dot{\eta _{t}}^{\omega ''}, \dot{v_{t}}^{\omega ''})\) as:

$$\begin{aligned} \hat{Z}_t^{\omega '} := p^{\omega '} \sum _{j=1}^{t} \left( \sum _{i \in {\mathcal {I}}} c_{ij}{\dot{x}}_{ij}^{\omega ''} + d_{j}{\dot{z}}_{j}^{\omega ''} + q_{j}^{\omega '}{\dot{y}}_{j}^{\omega ''}\right) + \lambda p^{\omega '} \sum _{j=2}^{t} \left( {\dot{\eta }}_{j}^{\omega ''} + \frac{1}{1-\alpha _j}{\dot{v}}_{j}^{\omega ''}\right) . \end{aligned}$$

Then, there exists a feasible solution to the CVaR-SMKP (28a) corresponding to scenario-\({\xi }^{\omega }\) problem such that \({\xi _t}^{\omega '} \preceq {\xi _t}^{\omega }\) for \(t \in {\hat{{\mathcal {T}}}} \subseteq {\mathcal {T}}\), \(({\bar{x_{t}}^{\omega }}, {\bar{z_{t}}^{\omega }}, {\bar{y_{t}}^{\omega }}, {\bar{\eta _{t}}^{\omega }}, {\bar{v_{t}}^{\omega }})\), that is satisfied by the following inequality:

$$\begin{aligned}&p^{\omega } \sum _{j=1}^{t} \left( \sum _{i \in {\mathcal {I}}} c_{ij}{x_{ij}^{\omega }} + d_{j}{z_{j}^{\omega }} + q_{j}^{\omega }{y_{j}^{\omega }} \right) +\lambda p^{\omega } \sum _{j=2}^{t} \left( {\eta _j^{\omega }} + \frac{1}{1-\alpha _t}{v_j^{\omega }}\right) \nonumber \\&\quad \ge \hat{Z}_t^{\omega '} \ \ \ \forall \omega \in {\Theta }^-_{t, {\xi }^{\omega '}},\ t \in {\hat{{\mathcal {T}}}}. \end{aligned}$$
(38)

Remark 6

It is easy to see that there exists a feasible solution to the CVaR-SMKP (28a) that is satisfied by the inequalities (38) because those inequalities provide a lower bound on the x-, z-, y-, \(\eta \)-, and v-variables. Using the inequalities (38), we often get the optimal solutions, but sometimes they may cut-off the optimal solution. This is because \(\hat{Z}_t^{\omega '}\) is not optimal but a close-to-optimal objective value to the stage-t scenario-\(\xi ^{\omega '}\) problem. Therefore, one needs to carefully implement the strong scenario dominance cuts. In Sect. 4, we investigate the trade-off between the quality of the best solution obtained by applying those inequalities versus the computational speed-up obtained with them. We, specifically, present the objective value gap defined by the percentage deviation between the best objective value found by CPLEX for the original problem and the best objective value obtained after applying the cuts (38) to analyze the impact of adding these aggressive inequalities that may cut-off the optimal solution. On the other hand, the stage-t scenario dominance inequalities (34) do not cut-off the optimal solution and reduce the optimality gap considerably, as shown in Section 4. However, the inequalities (38) may be preferable to the inequalities (34) if the user prefers a computational speed-up.

Example 3 (cont). Again, consider the CVaR-SMKP instance defined in Example 3. Consider two similar scenarios \({\xi }^{\omega '}\) and \({\xi }^{\omega ''}\), where \(\omega '=2\) and \(\omega ''=1\). Solving the stage-t scenario-\({\xi }^{1}\) problem for \(t=2\) and \(t=3\), and using the solution in the objective function of the scenario-\({\xi }^{2}\) problem, we have \(\hat{Z}_2^2 = 31.69\) and \(\hat{Z}_3^2 = 57.44\), respectively.

For \(\omega =1 \in {\Theta }^-_{t,{\xi }^{2}}\), the inequalities corresponding to (38) for \(t=2,3\) can be, respectively, written as:

$$\begin{aligned} p^1 \left( \sum _{i \in {\mathcal {I}}} c_{i2}{x_{i2}^{1}} + d_{2}{z_{2}^{1}} + q_{2}^{1}{y_{2}^{1}} \right) +\lambda p^{1} \left( {\eta _2^{1}} + \frac{1}{1-\alpha _2}{{v}_2^{1}} \right)&\ge 31.69, \\ p^1 \sum _{t \in \left\{ 2,3\right\} } \left( \sum _{i \in {\mathcal {I}}} c_{it}{x_{it}^{1}} + d_{t}{z_{t}^{1}} + q_{t}^{1}{y_{t}^{1}} \right) +\lambda p^{1} \sum _{t \in \left\{ 2,3\right\} } \left( {\eta _t^{1}} + \frac{1}{1-\alpha _t}{v_t^{1}}\right)&\ge 57.44. \end{aligned}$$

4 Computational experiments

4.1 Implementation specifications

In our computational experiments, we demonstrate the effectiveness of the proposed bounds and cuts driven by “scenario dominance" in solving risk-averse multi-stage stochastic knapsack instances with different problem characteristics. In particular, we solve a variety of CVaR-SMKP instances with the following approaches:

  • cpx: direct solution of the model (28a) by CPLEX using its default settings.

  • bc: inequalities (29) and (30).

  • scut: inequalities (34).

  • sdcut: the scut + inequalities (38).

The implementation of the cuts defined above involves appending them to the CVaR-SMKP formulations as CPLEX user cuts, and then solving the resulting model using CPLEX 12.71 with its default settings. The best feasible solution found within 50 CPU seconds (s) is fed into the optimizer before we run the model with the cuts in order to benefit from useful CPLEX cuts. The 50 s spent a prior to adding sdcut is also included in the cut generation time. We used a formula of \(\frac{{2^{T-2}}}{T-1}\) to determine the number of scenarios to be used for cut generation for solving the CVaR-SMKP. To drive this formula, we performed a preliminary experimental analysis on the fraction of the scenarios that are used for cut generation. After using various fractions, we observe that selecting the half of the scenarios divided by \(T-1\) strikes a good balance in terms of the cut generation time (which includes the solution of all selected scenario sub-problems) and the time benefit obtained from implementing the cuts. However, the number of scenarios to be used for cut generation could be set any number from 1 to \(\bar{\Omega }\). We generated both scut and sdcut for periods \(t=\left\{ T/2,\ldots ,T\right\} \in {\hat{{\mathcal {T}}}}\). The effectiveness of the proposed cuts is evaluated in terms of reducing the solution time, the integrality gap, the number of nodes solved in the branch and bound tree as well as the number of instances solved without memory problems relative to cpx, which is the direct solution of the model by CPLEX.

All computational testing was conducted on a desktop computer running with Intel i7 CPU and 64.0 GB of memory and using CPLEX 12.7.1. A time limit of 7200 CPU seconds was imposed for all test instances. We describe our test instance generation scheme for CVaR-SMKPs in Sect. 4.2, and present the detailed results in Sect. 4.3.

4.2 Instance generation and test problems

For each problem class, we randomly generate multiple instances using different seeds from a uniform distribution. All instances have the parameter T that defines the number of stages in the stochastic scenario tree with \(l=2\) branches emanating from each node. Thus the resulting scenario tree has a size of \(2^{T-1}\) scenarios. All CVaR-SMKP instances generated can be found at the Systems Optimization and Data Analytics Laboratory Mean-CVaR Stochastic Multi-Dimensional Mixed 0-1 Knapsack Test Problem Library (SODAL CVaRSMKP-LIB) (Büyüktahtakın 2021).

Knapsack instances We generate test instances for CVaR-SMKP in a similar fashion to the two-stage binary knapsack instances of Angulo et al. (2016). Different than the study of Angulo et al. (2016), which solved a two-stage risk-neutral version of the multi-item knapsack problem with a maximum of 120 scenarios and 55 constraints, in our study, we consider a larger, risk-averse version of the problem with up to \(T=10\) stages, 512 scenarios, and 67,072 constraints. Also, rather than using only binary variables as in the formulation of Angulo et al. (2016), we consider a mix of binary and continuous variables, further complicating the multi-item knapsack problem considered.

We assume that all parameter values at period \(t=1\) are set at zero since uncertainty starts at stage two. The parameters \(c_{it}\), \({\kappa }_{it}\), \(d_{t}\), \(a_{it}\), and \(w_{t}\) are independent and identically distributed (i.i.d.) sampled from the uniform distribution over \(\left\{ 1,\ldots ,100\right\} \), e.g., \(U\left[ 1,R\right] \) where \(R=100\) and \({\kappa }_{it} \in U[1, R]\) for \(t=2,\ldots ,T\). We have considered uncertainty in the objective function parameter \(q_{t}^{\omega }\), the left-hand side parameter \(r_{t}^{\omega }\), and the right-hand side parameter \(b_{t}^{\omega }\). For the 2-branch problem, all uncertain parameters have two levels: low (L) and high (H), where \(\left( q_{t}^{\omega }, r_{t}^{\omega }\right) \in U[1, R/2]\) for the low level and \(\left( q_{t}^{\omega }, r_{t}^{\omega }\right) \in U[R/2+1, R]\) for the high level and \(b_{t}^{\omega } = \frac{3}{4} \left( \sum _{j=1}^t \sum _{i \in {\mathcal {I}}} a_{ij}+r_{t}^{\omega } \right) \) for \(t=2,\ldots ,T\). A scenario \({{\xi }}^{\omega }\) stage-t dominates a scenario \({{\xi }}^{\omega '}\) with respect to the problem (28a) if \(\left( p^{\omega }\ge p^{\omega '}\right) \), \(\left( {q_j^{\omega }} \ge {q_j^{\omega '}}\right) \), \(\left( {r_j^{\omega }} \le {r_j^{\omega '}}\right) \), and \(\left( {b_j^{\omega }} \ge {b_j^{\omega '}}\right) \) hold for each \(j=2,\ldots , t\) with \(t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }\).

We set \(h_t= \frac{3}{4} \left( \sum _{i \in {\mathcal {I}}} {\kappa }_{it}+w_{t}\right) \). By employing this scheme, we randomly generate instances for various number of stages (T) and the number of items (I). Ten random feasible instances for each stage and item combination \((T,I) \in \{(5,120),(6,50),(7,40),(8,20),(9,15),\) \((10,10)\}\) are generated, resulting in a total number of 60 instances. As we focus on large scale problems of the form (5a), we assume \(p^{\omega }=1/\bar{\Omega }\), where \(\bar{\Omega }\) is the maximum number of scenarios, for the rest of the paper.

Those instances generated represent large-scale and hard CVaR-SMKP formulations with both the number of binary and continuous variables varying from 9656 to 71,140, and the number of constraints varying from 6096 to 67,072. The size and the difficulty of those instances increase with T and I.

4.3 Results for risk-averse multi-dimensional knapsack instances

In Tables 2, 3, 4, 5 and 6, we report results for a combination of columns defined as follows.

  • T: number of stages (periods).

  • I: number of items.

  • exp: solution approach.

  • time: CPU seconds required to solve the problem, excluding inequality generation time.

  • Ctime: CPU seconds required to generate the cuts, including the solution of the single scenario sub-problems.

  • Ttime: total CPU seconds required to solve the problem, including inequality generation time.

  • Tfac: time factor improvement by sdcut over cpx (Tfac= Time\(^{cpx}\)/Time\(^{sdcut}\)), where Time\(^{cpx}\) is the solution time (time) by cpx and Time\(^{sdcut}\) is the total solution time (Ttime) by bc, sdc, or sdcut.

  • cut: number of cuts added to CPLEX as user inequalities.

  • node (100K): number of nodes explored in the branch and bound tree in 100,000s.

  • gap\(^1\): final optimality gap by cpx at the end of the time limit.

  • gap\(^2\): percentage deviation between the best objective value found by cpx (obj\(^{cpx}\)) and the best objective value obtained by the considered cutting-plane method at the end of the time limit, e.g., gap\(^2\) = 100 \(\times \) (obj\(^{sdcut} \)/obj\(^{cpx}\)-1), where obj\(^{sdcut}\) defines the best objective value obtained by sdcut.

  • IGap: percentage integrality gap of the formulation before inequalities are added (InitGap = 100 \(\times \) (bestobj − relaxlb)/bestobj), where relaxlb and bestobj are objective function values of the initial LP relaxation and the best feasible solution by cpx, respectively.

  • RGap: percentage integrality gap of the formulation after inequalities are added (RootGap = 100 \(\times \) (bestobj − rootlb)/bestobj), where rootlb is the objective function value of the initial LP relaxation after scut and bc are added.

  • GapImp: percentage improvement in the integrality gap at the root node (GapImp = 100 \(\times \) (1-RGap/IGap).

  • unsol: the number of instances unsolved to optimality within the given time limit.

In all tables, the values under the time-related metrics (time, Ctime, Ttime, and Tfac) are rounded to the closest integer to improve the clarity of the presentation.

4.4 Results on bounds and Sdc

Table 2 presents the initial gap (IGap), cut, CTime, the root gap (RGap), and the optimality gap improvement (GapImp) over cpx using the bc and scut together (bc+scut), where we solve both \(2^{T-2}\) and \(2^{T-1}\) number of scenario sub-problems for only one time period \(t=T\). Here, we did not include the sdcut because the strong dominance cuts (38) are rather aggressive and may potentially cut-off the optimal solutions. The IGap, RGap, and GapImp columns are presented in Table 2. Each row of Table 2 presents results for an average of ten instances, while the Average row represents results for an average of 60 instances.

In Table 2, the percent initial gap shows an increasing trend as T increases, reaching \(5.65 \%\) for instances with \(T=10\) and \(I=10\). From the table, the gap improvement due to bc+scut also shows an increasing trend as T increases. The average integrality gap of \(2.37 \%\) is reduced to \(1.25 \%\) and \(0.10 \%\) by the implementation of bc+scut when \(2^{T-2}\) and \(2^{T-1}\) scenario sub-problems are solved, respectively. When cut generation is varied from solving \(2^{T-2}\) to solving \(2^{T-1}\) scenario sub-problems, the number of cuts generated as well as the cut generation time nearly double, as shown in Table 2. On the other hand, the average gap improvement due to scut and bc is 46.4%, when \(2^{T-2}\) scenarios are solved, while it increases to 94.3% when the number of scenario sub-problems solved increases to \(2^{T-1}\). In other words, the gap improvement due to the bc + scut nearly doubles when the number of scenario sub-problems solved is doubled.

Table 2 Optimality gap results due to bc and scut for CVaR-SMKP instances over 5–10 stages

Table 5 in Appendix B presents results that compare the performance of cpx, bc, scut, and sdcut instance-by-instance for the CVaR-SMKP test set with \(T=6\), \(I=50\), where \(\lambda =1\) and \(\alpha _t=0.95\). Each row of Table 5 represents an instance as specified under the Instance column, except the Average row block, which gives the average results over ten instances with \(T=6\) and \(I=50\). Here, bc and scut are implemented by solving \(2^{T-1}\) number of scenario sub-problems while sdcut is generated using \(2^{T-2}\) scenario sub-problems.

In terms of the solution time performance, sdcut performs the best by reducing the cpx solution time from two hours to only 68 CPU seconds. The bc and scut do not help much reducing the solution time, except the instance number 4, where two hours of cpx solution time is reduced to 180 and 203 CPU seconds, respectively. The final MIP optimality gap of 0.08% by cpx is improved to 0.05 and 0.04% by bc and scut, respectively. The main benefit of bc and scut is observed in reducing the integrality gap by 93.2% on average, as shown under the GapImp column. Since the bc and scut would provide the highest benefit to close the gap from the linear programming relaxation of the problem, we recommend using both bc and scut to close the integrality gap in the state-of-the-art solvers, such as CPLEX, rather than adding those inequalities as cuts and solving the augmented problem for computational speed-up.

4.5 Results on Sdcut

Table 3 summarizes results using sdcut for instances with a variety of stages \((T=5-10)\) and items \((I=10-120)\) where \(\lambda =1\) and \(\alpha _t=0.95\). Each row of Table 3 presents results for an average of ten instances, while Average (Total) row reports the overall average (total) over sixty instances.

Here, we do not include the bc because the initial experiments with the lower and upper bounding inequalities (29) and (30), shown in Section 4.5, do not provide any solution speed-up over the sdcut, despite they close the initial integrality gap substantially. The difficulty of the instances presented in Table 3 is a factor of the time stage T as well as the number of items I. In all implementations, sdcut significantly reduces the solution time by cpx. For example, the solution time of 7287 is reduced to 68 CPU seconds for instances with \(T=6\) and \(I=50\), and for the \(T=10\) and \(I=10\) instances, the solution time of 5780 is reduced to 227 CPU seconds due to the sdcut. In particular, the average solution time is reduced from 6, 559 to 111 seconds, where 98 seconds are spent for cut generation, including the solution of the scenario sub-problems. When the total computation time is compared, a total of 39, 354 solution time of the cpx is reduced to 669 CPU seconds by the sdcut to solve all instances. While sdcut improves the solution time of the \(T=5\) and \(I=120\) instances with a factor of 110, it improves the solution speed of the \(T=10\) and \(I=10\) instances with a factor of 26.

Table 3 Experiments for CVaR-SMKP instances with \(\lambda =1\) and \(\alpha _t=0.95\) over 5–10 stages

As shown in the overall averages in Table 3, sdcut improve results by a factor of 72, when all of the time factor improvement due to sdcut is averaged. The solution time reduction due to the sdcut is also consistent with the reduction in the number of nodes searched in the branch and bound tree. As the time period is increased from \(T=5\) to \(T=10\), the number of scenarios increases from 16 to 512. Because \(\frac{{2^{T-2}}}{T-1}*\frac{T-1}{2}\) scenarios are solved as single scenario sub-problems, the time required for the cut generation (Ctime) shows an increasing trend as T increases. The average gap that defines the percentage deviation between obj obtained by cpx and obj by sdcut is \(0.03\%\). This means that when all instances from \(T=5\) to \(T=10\) are averaged, the objective value provided by sdcut is \(0.03\%\) higher than the objective value provided by cpx. Out of 60 instances, cpx could not solve 53 of them to optimality within the 2-h of time limit, as shown under the unsol of Table 3, while the implementation with sdcut solves all of the 60 instances with only \(0.03\%\) deviation from cpx, within the 2-hour of time limit.

The Total row shows that a total time of 39,354 CPU seconds for solving 60 problems is reduced to only 669 CPU seconds (less than 11 minutes) by using sdcut, resulting in a time factor improvement of 432 for solving all instances presented in this table. Varying risk parameters does not change the performance of sdcut much (see detailed results for varying risk parameters in Table 6 and the related discussion under Appendix C).

4.6 Generalization to multiple random variables

We perform additional experiments to demonstrate the number of scenario-dominance relations and its impact on the performance of the sdcut as we vary the number of random variables. For those experiments, we have generated additional instances where we re-defined the parameter \(c_{it}\) as an uncertain parameter \(c_{it}^{\omega }\), which represents the cost coefficient associated with variable \(x_{it}^{\omega }\) for item i in period t under scenario \(\omega \), and assumed that all other parameters are deterministic. The random parameters \(c_{it}^{\omega }\) are stage-wise independent but are jointly distributed over i for each \(t \in {\mathcal {T}}\). The \(c_{it}^{\omega }\) could be positively or negatively correlated for each item i. A scenario \({{\xi }}^{\omega }\) stage-t dominates a scenario \({{\xi }}^{\omega '}\) with respect to the problem (28a) if \(\left( p^{\omega }\ge p^{\omega '}\right) \) and \(\left( c_{it}^{\omega } \ge c_{it}^{\omega '}\right) \) hold for each \(i \in {\mathcal {I}}\) and \(j=2,\ldots , t\) with \(t \in {\mathcal {T}}\backslash {\left\{ 1 \right\} }\). Since the case where each univariate component of the random vector is independently distributed could explode the size of the scenario tree, here we generated random variables as stage-wise independent but jointly distributed at each stage. However, results provide insights into the former case as well.

In Table 4, we compare the performance of sdcut with respect to cpx for a varying number of random variables in the scenario tree for instances with \(T=10\) and \(I=10\). In this table, \(\hat{I}\) represents the number of randomly generated items in each stage t, RV shows the total number of random variables generated, Sd (\(t=2\)) and Sd (\(t=T\)) represent the number of scenario-dominance relations at stages two and T, and Sd (total) presents the total number of scenario-dominance relations over all time periods for each RV. For example, instances with \(\hat{I}=0\) and RV\(=0\) are deterministic, while \(\hat{I}=3\) represents instances with three items randomly generated in each stage and thus that have a total of \(\hat{I}(T-1)=27\) random variables since the values of the random parameters at stage 1 are assumed to be known.

Each row of Table 4 presents results for an average of ten instances. Since random variables are stage-wise independent but are dependent in each stage, the number of scenarios for each instance with a varying number of random variables is 512.

According to the average solution times in Table 4, sdcut consistently improves the solution times of cpx for instances with an increasing number of random variables by an overall average factor of 25, reducing the average solution time of 5,716 seconds to 231 seconds. As the number of random variables increases, the number of second-stage and last-stage scenario-dominance relations reduces. Note that the second-stage dominance relations are much larger than the last-stage dominance relations. When we reach 81 random variables, we also hit the minimum number scenario relations at stages two and T, and in total, as shown by Proposition 3. For the considered instances, having the minimum number scenario relations does not impact the performance of sdcut with respect to cpx. For example, when we have \(\hat{I}=10\) and RV\(=90\), we have the least total number of scenario-dominance relations (\(W=261,632\)), and 1705 cuts are generated under sdcut. While mipgap tends to increase as the number of random variables increase, the performance of sdcut is not impacted by an increased number of random variables. The results imply that when we increase the number of random variables in each stage, the algorithm still generates plenty of cutting planes due to an ample number of stage-t dominance relations.

Table 4 Results for instances with \(T=10\), \(I=10\), and a varying number of random variables RV

5 Concluding remarks

In this paper, we introduced a new concept called “scenario dominance” and “stage-t scenario dominance” to effectively solve the general risk-averse multi-stage stochastic mixed-integer programs. This method derives the implications of one scenario on another by comparing the problem coefficients under each scenario and the optimal objective function values of the individual scenario sub-problems. We derive results on the minimum number of stage-t scenario-dominance relations in a multi-stage scenario tree. By studying those implications and the partial ordering of scenarios, we derive new bounds and cutting planes to tackle risk-averse SMIPs. We perform extensive computational experiments using the proposed cuts on the mean-CVaR stochastic dynamic knapsack problem. Our computations demonstrate that the proposed methodology reduces the solution time of the large-scale, risk-averse multi-stage stochastic mixed-integer optimization problems with varying risk characteristics by one-to-two orders of magnitude. Results also demonstrate that strong dominance cuts perform well for those instances with ten random variables in each stage and ninety random variables in total. They also have the potential to tackle a larger number of random variables because one can generate plenty of stage-t scenario-dominance relations even under the worst-case scenario.

This study opens several exciting research directions worth investigating in the future. For example, “scenario dominance” approach could be investigated for other risk measures, such as the Expected Excess of a given target and the Excess Probability (Schultz and Tiedemann 2003). In this paper, we only used a small subset of the full scenario set to generate the scenario dominance cuts. A future experimental study could analyze the impact of using different fractions and subsets of scenarios and various scenario selection schemes on the quality of the bounds as well as the cuts proposed. Decomposition methods involving Lagrangian relaxation (CarøE and Schultz 1999; Nowak and Römisch 2000), column generation (Singh et al. 2009; Sen 2005), L-shaped decomposition (Angulo et al. 2016; Laporte and Louveaux 1993), progressive hedging (Watson and Woodruff 2011; Gade et al. 2016), scenario-decomposition (Birge 1985; Sandıkçı and Özaltın 2017), and stochastic dual dynamic integer programming (Zou et al. 2019) have been quite useful in solving various classes of multi-stage stochastic integer programs. Thus, another important future research direction is to integrate the proposed “scenario dominance" cut-and-bound generation scheme within a decomposition framework for solving mean-risk M-SMIPs.

In this paper, we focus on the type of stochastic multi-dimensional knapsack instances up to \(T=10\) stages, 10 random variables in each stage, and \(I=120\) items. The considered CVaR-SMKP deterministic counterparts, which involve the expected values of the uncertain parameters, are also computationally challenging. Future research could investigate a variant of CVaR-SMKP instances, which have easier deterministic formulations but a larger number of scenarios than considered in this work, and thus remain computationally hard to solve. For example, a future extension of this work could consider \(T=5\)-stage \(I=1\)-item CVaR-SMKPs with \(\hat{I}=10\) independent random variables and 10 branches in each stage, with 10,000 scenarios. To tackle an instance like this, a parallel distributed platform would be more suitable for solving the individual scenario problems, reducing the dominance cut generation time, and thus improving the overall solution performance. In such a case, based on Proposition 4, we would have a minimum of over more than a hundred million scenario dominance relations, leading to an enormous number of scenario dominance cuts. Thus another important direction for research is to determine the most useful subset and the number of dominance cuts, which would not excessively increase the size of the problem formulation.

The concept of “scenario dominance” as defined in this paper is fundamentally different than the “stochastic dominance” relation defined in the literature (Müller and Stoyan 2002; Luedtke 2008) because two random variables are compared based on their probability distribution function in stochastic dominance, while in our scenario dominance approach we pairwise compare scenarios based on the realization of uncertain parameters at each time stage. Specifically, scenario dominance derives relations among scenarios by comparing possible outcomes of a single random variable or potential joint outcomes of more than one random variable. Yet, an interesting future direction would be to investigate the theoretical relations between the scenario dominance and the first (second)-order stochastic dominance.

The proposed approach in this paper does not require convexity or linearity, and thus can be extended to study the non-linear case and various types of risk-averse multi-stage stochastic MIPs. The generality of our method would require careful implementation of the proposed cuts for each specific problem, such as risk-averse multi-stage stochastic financial investment planning, capacity allocation, and network optimization problems.

Our approach also has some limitations. For example, problems with several uncorrelated random variables may exhibit less scenario-dominance relations and fewer dominance cuts than considered in this study. Therefore, an important future direction is to investigate further the boundaries of the scenario dominance cuts by exploring the breadth of problems that could benefit from our approach.