1 Introduction

In this paper, we focus on nonconvex mixed-integer quadratically constrained quadratic programming (MIQCQP) models that can be made separable with the employment of a Lagrangian relaxation-based approach [16]. Specifically, we focus on problems that possess a block angular structure that allows for such a separation, as those presented by deterministic equivalent formulations of two-stage stochastic programming models (DEM) [10]. To introduce the notation used hereinafter, let us define the problem DEM as

$$\begin{aligned} \text {DEM}: \text {max.} \quad&\sum _{j \in VI} I^{0}_j x_j + \sum _{s \in {S}} P^s \left( \sum _{i \in {VC}}\sum _{j \in {VC}} Q^{s, 0}_{i, j} y_{i}^{s} y_{j}^{s} + \sum _{i \in {VC}} C^{s,0}_{i,s}y_{i}^{s}\right) \nonumber \\ \text {s.t.:} \quad&\sum _{i \in {VC}}\sum _{j \in {VC}} Q^{s, r}_{i, j} y_{i}^{s} y_{j}^{s} + \sum _{i \in {VC}} C_{i}^{s,r} y_{i}^{s} + \sum _{j \in {VI}} I_{j}^{s,r} x_{j} + K^{s,r} \le 0, \end{aligned}$$
(1)
$$\begin{aligned}&\quad \forall s \in S, \forall r \in R\end{aligned}$$
(2)
$$\begin{aligned}&y_{i}^{s} \in [Y^{L,s}_{i}, Y^{U,s}_{i}], \forall s \in S, \forall i \in VC \end{aligned}$$
(3)
$$\begin{aligned}&x_{j} \in \{X^L_{j}, \dots , X^U_{j}\}, \forall j \in VI, \end{aligned}$$
(4)

where S is the set of scenarios, \(P^s\) is the probability of the scenario \(s \in S\), R is the set of the constraint indices for each scenario, VI (VC) is the set of integer (continuous) variables indices, \(Q^{s,r} \) are symmetric matrices for all \(s \in S \), \( r \in \{0\} \cup R \). The parameters \(I^{s,r}_j\) \(( C^{s,r}_i)\) correspond to linear coefficients associated with the integer (continuous) variables and \(K^{s,r}\) represent constants. Variables \(x_j\), \( \forall j \in VI\), can assume any integer value between its bounds \(X^L_j\) and \(X^U_j\) and variables \(y_i^{s}\), \(i \in VC\) and \(s \in S \), assume any continuous value between \(Y^{L,s}_i\) and \(Y^{U,s}_i\). Note that this boundedness assumption, which is indeed satisfied in many applications, will become essential in our developments.

The range of useful applications of MIQCQP models is noticeably broad, comprising of sectors such as finance, engineering, and the process industry, permeating several applications arising in management science and operations research [21]. Recent papers utilising MIQCQP models include problems such as planning optimal battery management for the reduction of power losses [41], the enhanced index-tracking problem formulation [59], the operational planning of refineries [2], profit maximisation in a three-level supply chain [39], storage investment planning [63] and solution approaches for a gas flow problem with general setups [57], to mention only a few of the potential applications that could benefit from the developments presented in this paper.

A MIQCQP model is defined as convex if its continuous relaxation is convex, regardless of the nonconvexity introduced by the integrality requirements. Therefore, the DEM is convex if \(Q^{s,r}\) is positive semidefinite for all \(s \in S\) and \(r \in \{0\} \cup R\), being nonconvex otherwise. The latter (nonconvex) is the case addressed in this paper.

In principle, MIQCQP models can be solved by available open-source and commercial solvers such as recent releases of Gurobi [34], Couenne [26] or Baron [61]. However, these solvers still lack performance robustness when facing larger-scale instances, such as those arising as DEM from two-stage stochastic programs.

Due to the challenging nature of MIQCQP problems, several solution approaches have been developed [14]. These can be generally divided into exact and heuristic methods. While the former can guarantee the convergence to the global optimum, the latter can generally only offer local optimality guarantee of the solutions obtained. Furthermore, almost all methods involve the employment of relaxation techniques as a subroutine. The approximation of the original MIQCQP problem with mixed-integer linear programs (MILP) [1, 46] is the most typical relaxation strategy employed in this setting. However, this type of relaxation can significantly increase the number of variables and constraints compared to the original (primal) model, thus deteriorating computational performance.

One of the most widely used exact algorithms to solve a MIQCQP problem is the branch-and-bound (BnB) method and its variants. If the problem is convex, BnB is capable of providing globally optimal solutions by relaxing the integrality constraints to obtain bounds. In the case of nonconvex MIQCQP models, a spatial BnB is typically used, employing (convex) relaxations of the nonconvex terms that are the source of the nonlinearity. Pursuing this standpoint, Castro [19] presented a spatial BnB with relaxations relying on normalised multiparametric disaggregation technique for solving nonconvex MIQCQP problems. Ding et al. [29] presented an integration of spatial BnB and standard BnB method named bi-level branch-and-bound technique capable of solving MIQCQP problems with superior solution quality and convergence characteristics. Berthold et al. [6] developed MIQCQP problem solver based on the combination of constraint programming and branch-and-cut algorithm exploiting cutting planes to tighten the relaxations. Billionnet et al. [9] developed the mixed-integer quadratic convex reformulation (MIQCR) method, which consists of a (convex) reformulation approach for nonconvex mixed-integer quadratic programming problems with linear constraints, which is also embedded within a BnB framework.

Another conceptually different approach to solve MIQCQP problems is the employment of decomposition, which is one of the main focuses of this paper. The key concept of this method is to split the problem into several smaller subproblems that are more tractable and can be solved independently, possibly in parallel.

By making explicit the non-anticipativity conditions (NAC) of the first-stage variables x in the DEM problem, the reformulated deterministic equivalent model (RDEM) can be represented as

$$\begin{aligned} \text {RDEM}: \text {max. } \quad&\sum _{s \in {S}} P^s \left( \sum _{j \in VI} I^{0}_j x_{j}^{s} + \sum _{i \in {VC}}\sum _{j \in {VC}} Q^{s, 0}_{i, j} y_{i}^{s} y_{j}^{s} + \sum _{i \in {VC}} C^{s,0}_{i} y_{i}^{s}\right) \nonumber \\ \text {s.t.: } \quad&\sum _{i \in {VC}}\sum _{j \in {VC}} Q^{s, r}_ {i, j} y_{i}^{s} y_{j}^{s} + \sum _{i \in {VC}} C^{s,r}_{i} y_{i}^{s} + \sum _{j \in {VI}} I^{s, r}_{j} x_{j}^{s} + K^{s,r} \le 0, \end{aligned}$$
(5)
$$\begin{aligned}&\quad \forall s \in S, \forall r \in R \end{aligned}$$
(6)
$$\begin{aligned}&x_{j}^{s} \in \{X^L_{j}, \dots , X^U_{j}\}, \forall s \in S , \forall j \in VI \end{aligned}$$
(7)
$$\begin{aligned}&x_{j}^{s^\prime } - x_{j}^{s} = 0 , \forall s \in S \setminus \{s^\prime \}, \forall j \in VI \\&\text { and (3),}\nonumber \end{aligned}$$
(8)

where \({s^\prime } \in S\) is a reference scenario.

The introduction of the NAC in the above RDEM problem results in an explicit nearly-decomposable equivalent of the original DEM problem with an exposed block-angular structure [10, 11]. Consequently, one can obtain |S| essentially independent MIQCQP subproblems where (8) is the only set of linear constraints that relates variables from distinct subproblems. These constraints are referred to as linking or complicating constraints. Therefore, if one could remove these constraints, each of the subproblems could be solved independently. Hereinafter, we will use the term subproblems when referring to each element of the block-angular structure of the RDEM problem. It is important to highlight that there are multiple alternative ways in which the NAC could be represented. In what follows, we use the representation presented by Oliveira et al. [48], as the authors report better computational results in a similar context. Nonetheless, the developments presented are not dependent on the specific choice for representing NACs.

In classical linear programming, the three most common decomposition frameworks are Dantzig–Wolfe decomposition (DWD), Benders decomposition (BD) and Lagrangian decomposition (LD), of which the last is explored in this paper. It should be noted that LD is not only limited to be considered in a decomposition framework, but it can also be used to devise easier-to-solve (i.e., typically smaller in scale due to separability) equivalent problems, which is often referred to as Lagrangian relaxation (LR). On the other hand, LD involves creating copies of the complicating variables—i.e., the first stage variables x-for each scenario and introducing NAC to ensure primal feasibility (the reformulation used to obtain RDEM). LD would then be the employment of LR to remove the generated complicating constraints (NAC). There is a well-known connection between these decomposition approaches, in that if the LD is solved using the cutting-planes algorithm, it can be viewed as a BD. Furthermore, BD can be stated as the dual of DWD.

The decision on which decomposition method is the most appropriate depends on the problem structure, such as the presence of complicating constraints. Furthermore, the BD (and DWD) in its classical form cannot be applied to general nonlinear programming problems. Addressing this shortcoming, Geoffrion [32] proposed the generalised Benders decomposition (GBD) based on BD to decompose convex nonlinear programming problems. Later, Li, Tomasgard, and Barton [43] improved the GBD through the nonconvex generalised Benders decomposition (NGBD) to decompose nonconvex nonlinear programming. In contrast, the advantage of the LD is that it can be directly applied to a nonconvex problem. However, it should be noted that the nonconvex subproblems in this context must be solved to global optimality, as it will be discussed in further details in Sect. 2. This can be challenging, since one would still have to solve nonconvex MIQCQP subproblems.

This paper presents a new class of relaxations called p-Lagrangian. The p-Lagrangian is a composition of mixed-integer programming (MIP)-based relaxation method named reformulated normalised multiparametric disaggregation technique (RNMDT) [1] and LD. The idea of employing RNMDT to devise a MIP approximation of the resulting LD problem is inspired by the approach used to expand the GBD to the NGBD. Therefore, the p-Lagrangian relaxation allows one to use the same decomposition strategies as the classic LD, but with subproblems that can be solved to global optimality using more robust MILP technology.

This paper is structured as follows. Section 2 reviews the fundamental concepts of the LD. Section 3 describes the RNMDT relaxation. The p-Lagrangian relaxation of the MIQCQP problem is presented in Sect. 4, followed by technical results that describe its convergence behaviour. An iterative algorithm for solving p-Lagrangian relaxation problems is described in Sect. 5. Furthermore, numerical experiments are presented in Sect. 6, where the contributions of this paper are tested on randomly generated instances. Finally, in Sect. 7 we draw conclusions and provide directions for further research.

2 Background

Lagrangian relaxation is a bounding technique for solving a given arbitrary optimisation problem that has important applications for nonconvex problems such as MILP problems [16, 33]. However, methods based on Lagrangian duality cannot be straightforwardly applied to nonconvex problems, as discussed in what follows.

First, let us formally provide a definition for a relaxation which, although rather restrictive, will suffice for the developments hereinafter. Consider the following problem \(P : \text {max.} \{f(x): x \in C\}\) where \(f: {\mathbb {R}}^n \rightarrow {\mathbb {R}}\) and \(C \subseteq {\mathbb {R}}^n\) and the problem \(P_R : \text {max.} \{f_R(x): x \in C_R\}\) where \(f_R: {\mathbb {R}}^n \rightarrow {\mathbb {R}}\) and \(C_R \subseteq {\mathbb {R}}^n\).

Definition 1

The problem \(P_R\) is a relaxation of P if and only if:

  1. 1.

    \(f_R(x) \ge f(x) \text {, }\forall x \in C\)

  2. 2.

    \(C \subseteq C_R \).

Let \(f^s: {\mathbb {R}}^{|VI|+|VC|} \rightarrow {\mathbb {R}}\) and \(g^{s,r}: {\mathbb {R}}^{|VI|+|VC|} \rightarrow {\mathbb {R}}\), \(\forall s \in S\) and \( \forall r \in R\), be continuous twice differentiable functions. Suppose that we are interested in solving the (primal) problem (9), which is introduced to ease the notation in our later developments.

$$\begin{aligned} \begin{aligned} {\overline{f}} = \text {max.}_{x,y}&\sum \limits _{s \in S} f^s(x^s, y^s)\\ \text {s.t.: }&\ g^{s,r}(x^s, y^s) \le 0, \forall s \in S, \forall r \in R\\&\ x^{s^\prime } - x^s = 0, \forall s \in S \setminus \{ s^\prime \} \\&\ x^s \in X, \forall s \in S,\\&\ y^s \in Y^s, \forall s \in S. \end{aligned} \end{aligned}$$
(9)

As can be noticed, (9) is equivalent to the RDEM problem, by making

$$\begin{aligned} f^s(x^s, y^s)= & {} P^s \left( \sum _{j \in VI} I^{0}_j x_{j}^{s} + \sum _{i \in {VC}}\sum _{j \in {VC}} Q^{s, 0}_{i, j} y_{i}^{s} y_{j}^{s} + \sum _{i \in {VC}} C^{s,0}_{i} y_{i}^{s}\right) ,\\ g^{s,r}(x^s, y^s)= & {} \sum _{i \in {VC}}\sum _{j \in {VC}} Q^{s, r}_ {i, j} y_{i}^{s} y_{j}^{s} + \sum _{i \in {VC}} C^{s,r}_{i} y_{i}^{s} + \sum _{j \in {VI}} I^{s, r}_{j} x_{j}^{s} + K^{s,r}, \end{aligned}$$

while X, \(Y^s\) in (9) represents the variable bounds (7, 3) for integer and continuous variables, respectively.

Let \(D^s = \{x^s, y^s \mid x^s \in X, \ y^s \in Y^s, \ g^{s,r}(x^s,y^s) \le 0, \ \forall r \in R\}\) be a feasibility set, \(\forall s \in S\), and let , that is D is the Cartesian products of the sets \(D_s\), \(\forall s \in S\). With this definition in mind, the Lagrangian dual function \(\phi : {\mathbb {R}}^{(|S|-1) \times |VI|} \rightarrow {\mathbb {R}}\) can be defined as

$$\begin{aligned} \phi (\lambda ) = \sup _{ (x,y) \in D}\sum \limits _{s \in S} f^s(x^s, y^s)+\sum \limits _{s\in S \setminus s^\prime } ( \lambda ^{s} )^\top (x^{s^\prime } - x^s), \end{aligned}$$
(10)

where the components of \(\lambda \in {\mathbb {R}}^{(|S|-1)\times |VI|}\) are the Lagrangian multipliers and \(x, y = \{x^s,y^s\}_{s \in S}\). In this setting, it is common to say that the NAC are being relaxed [12]. In what follows, we present important and widely known properties of the Lagrangian dual function \(\phi \), which we formally state as Propositions 2 and 3 for reference later on.

Proposition 2

[5, Theorem 6.3.1] The Lagrangian dual function \(\phi \) is convex.

For any value of the Lagrangian multiplier \(\lambda \), the Lagrangian dual function \(\phi \) provides an upper bound (UB) for the primal problem (9). Our objective is to obtain the tightest (i.e., the lowest) UB. Therefore, we are interested in solving the following Lagrangian dual problem (LDP)

$$\begin{aligned} {\overline{\phi }} = \inf _{\lambda } \phi (\lambda ) \end{aligned}$$
(11)

Even though the function \(\phi \) is convex, evaluating \(\phi (\lambda )\) for a given \(\lambda \) may require solving a nonconvex problem if either \(f^s\) or \(g^{s,r}\) are nonconvex for some \(s \in S\) and \(r \in R\). Thus, evaluating \(\phi \) may be as hard to solve to global optimality as the primal problem. We refer to this later on as Issue 1. Proposition 3 describes how one can use the LDP (11) to obtain bounds for (9), a property generally referred to as weak duality.

Proposition 3

[5, Theorem 6.2.1] Let \(f^s: {\mathbb {R}}^{|VI|+|VC|} \rightarrow {\mathbb {R}}\) and \(g^{s,r}: {\mathbb {R}}^{|VI|+|VC|} \rightarrow {\mathbb {R}}\), \(\forall s \in S\) and \(\forall r \in R\), be continuous twice-differentiable functions and suppose we consider the primal problem (9). Let \(\phi : {\mathbb {R}}^{(|S|-1) \times |VI|} \rightarrow {\mathbb {R}}\) be Lagrangian dual function forming the Lagrangian dual problem (11). Let \({\overline{f}}\) and \({\overline{\phi }}\) be the optimal values of the primal and dual problems accordingly. Then, the following statements are true:

  1. 1.

    \(\phi (\lambda ) \ge \sum \limits _{s\in S} f^s (x^s, y^s), \ \forall \lambda \in {\mathbb {R}}^{(|S|-1)\times |VI|}, x,y \in D\) such that \(x \in X^{\text {NAC}}\), where \(X^{\text {NAC}} = \{x \mid (x^{s^\prime } - x^s) = 0, \forall s \in S \setminus \{ s^\prime \}\}\).

  2. 2.

    \({\overline{\phi }} \ge {\overline{f}}\).

It is important to highlight that, for problems including integer decision variables or nonconvex functions, a duality gap might exist (i.e., only weak duality is valid). This will hereinafter be called Issue 2. Nevertheless, the LDP can still be used to obtain bounds for the primal problem (9), as stated in Proposition 3.

Therefore, if the primal problem is nonconvex, there are two main issues to be addressed when using the Lagrangian duality to devise a solution method, i.e., we must solve nonconvex subproblems (Issue 1), and only weak duality holds (Issue 2).

The second issue is traditionally addressed by modifying the Lagrangian function to include penalty terms. Linear penalty terms were introduced by Pietrzykowski [50] and Zangwill [65] and quadratic penalty expressions by Courant [24]. Penalty expressions with both linear and quadratic pieces were proposed by Rockafellar [53]. A generalised approach called the augmented Lagrangian, also known as the methods of multipliers, was first studied by Hestenes [37] and Powell [51]. Two disadvantages of the augmented Lagrangian is that it usually ruins the problem’s separable structure when it exists, thus, compromising decomposition strategies [60], and it also adds nonlinearities to the problem. The progressive hedging (PH) method, proposed by Rockafellar and Wets [54] overcomes the former obstacle. Santos et al. [56] evaluated the application of PH for solving hydrothermal systems short-term operational planning problems comparing to the Nested Decomposition. Veliz et al. [62] demonstrated that PH is competitive with a direct solution when applied to solve the mixed-integer problem of medium-term forest planning. Lokketangen and Woodruff [44] combined the tabu search method with PH to solve multistage, stochastic mixed-integer (0, 1) programming problems. Boland et al. [12] enhanced the PH convergence characteristics by introducing a Frank-Wolfe-based method to compute primal updates in PH and Boland et al. [13] presented how it could be further improved under a bundle method setting. In a similar vein, although not directly related to PH, de Oliveira et al. [28] proposed an inexact proximal bundle method decomposition for solving two-stage stochastic programming problems, which allows for the consideration of subsets of scenarios, ultimately yielding performance improvements. The p-Lagrangian decomposition method proposed in this paper can be potentially generalised to those settings as well. However, in this study, we only concentrate on Issue 1 for the nonconvex MIQCQP problems. As it will become clear, all of these developments could also be incorporated within the framework of the p-Lagrangian relaxation in an attempt to address Issue 2 and is left for future research.

A common approach to solve decomposable QCQP problems is to relax all quadratic constraints to obtain a subproblem that is equivalent to a semidefinite programming problem that can be solved to global optimality using appropriate interior-point methods [3]. On the other hand, the formulation of the LDP is not trivial to find, especially if not all constraints are relaxed. Another issue with this approach is that the quality of the dual bounds is dependent on the subset of constraints to be relaxed. Dentcheva and Römisch [25] proposed a framework to estimate how large the duality gap will be when a specific subset of constraints is relaxed. The framework was then applied to decide which constraints to relax to generate the smallest duality gap. In this paper, only complicating constraints (8) are considered for being relaxed, as we focus on problems with block-angular separable structure, such as two-stage stochastic programming problems. It is worth highlighting that the framework presented here is, however, general enough to be employed to any other MIQCQP problem with a similar separable structure.

In what follows, we concentrate on addressing Issue 1 by developing a new class of dual problems replacing the nonconvex MIQCQP subproblems with MIP-based relaxations obtained using the reformulated normalised multiparametric disaggregation technique (RNMDT). Even though MIP problems might also be computationally challenging, there are more reliable techniques available for solving MIP problems that are efficient in many cases (e.g., branch-and-cut) and widely available off-the-shelf commercial implementations in solvers such as Gurobi [34] and CPLEX [23].

2.1 Motivating example

The following motivating example illustrates that, even in simple cases, solving the LDP might not provide a valid dual bound if the Lagrangian dual function \(\phi \) cannot be appropriately evaluated. That will be the case if \((x^s, y^s), \forall s \in S\), is not a global maximiser for \(\phi (\lambda )\) for a given \(\lambda \). Consider the following problem.

$$\begin{aligned} \begin{aligned} \text {max.}_{x_1,x_2}&x_1 x_2 \\ \text {s.t.:}&x_1 + 2x_2 = 1 \\&\ x_1,x_2 \in [0, 1.75]. \end{aligned} \end{aligned}$$
(12)

For a fixed Lagrangian multiplier \(\lambda \in {\mathbb {R}}\), the evaluation of the Lagrangian dual function \(\phi \) corresponding to the problem (12) is

$$\begin{aligned} \begin{aligned} \text {max.}_{x_1, x_2}&x_1 x_2 +\lambda (x_1 + 2x_2 - 1)\\ \text {s.t.:}&x_1,x_2 \in [0,1.75]. \end{aligned} \end{aligned}$$
(13)

Solving (12) with the global solver Gurobi (version 9.0.0), we obtain the optimal value 0.125. Since the primal problem and its LDP have a (weak) dual relationship, Problem (13) should provide an UB as its optimal value for all fixed values of the Lagrangian multiplier \(\lambda _f\), i.e., a value greater or equal than 0.125. However, if we solve for a fixed value of \(\lambda _f = -0.100\) using the local solver Ipopt [64], it returns a solution which is zero for all variables, corresponding to the objective function value equal to 0.100. This is not a valid UB since the feasible solution \((x_1,x_2) = (0.500,0.250)\) has a greater value than that.

We could improve the solution for this problem by using another local solver that utilises alternative methods, by providing a warm start, or even by applying a multi-start strategy. However, if one cannot guarantee that the solutions obtained for (13) are global maxima, one cannot be sure about the validity of the resulting bounds, which, in turn, compromises the validity of solutions methods that rely on Lagrangian relaxation. In Sect. 4.2, we revisit this example to demonstrate how to address the issue of generating valid Lagrangian bounds for nonconvex problems using RNMDT, thereby addressing Issue 1.

3 Reformulated normalised multiparametric disaggregation

The reformulated normalised multiparametric disaggregation technique (RNMDT) [1] is an enhanced version of the normalised multiparametric disaggregation technique (NMDT), originally proposed in [40] and further developed in [17, 18]. RNMDT allows for the development of mixed-integer based relaxations for MIQCQP models that can be set to be arbitrarily precise, at the expense of trading off precision and the total of additional auxiliary (binary and continuous) variables required. The enhancement of the original framework is due to (i) a significant reduction in the number of auxiliary variables and constraints and (ii) a dynamic procedure for selecting variables to have their discretised representation made more precise.

Suppose that we seek to solve the following problem, which is the same as RDEM problem but without the separable structure (i.e., scenarios), to ease the notation.

$$\begin{aligned} \text {max.} \quad&\sum _{i \in VC} \sum _{j \in VC} Q^0_{i,j} y_i y_j + \sum _{j \in VI} I^{0}_j x_j + \sum _{i \in VC} C^{0}_i y_i&\end{aligned}$$
(14)
$$\begin{aligned} \text {s.t.:} \quad&\sum _{i \in VC} \sum _{j \in VC} Q^r_{i,j} y_i y_j + \sum _{j \in VI} I^{r}_j x_j + \sum _{i \in VC} C^{r}_i y_i + K^{r} \le 0, \forall r \in R \end{aligned}$$
(15)
$$\begin{aligned}&y_i \in [Y^{L}_i, Y^{U}_i], \forall i \in VC \end{aligned}$$
(16)
$$\begin{aligned}&x_j \in \{X^L_j, \dots , X^U_j\}, \forall j \in VI, \end{aligned}$$
(17)

where the parameters \(Q^r, I^{r}, C^{r}, \text { and } K^{r} \) are as in Sect. 1.

Let

$$\begin{aligned} QT = \{ (i,j) \in VC \times VC \mid j \ge i, \exists r \in R \cup \{0\}, |Q^{r}_{i,j} |> 0 \} \end{aligned}$$

and

$$\begin{aligned} DS = \{ j \in VC \mid \exists i \in VC, (i,j) \in QT \}. \end{aligned}$$

The set QT comprises the indices of variables that appear in at least one quadratic term, while DS corresponds to the set of variables that will be discretised. In this context, the discretisation of the variables \(y_j\), \(\forall j \in DS\), can be achieved by introducing the following constraints:

$$\begin{aligned}&y_j = (Y^{U}_j - Y^{L}_j)\left( \sum _{l \in Z_{p,-1}} {2^{l} z_{j,l}} + \varDelta y_j\right) + Y^{L}_j, \forall j \in DS \end{aligned}$$
(18)
$$\begin{aligned}&0 \le \varDelta y_j \le 2^{p}, \forall j \in DS \end{aligned}$$
(19)
$$\begin{aligned}&z_{j,l} \in \{0,1\}, \forall j \in DS, l \in Z_{p,-1}, \end{aligned}$$
(20)

where \(Z_{a,b} = \{a, \dots , b\}\) is the subset of integer numbers ranging from a to b (inclusive), (18) is used to normalise decision variables \(y_j\) and the term

$$\begin{aligned} \left\{ \sum \limits _{l \in Z_{p,-1}} {2^{l} z_{j,l}} + \varDelta y_j\right\} \in [0,1] \end{aligned}$$

is discretised in partitions of the size \(2^p\) each, where the integer parameter \(p < 0\) corresponds to a precision factor. The auxiliary variables \(\varDelta y_j\) are added to allow the term to attain all values in the interval [0, 1]. Notice that, if the discretisation would have used the partitions of the size \(10^p\) instead, the absolute value of the precision factor p would be equivalent to the number of digits used when normalising the integer variables. However, we use a basis 2 instead, since it allows for more compact relaxations in terms of the number of auxiliary variables, as demonstrated in [1].

For each bilinear term \(y_iy_j\), only one of the variables needs to be discretised, as the resulting product between continuous and binary variables can be linearised by a standard equivalent reformulation. Multiplying both sides of (18) by \(y_i, \forall i \in VC\), gives the constraints

$$\begin{aligned} y_i y_j = (Y^{U}_j - Y^{L}_j)\left( \sum _{l \in Z_{p,-1}} {2^{l} y_i z_{j,l}} + y_i \varDelta y_j\right) + y_i Y^{L}_j, \forall i,j \in QT. \end{aligned}$$
(21)

Next, we introduce the auxiliary variables \(w_{i,j}\), \({\hat{y}}_{i,j,l}\) and \(\varDelta w_{i,j}\) to represent the products \(y_i y_j\), \(y_i z_{j,l}\) and \(y_i \varDelta y_j\), respectively. The resulting set of constraints obtained is then

$$\begin{aligned} w_{i,j}&= (Y^{U}_j - Y^{L}_j) \left( \sum _{l \in Z_{p,-1}} {2^{l}{\hat{y}}_{i,j,l}} + \varDelta w_{i,j}\right) + y_i Y^{L}_j, \forall i,j \in QT \end{aligned}$$
(22)
$$\begin{aligned} Y^{L}_i z_{j,l}&\le {\hat{y}}_{i,j,l} \le Y^{U}_i z_{j,l}, \forall i,j \in QT, l \in Z_{p,-1} \end{aligned}$$
(23)
$$\begin{aligned} Y^{L}_i (1 - z_{j,l})&\le y_i - {\hat{y}}_{i,j,l} \le Y^{U}_i (1 - z_{j,l}), \forall i,j \in QT, l \in Z_{p,-1} \end{aligned}$$
(24)
$$\begin{aligned} 2^p(y_i - Y^{U}_i) + Y^{U}_i \varDelta y_j&\le \varDelta w_{i,j} \le 2^p(y_i - Y^{L}_i) + Y^{L}_i \varDelta y_j , \forall i,j \mid (i,j) \in QT \end{aligned}$$
(25)
$$\begin{aligned} Y_i^{L} \varDelta y_j&\le \varDelta w_{i,j} \le Y_i^{U} \varDelta y_j , \forall i,j \mid (i,j) \in QT, \end{aligned}$$
(26)

where constraints (23) and (24) form an exact linearisation of the product between binary and a continuous variable. The constraints (25) and (26) provide a relaxation of the product of two continuous variables and are known as McCormick envelopes [17].

Furthermore, using the previously defined variable \(w_{i,j}\), the objective function (14) and the original constraints (15) are replaced by (27) and (28), respectively.

$$\begin{aligned}&\text {max.} \quad \sum _{i | (i,i) \in QT} {Q^{0}_{i,i} w_{i,i}} + 2\sum _{(i,j) \in QT | j > i} {Q^{0}_{i,j} w_{i,j}} + \sum _j I^{0}_j x_j + \sum _i C^{0}_i y_i \end{aligned}$$
(27)
$$\begin{aligned}&\sum _{i | (i,i) \in QT} {Q^{r}_{i,i} w_{i,i}} + 2\sum _{(i,j) \in QT | j > i} {Q^{r}_{i,j} w_{i,j}} + \sum _j I^{r}_j x_j + \sum _i C^{r}_i y_i + K^{r} \le 0, \nonumber \\&\forall r \in R. \end{aligned}$$
(28)

Note that we use the assumption that the matrices \(Q^r\) are symmetric for all \(r \in R \cup \{ 0 \}\) to replace the terms

$$\begin{aligned} \sum _{i \in VC} \sum _{j \in VC} Q^r_{i,j} y_i y_j \end{aligned}$$

with

$$\begin{aligned} \sum _{i | (i,i) \in QT} {Q^{r}_{i,i} w_{i,i}} + 2\sum _{(i,j) \in QT | j > i} {Q^{r}_{i,j} w_{i,j}}, \ \forall r \in R \cup \{0\}. \end{aligned}$$

Summarising the above, we can define the RNMDT relaxation as follows.

Definition 4

For every integer \(p < 0\), the RNMDT relaxation of the problem (14)–(17) is defined as the problem of maximising the objective function (27), subject to the constraints (16)–(17), (18)–(20), (22)–(26) and (28).

Hereinafter, we will refer to the relaxation obtained using RNMDT with an arbitrary parameter p as to RNMDT\(_p\). The following results allow for describing the relationship between the problem (14)–(17) and the RNMDT\(_p\), which will be useful in the remaining of the paper.

With Definition 1 in mind, the following theorem defines the relationship between the original problem (14)–(17) and its RNMDT\(_p\) for any \(p < 0\). An equivalent version of Theorem 5 has been proved in [1], but is formally stated for reference in the developments to follow.

Theorem 5

Suppose we consider MIQCQP problem (14)–(17) and correspondent RNMDT\(_p\) problems, \(\forall p < 0\), introduced in Definition 4. Then, RNMDT\(_p\) is a relaxation to the problem (14)–(17) for every \(p < 0\). Moreover, for any pair of \((p_1,p_2)\) with \(p_1<p_2 <0\), the RNMDT\(_{p_2}\) is a relaxation of the problem RNMDT\(_{p_1}\). Consequently, for any pair of \((p_1,p_2)\) with \(p_1<p_2 <0\), the RNMDT\(_{p_1}\) is a tighter (or equal) relaxation of the problem (14)–(17) than RNMDT\(_{p_2}\).

Proof

The proof readily follows from [1, Proposition 6 and Theorem 6]. \(\square \)

4 The p-Lagrangian relaxation

The combination of Lagrangian relaxation with the RNMDT makes it possible to construct separable mixed-integer problems that retain a weak dual relationship with the original MIQCQP problem. More specifically, it means that one can formulate the Lagrangian relaxation to the MIQCQP and subsequently employ RNMDT to the relaxed MIQCQP probelm to obtain a MIP-based relaxation for a given arbitrary value of the precision factor p. Hence, for each fixed p, we obtain a mixed-integer approximation of the LDP, which will be hereinafter called p-Lagrangian dual problem (p-LDP). The procedure to relax one or more constraints with this method is what we refer to as the p-Lagrangian relaxation, and the framework analogous to the Lagrangian decomposition is p-Lagrangian decomposition (p-LD). It is worth highlighting that one can alternatively employ the RNMDT to obtain the RNMDT\(_p\) reformulation of the primal problem and then employ Lagrangian relaxation to relax the set of constraints (8), which would lead to an identical formulation of the p-LDP.

The p-LDP of the primal RDEM problem can be constructed by employing RNMDT to discretise the variables \(y_j\) in the LDP represented by the Problem (11). Therefore, this results in a p-LDP that is decomposable by \(s \in S\). Each scenario subproblem can, thus, be stated as

$$\begin{aligned}&{\hat{\phi }}_p^s(\lambda ) = \nonumber \\&\qquad \text {max. } P^s \left( \sum _{j \in VI} I^0_j x^s_{j} + \sum _{i | (i,i) \in QT} {Q^{r}_{i,i} w_{i,i}} + 2\sum _{(i,j) \in QT | j > i} {Q^{r}_{i,j} w_{i,j}} \right. \nonumber \\&\quad + \left. \sum _{i \in {VC}} C^{s,0}_{i} y^s_{i} + \sum _{j \in VI} {L^s(\lambda )}_j x^s_{j} \right) \end{aligned}$$
(29)
$$\begin{aligned}&\text {s.t.:} \quad \sum _{i | (i,i) \in QT} {Q^{r}_{i,i} w_{i,i}} + 2\sum _{(i,j) \in QT | j > i} {Q^{r}_{i,j} w_{i,j}} \nonumber \\&\qquad \qquad \qquad + \sum _{i \in {VC}} C^{s,r}_{i} y^s_{i} + \sum _{j \in {VI}} I^{s, r}_{j} x^s_{j} + K^{s,r} \le 0, \forall r \in R \nonumber \\&y^s_{j} = (Y^{U,s}_j - Y^{L,s}_j)\bigg ( \sum _{l \in Z_{p, -1}} 2^l z^s_{j, l} + \varDelta y^s_j \bigg ) + Y^{L,s}_j, \forall j \in DS \nonumber \\&\omega ^s_{i, j} = (Y^{U,s}_j - Y^{L,s}_j)\bigg (\sum _{l \in Z_{p, -1}} 2^l {\hat{y}}^s_{i, j , l} + \varDelta \omega ^s_{i, j} \bigg ) + y^s_i Y^{L,s}_j, \forall i, j \mid (i,j) \in QT \nonumber \\&0 \le \varDelta y^s_{j} \le 2^{p}, \forall j \in DS \nonumber \\&2^{p}(y^s_{i}-Y^{U,s}_i) + Y^{U,s}_i \varDelta y^s_{j} \le \varDelta \omega ^s_{i, j}, \forall i, j \mid (i,j) \in QT \nonumber \\&\varDelta \omega ^s_{i, j} \le 2^{p}(y^s_{i}-Y^{L,s}_i) + Y^{L,s}_i\varDelta y^s_{j}, \forall i, j \mid (i,j) \in QT \nonumber \\&Y^{L,s}_i \varDelta y^s_{j} \le \varDelta w^s_{i, j} \le Y^{U,s}_i \varDelta y^s_{j}, \forall i, j \mid (i,j) \in QT \nonumber \\&Y^{L,s}_i z^s_{j, l} \le {\hat{y}}^s_{i, j, l} \le Y^{U,s}_i z^s_{j, l}, \forall i, j \mid (i,j) \in QT, l \in Z_{p, -1} \nonumber \\&Y^{L,s}_i (1 - z^s_{j, l}) \le y^s_{i} - {\hat{y}}^s_{i, j, l} \le Y^{U,s}_i (1 - z^s_{j, l}), \forall i, j \mid (i,j) \in QT, l \in Z_{p, -1} \nonumber \\&y^s_{i} \in [Y^{L,s}_{i}, Y^{U,s}_{i}], \forall i \in VC \nonumber \\&x^s_{j} \in \{X^L_{j}, \dots , X^U_{j}\}, \forall j \in VI \nonumber \\&z^s_{j, l} \in \{0, 1\}, \forall j \in DS, l \in Z_{p, -1}, \end{aligned}$$
(30)

where \({\hat{\phi }}_p(\lambda ) = \sum \limits _{s} {\hat{\phi }}_p^s(\lambda )\) and \(L^s(\lambda )\) is given by

$$\begin{aligned} L^{s}(\lambda ) = {\left\{ \begin{array}{ll} \text {vector with elements } \left[ \sum \limits _{s \in S \{s^\prime \} } \lambda ^s_{j}\right] , \forall j \in VI, &{} \text {if } s = \{s^\prime \} \\ \text {vector with elements } [- \lambda ^s_{j}], \forall j \in VI, &{} \text {otherwise.} \end{array}\right. } \end{aligned}$$

One appealing feature of the p-LD method is the possibility of regulating the precision of the dual bound by means of the parameter p that controls the precision of the p-LDP. Therefore, if we solve two p-LDPs setting two different values for p, the one with the smaller parameter p will provide a better or equal dual bound compared with the one with the larger p (cf. Theorem 5). The p-LDP is equivalent to constructing RNMDT\(_p\) of the LDP and consists of replacing the dual function \(\phi \) with an over-estimator \({\hat{\phi }}_p\) with its respective RNMDT-associated auxiliary variables and constraints.

4.1 Convergence of the p-Lagrangian relaxation

The following technical results state the convergence of the sequence \(\{{\hat{\phi }}_{-k}\}_{k=1}^{+\infty }\) of the values generated by the p-Lagrangian dual function \({\hat{\phi }}\) to the Lagrangian dual function \(\phi \). We start by referring to epi-convergence [55], which has been shown to be the ideal tool to study the convergence and approximation of optimisation problems, especially in settings considering duality as a coordination framework. We refer the reader to [55, Chapter 7] for a detailed study of the properties of epi-convergence, opting to list only the properties relevant to our purposes.

4.1.1 Reformulation of DEM

To use this theorical framework, it will be convenient to include an auxiliary set of variables to the DEM problem along with the framing of the problem as a minimisation. This can be achieved by reformulating the DEM as:

$$\begin{aligned}&P~:~ v\left( \left\{ -f^{s}\right\} _{s\in S},\left\{ C^{s}\right\} _{s\in S}\right) \\&\quad =\text {min.}_{\left( x,y,w\right) } \left[ \sum _{s\in S}\left( -f^{s}\right) \left( x,\left( y^{s},w^{s}\right) \right) :\left( x,y^{s},w^{s}\right) \in C^{s}\right] \end{aligned}$$

where, for \(s \in S\),

$$\begin{aligned} C^{s} =\left\{ \left( x,\left( y^{s},w^{s}\right) \right) \mid g^{s,r}\left( x,\left( y^{s},w^{s}\right) \right) \le 0\text {, }r\in R\text {, }w_{i,j}^{s}=y_{i}^{s}y_{j}^{s}\text {, } x\in X\text {, }y^{s}\in Y^{s}\right\} , \end{aligned}$$

and

$$\begin{aligned} f^{s}(x,\left( y^{s},w^{s}\right) )&=P^{s}\left( \sum _{j\in VI}I_{j} ^{0}x_{j}+\sum _{i\in {VC}}\sum _{j\in {VC}}Q_{i,j} ^{s,0}w_{i,j}^{s}+\sum _{i\in {VC}}C_{i}^{s,0}y_{i}^{s}\right) , \end{aligned}$$
(31)
$$\begin{aligned} g^{s,r}(x,\left( y^{s},w^{s}\right) )&=\sum _{i\in {VC}}\sum _{j\in {VC}}Q_{i,j}^{s,r}w_{i,j}^{s}+\sum _{i\in {VC}} C_{i}^{s,r}y_{i}^{s}+\sum _{j\in {VI}}I_{j}^{s,r}x_{j}+K^{s,r}, \end{aligned}$$
(32)

while X and \(Y^{s}\) represent the box constraints (16) and (17) for integer and continuous variables, respectively. Denote the convex hull of a set C by \({\text {conv}} C\). In our final result, we will assume that \(0 \in {\text {int}} {\text {conv}}C^{s}\) for all \(s \in S\). This, without loss of generality, can in turn be ensured via a translation (and a subsequent change of variable) whenever \({\text {int}} {\text {conv}} C^s \ne \emptyset \) for all \(s \in S\).

Recall that the constraints (25) and (26) in Definition 4 provide a relaxation of the product of two continuous variables and are known as McCormick envelopes [17]. Denote these by \(S_{R_{p}}\), which comprises the variables \(\left( y,x,{{\hat{y}},}z,\varDelta y,\varDelta w\right) \) satisfying the constraints (23) to (26).

For the purpose of convergence analysis, we would like the relaxation to reside in the same variable space of \(\left( x,\left( y,w\right) \right) \in X\times Y^{s}\times {\mathbb {R}}^{|VC|^{2}}\subseteq {\mathbb {R}}^{|VI |+|VC|+|VC|^{2}}\) for \(s\in S\). The current space of variables consists of \(\left( y,x,{{\hat{y}},}z,\varDelta y,\varDelta w\right) \). Analogously to the derivations in the proof of Theorem 5, let us define the mapping \(M:\left( y,x,{{\hat{y}},}z,\varDelta y,\varDelta w\right) \rightarrow \left( x^{\prime },\left( y^{\prime },w^{\prime }\right) \right) \) to be

$$\begin{aligned} x^{\prime }&=x \quad \text { and } \quad y_{j}^{\prime } =(Y_{j}^{U}-Y_{j}^{L})\left( \sum _{l\in Z_{p,-1}}{2^{l}z_{j,l} }+\varDelta y_{j}\right) +Y_{j}^{L} \text { with}\\ w_{i,j}^{\prime }&=(Y_{j}^{U}-Y_{j}^{L})\left( \sum _{l\in Z_{p,-1}} {2^{l}{\hat{y}}_{i,j,l}}+\varDelta w_{i,j}\right) + y_{i}Y_{j}^{L}. \end{aligned}$$

Then, we constrain \(\left( x,\left( y,w\right) \right) \) to the set \(M\left( S_{R_{p}}\right) \). The RNMDT\(_{p}\) relaxation of the problem P for the fixed value of \(p < 0\) can be re-formulated as

$$\begin{aligned}&P_{R_{p}}~:~ v\left( \left\{ -f^{s}\right\} _{s\in S},\left\{ C_{R_{p} }^{s}\right\} _{s\in S}\right) \nonumber \\&\quad =\text {min.}_{\left( x,y,w\right) } \left[ \sum _{s\in S}\left( -f^{s}\right) \left( x,\left( y^{s},w^{s}\right) \right) :\left( x,y^{s},w^{s}\right) \in C_{R_{p}}^{s}\right] \end{aligned}$$
(33)

where, for \(s\in S\),

$$\begin{aligned} C_{R_{p}}^{s} =\left\{ \left( x,\left( y^{s} ,w^{s}\right) \right) \in M\left( S_{R_{p}}\right) \mid g^{s,r}\left( x,\left( y^{s},w^{s}\right) \right) \le 0\text {, }r\in R\text {, } x\in X\text {, }y^{s}\in Y^{s}\right\} . \end{aligned}$$

The advantage of framing the approximation in this form is that the additional integer variables z required to describe the McCormick envelopes are embedded in the constraint set \(M(S_{R_{p}})\), as they simply constitute additional variables required for the description of the McCormick envelopes. Notice that the objective is not perturbed at all in this formulation. We have the following due to the properties of the McCormick envelopes [17].

Proposition 6

We have \(P_{R_{p}}\) a relaxation of \(P_{R_{p-1}}\) and of P as

$$\begin{aligned} M\left( S_{R_{p-1}}\right) \subseteq M\left( S_{R_{p}}\right) \end{aligned}$$

where

$$\begin{aligned} \left\{ \left( x,\left( y^{s},w^{s}\right) \right) \mid w_{i,j}^{s} =y_{i}^{s}y_{j}^{s}\right\} =\bigcap _{p<0}M\left( S_{R_{p}}\right) . \end{aligned}$$

Proof

The proof of the first part is equivalent to that of Theorem 5 in Sect. 3. To prove the second part, first, notice that \(P_{R_p}\), \(\forall p < 0\), is a reformulated equivalent of NMDT relaxation presented in [18]. As demonstrated in [18, Property 3], the NMDT relaxation converges to the primal problem P as \(p \rightarrow -\infty \), which in turn implies that \(\lim \limits _{p \rightarrow -\infty } w^s_{i,j} = y^s_i y^s_j\). \(\square \)

4.1.2 Convergence of relaxations of nonconvex optimisation problems

We will be relating the convergence of the RNMDT\(_p\) relaxations to the notion of epi-convergence. In the subsection we develop a general theoretical framework applicable to this context.

Definition 7

Let \(f_{k}:{\mathbb {R}}^{n}\rightarrow \mathbb {R\cup }\left\{ +\infty \right\} \) be a sequence of extended real-valued functions. We say that \(\left\{ f_{k}\right\} \) epi-converges to f and write e-\(\lim _{k} f_{k}=f\) if and only if \(\lim _{k}{\text {epi}}f_{k}={\text {epi}}f\), where \({\text {epi}}f_{k} = \{(x,\alpha ) \mid f_{k}( x) \le \alpha \} \) and \({\text {epi}}f = \{ ( x,\alpha ) \mid f(x) \le \alpha \}\).

The limit of sets in Definition 7 is understood to be in the Kuratowski-Painleve sense. That is, (i) all accumulation points of the subsequence \(\left( x_{k_{l}},\alpha _{k_{l}}\right) \in {\text {epi}}f_{k_{l}}\) are in \({\text {epi}}f\) (i.e., \(\left( x_{k_{l}},\alpha _{k_{l}}\right) \rightarrow \left( x,\alpha \right) \in {\text {epi}}f\)) and (ii) for all \(\left( x,\alpha \right) \in {\text {epi}}f\) there exists a sequence \(\left( x_{k},\alpha _{k}\right) \in {\text {epi}}f_{k}\) with \(\left( x_{k},\alpha _{k}\right) \rightarrow (x,\alpha )\). Recall that a function f is lower semi-continuous if and only if \({\text {epi}}f\) is closed.

One of the reasons that epi-convergence lends itself to the analysis of approximations is that any optimisation problem of the form \(v\left( g_{k} ,C_{k}\right) =\text {min.}\{g_{k}(x) \mid x\in C_{k}\}\) can be represented as an infimum of an extended real-valued function \(f_{k}=g_{k}+\delta _{C_{k}}\), where \(\delta _{C_{k}}\left( x\right) =0\) if and only if \(x\in C_{k}\), and \(+\infty \), otherwise. Thus, the general properties of approximating optimisation problems can be framed in terms of the behavior of infima of extended real-valued functions that are fully capable of modelling constraints sets containing a fixed number of integer variables.

Consider the optimisation problem \(P:\text {max.}\{f(x) \mid x \in C\} \). Recall that \(P_{R_{2}}: \text {max.}\{f_{R_{2}}(x) \mid x\in C_{R_{2}}\}\) is a tighter relaxation (cf. Definition 1) of P than \(P_{R_{1}}:\text {max.}\{ f_{R_{1}}(x) \mid x \in C_{R_{1}}\}\) if and only if

  1. 1.

    \(f_{R_{1}}\left( x\right) \ge f_{R_{2}}\left( x\right) \ge f\left( x\right) \) for all \(x\in C\) and

  2. 2.

    \(C\subseteq C_{R_{2}}\subseteq C_{R_{1}}\).

Note that \(v\left( f_{R_{2}},C_{R_{2}}\right) =\text {max.}\left( f_{R_{2}}\left( x\right) +\delta _{C_{R_{2}}}\left( x\right) \right) \le v\left( f_{R_{1}},C_{R_{1}}\right) \). To place the study of such relaxations into the framework of epi-convergence, we must consider the equivalent minimization problem, so

$$\begin{aligned} v\left( -f_{R_{2}},C_{R_{2}}\right) =\text {min.}\left( -f_{R_{2}}\left( x\right) +\delta _{C_{R_{2}}}\left( x\right) \right) \ge v\left( -f_{R_{1}},C_{R_{1}}\right) , \end{aligned}$$

because \(-f_{R_{2}}+\delta _{C_{R_{2}}}\ge -f_{R_{1}}+\delta _{C_{R_{1}}}\). Clearly, a sequence of tighter convex relaxations \(P_{R_{p}}\) with \(p = -k\) leads to a non-decreasing sequence of objective and extended real-valued functions that are associated with a sequences \(g_{k}= -f_{R_{k}}+\delta _{C_{R_{k}}}\) of optimisation problems that are monotonically non-decreasing, i.e., \(g_{k+1} \ge g_{k}\), for all k. From [55, Propositions 7.4 and 7.15], we know that such sequences epi-converge to the closure of their pointwise supremum. Moreover, uniformly convergent sequences also epi-converge. This enables epi-convergence to be invoked to study the convergence of our relaxations. Later, we will show that the associated sequence of convex dual problems also epi-converges in the strong sense of uniform convergence on bounded sets.

In what follows, we do not assume that the set C is convex nor connected and, thus, it could contain integer constraints on variables. In the analysis of integer programming problems, it is best to posit the integer constraints in the constraint set C and hence also in the relaxations \(C_{R_{k}}\), in that they do not pose a barrier to convergence, for the theory of epi-convergence applies to extended real-valued, lower semi-continuous or closed functions, such as \(\left\{ \delta _{C_{R_{k}}}\right\} _{k}\). Although it is natural that the relaxations \(C_{R_k}\) epi-converge (as they are monotonically become “tighter” as p decreases, as stated in Proposition 6) but we need to establish to what problem it converges to. This is stated in Proposition 8. Moreover, notice the role of Proposition 6 in ensuring that the RNMDT\(_p\) relaxation satisfies the condition 2 in Proposition 8.

Proposition 8

Consider the optimisation problem \(P:\text {min.}\left\{ -f\left( x\right) \mid x\in C\right\} \) involving a proper, closed, lower semi-continuous function f and a closed set S. Let us denote \({\overline{C}}\) as the closure of the set C. Suppose we have a sequence of tighter relaxations of P, given by \(P_{R_{k}} :\text {min.}\{-f_{R_{k}}(x) \mid x\in C_{R_{k}}\}\) where

  1. 1.

    \(-f_{R_{k}}\left( x\right) \le -f_{R_{k+1}}\left( x\right) \le -f\left( x\right) \) for all \(x\in C\) and \({\text {epi}}\left( -f\right) =\overline{\bigcap _{k}{\text {epi}}\left( -f_{k}\right) }\) with

  2. 2.

    \(C\subseteq C_{R_{k+1}}\subseteq C_{R_{k}}\) and \(\overline{\bigcap _{k}C_{R_{k}}}=C\) (which is equivalent to \(\bigcap _{k}C_{R_{k}}=C\) when each \(C_{R_{k}}\) is closed) .

Then \(\left\{ g_{k}=-f_{R_{k}}+\delta _{C_{R_{k}}}\right\} _{k=1}^{\infty }\) epi-converges to \(g=-f+\delta _{C}.\)

Proof

Let \(g = -f+\delta _{C}.\) Note that we have, for any x, that

$$\begin{aligned} g_{k}\left( x\right) =-f_{R_{k}}\left( x\right) +\delta _{C_{R_{k}} }\left( x\right) \le -f\left( x\right) +\delta _{C}\left( x\right) = g\left( x\right) . \end{aligned}$$

Let \(B_\delta ({\bar{x}}) = \{x \mid ||x - {\bar{x}}|| \le \delta \}\) where \(|| \cdot ||\) is any norm. Therefore, for any \(\delta >0\) and \({\bar{x}}\in {\text {dom}}f\cap C\) we have (cf. [55, Exercise 7.3])

$$\begin{aligned} e\text {-}\limsup _{k\rightarrow \infty }g_{k}\left( {\bar{x}}\right)&=\sup _{\delta>0}\limsup _{k\rightarrow \infty }\inf _{x\in B_{\delta }\left( {\bar{x}}\right) }g_{k}\left( x\right) \nonumber \\&\le \sup _{\delta >0}\inf _{x\in B_{\delta }\left( {\bar{x}}\right) }g\left( x\right) \le g\left( {\bar{x}}\right) =-f\left( {\bar{x}}\right) +\delta _{C}\left( {\bar{x}}\right) , \end{aligned}$$
(34)

with the last inequality following from \(\inf _{x\in B_{\delta }\left( \bar{x}\right) }g\left( x\right) \le g\left( {\bar{x}}\right) \) for all \(\delta >0\). By [55, Theorem 7.46], we always have (for any extended real-valued sequence)

$$\begin{aligned} e\text {-}\liminf _{k\rightarrow \infty }\left( -f_{R_{k}}\right) \left( {\bar{x}}\right) +e\text {-}\liminf _{k\rightarrow \infty }\delta _{C_{R_{k}} }\left( {\bar{x}}\right)&\le e\text {-}\liminf _{k\rightarrow \infty }\left( -f_{R_{k}}+\delta _{C_{R_{k}}}\right) \left( {\bar{x}}\right) \nonumber \\&=e\text {-} \liminf _{k\rightarrow \infty }g_{k}\left( {\bar{x}}\right) . \end{aligned}$$
(35)

Under our assumptions, \( e\text {-}\liminf _{k\rightarrow \infty }(-f_{R_{k}})({\bar{x}})=-f({\bar{x}}) \) and \(e\text {-}\liminf _{k\rightarrow \infty }\delta _{C_{R_{k}}}({\bar{x}}) =\delta _{C}({\bar{x}})\). This is because \(C\subseteq C_{R_{k+1}}\subseteq C_{R_{k}}\) and \(\overline{\bigcap _{k}C_{R_{k}}}=C\) implies that \(\{\delta _{C_{R_{k}}}\}_{k=1}^{\infty }\) is a monotone non-decreasing sequence of functions with

$$\begin{aligned} {\text {epi}}\left[ \overline{\sup _{k} \delta _{C_{R_{k}}}}\right] =\overline{{\text {epi}}\delta _{\cap _{k}C_{R_{k}}}}={\text {epi}}\delta _{\overline{\cap _{k}C_{R_{k}}} }={\text {epi}}\delta _{C}. \end{aligned}$$

Hence, (34) and (35) implies

$$\begin{aligned} -f\left( {\bar{x}}\right) +\delta _{C}\left( {\bar{x}}\right)&\le e\text {-}\liminf _{k\rightarrow \infty }\left( -f_{R_{k}}+\delta _{C_{R_{k}} }\right) \left( {\bar{x}}\right) =e\text {-}\liminf _{k\rightarrow \infty } g_{k}\left( {\bar{x}}\right) \\&\le e\text {-}\limsup _{k\rightarrow \infty }g_{k}\left( {\bar{x}}\right) \le -f\left( {\bar{x}}\right) +\delta _{C}\left( {\bar{x}}\right) \end{aligned}$$

and, thus, equality ensues. \(\square \)

4.1.3 Application to the convergence of the Lagrangian bounds

In this subsection, we will demonstrate the connection between the p-Lagrangian dual function \({\hat{\phi }}_p\) and the Lagrangian dual function \(\phi \) of the primal RDEM problem, analogously to the developments presented in Sect. 2. We begin with some elementary observations that will be useful in the developments hereinafter.

Proposition 9

Let f be the objective function of the primal MIQCQP problem and \({\bar{f}}\) be its optimal value. Let \(\phi \) be the Lagrangian dual function of the correspondent LDP and the functions \({\hat{\phi }}_p: {\mathbb {R}}^{|S|-1 \times |VI|} \rightarrow {\mathbb {R}}\) be the p-Lagrangian dual function of the correspondent p-LDP, where \(p \in {\mathbb {Z}}^-\). Then the following statements are true.

  1. (i)

    \({\hat{\phi }}_p\) is convex, \(\forall p < 0\).

  2. (ii)

    \({\hat{\phi }}_{p_2} \ge {\hat{\phi }}_{p_1} \ \forall p_1< p_2 < 0\).

  3. (iii)

    \({\hat{\phi }}_p \ge \phi , \ \forall \ p < 0\).

  4. (iv)

    \({\hat{\phi }}_p \ge {\overline{f}} , \ \forall \ p < 0\).

  5. (v)

    \(\inf \limits _{ \lambda } {\hat{\phi }}_{p_2}( \lambda ) \ge \inf \limits _{ \lambda } {\hat{\phi }}_{p_1}( \lambda ) \ge \inf \limits _{ \lambda } \phi (\lambda ), \forall p_1< p_2 < 0\).

Proof

Statement (i) is analogous to Proposition 2. On the other hand, the statements (ii) and (iii) are a direct consequence of Theorem 5. In particular, (iii) is also a consequence of the fact that \(\phi _p \rightarrow \phi \) as \(p \rightarrow -\infty \).

From Proposition 3, we know that \(\phi \ge {\overline{f}} \). From statement (iii), we have that \({\hat{\phi }}_p \ge \phi \). It follows by transitivity that \({\hat{\phi }}_p \ge {\overline{f}}\), which proves (iv). Lastly, combining statements (ii) and (iii) and taking the infimum leads us to the conclusion in statement (v). \(\square \)

The following results show that the study of perturbations of convex optimisations problems and how this affects their Lagrangian relaxations is essentially the analysis of the interaction of epi-convergence with conjugation.

To this end, let \(f^{s}:{\mathbb {R}}^{|VI|+|VC|+|VC|^{2} }\rightarrow {\mathbb {R}}\) and \(g^{s,r}:{\mathbb {R}}^{|VI|+|VC |+|VC|^{2}}\rightarrow {\mathbb {R}}\), \(\forall s\in S\) and \(\forall r\in R\), be continuous twice-differentiable functions. By \(x^{-s'}\) we denote the vector reduced by one dimension when removing the \(s'\) component, which, in turn, represents the reference scenario in the formulation of the RDEM. The associated Lagrangian relaxations are given by:

$$\begin{aligned}&-{\hat{\phi }}_p\left( \lambda \right) \\&\quad =\inf _{\left( x,\left( y,w\right) \right) }\left[ \sum _{s\in S}\left( -f^{s}\right) \left( x^{s},\left( y^{s},w^{s}\right) \right) +\sum _{s\in S\backslash \left\{ s^{\prime }\right\} }\left( -\lambda ^{s}\right) ^{\top }\left( x^{s^{\prime }}-x^{s}\right) :\left( x^{s},\left( y^{s},w^{s}\right) \right) \in C^{s}_{R_p}\right] \\&\quad =-\sup _{\left( x,\left( y,w\right) \right) }\left[ \sum _{s\in S\backslash \left\{ s^{\prime }\right\} }\left( \lambda ^{s}\right) ^{\top }\left( x^{s^{\prime }}-x^{s}\right) -\sum _{s\in S}\left( \left( -f^{s}\right) \left( x^{s},\left( y^{s},w^{s}\right) \right) +\delta _{C^{s}_{R_p}}\left( x^{s},\left( y^{s},w^{s}\right) \right) \right) \right] \\&\quad =-\sup _{\left( x^{-s^{\prime },}\left( y^{-s^{\prime }},w^{-s^{\prime } }\right) \right) }\sum _{s\in S\backslash \left\{ s^{\prime }\right\} }\left[ \left( -\lambda ^{s}\right) ^{\top }\ x^{s}-\ \left( -f^{s}\left( x^{s},(y^{s}, w^{s})\right) +\delta _{C^{s}_{R_p}}\left( x^{s},\left( y^{s} ,w^{s}\right) \right) \right) \right] \\&\qquad -\sup _{\left( x^{s^{\prime }},\left( y^{s^{\prime } },w^{s^{\prime }}\right) \right) }\left[ \left( \lambda ^{s^{\prime } }\right) x^{s^{\prime }}-\left( -f^{s^{\prime }}\left( x^{s^{\prime }},\left( y^{s^{\prime }},w^{s^{\prime }}\right) \right) +\delta _{C^{s^{\prime }}_{R_p}}\left( x^{s^{\prime }},\left( y^{s^{\prime }},w^{s^{\prime }}\right) \right) \right) \right] \\&\quad =-\left( \sum _{s\in S\backslash \left\{ s^{\prime }\right\} }\left( -f^{s}+\delta _{C^{s}_{R_p}}\right) \right) ^{*}\left( -\lambda ^{-s^{\prime } },(0,0)\right) -\left( -f^{s^{\prime }}+\delta _{C^{s^{\prime }}_{R_p}}\right) ^{*}\left( \lambda ^{s^{\prime }},(0,0)\right) , \end{aligned}$$

where \(\lambda ^{s'} = \sum _{s\in S\backslash \left\{ s^{\prime }\right\} } \lambda ^{s} \).

We will need to utilise a result that examines epi-convergence considered through the conjugation operator to obtain a result regarding the convergence of the optimal Lagrangian bound. The reason for this is that epi-convergence is widely recognised as the primary form of convergence under which we have convergence of the associated marginal mapping.

From Proposition 9, we have that \(-{\hat{\phi }}_p\left( \lambda \right) \le v\left( \left\{ -f^{s}\right\} _{s\in S},\left\{ C^{s}_{R_p}\right\} _{s\in S}\right) \), which implies that \(\inf _{\lambda } {\hat{\phi }}_p\left( \lambda \right) \ge v\left( \left\{ f^{s}\right\} _{s\in S},\left\{ C^s_{R_p}\right\} _{s\in S}\right) \), where

$$\begin{aligned} \inf _{\lambda }{\hat{\phi }}_p\left( \lambda \right) =\inf _{\lambda ^{-s^{\prime }}}\left( \sum _{s\in S\backslash \left\{ s^{\prime }\right\} }\left( -f^{s} +\delta _{C^{s}_{R_P}}\right) \right) ^{*}\left( -\lambda ^{-s^{\prime } },0\right) +\inf _{\lambda ^{s^{\prime }}}\left( -f^{s^{\prime }}+\delta _{C^{s^{\prime }}_{R_p}}\right) ^{*}\left( \lambda ^{s^{\prime }},0\right) . \end{aligned}$$

When we consider this dual form of the problem \(P_{R_{k}}\) to obtain the p-LDP with \(p = -k\), we will denote the corresponding dual function by \({\hat{\phi }}_{-k}\). In what follows, the functions \(f_k:=-\sum _{s\in S}f_{R_{k}}^{s} + \delta _{C_{R_{k}}^{s}}\) are equi-hypercoercive due to the boundedness assumption on \(C_{R_{k}}^{s}\) and, thus, we can use the following result to state the epi-convergence of \({\hat{\phi }}_p\) as \(p \rightarrow -\infty \). Note that this result does not assume \(\{f_k\}_{k=1}^{\infty }\) is a sequence of convex functions.

Theorem 10

(adapted from [49, Corollary 20]) If the functions \(\left\{ f_{k}\right\} \) and \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}\cup \{+\infty \}\) are proper, lower semi-continuous with f bounded below on bounded sets and \(\left\{ f_{k}\right\} \) equi-hypercoercive (in the sense that \(\lim _{\Vert x \Vert \rightarrow \infty }\frac{f_{k}(x)}{\Vert x\Vert }=+\infty \) uniformly in k) then

$$\begin{aligned} e\text {-}\lim _{k}f_{k}=f\quad \text {implies \,}e\text {-}\lim _{k}f_{k}^{*}=f^{*}. \end{aligned}$$

This enables us to state the following general result.

Corollary 11

Consider the optimisation problem \(P_{S}\) involving a proper-closed, lower semi-continuous functions \(\left\{ f^{s}\right\} _{s\in S}\) and a closed sets \(\left\{ C^{s}\right\} _{s\in S}\). Suppose we have a sequence of increasingly tighter relaxations of the optimisation problem given by

$$\begin{aligned} P_{R_{k}}:\text {min.}\left\{ -\sum _{s\in S}f_{R_{k}}^{s}\left( x,(y^{s}, w^{s}) \right) \mid \left( x,( y^{s},w^{s}) \right) \in C_{R_{k}}^{s}\text {, }s\in S\right\} . \end{aligned}$$

We assume

  1. (i)

    \(f_{R_{k}}^{s}\left( x,(y^{s}, w^{s})\right) \ge f_{R_{k+1}}^{s}\left( x,(y^{s}, w^{s})\right) \ge f^{s}\left( x,(y^{s}, w^{s})\right) \) for all \(\left( x,(y^{s}, w^{s})\right) \in C_{R_{k}}^{s}\), \(s\in S\) and \({\text {epi}}\left( -f^{s}\right) =\overline{\bigcap _{k}{\text {epi}}\left( -f_{R_{k}} ^{s}\right) }\) with

  2. (ii)

    for some \(K>0\) we have \(C^{s}\subseteq C_{R_{k+1}}^{s}\subseteq C_{R_{k}}^{s}\subseteq B_{K}(0)\), \(s\in S\) and \(\overline{\bigcap _{k} C_{R_{k}}^{s}}=C^{s}\) (which is equivalent to \(\bigcap _{k}C_{R_{k}}^{s}=C^{s}\) when each \(C_{R_{k}}^{s}\) is closed).

Let \(\left\{ {\hat{\phi }}_{-k}\left( \cdot \right) \right\} _{k=1}^{\infty }\) be the associated sequence of scenario-wise relaxation of the problems \(P_{R_{k} }\). Then \(\left\{ {\hat{\phi }}_{-k}\left( \cdot \right) \right\} _{k=1}^{\infty }\) epi-converges to \(\phi \left( \cdot \right) \) where \(\left\{ {\hat{\phi }}_{-k}\left( \cdot \right) \right\} _{k=1}^{\infty }\) and \(\phi \left( \cdot \right) \) are all convex, proper and closed.

Proof

As the functions \(-\displaystyle \sum _{s\in S}f_{R_{k}}^{s}+\delta _{C_{R_{k}}^{s}}\) are equi-hypercoercive in the following (due to the boundedness assumption on \(C_{R_{k}}^{s}\)) we have, from combining Proposition 6 and 8 , that \(\left\{ -\displaystyle \sum _{s\in S}f_{R_{k}}^{s}+\delta _{C_{R_{k}} ^{s}}\right\} \) epi-converges to \(-\displaystyle \sum _{s\in S}f^{s}+\delta _{C^{s}}\) and so by Theorem 10, and the fact that \({\hat{\phi }}_{-k}\) (and \(\phi \)) are obtained via conjugation, we have \(\left\{ {\hat{\phi }}_{-k}\left( \cdot \right) \right\} _{k=1}^{\infty }\) epi-converges to \(\phi \left( \cdot \right) .\)

\(\square \)

Furthermore, as the Lagrangian dual functions are convex (cf. Proposition 9), we can finally arrive to the convergence result we were aiming at.

Proposition 12

Consider the optimisation problem P involving a proper-closed, lower semi-continuous functions \(\left\{ f^{s}\right\} _{s\in S}\) and a closed sets \(\left\{ C^{s}\right\} _{s\in S}\). Suppose we have a sequence of increasingly tighter approximations of P given by

$$\begin{aligned} P_{R_{k}} :\text {min.}\left\{ -\sum _{s\in S}f_{R_{k}}^{s}\left( x,(y^{s}, w^{s})\right) \mid \left( x,(y^{s}, w^{s})\right) \in C_{R_{k}}^{s}\text {, }s\in S\right\} , \end{aligned}$$

posit the assumption as in Corollary 11. Then \(\left\{ {\hat{\phi }}_{-k}\left( \cdot \right) \right\} _{k=1}^{\infty }\) converges uniformly on bounded sets.

Proof

We utilise [55, Theorem 7.17], that says that when \(\phi \) is a convex, lower semi-continuous function on \({\mathbb {R}}^{n}\), and \({\text {dom}}\phi \) has nonempty interior, the epi-convergence of \(\left\{ {\hat{\phi }}_{-k}\left( \cdot \right) \right\} _{k=1}^{\infty }\) to uniform convergence of \(\phi (\cdot )\) is equivalent to \(\left\{ {\hat{\phi }}_{-k} \right\} \) converges uniformly on all compact subsets of \({\text {dom}}\phi \) that does not contain boundary points of \({\text {dom}} \phi \). As \(\left\{ C_{R_{k}}^{s}\right\} _{k=1}^{\infty }\) are contained in a bounded set then \({\text {dom}}\phi \) is the whole space so combining [55, Theorem 7.17] and Corollary 11 we immediately obtain uniformly convergence on bounded sets. \(\square \)

We now finish this analysis by verifying that our problem satisfies the assumptions of Proposition 12 and that this ensure convergence of the optimal Lagrangian bound. We note that the latter could not be proven without the utilisation of epi-convergence. Denote by \({\text {int}} C\) the interior of set C and by \({\text {conv}} f\) the convex function whose epigraph is formed by taking the convex hull of the set \({\text {epi}} f\). We say that a sequence \(\{f_{k}\}\) of functions is eventually level-bounded if for each \(\alpha \in \) \({\mathbb {R}}\) the sequence of sets \({\text {lev}}_{\alpha }f_{k}:=\{x\mid f_{k}(x)\le \alpha \}\) is eventually contained in a bounded set.

Theorem 13

Consider the problem DEM as formulated as problem (P) in Sect. 4.1.1 along with its approximation (\(P_{R_p})\). We employ Lagrangian duality to the problem \(P_{R_{k}}\) to obtain the p-LDP with \(p = -k\) and denote the corresponding Lagrangian dual function by \({\hat{\phi }}_{-k}\). Then, \(\left\{ {\hat{\phi }}_{-k}\left( \cdot \right) \right\} _{k=1}^{\infty }\) converges uniformly on bounded sets and epi-converges. If, in addition, \(0 \in {\text {int}} {\text {conv}} C^{s}\), \(\forall s \in S\), then we also have the optimal dual values:

$$\begin{aligned} \inf _{\lambda } {\hat{\phi }}_{-k}(\lambda ) \xrightarrow [k \rightarrow \infty ]{} \inf _{\lambda } {\phi }(\lambda ) \end{aligned}$$

Furthermore if \(\varepsilon _{k}\downarrow 0\) and \(\lambda _{k}\in \varepsilon _{k} \)-\(\arg \min {\hat{\phi }}_{-k} :=\left\{ \lambda \mid {\hat{\phi }}_{k}\left( \lambda \right) -\varepsilon _{k}\le \inf {\hat{\phi }}_{k}\right\} \) , the sequence \(\left\{ \lambda _{k}\right\} \) is bounded with all its accumulation points in \(\arg \min {\phi }\). If \(\arg \min \phi \) consists of a unique point \({\bar{\lambda }}\) then \(\lambda _{k}\rightarrow {\bar{\lambda }}\).

Proof

To apply Proposition 12, we note that (\(P_{R_{-k}}\)), as stated in Sect. 4.1.1, has an objective function that is continuous and that is not dependent on k, hence assumption (i) of Corollary 11 is satisfied. Assumption (ii) of Corollary 11 follows immediately from Proposition 6 and the fact that \(C_{R_{p}}^{s} \) are closed and uniformly bounded sets due to the box constraints X and \(Y^s\) being bounded. It follows from Proposition 12 that \(\left\{ {\hat{\phi }}_{-k}\left( \cdot \right) \right\} _{k=1}^{\infty }\) epi-converges (and uniformly on bounded sets). All other claims follow from applying [55, Theorem 7.33] combined with [55, Exercise 7.32(c)], once we have established that \({\hat{\phi }}_{-k}\) are eventually level set bounded. This follows from applying [52, Theorem 4C], once we establish that \(0 \in {\text {int}} {\text {dom}} {\text {conv}} \displaystyle \sum _{s\in S\backslash \left\{ s^{\prime }\right\} }\left( -f^{s}+\delta _{C^{s} }\right) \) and \(0 \in {\text {int}} {\text {dom}} {\text {conv}} \left( -f^{s'}+\delta _{C^{s'} }\right) \), noting that convex conjugate satisfies \(({\text {conv}}g)^{*} = (g)^{*} \). This, in turn, is true whenever \(0 \in {\text {int}} {\text {conv}} C^{s}\), \(\forall s \in S\). \(\square \)

4.2 Motivating example: part 2

We revisit the motivating example in Sect. 2.1. By employing the \(p-\)Lagrangian relaxation as described before, we obtain the p-LDP formulation

$$\begin{aligned} \begin{aligned} \text {max. }_{w, \varDelta w, z, \varDelta x, {\hat{x}}, x}&\ w_{1,2} +\lambda (x_1 + 2x_2 - 1)\\ \text {s.t.: }&\ w_{1,2} = 2^{-1} {\hat{x}}_{1,2,-1} + v_{1,2} \\&\ x_2 = 2^{-1} z_{2,-1} + \varDelta x_2 \\&\ 0 \le {\hat{x}}_{1,2,-1} \le z_{2,-1} \\&\ 0 \le x_1 - {\hat{x}}_ {1,2,-1} \le 1 - z_{2,-1} \\&\ 2^{-1} (x_1 - 1) + \varDelta x_2 \le \varDelta w_{1,2} \le 2^{-1} x_1 \\&\ 0 \le \varDelta w_{1,2} \le \varDelta x_2 \\&\ z_{2,-1} \in \{0,1\} \\&\ x_1, x_2 \in [0,1.75] \\&\ \varDelta x_2 \in [0,2^{-1}]. \end{aligned} \end{aligned}$$
(36)

In this example, the variable \(x_2\) is discretised with a precision \(p=-1\) and the Lagrangian multiplier value is fixed to -0.l00. Solving the resulting problem with Gurobi 9.0 [34], we obtain the p-Lagrangian dual bound \({\hat{\phi }}_{-1} (-0.100) = 0.800\), which is valid since it is greater than 0.125. Notice, however, that this is not a tight bound since it was obtained considering an arbitrary Lagrangian multiplier. This bound value could be strengthened by solving the p-Lagrangian dual problem employing a nonsmooth optimisation method. In this example, employing a subgradient method [36], the optimal Lagrangian multiplier \(\lambda =-0.333\) and correspondent value of the dual objective function \(\phi (-0.333)=0.334 \) would be obtained. In the next section, we discuss the solution methods for LDPs. Solving the p-LDP with a precision \(p=-1\) and Lagrangian multiplier value fixed to -0.333 provides with a bound \({\hat{\phi }}_{-1}(-0.333)= 0.334\).

If one was to obtain a tighter bound \({\hat{\phi }}_p\), it would be necessary to choose more carefully the parameter p. In Sect. 5 we present a solution method for p-LDPs that simultaneously finds appropriate values for the Lagrangian multipliers and precision factor p.

5 A p-Lagrangian decomposition algorithm

Algorithms to solve a nonconvex problem, such as the MIQCQP being considered, usually rely on computing (and successively improving) primal and dual bounds. In the proposed setting, dual bounds - upper (lower) bounds, in a maximisation (minimisation) problem - are computed by finding the optimal value for p-LDP for a given set of Lagrangian multipliers, which in turn have their accuracy regulated by the value of the precision factor p. The primal bound - a lower (upper) bound for a maximisation (minimisation) problem - can be obtained as the value of the objective function for a primal feasible solution.

The algorithm for solving the p-LDP combines two strategies: (i) search for optimal Lagrangian multipliers and (ii) tightening the RNMDT\(_p\) as the iterations progress by decreasing the parameter p, thus gradually decreasing the UB. However, this method might have a significant disadvantage associated with a rapid increase in the number of binary variables that are added to the LDP, since all discretised variables are expanded using the same number of partitions. This makes it harder to compute the solution for the p-LDP due to the accelerated increase in the number of variables and constraints. To mitigate this adverse effect, the algorithm proposed in this section employs the dynamic precision algorithm in [1]. The major benefit of this algorithm is that it only increases the precision (and thus the number of auxiliary variables and constraints) of the variables that will potentially improve (i.e., tighten) the relaxation and which are dynamically chosen at each iteration. Concurrently, the method convergence is ensured by periodically increasing the precision of the variables that have remained unchanged (that is, that have not been selected in a predefined number of past iterations).

Therefore, the single precision parameter p is replaced with a vector \(p^s_j, \forall s \in S\) and \(j \in DS\), for each scenario-based p-LDP subproblem (30). Each entry is associated with the number of partitions that is used to increase the precision of the variable \(y^s_j\) for all \( s \in S\) and \(j \in DS\). The procedure for solving p-LDP is summarised in Algorithm 1.

figure a

The variables for which the precision will be increased (i.e., \(p_j^s\) will be decreased) are chosen by ranking them using the function

$$\begin{aligned} f_{rank}(s,j) = \sum \limits _{r \in R} \bigg \{ |Q^{s,r}_{j,j}(w^s_{j,j}-{y^s_j}^2)|+2\sum \limits _{(r,i)|i>j|(i,j)\in QT}|Q^{s,r}_{i,j}(w^s_{i,j}-y^s_iy^s_j)|\bigg \}. \end{aligned}$$
(37)

The term \(N_1\) is an auxiliary parameter corresponding to the number of ranked variables that are selected to have their precision increased, while \(N_2\) is the period (measured in number of iterations) of the periodic increase in precision of the variables that remained unchanged (in the last \(N_2\) iterations).

The Lagrangian-based heuristic mentioned in Step 2 of Algorithm 1 can be any method capable of generating primal feasible solutions. In this study, the heuristic employed at each iteration of the p-LD algorithm consists of two core elements. First, using the optimal values of the integer decision variables \({\bar{x}}_j^s\) for the p-LDP calculated at the iteration k, the averaged values are computed as follows.

$$\begin{aligned} x_j^{avg} = \frac{\sum \limits _{s \in S}{{\bar{x}}_j^s}}{|S|}, \forall j \in VI. \end{aligned}$$

Then, the optimal value of the variables \({\bar{y}}_i^s\), calculated at the iteration k, are used to find a feasible solution of the primal RDEM problem when integer variables \(x_j^s\) are fixed to the nearest integer (rounded) values of \(x_j^{avg}\), \(\forall j \in VI, s \in S\). This can be achieved by solving to optimality the primal problem with fixed integer variables and using the values \({\bar{y}}_{j,s}\) as a warm start for continuous variables. The objective function value of the primal problem is further used as a lower bound (LB) if the value obtained is higher than the existing LB. Hereinafter, we refer to this heuristic to obtain primal feasible solutions as the Lagrangian-based heuristic. It is worth mentioning that the same strategy is also employed in the dynamic-precision algorithm for solving MIQCQP problems using RNDMT (dp-RNMDT) proposed in [1]. Yet, this strategy is likely to be inefficient in more general settings. Thus, when possible, appropriate knowledge about the problem structure should be exploited for generating primal feasible solutions.

The nonsmooth optimisation algorithm mentioned in Step 3 is employed to update the Lagrangian multipliers values. In this paper, we used the bundle method. For a discussion on methods for performing the Lagrangian multiplier updates, please refer to the “A”. Notice that there are two types of stopping criteria in the Algorithm 1. A stop condition of type 1 in Step 4 is a stop criterion for the p-LDP multipliers update algorithm (i.e., bundle method). A stop condition of type 2 is a condition to stop the whole algorithm, e.g., time limit, iteration limit, or a threshold on the relative or absolute gap computed using incumbent primal and dual bounds.

Aiming to improve the convergence behaviour of the p-LD algorithm when using the bundle method (in Step 3), we modified the initialisation of the bundle method parameters. As a starting value for the centre of mass \(\lambda ^{centre}_0\), we considered the final value of the Lagrangian multipliers from the previous iteration of the p-LD algorithm, with the expectation that it would reduce the number of serious steps required until convergence (for details on the bundle method implementation, please refer to the “A”).

For the sake of completeness, the following theorem formally states the convergence of Algorithm 1. Notice that optimality gap obtained from Algorithm 1 is not guaranteed to converge to zero, as, while we can provide convergence guarantees for the dual bound (UB), no such guarantee can be provided for the Lagrangian-heuristic generating primal bounds (LB). Nevertheless, as illustrated in Sect. 6, often Algorithm 1 is capable of obtaining reasonably good solutions with reasonably small optimality gaps.

Theorem 14

Consider the RDEM problem defined in Sect. 1 and corresponding p-Lagrangian dual problem p-LDP with the objective function \({\hat{\phi }}_{p}\) for a fixed value \(p < 0 \) as defined in Sect. 4. Let \(f^*\) be the optimal objective value of the RDEM problem and UB\(^*\) (LB\(^*\)) be the corresponding dual bound (primal bound) value resulting from applying Algorithm 1 to p-LDP. Then, provided that Step 3 of Algorithm 1 utilises a method guaranteed to converge to the optimal Lagrangian multipliers \({\overline{\lambda }}_p := \arg \inf {\hat{\phi }}_{p}\), Algorithm 1 converges to the point UB\(^*\) (LB\(^*\)) that provides an upper (lower) estimate of the optimal objective value, i.e. \(UB^* \ge f^* \ge LB^*\).

Proof

Note that the employment of the bundle method in Step 4 of Algorithm 1, guarantees finite convergence when solving p-LDP for a given value of p (see for example [7, Proposition 5.3.2]). Moreover, note that Step 4 guarantees that \(p \rightarrow -\infty \) since, at every \(N_2\) iterations, all of the variables that have not had their precision p updated will have their value \(p_j^s\) decremented in one unit. Therefore, we have that

$$\begin{aligned} UB^* = \lim _{p \rightarrow -\infty }{\left\{ \inf _{\lambda } {\hat{\phi }}_{p} \right\} } \ge f^* \ge LB^*. \end{aligned}$$

by Proposition 3 and Theorem 13. \(\square \)

6 Computational experiments

In this section, we present numerical results for randomly generated nonconvex RDEM problems solved with Algorithm 1 (p-LD).We have generated random instances that replicate the separable structure of two-stage stochastic programming instances. The computational efficiency of the p-LD was compared with Gurobi’s (version 9.0.0) spatial branching algorithm (Full-space) [34] and the direct employment of dp-RNMDT. All the experiments were designed using the Julia (version 1.3.1) language [8] and the commercial solver Gurobi (version 9.0.0). All code and instances generated are freely available on the GitHub repository github.com/gamma-opt/p-Lagrangian_relaxation.jl. The code was run on Triton, Aalto University’s high-performance computing cluster, on a Dell PowerEdge C6420. The node has two Intel Xeon Gold 6148 20-core processors and 192GB of DDR4-2667 memory.

6.1 Design of the experiments

The three methods were applied to solve the collection of randomly generated instances. Each set of instances contained the RDEM problems with 5, 10, 15, 20, and 25 scenarios, represented in three sizes (small, medium, and large) as described in Table 1. Thus, the smallest instance has a total of 5 scenarios and therefore 100 continuous variables, 25 integer variables and 225 constraints, while the largest instance has 1000 continuous variables, 375 integer variables, and 1875 constraints. The quadratic matrices \(Q^{s,r}_{i,j}\) for all \(s \in S \), \( r \in \{0\} \cup R \) were randomly generated with approximately 80% density, a number that was arbitrarily chosen to match that observed in instances from [63]. The generation process of each group was replicated five times using different random seeds forming a total of 75 instances. In all experiments, we considered the first scenario as the reference index for formulating the NAC. More details regarding the instance generation procedure are given in “B”.

Table 1 Instance problems dimensions (per scenario)

The p-LD Algorithm 1 was implemented in two versions utilising sequential and parallel computing (using multiple threads). The parallelisation was based on the scenario subproblems of the p-LDP and for each instance, the number of processes utilised for parallel computing was equal to the number of scenarios in the instance.

Table 2 presents the parameter values for p-LD, including the parameter values for the dynamic precision-based algorithm and for bundle method used for updating the Lagrangian multipliers within p-LD. The termination tolerance for the dynamic precision-based algorithm was used to control the gap between primal and dual bounds. In turn, the termination tolerance for the bundle method was based on the pairwise differences between the Lagrangian dual function values considering the last three iterations (see Sect. 5 and “A”, respectively, for more details on the termination criteria; a discussion on setting different values for \(N_1\) and \(N_2\) and their impact on the algorithm’s performance is given in [1]). It is worth mentioning that both methods can be sensitive to the initial parameter values, which were set based on preliminary experiments on the smaller instances.

Table 2 Algorithm parameters

6.2 Numerical results

Table 3 presents the average values of the optimality gaps achieved within the predefined time limit and average running times (i.e., the time required by the algorithms to converge or if it was terminated). The optimality gaps were calculated as the relative difference between the UB attained using one of the methods to solve the dual problem and LB obtained by generating the primal feasible solution using the Lagrangian-based heuristic. In the rows highlighted with the symbol “*” we discarded the random seeds for which the heuristic method found a LB providing a relative gap higher than 500% for the p-LD. The values are shown for all three methods, i.e., Solver (employing Gurobi to solve the RDEM problem), dp-RNMDT and p-LD. The p-LD algorithm was executed serially (i.e., with no parallelisation) and also parallelising the solution of the p-LDP. To highlight the potential improvement that the parallelisation would provide, we present in the last column the time taken for the parallel p-LD to perform the same number of iterations observed in the sequential execution of p-LD. We highlight (with bold font) the cells corresponding to the serial method having superior performance in terms of optimality gaps, and running time at convergence (indicated by the solution time is seconds) or termination due to the time limit of 3600s (indicated by the letter “T” in the column “Time (s)”).

From the numerical results, one can conclude that the proposed method performed better than the commercial solver and dp-RNMDT. Solving the full-space problem never converged within the time limit set and never attained a gap better than roughly 130%, highlighting how challenging these problems are under a computational standpoint even for relatively small instances. Comparing the dp-RNMDT and sequential p-LD, one can notice that for most instances, with exception of those with 5 scenarios, utilising the p-LD method allows for obtaining relative gaps up to six times smaller, as is the case for the medium instance with 25 scenarios. Moreover, the advantage of the dp-RNMDT considering the gap attained for the instances with 5 scenarios is partially due to the heuristic method employed for the LB generation, which happened to be able in some of the experiments to generate better primal bounds from the solution obtained for dp-RNMDT. Nevertheless, another takeaway is the superior performance of p-LD in terms of solution time for two-thirds of the instances. Furthermore, the parallelised version of p-LD yields improvements in solution time of up to approximately 5 times, as is the case for large instances with 25 scenarios.

Table 3 Average results of the experiments

Table 4 provides more information on the performance of dp-RNMDT and sequential p-LD. The numerical results for the generated instances were organised in groups by random seeds and in each group, two columns are associated with each of the techniques. The first column “UB” reports the relative difference between the UB generated by the two methods, which was calculated as

$$\begin{aligned} \frac{\text {UB} - \min (\text {UB}_{\text {dp-RNMDT}}, \text {UB}_{p\text {-LD}})}{\min (\text {UB}_{\text {dp-RNMDT}}, \text {UB}_{p\text {-LD}})} \times 100, \end{aligned}$$
(38)

where “UB\(_{\text {dp-RNMDT}}\)” and “UB\({_{p\text {-LD}}}\)” are the UB generated by dp-RNMDT and p-LD, respectively, and “UB” is the the UB generated by the method considered in the column. Notice that, for the same random seed, the entry with value \(0.00\%\) indicates that the method found the best of the UB found for that instance, while the other entries show how much worse (larger) the other UB obtained are.

As before, we highlight (with bold font) the cells corresponding to the method having superior (or equal) performance. The second column “Time (s)” represents the time required by the method to converge, if the cell contains a number, or if it was terminated due to the time limit of 3600s, which is indicated with the letter “T”. Analogously, with the bold font, we emphasise the cases when the correspondent method converged faster. The last row in both columns summarizes the information, showing the number of cases in which the method generated an equal or better bound for the column “UB” and the number of instances for which the method converged for the column “Time”. The cells in bold font indicate which method performed better in terms of generating bounds. Similarly, we highlight with bold font which method converged more frequently. As one can observe from the table, in each random seed-based group sequential p-LD presented a superior performance for both criteria.

Table 4 Comparison analysis of the dp-RNMDT and p-LD convergence

Figure 1 illustrates the progress of the parallel p-LD with respect to the duality gap and the elapsed time for three arbitrarily selected instances of different sizes, each with 15 scenarios. As can be seen, the plots indicate the existence of a threshold value in the number of iterations performed by p-LD after which the reduction rate in the duality gap decreases considerably, while the computational time required per iteration continues to steadily increase (noticeable by the upward curvature of the time curve). The latter is a consequence of the increase in the number of binary variables in the p-LDP as the precision of the relaxation increases.

In the experiments conducted, we noticed that the time taken in each iteration is mostly (roughly 90%) dedicated to solving the (MIP scenario subproblems) p-LDP, while the remainder is majorly taken performing the heuristic to obtain primal feasible solutions. Only a negligible percentage of the iteration time (considerably less than 1%) is taken by the bundle method to calculate the Lagrangian multipliers. This pattern was observed for all instances, regardless of the number of scenarios or the size of the scenario subproblems, which motivates the observed benefits arising from parallelisation.

Fig. 1
figure 1

p-LD gap and time plots for instances with 15 scenarios

6.2.1 Performance profiles

To provide a structured comparison between the p-LD and dp-RNMDT, we present performance profiles based on [30]. Let P be the set of all the problem instances and A be the set of the algorithms used to solve the problem instances \(p \in P\), i.e., dp-RNMDT, p-LD and parallelised p-LD. Let \(t_{p, a}\) be computing time required by the algorithm \(a \in A\) to solve the problem \(p \in P\). For all \(p \in P\) and \(a \in A\) let the time performance ratio be defined as

$$\begin{aligned} r^t_{p,a} = \frac{t_{p,a}}{\min _{a \in A}\{t_{p,a}\}}. \end{aligned}$$

Let \(P^t_{\tau ,a} = \{p \in P : r^t_{p, a}\le \tau \}\) and, for every \(a \in A\), let the overall assessment of the time performance be defined as

$$\begin{aligned} \rho _a^t(\tau )=\frac{|P^t_{\tau ,a}|}{|A|}. \end{aligned}$$

Analogously, let \(ub_{p, a}\) be the UB achieved by the algorithm \(a \in A\) for the problem \(p \in P\). Let the UB performance ratio be defined as

$$\begin{aligned} r^{ub}_{p,a} = \frac{ub_{p,a}}{\min _{a \in A }\{ub_{p,a}\}}. \end{aligned}$$

Let \(P^{ub}_{\tau , a} = \{p \in P : r^{ub}_{p, a}\le \tau \}\) and let the UB performance assessment be defined as

$$\begin{aligned} \rho _a^{ub}(\tau )=\frac{|P^{ub}_{\tau , a}|}{|A|}. \end{aligned}$$

Figures 2 and 3 present the dual (upper) bound and time performance profiles, respectively. In Fig. 3, the parallelised version of the p-LD is excluded because the parallelisation did not considerably affect the quality of the bound generated. In both figures, the horizontal axes are plotted on a logarithmic scale.

Fig. 2
figure 2

Performance profile based on the upper bounds

Fig. 3
figure 3

Performance profiles based on the convergence time

As one can observe from the Figs. 2 and 3 , p-LD demonstrated superior performance when compared to dp-RNMDT in both bound generated and time performance criteria. In addition, Fig. 3 indicates that the computational performance superiority of p-LD can be reinforced by means of parallelisation techniques.

7 Conclusions

In this paper, we proposed a novel decomposition method named p-Lagrangian decomposition, which consists of an alternative framework to achieve decomposition for nonconvex MIQCQP problems. The core idea of the p-Lagrangian decomposition is to combine two techniques: Lagrangian decomposition and RNMDT. Lagrangian decomposition is a broadly known decomposition framework commonly applied to solving large-scale constrained optimisation problems with exploitable structure, which is the case whenever the relaxation of the complicating constraints results in a decomposable version of the original problem. Nevertheless, the nonconvex nature of the primal problem may lead to substantial issues, in specific, the necessity of solving nonconvex problems when evaluating the Lagrangian dual function. Therefore, we address this issue by applying a mixed-integer based relaxation technique, named RNMDT. Consequently, the primal problem is converted into a decomposable mixed-integer problem with significantly easier tractable Lagrangian dual function.

The values of the Lagrangian multipliers along with the value of the precision parameter p of the RNMDT allow for controlling the quality of the relaxation. Therefore, we proposed a new algorithm named p-LD inspired by the dynamic-precision based method developed in [1] combined with the bundle method for updating the Lagrangian multipliers. Additionally, the decomposable structure of the Lagrangian dual problem is amenable to parallelisation, which can significantly enhance the computational performance.

The numerical experiments suggest that the p-LD algorithm has considerable advantages over the commercial solver Gurobi in obtaining dual bounds within the predefined time limit. The experiments also indicate that significant savings in computational time may be achieved when introducing parallel computing.

Despite the promising performance suggested by the numerical results, the p-LD algorithm has two important shortcomings. The first is the dependence on a Lagrangian-based heuristic for generating primal feasible solutions which are likely to not be able to attain a desired optimality tolerance. The second issue relates to the duality gap arising from the mixed-integer nature of the primal problem combined with the imprecision of the RNMDT relaxation.

Therefore, future research should consider efficient ways to incorporate the p-Lagrangian decomposition framework within a branch-and-bound setting, which could potentially mitigate both issues. In particular, we believe that further advancement of the proposed method could be achieved by considering augmented Lagrangian instead [22]. Furthermore, the employment of the p-Lagrangian relaxation for bound generation in a branch-and-bound framework could bring new light to the classic approach for solving nonconvex mixed-integer non-linear problems such as a combination of the symbolic reformulation and spatial branch-and-bound algorithm [58], thus enhancing the computational efficiency of the method. Additionally, the p-Lagrangian relaxation could be employed for models arising from equilibrium problems [63], which naturally yield large-scale MIQCQPs with separable structure.

Moreover, synergies with the alternating direction method of multipliers (or Gauss-Siedel approaches, such as those explored in [47]) and the employment of inexact variants of bundle methods (which would allow for controlled imprecision in the solution of the p-Lagrangian dual problems; see [27, 28]) are interesting avenues to improve the computational performance of the method. Finally, under a theoretical standpoint, it would be interesting to investigate the convergence rate of the proposed method and whether it could be improved considering alternative nonsmooth optimisation methods for solving the p-LDPs.