Dissecting the duality gap: the supporting hyperplane interpretation revisited

We revisit the classic supporting hyperplane illustration of the duality gap for non-convex optimization problems. It is refined by dissecting the duality gap into two terms: the first measures the degree of near-optimality in a Lagrangian relaxation, while the second measures the degree of near-complementarity in the Lagrangian relaxed constraints. We also give an example of how this dissection may be exploited in the design of a solution approach within discrete optimization.


Background
We consider the primal problem of finding subject to g(x) ≤ 0 m , (1b) where the set X ⊆ R n and the functions f : R n → R and g : R n → R m . With the vector u ∈ R m + of Lagrangian multipliers for the constraint (1b), the dual function associated with the Lagrangian relaxation of this constraint is while θ * := supremum is the Lagrangian dual problem. We assume that the set X is non-empty and compact, and that the functions f and g are continuous on X . Then the relaxed problem (2) has an optimal solution for every u ∈ R m + . We further assume that the primal problem fulfils some constraint qualification which ensures that the dual problem (3) has an optimal solution (such as a Slater condition, see e.g. [1,Proposition 2.4.1]). Optimal solutions to problems (1) and (3) are denoted x * and u * , respectively. The duality gap for the primal-dual pair is Γ := f * − θ * . To ensure that the duality gap is zero, the primal problem must have a convexity property; cf. [2,Theorem 6.2.4], [3,Chapter 5], and [4,Chapter 6]. In case the primal problem is non-convex (e.g., a discrete optimization problem), a positive duality gap can be expected. (Readers that are not well acquainted with Lagrangian duality are referred to e.g. [2][3][4].) If the duality gap is zero, optimal solutions to both the primal problem (1) and its Lagrangian dual problem (3) The interpretation of these three conditions is optimality in the Lagrangian relaxed problem (2), feasibility in the relaxed constraint (1b), and complementarity in this constraint, respectively. The following result establishes the equivalence of the consistency of the system (4) and primal-dual optimality with a zero duality gap. The following theorem can be found in e.g. [2, Theorem 6.2.5].
Theorem 1.1 (primal-dual optimality condition) A pair (x, u) ∈ X × R m + satisfies the system (4) if and only if x solves the primal problem (1), u solves the dual problem (3), and f * = θ * holds.
A conclusion from this theorem is that the system (4) is inconsistent whenever u is not optimal in the Lagrangian dual problem (3) or there is a positive duality gap. In the case when the duality gap is zero and the dual vector is optimal in the Lagrangian dual problem (3), the result of Theorem 1.1 can be used to characterize all optimal solutions to the primal problem (1). Corollary 1.1 (characterization of optimal primal solutions) If f * = θ * holds and u solves the dual problem (3), then an x ∈ X solves the primal problem (1) if and only if it, together with u, satisfies the system (4). This characterization has been generalized [6, Proposition 5] to allow for a positive duality gap, the use of a u ∈ R m + that is not necessarily optimal in the dual problem (3), and also to describe near-optimal solutions to the primal problem (1). This generalization is based on the following relaxed global optimality conditions for the problem (1). Here, β ∈ R + and again we let (x, u) ∈ X × R m + .
Note that the quantities ε and δ will always become non-negative whenever (x, u) ∈ X × R m + and (5) hold. They capture near-optimality in the Lagrangian relaxed problem (2) and near-complementarity in the relaxed constraint (1b), respectively. The following theorem is a restatement of [6, Proposition 5]. Theorem 1.2 (characterization of near-optimal primal solutions) For any given u ∈ R m + , an x ∈ X is β-optimal in the primal problem (1) if and only if it, together with u and some values of ε and δ, satisfies the system (5).
Note that for any u ∈ R m + and the choice β = 0, the system (5) characterizes all primal optimal solutions. Further, if the duality gap is zero, u solves the dual problem, and β = 0, this characterization reduces to that of Corollary 1.1.
The characterization in Theorem 1.2 can be simplified by introducing the function ε : , which for a given u measures the degree of near-optimality of an x ∈ X in the Lagrangian relaxation, and the function δ : which for a given u measures the degree of near-complementarity of an x ∈ X in the relaxed constraint.
Note that holds for any choice of primal feasible solution x and u ∈ R m + . Further, for any such choice, the identity (6) provides a dissection of the difference between the primal and dual objective values into a non-negative Lagrangian near-optimality term and a nonnegative near-complementarity term. In particular, f * −θ * = ε(x * , u * )+δ(x * , u * ) = Γ . With this new notation, Theorem 1.2 can be restated as follows.

Supporting hyperplane illustrations
We now consider the case m = 1, introduce auxiliary variables z ∈ R and v ∈ R, which describe values of functions f and g, respectively, and define the set The Lagrangian relaxed problem (2) can then be restated as (7b) Figures 1 and 2 show the classical geometric illustrations of Lagrangian dualization, see e.g. [2], for the case of a zero and a positive duality gap, respectively. Here, and in the remainder of this section, · * denotes an optimal value. Points in the set (g, f )(X ) with g(x) ≤ 0 are indicated by the gray area, and (g * , The functions ε and δ are now introduced, and in Fig. 3 we show their optimal values. Since ε(x * , u * ) measures the degree of near-optimality of x * ∈ X in the Lagrangian relaxation (2), the line z Next, in Fig. 4, we illustrate the dissection of f (x) − θ(u) into ε(x, u) and δ(x, u) for a non-optimal primal feasible solutionx and a non-optimalū ∈ R m Fig. 3 Geometric interpretation of ε and δ for x * and u * (g(x), f (x)). Sinceū is not optimal, the line z +ūv = θ(ū) supports the set (g, f )(X ) at only one point (which may correspond to an x ∈ X that is feasible or infeasible). The construction of the geometric interpretation of ε(x,ū) and δ(x,ū) follows the same arguments as in Fig. 3.

Fig. 2 Classical illustration for a positive duality gap
To make the interpretations very concrete, we conclude this section with a detailed analysis of a numerical example, which is a knapsack problem. . 4 Geometric interpretation of ε and δ for non-optimalx andū ε(x * , u * ) = 5 1 11

A practical implication
The purpose of this section is to illustrate how the quantities ε and δ can be exploited when designing solution approaches for certain problem structures. Preliminary results along this line of research are presented in [8]. We here present a slight extension of the findings from that reference.
We consider the Set Covering Problem (SCP) stated as The dual problem is h * = max u∈R m + h(u). This Lagrangian relaxation has the integrality property [9, p. 177]. Hence, h * coincides with the optimal value of the linear programming relaxation of the SCP. Further, any optimal solution to the dual of the latter problem is an optimal solution to the Lagrangian dual problem. Since the upper bounds on the variables are redundant in SCP, we may consider the linear programming relaxation without these bounds. Let u * be an optimal dual solution to this problem. Then c j = c j − i∈I u * i a i j ≥ 0 holds for all j ∈ J . For the SCP, ε : {0, 1} n × R m Objective values z * IP in bold are proven optimal. Columns δ min rel and δ max rel give the minimal and maximal relative near-complementarity for the instance. Recall that ε rel = 1 − δ rel Letx be any feasible solution to SCP. Since h * = i∈I u * i , we get Further, From (6) we have that j∈J c jx j − h * = ε(x, u * ) + δ(x, u * ), and in particular ifx is optimal we obtain that Γ = ε(x, u * ) + δ(x, u * ).
We study 11 challenging SCP problem instances taken from the OR-Library [10]; details concerning the computational setup can be found in [8]. The instances are listed in Table 1. The first five are artificial and taken from [11], and the other six originate from a rail crew scheduling application [12]. The former have a density of 5% (by construction) and the latter have densities between 0.2% and 1.3%. Three of the instances could be solved to proven optimality.
For the optimal or best found solution, denoted x * , and its objective value z * IP , we calculate the following quantities: relative gap Γ rel := (z * IP − h * )/h * , relative near-optimality ε rel := ε(x * , u * )/(z * IP − h * ), and relative near-complementarity We also calculate the quantity Average Excess Coverage (AEC) := 1 m i∈I ( j∈J a i j x * j − 1). Note that ε rel + δ rel = 1. The functions ε and δ depend on both x and u. Hence, if there are alternative optimal primal or dual solutions, then the contributions of ε and δ to the duality gap may vary between these solutions; this was noticed already in [6]. To study this aspect, we solved the two problems Their optimal values give the full range for δ(x, u * ) over all solutions to SCP that are at least as good as x * . These problems are actually sometimes harder to solve than the original SCP, but most were solved to proven optimality. Detailed results are given in Table 1. (The analysis of the full range of δ(x, u) with respect to both optimal x and u is a much more complex task.) As can be seen in the table, the primal-dual gap ε(x * , u * ) + δ(x * , u * ) can be caused by either of the terms. For the first five instances, this gap is vastly dominated by the violation of complementarity, while for the rail instances it can be composed by either the Lagrangian near-optimality term or the near-complementarity term, or a combination of them. Further, large gaps are consistently caused solely by violation of complementarity, due to excess coverage of constraints.
Our observations can be utilized when designing core problem solution strategies for classes of set covering problems with known characteristics. A core problem is a restricted but feasible version of an original problem; such a problem should be of a manageable size and is constructed by selecting a subset of the original variables, see for example [12]. Our results indicate that if the duality gap is expected to be large then it can also be expected that the near-optimality term is relatively small. Since ε(x * , u * ) = j∈J c j x * j ≥ 0, it is then likely that x * j = 0 holds whenever c j is large. Therefore, variables with large values ofc j can most likely be excluded from the core problem. Otherwise, if the gap is expected to be moderate, then the near-optimality term can be relatively large, and therefore the core problem should also contain variables with relatively large reduced costs. These conclusions give a theoretical justification of the core problem construction used in [12].

Conclusion
We have extended the classical supporting hyperplane illustration of the duality gap for non-convex optimization problems, by dissecting the gap into two contributions: nearoptimality in the Lagrangian relaxation and near-complementarity in the Lagrangian relaxed constraints. This dissection adds improved understanding of the nature of the duality gap. We have also demonstrated that this dissection may have implications on the design of solution approaches.
Funding Open access funding provided by Linköping University.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.