As mentioned earlier, PA-based methods are normally not able to guarantee optimality of a solution for a nonconvex MINLP problem, but rather they work as a heuristic method that might be able to find a good, perhaps even an optimal, solution. The problem is that the separating hyperplane theorem, which these methods generally rely on, only guarantees that it is possible to find a separating hyperplane between two convex sets. Due to the violation of the separation theorem, the separation techniques commonly used in PA-based methods may not be valid. Thus, whenever cutting planes or supporting hyperplanes are generated to remove a previous solution point from the PA expressed in the MIP problem, we run the risk of cutting away feasible solutions of the original nonconvex problem. Therefore, while the primal bound provided by known integer-feasible solutions are still valid, the dual bound provided by the MIP solver is not as soon as a cut has been generated for a nonconvex constraint. Here, bound tightening is especially important since it may exclude problematic nonconvex parts of the feasible region early on.
As long as no cutting planes or supporting hyperplanes have been added to nonconvex constraints the lower bounds provided by the MIP solver are valid lower bounds also for the nonconvex MINLP problem. These global bounds are stored in SHOT, and can be used for termination on gap tolerance, e.g., if the MIP lower bound is equal or close to the upper bound provided by primal heuristics. Because SHOT can automatically detect convexity of nonlinear constraints, it is possible in some cases to avoid adding cuts for nonconvex constraints; to the best knowledge of the authors, SHOT is the only local solver available today that returns valid lower bounds also for nonconvex problems. Also, even if termination cannot be achieved by closing the objective gap, the lower bound may provide a good indication of the quality of the primal solution. This is one of the reasons that the default nonconvex strategy in SHOT tries to avoid creating cuts for nonconvex constraints as long as possible, i.e., as long as cuts can be added to convex constraints, none are created for nonconvex ones.
By only adding the minimal number of cuts for nonconvex constraints required, the probability that the subproblem (3) becomes infeasible is reduced, and the longer the iterative dual-primal solution process is allowed to continue, the greater is the probability of finding better solutions with the primal heuristics. This is of course a generalization, and there are naturally problem instances where adding several cuts early on and reducing the number of subproblems solved improves the performance. However, as can be seen in the benchmarks later in this paper, AlphaECP and DICOPT are quite efficient at quickly finding a feasible solution, but they struggle at improving these initial solutions, and often have to terminate before finding the optimal one.
In the next example, we will illustrate how the ESH algorithm fails to solve a simple nonconvex MINLP problem. The same example will be used throughout this section to exemplify the nonconvex improvements in SHOT.
Example 1
We will now consider a simple nonconvex MINLP problem with one continuous variable \(x_1\) and one integer variable \(x_2\). The first nonlinear constraint \(g_1\) is nonconvex and \(g_2\) is convex.
In the first iteration of the ESH algorithm, only the linear constraints are considered, i.e., constraints \(g_1\) and \(g_2\) are ignored. The solution point to the MILP problem will be \(x^*_1 = (2,2)\). Assuming that we have found the interior point (5.99, 0.35) by minimizing the function
$$\begin{aligned} G(x_1,x_2) := \max \{g_1(x_1,x_2),\ g_2(x_1,x_2) \}\le 0, \end{aligned}$$
(5)
we can then perform a root search for a point on the boundary on the integer-relaxed nonlinear feasible set, i.e., where \(G(x_1,x_2)=0\), to obtain the point (5.69, 0.48). By generating a supporting hyperplane at this point for the constraint \(g_1\), since \(g_2\) is satisfied, we will get the following supporting hyperplane
$$\begin{aligned} {\text {CUT}}_1(x_1,x_2):= - 2.10 x_1 + 11.74 x_2 + 6.39 \le 0. \end{aligned}$$
(6)
As can be seen from Fig. 1, adding this hyperplane, which is based on the nonconvex constraint, causes the MILP problem to immediately become infeasible as all integer solutions are cut off. Thus, the standard ESH algorithm could not find a primal solution even to this simple problem, as it cannot recover from an infeasible MILP subproblem.
SHOT includes much more functionality than the pure ESH algorithm, so it is possible that it will still find a valid integer-solution to problem (4). For example, the MIP solver can return more than one feasible solution in its so-called solution pool, and checking these candidates on the original MINLP problem may give an integer feasible solution. Also other primal heuristic strategies such as fixing integer variables to specific values and solving an NLP problem could work. In general, however, SHOT without the supplementary strategies discussed in this paper will not work well for nonconvex problems.
To reduce the probability of cutting away parts of the nonconvex feasible region, or more drastically creating an infeasible subproblem, we may want to generate as few cuts for nonconvex constraints as possible. It may also be a good idea to make the cuts less tight while still cutting away the previous solution point. Utilizing the ECP algorithm instead of the ESH algorithm, i.e., generating cutting planes instead of supporting hyperplanes is a strategy to make the cuts less tight, thus reducing the probability of cutting away parts of the nonconvex feasible region. Since it is also problematic to find an interior point needed for the root search, ECP can in many cases be a better choice or the only option. However, since the ESH algorithm normally generates fewer and better cuts, it is very problem specific which of the algorithms to use for optimal performance.
As previously mentioned, SHOT first only adds cuts for the convex constraints. For most problems, it will however be required to eventually add cuts for nonconvex constraints as well. After this we cannot be sure that the lower bound obtained from the MIP solver is valid anymore for the nonconvex MINLP problem. Another issue is that we can end up with infeasible subproblems, even though the original MINLP problem is feasible. To handle this, SHOT will try to repair infeasible MIP problems by relaxing the cuts added, as described in Sect. 3.1. Also, if a primal solution has been found, SHOT will introduce an objective cut that forces the next solution to be better than the currently best known solution. If this causes the MIP problem to become infeasible, the same feasibility relaxation can be attempted. The objective cut is described in Sect. 3.2.
Repairing infeasibilities in the dual strategies
The main issue with solving nonconvex problems with a PA strategy, is that feasible solutions are often sooner or later cut off when adding cuts to nonconvex constraints. The cuts might also make the linearized problem infeasible. Normally it is not possible to continue in this case, and we would need to terminate with the currently best known solution (if any).
A PA strategy can, however, be made more robust by performing an infeasibility relaxation, where the cuts added are relaxed to restore feasibility, after an infeasible subproblem has been detected. Assuming we have created an polyhedral approximation expressed using the constraints \(\mathrm C{\mathbf {x}}+ \mathbf {d} \le \mathbf {0}\), we can easily solve the following MILP problem to find a feasibility relaxation:
$$\begin{aligned} \begin{aligned}&\text {minimize}&\mathbf {v}^T\mathbf {r}&\\&\text {subject to}&\mathrm A{\mathbf {x}}\le \mathbf {a},\ \mathrm B{\mathbf {x}}= \mathbf{b},&\\&\mathrm C{\mathbf {x}}+ \mathbf {d} \le \mathbf {r},&\\&\underline{x}_i \le x_i \le \overline{x}_i&\forall i \in I = \{1,2,\ldots ,n\}, \\&x_i \in \mathbf {R},\ x_j \in \mathbf {Z}&\forall i,j \in I,\ i\ne j, \\&\mathbf {r} \ge \mathbf {0}.&\end{aligned} \end{aligned}$$
(7)
Note that if the MIP solver supports quadratic terms, these can be included in the repair problem as well. Here, the solution vector \(\mathbf {r}\) will contain the values required for restoring feasibility for the corresponding constraints. If \(r_k\) is fixed to be zero, k-th cut will not be allowed to be modified, e.g., for cuts generated for convex constraints, which we know are valid. Penalizing the relaxation of individual constraints is done by assigning high values to the corresponding element of the positive vector of scalars \(\mathbf {v}\). In SHOT, the strategy is to penalize the constraints added later more than those added earlier, by assigning the weight k, where k is an increasing counter for generated cuts. By favoring the modification of early added cuts, the risk of cycling, i.e., when a cut recently added is directly relaxed, can be reduced. After the feasibility relaxation has been found, the constraints in the MIP problem are modified according to:
$$\begin{aligned} \mathrm C{\mathbf {x}}+ \mathbf {d} \le \mathbf {0} \quad \longrightarrow \quad \mathrm C{\mathbf {x}}+ \mathbf {d} \le \tau \mathbf {r}, \end{aligned}$$
(8)
where \(\tau \ge 1\) is a parameter to relax the model further. The MIP problem can now be solved, and additional cuts added to the linearization. If it was not possible to repair feasibility, SHOT will have to terminate with the currently best solution, as it is not possible to continue. This can, e.g., happen if integer cuts have been added and the user has restricted SHOT to not try to relax these in the repair process. However, since the NLP solvers utilized are not global, we cannot guarantee that their returned solution is global and therefore, the integer cut may exclude a solution we have not found yet, cf., Sect. 3.4. Another case when the repair step might fail is when a cutoff value below the best possible solution has been added, cf., Sect. 3.2.
CPLEX and Gurobi have built in feasibility-repair functionality (utilizing the functions feasopt and feasRelax respectively), while Cbc currently lacks this functionality. Thus, if Cbc is used, SHOT restores feasibility by solving problem (7) directly.
This type of infeasibility relaxation has similarities to the strategy used in DICOPT, where the repair functionality is integrated into the MILP subproblem and considered in each iteration instead of a separate repair step taken when the subproblem becomes infeasible [72].
Example 2
We will now apply the repair functionality to the infeasible subproblem obtained in Ex. 1. The different steps are illustrated in Fig. 2. The infeasible problem (with added supporting hyperplane \({\text {CUT}}_1\)) was
$$\begin{aligned} \begin{aligned}&\text {minimize}&x_1-10 x_2&\\&\text {subject to}&- x_1 -10x_2 \le -6,\quad x_1 - 10x_2 \le 4,&\\&- 2.10 x_1 + 11.74 x_2 \le - 6.39,&\\&2 \le x_1 \le 8,\quad x_2 \in \{ 0,1,2\}. \end{aligned} \end{aligned}$$
(9)
Now, we formulate and solve problem (7) with \(v_1 = 1\)
$$\begin{aligned} \begin{aligned}&\text {minimize}&r_1&\\&\text {subject to}&- x_1 -10x_2 \le -6,\quad x_1 - 10x_2 \le 4,&\\&- 2.10 x_1 + 11.74 x_2 + 6.39 \le r_1,&\\&2 \le x_1 \le 8,\quad x_2 \in \{ 0,1,2\},\quad r_1 \ge 0, \end{aligned} \end{aligned}$$
(10)
which gives the solution \((x_1,x_2,r_1) = (8,1,1.30)\). The supporting hyperplane in problem (9) is relaxed by adding 1.43 to the RHS (here, we assume that the factor \(\tau =1.1\)). Then the new cutting plane replacing \({\text {CUT}}_1\) will be
$$\begin{aligned} {\text {CUT}}_2(x_1,x_2):= - 2.10 x_1 + 11.74 x_2 + 4.96 \le 0. \end{aligned}$$
(11)
When the updated MILP problem is solved, the solution is \((x_1,x_2)=(7.94,1)\), with objective value 16.75. Note that this point is not a valid solution to problem (1), since the constraint \(g_2\) is not fulfilled. However, we have a valid point outside the feasible region of the (integer-relaxed) MINLP problem, so we can perform a root search and generate a new constraint \({\text {CUT}}_3\):
$$\begin{aligned} {\text {CUT}}_3(x_1,x_2):= 5.47 x1 - 6.27 x2 -36.50 \le 0. \end{aligned}$$
(12)
As can be seen from the figure, this supporting hyperplane cut is generated for the convex constraint \(g_2\). Thus, the new cut does not cut away any feasible solutions. Adding this cut will however again give an integer-infeasible MILP problem, and we will need to restore feasibility by solving the following relaxation (with \(v_1 = v_2 = 1\))
$$\begin{aligned} \begin{aligned}&\text {minimize}&r_1 + r_2&\\&\text {subject to}&- x_1 -10x_2 \le -6,\quad x_1 - 10x_2 \le 4,&\\&- 2.10 x_1 + 11.74 x_2 + 6.39 \le r_1,&\\&5.47 x_1 - 6.27 x_2 -36.50 \le r_2,&\\&2 \le x_1 \le 8,\quad x_2 \in \{ 0,1,2\},\quad r_1 \ge 0,\ r_2 = 0, \end{aligned} \end{aligned}$$
(13)
where the variable \(r_2\) has been fixed to zero since the corresponding constraint was generated for a convex constraint. Now, we obtain the value \(r_1 = 0.29\), and thus we replace the constraint \({\text {CUT}}_2\) with
$$\begin{aligned} {\text {CUT}}_4(x_1,x_2):= - 2.10 x_1 + 11.74 x_2 + 4.67 \le 0. \end{aligned}$$
(14)
The resulting MILP problem is then
$$\begin{aligned} \begin{aligned}&\text {minimize}&x_1-10 x_2&\\&\text {subject to}&- x_1 -10x_2 \le -6,\quad x_1 - 10x_2 \le 4,&\\&- 2.10 x_1 + 11.74 x_2 + 4.67 \le 0,&\\&5.47 x_1 - 6.27 x_2 -36.50 \le 0,&\\&2 \le x_1 \le 8,\quad x_2 \in \{ 0,1,2\}, \end{aligned} \end{aligned}$$
(15)
which gives the solution 16.20 at the point \((x_1,x_2)=(7.80,1)\). This is also a feasible solution to the original MINLP problem (1). By performing these simple repair steps, we have thus found a feasible solution to a problem we could not have solved otherwise with the ESH method. Note however, that this is still not the globally optimal solution mentioned in Ex. 1.
Utilizing a cutoff constraint to force new solutions and reduce the objective gap
Solving problem (1) as described in Exs. 1 and 2, shows that PA strategies in general, and the ESH algorithm specifically, may get stuck in suboptimal solutions for nonconvex problems. Normally, the PA methods then terminate with this suboptimal solution. However, as briefly described in [48], it is possible to try to force a better solution from the MIP problem when no progress can otherwise be made by introducing a so-called primal objective cut and then resolving the MIP problem. This cut is of the form
$$\begin{aligned} {\mathbf {c}}^T{\mathbf {x}}\le \gamma \cdot {\text {PB}}, \end{aligned}$$
(16)
where \(\gamma \) must be selected so that \(\gamma \cdot {\text {PB}}< {\text {PB}}\). Note that this cannot normally be accomplished by using the cutoff functionality in the MIP solvers, since the infeasibilities normally need to be explicitly present in the model as constraints for the solvers’ built in infeasibility relaxation functionality to work.
In practice, whenever SHOT reaches an optimality gap of zero with cuts also created for nonconvex constraints, which would for a convex problem mean the global solution is found, it creates or modifies the objective cut in Eq. (16), so that its right-hand-side is less than the current primal bound. The problem is then resolved with the MIP solver. The problem will then either be infeasible (in which case the repair functionality discussed in Sect. 3.1 will try to repair the infeasibility), or a new solution with better objective value will be found. Note however, that this solution does not need to be a new primal solution to the MINLP problem, since it is not required to fulfill the nonlinear constraints, only their linearizations through hyperplane cuts that have been included in the MIP problem. This whole procedure is then repeated a user-defined number of times.
In the next example, and as illustrated in Fig. 3, the primal objective cut procedure in combination with the repair functionality is applied to the problem considered in Exs. 1 and 2.
Example 3
In Ex. 2, we were able to repair the MILP problem to get a feasible solution in \((x_1,x_2)=(7.80,1)\) with the objective value 16.20. However, we know that this is not the global solution so we will try to find a better one by adding a primal objective cut that forces the objective to have a better (lower) value. We do this by introducing a cut
$$\begin{aligned} {\text {CUT}}_5(x_1,x_2):=x_1 -10x_2\le 0.3 \cdot 16.20 = 4.86. \end{aligned}$$
(17)
Note that the value 0.3 has been chosen here to reduce the numbers of iterations, and normally a \(\gamma \)-value less than but close to one should be used. As can be seen in Fig. 3, this makes the MILP problem infeasible again, and the constraint \({\text {CUT}}_4\) needs to be relaxed. The required feasibility relaxation can now be obtained by again formulating and solving problem (7). Note however, that the primal objective cut \({\text {CUT}}_5\) should not be relaxed, so its corresponding r-variable should be fixed to zero. By replacing constraint \({\text {CUT}}_4\) with the repaired constraint (with \(\tau =1.1\))
$$\begin{aligned} {\text {CUT}}_6(x_1,x_2):= - 2.10 x_1 + 11.74 x_2 -7.51 \le 0, \end{aligned}$$
(18)
the MIP problem again have a solution in the point (2.0, 1). This point is, however, not feasible in the original MINLP problem, so we need to remove the point by adding a cut to the MILP problem in the next iteration. Now, the interior point is no longer feasible in the PA, so we instead add the cutting plane
$$\begin{aligned} {\text {CUT}}_7(x_1,x_2):= - 5.98 x_1 + 6.00 x_2 +19.06 \le 0, \end{aligned}$$
(19)
based on the convex constraint \(g_1\). In iteration 7, the optimal solution \((x_1,x_2) = (2.18, 1) \) has now been found to a constraint tolerance of 0.03 and with an objective value of \(-6.25\).
Verifying lower bounds for nonconvex problems
If feasibility can not be restored by modifying the supporting hyperplanes or cutting planes generated for the nonconvex constraints, the primal bound cannot be less the value \(\gamma \cdot {\text {PB}}\) and this value is thus a valid lower bound for the objective value for the nonconvex problem. What this means in practice is that the POA of the convex constraints has no solution giving a lower objective value than \(\gamma \cdot {\text {PB}}\), and since all solutions to the nonconvex problems are contained in this polyhedral feasible set, no solution to the entire nonconvex problem can have a lower value than this value either.
Thus, the techniques in Sects. 3.1 and 3.2 can be combined to create a method for verifying a lower bound for problem (1). Assuming that we have generated cuts \({\text {CUT}}_l\) with indices \(l\in L_\text {NC}\) for nonconvex constraints out of all generated constraints indices in L. Then we generate the following MILP problem:
$$\begin{aligned} \begin{aligned}&\text {minimize}&{\mathbf {c}}^T{\mathbf {x}}+ \sum _{l\in L}r_l,&\\&\text {subject to}&\mathrm A{\mathbf {x}}\le \mathbf {a},\ \mathrm B{\mathbf {x}}= \mathbf{b},&\\&{\text {CUT}}_l({\mathbf {x}}) \le r_l&\forall l \in L, \\&{\mathbf {c}}^T{\mathbf {x}}\le \gamma \cdot {\text {PB}},&\\&\underline{x}_i \le x_i \le \overline{x}_i&\forall i \in I = \{1,2,\ldots ,n\}, \\&r_l \ge 0&\forall l \in L_\text {NC}, \\&r_l = 0&\forall l \in L \setminus L_\text {NC}, \\&x_i \in \mathbf {R},\ x_j \in \mathbf {Z},&\forall i,j \in I,\ i\ne j. \\ \end{aligned} \end{aligned}$$
(20)
If this problem is infeasible, then we know that the nonconvex problem (1), where each nonlinear equality constraint \(h({\mathbf {x}})=0\) has been rewritten as the two constraints \(-h({\mathbf {x}})\le 0\) and \(h({\mathbf {x}})\le 0\), does not have a solution with lower objective value than \(\tau \cdot {\text {PB}}\).
Adding integer cuts
In algorithms based on PA, integer cuts are often used to exclude a specific combination of integer or binary variable solutions. For example, in POA-based convex MINLP, this can be used to speed up the solution process since a specific integer combination will not be revisited in later iterations. In nonconvex PA-based methods integer cuts may be needed to force the MIP solver to visit other integer combinations. An integer cut is a constraint of the form
$$\begin{aligned} \Vert {\mathbf {y}}- {\mathbf {y}}^k\Vert _1 \ge 1, \end{aligned}$$
(21)
where \({\mathbf {y}}\) corresponds to the elements of the vector \({\mathbf {x}}\) that are integer or binary variables. The constraint in Eq. (21) will then exclude the specific integer combination \({\mathbf {y}}^k\). In the case where all discrete variables are binaries, this expression simplifies to
$$\begin{aligned} \sum _{y^k_j = 0} y_j - \sum _{y^k_j = 1} (1-y_j) \ge 1. \end{aligned}$$
(22)
It is also possible to write the constraint in Eq. (21) in linear form in the more general case when one or more of the discrete variables are nonbinary; this is discussed further in [3].
For nonconvex MINLP, generating integer cuts in points provided by local NLP solvers is problematic due to the fact that we cannot be sure that the solution we have received when solving a fixed NLP problem for a specific integer combination is globally optimal unless a global solver has been used. The integer cut can, therefore, exclude the optimal integer assignment even if the optimal solution has not been obtained. Therefore, there is a setting in SHOT that also allows us to relax added integer cuts when doing the feasibility relaxation in Sect. 3.1 in case the MIP subproblem becomes infeasible after adding integer cuts.