Abstract
A deterministic global optimization method is developed for a class of discontinuous functions. McCormick’s method to obtain relaxations of nonconvex functions is extended to discontinuous factorable functions by representing a discontinuity with a step function. The properties of the relaxations are analyzed in detail; in particular, convergence of the relaxations to the function is established given some assumptions on the bounds derived from interval arithmetic. The obtained convex relaxations are used in a branchandbound scheme to formulate lower bounding problems. Furthermore, convergence of the branchandbound algorithm for discontinuous functions is analyzed and assumptions are derived to guarantee convergence. A key advantage of the proposed method over reformulating the discontinuous problem as a MINLP or MPEC is avoiding the increase in problem size that slows global optimization. Several numerical examples for the global optimization of functions with discontinuities are presented, including ones taken from process design and equipment sizing as well as discretetime hybrid systems.
This is a preview of subscription content, access via your institution.
Similar content being viewed by others
References
Adjiman, C.S., Dallwig, S., Floudas, C.A., Neumaier, A.: A global optimization method, \(\alpha \)BB, for general twicedifferentiable constrained NLPsI. Theoretical advances. Comput. Chem. Eng. 22(9), 1137–1158 (1998)
Barton, P.I., Allgor, R.J., Feehery, W.F., Galán, S.: Dynamic optimization in a discontinuous world. Ind. Eng. Chem. Res. 37(3), 966–981 (1998)
Batukhtin, V.D.: On solving discontinuous extremal problems. J. Optim. Theory Appl. 77, 575–589 (1993)
Batukhtin, V.D.: An approach to the solution of discontinuous extremal problems. J. Comput. Syst. Sci. Int. 33, 30–38 (1995)
Batukhtin, V.D., Bigil’deev, S.I., Bigil’deeva, T.B.: Numerical methods for solutions of discontinuous extremal problems. J. Comput. Syst. Sci. Int. 36, 438–445 (1997)
Baumrucker, B.T., Renfro, J.G., Biegler, L.T.: MPEC problem formulations and solution strategies with chemical engineering applications. Comput. Chem. Eng. 32, 2903–2913 (2008)
Chachuat, B., Mitsos, A., Barton, P.I.: libMC—A numeric library for McCormick relaxation of factorable functions (2007) . http://yoric.mit.edu/libMC/
Chen, J.: Comments on improvements on a replacement for the logarithmic mean. Chem. Eng. Sci. 42(10), 2488–2489 (1987)
Clarke, F.H.: Optimization and Nonsmooth Analysis. Wiley, New York, NY (1983)
Conn, A.R., Mongeau, M.: Discontinuous piecewise linear optimization. Math. Program. 80(3), 315–380 (1998)
Cortés, J.: Discontinuous dynamical systems. IEEE Control Syst. Mag. 28(3), 36–73 (2008)
Duran, M.A., Grossmann, I.E.: Simultaneous optimization and heat integration of chemical processes. AIChE J. 32, 123–138 (1986)
Ermoliev, Y.M., Norkin, V.I.: On constrained discontinuous optimization. In: Proceedings of 3rd GAMM/IFIP Workshop. Stochastic optimization: Numerical Methods and Technical Applications. Lecture Notes in Economics and Mathematical Systems, vol. 458, pp. 128–142. Springer, Berlin (1998)
Ermoliev, Y.M., Norkin, V.I., Wets, R.J.B.: The minimization of semicontinuous functions: Mollifier subgradients. SIAM J. Control Optim. 33, 149–167 (1995)
Falk, J.E., Soland, R.M.: An algorithm for separable nonconvex programming problems. Manag. Sci. 15, 550–569 (1969)
Ferris, M.C., Dirkse, S.P., Jagla, J.H., Meeraus, A.: An extended mathematical programming framework. Comput. Chem. Eng. 33(12), 1973–1982 (2009)
Furman, K.C., Sahinidis, N.V.: A critical review and annotated bibliography for heat exchanger network synthesis in the 20th century. Ind. Eng. Chem. Res. 41, 2335–2370 (2002)
Goebel, R., Sanfelice, R.G., Teel, A.R.: Hybrid dynamical systems. IEEE Control Syst. Mag. 29(2), 28–93 (2009)
Gordon, R.A.: The Integrals of Lebesgue, Denjoy, Perron, and Henstock. American Mathematical Society, Providence, RI (1994)
HiriartUrruty, J.B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms. Springer, Berlin (1993)
Horst, R.: Deterministic global optimization with partition sets whose feasibility is not known: application to concave minimization, reverse convex constraints, DCprogramming, and Lipschitzian optimization. J. Optim. Theory Appl. 58(1), 11–37 (1988)
Horst, R., Tuy, H.: Global Optimization: Deterministic Approaches, 3rd edn. Springer, Berlin (1996)
Knüppel, O.: PROFIL/BIASa fast interval library. Computing 53(3–4), 277–287 (1994)
Liu, J., Liao, Lz., Nerode, A., Taylor, J.H.: Optimal control of systems with continuous and discrete states. In: Proceedings of 32nd IEEE Conference on Decision and Control, IEEE, pp. 2292–2297 (1993)
Lukšan, L., Vlček, J.: Algorithm 811: NDA: algorithms for nondifferentiable optimization. ACM Trans. Math. Softw. 27, 193–213 (2001)
McCormick, G.P.: Computability of global solutions to factorable nonconvex programs: part I convex underestimating problems. Math. Program. 10, 147–175 (1976)
McCormick, G.P.: Nonlinear Programming: Theory, Algorithms, and Applications. Wiley, New York, NY (1983)
Mitsos, A., Chachuat, B., Barton, P.I.: McCormickbased relaxations of algorithms. SIAM J. Optim. 20, 573–601 (2009)
Moore, R.E.: Methods and Applications of Interval Analysis. SIAM, Philadelphia, PA (1979)
Moreau, L., Aeyels, D.: Optimization of discontinuous functions: a generalized theory of differentiation. SIAM J. Optim. 11, 53–69 (2000)
Neumaier, A.: Interval Methods for Systems of Equations. Cambridge University Press, Cambridge, UK (1990)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton, NJ (1996)
Rubinstein, R.Y.: Smoothed functionals in stochastic optimization. Math. Oper. Res. 8, 26–33 (1983)
Rudin, W.: Principles of Mathematical Analysis, 3rd edn. McGrawHill, New York, NY (1976)
Ryoo, H.S., Sahinidis, N.V.: A branchandreduce approach to global optimization. J. Glob. Optim. 8(2), 107–138 (1996)
Sahinidis, N.V.: BARON solver manual (2012). http://gams.com/dd/docs/solvers/baron.pdf
Scott, J.K., Stuber, M.D., Barton, P.I.: Generalized McCormick relaxations. J. Glob. Optim. 51(4), 569–606 (2011)
Tawarmalani, M., Sahinidis, N.V.: Convexification and Global Optimization in Continuous and MixedInteger Nonlinear Programming. Kluwer, Dordrecht (2002)
Türkay, M., Grossmann, I.E.: Disjunctive programming techniques for the optimization of process systems with discontinuous investment costsmultiple size regions. Ind. Eng. Chem. Res 35, 2611–2623 (1996)
Vicente, L.N., Custódio, A.L.: Analysis of direct searches for discontinuous functions. Math. Program. 133(1–2), 299–325 (2012)
Zang, I.: Discontinuous optimization by smoothing. Math. Oper. Res. 6, 140–152 (1981)
Zheng, Q.: Robust analysis and global minimization of a class of discontinuous functions (I). Acta Mathematicae Applicatae Sinica 6, 205–223 (1990)
Acknowledgments
This work was supported by Statoil as part of the paired Ph. D. research program in gas technologies between MIT and NTNU.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A: Discussion of sufficient conditions for convergence of the relaxations
Here, three lemmata will be given that present sufficient conditions for Assumption 4 to hold for a given factor \(v_k\) and thus can be used in a finite induction argument to establish Assumption 4. In particular, they formalize the discussion in Sect. 2.4 and show that, to establish Assumption 4, it is sufficient to exclude these cases from occurring. First, overestimation in binary operations is considered. Here, two reasonably strong results can be given. Then, attention will be directed to univariate functions where more restrictive assumptions need to be made.
Lemma 6
Consider any \(k\) such that \(n<k\le m\) where \(v_k\) is defined by a summation or multiplication. Consider a nested sequence of intervals \(X^l\rightarrow X^*=[\mathbf{x}^*,\mathbf{x}^*]\), \(X^l\in I\!X\), and \(X^l\ne X^*\). Suppose Assumption 4 holds for all \(i,j<k\). Suppose that \(v_i\) and \(v_j\) are discontinuous with respect to \(\mathbf{x}\) at \(\mathbf{x}^*\) and that these discontinuities are introduced at earlier factors \(k_i\le i\) and \(k_j\le j\), i.e., \(v_{k_i}=\pi (v_{r_i})\) and \(v_{k_j}=\pi (v_{r_j})\), \(r_i<k_i\) and \(r_j<k_j\). Assume that \(v_{k_i}\) and \(v_{k_j}\) are the only discontinuous step mappings. Define subsets of \(X^l\) as \(\varXi _i^l=\{\mathbf{x}\in X^l:v_{r_i}(\mathbf{x})>0\}\) and \(\varXi _j^l=\{\mathbf{x}\in X^l:v_{r_j}(\mathbf{x})>0\}\). If there exists a \(L\in \mathbb N \) so that for all \(l>L\),
then Assumption 4 holds for \(k\).
Proof
By assumption, there exist four sequences \(\{\mathbf{x}^l_1\},\ldots ,\{\mathbf{x}^l_4\}\) converging to \(\mathbf{x}^*\) where \(\mathbf{x}^l_1\in \varXi _i^l\cap \varXi _j^l\), \(\mathbf{x}^l_2\in \varXi _i^l\cap (X^l\backslash \varXi _j^l)\), \(\mathbf{x}^l_3\in (X^l\backslash \varXi _i^l)\cap \varXi _j^l\), and \(\mathbf{x}^l_4\in (X^l\backslash \varXi _i^l)\cap (X^l\backslash \varXi _j^l)\).
For any \(X^l\) with \(l>L\), the image of \(v_{k_i}\) is \(V_{k_i}^l=[0,1]\) as \(\varXi _i^l\) is a nonempty strict subset of \(X^l\), \(v_{k_i}(\mathbf{x}_q^l)=1\) for \(q=1,2\) and \(v_{k_i}(\mathbf{x}_q^l)=0\) for \(q=3,4\). Thus, \(V_{k_i}^l\) is an exact bound of the range of \(v_{k_i}\). Consider the finite sequence of \(s+1\) continuous factors, say \(v_{i_1},\ldots ,v_{i_s},v_i\) with \(k_i<i_1<\cdots <i_s<i\), that maps \(V_{k_i}\) to \(V_i\). By assumption, other arguments involved in the definition of the factors \(v_{i_1},\ldots ,v_{i_s},v_i\) are continuous step mappings and, as a result, their corresponding interval bounds converge to degenerate intervals as \(l\rightarrow \infty \).
Consider factor \(v_{i_1}\) and let \([\underline{v}_{i_1}^*,\overline{v}_{i_1}^*]=\lim _{l\rightarrow \infty }[\underline{v}_{i_1}^l,\overline{v}_{i_1}^l]\). If this step mapping is a binary operation combining \(v_{k_i}\) with a continuous factor, \(V_{i_1}\) will converge to a nondegenerate interval and, without loss of generality, \(\underline{v}_{i_1}^*=\lim _{l\rightarrow \infty }v_{i_1}^l(\mathbf{x}_q^l)\) for \(q=1,2\) and \(\overline{v}_{i_1}^*=\lim _{l\rightarrow \infty }v_{i_1}^l(\mathbf{x}_q^l)\) for \(q=3,4\). If this step mapping is a univariate operation, Assumption 4 guarantees that \(V_{i_1}\) will converge to the exact bounds, i.e., without loss of generality, \(\underline{v}_{i_1}^*=\lim _{l\rightarrow \infty }v_{i_1}^l(\mathbf{x}_q^l)\) for \(q=1,2\) and \(\overline{v}_{i_1}^*=\lim _{l\rightarrow \infty }v_{i_1}^l(\mathbf{x}_q^l)\) for \(q=3,4\). Repeating this argument for the factors \(v_{i_2},\ldots ,v_{i_s},v_i\), it follows without loss of generality that \(\underline{v}_i^*=\lim _{l\rightarrow \infty }v_i^l(\mathbf{x}_q^l)\) for \(q=1,2\) and \(\overline{v}_i^*=\lim _{l\rightarrow \infty }v_i^l(\mathbf{x}_q^l)\) for \(q=3,4\) where \([\underline{v}_i^*,\overline{v}_i^*]=\lim _{l\rightarrow \infty }[\underline{v}_i^l,\overline{v}_i^l]\). It can be argued similarly that, without loss of generality, \(\underline{v}_j^*=\lim _{l\rightarrow \infty }v_j^l(\mathbf{x}_q^l)\) for \(q=1,3\) and \(\overline{v}_j^*=\lim _{l\rightarrow \infty }v_j^l(\mathbf{x}_q^l)\) for \(q=2,4\) where \([\underline{v}_j^*,\overline{v}_j^*]=\lim _{l\rightarrow \infty }[\underline{v}_j^l,\overline{v}_j^l]\).
Thus, each combination of the bounds of \(v_i\) and \(v_j\) is attained in the neighborhood of \(\mathbf{x}^*\). In particular, in the case of addition, the sequences \(\{v_i(\mathbf{x}^l_1)\}\), \(\{v_j(\mathbf{x}^l_1)\}\) and \(\{v_i(\mathbf{x}^l_4)\}\), \(\{v_j(\mathbf{x}^l_4)\}\) converge to \(\underline{v}_i^*\), \(\underline{v}_j^*\) and \(\overline{v}_i^*\), \(\overline{v}_j^*\),respectively. Thus, \([\underline{v}_k^*,\overline{v}_k^*]=[\underline{v}_i^*,\overline{v}_i^*]+[\underline{v}_j^*,\overline{v}_j^*]\) is, in the limit, an exact bound. A similar argument can be presented for the case of multiplication. Here, each combination of lower and upper bounds on \(v_i\) and \(v_j\) is realized by a different sequence \(\{\mathbf{x}_q^l\}\), \(q=1,\ldots ,4\). Thus, Assumption 4 holds for \(k\). \(\square \)
Remark 5

Lemma 6 considers the case of adding or multiplying \(v_i\) and \(v_j\) where \(v_i\) and \(v_j\) are discontinuous in the limit \(\mathbf{x}^*\) and these discontinuities are introduced by exactly one \(\pi \) function each. Then, the dependency problem in interval arithmetic can be mitigated when there exist regions in each interval \(X^l\) so that all combination of the lower and upper bounds of the factors \(v_i\) and \(v_j\) are attained. This can be alternatively expressed as requiring that the intrinsic discontinuities do not coincide in a neighborhood of \(\mathbf{x}^*\). A case where this hypothesis of Lemma 6 holds is illustrated in Fig. 6a.

A counterexample can be given to show that Lemma 6 cannot be easily extended to the case when more than \(n\) intrinsic discontinuities coincide at \(\mathbf{x}^*\in \mathbb{R }^n\). To see this, consider \(f(\mathbf{x})=1+\pi (x_1)+\pi (x_2)\pi (x_1+x_2)\), \(X=[1,1]^2\) and \(X^l=[l^{1},l^{1}]^2\). As shown in Fig. 6b three intrinsic discontinuities coincide at \((0,0)\). The bounds of \(f\) on \(X^l\) obtained from interval arithmetic are \(\underline{f}^l=0\) and \(\overline{f}^l=3\). They are not attained for any \(\mathbf{x}\in X^l\) and any \(l\) and thus Assumption 4 does not hold.

Also note that, given Assumption 4, the exacerbated dependency problem of interval arithmetic is not acute when there is only one discontinuity present in either \(v_i\) or \(v_j\) at \(\mathbf{x}^*\). This has been exploited in the proof of Lemma 6.

Lastly, observe that the hypotheses of Lemma 6 cannot be satisfied when \(X^l\subset \mathbb{R }\). At most three subsets of \(X\) in the vicinity of \(x^*\), \(\{x:x<x^*\}\), \(\{x:x=x^*\}\) and \(\{x:x>x^*\}\), are conceivable where \(v_i\) and \(v_j\) could attain their lower and upper bounds. To guarantee that Assumption 4 holds for \(v_k\), the interval arithmetic for \(v_i+v_j\) or \(v_iv_j\) needs to combine the bounds in such a way that \(v_k\) attains both its lower and upper bound. However, it is easy to conceive counterexamples where this is not true, e.g., see the discussion prior to Lemma 6.
Though it was pointed out that there are counterexamples restricting the generalization of Lemma 6 when more than 2 intrinsic discontinuities coincide at \(\mathbf{x}^*\) in \(\mathbb{R }^2\), a generalization is possible to \(n\) intrinsic discontinuities coinciding in \(\mathbb{R }^n\).
Lemma 7
Consider any \(k\) such that \(n<k\le m\) where \(v_k\) is defined by summation or multiplication. Suppose Assumption 4 holds for all \(i,j<k\). Consider a nested sequence of intervals \(X^l\rightarrow X^*=[\mathbf{x}^*,\mathbf{x}^*]\), \(X^l\in I\!X\), \(X^l\ne X^*\). Suppose that \(v_i\) and \(v_j\) are discontinuous with respect to \(\mathbf{x}\) at \(\mathbf{x}^*\) and that these discontinuities are introduced by \(q\le n\) earlier factors \(k_1,\ldots ,k_q\), i.e., \(v_{k_{\tilde{q}}}=\pi (v_{r_{\tilde{q}}})\) with \(v_{r_{\tilde{q}}}(\mathbf{x}^*)=0\) for \(\tilde{q}=1,\ldots ,q\). Assume that \(v_{r_{\tilde{q}}}\) is differentiable with respect to \(\mathbf{x}\) at \(\mathbf{x}^*\), for all \(\tilde{q}=1,\ldots ,q\), and denote the gradient of \(v_{r_{\tilde{q}}}\) at \(\mathbf{x}^*\) as \(\varvec{\nabla }v_{\tilde{q}}\). If \(\varvec{\nabla }v_1,\ldots ,\varvec{\nabla }v_q\) are linearly independent, then Assumption 4 holds for \(k\).
Proof
Define subsets of \(X^l\) as \(\varXi _{\tilde{q}}^l=\{\mathbf{x}\in X^l:v_{r_{\tilde{q}}}>0\}\), \(\tilde{q}=1,\ldots ,q\). Requiring linear independence of \(\varvec{\nabla }v_1,\ldots ,\varvec{\nabla }v_q\) is a sufficient condition for the existence of \(2^q\) nonempty subsets of \(X^l\) that realize all combinations of \(\varXi _{\tilde{q}}^l\) with \(\varXi _{\hat{q}}^l\) or \(X^l\backslash \varXi _{\hat{q}}^l\), \(\hat{q}=1,\ldots ,q\), \(\hat{q}\ne \tilde{q}\), for all \(l>L\) for some \(L\in \mathbb N \). Thus, the argument used in the proof of Lemma 6 can be extended to show that each possible combination of the bounds on intermediate factors is indeed realized. \(\square \)
Lemma 8
Consider any \(k\) such that \(n<k\le m\) where \(v_k\) is defined by \(v_k=\pi (v_i)\). Consider a nested sequence of intervals \(X^l\rightarrow X^*=[\mathbf{x}^*,\mathbf{x}^*]\), \(X^l\in I\!X\), \(X^l\ne X^*\). Suppose either

1.
that \(v_i(\mathbf{x}^*)=0\) and that for all \(l>0\) there exists a \(\mathbf{x}_l^\dagger \in X^l\) and a \(\varepsilon _l>0\) so that \(v_i(\mathbf{x}_l^\dagger )=\varepsilon _l\),

2.
that there exists a \(L_1>0\) so that \(\overline{v}_i^l\le 0\) for all \(l\ge L_1\), or

3.
that there exists a \(L_2>0\) so that \(\underline{v}_i^l> 0\) for all \(l\ge L_2\).
Then, Assumption 4 holds for \(k\).
Proof
Consider Case 1. By assumption, \(\underline{v}_k^l=0\) and \(\overline{v}_k^l=1\), \(\forall l\) so that \(\lim _{l\rightarrow \infty }[\underline{v}_k^l,\overline{v}_k^l]=[0,1]\). Furthermore, it holds that
Consider Case 2. By assumption, \([\underline{v}_k^l,\overline{v}_k^l]=[0,0]\) for all \(l\ge L_1\). Thus, \(v_k(\mathbf{x})=0\) for all \(\mathbf{x}\in X^{L_1}\) so that \([\lim _{l\rightarrow \infty }\inf _{\mathbf{x}\in X^l}v_k(\mathbf{x}),\lim _{l\rightarrow \infty }\sup _{\mathbf{x}\in X^l}v_k(\mathbf{x})]=[0,0]=\lim _{l\rightarrow \infty }[\underline{v}_k^l,\overline{v}_k^l]\).
Consider Case 3. By assumption, \([\underline{v}_k^l,\overline{v}_k^l]=[1,1]\) for all \(l\ge L_1\). Thus, \(v_k(\mathbf{x})=1\) for all \(\mathbf{x}\in X^{L_2}\) so that \([\lim _{l\rightarrow \infty }\inf _{\mathbf{x}\in X^l}v_k(\mathbf{x}),\lim _{l\rightarrow \infty }\sup _{\mathbf{x}\in X^l}v_k(\mathbf{x})]=[1,1]=\lim _{l\rightarrow \infty }[\underline{v}_k^l,\overline{v}_k^l]\).
Thus, Eq. (2) and, hence, Assumption 4 hold for factor \(k\). \(\square \)
Lemma 9
Consider a nested sequence of intervals \(X^l\rightarrow X^*\), \(X^l\in I\!X\), \(X^l\ne X^*\) and a continuous function \(f:X\rightarrow \mathbb{R }\). Then,
Proof
Fix \(\varepsilon >0\). Let \(\mathbf{x}^*_{\min }\in \text{ arg } \text{ min }_{\mathbf{x}\in X^*}f(\mathbf{x})\), the infimum is attained since \(X^*\) is compact and \(f\) is continuous on \(X^*\). Since \(X^l\subset X\) is compact and \(f\) is continuous on \(X\), \(f\) is uniformly continuous on \(X^l\). Uniform continuity of \(f\) implies that \(\exists \delta >0\) so that \(f(\mathbf{x})f(\mathbf{y})<\varepsilon \) for all \(\mathbf{x},\mathbf{y}\in X^l\) for which \(\Vert \mathbf{x}\mathbf{y}\Vert <\delta \sqrt{n}\). Convergence of \(X^l\) to \(X^*\) implies that there is a \(L>0\) so that \(d_H(X^l,X^*)<\delta \) for all \(l>L\). By definition of the Hausdorff metric, \(\underline{x}_i^l>\underline{x}_i^*\delta \) and \(\overline{x}_i^l<\overline{x}_i^*+\delta \) for all \(l>L\) and \(i=1,\ldots ,n\). Thus, \(f(\mathbf{x}^\dagger )+\varepsilon >f(\mathbf{x}^\ddagger )\) where \(\mathbf{x}^\dagger \in X^l\backslash X^*\) and \(\mathbf{x}^\ddagger \in \partial X^*\) with \(\partial X^*\) denoting the boundary of \(X^*\). By definition, \(f(\mathbf{x})\ge f(\mathbf{x}^*_{\min })\), \(\forall \mathbf{x}\in X^*\) so that \(f(\mathbf{x}^\ddagger )\ge f(\mathbf{x}^*_{\min })\). As a result, \(f(\mathbf{x})+\varepsilon >f(\mathbf{x}^*_{\min })\) for all \(\mathbf{x}\in X^l\) with \(l>L\). Since \(X^l\supset X^*\), \(\inf _{\mathbf{x}\in X^l}f(\mathbf{x})\le f(\mathbf{x}^*_{\min })\) for all \(l\). \(\varepsilon \) is arbitrary so that \(\lim _{l\rightarrow \infty }\inf _{\mathbf{x}\in X^l}f(\mathbf{x})=\inf _{\mathbf{x}\in X^*}f(\mathbf{x})\). An analogous argument can be made to show that \(\lim _{l\rightarrow \infty }\sup _{\mathbf{x}\in X^l}f(\mathbf{x})=\sup _{\mathbf{x}\in X^*}f(\mathbf{x})\). \(\square \)
Lemma 10
Consider any \(k\) such that \(n<k\le m\) where \(v_k\) is defined by a continuous univariate intrinsic function \(\varphi _k\). Suppose Assumption 4 holds for all \(i<k\). Consider a nested sequence of intervals \(X^l\rightarrow X^*=[\mathbf{x}^*,\mathbf{x}^*]\), \(X^l\in I\!X\), \(X^l\ne X^*\). Let \([\underline{v}_i^*,\overline{v}_i^*]=\lim _{l\rightarrow \infty }[\underline{v}_i^l,\overline{v}_i^l]\). Then, Assumption 4 holds for \(k\) if
Proof
First, suppose that \(v_i\) is continuous with respect to \(\mathbf{x}\) at \(\mathbf{x}^*\). Then, \([\underline{v}_i^*,\overline{v}_i^*]\) is a degenerate interval. Since \(\varPhi _k\) is an interval extension, \(\varPhi _k([\underline{v}_i^*,\overline{v}_i^*])\) is also a degenerate interval and, hence, Eq. (2) holds.
Next, suppose that \(v_i\) is not continuous with respect to \(\mathbf{x}\) at \(\mathbf{x}^*\). Since Assumption 4 holds for factor \(i\), it follows that \(\lim _{l\rightarrow \infty }[\underline{v}_i^l,\overline{v}_i^l]=[\lim _{l\rightarrow \infty }\inf _{\mathbf{x}\in X^l}v_i(\mathbf{x}), \lim _{l\rightarrow \infty }\sup _{\mathbf{x}\in X^l}v_i(\mathbf{x})]\). Consider the sequence \(V_i^l=[\underline{v}^l,\overline{v}^l]\) converging to \(V_i^*=[\underline{v}^*,\overline{v}^*]\). According to Lemma 9, it holds that \(\lim _{l\rightarrow \infty }\inf _{z\in V_i^l}\varphi _k(z)=\inf _{z\in V_i^*}\varphi _k(z)\) and that \(\lim _{l\rightarrow \infty }\sup _{z\in V_i^l}\varphi _k(z)=\sup _{z\in V_i^*}\varphi _k(z)\). The hypothesis of the lemma imply furthermore that \(\underline{\varPhi }_k([\underline{v}_i^*,\overline{v}_i^*])=\inf _{z\in [\underline{v}_i^*,\overline{v}_i^*]}\varphi _k(z)\) and that \(\overline{\varPhi }_k([\underline{v}_i^*,\overline{v}_i^*])=\sup _{z\in [\underline{v}_i^*,\overline{v}_i^*]}\varphi _k(z)\). Therefore it follows that
i.e., Eq. (2) holds and, hence, Assumption 4 is established for factor \(k\). \(\square \)
Remark 6
An example of a class of univariate intrinsic functions \(\varphi \) that can meet the hypotheses of Lemma 10 are monotone functions. However, the specific implementation of \(\varPhi \) will dictate if \(\varphi \) indeed meets the hypotheses of Lemma 10.
Appnedix B: More general convergence results for branchandbound algorithm
In Sect. 3, it was assumed that \(f\) is either lower semicontinuous or attains its minimum on \(D\). Results are outlined below that hold even when these assumptions are generalized.
Remark 7
When the assumption that \(f\) is lower semicontinuous is dropped in Theorem 9, then one cannot appeal to Theorem 8. However, with Remark 3 in mind, one can argue that \(\lim _{q\rightarrow \infty }\beta (X^{l_q})=\lim _{q\rightarrow \infty }\inf _{\mathbf{x}\in X^{l_q}}f(\mathbf{x})\ge \min \{\lim _{q\rightarrow \infty }\inf _{\mathbf{x}\in X^{l_q}\backslash \{\mathbf{x}^*\}}f(\mathbf{x}),f(\mathbf{x}^*)\}\ge \min \{\liminf _{\mathbf{x}\rightarrow \mathbf{x}^*}f(\mathbf{x}),f(\mathbf{x}^*)\}\), which is sufficient to show that the lower bounding operation is strongly consistent.
Note that Remark 7 does not allow for the argument \(\lim _{q\rightarrow \infty }\beta (X^{l_q})=\inf _{\mathbf{x}\in X^*\cap D}f(\mathbf{x})\), and consequently \(\lim _{k\rightarrow \infty }\beta _k=\inf _{\mathbf{x}\in D}f(\mathbf{x})\), when \(f\) is not assumed to be lower semicontinuous. In particular, there may be an infinitely decreasing sequence of nested intervals \(X^l\) so that there exists a \(\mathbf{y}\in \partial D\) with \(\mathbf{y}\in \text{ int }X^l\), \(\forall l\), i.e., all partition elements contain an element of the boundary of the feasible set in its interior. Suppose that \(f(\mathbf{y})=\inf _{\mathbf{x}\in D}f(\mathbf{x})\). Thus, it is conceivable that there exists a \(\varepsilon >0\) and a sequence \(\{\mathbf{z}^l\}\) with \(\mathbf{z}^l\notin D\), \(\mathbf{z}^l\in X^l\), \(\forall l\) so that \(f(\mathbf{z}^l)<f(\mathbf{y})\varepsilon \). As a result, \(\lim _{l\rightarrow \infty }\beta (X^l)\le f(\mathbf{y})\varepsilon \).
To avoid this complication, another assumption is introduced.
Assumption 6
Suppose \(f(\mathbf{y})\ge \inf _{\mathbf{x}\in D}f(\mathbf{x})\), \(\forall \mathbf{y}\in X:\mathbf{y}\notin D\).
This assumption can be satisfied by reformulating \(f\) as a penalty function, e.g., minimizing \(\tilde{f}\) with
where \(\overline{f}(D)\) denotes an upper bound, e.g., derived from interval analysis, of \(f\) on \(D\).
Remark 8
When the assumption that \(f\) attains its minimum on \(D\) in Theorem 10 is removed and Assumption 6 holds, one can still argue that \(\beta =\inf _{\mathbf{x}\in D}f(\mathbf{x})\) using Theorem 8 and Remark 3. However, the set of minimizers of \(f\) on \(D\), \(\text{ arg } \text{ min }_{\mathbf{x}\in D}f(\mathbf{x})\), is not defined in this case. Instead, consider the set
In this case \(X_{\min }\subset \text{ arg } \text{ inf }_{\mathbf{x}\in D}f(\mathbf{x})\). This can be shown as follows:
Assume that the algorithm does not terminate after a finite number of steps. Consider the sequence of lower bounds \(\{\beta _k\}\) with \(\mathbf{x}^k_{\min }\), \(L_k\) and \(X^{L_k}\) as defined previously. From the construction of the algorithm it follows that \(\{\beta _k\}\) is a nondecreasing sequence with \(\beta _k\le \inf _{\mathbf{x}\in D} f(\mathbf{x})\). Hence, \(\beta =\lim _{k\rightarrow \infty }\beta _k\) exists and \(\beta \le \inf _{\mathbf{x}\in D} f(\mathbf{x})\). Let \(\mathbf{x}_{\min }^\dagger \) denote an element of the set of accumulation points of the sequence \(\{\mathbf{x}_{\min }^k\}\) and let \(\{\mathbf{x}_{\min }^{k_r}\}\) be a subsequence of \(\{\mathbf{x}_{\min }^k\}\) with subsequential limit \(\mathbf{x}_{\min }^\dagger \). Since the partition subdivision is exhaustive and the selection operation is bound improving, a finite number of partition elements is visited in each iteration only. Consequently, a decreasing subsequence of successively refined partition elements \(\{X^{q^\prime }\}\subset \{X^{L_{k_r}}\}\) exists such that \(\lim _{q^\prime \rightarrow \infty }X^{q^\prime }=\{\mathbf{x}_{\min }^\dagger \}\). Since the lower bounding operation is strongly consistent, there exists a subsequence \(\{X^q\}\subset \{X^{q^\prime }\}\) such that \(\lim _{q\rightarrow \infty }\beta (X^q)\ge \min \{\liminf _{\mathbf{x}\rightarrow \mathbf{x}_{\min }^\dagger }f(\mathbf{x}),f(\mathbf{x}_{\min }^\dagger )\}\). The “deletion by infeasibility” rule is certain in the limit so that \(\mathbf{x}_{\min }^\dagger \in D\). Thus, \(\inf _{\mathbf{x}\in D} f(\mathbf{x})\ge \beta \ge \min \{\liminf _{\mathbf{x}\rightarrow \mathbf{x}_{\min }^\dagger }f(\mathbf{x}),f(\mathbf{x}_{\min }^\dagger )\}\). By assumption, \(f(\mathbf{y})\ge \inf _{\mathbf{x}\in D} f(\mathbf{x})\) when \(\mathbf{y}\notin D\) so that
Thus, the result follows.
Remark 9
Assumption 5, which implicitly presumes that \(f\) attains its minimum on \(D\), is used in Theorem 11. The latter can be modified when the minimum of \(f\) on \(D\) is not attained: define \(\tilde{\mathbf{x}}_{\min }\in D\) as the limit of a sequence \(\{\mathbf{x}^l\}\subset D\) with \(\lim _{l\rightarrow \infty }f(\mathbf{x}^l)=f^*\). Suppose that, for every \(\varepsilon >0\), there exists a \(\delta >0\) and a \(\mathbf{x}\in D\) for which \(\Vert \mathbf{x}\tilde{\mathbf{x}}_{\min }\Vert <\delta \), \(\mathbf{x}\ne \tilde{\mathbf{x}}_{\min }\), \(f(\mathbf{x})\le f^*+\varepsilon \) hold. Under this assumption, consistency of the lower bounding operation can be argued following a proof similar to the one of Theorem 11.
Remark 10
In Theorem 12 it was assumed that \(f\) is lower semicontinuous. This assumption was utilized therein to assert that sublevel sets of \(f\) are closed. A similar statement is not possible when the assumption of lower semicontinuity of \(f\) is dropped as they are equivalent. Consider a discontinuous functions with the following property: there exist two sequences \(\{\mathbf{y}^l\},\{\mathbf{z}^l\}\subset D\) with limits \(\mathbf{y}^*\ne \mathbf{z}^*\), respectively, so that \(\lim _{l\rightarrow \infty }f(\mathbf{y}^l)=f^*=\lim _{l\rightarrow \infty }f(\mathbf{z}^l)\) and let \(f(\mathbf{y}^*)=f^*\ne f(\mathbf{z}^*)\). The branchandbound algorithm is not able to fathom any partition element that contains an infinite number of elements of \(\{\mathbf{z}^l\}\). Consequently, \(\mathbf{y}^*\) and \(\mathbf{z}^*\) are accumulation points of \(\{\mathbf{x}^k\}\), whereas, in the strict sense, only \(\mathbf{y}^*\) solves (P). However, \(\mathbf{z}^*\) is in the set \(\text{ arg } \text{ inf }_{\mathbf{x}\in D}f(\mathbf{x})\) as defined by Eq. (3). Using the argument presented in this remark and asserting Assumption 6, one can show that, for any accumulation point \(\mathbf{x}^\dagger \) of \(\{\mathbf{x}^k\}\), \(\mathbf{x}^\dagger \in \text{ arg } \text{ inf }_{\mathbf{x}\in D}f(\mathbf{x})\) holds.
Rights and permissions
About this article
Cite this article
Wechsung, A., Barton, P.I. Global optimization of bounded factorable functions with discontinuities. J Glob Optim 58, 1–30 (2014). https://doi.org/10.1007/s1089801300603
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s1089801300603
Keywords
 Global optimization
 Discontinuous functions
 Convex relaxations
 McCormick relaxations
 Nonconvex optimization