Skip to main content

Global optimization of bounded factorable functions with discontinuities

Abstract

A deterministic global optimization method is developed for a class of discontinuous functions. McCormick’s method to obtain relaxations of nonconvex functions is extended to discontinuous factorable functions by representing a discontinuity with a step function. The properties of the relaxations are analyzed in detail; in particular, convergence of the relaxations to the function is established given some assumptions on the bounds derived from interval arithmetic. The obtained convex relaxations are used in a branch-and-bound scheme to formulate lower bounding problems. Furthermore, convergence of the branch-and-bound algorithm for discontinuous functions is analyzed and assumptions are derived to guarantee convergence. A key advantage of the proposed method over reformulating the discontinuous problem as a MINLP or MPEC is avoiding the increase in problem size that slows global optimization. Several numerical examples for the global optimization of functions with discontinuities are presented, including ones taken from process design and equipment sizing as well as discrete-time hybrid systems.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Adjiman, C.S., Dallwig, S., Floudas, C.A., Neumaier, A.: A global optimization method, \(\alpha \)BB, for general twice-differentiable constrained NLPs-I. Theoretical advances. Comput. Chem. Eng. 22(9), 1137–1158 (1998)

    Article  Google Scholar 

  2. Barton, P.I., Allgor, R.J., Feehery, W.F., Galán, S.: Dynamic optimization in a discontinuous world. Ind. Eng. Chem. Res. 37(3), 966–981 (1998)

    Article  Google Scholar 

  3. Batukhtin, V.D.: On solving discontinuous extremal problems. J. Optim. Theory Appl. 77, 575–589 (1993)

    Article  Google Scholar 

  4. Batukhtin, V.D.: An approach to the solution of discontinuous extremal problems. J. Comput. Syst. Sci. Int. 33, 30–38 (1995)

    Google Scholar 

  5. Batukhtin, V.D., Bigil’deev, S.I., Bigil’deeva, T.B.: Numerical methods for solutions of discontinuous extremal problems. J. Comput. Syst. Sci. Int. 36, 438–445 (1997)

    Google Scholar 

  6. Baumrucker, B.T., Renfro, J.G., Biegler, L.T.: MPEC problem formulations and solution strategies with chemical engineering applications. Comput. Chem. Eng. 32, 2903–2913 (2008)

    Article  Google Scholar 

  7. Chachuat, B., Mitsos, A., Barton, P.I.: libMC—A numeric library for McCormick relaxation of factorable functions (2007) . http://yoric.mit.edu/libMC/

  8. Chen, J.: Comments on improvements on a replacement for the logarithmic mean. Chem. Eng. Sci. 42(10), 2488–2489 (1987)

    Article  Google Scholar 

  9. Clarke, F.H.: Optimization and Nonsmooth Analysis. Wiley, New York, NY (1983)

    Google Scholar 

  10. Conn, A.R., Mongeau, M.: Discontinuous piecewise linear optimization. Math. Program. 80(3), 315–380 (1998)

    Article  Google Scholar 

  11. Cortés, J.: Discontinuous dynamical systems. IEEE Control Syst. Mag. 28(3), 36–73 (2008)

    Article  Google Scholar 

  12. Duran, M.A., Grossmann, I.E.: Simultaneous optimization and heat integration of chemical processes. AIChE J. 32, 123–138 (1986)

    Article  Google Scholar 

  13. Ermoliev, Y.M., Norkin, V.I.: On constrained discontinuous optimization. In: Proceedings of 3rd GAMM/IFIP Workshop. Stochastic optimization: Numerical Methods and Technical Applications. Lecture Notes in Economics and Mathematical Systems, vol. 458, pp. 128–142. Springer, Berlin (1998)

  14. Ermoliev, Y.M., Norkin, V.I., Wets, R.J.B.: The minimization of semicontinuous functions: Mollifier subgradients. SIAM J. Control Optim. 33, 149–167 (1995)

    Article  Google Scholar 

  15. Falk, J.E., Soland, R.M.: An algorithm for separable nonconvex programming problems. Manag. Sci. 15, 550–569 (1969)

    Article  Google Scholar 

  16. Ferris, M.C., Dirkse, S.P., Jagla, J.H., Meeraus, A.: An extended mathematical programming framework. Comput. Chem. Eng. 33(12), 1973–1982 (2009)

    Article  Google Scholar 

  17. Furman, K.C., Sahinidis, N.V.: A critical review and annotated bibliography for heat exchanger network synthesis in the 20th century. Ind. Eng. Chem. Res. 41, 2335–2370 (2002)

    Article  Google Scholar 

  18. Goebel, R., Sanfelice, R.G., Teel, A.R.: Hybrid dynamical systems. IEEE Control Syst. Mag. 29(2), 28–93 (2009)

    Google Scholar 

  19. Gordon, R.A.: The Integrals of Lebesgue, Denjoy, Perron, and Henstock. American Mathematical Society, Providence, RI (1994)

    Google Scholar 

  20. Hiriart-Urruty, J.B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms. Springer, Berlin (1993)

    Google Scholar 

  21. Horst, R.: Deterministic global optimization with partition sets whose feasibility is not known: application to concave minimization, reverse convex constraints, DC-programming, and Lipschitzian optimization. J. Optim. Theory Appl. 58(1), 11–37 (1988)

    Article  Google Scholar 

  22. Horst, R., Tuy, H.: Global Optimization: Deterministic Approaches, 3rd edn. Springer, Berlin (1996)

    Book  Google Scholar 

  23. Knüppel, O.: PROFIL/BIAS-a fast interval library. Computing 53(3–4), 277–287 (1994)

    Article  Google Scholar 

  24. Liu, J., Liao, Lz., Nerode, A., Taylor, J.H.: Optimal control of systems with continuous and discrete states. In: Proceedings of 32nd IEEE Conference on Decision and Control, IEEE, pp. 2292–2297 (1993)

  25. Lukšan, L., Vlček, J.: Algorithm 811: NDA: algorithms for nondifferentiable optimization. ACM Trans. Math. Softw. 27, 193–213 (2001)

    Article  Google Scholar 

  26. McCormick, G.P.: Computability of global solutions to factorable nonconvex programs: part I- convex underestimating problems. Math. Program. 10, 147–175 (1976)

    Article  Google Scholar 

  27. McCormick, G.P.: Nonlinear Programming: Theory, Algorithms, and Applications. Wiley, New York, NY (1983)

    Google Scholar 

  28. Mitsos, A., Chachuat, B., Barton, P.I.: McCormick-based relaxations of algorithms. SIAM J. Optim. 20, 573–601 (2009)

    Article  Google Scholar 

  29. Moore, R.E.: Methods and Applications of Interval Analysis. SIAM, Philadelphia, PA (1979)

    Book  Google Scholar 

  30. Moreau, L., Aeyels, D.: Optimization of discontinuous functions: a generalized theory of differentiation. SIAM J. Optim. 11, 53–69 (2000)

    Article  Google Scholar 

  31. Neumaier, A.: Interval Methods for Systems of Equations. Cambridge University Press, Cambridge, UK (1990)

    Google Scholar 

  32. Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton, NJ (1996)

    Google Scholar 

  33. Rubinstein, R.Y.: Smoothed functionals in stochastic optimization. Math. Oper. Res. 8, 26–33 (1983)

    Article  Google Scholar 

  34. Rudin, W.: Principles of Mathematical Analysis, 3rd edn. McGraw-Hill, New York, NY (1976)

    Google Scholar 

  35. Ryoo, H.S., Sahinidis, N.V.: A branch-and-reduce approach to global optimization. J. Glob. Optim. 8(2), 107–138 (1996)

    Article  Google Scholar 

  36. Sahinidis, N.V.: BARON solver manual (2012). http://gams.com/dd/docs/solvers/baron.pdf

  37. Scott, J.K., Stuber, M.D., Barton, P.I.: Generalized McCormick relaxations. J. Glob. Optim. 51(4), 569–606 (2011)

    Article  Google Scholar 

  38. Tawarmalani, M., Sahinidis, N.V.: Convexification and Global Optimization in Continuous and Mixed-Integer Nonlinear Programming. Kluwer, Dordrecht (2002)

    Book  Google Scholar 

  39. Türkay, M., Grossmann, I.E.: Disjunctive programming techniques for the optimization of process systems with discontinuous investment costs-multiple size regions. Ind. Eng. Chem. Res 35, 2611–2623 (1996)

    Article  Google Scholar 

  40. Vicente, L.N., Custódio, A.L.: Analysis of direct searches for discontinuous functions. Math. Program. 133(1–2), 299–325 (2012)

    Article  Google Scholar 

  41. Zang, I.: Discontinuous optimization by smoothing. Math. Oper. Res. 6, 140–152 (1981)

    Article  Google Scholar 

  42. Zheng, Q.: Robust analysis and global minimization of a class of discontinuous functions (I). Acta Mathematicae Applicatae Sinica 6, 205–223 (1990)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by Statoil as part of the paired Ph. D. research program in gas technologies between MIT and NTNU.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paul I. Barton.

Appendices

Appendix A: Discussion of sufficient conditions for convergence of the relaxations

Here, three lemmata will be given that present sufficient conditions for Assumption 4 to hold for a given factor \(v_k\) and thus can be used in a finite induction argument to establish Assumption 4. In particular, they formalize the discussion in Sect. 2.4 and show that, to establish Assumption 4, it is sufficient to exclude these cases from occurring. First, overestimation in binary operations is considered. Here, two reasonably strong results can be given. Then, attention will be directed to univariate functions where more restrictive assumptions need to be made.

Lemma 6

Consider any \(k\) such that \(n<k\le m\) where \(v_k\) is defined by a summation or multiplication. Consider a nested sequence of intervals \(X^l\rightarrow X^*=[\mathbf{x}^*,\mathbf{x}^*]\), \(X^l\in I\!X\), and \(X^l\ne X^*\). Suppose Assumption 4 holds for all \(i,j<k\). Suppose that \(v_i\) and \(v_j\) are discontinuous with respect to \(\mathbf{x}\) at \(\mathbf{x}^*\) and that these discontinuities are introduced at earlier factors \(k_i\le i\) and \(k_j\le j\), i.e., \(v_{k_i}=\pi (v_{r_i})\) and \(v_{k_j}=\pi (v_{r_j})\), \(r_i<k_i\) and \(r_j<k_j\). Assume that \(v_{k_i}\) and \(v_{k_j}\) are the only discontinuous step mappings. Define subsets of \(X^l\) as \(\varXi _i^l=\{\mathbf{x}\in X^l:v_{r_i}(\mathbf{x})>0\}\) and \(\varXi _j^l=\{\mathbf{x}\in X^l:v_{r_j}(\mathbf{x})>0\}\). If there exists a \(L\in \mathbb N \) so that for all \(l>L\),

$$\begin{aligned} \varXi _i^l\cap \varXi _j^l\ne \emptyset ,\quad \varXi _i^l\cap (X^l\backslash \varXi _j^l)\ne \emptyset ,\quad (X^l\backslash \varXi _i^l)\cap \varXi _j^l\ne \emptyset ,\quad (X^l\backslash \varXi _i^l)\cap (X^l\backslash \varXi _j^l)\ne \emptyset , \end{aligned}$$

then Assumption 4 holds for \(k\).

Proof

By assumption, there exist four sequences \(\{\mathbf{x}^l_1\},\ldots ,\{\mathbf{x}^l_4\}\) converging to \(\mathbf{x}^*\) where \(\mathbf{x}^l_1\in \varXi _i^l\cap \varXi _j^l\), \(\mathbf{x}^l_2\in \varXi _i^l\cap (X^l\backslash \varXi _j^l)\), \(\mathbf{x}^l_3\in (X^l\backslash \varXi _i^l)\cap \varXi _j^l\), and \(\mathbf{x}^l_4\in (X^l\backslash \varXi _i^l)\cap (X^l\backslash \varXi _j^l)\).

For any \(X^l\) with \(l>L\), the image of \(v_{k_i}\) is \(V_{k_i}^l=[0,1]\) as \(\varXi _i^l\) is a nonempty strict subset of \(X^l\), \(v_{k_i}(\mathbf{x}_q^l)=1\) for \(q=1,2\) and \(v_{k_i}(\mathbf{x}_q^l)=0\) for \(q=3,4\). Thus, \(V_{k_i}^l\) is an exact bound of the range of \(v_{k_i}\). Consider the finite sequence of \(s+1\) continuous factors, say \(v_{i_1},\ldots ,v_{i_s},v_i\) with \(k_i<i_1<\cdots <i_s<i\), that maps \(V_{k_i}\) to \(V_i\). By assumption, other arguments involved in the definition of the factors \(v_{i_1},\ldots ,v_{i_s},v_i\) are continuous step mappings and, as a result, their corresponding interval bounds converge to degenerate intervals as \(l\rightarrow \infty \).

Consider factor \(v_{i_1}\) and let \([\underline{v}_{i_1}^*,\overline{v}_{i_1}^*]=\lim _{l\rightarrow \infty }[\underline{v}_{i_1}^l,\overline{v}_{i_1}^l]\). If this step mapping is a binary operation combining \(v_{k_i}\) with a continuous factor, \(V_{i_1}\) will converge to a non-degenerate interval and, without loss of generality, \(\underline{v}_{i_1}^*=\lim _{l\rightarrow \infty }v_{i_1}^l(\mathbf{x}_q^l)\) for \(q=1,2\) and \(\overline{v}_{i_1}^*=\lim _{l\rightarrow \infty }v_{i_1}^l(\mathbf{x}_q^l)\) for \(q=3,4\). If this step mapping is a univariate operation, Assumption 4 guarantees that \(V_{i_1}\) will converge to the exact bounds, i.e., without loss of generality, \(\underline{v}_{i_1}^*=\lim _{l\rightarrow \infty }v_{i_1}^l(\mathbf{x}_q^l)\) for \(q=1,2\) and \(\overline{v}_{i_1}^*=\lim _{l\rightarrow \infty }v_{i_1}^l(\mathbf{x}_q^l)\) for \(q=3,4\). Repeating this argument for the factors \(v_{i_2},\ldots ,v_{i_s},v_i\), it follows without loss of generality that \(\underline{v}_i^*=\lim _{l\rightarrow \infty }v_i^l(\mathbf{x}_q^l)\) for \(q=1,2\) and \(\overline{v}_i^*=\lim _{l\rightarrow \infty }v_i^l(\mathbf{x}_q^l)\) for \(q=3,4\) where \([\underline{v}_i^*,\overline{v}_i^*]=\lim _{l\rightarrow \infty }[\underline{v}_i^l,\overline{v}_i^l]\). It can be argued similarly that, without loss of generality, \(\underline{v}_j^*=\lim _{l\rightarrow \infty }v_j^l(\mathbf{x}_q^l)\) for \(q=1,3\) and \(\overline{v}_j^*=\lim _{l\rightarrow \infty }v_j^l(\mathbf{x}_q^l)\) for \(q=2,4\) where \([\underline{v}_j^*,\overline{v}_j^*]=\lim _{l\rightarrow \infty }[\underline{v}_j^l,\overline{v}_j^l]\).

Thus, each combination of the bounds of \(v_i\) and \(v_j\) is attained in the neighborhood of \(\mathbf{x}^*\). In particular, in the case of addition, the sequences \(\{v_i(\mathbf{x}^l_1)\}\), \(\{v_j(\mathbf{x}^l_1)\}\) and \(\{v_i(\mathbf{x}^l_4)\}\), \(\{v_j(\mathbf{x}^l_4)\}\) converge to \(\underline{v}_i^*\), \(\underline{v}_j^*\) and \(\overline{v}_i^*\), \(\overline{v}_j^*\),respectively. Thus, \([\underline{v}_k^*,\overline{v}_k^*]=[\underline{v}_i^*,\overline{v}_i^*]+[\underline{v}_j^*,\overline{v}_j^*]\) is, in the limit, an exact bound. A similar argument can be presented for the case of multiplication. Here, each combination of lower and upper bounds on \(v_i\) and \(v_j\) is realized by a different sequence \(\{\mathbf{x}_q^l\}\), \(q=1,\ldots ,4\). Thus, Assumption 4 holds for \(k\). \(\square \)

Remark 5

  • Lemma 6 considers the case of adding or multiplying \(v_i\) and \(v_j\) where \(v_i\) and \(v_j\) are discontinuous in the limit \(\mathbf{x}^*\) and these discontinuities are introduced by exactly one \(\pi \) function each. Then, the dependency problem in interval arithmetic can be mitigated when there exist regions in each interval \(X^l\) so that all combination of the lower and upper bounds of the factors \(v_i\) and \(v_j\) are attained. This can be alternatively expressed as requiring that the intrinsic discontinuities do not coincide in a neighborhood of \(\mathbf{x}^*\). A case where this hypothesis of Lemma 6 holds is illustrated in Fig. 6a.

  • A counterexample can be given to show that Lemma 6 cannot be easily extended to the case when more than \(n\) intrinsic discontinuities coincide at \(\mathbf{x}^*\in \mathbb{R }^n\). To see this, consider \(f(\mathbf{x})=1+\pi (x_1)+\pi (x_2)-\pi (x_1+x_2)\), \(X=[-1,1]^2\) and \(X^l=[-l^{-1},l^{-1}]^2\). As shown in Fig. 6b three intrinsic discontinuities coincide at \((0,0)\). The bounds of \(f\) on \(X^l\) obtained from interval arithmetic are \(\underline{f}^l=0\) and \(\overline{f}^l=3\). They are not attained for any \(\mathbf{x}\in X^l\) and any \(l\) and thus Assumption 4 does not hold.

  • Also note that, given Assumption 4, the exacerbated dependency problem of interval arithmetic is not acute when there is only one discontinuity present in either \(v_i\) or \(v_j\) at \(\mathbf{x}^*\). This has been exploited in the proof of Lemma 6.

  • Lastly, observe that the hypotheses of Lemma 6 cannot be satisfied when \(X^l\subset \mathbb{R }\). At most three subsets of \(X\) in the vicinity of \(x^*\), \(\{x:x<x^*\}\), \(\{x:x=x^*\}\) and \(\{x:x>x^*\}\), are conceivable where \(v_i\) and \(v_j\) could attain their lower and upper bounds. To guarantee that Assumption 4 holds for \(v_k\), the interval arithmetic for \(v_i+v_j\) or \(v_iv_j\) needs to combine the bounds in such a way that \(v_k\) attains both its lower and upper bound. However, it is easy to conceive counterexamples where this is not true, e.g., see the discussion prior to Lemma 6.

Fig. 6
figure 6

Illustrations for Assumption 4 when \(X\subset \mathbb{R }^2\). The curves indicate discontinuities introduced at previous factors

Though it was pointed out that there are counterexamples restricting the generalization of Lemma 6 when more than 2 intrinsic discontinuities coincide at \(\mathbf{x}^*\) in \(\mathbb{R }^2\), a generalization is possible to \(n\) intrinsic discontinuities coinciding in \(\mathbb{R }^n\).

Lemma 7

Consider any \(k\) such that \(n<k\le m\) where \(v_k\) is defined by summation or multiplication. Suppose Assumption 4 holds for all \(i,j<k\). Consider a nested sequence of intervals \(X^l\rightarrow X^*=[\mathbf{x}^*,\mathbf{x}^*]\), \(X^l\in I\!X\), \(X^l\ne X^*\). Suppose that \(v_i\) and \(v_j\) are discontinuous with respect to \(\mathbf{x}\) at \(\mathbf{x}^*\) and that these discontinuities are introduced by \(q\le n\) earlier factors \(k_1,\ldots ,k_q\), i.e., \(v_{k_{\tilde{q}}}=\pi (v_{r_{\tilde{q}}})\) with \(v_{r_{\tilde{q}}}(\mathbf{x}^*)=0\) for \(\tilde{q}=1,\ldots ,q\). Assume that \(v_{r_{\tilde{q}}}\) is differentiable with respect to \(\mathbf{x}\) at \(\mathbf{x}^*\), for all \(\tilde{q}=1,\ldots ,q\), and denote the gradient of \(v_{r_{\tilde{q}}}\) at \(\mathbf{x}^*\) as \(\varvec{\nabla }v_{\tilde{q}}\). If \(\varvec{\nabla }v_1,\ldots ,\varvec{\nabla }v_q\) are linearly independent, then Assumption 4 holds for \(k\).

Proof

Define subsets of \(X^l\) as \(\varXi _{\tilde{q}}^l=\{\mathbf{x}\in X^l:v_{r_{\tilde{q}}}>0\}\), \(\tilde{q}=1,\ldots ,q\). Requiring linear independence of \(\varvec{\nabla }v_1,\ldots ,\varvec{\nabla }v_q\) is a sufficient condition for the existence of \(2^q\) nonempty subsets of \(X^l\) that realize all combinations of \(\varXi _{\tilde{q}}^l\) with \(\varXi _{\hat{q}}^l\) or \(X^l\backslash \varXi _{\hat{q}}^l\), \(\hat{q}=1,\ldots ,q\), \(\hat{q}\ne \tilde{q}\), for all \(l>L\) for some \(L\in \mathbb N \). Thus, the argument used in the proof of Lemma 6 can be extended to show that each possible combination of the bounds on intermediate factors is indeed realized. \(\square \)

Lemma 8

Consider any \(k\) such that \(n<k\le m\) where \(v_k\) is defined by \(v_k=\pi (v_i)\). Consider a nested sequence of intervals \(X^l\rightarrow X^*=[\mathbf{x}^*,\mathbf{x}^*]\), \(X^l\in I\!X\), \(X^l\ne X^*\). Suppose either

  1. 1.

    that \(v_i(\mathbf{x}^*)=0\) and that for all \(l>0\) there exists a \(\mathbf{x}_l^\dagger \in X^l\) and a \(\varepsilon _l>0\) so that \(v_i(\mathbf{x}_l^\dagger )=\varepsilon _l\),

  2. 2.

    that there exists a \(L_1>0\) so that \(\overline{v}_i^l\le 0\) for all \(l\ge L_1\), or

  3. 3.

    that there exists a \(L_2>0\) so that \(\underline{v}_i^l> 0\) for all \(l\ge L_2\).

Then, Assumption 4 holds for \(k\).

Proof

Consider Case 1. By assumption, \(\underline{v}_k^l=0\) and \(\overline{v}_k^l=1\), \(\forall l\) so that \(\lim _{l\rightarrow \infty }[\underline{v}_k^l,\overline{v}_k^l]=[0,1]\). Furthermore, it holds that

$$\begin{aligned} \left[ \lim _{l\rightarrow \infty }\inf _{\mathbf{x}\in X^l}v_k(\mathbf{x}),\lim _{l\rightarrow \infty }\sup _{\mathbf{x}\in X^l}v_k(\mathbf{x})\right]&= \left[ \lim _{l\rightarrow \infty }\inf _{\mathbf{x}\in X^l}\pi (v_i(\mathbf{x})),\lim _{l\rightarrow \infty }\sup _{\mathbf{x}\in X^l}\pi (v_i(\mathbf{x}))\right] \\&= \left[ \pi (v_i(\mathbf{x}^*)),\lim _{l\rightarrow \infty }\pi (v_i(\mathbf{x}_l^\dagger ))\right] \\&= \left[ \pi (0),\lim _{l\rightarrow \infty }\pi (\varepsilon _l)\right] =[0,1]. \end{aligned}$$

Consider Case 2. By assumption, \([\underline{v}_k^l,\overline{v}_k^l]=[0,0]\) for all \(l\ge L_1\). Thus, \(v_k(\mathbf{x})=0\) for all \(\mathbf{x}\in X^{L_1}\) so that \([\lim _{l\rightarrow \infty }\inf _{\mathbf{x}\in X^l}v_k(\mathbf{x}),\lim _{l\rightarrow \infty }\sup _{\mathbf{x}\in X^l}v_k(\mathbf{x})]=[0,0]=\lim _{l\rightarrow \infty }[\underline{v}_k^l,\overline{v}_k^l]\).

Consider Case 3. By assumption, \([\underline{v}_k^l,\overline{v}_k^l]=[1,1]\) for all \(l\ge L_1\). Thus, \(v_k(\mathbf{x})=1\) for all \(\mathbf{x}\in X^{L_2}\) so that \([\lim _{l\rightarrow \infty }\inf _{\mathbf{x}\in X^l}v_k(\mathbf{x}),\lim _{l\rightarrow \infty }\sup _{\mathbf{x}\in X^l}v_k(\mathbf{x})]=[1,1]=\lim _{l\rightarrow \infty }[\underline{v}_k^l,\overline{v}_k^l]\).

Thus, Eq. (2) and, hence, Assumption 4 hold for factor \(k\). \(\square \)

Lemma 9

Consider a nested sequence of intervals \(X^l\rightarrow X^*\), \(X^l\in I\!X\), \(X^l\ne X^*\) and a continuous function \(f:X\rightarrow \mathbb{R }\). Then,

$$\begin{aligned} \lim _{l\rightarrow \infty }\inf _{\mathbf{x}\in X^l}f(\mathbf{x})=\inf _{\mathbf{x}\in X^*}f(\mathbf{x}) \quad \text{ and } \quad \lim _{l\rightarrow \infty }\sup _{\mathbf{x}\in X^l}f(\mathbf{x})=\sup _{\mathbf{x}\in X^*}f(\mathbf{x}). \end{aligned}$$

Proof

Fix \(\varepsilon >0\). Let \(\mathbf{x}^*_{\min }\in \text{ arg } \text{ min }_{\mathbf{x}\in X^*}f(\mathbf{x})\), the infimum is attained since \(X^*\) is compact and \(f\) is continuous on \(X^*\). Since \(X^l\subset X\) is compact and \(f\) is continuous on \(X\), \(f\) is uniformly continuous on \(X^l\). Uniform continuity of \(f\) implies that \(\exists \delta >0\) so that \(|f(\mathbf{x})-f(\mathbf{y})|<\varepsilon \) for all \(\mathbf{x},\mathbf{y}\in X^l\) for which \(\Vert \mathbf{x}-\mathbf{y}\Vert <\delta \sqrt{n}\). Convergence of \(X^l\) to \(X^*\) implies that there is a \(L>0\) so that \(d_H(X^l,X^*)<\delta \) for all \(l>L\). By definition of the Hausdorff metric, \(\underline{x}_i^l>\underline{x}_i^*-\delta \) and \(\overline{x}_i^l<\overline{x}_i^*+\delta \) for all \(l>L\) and \(i=1,\ldots ,n\). Thus, \(f(\mathbf{x}^\dagger )+\varepsilon >f(\mathbf{x}^\ddagger )\) where \(\mathbf{x}^\dagger \in X^l\backslash X^*\) and \(\mathbf{x}^\ddagger \in \partial X^*\) with \(\partial X^*\) denoting the boundary of \(X^*\). By definition, \(f(\mathbf{x})\ge f(\mathbf{x}^*_{\min })\), \(\forall \mathbf{x}\in X^*\) so that \(f(\mathbf{x}^\ddagger )\ge f(\mathbf{x}^*_{\min })\). As a result, \(f(\mathbf{x})+\varepsilon >f(\mathbf{x}^*_{\min })\) for all \(\mathbf{x}\in X^l\) with \(l>L\). Since \(X^l\supset X^*\), \(\inf _{\mathbf{x}\in X^l}f(\mathbf{x})\le f(\mathbf{x}^*_{\min })\) for all \(l\). \(\varepsilon \) is arbitrary so that \(\lim _{l\rightarrow \infty }\inf _{\mathbf{x}\in X^l}f(\mathbf{x})=\inf _{\mathbf{x}\in X^*}f(\mathbf{x})\). An analogous argument can be made to show that \(\lim _{l\rightarrow \infty }\sup _{\mathbf{x}\in X^l}f(\mathbf{x})=\sup _{\mathbf{x}\in X^*}f(\mathbf{x})\). \(\square \)

Lemma 10

Consider any \(k\) such that \(n<k\le m\) where \(v_k\) is defined by a continuous univariate intrinsic function \(\varphi _k\). Suppose Assumption 4 holds for all \(i<k\). Consider a nested sequence of intervals \(X^l\rightarrow X^*=[\mathbf{x}^*,\mathbf{x}^*]\), \(X^l\in I\!X\), \(X^l\ne X^*\). Let \([\underline{v}_i^*,\overline{v}_i^*]=\lim _{l\rightarrow \infty }[\underline{v}_i^l,\overline{v}_i^l]\). Then, Assumption 4 holds for \(k\) if

$$\begin{aligned} \min \{\varphi _k(\underline{v}_i^*),\varphi _k(\overline{v}_i^*)\} =\underline{\varPhi }_k([\underline{v}_i^*,\overline{v}_i^*]) \quad \text{ and } \quad \max \{\varphi _k(\underline{v}_i^*),\varphi _k(\overline{v}_i^*)\} =\overline{\varPhi }_k([\underline{v}_i^*,\overline{v}_i^*]). \end{aligned}$$

Proof

First, suppose that \(v_i\) is continuous with respect to \(\mathbf{x}\) at \(\mathbf{x}^*\). Then, \([\underline{v}_i^*,\overline{v}_i^*]\) is a degenerate interval. Since \(\varPhi _k\) is an interval extension, \(\varPhi _k([\underline{v}_i^*,\overline{v}_i^*])\) is also a degenerate interval and, hence, Eq. (2) holds.

Next, suppose that \(v_i\) is not continuous with respect to \(\mathbf{x}\) at \(\mathbf{x}^*\). Since Assumption 4 holds for factor \(i\), it follows that \(\lim _{l\rightarrow \infty }[\underline{v}_i^l,\overline{v}_i^l]=[\lim _{l\rightarrow \infty }\inf _{\mathbf{x}\in X^l}v_i(\mathbf{x}), \lim _{l\rightarrow \infty }\sup _{\mathbf{x}\in X^l}v_i(\mathbf{x})]\). Consider the sequence \(V_i^l=[\underline{v}^l,\overline{v}^l]\) converging to \(V_i^*=[\underline{v}^*,\overline{v}^*]\). According to Lemma 9, it holds that \(\lim _{l\rightarrow \infty }\inf _{z\in V_i^l}\varphi _k(z)=\inf _{z\in V_i^*}\varphi _k(z)\) and that \(\lim _{l\rightarrow \infty }\sup _{z\in V_i^l}\varphi _k(z)=\sup _{z\in V_i^*}\varphi _k(z)\). The hypothesis of the lemma imply furthermore that \(\underline{\varPhi }_k([\underline{v}_i^*,\overline{v}_i^*])=\inf _{z\in [\underline{v}_i^*,\overline{v}_i^*]}\varphi _k(z)\) and that \(\overline{\varPhi }_k([\underline{v}_i^*,\overline{v}_i^*])=\sup _{z\in [\underline{v}_i^*,\overline{v}_i^*]}\varphi _k(z)\). Therefore it follows that

$$\begin{aligned} \left[ \lim _{l\rightarrow \infty }\inf _{\mathbf{x}\in X^l}v_k(\mathbf{x}),\lim _{l\rightarrow \infty }\sup _{\mathbf{x}\in X^l}v_k(\mathbf{x})\right]&= \left[ \lim _{l\rightarrow \infty }\inf _{z\in V_i^l}\varphi _k(z),\lim _{l\rightarrow \infty }\sup _{z\in V_i^l}\varphi _k(z)\right] \\&= \left[ \inf _{z\in V_i^*}\varphi _k(z),\sup _{z\in V_i^*}\varphi _k(z)\right] \!=\!\varPhi _k\left( \left[ \underline{v}_i^*,\overline{v}_i^*\right] \right) \!=\!\left[ \underline{v}_k^*,\overline{v}_k^*\right] , \end{aligned}$$

i.e., Eq. (2) holds and, hence, Assumption 4 is established for factor \(k\). \(\square \)

Remark 6

An example of a class of univariate intrinsic functions \(\varphi \) that can meet the hypotheses of Lemma 10 are monotone functions. However, the specific implementation of \(\varPhi \) will dictate if \(\varphi \) indeed meets the hypotheses of Lemma 10.

Appnedix B: More general convergence results for branch-and-bound algorithm

In Sect. 3, it was assumed that \(f\) is either lower semi-continuous or attains its minimum on \(D\). Results are outlined below that hold even when these assumptions are generalized.

Remark 7

When the assumption that \(f\) is lower semi-continuous is dropped in Theorem 9, then one cannot appeal to Theorem 8. However, with Remark 3 in mind, one can argue that \(\lim _{q\rightarrow \infty }\beta (X^{l_q})=\lim _{q\rightarrow \infty }\inf _{\mathbf{x}\in X^{l_q}}f(\mathbf{x})\ge \min \{\lim _{q\rightarrow \infty }\inf _{\mathbf{x}\in X^{l_q}\backslash \{\mathbf{x}^*\}}f(\mathbf{x}),f(\mathbf{x}^*)\}\ge \min \{\liminf _{\mathbf{x}\rightarrow \mathbf{x}^*}f(\mathbf{x}),f(\mathbf{x}^*)\}\), which is sufficient to show that the lower bounding operation is strongly consistent.

Note that Remark 7 does not allow for the argument \(\lim _{q\rightarrow \infty }\beta (X^{l_q})=\inf _{\mathbf{x}\in X^*\cap D}f(\mathbf{x})\), and consequently \(\lim _{k\rightarrow \infty }\beta _k=\inf _{\mathbf{x}\in D}f(\mathbf{x})\), when \(f\) is not assumed to be lower semi-continuous. In particular, there may be an infinitely decreasing sequence of nested intervals \(X^l\) so that there exists a \(\mathbf{y}\in \partial D\) with \(\mathbf{y}\in \text{ int }X^l\), \(\forall l\), i.e., all partition elements contain an element of the boundary of the feasible set in its interior. Suppose that \(f(\mathbf{y})=\inf _{\mathbf{x}\in D}f(\mathbf{x})\). Thus, it is conceivable that there exists a \(\varepsilon >0\) and a sequence \(\{\mathbf{z}^l\}\) with \(\mathbf{z}^l\notin D\), \(\mathbf{z}^l\in X^l\), \(\forall l\) so that \(f(\mathbf{z}^l)<f(\mathbf{y})-\varepsilon \). As a result, \(\lim _{l\rightarrow \infty }\beta (X^l)\le f(\mathbf{y})-\varepsilon \).

To avoid this complication, another assumption is introduced.

Assumption 6

Suppose \(f(\mathbf{y})\ge \inf _{\mathbf{x}\in D}f(\mathbf{x})\), \(\forall \mathbf{y}\in X:\mathbf{y}\notin D\).

This assumption can be satisfied by reformulating \(f\) as a penalty function, e.g., minimizing \(\tilde{f}\) with

$$\begin{aligned} \tilde{f}(\mathbf{x})= \left\{ \begin{array}{ll} f(\mathbf{x}), &{} \quad \text{ if } \mathbf{x}\in D, \\ \overline{f}(D), &{} \quad \text{ otherwise }, \end{array} \right. \end{aligned}$$

where \(\overline{f}(D)\) denotes an upper bound, e.g., derived from interval analysis, of \(f\) on \(D\).

Remark 8

When the assumption that \(f\) attains its minimum on \(D\) in Theorem 10 is removed and Assumption 6 holds, one can still argue that \(\beta =\inf _{\mathbf{x}\in D}f(\mathbf{x})\) using Theorem 8 and Remark 3. However, the set of minimizers of \(f\) on \(D\), \(\text{ arg } \text{ min }_{\mathbf{x}\in D}f(\mathbf{x})\), is not defined in this case. Instead, consider the set

$$\begin{aligned} \text{ arg } \text{ inf }_{\mathbf{x}\in D}f(\mathbf{x})\!\equiv \!\left\{ \mathbf{x}\in D:\exists \{\mathbf{z}^l\}\subset D \text{ with } \lim _{l\rightarrow \infty }\mathbf{z}^l\!=\!\mathbf{x} \text{ and } \lim _{l\rightarrow \infty }f(\mathbf{z}^l)\!=\!\inf _{\mathbf{z}\in D}f(\mathbf{z})\right\} . \end{aligned}$$
(3)

In this case \(X_{\min }\subset \text{ arg } \text{ inf }_{\mathbf{x}\in D}f(\mathbf{x})\). This can be shown as follows:

Assume that the algorithm does not terminate after a finite number of steps. Consider the sequence of lower bounds \(\{\beta _k\}\) with \(\mathbf{x}^k_{\min }\), \(L_k\) and \(X^{L_k}\) as defined previously. From the construction of the algorithm it follows that \(\{\beta _k\}\) is a nondecreasing sequence with \(\beta _k\le \inf _{\mathbf{x}\in D} f(\mathbf{x})\). Hence, \(\beta =\lim _{k\rightarrow \infty }\beta _k\) exists and \(\beta \le \inf _{\mathbf{x}\in D} f(\mathbf{x})\). Let \(\mathbf{x}_{\min }^\dagger \) denote an element of the set of accumulation points of the sequence \(\{\mathbf{x}_{\min }^k\}\) and let \(\{\mathbf{x}_{\min }^{k_r}\}\) be a subsequence of \(\{\mathbf{x}_{\min }^k\}\) with subsequential limit \(\mathbf{x}_{\min }^\dagger \). Since the partition subdivision is exhaustive and the selection operation is bound improving, a finite number of partition elements is visited in each iteration only. Consequently, a decreasing subsequence of successively refined partition elements \(\{X^{q^\prime }\}\subset \{X^{L_{k_r}}\}\) exists such that \(\lim _{q^\prime \rightarrow \infty }X^{q^\prime }=\{\mathbf{x}_{\min }^\dagger \}\). Since the lower bounding operation is strongly consistent, there exists a subsequence \(\{X^q\}\subset \{X^{q^\prime }\}\) such that \(\lim _{q\rightarrow \infty }\beta (X^q)\ge \min \{\liminf _{\mathbf{x}\rightarrow \mathbf{x}_{\min }^\dagger }f(\mathbf{x}),f(\mathbf{x}_{\min }^\dagger )\}\). The “deletion by infeasibility” rule is certain in the limit so that \(\mathbf{x}_{\min }^\dagger \in D\). Thus, \(\inf _{\mathbf{x}\in D} f(\mathbf{x})\ge \beta \ge \min \{\liminf _{\mathbf{x}\rightarrow \mathbf{x}_{\min }^\dagger }f(\mathbf{x}),f(\mathbf{x}_{\min }^\dagger )\}\). By assumption, \(f(\mathbf{y})\ge \inf _{\mathbf{x}\in D} f(\mathbf{x})\) when \(\mathbf{y}\notin D\) so that

$$\begin{aligned} \inf _{\mathbf{x}\in D} f(\mathbf{x})=\min \left\{ \liminf _{\mathbf{x}\rightarrow \mathbf{x}_{\min }^\dagger } f(\mathbf{x}),f(\mathbf{x}_{\min }^\dagger )\right\} =\beta . \end{aligned}$$

Thus, the result follows.

Remark 9

Assumption 5, which implicitly presumes that \(f\) attains its minimum on \(D\), is used in Theorem 11. The latter can be modified when the minimum of \(f\) on \(D\) is not attained: define \(\tilde{\mathbf{x}}_{\min }\in D\) as the limit of a sequence \(\{\mathbf{x}^l\}\subset D\) with \(\lim _{l\rightarrow \infty }f(\mathbf{x}^l)=f^*\). Suppose that, for every \(\varepsilon >0\), there exists a \(\delta >0\) and a \(\mathbf{x}\in D\) for which \(\Vert \mathbf{x}-\tilde{\mathbf{x}}_{\min }\Vert <\delta \), \(\mathbf{x}\ne \tilde{\mathbf{x}}_{\min }\), \(f(\mathbf{x})\le f^*+\varepsilon \) hold. Under this assumption, consistency of the lower bounding operation can be argued following a proof similar to the one of Theorem 11.

Remark 10

In Theorem 12 it was assumed that \(f\) is lower semi-continuous. This assumption was utilized therein to assert that sublevel sets of \(f\) are closed. A similar statement is not possible when the assumption of lower semi-continuity of \(f\) is dropped as they are equivalent. Consider a discontinuous functions with the following property: there exist two sequences \(\{\mathbf{y}^l\},\{\mathbf{z}^l\}\subset D\) with limits \(\mathbf{y}^*\ne \mathbf{z}^*\), respectively, so that \(\lim _{l\rightarrow \infty }f(\mathbf{y}^l)=f^*=\lim _{l\rightarrow \infty }f(\mathbf{z}^l)\) and let \(f(\mathbf{y}^*)=f^*\ne f(\mathbf{z}^*)\). The branch-and-bound algorithm is not able to fathom any partition element that contains an infinite number of elements of \(\{\mathbf{z}^l\}\). Consequently, \(\mathbf{y}^*\) and \(\mathbf{z}^*\) are accumulation points of \(\{\mathbf{x}^k\}\), whereas, in the strict sense, only \(\mathbf{y}^*\) solves (P). However, \(\mathbf{z}^*\) is in the set \(\text{ arg } \text{ inf }_{\mathbf{x}\in D}f(\mathbf{x})\) as defined by Eq. (3). Using the argument presented in this remark and asserting Assumption 6, one can show that, for any accumulation point \(\mathbf{x}^\dagger \) of \(\{\mathbf{x}^k\}\), \(\mathbf{x}^\dagger \in \text{ arg } \text{ inf }_{\mathbf{x}\in D}f(\mathbf{x})\) holds.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Wechsung, A., Barton, P.I. Global optimization of bounded factorable functions with discontinuities. J Glob Optim 58, 1–30 (2014). https://doi.org/10.1007/s10898-013-0060-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10898-013-0060-3

Keywords

Mathematics Subject Classification (2000)