Skip to main content
Log in

Optimization-Constrained Differential Equations with Active Set Changes

  • Published:
Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Abstract

Foundational theory is established for nonlinear differential equations with embedded nonlinear optimization problems exhibiting active set changes. Existence, uniqueness, and continuation of solutions are shown, followed by lexicographically smooth (implying Lipschitzian) parametric dependence. The sensitivity theory found here accurately characterizes sensitivity jumps resulting from active set changes via an auxiliary nonsmooth sensitivity system obtained by lexicographic directional differentiation. The results in this article hold under easily verifiable regularity conditions (linear independence of constraints and strong second-order sufficiency), which are shown to imply generalized differentiation index one of a nonsmooth differential-algebraic equation system obtained by replacing the optimization problem with its optimality conditions and recasting the complementarity conditions as nonsmooth algebraic equations. The theory in this article is computationally relevant, allowing for implementation of dynamic optimization strategies (i.e., open-loop optimal control), and recovers (and rigorously formalizes) classical results in the absence of active set changes. Along the way, contributions are made to the theory of piecewise differentiable functions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. On Page 2263 of that proof, the statements \(\widetilde{{\mathbf {x}}} \in \bigcup _{i \in {P}} B_{0.5\alpha _i}({\mathbf {x}}_{(i)})\) and \({\mathbf {x}}^* \in \bigcup _{i \in {P}} B_{\alpha _i}({\mathbf {x}}_{(i)})\) should be replaced by \(\widetilde{{\mathbf {x}}} \in \bigcap _{i \in {P}} B_{0.5\alpha _i}({\mathbf {x}}_{(i)})\) and \({\mathbf {x}}^* \in \bigcap _{i \in {P}} B_{\alpha _i}({\mathbf {x}}_{(i)})\), respectively.

  2. If \({\mathbf {h}}\) and its local inverse \({\mathbf {h}}^{-1}\) are Lipschitz continuous on neighborhoods of \({\mathbf {z}}_0\) and \({\mathbf {h}}({\mathbf {z}}_0)\), respectively, then \({\mathbf {h}}\) is a (local) Lipschitz homeomorphism at the domain point \({\mathbf {z}}_0\).

  3. If \({\mathbf {h}}\) and its local inverse \({\mathbf {h}}^{-1}\) are \(PC^1\) at \({\mathbf {z}}_0\) and \({\mathbf {h}}({\mathbf {z}}_0)\), respectively, then \({\mathbf {h}}\) is a (local) \(PC^1\) homeomorphism at the domain point \({\mathbf {z}}_0\).

  4. There are some qualitatively distinct classes of trajectories not captured in the figures, such as the solution trajectory traversing the constraint counterclockwise when \({\mathbf {p}}\in R_1 \cap N\) and \(p_2<p_1<1\). Nevertheless, this is unimportant for present purposes because the active sets are unchanged.

References

  1. Mahadevan, R., Edwards, J.S., Doyle, F.J.: Dynamic flux balance analysis of diauxic growth in Escherichia coli. Biophys. J. 83(3), 1331–1340 (2002)

    Article  Google Scholar 

  2. Lewis, N.E., Nagarajan, H., Palsson, B.O.: Constraining the metabolic genotype–phenotype relationship using a phylogeny of in silico methods. Nat. Rev. Microbiol. 10(4), 291–305 (2012)

    Article  Google Scholar 

  3. Amundson, N.R., Caboussat, A., He, J.W., Landry, C., Seinfeld, J.H.: A dynamic optimization problem related to organic aerosols. Comptes Rendus Math. 344(8), 519–522 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  4. Landry, C., Caboussat, A., Hairer, E.: Solving optimization-constrained differential equations with discontinuity points, with application to atmospheric chemistry. SIAM J. Sci. Comput. 31(5), 3806–3826 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  5. Veliov, V.: On the time-discretization of control systems. SIAM J. Control Optim. 35(5), 1470–1486 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  6. Campbell, S.L., Gear, C.W.: The index of general nonlinear DAEs. Numer. Math. 72(2), 173–196 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  7. Mehrmann, V.: Index concepts for differential-algebraic equations. In: Engquist, B. (ed.) Encyclopedia of Applied and Computational Mathematics, pp. 676–681. Springer, Berlin (2015)

    Chapter  Google Scholar 

  8. Brenan, K.E., Campbell, S.L., Petzold, L.R.: Numerical Solution of Initial-Value Problems in Differential-Algebraic Equations. SIAM, Philadelphia (1996)

    MATH  Google Scholar 

  9. Kunkel, P., Mehrmann, V.: Differential-Algebraic Equations: Analysis and Numerical Solution. European Mathematical Society, Zurich (2006)

    Book  MATH  Google Scholar 

  10. Ascher, U.M., Petzold, L.R.: Computer Methods for Ordinary Differential Equations and Differential-Algebraic Equations. SIAM, Philadelphia (1998)

    Book  MATH  Google Scholar 

  11. Feehery, W.F., Tolsma, J.E., Barton, P.I.: Efficient sensitivity analysis of large-scale differential-algebraic systems. Appl. Numer. Math. 25, 41–54 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  12. Cao, Y., Li, S., Petzold, L., Serban, R.: Adjoint sensitivity analysis for differential-algebraic equations: the adjoint DAE system and its numerical solution. SIAM J. Sci. Comput. 24(3), 1076–1089 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  13. Amundson, N.R., Caboussat, A., He, J.W., Seinfeld, J.H.: Primal-dual interior-point method for an optimization problem related to the modeling of atmospheric organic aerosols. J. Optim. Theory Appl. 130(3), 377–409 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  14. Caboussat, A., Landry, C., Rappaz, J.: Optimization problem coupled with differential equations: a numerical algorithm mixing an interior-point method and event detection. J. Optim. Theory Appl. 147(1), 141–156 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  15. Hüser, J., Deussen, J., Naumann, U.: Integration of differential-algebraic equations with optimality criteria. In: AD2016—The 7th International Conference on Algorithmic Differentiation Programme and Presentations (2016)

  16. Griewank, A., Walther, A.: Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation. Other Titles in Applied Mathematics, 2nd edn. SIAM, Philadelphia (2008)

    Book  MATH  Google Scholar 

  17. Naumann, U.: The Art of Differentiating Computer Programs: An Introduction to Algorithmic Differentiation. SIAM, Philadelphia (2012)

    MATH  Google Scholar 

  18. Scholtes, S.: Introduction to Piecewise Differentiable Equations. Springer, New York (2012)

    Book  MATH  Google Scholar 

  19. Rabier, P.J., Rheinboldt, W.C.: Theoretical and Numerical Analysis of Differential-Algebraic Equations. Elsevier, North-Holland (2002)

    Book  MATH  Google Scholar 

  20. Stechlinski, P.G., Barton, P.I.: Dependence of solutions of nonsmooth differential-algebraic equations on parameters. J. Differ. Equ. 262(3), 2254–2285 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  21. Stechlinski, P.G., Barton, P.I.: Generalized derivatives of differential-algebraic equations. J. Optim. Theory Appl. 171(1), 1–26 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  22. Stechlinski, P.G., Barton, P.I.: Generalized derivatives of optimal control problems with nonsmooth differential-algebraic equations embedded. In: 55th IEEE Conference on Decision and Control, pp. 592–597 (2016)

  23. Griewank, A.: On stable piecewise linearization and generalized algorithmic differentiation. Optim. Methods Softw. 28(April 2015), 1139–1178 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  24. Khan, K.A., Barton, P.I.: A vector forward mode of automatic differentiation for generalized derivative evaluation. Optim. Methods Softw. 30(6), 1185–1212 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  25. Khan, K.A.: Branch-locking AD techniques for nonsmooth composite functions and nonsmooth implicit functions. Optim. Methods Softw. 33(4–6), 1127–1155 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  26. Schumacher, J.M.: Complementarity systems in optimization. Math. Program. 101, 263–295 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  27. Pang, J.S., Stewart, D.E.: Differential variational inequalities. Math. Program. 113, 345–424 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  28. Clarke, F.H.: Optimization and Nonsmooth Analysis. SIAM, Philadelphia (1990)

    Book  MATH  Google Scholar 

  29. Barton, P.I., Khan, K.A., Stechlinski, P., Watson, H.A.J.: Computationally relevant generalized derivatives: theory, evaluation and applications. Optim. Methods Softw. 33, 1030–1072 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  30. Ralph, D., Scholtes, S.: Sensitivity analysis of composite piecewise smooth equations. Math. Program. 76, 593–612 (1997)

    MathSciNet  MATH  Google Scholar 

  31. Stechlinski, P., Khan, K.A., Barton, P.I.: Generalized sensitivity analysis of nonlinear programs. SIAM J. Optim. 28(1), 272–301 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  32. Facchinei, F., Pang, J.S.: Finite-Dimensional Variational Inequalities and Complementarity Problems. Springer, New York (2003)

    MATH  Google Scholar 

  33. Nesterov, Y.: Lexicographic differentiation of nonsmooth functions. Math. Program. 104, 669–700 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  34. Khan, K.A., Barton, P.I.: Generalized derivatives for solutions of parametric ordinary differential equations with non-differentiable right-hand sides. J. Optim. Theory Appl. 163, 355–386 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  35. Khan, K.A., Barton, P.I.: Generalized derivatives for hybrid systems. IEEE Trans. Autom. Control 62(7), 3193–3208 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  36. Qi, L., Sun, J.: A nonsmooth version of Newton’s method. Math. Program. 58, 353–367 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  37. Lukšan, L., Vlček, J.: A bundle-Newton method for nonsmooth unconstrained minimization. Math. Program. 83, 373–391 (1998)

    MathSciNet  MATH  Google Scholar 

  38. Stechlinski, P., Patrascu, M., Barton, P.I.: Nonsmooth DAEs with applications in modeling phase changes. In: Campbell, S., Ilchmann, A., Mehrmann, V., Reis, T. (eds.) Applications of Differential-Algebraic Equations: Examples and Benchmarks, Differential-Algebraic Equations Forum. Springer, Berlin (2018)

    Google Scholar 

  39. Fiacco, A.V.: Introduction to Sensitivity and Stability Analysis in Nonlinear Programming. Academic Press, New York (1983)

    MATH  Google Scholar 

  40. Galán, S., Feehery, W.F., Barton, P.I.: Parametric sensitivity functions for hybrid discrete/continuous systems. Appl. Numer. Math. 31, 17–47 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  41. Kleinert, J., Simeon, B.: Differential-algebraic equations and beyond: from smooth to nonsmooth constrained dynamical systems. arXiv preprint arXiv:1811.07658 (2018)

Download references

Acknowledgements

The author would like to thank the anonymous reviewers for their helpful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Stechlinski.

Ethics declarations

Conflict of interest

The author declares that they have no conflict of interest.

Additional information

Communicated by Lorenz T. Biegler.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Proof of Proposition 2.1

Without loss of generality, let \(\{{\mathbf {f}}_{(1)},\ldots ,{\mathbf {f}}_{(k)}\}\) be a set of essentially active \(C^1\) selection functions of \({\mathbf {f}}\) at \({\mathbf {x}}\). \(\varLambda {\mathbf {f}}({\mathbf {x}})\) is nonempty since \({\mathbf {J}}{\mathbf {f}}_{(i)}({\mathbf {x}}) \in \varLambda {\mathbf {f}}({\mathbf {x}})\) for each \(i\in \{1,\ldots ,k\}\). \(\varLambda {\mathbf {f}}({\mathbf {x}})\) is a finite set, with \(|\varLambda {\mathbf {f}}({\mathbf {x}})| \le n k\), and therefore compact. We proceed in showing upper semicontinuity as follows: choose any \(\varepsilon >0\). Since \(\partial _{\mathrm{B}}{\mathbf {f}}\) is upper semicontinuous [28], there exists \(\delta >0\) such that \(\partial _{\mathrm{B}} {\mathbf {f}}({\mathbf {y}}) \subset B_{\varepsilon ^*}(\partial _{\mathrm{B}}{\mathbf {f}}({\mathbf {x}}))\) for all \({\mathbf {y}}\in B_{\delta }({\mathbf {x}})\), where \(\varepsilon ^*=\varepsilon /n\) and \(B_{\varepsilon ^*}(\partial _{\mathrm{B}}{\mathbf {f}}({\mathbf {x}}))=\left\{ {\mathbf {F}}+ \varepsilon ^* {\mathbf {Y}}: {\mathbf {F}}\in \partial _{\mathrm{B}}{\mathbf {f}}({\mathbf {x}}), \Vert {\mathbf {Y}}\Vert <1\right\} \). Choose any point \({\mathbf {x}}^{\delta } \in B_{\delta }({\mathbf {x}})\). Enumerate the B-subdifferential of \({\mathbf {f}}\) at \({\mathbf {x}}\) and \({\mathbf {x}}^{\delta }\) by, respectively, \(\{{\mathbf {F}}_{(1)},\ldots ,{\mathbf {F}}_{(k)}\}=\partial _{\mathrm{B}}{\mathbf {f}}({\mathbf {x}})\) and \(\{{\mathbf {F}}^{\delta }_{(1)},\ldots ,{\mathbf {F}}^{\delta }_{(q)}\} =\partial _{\mathrm{B}}{\mathbf {f}}({\mathbf {x}}^{\delta })\).

Let \(\pmb {\varGamma }:\partial _{\mathrm{B}} {\mathbf {f}}({\mathbf {x}}^{\delta }) \rightarrow \partial _{\mathrm{B}} {\mathbf {f}}({\mathbf {x}})\) be defined as follows: given \({\mathbf {F}}^{\delta } \in \partial _{\mathrm{B}} {\mathbf {f}}({\mathbf {x}}^{\delta })\), let \(\pmb {\varGamma }({\mathbf {F}}^{\delta })={\mathbf {F}}\in \partial _{\mathrm{B}} {\mathbf {f}}({\mathbf {x}})\) be such that \({\mathbf {F}}^{\delta }={\mathbf {F}}+\varepsilon ^* {\mathbf {Y}}\) for some \(\Vert {\mathbf {Y}}\Vert <1\). For any \(i\in \{1,\ldots ,n\}\), let the mapping \(\varTheta _i:\varLambda {\mathbf {f}}({\mathbf {x}}^{\delta }) \rightarrow \partial _{\mathrm{B}}{\mathbf {f}}({\mathbf {x}}^{\delta })\) be defined as follows: given \({\mathbf {F}}^{\varLambda } \in \varLambda {\mathbf {f}}({\mathbf {x}}^{\delta })\), let \(\varTheta _i({\mathbf {F}}^{\varLambda })={\mathbf {F}}^{\delta } \in \partial _{\mathrm{B}}{\mathbf {f}}({\mathbf {x}}^{\delta })\) be such that \(\text {row}_i({\mathbf {F}}^{\varLambda })=\text {row}_i({\mathbf {F}}^{\delta })\).

Choose any \(\bar{{\mathbf {F}}}^{\delta } \in \varLambda {\mathbf {f}}({\mathbf {x}}^{\delta })\) and let

$$\begin{aligned}\bar{{\mathbf {F}}}= \begin{bmatrix} \text {row}_1(\pmb {\varGamma }(\varTheta _1(\bar{{\mathbf {F}}}^{\delta }))) \\ \text {row}_2(\pmb {\varGamma }(\varTheta _2(\bar{{\mathbf {F}}}^{\delta }))) \\ \vdots \\ \text {row}_n(\pmb {\varGamma }(\varTheta _n(\bar{{\mathbf {F}}}^{\delta }))) \end{bmatrix}. \end{aligned}$$

For any \(i\in \{1,\ldots ,n\}\), \(\pmb {\varGamma }(\varTheta _i(\bar{{\mathbf {F}}}^{\delta }))={\mathbf {F}}_{(j)} \in \partial _{\mathrm{B}} {\mathbf {f}}({\mathbf {x}})\) s.t. \({\mathbf {F}}^{\delta }_{(l)}={\mathbf {F}}_{(j)}+\varepsilon ^* {\mathbf {Y}}_{(i)}\) for some \({\mathbf {F}}^{\delta }_{(l)} \in \partial _{\mathrm{B}}{\mathbf {f}}({\mathbf {x}}^{\delta })\) satisfying \(\text {row}_i(\bar{{\mathbf {F}}}^{\delta })=\text {row}_i({\mathbf {F}}^{\delta }_{(l)})\) and \(\Vert {\mathbf {Y}}_{(i)}\Vert <1\). Thus, \(\text {row}_i(\bar{{\mathbf {F}}})=\text {row}_i(\pmb {\varGamma }(\varTheta _i(\bar{{\mathbf {F}}}^{\delta }))))=\text {row}_i({\mathbf {F}}_{(j)})\), implying that \(\bar{{\mathbf {F}}} \in \varLambda {\mathbf {f}}({\mathbf {x}})\). Moreover, \(\text {row}_i(\bar{{\mathbf {F}}}^{\delta }-\bar{{\mathbf {F}}}) =\text {row}_i(\bar{{\mathbf {F}}}^{\delta })-\text {row}_i(\pmb {\varGamma }(\varTheta _i(\bar{{\mathbf {F}}}^{\delta })))\), from which it follows that \(\text {row}_i(\bar{{\mathbf {F}}}^{\delta }-\bar{{\mathbf {F}}})=\text {row}_i(\bar{{\mathbf {F}}}^{\delta })-\text {row}_i(\pmb {\varGamma }({\mathbf {F}}_{(l)}^{\delta })) =\text {row}_i(\bar{{\mathbf {F}}}^{\delta })-\text {row}_i({\mathbf {F}}_{(j)})\) where

$$\begin{aligned} \text {row}_i({\mathbf {F}}_{(j)}) =\text {row}_i({\mathbf {F}}_{(l)}^{\delta } - \varepsilon ^* {\mathbf {Y}}_{(l)}) =\text {row}_i(\bar{{\mathbf {F}}}^{\delta }) - \varepsilon ^* \text {row}_i({\mathbf {Y}}_{(i)}). \end{aligned}$$

From this, it follows that \(\text {row}_i(\bar{{\mathbf {F}}}^{\delta }-\bar{{\mathbf {F}}}) =\text {row}_i(\bar{{\mathbf {F}}}^{\delta })-\text {row}_i({\mathbf {F}}_{(j)})\), and therefore

$$\begin{aligned} \text {row}_i(\bar{{\mathbf {F}}}^{\delta }-\bar{{\mathbf {F}}}) =\text {row}_i(\bar{{\mathbf {F}}}^{\delta })-(\text {row}_i(\bar{{\mathbf {F}}}^{\delta }) - \varepsilon ^* \text {row}_i({\mathbf {Y}}_{(l)}))=\varepsilon ^* \text {row}_i({\mathbf {Y}}_{(i)}). \end{aligned}$$

The above result holds for any \(i\in \{1,\ldots ,n\}\), implying that \( \bar{{\mathbf {F}}}^{\delta }-\bar{{\mathbf {F}}}= \varepsilon ^* \bar{{\mathbf {Y}}} \) where

$$\begin{aligned} \bar{{\mathbf {Y}}}= \begin{bmatrix} \text {row}_1({\mathbf {Y}}_{(1)}) \\ \text {row}_2({\mathbf {Y}}_{(2)}) \\ \vdots \\ \text {row}_n({\mathbf {Y}}_{(n)}) \\ \end{bmatrix}. \end{aligned}$$

Note that \(\Vert {\mathbf {Y}}_{(i)}\Vert <1\) for each i, from which it follows that \(\Vert {\mathbf {Y}}_{(i)}\Vert _{\infty } < \sqrt{n}\). Thus,

$$\begin{aligned} \Vert \bar{{\mathbf {Y}}}\Vert \le \sqrt{n} \Vert \bar{{\mathbf {Y}}}\Vert _{\infty } \le \sqrt{n} \max \{\Vert {\mathbf {Y}}_{(i)}\Vert _{\infty }\} < n. \end{aligned}$$

Hence, \(\bar{{\mathbf {F}}}^{\delta }=\bar{{\mathbf {F}}}+\varepsilon \left( \frac{1}{n} \bar{{\mathbf {Y}}}\right) \) where \(\Vert \frac{1}{n} \bar{{\mathbf {Y}}}\Vert <1\). Thus, \(\bar{{\mathbf {F}}}^{\delta } \in B_{\varepsilon }(\varLambda {\mathbf {f}}({\mathbf {x}}))\), from which upper semicontinuity follows.

Choose \(n^* \in {\mathbb {N}}\) such that \(B_{1/n^*}(\varOmega ) \subset X\). If \({\mathbf {f}}\) is not CCO on a neighborhood of \(\varOmega \), then for any \(n \ge n^*\), there must exist \({\mathbf {x}}_{n} \in B_{1/n}(\varOmega ) {\setminus } \varOmega \) such that \({\mathbf {f}}\) is not CCO at \({\mathbf {x}}_{n}\). Let \({\mathbf {x}}^* \in \varOmega \) be an accumulation point of the sequence \(\{{\mathbf {x}}_{n}\}\). Then, since \({\mathbf {f}}\) is CCO on \(\varOmega \), \({{\,\mathrm{sign}\,}}(\det ({\mathbf {F}}))=1\) for all \({\mathbf {F}}\in \varLambda {\mathbf {f}}({\mathbf {x}}^*)\), without loss of generality. Then, by the arguments in the proof of [32, Lemma 7.5.2], upper semicontinuity of \(\varLambda {\mathbf {f}}\) at \({\mathbf {x}}^*\), along with continuity of the determinant, imply the existence of \(\rho >0\) such that \({{\,\mathrm{sign}\,}}(\det ({\mathbf {F}}^{\rho }))=1\) for all \({\mathbf {F}}^{\rho } \in \varLambda {\mathbf {f}}({\mathbf {x}}^{\rho })\) and all \({\mathbf {x}}^{\rho } \in B_{\rho }({\mathbf {x}}^*) \subset X\). However, for some \(\widetilde{n} \ge n^*\), \({\mathbf {x}}_{\widetilde{n}} \in B_{\rho }({\mathbf {x}}^*)\), a contradiction. \(\square \)

Proof of Proposition 2.2

The function \({\mathbf {f}}\) is \(PC^1\) at \(({\mathbf {x}}^*,{\mathbf {y}}^*)\) by construction, with

$$\begin{aligned} \partial _{\mathrm{B}} {\mathbf {f}}({\mathbf {x}}^*,{\mathbf {y}}^*) = \left\{ \left[ \begin{array}{c|c} {\mathbf {I}}_{n} &{} {\mathbf {0}}\\ \hline {\mathbf {J}}_{{\mathbf {x}}}{\mathbf {g}}_{(i)}({\mathbf {x}}^*,{\mathbf {y}}^*) &{} {\mathbf {J}}_{{\mathbf {y}}}{\mathbf {g}}_{(i)}({\mathbf {x}}^*,{\mathbf {y}}^*) \end{array}\right] : {\mathbf {g}}_{(i)} \in \mathscr {F}_{{\mathbf {g}},({\mathbf {x}}^*,{\mathbf {y}}^*)} \right\} \end{aligned}$$

where \(\mathscr {F}_{{\mathbf {g}},({\mathbf {x}}^*,{\mathbf {y}}^*)}=\{{\mathbf {g}}_{(i)} \}\) is a set of essentially active \(C^1\) selection functions of \({\mathbf {g}}\) at \(({\mathbf {x}}^*,{\mathbf {y}}^*)\). The combinatorial Jacobian of \({\mathbf {f}}\) at \(({\mathbf {x}}^*,{\mathbf {y}}^*)\) is

$$\begin{aligned} \varLambda {\mathbf {f}}({\mathbf {x}}^*,{\mathbf {y}}^*) = \left\{ \left[ \begin{array}{c|c} {\mathbf {I}}_{n} &{} {\mathbf {0}}\\ \hline {\mathbf {X}}^*_{(j)} &{} {\mathbf {Y}}^*_{(j)} \end{array}\right] : j=1,2,\ldots ,n_s \right\} \end{aligned}$$

where \(n_s \le 2^{|\mathscr {F}_{{\mathbf {g}},({\mathbf {x}}^*,{\mathbf {y}}^*)}|}\), and for any \(j \in \{1,\ldots ,n_{s}\}\), \(k \in \{1,\ldots ,m\}\),

$$\begin{aligned} \text {row}_k[{\mathbf {X}}^*_{(j)} \quad {\mathbf {Y}}^*_{(j)}]=\text {row}_k[{\mathbf {J}}_{\mathbf {x}}{\mathbf {g}}_{(i)}({\mathbf {x}}^*,{\mathbf {y}}^*) \quad {\mathbf {J}}_{\mathbf {y}}{\mathbf {g}}_{(i)}({\mathbf {x}}^*,{\mathbf {y}}^*)] \end{aligned}$$

for some \({\mathbf {g}}_{(i)} \in \mathscr {F}_{{\mathbf {g}},({\mathbf {x}}^*,{\mathbf {y}}^*)}\) by definition of \(\varLambda {\mathbf {f}}\). That is, \({\mathbf {f}}\) is CCO at \(({\mathbf {x}}^*,{\mathbf {y}}^*)\) if and only if \({\mathbf {g}}\) is CCO w.r.t. \({\mathbf {y}}\) at \(({\mathbf {x}}^*,{\mathbf {y}}^*)\), as required. \(\square \)

Proof of Theorem 2.1

First we demonstrate \({\mathbf {g}}\) is CCO w.r.t. \({\mathbf {y}}\) on a neighborhood of \(\varOmega \): Define the mapping \({\mathbf {f}}: W \rightarrow {\mathbb {R}}^{n} \times {\mathbb {R}}^m:({\mathbf {x}},{\mathbf {y}}) \mapsto ({\mathbf {x}},{\mathbf {g}}({\mathbf {x}},{\mathbf {y}}))\). Then \({\mathbf {f}}\) is CCO at \(({\mathbf {x}}^*,{\mathbf {y}}^*)\) if and only if \({\mathbf {g}}\) is CCO w.r.t. \({\mathbf {y}}\) at \(({\mathbf {x}}^*,{\mathbf {y}}^*)\) by Proposition 2.2, from which it follows that \({\mathbf {f}}\) is CCO on \(\varOmega \). Consequently, \({\mathbf {f}}\) is CCO on \(B_{\gamma }(\varOmega ) \subset W\) for some \(\gamma >0\) by Proposition 2.1, from which it follows that \({\mathbf {g}}\) is CCO w.r.t. \({\mathbf {y}}\) on \(B_{\gamma }(\varOmega ) \subset W\), implying (b) holds.

Next, we argue (a) holds: Since \({\mathbf {g}}\) is CCO with respect to \({\mathbf {y}}\) and \({\mathbf {g}}({\mathbf {x}},{\mathbf {y}})={\mathbf {0}}\) for each \(({\mathbf {x}},{\mathbf {y}}) \in \varOmega \), we may proceed as in the proof of [20, Theorem 3.5], but instead apply the \(PC^1\) local implicit function theorem [31, Theorem 3.4] to each point in \(\varOmega \) to furnish the family \(\mathscr {F}_{{\mathbf {r}}}=\{{\mathbf {r}}_{\mathbf {x}}:{\mathbf {x}}\in \pi _x \varOmega \}\) of local \(PC^1\) implicit functions, with corresponding collection of domains \(\{N_{{\mathbf {x}}} \subset \pi _{\mathbf {x}}W: {\mathbf {x}}\in \pi _x \varOmega \}\) that are neighborhoods of \({\mathbf {x}}\), where \(\pi _x W=\{{\mathbf {x}}:\exists ({\mathbf {x}},{\mathbf {y}}) \in W\}\). Then, continuing as in [20, Theorem 3.5], a \(PC^1\) extended implicit function \({\mathbf {r}}:B_{\delta }(\pi _x \varOmega ) \subset \pi _x W \rightarrow {\mathbb {R}}^m\) can be constructed, for some \(\delta >0\), using finitely many of these \(PC^1\) local implicit functions, say \(\{{\mathbf {r}}_{{\mathbf {x}}_{(1)}},{\mathbf {r}}_{{\mathbf {x}}_{(2)}},\ldots ,{\mathbf {r}}_{{\mathbf {x}}_{(q)}}\}\), such that for each \({\mathbf {x}}\in B_{\delta }(\pi _x \varOmega )\), \(({\mathbf {x}},{\mathbf {r}}({\mathbf {x}}))\) is the unique vector in \(B_{\xi }(\varOmega )\) satisfying \({\mathbf {g}}({\mathbf {x}},{\mathbf {r}}({\mathbf {x}}))={\mathbf {0}}\), for some \(\xi >0\). Moreover, the function \({\mathbf {r}}\) is \(PC^1\) on its domain by construction; for any \({\mathbf {x}}\in B_{\delta }(\pi _x \varOmega )\), \({\mathbf {r}}({\mathbf {x}}) \in \{{\mathbf {r}}_{{\mathbf {x}}_{(1)}}({\mathbf {x}}),{\mathbf {r}}_{{\mathbf {x}}_{(2)}}({\mathbf {x}}),\ldots ,{\mathbf {r}}_{{\mathbf {x}}_{(q)}}({\mathbf {x}})\}\). Hence, Conclusion (i) holds with \(\rho =\min (\xi ,\gamma )\), shrinking \(\delta \) if necessary.

Lastly, since a \(PC^1\) function is L-smooth, the arguments from the proof of [21, Theorem 3.2] can be repeated, with the above \(PC^1\) extended implicit function result, to show Eq. (5) holds. \(\square \)

Proof of Corollary 2.1

Conclusion (i) follows since \(\{(t,{\mathbf {p}}_0,{\mathbf {z}}(t,{\mathbf {p}}_0)): t \in T, {\mathbf {p}}\in P\} \subset G_{\mathrm{R}}^{P} \subset G_{\mathrm{R}}^\mathrm{L}\). To show (ii), suppose that \((t_0,{\mathbf {p}}_0,{\mathbf {x}}_0,{\mathbf {y}}_0) \in G_{\mathrm{R}}^P \cap G_{\mathrm{C}}^0 \subset G_{\mathrm{R}}^{\mathrm{L}} \cap G_{\mathrm{C}}^0\). Then, since \(G_{\mathrm{R}}^P\) is open, there exists \(\rho >0\) sufficiently small such that \(B_{\rho }(t_0,{\mathbf {p}}_0,{\mathbf {x}}_0,{\mathbf {y}}_0) \subset G_{\mathrm{R}}^P\), and, by continuity of the (local) solution \(t \mapsto (t,{\mathbf {p}}_0,{\mathbf {z}}(t,{\mathbf {p}}_0))\), there exists \(\alpha \in (0,\varepsilon )\) such that

$$\begin{aligned} \{(t,{\mathbf {p}}_0,{\mathbf {z}}(t,{\mathbf {p}}_0)):t\in [t_0-\alpha ,t_0+\alpha ]\} \subset B_{\rho }(t_0,{\mathbf {p}}_0,{\mathbf {x}}_0,{\mathbf {y}}_0) \subset G_{\text {R}}^{ \mathrm P}. \end{aligned}$$

To prove (iii), note that compactness of \(\{(t,{\mathbf {p}}_0,\widetilde{{\mathbf {z}}}(t,{\mathbf {p}}_0)): t\in [t_l,t_u]\} \subset G_{\mathrm{R}}^{ \mathrm P}\) combined with openness of \(G_{\mathrm{R}}^{ \mathrm P}\) implies the existence of \(\rho >0\) such that \(B_{\rho }(\{(t,{\mathbf {p}}_0,\widetilde{{\mathbf {z}}}(t,{\mathbf {p}}_0)): t\in [t_l,t_u]\}) \subset G_{\mathrm{R}}^{ \mathrm P}\). In addition, since \(t \mapsto (t,{\mathbf {p}}_0,\widetilde{{\mathbf {z}}}(t,{\mathbf {p}}_0))\) is continuous, there exists \(\varepsilon \in (0, \alpha )\) such that

$$\begin{aligned} \{(t,{\mathbf {p}}_0,\widetilde{{\mathbf {z}}}(t,{\mathbf {p}}_0)): t\in [t_l-\varepsilon ,t_u+\varepsilon ]\} \subset B_{\rho }(\{(t,{\mathbf {p}}_0,\widetilde{{\mathbf {z}}}(t,{\mathbf {p}}_0)): t\in [t_l,t_u]\}) \subset G_{\mathrm{R}}^{ \mathrm P}, \end{aligned}$$

implying P-regularity of \(\widetilde{{\mathbf {z}}}\) on \([t_l-\alpha ,t_u+\alpha ] \times \{{\mathbf {p}}_0\}\). Lastly, to prove maximal regular continuation, define the set of augmented graphs of regular continuations as

$$\begin{aligned} \varGamma _{\mathrm{ext}}:=\{ \{(t,{\mathbf {p}}_0,\widehat{{\mathbf {z}}}(t,{\mathbf {p}}_0)):t \in \widehat{T}\}: \widehat{{\mathbf {z}}} \text { is a P-regular continuation of } {\mathbf {z}}\text { on } \widehat{T}\}. \end{aligned}$$

\(\varGamma _{\mathrm{ext}}\) is nonempty since \(\{(t,{\mathbf {p}}_0,\widetilde{{\mathbf {z}}}(t,{\mathbf {p}}_0)):t \in [t_l-\alpha ,t_u+\alpha ]\} \in \varGamma _{\mathrm{ext}}\). Zorn’s Lemma can be applied with the generalized inequality \( \varPhi \preceq \varPsi \) for \(\varPhi , \varPsi \in \varGamma _{\mathrm{ext}}\) if and only if \(\varPhi \subset \varPsi \), which is a partial ordering (see the proof of [20, Theorem 6.4]). Given any nonempty totally ordered subset \(\varGamma _{\mathrm{ext}}^*=\{\varOmega _{(i)}:i \in A\} \subset \varGamma _{\mathrm{ext}}\), where \(\varOmega _{(i)}=\{(t,{\mathbf {p}}_0,{\mathbf {z}}_{(i)}(t,{\mathbf {p}}_0)):t \in T_{(i)}\}\), the element \(\varOmega _u=\{(t,{\mathbf {p}}_0,{\mathbf {z}}_u(t)):t \in T_u\}\) is an upper bound of \(\varGamma _{\mathrm{ext}}^*\), where \(T_u=\bigcup _{i \in A} T_{(i)}\) and the mapping \({\mathbf {z}}_u:t \mapsto {\mathbf {z}}_{(i)}(\cdot ,{\mathbf {p}}_0), \; t \in T_{(i)}\) is single-valued since for any \(i, j \in A\), \({\mathbf {z}}_{(i)}(t,{\mathbf {p}}_0)={\mathbf {z}}_{(j)}(t,{\mathbf {p}}_0)\) for all \(t \in T_{(i)} \cap T_{(j)}\). Zorn’s Lemma implies that \(\varGamma _{\mathrm{ext}}\) contains maximal elements; that is, there exists \(\varOmega _{\mathrm{max}}= \{(t,{\mathbf {p}}_0,{\mathbf {z}}_{\mathrm{max}}(t,{\mathbf {p}}_0)):t \in T_{\mathrm{max}}\} \in \varGamma _{\mathrm{ext}}\), such that \(\varOmega \subset \varOmega _{\mathrm{max}}\) for any \(\varOmega \in \varGamma _{\mathrm{ext}}\). It is then possible to show that \(T_{\mathrm{max}}=(t_L,t_U)\) since otherwise P-regularity allows for application of Theorem 2.2 (ii) to \((t_L,{\mathbf {p}}_0,{\mathbf {z}}_{\mathrm{max}}(t_L,{\mathbf {p}}_0))\) or \((t_U,{\mathbf {p}}_0,{\mathbf {z}}_{\mathrm{max}}(t_U,{\mathbf {p}}_0))\) to continue the solution on a strict superset of \(T_{\mathrm{max}}\). \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Stechlinski, P. Optimization-Constrained Differential Equations with Active Set Changes. J Optim Theory Appl 187, 266–293 (2020). https://doi.org/10.1007/s10957-020-01744-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10957-020-01744-4

Keywords

Mathematics Subject Classification

Navigation