Optimization-Constrained Differential Equations with Active Set Changes

Stechlinski, Peter

doi:10.1007/s10957-020-01744-4

Optimization-Constrained Differential Equations with Active Set Changes

Published: 18 September 2020

Volume 187, pages 266–293, (2020)
Cite this article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Peter Stechlinski ORCID: orcid.org/0000-0002-5162-4951¹

262 Accesses
3 Citations
Explore all metrics

Abstract

Foundational theory is established for nonlinear differential equations with embedded nonlinear optimization problems exhibiting active set changes. Existence, uniqueness, and continuation of solutions are shown, followed by lexicographically smooth (implying Lipschitzian) parametric dependence. The sensitivity theory found here accurately characterizes sensitivity jumps resulting from active set changes via an auxiliary nonsmooth sensitivity system obtained by lexicographic directional differentiation. The results in this article hold under easily verifiable regularity conditions (linear independence of constraints and strong second-order sufficiency), which are shown to imply generalized differentiation index one of a nonsmooth differential-algebraic equation system obtained by replacing the optimization problem with its optimality conditions and recasting the complementarity conditions as nonsmooth algebraic equations. The theory in this article is computationally relevant, allowing for implementation of dynamic optimization strategies (i.e., open-loop optimal control), and recovers (and rigorously formalizes) classical results in the absence of active set changes. Along the way, contributions are made to the theory of piecewise differentiable functions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Solving optimal control problems with terminal complementarity constraints via Scholtes’ relaxation scheme

Article 15 December 2018

A Note on Linear Differential Variational Inequalities in Hilbert Space

A Survey on Optimal Control Problems with Differential-Algebraic Equations

Notes

On Page 2263 of that proof, the statements $\widetilde{{\mathbf {x}}} \in \bigcup _{i \in {P}} B_{0.5\alpha _i}({\mathbf {x}}_{(i)})$ and ${\mathbf {x}}^* \in \bigcup _{i \in {P}} B_{\alpha _i}({\mathbf {x}}_{(i)})$ should be replaced by $\widetilde{{\mathbf {x}}} \in \bigcap _{i \in {P}} B_{0.5\alpha _i}({\mathbf {x}}_{(i)})$ and ${\mathbf {x}}^* \in \bigcap _{i \in {P}} B_{\alpha _i}({\mathbf {x}}_{(i)})$, respectively.
If ${\mathbf {h}}$ and its local inverse ${\mathbf {h}}^{-1}$ are Lipschitz continuous on neighborhoods of ${\mathbf {z}}_0$ and ${\mathbf {h}}({\mathbf {z}}_0)$, respectively, then ${\mathbf {h}}$ is a (local) Lipschitz homeomorphism at the domain point ${\mathbf {z}}_0$.
If ${\mathbf {h}}$ and its local inverse ${\mathbf {h}}^{-1}$ are $PC^1$ at ${\mathbf {z}}_0$ and ${\mathbf {h}}({\mathbf {z}}_0)$, respectively, then ${\mathbf {h}}$ is a (local) $PC^1$ homeomorphism at the domain point ${\mathbf {z}}_0$.
There are some qualitatively distinct classes of trajectories not captured in the figures, such as the solution trajectory traversing the constraint counterclockwise when ${\mathbf {p}}\in R_1 \cap N$ and $p_2<p_1<1$. Nevertheless, this is unimportant for present purposes because the active sets are unchanged.

References

Mahadevan, R., Edwards, J.S., Doyle, F.J.: Dynamic flux balance analysis of diauxic growth in Escherichia coli. Biophys. J. 83(3), 1331–1340 (2002)
Article Google Scholar
Lewis, N.E., Nagarajan, H., Palsson, B.O.: Constraining the metabolic genotype–phenotype relationship using a phylogeny of in silico methods. Nat. Rev. Microbiol. 10(4), 291–305 (2012)
Article Google Scholar
Amundson, N.R., Caboussat, A., He, J.W., Landry, C., Seinfeld, J.H.: A dynamic optimization problem related to organic aerosols. Comptes Rendus Math. 344(8), 519–522 (2007)
Article MathSciNet MATH Google Scholar
Landry, C., Caboussat, A., Hairer, E.: Solving optimization-constrained differential equations with discontinuity points, with application to atmospheric chemistry. SIAM J. Sci. Comput. 31(5), 3806–3826 (2009)
Article MathSciNet MATH Google Scholar
Veliov, V.: On the time-discretization of control systems. SIAM J. Control Optim. 35(5), 1470–1486 (1997)
Article MathSciNet MATH Google Scholar
Campbell, S.L., Gear, C.W.: The index of general nonlinear DAEs. Numer. Math. 72(2), 173–196 (1995)
Article MathSciNet MATH Google Scholar
Mehrmann, V.: Index concepts for differential-algebraic equations. In: Engquist, B. (ed.) Encyclopedia of Applied and Computational Mathematics, pp. 676–681. Springer, Berlin (2015)
Chapter Google Scholar
Brenan, K.E., Campbell, S.L., Petzold, L.R.: Numerical Solution of Initial-Value Problems in Differential-Algebraic Equations. SIAM, Philadelphia (1996)
MATH Google Scholar
Kunkel, P., Mehrmann, V.: Differential-Algebraic Equations: Analysis and Numerical Solution. European Mathematical Society, Zurich (2006)
Book MATH Google Scholar
Ascher, U.M., Petzold, L.R.: Computer Methods for Ordinary Differential Equations and Differential-Algebraic Equations. SIAM, Philadelphia (1998)
Book MATH Google Scholar
Feehery, W.F., Tolsma, J.E., Barton, P.I.: Efficient sensitivity analysis of large-scale differential-algebraic systems. Appl. Numer. Math. 25, 41–54 (1997)
Article MathSciNet MATH Google Scholar
Cao, Y., Li, S., Petzold, L., Serban, R.: Adjoint sensitivity analysis for differential-algebraic equations: the adjoint DAE system and its numerical solution. SIAM J. Sci. Comput. 24(3), 1076–1089 (2003)
Article MathSciNet MATH Google Scholar
Amundson, N.R., Caboussat, A., He, J.W., Seinfeld, J.H.: Primal-dual interior-point method for an optimization problem related to the modeling of atmospheric organic aerosols. J. Optim. Theory Appl. 130(3), 377–409 (2006)
Article MathSciNet MATH Google Scholar
Caboussat, A., Landry, C., Rappaz, J.: Optimization problem coupled with differential equations: a numerical algorithm mixing an interior-point method and event detection. J. Optim. Theory Appl. 147(1), 141–156 (2010)
Article MathSciNet MATH Google Scholar
Hüser, J., Deussen, J., Naumann, U.: Integration of differential-algebraic equations with optimality criteria. In: AD2016—The 7th International Conference on Algorithmic Differentiation Programme and Presentations (2016)
Griewank, A., Walther, A.: Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation. Other Titles in Applied Mathematics, 2nd edn. SIAM, Philadelphia (2008)
Book MATH Google Scholar
Naumann, U.: The Art of Differentiating Computer Programs: An Introduction to Algorithmic Differentiation. SIAM, Philadelphia (2012)
MATH Google Scholar
Scholtes, S.: Introduction to Piecewise Differentiable Equations. Springer, New York (2012)
Book MATH Google Scholar
Rabier, P.J., Rheinboldt, W.C.: Theoretical and Numerical Analysis of Differential-Algebraic Equations. Elsevier, North-Holland (2002)
Book MATH Google Scholar
Stechlinski, P.G., Barton, P.I.: Dependence of solutions of nonsmooth differential-algebraic equations on parameters. J. Differ. Equ. 262(3), 2254–2285 (2017)
Article MathSciNet MATH Google Scholar
Stechlinski, P.G., Barton, P.I.: Generalized derivatives of differential-algebraic equations. J. Optim. Theory Appl. 171(1), 1–26 (2016)
Article MathSciNet MATH Google Scholar
Stechlinski, P.G., Barton, P.I.: Generalized derivatives of optimal control problems with nonsmooth differential-algebraic equations embedded. In: 55th IEEE Conference on Decision and Control, pp. 592–597 (2016)
Griewank, A.: On stable piecewise linearization and generalized algorithmic differentiation. Optim. Methods Softw. 28(April 2015), 1139–1178 (2013)
Article MathSciNet MATH Google Scholar
Khan, K.A., Barton, P.I.: A vector forward mode of automatic differentiation for generalized derivative evaluation. Optim. Methods Softw. 30(6), 1185–1212 (2015)
Article MathSciNet MATH Google Scholar
Khan, K.A.: Branch-locking AD techniques for nonsmooth composite functions and nonsmooth implicit functions. Optim. Methods Softw. 33(4–6), 1127–1155 (2018)
Article MathSciNet MATH Google Scholar
Schumacher, J.M.: Complementarity systems in optimization. Math. Program. 101, 263–295 (2004)
Article MathSciNet MATH Google Scholar
Pang, J.S., Stewart, D.E.: Differential variational inequalities. Math. Program. 113, 345–424 (2008)
Article MathSciNet MATH Google Scholar
Clarke, F.H.: Optimization and Nonsmooth Analysis. SIAM, Philadelphia (1990)
Book MATH Google Scholar
Barton, P.I., Khan, K.A., Stechlinski, P., Watson, H.A.J.: Computationally relevant generalized derivatives: theory, evaluation and applications. Optim. Methods Softw. 33, 1030–1072 (2018)
Article MathSciNet MATH Google Scholar
Ralph, D., Scholtes, S.: Sensitivity analysis of composite piecewise smooth equations. Math. Program. 76, 593–612 (1997)
MathSciNet MATH Google Scholar
Stechlinski, P., Khan, K.A., Barton, P.I.: Generalized sensitivity analysis of nonlinear programs. SIAM J. Optim. 28(1), 272–301 (2018)
Article MathSciNet MATH Google Scholar
Facchinei, F., Pang, J.S.: Finite-Dimensional Variational Inequalities and Complementarity Problems. Springer, New York (2003)
MATH Google Scholar
Nesterov, Y.: Lexicographic differentiation of nonsmooth functions. Math. Program. 104, 669–700 (2005)
Article MathSciNet MATH Google Scholar
Khan, K.A., Barton, P.I.: Generalized derivatives for solutions of parametric ordinary differential equations with non-differentiable right-hand sides. J. Optim. Theory Appl. 163, 355–386 (2014)
Article MathSciNet MATH Google Scholar
Khan, K.A., Barton, P.I.: Generalized derivatives for hybrid systems. IEEE Trans. Autom. Control 62(7), 3193–3208 (2017)
Article MathSciNet MATH Google Scholar
Qi, L., Sun, J.: A nonsmooth version of Newton’s method. Math. Program. 58, 353–367 (1993)
Article MathSciNet MATH Google Scholar
Lukšan, L., Vlček, J.: A bundle-Newton method for nonsmooth unconstrained minimization. Math. Program. 83, 373–391 (1998)
MathSciNet MATH Google Scholar
Stechlinski, P., Patrascu, M., Barton, P.I.: Nonsmooth DAEs with applications in modeling phase changes. In: Campbell, S., Ilchmann, A., Mehrmann, V., Reis, T. (eds.) Applications of Differential-Algebraic Equations: Examples and Benchmarks, Differential-Algebraic Equations Forum. Springer, Berlin (2018)
Google Scholar
Fiacco, A.V.: Introduction to Sensitivity and Stability Analysis in Nonlinear Programming. Academic Press, New York (1983)
MATH Google Scholar
Galán, S., Feehery, W.F., Barton, P.I.: Parametric sensitivity functions for hybrid discrete/continuous systems. Appl. Numer. Math. 31, 17–47 (1999)
Article MathSciNet MATH Google Scholar
Kleinert, J., Simeon, B.: Differential-algebraic equations and beyond: from smooth to nonsmooth constrained dynamical systems. arXiv preprint arXiv:1811.07658 (2018)

Download references

Acknowledgements

The author would like to thank the anonymous reviewers for their helpful comments.

Author information

Authors and Affiliations

Department of Mathematics and Statistics, University of Maine, Orono, ME, USA
Peter Stechlinski

Authors

Peter Stechlinski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peter Stechlinski.

Ethics declarations

Conflict of interest

The author declares that they have no conflict of interest.

Additional information

Communicated by Lorenz T. Biegler.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Proof of Proposition 2.1

Without loss of generality, let $\{{\mathbf {f}}_{(1)},\ldots ,{\mathbf {f}}_{(k)}\}$ be a set of essentially active $C^1$ selection functions of ${\mathbf {f}}$ at ${\mathbf {x}}$. $\varLambda {\mathbf {f}}({\mathbf {x}})$ is nonempty since ${\mathbf {J}}{\mathbf {f}}_{(i)}({\mathbf {x}}) \in \varLambda {\mathbf {f}}({\mathbf {x}})$ for each $i\in \{1,\ldots ,k\}$. $\varLambda {\mathbf {f}}({\mathbf {x}})$ is a finite set, with $|\varLambda {\mathbf {f}}({\mathbf {x}})| \le n k$, and therefore compact. We proceed in showing upper semicontinuity as follows: choose any $\varepsilon >0$. Since $\partial _{\mathrm{B}}{\mathbf {f}}$ is upper semicontinuous [28], there exists $\delta >0$ such that $\partial _{\mathrm{B}} {\mathbf {f}}({\mathbf {y}}) \subset B_{\varepsilon ^*}(\partial _{\mathrm{B}}{\mathbf {f}}({\mathbf {x}}))$ for all ${\mathbf {y}}\in B_{\delta }({\mathbf {x}})$, where $\varepsilon ^*=\varepsilon /n$ and $B_{\varepsilon ^*}(\partial _{\mathrm{B}}{\mathbf {f}}({\mathbf {x}}))=\left\{ {\mathbf {F}}+ \varepsilon ^* {\mathbf {Y}}: {\mathbf {F}}\in \partial _{\mathrm{B}}{\mathbf {f}}({\mathbf {x}}), \Vert {\mathbf {Y}}\Vert <1\right\} $. Choose any point ${\mathbf {x}}^{\delta } \in B_{\delta }({\mathbf {x}})$. Enumerate the B-subdifferential of ${\mathbf {f}}$ at ${\mathbf {x}}$ and ${\mathbf {x}}^{\delta }$ by, respectively, $\{{\mathbf {F}}_{(1)},\ldots ,{\mathbf {F}}_{(k)}\}=\partial _{\mathrm{B}}{\mathbf {f}}({\mathbf {x}})$ and $\{{\mathbf {F}}^{\delta }_{(1)},\ldots ,{\mathbf {F}}^{\delta }_{(q)}\} =\partial _{\mathrm{B}}{\mathbf {f}}({\mathbf {x}}^{\delta })$.

Let $\pmb {\varGamma }:\partial _{\mathrm{B}} {\mathbf {f}}({\mathbf {x}}^{\delta }) \rightarrow \partial _{\mathrm{B}} {\mathbf {f}}({\mathbf {x}})$ be defined as follows: given ${\mathbf {F}}^{\delta } \in \partial _{\mathrm{B}} {\mathbf {f}}({\mathbf {x}}^{\delta })$, let $\pmb {\varGamma }({\mathbf {F}}^{\delta })={\mathbf {F}}\in \partial _{\mathrm{B}} {\mathbf {f}}({\mathbf {x}})$ be such that ${\mathbf {F}}^{\delta }={\mathbf {F}}+\varepsilon ^* {\mathbf {Y}}$ for some $\Vert {\mathbf {Y}}\Vert <1$. For any $i\in \{1,\ldots ,n\}$, let the mapping $\varTheta _i:\varLambda {\mathbf {f}}({\mathbf {x}}^{\delta }) \rightarrow \partial _{\mathrm{B}}{\mathbf {f}}({\mathbf {x}}^{\delta })$ be defined as follows: given ${\mathbf {F}}^{\varLambda } \in \varLambda {\mathbf {f}}({\mathbf {x}}^{\delta })$, let $\varTheta _i({\mathbf {F}}^{\varLambda })={\mathbf {F}}^{\delta } \in \partial _{\mathrm{B}}{\mathbf {f}}({\mathbf {x}}^{\delta })$ be such that $\text {row}_i({\mathbf {F}}^{\varLambda })=\text {row}_i({\mathbf {F}}^{\delta })$.

Choose any $\bar{{\mathbf {F}}}^{\delta } \in \varLambda {\mathbf {f}}({\mathbf {x}}^{\delta })$ and let

$$\begin{aligned}\bar{{\mathbf {F}}}= \begin{bmatrix} \text {row}_1(\pmb {\varGamma }(\varTheta _1(\bar{{\mathbf {F}}}^{\delta }))) \\ \text {row}_2(\pmb {\varGamma }(\varTheta _2(\bar{{\mathbf {F}}}^{\delta }))) \\ \vdots \\ \text {row}_n(\pmb {\varGamma }(\varTheta _n(\bar{{\mathbf {F}}}^{\delta }))) \end{bmatrix}. \end{aligned}$$

For any $i\in \{1,\ldots ,n\}$, $\pmb {\varGamma }(\varTheta _i(\bar{{\mathbf {F}}}^{\delta }))={\mathbf {F}}_{(j)} \in \partial _{\mathrm{B}} {\mathbf {f}}({\mathbf {x}})$ s.t. ${\mathbf {F}}^{\delta }_{(l)}={\mathbf {F}}_{(j)}+\varepsilon ^* {\mathbf {Y}}_{(i)}$ for some ${\mathbf {F}}^{\delta }_{(l)} \in \partial _{\mathrm{B}}{\mathbf {f}}({\mathbf {x}}^{\delta })$ satisfying $\text {row}_i(\bar{{\mathbf {F}}}^{\delta })=\text {row}_i({\mathbf {F}}^{\delta }_{(l)})$ and $\Vert {\mathbf {Y}}_{(i)}\Vert <1$. Thus, $\text {row}_i(\bar{{\mathbf {F}}})=\text {row}_i(\pmb {\varGamma }(\varTheta _i(\bar{{\mathbf {F}}}^{\delta }))))=\text {row}_i({\mathbf {F}}_{(j)})$, implying that $\bar{{\mathbf {F}}} \in \varLambda {\mathbf {f}}({\mathbf {x}})$. Moreover, $\text {row}_i(\bar{{\mathbf {F}}}^{\delta }-\bar{{\mathbf {F}}}) =\text {row}_i(\bar{{\mathbf {F}}}^{\delta })-\text {row}_i(\pmb {\varGamma }(\varTheta _i(\bar{{\mathbf {F}}}^{\delta })))$, from which it follows that $\text {row}_i(\bar{{\mathbf {F}}}^{\delta }-\bar{{\mathbf {F}}})=\text {row}_i(\bar{{\mathbf {F}}}^{\delta })-\text {row}_i(\pmb {\varGamma }({\mathbf {F}}_{(l)}^{\delta })) =\text {row}_i(\bar{{\mathbf {F}}}^{\delta })-\text {row}_i({\mathbf {F}}_{(j)})$ where

$$\begin{aligned} \text {row}_i({\mathbf {F}}_{(j)}) =\text {row}_i({\mathbf {F}}_{(l)}^{\delta } - \varepsilon ^* {\mathbf {Y}}_{(l)}) =\text {row}_i(\bar{{\mathbf {F}}}^{\delta }) - \varepsilon ^* \text {row}_i({\mathbf {Y}}_{(i)}). \end{aligned}$$

From this, it follows that $\text {row}_i(\bar{{\mathbf {F}}}^{\delta }-\bar{{\mathbf {F}}}) =\text {row}_i(\bar{{\mathbf {F}}}^{\delta })-\text {row}_i({\mathbf {F}}_{(j)})$, and therefore

$$\begin{aligned} \text {row}_i(\bar{{\mathbf {F}}}^{\delta }-\bar{{\mathbf {F}}}) =\text {row}_i(\bar{{\mathbf {F}}}^{\delta })-(\text {row}_i(\bar{{\mathbf {F}}}^{\delta }) - \varepsilon ^* \text {row}_i({\mathbf {Y}}_{(l)}))=\varepsilon ^* \text {row}_i({\mathbf {Y}}_{(i)}). \end{aligned}$$

The above result holds for any $i\in \{1,\ldots ,n\}$, implying that $ \bar{{\mathbf {F}}}^{\delta }-\bar{{\mathbf {F}}}= \varepsilon ^* \bar{{\mathbf {Y}}} $ where

$$\begin{aligned} \bar{{\mathbf {Y}}}= \begin{bmatrix} \text {row}_1({\mathbf {Y}}_{(1)}) \\ \text {row}_2({\mathbf {Y}}_{(2)}) \\ \vdots \\ \text {row}_n({\mathbf {Y}}_{(n)}) \\ \end{bmatrix}. \end{aligned}$$

Note that $\Vert {\mathbf {Y}}_{(i)}\Vert <1$ for each i, from which it follows that $\Vert {\mathbf {Y}}_{(i)}\Vert _{\infty } < \sqrt{n}$. Thus,

$$\begin{aligned} \Vert \bar{{\mathbf {Y}}}\Vert \le \sqrt{n} \Vert \bar{{\mathbf {Y}}}\Vert _{\infty } \le \sqrt{n} \max \{\Vert {\mathbf {Y}}_{(i)}\Vert _{\infty }\} < n. \end{aligned}$$

Hence, $\bar{{\mathbf {F}}}^{\delta }=\bar{{\mathbf {F}}}+\varepsilon \left( \frac{1}{n} \bar{{\mathbf {Y}}}\right) $ where $\Vert \frac{1}{n} \bar{{\mathbf {Y}}}\Vert <1$. Thus, $\bar{{\mathbf {F}}}^{\delta } \in B_{\varepsilon }(\varLambda {\mathbf {f}}({\mathbf {x}}))$, from which upper semicontinuity follows.

Choose $n^* \in {\mathbb {N}}$ such that $B_{1/n^*}(\varOmega ) \subset X$. If ${\mathbf {f}}$ is not CCO on a neighborhood of $\varOmega $, then for any $n \ge n^*$, there must exist ${\mathbf {x}}_{n} \in B_{1/n}(\varOmega ) {\setminus } \varOmega $ such that ${\mathbf {f}}$ is not CCO at ${\mathbf {x}}_{n}$. Let ${\mathbf {x}}^* \in \varOmega $ be an accumulation point of the sequence $\{{\mathbf {x}}_{n}\}$. Then, since ${\mathbf {f}}$ is CCO on $\varOmega $, ${{\,\mathrm{sign}\,}}(\det ({\mathbf {F}}))=1$ for all ${\mathbf {F}}\in \varLambda {\mathbf {f}}({\mathbf {x}}^*)$, without loss of generality. Then, by the arguments in the proof of [32, Lemma 7.5.2], upper semicontinuity of $\varLambda {\mathbf {f}}$ at ${\mathbf {x}}^*$, along with continuity of the determinant, imply the existence of $\rho >0$ such that ${{\,\mathrm{sign}\,}}(\det ({\mathbf {F}}^{\rho }))=1$ for all ${\mathbf {F}}^{\rho } \in \varLambda {\mathbf {f}}({\mathbf {x}}^{\rho })$ and all ${\mathbf {x}}^{\rho } \in B_{\rho }({\mathbf {x}}^*) \subset X$. However, for some $\widetilde{n} \ge n^*$, ${\mathbf {x}}_{\widetilde{n}} \in B_{\rho }({\mathbf {x}}^*)$, a contradiction. $\square $

Proof of Proposition 2.2

The function ${\mathbf {f}}$ is $PC^1$ at $({\mathbf {x}}^*,{\mathbf {y}}^*)$ by construction, with

$$\begin{aligned} \partial _{\mathrm{B}} {\mathbf {f}}({\mathbf {x}}^*,{\mathbf {y}}^*) = \left\{ \left[ \begin{array}{c|c} {\mathbf {I}}_{n} &{} {\mathbf {0}}\\ \hline {\mathbf {J}}_{{\mathbf {x}}}{\mathbf {g}}_{(i)}({\mathbf {x}}^*,{\mathbf {y}}^*) &{} {\mathbf {J}}_{{\mathbf {y}}}{\mathbf {g}}_{(i)}({\mathbf {x}}^*,{\mathbf {y}}^*) \end{array}\right] : {\mathbf {g}}_{(i)} \in \mathscr {F}_{{\mathbf {g}},({\mathbf {x}}^*,{\mathbf {y}}^*)} \right\} \end{aligned}$$

where $\mathscr {F}_{{\mathbf {g}},({\mathbf {x}}^*,{\mathbf {y}}^*)}=\{{\mathbf {g}}_{(i)} \}$ is a set of essentially active $C^1$ selection functions of ${\mathbf {g}}$ at $({\mathbf {x}}^*,{\mathbf {y}}^*)$. The combinatorial Jacobian of ${\mathbf {f}}$ at $({\mathbf {x}}^*,{\mathbf {y}}^*)$ is

$$\begin{aligned} \varLambda {\mathbf {f}}({\mathbf {x}}^*,{\mathbf {y}}^*) = \left\{ \left[ \begin{array}{c|c} {\mathbf {I}}_{n} &{} {\mathbf {0}}\\ \hline {\mathbf {X}}^*_{(j)} &{} {\mathbf {Y}}^*_{(j)} \end{array}\right] : j=1,2,\ldots ,n_s \right\} \end{aligned}$$

where $n_s \le 2^{|\mathscr {F}_{{\mathbf {g}},({\mathbf {x}}^*,{\mathbf {y}}^*)}|}$, and for any $j \in \{1,\ldots ,n_{s}\}$, $k \in \{1,\ldots ,m\}$,

$$\begin{aligned} \text {row}_k[{\mathbf {X}}^*_{(j)} \quad {\mathbf {Y}}^*_{(j)}]=\text {row}_k[{\mathbf {J}}_{\mathbf {x}}{\mathbf {g}}_{(i)}({\mathbf {x}}^*,{\mathbf {y}}^*) \quad {\mathbf {J}}_{\mathbf {y}}{\mathbf {g}}_{(i)}({\mathbf {x}}^*,{\mathbf {y}}^*)] \end{aligned}$$

for some ${\mathbf {g}}_{(i)} \in \mathscr {F}_{{\mathbf {g}},({\mathbf {x}}^*,{\mathbf {y}}^*)}$ by definition of $\varLambda {\mathbf {f}}$. That is, ${\mathbf {f}}$ is CCO at $({\mathbf {x}}^*,{\mathbf {y}}^*)$ if and only if ${\mathbf {g}}$ is CCO w.r.t. ${\mathbf {y}}$ at $({\mathbf {x}}^*,{\mathbf {y}}^*)$, as required. $\square $

Proof of Theorem 2.1

First we demonstrate ${\mathbf {g}}$ is CCO w.r.t. ${\mathbf {y}}$ on a neighborhood of $\varOmega $: Define the mapping ${\mathbf {f}}: W \rightarrow {\mathbb {R}}^{n} \times {\mathbb {R}}^m:({\mathbf {x}},{\mathbf {y}}) \mapsto ({\mathbf {x}},{\mathbf {g}}({\mathbf {x}},{\mathbf {y}}))$. Then ${\mathbf {f}}$ is CCO at $({\mathbf {x}}^*,{\mathbf {y}}^*)$ if and only if ${\mathbf {g}}$ is CCO w.r.t. ${\mathbf {y}}$ at $({\mathbf {x}}^*,{\mathbf {y}}^*)$ by Proposition 2.2, from which it follows that ${\mathbf {f}}$ is CCO on $\varOmega $. Consequently, ${\mathbf {f}}$ is CCO on $B_{\gamma }(\varOmega ) \subset W$ for some $\gamma >0$ by Proposition 2.1, from which it follows that ${\mathbf {g}}$ is CCO w.r.t. ${\mathbf {y}}$ on $B_{\gamma }(\varOmega ) \subset W$, implying (b) holds.

Next, we argue (a) holds: Since ${\mathbf {g}}$ is CCO with respect to ${\mathbf {y}}$ and ${\mathbf {g}}({\mathbf {x}},{\mathbf {y}})={\mathbf {0}}$ for each $({\mathbf {x}},{\mathbf {y}}) \in \varOmega $, we may proceed as in the proof of [20, Theorem 3.5], but instead apply the $PC^1$ local implicit function theorem [31, Theorem 3.4] to each point in $\varOmega $ to furnish the family $\mathscr {F}_{{\mathbf {r}}}=\{{\mathbf {r}}_{\mathbf {x}}:{\mathbf {x}}\in \pi _x \varOmega \}$ of local $PC^1$ implicit functions, with corresponding collection of domains $\{N_{{\mathbf {x}}} \subset \pi _{\mathbf {x}}W: {\mathbf {x}}\in \pi _x \varOmega \}$ that are neighborhoods of ${\mathbf {x}}$, where $\pi _x W=\{{\mathbf {x}}:\exists ({\mathbf {x}},{\mathbf {y}}) \in W\}$. Then, continuing as in [20, Theorem 3.5], a $PC^1$ extended implicit function ${\mathbf {r}}:B_{\delta }(\pi _x \varOmega ) \subset \pi _x W \rightarrow {\mathbb {R}}^m$ can be constructed, for some $\delta >0$, using finitely many of these $PC^1$ local implicit functions, say $\{{\mathbf {r}}_{{\mathbf {x}}_{(1)}},{\mathbf {r}}_{{\mathbf {x}}_{(2)}},\ldots ,{\mathbf {r}}_{{\mathbf {x}}_{(q)}}\}$, such that for each ${\mathbf {x}}\in B_{\delta }(\pi _x \varOmega )$, $({\mathbf {x}},{\mathbf {r}}({\mathbf {x}}))$ is the unique vector in $B_{\xi }(\varOmega )$ satisfying ${\mathbf {g}}({\mathbf {x}},{\mathbf {r}}({\mathbf {x}}))={\mathbf {0}}$, for some $\xi >0$. Moreover, the function ${\mathbf {r}}$ is $PC^1$ on its domain by construction; for any ${\mathbf {x}}\in B_{\delta }(\pi _x \varOmega )$, ${\mathbf {r}}({\mathbf {x}}) \in \{{\mathbf {r}}_{{\mathbf {x}}_{(1)}}({\mathbf {x}}),{\mathbf {r}}_{{\mathbf {x}}_{(2)}}({\mathbf {x}}),\ldots ,{\mathbf {r}}_{{\mathbf {x}}_{(q)}}({\mathbf {x}})\}$. Hence, Conclusion (i) holds with $\rho =\min (\xi ,\gamma )$, shrinking $\delta $ if necessary.

Lastly, since a $PC^1$ function is L-smooth, the arguments from the proof of [21, Theorem 3.2] can be repeated, with the above $PC^1$ extended implicit function result, to show Eq. (5) holds. $\square $

Proof of Corollary 2.1

Conclusion (i) follows since $\{(t,{\mathbf {p}}_0,{\mathbf {z}}(t,{\mathbf {p}}_0)): t \in T, {\mathbf {p}}\in P\} \subset G_{\mathrm{R}}^{P} \subset G_{\mathrm{R}}^\mathrm{L}$. To show (ii), suppose that $(t_0,{\mathbf {p}}_0,{\mathbf {x}}_0,{\mathbf {y}}_0) \in G_{\mathrm{R}}^P \cap G_{\mathrm{C}}^0 \subset G_{\mathrm{R}}^{\mathrm{L}} \cap G_{\mathrm{C}}^0$. Then, since $G_{\mathrm{R}}^P$ is open, there exists $\rho >0$ sufficiently small such that $B_{\rho }(t_0,{\mathbf {p}}_0,{\mathbf {x}}_0,{\mathbf {y}}_0) \subset G_{\mathrm{R}}^P$, and, by continuity of the (local) solution $t \mapsto (t,{\mathbf {p}}_0,{\mathbf {z}}(t,{\mathbf {p}}_0))$, there exists $\alpha \in (0,\varepsilon )$ such that

$$\begin{aligned} \{(t,{\mathbf {p}}_0,{\mathbf {z}}(t,{\mathbf {p}}_0)):t\in [t_0-\alpha ,t_0+\alpha ]\} \subset B_{\rho }(t_0,{\mathbf {p}}_0,{\mathbf {x}}_0,{\mathbf {y}}_0) \subset G_{\text {R}}^{ \mathrm P}. \end{aligned}$$

To prove (iii), note that compactness of $\{(t,{\mathbf {p}}_0,\widetilde{{\mathbf {z}}}(t,{\mathbf {p}}_0)): t\in [t_l,t_u]\} \subset G_{\mathrm{R}}^{ \mathrm P}$ combined with openness of $G_{\mathrm{R}}^{ \mathrm P}$ implies the existence of $\rho >0$ such that $B_{\rho }(\{(t,{\mathbf {p}}_0,\widetilde{{\mathbf {z}}}(t,{\mathbf {p}}_0)): t\in [t_l,t_u]\}) \subset G_{\mathrm{R}}^{ \mathrm P}$. In addition, since $t \mapsto (t,{\mathbf {p}}_0,\widetilde{{\mathbf {z}}}(t,{\mathbf {p}}_0))$ is continuous, there exists $\varepsilon \in (0, \alpha )$ such that

$$\begin{aligned} \{(t,{\mathbf {p}}_0,\widetilde{{\mathbf {z}}}(t,{\mathbf {p}}_0)): t\in [t_l-\varepsilon ,t_u+\varepsilon ]\} \subset B_{\rho }(\{(t,{\mathbf {p}}_0,\widetilde{{\mathbf {z}}}(t,{\mathbf {p}}_0)): t\in [t_l,t_u]\}) \subset G_{\mathrm{R}}^{ \mathrm P}, \end{aligned}$$

implying P-regularity of $\widetilde{{\mathbf {z}}}$ on $[t_l-\alpha ,t_u+\alpha ] \times \{{\mathbf {p}}_0\}$. Lastly, to prove maximal regular continuation, define the set of augmented graphs of regular continuations as

$$\begin{aligned} \varGamma _{\mathrm{ext}}:=\{ \{(t,{\mathbf {p}}_0,\widehat{{\mathbf {z}}}(t,{\mathbf {p}}_0)):t \in \widehat{T}\}: \widehat{{\mathbf {z}}} \text { is a P-regular continuation of } {\mathbf {z}}\text { on } \widehat{T}\}. \end{aligned}$$

$\varGamma _{\mathrm{ext}}$ is nonempty since $\{(t,{\mathbf {p}}_0,\widetilde{{\mathbf {z}}}(t,{\mathbf {p}}_0)):t \in [t_l-\alpha ,t_u+\alpha ]\} \in \varGamma _{\mathrm{ext}}$. Zorn’s Lemma can be applied with the generalized inequality $ \varPhi \preceq \varPsi $ for $\varPhi , \varPsi \in \varGamma _{\mathrm{ext}}$ if and only if $\varPhi \subset \varPsi $, which is a partial ordering (see the proof of [20, Theorem 6.4]). Given any nonempty totally ordered subset $\varGamma _{\mathrm{ext}}^*=\{\varOmega _{(i)}:i \in A\} \subset \varGamma _{\mathrm{ext}}$, where $\varOmega _{(i)}=\{(t,{\mathbf {p}}_0,{\mathbf {z}}_{(i)}(t,{\mathbf {p}}_0)):t \in T_{(i)}\}$, the element $\varOmega _u=\{(t,{\mathbf {p}}_0,{\mathbf {z}}_u(t)):t \in T_u\}$ is an upper bound of $\varGamma _{\mathrm{ext}}^*$, where $T_u=\bigcup _{i \in A} T_{(i)}$ and the mapping ${\mathbf {z}}_u:t \mapsto {\mathbf {z}}_{(i)}(\cdot ,{\mathbf {p}}_0), \; t \in T_{(i)}$ is single-valued since for any $i, j \in A$, ${\mathbf {z}}_{(i)}(t,{\mathbf {p}}_0)={\mathbf {z}}_{(j)}(t,{\mathbf {p}}_0)$ for all $t \in T_{(i)} \cap T_{(j)}$. Zorn’s Lemma implies that $\varGamma _{\mathrm{ext}}$ contains maximal elements; that is, there exists $\varOmega _{\mathrm{max}}= \{(t,{\mathbf {p}}_0,{\mathbf {z}}_{\mathrm{max}}(t,{\mathbf {p}}_0)):t \in T_{\mathrm{max}}\} \in \varGamma _{\mathrm{ext}}$, such that $\varOmega \subset \varOmega _{\mathrm{max}}$ for any $\varOmega \in \varGamma _{\mathrm{ext}}$. It is then possible to show that $T_{\mathrm{max}}=(t_L,t_U)$ since otherwise P-regularity allows for application of Theorem 2.2 (ii) to $(t_L,{\mathbf {p}}_0,{\mathbf {z}}_{\mathrm{max}}(t_L,{\mathbf {p}}_0))$ or $(t_U,{\mathbf {p}}_0,{\mathbf {z}}_{\mathrm{max}}(t_U,{\mathbf {p}}_0))$ to continue the solution on a strict superset of $T_{\mathrm{max}}$. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Stechlinski, P. Optimization-Constrained Differential Equations with Active Set Changes. J Optim Theory Appl 187, 266–293 (2020). https://doi.org/10.1007/s10957-020-01744-4

Download citation

Received: 30 October 2019
Accepted: 02 September 2020
Published: 18 September 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s10957-020-01744-4

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimization-Constrained Differential Equations with Active Set Changes

Abstract

Access this article

Similar content being viewed by others

Solving optimal control problems with terminal complementarity constraints via Scholtes’ relaxation scheme

A Note on Linear Differential Variational Inequalities in Hilbert Space

A Survey on Optimal Control Problems with Differential-Algebraic Equations

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Proof of Proposition 2.1

Proof of Proposition 2.2

Proof of Theorem 2.1

Proof of Corollary 2.1

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Optimization-Constrained Differential Equations with Active Set Changes

Abstract

Access this article

Similar content being viewed by others

Solving optimal control problems with terminal complementarity constraints via Scholtes’ relaxation scheme

A Note on Linear Differential Variational Inequalities in Hilbert Space

A Survey on Optimal Control Problems with Differential-Algebraic Equations

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

Proof of Proposition 2.1

Proof of Proposition 2.2

Proof of Theorem 2.1

Proof of Corollary 2.1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation