1 Introduction

This paper examines in a real Hilbert space H the motion of the singularities of the Moreau envelopes [35, 36] of a lower semicontinuous function \(f:H\rightarrow {\mathbb {R}}\cup \{\infty \}\), i.e.,

$$\begin{aligned} f_t(x)=\inf _{y\in H}\left( f(y)+\frac{1}{2t}\Vert x-y\Vert ^2\right) ,\qquad t>0,\; x\in H. \end{aligned}$$
(1)

In the absence of convexity of f, we prove that singularities, in a sense made precise below, propagate along intrinsic characteristic curves, a notion introduced for Hamilton–Jacobi equations in a finite-dimensional context by Cannarsa and Cheng in [13]. The accompanying set of minimizers

$$\begin{aligned} P_{tf}(x) =\mathop {\text {arg min}}\limits _{y\in H}\left( tf(y)+\frac{1}{2}\Vert x-y\Vert ^2\right) \end{aligned}$$

defines the proximal mapping \(P_{tf}:H\rightrightarrows H\) of index t. Assuming

$$\begin{aligned} \inf _{x\in H}\left( f(x)+\alpha \Vert x\Vert ^2 \right) \in {\mathbb {R}}\quad \text {for all }\,\, \alpha >0, \end{aligned}$$
(2)

\(f_t\) is a semiconcave (see Lemma 2 below) real-valued function for all \(t\in (0,\infty )\) such that \(\lim _{t\downarrow 0}f_t(x)=f(x)\) for all \(x\in H\). The distance function \(d_E\) appears when f is the indicator function \({\mathfrak {I}}_E\) of a closed nonempty subset E of H (i.e., \({\mathfrak {I}}_E(x)=0\) if \(x \in E\) while \({\mathfrak {I}}_E(x)=\infty \) otherwise). Indeed, then

$$\begin{aligned} f_t(x)=\frac{1}{2t}d_E^2(x)\quad \text {and}\quad P_{tf}=P_E, \end{aligned}$$

where the distance function and the associated metric projection to E send each \(x\in H\) to

$$\begin{aligned} d_E(x)&=\inf _{y\in E}\Vert x-y\Vert \quad \text {and}\\ P_E(x)&= \mathop {\text {arg min}}\limits _{y\in E}\Vert x-y\Vert =\{y\in E:\Vert x-y\Vert =d_E(x)\} , \end{aligned}$$

respectively.

In general, \(f_t\) is nondifferentiable; in fact, \(f_t\) is everywhere Fréchet differentiable in H when \(0<t<T\) if and only if \(f+(2T)^{-1}\Vert \cdot \Vert ^2\) is a convex function. (The definitions of Fréchet and Gâteaux differentiability are recalled in Section 2.) In the literature, the functions (1) are also known as the Moreau–Yosida approximations of f in particular in the presence of convexity. For general functions f, an application of Asplund’s [7] results on generic differentiability of convex functions shows that \((t,x)\mapsto f_t(x)\) is Fréchet differentiable on a dense \(G_\delta \). The connection to the Hamilton–Jacobi equation

$$\begin{aligned} \frac{\partial S}{\partial t}+\frac{1}{2}\Vert d_x S\Vert ^2&=0\quad \text {in } (0,\infty )\times H, \end{aligned}$$
(3)
$$\begin{aligned} \lim _{t\downarrow 0} S(t,x)&=f(x)\quad \text {in }\, H, \end{aligned}$$
(4)

is classical and well understood in particular when \(H={\mathbb {R}}^n\). The Moreau envelope \(S(t,x)=f_t(x)\) is a viscosity solution of the Cauchy problem (3)–(4) also in infinite dimensions [37, 38]; see Proposition 7.

If H is finite-dimensional, then the infimum (1) is always attained and the following conditions are equivalent:

  1. (i)

    \(f_t\) is Fréchet differentiable at x;

  2. (ii)

    \(f_t\) is Gâteaux differentiable at x;

  3. (iii)

    \(P_{tf}(x)\) is a singleton.

Adopting a term from Hamilton–Jacobi theory, a point \((t,x)\in (0,\infty )\times {\mathbb {R}}^n\) is called singular or a singularity if \(f_t\) fails to be differentiable at x. We conclude that (tx) is a singularity if and only if there exists more than one minimizer y in (1).

The situation is less clear-cut and, in fact, intricate when \(\dim H=\infty \). First, the infimum (1) need not be attained. Still, it is true that (ii) \(\Leftarrow \) (i) \(\Rightarrow \) (iii) but no other implication holds true between any other pair of these conditions. We choose to distinguish between two kinds of singularities one of which is stronger than the other when \(\dim H=\infty \).

Definition 1

A point (tx) is called singular or a singularity for \(S(t,x)=f_t(x)\) if \(f_t\) fails to be Fréchet differentiable at x. The set of all singular points in \((0,\infty )\times H\) is denoted by \(\Sigma \).

Definition 2

A point (tx) is called strongly singular or a strong singularity for \(S(t,x)=f_t(x)\) if either \(f_t\) fails to be Gâteaux differentiable at x or \(P_{tf}(x)=\emptyset \). The set of all strongly singular points in \((0,\infty )\times H\) is designated by \(\Sigma _{\mathrm s}\).

On the one hand, \(\Sigma \) and \(\Sigma _{\mathrm s}\) are nonempty unless f is a convex function and on the other \(\Sigma _{\mathrm s}\subseteq \Sigma \); see Proposition 6. Theorem 10 and Example 3 shed light on subtleties regarding \(\Sigma \) and \(\Sigma _{\mathrm s}\) for Moreau envelopes \(f_t\) showing in particular that \(\Sigma \) and \(\Sigma _{\mathrm s}\) disagree in general. When E is nonconvex and \(f={\mathfrak {I}}_E\), \(f_t(\cdot )=d_E^2(\cdot )/(2t)\) is not a globally differentiable function. Clearly, the question of differentiability of \(f_t\) is in this case independent of t. With a slight abuse of notation, we therefore understand that

$$\begin{aligned} \Sigma =\{x\in H:d^2_E\, \text {is not Fr}{\acute{\mathrm{e}}}\text {chet differentiable at} ~x \} \end{aligned}$$

and

$$\begin{aligned} \Sigma _{\mathrm {s}}=\{x\in H:P_E(x)=\emptyset \, \text {or}\, d^2_E \text {is not G}{\hat{\mathrm{a}}}\text {teaux differentiable at} ~ x \}. \end{aligned}$$

Needless to say, \(\Sigma \) and E are disjoint sets. It follows from Fitzpatrick’s paper [23] that actually \(\Sigma =\Sigma _{\mathrm s}\) in this case; see Theorem 11. The following fundamental result on singular dynamics for distance functions in a Hilbert space has been an inspiration for this work.

Theorem 1

(Frerking and Westphal [24]) Suppose that \(x_0\in \Sigma .\) Then, unless it is an isolated singularity, \(x_0\) lies on a nonconstant Lipschitz arc each point of which is a member of \(\Sigma .\) Furthermore, \(x_0\) is an isolated singularity if and only if \(P_E(x_0)={\mathbb {S}}(x_0,d_E(x_0)),\) the sphere about \(x_0\) of radius \(d_E(x_0)\).

There is a substantial, highly successful and fast expanding literature on singular dynamics for Hamilton–Jacobi equations set in \({\mathbb {R}}^n\) covering (3) when \(\dim H<\infty \); see, e.g., the survey articles [14, 15] by Cannarsa and Cheng. At the heart of this branch of research there are two main concepts of characteristics as well as some refinements. It has been established when \(\dim H <\infty \) that singularities propagate along Lipschitz continuous arcs \({\varvec{X}}(t)\) that are characterized as generalized characteristics. This notion was first defined and studied in \({\mathbb {R}}^n\) by Albano and Cannarsa in [4], although it agrees for \(n=1\) with Dafermos’ concept of generalized characteristics for scalar conservation laws in one space variable [22]. For the Cauchy problem (3)–(4), assuming \(H={\mathbb {R}}^n\) and denoting by \(d^+f_t\) the Fréchet superdifferential of \(f_t\) (see (7) for the definition), for any \((t_0,x_0)\in (0,\infty )\times {\mathbb {R}}^n\) there exists a unique locally Lipschitz continuous arc \({\varvec{X}}(t)\) such that

$$\begin{aligned} {\varvec{X}}(t_0)=x_0\quad \text {and}\quad {\dot{{\varvec{X}}}}(t)\in d^+ f_t({\varvec{X}}(t))\quad \text {a.e.} ~ t\in [t_0,\infty ). \end{aligned}$$
(5)

It satisfies \((t,{\varvec{X}}(t))\in \Sigma \) for all \(t\in [t_1,\infty )\) if \(t_1\ge t_0\) and \((t_1,{\varvec{X}}(t_1))\in \Sigma \). The fact that the singular propagation continuous without interruption was established by Cannarsa, Mazzola and Sinestrari in [19] (see Theorem 2 below). For a viscosity solution S(tx) of a general Hamilton–Jacobi equation \(\partial S/\partial t+{{\mathcal {H}}}(t,x,\nabla S)=0\) in, say, \((0,\infty )\times {\mathbb {R}}^n\), a generalized characteristic refers to a locally Lipschitz continuous curve \({\varvec{X}}(t)\) satisfying

$$\begin{aligned} {\dot{{\varvec{X}}}}(t)\in {\text {co}}\nabla _p {{\mathcal {H}}}(t,{\varvec{X}}(t),d^+_x S(t,{\varvec{X}}(t)))\qquad \text {for a.e.}~ t. \end{aligned}$$

It is well-known that the singularities of S(tx) propagate along generalized characteristics locally in time t; consult, e.g., the seminal paper [4] or the subsequent articles [17, 43] by Cannarsa and Yu or the monograph [16] on semiconcavity by Cannarsa and Sinestrari. In certain special cases research has revealed that the propagation continues for all later times without stopping, see, e.g., [19] and Albano’s paper [3]. However, the extent to which the propagation is global in time t remains to this day a vital and central research problem for general Hamilton–Jacobi equations [14, 15]. Nevertheless, Albano actually proved in [2] for a broad class of Hamilton–Jacobi equations that \((t,{\varvec{X}}(t))\) remains in the closed hull of the singular set (i.e., in the \(C^1\) singular support) for all later times. This is an important achievement especially in the presence of some degree of smoothness. In certain nonsmooth cases, however, there might be a considerable difference between \(\Sigma \) and \({\overline{\Sigma }}\), inasmuch as \({\text {int}}\Sigma =\emptyset \) whereas \({\overline{\Sigma }}\) may contain interior points. Santilli [39] has in his investigations of distance functions to \(C^1\) or \(C^{1,1}\) hypersurfaces in \({\mathbb {R}}^n\) obtained several striking results on the denseness of the singular set.

While the propagation of singularities along arcs may be viewed as a kind of a lower bound on the structure and connectedness of the singular set, rectifiability results constitute upper bounds. The fine estimates obtained by Alberti, Ambrosia and Cannarsa in [6] for general convex functions on \({\mathbb {R}}^N\) apply to \(\Sigma \). Furthermore, significant results on the rectifiability of \(\overline{\Sigma }\) including sharp bounds for the Hausdorff measure of \(\overline{\Sigma }{\setminus }\Sigma \) for smooth initial-value problems for Hamilton–Jacobi equations appear in the article [18] by Cannarsa, Mennucchi and Sinestrari. A recent paper by Miura and Tanaka [34] on distance functions of closed sets states that \(\Sigma \) is not only covered by but is equal to an at most countable union of Lipschitz hypersurfaces save an exceptional set of codimension two.

Singular dynamics has also been studied for general semiconcave functions on Banach spaces as notably in the paper [1] by Albano and Cannarsa.

In this paper we take a different path and study the motion of singularities in an infinite-dimensional setting by means of so-called intrinsic characteristics. This class of relevant curves was singled out by Cannarsa and Cheng in [13] and was used for topological studies in [20, 21]. It is not based on a differential inclusion but has a purely variational definition.

Definition 3

(Intrinsic characteristic) For any \(x_0\in H\) and any \(t_0>0\) we define the intrinsic characteristic \({\varvec{x}}(t)\) for \(t\in [t_0,\infty )\) by \({\varvec{x}}(t_0)=x_0\) and

$$\begin{aligned} \{{\varvec{x}}(t)\}=\mathop {\text {arg max}}\limits _{x\in H}\left( f_t(x)-\frac{1}{2(t-t_0)}\Vert x_0-x\Vert ^2 \right) ,\qquad t_0<t<\infty . \end{aligned}$$

Equivalently,

$$\begin{aligned} \frac{{\varvec{x}}(t)-x_0}{t-t_0}\in d^+ f_t({\varvec{x}}(t)),\qquad t_0<t<\infty . \end{aligned}$$
(6)

In (5) and (6), \(f_t(x)\) serves as a nonsmooth velocity potential, for momentaneous and average velocities, respectively, and the superdifferential \(d^+f_t\) is defined by (7). The results of this paper are not stated in the introduction except for the main propagation theorem (which reappears as Theorem 4 below). This backbone asserts that singularities, once they come into existence, propagate along intrinsic characteristics instantly transforming into strong singularities.

Theorem. For a given lower semicontinuous function \(f:H\rightarrow {\mathbb {R}}\cup \{\infty \}\) meeting (2), let \({\varvec{x}}(t)\) be the intrinsic characteristic emanating from \((t_0,x_0).\)

  1. (i)

    If \(t_1\ge t_0\) and \((t_1,{\varvec{x}}(t_1))\in \Sigma ,\) then \((t,{\varvec{x}}(t))\in \Sigma _{\mathrm s}\) for every \(t\in (t_1,\infty ).\)

  2. (ii)

    If \((t_0,x_0)\in \Sigma \) and \(0\notin d^+f_{t_0}(x_0),\) then \({\dot{{\varvec{x}}}}^+(t_0)\ne 0\) and \((t,{\varvec{x}}(t))\in \Sigma _{\mathrm s}\) for all \(t\in (t_0,\infty ).\)

Outline of the paper. Section 2 contains some background material mainly on relevant concepts of differentiation, including sub- and superdifferentials. The definition of Asplund’s function and some of its basic properties are given in Sect. 3. A major previous result on singular propagation for Moreau envelopes in \({\mathbb {R}}^n\) is briefly recalled in Sect. 4 where also some remarks are made about the problem of extensions to infinite dimensions. In Sects. 5 through 8 we state without proofs our most important principal results, namely Theorems 3 through 9, on singular dynamics along intrinsic characteristics. Sections 56 state the basic results for Moreau envelopes and distance functions. The generation of singularities for distance functions and an application to homotopy equivalence is the subject of Sect. 7. Section 8 examines weak limit points of bounded singular arcs \({\varvec{x}}(t)\) as \(t\rightarrow \infty \) showing that every such limit point is singular too. This stability property entails that the singular propagation along intrinsic characteristics has a truly global character.

After that point the paper is devoted to a detailed study of Moreau envelopes and intrinsic characteristics including proofs. Sects. 910 investigate the differentiability properties of \(S(t,x)=f_t(x)\) partly from the perspective of viscosity solution theory. Sections 1113 define and analyze intrinsic characteristics for Moreau envelopes and distance functions with a focus on singular dynamics. Section 14 adds a further tool for prolonging the singular propagation. Section 15 compiles and completes the postponed proofs of the main results contained in Sects. 58. Finally, some examples are gathered in Section 16.

2 Prerequisites

Throughout the paper H stands for a real Hilbert space whose scalar product and norm are denoted by \(\langle \cdot ,\cdot \rangle \) and \(\Vert \cdot \Vert \), respectively. The open and closed balls with center at x and of radius R are signified by \({\mathbb {B}}(x,R)\) and \(\overline{{\mathbb {B}}}(x,R)\), respectively, while the sphere is denoted by \({\mathbb {S}}(x,R)\). Let \(g:H\rightarrow (-\infty ,\infty ]\) be a proper function which means that its essential domain \({\text {dom}}g=\{x\in H:g(x)<\infty \}\) is nonvoid. The Fréchet and Gâteaux differential at a point \(x\in {\text {dom}}g\) are denoted by dg(x) and \(\nabla g(x)\), respectively. We recall the definitions. The function g is a said to be Fréchet differentiable at \(x\in {\text {dom}}g\) if there exists a vector \(dg(x)\in H\) such that

$$\begin{aligned} g(x+h)-g(x)=\langle dg(x),h \rangle +o(h)\quad \text {as}\, h\rightarrow 0, \end{aligned}$$

where as usual \(o(h)/\Vert h\Vert \rightarrow 0\) as \(h\rightarrow 0\). The function is Gâteaux differentiable at \(x\in {\text {dom}}g\) if the directional derivative

$$\begin{aligned} g'(x,v)=\lim _{\lambda \rightarrow 0}\frac{g(x+\lambda v)-g(x)}{\lambda } \end{aligned}$$

exists in any direction \(v\in H\), and for a certain \(\nabla g(x)\in H\) it holds that \(g'(x,v)=\langle \nabla g(x),v \rangle \) for all \(v\in H\).

The Fréchet superdifferential of g is the multivalued mapping \(d^+ g:H\rightrightarrows H\) defined by

$$\begin{aligned} d^+ g(x)=\left\{ p\in H:\limsup _{\Vert h\Vert \rightarrow 0}\frac{g(x+h)-g(x)-\langle h,p\rangle }{\Vert h\Vert } \le 0 \right\} \end{aligned}$$
(7)

when \(x\in {\text {dom}}g\) while \(d^+ g(x)=\emptyset \) if \(g(x)=\infty \). The Fréchet subdifferential \(d^-f\) is defined by replacing “\(\limsup \)" and “\(\le 0\)" in (7) by “\(\liminf \)" and “\(\ge 0\)", respectively. The sets \(d^\pm g(x)\) are convex and closed and they are simultaneously nonempty exactly when g is Fréchet differentiable at x, in which case \(d^+ g(x)=d^- g(x)=\{d g(x)\}\). Kruger’s survey paper [31] furnishes an overview of generalized differentiation.

The Legendre–Fenchel transform \(g^*:H\rightarrow (-\infty ,\infty ]\) of g is the convex and lower semicontinuous function which assigns to each \(y\in H\) the value

$$\begin{aligned} g^*(y)=\sup _{x\in H}(\langle x,y \rangle -g(x)). \end{aligned}$$

If g is itself convex and lower semicontinuous, then \(d^-g\) agrees with the Fenchel subdifferential \(\partial g\) which is defined by

$$\begin{aligned} \partial g(x)=\{y\in H:\forall z\in H\; g(z)\ge g(x)+\langle z-x,y \rangle \}\quad \text {for every}\, x\in {\text {dom}}g, \end{aligned}$$

while \(\partial g(x)=\emptyset \) otherwise. In this case, \(g^{**}=g\) and \(\partial g^*=(\partial g)^{-1}\), i.e., \(y\in \partial g(x)\) \(\Leftrightarrow \) \(x\in \partial g^*(y)\).

Let g be continuous and locally semiconcave in an open nonempty subset \(\Omega \) of H, i.e., for any point \(x\in \Omega \) let there exist a ball \(B\subseteq \Omega \) centered at x and a constant \(c\ge 0\) such that \(g-c\Vert \cdot \Vert ^2/2\) is concave in B. Then, for any \(x\in \Omega \), \(d^+ g(x)\) is a bounded convex closed nonempty subset of H, which reduces to a singleton if and only if g is Gâteaux differentiable at x; and \(d^-g(x)\) is empty unless g is Fréchet differentiable at x. A reachable gradient \(p\in H\) of g at x, in symbols \(p\in d^\bullet g(x)\), is by definition a weak limit of a sequence of Fréchet gradients \(dg(x_k)\) where \(x_k\rightarrow x\) strongly; \(d^\bullet g(x)\) is a nonvoid subset of \(d^+g(x)\) and \(d^+g(x)={{\overline{co}}}d^\bullet g(x)\). While Fréchet differentiability is a stronger notion than Gâteaux differentiability in general, it is a property of semiconcave functions in finite dimensions that dg(x) exists if and only if \(\nabla g(x)\) exists. The main object of this paper, namely \(S(t,x)=f_t(x)\), is locally semiconcave in \((0,\infty )\times H\), an open subset of the Hilbert space \({\mathbb {R}}\times H\).

We next recall Asplund’s characterization of Fréchet differentiability of \(g^*\) for general proper functions g. We denote by \(\Gamma \) the set of all convex, lower semicontinuous functions \(\gamma :[0,\infty )\rightarrow [0,\infty ]\) such that \(\gamma (0)=0\) and consider the subsets

$$\begin{aligned} \Gamma _U =\{ \gamma \in \Gamma :\gamma (r)>0\,\text {if}\, r>0\}\quad \text {and}\quad \Gamma _L =\{ \gamma \in \Gamma :\gamma (r)/r\rightarrow 0\,\text {as}\, r\rightarrow 0\}. \end{aligned}$$

In defining the conjugate to \(\gamma \in \Gamma \) by

$$\begin{aligned} \gamma ^*(s)=\sup _{r\ge 0}(rs-\gamma (r)),\qquad 0\le s<\infty , \end{aligned}$$

\(\gamma \in \Gamma _U\) if and only if \(\gamma ^*\in \Gamma _L\); see Lemma 1 in [7].

Lemma 1

(Asplund [7]) Let g be a lower semicontinuous proper function on H and consider a pair \((x_0,y_0)\in H\times H\). Then the following conditions are mutually equivalent.

  1. (i)

    \(g^*\) is finite and Fréchet differentiable at \(y_0\) with \(x_0=dg^*(y_0)\).

  2. (ii)

    For some \(\gamma ^*\in \Gamma _L,\)

    $$\begin{aligned} g^*(y)\le g^*(y_0)+\langle x_0,y-y_0\rangle +\gamma ^*(\Vert y-y_0\Vert )\qquad \text {for all}\, y\in H, \end{aligned}$$

    and \(g^*(y_0)\in {\mathbb {R}}.\)

  3. (iii)

    For some \(\gamma \in \Gamma _U,\)

    $$\begin{aligned} g(x)\ge g(x_0)+\langle x-x_0,y_0\rangle +\gamma (\Vert x-x_0\Vert )\qquad \text {for all}\, x\in H, \end{aligned}$$

    and \(g(x_0)\in {\mathbb {R}}.\)

  4. (iv)

    \(g^*\) is finite at \(y_0\) and \({\text {dom}}g^*\) is radial at \(y_0;\) and if

    $$\begin{aligned} \lim _{j\rightarrow \infty }(\langle x_j,y_0\rangle -g(x_j))=g^*(y_0), \end{aligned}$$

    then \(x_j\rightarrow x_0\) in norm.

Any single one of these four conditions implies \(\langle x_0,y_0\rangle =g(x_0)+g^*(y_0)\) and \(g(x_0)=g^{**}(x_0)\).

3 Asplund’s function

The following is a version of Asplund’s function [8].

Lemma 2

The associated function \(A:(0,\infty )\times H\rightarrow {\mathbb {R}}\) defined by

$$\begin{aligned} A(t,x)=\frac{1}{2}\Vert x\Vert ^2-tS(t,x),\qquad (t,x)\in (0,\infty )\times H, \end{aligned}$$

is convex. In terms of the Legendre–Fenchel transform, \(A(t,x)=\left( tf+\frac{1}{2}\Vert \cdot \Vert ^2\right) ^*(x).\)

Proof

Indeed, A can be represented as the pointwise supremum of a family of affine functions of (tx) in the following way:

$$\begin{aligned} A(t,x)=\sup _{y\in {\text {dom}}f}(\langle x,y\rangle -tf(y)-\Vert y\Vert ^2/2). \end{aligned}$$

\(\square \)

In the case of the distance function to a set E, A is independent of t, namely, \(A(x)=\frac{1}{2}\Vert x\Vert ^2-\frac{1}{2}d_E^2(x)\). In his paper on Chebyshev sets, Asplund [8] made significant use of the fact that A is a convex continuous function such that \(P_E\subseteq \partial A\) (see Proposition 6). We give a brief review of the relation between \(P_E\) and \(\partial A\). The equality \(P_E(x)=\partial A(x)\) holds if and only if \(x\notin \Sigma \), in which case \(P_E(x)=\{dA(x)\}\). If \(\dim H<\infty \), then \({\text {co}}P_E(x)=\partial A(x)\). However, \({{\overline{co}}}P_E(x)\) and \(\partial A(x)\) do not coincide in general if \(\dim H=\infty \). Example 5 (Example 4, respectively) furnishes a case where \(P_E(x)=\emptyset \) while \(\partial A(x)\) is a singleton (\(P_E(x)\) is a singleton while \(\partial A(x)\) has more than one element, respectively). In fact, Godini [25] has in any infinite-dimensional Hilbert space H uncovered a proximinal set E (i.e., \(P_E(x)\ne \emptyset \) for all \(x\in H\)) such that \({{\overline{co}}}P_E(x)\) is a strict subset of \(\partial A(x)\) for some \(x\in H{\setminus } E\). Klee [30] obtained a few years earlier an example of this nature for nonseparable Hilbert spaces.

Lemma 3

(Berens [10] and Veselỳ [42]) At every \(x\in H,\)

$$\begin{aligned} \partial A(x)=\bigcap _{R> d_E(x)}{{\overline{co}}}\left( E\cap \overline{{\mathbb {B}}}(x,R) \right) \subseteq \overline{{\mathbb {B}}}(x,d_E(x))\cap {{\overline{co}}}E \end{aligned}$$

and \(P_E(x)=\partial A(x)\cap {\mathbb {S}}(x,d_E(x)).\)

We collect a few elementary observations concerning the extreme case when \(P_E(x)={\mathbb {S}}(x,d_E(x))\) which occurs exactly when \(E\cap \overline{{\mathbb {B}}}(x,d_E(x))={\mathbb {S}}(x,d_E(x))\). A proof can easily be constructed on the basis of Lemma 3.

Lemma 4

The following conditions are equivalent for any \(x\in H:\)

  1. (i)

    \(P_E(x)={\mathbb {S}}(x,d_E(x));\)

  2. (ii)

    \(\partial A(x)=\overline{{\mathbb {B}}}(x,d_E(x));\)

  3. (iii)

    The boundary of \(\partial A(x)\) is equal to \({\mathbb {S}}(x,d_E(x));\)

  4. (iv)

    The boundary of \(\partial A(x)\) is included in \(P_E(x).\)

The following properties of \(A^*\) were elegantly exploited in [7].

Lemma 5

\(A^*=({\mathfrak {I}}_E+\frac{1}{2}\Vert \cdot \Vert ^2)^{**}\) is the supremum of all closed convex functions minorizing \(\frac{1}{2}\Vert \cdot \Vert ^2\) on E. In particular, \(A^*(x)=\infty \) and hence \(\partial A^*(x)=\emptyset \) for any x in the complement of \({{\overline{co}}}E,\) i.e.,

$$\begin{aligned} {\text {dom}}\partial A^*\subseteq {\text {dom}}A^* \subseteq {{\overline{co}}}E. \end{aligned}$$

4 Remarks on extension from finite to infinite dimensions

The time global propagation results for generalized characteristics \({\varvec{X}}(t)\) (defined by (5)) obtained in [3, 19, 41] are directly applicable to Moreau envelopes in \({\mathbb {R}}^n\). Regardless of dimension, as confirmed in Proposition 7 below, singularities of \(S(t,x)=f_t(x)\) can be detected from the following dichotomy:

$$\begin{aligned} \min \{\omega +\Vert v\Vert ^2/2:(\omega ,v)\in d^+ S(t,x)\}{\left\{ \begin{array}{ll} <0&{}\text {if}\, (t,x)\in \Sigma ,\\ =0&{}\text {if}\, (t,x)\not \in \Sigma . \end{array}\right. } \end{aligned}$$

We present a version of the global-in-time propagation result in \({\mathbb {R}}^n\) obtained by Cannarsa, Mazzola and Sinestrari [19]. Albano’s paper [3] examines more general equations successfully.

Theorem 2

Assume that \(f:{\mathbb {R}}^n\rightarrow {\mathbb {R}}\cup \{\infty \}\) is a lower semicontinuous function satisfying (2) and set \(S(t,x)=f_t(x).\) Consider the generalized characteristic \({\varvec{X}}:[t_0,\infty )\rightarrow {\mathbb {R}}^n\) emanating from the point \((t_0,x_0)\in (0,\infty )\times {\mathbb {R}}^n.\) Then

$$\begin{aligned} m(t):= t^2\cdot \min \{\omega +\Vert v\Vert ^2/2:(\omega ,v)\in d^+ S(t,{\varvec{X}}(t))\} \end{aligned}$$

is a right-continuous, nonincreasing and nonpositive function of \(t\in [t_0,\infty )\) such that \(m(t)<0\) exactly if \((t,{\varvec{X}}(t))\in \Sigma \). In particular, if \(t_1\ge t_0\) and \((t_1,{\varvec{X}}(t_1))\in \Sigma ,\) then \((t,{\varvec{X}}(t))\in \Sigma \) for all \(t\in [t_1,\infty ).\)

Proof

We sketch the proof of the monotonicity presented in [41] leaving out most details. The task is to prove that \(m(t)\le m(t_1)\) for any fixed \(t_0\le t_1\) and any \(t>t_1\). First, A(tx) is approximated by \(C^\infty \) convex functions \(A_\varepsilon (t,x)\) by means of integral convolution with a nonnegative mollifier and then \(S_\varepsilon (t,x)\) is defined from \(A_\varepsilon (t,x)=\frac{1}{2}\Vert x\Vert ^2-tS_\varepsilon (t,x)\). Consider the solution \(X_\varepsilon (t)\) to \(dx/dt=\nabla _x S_\varepsilon (t,x)\) satisfying \(X_\varepsilon (t_1)={\varvec{X}}(t_1)\). Then \(X_\varepsilon (t)\rightarrow {\varvec{X}}(t)\) locally uniformly as \(\varepsilon \downarrow 0\). A calculation reveals that the derivative of

$$\begin{aligned} m_\varepsilon (t):=t^2\left( \frac{\partial S_\varepsilon }{\partial t}(t,X_\varepsilon (t))+\frac{1}{2}\Vert \nabla _x S_\varepsilon (t,X_\varepsilon (t))\Vert ^2\right) \end{aligned}$$

satisfies

$$\begin{aligned} {\dot{m}}_\varepsilon (t)=-t\left\langle D^2A_\varepsilon (Y_\varepsilon (t)){\dot{Y}}_\varepsilon (t),{\dot{Y}}_\varepsilon (t)\right\rangle \le 0. \end{aligned}$$

Here, \(D^2A_\varepsilon (t,x)\) denotes the (positive semidefinite) full Hessian matrix of \(A_\varepsilon (t,x)\) and \(Y_\varepsilon (t)=(t,X_\varepsilon (t))\). We find that \(m_\varepsilon (t)\le m_\varepsilon (t_1)\) for every \(t>t_1\). By virtue of the specific mollification lemma of [17] the mollifier can be chosen so as to obtain \(m_\varepsilon (t_1)\rightarrow m(t_1)\) as \(\varepsilon \downarrow 0\) making it possible to conclude that \(m(t)\le m(t_1)\). \(\square \)

As this proof sketch shows, the approximation lemma of [17, 43] is a vital tool for the analysis in \({\mathbb {R}}^n\). It is a difficulty when attempting to derive results in infinite-dimensional spaces that this tailor-made regularization technique is no longer available. For distance functions, the article [5] proves the indefinite propagation of singularities in \({\mathbb {R}}^n\) as well as in manifolds. The proofs revolve around relevant ordinary differential equations interpreted in a clever way. The major hurdle in extending propagation results to H is the lack of compactness of bounded sets in H or the lack of weak lower semicontinuity of relevant functions. The choice of this paper is the intrinsic characteristics approach of Cannarsa and Cheng [13] which works well owing to the semiconcavity of \(f_t\) (as expressed in Lemma 2) yielding well-behaved global concave maximization problems in H. For instance, this is true of the very definition of the arcs (see Definition 3). The analysis relies on weak convergence and the weak lower semicontinuity that convex closed functions enjoy.

5 Principal results for Moreau envelopes

We proceed by presenting a selection of our most central results about intrinsic characteristics introduced in Definition 3. We define \((\omega ^\circ (t,x),v^\circ (t,x))\) as the unique element of \(d^+ S(t,x)\) minimizing \(\omega +\frac{1}{2}\Vert v\Vert ^2\), i.e.,

$$\begin{aligned} \omega ^\circ (t,x)+\frac{1}{2}\Vert v^\circ (t,x)\Vert ^2\le \omega + \frac{1}{2}\Vert v\Vert ^2\quad \text {for all}\, (\omega ,v)\in d^+ S(t,x). \end{aligned}$$

We observe that \(v^\circ (t,x)\in d^+ f_t(x)\).

Theorem 3

Assume that \(f:H\rightarrow {\mathbb {R}}\cup \{\infty \}\) is a lower semicontinuous function satisfying (2). Let \({\varvec{x}}(t)\) signify the intrinsic characteristic emanating from a point \(x_0\in H\) at time \(t_0>0.\) Then the following assertions are fulfilled:

  1. (i)

    \({\varvec{x}}(t)=dF^{t_0\rightarrow t}(x_0)\) when \(t\ge t_0\) where

    $$\begin{aligned} F^{t_0\rightarrow t}=\left( \frac{t-t_0}{t}A(t,\cdot )+\frac{t_0}{2t}\Vert \cdot \Vert ^2\right) ^* \end{aligned}$$

    or, setting \(J^{t_0\rightarrow t}=dF^{t_0\rightarrow t},\) \({\varvec{x}}(t)=J^{t_0\rightarrow t}(x_0)\) where

    $$\begin{aligned} J^{t_0\rightarrow t} =\left( \frac{t-t_0}{t}\partial _x A(t,\cdot )+\frac{t_0}{t}I\right) ^{-1}=\left( I-(t-t_0)d^+f_{t}\right) ^{-1}. \end{aligned}$$

    The convex function \(F^{t_0\rightarrow t}:H\rightarrow {\mathbb {R}}\) is Fréchet differentiable and \(J^{t_0\rightarrow t}=dF^{t_0\rightarrow t}\) is globally Lipschitz continuous of rate \(t/t_0.\)

  2. (ii)

    \(t\mapsto {\varvec{x}}(t)\) is Hölder continuous with exponent 1/2 in \([t_0,T]\) for every \(t_0<T<\infty .\)

  3. (iii)

    The right derivative \({\dot{{\varvec{x}}}}^+(t_0)\) exists and \({\dot{{\varvec{x}}}}^+(t_0)=v^\circ (t_0,x_0);\) in particular, \({\dot{{\varvec{x}}}}^+(t_0)\ne 0\) if \(0\not \in d^+f_{t_0}(x_0).\)

Corollary 1

(Lipschitz and monotone dependence on initial data) Let \(t_0>0\) and consider the intrinsic characteristics \({\varvec{x}}(t)\) and \({\varvec{y}}(t)\) issuing from \((t_0,x_0)\) and \((t_0,y_0),\) respectively. Then

$$\begin{aligned} \frac{1}{t}\Vert {\varvec{x}}(t)-{\varvec{y}}(t)\Vert \le \frac{1}{t_0}\Vert x_0-y_0\Vert \quad \text {and}\quad \left\langle {\varvec{x}}(t)-{\varvec{y}}(t),x_0-y_0 \right\rangle \ge 0 \end{aligned}$$

for all \(t\in [t_0,\infty ).\)

The following is our principal result on singular dynamics for Moreau envelopes.

Theorem 4

(Propagation of singularities) For a given lower semicontinuous function \(f:H\rightarrow {\mathbb {R}}\cup \{\infty \}\) meeting (2), let \({\varvec{x}}(t)\) be the intrinsic characteristic emanating from \((t_0,x_0).\)

  1. (i)

    If \(t_1\ge t_0\) and \((t_1,{\varvec{x}}(t_1))\in \Sigma ,\) then \((t,{\varvec{x}}(t))\in \Sigma _{\mathrm s}\) for every \(t\in (t_1,\infty ).\)

  2. (ii)

    If \((t_0,x_0)\in \Sigma \) and \(0\notin d^+f_{t_0}(x_0),\) then \({\dot{{\varvec{x}}}}^+(t_0)=v^\circ (t_0,x_0)\ne 0\) and \((t,{\varvec{x}}(t))\in \Sigma _{\mathrm s}\) for all \(t\in (t_0,\infty ).\)

Remark 1

Example 6 manifests that generalized characteristics and intrinsic characteristics are different concepts and that, in general, \(J^{t_1\rightarrow t_2}\circ J^{t_0\rightarrow t_1}\ne J^{t_0\rightarrow t_2}\) when \(t_0<t_1<t_2\).

Remark 2

As an alternative to \({\varvec{x}}(t)=J^{t_0\rightarrow t}(x_0)\) we could construct a singular curve by choosing a sequence \(0<t_0<t_1<\cdots <t_k\rightarrow \infty \), setting \(\varvec{\xi }(t_0)=x_0\) and proceeding recursively by defining

$$\begin{aligned} \varvec{\xi }(t)=J^{t_k\rightarrow t}(\varvec{\xi }(t_k))\quad \text {for}\, t\in (t_k,t_{k+1}], \; k=0,1,2,\ldots . \end{aligned}$$

Cf. Example 6.

6 Basic results for distance functions

We turn to the spreading of singularities of distance functions. We remind the reader that \(\Sigma =\Sigma _{\mathrm s}\) in this case (Theorem 11). For any \(x_0\in H{\setminus } E\) and any \(t_0>0\), the intrinsic characteristic \({\varvec{x}}(t)\) satisfying the initial condition \({\varvec{x}}(t_0)=x_0\) is in this case given by

$$\begin{aligned} \{{\varvec{x}}(t)\}=\mathop {\text {arg max}}\limits _{x\in H}\left( \frac{1}{2t}d_E^2(x)-\frac{1}{2(t-t_0)}\Vert x_0-x\Vert ^2 \right) ,\qquad t_0<t<\infty . \end{aligned}$$
(8)

Equivalently,

$$\begin{aligned} {\varvec{x}}(t)=J^{t_0\rightarrow t}(x_0)\quad \text {where}\quad J^{t_0\rightarrow t}=\left( \frac{t-t_0}{t}\partial A+\frac{t_0}{t}I\right) ^{-1},\qquad t_0\le t<\infty , \end{aligned}$$
(9)

where Asplund’s function is \(A(x)=\frac{1}{2}\Vert x\Vert ^2-\frac{1}{2}d_E^2(x)\) [8]. We remark that the choice of \(t_0>0\) is arbitrary because \(J^{\mu t_0\rightarrow \mu t}=J^{t_0\rightarrow t}\) for any \(\mu >0\).

Definition 4

A point \(x_0\in H\) is termed critical if \(0\in d^+ d^2_E(x_0).\)

A trivial consequence of (6) is that \({\varvec{x}}(t)=x_0\) for all \(t\in [t_0,\infty )\) if \(x_0\) is a critical point.

Proposition 1

In terms of Asplund’s function \(A(x)=\frac{1}{2}\Vert x\Vert ^2-\frac{1}{2}d_E^2(x),\) \(x_0\) is a critical point if and only \(x_0\in \partial A(x_0)\) if and only if \(x_0\in \partial A^*(x_0).\)

Proposition 2

Let E be a closed nonempty subset of H. If \(x_0\) is a critical point, then \(x_0\in {\text {dom}}\partial A^*\subseteq {{\overline{co}}}E.\) In particular, there exists no critical point outside of \({{\overline{co}}}E.\)

Proof

Let \(x_0\) be a critical point. Then \(x_0\in \partial A^*(x_0)\) and hence \(x_0\in {\text {dom}}\partial A^*\subseteq {{\overline{co}}}E\) by Lemma 5. The conclusion also follows from Lemma 3 applied to \(x_0\in \partial A(x_0)\). \(\square \)

We let \({\varvec{r}}(x)\) denote the norm minimal element of \(d^+d_E^2(x)/2\). As \(d^+d_E^2/2=d_E d^+d_E\) and \(d_E\) is Lipschitz continuous with constant 1, we have \(\Vert {\varvec{r}}(x)\Vert \le d_E(x)\). If \(x\notin \Sigma \), then \({\varvec{r}}(x)=x-y\) where y is the closest point to x in E, thus, \(\Vert {\varvec{r}}(x)\Vert = d_E(x)\). By contrast, if \(x\in \Sigma \), then \(\Vert {\varvec{r}}(x)\Vert < d_E(x)\) as \(d^+d_E(x)\) is a convex closed subset of \({\overline{{\mathbb {B}}}}(0,1)\) with more than one element. On account of Theorem 3(iii), \({\dot{{\varvec{x}}}}^+(t_0)=v^\circ (t_0,x_0)={\varvec{r}}(x_0)/t_0\).

The crude Hölder estimate of Theorem 3(ii) can be radically improved for distance functions.

Theorem 5

(Lipschitz continuity) Let E be a closed nonempty subset of H. For any \(t_0<s<t<\infty \) consider the initial velocity \(v_0={\varvec{r}}(x_0)/t_0\) and the average velocities

$$\begin{aligned} v_{t_0\rightarrow s}=\frac{{\varvec{x}}(s)-x_0}{s-t_0}\quad \text {and}\quad v_{s\rightarrow t}=\frac{{\varvec{x}}(t)-{\varvec{x}}(s)}{t-s}. \end{aligned}$$

Then

$$\begin{aligned} \Vert v_{s\rightarrow t}\Vert ^2\le \Vert v_{t_0\rightarrow s}\Vert ^2-\Vert v_{s\rightarrow t}- v_{t_0\rightarrow s}\Vert ^2; \end{aligned}$$
(10)

in particular, \(\Vert v_{s\rightarrow t}\Vert <\Vert v_{t_0\rightarrow s}\Vert \) unless \(v_{s\rightarrow t}= v_{t_0\rightarrow s}\). Furthermore,

$$\begin{aligned} \Vert v_{t_0\rightarrow t}\Vert ^2\le \Vert v_0\Vert ^2-\Vert v_{t_0\rightarrow t}- v_0\Vert ^2 \end{aligned}$$
(11)

as well as

$$\begin{aligned} \Vert v_{s\rightarrow t}\Vert ^2\le \Vert v_0\Vert ^2-\Vert v_{t_0\rightarrow s}- v_0\Vert ^2-\Vert v_{s\rightarrow t}- v_{t_0\rightarrow s}\Vert ^2; \end{aligned}$$
(12)

in particular, \(\Vert v_{s\rightarrow t}\Vert <\Vert v_0\Vert \) unless \(v_{s\rightarrow t}= v_{t_0\rightarrow s}=v_0\).

Theorem 6

The intrinsic characteristic emanating from a noncritical point \(x_0\) possesses the following properties.

  1. (i)

    \({\varvec{x}}(t)\ne x_0\) for all \(t\in (t_0,\infty )\) and the initial velocity \({\dot{{\varvec{x}}}}^+(t_0)\) is given by

    $$\begin{aligned} {\dot{{\varvec{x}}}}^+(t_0)=v_0:=v^\circ (t_0,x_0)={\varvec{r}}(x_0)/t_0\ne 0. \end{aligned}$$

    In particular, \(\Vert {\dot{{\varvec{x}}}}^+(t_0)\Vert =\Vert v_0\Vert \in (0,d_E(x_0)/t_0].\)

  2. (ii)

    \(\Vert {\varvec{x}}(t)-{\varvec{x}}(s)\Vert \le \Vert v_0\Vert (t-s)\) for all \(t_0\le s<t<\infty .\)

  3. (iii)

    The distances \(d_E({\varvec{x}}(t))\) and \(\Vert x_0-{\varvec{x}}(t)\Vert \) are nondecreasing functions of \(t\in [t_0,\infty ).\) In fact, if \(t_0\le t_1<t_2,\) then

    $$\begin{aligned} d_E({\varvec{x}}(t_1))<d_E({\varvec{x}}(t_2))\quad \text {and}\quad \Vert x_0-{\varvec{x}}(t_1)\Vert <\Vert x_0-{\varvec{x}}(t_2)\Vert \end{aligned}$$

    unless \({\varvec{x}}(t_1)={\varvec{x}}(t_2).\) Furthermore, the functions \(d_E({\varvec{x}}(t))\) and \(\Vert x_0-{\varvec{x}}(t)\Vert \) are either simultaneously bounded or simultaneously unbounded.

  4. (iv)

    (Propagation of singularities) If \(x_0\in \Sigma ,\) then \(0<\Vert {\dot{{\varvec{x}}}}^+(t_0)\Vert <d_E(x_0)/t_0\) and \({\varvec{x}}(t)\in \Sigma {\setminus }\{x_0\}\) for all \(t\in (t_0,\infty ).\)

Similar results were derived for generalized characteristics of distance functions in Riemannian manifolds in the article [5] by Albano, Cannarsa, Nguyen and Sinestrari.

7 Generation of singularities for distance functions and homotopy equivalence

If \(P_E(x_0)\) is a singleton \(\{y_0\}\), then we can extend \({\varvec{x}}(\cdot )\) to \([0,\infty )\) by setting,

$$\begin{aligned} \text {for}\, t\in [0,t_0),\quad {\varvec{x}}(t)=y_0+tv_0\quad \text {where}\quad v_0=\frac{x_0-y_0}{t_0}. \end{aligned}$$
(13)

Our next theorem asserts that if \({\varvec{x}}(t)\) starts from a nonsingular point \(x_0\in \complement E\), whose nearest point in E is \(y_0\), then \({\varvec{x}}(t)\) will eventually become singular unless E is supported at the boundary point \(y_0\) by the hyperplane whose normal vector is \({\varvec{r}}(x_0)=x_0-y_0\). (Since \(x_0\notin \Sigma \), \(P_E(x_0)=\{dA(x_0)\}=\{y_0\}\).)

Theorem 7

(Generation and propagation of singularities) Let E be a closed nonempty subset of H and suppose that \(x_0\in \complement (\Sigma \cup E).\) Let \({\varvec{x}}(\cdot )\) be extended to \([0,\infty )\) by (13) where \(P_E(x_0)=\{y_0\}\). Then, either

  1. (i)

    \({\varvec{x}}(t)\) remains nonsingular for all \(t\in (0,\infty ),\) \(P_E({\varvec{x}}(t))=\{y_0\}\) and \({\varvec{x}}(t)=y_0+tv_0=x_0+(t-t_0)v_0\) for all \(t\in [0,\infty ),\) or

  2. (ii)

    there exists a \(t^*\in [t_0,\infty )\) such that \({\varvec{x}}(t)\) is nonsingular and \({\varvec{x}}(t)=x_0+(t-t_0)v_0\) for all \(t\in (0,t^*)\) whereas \({\varvec{x}}(t)\) is singular for all \(t\in (t^*,\infty ).\)

Case (i) happens if and only if

$$\begin{aligned} E\subseteq \{x\in H:\langle x-y_0,v_0\rangle \le 0\}. \end{aligned}$$

In case (ii), the cut point \({\varvec{x}}(t^*)\) lies on the ray from \(y_0\) through \(x_0,\) \(y_0\in P_E({\varvec{x}}(t^*))\) while \(y_0\notin P_E({\varvec{x}}(t))\) for every \(t>t^*.\)

Theorem 7 has topological implications. While \(\complement E\) and \(\Sigma \) are clearly not homeomorphic (as \(\complement E\) is an open set while the interior of \(\Sigma \) is empty), it was proved by Lieutier in [33] that they still are homotopy equivalent if \(\complement E\) is a bounded nonvoid subset of \({\mathbb {R}}^n\). Lieutier’s theorem was extended to Riemannian manifolds in [5]. In our Hilbert space setting we shall construct the required homotopy drawing on Theorem 7, assuming a weaker condition than the boundedness of \(\complement E\), namely that \(\rho (\complement E)<\infty \) where

$$\begin{aligned} \rho (\complement E):=\sup _{x\in \complement E}d_E(x)=\sup _{x\in H}d_E(x). \end{aligned}$$
(14)

Equivalently,

$$\begin{aligned} \rho (\complement E)= \sup \{R>0:{\mathbb {B}}(x,R)\subseteq \complement E\;\text {for some} x\in \complement E\}. \end{aligned}$$

The first step states that \({\varvec{x}}(t)\) will be trapped in \(\Sigma \) for all times t such that \(t/t_0>\rho (\complement E)/d_E(x_0)\) provided (14) is finite.

Corollary 2

Assume that the complement of E is open, nonempty and such that \(\rho (\complement E)<\infty .\) Let \(x_0\in \complement (\Sigma \cup E).\) Then alternative (ii) in Theorem 7 is in force. Moreover, \({\varvec{x}}(t^*)=y_0+t^*v_0\) and \(t^*/t_0\le \rho (\complement E)/d_E(x_0).\)

Proof

Alternative (i) is ruled out by \(d_E({\varvec{x}}(t))\le \rho (\complement E)<\infty \). We have \({\varvec{x}}(t)=y_0+tv_0\) when \(t\in [t_0,t^*]\) and

$$\begin{aligned} \rho (\complement E)\ge d_E({\varvec{x}}(t^*))=\Vert {\varvec{x}}(t^*)-y_0\Vert =t^*\Vert v_0\Vert =t^*d_E(x_0)/t_0. \end{aligned}$$

It ensues that \(t^*/t_0\le \rho (\complement E)/d_E(x_0)\). \(\square \)

The second step summarizes what we know about the Lipschitz continuity of \(J^{t_0\rightarrow t}(x_0)=J^{1\rightarrow t/t_0}(x_0)\). A direct combination of Theorem 5 and Corollary 1 yields the following conclusion.

Proposition 3

The mapping \(\Phi :[1,\infty )\times H\rightarrow H\) defined by \(\Phi (\tau ,x)=J^{1\rightarrow \tau }(x)\) is Lipschitz continuous:

$$\begin{aligned} \Vert \Phi (\tau _1,x)-\Phi (\tau _0,x)\Vert \le \Vert {\varvec{r}}(x)\Vert |\tau _1-\tau _0| \le d_E(x)|\tau _1-\tau _0| \end{aligned}$$
(15)

as well as

$$\begin{aligned} \Vert \Phi (\tau ,x_1)-\Phi (\tau ,x_0)\Vert \le \tau \Vert x_1-x_0\Vert \end{aligned}$$
(16)

for all \(\tau \) and \(\tau _j\) in \([1,\infty )\) and all x and \(x_j\) in H.

We next generalize Lieutier’s result to the Hilbert space H. Since \(\Sigma \subset \complement E\), the proof amounts to exhibiting a continuous mapping \(F:[0,1]\times \complement E\rightarrow \complement E\) such that \(F(0,\cdot )\) is the identity mapping on \(\complement E\) while \(F(1,\cdot )\) maps \(\complement E\) into \(\Sigma \) and \(F(\theta ,\cdot )\) maps \(\Sigma \) into \(\Sigma \) for every \(\theta \in [0,1]\).

Theorem 8

(Homotopy equivalence) Assume that the complement of E is an open nonempty set such that \(\rho (\complement E)<\infty \). Then \(\complement E\) and \(\Sigma \) are of the same homotopy type because the mapping \(F:[0,1]\times \complement E\rightarrow \complement E\) defined by

$$\begin{aligned} F(\theta ,x)=J^{1\rightarrow 1-\theta +\theta R/d_E(x)}(x)\qquad \text {for all}\, (\theta ,x)\in [0,1]\times \complement E, \end{aligned}$$

for any fixed \(R>\rho (\complement E),\) is continuous and satisfies, on the one hand, \(F(0,x)=x\) and \(F(1,x)\in \Sigma \) for all \(x\in \complement E\) and, on the other, \(F(\theta ,x)\in \Sigma \) whenever \((\theta ,x)\in [0,1]\times \Sigma .\) In fact, the homotopy F is Lipschitz continuous away from the boundary of E.

8 Weak limit points as \(t\rightarrow \infty \) for distance functions

This section investigates the behavior of \({\varvec{x}}(t)\) as \(t\rightarrow \infty \) assuming that the initial point \(x_0\in \complement E\) be noncritical. Being a nondecreasing function, the distance \(\Vert x_0-{\varvec{x}}(t)\Vert \) either approaches \(\infty \) or remains bounded as t tends to \(\infty \). Let us assume the latter alternative, i.e., that \({\varvec{x}}(t)\) stays in some ball \({\overline{{\mathbb {B}}}}(x_0,R)\) for all \(t\ge t_0\). Under these circumstances, \({\varvec{x}}(t)\in \Sigma \) for all sufficiently large t by virtue of Theorem 7. By the weak sequential compactness of closed balls, for any \(t_j\rightarrow \infty \) the bounded sequence \({\varvec{x}}(t_j)\) possesses a weakly convergent subsequence. It turns out that each weak limit point belongs to \(\partial A^*(x_0)\) (see Lemma 5 for basic properties of this set). Moreover, it holds that \(\partial A^*(x_0)\subseteq \Sigma {\setminus }\{x_0\}\). Thus, each weak limit point is singular. By basic convex function theory, \(x_0\in {\text {int}}{\text {dom}}A^*\) \(\Leftrightarrow \) \(A^*\) is finite and continuous at \(x_0\) \(\Leftrightarrow \) \(A^*\) is Lipschitz continuous on some neighborhood of \(x_0\) (see, e.g., [12, Sect. 4.1]). Either of these conditions implies that \(\partial A^*(x_0)\) is bounded and nonempty.

Theorem 9

(Weak limit points) For the distance function \(d_E\) to a closed nonempty set \(E\subset H,\) let \({\varvec{x}}(t)\) be the intrinsic characteristic emanating at time \(t_0>0\) from a noncritical point \(x_0\in \complement E.\) Assume that \(\Vert x_0-{\varvec{x}}(t)\Vert \) stays bounded as t tends to \(\infty \). Then, the set of weak limit points

$$\begin{aligned} W=\{{\bar{x}}\in H:{\bar{x}}\, \text {is the weak limit of}\, {\varvec{x}}(t_j)\, \text {for some sequence}\, t_j\rightarrow \infty \} \end{aligned}$$

is a weakly closed nonempty subset of \(\Sigma {\setminus }\{x_0\}.\) In fact,

$$\begin{aligned} \emptyset \ne W\subseteq \partial A^*(x_0)\subseteq \Sigma {\setminus }\{x_0\}. \end{aligned}$$
(17)

In particular, \(x_0\in {{\overline{co}}}E.\)

Corollary 3

Under the hypotheses of Theorem 9, the following assertions hold true.

  1. (a)

    If \(\partial A^*(x_0)\) is a bounded set, then W is weakly compact.

  2. (b)

    If \(A^*\) is Gâteaux differentiable at \(x_0,\) then \(W=\{\nabla A^*(x_0)\}\) and \({\varvec{x}}(t)\rightarrow \nabla A^*(x_0)\) weakly as \(t\rightarrow \infty \).

  3. (c)

    If \(A^*\) is Fréchet differentiable at \(x_0,\) then \({\varvec{x}}(t)\rightarrow d A^*(x_0)\) strongly as \(t\rightarrow \infty \).

Corollary 4

Every intrinsic characteristic \({\varvec{x}}(t)\) with an initial point \(x_0\not \in {{\overline{co}}}E\) is unbounded, i.e., \(\Vert x_0-{\varvec{x}}(t)\Vert \rightarrow \infty \) and \(d_E({\varvec{x}}(t))\rightarrow \infty \) as \(t\rightarrow \infty .\)

Proof

By contraposition of Theorem 9, we conclude that \(\Vert x_0-{\varvec{x}}(t)\Vert \) is unbounded when \(x_0\not \in {{\overline{co}}}E\). Theorem 6 (or Proposition 15) asserts that \(d_E({\varvec{x}}(t))\) is unbounded too. \(\square \)

Corollary 5

Every singular intrinsic characteristic \({\varvec{x}}(t)\) with an initial point \(x_0\not \in {{\overline{co}}}E\) carries singularities to infinity, i.e., if \(t_1\ge t_0\) and \({\varvec{x}}(t_1)\in \Sigma ,\) then \({\varvec{x}}(t)\in \Sigma \) for all \(t\in [t_1,\infty ),\) \(\Vert x_0-{\varvec{x}}(t)\Vert \rightarrow \infty \) and \(d_E({\varvec{x}}(t))\rightarrow \infty \) as \(t\rightarrow \infty .\)

Proof

See Theorem 4 or Corollary 7 for the singular propagation. \(\square \)

Cf. Examples 67 at the end of the paper. The following remark addresses the important question of the global character of the singular dynamics.

Remark 3

The stability property presented in Theorem 9 can be utilized to exclude that a singular intrinsic characteristic merely is a rescaled or reparameterized purely local singular arc. To explain this, suppose that \(\xi :[0,\infty )\rightarrow H\) is a continuous nonconstant curve such that, for some \({\bar{\tau }}>0\), \(\xi (\tau )\in \Sigma \) when \(\tau \in [0,{\bar{\tau }})\) but \(\xi ({\bar{\tau }})\notin \Sigma \). In this hypothetical situation, \(\xi (\cdot )\) propagates singularities locally but not globally in \(\tau \) as \(\xi (\tau )\) exits \(\Sigma \) at \(\tau ={\bar{\tau }}\). Let \(\phi :[0,\infty )\rightarrow [0,{\bar{\tau }})\) be an increasing homeomorphism (e.g., \(\phi (\sigma )={\bar{\tau }}(1-e^{-\sigma })\) when \(\sigma \in [0,\infty )\)) and set \(\chi =\xi \circ \phi \). Clearly, \(\chi (\sigma )\in \Sigma \) for all \(\sigma \in [0,\infty )\) yet the propagation is not genuinely global as \(\chi (\infty ):=\lim _{\sigma \rightarrow \infty }\chi (\sigma )=\xi ({\bar{\tau }})\notin \Sigma \).

By Theorem 9, if it exists, the strong (or weak) limit \({\varvec{x}}(\infty )\) is a member of \(\Sigma \) for any singular bounded intrinsic characteristic \({\varvec{x}}(\cdot )\). Hence, \({\varvec{x}}(\cdot )\) is not merely a rescaled local singular arc in the above sense. Furthermore, if \({\bar{x}} ={\varvec{x}}(\infty )\) is noncritical, then the singular propagation can be genuinely prolonged by using \({\bar{x}}\) as a new initial point. Indeed, the curve \(\varvec{\xi }(s)=J^{s_0\rightarrow s}({\bar{x}})\), defined for some fixed \(s_0>0\) and all \(s\ge s_0\), is a nonconstant singular arc such that \(d_E(\varvec{\xi }(s))\) is nondecreasing with \(d_E(\varvec{\xi }(s))> d_E({\bar{x}})\ge d_E({\varvec{x}}(t))\) for all \(t> t_0\) and all \(s> s_0\) (see Theorem 6).

Let us take a look at the exceptional case where \({\varvec{x}}(t)\) moves inside a “spherical cavity" (see Theorem 1 and Lemma 4). This is the only way that an intrinsic characteristic can reach an isolated singularity.

Example 1

Assume that \(E\cap \overline{{\mathbb {B}}}(z_0,R)={\mathbb {S}}(z_0,R)\) and select an initial point \(x_0\) in \({{\mathbb {B}}}(z_0,R)\), \(x_0\ne z_0\). Then \({\varvec{x}}(t)\) moves with constant velocity to the center \(z_0\) of the ball, i.e.,

$$\begin{aligned} {\varvec{x}}(t)=x_0+(t-t_0)v_0\quad \text {when}\, t\in [t_0,t^*] \end{aligned}$$

where

$$\begin{aligned} t^*= \frac{t_0R}{R-\Vert z_0-x_0\Vert },\quad t^*-t_0=\frac{t_0\Vert z_0-x_0\Vert }{R-\Vert z_0-x_0\Vert } \quad \text {and}\quad v_0=\frac{z_0-x_0}{t^*-t_0}. \end{aligned}$$

Moreover, \({\varvec{x}}(t)=z_0\) for all \(t\in [t^*,\infty )\) and \(z_0\) is an isolated singularity because \(d_E(x)=R-\Vert z_0-x\Vert \) for all \(x\in {\mathbb {B}}(z_0,R)\).

9 Differentiability of Moreau envelopes

This section is concerned with the differentiability properties of the Moreau–Yosida approximations (1) of a lower semicontinuous function f subject to condition (2). Clearly, at a point x, the differentiability of \(f_t\) is equivalent to that of \(A(t,\cdot )\); see Lemma 2. From now on proofs will be included.

Proposition 4

Assume that \(S(t_0,\cdot )=f_{t_0}(\cdot )\) is Fréchet differentiable at \(x_0.\) Then the infimum (1) for \(t=t_0\) is achieved at a unique point \(y_0,\) i.e., \(P_{t_0f}(x_0)=\{y_0\}\) and

$$\begin{aligned} d_xS(t_0,x_0)= d f_{t_0}(x_0)=\frac{x_0-y_0}{t_0}. \end{aligned}$$

Moreover, at \((t_0,x_0),\) the partial derivative \(\partial S/\partial t\) exists too and the Hamilton–Jacobi equation \(\partial S/\partial t+\frac{1}{2}\Vert d_x S\Vert ^2=0\) is satisfied.

Proof

By Lemma 2, likewise as \(S(t,x)=f_t(x)\), the convex function \(A:(t,x)\mapsto (t f+\frac{1}{2}\Vert \cdot \Vert ^2)^*(x)\) is locally Lipschitz continuous. Since \(f_{t_0}\) is Fréchet differentiable at \(x_0\), owing to (6), so is \(A(t_0,\cdot )\), say, \(y_0=d_xA(t_0,x_0)\). By Lemma 1, therefore,

$$\begin{aligned} \langle x_0,y_0\rangle =\left( t_0f+\frac{1}{2}\Vert \cdot \Vert ^2\right) ^*(x_0)+t_0f(y_0)+\frac{1}{2}\Vert y_0\Vert ^2 \end{aligned}$$
(18)

which converts to

$$\begin{aligned} t_0f_{t_0}(x_0)=\frac{1}{2}\Vert x_0\Vert ^2-\left( t_0f+\frac{1}{2}\Vert \cdot \Vert ^2\right) ^*(x_0)=t_0f(y_0)+\frac{1}{2}\Vert x_0-y_0\Vert ^2, \end{aligned}$$
(19)

demonstrating that the infimum (1) for \(t=t_0\) is attained at \(y_0\). In order to investigate the derivative of \(f_t(x_0)\) at \(t=t_0\) we consider \(t\ne t_0\) close to \(t_0\) and select \(y_t\in H\) such that

$$\begin{aligned} f(y_t)+\frac{1}{2t}\Vert x_0-y_t\Vert ^2<f_{t}(x_0)+(t-t_0)^2, \end{aligned}$$
(20)

which translates to

$$\begin{aligned} \langle x_0,y_t\rangle -t f(y_t)-\frac{1}{2}\Vert y_t\Vert ^2 > \left( tf+\frac{1}{2}\Vert \cdot \Vert ^2\right) ^*(x_0)-t (t-t_0)^2. \end{aligned}$$
(21)

We claim that \(y_t\rightarrow y_0\) in norm as \(t\rightarrow t_0\). On the basis of (21) we find that

$$\begin{aligned}&\liminf _{t\rightarrow t_0} \left( \langle x_0,y_t\rangle -t_0 f(y_t)-\frac{1}{2}\Vert y_t\Vert ^2\right) =\liminf _{t\rightarrow t_0}\left( \langle x_0,y_t\rangle -t f(y_t)-\frac{1}{2}\Vert y_t\Vert ^2\right) \nonumber \\&\ge \liminf _{t\rightarrow t_0}\left( tf+\frac{1}{2}\Vert \cdot \Vert ^2\right) ^*(x_0)=\left( t_0 f+\frac{1}{2}\Vert \cdot \Vert ^2\right) ^*(x_0). \end{aligned}$$
(22)

By virtue of (18), (22) and the Fréchet differentiability of \((t_0f+\frac{1}{2}\Vert \cdot \Vert ^2)^*\) at \(x_0\), by invoking Lemma 1, we may now infer that \(y_t\rightarrow y_0\) in norm, as claimed.

On account of (19) and (20) it holds that

$$\begin{aligned}&f(y_t)+\frac{1}{2t}\Vert x_0-y_t\Vert ^2-(t-t_0)^2-\left( f(y_t)+ \frac{1}{2t_0}\Vert x_0-y_t\Vert ^2 \right) \\&\le f_t(x_0)-f_{t_0}(x_0)\\&\le f(y_0)+\frac{1}{2t}\Vert x_0-y_0\Vert ^2-\left( f(y_0)+ \frac{1}{2t_0}\Vert x_0-y_0\Vert ^2 \right) \end{aligned}$$

which reduces to

$$\begin{aligned} - \frac{t-t_0}{2tt_0}\Vert x_0-y_t\Vert ^2-(t-t_0)^2 \le f_t(x_0)-f_{t_0}(x_0)\le -\frac{t-t_0}{2tt_0}\Vert x_0-y_0\Vert ^2; \end{aligned}$$

hence,

$$\begin{aligned} \lim _{t\rightarrow t_0}\frac{f_t(x_0)-f_{t_0}(x_0)}{t-t_0}=-\frac{1}{2t_0^2}\Vert x_0-y_0\Vert ^2=-\frac{1}{2}\Vert df_{t_0}(x_0)\Vert ^2. \end{aligned}$$

\(\square \)

Example 2

Differentiability of \(t\mapsto S(t,x_0)=f_t(x_0)\) at \(t=t_0\) does not imply that of \(x\mapsto S(t_0,x)\) at the point \(x_0\). Indeed, consider \(f=-\Vert \cdot \Vert \) and \(f_t(x)=-t/2-\Vert x\Vert \) at \(x_0=0\).

Remark 4

The hypothesis of Fréchet differentiability in Proposition 4 cannot be relaxed to Gâteaux differentiability. Indeed, let f be the indicator function of a closed nonconvex set \(E\subset H\) such that \(d^2_E\) is Gâteaux but not Fréchet differentiable at a point \(x_0\in H{\setminus } E\). Then, by virtue of Theorem 11, \(P_E(x_0)=\emptyset \), \(\Vert \nabla d_E(x_0)\Vert <1\) and, thus,

$$\begin{aligned} \frac{\partial S}{\partial t}(t,x_0)+\frac{1}{2}\Vert \nabla _x S(t,x_0)\Vert ^2=-\frac{d^2_E(x_0)}{2t^2}+\frac{d^2_E(x_0)}{2t^2}\Vert \nabla d_E(x_0)\Vert ^2<0. \end{aligned}$$

A closed subset E of \(H=\ell ^2\) such that \(d_E(0)=1\) and \(\nabla d_E(0)=0\) is furnished by Example 5. See Corollary 6 below for a more general statement.

Proposition 5

Suppose that \(y_0\in P_{t_0f}(x_0)\). Then, at each point of the form \(X(t)=y_0+(t/t_0)(x_0-y_0)\) where \(0<t<t_0,\) \(f_t\) is Fréchet differentiable with

$$\begin{aligned} df_t(X(t))=\frac{X(t)-y_0}{t}=\frac{x_0-y_0}{t_0}. \end{aligned}$$

In particular, \(P_{tf}(X(t))=\{y_0\}\) when \(t\in (0,t_0).\)

Proof

We notice that the supremum for \(A(t_0,x_0)=(t_0f+\frac{1}{2}\Vert \cdot \Vert ^2)^*(x_0)\) is attained at the same point, \(y_0\), i.e.,

$$\begin{aligned} \langle x_0,y \rangle -t_0f(y)-\frac{1}{2}\Vert y\Vert ^2 \le \langle x_0,y_0\rangle -t_0f(y_0)-\frac{1}{2}\Vert y_0\Vert ^2 \end{aligned}$$
(23)

for all \(y\in H\). It suffices to verify that \(x\mapsto A(t,x)\) is differentiable at X(t) with \(d_xA(t,X(t))=y_0\) for fixed \(0<t<t_0\). To this end, we set

$$\begin{aligned} {\mathcal {D}}(y):= tf(y)+\frac{1}{2}\Vert y\Vert ^2-tf(y_0)-\frac{1}{2}\Vert y_0\Vert ^2-\langle X(t),y-y_0\rangle . \end{aligned}$$

We take (23) into account to derive the inequality

$$\begin{aligned}&{\mathcal {D}}(y)=\frac{t}{t_0}(t_0f(y)-t_0f(y_0))+\frac{1}{2}\Vert y\Vert ^2-\frac{1}{2}\Vert y_0\Vert ^2-\langle X(t),y-y_0\rangle \\&\ge \frac{t}{t_0}\left( -\frac{1}{2}\Vert y\Vert ^2+\frac{1}{2}\Vert y_0\Vert ^2+\langle x_0,y-y_0\rangle \right) +\frac{1}{2}\Vert y\Vert ^2-\frac{1}{2}\Vert y_0\Vert ^2-\langle X(t),y-y_0\rangle \\&=\frac{t_0-t}{2t_0}\Vert y-y_0\Vert ^2, \end{aligned}$$

which shows that \({\mathcal {D}}(y)\ge \gamma (\Vert y-y_0\Vert )\) for a quadratic function \(\gamma \in \Gamma _U\). By appealing to Lemma 1 we may conclude that \(f_t\) is Fréchet differentiable at X(t). \(\square \)

We recall that, for every \(x\in H\), \(P_{tf}(x)\) is included in the closed convex nonempty set \(\partial _x A(t,x)\) of subgradients. In particular, \(P_{tf}\) is a cyclically monotone mapping.

Proposition 6

Let \(f:H\rightarrow {\mathbb {R}}\cup \{\infty \}\) be a lower semicontinuous function satisfying (2). Then the following statements are fulfilled.

  1. (i)

    \({{\overline{co}}}P_{tf}(x)\subseteq \partial _x A(t,x)\) for every \(t>0\) and \(x\in H.\)

  2. (ii)

    A point (tx) is not strongly singular if and only if \(f_t\) is Gâteaux differentiable at x and \(P_{tf}(x)\ne \emptyset \).

  3. (iii)

    \(\Sigma _{\mathrm s}\subseteq \Sigma \) when \(\dim H=\infty \) while \(\Sigma _{\mathrm s}=\Sigma \) when \(\dim H<\infty .\)

  4. (iv)

    If (tx) is not a strong singularity, then \(P_{tf}(x)\) is a singleton \(\{y\},\) \(\nabla f_t(x)=(x-y)/t\) and \(P_{tf}(x)=\partial _x A(t,x)=\{\nabla _x A(t,x)\}.\)

Proof

(i) Let \(y\in P_{tf}(x)\); then \(f(y)<\infty \) and

$$\begin{aligned}&f_{t}(x+h)-f_{t}(x)\\&\le f(y)+\frac{1}{2t}\Vert x+h-y\Vert ^2-\left( f(y)+ \frac{1}{2t}\Vert x-y\Vert ^2 \right) \\&=\left\langle \frac{x-y}{t},h\right\rangle + \frac{\Vert h\Vert ^2}{2t}. \end{aligned}$$

It ensues that \((x-y)/t\in d^+f_{t}(x)\) which translates to \(y\in \partial _x A(t,x)\). Part (ii) is a trivial consequence of the definition.

(iii)–(iv) The Gâteaux differential \(\nabla f_{t}(x)\), if it exists and if \(y\in P_{tf}(x)\), must be equal to \((x-y)/t\) whence \(P_{tf}(x)=\{y\}=\partial _x A(t,x)\). By Proposition 4, if \((t,x)\not \in \Sigma \) then the infimum defining \(f_t(x)\) is uniquely attained and so \((t,x)\not \in \Sigma _{\mathrm s}\). \(\square \)

10 Viscosity solution

We next confirm that \(S(t,x)=f_t(x)\) is a viscosity solution of (3) based on \(d^\pm S\).

Proposition 7

The following assertions are true for the Moreau envelopes of any lower semicontinuous function \(f:H\rightarrow {\mathbb {R}}\cup \{\infty \}\) subject to (2).

  1. (i)

    \(S(t,x)=f_t(x)\) is a viscosity solution of (3), i.e., for any \((t,x)\in (0,\infty )\times H,\) \(\omega +\frac{1}{2}\Vert v\Vert ^2\le 0\) if \((\omega ,v)\in d^+S(t,x)\) while \(\omega +\frac{1}{2}\Vert v\Vert ^2\ge 0\) if \((\omega ,v)\in d^-S(t,x).\)

  2. (ii)

    A point \((t,x)\in (0,\infty )\times H\) belongs to \(\Sigma \) if and only if \(\omega +\frac{1}{2}\Vert v\Vert ^2<0\) for some \((\omega ,v)\in d^+ S(t,x).\)

Proof

(i) By the convexity of \((\omega ,v)\mapsto \omega +\frac{1}{2}\Vert v\Vert ^2\) and \(d^+ S(t,x)={{\overline{co}}}d^\bullet S(t,x)\), it suffices to demonstrate the subsolution inequality assuming that \((\omega , v)\in d^\bullet S(t,x)\). In this case, there exists a sequence \(\Sigma \not \ni (t_k,x_k)\rightarrow (t,x)\) such that \(dS(t_k,x_k)\) converges weakly to \((\omega ,p)\). Proposition 4 ensures that the Hamilton–Jacobi equation is satisfied at each \((t_k,x_k)\); hence, \(\omega +\frac{1}{2}\Vert v\Vert ^2\le 0\) is obtained in the limit as \(k\rightarrow \infty \). Second, if \((\omega ,v)\in d^- S(t,x)\), then S is Fréchet differentiable at (tx) and the Hamilton–Jacobi equation \(\omega +\frac{1}{2}\Vert v\Vert ^2=0\) is satisfied.

(ii) Fixing the point (tx), the task amounts to showing the implication (a) \(\Rightarrow \) (b) where

  1. (a)

    \(\omega +\frac{1}{2}\Vert v\Vert ^2=0\) for all \((\omega ,v)\in d^+ S(t,x)\);

  2. (b)

    S is Fréchet differentiable at (tx).

It is elementary to verify, for any two points \((\omega _k,v_k)\in {\mathbb {R}}\times H\), that the system of equations

$$\begin{aligned} 0= \omega _0+\frac{1}{2}\Vert v_0\Vert ^2 =\omega _1+\frac{1}{2}\Vert v_1\Vert ^2=\frac{\omega _0+\omega _1}{2}+\frac{1}{2}\left\| \frac{v_0+v_1}{2}\right\| ^2 \end{aligned}$$
(24)

implies \((\omega _0,v_0)=(\omega _1,v_1)\). Assume (a) and choose \((\omega _k,v_k)\in d^+ S(t,x)\), \(k=0,1\), arbitrarily. Then also \(\frac{1}{2}(\omega _0+\omega _1,v_0+v_1)\) belongs to \(d^+ S(t,x)\). By (a), (24) is fulfilled forcing \((\omega _0,v_0)=(\omega _1,v_1)\). Thus \(d^+S(t,x)\) is a singleton \(\{(\omega ,v)\}\) which means that S is Gâteaux differentiable at (tx) with gradient \((\omega ,v)\). Then \(d^\bullet S(t,x)=\{(\omega ,v)\}\). Thus, for any sequence \((t_j,x_j)\not \in \Sigma \) converging strongly to (tx), we have \((\omega _j,v_j):=dS(t_j,x_j)\rightarrow (\omega ,v)\) weakly. Owing to (a) we have \(\omega +\frac{1}{2}\Vert v\Vert ^2=0\) and using \(\omega _j+\frac{1}{2}\Vert v_j\Vert ^2=0\) we find that \(\Vert v_j\Vert ^2\rightarrow \Vert v\Vert ^2\) as \(j\rightarrow \infty \) since \(\omega _j\rightarrow \omega \). This observation elevates the weak convergence \(v_j\rightarrow v\) to strong convergence. By Šmulian’s theorem on Fréchet differentiability [40, 12, Thm. 4.2.10] (which extends from convex to semiconvex and semiconcave functions) we may conclude that S is Fréchet differentiable at (tx). \(\square \)

Theorem 10

Let f be a proper lower semicontinuous function on H such that (2) is fulfilled. Then the following conditions are equivalent for any \((t_0,x_0)\in (0,\infty )\times H.\)

  1. (i)

    \(S(t_0,\cdot )\) is Fréchet differentiable at \(x_0.\)

  2. (ii)

    S is Fréchet differentiable at \((t_0,x_0).\)

  3. (iii)

    S is Gâteaux differentiable at \((t_0,x_0)\) and \(P_{t_0 f}(x_0)\) is a singleton.

  4. (iv)

    S is Gâteaux differentiable at \((t_0,x_0)\) and \(P_{t_0 f}(x_0)\ne \emptyset .\)

  5. (v)

    S is Gâteaux differentiable at \((t_0,x_0)\) and the Gâteaux gradient \((\omega ,p)\) satisfies \(\omega +\frac{1}{2}\Vert v\Vert ^2=0.\)

  6. (vi)

    \(\omega +\frac{1}{2}\Vert v\Vert ^2=0\) for all \((\omega ,v)\in d^+ S(t,x).\)

Proof

Proposition 4 ensures that (ii) \(\Rightarrow \) (iii) while the implication (iii) \(\Rightarrow \) (iv) is immediate. The implication (iv) \(\Rightarrow \) (v) follows since, at \((t_0,x_0)\), the Gâteaux gradient \((\partial S/\partial t,\nabla S)\) is equal to \((-\Vert x_0-y_0\Vert ^2/(2t_0^2), (x_0-y_0)/t_0)\) if (iv) holds and \(y_0\in P_{t_0 f}(x_0)\) (the proof is similar to that of Proposition 6). Furthermore, (v) \(\Rightarrow \) (vi) holds as \(d^+ S(t_0,x_0)\) is a singleton when S is Gâteaux differentiable at \((t_0,x_0)\). The implication (vi) \(\Rightarrow \) (ii) was a key step in the proof of Proposition 7. As (ii) \(\Rightarrow \) (i) holds trivially, we shall close the circle by demonstrating that (i) implies (vi). Assuming (i), pick any \((\omega ,v)\in d^+ S(t_0,x_0)\). Then \(v\in d^+_x S(t_0,x_0)\) and \(\omega \in d^+_t S(t_0,x_0)\). It follows that \(v=d_xS(t_0,x_0)\) and taking into account Proposition 4 we also find that \(\omega =(\partial S/\partial t)(t_0,x_0)\) and \(\omega +\frac{1}{2}\Vert v\Vert ^2=0\). \(\square \)

The equation \(\omega +\frac{1}{2}\Vert v\Vert ^2=0\) may not hold for any \((\omega ,v)\in d^+ S(t,x).\)

Corollary 6

If S is Gâteaux but not Fréchet differentiable at \((t_0,x_0),\) then \(P_{t_0f}(x_0)=\emptyset \) and

$$\begin{aligned} \frac{\partial S}{\partial t}(t_0,x_0)+\frac{1}{2}\Vert \nabla _x S (t_0,x_0)\Vert ^2<0. \end{aligned}$$

Example 3 shows in particular that \(\Sigma \) and \(\Sigma _{\mathrm s}\) are different in general and that the condition

  • \(S(t_0,\cdot )\) is Gâteaux differentiable at \(x_0\) and \(P_{t_0 f}(x_0)\) is a singleton"

is not equivalent to any of the conditions presented in Theorem 10.

Theorem 10 is a generalization to Moreau envelopes of Fitzpatrick’s characterization of Fréchet differentiability of distance functions.

Theorem 11

(Fitzpatrick [23]) Let E be a nonvoid closed subset of H. The following conditions are equivalent for any \(x_0\in H{\setminus } E.\)

  1. (i)

    \(d_E\) is Fréchet differentiable at \(x_0.\)

  2. (ii)

    \(d_E\) is Gâteaux differentiable at \(x_0\) and \(P_E(x_0)\) is nonempty.

  3. (iii)

    \(d_E\) is Gâteaux differentiable at \(x_0\) and \(P_E(x_0)\) is a singleton.

  4. (iv)

    \(d_E\) is Gâteaux differentiable at \(x_0\) and \(\Vert \nabla d_E(x_0)\Vert =1.\)

  5. (v)

    The metric projection \(P_E\) is continuous at \(x_0\) in the sense that \(P_E(x_0)\) is a singleton and \(y_j\rightarrow P_E(x_0)\) whenever \(x_j\rightarrow x_0\) and \(y_j\in P(x_j).\)

In particular, \(\Sigma =\Sigma _{\mathrm s}.\)

For more on the theme of differentiability of Moreau envelopes, see [26].

11 Basic analysis of intrinsic characteristics for Moreau envelopes

We are now in the position to apply, on a detailed level, the ideas and methods set forth by Cannarsa and Cheng in [13] to the viscosity solution \(S(t,x)=f_t(x)\) of (3)–(4). A formal time reversal motivates the definition

$$\begin{aligned} g^s(x)=\sup _{y\in H}\left( g(y)-\frac{1}{2s}\Vert x-y\Vert ^2\right) . \end{aligned}$$

In [32], Lasry and Lions introduced a regularization and approximation scheme for lower semicontinuous functions f on a Hilbert space by defining \(f_{t,s}=(f_t)^s\) for \(0<s<t\). These double indexed approximations of f enjoy remarkable properties. For instance, if (2) is met, \(f_{t,s}\rightarrow f\) pointwise as \(0<s<t\downarrow 0\) and \(f_{t,s}\) is differentiable with a globally Lipschitz continuous differential \(df_{t,s}\). Reconnecting to the approach of [13] we consider, for a fixed initial point \((t_0,x_0)\) and any \(t>t_0\), the maximization problem

$$\begin{aligned} \mathop {\text {arg max}}\limits _{x\in H}\left( f_t(x)-\frac{1}{2(t-t_0)}\Vert x_0-x\Vert ^2 \right) , \end{aligned}$$
(25)

which is well-behaved since the objective function is uniformly concave by virtue of the expansion

$$\begin{aligned} x\mapsto f_t(x)-\frac{1}{2(t-t_0)}\Vert x_0-x\Vert ^2=\text { a concave function }-\left( \frac{1}{t-t_0}-\frac{1}{t} \right) \Vert x\Vert ^2/2. \end{aligned}$$

It ensues that there exists a unique maximizer in (25), which we shall denote by \(x={\varvec{x}}(t)\), i.e.,

$$\begin{aligned} \{{\varvec{x}}(t)\}=\mathop {\text {arg max}}\limits _{x\in H}\left( f_t(x)-\frac{1}{2(t-t_0)}\Vert x_0-x\Vert ^2 \right) ,\qquad t_0<t<\infty . \end{aligned}$$

Clearly, \({\varvec{x}}(t)\) is characterized as the unique solution x of

$$\begin{aligned} \frac{x-x_0}{t-t_0}\in d^+ f_t(x) \end{aligned}$$
(26)

and it is straightforward to show that \({\varvec{x}}(t_0):=\lim _{t\downarrow t_0}{\varvec{x}}(t)=x_0\). The mapping \({\varvec{x}}:[t_0,\infty )\rightarrow H\) is referred to as the intrinsic characteristic emanating from \(x_0\) (thus repeating Definition 3).

We expect that minimizing straight arcs are intrinsic characteristics.

Proposition 8

Suppose \(y_1\in P_{t_1f}(x_1).\) Consider a point of the form

$$\begin{aligned} x_0=y_1+t_0 v\quad \text {where}\, 0<t_0<t_1\, \text {and}\;v=\frac{x_1-y_1}{t_1}. \end{aligned}$$

Then the intrinsic characteristic \({\varvec{x}}(t)\) with \({\varvec{x}}(t_0)=x_0\) is given by

$$\begin{aligned} {\varvec{x}}(t)=x_0+(t-t_0)v=y_1+tv\quad \text {when}\quad t\in [t_0,t_1]. \end{aligned}$$
(27)

Moreover, \(f_t\) is Fréchet differentiable at \({\varvec{x}}(t)\) and

$$\begin{aligned} {\dot{{\varvec{x}}}}(t)=df_t({\varvec{x}}(t))=v\quad \text {when}\quad t\in [t_0,t_1). \end{aligned}$$

Proof

By Proposition 5, \(f_t\) is Fréchet differentiable along the straight line (27) with \(df_t({\varvec{x}}(t))=v\) for every \(t\in [t_0,t_1)\). The intrinsic characteristic is characterized as the unique solution x of (26) which is \(x=x_0+(t-t_0)v\) when \(t\in (t_0,t_1)\). \(\square \)

If the intrinsic characteristic \((t,{\varvec{x}}(t))\) that starts at \((t_0,x_0)\) passes through a nonsingular point at \((t_1,{\varvec{x}}(t_1))\) then it has to agree with a nonsingular line segment when \(t\in [t_0,t_1]\).

Theorem 12

Consider the intrinsic characteristic \({\varvec{x}}(t)\) issuing from an arbitrary point \((t_0,x_0)\in (0,\infty )\times H.\) Assume that \((t_1,{\varvec{x}}(t_1))\) is not a strong singularity for a certain \(t_1> t_0,\) which means on the one hand that \(P_{t_1f}({\varvec{x}}(t_1))=\{y_1\}\) for some \(y_1\in H\) and, on the other, that the Gâteaux differential \(\nabla f_{t_1}({\varvec{x}}(t_1))\) exists and equals \(v:=({\varvec{x}}(t_1)-y_1)/t_1.\) Then

$$\begin{aligned} {\varvec{x}}(t)=y_1+tv=x_0+(t-t_0)v\quad \text {when}\, t\in [t_0,t_1], \end{aligned}$$

while \(f_t\) is Fréchet differentiable at \({\varvec{x}}(t)\) with \(df_t({\varvec{x}}(t))=v={\dot{{\varvec{x}}}}(t)\) for every \(t\in [t_0,t_1).\) Furthermore, \(P_{tf}({\varvec{x}}(t))=\{y_1\}\) for all \(t\in [t_0,t_1].\)

Proof

Let \(0<t_0< t_1\) and assume that \((t_1,{\varvec{x}}(t_1))\) is not a strong singularity. The point \(x={\varvec{x}}(t_1)\) is characterized by (26), i.e.,

$$\begin{aligned} \nabla f_{t_1}({\varvec{x}}(t_1))=\frac{{\varvec{x}}(t_1)-x_0}{t_1-t_0}. \end{aligned}$$
(28)

As regards \(\nabla f_{t_1}({\varvec{x}}(t_1))\), Proposition 6 ensures that \(P_{t_1f}({\varvec{x}}(t_1))\) is a singleton \(\{y_1\}\) and

$$\begin{aligned} \nabla f_{t_1}({\varvec{x}}(t_1))=\frac{{\varvec{x}}(t_1)-y_1}{t_1}. \end{aligned}$$
(29)

It follows from (28) and (29) that

$$\begin{aligned} \frac{{\varvec{x}}(t_1)-x_0}{t_1-t_0}=\frac{{\varvec{x}}(t_1)-y_1}{t_1} \end{aligned}$$

which expresses that \((t_0,x_0)\) is an interior point of the straight line segment

$$\begin{aligned} \left( t,y_1+t\frac{{\varvec{x}}(t_1)-y_1}{t_1} \right) =(t,y_1+tv),\qquad 0\le t\le t_1, \end{aligned}$$

connecting \((0,y_1)\) with \((t_1,{\varvec{x}}(t_1))\). Thus, on account of Propositions 5 and 8, first,

$$\begin{aligned} {\varvec{x}}(t)=y_1+tv=x_0+(t-t_0)v\quad \text {when} t\in [t_0,t_1], \end{aligned}$$

and, secondly, \(f_{t}\) is Fréchet differentiable at \({\varvec{x}}(t)\) with \(df_t({\varvec{x}}(t))=v\) for every \(t\in [t_0,t_1)\). \(\square \)

We may now conclude that singularities propagate forward in time along intrinsic arcs turning immediately into strong singularities.

Corollary 7

(Propagation of singularities) Let \({\varvec{x}}(t)\) be the intrinsic characteristic emanating from \((t_0,x_0).\) If \(t_1\ge t_0\) and \((t_1,{\varvec{x}}(t_1))\in \Sigma ,\) then \((t,{\varvec{x}}(t))\in \Sigma _{\mathrm s}\) for every \(t\in (t_1,\infty ).\)

Proof

The conclusion follows from Theorem 12 by contraposition. \(\square \)

A complementary observation concerning the points of \(\Sigma _{\mathrm s}\) of vacuous proximal mapping reads as follows:

Corollary 8

Suppose that \(P_{t_1f}({\varvec{x}}(t_1))=\emptyset .\) Then there exists some \(\tau \in [t_0,t_1)\) such that \((t,{\varvec{x}}(t))\in \Sigma _{\mathrm s}\) for all \(t\in (\tau ,\infty ).\)

Proof

Devising the proof by contraposition, assume that \((t,{\varvec{x}}(t))\notin \Sigma _{\mathrm s}\) for all \(t\in [t_0,t_1)\). Owing to Corollary 7, the task at hand boils down to inferring that \(P_{t_1f}({\varvec{x}}(t_1))\ne \emptyset \). Let \(t_0<\tau <t_1\). By Theorem 12 applied to the point \((\tau ,{\varvec{x}}(\tau ))\notin \Sigma _{\mathrm s}\), we deduce that \({\varvec{x}}(t)\) agrees with a straight line segment of the form \(y(\tau )+tv(\tau )\) and that \(P_{t f}({\varvec{x}}(t))=\{y(\tau )\}\) when \(t\in [t_0,\tau ]\). We claim that \(y(\tau )\) and \(v(\tau )\) are in fact independent of \(\tau \in (t_0,t_1)\). To this end, pick \(t_0<\tau _0<\tau _1<t_1\) arbitrarily. Then, in particular,

$$\begin{aligned} {\varvec{x}}(t)=y(\tau _0)+tv(\tau _0)=y(\tau _1)+tv(\tau _1)\qquad \text {when}\, t\in [t_0,\tau _0], \end{aligned}$$

forcing \(y(\tau _0)=y(\tau _1)\) and \(v(\tau _0)=v(\tau _1)\), and the claim follows. We conclude that there exist constant vectors \(y\in H\) and \(v\in H\) such that \({\varvec{x}}(t)=y+tv\) and \(P_{t f}({\varvec{x}}(t))=\{y\}\) for all \(t\in [t_0,t_1)\). Taking the limit as \(t\uparrow t_1\) in the inequality

$$\begin{aligned} \forall z\in H\quad f(y)+\frac{1}{2t}\Vert {\varvec{x}}(t)-y\Vert ^2\le f(z)+\frac{1}{2t}\Vert {\varvec{x}}(t)-z\Vert ^2, \end{aligned}$$

which is in force when \(t_0<t<t_1\), using the continuity of \({\varvec{x}}(t)\) proved in Proposition 11, implies \(y\in P_{t_1f}({\varvec{x}}(t_1))\). \(\square \)

We next examine the intersection of intrinsic characteristics noticing that the inverse multivalued mapping \((J^{t_0\rightarrow t})^{-1}:H\rightrightarrows H\), given by

$$\begin{aligned} (J^{t_0\rightarrow t})^{-1}=I-(t-t_0)d^+f_{t}, \end{aligned}$$

maps points to bounded convex nonempty sets.

Proposition 9

The mapping \(J^{t_0\rightarrow t_1}:H\rightarrow H\) is surjective for any \(0<t_0<t_1.\) In other words, through any given point \((t_1,x_1)\in (0,\infty )\times H\) with \(0<t_0<t_1\) there passes at least one intrinsic characteristic issuing at \(t=t_0\) from some initial point \(x_0;\) more precisely,

$$\begin{aligned} J^{t_0\rightarrow t_1}(x_0)=x_1 \end{aligned}$$

if and only if

$$\begin{aligned} x_0\in (J^{t_0\rightarrow t_1})^{-1}(x_1)=x_1-(t_1-t_0)d^+f_{t_1}(x_1). \end{aligned}$$

Furthermore, the intrinsic characteristic starting at \(t=t_0\) passing through \((t_1,x_1)\) is unique exactly if \(f_{t_1}\) is Gâteaux differentiable at \(x_1.\) In this case, it is given by \({\varvec{x}}(t)=x_1+(t-t_1)\nabla f_{t_1}(x_1)\) when \(t\in [t_0,t_1]\) and \((t,x(t))\not \in \Sigma \) when \(t\in [t_0,t_1)\) if, in addition, \(P_{t_1f}(x_1)\ne \emptyset .\) If, on the other hand, \(f_{t_1}\) is Gâteaux differentiable at \(x_1\) while \(P_{t_1f}(x_1)=\emptyset ,\) then \((t,{\varvec{x}}(t))\in \Sigma _{\mathrm s}\) when \(t\in (\tau ,\infty )\) for some \(\tau \in (t_0,t_1).\)

Proof

Owing to Theorem 3, the description of the inverse mapping of \(J^{t_0\rightarrow t_1}\) is straight-forward. The inverse mapping is single-valued at \(x_1\) if and only if the convex function \(A(t_1,\cdot )\) is Gâteaux differentiable at \(x_1\). In this case, let the unique intrinsic characteristic be denoted by \({\varvec{x}}(t)\), \(t_0\le t\le t_1\). On account of Theorem 12, it is given by \({\varvec{x}}(t)=x_1+(t-t_1)\nabla f_{t_1}(x_1)\) when \(t\in [t_0,t_1]\) if, in addition, \(P_{t_1f}(x_1)\ne \emptyset \). The conclusion about the case when \(P_{t_1f}(x_1)=\emptyset \) comes from Corollary 8. \(\square \)

12 Regularity of intrinsic characteristics

We recall that \((\omega ^\circ (t,x),v^\circ (t,x))\in d^+ S(t,x)\) satisfies

$$\begin{aligned} \omega ^\circ (t,x)+\frac{1}{2}\Vert v^\circ (t,x)\Vert ^2\le \omega + \frac{1}{2}\Vert v\Vert ^2\quad \text {for all}\, (\omega ,v)\in d^+ S(t,x). \end{aligned}$$

In the terminology of [27, 28], \(v^\circ (t,x)\) is the admissible velocity. In \(H={\mathbb {R}}^n\), the following result was obtained for certain Hamilton–Jacobi equations of the form \({{\mathcal {H}}}(x,\nabla S(x))=0\) in the seminal paper [13, Prop. 3.4] on intrinsic characteristics. The initial velocity is the right derivative

$$\begin{aligned} {\dot{{\varvec{x}}}}^+(t_0)= \lim _{t\downarrow t_0}\frac{{\varvec{x}}(t)-x_0}{t-t_0} \end{aligned}$$

at the initial point.

Proposition 10

The right derivative \({\dot{{\varvec{x}}}}^+(t_0)\) exists and \({\dot{{\varvec{x}}}}^+(t_0)=v^\circ (t_0,x_0).\)

Proof

Let \(t_k\downarrow t_0\) and consider

$$\begin{aligned} v_k=\frac{{\varvec{x}}(t_k)-x_0}{t_k-t_0},\qquad k=1,2,\ldots . \end{aligned}$$

The task is to establish that \(v_k\rightarrow v^\circ (t_0,x_0)\) strongly. Choose \(\delta >0\) so large that \((t_k,{\varvec{x}}(t_k))\in \Omega =[t_0,t_0+\delta ]\times \overline{{\mathbb {B}}}(x_0,\delta )\) for all \(k\ge 1\). The function \(S(t,x)=f_t(x)\) is Lipschitz continuous and semiconcave in \(\Omega \). From (26) we see that \(v_k\in d_x^+ S(t_k,{\varvec{x}}(t_k))\) and, thus, \((v_k)\) is a bounded sequence possessing a weakly convergent subsequence, still labelled by \((v_k)\), i.e., \(v_k\rightarrow v_0\) weakly. In fact, the semiconcavity of S ensures that \((\omega _k,v_k)\in d^+ S(t_k,{\varvec{x}}(t_k))\) for some bounded real sequence \((\omega _k)\). By extracting a further subsequence, we may assume that \(\omega _k\rightarrow \omega _0\) as \(k\rightarrow \infty \). On account of the semiconcavity,

$$\begin{aligned}&S(t_0,x_0)\le S(t_k,{\varvec{x}}(t_k))+\omega _k(t_0-t_k)+\langle v_k,x_0-{\varvec{x}}(t_k) \rangle \nonumber \\&+C((t_0-t_k)^2+\Vert x_0-{\varvec{x}}(t_k)\Vert ^2) \end{aligned}$$
(30)

Similarly, for any \((\omega ,v)\in d^+ S(t_0,x_0)\) it holds that

$$\begin{aligned}&S(t_k,{\varvec{x}}(t_k))\le S(t_0,x_0)+\omega (t_k-t_0)+\langle v,{\varvec{x}}(t_k)-x_0\rangle \nonumber \\&+C((t_k-t_0)^2+\Vert {\varvec{x}}(t_k)-x_0\Vert ^2). \end{aligned}$$
(31)

In (30) and (31), C is a constant of semiconcavity of S on \(\Omega \). Inequalities (30) and (31) add to

$$\begin{aligned} 0\le (\omega -\omega _k)(t_k-t_0)+\langle v-v_k,{\varvec{x}}(t_k)-x_0\rangle +2C((t_k-t_0)^2+\Vert {\varvec{x}}(t_k)-x_0\Vert ^2) \end{aligned}$$

or, dividing through by \(t_k-t_0>0\), to

$$\begin{aligned} 0\le \omega -\omega _k+\langle v-v_k,v_k\rangle +2C(1+\Vert v_k\Vert ^2)(t_k-t_0). \end{aligned}$$

Rearranging terms and passing to the limit yields

$$\begin{aligned}&\Vert v_0\Vert ^2\le \liminf _{k\rightarrow \infty }\Vert v_k\Vert ^2 \le \limsup _{k\rightarrow \infty }\Vert v_k\Vert ^2\nonumber \\&\le \limsup _{k\rightarrow \infty }\left( \omega -\omega _k+\langle v,v_k\rangle +2C(1+\Vert v_k\Vert ^2)(t_k-t_0)\right) =\omega -\omega _0+\langle v,v_0\rangle . \nonumber \\ \end{aligned}$$
(32)

Thus, we have demonstrated that

$$\begin{aligned} \omega -\omega _0+\langle v-v_0,v_0\rangle \ge 0 \quad \text {for all}\, (\omega , v)\in d^+ S(t_0,x_0). \end{aligned}$$
(33)

Owing to the outer semicontinuity of the multivalued mapping \(d^+ S\), the weak limit \((\omega _0,v_0)\) belongs to \(d^+ S(t_0,x_0)\); this conclusion is confirmed in Lemma 6 below. To prove that actually \(v_k\rightarrow v_0\) in norm it is sufficient to notice that each inequality in (32) is an equality when \((\omega , v)=(\omega _0,v_0)\). Indeed, this choice shows that

$$\begin{aligned} \lim _{k\rightarrow \infty }\Vert v_k\Vert ^2= \Vert v_0\Vert ^2 \end{aligned}$$

which together with \(v_k\rightarrow v_0\) weakly implies that \(v_k\rightarrow v_0\) in norm.

Finally, inequality (33) implies that \((\omega _0,v_0)=(\omega ^\circ (t_0,x_0),v^\circ (t_0,x_0))\). Indeed, by expansion of \(\Vert v-v_0\Vert ^2\) the following inequality holds for all \((\omega , v)\in {\mathbb {R}}\times H\) with \(v\ne v_0\):

$$\begin{aligned} \omega +\frac{1}{2}\Vert v\Vert ^2-\left( \omega _0+\frac{1}{2}\Vert v_0\Vert ^2\right) > \omega -\omega _0+\langle v-v_0,v_0\rangle , \end{aligned}$$

which combined with (33) yields \((\omega _0,v_0)=(\omega ^\circ (t_0,x_0),v^\circ (t_0,x_0))\). \(\square \)

Lemma 6

Let \({\mathbb {B}}\) be an open ball in a Hilbert space \({\mathbb {H}}\). Let \(F:{{\mathbb {H}}}\rightarrow {\mathbb {R}}\) be such that \(F(X)-C\Vert X\Vert ^2/2\) is concave and continuous in the closed ball \(\overline{{\mathbb {B}}}\) for some constant C. If \(X_j\rightarrow {\bar{X}}\in {{\mathbb {B}}}\) in norm, \(P_j\in d^+F(X_j)\) and \(P_j\rightarrow {\bar{P}}\) weakly, then \({\bar{P}}\in d^+ F({\bar{X}})\).

Proof

We convert to the convex function defined as \(G(X)=C\Vert X\Vert ^2/2-F(X)\) in the closed ball \(\overline{{\mathbb {B}}}\) and as \(G(X)=\infty \) otherwise. We set \(Q_j=CX_j-P_j\) which converges weakly to \({\bar{Q}}=C{\bar{X}}-{\bar{P}}\). Then, \(Q_j\in \partial G(X_j)\) in the sense of convex analysis which means that

$$\begin{aligned} G(X) \ge G(X_j)+ \langle X-X_j,Q_j\rangle \qquad \text {for all}\, X\in {{\mathbb {H}}}. \end{aligned}$$

Sending \(j\rightarrow \infty \) yields

$$\begin{aligned} G(X)\ge G({\bar{X}})+ \langle X-{\bar{X}},{\bar{Q}}\rangle \qquad \text {for all}\, X\in {{\mathbb {H}}}, \end{aligned}$$

which says that \({\bar{Q}}\in \partial G({\bar{X}})\) and, thus, that \({\bar{P}}\in d^+ F({\bar{X}})\). \(\square \)

Remark 5

The generalized characteristic defined by (5) actually satisfies \(\dot{{\varvec{X}}}^+(t)=v^\circ (t,{\varvec{X}}(t))\) for all \(t\in [t_0,\infty )\) owing to its uniqueness [17, Cor. 3.4].

We proceed to the regularity of \({\varvec{x}}(t)\). We base the proof partly on the regularity of A(tx). The most advantageous case is when \(f={\mathfrak {I}}_E\) since then A is independent of t.

Proposition 11

The intrinsic characteristic \({\varvec{x}}(t)\) issuing from \((t_0,x_0)\) is locally Hölder continuous with exponent 1/2. It is locally Lipschitz continuous if \(\partial A/\partial t\) exists and is a locally Lipschitz continuous function of \((t,x)\in [t_0,\infty )\times H.\) Moreover, if f is the indicator function of a closed nonempty set \(E\subset H,\) or if \(\partial A/\partial t\) is a constant, then \({\varvec{x}}(t)\) satisfies the Lipschitz estimates of Theorem 5.

Proof

For any \(t_0<s<t\) we set

$$\begin{aligned} p_t={\varvec{x}}(t)-\frac{t({\varvec{x}}(t)-x_0)}{t-t_0}\quad \text {and}\quad p_s={\varvec{x}}(s)-\frac{s({\varvec{x}}(s)-x_0)}{s-t_0} \end{aligned}$$

noting that \(p_t\in \partial _x A(t,{\varvec{x}}(t))\) and \(p_s\in \partial _x A(s,{\varvec{x}}(s))\) on account of (49). Since A is convex there exist real numbers \(\alpha _t,\alpha _s\) such that \((\alpha _t,p_t)\in \partial A(t,{\varvec{x}}(t))\), \((\alpha _s,p_s)\in \partial A(s,{\varvec{x}}(s))\), and hence

$$\begin{aligned} (\alpha _t-\alpha _s)(t-s)+\langle p_t-p_s,{\varvec{x}}(t)-{\varvec{x}}(s)\rangle \ge 0. \end{aligned}$$
(34)

We next expand

$$\begin{aligned} {\mathcal {Q}}:=\Vert {\varvec{x}}(t)-{\varvec{x}}(s)\Vert ^2 \end{aligned}$$
(35)

to

$$\begin{aligned} {\mathcal {Q}}= -\frac{(t-t_0)\langle p_t-p_s,{\varvec{x}}(t)-{\varvec{x}}(s)\rangle }{t_0}+\left\langle \frac{{\varvec{x}}(s)-x_0}{s-t_0},{\varvec{x}}(t)-{\varvec{x}}(s)\right\rangle (t-s). \end{aligned}$$
(36)

Taking into account (34) and

$$\begin{aligned} \frac{{\varvec{x}}(s)-x_0}{s-t_0}\in d^+ f_s({\varvec{x}}(s)) \end{aligned}$$

we can derive upper bounds for (36) as follows

$$\begin{aligned} {\mathcal {Q}}&\le \frac{(t-t_0)(\alpha _t-\alpha _s)(t-s)}{t_0}+ \left\langle \frac{{\varvec{x}}(s)-x_0}{s-t_0},{\varvec{x}}(t)-{\varvec{x}}(s)\right\rangle (t-s) \end{aligned}$$
(37)
$$\begin{aligned}&\le \left( \frac{(t-t_0)(\alpha _t-\alpha _s)}{t_0}+M(s)\Vert {\varvec{x}}(t)-{\varvec{x}}(s)\Vert \right) (t-s) \end{aligned}$$
(38)

where

$$\begin{aligned} M(s)=\sup _{v\in d^+ f_s({\varvec{x}}(s))}\Vert v\Vert . \end{aligned}$$

Using that \(|\alpha _t-\alpha _s|\) and M(s) are bounded when \(t_0\le s<t\le T\) we infer, from (35) and (37)–(38), the Hölder estimate \(\Vert {\varvec{x}}(t)-{\varvec{x}}(s)\Vert \le C_T(t-s)^{1/2}\) for \(t_0\le s<t\le T<\infty \).

Let us now add the hypothesis that \(\partial A/\partial t\) be locally Lipschitz continuous when \(t\ge t_0\). We claim that the average velocity

$$\begin{aligned} v_{s\rightarrow t}=\frac{{\varvec{x}}(t)-{\varvec{x}}(s)}{t-s} \end{aligned}$$

is then bounded when \(t_0\le s<t\le T\). Observing that

$$\begin{aligned} \alpha _t-\alpha _s=\frac{\partial A}{\partial t}(t,{\varvec{x}}(t))-\frac{\partial A}{\partial t}(s,{\varvec{x}}(s))\le C_T((t-s)+\Vert {\varvec{x}}(t)-{\varvec{x}}(s)\Vert ) \end{aligned}$$

when \(t_0\le s<t\le T\) we find, by means of (38), that

$$\begin{aligned} \Vert v_{s\rightarrow t}\Vert ^2\le \frac{C_T(t-t_0)}{t_0}(1+\Vert v_{s\rightarrow t}\Vert )+M_T\Vert v_{s\rightarrow t}\Vert . \end{aligned}$$

We may now infer the speed bound \(\Vert v_{s\rightarrow t}\Vert \le L_T\) when \(t_0\le s<t\le T<\infty \) for some constant \(L_T\) depending on T.

Finally we assume instead that \(\alpha _t\le \alpha _s\), which certainly holds if \(\partial A/\partial t\) is a constant. For instance, if \(f={\mathfrak {I}}_E\), then A(tx) is independent of t and \(\alpha _t=0=\alpha _s\). In this special case we can derive the inequality

$$\begin{aligned} \Vert v_{s\rightarrow t}\Vert ^2\le \langle v_{t_0\rightarrow s},v_{s\rightarrow t} \rangle \end{aligned}$$
(39)

for the average velocities

$$\begin{aligned} v_{t_0\rightarrow s}=\frac{{\varvec{x}}(s)-x_0}{s-t_0}\quad \text {and}\quad v_{s\rightarrow t}=\frac{{\varvec{x}}(t)-{\varvec{x}}(s)}{t-s}. \end{aligned}$$

Indeed, combining (35) with (37) yields

$$\begin{aligned} \Vert {\varvec{x}}(t)-{\varvec{x}}(s)\Vert ^2\le \left\langle \frac{{\varvec{x}}(s)-x_0}{s-t_0},{\varvec{x}}(t)-{\varvec{x}}(s)\right\rangle (t-s) \end{aligned}$$

and (39) ensues by division with \((t-s)^2\). Inequality (39) is equivalent to (10), i.e., to

$$\begin{aligned} \Vert v_{s\rightarrow t}\Vert ^2-\Vert v_{t_0\rightarrow s}\Vert ^2\le -\Vert v_{s\rightarrow t}- v_{t_0\rightarrow s}\Vert ^2, \end{aligned}$$
(40)

which is confirmed by an expansion of the right-hand side of (40); in particular, \(\Vert v_{s\rightarrow t}\Vert <\Vert v_{t_0\rightarrow s}\Vert \) unless \(v_{s\rightarrow t}= v_{t_0\rightarrow s}\). \(\square \)

13 A few propositions about distance functions

We recall that \(\Sigma _{\mathrm s}=\Sigma \) in the case of distance functions by Theorem 11. In this case, \({\varvec{x}}(t)\) is given by (8) or, equivalently, in setting

$$\begin{aligned} \theta (t)=\frac{t}{t-t_0},\qquad t_0<t<\infty , \end{aligned}$$

\(x={\varvec{x}}(t)\) if and only if

$$\begin{aligned} \theta (t)(x-x_0)\in d^+ d^2_E(x)/2. \end{aligned}$$

The function \(\theta (\cdot )\) is decreasing and its range is \((1,\infty )\). The intrinsic characteristic \({\varvec{x}}(t)\) does not return to the initial point \(x_0\) unless \({\varvec{x}}(t)\equiv x_0\). Recall that a point \(x_0\in H\) is called critical if \(0\in d^+ d^2_E(x_0)\).

Proposition 12

If \(x_0\) is a critical point then \({\varvec{x}}(t)=x_0\) for all \(t\in [t_0,\infty ).\) Otherwise, \({\varvec{x}}(t)\ne x_0\) for all \(t\in (t_0,\infty )\) and \({\dot{{\varvec{x}}}}^+(t_0)=v^\circ (t_0,x_0)={\varvec{r}}(x_0)/t_0\ne 0.\)

Proposition 13

Assume that \(x_0\) is not a critical point, \(t_0<t_1<t_2\) and \({\varvec{x}}(t_1)={\varvec{x}}(t_2).\) Then \({\varvec{x}}(t_j)\in \Sigma \) and \({\varvec{x}}\) is constant on \([t_1,t_2].\)

Proof

Setting \({\bar{x}}={\varvec{x}}(t_1)={\varvec{x}}(t_2)\) we have \({\bar{x}}\ne x_0\) and \(\theta (t_j)({\bar{x}}-x_0)\in d^+d^2_E({\bar{x}})/2\) for \(j=1,2\), implying \({\bar{x}}\in \Sigma \). Let \(t_1<t<t_2\); then also \(\theta (t)({\bar{x}}-x_0)\in d^+d^2_E({\bar{x}})/2\), and hence \({\varvec{x}}(t)={\bar{x}}\), because \(\theta (t_2)<\theta (t)<\theta (t_1)\) and \(d^+d^2_E({\bar{x}})/2\) is a convex set. \(\square \)

If \({\varvec{x}}(t)\) reaches a critical point, then it will come to a stop there.

Proposition 14

Assume that \({\varvec{x}}(t_1)\) is a critical point for some \(t_1\in (t_0,\infty ).\) Then \({\varvec{x}}(t)={\varvec{x}}(t_1)\) for all \(t\in [t_1,\infty ).\)

Proof

The point \({\varvec{x}}(t_1)\) is characterized by

$$\begin{aligned} \theta (t_1)({\varvec{x}}(t_1)-x_0)\in d^+ d^2_E({\varvec{x}}(t_1))/2; \end{aligned}$$

hence

$$\begin{aligned} {\text {co}}\left\{ 0, \theta (t_1)({\varvec{x}}(t_1)-x_0) \right\} \subset d^+ d^2_E({\varvec{x}}(t_1))/2. \end{aligned}$$
(41)

In order to demonstrate that \({\varvec{x}}(t)={\varvec{x}}(t_1)\) for any \(t>t_1\) it suffices to show that

$$\begin{aligned} \theta (t)({\varvec{x}}(t_1)-x_0)\in d^+ d^2_E({\varvec{x}}(t_1))/2. \end{aligned}$$
(42)

To this end we need only observe that the left-hand side of (42) is an element of the left-hand side of (41). Indeed,

$$\begin{aligned} \theta (t)({\varvec{x}}(t_1)-x_0)=(1-\lambda )0+\lambda \theta (t_1)({\varvec{x}}(t_1)-x_0) \end{aligned}$$

for the scalar \(\lambda = \theta (t)/\theta (t_1)\in (0,1)\). \(\square \)

Proposition 15

The distances \(d_E({\varvec{x}}(t))\) and \(\Vert x_0-{\varvec{x}}(t)\Vert \) are nondecreasing functions of \(t\in [t_0,\infty ).\) In fact, if \(t_0\le t_1<t_2,\) then

$$\begin{aligned} d_E({\varvec{x}}(t_1))<d_E({\varvec{x}}(t_2))\quad \text {and}\quad \Vert x_0-{\varvec{x}}(t_1)\Vert <\Vert x_0-{\varvec{x}}(t_2)\Vert \end{aligned}$$

unless \({\varvec{x}}(t_1)={\varvec{x}}(t_2).\) Moreover, the functions \(d_E({\varvec{x}}(t))\) and \(\Vert x_0-{\varvec{x}}(t)\Vert \) are either simultaneously bounded or simultaneously unbounded.

Proof

For arbitrary fixed \(t_0<t_1<t_2\) we set \(F(x)=d^2_E(x)/2\), \(G(x)=\Vert x_0-x\Vert ^2/2\) and, for \(j=1,2\), \(x_j={\varvec{x}}(t_j)\) and \(\theta _j=\theta (t_j)\). Then \(\theta _2<\theta _1\) and

$$\begin{aligned} \{x_j\}=\mathop {\text {arg max}}\limits _{x\in H}(F(x)-\theta _j G(x)). \end{aligned}$$

Suppose that \(x_1\ne x_2\); then

$$\begin{aligned} F(x_1)-\theta _1 G(x_1)>F(x_2)-\theta _1 G(x_2) \end{aligned}$$

as well as

$$\begin{aligned} F(x_2)-\theta _2 G(x_2)>F(x_1)-\theta _2 G(x_1), \end{aligned}$$

which combine to

$$\begin{aligned} \theta _2(G(x_2)-G(x_1))<F(x_2)-F(x_1) <\theta _1(G(x_2)-G(x_1)). \end{aligned}$$
(43)

Taking into account that \(\theta _1>\theta _2>0\) it follows that each membrum of (43) is positive. Thus, \(F(x_2)>F(x_1)\) and \(G(x_2)>G(x_1)\). To prove the boundedness assertion, we start by recalling that by (8), for any fixed \(t>t_0\), \({\varvec{x}}(t)\) is the maximizer of

$$\begin{aligned} \Psi (x)=\frac{1}{2t}d_E^2(x)-\frac{1}{2(t-t_0)}\Vert x_0-x\Vert ^2 . \end{aligned}$$

As \(\Psi ({\varvec{x}}(t))=\sup \Psi \ge \Psi (x_0)\) we must have

$$\begin{aligned} \frac{1}{2t}d_E^2({\varvec{x}}(t))-\frac{1}{2(t-t_0)}\Vert x_0-{\varvec{x}}(t)\Vert ^2\ge \frac{1}{2t}d_E^2(x_0) \end{aligned}$$

implying

$$\begin{aligned} \Vert x_0-{\varvec{x}}(t)\Vert ^2\le \frac{(t-t_0)(d_E^2({\varvec{x}}(t))-d_E^2(x_0))}{t}\le d_E^2({\varvec{x}}(t))-d_E^2(x_0). \end{aligned}$$

In particular, if \(d_E({\varvec{x}}(t))\) is a bounded function of \(t\in [t_0,\infty )\) then so is \(\Vert x_0-{\varvec{x}}(t)\Vert \). Conversely, it is easily verified that the boundedness of \(\Vert x_0-{\varvec{x}}(t)\Vert \) implies that of \(d_E({\varvec{x}}(t))\). \(\square \)

14 A further tool for propagation of singularities

The propagation results of Theorems 47 have the strength of being global in time t. Even if \({\varvec{x}}(t)\) eventually becomes constant, the propagation still persists along \((t,{\varvec{x}}(t))\); see Example 8 in the final section for an elementary illustration. In the case of distance functions, we also know that the speed of propagation along \({\varvec{x}}(t)\) does not exceed the initial speed. Still there is a flaw in this picture. We are primarily interested in the propagation along \({\varvec{x}}(t)\) in \(\complement E\), thinking of t as a parameter. It may happen that \({\varvec{x}}(t)\) becomes constant after some time while singularities may continue propagating along some other arc. Suppose, for the sake of argument, that an intrinsic characteristic \({\varvec{x}}(t)\) is nonsingular when \(t<t_1\) but \({\varvec{x}}(t_1)= J^{t_0\rightarrow t_1}(x_0)\) is a critical point implying that \({\varvec{x}}(t)={\varvec{x}}(t_1)\) for all \(t\in [t_1,\infty )\), as confirmed by Proposition 14. It is clear that the arc \(t\mapsto J^{t_1\rightarrow t}({\varvec{x}}(t_1))\), with \({\varvec{x}}(t_1)\) as the new initial point, is constant as well when \(t\ge t_1\). Still there exists a nonconstant singular Lipschitz arc emanating from \({\varvec{x}}(t_1)\) provided \(P_E({\varvec{x}}(t_1))\ne {\mathbb {S}}({\varvec{x}}(t_1),d_E({\varvec{x}}(t_1)))\) (which rules out the situation of Example 1); see Theorem 1. In our next theorem we use the operator \(J^{t_0\rightarrow t_1}\) again to construct such a singular arc by letting it act on a certain straight line segment rather than the point \(x_0\) alone.

Theorem 13

Let E be a closed nonempty subset of H. Assume that \({\varvec{x}}(t_1)=J^{t_0\rightarrow t_1}(x_0)\in \Sigma \) where \(t_0<t_1\) and \(P_E({\varvec{x}}(t_1))\ne {\mathbb {S}}({\varvec{x}}(t_1),d_E({\varvec{x}}(t_1))).\) Then a singular Lipschitz arc \({\varvec{X}}(s)\in \Sigma ,\) \(0\le s\le s_0,\) satisfying \({\varvec{X}}(0)={\varvec{x}}(t_1)\) and \({\varvec{X}}(s)\ne {\varvec{x}}(t_1)\) for all \(s\in (0,s_0]\) is obtained by defining

$$\begin{aligned} {\varvec{X}}(s)=J^{t_0\rightarrow t_1}\left( \frac{t_0}{t_1}{\varvec{x}}(t_1)+ \frac{t_1-t_0}{t_1}(y_0+su)\right) ,\quad 0\le s\le s_0, \end{aligned}$$

where \(y_0\in {\mathbb {B}}({\varvec{x}}(t_1),d_E({\varvec{x}}(t_1)))\) is a boundary point of \(\partial A({\varvec{x}}(t_1)),\) \(0<s_0<d_E(y_0)/2,\) while u is any unit vector such that \(y_0+su\notin \partial A({\varvec{x}}(t_1))\) for all \(s>0\).

Proof

We modify the proof in [24]. Abbreviating \(J^{t_0\rightarrow t_1}\) to J, Theorem 3 tells us that

$$\begin{aligned} {\varvec{x}}(t_1)=J(x_0)=\left( (1-\lambda )\partial A+\lambda I\right) ^{-1}(x_0) \end{aligned}$$

for \(\lambda =t_0/t_1\). On account of Lemmas 34, while included in \(\overline{{\mathbb {B}}}({\varvec{x}}(t_1),d_E({\varvec{x}}(t_1)))\), the boundary of \(\partial A({\varvec{x}}(t_1))\) intersects the open ball \({\mathbb {B}}({\varvec{x}}(t_1),d_E(x_1))\). We may therefore select a boundary point \(y_0\) of \(\partial A({\varvec{x}}(t_1))\) and a unit vector u satisfying \(\Vert {\varvec{x}}(t_1)-y_0\Vert <d_E({\varvec{x}}(t_1))\) and \(y_s:=y_0+su\notin \partial A({\varvec{x}}(t_1))\) for all \(s>0\). In particular, \(y_0\not \in E\), and choosing \(0<s_0<d_E(y_0)/2\), we find that, for any \(s\in [0,s_0]\),

$$\begin{aligned} |d_E(y_s)-d_E(y_0)|\le \Vert y_s-y_0 \Vert =s\le s_0<d_E(y_0)/2 \end{aligned}$$

forcing

$$\begin{aligned} s<d_E(y_s). \end{aligned}$$
(44)

On the sole basis of the definition of J, for any \(x\in H\) and \(y\in H\) it holds that

$$\begin{aligned} J(\lambda x+(1-\lambda )y)=x\quad \Leftrightarrow \quad y\in \partial A(x) \end{aligned}$$
(45)

as well as

$$\begin{aligned} x-\lambda J(x)\in (1-\lambda )\partial A(J(x)). \end{aligned}$$
(46)

We set

$$\begin{aligned} z_s=\lambda {\varvec{x}}(t_1)+(1-\lambda )y_s,\qquad w_s=\frac{z_s-\lambda J(z_s)}{1-\lambda }, \end{aligned}$$

and examine the Lipschitz continuous curve \({\varvec{X}}(s)=J(z_s)\) for \(s\in [0,s_0]\). The equivalence (45) ensures that \({\varvec{X}}(0)={\varvec{x}}(t_1)\) and \({\varvec{X}}(s)\ne {\varvec{x}}(t_1)\) for every \(s\in (0,s_0]\). Thus \({\varvec{X}}(\cdot )\) does not reduce to a single point. It remains only to prove that \({\varvec{X}}(s)\) is a singular arc. On the one hand, (46) ensures that \(w_s\in \partial A({\varvec{X}}(s))\) and, on the other, taking into account the Lipschitz continuity of J (with constant \(1/\lambda \) by Theorem 3) and (44),

$$\begin{aligned} \Vert y_s-w_s\Vert =\frac{\lambda }{1-\lambda } \Vert J(z_s)-J(z_0)\Vert \le \frac{\lambda }{1-\lambda } \frac{1 }{\lambda } \Vert (1-\lambda )(y_s-y_0)\Vert =s<d_E(y_s) \end{aligned}$$

implying \(w_s\notin E\). Hence, A is not Fréchet differentiable at \({\varvec{X}}(s)\), because if it were then \(w_s=d A({\varvec{X}}(s))=P_E({\varvec{X}}(s))\in E\) by Proposition 6. Thereby, we have demonstrated that \({\varvec{X}}(s)\in \Sigma \) for every \(s\in [0,s_0]\), concluding the proof. \(\square \)

Remark 6

In Example 9 in the final section \(\Sigma \) is a hyperplane consisting of critical points. If the initial point \(x_0\) is nonsingular, then \({\varvec{x}}(t)\) reaches \(\Sigma \) in a finite time and comes to a stop in \(\Sigma \). By contrast, \({\varvec{X}}(s)\) is a nonconstant singular arc.

A version of Theorem 13 for Moreau envelopes concludes this section. It asserts the existence of a nonconstant singular arc in \(\{t_1\}\times H\) starting from a strictly singular point \((t_1,x_1)\). By Proposition 6, \(P_{tf}(x)\subseteq \partial _x A(t,x)\).

Theorem 14

Given a lower semicontinuous function \(f:H\rightarrow {\mathbb {R}}\cup \{\infty \}\) fulfilling (2), let \((t_1,x_1)\) be a strictly singular point such that some point of the boundary of \(\partial _x A(t_1,x_1)\) is not a member of \(P_{t_1f}(x_1).\) Then there exists a Lipschitz curve \({\varvec{X}}(s)\) defined for \(s\in [0,s_0]\) such that \({\varvec{X}}(0)=x_1\) while \({\varvec{X}}(s)\ne x_1\) and \((t_1,{\varvec{X}}(s))\in \Sigma _{\mathrm s}\) for all \(s\in (0,s_0].\)

Proof

Fix \(0<t_0<t_1\), set \(\lambda =t_0/t_1\) and \(J=J^{t_0\rightarrow t_1}\). Then, for any \(x\in H\) and \(y\in H\) it holds that

$$\begin{aligned} J(\lambda x+(1-\lambda )y)=x\quad \Leftrightarrow \quad y\in \partial _x A(t_1,x) \end{aligned}$$
(47)

as well as

$$\begin{aligned} x-\lambda J(x)\in (1-\lambda )\partial _x A(t_1,J(x)). \end{aligned}$$
(48)

Let \(y_0\) be a boundary point of \(\partial _x A(t_1,x_1)\) such that \(y_0\notin P_{t_1f}(x_1)\) and select a unit vector u such that \(y_0+su\notin \partial _x A(t_1,x_1)\) for all \(s>0\). We set

$$\begin{aligned} z_s=\lambda x_1+(1-\lambda )(y_0+su),\qquad w_s=\frac{z_s-\lambda J(z_s)}{1-\lambda }, \end{aligned}$$

and investigate the Lipschitz continuous arc \({\varvec{X}}(s)=J(z_s)\) for \(s\in [0,s_0]\) where the value of \(s_0\) is specified below. The equivalence (47) ensures that \({\varvec{X}}(0)=x_1\) and \({\varvec{X}}(s)\ne x_1\) for every \(s\in (0,s_0]\). To prove that \({\varvec{X}}(s)\) is a strictly singular arc, first, (48) shows that \(w_s\in \partial _x A(t_1,{\varvec{X}}(s))\). Furthermore,

$$\begin{aligned} m(s)=\inf _{y\in H}\left( f(y)+\frac{1}{2t_1}\Vert {\varvec{X}}(s)-y\Vert ^2\right) \end{aligned}$$

is an upper semicontinuous function and, as \(w_0=y_0\notin P_{t_1f}(x_1)\),

$$\begin{aligned} m(0)=\inf _{y\in H}\left( f(y)+\frac{1}{2t_1}\Vert x_1-y\Vert ^2\right) < f(w_0)+\frac{1}{2t_1}\Vert x_1-w_0\Vert ^2=: M. \end{aligned}$$

Hence, \(\limsup _{s\downarrow 0}m(s)\le m(0)<M\) and so, for some \(\varepsilon >0\) and some \(s_0>0\), \(m(s)<M-\varepsilon \) for all \(s\in [0,s_0]\). Moreover, by lower semicontinuity,

$$\begin{aligned} M \le \liminf _{s\downarrow 0}\left( f(w_s)+\frac{1}{2t_1}\Vert {\varvec{X}}(s)-w_s\Vert ^2\right) ; \end{aligned}$$

making \(s_0\) smaller if necessary, it ensues that \(w_s\notin P_{t_1f}({\varvec{X}}(s))\) for all \(s\in [0,s_0]\). Hence \((t_1,{\varvec{X}}(s))\in \Sigma _{\mathrm s}\), otherwise \(w_s=\nabla _x A(t_1,{\varvec{X}}(s))=P_{t_1f}({\varvec{X}}(s))\) by Proposition 6. \(\square \)

15 Proofs of Theorems 39 and Corollary 3

We are now ready to present complete proofs of our most central results. We may convert (11) to the uniformly convex minimization problem

$$\begin{aligned} \{{\varvec{x}}(t)\}=\mathop {\text {arg min}}\limits _{x\in H}\left( \frac{1}{t}A(t,x)-\frac{1}{2t}\Vert x\Vert ^2+\frac{1}{2(t-t_0)}\Vert x_0-x\Vert ^2 \right) . \end{aligned}$$
(49)

Proof of Theorem 3

(i) From (49) we infer that \({\varvec{x}}(t)\) is given by

$$\begin{aligned} \{{\varvec{x}}(t)\}= & {} \left( \frac{t-t_0}{t}\partial _x A(t,\cdot )+\frac{t_0}{t}I\right) ^{-1}(x_0)\\= & {} \partial \left( \frac{t-t_0}{t}A(t,\cdot )+\frac{t_0}{2t}\Vert \cdot \Vert ^2\right) ^*(x_0)=\partial F(x_0). \end{aligned}$$

We notice that \(F=G^*\) where \(G-\alpha \Vert \cdot \Vert ^2/2\) is a convex function for \(\alpha =t_0/t\). Hence, by the duality theory for the Legendre–Fenchel transform, F is Fréchet differentiable and \(dF=dG^*\) is Lipschitz continuous with constant \(1/\alpha =t/t_0\). Parts (ii)–(iii) are covered by Propositions 1011. \(\square \)

Proof of Theorem 4

Statement (i) is identical to Corollary 7. To demonstrate (ii) we need only recall that \({\dot{{\varvec{x}}}}^+(t_0)=v^\circ (t_0,x_0)\in d^+f_{t_0}(x_0)\) by Proposition 10. \(\square \)

Proof of Theorem 5

Proposition 11 covers this case. Estimate (10) implies (11) in the limit as \(s\downarrow t_0\) while (12) is obtained when (11) is substituted into (10). \(\square \)

Proof of Theorem 6

Parts (i) and (ii) are covered by Propositions 1011. For (iii) see Proposition 15. Fort part (iv) see Corollary 7. \(\square \)

Proof of Theorem 7

See Proposition 8 and Theorem 12. Case (i) is equivalent to

$$\begin{aligned} \Vert x-(y_0+tv_0)\Vert ^2>t^2\Vert v_0\Vert ^2\quad \text {for all}\, t>0, x\in E, x\ne y_0, \end{aligned}$$

which is fulfilled if and only if

$$\begin{aligned} \langle x-y_0,v_0\rangle <\frac{1}{2t}\Vert x-y_0\Vert ^2 \quad \text {for all}\, t>0, x\in E, x\ne y_0, \end{aligned}$$

or, equivalently, exactly when \(\langle x-y_0,v_0\rangle \le 0\) for all \(x\in E\). Assuming alternative (ii), Proposition 8 implies that \(P_E({\varvec{x}}(t))\) cannot contain \(y_0\) for any \(t>t^*\). \(\square \)

Proof of Theorem 8

As \(R/d_E(x)>1\) for all \(x\in \complement E\) (owing to \(R>\rho (\complement E)\)) it follows from Proposition 3 that the composite mapping

$$\begin{aligned} F(\theta ,x)=\Phi (1-\theta +\theta R/d_E(x),x)\quad \text {where}\quad \Phi (\tau ,x)=J^{1\rightarrow \tau }(x) \end{aligned}$$

is well-defined on \([0,1]\times \complement E\) and continuous. In addition, Proposition 15 ensures that \(F(\theta ,x) \in \complement E\) for all \((\theta ,x)\in [0,1]\times \complement E\). Obviously, \(F(0,\cdot )\) is the identity mapping while \(F(\theta ,\cdot )\) maps \(\Sigma \) into \(\Sigma \) for every \(\theta \in [0,1]\) by Corollary 7. As regards \(F(1,\cdot )\), \(F(1,x)=J^{1\rightarrow R/d_E(x)}(x)\in \Sigma \) for any \(x\in \complement E\) owing to Corollary 2 as \(t^*/t_0<R/d_E(x)\) when \(x\notin \Sigma \).

To derive Lipschitz estimates for F, we invoke Proposition 3. First, for any \(\theta _j\in [0,1]\) and any \(x\in \complement E\), by virtue of (15),

$$\begin{aligned}&\Vert F(\theta _1,x)-F(\theta _0,x)\Vert \\&\le d_E(x)\left| (1-\theta _1)+\frac{\theta _1R}{d_E(x)} -\left( (1-\theta _0)+\frac{\theta _0R}{d_E(x)}\right) \right| =(R-d_E(x))|\theta _1-\theta _0|. \end{aligned}$$

Secondly, for arbitrary \(\theta \in [0,1]\) and \(x_j\in \complement E\), owing to (15)–(16),

$$\begin{aligned}&\Vert F(\theta ,x_1)-F(\theta ,x_0)\Vert \le \Vert \Phi (1-\theta +\theta R/d_E(x_1),x_1)- \Phi (1-\theta +\theta R/d_E(x_1),x_0)\Vert \\&+\Vert \Phi (1-\theta +\theta R/d_E(x_1),x_0)-\Phi (1-\theta +\theta R/d_E(x_0),x_0)\Vert \\&\le \left( 1-\theta +\frac{\theta R}{d_E(x_1)} \right) \Vert x_1-x_0\Vert +d_E(x_0)\left| \frac{\theta R}{d_E(x_1)} -\frac{\theta R}{d_E(x_0)} \right| \\&=\left( 1-\theta +\frac{\theta R}{d_E(x_1)} \right) \Vert x_1-x_0\Vert +\theta R\frac{|d_E(x_0)-d_E(x_1)|}{d_E(x_1)}\\&\le \left( 1-\theta +\frac{2\theta R}{d_E(x_1)} \right) \Vert x_1-x_0\Vert \le \frac{2 R}{d_E(x_1)} \Vert x_1-x_0\Vert . \end{aligned}$$

Taking into account the symmetry of the left-hand side, the denominator on the last lines may be replaced by \(d_E(x_0)\) or by \(\max (d_E(x_1),d_E(x_0))\). \(\square \)

Proof of Theorem 9

By (9),

$$\begin{aligned} x_0-\frac{t_0}{t}{\varvec{x}}(t)\in \frac{t-t_0}{t}\partial A({\varvec{x}}(t)) \end{aligned}$$

which yields

$$\begin{aligned} y_t:=\frac{tx_0-t_0{\varvec{x}}(t)}{t-t_0}\in \partial A({\varvec{x}}(t)). \end{aligned}$$

We note that \(y_t\rightarrow x_0\) strongly as \(t\rightarrow \infty \) since \({\varvec{x}}(\cdot )\) is bounded by hypothesis. Consider a weakly convergent sequence \({\varvec{x}}(t_j)\rightarrow {\bar{x}}\) where \(t_j\rightarrow \infty \) as \(j\rightarrow \infty \). Passing to the limit along the sequence \(t=t_j\) in the subgradient inequality

$$\begin{aligned} A(x)\ge A({\varvec{x}}(t))+ \langle x-{\varvec{x}}(t),y_t\rangle \quad \text {for all}\, x\in H, \end{aligned}$$

yields, owing to the weak lower semicontinuity of the convex function A,

$$\begin{aligned} A(x)\ge A({\bar{x}}) +\langle x-{\bar{x}},x_0\rangle \quad \text {for all}\, x\in H, \end{aligned}$$

telling us that \(x_0\in \partial A({\bar{x}})\). From convex analysis we know that \(x_0\in \partial A({\bar{x}})\) is equivalent to \({\bar{x}}\in \partial A^*(x_0)\). In particular, \(x_0\in {{\overline{co}}}E\) because \({\text {dom}}\partial A^*\subseteq {{\overline{co}}}E\) by Lemma 5. We claim that \(\partial A^*(x_0)\subseteq \Sigma {\setminus }\{x_0\}\). Let \(y_0\in \partial A^*(x_0)\). Then \(y_0\) must be a singular point. Indeed, if A were Fréchet differentiable at \(y_0\), then \(x_0=dA(y_0)\) and \(P_E(y_0)=\{x_0\}\) (see Proposition 6) violating the assumption that \(x_0\in \complement E\). We also note that \(y_0\ne x_0\), otherwise \(x_0\) would be a critical point by Proposition 1, establishing \(\partial A^*(x_0)\subseteq \Sigma {\setminus }\{x_0\}\) and (17). Being the set of weak limit points, W is weakly closed. \(\square \)

Proof of Corollary 3

(a) W and \(\partial A^*(x_0)\) are weakly compact because \(\partial A^*(x_0)\) is a closed convex and bounded set in this case. To demonstrate part (b) it suffices to note that \(\partial A^*(x_0)\) consists of \(\nabla A^*(x_0)\) alone when \(\nabla A^*(x_0)\) exists. As regards (c), returning to the proof of Theorem 9 above, \({\varvec{x}}(t_j)\in \partial A^*(y_{t_j})\) where \(y_{t_j}\rightarrow x_0\) strongly and \({\varvec{x}}(t_j)\rightarrow {\bar{x}}\) weakly. By Šmulian’s theorem [12, Thm. 4.2.10], if \(A^*\) is Fréchet differentiable at \(x_0\), then \({\varvec{x}}(t_j)\rightarrow {\bar{x}}= dA^*(x_0)\) strongly. \(\square \)

16 Examples

We return to the logical relations between the following conditions when \(\dim H=\infty \):

  1. (i)

    \(f_t\) is Fréchet differentiable at x;

  2. (ii)

    \(f_t\) is Gâteaux differentiable at x;

  3. (iii)

    \(P_{tf}(x)\) is a singleton.

While it is correct that (ii) \(\Leftarrow \) (i) \(\Rightarrow \) (iii), no other implication is valid between any other pair of these conditions. To justify this assertion we give three examples, two of which are imported from [23]. In Example 3, for a certain t, (ii) and (iii) are in force for all \(x\in H\) yet (i) fails for some \(x\in H\). In Example 4, only condition (iii) is satisfied. Example 5 displays a situation where only (ii) is fulfilled.

Example 3

Assuming \(\dim H=\infty \), we select a Gâteaux differentiable continuous convex function \(g:H\rightarrow {\mathbb {R}}\) which is not everywhere Fréchet differentiable and which satisfies \(c\Vert \cdot \Vert ^2/2\le g(\cdot )\le \Vert \cdot \Vert ^2/2\) for some constant \(0<c<1\). We may actually choose g as the square of a certain equivalent norm on H; consult [11] by Borwein and Fabian for a paper on this topic. Then, \(\Vert \cdot \Vert ^2/2\le g^* \le \Vert \cdot \Vert ^2/(2c)\), \(g^*\) is strictly convex, and the supremum for the bi-conjugate

$$\begin{aligned} g^{**}(x)=\sup _{y\in H}\left( \langle x,y\rangle - g^*(y)\right) \end{aligned}$$
(50)

is uniquely attained for each \(x\in H\). Define, for a fixed \(t_0>0\),

$$\begin{aligned} f(x)=\left( g^*(x)-\Vert x\Vert ^2/2 \right) \!\! /t_0,\qquad x\in H. \end{aligned}$$

Then, f is real-valued, \(f\ge 0\) and \(f_{t_0}\) is everywhere Gâteaux differentiable yet fails to be Fréchet differentiable throughout H. To see this, we need only observe that

$$\begin{aligned} t_{0}f_{t_0}(x)=\frac{1}{2}\Vert x\Vert ^2-(t_0f+\Vert \cdot \Vert ^2/2)^*(x)=\frac{1}{2}\Vert x\Vert ^2-g^{**}(x)=\frac{1}{2}\Vert x\Vert ^2-g(x) \end{aligned}$$

since \(g^{**}=g\). For each \(x\in H\), the infimum for \(f_{t_0}(x)\) is uniquely attained because it boils down to the supremum (50). To summarize,

$$\begin{aligned} \Sigma _{\mathrm s}\bigcap \left( \{t_0\}\times H\right) =\emptyset \quad \text {whereas}\quad \Sigma \bigcap \left( \{t_0\}\times H\right) \ne \emptyset . \end{aligned}$$

By Theorem 10, \(S(t,x)=f_t(x)\) fails to be jointly Gâteaux differentiable at some point of \(\{t_0\}\times H\).

Remark 7

By Theorem 11, (i) \(\Leftrightarrow \) (ii) \(\wedge \) (iii) in the case \(f={\mathfrak {I}}_E\). This equivalence fails in general as Example 3 shows.

Example 4

(Fitzpatrick [23]) In the Hilbert space \(H=\ell ^2\) let

$$\begin{aligned} E=\{e_1\}\cup \{ r_ne_n:n=2,3,\ldots \} \end{aligned}$$

where \(1<r_n \downarrow 1\) as \(n\rightarrow \infty \). Then \(P_E(0)=\{e_1\}\), a singleton, yet \(d^2_E\) is not Gâteaux differentiable at 0. Indeed,

$$\begin{aligned} d^2_E(\lambda e_1)-d^2_E(0)=\min \left\{ (\lambda -1)^2,\inf _{n\ge 2}(\lambda ^2+r_n^2)\right\} -1=\min \{\lambda ^2-2\lambda ,\lambda ^2\} \end{aligned}$$

and we conclude that

$$\begin{aligned} \lim _{\lambda \rightarrow 0}\frac{d^2_E(\lambda e_1)-d^2_E(0)}{\lambda }\quad \text {does not exist.} \end{aligned}$$

Example 5

(Fitzpatrick [23]) In \(H=\ell ^2\), let

$$\begin{aligned} E=\{ r_ne_n:n=1,2,\ldots \} \end{aligned}$$

where \(r_n=(n+1)/n\). Then \(P_E(0)=\emptyset \) yet \(d^2_E\) is Gâteaux differentiable at 0 with \(\nabla d^2_E(0)=0\). In this case, setting \(I(x)=-2A(x)\), \(x=(x_k)_{k=1}^\infty \in \ell ^2\),

$$\begin{aligned} I(x)=\inf _{k\ge 1}(r_k^2-2r_k x_k) \end{aligned}$$

is a concave function with its global maximum at \(x=0\) on account of \(I(0)=1\) and

$$\begin{aligned} I(x)\le \liminf _{k\rightarrow \infty }(r_k^2-2r_k x_k)=1\quad \text {for all}\, x\in \ell ^2, \end{aligned}$$

owing to \(1<r_k\rightarrow 1\) and \(x_k\rightarrow 0\) as \(k\rightarrow \infty \). In particular, if its exists, \(\nabla I(0)\) must be 0. The confirmation of \(\nabla d^2_E(0)=0\) amounts to demonstrating that the directional derivative

$$\begin{aligned} \lim _{\lambda \rightarrow 0}\frac{I(\lambda v)-I(0)}{\lambda }=\lim _{\lambda \rightarrow 0}\frac{I(\lambda v)-1}{\lambda }=0 \end{aligned}$$
(51)

in any direction \(v\in \ell ^2\). To this end, let \(w_k=v_k\) for all \(1\le k\le N\) and \(w_k=0\) for all \(k>N\). The truncated sequence \(w=w^N\) satisfies \(I(\lambda w)=1\) for all sufficiently small \(|\lambda |\). Indeed,

$$\begin{aligned} \inf _{k>N}(r_k^2-2\lambda w_k r_k)=\inf _{k>N}r_k^2=1 \end{aligned}$$

and

$$\begin{aligned} \min _{1\le k\le N}(r_k^2-2\lambda w_k r_k)>1 \end{aligned}$$

for all \(\lambda \) in a neighborhood of 0 (by continuity since the left-hand side is equal to \(r_N^2>1\) when \(\lambda =0\)). Returning to v we next find that

$$\begin{aligned} \left| \frac{I(\lambda v)-I(0)}{\lambda }\right| \le \left| \frac{I(\lambda v)-I(\lambda w)}{\lambda }\right| +\left| \frac{I(\lambda w)-I(0)}{\lambda }\right| \le C\Vert v-w\Vert +0 \end{aligned}$$

when \(0<|\lambda |\) is small enough, whence \(\limsup _{\lambda \rightarrow 0}|(I(\lambda v)-I(0))/\lambda |\le C\Vert v-w\Vert \). Finally, (51) follows when \(N\rightarrow \infty \).

We close the paper by giving four examples of intrinsic characteristics.

Example 6

Let E be the complement of the open first quadrant in \(H={\mathbb {R}}^2\) and hence \(\Sigma =\{(x_1,x_2)\in H:x_2=x_1>0\}\). The intrinsic characteristic subject to the initial condition \({\varvec{x}}(t_0)=x_0=(\xi _0,\xi _0)\in \Sigma \) is given by

$$\begin{aligned} {\varvec{x}}(t)=\frac{2t}{t_0+t}x_0,\qquad t_0\le t<\infty , \end{aligned}$$
(52)

Indeed, it is a singular arc whose points \(x=(\xi ,\xi )\) are calculated by maximizing

$$\begin{aligned} {\mathbb {R}}\ni \xi \mapsto \frac{1}{2t}\xi ^2-\frac{1}{2(t-t_0)}2(\xi _0-\xi )^2,\qquad t_0<t<\infty . \end{aligned}$$
(53)

We notice that \({\varvec{x}}(\cdot )\) traces out only \([1,2)x_0\subset \Sigma \) and that \({\varvec{x}}(\infty )=2x_0\). (By means of the recursive approach explained in Remark 2 a singular curve \(\varvec{\xi }(t)\) can be obtained which traces out the ray \([1,\infty )x_0\subset \Sigma \). Such a singular arc can also be constructed by iterating the step sketched in Remark 3.) By contrast, the unique generalized characteristic emanating from \(x_0\) at time \(t_0\) is

$$\begin{aligned} {\varvec{X}}(t)=\sqrt{\frac{t}{t_0}}x_0,\qquad t_0\le t<\infty . \end{aligned}$$
(54)

Indeed, if \(t>0\) and \(x=(\xi ,\xi )\), then the norm minimal element of \(d^+d^2_{E}(x)/2={\text {co}}\{(\xi ,0),(0,\xi )\}\) is \(\frac{1}{2}(\xi ,\xi )\) and, thus, \(v^\circ (t,x)=(2t)^{-1}x\). Solving \(d\xi /dt=(2t)^{-1}\xi \) with initial condition \(\xi (t_0)=\xi _0\) yields (54). Obviously, \({\varvec{x}}\) and \({\varvec{X}}\) are distinct. Furthermore, \(J^{t_1\rightarrow t_2}\circ J^{t_0\rightarrow t_1}(x_0)\ne J^{t_0\rightarrow t_2}(x_0)\) when \(t_0<t_1<t_2\).

Example 7

Again in \(H={\mathbb {R}}^2\), let E be the L-shaped set defined for some \(a>0\) as

$$\begin{aligned} E= [0,2a] \times \{0\}\bigcup \, \{0\}\times [0,2a] , \end{aligned}$$

whose singular set is \(\Sigma =\{(x_1,x_2)\in H:x_2=x_1>0\}\). We set \(z_0=(a,a)\). For the initial point \(x_0=(\xi _0,\xi _0)\in \Sigma \) we distinguish between three cases:

  1. (a)

    \(x_0\in {\text {co}}E\), i.e., \(0<\xi _0\le a\). In this case, \({\varvec{x}}(t)\) is given by (52) and its limit is \({\varvec{x}}(\infty )=2x_0\). We observe that \({\varvec{x}}(\infty )\) remains in \({\text {co}}E\) only if \(\xi _0\le a/2\).

  2. (b)

    \(x_0\notin {\text {co}}E\) and \(a<\xi _0<2a\). The intrinsic characteristic \({\varvec{x}}(t)\) agrees with (52) until it reaches the point \(2z_0\) at time \(t_1=at_0/(\xi _0-a)\). After that, when \(t\ge t_1\), \({\varvec{x}}(t)\) coincides with (55).

  3. (c)

    \(x_0\notin {\text {co}}E\) and \(\xi _0\ge 2a\). Similarly as is in (53), the singular arc is obtained by maximizing

    $$\begin{aligned} \xi \mapsto \frac{1}{2t}\left( (\xi -2a)^2+\xi ^2\right) -\frac{1}{2(t-t_0)}2(\xi _0-\xi )^2,\qquad t_0<t<\infty . \end{aligned}$$

    A calculation results in

    $$\begin{aligned} {\varvec{x}}(t)=x_0+(t-t_0)v_0,\qquad t_0\le t<\infty , \end{aligned}$$
    (55)

    where the constant velocity is \(v_0=(x_0-z_0)/t_0\).

Clearly, the singular arcs of (b) and (c) are unbounded which is consistent with Theorem 9. In case (c), the arc has constant nonzero velocity.

Example 8

Let \(f(x)=\exp (-\Vert x\Vert ^2/2)\). Then it can be checked that \(\Sigma = \Sigma _{\mathrm s}=(1,\infty )\times \{0\}\). If \(t_0>0\) and \(x_0=0\) then \((t,{\varvec{x}}(t))=(t,0)\) for all \(t\in [t_0,\infty )\). Assuming \(0<t_0<1\), this shows that \({\varvec{x}}(t)\) can be constant even though \((t,{\varvec{x}}(t))\) starts off as nonsingular and later becomes singular (when \(t>1\)). If we extend each intrinsic characteristic to \([0,\infty )\), then the resulting family of curves can be parameterized by their initial points \({\varvec{x}}(0)=y\). This procedure results in the general form

$$\begin{aligned} {\varvec{x}}_y(t)={\left\{ \begin{array}{ll} (1-te^{-\Vert y\Vert ^2/2})y&{}\quad \text {if}\, 0\le t < e^{\Vert y\Vert ^2/2}, \\ 0&{} \quad \text {if}\, t\ge e^{\Vert y\Vert ^2/2}. \end{array}\right. } \end{aligned}$$

Indeed, \((t,{\varvec{x}}_y(t))\) stays nonsingular with constant velocity \(df(y)=-e^{-\Vert y\Vert ^2/2}y\) when \(0\le t < e^{\Vert y\Vert ^2/2}\) until entering \(\Sigma \) at time \(t=e^{\Vert y\Vert ^2/2}\). After that it will remain in \(\Sigma = \Sigma _{\mathrm s}\), hence, \({\varvec{x}}_y(t)=0\) when \(t\ge e^{\Vert y\Vert ^2/2}\). Furthermore, \({\varvec{x}}_y(t)\) coincides in this case with the unique generalized characteristic for any \(y\in H\) although \({\varvec{x}}_y(t)\) switches from nonsingular to singular and \(\dot{{\varvec{x}}}_y^+(t)\) is not a constant for any \(y\ne 0\).

Example 9

Let f be the indicator function of the following union of two half-spaces

$$\begin{aligned} E=\{x \in H:\langle x,z\rangle \ge 1\text { or }\langle x,z\rangle \le -1\} \end{aligned}$$

for a certain fixed \(z\in H\) with \(\Vert z\Vert =1\). Then

$$\begin{aligned} d_E^2(x)=\min \{(\langle x,z\rangle -1)^2, (\langle x,z\rangle +1)^2 \}\quad \text {when} -1<\langle x,z\rangle <1. \end{aligned}$$

Note, first, that \(d_E^2\) fails to be Gâteaux differentiable on the orthogonal complement \(\{z\}^\perp \) on account of

$$\begin{aligned} (\langle x,z\rangle -1)^2=(\langle x,z\rangle +1)^2\quad \Leftrightarrow \quad \langle x,z\rangle =0 \end{aligned}$$

and, secondly, that \(d_E^2=1\) on \(\{z\}^\perp \) but \(d_E^2<1\) elsewhere. Hence, if \(x_0\in \{z\}^\perp \), then \(x_0\) is a critical point and \({\varvec{x}}(t)=x_0\) for all \(t\ge t_0\). Still, assuming \(\dim H>1\), there emanates from \(x_0\) a variety of nonconstant singular Lipschitz arcs in the orthogonal complement of \(\{z\}\). In fact, every curve \({\varvec{X}}\) with \({\varvec{X}}(t)\in \{z\}^\perp \) for all t is a singular curve.