1 Introduction

Many physical relations and data-based coherences cannot satisfactorily be described by classical differential equations. Often they inherently possess some features, which are not purely local. In this regard, mathematical models which are governed by nonlocal operators enrich our modeling spectrum and present useful alternatives as well as supplemental approaches. That is why they appear in a large variety of applications including among others, anomalous or fractional diffusion [10, 11, 19], peridynamics [25, 27, 54, 64], image processing [31, 38, 42], cardiology [14], machine learning [44], as well as finance and jump processes [5, 6, 26, 37, 59]. Nonlocal operators are integral operators allowing for interactions between two distinct points in space. The nonlocal models investigated in this paper involve kernels that are not necessarily symmetric and which are assumed to have a finite range of nonlocal interactions; see, e.g, [23, 24, 26, 60] and the references therein.

Not only the problem itself but also various optimization problems involving nonlocal models of this type are treated in literature. For example matching-type problems are treated in [18, 20, 21] to identify system parameters such as the forcing term or a scalar diffusion parameter. The control variable is typically modeled to be an element of a suitable function space. Moreover, nonlocal interface problems have become popular in recent years [13, 17, 29, 32, 43]. However, shape optimization techniques applied to nonlocal models can hardly be found in literature. For instance, the articles [9, 41, 55] deal with minimizing (functions of) eigenvalues of the fractional Laplacian with respect to the domain of interest. Also, in [8, 15] the energy functional related to fractional equations is minimized. In [12] a functional involving a more general kernel is considered. All of the aforementioned papers are of theoretical nature only. To the best of our knowledge, shape optimization problems involving nonlocal constraint equations with truncated kernels and numerical methods for solving such problems cannot yet be found in literature.

Instead, shape optimization problems which are constrained by partial differential equations appear in many fields of application [34, 46, 52, 53] and particularly for inverse problems where the parameter to be estimated, e.g., the diffusivity in a heat equation model, is assumed to be defined piecewise on certain subdomains. Given a rough picture of the configuration, shape optimization techniques can be successfully applied to identify the detailed shape of these subdomains [48,49,50, 62].

In this paper we transfer the problem of parameter identification into a nonlocal regime. Here, the parameter of interest is given by the kernel which describes the nonlocal model. We assume that this kernel is defined piecewise with respect to a given partition \(\{\Omega _i\}_{i}\) of the domain of interest \(\Omega \). Thereby, the state of such a nonlocal model depends on the interfaces between the respective subdomains \(\Omega _i\). Under the assumption that we know the rough setting but are lacking in details, we can apply the techniques developed in the aforementioned shape optimization papers to identify these interfaces from a given measured state.

For this purpose we formulate a shape optimization problem which is constrained by an interface–dependent nonlocal convection–diffusion model. Here, we do not aim at investigating conceptual improvements of existing shape optimization algorithms. On the contrary, we want to study the applicability of established methods for problems of this type.

The realization of this plan basically requires two ingredients both of which are worked out here. First, we define a reasonable interface–dependent nonlocal model and provide a finite element code which discretizes a variational formulation thereof. Second, we need to derive the shape derivative of the corresponding nonlocal bilinear form which is then implemented into an overall shape optimization algorithm.

This leads to the following organization of the present paper. In Sect. 2 we formulate the shape optimization problem including an interface–dependent nonlocal model. Once established, we briefly recall basic concepts from the shape optimization regime in Sect. 3. Then Sect. 4 is devoted to the task of computing the shape derivative of the nonlocal bilinear form and the reduced objective functional. Finally we present numerical illustrations in Sect. 5 which corroborate theoretical findings.

2 Problem formulation

The system model to be considered is the homogeneous steady-state nonlocal Dirichlet problem with volume constraints, given by

$$\begin{aligned} \left\{ \begin{aligned} -\mathcal {L}_\Gamma u&= {f}_\Gamma \quad \text {on } {\Omega }\\ u&= 0 \quad \text { on } {\Omega _{I}}, \end{aligned} \right. \end{aligned}$$
(1)

posed on a bounded domain \(\Omega \subset \mathbb {R}^d\), \(d \in \mathbb {N}\) and its nonlocal interaction domain \({\Omega _{I}}\); see, e.g, [4, 23, 24, 26, 60] and the references therein. Here, we assume that this domain is partitioned into a simply connected interior subdomain \(\Omega _1 \subset \Omega \) with boundary \(\Gamma :=\partial \Omega _1\) and a domain \(\Omega _2 :=\Omega \backslash {\overline{\Omega }}_1\). Thus we have \(\Omega = \Omega (\Gamma ) = \Omega _1 {\dot{\cup }} \Gamma {\dot{\cup }} \Omega _2\), where \({\dot{\cup }}\) denotes the disjoint union. In the following, the boundary \(\Gamma \) of the interior domain \(\Omega _1\) is called the interface and is assumed to be an element of an appropriate shape space; see also Sect. 3 for a related discussion. The governing operator \(\mathcal {L}_\Gamma \) is an interface–dependent, nonlocal convection–diffusion operator of the form

$$\begin{aligned} -\mathcal {L}_\Gamma u(\textbf{x})&:= \int _{\mathbb {R}^d} \left( u(\textbf{x})\gamma _\Gamma (\textbf{x},\textbf{y}) - u(\textbf{y})\gamma _\Gamma (\textbf{y},\textbf{x})\right) d\textbf{y}, \end{aligned}$$
(2)

which is determined by a nonnegative, interface–dependent (interaction) kernel \(\gamma _\Gamma :\mathbb {R}^d \times \mathbb {R}^d \rightarrow \mathbb {R}\). The second equation in (1) is called Dirichlet volume constraint. It specifies the values of u on the interaction domain

$$\begin{aligned} \Omega _I:= \left\{ \textbf{y}\in \mathbb {R}^d\backslash \Omega :~\exists \textbf{x}\in \Omega : \gamma _\Gamma (\textbf{x},\textbf{y}) \ne 0 \right\} , \end{aligned}$$

which consists of all points in the complement of \(\Omega \) that interact with points in \(\Omega \). For ease of exposition, we set \(u=0\) on \(\Omega _I\), but generally we can use the constraint \(u = g\) on \(\Omega _I\), if g satisfies some appropriate regularity assumptions.

Furthermore, we assume that the kernel depends on the interface in the following way

$$\begin{aligned} \gamma _\Gamma (\textbf{x}, \textbf{y}) = \sum _{i,j = 1,2} \gamma _{ij}(\textbf{x},\textbf{y}) \chi _{\Omega _i \times \Omega _j}(\textbf{x},\textbf{y}) + \sum _{i = 1,2} \gamma _{iI}(\textbf{x},\textbf{y})\chi _{\Omega _i \times \Omega _I}(\textbf{x},\textbf{y}), \end{aligned}$$
(3)

where \(\chi _{\Omega _i \times \Omega _j}\) denotes the indicator of the set \(\Omega _i \times \Omega _j\). For instance, in [51] the authors refer to \(\gamma _{ij}\) and \(\gamma _{iI}\) as inter– and intra–material coefficients. Notice that we do not need kernels \(\gamma _{Ii}\), since \(u=0\) on \(\Omega _I\). Furthermore, for \(i=1,2\) let \(\{{S}_i(\textbf{x})\}_{\textbf{x}\in \Omega }\), with \({S}_i(\textbf{x}) \subset {\mathbb {R}^d}\) for \(\textbf{x}\in \Omega \), be a family of sets, where the symmetry \(\textbf{y}\in {S}_i(\textbf{x}) \Leftrightarrow \textbf{x}\in {S}_i(\textbf{y})\) for \(\textbf{x},\textbf{y}\in \Omega \) holds. We additionally assume for \(i \in \{1,2\}\) that there exist two radii \(0<\varepsilon _{i}^1 \le \varepsilon _{i}^2<\infty \) such that \(B_{\varepsilon _{i}^1}(\textbf{x}) \subset {S}_{i}(\textbf{x})\subset B_{\varepsilon _{i}^2}(\textbf{x})\) for all \(\textbf{x}\in \Omega \), where \(B_{\varepsilon _{i}^k}(\textbf{x})\) denotes the Euclidean ball of radius \(\varepsilon _{i}^k\).

Throughout this work we consider truncated interaction kernels, which can be written as

$$\begin{aligned} \gamma _{ij}(\textbf{x},\textbf{y}) = {\phi }_{ij}(\textbf{x},\textbf{y}) \chi _{{{S}_{i}(\textbf{x})}}(\textbf{y}) \text { and } \gamma _{iI}(\textbf{x},\textbf{y}) = {\phi }_{iI}(\textbf{x},\textbf{y}) \chi _{{{S}_{i}(\textbf{x})}}(\textbf{y}) \text { for } i,j=1,2 \nonumber \\ \end{aligned}$$
(4)

for appropriate positive functions \({\phi }_{ij} :\mathbb {R}^d \times \mathbb {R}^d \rightarrow \mathbb {R}\) and \({\phi }_{iI} :\mathbb {R}^d \times \mathbb {R}^d \rightarrow \mathbb {R}\), which we refer to as kernel functions. In this paper we differentiate between square integrable kernels and singular symmetric kernels. For square integrable kernels we require \(\gamma _{ij}\in L^2(\Omega \times \Omega )\) and \(\gamma _{iI}\in L^2(\Omega \times \Omega _I)\), which also implies \(\gamma _\Gamma \in \) \(L^2((\Omega \cup \Omega _I) \times (\Omega \cup \Omega _I))\). We do not assume that (3) is symmetric for this type of kernels.

In the case of singular symmetric kernels we require the existence of constants \(0<\gamma _* \le \gamma ^* < \infty \) and a fraction \(s \in (0,1)\), such that

$$\begin{aligned} \gamma _* \le \gamma (\textbf{x},\textbf{y})||\textbf{x}- \textbf{y}||_2^{d + 2s} \le \gamma ^* \end{aligned}$$

for \(\textbf{x}\in \Omega \text { and } \textbf{y}\in S_{1}(\textbf{x}) \cup S_{2}(\textbf{x})\). Also, since the singular kernel is required to be symmetric, the condition \(\gamma (\textbf{x},\textbf{y})=\gamma (\textbf{y},\textbf{x})\), and, respectively, \(\phi _{12}(\textbf{x},\textbf{y})=\phi _{21}(\textbf{y},\textbf{x})\), \(\phi _{ii}(\textbf{x},\textbf{y})=\phi _{ii}(\textbf{y},\textbf{x})\) has to hold. Because we do not need to define \(\gamma _{Ii}\), as described above, there is no further symmetry condition for \(\gamma _{iI}\) required.

Example 2.1

One example of such a singular symmetric kernel is given by

$$\begin{aligned}&\gamma _{ij}(\textbf{x},\textbf{y}) :=\frac{\sigma _{ij}(\textbf{x},\textbf{y})}{||\textbf{x}- \textbf{y}||^{d + 2s}_2}\chi _{B_\varepsilon (\textbf{x})}(\textbf{y}),\quad \gamma _{iI}(\textbf{x},\textbf{y}) :=\frac{\sigma _{iI}(\textbf{x},\textbf{y})}{||\textbf{x}- \textbf{y}||^{d + 2s}_2}\chi _{B_\varepsilon (\textbf{x})}(\textbf{y}),\quad \\&\text {for } i,j=1,2, \end{aligned}$$

where \(s \in (0,1)\), \(0<\varepsilon <\infty \) and the functions \(\sigma _{ij},\sigma _{iI}:\mathbb {R}^d \times \mathbb {R}^d \rightarrow \mathbb {R}\) are bounded from below and above by some positive constants, say \(\gamma _*\) and \(\gamma ^*\). Additionally, the \(\sigma _{ii}\) are assumed to be symmetric on \(\Omega \times \Omega \) and \(\sigma _{12}(\textbf{x},\textbf{y}) = \sigma _{21}(\textbf{y},\textbf{x})\) holds for \({\textbf{x},\textbf{y}\in \Omega \cup \Omega _I}\).

For the forcing term \(f_\Gamma \) in (1) we assume a dependency on the interface in the following way

$$\begin{aligned} \begin{aligned} f_\Gamma (\textbf{x}) :={\left\{ \begin{array}{ll} f_1(\textbf{x}): &{}\textbf{x}\in \Omega _1\\ f_2(\textbf{x}): &{}\textbf{x}\in \Omega _2, \end{array}\right. } \end{aligned} \end{aligned}$$
(5)

where we assume that \(f_i \in H^1(\Omega )\), \(i=1,2\), because we need that f is weakly differentiable in Sect. 4. Figure 1 illustrates our setting.

Fig. 1
figure 1

Here you can see one example configuration, where the domain \(\Omega \) is divided in \(\Omega _1\) and \(\Omega _2\) with \(\Gamma = \partial \Omega _1\) and \(\Omega _I\) is the nonlocal interaction domain. In this case the support of \(\gamma _{11}(\textbf{x},\cdot )\) for one \(\textbf{x}\in \Omega _1\) is depicted in blue and the support of \(\gamma _{22}(\textbf{y},\cdot )\) for one \(\textbf{y}\in \Omega _{2}\) is colored in red, where the latter can be expressed by using the \(||\cdot ||_\infty \)-ball in \(\mathbb {R}^2\)

Next, we introduce a variational formulation of problem (1). For this purpose we define the corresponding forms

$$\begin{aligned} {A}_{\Gamma }(u, v) :=\left( -\mathcal {L}_\Gamma u, v\right) _{L^2(\Omega )} ~~~~\text {and}~~~~ F_{\Gamma }( v) :=(f_\Gamma , v)_{L^2(\Omega )} \end{aligned}$$
(6)

for some functions \(u,v:\Omega \cup \Omega _I\rightarrow \mathbb {R}\), where \(v= 0\) on \(\Omega _I\). By inserting the definitions of the nonlocal operator (2) with the kernel given in (4) and the definition of the forcing term (5), we obtain the nonlocal bilinear form

$$\begin{aligned} {A}_{\Gamma }(u, v)&=\int _{\Omega } v(\textbf{x})\int _{\mathbb {R}^d} (u(\textbf{x})\gamma _\Gamma (\textbf{x},\textbf{y}) - u(\textbf{y})\gamma _\Gamma (\textbf{y},\textbf{x})) d\textbf{y}d\textbf{x}\nonumber \\&=\sum _{i,j= 1,2} \int _{\Omega _i} v(\textbf{x}) \int _{\Omega _j} \left( u(\textbf{x})\gamma _{ij}(\textbf{x},\textbf{y}) - u(\textbf{y})\gamma _{ji}(\textbf{y},\textbf{x})\right) d\textbf{y}d\textbf{x}\nonumber \\&\quad + \sum _{i=1,2} \int _{\Omega _i} v(\textbf{x})u(\textbf{x}) \int _{\Omega _I} \gamma _{iI}(\textbf{x},\textbf{y}) d\textbf{y}d\textbf{x}\end{aligned}$$
(7)
$$\begin{aligned}&=\sum _{i,j= 1,2} \frac{1}{2} \int _{\Omega _i} \int _{\Omega _j}\left( v(\textbf{x}) - v(\textbf{y}) \right) \left( u(\textbf{x})\gamma _{ij}(\textbf{x},\textbf{y}) - u(\textbf{y})\gamma _{ji}(\textbf{y},\textbf{x})\right) d\textbf{y}d\textbf{x}\nonumber \\&\quad + \sum _{i=1,2} \int _{\Omega _i} v(\textbf{x})u(\textbf{x}) \int _{\Omega _I} \gamma _{iI}(\textbf{x},\textbf{y}) d\textbf{y}d\textbf{x}\end{aligned}$$
(8)

and the linear functional

$$\begin{aligned} F_\Gamma ( v) = \int _{\Omega } f_\Gamma v~d\textbf{x}= \int _{\Omega _1} f_1 v~d\textbf{x}+ \int _{\Omega _2} f_2 v~ d\textbf{x}. \end{aligned}$$
(9)

In order to derive the second bilinear form (8) we used Fubini’s theorem. We employ both representations (7) and (8) of the nonlocal bilinear form in the proofs of Sect. 4. For singular symmetric kernels we also use another equivalent representation of the nonlocal bilinear form given by

$$\begin{aligned} {A}_{\Gamma }(u, v) = \frac{1}{2}\iint \limits _{(\Omega \cup \Omega _I)^2} (v(\textbf{x}) - v(\textbf{y}))(u(\textbf{x}) - u(\textbf{y}))\gamma _\Gamma (\textbf{x},\textbf{y}) ~d\textbf{y}d\textbf{x}, \end{aligned}$$

where we again used Fubini’s theorem and applied that \(u,v= 0\) on \(\Omega _I\). Next, we employ the nonlocal bilinear form to define a seminorm

$$\begin{aligned} |||u||| :=\sqrt{{A}_\Gamma (u,u)}. \end{aligned}$$

With this seminorm, we further define the energy spaces

$$\begin{aligned} \begin{aligned} V(\Omega \cup \Omega _I)&:=\{u \in L^2(\Omega \cup \Omega _I): ||u||_{V(\Omega \cup \Omega _I)} :=|||u||| + ||u||_{L^2(\Omega \cup \Omega _I)} < \infty \} \text { and}\\ V_c(\Omega \cup \Omega _I)&:=\{u \in V(\Omega \cup \Omega _I): u = 0 \text { on } \Omega _I \}. \end{aligned} \end{aligned}$$
(10)

We now formulate the variational formulation corresponding to problem (1) as follows

$$\begin{aligned} \begin{array}{c} {given\, f_\Gamma \in H^1(\Omega )\, find\, u \in V_c(\Omega \cup \Omega _I)\, such\,\, that } \\ {A}_\Gamma (u,v) = F_\Gamma (v)~~{ for\, all}~~ v\in V_c(\Omega \cup \Omega _I). \end{array} \end{aligned}$$
(11)

Additionally, for \(s \in (0,1)\) we define the seminorm

$$\begin{aligned} |u|_{H^s(\Omega \cup \Omega _I)} :=\int _{\Omega \cup \Omega _I}\int _{\Omega \cup \Omega _I} \frac{\left( u(\textbf{x}) - u(\textbf{y})\right) ^2}{||\textbf{x}- \textbf{y}||_2^{d+2s}} ~d\textbf{y}d\textbf{x}\end{aligned}$$

and the fractional Sobolev space as

$$\begin{aligned} H^s(\Omega \cup \Omega _I)&:=\{u \in L^2(\Omega \cup \Omega _I): ||u||_{H^s(\Omega \cup \Omega _I)} \\&:=||u||_{L^2(\Omega \cup \Omega _I)} + |u|_{H^s(\Omega \cup \Omega _I)} < \infty \}. \end{aligned}$$

Moreover, we denote the volume-constrained spaces by

$$\begin{aligned} L_c^2(\Omega \cup \Omega _I)&:=\{u \in L^2(\Omega \cup \Omega _I): u=0 \text { on } \Omega _I\} \text { and} \\ H^s_c(\Omega \cup \Omega _I)&:=\{u \in H^s(\Omega \cup \Omega _I): u=0 \text { on } \Omega _I\} \text { for } s \in (0,1). \end{aligned}$$

Then, for square integrable kernels one can show the equivalence between \(\left( V(\Omega \cup \Omega _I), ||\cdot ||_{V(\Omega \cup \Omega _I)}\right) \) and \(\left( L^2(\Omega \cup \Omega _I), ||\cdot ||_{L^2(\Omega \cup \Omega _I)}\right) \), i.e., there exist constants \(C_1,C_2 > 0\) such that

$$\begin{aligned} C_1||u||_{L^2(\Omega \cup \Omega _I)} \le ||u||_{V(\Omega \cup \Omega _I)} \le C_2||u||_{L^2(\Omega \cup \Omega _I)} \end{aligned}$$

and consequently \( u \in V(\Omega \cup \Omega _I) \Leftrightarrow u \in L^2(\Omega \cup \Omega _I).\) Additionally, one can proof the equivalence of \(\left( V_c(\Omega \cup \Omega _I), |||\cdot |||\right) \) and \(\left( L_c^2(\Omega \cup \Omega _I), ||\cdot ||_{L^2(\Omega \cup \Omega _I)}\right) \), see related results in [24, 28, 61]. Moreover, the well-posedness of problem (11) for symmetric (square integrable) kernels is proven in [24] and in [28] the well-posedness for some nonsymmetric cases is also covered (again under certain conditions on the kernel and the forcing term f). For the singular symmetric kernels the well-posedness of problem (11), the equivalence between \(\left( V(\Omega \cup \Omega _I),||\cdot ||_{V(\Omega \cup \Omega _I)}\right) \) and the fractional Sobolev space \(\left( H^s(\Omega \cup \Omega _I),||\cdot ||_{H^s(\Omega \cup \Omega _I)}\right) \) and between \(\left( V_c(\Omega \cup \Omega _I),|||\cdot |||\right) \) and \(\left( H_c^s(\Omega \cup \Omega _I), |\cdot |_{H^s(\Omega \cup \Omega _I)}\right) \) is shown in [24].

Finally, let us suppose we are given measurements \({\bar{u}}:\Omega \rightarrow \mathbb {R}\) on the domain \(\Omega \), which we assume to follow the nonlocal model (11) with the interface–dependent kernel \(\gamma _\Gamma \) and the forcing term \(f_\Gamma \) defined in (3) and (5), respectively. In order to formulate the shape derivative in Chapter 4 we need \({\bar{u}} \in H^1(\Omega )\). Then, given the data \({\bar{u}}\) we aim at identifying the interface \(\Gamma \) for which the corresponding nonlocal solution \(u(\Gamma )\) is the “best approximation” to these measurements. Mathematically spoken, we formulate an optimal control problem with a tracking-type objective functional where the interface \(\Gamma \) represents the control variable and is modeled as an element of a shape space \(\mathcal {A}\), which will be specified in Chapter 3.1. We now assume \(\Omega := (0,1)^2\) and introduce the following nonlocally constrained shape optimization problem

figure a

The objective functional is given by

$$\begin{aligned} J(u,\Gamma )&:=j(u,\Gamma )+j_{reg}(\Gamma ):=\frac{1}{2} \int _{\Omega } (u - {\bar{u}})^2 ~ d \textbf{x}+ \nu \int _{\Gamma }1 ~ds . \end{aligned}$$

The first term \(j(u,\Gamma )\) is a standard \(L^2\) tracking-type functional, whereas the second term \(j_{reg}(\Gamma )\) is known as the perimeter regularization, which is commonly used in the related literature to overcome possible ill-posedness of optimization problems [3].

3 Basic concepts in shape optimization

For solving the constrained shape optimization problem (12) we want to use the same shape optimization algorithms as they are developed in [47, 48, 50] for problem classes that are comparable in structure. Thus, in this section we briefly introduce the basic concepts and ideas of the therein applied shape formalism. For a rigorous introduction to shape spaces, shape derivatives and shape calculus in general, we refer to the monographs [16, 56, 62]. From now on, we restrict ourselves to the cases \(d \in \{2,3\}\) for the remaining part of this paper, since shape optimization problems are typically formulated for a two- or three-dimensional setting.

3.1 Notations and definitions

Based on our perception of the interface, we now refer to the image of a simple closed and smooth sphere as a shape, i.e., the spaces of interest are subsets of

$$\begin{aligned} \mathcal {A}:=\left\{ \Gamma :=\varphi (S^{d-1}):\varphi \in C^\infty (S^{d-1}, \Omega ) ~\text {injective};~\varphi ' \ne 0 \right\} , \end{aligned}$$
(13)

where \(S^{d-1}\) is the unit sphere in \({\mathbb {R}^d}\). By the Jordan-Brouwer separation theorem [33] such a shape \(\Gamma \in \mathcal {A}\) divides the space into two (simply) connected components with common boundary \(\Gamma \). One of them is the bounded interior, which in our situation can then be identified with \(\Omega _1\).

Functionals \(J :\mathcal {A}\rightarrow \mathbb {R}\) which assign a real number to a shape are called shape functionals. Since this paper deals with minimizing such shape functionals, i.e., with so-called shape optimization problems, we need to introduce the notion of an appropriate shape derivative. To this end we consider a family of mappings \(\textbf{F}_{\textbf{t}}:{\overline{\Omega }} \rightarrow \mathbb {R}^d\) with \(\textbf{F}_{\textbf{0}}= \textbf{id}\), where \(t\in [0,T]\) and T \(\in (0,\infty )\) sufficiently small, which transform a shape \(\Gamma \) into a family of perturbed shapes \(\left\{ \Gamma ^t\right\} _{t \in [0,T]}\), where \( \Gamma ^t:=\textbf{F}_{\textbf{t}}(\Gamma ) \) with \(\Gamma ^0 = \Gamma \). Here the family of mappings \(\left\{ \textbf{F}_{\textbf{t}}\right\} _{t\in [0,T]}\) is described by the perturbation of identity, which for a smooth vector field \(\textbf{V}\in C_0^k(\Omega , \mathbb {R}^d)\), \(k \in {\mathbb {N}}\), is defined by

$$\begin{aligned} \textbf{F}_{\textbf{t}}(\textbf{x}) :=\textbf{x}+ t \textbf{V}(\textbf{x}), \quad \text {for all } \textbf{x}\in \Omega . \end{aligned}$$

We note that for sufficiently small \(t\in [0,T]\) the function \(\textbf{F}_{\textbf{t}}\) is injective, and thus \(\Gamma ^t\in \mathcal {A}\). Then the Eulerian or directional derivative of a shape functional J at a shape \(\Gamma \) in direction of a vector field \(\textbf{V}\in C_0^k(\Omega , \mathbb {R}^d)\), \(k \in {\mathbb {N}}\), is defined by

$$\begin{aligned} D_\Gamma J(\Gamma )[\textbf{V}] :=\left. \frac{d}{dt}\right| _{t = 0^+} J(\textbf{F}_{\textbf{t}}(\Gamma ))= \lim _{t \searrow 0} \frac{\left( J(\textbf{F}_{\textbf{t}}(\Gamma )) - J(\Gamma )\right) }{t}. \end{aligned}$$
(14)

If \(D_\Gamma J(\Gamma )[\textbf{V}] \) exists for all \(\textbf{V}\in C_0^k(\Omega , \mathbb {R}^d)\), \(\textbf{V}\mapsto DJ(\Gamma )[\textbf{V}] \) is continuous and in the dual space \(\left( C_0^k(\Omega , \mathbb {R}^d)\right) ^*\), then \(DJ(\Gamma )[\textbf{V}] \) is called the shape derivative of J [62, Definition 4.6].

At this point, let us also define the material derivative of a family of functions \({ \{v^t:\Omega \rightarrow \mathbb {R}: t \in [ 0, T ] \} }\) in direction \(\textbf{V}\) by

$$\begin{aligned} D_m v(\textbf{x}) :=\left. \frac{d}{dt}\right| _{t = 0^+} v^t(\textbf{F}_{\textbf{t}}(\textbf{x})). \end{aligned}$$

For functions \(v\), which do not explicitly depend on the shape, i.e., \(v^t = v~\text {for all } t\in [0,T]\), we find

$$\begin{aligned} D_m v= \nabla v^\top \textbf{V}. \end{aligned}$$

For more details on shape optimization we refer to the literature, e.g., [16] or [56].

Remark 3.1

In case of the nonlocal problem (12) we extend the vector field \(\textbf{V}\) to \(\Omega \cup \Omega _I\) by zero, i.e, \(\textbf{V}\in C_0^k(\Omega \cup \Omega _I, {\mathbb {R}^d}) :=\{\textbf{V}:\Omega \cup \Omega _I\rightarrow {\mathbb {R}^d}: \textbf{V}|_{\Omega } \in C_0^k(\Omega ,{\mathbb {R}^d}) \text { and } \textbf{V}= 0 \text { on } \Omega _I \}\). Accordingly, the shape of the interaction domain \(\Omega _I\) does not change. Moreover, in this work \(\textbf{V}\in C_0^1(\Omega \cup \Omega _I,\mathbb {R}^d)\) is sufficient for all computations.

3.2 Optimization approach: averaged adjoint method

Let us assume, that for each admissible shape \(\Gamma \), there exists a unique solution \(u(\Gamma )\) of the constraint equation, i.e., \(u(\Gamma )\) satisfies \({A}_\Gamma (u(\Gamma ), v) = F_{\Gamma }( v)\) for all \(v\in V_c(\Omega \cup \Omega _I)\). Then we can consider the reduced problem

$$\begin{aligned} \min \limits _{\Gamma }\;&J^{red}(\Gamma ) := J(u(\Gamma ), \Gamma ). \end{aligned}$$
(15)

In order to employ derivative based minimization algorithms, we need to derive the shape derivative of the reduced objective functional \(J^{red}\). By formally applying the chain rule, we obtain

$$\begin{aligned} D_\Gamma J^{red}(\Gamma )[\textbf{V}] = D_uJ(u(\Gamma ), \Gamma ) D_\Gamma u(\Gamma )[\textbf{V}] + D_\Gamma J(u(\Gamma ), \Gamma )[\textbf{V}] , \end{aligned}$$

where \(D_uJ\) and \(D_\Gamma J\) denote the partial derivatives of the objective J with respect to the state variable u and the control \(\Gamma \), respectively. In applications we typically do not have an explicit formula for the control-to-state mapping \(u(\Gamma )\), so that we cannot analytically quantify the sensitivity of the unique solution \(u(\Gamma )\) with respect to the interface \(\Gamma \). Thus, a formula for the shape derivative \(D_\Gamma u(\Gamma )[\textbf{V}] \) is unattainable. One possible approach to circumvent \(D_\Gamma u(\Gamma )[\textbf{V}] \) and access the shape derivative \(D_\Gamma J^{red}(\Gamma )[\textbf{V}]\) is the averaged adjoint method (AAM) developed in [36, 57, 58], which is a Lagrangian method, where the so-called Lagrangian functional is defined as

$$\begin{aligned} L(u,\Gamma ,v) :=J(u,\Gamma ) + {A}_\Gamma (u,v) - F_\Gamma (v). \end{aligned}$$

The basic idea behind Lagrangian methods is the aspect, that we can express the reduced functional as

$$\begin{aligned} J^{red}(\Gamma )= L(u(\Gamma ),\Gamma ,v), \quad \forall v \in V_c(\Omega \cup \Omega _I). \end{aligned}$$

Now let \(\Gamma \) be fixed and denote by \(\Gamma ^t:=\textbf{F}_{\textbf{t}}(\Gamma )\) and \(\Omega _i^t :=\textbf{F}_{\textbf{t}}(\Omega _i)\) the deformed interior boundary and the deformed domains, respectively. Furthermore we indicate by writing \(\Omega (\Gamma ^t)\) that we use the decomposition \(\Omega (\Gamma ^t) = \Omega _1^t \cup \Gamma ^t \cup \Omega _2^t \left( = \Omega \right) \), where \(\Gamma ^t = \partial \Omega _1^t\). Consequently, the norm \(||\cdot ||_{V(\Omega (\Gamma ^t) \cup \Omega _I)}\) of the space \(V(\Omega (\Gamma ^t) \cup \Omega _I)\) differs from the norm \(||\cdot ||_{V(\Omega \cup \Omega _I)}\) of the space \(V(\Omega \cup \Omega _I)\) due to the interface-sensitivity of the kernel, see (10). Then we consider the reduced objective functional regarding \(\Gamma ^t\), i.e.,

$$\begin{aligned} J^{red}(\Gamma ^t)= L(u(\Gamma ^t),\Gamma ^t,v), \quad \forall v \in V_c(\Omega (\Gamma ^t) \cup \Omega _I), \end{aligned}$$
(16)

where \(u(\Gamma ^t) \in V_c(\Omega (\Gamma ^t) \cup \Omega _I) \). If we now try to differentiate L with respect to t in order to derive the shape derivative, we would have to compute the derivative for \(u(\Gamma ^t) \circ \textbf{F}_{\textbf{t}}\) and \(v\circ \textbf{F}_{\textbf{t}}\), where \(u(\Gamma ^t),v\in V_c(\Omega (\Gamma ^t) \cup \Omega _I) \) may not be differentiable. Additionally the norm \(||\cdot ||_{V(\Omega (\Gamma ^t) \cup \Omega _I) }\), and therefore the space \(V_c(\Omega (\Gamma ^t) \cup \Omega _I)\), is also dependent on t. Instead, since \(\textbf{F}_{\textbf{t}}\) is a homeomorphism, we can use that for \(u,v\in V_c(\Omega (\Gamma ^t) \cup \Omega _I) \), there exist functions \({\tilde{u}},\tilde{v} \in V_c(\Omega \cup \Omega _I)\), such that

$$\begin{aligned} u = {\tilde{u}} \circ \textbf{F}_{\textbf{t}}^{-1} ~\text {and}~ v= \tilde{v} \circ \textbf{F}_{\textbf{t}}^{-1}. \end{aligned}$$

Moreover let \(T \in (0, \infty )\) be sufficiently small. Then we define

$$\begin{aligned}&J:[0,T]\times V_c(\Omega \cup \Omega _I)\rightarrow \mathbb {R},\nonumber \\&\quad J(t,u) :=J(u \circ \textbf{F}_{\textbf{t}}^{-1},\Gamma ^t), \nonumber \\&\quad A:[0,T]\times V_c(\Omega \cup \Omega _I)\times V_c(\Omega \cup \Omega _I) \rightarrow \mathbb {R}, \nonumber \\&\quad A(t,u,v):=A_{\Gamma ^t}(u\circ \textbf{F}_{\textbf{t}}^{-1}, v\circ \textbf{F}_{\textbf{t}}^{-1}), \nonumber \\&\quad F:[0,T]\times V_c(\Omega \cup \Omega _I)\rightarrow \mathbb {R},\nonumber \\&\quad F(t,v):=F_{\Gamma ^t}(v\circ \textbf{F}_{\textbf{t}}^{-1}), \nonumber \\&\quad G:[0,T] \times V_c(\Omega \cup \Omega _I) \times V_c(\Omega \cup \Omega _I) \rightarrow \mathbb {R},\nonumber \\&\quad G(t,u,v) :=L(u \circ \textbf{F}_{\textbf{t}}^{-1},\Gamma ^t,v\circ \textbf{F}_{\textbf{t}}^{-1}) = J(t, u) + {A}(t,u,v) - F(t, v). \end{aligned}$$
(17)

Then we can reformulate (16) as

$$\begin{aligned} J^{red}(\Gamma ^t)= G(t,u^t,v),\quad \forall v \in V_c(\Omega \cup \Omega _I), \end{aligned}$$

where \(u^t \in V_c(\Omega \cup \Omega _I)\) is the unique solution of the nonlocal equation corresponding to \(\Gamma ^t\)

$$\begin{aligned} {A}(t,u,v) - F(t,v) = 0,\quad \forall v\in V_c(\Omega \cup \Omega _I). \end{aligned}$$

Furthermore \({A}(t,u,v) - F(t,v)\) is obviously linear in \(v\) for all \((t,u) \in [0,T] \times V_c(\Omega \cup \Omega _I)\), which is one prerequisite of the AAM. Then, in order to use the AAM to compute the shape derivative, the following additional assumptions have to be met.

  • Assumption (H0): For every \((t,v) \in [0,T] \times V_c(\Omega \cup \Omega _I)\)

    1. 1.

      \([0,1] \ni s \mapsto G(t,su^t+(1-s)u^0,v)\) is absolutely continuous and

    2. 2.

      \([0,1] \ni s \mapsto d_uG(t,su^t + (1-s)u^0,v)[{\tilde{u}}] \in L^1((0,1))\) for all \({\tilde{u}} \in V_c(\Omega \cup \Omega _I)\).

  • For every \(t \in [0,T]\) there exists a unique solution \(v^t \in L^2(\Omega )\), such that \(v^t\) solves the averaged adjoint equation

    $$\begin{aligned} \int _0^1 d_u G(t,su^t + (1-s)u^0,v^t)[{\tilde{u}}]ds = 0 \quad \text {for all } {\tilde{u}} \in V_c(\Omega \cup \Omega _I). \end{aligned}$$
    (18)
  • Assumption (H1):

    Assume that the following equation holds

    $$\begin{aligned} \lim _{t \searrow 0} \frac{G(t,u^0,v^t)- G(0,u^0,v^t)}{t}= \partial _t G(0,u^0,v^0). \end{aligned}$$

In our case, due to the linearity of \({A}(t, {\tilde{u}}, v^t)\) in the second argument, the left-hand side of the averaged adjoint equation (18) can be formulated as

$$\begin{aligned} \int _0^1 d_u G(t,su^t + (1-s)u^0,v^t)[{\tilde{u}}] ~ds = {A}(t,{\tilde{u}},v^t) + \int _\Omega \left( \frac{1}{2}(u^t + u^0) - {\bar{u}}^t\right) {\tilde{u}}\xi ^t~d\textbf{x}, \end{aligned}$$

where \(\xi ^t(\textbf{x}) :=det D \textbf{F}_{\textbf{t}}(\textbf{x})\) and \({\bar{u}}^t(\textbf{x}) :={\bar{u}}(\textbf{F}_{\textbf{t}}(\textbf{x}))\). As a consequence, (18) is equivalent to

$$\begin{aligned} \mathcal {A}(t, {\tilde{u}}, v^t) = - \int _{\Omega } \left( \frac{1}{2}\left( u^t + u^0 \right) - {\bar{u}}^t \right) {\tilde{u}}\xi ^t~d\textbf{x}\quad \forall {\tilde{u}} \in V_c(\Omega \cup \Omega _I). \end{aligned}$$

For \(t=0\) we get

$$\begin{aligned} {A}(0,{\tilde{u}},v^0)&= - \int _\Omega (u^0 - {\bar{u}}){\tilde{u}} ~d\textbf{x}\quad \forall {\tilde{u}} \in V_c(\Omega \cup \Omega _I). \end{aligned}$$
(19)

In this case we call (19) adjoint equation and the solution \(v^0\) is referred to as the adjoint solution. Moreover the nonlocal problem (11) for \(t=0\) is also called state equation and the solution \(u^0\) is named state solution.

Finally, the next theorem yields a practical formula for deriving the shape derivative.

Theorem 3.2

([36, Theorem 3.1]) Let the assumptions (H0) and (H1) be satisfied and suppose there exists a unique solution \(v^t\) to the averaged adjoint equation (18). Then for \(v\in V_c(\Omega \cup \Omega _I)\) we obtain

$$\begin{aligned} D_\Gamma J^{red}(\Gamma )[\textbf{V}]=\left. \frac{d}{dt}\right| _{t=0^+}J^{red}(\Gamma ^t) =\left. \frac{d}{dt}\right| _{t=0^+}G(t,u^t,v) = \partial _t G(0,u^0,v^0). \end{aligned}$$
(20)

Proof

See proof of [36, Theorem 3.1]. \(\square \)

Remark 3.3

Under the assumption that the material derivatives of u and \(v\) exist and that \(D_m u, D_m v\in V_c(\Omega \cup \Omega _I)\), one can also use the material derivative approach of [7] to derive the shape derivative of the reduced functional (15).

3.3 Optimization algorithm

Let us assume for a moment that we have an explicit formula for the shape derivative of the reduced objective functional. We now briefly recall the techniques developed in [50] and describe how to exploit this derivative for implementing gradient based optimization methods or even Quasi-Newton methods, such as L-BFGS, to solve the constrained shape optimization problem (12).

In order to identify gradients we need to require the notion of an inner product, or more generally a Riemannian metric. Unfortunately, shape spaces typically do not admit the structure of a linear space. However, in particular situations it is possible to define appropriate quotient spaces, which can be equipped with a Riemannian structure. For instance consider the set \(\mathcal {A}\) introduced in (13). Since we are only interested in the image of the defining embedding, a re-parametrization thereof does not lead to a different shape. Consequently, two spheres that are equal modulo (diffeomorphic) re-parametrizations define the same shape. This conception naturally leads to the quotient space \({{\,\textrm{Emb}\,}}(S^{d-1}, \mathbb {R}^d) / {{\,\textrm{Diff}\,}}(S^{d-1}, S^{d-1}) \), which can be considered an infinite-dimensional Riemannian manifold [39, 62]. This example already intimates the difficulty of translating abstract shape derivatives into discrete optimization methods; see, e.g., the thesis [63] on this topic. A detailed discussion of these issues is not the intention of this work and we now outline Algorithm 1.

The basic idea can be intuitively explained in the following way. Starting with an initial guess \(\Gamma _0\), we aim to iterate in a steepest-descent fashion over interfaces \(\Gamma _k\) until we reach a “stationary point” of the reduced objective functional \(J^{red}\). The interface \(\Gamma _k\) is encoded in the finite element mesh and transformations thereof are realized by adding vector fields \(\textbf{U}:\Omega \rightarrow \mathbb {R}^d\) (which can be interpreted as tangent vectors at a fixed interface) to the finite element nodes which we denote by \(\Omega _k\).

Thus, the essential part is to update the finite element mesh after each iteration by adding an appropriate transformation vector field. For this purpose, we use the solution \(\textbf{U}(\Gamma ):\Omega (\Gamma ) \rightarrow \mathbb {R}^d\) of the so-called deformation equation

$$\begin{aligned} a_{\Gamma }(\textbf{U}(\Gamma ),\textbf{V})= D_\Gamma J^{red}(\Gamma )[\textbf{V}] ~~~\text {for all}~\textbf{V}\in H^1_0(\Omega (\Gamma ),{\mathbb {R}}^{d}). \end{aligned}$$
(21)

The right-hand side of this equation is given by the shape derivative of the reduced objective functional (20) and the left-hand side denotes an inner product on the vector field space \(H^1_0(\Omega ,{\mathbb {R}}^{d})\). In the view of the manifold interpretation, we can consider \(a_{\Gamma }\) as inner product on the tangent space at \(\Gamma \), so that \(\textbf{U}(\Gamma )\) is interpretable as the gradient of the shape functional \(J^{red}\) at \(\Gamma \). The solution \(\textbf{U}(\Gamma ):\Omega \rightarrow {\mathbb {R}}^{d}\) of (21) is then added in a scaled version to the coordinates \(\Omega _k\) of the finite element nodes.

A common choice for \(a_\Gamma \) is the bilinear form associated to the linear elasticity equation given by

$$\begin{aligned} a_\Gamma (\textbf{U},\textbf{V})= \int \limits _{\Omega (\Gamma )}\sigma (\textbf{U}):\epsilon (\textbf{V})\,dx, \end{aligned}$$

for \(\textbf{U},\textbf{V}\in H^1_0(\Omega ,{\mathbb {R}}^{d})\) and the identity function \({{\,\mathrm{\textbf{Id}}\,}}:{\mathbb {R}^d}\rightarrow {\mathbb {R}^d}\), where

$$\begin{aligned} \sigma (\textbf{U}):= \lambda \text {tr}(\epsilon (\textbf{U})) {{\,\mathrm{\textbf{Id}}\,}}+ 2 \mu \epsilon (\textbf{U}) \end{aligned}$$
(22)

and

$$\begin{aligned} \epsilon (\textbf{U}):= \frac{1}{2}(\nabla \textbf{U}+ \nabla \textbf{U}^T) \end{aligned}$$

are the strain and stress tensors, respectively. Deformation vector fields \(\textbf{V}\) which do not change the interface do not have an impact on the reduced objective functional, so that

$$\begin{aligned} D_\Gamma J^{red}(\Gamma )[\textbf{V}] =0 ~~~\text {for all}~ \textbf{V}\text { with } \text {supp}(\textbf{V})\cap \Gamma =\emptyset . \end{aligned}$$

Therefore, the right-hand side \(D_\Gamma J^{red}(\Gamma )[\textbf{V}] \) is only assembled for test vector fields whose support intersects with the interface \(\Gamma \) and set to zero for all other basis vector fields. This prevents wrong mesh deformations resulting from discretization errors as outlined and illustrated in [49]. Furthermore, \(\lambda \) and \(\mu \) in (22) denote the Lamé parameters which do not need to have a physical meaning here. It is more important to understand their effect on the mesh deformation. They enable us to control the stiffness of the material and thus can be interpreted as some sort of step size. In [47], it is observed that locally varying Lamé parameters have a stabilizing effect on the mesh. A good strategy is to choose \(\lambda =0\) and \(\mu \) as solution of the following Laplace equation

$$\begin{aligned} \begin{aligned} -\Delta \mu&= 0 ~~\quad \quad \text {in } \Omega \\ \mu&= \mu _{\text {max}} \quad \text {on }\Gamma \\ \mu&= \mu _{\text {min}} ~\quad \text {on }\partial \Omega . \end{aligned} \end{aligned}$$
(23)

Therefore \(\mu _\text {min},\mu _\text {max}\in {\mathbb {R}}\) influence the step size of the optimization algorithm. A small step is achieved by the choice of a large \(\mu _\text {max}\). Note that \(a_\Gamma \) then depends on the interface \(\Gamma \) through the parameter \(\mu = \mu (\Gamma ) :\Omega (\Gamma ) \rightarrow \mathbb {R}\).

Algorithm 1
figure b

Shape optimization algorithm

How to perform the limited memory L-BFGS update in Line 13 of Algorithm 1 within the shape formalism is investigated in [49, Section 4]. Here, we only mention that the therein examined vector transport is approximated with the identity operator, so that we finally treat the gradients \(\textbf{U}_k:\Omega _k \rightarrow \mathbb {R}^d\) as vectors in \(\mathbb {R}^{d|\Omega _k|}\) and implement the standard L-BFGS update [47, Section 5]. For the sufficient decrease condition in Line 18 a small value for c, e.g., \(c= 10^{-4}\), is suggested in [40].

4 Shape derivative of the reduced objective functional

In Sect. 3 we have depicted the optimization methodology, that we follow in this work to numerically solve the constrained shape optimization problem (12). First, we need the following conclusion from [56, Proposition 2.32].

Lemma 4.1

If \(\gamma \in W^{1,1}({\mathbb {R}^d}\times {\mathbb {R}^d},\mathbb {R})\) and \(\tilde{\textbf{V}} \in C_0^1({\mathbb {R}^d}\times {\mathbb {R}^d}, {\mathbb {R}^d}\times {\mathbb {R}^d})\), then \(t \mapsto \gamma \circ \tilde{\textbf{F}}_{\textbf{t}}\), where \(\tilde{\textbf{F}}_{\textbf{t}}:=(\textbf{x},\textbf{y}) + t\tilde{\textbf{V}}(\textbf{x},\textbf{y})\), is differentiable in \(L^1({\mathbb {R}^d}\times {\mathbb {R}^d},\mathbb {R})\) and its derivative is given by

$$\begin{aligned} \left. \frac{d}{dt}\right| _{t=0} \gamma \circ \tilde{\textbf{F}}_{\textbf{t}}= \nabla \gamma ^T \tilde{\textbf{V}}. \end{aligned}$$

Proof

See proof of [56, Proposition 2.32]. \(\square \)

Remark 4.2

Given a subset \(D \subset {\mathbb {R}^d}\times {\mathbb {R}^d}\) of nonzero measure, we can replace the set \({\mathbb {R}^d}\times {\mathbb {R}^d}\) by D in Lemma 4.1 and the statement still holds, which can be proven by extending functions \(\gamma \in W^{1,1}(D,\mathbb {R})\) and \(\tilde{\textbf{V}} \in C_0^1(D, D)\) by zero to functions \(\hat{\gamma } \in W^{1,1}({\mathbb {R}^d}\times {\mathbb {R}^d}, \mathbb {R})\) and

\(\hat{\textbf{V}} \in C_0^1({\mathbb {R}^d}\times {\mathbb {R}^d}, {\mathbb {R}^d}\times {\mathbb {R}^d})\).

In our case, we set \(\tilde{\textbf{V}}(\textbf{x},\textbf{y}) :=\left( \textbf{V}(\textbf{x}),\textbf{V}(\textbf{y}) \right) \) in order to use Lemma 4.1 to derive several derivatives in this section.

In order to prove the requirements of the AAM, we need some additional assumptions.

Assumption (P0):

  • For every \(t \in [0,T]\), there exist unique solutions \(u^t,v^t \in V_c(\Omega \cup \Omega _I)\), such that

    $$\begin{aligned} A(t,u^t,v)&= F(t,v) \text { for all } v\in V_c(\Omega \cup \Omega _I) \text { and } \nonumber \\ A(t,u,v^t)&= \left( -(\frac{1}{2}(u^t + u^0) - {\bar{u}}^t)\xi ^t,u \right) _{L^2(\Omega \cup \Omega _I)} \text { for all } u \in V_c(\Omega \cup \Omega _I), \end{aligned}$$
    (24)

    where \(A(t,u,v)\) and \(F(t,v)\) are defined as in (17), \({\bar{u}}^t(\textbf{x}) :={\bar{u}}(\textbf{F}_{\textbf{t}}(\textbf{x}))\) and

    \(\xi ^t(\textbf{x}) :=\det D\textbf{F}_{\textbf{t}}(\textbf{x})\).

  • Additionally assume that there exists a constant \(0< C_0 < \infty \), such that

    $$\begin{aligned} A(t,u,u) \ge C_0 ||u||^2_{L^2(\Omega )} \text { for all } t \in [0,T] \text { and } u \in V_c(\Omega \cup \Omega _I), \end{aligned}$$

    where A is defined as in (17).

Assumption (P1):

For each class of kernels additional requirements have to hold:

  • Define the sets \(D_n :=\{(\textbf{x},\textbf{y}) \in (\Omega \cup \Omega _I)^2: ||\textbf{x}- \textbf{y}||_2 > \frac{1}{n} \} \text { for } n \in \mathbb {N}.\) Then, singular kernels are assumed to have weak derivatives \(\nabla _x \gamma , \nabla _y \gamma \in L^{1}(D_n,{\mathbb {R}^d})\) for all \(n \in \mathbb {N}\) with

    $$\begin{aligned}&|\nabla _\textbf{x}\gamma _{ij}(\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{x}) + \nabla _{\textbf{y}}\gamma _{ij}(\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{y})|||\textbf{x}- \textbf{y}||_2^{d+2s} \in L^{\infty }(\Omega \times \Omega ) \text { and} \\&|\nabla _\textbf{x}\gamma _{iI}(\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{x})|||\textbf{x}- \textbf{y}||_2^{d+2s} \in L^{\infty }(\Omega \times \Omega _I). \end{aligned}$$
  • Square integrable kernels have to meet the following conditions

    $$\begin{aligned}&\gamma _{ij}, \nabla \gamma _{ij}\in L^\infty (\Omega \times \Omega ) \text { and} \\&\gamma _{iI}, \nabla \gamma _{iI}\in L^\infty (\Omega \times \Omega _I). \end{aligned}$$

Remark 4.3

We recall that there exists a Lipschitz constant \(L > 0\) such that

$$\begin{aligned} ||\textbf{F}_{\textbf{t}}^{-1}(\textbf{x}) - \textbf{F}_{\textbf{t}}^{-1}(\textbf{y})||_2 \le \frac{1}{L} ||\textbf{x}- \textbf{y}||_2 \text { for } \textbf{x},\textbf{y}\in \Omega \cup \Omega _I \text { and } t \in [0,T], \end{aligned}$$

if \(T>0\) is chosen small enough. Consequently we derive

$$\begin{aligned} \gamma ^t(\textbf{x}, \textbf{y}) = \gamma (\textbf{F}_{\textbf{t}}(\textbf{x}),\textbf{F}_{\textbf{t}}(\textbf{x})) \le \frac{\gamma *}{||\textbf{F}_{\textbf{t}}(\textbf{x}) - \textbf{F}_{\textbf{t}}(\textbf{y})||_2^{d+2s}} \le \frac{L\gamma *}{||\textbf{x}- \textbf{y}||_2^{d+2s}} \text { for } \textbf{x},\textbf{y}\in \Omega \cup \Omega _I. \end{aligned}$$

Therefore \(\gamma ^t (\textbf{x}, \textbf{y})\le \frac{L \gamma ^*}{||\textbf{x}- \textbf{y}||_2^{d+2\,s}} < L \gamma ^* n^{d+2\,s}\) for \((\textbf{x},\textbf{y}) \in D_n\), \(t \in [0,T]\) and we get

\(\gamma ^t \in W^{1,1}(D_n, \mathbb {R})\) for singular symmetric kernels if Assumption (P1) is fulfilled.

Singular kernels already satisfy Assumption (P0) since the nonlocal equations in the first condition are well-posed, which can trivially be proven by using the theory of [24, 61]. Additionally, singular kernels also fulfill the second requirement of (P0), which is shown in the following Lemma:

Lemma 4.4

In the case of a singular kernel, there exists a constant \(0< C_0 < \infty \), so that

$$\begin{aligned} A(t,u,u) \ge C_0||u||_{L^2(\Omega )},\quad \text {for every } t \in [0,T],\ u \in H^s(\Omega ). \end{aligned}$$

Proof

Let \(\varepsilon :=\min \{\varepsilon ^1_1,\varepsilon ^1_2\}\). Applying [24, Lemma 4.3] there exists a constant \(C_* > 0\) for the kernel \(\frac{\gamma _*}{||\textbf{x}- \textbf{y}||^{d+2\,s}}\chi _{B_\varepsilon (\textbf{x})}(\textbf{y})\), s.t.

$$\begin{aligned} C_*||u||_{L^2(\Omega )}&\le \iint \limits _{(\Omega \cup \Omega _I)^2} \frac{1}{2} (u(\textbf{x}) - u(\textbf{y}))^2\frac{\gamma _*}{||\textbf{x}- \textbf{y}||_2^{d+2s}}\chi _{B_\varepsilon (\textbf{x})}(\textbf{y})~ d\textbf{y}d\textbf{x}\\&= \iint \limits _{(\textbf{F}_{\textbf{t}}(\Omega ) \cup \Omega _I)^2} \frac{1}{2} (u(\textbf{x}) - u(\textbf{y}))^2\frac{\gamma _*}{||\textbf{x}- \textbf{y}||_2^{d+2s}}\chi _{B_\varepsilon (\textbf{x})}(\textbf{y})~ d\textbf{y}d\textbf{x}\\&\le \iint \limits _{(\textbf{F}_{\textbf{t}}(\Omega ) \cup \Omega _I)^2}\frac{1}{2} (u(\textbf{x}) - u(\textbf{y}))^2\gamma (\textbf{x},\textbf{y})\chi _{B_\varepsilon (\textbf{x})}(\textbf{y})~ d\textbf{y}d\textbf{x}= A_{\Gamma ^t}(u, u) \end{aligned}$$

So we conclude

$$\begin{aligned} C_* ||u \circ \textbf{F}_{\textbf{t}}^{-1}||_{L^2(\Omega )}^2 \le A_{\Gamma ^t}(u \circ \textbf{F}_{\textbf{t}}^{-1},u \circ \textbf{F}_{\textbf{t}}^{-1}) = A(t,u,u). \end{aligned}$$

Since T is chosen small enough, \([0,T] \times {\bar{\Omega }}\) is a compact set and \(\xi ^t\) is continuous on \([0,T] \times {\bar{\Omega }}\), there exists \(\xi _* > 0\), s.t. \(\xi ^t(\textbf{x}) \ge \xi _*\) for every \(t \in [0,T]\) and \(\textbf{x}\in {\bar{\Omega }}\). Therefore, by using that \(\textbf{F}_{\textbf{t}}(\Omega )=\Omega \), we derive

$$\begin{aligned} ||u \circ \textbf{F}_{\textbf{t}}^{-1}||_{L^2(\Omega )}^2&= \int _\Omega (u \circ \textbf{F}_{\textbf{t}}^{-1})^2 ~d\textbf{x}= \int _{\textbf{F}_{\textbf{t}}(\Omega )} (u \circ \textbf{F}_{\textbf{t}}^{-1})^2 ~d\textbf{x}\\&= \int _\Omega u^2 \xi ^t~d\textbf{x}\ge \xi _* \int _\Omega u^2 ~d\textbf{x}= \xi _* ||u||_{L^2(\Omega )}^2. \end{aligned}$$

\(\square \)

In the following we prove that assumption (P1) also holds for a standard example of a singular symmetric kernel.

Example 4.5

For \(\gamma (\textbf{x},\textbf{y})= \frac{\sigma (\textbf{x},\textbf{y})}{||\textbf{x}-\textbf{y}||^{d+2\,s}}\chi _{B_\varepsilon (\textbf{x})}(\textbf{y})\) of Example 2.1, where additionally there exists a constant \(\sigma ^* \in (0,\infty )\) with \({|\nabla _\textbf{x}\sigma |,|\nabla _\textbf{y}\sigma | \le \sigma ^*}\), the assumption (P1) holds, since \(\Omega \cup \Omega _I\) is a bounded domain and

$$\begin{aligned}&|\nabla _\textbf{x}\gamma (\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{x}) + \nabla _\textbf{y}\gamma (\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{y})|||\textbf{x}- \textbf{y}||_2^{d+2s}\\&\quad \le |\sigma (\textbf{x},\textbf{y})\frac{(\textbf{x}-\textbf{y})^\top (\textbf{V}(\textbf{x}) - \textbf{V}(\textbf{y}))}{||\textbf{x}-\textbf{y}||_2^2}| + |\nabla _\textbf{x}\sigma (\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{x}) + \nabla _\textbf{y}\sigma (\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{y})| \\&\quad \le L \gamma ^* + 2\sigma ^*\textbf{V}^* < \infty , \end{aligned}$$

where we used that \(\textbf{V}\in C_0^1(\Omega \cup \Omega _I,\mathbb {R}^d)\) is Lipschitz continuous for some Lipschitz constant \(L > 0\) and that there exists a \(\textbf{V}^*>0\) with \(|\textbf{V}(\textbf{x})| \le \textbf{V}^* \) for \(\textbf{x}\in \Omega \cup \Omega _I\).

Now we can show, that the additional requirements of AAM are satisfied by problem (12):

Lemma 4.6

Let G be defined as in (17) and let the assumptions (P0) and (P1) be fulfilled. Then the assumptions (H0) and (H1) are satisfied and for every \(t \in [0,T]\) there exists a solution \({v^t \in V_c(\Omega \cup \Omega _I)}\) that solves the averaged adjoint equation (18).

Proof

Because of the length of the proof, we move it to Appendix A. \(\square \)

As a direct consequence of Theorem 3.2 and Lemma 4.6 we can state the next Corollary.

Corollary 4.7

Let G be defined as in (17) and (P0) and (P1) be fulfilled. Then the reduced cost functional of (12) is shape differentiable and the shape derivative of the reduced cost functional can be expressed as

$$\begin{aligned} D_{\Gamma }J^{red}(\Omega )[\textbf{V}] = \partial _tG(0, u^0, v^0), \end{aligned}$$

where \(u^0\) is the solution to the state equation (11) and \(v^0\) the solution to the adjoint equation (19).

The missing piece to implement the respective algorithmic realization presented in

Section 3.3 is the shape derivative of the reduced objective functional, which is used in

Line 7 of Algorithm 1 and given by

$$\begin{aligned} \begin{aligned} D_\Gamma J^{red}(\Gamma )[\textbf{V}]&= \partial _t G(0,u^0,v^0) = \left. \frac{d}{dt}\right| _{t=0^+} J(t, u^0) + \left. \frac{d}{dt}\right| _{t=0^+}\\&{A}(t, u^0, v^0) - \left. \frac{d}{dt}\right| _{t=0^+} F(t, v^0). \end{aligned} \end{aligned}$$
(25)

As a first step, we formulate the shape derivative of the objective functional J and the linear functional F, which can also be found in the standard literature.

Theorem 4.8

(Shape derivative of the reduced objective functional) Let the assumptions (P0) and (P1) be satisfied. Further let \(\Gamma \) be a shape with corresponding state variable \(u^0\) and adjoint variable \(v^0\). Then, for a vector field \(\textbf{V}\in C_0^1(\Omega \cup \Omega _I,\mathbb {R}^d)\) we find

$$\begin{aligned} \begin{aligned} D_\Gamma J^{red}(\Gamma )[\textbf{V}]&= \int _{\Omega } -(u^0 - {\bar{u}})\nabla {\bar{u}}^\top \textbf{V}+ (u^0 - {\bar{u}})^2 {{\,\textrm{div}\,}}\textbf{V}~d\textbf{x}\\&\quad + \nu \int _\Gamma {{\,\textrm{div}\,}}\textbf{V}- \textbf{n}^\top \nabla \textbf{V}^\top \textbf{n}~ds\\&\quad - \int _\Omega D_m f_\Gamma v^0 + {{\,\textrm{div}\,}}\textbf{V}(fv^0)~d\textbf{x}+ D_\Gamma {A}_{\Gamma }(u^0, v^0) [\textbf{V}]. \end{aligned} \end{aligned}$$
(26)

Proof

In order to prove this theorem, we just have to compute the shape derivative of the objective function \(J(u^0,\Gamma )\) and of the linear functional \(F_{\Gamma }(v^0)\). Therefore, let \(\xi ^t(\textbf{x}) :=\det D\textbf{F}_{\textbf{t}}(\textbf{x})\).

Then, we have \(\xi ^0(x)= \det D\textbf{F}_{\textbf{0}}(\textbf{x})= \det (\textbf{I})=1\) and \(\left. \frac{d}{dt}\right| _{t = 0^+} \xi ^t= {{\,\textrm{div}\,}}\textbf{V}\)(see, e.g., [45]), such that the shape derivative of the right-hand side \(F_\Gamma \) can be derived as a consequence of [56, Proposition 2.32] and the product rule of Fréchet derivatives as follows

$$\begin{aligned} D_\Gamma F_\Gamma (v^0)[\textbf{V}]&= \left. \frac{d}{dt}\right| _{t = 0^+} F_{\Gamma ^t}(v^0 \circ \textbf{F}_{\textbf{t}}^{-1}) = \int _\Omega \left. \frac{d}{dt}\right| _{t = 0^+} (f_\Gamma \circ \textbf{F}_{\textbf{t}}) v^0 \xi ^t~d\textbf{x}\\&= \int _{\Omega } D_m f_\Gamma v^0 ~d \textbf{x}+\int _{\Omega }f_\Gamma v^0 ~{{\,\textrm{div}\,}}\textbf{V}~d\textbf{x}. \end{aligned}$$

Moreover, the shape derivative of the objective functional can be written as

$$\begin{aligned} D_{\Gamma }J(u^0, \Gamma )[\textbf{V}]&= D_{\Gamma }j(u^0, \Gamma )[\textbf{V}] + D_{\Gamma }j_{reg}(\Gamma )[\textbf{V}] = \left. \frac{d}{dt} \right| _{t = 0^+} j(u^0 \circ \textbf{F}_{\textbf{t}}^{-1}, \Gamma ^t) \\&\quad + \left. \frac{d}{dt} \right| _{t = 0^+} j_{reg}(\Gamma ^t). \end{aligned}$$

Here the shape derivative of the regularization term is an immediate consequence of [62, Theorem 4.13] and is given by

$$\begin{aligned} D_{\Gamma }j_{reg}(u^0,\Gamma )[\textbf{V}] = \nu \int _{{\Gamma }}{{\,\textrm{div}\,}}_{\Gamma } \textbf{V}~ds = \nu \int _{{\Gamma }}{{\,\textrm{div}\,}}\textbf{V}- \textbf{n}^\top \nabla \textbf{V}^\top \textbf{n}~ds, \end{aligned}$$

where \(\textbf{n}\) denotes the outer normal of \(\Omega _1\). Additionally, we obtain for the shape derivative of the tracking-type functional

$$\begin{aligned} D_\Gamma j(u^0,\Gamma )[\textbf{V}]&= \left. \frac{d}{dt}\right| _{t = 0^+} j(u^0 \circ \textbf{F}_{\textbf{t}}^{-1}, \Gamma ^t) = \frac{1}{2} \left. \frac{d}{dt}\right| _{t = 0^+} \int _{\textbf{F}_{\textbf{t}}(\Omega )} (u^0 \circ \textbf{F}_{\textbf{t}}^{-1} - {\bar{u}})^2 ~d\textbf{x}\\&= \frac{1}{2} \int _{\Omega } \left. \frac{d}{dt}\right| _{t = 0^+} (u^0 - {\bar{u}} \circ \textbf{F}_{\textbf{t}})^2 \xi ^t~d\textbf{x}\\&= \int _{\Omega } -(u^0 - {\bar{u}})\nabla {\bar{u}}^\top \textbf{V}+ (u^0 - {\bar{u}})^2 {{\,\textrm{div}\,}}\textbf{V}~d\textbf{x}. \end{aligned}$$

Putting the above terms into equation (25) yields the formula of Theorem 4.8. \(\square \)

The last step to derive the shape derivative of the reduced objective functional (25) is to compute the shape derivative of the nonlocal bilinear form \({A}_\Gamma \).

Lemma 4.9

(Shape derivative of the nonlocal bilinear form) Let the assumptions (P0) and (P1) be satisfied. Further let \(\Gamma \) be a shape with corresponding state variable \(u^0\) and adjoint variable \(v^0\). Then for a vector field \(\textbf{V}\in C_0^1(\Omega \cup \Omega _I,\mathbb {R}^d)\) we find for a square integrable kernel \(\gamma \) that

$$\begin{aligned} \begin{aligned}&\left. \frac{d}{dt}\right| _{t=0^+} {A}(t, u^0, v^0) = D_\Gamma {A}_{\Gamma }(u^0, v^0) [\textbf{V}]\\&\quad = \sum _{i,j=1,2} \int _{\Omega _i}\int _{\Omega _j} \left( v^0(\textbf{x}) - v^0(\textbf{y}) \right) \left( u^0(\textbf{x})\nabla _x \gamma _{ij}(\textbf{x},\textbf{y}) - u^0(\textbf{y})\nabla _y \gamma _{ji}(\textbf{y},\textbf{x}) \right) ^\top \textbf{V}(\textbf{x}) \\&\qquad + (v^0(\textbf{x}) - v^0(\textbf{y}))(u^0(\textbf{x})\gamma _{ij}(\textbf{x},\textbf{y}) - u^0(\textbf{y})\gamma _{ji}(\textbf{y},\textbf{x})){{\,\textrm{div}\,}}\textbf{V}(\textbf{x}) ~d\textbf{y}d\textbf{x}\\&\qquad + \sum _{i=1,2} \int _{\Omega _i}\int _{\Omega _I} u^0(\textbf{x})v^0(\textbf{x}) \nabla _x \gamma _{iI}(\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{x}) + u^0(\textbf{x})v^0(\textbf{x}) \gamma _{iI}(\textbf{x},\textbf{y}) {{\,\textrm{div}\,}}\textbf{V}(\textbf{x}) ~d\textbf{y}d\textbf{x}. \end{aligned}\nonumber \\ \end{aligned}$$
(27)

and for a singular kernel \(\gamma \) that

$$\begin{aligned}&D_\Gamma {A}_{\Gamma }(u^0, v^0) [\textbf{V}]\\&\quad = \sum _{i,j=1,2}\frac{1}{2} \int _{\Omega _i}\int _{\Omega _j}(u^0(\textbf{x}) -u^0(\textbf{y}))(v^0(\textbf{x})\\&\qquad - v^0(\textbf{y}))\left( \nabla _{\textbf{x}}\gamma _{ij}(\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{x}) + \nabla _{\textbf{y}}\gamma _{ij}(\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{y}) \right) ~d\textbf{y}d\textbf{x}\\&\qquad + \sum _{i,j=1,2}~ \int _{\Omega _i}\int _{\Omega _j}(u^0(\textbf{x}) -u^0(\textbf{y}))(v^0(\textbf{x}) - v^0(\textbf{y}))\gamma _{ij}(\textbf{x},\textbf{y}) {{\,\textrm{div}\,}}\textbf{V}(\textbf{x}) ~d\textbf{y}d\textbf{x}\\&\qquad +\sum _{i=1,2} \int _{\Omega _i}\int _{\Omega _I}(u^0(\textbf{x}) -u^0(\textbf{y}))(v^0(\textbf{x})\\&\qquad - v^0(\textbf{y}))\left( \nabla _{\textbf{x}}\gamma _{iI}(\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{x}) + \gamma _{iI}(\textbf{x},\textbf{y}){{\,\textrm{div}\,}}\textbf{V}(\textbf{x}) \right) ~d\textbf{y}d\textbf{x}. \end{aligned}$$

Proof

Define \(\xi ^t(\textbf{x}):=\det D\textbf{F}_{\textbf{t}}(\textbf{x})\) and \(\gamma _{ij}^t(\textbf{x},\textbf{y}) :=\gamma _{ij}(\textbf{F}_{\textbf{t}}(\textbf{x}), \textbf{F}_{\textbf{t}}(\textbf{y}))\).

Case 1: Square integrable kernels

Then, we can write by using representation (8) of the nonlocal bilinear form \({A}\)

$$\begin{aligned}&{A}(t, u^0, v^0) = {A}_{\Gamma ^t}(u^0 \circ \textbf{F}_{\textbf{t}}^{-1},v^0 \circ \textbf{F}_{\textbf{t}}^{-1})\\&\quad =\frac{1}{2} \sum _{i,j=1,2} \int _{\Omega _i}\int _{\Omega _j} \left( v^0(\textbf{x}) - v^0(\textbf{y}) \right) \left( u^0(\textbf{x})\gamma _{ij}^t(\textbf{x},\textbf{y}) - u^0(\textbf{y})\gamma _{ji}^t(\textbf{y},\textbf{x}) \right) \\&\qquad \xi ^t(\textbf{x})\xi ^t(\textbf{y})~d\textbf{y}d\textbf{x}\\&\qquad + \sum _{i=1,2} \int _{\Omega _i}\int _{\Omega _I} u^0(\textbf{x})v^0(\textbf{x})\gamma _{iI}(\textbf{F}_{\textbf{t}}(\textbf{x}),\textbf{y})\xi ^t(\textbf{x})~d\textbf{y}d\textbf{x}. \end{aligned}$$

First, we recall that the function \(\xi ^t\) is continuously differentiable and \(\left. \frac{d}{dt}\right| _{t=0} \xi ^t(\textbf{x}) = div \textbf{V}(\textbf{x})\), see, e.g., [45].

So we derive the shape derivative of the nonlocal bilinear form by applying Lemma 4.1, as described in Remark 4.2, on \(\gamma _{ij}\in W^{1,1}(\Omega \times \Omega ,\mathbb {R})\) and \(\gamma _{iI}\in W^{1,1}(\Omega \times \Omega _I,\mathbb {R})\) and by using the product rule for Fréchet derivatives

$$\begin{aligned}&\left. \frac{d}{dt}\right| _{t=0^+} {A}(t, u^0, v^0)\\&\quad = \frac{1}{2} \sum _{i,j=1,2} \int _{\Omega _i}\int _{\Omega _j} \left( v^0(\textbf{x}) - v^0(\textbf{y}) \right) \left( u^0(\textbf{x})\nabla _x \gamma _{ij}(\textbf{x},\textbf{y}) - u^0(\textbf{y})\nabla _y \gamma _{ji}(\textbf{y},\textbf{x}) \right) ^\top \textbf{V}(\textbf{x}) \\&\qquad + \left( v^0(\textbf{x}) - v^0(\textbf{y}) \right) \left( u^0(\textbf{x})\nabla _y \gamma _{ij}(\textbf{x},\textbf{y}) - u^0(\textbf{y})\nabla _x \gamma _{ji}(\textbf{y},\textbf{x}) \right) ^\top \textbf{V}(\textbf{y}) \\&\qquad +(v^0(\textbf{x}) - v^0(\textbf{y}))(u^0(\textbf{x})\gamma _{ij}(\textbf{x},\textbf{y}) - u^0(\textbf{y})\gamma _{ji}(\textbf{y},\textbf{x}))({{\,\textrm{div}\,}}\textbf{V}(\textbf{x}) + {{\,\textrm{div}\,}}\textbf{V}(\textbf{y}))~d\textbf{y}d\textbf{x}\\&\qquad + \sum _{i=1,2} \int _{\Omega _i}\int _{\Omega _I} u^0(\textbf{x})v^0(\textbf{x}) \nabla _x \gamma _{iI}(\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{x}) + u^0(\textbf{x})v^0(\textbf{x}) \gamma _{iI}(\textbf{x},\textbf{y}) {{\,\textrm{div}\,}}\textbf{V}(\textbf{x}) ~d\textbf{y}d\textbf{x}\\&\quad = \sum _{i,j=1,2} \int _{\Omega _i}\int _{\Omega _j} \left( v^0(\textbf{x}) - v^0(\textbf{y}) \right) \left( u^0(\textbf{x})\nabla _x \gamma _{ij}(\textbf{x},\textbf{y}) - u^0(\textbf{y})\nabla _y \gamma _{ji}(\textbf{y},\textbf{x}) \right) ^\top \textbf{V}(\textbf{x}) \\&\qquad + (v^0(\textbf{x}) - v^0(\textbf{y}))(u^0(\textbf{x})\gamma _{ij}(\textbf{x},\textbf{y}) - u^0(\textbf{y})\gamma _{ji}(\textbf{y},\textbf{x})){{\,\textrm{div}\,}}\textbf{V}(\textbf{x})~d\textbf{y}d\textbf{x}\\&\qquad + \sum _{i=1,2} \int _{\Omega _i}\int _{\Omega _I} u^0(\textbf{x})v^0(\textbf{x})\nabla _x \gamma _{iI}(\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{x}) + u^0(\textbf{x})v^0(\textbf{x})\gamma _{iI}(\textbf{x},\textbf{y}) {{\,\textrm{div}\,}}\textbf{V}(\textbf{x}) ~d\textbf{y}d\textbf{x}. \end{aligned}$$

For the second equation, the following computations are used, which can be obtained by applying Fubini’s theorem and by swapping \(\textbf{x}\) and \(\textbf{y}\)

$$\begin{aligned}&\int _{\Omega _i} \int _{\Omega _j} (v^0(\textbf{x})-v^0(\textbf{y}))(- u^0(\textbf{y})\nabla _x\gamma _{ji}(\textbf{y},\textbf{x})^\top \textbf{V}(\textbf{y})) ~d\textbf{y}d\textbf{x}\\&\quad = \int _{\Omega _j} \int _{\Omega _i} (v^0(\textbf{x})-v^0(\textbf{y}))(u^0(\textbf{x})\nabla _x\gamma _{ji}(\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{x})) ~d\textbf{y}d\textbf{x}, \\&\int _{\Omega _i} \int _{\Omega _j}(v^0(\textbf{x})-v^0(\textbf{y}))u^0(\textbf{x})\nabla _y\gamma _{ij}(\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{y}) ~d\textbf{y}d\textbf{x}\\&\quad = -\int _{\Omega _j} \int _{\Omega _i}(v^0(\textbf{x})-v^0(\textbf{y}))u^0(\textbf{y})\nabla _y\gamma _{ij}(\textbf{y},\textbf{x})^\top \textbf{V}(\textbf{x}) ~d\textbf{y}d\textbf{x}\text { and }\\ \\&\int _{\Omega _i} \int _{\Omega _j}(v^0(\textbf{x})-v^0(\textbf{y}))(u^0(\textbf{x})\gamma _{ij}(\textbf{x},\textbf{y})-u^0(\textbf{y})\gamma _{ji}(\textbf{y},\textbf{x})){{\,\textrm{div}\,}}\textbf{V}(\textbf{y})~d\textbf{y}d\textbf{x}\\&\quad = \int _{\Omega _j} \int _{\Omega _i}(v^0(\textbf{x})-v^0(\textbf{y}))(u^0(\textbf{x})\gamma _{ji}(\textbf{x},\textbf{y}) - u^0(\textbf{y})\gamma _{ij}(\textbf{y},\textbf{x})){{\,\textrm{div}\,}}\textbf{V}(\textbf{x}) ~d\textbf{y}d\textbf{x}. \end{aligned}$$

Case 2: Singular kernels

Since \(\xi ^t(\textbf{x}) = det D\textbf{F}_{\textbf{t}}(\textbf{x})\) is continuous and therefore bounded on \(\Omega \cup \Omega _I\), we get that the double integral

$$\begin{aligned} A(t,u^0,v^0) = \frac{1}{2}\iint \limits _{(\Omega \cup \Omega _I)^2} (u^0(\textbf{x}) - u^0(\textbf{y}))(v^0(\textbf{x}) - v^0(\textbf{y}))\gamma ^t(\textbf{x},\textbf{y})\xi ^t(\textbf{x})\xi ^t(\textbf{y}) ~d\textbf{y}d\textbf{x}\end{aligned}$$

is well-defined(see also Remark 4.3). Moreover since \(\gamma ^t \in W^{1,1}(D_n, \mathbb {R})\) we can conclude, as outlined in Remark 4.2, that the function \(\gamma ^t\) is differentiable in \(L^1(D_n,\mathbb {R})\) with

$$\begin{aligned} \left. \frac{d}{dt}\right| _{t=0}\gamma ^t(\textbf{x},\textbf{y}) = \nabla _{\textbf{x}} \gamma (\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{x}) + \nabla _{\textbf{y}} \gamma (\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{y}). \end{aligned}$$

Therefore, we follow

for the singular symmetric kernel

$$\begin{aligned}&\left. \frac{d}{dt} \right| _{t=0} A(t,u^0,v^0) = \lim _{n \rightarrow \infty } \left. \frac{d}{dt} \right| _{t=0} \frac{1}{2}\iint \limits _{D_n} (u^0(\textbf{x}) - u^0(\textbf{y}))(v^0(\textbf{x})\nonumber \\&\qquad - v^0(\textbf{y}))\gamma ^t(\textbf{x},\textbf{y})\xi ^t(\textbf{x})\xi ^t(\textbf{y}) ~d\textbf{y}d\textbf{x}\nonumber \\&\quad = \lim _{n \rightarrow \infty } \frac{1}{2}\iint \limits _{D_n} (u^0(\textbf{x}) -u^0(\textbf{y}))(v^0(\textbf{x}) - v^0(\textbf{y}))\nonumber \\&\qquad \left( \nabla _{\textbf{x}}\gamma (\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{x}) + \nabla _{\textbf{y}}\gamma (\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{y}) \right) ~d\textbf{y}d\textbf{x}\nonumber \\&\qquad + \lim _{n \rightarrow \infty } \frac{1}{2} \iint \limits _{D_n} (u^0(\textbf{x}) -u^0(\textbf{y}))(v^0(\textbf{x}) - v^0(\textbf{y}))\gamma (\textbf{x},\textbf{y}) \left( {{\,\textrm{div}\,}}\textbf{V}(\textbf{x}) + {{\,\textrm{div}\,}}\textbf{V}(\textbf{y}) \right) ~d\textbf{y}d\textbf{x}\nonumber \\&\quad = \frac{1}{2}\iint \limits _{\left( \Omega \cup \Omega _I\right) ^2} (u^0(\textbf{x}) -u^0(\textbf{y}))(v^0(\textbf{x}) - v^0(\textbf{y}))\left( \nabla _{\textbf{x}}\gamma (\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{x}) + \nabla _{\textbf{y}}\gamma (\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{y}) \right) ~d\textbf{y}d\textbf{x}\nonumber \\&\qquad + \frac{1}{2} \iint \limits _{\left( \Omega \cup \Omega _I\right) ^2} (u^0(\textbf{x}) -u^0(\textbf{y}))(v^0(\textbf{x}) - v^0(\textbf{y}))\gamma (\textbf{x},\textbf{y}) \left( {{\,\textrm{div}\,}}\textbf{V}(\textbf{x}) + {{\,\textrm{div}\,}}\textbf{V}(\textbf{y}) \right) ~d\textbf{y}d\textbf{x}\nonumber \\&\quad = \sum _{i,j=1,2} \frac{1}{2} \int _{\Omega _i}\int _{\Omega _j}(u^0(\textbf{x}) -u^0(\textbf{y}))(v^0(\textbf{x}) - v^0(\textbf{y}))\nonumber \\&\qquad \left( \nabla _{\textbf{x}}\gamma _{ij}(\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{x}) + \nabla _{\textbf{y}}\gamma _{ij}(\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{y}) \right) ~d\textbf{y}d\textbf{x} \end{aligned}$$
(28)
$$\begin{aligned}&\quad + \sum _{i,j=1,2}~ \int _{\Omega _i}\int _{\Omega _j}(u^0(\textbf{x}) -u^0(\textbf{y}))(v^0(\textbf{x}) - v^0(\textbf{y}))\gamma _{ij}(\textbf{x},\textbf{y}) {{\,\textrm{div}\,}}\textbf{V}(\textbf{x}) ~d\textbf{y}d\textbf{x}\end{aligned}$$
(29)
$$\begin{aligned}&\quad \quad +\sum _{i=1,2} \int _{\Omega _i}\int _{\Omega _I}(u^0(\textbf{x}) - u^0(\textbf{y}))(v^0(\textbf{x}) - v^0(\textbf{y}))\nonumber \\&\quad \quad \left( \nabla _{\textbf{x}}\gamma _{iI}(\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{x}) + \gamma _{iI}(\textbf{x},\textbf{y}){{\,\textrm{div}\,}}\textbf{V}(\textbf{x}) \right) ~d\textbf{y}d\textbf{x}, \end{aligned}$$
(30)

where the integrals (28)–(30) are also well-defined, since \({{\,\textrm{div}\,}}\textbf{V}\) is bounded on \(\Omega \) and Assumption (P1) yields the existence of derivatives \(\nabla \gamma _{ij}\) and \(\nabla \gamma _{iI}\) and of constants \(0 \le C_1,C_2 < \infty \) with

$$\begin{aligned}&|\nabla _{\textbf{x}}\gamma _{ij}(\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{x}) {+} \nabla _{\textbf{y}}\gamma _{ij}(\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{y})| \le \frac{C_1}{||\textbf{x}{-} \textbf{y}||_2^{d+2s}} \text { for a.e. } (\textbf{x},\textbf{y}) \in \Omega \times \Omega \text { and }\\&|\nabla _{\textbf{x}}\gamma _{iI}(\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{x})| \le \frac{C_2}{||\textbf{x}- \textbf{y}||_2^{d+2s}} \text { for a.e. } (\textbf{x},\textbf{y}) \in \Omega \times \Omega _I. \end{aligned}$$

\(\square \)

5 Numerical experiments

In this section, we want to put the above derived formula (26) for the shape derivative of the reduced objective functional into numerical practice. In the following numerical examples we test one singular symmetric and one nonsymmetric square integrable kernel. Specifically,

$$\begin{aligned} \gamma ^{sym}_\Gamma (\textbf{x},\textbf{y}) = \phi ^{sym}(\textbf{x},\textbf{y}) \chi _{B_{\delta }(\textbf{x})}(\textbf{y}), \end{aligned}$$

where

$$\begin{aligned} \phi ^{sym}(\textbf{x},\textbf{y}) = {\left\{ \begin{array}{ll} 100 d_{\delta }\frac{1}{||\textbf{x}- \textbf{y}||_2^{2+2s}} &{} \text {if } (\textbf{x},\textbf{y}) \in \Omega _1 \times \Omega _1, \\ 1.0 d_{\delta }\frac{1}{||\textbf{x}- \textbf{y}||_2^{2+2s}} &{} \text {if } (\textbf{x},\textbf{y}) \in \Omega _2 \times \Omega _2, \\ 10 d_{\delta }\frac{1}{||\textbf{x}- \textbf{y}||_2^{2+2s}} &{} \text {else}, \end{array}\right. } \end{aligned}$$

with with scaling constants \(d_{\delta }:=\frac{2-2\,s}{\pi \delta ^{2-2\,s}}\) and

$$\begin{aligned} \gamma ^{nonsym}_\Gamma (\textbf{x},\textbf{y}) = \phi ^{nonsym}(\textbf{x},\textbf{y}) \chi _{B_{\delta }(\textbf{x})}(\textbf{y}), \end{aligned}$$

where

$$\begin{aligned} \phi ^{nonsym}(\textbf{x},\textbf{y})&= {\left\{ \begin{array}{ll} 5.0 c_\delta &{} \text {if } \textbf{x}\in \Omega _1,\\ 3.0 c_\delta &{} \text {if } \textbf{x}\in \Omega _2, \end{array}\right. } \end{aligned}$$

with scaling constants \(c_\delta :=\frac{1}{\delta ^4}\). We truncate all kernel functions by \(\Vert \cdot \Vert _2\)-balls of radius \(\delta = 0.1\) so that \(\Omega \cup \Omega _I\subset [-\delta ,1+\delta ]^2\). As a right-hand side we choose a piecewise constant function

$$\begin{aligned} f_\Gamma (\textbf{x}) = 100\chi _{\Omega _1}(\textbf{x}) -10\chi _{\Omega _2}(\textbf{x}), \end{aligned}$$

i.e., \(f_1 = 100\) and \(f_2 = -10\). We note that the nonsymmetric kernel \(\gamma ^{nonsym}\) satisfies the conditions for the class of integrable kernels considered in [61], such that the corresponding nonlocal problem is well-posed and also assumptions (P0) and (P1) can easily be verified in this case. The symmetric kernel \(\gamma ^{sym}\) is a special case of Example 2.1 and therefore the assumptions (P0) and (P1) are met. The well-posedness of the nonlocal problem regarding the singular kernel is shown in [24]. As a perimeter regularization we choose \(\nu = 0.001\) and, since we only utilize \(\textbf{V}\) with \({\text {supp}(\textbf{V}) \cap \Gamma _k \ne \emptyset }\), where \(\Gamma _k\) is the current interface in iteration k of Algorithm 1, we additionally assume that the nonlocal boundary has no direct influence on the shape derivative of the nonlocal bilinear form \(D_\Gamma {A}_\Gamma \), such that for all \(\textbf{V}\in C_0^1(\Omega \cup \Omega _I,\mathbb {R}^2)\) with \({\text {supp}(\textbf{V}) \cap \Gamma _k \ne \emptyset }\) we have for the square integrable kernel

$$\begin{aligned} D_\Gamma {A}_{\Gamma }(u^0, v^0) [\textbf{V}]&= \sum _{i,j=1,2}~ \iint \limits _{\Omega _i\times \Omega _j} \left( v^0(\textbf{x}) - v^0(\textbf{y}) \right) \\&\quad \left( u^0(\textbf{x})\nabla _x \gamma _{ij}(\textbf{x},\textbf{y}) - u^0(\textbf{y})\nabla _y \gamma _{ji}(\textbf{y},\textbf{x}) \right) ^\top \textbf{V}(\textbf{x})+ (v^0(\textbf{x}) \\&\quad - v^0(\textbf{y}))(u^0(\textbf{x})\gamma _{ij}(\textbf{x},\textbf{y}) - u^0(\textbf{y})\gamma _{ji}(\textbf{y},\textbf{x})){{\,\textrm{div}\,}}\textbf{V}(\textbf{x}) ~d\textbf{y}d\textbf{x}\end{aligned}$$

and for the singular symmetric kernel

$$\begin{aligned}&D_\Gamma {A}_{\Gamma }(u^0, v^0) [\textbf{V}] \\&\quad = \sum _{i,j=1,2}~ \iint \limits _{\Omega _i\times \Omega _j}\frac{1}{2}(u^0(\textbf{x}) -u^0(\textbf{y}))(v^0(\textbf{x}) - v^0(\textbf{y}))\\&\qquad \left( \nabla _{\textbf{x}}\gamma _{ij}(\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{x}) + \nabla _{\textbf{y}}\gamma _{ij}(\textbf{x},\textbf{y})^\top \textbf{V}(\textbf{y}) \right) \\&\qquad + (u^0(\textbf{x}) -u^0(\textbf{y}))(v^0(\textbf{x}) - v^0(\textbf{y}))\gamma _{ij}(\textbf{x},\textbf{y}) {{\,\textrm{div}\,}}\textbf{V}(\textbf{x}) ~d\textbf{y}d\textbf{x}. \end{aligned}$$

In order to solve problem (12), we apply a finite element method, where we employ continuous piecewise linear basis functions on triangular grids for the discretization of the nonlocal constraint equation. In particular we use the free meshing software Gmsh [30] to construct the meshes and a customized version of the Python package nlfem [35] to assemble the stiffness matrices of the nonlocal state and adjoint equation as well as the load vector regarding the shape derivative \(D_\Gamma {A}_\Gamma \). Moreover, to assemble the load vector of the state and adjoint equation and the shape derivatives \(D_\Gamma J\) and \(D_\Gamma F_\Gamma \), we employ the open-source finite element software FEniCS [1, 2]. For a detailed discussion on the assembly of the nonlocal stiffness matrix we refer to [22, 35]. Here we solely emphasize how to implement a subdomain–dependent kernel of type (3). During the mesh generation each triangle is labeled according to its subdomain affiliation. Thus, whenever we integrate over a pair of two triangles, we can read out the labels (ij) and choose the corresponding kernel \(\gamma _{ij}\).

The data \({\bar{u}}\) is generated as solution \(u({\overline{\Gamma }})\) of the constraint equation associated to a target shape \({\overline{\Gamma }}\). Thus the data is represented as a linear combination of basis functions from the finite element basis. For the interpolation task in Line 3 of Algorithm 1 we solely need to translate between (non-matching) finite element grids by using the project function of FEniCS. In all examples below the target shape \({\overline{\Gamma }}\) is chosen to be a circle of radius 0.25 centered at (0.5, 0.5).

Fig. 2
figure 2

Example 1 for the singular symmetric kernel

Fig. 3
figure 3

Example 1 for the nonsymmetric integrable kernel

Fig. 4
figure 4

Example 2 for the singular symmetric kernel

Fig. 5
figure 5

Example 2 for the nonsymmetric integrable kernel

We now present two different non-trivial examples which differ in the choice of the initial guess \(\Gamma _0\). They are presented and described in the Figs. 2 and 4 for the singular symmetric kernel and in the Figs. 3 and 5 for the nonsymmetric integrable kernel. In each plot of the aforementioned figures the black line represents the target interface \({\overline{\Gamma }}\). Moreover the blue area depicts \(\Omega _1\), the grey area \(\Omega _2\) and the red area the nonlocal interaction domain \(\Omega _I\).

Since the start shapes are smaller than the target shape, the shape needs to expand in the first few iterations. Thereby the nodes of the mesh are pushed towards the boundary, so that the mesh quality decreases and the algorithm stagnates, because nodes are prohibited to be pushed outside of \(\Omega \). Therefore, we apply a remeshing technique, where we remesh after the fifth and tenth iteration. In order to remesh, we save the points of our current shape as a spline in a dummy.geo file, that also contains the information of the nonlocal boundary, and then compute a new mesh with Gmsh. In this new mesh the distance between the nodes and the boundary is sufficiently large to attain a better improvement regarding the objective function value by the new mesh deformations.

It is important to mention that computation times and the performance of Algorithm 1 in general are very sensitive to the choice of parameters and may strongly vary, which is why reporting exact computation times is not very meaningful at this stage. Particularly delicate choices are those of the system parameters including the kernel (diffusion and convection) and the forcing term, which both determine the identifiability of the model. But also the choice of Lamé parameters to control the step size, specifically \(\mu _{max}\) (we set \(\mu _{min} = 0\) in all experiments). The convergence history of each experiment is shown in Fig. 6.

Fig. 6
figure 6

In the first six or seven iterations the improvement regarding the objective function value is quite high. After that the objective function value decreases in a much slower fashion. Due to the regularization term the objective functional value will not converge to zero

Moreover, especially in the case of system parameters with high interface-sensitivity in combination with an inconveniently small \(\mu _{max}\), mesh deformations may be large in the early phase of the algorithm. Thus, such mesh deformations \(\tilde{\textbf{U}}_k\) of high magnitude lead to destroyed meshes so that an evaluation of the reduced objective functional \(J^{red}(({{\,\mathrm{\textbf{Id}}\,}}+ \alpha \tilde{\textbf{U}}_k)(\Omega _k))\), which requires the assembly of the nonlocal stiffness matrix, becomes a pointless computation. In order to avoid such computations we first perform a line search depending on one simple mesh quality criterion. More precisely, we downscale the step size, i.e., \(\alpha = \tau \alpha \), until all finite element nodes of the resulting mesh \(({{\,\mathrm{\textbf{Id}}\,}}+\alpha \tilde{\textbf{U}}_k)(\Omega _k)\) are a subset of \(\Omega \). After that, we continue with the backtracking line search in Line 18 of Algorithm 1.

6 Concluding remarks and future work

We have conducted a numerical investigation of shape optimization problems which are constrained by nonlocal system models. We have proven through numerical experiments the applicability of established shape optimization techniques for which the shape derivative of the nonlocal bilinear form represents the novel ingredient. All in all, this work is only a first step along the exploration of the interesting field of nonlocally constrained shape optimization problems.