The dual approach to optimal control in the coefficients of nonlocal nonlinear diffusion

We derive the dual variational principle (principle of minimal complementary energy) for the nonlocal nonlinear scalar diffusion problem, which may be viewed as the nonlocal version of the $p$-Laplacian operator. We establish existence and uniqueness of solutions (two-point fluxes) as well as their quantitative stability, which holds uniformly with respect to the small parameter (nonlocal horizon) characterizing the nonlocality of the problem. We then focus on the nonlocal analogue of the classical optimal control in the coefficient problem associated with the dual variational principle, which may be interpreted as that of optimally distributing a limited amount of conductivity in order to minimize the complementary energy. We show that this nonlocal optimal control problem $\Gamma$-converges to its local counterpart, when the nonlocal horizon vanishes.


Introduction
More than half a century ago Céa and Malanowski [1] introduced an optimization problem, which can be viewed as that of distributing a limited amount of "conductivity" in a given design domain in order to minimize the weighted average steady-state temperature for a given volumetric heat source.This seminal paper catalyzed the development of what is arXiv:2306.08435v3[math.AP] 6 Oct 2023 nowadays referred to as topology optimization, see for example the monographs [2][3][4] and the references therein.The original problem has been generalized and extended in countless ways.We will state it here almost in its original form, except we will assume that the linear diffusion governed by the Laplace operator is replaced by a nonlinear diffusion governed by the p-Laplace operator, 1 < p < +∞.Namely, let Ω be a connected bounded open Lipschitz domain in R n , n ≥ 2, and f ∈ L q (Ω) with p −1 + q −1 = 1 be a given volumetric heat source.We would like to find the optimal distribution of the conductivity κ ∈ A ⊂ L ∞ (Ω) and the corresponding temperature field u ∈ U 0 = W 1,p 0 (Ω), solving the following convex saddle-point problem: where ǐloc (κ) := inf u∈U0 Ǐloc (κ; u), For both physical and solvability reasons we will assume that there are constants 0 < κ ≤ κ < +∞, such that and that the admissible set of conductivities A is nonempty, convex, and weakly * compact in L ∞ (Ω).A model example of A is given by where V > 0 is an upper bound on the total available conductivity.
We now outline two ideas, which provide the proper context for this work.

Dual formulation of the state problem
It is often desirable to replace the "inner" optimization problem (2) in the saddle point problem (1) with its convex conjugate.In the present case, this leads to the following dual variational statement: Îloc (κ; σ), where div is the distributional divergence operator, and Q is a separable, reflexive Banach space with respect to the graph norm ∥σ∥ q Q = ∥σ∥ q L q (Ω;R n ) + ∥ div σ∥ q L q (Ω) , which also makes Q(f ) into a closed subset of Q.
As we shall recall, the problem ( 6) is solvable for each f ∈ L q (Ω), and the equality ǐloc (κ) + îloc (κ) = 0 holds.This allows us to equivalently restate the problem (1) as The numerous reasons for considering (9) in lieu of (1) may now become clearer.Physically, the conjugate variables σ provide us with valuable information about the propagation of the quantity u through the heterogeneous domain Ω; for example in the case of heat conduction they correspond to heat fluxes.Algorithmically, one may interchange the order of minimization in ( 9) and ( 6), or indeed minimize simultaneously with respect to the conduction coefficients and the fluxes, see for example [5][6][7][8].Analytically, one derives useful optimality conditions with respect to κ from the assumed knowledge of optimal fluxes σ, see for example [2][3][4]9] and the references therein.

Nonlocal physics
Another avenue, along which (1) may be developed, is to assume that the diffusion is governed by a nonlocal operator instead of a differential one.Nonlocal models have become one of the active research directions lately owing to their numerous attractive mathematical properties and applications, see for example [10][11][12][13][14].In particular, nonlocal models arise naturally when long-range interactions between points in space and/or in time are considered.Nonlocality appears in a variety of contexts, such as image processing [15], pattern formation [16], population dispersal [17], nonlocal diffusion [11,12], nonlocal characterization of Sobolev spaces [18][19][20] and fractional Laplacian [21] and applications of these.In the current work we will focus on peridynamics, a nonlocal paradigm in solid mechanics, see for example [10,14].
We will utilize a model which corresponds to the bond-based peridynamic model of diffusion.The "degree of nonlocality" of the model will be characterized by a small parameter δ > 0, which will be referred to as the nonlocal interaction horizon.In this model only the points, which are less than δ distance units apart, may interact with each other.To this end we let Ω δ = ∪ x∈Ω B(x, δ) be the domain Ω including the nonlocal "δ-halo"/"collar"/boundary The strength of the nonlocal interactions will be characterized by a radial kernel ω δ : R n → R + , which is supported in B(0, δ) and is normalized to satisfy S n−1 is the unit sphere in R n , and e ∈ S n−1 is arbitrary, see [18].Owing to the radial nature of ω δ , sometimes we will write ω δ (|x|) instead of ω δ (x), in particular in connection with integration in spherical coordinates.A typical example of ω δ is given by where α ∈ (−1 − n/p, −1] is a modelling parameter, and the normalization constant c δ,α > 0 is computed from the normalization condition above.More singular behaviour of ω δ at zero (that is, more negative values of α) results in higher solution regularity to nonlocal diffusion problems.
The linear operator Gδ : D( Gδ ) ⊆ L p (Ω) → L p (Ω δ × Ω δ ) defined by will be called the nonlocal gradient, where we employ the convention that functions are extended to vanish outside of their explicit domain of definition, unless explicitly stated otherwise.We will recall and summarize its properties in Proposition 2.1.For now, it is sufficient to state that is a separable, reflexive Banach space when equipped with the norm ∥u∥ U δ,0 = ∥ Gδ u∥ L p (Ω δ ×Ω δ ) , which is furthermore equivalent (uniformly for small δ > 0) to the graph norm associated with the operator Gδ , i.e., The central result on which this construction hinges is the nonlocal Poincaré inequality, see for example [18][19][20].
With this in mind we can now state the direct nonlocal analogue of ( 1)-(3): ǐδ (κ) := inf The coefficient κ may be thought of as the "nonlocal conductivity," in the sense that κ(x, x ′ ) provides an additional characterization of the bond strength between the points (x, x ′ ) ∈ Ω δ × Ω δ .We emphasize the word additional, because in this model the bond strength depends not only on the distance between the points, which is accounted for in the radial kernel ω δ , but also on what can be thought of as the material property κ(x, x ′ ).The quantity Ǐδ (κ; u) defined in (14) is the direct analogue of the usual, local, Dirichlet/potential energy given by (3) associated with p-Laplacian.For now the exact structure of the set of admissible nonlocal conductivities A δ is not going to be of great importance.Similarly to (5), we will assume that nonlocal conductivities satisfy the bounds for some constants 0 < κ ≤ κ < +∞.Similarly, we will assume that A δ is a nonempty, convex, and weakly * compact set in L ∞ (Ω δ × Ω δ ).Finally, we will assume that all nonlocal conductivities are symmetric, that is, The symmetry assumption on κ(x, x ′ ) is reasonable because the quantity | Gδ u(x, x ′ )| p satisfies it, and consequently the Dirichlet energy Ǐδ (κ; u) depends only on the symmetric part of κ(x, x ′ ).
In the context of nonlocal p-Laplacian the well-posedness of the state problem has been analyzed in for example in [22] utilizing the nonlocal vector calculus of [23], see also [24].The corresponding saddle point problem (12) has been introduced and studied in [25][26][27][28][29], even if in certain cases only for p = 2.
The especially interesting research question for these models is whether they can approximate the local models, in some apporiate sense,1 when one considers the limit δ → 0 in for example (13) or (12).These questions have been answered affirmatively in the cited references,2 with the critical tools provided in [18][19][20] and their generalization to problems with heterogeneous coefficients [30,31].

The contribution of this work
The main contribution of this work effectively lies in combining the lines of thought outlined in Subsections 1.1 and 1.2.
More specifically, we will define the dual variational principle corresponding to the nonlocal state problem (13).We will demonstrate its well-posedness, strong duality with the primal statement (13), and finally carry out the analysis of convergence towards the local problems as δ → 0. This plan has been successfully executed in the quadratic case p = 2 in [32].Its generalization to the case p ∈ (1, ∞) is not quite straightforward.In particular, in the non-quadratic case we may no longer rely upon the global stability of solutions to the mixed linear variational principles (uniformly with respect to small δ > 0).Furthermore, the explicit construction of a "recovery sequence" involved in obtaining a lower bound for the local problem (see Subsection 4.2 and Section 5) relies upon a nonlinear operator in the present case, which is also interesting.
Owing to the strong duality, the Γ-convergence result for the dual nonlocal problem is nearly identical to the one mentioned in Section 1.2, see for example [28][29][30][31] 3 .However, our approach utilizes different ideas, namely the stability of optimal values to convex constrained optimization problems, and is of interest on its own merits.

Outline
The outline of the rest of this paper is as follows.In section 2 we introduce the nonlocal divergence operator, state the dual variational statement for the nonlocal p-Laplacian, and establish its well-posedness and quantitative stability.
With these results in place, in Section 3 we state the nonlocal analogue of (6), which is the main subject of study in the present work, and establish its well-posedness.Finally, Sections 4 and 5 deal with the question of Γ-convergence of the nonlocal control in the coefficients towards its local counterpart.We have chosen to collect most of the computations needed to establish the consistency of the limit of the recovery sequence constructed in Subsection 4.2 and needed for Γ-convergence into Section 5.

Nonlocal divergence and the nonlocal dual variational principle for p-Laplacian
Let us begin this section by summarizing the well-known but pertinent properties of the nonlocal gradient operator defined in (11) for future reference.
1.The domain

The range of Gδ
In view of (10) we can conclude that 2. Convergence in L p implies a.e.pointwise convergence, up to a subsequence; Gδ is defined pointwise.
3. The proof of the fact that the constant in the nonlocal Poincaré-type inequality may be chosen uniformly for all small δ > 0 is found in for example [27,Lemma 4.2] for p = 2.The proof holds verbatim for 1 < p < ∞.
4. This is shown in [33,Theorem 2.21] as a consequence of the previous points.
5. We follow the standard argument and consider U δ,0 = D( Gδ ) equipped with the graph norm ∥u∥ p = ∥u∥ p L p (Ω) + ∥ Gδ u∥ p L p (Ω δ ×Ω δ ) , and an isometry T : U δ,0 → L p (Ω) × L p (Ω δ × Ω δ ) given by T u = (u, Gδ u).The point 2 implies that R(T ) = graph Gδ is a closed subspace of the the reflexive and separable Banach space , and is consequently also reflexive, separable, and complete [33,Propositions 3.20,3.25].The same properties hold for U δ,0 equipped with the graph norm because T is a linear isometry.Finally, the conclusion does not change when we equip U δ,0 with the norm ∥u∥ U δ,0 = ∥ Gδ u∥ L p (Ω δ ×Ω δ ) , since the this norms is equivalent to the graph norm in view of 3.
Owing to the density of D( Gδ ) in L p (Ω) we can define a negative adjoint operator D δ : D(D δ ) ⊂ L q (Ω δ × Ω δ ) → L q (Ω), where we use Riesz representation (see, for example, [33,Theorem 4.11]) to identify the duals of L p -spaces with L q : We will refer to this operator as the nonlocal divergence operator.As an adjoint operator, it is closed and linear.We summarize some of its properties we are going to utilize below.
1.The claim follows from the definition of D δ using the integration by parts formula.Namely, for all , and for all v ∈ U δ,0 we have, owing to Fubini's theorem: dx, where The last equality holds owing to the continuity of Lebesgue's integral and owing to the following estimate, which holds for each x ∈ Ω: where we have utilized Hölder inequality and (10).Consequently, D δ σ ∈ L ∞ (Ω) ⊂ L q (Ω) in this case.
2. We note that D δ is a closed operator as the adjoint operator.One can then follow the proof of Proposition 2.1, point 5.

The surjectivity of
. This allows us to formulate the following nonlocal version of (6), that is, the dual variational principle for p-Laplacian: Theorem 2.3 (Existence of solutions and optimality conditions).The problem (18) admits a unique optimal solution σ * ∈ Q δ (f ) for each small δ > 0, f ∈ L q (Ω), and κ ∈ A δ .Furthermore, there is a unique Lagrange multiplier u * ∈ L p (Ω) such that the pair (σ * , u * ) satisfies the first order necessary and sufficient optimality conditions: Proof.The infimum in (18) is attained: indeed we minimize a convex, continuous, and coercive function over the convex, closed, and nonempty subset of a reflexive Banach space Q δ , and consequently the generalized Weierstrass theorem is applicable [34,Theorem 7.3.7].Thereby the existence of σ * is established.
We now note that Robinson's constraint qualification (cf.[35,Equation (3.12)]) holds: owing to the surjectivity of D δ , see Proposition 2.2, point 3. Therefore, we can apply [35,Theorem 3.6] to assert the necessity of the optimality conditions (20) and the existence of Lagrange multipliers.The sufficiency of (20) for optimality under convexity assumptions is standard, see for example [35,Proposition 3.3].
The uniqueness of u * may be verified directly from (20) using the surjectivity of D δ , see Proposition 2.2, point 3.
Indeed, the difference v ∈ L p (Ω) between any two Lagrange multipliers would have to satisfy the equality which is easily obtainable by subtracting two versions of optimality conditions (20) corresponding to two potential Lagrange multipliers from each other. 4ymmetry will play a critical role in some of the forthcoming results.In order to describe it we introduce the following notation for two subspaces of symmetric and anti-symmetric functions in L q (Ω δ × Ω δ ): We will also write To characterize the symmetric and antisymmetric functions we will use the following auxiliary result.
The corresponding statement obtained by replacing s and a also holds. Proof.
1.Both L q s (Ω δ ×Ω δ ) and L q a (Ω δ ×Ω δ ) are closed, since convergence in L q (Ω δ ×Ω δ ) implies the pointwise convergence along a subsequence.Furthermore, from their definition we have ) may be represented as a sum of a symmetric and an antisymmetric functions: .
2. This can be shown using "symmetric mollification."Indeed, let us take an arbitrary function σ ∈ L q a (Ω δ × Ω δ ), and an arbitrary mollifier 3. We perform the direct computation: where the first equality stems from the definition of the symmetric and antisymmetric functions, in the second we change the order of the integration (Fubini theorem, [33,Theorem 4.5]), and in the third we simply rename the variables x ↔ x ′ .
We will now show, that D δ possesses a bounded right inverse operator, whose norm is uniformly bounded for small δ > 0. This will allow us to make uniform a priori estimates of optimal solutions to (18).Proposition 2.6.Let the constants δ > 0 and C > 0 be as in Proposition 2.1, point 3. Then for each f ∈ L q (Ω) Proof.Let us consider the following infima: The infimum on the left is clearly attained: we minimize a convex, continuous, and coercive function over the convex, closed, and nonempty subset of a reflexive Banach space Q δ , and consequently the generalized Weierstrass theorem is applicable [34,Theorem 7.3.7].We let σδ,f If i δ,f = 0 then also σδ,f = 0, and we do not need to proceed any further since the claimed inequality clearly holds.
and ker D δ , see the second infimum in (22).We now define a linear functional F : ker D δ ⊕ span(σ δ,f ) → R by: Note that for each σ ∈ ker D δ and α ∈ R we have the inequality where the second equality is owing to the fact that ker is a linear subspace.We now apply Hahn-Banach theorem where we use the same symbol F to denote the extension.Note that our definition implies that F (σ) = 0, ∀σ ∈ ker D δ .Owing to the closed range theorem (cf.[33,Theorem 2.19]), we have that F ∈ (ker D δ ) ⊥ = R( Gδ ), and consequently there is u ∈ U δ,0 such that where we use Riesz representation (cf.[33,Theorem 4.11]).In particular, where we have utilized the definition of σδ,f , F , D δ through the integration by parts, Hölder's inequality, Riesz representation, and most importantly the Poincaré-type inequality for Gδ , see Proposition 2.1, point 3.
Finally, the proof is concluded by observing that Corollary 2.7 (A priori estimates).Let the constants δ > 0 and C > 0 be as in Proposition 2.6, and let κ ∈ A δ be arbitrary.Then for each f ∈ L q (Ω) and each δ ∈ (0, δ) the unique optimal solution σ * ∈ Q δ (f ) to (18) satisfies the a priori estimates Proof.Let the flux σδ,f ∈ Q δ (f ) be as in Proposition 2.6.Since it is feasible in (18), we have the following string of inequalities: Finally, Corollary 2.8 (Local Lipschitz semicontinuity of the optimal value, uniformly for small δ > 0).Let the constants δ > 0 and C > 0 be as in Proposition 2.6, and let κ ∈ A δ be arbitrary.We will denote by σ1 ∈ Q δ (f 1 ) and σ2 ∈ Q δ (f 2 ) the unique optimal solutions to (18) corresponding to the heat sources Then there is a positive constant L = L(∥f 1 ∥ L q (Ω) , q, C, κ), continuous and nondecreasing with respect to ∥f 1 ∥ L q (Ω) , but independent from δ > 0 such that Proof.Instead of trying to fit the current situation into the framework of [35,Chapter 4], we provide a direct and simple proof of this claim.Note that σ1 + σδ,f2−f1 ∈ Q δ (f 2 ), where we use the notation σδ,f from Proposition 2.6.
Consequently Îδ (κ; σ2 ) ≤ Îδ (κ; σ1 + σδ,f2−f1 ).In particular, utilizing the triangle inequality we get the estimate We apply the following estimates to the each of the terms: We can now apply Taylor's formula to the function R + ∋ t → t q ∈ R + to get the estimate where θ α,β ∈ [0, 1], we have utilized the non-negativity of α and β, and the monotonicity of the function R + ∋ t → t q−1 ∈ R + .Finally, the Lipschitz constant in the estimate above can be explicitly expressed as C. It remains to utilize the upper estimates on α and β, while having the assumption ∥f 2 −f 1 ∥ L q (Ω) ≤ 1 in mind, to conclude the proof.
In order to rigorously recover the duality relationship between (18) and (13) we are going to need the a slightly different characterization of U δ,0 , which is akin to [33,Proposition 9.18], see also [32,Proposition 4.5].
Proposition 2.9.Assume that u ∈ L p (Ω).Then u ∈ U δ,0 if and only if there is a constant c ≥ 0 such that Proof.The "only if" part follows immediately from the integration by parts formula (17) and Hölder inequality; in this case we can use c = ∥ Gδ u∥ L p (Ω δ ×Ω δ ) .
To obtain the "if" part, we first note that the assumed inequality holds trivially also for the symmetric Lipschitz continuous fluxes owing to Proposition 2.2, part 1.Consequently, for each σ ∈ C 0,1 (Ω δ × Ω δ ) we can write where in the last inequality we estimate the norm ∥σ a ∥ L q (Ω δ ×Ω δ ) of the antisymmetric part σa (x, x ′ ) = 1 2 [σ(x, x ′ ) − σ(x ′ , x)] of σ using triangle inequality. 5roceeding with integration by parts as in the proof of Proposition 2.2, part 1, we define a bounded linear functional where as before We now apply Hahn-Banach theorem (cf.[33, Corollary 1.2]) to extend F u to a functional on L q (Ω δ × Ω δ ).Owing to Riesz representation (see, for example, [33,Theorem 4.11]), , and ( 25) implies that τ agrees with − Gδ u, almost everywhere in O ϵ .
The strong duality holds, that is, Proof.We compute the dual function ĩκ : L p (Ω) → R ∪ {±∞} for (18) as where we have utilized Proposition 2.9 to arrive at the first case, and the integration by parts in the second.To compute the remaining infimum, we utilize the fact that Q δ is dense in L q (Ω δ × Ω δ ), see Proposition 2.2.Therefore, for u ∈ U δ,0 we have where we have utilized the separability of the problem.The last infimum is attained at σ , which in turn leads to The dual problem may thus be stated as In particular, the Lagrange multiplier u * ∈ L p (Ω), whose existence and uniqueness has been established in Theorem 2.3, is also the solution to the dual problem and is therefore in U δ,0 ; see [35,Section 3.1.1].Consequently, the strong duality holds, leading to the equality ǐδ (κ) + îδ (κ) = 0.

Control in the coefficients of the nonlocal p-Lapacian
Having done all the preparatory work in Section 2, we now turn our attention to the nonlocal version of (9): Owing to the strong duality between ( 18) and ( 13) established in Proposition 2.10, this problem is equivalent to (12).
One may therefore appeal to the existence of solutions to (12) to assert the existence of solutions to (26).A direct proof of existence, utilizng the convexity of the problem, is not much more difficult.
Proof.Owing to Theorem 2.3, the claim is equivalent to establishing that the infimum is attained.The latter follows using the direct method of calculus of variations while utilizing the convexity of the objective and the feasible set.

Localization
We will now investigate the connection between the problem (26) and the corresponding local problem (9).In order to do this properly, we have to make assumptions about the relationship between the local and nonlocal conductivities, that is, between A and A δ .We will first extend all functions in A by some fixed constant in the interval although we note that other approaches are clearly possible.Once this is done, we will assume that Generally speaking, there is no reason to assume that the set A δ defined in ( 28) is convex, and consequently Theorem 3.1 as stated may not be applicable in the present case.However, since there is a one-to-one correspondence between A δ and A in this case, we can write (with a slight abuse of notation) instead of Îδ (κ; σ) and îδ (κ), having in mind that κ ∈ A and κ ∈ A δ are related through (28).With this notational agreement we have the trivial equality The reasoning of Theorem 3.1 applies to the third infimum from the left: indeed the objective is an integral of a convex nonnegative function, uniformly coercive with respect to the second argument, and A is convex and weakly * compact in L ∞ (Ω).This infimum is therefore attained at some (κ * , σ * ) ∈ A × Q δ (f ).As a result, the rest of the infima are also attained at the corresponding κ * ∈ A δ .
Before we proceed with the estimates, we would like to record the following simple observation, which will be utilized in what follows.If in (29) we have σ ∈ L q a (Ω δ × Ω δ ), then

An upper estimate for the local problem
For each σ ∈ L q (Ω δ × Ω δ ) let us define the following "flux recovery" operator 6 and ∀κ ∈ A.
Proof.We begin by applying Hölder inequality to obtain the estimate where the last inequality is owing to the normalization of the kernel ω δ (•).It only remains to integrate both sides with respect to x to arrive at the first claim.
Let us now take an arbitrary ψ ∈ L p (Ω δ ; R n ), κ ∈ A, and σ ∈ L q a (Ω δ × Ω δ ).We apply Hölder inequality as follows: , where we have used the fact that for all x ∈ Ω δ we have the inequality where e ∈ S n−1 is an arbitrary unit vector.It remains to take Proof.Let ψ ∈ C ∞ c (Ω) be arbitrary.Utilizing the definitions ( 4) and ( 30) and the second order Taylor theorem for ψ, we obtain the estimate where we have used Hölder inequality and the fact that ω δ ≡ 0 for |x − x ′ | > δ k .By passing to the limit δ k → 0 we arrive at the equality which concludes the proof.
We hereby obtained the following one-sided estimate.
Proof.Suppose that the claim is false.Then there is ϵ > 0 and a sequence of indices k ′ such that îδ k ′ (κ Let σδ k = arg min σ∈Q δ k (f ) Îδ k (κ δ k , σ), cf.Theorem 2.3 which establishes its existence and uniqueness.Note that , owing to Corollary 2.5.Furthermore, the sequence ∥σ δ k ∥ Q δ k is bounded, see the a priori estimate given in Proposition 2.6.Consequently, the sequence R δ k σδ k is bounded in L q (Ω; R n ), see Proposition 4.1.

A lower estimate for the local problem
Let us define the following nonlinear operator, which maps vector-valued functions σ : R n → R n to two-point quantities Fδ σ : R n × R n → R: Note that in the quadratic case p = 2, Fδ reduces to the Hilbert space adjoint operator for R δ , when the latter is restricted to L 2 a (Ω δ × Ω δ ), see [32, Section 6].

An informal derivation of Fδ
In this subsection we present an informal derivation of (32).Throughout the derivation we shall assume that the conductivity field κ(x) ≡ κ is constant, and consequently also κ ≡ κ.Let σ ∈ L q (Ω δ × Ω δ ) be a two-point flux corresponding to a sufficiently smooth temperature field, for example u ∈ C 2 c (Ω).We can then state the identity where the first equality is obtained by isolating σ from the first of the optimality conditions (20), and the second one is owing to the symmetry of κ and anti-symmetry of Gδ .We can now make the following finite difference approximations stemming directly from the definition of Gδ , see (11): where we have utilized the radial symmetry of ω δ .Finally, assuming that u is also the temperature field corresponding to the "local" vector flux σ for p-Laplacian, we get the formulae ∇u(x) = −κ 1−q |σ(x)| q−2 σ(x), and It remains to substitute (35) into (34), and the result subsequently into (33) and note that κκ (p−1)(1−q) = 1 and |σ(x)| (p−1)(q−2) = |σ(x)| 2−p in order to arrive at (32).

Rigorous analysis of Fδ
We will now show, that Fδ possesses several properties, which are complementary to those enjoyed by R δ .Namely, we will show the following: • Antisymmetry: Fδ σ ∈ L q a (Ω δ × Ω δ ), for each σ ∈ L q (Ω; R n ).
Antisymmetry follows directly from the definition of F , since Furthermore, for each σ ∈ L q (Ω; R n ) we can apply Hölder's inequality to show that Fδ σ ∈ L q (Ω δ × Ω δ ): We now proceed to establishing norm-stability, which is only marginally more difficult to obtain.Proof.We proceed with the direct computation.Let us put z = x ′ − x.Then we can write: dx, where we have utilized the fact that ω δ is radial, the anti-symmetry of Fδ and the convexity of | • | q .Arguing as in (31) we establish that E 1 (x) = |σ(x)| q , for each x ∈ Ω.Our strategy now is to show that E 2 (x) is close to E 1 (x), in the sense that Lebesgue's dominated convergence theorem is applicable as δ ↘ 0. Since we know the value of E 1 (x) only for x ∈ Ω, we will also need to show that the integrals of E 1 and E 2 over Γ δ are small.
To this end we first state the upper bound where we utilized Hölder's inequality.We now consider the integrand g(r) = |σ(x + rs)| q−p |σ(x + rs) • s| p , where s ∈ S n−1 and r ∈ R are arbitrary.If σ(x) ̸ = 0, the same holds over B(x, δ), for small enough δ > 0. We can then directly compute for r ∈ (−δ, δ).Consequently, utilizing Cauchy-Bunyakovsky-Schwarz inequality we can estimate and therefore owing to Taylor's theorem we have the inequality If, on the other hand, σ(x) = 0, then also g(0) = 0 and utilizing Cauchy-Bunyakovsky-Schwarz inequality and Taylor's theorem we can write Regardless of whether σ(x) = 0 or not, and having in mind that q > 1, for small |z| we can write the estimate with the proportionality constant depending on p, q, ∥σ∥ L ∞ (R n ;R n ) and ∥∇σ∥ L ∞ (R n ;R n×n ) .We now use s = z/|z| and r = |z| in the definition of g to write the following: for each x ∈ Ω δ .Since we have the upper bound (37), Lebesgue's dominated convergence theorem applies, implying the inequality with the last limit being 0 owing to the continuity of Lebesgue's integral.In summary, we can conclude that lim sup δ↘0 Îδ (κ; Fδ σ) ≤ Îloc (κ; σ).
We postpone the analysis of consistency of Fδ to Section 5.For now, we only state the final result without a proof.

Application to the lower estimate for the local problem and Γ-convergence
With these results at our disposal, we can prove the following claim.
According to Proposition 4.6, we can select δ(ϵ) > 0, such that for all δ ∈ (0, δ(ϵ)) we have the bound: Let σδ ∈ Q δ (f ) and τδ,ϵ ∈ Q δ (f δ,ϵ ) be the solutions to (18) corresponding to the two different volumetric heat sources f and f δ,ϵ .In view of the stability estimate, Corollary 2.8, we have the following estimate: where L ϵ is continuous and nondecreasing with respect to ∥f δ,ϵ ∥ L q (Ω δ ) ≤ ∥f ∥ L q (Ω) + 2ϵ, and therefore remains bounded for all small ϵ > 0. Consequently, we can write  In particular, the optimal values of the local (9) and nonlocal (26) problems satisfy the inequality Proof.The first claim follows from applying Proposition 4.7 to σ = arg min τ ∈Q(f ) Îloc (κ; τ ), while the second follows from applying the same result to a solution to (26).Now, as a direct consequence of Proposition 4.3 and Corollary 4.8 we obtain the following nonlocal-to-local approximation result.
As a direct consequence of Γ-convergence, we obtain the convegrence of optimal values, namely and consequently owing to the strong duality also see [37].Furthermore, in view of a priori bounds on the optimal solutions we have established, we also get their convergence in the following sense.Let {(κ δ , σδ )} be a family of optimal solutions to (26) as δ ↘ 0.Then, the family {(κ δ , R δ σδ )} is sequentially compact, with respect to the weak * topology of L ∞ (Ω) and the weak topology of L q (Ω; R n ).Each limit point of {(κ δ , R δ σδ )} as δ ↘ 0 is an optimal solution to (9).

Consistency of Fδ
We now proceed to establishing the consistency of Fδ , that is, we will prove Proposition 4.6.Owing to the non-linearity of Fδ for p ̸ = 2, the proofs are somewhat longer than in the quadratic case p = 2.We begin with the following lemma, which will be applied to Taylor polynomials of D δ Fδ σ later on.
Lemma 5.1.Let e ∈ S n−1 and A ∈ R n×n be arbitrary.Then Proof.Let Q ∈ R n×n be an orthogonal transformation mapping e to some other unit vector ẽ = Qe ∈ S n−1 .Then the left hand side of (39) reduces to where we have utilized the variable substitution s = Qs.The right hand side does not change, since tr A = tr(QAQ T ).
Therefore, without any loss of generality we may assume that e = e n , where e 1 , . . ., e n ∈ S n−1 are the standard basis vectors in R n . Then with integrals corresponding to i ̸ = n or j ̸ = n evaluating to 0 owing to the orthogonality and symmetry considerations. Furthermore where we denoted the value of the integral corresponding to i = j ̸ = n by E i,n and used the fact that the integrals corresponding to i ̸ = j evaluate to 0. Owing to the symmetry, , so we only need to compute one of them.
Let us directly evaluate E n−1,n using the spherical coordinates . . . . . .
A simple consequence of Lemma 5.1 and the normalization condition (10) is the following statement about volumetric integrals related to (39).
Corollary 5.2.Let e ∈ S n−1 and A ∈ R n×n be arbitrary.Then Proof.We evaluate the integral (44) in spherical coordinates: for smooth enough σ, where s ∈ S n−1 is an arbitrary unit vector.The reason for this is as follows.
Proposition 5.3.Let σ be such that the family J p,δ,ϵ [σ] is dominated by some L q (Ω)-function, uniformly with respect to ϵ > 0. Then Fδ σ ∈ Q δ and Proof.Let us take an arbitrary u ∈ U δ,0 .We begin with the integration by parts: where The steps rely primarily on (anti-)symmetries of the terms, including the radial symmetry of ω δ (•), and Fubini's theorem.Finally, we can move the limit under the integral sign owing to the assumed uniform boundedness and Lebesgue's dominated convergence theorem.
For p = 2 the nonlinearity in Fδ disappears, and and small δ > 0, see for example [32,Proposition 6.1].We proceed with the analysis of the two remaining cases p > 2 and 1 < p < 2, starting with a simpler case p > 2. In particular, in both cases we will show that J p,δ,ϵ [σ] is bounded and converges pointwise to div σ for smooth enough σ, therefore Lebesgue dominated convergence theorem yields the desired conclusion.In other words, in the remainder of the paper we will check that the prerequisites of Proposition 5.3 hold for σ ∈ C 2 c (R n ; R n ), which in turn is sufficient to conclude that Proposition 4.6 holds.
Proof.We consider three cases.
Then for all 0 < δ < δ(x)/2 and for all z ∈ B(0, δ) we have We apply Taylor's theorem as in (47) with y 0 = x and y 1 = x + z, let s = z/|z|, and integrate each of the terms.For the constant term we get B(0,δ)\B(0,ϵ) owing to the symmetry.For one of the first order terms we proceed as follows: B(0,δ)\B(0,ϵ) because the integrand is a product of at least once continuously differentiable functions with respect to x in B(0, δ).
Case 2: σ(x) = 0, ∇σ(x) = 0.By utilizing Taylor's formula of degree two, we improve (48) to with constants depending only on the second order derivatives of σ.After integration this yields the estimate Therefore, in this case (50) also holds.
Finally, at each point of G ijk , the implicit function theorem applies, allowing us to express x j as a C 1 function of the rest of coordinates with the derivative bounded by O(k∥∇σ(x)∥ L ∞ (R n ;R n×n ) ).The Lebesgue measure of the graph of a such a function is 0.

Analysis of J
We begin with some additional notation.In the proof of case 1 of Proposition 5.6, we put δ(x) = |σ(x)|/∥∇σ∥ L ∞ (R n ;R n×n ) and when needed considered 0 < δ < δ(x)/2 in order to have a uniform positive lower bound (51) on |σ(x + z)|, for all z ∈ B(0, δ).In the present case, we will also need a uniform lower bound on the absolute value of the inner product σ(x + z) • s, z ∈ B(0, δ), s ∈ S n−1 .With this in mind we estimate this quantity as and define We will now estimate each of the terms E 1 , E 2 , E 3 , and E 4 .
To estimate E 2 we can for example consider the sets Additionally, we recall that a differentiable concave function ϕ : R → R satisfies and that the reverse inequality holds assuming convexity.Utilizing the monotonicity and concavity of the function [0, ∞) ∋ t → |t| p−2 t on S 1 , we have the estimate which is valid everywhere on S 1 outside of the null-set { s ∈ S n−1 : σ(x) • s = 0 }, which will become irrelevant after integration.Similarly, utilizing the monotonicity and convexity of the function (−∞, 0] ∋ t → |t| p−2 t, on S 2 we have the estimate with the same comment as for the estimate for S 1 .We now focus on T .We note the bounds due to Taylor's theorem: Thus the sign difference between σ(x) • s and σ(x + rs) • s may only occur when s belongs to the spherical segment S − (x, r), see (56).Hence T ⊂ S − (x, r), whose measure is bounded by Armed with (60), (61), and (62) we can estimate the contribution to E 2 coming from S 1 ∪ S 2 by Let us finally focus on contributions to E 2 from S 3 = (S 3 ∩ S − (x, r)) ∪ (S 3 ∩ S + (x, r)).On S 3 ∩ S − (x, r), utilizing the monotonicity of the function (−∞, 0] ∋ t → t p−1 we can simply write Let s ∈ S + (x, r) be an arbitrary vector; in particular, 0 < r < r(x, s).Let further t ∈ [0, 1] and r ∈ [r, r(x, s)) be arbitrary.We note the inclusion s ∈ S + (x, r), which follows directly from the definition (56), as well as the trivial inequality rt < r(x, s).Consequently, utilizing (55) we get the inequality By taking the supremum of the right hand side over r ∈ [r, r(x, s)) we arrive at the estimate  After integration, this yields: Esimate for E 4 .We proceed in the same way as with After integration, this yields: Summing up the estimates (58), ( 59), ( 63), ( 64), (66), (67), and (68), and collecting all the constants into L concludes the proof of the claim.
Corollary 5.8.Assume that σ ∈ C 1 c (R n ; R n ).Then we have the inequality where the constant L(σ, p, n) > 0 is as in Lemma 5.7.Proof.Note that cases 2 and 3 in the proof of Proposition 5.6 apply without any change.Therefore, we will focus on the remaining case σ(x) ̸ = 0. We remind the reader of the notation (56).
We begin by dealing with the integral over O − δ,ϵ (x).Recall the upper estimate of the measure of S − (x, r) provided by (62), which implies that The other first order term is more troublesome: To the last integral we can apply Proposition 5.9, which does not provide us with an error estimate but is sufficient for establishing its convergence and identifying its limit:

Numerical verification
Jupyter notebooks allowing one to numerically verify the equality asserted in Lemma 5.1 or the pointwise convergence established in Propositions 5.6 and 5.10 are publicly available via GitHub under https://github.com/anev-aau/nloc_dual_lapl_p.
D δ follows from Proposition 2.1, points 1, 2, and 3 and [33, Theorem 2.21].The fact that it is a bounded operator follows directly from the definition of the graph norm on Q δ .Proposition 2.2, point 3 implies that the affine subspace Q δ