Optimal Distributed Control for a Viscous Non-local Tumour Growth Model

In this paper, we address an optimal distributed control problem for a non-local model of phase-field type, describing the evolution of tumour cells in presence of a nutrient. The model couples a non-local and viscous Cahn–Hilliard equation for the phase parameter with a reaction-diffusion equation for the nutrient. The optimal control problem aims at finding a therapy, encoded as a source term in the system, both in the form of radiotherapy and chemotherapy, which could lead to the evolution of the phase variable towards a desired final target. First, we prove strong well-posedness for the system of non-linear partial differential equations. In particular, due to the presence of a viscous regularisation, we can also consider double-well potentials of singular type and cross-diffusion terms related to the effects of chemotaxis. Moreover, the particular structure of the reaction terms allows us to prove new regularity results for this kind of system. Then, turning to the optimal control problem, we prove the existence of an optimal therapy and, by studying Fréchet-differentiability properties of the control-to-state operator and the corresponding adjoint system, we obtain the first-order necessary optimality conditions.


Introduction
In recent years the mathematical modelling community has devoted much attention towards the understanding of intrinsic phenomena behind tumour growth, as well as the possibility of simulating medical treatments, which could drive the evolution of tumours.Without being by any means exhaustive, we cite [3,4,13,37,38] and references therein.Regarding the mathematical analysis of these models, the main questions concern not only well-posedness, regularity and long-time behaviour of solutions, but also related issues such as optimal control or parameter identification problems, see for example [1,5,12,29,30,43].
In the present work, we consider a diffuse interface model, which is a non-local and viscous variant of the model originally introduced in [34].We start by recalling the original model in [34], which can be written in the following form: together with homogenous Neumann boundary conditions and initial conditions.The main variable of the model is ϕ, a phase-field which represents the difference in volume fractions between tumour cells and healthy cells coexisting in the spatial domain Ω ⊆ R N , N = 2, 3.
Physically, ϕ should assume only values between −1 and 1, where ϕ ≃ 1 is identified as the tumour phase and ϕ ≃ −1 as the healthy phase.As a phase-field approximation, ϕ is free to assume values in (−1, 1), corresponding to a mixture of the two cell types; however, the interfacial region is generally kept small through an appropriate parameter.The evolving tumour region is, then, recovered as a level set of ϕ at values close to 1.Besides being easier to deal with from a numerical point of view, one of the main advantages of diffuse interface models, when compared to free-boundary ones, where ϕ would assume only the values −1 or 1, is the possibility of accounting for topological changes in the tumour mass, which can be quite common in certain stages of its evolution.The other variable σ of the system is related to the concentration of a nutrient in the extracellular water, such as oxygen or glucose.The idea is that tumour cells proliferate by absorbing such substance, whereas the concentration of nutrient consequently varies depending on the areas in which the tumour evolves.Also in this case, σ should physically be between 0 and 1, where σ ≃ 0 represents low concentrations of nutrient and σ ≃ 1 high ones.In particular, ϕ satisfies a Cahn-Hilliard equation, which is wellsuited in describing phase-separation phenomena (see [39] and references therein), while the nutrient variable σ satisfies a reaction-diffusion equation, and the two equations are coupled through an explicit reaction term, which models the interaction between nutrient and evolving tumour.We also account for another coupling term, associated with the parameter χ ≥ 0, that is responsible for the chemotaxis phenomenon, which concerns the natural movement of tumour cells towards regions with higher concentrations of nutrient.
As we briefly said at the beginning, we intend to consider a non-local and viscous variant of the system (1.1)- (1.3).Regarding the non-local aspect of the model, this is achieved by substituting the local term −B∆ϕ with one involving a convolution operator with a symmetric kernel J, possibly accounting for more complex interactions between cells, involving longrange competition.In this way, ϕ satisfies a non-local Cahn-Hilliard equation (see [26,31]), which can be seen as an approximation of the local one, when the kernel J is suitably peaked around zero (see [14][15][16]).Analysis and application of non-local phase-field models on tumour growth phenomena is still flourishing; we just mention some interesting works in this direction [24,25,43].The second modification applied to the original model is the addition of a viscosity term τ ϕ t in the equation for the chemical potential µ.Physically, it can be interpreted as a further dissipation acting on the system, coming from internal frictions (see [40]).From the mathematical point of view, however, it acts as a regularising term, since it allows some additional regularity estimates.Moreover, we also add two external sources u and v, which will then represent possible therapies in our optimal control problem.
In its most general form, the model that we are going to analyse is the following: where T > 0 is a fixed final time, Ω ⊂ R N , N = 2, 3, is an open bounded sufficiently smooth domain and Q T = Ω × (0, T ).We complement it with the following homogeneous Neumann boundary conditions and initial conditions: ) where n is the exterior normal unit vector to ∂Ω.
We now briefly introduce all the parameters that appear in the system.The auxiliary variable µ can be interpreted as a chemical potential in the Cahn-Hilliard equation, and, up to the additional viscosity term τ ∂ t ϕ, with τ > 0, it is the variational derivative of the non-local free energy of Ginzburg-Landau type The function F is generally a potential of double-well type, possessing two global minima in correspondence of the two pure phases −1 and 1.In most applications, typical choices of such potentials are a regular or polynomial one, a singular or logarithmic one, with derivatives exploding at ±1, or a double-obstacle one, which are respectively of the following form: Actually, the global minima of the logarithmic potential F sing are not exactly in −1 and 1, but in two intermediate values, which can be sufficiently close to them, depending on the choice of the parameters θ and θ 0 .In this paper, we will consider only potentials of regular or singular type, excluding the double obstacle case, which can generally be treated with slightly different methods, as in [10,45].From a physical perspective, potentials of singular type are more fitting, since they can effectively limit the evolution of ϕ in the interval [−1 , 1]; in this case it is crucial to prove the so-called separation property, see the recent work [26] for a detailed overview of the matter.This property is related to the fact that, during its evolution, the phase variable stays strictly inside the physical interval (−1, 1), which implies that the derivative of the potential does not actually blow-up.This is to be expected, since the variational structure of the system drives the evolution of ϕ towards the minima of the potential, which, in case of the singular one, are strictly inside the interval.In general this property is not always easy to prove and its validity is still an open problem in some situations, see however the recent works [32,41] for substantial advances in the three dimensional nonlocal case.When considering a regular potential, instead, this property actually means that the phase parameter stays globally bounded during its evolution.Here, thanks to the presence of the viscosity term, we are able to simplify these arguments and consider both regular and singular potentials, even in presence of reaction terms and sources.Proving this kind of property is important since it allows to effectively treat the singular potential as a regular one and compute its derivatives, and this is crucial to prove higher regularity for the solution of the system.The second term in the energy stands as a non-local approximation of the local term B|∇ϕ| 2 /2, which would penalise steep transitions in the interfaces between the two phases.
The convolution kernel J is generally taken of Newton or Bessel type; however, when studying non-local to local asymptotics of Cahn-Hilliard equations, one can also consider families of kernels suitably peaking around zero, see for instance [14][15][16].Finally, the parameters A > 0 and B > 0 are related to the width of the diffuse interface between the two phases, and are generally of the form A = 1/ε and B = ε, for ε ≪ 1; whereas the parameter χ ≥ 0 and its corresponding term are related to the chemotactic effect.Going back to the system (1.4)- (1.8), the functions m(ϕ) and n(ϕ) are called mobility functions and regulate the diffusion processes of the two variables.Generally, they can be constant or bounded both above and below, but in some cases, especially when F is singular, m can be degenerate on ±1.For instance, when dealing with the logarithmic potential F sing , one can choose m(s) = 1 − s 2 , in order to further enforce the physical condition and to compensate the singularities of the potential, see [19,20,23].The function P is a proliferation function that calibrates the strength of the reaction terms, which, in turn, are written in this form due to some chemical phenomenological laws (see [34]).Typically, P can be of the form so that P (−1) = P 0 and P (1) = P 1 , with 0 ≤ P 0 < P 1 , i.e. the proliferation is faster in the tumour phase.Generally, one takes P 0 = 0 and P 1 = 1.Note that, for technical reasons, we will be forced to assume that P is strictly positive, albeit still possibly very close to zero.This will be needed in proving the existence of strong solutions to (1.4)- (1.8); in case of a singular potential, it will be used also for weak solutions (see Remark 2.5 below).However, this is not unprecedented; for example, the same hypothesis was needed in [5] to study long-time dynamics of a similar tumour growth model.Observe also that this kind of reaction term P (ϕ)(σ + χ(1 − ϕ) − µ), with opposite signs in (1.4) and (1.6), implies the conservation of mass for the sum of ϕ and σ, in absence of external sources (i.e.u = v = 0).Indeed, by multiplying (1.4) and (1.6) by 1, integrating over Ω and summing up, we obtain the identity: Finally, the external sources u and v are the controls that we are going to use in our optimal control problem to direct the evolution of the system towards a certain objective.They can be interpreted as external therapies acting on the tumour.In particular, u can represent a radiotherapy directly acting on the proliferation of the tumour cells, whereas v can be seen as a chemotherapy acting on the tumour through drugs.The function (ϕ) is an additional parameter that can be used to distribute the radiotherapy through particular strategies, see for example [7,28].
Regarding the well-posedness and regularity of this kind of system, there are already many contributions to the local version of the model; we just cite [8,18,21].On the nonlocal model, there are fewer ones, but we recall [23], where the authors showed well-posedness for the system without viscosity (i.e.τ = 0) in the case of a regular potential and bounded mobility or of a singular potential and degenerate mobility.Then, for the same system, but without chemotaxis (i.e.χ = 0), in [24] the authors proved strong regularities for the solutions of the system, when considering singular potentials and degenerate mobility.Finally, we also cite [44], where, with an additional regularising term α∂ t µ in (1.4) and τ > 0, the authors proved strong well-posedness for a similar system with full chemotaxis, constant mobility and both regular and singular potentials.However, they consider a different reaction term, introduced by Garcke et al. (see [27,30]) through thermodynamic and kinetic considerations.Moreover, they also perform an asymptotic analysis for α → 0, but they are not able to recover the full strong regularity for the solutions, and only for regular potentials, if part of the chemotactic effect is neglected.Regarding these themes, the main novelty of this work is to prove strong well-posedness for (1.4)- (1.8), in the case of constant mobilities, with full chemotaxis, positive viscosity and both regular and singular potentials, i.e. for the system: ) ) where we take m(ϕ) ≡ m > 0 and n(ϕ) ≡ n > 0 constant.This can be done without resorting to the additional regularisation α∂ t µ, used in [44], which, in general, has little physical meaning.The crucial point is that our different reaction term, with respect to [44], together with the additional assumption that P is strictly positive, still allows us to prove some further regularity on µ.For more details, see Theorems 2.3, 3.5 and 3.7 below.
The other main novelty, as well as the key objective of this paper, is the analysis of an optimal distributed control problem on a non-local tumour growth model, with different therapy techniques.Indeed, up to the author's knowledge, this is the first time that such a problem is considered for tumour growth phenomena in a non-local framework (see for instance [22] for other optimal control problems in non-local phase-field systems related to different applications).While we can count many contributions in the local case (see for instance [6,12,28]), in our setting we only have some applications to parameter identification (see [43]) or inverse identification problems (see [24]).In general, optimal control problems are being widely studied in applied mathematics related to bio-medical issues due to their applicability in predicting and directing possible effects of therapies, we cite for instance [11,42].In the present work, we consider the following optimal control problem: (CP) Minimise the cost functional subject to the control constraints and to the state system (1.9)-(1.13)with constant mobilities.
Here α Ω , α Q , β Ω , β Q , α u , β v are non-negative parameters that can be used to select which targets have to be privileged.Some, but not all of them, can clearly be equal to zero, and their relative ratios determine which strategy is being pursued.The function ϕ Ω is a final target for the tumour distribution, for instance one that could be suitable for surgery or possibly one that, after some time, stabilises on a certain shape; whereas ϕ Q is a desired evolution.In the same way, σ Ω and σ Q are respectively a final target and a desired evolution for the nutrient.Finally, the last two terms in the cost functional penalise large use of radiotherapy or drugs, which could still harm the patient.The aim of the optimal control problem is, then, to find the best therapies u and v, which can lead the evolution of the tumour to the desired targets.Indeed, this is a quite standard tracking-type cost functional, which mostly measures the L 2 distance between the current solutions and some fixed targets.Regarding (CP), in Theorem 4.2 we show the existence of an optimal pair (u, v).Then, through careful analysis of the control-to-state operator and the introduction of the adjoint system, in Theorem 4.8 we prove the first-order necessary optimality conditions, which have to be satisfied by the optimal pair.We would like to stress that proving global boundedness of the solution in the right function spaces, as well as the separation property for the tumour phase parameter, is of utmost importance when analysing the control-to-state operator.On this theme, let us point out that in Theorem 4.4 we prove the Fréchet differentiability of said operator.While Gâteaux-differentiability would be sufficient to derive first-order necessary conditions, we choose to prove a stronger notion of differentiability since our regularity setting allows to get it with relative ease.The point is that the key step in getting higher regularity for the solution is proving the separation property, and, once one has that, all the rest follows practically for free.However, one cannot do without the separation property, even if resorting to Gâteaux-differentiability, since for example it is needed for the first continuous dependence estimate, as well as the well-posedness of the linearised system (see Theorems 3.7 and 4.3).Finally, let us remark that one could go further with the study of the optimal control problem (CP) and try to prove also second-order sufficient conditions, as done for instance in [12].While very interesting from an applicative point of view, this is presently out of the scope of this paper and is left to further investigations.The plan of the paper is the following.In section 2, we introduce the notation and the main hypotheses that we are going to use throughout the paper, and we also prove the existence of weak solutions to (1.4)- (1.8).In section 3, we focus on strong well-posedness of the system with constant mobilities, by proving regularity and continuous dependence results.In section 4, we shift our attention to the optimal control problem, by first proving an existence result.Then, we introduce the linearised system and prove the Fréchet-differentiability of the controlto-state operator.Finally, after studying the adjoint system, we shall prove the main result of the paper, namely the first-order necessary optimality conditions for (CP).

Preliminaries and existence of weak solutions
We now introduce some notation that will be used throughout the paper.We denote with Ω ⊂ R N , N = 2, 3 an open bounded domain with boundary ∂Ω of class C 2 , whereas T > 0 is a fixed final time.The C 2 requirement for ∂Ω is needed for elliptic regularity estimates in Section 3, while for weak solutions one can just assume that ∂Ω is Lipschitz.For convenience, we also denote Q t = Ω × (0, t), for any t ∈ (0, T ].
Next, we recall the usual conventions regarding the Hilbertian triplet used in this context.If we define then we have the continuous and dense embeddings: We denote by •, • V the duality pairing between V * and V and by (•, •) H the scalar product in H. Regarding Lebesgue and Sobolev spaces, we will use the notation • p for the L p (Ω)-norm and • k,p for the W k,p (Ω)-norm, with k ∈ N and 1 ≤ p ≤ ∞.Moreover, we observe that, by elliptic regularity theory, an equivalent norm on W is Finally, we recall the Riesz isomorphism N : It is well-known that for u ∈ W we have N u = −∆u + u ∈ H and that the restriction of N to W is an isomorphism from W to H. Additionally, by the spectral theorem, there exists a sequence of eigenvalues 0 < λ 1 ≤ λ 2 ≤ . . ., with λ j → +∞, and a family of eigenfunctions w j ∈ W such that N w j = λ j w j , which forms an orthonormal basis in H and an orthogonal basis in V .In particular, w 1 is constant.Finally, we recall some useful inequalities that will be used throughout the paper: In particular, we recall the following versions with N = 2, 3: In particular, we recall the following versions with N = 2, 3: Note that all constants C > 0 mentioned above depend only on the measures of the sets and the parameters, not on the actual functions.Now we introduce the structural assumptions on the parameters of our model (1.4)-(1.8):A3.F : R → R ∪ {+∞} can be written as where F 1 is such that the effective domain of for some l ∈ (0, +∞], and it is extended by A4.There exists c 0 > 0 such that AF ′′ (s) + Ba(x) ≥ c 0 ∀s ∈ (−l, l) a.e.x ∈ Ω.
Remark 2.1.Regarding hypotheses A3 and A4 on the potential, we wish to point out the following aspects.Firstly, l = 1 corresponds to the physical case and here F can be taken as the logarithmic potential F sing .Moreover, if l = +∞, F is actually a regular potential.Due to the strong hypothesis A5 on the proliferation function P , in principle F could have any type of growth at infinity.However, by assuming polynomial growth for F as in [23], one can relax the hypothesis on P .Indeed, P can have polynomial growth up to a certain order and the hypothesis P 0 > 0 can be avoided.This will be more clear in Remark 2.4.Secondly, hypothesis A3 is needed to work with a Yosida approximation of the singular potential within the proof.In particular, the logarithmic singular potential shown before, and also the polynomial double-well, satisfy this hypothesis.
Remark 2.2.Concerning hypothesis A2, instead, it is easily seen that Newton and Bessel type kernels satisfy these requirements (see [2]).Another class of kernels which satisfy A2 and A4 is for example the one used in [16] for non-local-to-local asymptotics.Without going into the details, they are a class of kernels J ε , depending on ε > 0, for which a * ε and b ε are finite for any fixed ε, therefore J ε ∈ W 1,1 loc (R N ), but they blow up to +∞ as ε → 0. In this way, also A4 is satisfied if ε is small enough.
Finally, we would like to stress that, in the following, we will extensively use the symbol C > 0 to denote positive constants, which may change from line to line.They will depend only on Ω, T , the parameters and on the norms of the fixed functions introduced in hypotheses A1-A8 and possible subsequent ones.Sometimes, we will also add subscripts on C to highlight some particular dependences of these constants.Above all, when possible, we will focus on the dependence on the parameter τ , since one may be interested in sending it to 0 in some future applications.
We can now start with the first result about existence of global weak solutions for our tumour growth system (1.4)-(1.8).
Theorem 2.3.Under assumptions A1-A8, there exists a weak solution (ϕ, µ, σ) to (1.4)-(1.8),with and the following variational formulation for a.e.t ∈ (0, T ) and for any ζ ∈ V : In particular, there exists a constant C > 0, depending only on the parameters of the model and on the data ϕ 0 , σ 0 and u, v, such that: Remark 2.4.In the case of a regular potential of polynomial growth, one can still prove Theorem 2.3 with less restricting hypotheses on the function P .Indeed, following [23], one can replace hypotheses A3, A5 and A7, respectively, with the following: A3'.F ∈ C 2 (R) and there exist c 1 ∈ R and c 2 > χ 2 A such that Moreover, there exist c 3 > 0 and c 4 ≥ 0 such that A5'. P ∈ C 0 (R) and there exist c 5 > 0 and q ∈ [1, 4] such that In the following Remark 2.7, we provide some hints as to how to modify the proof in this case.
Remark 2.5.Before starting the proof, we would like to briefly comment our strategy.As usual with Cahn-Hilliard type equations, the main issue is to recover in some way also an L 2 bound on µ from the standard energy estimate used in this context.This is generally done by means of an estimate on the mean value of µ, which then allows the application of Poincaré-Wirtinger inequality.To do this, one either starts studying the evolution of the mean value of ϕ, as in standard Cahn-Hilliard-Oono equations (see for instance [17,33,35]), or directly imposes some growth conditions on the potential F , as in [23].In our framework, both these possibilities have to be ruled out, since our reaction term is too complex to give some information on ϕ Ω (t) and we need uniform estimates independent on the regularisation of our possibly singular potential.Therefore, hypothesis A5 on the strict positivity of P , as well as its boundedness, comes into play, together with the structure of our reaction term.Both properties of the proliferation function are needed already in the first energy estimate.
Clearly, this is an alternative way to the additional regularisation αµ t introduced in [44].
Proof of Theorem 2.3.For simplicity of exposition, we proceed with formal a priori estimates.However, we would like to give some details on how to construct a nice approximation framework.First, one needs to approximate the possibly singular potential with a one-parameter family of regular ones.By hypothesis A3, we can employ a Yosida approximation of ∂F 1 , where with this symbol we denote the subdifferential in the sense of convex analysis, which coincides with F ′ 1 inside (−l, l).Moreover, we recall that we have extended F 1 to +∞ outside [−l, l], so that it is proper, convex and lower-semicontinuous by hypothesis.Then, we can approximate ∂F 1 with a family where I stands for the identity operator.By the properties of the regularisation, we know that F ′ 1,λ is 1/λ-Lipschitz for any λ > 0.Then, we employ the Moreau regularisation of F 1 by defining Finally, we set The second level of approximation is a Faedo-Galerkin discretisation scheme with discrete spaces W n , generated by the eigenvectors of the operator N .Note that, within this discretisation, all eigenfunctions satisfy homogeneous Neumann boundary conditions, and this turns useful when integration by parts is needed in the following estimates.For more details about the fully approximated problem, we refer the reader to the proof of [44,Theorem 2.1].Here, we assume to be working with both approximations, and we show uniform estimates on the variables, independent of the regularisation parameters.For simplicity, however, we suppress the dependences on these additional parameters.
For the main a priori energy estimate, we substitute (2.5).Note that all these substitutions are allowed in the discretisation framework.By adding the resulting identities together and using standard results on differentiation in Bochner spaces, we get: where In particular, we used hypothesis A2 on the symmetry of J to get the corresponding term in the expression of E J (t).Now, we can rewrite the term involving P (ϕ) and, by using A5, Cauchy-Schwarz and Young's inequalities with carefully chosen constants, we can estimate it in the following way: Then, we can also treat the terms on the right-hand side of (2.7) by means of Cauchy-Schwarz and Young's inequalities, together with hypothesis A7, indeed: Now, by using A2 and Cauchy-Schwarz and Young's inequalities, we can estimate the energy E(t) from below as where C depends only on the measure of Ω and χ.In a similar way, one can also estimate the initial energy E(0) from above as owing to assumption A8.It may be useful to note that at the discretisation level, one has to work with F λ (Π n ϕ 0 ), where Π n ϕ 0 is the projection of ϕ 0 onto the discrete spaces.In this case, one has to use the fact that, by construction, F λ as at most quadratic growth and the hypothesis on ϕ 0 ∈ H, together with the properties of projections.Then, after passing to the limit in the Galerkin discretisation, one can also recover an upper bound independent of λ, by using the fact that F λ ≤ F by construction and the hypothesis F (ϕ 0 ) ∈ L 1 (Ω).For more details, see [44,Theorem 2.1].From the previous inequality and by using A6, we get: 3) and we multiply both sides by a constant M > 0 such that Then, by using hypotheses A5, A6, A7 and Cauchy-Schwarz and Young's inequalities, we get: (2.9) Finally, to compensate also the term C ∇ϕ 2 H , we substitute ζ = −∆ϕ in (1.5), which is possible within Galerkin's discretisation, and, integrating by parts, without getting any extra boundary terms, due to the discrete spaces, we obtain: Next, we add and subtract χ 2 ∇ϕ 2 on the right-hand side, and by using A2, A4, Cauchy-Schwarz and Young's inequalities, we obtain that (2.10) At this point, we can add together (2.8), (2.9) and (2.10) and integrate on (0, t), for any t ∈ (0, T ) to get:

Now, we can apply Gronwall's inequality and infer that
where the constant C depends only on Ω, T and all the parameters introduced in hypotheses A1-A8, but not on the Galerkin approximation parameter n.At this stage, C can still depend on λ, but this dependence can be eliminated after passing to the limit as n → +∞, as explained before.Moreover, it is straight-forwards to deduce that also Finally, by comparison in (1.6), from (2.11) it follows that Note that, to be rigorous within Galerkin's approximation, one would have to test (1.6) for any ζ ∈ W n and use the orthogonality properties of the eigenvectors of N , in order to recover the same estimate.Now, all that remains to do is to pass to the limit in the approximation framework.Indeed, by the uniform estimates found before, one can extract weakly and strongly convergent subsequences, that can be used to pass to the limit in the discretised variational form as n → +∞.Then, by weak lower-semicontinuity and the properties of the Yosida approximation, one can recover uniform estimates in λ and then pass to the limit as λ ց 0. This is done very similarly to [44, Theorem 2.1], so we leave the details to the interested reader.Finally, regarding initial data, we observe that ϕ ∈ H by standard results, so they make sense respectively in V and H.This concludes the proof of Theorem 2.3.
Remark 2.6.In case of a singular potential, i.e. for l < +∞, the uniform boundedness of which is the case for the logarithmic potential, one can also recover that |ϕ(x, t)| < l for a.e.(x, t) ∈ Q T .Indeed, by comparison in (1.5), since µ ∈ L 2 (0, T ; H), also F ′ (ϕ) ∈ L 2 (0, T ; H), which readily implies the thesis.By improving this kind of hypothesis (see B2), in Section 3 we will also be able to prove the so-called strict separation property for ϕ.
Remark 2.7.As anticipated before in Remark 2.4, we would like to give some hints regarding the differences in the proof when considering only a regular potential of polynomial growth, actually leaving most of the details to the interested reader.
The main energy estimate is still (2.7), however this time one can keep the reaction term H on the left-hand side and estimate differently the energy E(t) and the term ( (ϕ)u, µ) H . Indeed, by using Hölder and Young's inequalities and the new hypothesis A3', one has: where this holds for any α > 0. Now, if χ = 0, one keeps the estimate as it is, otherwise if χ > 0, one can choose α = 1 χ 1 2 − δ with δ ∈ 0, 1 2 and obtain: Then, one can test (1.5) with ζ = 1, which is also possible within Galerkin's discretisation, since the first eigenfunction of N is constant, and integrate on (0, t), for any t ∈ (0, T ).Consequently, by using A2, A3', the continuous embedding of L 2 (Ω) into L 1 (Ω) and Young's inequality, one gets: where ε > 0 is yet to be chosen.Therefore, by using also A7', one can estimate the source term as follows: )) and renaming the constants.Then, by integrating in time and using Gronwall's lemma, the following uniform estimate can be obtained: (2.12) Moreover, observe that by testing again (1.5) with ζ = 1, one can see that uniformly with respect to the parameters.Therefore, by using Poincaré-Wirtinger's inequality, it follows that also: Then, one tests (1.5) by −∆ϕ to recover the remaining L ∞ (0, T ; V ) estimate on ϕ and proceeds by comparison for the H 1 (0, T ; V * ) estimate for σ.Here we have the second main difference, which concerns the reaction term R : As a matter of fact, observe that formally, by using the embedding L 6/5 (Ω) ֒→ V * and Hölder's inequality, one has: Indeed, one already knows that P (ϕ)(σ + χ(1 − ϕ) − µ) ∈ L 2 (0, T ; H) by (2.12) and, by using hypothesis A5' with q ≤ 4, one can estimate: thanks to the embedding V ֒→ L 6 (Ω).Hence, one obtains that and by comparison in (1.6) also: Finally, one can similarly pass to the limit in the discretisation framework.For more details, we refer to [23,Theorem 2.1], where the same system without viscosity is studied.This would conclude the proof of Theorem 2.3.

Strong well-posedness
From now on, we consider the simplified version (1.9)-(1.13) of our starting system, obtained by considering constant mobilities.Without loss of generality, we can fix the values of the constant mobilities as m = n = 1.Then, we have the following system: paired with boundary and initial conditions: Moreover, we also assume stronger hypotheses on the parameters and on the data.First let us observe that, to gain further regularity, we would need to assume that J ∈ W 2,1 loc (R N ), but this hypothesis is incompatible with widely used convolution kernels, such as those of Newton or Bessel type.However, following [2, Definition 1], we can still introduce a suitable class of kernels, which includes the ones mentioned before and satisfies our needs.Indeed, we recall the following definition: ) is admissible if it satisfies the following conditions: • J is radially symmetric and non-increasing, i.e.J(•) = J (|•|) for a non-increasing function J : R + → R.
• There exists We also need stronger hypotheses on F and on the initial data to guarantee the strict separation property.Indeed, we assume the following: B2. F ∈ C 4 ((−l, l)) and the constant c 0 of hypothesis A4 is such that c 0 > χ 2 ≥ 0.Moreover, we assume that lim and there exists P 0 > 0 such that and it is separated, i.e. there exists s 0 ∈ (0, l) such that ϕ 0 L ∞ (Ω) ≤ s 0 .
Remark 3.2.We recall that if J satisfies B1, then, by [2, Lemma 2], for any p ∈ (1, +∞) there exists a constant b p > 0 such that: This is just what one needs to control second-order derivatives of the convolution term J * ϕ.
Remark 3.3.Assumption B2 is needed to have some coercivity, due to the presence of chemotaxis, and a control on how F ′ blows up at the extrema, in order to prove the separation property.Moreover, on hypothesis B5, note that ϕ 0 being separated does not take away from practical situations.Indeed, if F is regular, it simply means that ϕ 0 ∈ L ∞ (Ω).Whereas if F is singular, we recall that the actual minima of the logarithmic potential are not exactly in ±1, but in two intermediate values close to them (i.e. the actual pure phases in this setting).Therefore, we are just asking that F ′ (ϕ 0 ) does not explode, in a uniform way.
Remark 3.4.We also observe that, since ϕ 0 ∈ H 2 (Ω) and it is separated, one can freely differentiate the potential F ′ (ϕ 0 ).Therefore, by comparison in the variational formulations of (3.1) and (3.2) at time 0, we also have that: Indeed, by considering (2.3) and (2.4) at time t = 0, we have that for any ζ ∈ V Then, by substituting the first one into the second one and isolating the terms containing µ(0), we infer that µ(0) satisfies the following variational problem: where Then, µ(0) is a weak solution of the following elliptic partial differential equation: where the homogeneous Neumann boundary condition is a consequence of the test functions ζ being in V = H 1 (Ω).Therefore, since, by B5, A5 and Sobolev embeddings, we have that f ∈ H, (1 + τ P (ϕ 0 )) ∈ C 0 (Ω) and it is non-negative, by standard elliptic regularity theory we can infer that µ(0) ∈ W .Then, by comparison, we also have that ϕ t (0) ∈ H.
We have the following result about existence of strong solutions: Theorem 3.5.Under assumptions A1-A4 and B1-B5, there exists a strong solution to (3.1)-(3.5),with the following regularity: In particular, there exists a constant C > 0, depending only on the parameters of the model and on the data ϕ 0 , σ 0 and u, v, such that: Moreover, ϕ satisfies the strict separation property, i.e. there exists s * ∈ (s 0 , l) such that Proof.We proceed with only formal estimates for the sake of exposition.These can be made rigorous by going back to the Galerkin discretisation and by using finite differences operators for time derivatives of higher order.Before starting, we observe that, from the weak regularities of Theorem 2.3, we can immediately have more regularity on some terms, without extra assumptions.In particular, regarding the reaction term, we can say that In turn, this implies, by comparison in (3.1), that which means that we also have a uniform bound on µ in L 2 (0, T ; W ), i.e.
For the first estimate, we test equation (3.3) by ∂ t (σ − χϕ) and, by using Cauchy-Schwarz and Young's inequalities, we get: Then, by integrating on (0, t), for any t ∈ (0, T ), and by using Gronwall's lemma, hypothesis B5 and (2.6), we infer that Also, since ϕ ∈ L ∞ (0, T ; V ) by (2.6), we can conclude that Next, we test again (3.3) by −∆(σ − χϕ) and with analogous estimates we get: Then, again by integrating on (0, T ) and using (2.6) and (3.11), we infer that Now, for the main estimate, we test (3.1) by ∂ t µ and the time-derivative of (3.2) by −∂ t ϕ and sum them up.To be precise, these time-derivatives should be done by using the approximation , and then by sending h → 0. However, to give the idea of the procedure, we stick to formal estimates.Indeed, after cancellations, we obtain: By using assumptions A2, A4 and Cauchy-Schwarz and Young's inequalities, we infer that Now, in order to estimate the term I 1 , we use Leibniz's rule in time, generalised Hölder's inequality, Young's inequality, hypotheses B3, B4 and the embeddings V ֒→ L 6 (Ω) and W ֒→ L ∞ (Ω), as follows: In a similar way, we also estimate the term I 2 : By putting all together, we have: where Now we can estimate function G at time t and at time 0, by using B3, Remark 3.4, Hölder and Young's inequalities as follows: Finally, for some 0 < α < min{P 0 , 1}/2, after integrating on (0, t), for any t ∈ (0, T ) and using Remark 3.4, we obtain that where the term in front of ϕ t 2 H on the right-hand side is in L 1 (0, T ), thanks to (2.6), (3.12), (3.10) and B5.Recall also that µ(0) V ≤ C and ϕ t (0) H ≤ C by Remark 3.4.Then, by Gronwall's lemma, we infer the following uniform estimates: In particular, we now observe that then, by comparison in (3.1), we have that where both these inclusions hold uniformly with respect to the parameters and the data.Therefore, we also have the uniform bound: which in turn, thanks to the embedding W ֒→ L ∞ (Ω), also implies that Now, we mostly follow the procedure used in the proof of [44, Theorem 2.5], in order to prove the strict separation property and get more regularity for ϕ.Even if the argument is very similar to the referenced one, we repeat it for the sake of completeness.First, we rewrite equation (3.2) as then we sum the term −χ 2 ϕ to both sides, to obtain the following evolution equation in Q T : Next, regarding the right-hand side, we observe that µ ∈ L ∞ (0, T ; W ) by (3.15), σ − χϕ ∈ L 2 (0, T ; W ) by (3.13) and also J * ϕ ∈ L ∞ (0, T ; H 2 (Ω)) by B1, since ϕ ∈ L ∞ (0, T ; H).Indeed, one just has to pass the derivatives onto the convolution kernel J, by using also Remark 3.2.
Then, it follows that we have the uniform bound: Additionally, by adding −χϕ t on both sides, we can rewrite (3.3) as where the right-hand side is in L ∞ (0, T ; H) by (3.14).Then, since σ 0 − χϕ 0 ∈ L ∞ (Ω) thanks to B5, by parabolic regularity theory (see [36,Theorem 7.1, page 181]), it follows that σ − χϕ is uniformly bounded in L ∞ (Q T ).Therefore, since H 2 (Ω) ֒→ L ∞ (Ω), we can actually infer that there exists a constant M such that Now we can prove the strict separation property.By hypothesis B5, we know that there exists s 0 ∈ (0, l) such that ϕ 0 L ∞ (Ω) ≤ s 0 .Moreover, by hypothesis B2, we also know that Therefore, there surely exists s * ∈ (s 0 , l) such that ) where M is the one of (3.18).Now, we test (3.16) by (ϕ − s * ) + in H and we integrate on (0, t), for any t ∈ (0, T ): Then, observe that (ϕ − s * ) + = 0 if ϕ < s * , therefore we can restrict the integrals on Q t ∩ {ϕ > s * }.In this way, we can infer that where (ϕ 0 − s * ) + = 0 by B5, since s * > s 0 .Moreover, by (3.19) and (3.18), it follows that Consequently, we have that but the left-hand side is clearly non-negative, so the only possibility is that (ϕ(x, t)− s * ) + = 0 for a.e.(x, t) ∈ Q T , which implies ϕ(x, t) ≤ s * a.e. in Q T .By repeating the same procedure with −(ϕ + s * ) + , one can also recover the estimate from below, i.e. ϕ(x, t) ≥ −s * for a.e.(x, t) ∈ Q T .Therefore, the separation property (3.9) is proved.Now, exactly as in [44, Section 3.7], we test the gradient of (3.16) by ∇ϕ|∇ϕ| p−2 with p > 1 to be set later.Note that, to be rigorous, here one should use a truncation argument, even within the Galerkin discretisation.However, by proceeding formally, we have: Next, by using hypotheses A4 and B2, we arrive at the inequality: Now, we integrate both sides over (0, t), for any t ∈ (0, T ), and we use Hölder's inequality and the generalised Young's inequality with conjugate exponents p and p/(p − 1), in order to get: where, in the last line, we used the fact that, for p > 1, the real function x → x p p−1 is strictly increasing and the supremum is preserved under increasing functions.Then, passing to the supremum over (0, t) also on the left-hand side, we get: where the terms on the right-hand side are uniformly bounded if p ≤ 6 by respectively B5, (2.6) and (3.17).Then, we can conclude that Finally, for any i, j = 1, 2, 3 we apply the differential operator ∂ x i x j to (3.16) and we test the resulting equation by ∂ x i x j ϕ, indeed: Then, by using A4, Hölder's and Young's inequality and B2 together with the already proven separation property (see the following Remark 3.6 for more details), we infer that , therefore, by summing on i, j = 1, 2, 3, integrating on (0, t) for any t ∈ (0, T ) and using Gronwall's lemma, together with B5, (3.21) and (3.17), we get the uniform estimate: In particular, since we already knew that σ − χϕ ∈ L 2 (0, T ; W ) by (3.13), we also infer that To conclude, we observe that since F ∈ C 3 ((−l, l)) and the separation property holds, starting from (3.22) we can also get that Then, by comparison in (3.16), we can also see that meaning that we also have the uniform bound Finally, starting from all these estimates, a strong solution can then be recovered, up to passing to the limit in the discretisation framework.Moreover, estimate (3.8) can be deduced by weak lower-semicontinuity.This concludes the proof of Theorem 3.5.
Remark 3.6.Since F ∈ C 4 ((−l, l)) by B2 and the separation property (3.9) holds, one can freely use the chain rule to compute the derivatives of F ′ (ϕ), as we already did in the previous proof.In particular, since F and its derivatives are locally bounded in (−l, l), by (3.9) we can deduce that where the constant C > 0 depends only on F , s * and M .
then, up to adding and subtracting some terms, they solve: paired with boundary and initial conditions: Now, for the main estimate, we test equation (3.26) by µ in H and, by using Cauchy-Schwarz and Young's inequalities and the fact that P and are Lipschitz functions on bounded intervals, together with the uniform estimates on the L ∞ (Q T )-norms of ϕ 1,2 , we obtain: where ε > 0 is to be chosen later and ∞ ∈ L 1 (0, T ), since every term is bounded in L 2 (0, T ; H 2 (Ω)) and H 2 (Ω) ֒→ L ∞ (Ω).Next, we test equation (3.27) by −ϕ t in H and we use Leibniz's rule in time and rewrite the resulting terms accordingly.Then, adding and subtracting (Baϕ, ϕ t ) H = 1 2 d dt (Baϕ, ϕ) H and by using the separation property together with the fact that F ′′ is Lipschitz on [−s * , s * ], we get: (3.24).Then, we test equation (3.27) by δµ in H, with δ > 0 to be chosen later, and, again since F ′ is Lipschitz on [−s * , s * ] and the separation property holds, we infer that H , where we used Young's inequality with γ > 0 to be set later and ε > 0 is the same one used before.Finally, we test equation (3.28) by (σ − χϕ) and, with similar techniques, we find that ∞ ∈ L 1 (0, T ).Therefore, by summing up the previous four inequalities, after cancellations we get: where Now, we can independently choose first ε > 0, γ = γ(τ ) > 0 and then δ = δ(τ ) > 0, for any value of τ > 0, such that Moreover, by using Lagrange's theorem and the fact that ϕ 1 , ϕ 2 ∈ C 0 (Q T ) by standard embeddings, we can say that F ′ (ϕ 1 (x, t)) − F ′ (ϕ 2 (x, t)) = F ′′ (s)ϕ(x, t) for some s = s(x, t) ∈ R, which is at least a measurable function, for a.e.(x, t) ∈ Q T .Then, by hypotheses A4 and B2, together with Cauchy-Schwarz and Young's inequalities, we can estimate G(t) from below as Consequently, by integrating (3.32) on (0, t) for any t ∈ (0, T ), we infer that and, by using Gronwall's lemma, we obtain the first continuous dependence estimate: Observe that here the constant C τ on the right-hand side depends on τ , but this can be avoided under mild additional assumptions (see Remark 3.9 below).Next, we test the gradient of equation (3.27) by ∇ϕ in H and, after a careful rewriting of the terms, we get that Hence, by adding and subtracting χ 2 (∇ϕ, ∇ϕ) H and by using hypotheses A2, A4, B2, the Lipschitz properties of F ′′ and Hölder and Young's inequalities, we infer that Then, by integrating on (0, t), for any t ∈ (0, T ), and applying Gronwall's lemma, together with (3.34), we obtain the estimate and by comparison with the previous estimate on σ − χϕ ∈ L 2 (0, T ; V ), we also get that Now, we test equation (3.28) by ∂ t (σ − χϕ) and, by using hypothesis B3 and Cauchy-Schwarz and Young's inequalities, we infer that Therefore, by integrating in (0, t) for any t ∈ (0, T ) and then applying Gronwall's lemma and (3.34), we deduce that Next, we test equation (3.28) by −∆(σ − χϕ) and, by similar methods, we infer that then, by integrating on (0, T ) and using (3.34) and (3.37), we get the estimate: Moreover, in a similar way, we can also test equation (3.26) by −∆µ and then integrate on (0, T ) to get: Finally, for any i, j = 1, 2, 3, we apply the differential operator ∂ x i x j to (3.27), which makes sense in H, and we test the resulting equation by ∂ x i x j ϕ.Then, after careful rewriting of the terms arising from the derivatives of F , we get: Hence, by using hypotheses A4, B1, B2, Remark 3.6 and the generalised Hölder's and Young's inequalities, together with (2.1) and (2.2), we infer that where ϕ 2 4 6 ∈ L ∞ (0, T ), since, by (3.22), ϕ 2 ∈ L ∞ (0, T ; H 2 (Ω)), which is embedded into L ∞ (0, T ; W 1,6 (Ω)).Therefore, by summing on i, j = 1, 2, 3, integrating on (0, t), for any t ∈ (0, T ), and using Gronwall's lemma, together with the previous estimates (3.35), (3.38), (3.39), we infer that Moreover, by comparison with (3.38), we also deduce that This concludes the proof of Theorem 3.7, since all the constants that appear in front of the estimates depend only on parameters and possibly on the norms of {(ϕ 0 i , σ 0 i )} i=1,2 , but not on their difference.
Remark 3.9.We wish to point out that, under the additional assumption that 0 < τ ≤ M , for some M > 0, one can show that all constants C τ appearing in the proof are actually independent of τ for any τ ∈ (0, M ].This can be useful in applications, where τ is generally kept very small and is possibly tending to 0. Indeed, going back to (3.33), one can first choose ε > 0 and 0 < γ < 1/M in such a way that Then, C γ is now fixed independently of τ ∈ (0, M ], therefore one can choose δ > 0 small enough such that At this point, it is clear that the constants C τ , appearing from (3.34) onwards, depend only on the other parameters and on M , which is fixed.
Remark 3.10.In the previous proof, in order to close the first continuous dependence estimate (3.34), we need a positive term on the left-hand side for µ 2 H .This is obtained by testing (3.27) by δµ, with a small constant δ, and leads to either constants depending on τ or to an additional hypothesis on τ as in Remark 3.9.Another possibility could be to directly use hypothesis B3 on P 0 > 0 on the term (P (ϕ 1 )µ, µ), which can be brought on the left when testing (3.26) by µ.Namely, inequality (3.31) would become which then, by B3, Cauchy-Schwarz and Young, would immediately imply thus removing the need of testing (3.27) by δµ.However, we choose the first method, since the hypothesis P 0 > 0 could be restrictive in applications, so we would like to avoid heavy use of it, even if it is necessary in the proof of Theorem 3.5.Moreover, the procedure we used could possibly be adapted to even different reaction terms, provided that one is able to prove existence of strong solutions.The same strategy shall be pursued also in Section 4.

Optimal control problem
From now on, we consider the initial data ϕ 0 , σ 0 , satisfying B5, fixed.We recall the optimal control problem that we want to study: (CP) Minimise the cost functional subject to the control constraints and to the state system (3.1)-(3.5).
Regarding the parameters at play, we make the following hypotheses: , with u min ≥ 0 a.e. in Ω.Let also M ′ > 0 be such that for any u ∈ U ad and for any v ∈ V ad Remark 4.1.The hypothesis u min ≥ 0 comes from modelling assumptions, see also [7] and references therein.Moreover, we need the stronger hypothesis σ Ω ∈ H 1 (Ω) to prove wellposedness of the adjoint system in Section 5.3.
Finally, we would like to comment on the structure of U ad .While the best choice, from an applicative viewpoint, would be a subset of L ∞ (Q T ) with box constraints, e.g.{u ∈ L ∞ (Q T ) | u min ≤ u ≤ u max a.e. in Q T }, the presence of u as a source term in the Cahn-Hilliard equation demands higher regularity, if one wants to achieve strong well-posedness.Indeed, we also need a bound on the H 1 (0, T ; H)-norm, as can be seen by B4 and Theorems 3.5 and 3.7.By Theorems 3.5 and 3.7, we know that for any (u, v) ∈ U ad × V ad there exists a unique strong solution (ϕ, µ, σ) ∈ X to (3.1)-(3.5),where therefore the optimal control problem (CP) is well-defined.Our goal is to prove existence of an optimal control and then find the first-order necessary optimality conditions.Theorem 4.2.Assume hypotheses A1-A4, B1-B5 and C1-C4.Then the optimal control problem (CP) admits at least one solution (u, v) ∈ U ad × V ad , such that if (ϕ, µ, σ) is the solution to (3.1)-(3.5)associated to (u, v), one has that where (ϕ n , µ n , σ n ) are the solutions to (3.1)-(3.5)associated to (u n , v n ), with the regularities given by Theorem 3.5.Since {(u n , v n )} n∈N ⊂ U ad × V ad , we have that {u n } is uniformly bounded in L ∞ (Q T ) ∩ H 1 (0, T ; H) and {v n } is uniformly bounded in L ∞ (Q T ), therefore, by the Banach-Alaouglu theorem, we can deduce that there exists Moreover, since U ad × V ad is convex and closed in the larger space H 1 (0, T ; H) × L 2 (Q T ), it is also weakly sequentially closed and thus (u, v) ∈ U ad × V ad .Now, consider the solutions (ϕ n , µ n , σ n ) ∈ X corresponding to (u n , v n ) for every n ∈ N, then, by the uniform estimate (3.8), we have that these solutions are uniformly bounded with respect to n in the spaces of strong solutions, given by Theorem 3.5.Therefore, again by Banach-Alaoglu, we can say that, up to a subsequence, In particular, by the compact embeddings of Aubin-Lions-Simon (see [46,Section 8,Corollary 4]), it follows that ϕ n → ϕ strongly in C 0 ([0, T ], H s (Ω)) for any 0 < s < 2, which implies that ϕ n → ϕ strongly in C 0 (Q T ), since H s (Ω) ֒→ C 0 (Q T ) if s > 3/2 for N = 3.In particular, by uniform convergence, ϕ still satisfies the strict separation property.This fact, by the continuity of the functions P , and F ′ , implies that Moreover, in the same way, we also have that σ n → σ strongly in C 0 ([0, T ], H).Therefore, by using the weak and strong convergences written above, it is easy to show that, starting from the equations (3.1)-(3.5)satisfied by (ϕ n , µ n , σ n ) with respect to (u n , v n ), also the limit functions (ϕ, µ, σ) ∈ X satisfy (3.1)-(3.5)with respect to (u, v).Then, we can infer that Now, we observe that the functional J , if defined on the larger space is strongly continuous and convex, therefore it is weakly lower-semicontinuous with respect to this weaker topology.Consequently, we can deduce that which means that (u, v) ∈ U ad × V ad is an optimal control.This concludes the proof of Theorem 4.2.

Linearised system
As an ansatz for the Fréchet-derivative of the control-to-state operator, which maps any (u, v) ∈ U ad × V ad into the corresponding solution of the state system, we now start studying the linearised version of our system.Indeed, we fix an optimal state (ϕ, µ, σ) ∈ X corresponding to (u, v) ∈ U ad × V ad and linearise near (u, v): Then, by approximating the non-linearities at the first order of their Taylor expansion, we see that (ξ, η, ρ) satisfy the equations: together with boundary and initial conditions: Observe that to derive the linearised system, we needed to start from h ∈ L ∞ (Q T ) ∩ H 1 (0, T ; H) and k ∈ L ∞ (Q T ), however the system actually makes sense even for h, k ∈ L 2 (0, T ; H).Indeed, we will prove well-posedness in this more general case.
Proof.We proceed with only formal estimates, which can be done rigorously through a Faedo-Galerkin discretisation scheme, with discrete spaces made of eigenvectors of the operator N .Then, one can pass to the limit in a standard matter to recover the estimates also in the continuous case.For the first estimate, we test in H equation (4.5) by η and we get: Then, by using hypothesis B3 and Cauchy-Schwarz and Young's inequality with ε > 0 to be chosen later, we obtain: where we recall the uniform bound on σ +χ(1−ϕ)−µ 2 ∞ ∈ L 1 (0, T ), since H 2 (Ω) ֒→ L ∞ (Ω), and on u 2 ∞ ∈ L ∞ (0, T ) by B4.Next, we test (4.6) in H by −ξ t , which is possible within the discretisation, and, by using Leibniz's rule for the time-derivative, we get: We further estimate the right-hand side by using Hölder's and Young's inequalities and by recalling that, by Theorem 3.5, we have the separation property for ϕ, indeed: where ) by Theorem 3.5.Moreover, we test (4.7) in H by ρ − χξ and, by similar techniques, we infer that 1 2 Then, by summing up all three inequalities, after cancellations, we get: Observe that, by hypothesis A4 and Young's inequality, we can infer that moreover, by the initial conditions, we know that G(0) = 0. Finally, we test again (4.6) in H by δη to obtain: . By using Remark 3.6, Cauchy-Schwarz and Young's inequalities with γ > 0 to be chosen later and the same ε > 0 used before, we can estimate At this point, we can sum this last inequality to (4.10), integrate everything over (0, t), for any t ∈ (0, T ), and infer that Now, exactly as in the proof of Theorem 3.7 (see equation (3.33)), we can respectively choose ε > 0, γ = γ(τ ) > 0 and then δ = δ(τ ) > 0 such that the constants above are strictly positive.Therefore, we can apply Gronwall's inequality and recover the estimate: with C τ > 0, depending only on the parameters of the system and on τ > 0. By reasoning as in Remark 3.9, this dependence on τ can be avoided, if one additionally assumes that the range of values that τ can assume is bounded from above.For simplicity, from now on we will remove the explicit dependence on τ from the constants, keeping in mind that the only constant that could depend on τ is the one in (4.11).Now, we can test (4.6) in H by −∆ξ, which makes sense in the Galerkin discretisation, and by using integration by parts formulae, without any extra boundary terms, due to the discrete spaces, we have: In addition, if we add χ 2 (∇ξ, ∇ξ) H on both sides of the equality, then, by rearranging the terms and by using hypothesis A4 and Cauchy-Schwarz and Young's inequalities, we infer that In order to estimate the last term, we use Remark 3.6, Hölder's inequality, (2.1) and the generalised Young's inequality as follows: Then, going back to (4.12) and integrating on (0, t) for any t ∈ (0, T ), we arrive at and, by using (4.11) together with Gronwall's inequality, we deduce that Moreover, since ρ − χξ ∈ L 2 (0, T ; V ) by (4.11), we also infer that Finally, we test (4.7) in H by ∂ t (ρ − χξ), which is still possible within the discretisation, and by usual techniques we get: Then, by integrating on (0, t) for any t ∈ (0, T ) and by using Gronwall's lemma and (4.11), we infer that and, by (4.13), we also get Before concluding, observe that, by comparison in (4.5) and (4.7), we can also deduce that Indeed, due to the regularity of strong solutions given by Theorem 3.5, one just has to notice that At this point, with all these uniform estimates, it is easy to pass to the limit in the discretisation framework and prove the existence of a strong solution with the prescribed regularities.Moreover, since the system is linear, it is straightforward to prove the uniqueness of the solution from the energy estimate (4.11).This concludes the proof of Theorem 4.3.

Differentiability of the control-to-state operator
We now want to study the control-to-state operator S, which associates to any control (u, v) ∈ U ad × V ad the corresponding solution of the system (3.1)-(3.5).We introduce the following space: Observe that the space of strong solutions X is continuously embedded into Y, which is exactly the space where we proved the continuous dependence estimates.Indeed, from Theorem 3.5 and Theorem 3.7 we respectively know that Indeed, by hypothesis C3, we can take: Note that, in U R × V R , the continuous dependence estimate of Theorem 3.7 holds with K depending only on R and the fixed data of the system.Our aim is to show that S : U R ×V R → W is also Fréchet-differentiable in the larger space: Indeed, we can prove the following theorem: together with boundary and initial conditions: where: Before going on, we can rewrite in a better way the terms F h , U h and Q h , by using the following version of Taylor's theorem with integral remainder for a real function f ∈ C 2 at a point x 0 ∈ R: Indeed, with straightforward calculations one can see that and also, up to adding and subtracting some additional terms, that where Regarding these last remainder terms, we can immediately say that there exists a constant C > 0, depending only on the parameters, Λ and T , such that Indeed, by using the fact that F ∈ C 3 and ϕ, ϕ are bounded uniformly in L ∞ (Q T ) with values in [−s * , s * ] by Theorem 3.5, we have that The other two terms are done in an analogous way, by exploiting hypothesis C4.Moreover, for later purposes, we also observe that Indeed, by using Lebesgue's theorem to differentiate inside the integral sign, Jensen's inequality, Fubini's theorem, the embedding H 2 (Ω) ֒→ W 1,6 (Ω), F ∈ C 4 , the separation property and the uniform bound on ϕ, ϕ in L ∞ (0, T ; H 2 (Ω)) as before, we have: Then, by taking the essential supremum on (0, T ) we deduce (4.23).Now we can start with a priori estimates on the system (4.17)-(4.21).Firstly, we test (4.17) in H by ζ and, by using (4.22),Cauchy-Schwarz and Young's inequalities, the embedding H 2 (Ω) ֒→ L ∞ (Ω) and the local Lipschitz continuity of ∈ C 1 , we get: where ε > 0 from Young's inequality is to be chosen later.Next, to estimate the term (Q h , ζ) H , we use use again (4.22), Hölder and Young's inequalities, the embedding V ֒→ L 4 (Ω) and the local Lipschitz continuity of P ∈ C 1 , indeed: where again ε > 0 is yet to be chosen and without loss of generality, up to changing the constants C > 0, can be thought the same as before.Then, by putting all together and integrating on (0, t) for any t ∈ (0, T ), we have: At this point, we use the continuous dependence estimate found in Theorem 3.7 and the fact that σ + (1 − χ)ϕ − µ 2 ∞ ∈ L 1 (0, T ) uniformly, to obtain: where ϕ t H 2 (Ω) ∈ L 2 (0, T ) by Theorem 3.5.Then, by integrating on (0, t), for any t ∈ (0, T ), and, by exploiting hypothesis A4 and the continuous dependence result of Theorem 3.7, we find that For the third estimate, we test (4.19) in H by θ − χψ and with similar techniques we find that Therefore, by integrating on (0, t) for any t ∈ (0, T ) and by using again Theorem 3.7, we obtain: Finally, we test (4.18) in H by δζ, with δ > 0 to be chosen later, getting: Exactly as we already did in the proof of Theorems 3.7 and 4.3, by using Cauchy-Schwarz and Young's inequalities with γ > 0 to be chosen later and the same ε > 0 used before, up to changing the constants C > 0, we can estimate Then, by integrating on (0, t), for any t ∈ (0, T ) and using Theorem 3.7, we obtain: Now, we can sum the inequalities (4.24), (4.25), (4.26) and (4.27), so that after cancellations and after observing that we arrive at the inequality: Observe that once again, exactly as in the proof of Theorem 3.7 (see equation (3.33)), we can respectively choose ε > 0, γ = γ(τ ) > 0 and then δ = δ(τ ) > 0 such that the constants above are strictly positive, Therefore, we can apply Gronwall's inequality and recover the estimate: where the constant C τ > 0 depends only on the parameters of the system, R, Λ and possibly on τ .As in the previous proofs, the dependence on τ can be avoided by arguing as in Remark 3.9.
To conclude, we just need two more estimates.First we test the gradient of (4.18) by ∇ψ in H and, after using the chain rule and adding and subtracting χ 2 (∇ψ, ∇ψ) H , we have: Then, by using (4.22), (4.23), F ∈ C 3 , Remark 3.6, Hölder's inequality, (2.1) and Sobolev embeddings, we can infer that where ∇ϕ 4 6 ∈ L ∞ (0, T ).Therefore, by integrating on (0, t) for any t ∈ (0, T ), using Theorem (3.7) and the previous estimate (4.28), after applying Gronwall's lemma we obtain the following: . Moreover, by comparison with the estimate for ∇(θ − χψ), we can also see that . Finally, we test (4.19) in H by ∂ t (θ −χψ) and, with Cauchy-Schwarz and Young's inequalities, we get: Regarding Q h 2 H , one can argue exactly like we did for the first estimate and then, after integrating in time and applying Theorem 3.7, arrive at: . Hence, by applying Gronwall's lemma and the previous estimate (4.28), we infer that , and then also by comparison In the end, by putting all together, we showed that , which is exactly (4.16) with s = 4 > 2 and so we are done.

Adjoint system and first-order necessary conditions
In order to write down the necessary conditions of optimality and make them useful for applications, we now need to introduce the adjoint system to the optimal control problem (CP).Indeed, we fix an optimal state (ϕ, µ, σ) = S(u, v), then, by using the formal Lagrangian method with adjoint variables (p, q, r), one can find that the adjoint system, which is formally solved by these variables, has the following form: together with the following boundary and final conditions: First, we prove the well-posedness of this adjoint system in the following theorem.
Proof.We observe that (4.29)-(4.33) is a backward linear system, so once we prove wellposedness through an energy estimate, the uniqueness of the solution will follow easily.Also in this case the proof can be made rigorous through a Galerkin approximation scheme, the details of which will be left to the reader.We will limit ourselves to derive formal energy estimates, which can then be used to pass to the limit in the approximation.We start by testing (4.29) in H by p + τ q, so that, by also adding and subtracting τ q on every occurrence of p in the source terms, we get: Then, by using the separation property for ϕ, the regularity properties of the functions F, P, , Cauchy-Schwarz inequality and Young's inequality with ε > 0 to be chosen later, we infer that where the constants C ε > 0 depend only on the parameters and on ε.Next, we multiply (4.30) by M > 0, to be chosen later, and we test it in H by p.By adding and subtracting τ q where necessary, we get: Then, by using Cauchy-Schwarz and Young's inequalities with the same ε > 0 as before, without loss of generality, we deduce that Then, again by Cauchy-Schwarz and Young, we obtain: with δ > 0 again to be chosen later.Finally, we need to compensate the term 1 8 ∆r 2 H in (4.34).This is where we will need the stronger hypothesis σ Ω ∈ V in C2, after timeintegration.Indeed, now we test (4.31) in H by −∆r, which is possible within the discretization.By adding and subtracting again τ q where necessary, we get: We now estimate each term on the right-hand side by using Cauchy-Schwarz's and Young's inequality, indeed: Now, we sum all inequalities (4.34), (4.35), (4.36), (4.37) and we obtain the following: At this point, we can respectively choose M = M (τ ) > 0 big enough and then ε = ε(τ ) > 0 and δ = δ(τ ) > 0 small enough such that Therefore, by integrating on (t, T ), for any t ∈ (0, T ), and using the prescribed final data, we eventually arrive at the inequality: Hence, by the regularities given by Theorem 3.5 and C2, it is possible to apply Gronwall's lemma to obtain the following uniform estimates: p + τ q 2 L ∞ (0,T ;H) + τ q 2 L 2 (0,T ;H) + p 2 L 2 (0,T ;V ) + r 2 L ∞ (0,T ;V )∩L 2 (0,T ;W ) where C τ > 0 is a constant depending only on the parameters of the system.Now, by comparison in (4.30), since P ∈ L ∞ and p, q, r ∈ L 2 (0, T ; H) we can deduce that also ∆p is uniformly bounded in L 2 (0, T ; H).In an analogous way, by comparison in (4.31), we can see that r t is also uniformly bounded in L 2 (0, T ; H).Therefore we have that p 2 L 2 (0,T ;W ) + r 2 H 1 (0,T ;H) ≤ C τ α Ω (ϕ(T ) Finally, we observe that by Hölder's inequality and Sobolev embeddings: which is uniformly bounded by Theorem 3.5 and (4.38).Then, by comparison in (4.29), we deduce that also With all these uniform estimates, one can easily pass to the limit in a discretisation framework an prove the existence of a solution.Then, by linearity of the system, estimate (4.38) provides also uniqueness.
Remark 4.7.Notice that in this case, even if τ is bounded from above as in Remark 3.9, the constant C τ in (4.38) cannot be made independent of τ .This could be possible if χ = 0, indeed the definition of the coefficient γ would become: with γ independent of τ .However, if for instance one intends to try sending τ → 0, then, in general, different estimates would be required.Regarding this problem, we refer the reader to [43].
Finally, with the adjoint variables, we can determine and then simplify the first-order necessary conditions.Indeed, we have the following result: Theorem 4.8.Assume hypotheses A1-A4, B1-B5 and C1-C4.Let (u, v) ∈ U ad × V ad be an optimal control for (CP) and let (ϕ, µ, σ) = S(u, v) be the corresponding optimal state, i.e. the solution of (3.1)-(3.5)with these (u, v).Let also (p, q, r) be the adjoint variables to (ϕ, σ, µ), i.e. the solutions to the adjoint system (4.29)-(4.33).Then, they satisfy the following variational inequality, which holds for any (u, v) ∈ U ad × V ad : Proof.Before starting, observe that, by the regularities guaranteed by Theorems 3.5, 4.3 and 4.6, all the integrals that we are going to write make sense in L 1 (Q T ).
At this point, we can rewrite our optimal control problem (CP) through the reduced cost functional as the minimisation problem arg min (u,v)∈ U ad ×V ad f (u, v).
Then, if (u, v) is optimal, since U ad × V ad is convex and f is Fréchet-differentiable, it has to satisfy the necessary optimality condition When computing explicitly the derivative of f , we get that for any (u, v) ∈ U ad × V ad Now we integrate by parts in time, by using also the initial conditions (4.9) on the linearised system, and in space, by using the boundary conditions (4.32) and (4.8), and, after cancellations, we find that equivalently for any (u, v) ∈ U ad × V ad where we also used the symmetry of the kernel J.By factoring out p, q and r respectively, we can rewrite the previous inequality as Then, by adding this to the previous inequality, we at last infer that for any (u, v) ∈ U ad × V ad To conclude, we notice that the expressions enclosed in the parentheses are exactly the equations (4.5), (4.6), (4.7) of the linearised system, up to their source terms.Hence, by substituting those into our inequality, we find that for any (u, v) ∈ U ad × V ad Remark 4.9.Observe that, since U ad × V ad is closed and convex, (4.39) means that, if α u > 0 and β v > 0, the optimal control (u, v) is exactly the L 2 (Q T ) 2 -orthogonal projection of (α −1 u (ϕ) p, −β −1 v r) onto U ad × V ad .In particular, if u ≡ 0, i.e. we are only looking for a chemotherapy v through the nutrient, it can be shown that, due to the structure of V ad , the L 2 (Q T )-projection of −β −1 v r has the explicit form: v(x, t) = min v max (x, t), max −β −1 v r(x, t), v min (x, t) for a.e.(x, t) ∈ Q T .
If we also want to optimise for a radiotherapy u, the actual description of the projection onto the space U ad is more difficult and has to be tackled numerically, due to the presence of the additional constraint u H 1 (0,T ;H) ≤ M .However, this cannot be avoided, since the extra regularity on u is needed to prove the global boundedness of the variable ϕ, which is crucial for the whole analysis.Note that this kind of restriction is not unprecedented for controls acting as source terms in Cahn-Hilliard type equations, see for instance [9].