Why more physics can help achieving better mathematics

In this paper, we discuss the question whether a physical “simplification” of a model makes it always easier to study, at least from a mathematical and numerical point of view. To this end, we give different examples showing that these simplifications often lead to worse mathematical properties of the solution to the model. This may affect the existence and uniqueness of solutions as well as their numerical approximability and other qualitative properties. In the first part, we consider examples where the addition of a higher-order term or stochastic noise leads to better mathematical results, whereas in the second part, we focus on examples showing that also nonlocal models can often be seen as physically more exact models as they have a close connection to higher-order models.


Introduction
Differential equations are nowadays a standard tool to describe many physical phenomena and processes. The goal of this paper is to illustrate that, in many cases, a physical "simplification" of a model (meaning here the omission of higher-order derivatives or stochastic noise) does not imply a mathematical simplification as it reduces the mathematical "quality" of solutions in terms of regularity, uniqueness, numerical approximability and other properties such as, for example, long-term behaviour. This raises the question if it might be more adequate to study the physically more exact model with respect to mathematics and numerics. Besides, we also wish to draw attention to the less common nonlocal models: In many cases, the equation with higher-order terms can be transformed into a nonlocal equation, or it is at least an approximation, in some sense, of a nonlocal equation. In this way, nonlocal models can be seen as models comprising different local models as well as models of different characteristic scale. 1 This connection might also help improving the numerics of nonlocal equations, for which up to now less is known in comparison to the numerics of local models.
In the first part of this paper, we give a short introduction to some mathematical solution concepts for differential equation problems to make the following sections of this work easier to understand.
In the second part, we present some examples where the simplified model has a mathematically "worse" solution than the so-called regularised problem including higher-order terms or stochastic perturbations, which both have a physical meaning in most cases. We give a concrete example where the higher-order term already appears in the physical derivation of the model but is then omitted because it seems to be of negligibly small order of magnitude.
In the third and last part, we give some examples showing interesting connections between nonlocal models and local higher-order models, for example the transformation of a higher-order equation into a nonlocal one, or the approximation of a nonlocal equation via a higher-order equation.

Different solution concepts
We briefly introduce some of the most important mathematical solution concepts for differential equation problems. Throughout this paper, let Ω be an open subset of R d with sufficiently smooth boundary. We start with the probably most popular concept, the concept of classical solutions: a classical solution is sufficiently often continuously differentiable and the differential equation is fulfilled pointwise everywhere. A similar concept is the concept of strong solutions 2 : a strong solution is sufficiently often differentiable in the weak sense and the equation is fulfilled pointwise almost everywhere. We call a function u : Ω → R weakly differentiable in the direction x i , i ∈ {1, . . . , d}, if it, together with its weak derivative v : Ω → R, fulfils some kind of integration-by-parts condition, i.e., the equation holds for all test functions ϕ : Ω → R that are infinitely many times continuously differentiable and vanish outside of a compact subset of the domain Ω. This concept of weak derivatives can be seen as a generalisation of the concept of classical derivatives since every integrable classical derivative fulfils Eq. (1). The boundary term appearing if integration by parts is applied vanishes here because the function ϕ vanishes at the boundary of Ω. The weak derivative also motivates another quite popular concept of solutions, the so-called weak solutions. One of the first using it was Jean Leray [3] in 1934 to prove the existence of solutions to the Navier-Stokes equations in two and three spatial dimensions. Since then it has become one of the most often used concepts of solutions. To obtain the weak formulation of the differential equation problem, we multiply the original differential equation with test functions ϕ with the same properties as in the definition of the weak derivative. Then we integrate on both sides of the equation over Ω and formally perform an integration by parts in the terms with higher derivatives such that half of the derivatives is shifted onto the test function ϕ. The solution to the resulting integral equation in an appropriate solution space we call weak solution. Note that the weak solution in general has a lower regularity than both classical and strong solutions, e.g., a weak solution to the common heat equation only has to be once differentiable in space in the weak sense although two derivatives appear in the original equation. It can be shown that a classical solution is also always a weak solution. Therefore, the concept of weak solutions is a generalisation of the concept of classical solutions. It allows for example discontinuities in some derivatives of the solution and is able to handle more general data. Moreover, there are many (especially nonlinear) problems that do not admit classical solutions but weak solutions. A more detailed introduction into the concept of weak solutions can, e.g., be found in Chipot [4] or Roubíček [5].
The last concept of solutions we want to discuss in detail is the concept of the so-called (Young-) measure-valued solutions. In the case of a nonlinear equation, even the existence of weak solutions sometimes cannot be proven, especially if certain oscillation effects come into play. These oscillation effects are now covered by a measure-valued mapping ν which replaces the function u (or derivatives of u, respectively) in the weak formulation in the term with the nonlinearity. Thus, a measure-valued solution always consists of the function u and the measure-valued mapping ν. The concept of measure-valued solutions is again a generalisation of the concept of weak solutions, since every weak solution is also a measure-valued solution taking as ν the point measure of the solution (or derivatives of it, respectively). However, the concept of measure-valued solutions is quite controversial, as it is a rather weak concept. For example, uniqueness of measure-valued solutions cannot be expected in general. Even for problems that admit a unique weak solution, existence of infinitely many measure-valued solutions can be shown [6]. Apart from that, the numerical approximability is worse than for weak solutions, which, because of their nice structure, can be easily approximated via the finite element method, for example. A more detailed introduction into the concept of measure-valued solutions can, e.g., be found in Málek et al. [7].
Between these concepts there are many others, for example very weak solutions where, in contrast to weak solutions, all derivatives are shifted onto the test function, or entropy solutions which have to fulfil an additional entropy inequality. An overview of some of these concepts using the example of the Navier-Stokes equations can be found in Amann [8].
Another overview of results concerning classical, weak and measure-valued solutions in the theory of elastodynamics is given in Emmrich and Puhst [9].

Higher-order terms in physics and mathematics
In this section, we present two examples where physically motivated higher-order terms lead to mathematically "better" solutions to the regularised problem than to the original problem. Additionally, we consider the regularisation using stochastic perturbations instead of higher-order terms, which requires less assumptions on the given data, compared to the problem without a stochastic perturbation, to obtain existence and uniqueness of solutions.

Example 1: backward-forward heat equation
The first example is the so-called backward-forward heat equation where u represents the temperature and φ is a nonmonotone function representing the heat flux density. This equation occurs, for example, in cases where Fourier's law of heat conduction cannot be applied to simplify the general heat equation. Apart from thermodynamics, it is also very important for the so-called anisotropic diffusion considered in image processing. There are several articles proving existence of measurevalued solutions to the backward-forward heat equation, for example, Slemrod [10] and Thanh et al. [11]. In both articles, the method of regularisation is used to prove existence. In the first work [10], the term ε Δ 2 u is added to the left-hand side, where ε > 0 is small. For the regularised equation existence of weak solutions can be proven. The limit ε → 0 then yields the existence of a measure-valued solution to Eq. (3). As mentioned in Sect. 2 on solution concepts, measure-valued solutions are weaker than weak solutions. Thus, we get "better" solutions to the regularised equation with a higher-order term than to the original equation without this higher-order term.
In the second work [11], the term − ε Δ∂ t u is added to the left-hand side of the backward-forward heat equation. Again, existence of weak solutions can be shown for the regularised equation but in the limit ε → 0, we again only end up with a measurevalued solution to the original equation. So far, the existence of weak solutions to the backwardforward heat equation is only known in a special case, where the spatial dimension is equal to one and where the heat flux density φ is piecewise linear (cf. [12,13]). Besides, uniqueness of solutions is also a problem. Uniqueness of measure-valued solutions cannot be expected due to the solution concept, and in the case mentioned above, where existence of weak solutions is known, it can additionally be proven that there exist infinitely many weak solutions. So far, uniqueness is only known for a special kind of classical solutions and again only in the case of one spatial dimension and for special cases of the heat flux density φ (cf. [14,15]). Whether solutions of this kind do even exist, could, to the best knowledge of the authors, not yet be proven.
These observations raise one central question of this article: Is passing to the limit ε → 0 and thus reducing the mathematical quality of solutions necessary for the equations to be an adequate model of the underlying physics? In fact, both regularisations shown above are motivated by physics. Slemrod [10] refers to the higher-order theory of heat conduction due to Maxwell [16], which is also mentioned in Truesdell and Noll [17]. It is based upon a moment expansion of the Boltzmann equation, which leads to an infinite hierarchy of coupled moment equations for the mass density, the momentum density, the energy density, the energy flux etc., and has to be truncated by appropriate closure approximations. The common approximation of the heat flux by Fourier's law in the mean energy balance equation gives the standard heat conduction equation. However, already in [16] it was shown for rarified gases that higher-order terms can arise due to density gradient contributions to the energy balance, and in [10] it was pointed out that a double Laplacian similar to Eq. (4) also occurs in the Cahn-Hilliard equation which describes the process of phase separation in a two-component binary fluid. Moment expansions of the Boltzmann equation, resulting in hydrodynamic balance equations for charge carrier densities and electron temperature, have also widely been used in nonlinear electron transport in semiconductors [18][19][20], giving higher-order terms at various levels of approximation.
Thanh et al. [11] describe the regularisation term as some kind of viscous effects referring, amongst others, to Binder et al. [21] and Novick-Cohen and Pego [22].
Indeed, there is a whole mathematical theory called vanishing viscosity method based upon the regularisation with a diffusive term multiplied with the regularisation parameter ε. At the end, letting the parameter ε tend to zero yields a solution to the original problem. The method goes back to von Neumann and Richtmyer [23] who introduced it as a method to calculate hydrodynamic shocks numerically.
which can be seen as a simple model for a one-dimensional flow. To apply the vanishing viscosity method, the diffusion term ε ∂ x x u is added, where ε is the regularisation parameter mentioned above.
Intuitively, the discontinuities that may appear in the case of the inviscid Burgers' equation (6) are first smoothed out by the additional diffusion term and then, this smoothing effect is reduced more and more (cf. Fig. 1), such that at the end, a solution to the original Eq. (6) is obtained. A detailed discussion of the application of the vanishing viscosity method in the case of the Burgers' equation can, e.g., be found in the lecture notes of Kruzhkov on first-order quasilinear partial differential equations [24]. Another example where the vanishing viscosity method can successfully be applied is a system of Boussinesq equations, shown in the monograph of Guo et al. [25]. It reads and describes the propagation of the long surface wave in a pipe with constant depth. Here, ρ is the density, u is the velocity, ω = 1 + δρ denotes the altitude from the bottom to the free surface of the flow and α, β, γ , δ, ν are constants. In this case, the regularisation term ε ∂ x x ρ is added to the first equation and at the end, ε tends to zero. Here, the regularised problem admits a unique classical smooth solution, whereas the original problem only admits weak solutions. A broad overview of other examples for the vanishing viscosity method can also be found in Guo et al. [25].

Example 2: kinetic models for dilute polymers
In other examples, like the following one, the regularisation term is already existent in the derivation of the physical model but is then omitted because it seems to be of negligibly small order of magnitude. Barrett and Süli [26] consider a kinetic bead-spring model for dilute polymers, where the extra-stress tensor is defined through the associated probability density function ψ. This function satisfies the Fokker-Planck-type parabolic equation Here, u is the velocity of the fluid considered, q is the elongation vector of the dumbbell representing a polymer chain, J x l 0 ,q is the directional Friedrichs mollifier with respect to x over an interval of length l 0 |q| in the direction q, and U is the potential of the elastic force of the spring connecting two beads. The constant ε corresponds to the quantity De Pe , where De denotes the Deborah number and Pe the Péclet number, and the constant λ corresponds to the relaxation time constant of the dumbbells.
One interesting feature of this model is the presence of the diffusion term ε Δ x ψ on the right-hand side of Eq. (9). As Barrett and Süli [26] already mention, this term is usually omitted in standard derivations of bead-spring models, because it is several orders of magnitude smaller than the other terms in Eq. (9). Actually, Bhave et al. [27] estimate the quantity De Pe to be in the range of 10 −9 -10 −7 , whereas the expected important length scales of stress diffusion start at 10 −5 -10 −3 .
Mathematically however, omitting the diffusion term is quite detrimental as it leads to a hyperbolically degenerate parabolic equation which is much harder to handle than Eq. (9). 3 In fact, the existence result in the case ε = 0 is again proven via showing the existence of solutions for ε > 0 and then passing to the limit as ε → 0. This leads to less regularity for the probability density function ψ. So again we state that the original model with the higher-order term delivers mathematically "better" solutions than the model without this higher-order term. In this case, the model with the higherorder term even seems to be more adequate physically.

Example 3: regularisation by noise
A quite recent approach to regularise an equation is to add a certain stochastic noise in order to obtain the existence of a unique solution where, without noise, only existence or uniqueness or none of these two is known so far.
There is some work by, e.g., Gyöngy and Pardoux [30,31] using additive noise to prove existence of a unique solution under assumptions which, in the deterministic case, are so far not known to suffice for obtaining existence or uniqueness. Gyöngy and Pardoux [30,31] consider the equation equipped with either homogeneous Dirichlet or homogeneous Neumann boundary conditions, where f is a nonlinear function and ∂ t x W denotes space-time white noise. Under the assumption that f satisfies some measurability and boundedness condition, Gyöngy and Pardoux [30,31] are able to prove existence and uniqueness of a solution in a generalized sense defined in the work of Walsh [32], which may be compared to a very weak solution (cf. Sect. 2) but in a stochastic sense. In the deterministic case, the assumptions on f are, to the best knowledge of the authors, not enough to prove existence or uniqueness of solutions.
More recently, there has been research on linear multiplicative noise by, e.g., Flandoli et al. [33], considering the linear transport equation driven by the vector field b. Assuming that b is sufficiently regular, uniqueness of solutions to the initial-boundary value problem governed by this equation can be proven (cf., e.g., DiPerna and Lions [34] or Ambrosio [35]), but if this is not the case then examples of non-uniqueness are known, as is shown in the work of Flandoli et al. [33]. However, if a certain amount of linear multiplicative noise is added to Eq. (11), existence and uniqueness of solutions can be proven under weaker assumptions on b, see again [33]. To be precise, the stochastic equation is considered, where e i , i = 1, . . . , d, are the unit vectors in is a standard Brownian motion in R d , and the notation • is used for the stochastic integration in the sense of Stratonovich.
Since real-world systems often include noise, the consideration of stochastic differential equations is physically also very important, and there are many works considering the influence of noise on various physical systems. We just want to mention some examples here. Additive noise has been shown to have an important effect, e.g., upon chimera states (coexisting coherent and incoherent space-time patterns in networks), which can be either destructive, see, e.g., Loos et al. [36,37], or constructive, see, e.g., Zakharova et al. [38,39]. Multiplicative noise has been considered, for example, in the work on nonequilibrium phase transitions by Van

Connection of higher-order and nonlocal equations
In this part, we show some interesting connections between nonlocal models and local higher-order models. As it turns out, many equations containing higher-order terms can be rewritten as some nonlocal equation or can be seen as approximations of nonlocal equations. Thus, nonlocal models can also be seen as physically more exact regularisations of local models.

Green's function as integral kernel
Let us start with a simple example, which was, amongst others, mentioned in Duruk et al. [41][42][43]. We consider the equation arising, e.g., in the nonlocal theory of elasticity which considers the stress at a point x not only as a function of the strain at the point x but the strain field at every point in the body. Thus, it also takes long-range interactions between the particles into account. In this case, u represents the displacement field and G is the kernel function representing the stress-strain relation. Here, G is the Green's function for the differential operator 1 − ε ∂ x x considered on the full space (therefore we can neglect boundary conditions), i.e., is the unique solution to the problem Based on Eq. (13), we can now conclude that ∂ tt u at least formally fulfils Therefore, we can see the nonlocal Eq. (13) as a regularised form of the usual wave equation This method can obviously be applied using any arbitrary Green's function G in an arbitrarily high spatial dimension, generating many different regularisations of arbitrarily high order. Considering, for example, the general differential operator where n ∈ N, a k > 0, k = 1, . . . , n, the Green's function for A reads where F −1 denotes the inverse Fourier transform. If we start, for example, with a nonlocal equation, this operator A can be used to approximate the differential operator of which the given kernel function is the Green's function. This has been done, for example, for lattice models in nonlocal continuum mechanics in Eringen [44, Ch. 6.9] and Lazar et al. [45]. Thus, we can consider nonlocal models as a generalisation of local ones as different kernels lead to different local, potentially higher-order models.

Nonlocal coupling in reaction-diffusion systems
A more involved example is a reaction-diffusion system with asymmetric nonlocal coupling, studied in Siebert et al. [46], which is derived as a limiting case of the activator-inhibitor reaction-diffusion-advection model Here, u and w are the activator and the inhibitor, respectively, linearly coupled by the terms − g w and h u with g, h ∈ R, F is a nonlinear function, τ is the inhibitor relaxation time, f is a constant, ξ represents the advection velocity and D > 0 is the inhibitor diffusion coefficient. Passing to the limit τ → 0, i.e., the case of fast inhibitor dynamics, the Green's function is used to reformulate the system by writing the solution w of the second equation of (20) as with the Green's function G defined as Inserting Eq. (21) into the first equation of (20), the reactiondiffusion equation with asymmetric nonlocal coupling is obtained, where σ = g h denotes the nonlocal coupling strength. This method is more or less the inverse of the method shown in Sect. 4.1. If we solve the first equation of (20) for w and insert it into the second equation, we end up with the same local higher-order equation which we would get using the method of Sect. 4.1. Thus, higher-order and nonlocal terms may be considered as physically representing some kind of nonlocal coupling.

Expansion of integral operators
Another way to approximate a nonlocal equation by a local one including higher-order terms is the expansion of the corresponding integral operator. We will illustrate this by the example of a linear peridynamic model, 4 as was done in Emmrich and Weckner [48]. We consider the equation where ρ denotes the mass density, and u is the displacement field of the body, Ω represents the volume that the body occupies, B(x, δ) is the ball of radius δ around x, d is the spatial dimension, δ > 0 is the so-called peridynamic horizon of interaction, λ d,δ is a real-valued function which depends upon d and δ and determines the specific material model, and b represents the external forces. The symbol ⊗ denotes the outer product. Additionally, the function λ d,δ has the property λ d,δ (r ) = 0 for all r ≥ δ.
We now consider only the integral operator L d,δ . The idea is to perform a Taylor expansion of the argument u up to the order of m, supposing sufficient regularity. In order to make the following easier to read, we omit the time dependence of u and obtain with some remainder r m (u;x, x) of order o(|x − x| m ). Now we insert the Taylor expansion into the operator L d,δ and get d,δ are local differential operators of order k. Furthermore, for odd k, the integrand is an odd function in x − x so that the integral vanishes and thus (L (k) d,δ u)(x) = 0 for all odd k. In summary, we can approximate the nonlocal Eq. (24) by the local equation For the second order m = 2, this approximation yields the classical Navier equation of linear elasticity with the Lamé coefficients μ = λ = 3K /5 where K denotes the bulk modulus. Using the above approximation, it is shown that the nonlocal operator L d,δ converges to the local operator L of the Navier equation for vanishing nonlocality δ → 0, at least in the interior of the domain Ω. Near the boundary, this is to the best knowledge of the authors not yet known.
The reason for that is the problem of finding appropriate boundary conditions for the Navier equation. For the peridynamic model, there are no boundary conditions since no spatial derivatives appear. More information about vanishing nonlocality in linear and also nonlinear peridynamic models can be found, e.g., in Puhst [49]. A quite similar approach to the one above, the so-called inner expansion, was used by Arndt and Griebel [50] to derive a new scheme upscaling the atomistic level to the continuum level for the model of a crystalline solid. There, a Taylor expansion of the deformation function around the points of the discrete lattice representing the atoms is inserted into the local energy potential to obtain, after some more steps, a continuum formulation which represents the upscaled model. One of this technique's advantages is the well-posedness of the resulting equation, which is not guaranteed by other techniques like the direct expansion technique proposed by Kruskal and Zabusky [51] and Zabusky and Kruskal [52], which relies on a Taylor expansion of the right-hand side of the equation instead of the deformation function. On the other hand, it covers higher-order effects to an arbitrarily high order in contrast to the common scaling technique, which, simply spoken, lets the number of atoms tend to infinity, and was studied, amongst others, by Blanc et al. [53]. This inclusion of higher-order effects prevents the loss of regularity of the solution that is often observed for the scaling technique because the dispersion which the discrete system inherits is not contained in the resulting continuum system.
Again we see how important higher-order terms are and, although there is no nonlocal model involved in this example, there is still a connection to nonlocal models since they can also be seen as some kind of upscaling of a microscale model, as already mentioned in the introduction.

Conclusion
In the first part of this paper, we presented many different examples showing that simplifications of differential equations in the physical sense often lead to worse mathematical properties of the solutions to these equations. This may affect the existence of various types of solutions, the uniqueness of these solutions if they exist, their numerical approximability, and their qualitative properties such as longterm behaviour. The example of the kinetic model for dilute polymers (Sect. 3.2), studied in Barrett and Süli [26], shows that the higher-order term-artificially added to the equation in order to regularise it-is in some cases already existent in the original physical model. However, it is usually omitted because its order of magnitude is several orders smaller than the other terms appearing in the model. This sheds new light on the question if physical simplification is admissible, or if, on the contrary, even small terms should be kept in physical models since they improve the numerical solvability and the mathematical properties of the solution.
In the second part, we extended our results by demonstrating an intriguing connection between nonlocal and local higher-order models. The example of Green's function as the integral kernel of the nonlocal equation (Sect. 4.1) shows that many local higher-order equations can be rewritten as a nonlocal equation, and in this sense nonlocal models can be seen as a generalisation of local models. The example of the expansion of integral operators (Sect. 4.3) shows that even in more general situations, there is still a connection between nonlocal models and local higher-order models since the latter can be used as an approximation of the former. Overall, we wish to motivate the reader to consider also nonlocal models when studying a physical problem, especially in applications where nonlocal models have not received much attention previously. The connection mentioned might also improve the numerical understanding of nonlocal models.
As a physical interpretation of this connection, we suggest that nonlocal coupling terms can in certain situations be avoided by considering a more detailed description using additional dynamical variables. Considering a single differential equation with higher-order derivatives is equivalent to a system of several differential equations with lower-order derivatives, i.e., to extending the number of degrees of freedom of a dynamical model. These additional differential equations can be eliminated using Green's functions, which results in nonlocal models with fewer dynamical variables. However, some nonlocal models like, for example, peridynamics would, at least formally, require an infinite number of degrees of freedom to be exactly represented. Additionally, they need much less regularity and are therefore more suitable to study discontinuous phenomena like, e.g., crack propagation.