1 Introduction

The porous medium equation (PME) is given by

$$\begin{aligned} \frac{\partial \rho }{\partial t}-\nabla \cdot \nabla \left( \rho ^m\right) =0, \end{aligned}$$
(1)

where \(x \in \Omega \subset \mathbb R^d\), \(\rho (t,x) \ge 0\) is the unknown density function, and \(m \ge 1\). The PME can be found in many physical and biological phenomena, including the flow of an ideal gas through porous media, groundwater infiltration, the spread of viscous fluids, and boundary layer theory [10, 18, 39]. The PME is an example of a non-linear parabolic PDE with an underlying Wasserstein gradient flow structure [33] that guarantees that any solution for initially positive data will remain so for all times, and will be energy dissipative.

Moreover, the PME has the peculiar finite speed of propagation property which states that any compactly supported initial data will remain so for all times, and thus, the solution at the domain interface, referred to as the free boundary, will only propagate at a finite speed. The PME admits a classical self-similar weak solution [4, 34] that is compactly supported and shows this property. It is also a commonly known fact that for certain initial data, the solution exhibits the waiting-time phenomenon in which the free-boundary does not move, but the interior profile continues to evolve until a certain positive finite time [1].

The existence of a free boundary due to the degeneracy of the PME makes traditional parabolic numerical techniques ineffective. Various numerical methods have been developed for the PME [7, 8, 12, 14, 15, 19,20,21, 24, 26, 28, 30, 31, 35, 40, 42]. We summarize some of the broad themes of these methods:

  • Standard numerical routines that invoke standard finite element procedures for spatial discretization and a predictor-corrector formulation for temporal discretization (PCSFE method) can suffer from oscillations at the free boundary which cannot be suppressed by increasing the polynomial degree or refining the spatial mesh [42].

  • Discontinuous Galerkin [12], Local Discontinuous Galerkin [42], and WENO [28] methods have been adapted to the PME to suppress these oscillations arising near the free boundary. Particularly, [42] makes use of a non-negativity preserving limiter, while [12] implements a maximum-principle-satisfying (MPS) limiter. However, these schemes introduce additional numerical viscosity, due to which they fail to accurately track the free boundary, and thus, to accurately estimate the waiting time.

  • Several interface tracking schemes [7, 8, 21, 30] have been developed that track the free boundary by solving the equation of the interface in the Lagrangian coordinate. However, these schemes have limited applicability in higher dimensions and to initial data with complex support due to the complexity of implementation. Recent works in [11, 27] develop fully Lagrangian schemes that can be applied to 2D problems. Nonetheless, being a Lagrangian scheme, it may not handle well initial data with complex support. If any “tangling” of the mesh occurs, the solution has to be manually interpolated onto a new mesh.

  • Perturbation techniques have also been used to remove the degeneracy of the PME [15, 20, 35]. In particular, the non-negative initial data is perturbed by a small parameter \(\varepsilon > 0\) to make it positive everywhere. Using this approach, [20] constructs a first-order scheme that is unconditionally energy dissipative as well as provably bound preserving on structured tensor-product grids. However, in addition to the perturbation by \(\varepsilon \), this scheme has the limitation that it cannot be applied to complex geometries.

In this work, we present two positive and energy-preserving nonlinear schemes for the PME: one is based on a log-density formulation, and the other uses a mixed formulation. Our log-density formulation scheme is closely related to the work in [20] but can handle fully unstructured grids in any number of dimensions. Additionally, both schemes do not require any perturbation of a non-positive initial data. Here we focus on lowest order finite element methods for both formulations and leave out detailed analysis for high-order positive and energy-stable schemes to our on-going work; see brief comments regarding the extensions to higher order schemes in Remark 2.2 and 3.2 below.

  • The log-density based scheme uses a classical conforming piecewise linear finite element space for the spatial discretization in combination with a first-order semi-implicit time discretization. Global mass conservation, the positivity of density, and energy dissipation for the energy \(\int _\Omega \rho (\log (\rho )-1)\,\textrm{dx}\) are proven for this fully discrete scheme on general unstructured meshes. Unique solvability of the nonlinear equations in each time step is established when the mass matrix is lumped to be a diagonal matrix. Moreover, under the condition of a Delaunay triangulation along with a special edge-based discretization of the nonlinear diffusion term [41], the scheme can be further proven to satisfy a discrete maximum principle. While the theoretical results for this scheme are proven under the condition that the initial density \(\rho ^0(x)>0\) is positive everywhere, we show a practical implementation to handle compactly supported initial data which simply deactivates the degrees of freedom if the associated diagonal entry of the stiffness matrix is below a threshold value (e.g., \(10^{-14}\)).

  • On the other hand, the mixed method is based on a first-order reformulation of the PME, where the unknowns are density, potential, and velocity. The classical RT0-P0 finite element pair is used for spatial discretization, where the velocity is approximated via the lowest order Raviart-Thomas finite element space [36], and the density and potential are approximated by a discontinuous piecewise constant space. A first-order semi-implicit time discretization is then applied to this spatial discretization, which results in a nonlinear fully discrete scheme. The fully discrete scheme is locally mass conservative and positivity preserving under a classical CFL condition on general unstructured meshes. Furthermore, we prove energy dissipation for the physical energy \(\int _{\Omega }\frac{\rho ^m}{m-1}\,\textrm{dx}\) of this scheme when mass lumping [3] is applied to evaluate the velocity mass matrix. Positivity of this lumped velocity mass matrix requires the triangulated mesh to be Delaunay. We note that this mass lumping is simply a trapezoidal rule on tensor-product meshes, on which one can further prove the solvability of the nonlinear system and unconditional positivity preservation following similar arguments as in the log-density formulation; see also [20]. The nonlinear system in each time step can be efficiently solved by expressing velocity and potential in terms of the density unknowns and then solving the parabolic system of nonlinear equations for density alone using Newton’s method. This scheme naturally handles a compactly supported initial density profile as zero density/potential is allowed in the scheme.

The rest of the paper is organized as follows. In Sect. 2, we describe the construction of the log-density scheme and prove several properties like mass conservation, energy stability, unique solvability, and bound preservation. In Sect. 3, we describe the mixed method and prove properties like local mass dissipation, energy stability, and positivity. In Sect. 4, we discuss several numerical experiments and compare the results of the two schemes. In Sect. 5, we provide some concluding remarks.

2 Log-Density Formulation

We consider the PME (1) expressed using the log-density variable \(u:= \log (\rho )\) on a bounded polyhedral domain \(\Omega \subset \mathbb R^d, d=1,2,3\) with a homogeneous Neumann boundary condition:

$$\begin{aligned} \frac{\partial \exp (u)}{\partial t}-\nabla \cdot \left( m\exp (m\,u) \nabla u\right)&=0,\quad {\text {in} }\quad \Omega , \end{aligned}$$
(2a)
$$\begin{aligned} \frac{\partial u}{\partial n}&= 0, \quad {\text{ o }n }\quad \partial \Omega , \end{aligned}$$
(2b)

and initial data

$$\begin{aligned} u(0,x) =&\; \log (\rho ^0(x)) \text { in }\Omega , \end{aligned}$$
(2c)

where the initial density \(\rho ^0(x)>0\) is assumed to be positive everywhere in the domain. This log-density based formulation was first developed in [29] for the Poisson-Nernst-Planck equations, see also [16], which is closely related to the the entropy-stable schemes based on the entropy variables for hyperbolic conservation laws and compressible flow in the CFD literature [6, 22, 23, 38]. In particular, the density \(\rho = \exp (u)\) is guaranteed to stay positive as long as the initial data \(\rho ^0(x)\) is positive. We note that the recently introduced bound preserving and energy dissipative finite difference scheme by Gu and Shen [20] is also closely related to this log-density formulation. Here, we consider homogeneous Neumann boundary conditions for the simplicity of presentation, which covers the important case of compactly supported initial data. The finite element scheme can be naturally modified to incorporate general inhomogeneous mixed boundary conditions with Dirichlet and Neumann conditions on parts of the boundary. For discussions on enforcing inhomogeneous essential and natural boundary conditions in Galerkin and mixed frameworks, please refer to [25] and [9], respectively.

Note that the PME has a special finite speed of propagation property [39] which states that if the initial data \(\rho ^0\) has compact support, then the solution to the Cauchy problem of the PME will also have compact support at any other time, \(t > 0\). For this reason, special care has to be taken in the case of compactly supported initial data as u is negative infinity at locations where \(\rho ^0\) is zero. In [20], the authors add a small perturbation \(\varepsilon \) to the initial data to make it positive everywhere. We employ a different approach in which we deactivate degrees of freedom wherever \(\rho ^0\) is close to zero; see more discussion at the end of this section.

The PME (2) satisfy the following three important properties:

  1. (i)

    Mass conservation:

    $$\begin{aligned} \int _{\Omega }\rho (t, x)dx = \int _{\Omega }\rho ^0(x)dx. \end{aligned}$$
    (3a)
  2. (ii)

    Positivity:

    $$\begin{aligned} \text { If }\rho ^0(x)>0, \text { then }\rho (t, x)>0 \text { for any }t>0. \end{aligned}$$
    (3b)
  3. (iii)

    Energy dissipation:

    $$\begin{aligned} \frac{d}{dt} E=-\int _{\Omega }m\rho ^m|\nabla u|^2dx, \end{aligned}$$
    (3c)

    where the energy E is given by

    $$\begin{aligned} E:= \int _{\Omega }\rho (\log (\rho )-1)dx = \int _{\Omega }\exp (u)(u-1)dx. \end{aligned}$$
    (3d)

Our goal is to design a numerical scheme that preserves these three properties.

2.1 Spatial and Temporal Discretizations

We describe the method on a general unstructured simplicial mesh, although quadrilateral/hexahedral meshes can also be used. Let \(\Omega _h:=\{K_i\}_{i=1}^{N_K}\) be a conforming simplicial triangulation of the domain \(\Omega \) with \(N_K\) elements. Denote \(\mathcal {E}_h=\{E_i\}_{i=1}^{N_E}\) as the collection of \(N_E\) edges of \(\Omega _h\), and \(\mathcal {V}_h=\{v_i\}_{i=1}^{N_V}\) as its collection of \(N_V\) vertices. For any element \(K\in \Omega _h\), denote \(\mathcal {E}_K:=\{E\in \mathcal {E}_h: \; E\subset \bar{K}\}\) as its edges, and \(\mathcal {V}_K:=\{v\in \mathcal {V}_h: \; v\in \bar{K}\}\) as its vertices.

We shall use the \(H^1\)-conforming finite element space

$$\begin{aligned} V_h := \{v_h\in H^1(\Omega ):\; v_h|_K\in P_1 (K),\quad \forall K\in \Omega _h\}, \end{aligned}$$
(4)

where \(P_1(K)\) is the space of linear polynomials on a simplex K. The space \(V_h\) is equipped with the standard nodal (hat) basis \(\{\phi _{i}(x)\}_{i=1}^{N_V}\) in which \(\phi _{i}(v_j) = \delta _{ij}\) where \(\delta _{ij}\) is the Kronecker delta function. Hence any function \(w_h\in V_h\) can be expressed as

$$\begin{aligned} w_h=\sum _{i=1}^{N_V}w_i\phi _i, \end{aligned}$$

where \(\underline{w}:=[w_1,\ldots , w_{N_V}]'\) is the coefficient vector satisfying \(w_i=w_h(v_i)\).

The spatial discretization for (2) then reads: find \(u_h \in V_h\) such that, for \(t>0\),

$$\begin{aligned} \int _{\Omega }\frac{\partial \exp (u_{h})}{\partial t} v_h\,\textrm{dx} +\int _{\Omega } m\exp (m\,u_{h})\nabla u_{h}\cdot \nabla v_h\,\textrm{dx} =&0,\quad \forall v_h\in V_h, \end{aligned}$$
(5)

with initial conditions

$$\begin{aligned} u_h(0, v_i)=\log \left( \rho ^0_h(v_i)\right) , \quad \forall v_i\in \mathcal {V}_h. \end{aligned}$$
(6)

We apply a first-order semi-implicit discretization for the ODE system (5) to arrive at the following fully discrete scheme: Given data \(u_h^{n-1}\in V_h\) at time \(t^{n-1}\) and time step size \(\Delta t\), find \(u_h^n\in V_h\) at time \(t^n=t^{n-1}+\Delta t\) such that

$$\begin{aligned} \mathcal {M}\left( \frac{\exp (u_{h}^n)-\exp (u_h^{n-1})}{\Delta t}, v_h\right) + \mathcal {A}\left( m\exp (m u_h^{n-1}); \nabla u_h^n, \nabla v_h\right) =0, \quad \forall v_h \in V_h, \end{aligned}$$
(7)

where the mass operator \(\mathcal {M}\) and stiffness operator \(\mathcal {A}\) read as follows:

$$\begin{aligned} \mathcal {M}(\alpha , \beta ):=&\; \int _{\Omega } \alpha \cdot \beta \textrm{dx},\quad \quad \mathcal {A}(\gamma ; \alpha , \beta ):=\; \int _{\Omega } \gamma \nabla \alpha \cdot \nabla \beta \textrm{dx}. \end{aligned}$$

Unique solvability of the scheme requires the mass matrix to be mass lumped, which is achieved by applying the following vertex-based quadrature rule

$$\begin{aligned} \mathcal {M}_h(\alpha , \beta ):= \sum _{K\in \Omega _h} \sum _{v\in \mathcal {V}_K}\frac{|K|}{d+1} \alpha (v) \beta (v)= \sum _{i=1}^{N_v}\frac{|S_i|}{d+1} \alpha (v_i) \beta (v_i), \end{aligned}$$
(8)

where \(S_i:=\cup _{v_i\in \bar{K}}\{\bar{K}\}\) is the vertex patch of \(v_i\) and \(|S_i|\) is its volume.

Furthermore, we make use of the following edge-based integration formula [41] for the stiffness matrix, which will be used to prove uniform the boundedness of the scheme:

$$\begin{aligned} \begin{aligned} \mathcal {A}_h(\gamma ; \alpha , \beta ):=&\; \sum _{K\in \Omega _h} \sum _{E\in \mathcal {E}_K} \omega ^K_{E} \tilde{\gamma }_{E}\delta _{E}(\alpha )\delta _{E}(\beta )\\ =&\; \sum _{E\in \mathcal {E}_h} \omega _E {\tilde{\gamma }}_{E}\delta _{E}(u_h)\delta _{E}(v_h), \end{aligned} \end{aligned}$$
(9)

where \( \omega _E:=\sum _{K \supset E} \omega ^K_{E}\). Here for an edge E with vertices \(v_i\) and \(v_j\), we have

$$\begin{aligned} \delta _{E}(u_h) = u_h(v_i) - u_h(v_j). \end{aligned}$$
(10)

The quantity \(\tilde{\gamma }_E\) is the following harmonic average on E,

$$\begin{aligned} \tilde{\gamma }_E = \left[ \frac{1}{|E|}\int _E \frac{1}{\gamma } \,ds \right] ^{-1}. \end{aligned}$$
(11)

Also, the weights \(\omega ^K_E\) are given by the identity [5, 41]:

$$\begin{aligned} \omega _E^K=\frac{1}{d(d-1)}\left| \kappa _E^K\right| \cot \theta _E^K, \end{aligned}$$
(12)

with d being the number of dimensions, \(\theta _E^K\) being the angle between faces not containing edge E, and \(\kappa _E^K\) the (\(d-2\)) dimensional simplex formed by their intersection.

2.2 Properties

In this section, we prove several important results for the scheme (7). We begin by showing that the scheme satisfies discrete versions of the properties in (3).

Theorem 2.1

The fully discrete, semi-implicit scheme (7) conserves mass, preserves positivity, and dissipates energy in the following form

$$\begin{aligned} E_h^n- E_h^{n-1}\le -\int _{\Omega _h} m \Delta t \exp \left( m u_h^{n-1}\right) |\nabla u_h^n|^2 dx, \end{aligned}$$
(13)

where the energy \(E_h^n:=\mathcal {M}(\exp (u_h^n), u_h^n-1).\)

Proof

The positivity of the scheme is guaranteed by the log-density variable formulation since \(\exp (u_h^n) >0\) always. The mass conservation can be proved by picking \(v_h=1\) in (7):

$$\begin{aligned} \mathcal {M}\left( \frac{\exp (u_{h}^n)-\exp (u_h^{n-1})}{\Delta t}, 1\right) + 0 = 0 \implies \mathcal {M}\left( \exp (u_{h}^n), 1\right) = \mathcal {M}\left( \exp (u_{h}^{n-1}), 1\right) .\nonumber \\ \end{aligned}$$
(14)

The only non-trivial property to prove is energy dissipation. Observe that by Taylor expansion, we get

$$\begin{aligned} (\exp (a)-\exp (b)) a= \exp (a)(a-1)- \exp (b)(b-1)+\frac{1}{2} \exp (\xi )(a-b)^2, \end{aligned}$$
(15)

where \(\xi \) is a function between a and b. Now, picking \(v=u_h^n\) and using the above Taylor expansion, we get

$$\begin{aligned}{} & {} \mathcal {M} \left( \exp (u_{h}^n), u_h^{n-1}\right) - \mathcal {M} \left( \exp (u_{h}^{n-1}), u_h^{n-1}-1\right) \nonumber \\{} & {} \quad = -\Delta t \mathcal {A} \left( m\exp (m u_h^{n-1}), \nabla u_h^n, \nabla u_h^n\right) \nonumber \\{} & {} \qquad - \mathcal {M} \left( \exp (\xi ), \frac{(u_h^n-u_h^{n-1})^2}{2} \right) \end{aligned}$$
(16)

where \(\xi \) is a function between \(u_h^n\) and \(u_h^{n-1}\). This completes the proof. \(\square \)

Next we prove unique solvability of the scheme (7) where mass lumping (8) is used for the mass operator.

Theorem 2.2

The fully discrete scheme (7) is uniquely solvable provided that mass lumping (8) is used to evaluate the mass operator.

Proof

We prove this result using matrix–vector notation. Denoting \(\underline{u}^j\) as the coefficient vector of solution \(u_h^j\in V_h\), the scheme (7) with mass lumping can then be written in the following matrix–vector form:

$$\begin{aligned} {\textbf {M}}(\exp (\underline{u}^n)-\exp (\underline{u}^{n-1})) + \Delta t {\textbf {A}}^{n-1}\underline{u}^n= 0, \end{aligned}$$
(17)

where \(\textbf{M}\) is the diagonal mass matrix, by virtue of mass-lumping, with entries \(\textbf{M}_{ii} = \frac{|S_i|}{(d+1)},\) and \({\textbf {A}}^{n-1}\) is a symmetric positive semidefinite stiffness matrix with entries:

$$\begin{aligned} {\textbf {A}}^{n-1}_{ij} = \mathcal {A}\left( m\exp (m u_h^{n-1}); \nabla \phi _i, \nabla \phi _j\right) . \end{aligned}$$
(18)

It is clear that the nonlinear system (17) is the Euler-Lagrange equation of the following minimization problem:

$$\begin{aligned} \underline{u}^n:=\textrm{argmin}_{\underline{u}\in \mathbb {R}^{N_V}} F(\underline{u}), \end{aligned}$$

where the energy functional is

$$\begin{aligned} F(\underline{u}) = \underline{1}\cdot {\textbf {M}}\exp (\underline{u}) - \underline{u}\cdot {\textbf {M}}\exp (\underline{u}^{n-1})+ \frac{\Delta t}{2}\underline{u} \cdot {\textbf {A}}^{n-1}\underline{u}, \end{aligned}$$
(19)

where \(\underline{1}\) is the vector of ones of size \(N_V\). Hence, unique solvability of (17) is equivalent to the coercivity (existence) and strictly convexity (uniqueness) of the functional (19). Both properties can be easily verified using the positivity of the mass and stiffness matrices and elementary calculation. We leave out the details. \(\square \)

Finally, we prove the uniform boundedness of the scheme (7) when mass lumping (8) is used for the mass operator, and the edge-based integration (9) is used for the stiffness operator.

Theorem 2.3

If the triangulation \(\Omega _h\) is Delaunay, then the solution to the fully discrete scheme (7) with mass lumping (8) being used for the mass operator and edge-based integration (9) being used for the stiffness operator is uniformly bounded. That is, given \(0<\varepsilon _1<\varepsilon _2\) such that \(\varepsilon _1 \le \exp (u_h^{n-1}) \le \varepsilon _2\), we have \(\varepsilon _1 \le \exp (u_h^n) \le \varepsilon _2\).

Proof

We only prove the lower bound, i.e., if \(\exp (u_h^{n-1})\ge \varepsilon _1\), then \(\exp (u_h^n)\ge \varepsilon _1\), as the upper bound use the same argument.

Taking a non-negative test function \(v_h\in V_h\) in (7) such that its coefficient \(v_i=\max \{\epsilon _1-\exp (u_i^n), 0\}\), and using the matrix–vector form (17), we get

$$\begin{aligned} \underline{v}\cdot {\textbf {M}}(\exp (\underline{u}^n)-\exp (\underline{u}^{n-1})) + \Delta t \underline{v}\cdot {\textbf {A}}^{n-1}\underline{u}^n= 0. \end{aligned}$$
(20)

Using the fact the \(\textbf{M}\) is a diagonal positive matrix, \(\exp (\underline{u}^{n-1})-\varepsilon _1\ge 0\), and definition of \(\underline{v}\), we have

$$\begin{aligned} \underline{v}\cdot {\textbf {M}}(\exp (\underline{u}^n)-\exp (\underline{u}^{n-1})) = \underbrace{\underline{v}\cdot {\textbf {M}}(\exp (\underline{u}^n)-\varepsilon _1)}_{\le 0}- \underbrace{\underline{v}\cdot {\textbf {M}}(\exp (\underline{u}^{n-1}))-\varepsilon _1) }_{\ge 0} \le 0. \end{aligned}$$

Next, using the edge integration formula (9), we have

$$\begin{aligned} \underline{v}\cdot {\textbf {A}}^{n-1}\underline{u}^n= \sum _{E_{ij}\in \mathcal {E}_h}\omega _{E_{ij}}\tilde{\gamma }_{E_{ij}} (v_i-v_j)(u_i-u_j), \end{aligned}$$
(21)

where \(\gamma =m\exp (u_h^{n-1})\). By definition of \(v_h\), we have

$$\begin{aligned} (v_i-v_j)(u_i-u_j) = (v_i-v_j)((u_i-\varepsilon _1) - (u_j-\varepsilon _1)) \le 0. \end{aligned}$$

Hence, the term (21) is non-positive as long as \(\omega _{E_{ij}}>0\) for all \(E_{ij}\in \mathcal {E}_h\), which is equivalent to the requirement that the triangulation \(\Omega _h\) is Delaunay; see [41]. In this case, we have

$$\begin{aligned} \underline{v}\cdot \textbf{M}(\exp (\underline{u}^n)-\varepsilon _1) = \sum _{i=1}^{N_V} \textbf{M}_{ii}(\exp ({u}_i^n)-\varepsilon _1) \max \{\epsilon _1-\exp (u_i^n), 0\} = 0 \end{aligned}$$

thanks to (20). Hence, \(\exp (u_i^n)\ge \varepsilon _1\), which completes the proof. \(\square \)

Remark 2.1

We remark that the edge-based integration in Theorem 2.3 above is only used to prove the non-positivity of the term (21) on Delaunay triangulations. When the mesh is a structured tensor-product grid, such bound preservation can be easily proven with other standard numerical integration rules; see, e.g., [20].

Moreover, although we use the edge integration formula (9) to prove the uniform boundedness result in Theorem 2.3, in practical implementation, we simply use the mass-lumping quadrature (8) to compute both \(\textbf{M}\) and \({\textbf {A}}^{n-1}\). The resulting scheme is still very robust, and is provable unique solvable and satisfy the properties in Theorem 2.1.

Remark 2.2

We comment that the log-density scheme admits higher-order spatial discretizations on general unstructured meshes while preserving properties stated in Theorem 2.1. Unique solvability in Theorem 2.2 requires a lumped mass matrix, which can be achieved for high-order finite elements on tensor-product meshes using Gauss-Lobatto quadrature rules or using special simplicial finite element spaces that allow mass lumping [13, 17]. However, the analysis in Theorem 2.3 for bound preservation will not go through as it requires the special edge-based integration in (9), which is only available for piecewise linear elements on simplicial meshes. Finally, the energy stability analysis in Theorems 2.1 relies on the Taylor expansion property (15), which is only available for the first-order backward Euler temporal discretization. Construction of higher-order temporal discretization schemes that are energy stable within the log-density formulation is the subject of our on-going work.

Remark 2.3

We conclude this section by remarking on the practical implementation of the scheme (7) with compactly supported initial data \(\rho ^0\).

In this case, we still interpolate the initial data using (6). Note that we have \(u_i^0=-\infty \) whenever \(\rho ^0(v_i)=0\). Now, the k-th Newton iteration for (17) takes the form

$$\begin{aligned} \left( {\textbf {M}} {\textbf {D}}^{(k-1)} + \Delta t {\textbf {A}}^{n-1}\right) \underline{u}^{(k)} = {\textbf {M}}\left( {\textbf {D}}^{(k-1)} \underline{u}^{(k-1)} -\exp {(\underline{u}^{(k-1)})} + \exp {(\underline{u}^{n-1})} \right) , \end{aligned}$$
(22)

where \(\textbf{D}^{(k-1)}\) is a diagonal matrix with diagonal entry \(\textbf{D}^{(k-1)}_{ii} = \exp {u_i^{(k-1)}}\), and \(\underline{u}^{(0)} = \underline{u}^{n-1}\). To solve the above linear system for \(\underline{u}^{(k)}\), we only activate the i-th degree of freedom if the diagonal entry \(\left( {\textbf {M}} {\textbf {D}}^{(k-1)} + \Delta t {\textbf {A}}^{n-1}\right) _{ii}\) is greater than a small cutoff value (e.g., \(10^{-14}\)), and set the inactive degrees of freedom to \(-\infty \).

3 Mixed Method

In this section, we develop our mixed formulation to solve the PME (1).

The physical energy for the PME is given by \(U(\rho ) = \frac{\rho ^m}{m-1}\). We define a new potential variable equal to the derivative of the physical energy:

$$\begin{aligned} \mu = U'(\rho ) = \frac{m}{m-1}\rho ^{m-1}. \end{aligned}$$
(23)

We also define a velocity variable, \(\textbf{u}\), and set it equal to the negative gradient of the potential, i.e.,

$$\begin{aligned} \textbf{u}= -\nabla \mu . \end{aligned}$$
(24)

Notice that using this definition, we easily observe that

$$\begin{aligned} \nabla \rho ^m = \rho \nabla \mu = -\rho \textbf{u}. \end{aligned}$$
(25)

We have now successfully reformulated the PME (1) into the following first-order system

$$\begin{aligned} {\left\{ \begin{array}{ll} &{}\rho _t+\nabla \cdot (\rho \textbf{u}) =0, \\ &{} \textbf{u}+\nabla \mu = 0, \\ &{} \mu - \frac{m}{m-1}\rho ^{m-1}=0, \end{array}\right. } \end{aligned}$$
(26)

where the unknowns are density, potential, and velocity. We again equip the PME system (26) with the homogeneous Neumann boundary condition

$$\begin{aligned} \textbf{u}\cdot \textbf{n}=0 \text { on }\partial \Omega . \end{aligned}$$

3.1 Spatial and Temporal Discretizations

Our energy-stable mixed method can be constructed on structured tensor product meshes in any space dimension, or on general unstructured Delaunay triangular meshes in two dimensions. Due to the use of velocity mass lumping as a key tool to establish the energy stability result, the method does not work on general simplicial meshes in three dimensions. Below we describe the method in detail on 2D triangular meshes, following the meshing notation as in Sect. 2.1. For any triangular element \(K\in \Omega _h\), we denote \(\partial K\) as its boundary whose associated outward unit normal is \(\textbf{n}_K\).

We shall use the following two finite element spaces:

$$\begin{aligned} \begin{aligned}&Q_{h}=\left\{ q_{h} \in L^{2}(\Omega ): \hspace{5.0pt}\left. q_{h}\right| _{K} \in P_{0}(K), \hspace{5.0pt}\forall K \in \Omega _{h}\right\} , \\&\textbf{V}_{h}=\left\{ \textbf{v}_{h} \in {\text {H}}({\text {div}}; \Omega ): \hspace{5.0pt}\left. \textbf{v}_{h}\right| _{K} \in RT_0(K), \hspace{5.0pt}\forall K \in \Omega _{h}, \;\; \textbf{v}_h\cdot \textbf{n}= 0,\quad \text { on } \partial \Omega \right\} , \end{aligned} \end{aligned}$$
(27)

where \(P_0(K)\) is the constant space, and \(RT_0(K)=[P_0(K)]^2\oplus \textbf{xP}_0(K)\) is the lowest-order Raviart-Thomas space on the triangle K. Note that the Neumann boundary condition is encoded in the velocity space \(\textbf{V}_h\).

The spatial discretization now reads as follows: find \(\rho _h, \mu _h \in Q_h\) and \(\textbf{u}_h \in \textbf{V}_h\) such that, for \(t>0\),

$$\begin{aligned} \sum _{K \in \Omega _h}\left[ \int _K (\rho _h)_t q_h\,\textrm{dx} + \int _{\partial K} \hat{\rho }_h\textbf{u}_h \cdot \textbf{n}_K q_h\,\textrm{ds} \right] =&\; 0 , \qquad \forall \hspace{5.0pt}q_h \in Q_h, \end{aligned}$$
(28a)
$$\begin{aligned} \sum _{K \in \Omega _h}\left[ \int _K \textbf{u}_h \cdot \textbf{v}_h\,\textrm{dx} - \int _{\partial K} \mu _h \textbf{v}_h \cdot \textbf{n}_K\,\textrm{ds} \right] =&\;0, \qquad \forall \hspace{5.0pt}\textbf{v}_h \in \textbf{V}_h, \end{aligned}$$
(28b)
$$\begin{aligned} \sum _{K \in \Omega _h} \int _K (\mu _h-\tfrac{m}{m-1} \rho _h^{m-1}) r_h dx =&\;0, \qquad \forall \hspace{5.0pt}r_h \in Q_h, \end{aligned}$$
(28c)

where \(\hat{\rho }_h\) is the upwinding numerical flux, i.e.,

given an edge E shared by two elements \(K^+\) and \(K^-\) with \(\textbf{u}_h\cdot \textbf{n}_{K^-}|_E\ge 0\), \(\hat{\rho }_h\) takes value from \(K^-\):

$$\begin{aligned} \left. \hat{\rho }_h \right| _{E} = (\rho _h|_{K^-})|_{E}, \quad \text { where } \textbf{u}_h\cdot \textbf{n}_{K^-}|_E\ge 0. \end{aligned}$$
(29)

Analogous to the log-density approach, we apply a first-order semi-implicit discretization of the ODE system (28) to obtain the following fully discrete scheme: Given time step size \(\Delta t\) and data \(\rho _h^{n-1}\in Q_h\) at time \(t^{n-1}\), find \(\mu ^n_h, \rho ^n_h \in Q_h\), and \(\textbf{u}_h^n \in \textbf{V}_h\), at time \(t^n\), such that

$$\begin{aligned} \mathcal {M}_h \left( \frac{\rho _h^n-\rho _h^{n-1}}{\Delta t}, q_h \right) + \mathcal {A}_h\left( \hat{\rho }_h^{n-1}; {\textbf {u}}_h^n, q_h \right) = 0, \quad&\forall q_h \in Q_h, \end{aligned}$$
(30a)
$$\begin{aligned} \bar{\mathcal {M}_h}(\textbf{u}_h^n, \textbf{v}_h) - \bar{\mathcal {A}_h} (\mu ^n_h, \textbf{v}_h) = 0, \quad&\forall \textbf{v}_h \in \textbf{V}_h, \end{aligned}$$
(30b)
$$\begin{aligned} \mathcal {M}_h \left( \mu ^n_h-\frac{m}{m-1}(\rho _h^n)^{m-1} ,r_h \right) = 0, \quad&\forall r_h \in Q_h, \end{aligned}$$
(30c)

where the associated operators are defined as:

$$\begin{aligned}&\mathcal {M}_h \left( \alpha ,\beta \right) = \sum _{K \in \Omega _h} \int _K \alpha \beta \textrm{dx},{} & {} \bar{\mathcal {M}}_h(\varvec{\alpha }, \varvec{\beta }) = \sum _{K \in \Omega _h} \int _K \varvec{\alpha }\cdot \varvec{\beta }\textrm{dx}, \end{aligned}$$
(31a)
$$\begin{aligned}&\mathcal {A}_h\left( \gamma ; \varvec{\alpha }, \beta \right) = \sum _{K \in \Omega _h} \int _{\partial K} \gamma \varvec{\alpha }\cdot \textbf{n}_K \beta \textrm{dx},{} & {} \bar{\mathcal {A}}_h\left( \alpha , \varvec{\beta }\right) = \sum _{K \in \Omega _h} \int _{\partial K} \alpha \varvec{\beta }\cdot \textbf{n}_K \textrm{dx}. \end{aligned}$$
(31b)

Again, the numerical flux \(\hat{\rho }_h^{n-1}\) in (30a) is taken to be the upwind flux: given an edge E shared by two elements \(K^+\) and \(K^-\) with \(\textbf{u}_h^n\cdot \textbf{n}_{K^-}|_E\ge 0\), we take

$$\begin{aligned} \left. \hat{\rho }_h^{n-1} \right| _{E} = (\rho _h^{n-1}|_{K^-})|_{E}, \quad \text { where } \textbf{u}_h^n\cdot \textbf{n}_{K^-}|_E\ge 0, \end{aligned}$$
(32)

Standard one-point numerical integration rules are used in the evaluation of the operators \(\mathcal {M}_h, \mathcal {A}_h,\) and \(\bar{\mathcal {A}}_h\). However, the energy stability of the scheme requires that the velocity mass matrix be mass lumped. As such, the operator \(\bar{\mathcal {M}}_h\) must be evaluated using an appropriate mass-lumping quadrature. On tensor-product grids, this is readily achieved by the use of the trapezoidal rule. On triangular meshes, mass-lumping is achieved by using the following formula given in [3]:

$$\begin{aligned} \bar{\mathcal {M}_h }\left( \varvec{\alpha },\varvec{\beta }\right) := \sum _{K \in \Omega _h} \sum _{E \in \mathcal {E}_K} \omega _E^K \varphi _E(\varvec{\alpha })\varphi _E(\varvec{\beta }) = \sum _{E \in \mathcal {E}_h} \omega _E\varphi _E(\varvec{\alpha })\varphi _E(\varvec{\beta }), \end{aligned}$$
(33)

where \(\omega _E:= \sum _{K \supset E} \omega _E^K\), and \(\varphi _E(\varvec{\alpha }):=\varvec{\alpha }\cdot \textbf{n}_E\) denotes the normal flux of \(\varvec{\alpha }\) through edge E. The weights \(\omega _E^K\) are given by

$$\begin{aligned} \omega _E^K=\frac{1}{2} \cot \theta _E^K, \end{aligned}$$
(34)

where \(\theta _E^K\) is the angle opposite to edge E in K. It is clear that the mass matrix associated with the integration rule in (33) is a diagonal matrix, whose diagonal entries are positive provided that the mesh is Delaunay.

Remark 3.1

To efficiently solve the scheme (30), we first apply static condensation to locally solve the potential and velocity variables \(\mu _h^n\) and \(\textbf{u}_h^n\) as functions of the density \(\rho _h^n\) using equations (30b)–(30c), and then use Newton’s method to solve the resulting parabolic nonlinear system (30a) for density alone.

3.2 Properties

Theorem 3.1

Provided a Delaunay triangulation \(\Omega _h\), the fully discrete scheme (30) is mass conservative and energy dissipative in the following forms

$$\begin{aligned} \mathcal {M}_h \left( \rho _h^n, 1 \right)&= \mathcal {M}_h \left( \rho _h^{n-1}, 1 \right) \end{aligned}$$
(35a)
$$\begin{aligned} \mathcal {M}_h \left( U(\rho ^n),1 \right) - \mathcal {M}_h \left( U(\rho ^{n-1}),1 \right)&\le -\Delta t \sum _{E \in \mathcal {E}_h} \omega _E \varphi _E(\mathbf \alpha ) \varphi _E({\textbf {u}}_h^n)^2 , \end{aligned}$$
(35b)

where \(U^n = \frac{m}{m-1}(\rho ^n)^{m-1}\) is the physical energy, and the right-hand side in (35b) comes from the quadrature formula (33).

Proof

By picking \(q_h = 1\) in (30a) and using the homegenous boundary condition \(\textbf{u}_h^n\cdot \textbf{n}|_{\partial \Omega }=0\), we can easily recover the mass conservation (35a). Now, to prove energy stability, we take the test function \(\textbf{v}_h\) in (30b) to be a function in \(\textbf{V}_h\) such that its normal flux through edge E is given by \(\varphi _E({\textbf {v}}_h) = \varphi _E(\hat{\rho }_h^{n-1}{} {\textbf {u}}_h^n)\). For this choice of \(\textbf{v}_h\), we have

$$\begin{aligned} \mathcal {A}_h\left( \hat{\rho }_h^{n-1}; {\textbf {u}}_h^n, \mu ^n_h \right) = \bar{\mathcal {A}_h} (\mu ^n_h, {\textbf {v}}_h). \end{aligned}$$
(36)

Additionally, through an application of (33), we obtain

$$\begin{aligned} \bar{\mathcal {M}_h}({\textbf {u}}_h, {\textbf {v}}_h) = \sum _{E \in \mathcal {E}_h} \omega _E\hat{\rho }_h^{n-1}\varphi _E({\textbf {u}}_h^n)^2 \end{aligned}$$
(37)

Next taking test function \(r_h=\frac{\rho _h^n-\rho _h^n-1}{\Delta t}\) in (30c), we get

$$\begin{aligned} \begin{aligned}&\mathcal {M}_h \left( \frac{\rho _h^n-\rho _h^{n-1}}{\Delta t}, \mu _h^n \right) = \mathcal {M}_h \left( \frac{\rho _h^n-\rho _h^{n-1}}{\Delta t}, U'(\rho _h^n) \right) \\&= \frac{1}{\Delta t}\mathcal {M}_h \left( U(\rho _h^n),1 \right) - \frac{1}{\Delta t}\mathcal {M}_h \left( U(\rho _h^{n-1}),1 \right) + \frac{1}{\Delta t} \mathcal {M}_h\left( \frac{U''(\xi )}{2}(\rho _h^n-\rho _h^{n-1})^2 \right) , \end{aligned} \end{aligned}$$
(38)

where we have used Taylor expansion with \(\xi \) being a function between \(\rho ^n_h\) and \(\rho _h^n-1\). Finally, taking the test function \(q_h=\mu _h^n\) in (30a), and using the above relations, we get

$$\begin{aligned} \begin{aligned} \mathcal {M}_h \left( U(\rho ^n),1 \right) - \mathcal {M}_h \left( U(\rho ^{n-1}),1 \right)&= -\Delta t\sum _{E \in \mathcal {E}_h} \omega _E\hat{\rho }_h^{n-1}\varphi _E({\textbf {u}}_h^n)^2\\ {}&\quad - \mathcal {M}_h\left( \frac{U''(\xi )}{2}(\rho _h^n-\rho _h^{n-1})^2 \right) \\&\le -\Delta t\sum _{E \in \mathcal {E}_h} \omega _E\hat{\rho }_h^{n-1}\varphi _E({\textbf {u}}_h^n)^2 \end{aligned} \end{aligned}$$
(39)

The right-hand side is non-positive if \(\omega _E \ge 0\) which is guaranteed if the triangulation \(\Omega _h\) is Delaunay. This completes the proof. \(\square \)

Next, we prove that the scheme (30) is positivity preserving under a usual CFL time stepping constraint.

Theorem 3.2

Given \(\rho ^{n-1}_h \ge 0\), the fully discrete scheme (30) is positivity preserving under the CFL condition,

$$\begin{aligned} {\Delta t} \sum _{E \in \mathcal {E}^-_K} |\textbf{u}_h^n\cdot {\textbf{n}_K}|_E\frac{|E|}{|K|} \le 1, \quad \forall K \in \Omega _h, \end{aligned}$$
(40)

where

$$\begin{aligned} \mathcal {E}_K^-:=\{E\in \mathcal {E}_K: \quad \textbf{u}_h^n\cdot \textbf{n}_K|_E\ge 0. \} \end{aligned}$$

Proof

Restricting the mass conservation equation (30a) to a single element \(K\in \Omega _h\), we have

$$\begin{aligned} \frac{|K|}{\Delta t}(\rho _K^n - \rho _K^{n-1}) + \sum _{E \in \mathcal {E}_K} \hat{\rho }^{n-1}_E \left. ({\textbf {u}}^n_h \cdot {\textbf {n}}_K)\right| _E |E| = 0, \end{aligned}$$
(41)

where \(\rho _K^n:=\rho _h^n|_K\) is the restriction of the function \(\rho _h^n\in Q_h\) to the element K. This implies that

$$\begin{aligned} \rho _K^n = \rho _K^{n-1}-\frac{\Delta t}{|K|} \sum _{E \in \mathcal {E}_K} \hat{\rho }^{n-1}_E\left. ({\textbf {u}}^n_h \cdot {\textbf {n}}_K)\right| _E |E|, \end{aligned}$$

By the definition of \(\mathcal {E}^-_K\), we have \(\hat{\rho }_E^{n-1}=\rho _K^{n-1}\) for all \(E\in \mathcal {E}^-_K\). Hence,

$$\begin{aligned} \rho _K^n = \rho _K^{n-1}(1-\frac{\Delta t}{|K|} \sum _{E \in \mathcal E^-_K} \left| {\textbf {u}}^n_h \cdot {\textbf {n}}_K\right| _E |E|) + \frac{\Delta t}{|K|} \sum _{E \in \mathcal {E}_K\backslash \mathcal {E}^-_K} \hat{\rho }_E^{n-1}\left| {\textbf {u}}^n_h \cdot {\textbf {n}}_K\right| _E |E|, \end{aligned}$$

Both terms on the right hand side are nonnegative under the assumption that \(\rho _h^{n-1}\ge 0\) and (40). This completes the proof. \(\square \)

Remark 3.2

Although the scheme (30) cannot be extended to higher-order spatial discretizations on unstructured triangular grids while preserving the energy stability property due to the absence of suitable mass lumping procedures, higher-order spatial discretizations are possible on tensor product grids, in the case of which the mass matrix can be lumped by using proper Gauss-Lobatto quadrature rules. Extensions to higher order time discretizations while preserving the energy stability property are not possible due to a similar reason mentioned in Remark 2.2.

Remark 3.3

We note that for tensor product meshes, the stronger property of unconditional positivity preservation can be proven following similar arguments as in the proof of Theorem 2.3; see also [20]. The key is to locally eliminate velocity and potential degrees of freedom to express the scheme as a finite volume scheme for the piecewise constant density unknown only. In this case, unconditionally positivity preservation holds for any consistent numerical flux, that is, the upwinding flux is not needed for the positivity proof on tensor-product meshes. We leave out the detailed derivation.

4 Numerical Results

In this section, we present out numerical findings. All computations are performed using the Python interface of the open-source library NGSolve [37]. The source code for these computations is available at the following git repository: https://github.com/avj-jpg/pme.

Table 1 Convergence results for 1D Barenblatt initial data
Fig. 1
figure 1

Evolution of 1D Barenblatt initial data by the two schemes for \(m=3\), \(\Delta t = 0.05\), and \(N=200\) elements

4.1 1D Barenblatt Solution

The porous medium equation admits an exact weak solution formulated by Barenblatt [4] and Pattle [34]. In the one-dimensional case, the Barenblatt solution is given by the equation:

$$\begin{aligned} \rho _B(x,t) = (t+1)^{-k} \left( s_0 - \frac{k(m-1)}{2m} \frac{x^2}{(t+1)^{2k}}\right) _+^{\frac{1}{m-1}}, \quad t > 0, \end{aligned}$$
(42)

where \(k = (m+1)^{-1}\), and \(s_0\) denotes a scaling factor. Note that this data is compactly supported in the interval \([-\eta _m(t), \eta _m(t)]\), where the right boundary \(\eta _m(t)\) moves as:

$$\begin{aligned} \eta _m(t) = \sqrt{\frac{2ms_0}{k(m-1)}}(t+1)^k. \end{aligned}$$
(43)

To verify the accuracy of the two schemes, we use (42) as the initial data, with \(x \in [-10,10]\), \(s_0=3\), and \(m\in \{2,3,4\}\), and conduct a spacetime mesh-refinement study for the \(L^2\)-error of density at final time \(t=1\). We record the \(L^2\)-error in the entire domain \([-10,10]\) and in the interval \([-5,5]\) away from the interface. We utilize a sequence of meshes consisting of \(\{100 \cdot 2^{i}\}_{i=0}^3\) spatial elements and set the timestep size to \(\frac{1}{5\cdot 4^{i}}\) for the log-density method and to \(\frac{1}{10\cdot 2^{i}}\) for the mixed method, correspondingly. The results of this convergence study are recorded in Table 1. For all three values of m, we observe that the log-density method is second-order accurate in space and first-order accurate in time in the region \([-5,5]\) where the solution remains smooth at final time. For larger values of m, the order of convergence deteriorates when the \(L^2\)-error calculated in the entire domain. This is anticipated since the error is expected to be greater at the interface for larger values of m, owing to the decreased regularity of the solution.

On the other hand, we observe that the mixed scheme has the expected first order accuracy in both space and time in the interval \([-5,5]\). Similar to the case of the log-density method, the order of convergence is observed to decay with m if the \(L^2\)-error is measured in the entire domain \([-10,10]\). However, the decay in order is slower than in the case of the log-density scheme.

Figure 1 displays plots of the numerically computed Barenblatt solution using the two schemes at time \(t=1\), for \(m=3\) and \(\Delta t=0.05\). These plots illustrate that the initial profile is accurately evolved by both schemes, with no oscillations emerging near the interface, and density staying non-negative at all times.

Fig. 2
figure 2

Observation of waiting time phenomenon for \(m=3\) and \(\theta =0\). Panels (A) and (C) show the solutions at various times, and panels (B) and (D) show the density over time at the node (\(x_R^0\)) corresponding to the initial location of the right interface

4.2 Waiting Time Phenomenon

The waiting time phenomenon is a well-known feature of the PME for initial data of form

$$\begin{aligned} \rho ^0(x)= {\left\{ \begin{array}{ll}\left( \frac{m-1}{m}\left( (1-\theta ) \cos ^2 x+\theta \cos ^4 x\right) \right) ^{\frac{1}{m-1}}, &{} \text{ if } -\frac{\pi }{2} \le x \le \frac{\pi }{2} \\ 0, \quad &{} \text{ otherwise, } \end{array}\right. } \end{aligned}$$
(44)

with \(\theta \) in the interval [0, 1]. For this data, the interface does not move until t is greater than a certain waiting time \(t^*\), even though the internal profile continues to evolve. The theoretical waiting time for \(0 \le \theta \le \frac{1}{4}\) is given by [1]

$$\begin{aligned} t^*=\frac{1}{2(m+1)(1-\theta )}. \end{aligned}$$
(45)

In our numerical experiment, we set the parameters \(m=3\) and \(\theta =0\), and solve the equation (44) using the two schemes on a sequence of meshes with \(\{200\cdot 2^i\}_{i=0}^6\) elements and a fixed time step \(\Delta t = 0.001\). The simulation is performed until the final time \(t=0.15\), past the theoretical waiting time \(t^*=0.125\). We denote by \(x_R^0\) the node that corresponds to the location of the right interface at time \(t=0\) and track the density value at this node for each time step.

The results of our experiment are presented in Fig. 2. Panels (A) and (B) show the density profiles at times \(t=0\), \(t=0.125\), and \(t=0.15\), while (C) and (D) plot the density at \(x_R^0\) as a function of time for different values of N, as obtained by the log-density and the mixed method, respectively. In both (A) and (B), we observe that the interface remains stationary at \(t=0.125\) while the internal profile continues to evolve. At time \(t=0.15\), a clear movement of the interface is noticed. The panels (C) and (D) reveal for both methods, when t is equal to the theoretical waiting time (indicated by a black vertical line), the density value at \(x_R^0\) gradually decreases towards 0 as the mesh is refined by increasing N. Furthermore, for each value of N, the mixed method gives a smaller density, and thus a smaller error, at the node \(x^R_0\) when \(t=t^*\). The smaller error can be ascribed to the piecewise constant spatial discretization of the mixed method, which allows for a sharper capture of the interface, reducing the error near it.

4.3 Higher Dimensional Barenblatt Solution

The d-dimensional version of (42) is given by

$$\begin{aligned} \rho _B(\varvec{x},t) = (t+1)^{-k} \left( s_0 - \frac{k(m-1)}{2dm} \frac{|\varvec{x}|^2}{(t+1)^{2k/d}}\right) _+^{\frac{1}{m-1}}, \quad t > 0, \end{aligned}$$
(46)

where \(k=\frac{1}{m-1+2/d}\). Analogous to the one dimensional case, we verify the accuracy of the two schemes in two dimensions by using (46) as initial data with \(x \in [-6,6]^2\), \(s_0=1\), and \(m\in \{2,3,4\}\) and conduct a spacetime mesh-refinement study for the \(L^2\) error at final time \(t=0.2\). The \(L^2\)-error is recorded in the entire domain and in the box \([-3,3]^2\) away from the interface. We use a sequence of spatial meshes with \(\{(32\times 32)\cdot 2^i\}_{i=0}^3\) tensor-product elements and set the timestep size to \(\frac{1}{5\cdot 4^{i}}\) for the log-density method and to \(\frac{1}{10\cdot 2^{i}}\) for the mixed method, correspondingly. The results of the study are summarized in Table 2. Akin to the one dimensional case, we observe second-order spatial accuracy and first-order temporal accuracy of the log-density method in the box \([-3,3]^2\) where the solution remains smooth at final time. In the same region, the mixed method is observed to be first-order accurate in space and time as expected. Additionally, the order of convergence is observed to decay with m when the error is measured in the entire domain. This deterioration is slower for the case of the mixed method.

Table 2 Convergence results for 2D Barenblatt initial data
Fig. 3
figure 3

Numerical results at final time \(t=0.2\) for 2D Barenblatt initial data with \(m=3\), \(\Delta t = 0.025\), and \(N\approx 1024\) elements

Fig. 4
figure 4

Numerical results at final time \(t=0.2\) using the log-density method for 3D Barenblatt initial data with \(m=3\), \(\Delta t = 0.025\), and \(N\approx 4100\) elements

In Fig. 3, we plot the density and error profiles obtained using the two schemes at final time \(t=0.2\) for the data (46) with \(m=3\), \(\Delta t = 0.025\), and \(N\approx 1024\) elements. Results on both triangular and quadrilateral meshes are shown. Panels (A)-(D) display the profiles computed using the log-density method, while (E)-(F) display the profiles computed using the mixed method.

We additionally evolve the three-dimensional version of 46 on tetrahedral and hexahedral meshes of the domain \(\Omega _h = [-6,6]^3\) using the log-density method until final time \(t=0.2\) with \(\Delta t=0.025\) and \(N \approx 4100\) elements. We set \(m=3\) and \(s_0=1\). The plots of this simulation are shown in Fig. 4. In the region \([-3,3]^3\), the \(L^2\)-error is 0.0196 for hexahedral elements and 0.0674 for tetrahedral elements. We remark that the mixed method can also be used for this case, but only hexahedral or regular tetrahedral meshes may be used due to the lack of a proper mass-lumping quadrature formula for general tetrahedral elements.

4.4 Merging Gaussians

We further investigate the robustness of the two schemes on a popular test case consisting of two initial Gaussian peaks that merge into a single peak under the action of the PME [11, 26, 31]. The initial condition for this test is given by

$$\begin{aligned} \rho ^0(x, y)=e^{-20\left( (x-0.3)^2+(y-0.3)^2\right) }+e^{-20\left( (x+0.3)^2+(y+0.3)^2\right) }, \end{aligned}$$
(47)

where the domain is taken to be \(\Omega = [-1,1]^2\). We set \(m=3\), \(\Delta t = 0.001\), and evolve the initial data (47) using the two schemes on a triangular grid with 3750 elements. We record the solution at \(t=0\), \(t=0.15\), and \(t=0.3\), and plot the results of the simulation in Fig. 5, where one can observe that in both cases the two peaks move towards each other and eventually start merging.

Fig. 5
figure 5

Merging Gaussians test with \( m =3, \Delta t = 0.001\), and \(N = 3750\) triangular elements

4.5 Complex Support

In the final numerical example, we consider the following initial data [2, 26, 32]

$$\begin{aligned} \rho ^0(x, y)= \left\{ \begin{aligned} 25\left( 0.25^2-\left( \sqrt{x^2+y^2}-0.75\right) ^2\right) ^{\frac{3}{2(m-1)}},&\quad \sqrt{x^2+y^2} \in [0.5,1] \text{ and } (x<0 \text{ or } y<0), \\ 25\left( 0.25^2-x^2-(y-0.75)^2\right) ^{\frac{3}{2(m-1)}},&\quad x^2+(y-0.75)^2 \le 0.25^2 \text{ and } x \ge 0, \\ 25\left( 0.25^2-(x-0.75)^2-y^2\right) ^{\frac{3}{2(m-1)}},&\quad (x-0.75)^2+y^2 \le 0.25^2 \text{ and } y \ge 0, \\ 0,&\quad \text{ otherwise }, \end{aligned}\right. \end{aligned}$$
(48)

on the domain \([-2,2]\). The support of intial data (48) has the shape of a horseshoe or a partial donut. We set \(m=3\), \(\Delta t = 0.001\), and evolve the initial data using the two schemes on a triangular grid consisting of 3750 elements. The solution is recorded at times \(t=0\), \(t=0.5\), and \(t=1.0\), and the computed profiles are presented in Fig. 6. The plots demonstrate that both the schemes deliver reliable results, as the horseshoe ends are observed to evolve towards each other before ultimately intersecting. Additionally, in line with previous observations, the boundary of the support is captured more sharply by the mixed method.

Fig. 6
figure 6

Complex support test with \( m =3, \Delta t = 0.001\), and \(N= 3750\) triangular elements

We note that unlike Lagarangian schemes, the two methods exhibit a robust handling of the topology change without requiring any interpolation of the solution on to a new mesh when this event occurs.

Remark 4.1

We emphasize that the density range in each plot, as illustrated in a color bar or an axis, accurately represents the proper range. Both schemes preserve positivity since the minimal value of the density in all cases is 0, owing to the compactly supported nature of the solutions.

Remark 4.2

We conclude this section with a brief note on the computational efficiency of the two schemes. The primary computational cost of both schemes is the non-linear parabolic system solved for the density variable in (22) for the log-density formulation and in (30a) for the mixed method. Due to the choices of the corresponding finite element spaces, the total number of density degrees of freedom is the number of vertices for the log-density formulation, while that for the mixed method is the number of elements in the mesh. Hence, the mixed method is computationally more efficient on the same mesh than the log-density method. However, the log-density method achieves second-order spatial accuracy, while the mixed method is only first-order accurate in space. Moreover, on structured tensor-product grids, we expect the log-density scheme to have a computational performance similar to that of the finite difference scheme by Gu and Shen [20], which also achieves both bound preservation and energy stability due to a very similar setup.

5 Conclusion

We presented two distinct, first-order spacetime accurate, finite element approaches to the PME. The log-density approach is constructed for a problem with Neumann boundary conditions, and the properties of mass conservation, energy stability, unique solvability, and bound preservation on unstructured Delaunay meshes are proved. The scheme is shown to be second order in space and first order in time. The mixed approach is also constructed for a problem with Neumann boundary conditions and is shown to be mass conservative, energy stable, and positivity preserving under a CFL condition. The mixed scheme is shown to be first-order in both space and time. Both schemes can handle compactly supported initial data without the need for any perturbation.