A convexity enforcing C0 interior penalty method for the Monge–Ampère equation on convex polygonal domains

We design and analyze aC0 interior penalty method for the approximation of classical solutions of the Dirichlet boundary value problem of the Monge–Ampère equation on convex polygonal domains. Themethod is based on an enhanced cubic Lagrange finite element that enables the enforcement of the convexity of the approximate solutions. Numerical results that corroborate the a priori and a posteriori error estimates are presented. It is also observed from numerical experiments that this method can capture certain weak solutions. Mathematics Subject Classification 65N30 · 65K10 · 35G30 · 90C06 · 90C26

The method in this paper is also based on a nonlinear least-squares approach. It is different from the least-squares method of Dean and Glowinski [19,29,31] in that our least-squares problem is posed only on the finite element spaces and the discrete problems are solved purely as optimization problems.
The key ingredient in our method is an enhanced cubic Lagrange element with exotic degrees of freedom (dofs) that enables us to enforce the convexity of the finite element solutions, which then allows us to develop a simple error analysis based on existing results for second order elliptic problems in non-divergence form.
The rest of the paper is organized as follows. We introduce the enhanced cubic Lagrange element in Sect. 2 together with the discrete nonlinear least-squares problem. We then present a priori and a posteriori error analyses in Sect. 3 and numerical results in Sect. 4. We end with some concluding remarks in Sect. 5. We also put some of the details in three appendices so that the main flow of the presentation is not distracted. Appendix A contains the derivation of a stability result for elliptic problems in nondivergence form needed for the error analysis in Sect. 3. Details of the optimization algorithm that we use to solve the discrete problems are given in Appendix B. An algorithm that we use to check the elementwise convexity of the approximate solutions is outlined in Appendix C.
Throughout the paper we will use C to denote a generic positive constant independent of the mesh size.

The discrete problem
The discrete problem is a nonlinear least-squares problem with box constraints. It is based on an exotic finite element space whose degrees of freedom (dofs) can enforce the convexity of the solutions.

An enhanced cubic Lagrange finite element
We begin by introducing a new finite element where some of the degrees of freedom (dofs) are associated with nodes outside the element domain. Consequently the construction of the local basis requires information from outside an element. Below we will treat a polynomial on an element as the restriction of a polynomial on R 2 and use the same notation to denote both. In other words, we will identify P k (R 2 ) with the space P k (T ) of polynomials of (total) degree ≤ k on a triangle T .

Proof
Suppose v vanishes at the 9 vertex and edge nodes, then v belongs to ϕT ⊕ ϕ 2 T P 1 (T ). A direct calculation shows that 0 is the only polynomial in ϕT ⊕ ϕ 2 T P 1 (T ) that vanishes at the center ofT and whose Laplacian also vanishes at the three points (1, 1), (−1, 1) and (1, −1).

Remark 2.2
The space P 3 (T ) ⊕ ϕ 2 T P 1 (T ) and the 13 dofs in Lemma 2.1 do not define a finite element onT in the classical sense of Ciarlet in [23, page 78] because the shape functions are treated as functions defined globally on R 2 , and not as functions defined just on the element domain.

Remark 2.3
The vertices of the reference simplex are the midpoints of the edges of the triangle with vertices (1, 1), (−1, 1) and (1, −1). For a general triangle T , the triangle whose midpoints are the vertices of T will be denoted by T † .
On an arbitrary triangle T , the space of shape functions of the enhanced cubic Lagrange element is where ϕ T is the cubic bubble function that vanishes on the boundary of T . The dofs of v ∈ P 3 (T ) ⊕ ϕ 2 T P 1 (T ) are (i) the values of v at the three vertices, (ii) the values of v at the two points that trisect each edge, (iii) the value of v at the center of T , and (iv) the values of tr(J t T D 2 v J T ) at the three vertices of T † , where J T ∈ R 2×2 is the Jacobian matrix of an affine map that maps the reference simplex to T .

Remark 2.4
The enhanced cubic Lagrange element is affine-equivalent (cf. [17,23]) by construction. The exotic dofs at the vertices of T † are responsible for enforcing the elementwise convexity of the discrete solutions of (1.1).

The finite element space V h
Let T h be a quasi-uniform simplicial triangulation of . A function v belongs to the finite element space V h ⊂ H 1 ( ) if and only if (i) v belongs to C(¯ ) and (ii) the restriction of v to T ∈ T h belongs to P 3 (T ) ⊕ ϕ 2 T P 1 (T ). The (global) dofs of v ∈ V h are (i) the values of v at the vertices of T h , (ii) the values of v at the points that trisect the edges of T h , (iii) the values of v at the centers of the triangles in T h , and (iv) the values of tr(J t T D 2 v T J T ) at the three vertices of T † for each T ∈ T h , where v T is the restriction of v to T and J T is the Jacobian of an affine map that maps the reference simplex to T .

Remark 2.5
The dofs in (iii) and (iv) define the bubble functions in ϕ T ⊕ ϕ 2 T P 1 (T ).
It follows from the extension theorems for Sobolev spaces (cf. [1,Chapter 5]) that the solution u ∈ H 4 ( ) of (1.1) can be extended to a strictly convex function in H 4 (˜ ) where˜ is an open set that contains¯ in its interior. We will denote this extension again by u. We assume that h is sufficiently small so that The nodal interpolant h u ∈ V h is then defined by the condition that u and h u share the same global dofs mentioned above. We will denote the piecewise Hessian operator by D 2 h , the set of the interior edges of T h by E i h , the length of an edge e by |e|, and the jump of the normal derivative of v across an (interior) edge by [[∂v/∂n]].

Lemma 2.6
The following estimates are valid for h u: In view of the estimate for D 2 h (u − h u) L 2 ( ) in (2.2) and the bound for max T ∈T h | h u| W 2,∞ (T ) in (2.5), we immediately arrive at where the positive constant C is independent of h.

Remark 2.7
All the estimates for h u are also valid for h φ. In particular we have

A nonlinear least-squares problem with box constraints
Let φ h be the one dimensional cubic Lagrange interpolant of φ along ∂ and the (convex) subset L h of V h be defined by (2.7)

Remark 2.8
The inequality constraints in the definition of L h are motivated by the observation in Remark 1.4 and they are box constraints for the dofs of V h introduced at the beginning of Sect. 2.2.

Remark 2.9 Note that
The discrete problem is to find where the cost function J h is defined by and v T is the restriction of v to T . Note that the Frobenius norm of J T satisfies (2.10) We will use · h to denote the mesh-dependent norm defined by (2.11)

Remark 2.10
The first two terms in the definition of J h are regularization terms that are crucial for the well-posedness of the discrete problem and for enforcing the elementwise convexity of the discrete solutions. The third term is a penalty term that compensates for the fact that V h ⊂ H 2 ( ). The last term is the least-squares term for (1.1a).
The solvability of (2.8) is justified by the following result.

Some a priori bounds for u h
Since h u belongs to L h , it follows from (1.1a), (2.4)-(2.6) and (2.10) that Consequently we have Let T ∈ T h be arbitrary and ψ T be the P 1 Lagrange interpolant of ψ on T . We have, by a standard inverse estimate (cf. [17,23] which together with (3.5) and the assumption that ψ ∈ H 2 ( ) implies

The elementwise convexity of u h
Let q T be a polynomial defined on T ∈ T h . Recall that q T is the restriction of a polynomial defined on R 2 which is also denoted by q T . We define I T q T to be the restriction of I T † q T on T , where I T † is the P 1 nodal interpolation operator associated with T † . Note that any linear polynomial on T is invariant under I T , and according to (2.7), We have, by (3.3), a standard inverse estimate (cf. [17,23]) and the Bramble-Hilbert lemma, (3.8)

Lemma 3.1 There exists a positive constant α independent of h such that, for h sufficiently small, we have
or equivalently the minimum eigenvalue of D 2 h u h is bounded below by a positive constant independent of h.
if h is sufficiently small. Consequently, in view of (2.10), we also have On the other hand, on each T ∈ T h , we have We conclude from (3.10) and (3.11) that, for h sufficiently small, the minimum eigenvalue of h −2 J t T D 2 (u h ) T J T on the triangle T is bounded below by a positive constant independent of T and h, which implies that the same is true for D 2 h u h because T h is a quasi-uniform triangulation.
Therefore, for h sufficiently small, u h is a strictly convex polynomial on each T ∈ T h . Note that (3.9) is equivalent to

A priori error estimates
We have, by the fundamental theorem of calculus (cf. which is valid for all points in except those on the edges of T h . Here and below we use the colon to denote the Frobenius inner product between matrices. Let A ∈ [L ∞ ( )] 2×2 be defined by the integral on the right-hand side of (3.13). For h sufficiently small, we have, by (1.6), (3.6) and (3.12), where 0 < α ≤ β are constants independent of h. The proof of the following lemma is given in Appendix A.

Lemma 3.2
Under the condition (3.14) we have and the positive constant C † only depends on the shape regularity of T h .
We can now establish an a priori error estimate for u h .

An a posteriori error estimate
Since finding a global minimizer of a nonconvex function is in general NP-hard, an optimization algorithm usually only produces an approximate stationary pointũ h of the cost function J h . Therefore we need more than the a priori error estimate (3.17) to ensure the convergence of the approximate solutions to the solution u of (1.1). In the following discussion, it suffices to assume that u belonging to W 2,∞ ( ) is strictly convex, i.e., (1.5) is satisfied almost everywhere in .
Letũ h ∈ L h be elementwise strictly convex. Then the relation (3.13) is valid with u h replaced byũ h , and we have, by Lemma 3.2 and the arguments in the proof of Theorem 3.3, where the positive constant C depends only on the minimum and maximum eigenvalues of D 2 u and D 2 hũ h over and the shape regularity of T h . Therefore, after verifying the elementwise strict convexity of an approximate solutionũ h for (2.8), we can monitor the convergence ofũ h by evaluating the residual-based error estimator According to (2.11) and (3.25), the estimator η h is reliable for the error measured by the norm · h : In the other direction the obvious relations imply that η h (ũ h ) is also locally efficient. Therefore we can use η h to generate adaptive meshes when the solution of (1.1) is less smooth.

Numerical results
We have tested our method on three examples with known solutions. For each example we solve (2.7)-(2.9) by an active set algorithm (cf. Appendix B) that produces an approximate stationary point of the cost function in (2.9). The elementwise convexity of the approximate solutions are checked numerically by Algorithm C.1 in Appendix C. For Example 4.2 and Example 4.3, where the known solutions do not belong to H 4 ( ), we have also solved (2.7)-(2.9) on adaptive meshes generated by the error estimator in (3.26) and a Dörfler marking strategy [32].
The relative errors of the approximate solutionũ h in various norms are defined by where V h is the set of the vertices of the triangulation T h . All the numerical experiments were carried out on a MacBook Pro laptop computer with a 2.8GHz Quad-Core Intel Core i7 processor and with 16GB 2133 MHz LPDDR3 memory. We use MATLAB (R2018b v.9.5.0) in our computations.

Example 4.1
This example is from [28], where is the unit square (0, 1) 2 , ψ(x) = (1 + |x| 2 )e |x| 2 /2 and φ(x) = e x 2 /2 .  The exact solution is u = e x 2 /2 . The assumptions (1.2)-(1.4) are satisfied. The errors of the approximate solutionsũ h obtained by the optimization algorithm on uniform meshes are reported in Table 1. The order of convergence for e r 2,h is 2, which agrees with the estimate in Theorem 3.3 for the solutions u h of (2.8). The orders of convergence for e r 1,h , e r 0,h and e r ∞,h are higher. The residual η h (ũ h ) and the cost J h (ũ h ) are reported in Table 2, their behaviors agree with the estimates (3.1), (3.4) and (3.5) for the minimizer u h . It is observed from the CPU times in Table 2 that a good approximate solution at h = 2 −2 was computed in 3 seconds.
We have verified that all the approximate solutions are elementwise strictly convex, and the reliability of the error estimator η h can be observed by comparing e r 2,h in Table 1 and η h (ũ h ) in Table 2.
It is observed from the convergence histories of e r 2,h and e r ∞,h in Fig. 3 that the performance of our method is similar for all five polygons. the exact solution is and φ ∈ H 4 ( ) equals u in a neighborhood of ∂ . For this example, the function ψ is piecewise smooth and discontinuous along the circle defined by |x| = 1/2, and the Aleksandrov solution u is a piecewise smooth C 1 function. The computational meshes are generated by a bisection procedure to fit the circle where ψ is discontinuous. The first two meshes and the final mesh are presented in Fig. 4. We have verified that all the approximate solutions are elementwise strictly convex. The relative errors are reported in Table 3. The convergence of e r 2,h is of a reduced order ≈ 0.5, and the orders of convergence are higher for the lower order norms.
The residual η h (ũ h ), the cost J h (ũ h ) and the CPU time are provided in Table 4. It is observed that a satisfactory approximate solution with 1009 dofs was computed in less than 3 seconds. The reliability estimate (3.27) is confirmed by comparing the values of e r 2,h and η h (ũ h ). We also tested the performance of the a posteriori error estimator in (3.26) for this example. The convergence histories of e r 2,h and e r ∞,h on bisection and adaptive meshes are shown in Fig. 5. The advantages of the adaptive meshes can be observed until round-off errors interfere at finer meshes.
The discrete solution on the final bisection mesh and the adaptive mesh with 3385 dofs are displayed in Fig. 6. [64], which is a modification of an example in [48]. The domain is the unit square = (0, 1) 2 ,

Example 4.3 This example is from
the exact solution of this example is  and φ ∈ H 4 ( ) equals u in a neighborhood of ∂ . For this example the function ψ vanishes on the disc defined by |x − (1/2, 1/2)| ≤ 0.2 and the Aleksandrov solution u is a piecewise smooth C 1 function.
We solved this problem by the nonlinear least-squares method on uniform meshes. The approximate solutions are elementwise strictly convex outside the disc where ψ = 0. The relative errors ofũ h are reported in Table 5. The order of convergence for e r 2,h is roughly 0.5 and the orders of convergence for e r 1,h , e r 0,h and e r ∞,h are better than 1. The discrete solution at h = 2 −5 can be found in Fig. 8.
The residual η h (ũ h ), the cost J h (ũ h ) and the CPU time are provided in Table 6. Comparing e r 2,h in Table 5 and η h (ũ h ) in Table 6, one can see that the reliability estimate (3.27) is no longer valid because of the lack of strict convexity for both the discrete solutions and the continuous solution inside the disc where ψ = 0. It can also be seen that a satisfactory approximate solution at h = 2 −3 was computed in less than 8 seconds.   We also tested the performance of the a posteriori residual error estimator in (3.26) for this example. The convergence histories of e r 2,h and e r ∞,h on uniform and adaptive meshes are shown in Fig. 7. The advantage of adaptive meshes is observed. The adaptive mesh with 30187 dofs in Fig. 8 clearly captures the singularities of the exact solution along the circle defined by |x − (1/2, 1/2)| = 0.2.

Concluding remarks
By going beyond the classical definition of a finite element, we are able to construct a C 0 interior penalty method for the Dirichlet boundary value problem of the Monge-Ampère equation, where the elementwise convexity of the approximate solutions can be enforced. This in turn enables us to use existing results for second order elliptic equations in nondivergence form to obtain both a priori and a posteriori error estimates. The a posteriori error estimate is a significant part of our method since it allows us to access the convergence of the approximate solutions generated by optimization algorithms that are not necessarily global minimizers.
The approach in this paper can be extended to smooth domains. We also note that convexity enforcing is useful for the problem of prescribed Gaussian curvature (cf. [38,42]) and the nonlinear least-squares approach can be applied to the Pucci equations (cf. [20,42]). These are some of our ongoing projects.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

The first ingredient is the Miranda-Talenti estimate
that is valid for convex domains (cf. [56,71]). In the case of a polygon, the two sides of (A.1) are actually equal (cf. [43,Sect. 4.3]). The second ingredient is the existence of an operator E h : where C † only depends on the shape regularity of T h (cf. [62,Lemma 3]

It follows from (A.1) and (A.2) that
for all ζ ∈ H 2 ( ) ∩ H 1 0 ( ) and v ∈ V h ∩ H 1 0 ( ). Following the treatment of second order linear elliptic equations in nondivergence form in [22,55,70], we introduce the function where I is the 2 × 2 identity matrix.
Note that A(x) : I is the sum of the eigenvalues of A(x) and A(x) : A(x) is the sum of the squares of the eigenvalues of A(x). Therefore we have, by (3.14), the following upper bound of γ (x): and also the following Cordes condition (cf. [24]): where δ (< 1) is given by (3.16). It follows from (A.3) and (A.5) that

Appendix B. An optimization algorithm
An active set algorithm is implemented to solve the bound constrained optimization where f : R n → R is twice continuously differentiable on the the set B = {x ∈ R n : l ≤ x ≤ u}. Our algorithm is based on the active set approach proposed in [45] for solving nonlinear optimization with bound constraints, which was further developed in [47] for handling more general polyhedral constrained optimization.
Here, we just very briefly explain the structure and convergence results of our algorithm. For more details on the theory of the algorithm, one may refer to references [45,47]. Our active set algorithm consists of two phases: a nonmonotone gradient projection phase and an unconstrained optimization phase, and a set of rules for switching between these two phases for achieving both global and fast local convergence. In particular, a projected cyclic Barzilai-Borwein (PCBB) algorithm is used in the gradient projection phase, where the line search direction at iteration x k is generated by Here, P B (·) is the projection on the feasible region B, α k is the cyclic Barzilai-Borwein stepsize [25] and g k = ∇ f (x k ). Along the search direction d k , an adaptive nonmonotone line search proposed in [26] is used to ensure global convergence. This PCBB algorithm of phase one is not only robust in the sense that it converges to a stationary point under mild assumptions, but also very effective for identifying the optimal active constraints where the components of the solution are on the boundary of B. However, the convergence rate of PCBB is often at best linear. Hence, to accelerate the convergence, a more efficient unconstrained optimization algorithm is used in phase two to optimize the objective function by fixing some components of variable x on the boundary of B, that is Here, A is the active index set given by phase one and b A indicates the partial boundary of B where the components of x with index A are fixed. When one iteration of the phase two algorithm lies out of the feasible region B, the set rules developed in the algorithm determine whether the algorithm will switch to the gradient projection phase or restart the unconstrained optimization phase by projecting the iterate back to B. A limited memory nonlinear conjugate gradient method (L-CG_DESCENT) [46] is used to solve the subspace optimization (B.3) in phase two. L-CG_DESCENT is a very efficient first-order method which has much more rapid convergence than most gradient descent methods, and maintains cheap iterations since only up to first-order information is used. However, when the optimization problem gets very ill-conditioned, which is often the case for a discrete optimization problem resulting from a finite difference method or a finite element method (such as the C 0 interior penalty method studied in this paper), slow convergence is often detected near the solution. Under this situation, in phase two we would switch L-CG_DESCENT to the second-order Newton's method, which is generally more expensive but often quickly leads to more accurate solutions. The convergence theories developed in [45,47] guarantee our active set algorithm converges at least to a stationary point of problem (B.1). Furthermore, the active set algorithm would asymptotically reduce the bound constrained optimization (B.1) to an unconstrained optimization (B.3) even when the problem is degenerate. Hence, fast local convergence would be expected by combining the more rapid convergence algorithms such as L-CG_DESCENT and Newton's method in the phase two optimization.

Appendix C. Elementwise convexity
Since the enhanced cubic Lagrange finite element is affine-equivalent, we can focus on the reference simplex. It is convenient to first consider the convexity of a tensor product polynomial on the unit square (0, 1) × (0, 1), for which we will need some explicit inverse estimates.

C.2 Convexity on the unit square
LetK be the (closed) unit square [0, 1] × [0, 1] with verticesp 1 ,p 2 ,p 3 ,p 4 , and QK be the nodal interpolation operator for the Q 1 finite element onK . For any q ∈ Q m,n and x ∈K , we have QK q − q ∈ Q m,n and therefore Similarly, since det D 2 q − QK (det D 2 q) ∈ Q 2m−2,2n−2 , we have which implies that q is strictly convex onK .

C.3 Convexity on the reference simplex
Let q ∈ P 3 (T ) ⊕ ϕ 2 T P 1 (T ). Then q ∈ Q 5,5 and we begin by computing (cf. (C.2) and (C. If L ,K > 0 and L det,K > 0, then q is strictly convex onK (and therefore also onT ). If this is not the case, then we divideK into four sub-squares and use scaled versions of (C.4) and (C.5) to check the convexity of q on the sub-squares whose intersection withT has a positive area. By repeating this procedure we arrive at the following algorithm for checking the convexity of q ∈ P 3 (T ) ⊕ ϕ 2 T P 1 (T ) onT .
Algorithm C.1 Let R 0 = {K } and L be the maximum refinement level.
1. If R l = ∅ and l ≤ L, compute for K ∈ R l the quantities and where p K ,i (i = 1, 2, 3, 4) are the vertices of K , h l is the width/height of the squares in R l , and Q K is the nodal interpolation operator for the Q 1 element associated with K . Set R nc l = {K ∈ R l : L ,K ≤ 0 or L det,K ≤ 0}.

Remark C.2
In our numerical experiments we were able to verify the elementwise strict convexity by observing that the Algorithm C.1 terminated before the refinement reached level 6.