Numerical quadratic energy minimization bound to convex constraints in thin-film micromagnetics

We analyze the reduced model for thin-film devices in stationary micromagnetics proposed in DeSimone et al. (R Soc Lond Proc Ser A Math Phys Eng Sci 457(2016):2983–2991, 2001). We introduce an appropriate functional analytic framework and prove well-posedness of the model in that setting. The scheme for the numerical approximation of solutions consists of two ingredients: The energy space is discretized in a conforming way using Raviart–Thomas finite elements; the non-linear but convex side constraint is treated with a penalty method. This strategy yields a convergent sequence of approximations as discretization and penalty parameter vanish. The proof generalizes to a large class of minimization problems and is of interest beyond the scope of thin-film micromagnetics.


Introduction and abstract setting
Let ⊆ R n be a domain and H ⊆ L 2 ( ) denote a continuously embedded Hilbert space with norm u 2 L 2 ( ) u 2 H = (u, u) H . Let |||u||| 2 = u , u be a continuous semi-norm on the space H which is induced by the symmetric bilinear form · , · . Let g : H → L 2 ( ) be a continuous mapping that is defined on the whole space H. We assume that g is pointwise convex, i.e., for two functions u, v ∈ H and for any choice of λ ∈ (0, 1), the inequality holds true pointwise almost everywhere. Given some linear and continuous functional : H → R, we define the quadratic energy functional Minimization problem (M): Find a minimizer u * of e(·) subject to the convex side constraint The side constraint (2)  Under these assumptions the direct method of calculus of variations [5] provides existence of a minimizer u * ∈ A. Moreover, if ||| · ||| is a norm, the minimizer u * is unique. The side constraint, however, may be difficult to treat numerically. Suitable optimality conditions-also referred to as KKT equations-may be hard to derive. A black box scheme for the numerical solution of this class of minimization problems is desirable, and we proceed to present a convergent penalty method. We define the function Minimizing the energy (1) over the set A of admissible functions can be interpreted as minimizing the energy functional e 0 (u) := e(u), for u ∈ A ∞, else over the full space H. We observe that this functional is not smooth since it has a sharp energy barrier at the boundaries of A. The idea of the penalty method is to approximate the energy functional e 0 by a regularization. Given ε > 0, we seek a minimizer u ε ∈ H of the penalized energy e ε (u) := e(u) + 1 2ε (g(u)) + 2 L 2 ( ) .
Throughout, we assume coercivity of the penalized energy functional for each fixed ε > 0, which essentially is an additional assumption on the function g: In order to solve the unconstrained penalized minimization problem we need to discretize the energy space H. Let (h n ) n∈N be a positive zero sequence and let (X h n ) n∈N be a sequence of finite dimensional subspaces of H. We assume These conditions state that the sequence of discrete spaces is nested and that for vanishing discretization parameter the discrete subspaces become dense in H. Both of these assumptions are usually satisfied for regular discretizations based on some mesh T h n of the domain , where T h n+1 is a uniform refinement of T h n . Let u 0 h denote a solution to the discretized constrained minimization problem. A solution to the penalized continuous problem is denoted by u ε 0 . Only u ε h , a solution to the discretized and penalized problem, is computed in practice. In Sect. 2, we prove that this numerical approach is justified in the sense that any choice of zero sequences (ε n , h n ) → (0, 0) yields convergence of the computable quantities u n := u ε n h n , i.e., lim n→∞ u n = u * in an appropriate sense. To be precise, we prove convergence with respect to the weak topology of H as well as the topology induced by the semi-norm ||| · |||. Recall that the solution u * is in general not unique. We clarify our notion of convergence: Each subsequence of (u n ) n∈N has a convergent subsequence (u n k ) k∈N whose limit u * := lim k u n k is a minimizer. If u * is uniquely determined, the full sequence converges u n → u * . We apply our numerical scheme to the minimization problem proposed in [7] for the simulation of thin-film devices in micromagnetics. For that model, we define a functional analytic framework and analyze existence and uniqueness of solutions as well as continuous dependence on the given data in Sect. 3. Since the model problem fits into our abstract setting, we may use the penalty method to treat the side constraint. In Sect. 4 we use Raviart-Thomas finite elements to discretize the energy space H. The behavior of our algorithm is studied with numerical experiments.
There is a vast literature on penalty methods. None of the results we found, however, could be applied to the thin-film model in micromagnetics discussed in Sect. 3. For discrete minimization problems with finite dimensional energy space arising, for example, in the context of mathematical finance, penalty methods are well understood and convergence is established, see [21] and the references therein. One may therefore be tempted to first discretize the continuous problem and then apply a penalty method to the obtained discrete minimization problem. There is no obvious mathematical justification, however, that refinement of the discretization does not increase the penalty error introduced.
In the context of infinite dimensional problems, a lot of effort has been put into developing efficient numerical schemes for the solution of quadratic minimization problems bound to side constraints, see e.g., [15,16] and the references therein. Since it is known that the system of equations of the penalized discrete problem becomes ill-conditioned as ε → 0, it is natural to ask for other, more robust methods for specific applications. All of these methods, however, are based on properties of the corresponding KKT-system or on orthogonal projections onto the admissible set. Apparently the non-linearity of the function g and the non-local norm of our energy space are significant enhancements to the problem complexity, and we failed to transfer ideas from the literature to our model problem.
In some cases, as the large-body limit in micromagnetics or contact problems, penalty methods have been applied successfully. Even convergence in certain norms has been proven, see e.g., [2,3,19]. These convergence results, however, are based on KKT-like conditions and require knowledge of the nature of the Lagrange multipliers associated with the inequality constraints. Since the thin-film model problem under consideration resembles the one treated in [2,3], as a first attempt we tried to transfer or modify the proofs in the publications mentioned to our setting. One crucial ingredient in the analysis of the works concerned with the large-body limit is, however, that the energy space is L 2 ( ). In particular its dual space consists of Lebesgue functions only. In our case the dual space includes distributions, which required us to develop a new approach to prove convergence of our method. The new approach applies to a large class of problems, and we are confident that it is of interest beyond the specific application in thin-film micromagnetics. It avoids use of KKT equations; it avoids estimates that require information about the Lagrange multipliers.

An abstract convergence result
Before stating the abstract convergence result, we introduce some notation and collect the necessary assumptions: Let ε, h ≥ 0. We denote by (M) the continuous and constrained minimization problem with solution u * . The discretized and penalized minimization problem is referred to as (M ε h ) with solution u ε h . In case of ε = 0 we have a constrained-and possibly discrete-problem (M 0 h ) with solution u 0 h ; the case h = 0 corresponds to a continuous-and possibly penalized-problem (M ε 0 ) with solution u ε 0 . Our notation identifies (M 0 0 ) = (M) and u 0 0 = u * . We stress that if ||| · ||| is a norm, all problems (M ε h ) allow for a uniquely determined solution u ε h . In this case, the particularly interesting situation is the one where the energy norm ||| · ||| is not equivalent to the norm · H and where H endowed with ||| · ||| is hence not a Hilbert space.
Our aim is to prove convergence of u n := u ε n h n to a minimizer u * in an appropriate sense. We first need to specify the assumptions on the choice of penalty and discretization parameters ε n and h n . For the penalty parameter, we assume that (ε n ) n∈N ⊆ R >0 is an arbitrary zero sequence ε n → 0. Concerning the discretization, we assume that (X h ) h∈I with I ⊆ R >0 is a monotone family of finite dimensional subspaces of H with h>0 X h = H. An arbitrary zero sequence (h n ) n∈N ⊆ I then satisfies the conditions (3). The corresponding discrete admissible sets are denoted by With these notions and assumptions in hand, we show convergence u n → u * : n∈N and (ε n ) n∈N be arbitrary positive zero sequences. A sequence of minimizers u n := u ε n h n of (M ε n h n ) satisfies convergence in the following sense: Any subsequence (u n k ) k∈N contains a convergent subsequence (u n k ) ∈N whose limit is a minimizer u * of the continuous constrained problem (M). Convergence holds with respect to both the weak topology of H and the topology induced by the semi-norm ||| · |||, i.e., u n k u * and |||u n k − u * ||| → 0.
Moreover, the entire sequence of energies converges: e(u n ) → e(u * ) as well as e ε n (u n ) → e(u * ).
If the minimizer u * is uniquely determined, we do not only have convergence of the energy sequences (5), but also convergence of the full sequence, i.e., (4) holds with u n k replaced by u n .
The remainder of the section is concerned with the proof of Theorem 1. It consists of four parts. As a first observation, we state the stability of the penalty method. By this we mean that given some sequence (u n ) n∈N ⊆ H and some zero sequence ε n → 0 such that the energies e ε n (u n ) are uniformly bounded, a weak limit u must satisfy the admissibility condition u ∈ A.
Proposition 2 (Stability of the penalty method) Let ε n → 0 be a non-negative zero and u n u ∞ for some u ∞ ∈ H. Then, lim n→∞ (g(u n )) + 2 L 2 ( ) = 0 and u ∞ ∈ A, i.e., the weak limit u ∞ satisfies the constraint Proof By convexity of the functions · 2 L 2 ( ) , (·) + , and g, the function is convex. Together with continuity we deduce that it is weakly lower semi-continuous. From the weak convergence u n u ∞ we thus obtain From (6) The weak lower semicontinuity of e(·)-together with u n u ∞ -yields the lower bound The combination of (8) and (9) shows that both sequences e(u n ) and e ε n (u n ) are bounded. Moreover, we have e ε n (u n ) = e(u n ) + 1 2ε n (g(u n )) + 2 L 2 ( ) according to the definition of both energies. This produces 2ε n (e ε n (u n ) − e(u n )) = (g(u n )) + 2 L 2 ( ) .
Next, we analyze convergence for fixed h ≥ 0 and ε n → 0.
In addition, the energy sequence converges: If the solution u 0 h of the constrained problem is uniquely determined, we have weak convergence of the full sequence, i.e., (10) holds with u ε n k h replaced by u ε n h . Proof We may assume without loss of generality that ε n < 1. Since u ε n h is a minimizer of the unconstrained problem (M ε n h ) and by definition of e ε n (·) we have Therefore, the sequence of minimizers u ε n h has bounded energy. From assumption (A2) and in particular coercivity of e 1 (·), we obtain boundedness of u ε n h in norm. This means that any subsequence (u ε n k h ) k∈N is bounded and must have a weakly convergent subsequence u ε n k h u * h . It remains to prove that first u * h ∈ A h and that second it is indeed a minimizer of e(·).
The first statement, i.e., u * h ∈ A h , follows immediately from Proposition 2. From and weak lower semicontinuity of e(·) we conclude Since u 0 h ∈ A h is a minimizer of e(·), there must hold e(u * h ) = e(u 0 h ), i.e., u * h ∈ A h is a minimizer of (M 0 h ) as well. Moreover, we obtain convergence in energy e(u In particular, this proves that each subsequence e(u ε n k h ) contains a subsequence e(u ε n k h ) which tends to the independent limit e(u 0 h ). Elementary calculus thus predicts that the entire energy sequence e(u ε n h ) converges to e(u 0 h ). Finally, if u * h is uniquely determined, we have seen that each subsequence of (u ε n h ) n∈N contains a subsequence whose weak limit is the unique u * h . From this one can already conclude lim n→∞ u ε n h = u * h in the weak topology.
In a third step we use similar arguments to prove convergence for fixed ε ≥ 0 as h → 0. Lemma 4 (Convergence as h → 0) Let ε ≥ 0 be fixed and let (h n ) n∈N ⊆ I be an arbitrary zero sequence h n → 0. Then, minimizers u ε h n of (M ε h n ) satisfy weak convergence with respect to the norm topology of H in the following sense: Any subsequence (u ε h n k ) k∈N contains a weakly convergent subsequence (u ε h n k ) ∈N whose limit is a Moreover, the entire energy sequence converges: If the solution u ε 0 of the continuous problem is uniquely determined, we have weak convergence of the full sequence, i.e., (13) Proof We assume without loss of generality h n ≤ 1. Then, from monotonicity of spaces we have X 1 ⊆ X h n and hence Put differently, the sequence (u ε h n ) n∈N has bounded energy. From coercivity (A2) we deduce boundedness of the sequence in H. In particular, each subsequence (u ε h n k ) k∈N is also bounded and has therefore a weakly convergent subsequence It remains to prove that u ε * is a minimizer. Let u ε 0 denote a minimizer of the continuous problem (M ε 0 ). From h>0 X h = H, we know that for each δ > 0 there is an integer L ∈ N such that for all ≥ L there exists some u ε Together with continuity of e ε (·), we know that for arbitrary η > 0 there exists some index L ∈ N such that for all ≥ L we may thus choose u ε Recall that u ε h n k is a minimizer and therefore e ε (u ε ). Since this holds for all η > 0, and together with (weak lower semi-)continuity of e ε (·), we conclude This means that u ε * is in fact a minimizer. Moreover, the preceding estimates yield lim inf ∈N e ε (u ε h n k ) = e ε (u ε 0 ). By extracting an additional subsequence, we may thus generate lim ∈N e ε (u ε h n k ) = e ε (u ε 0 ). Arguing as in Lemma 3, we may conclude convergence of the energy sequence as well as weak convergence of the full sequence provided that the limit is unique.
So far, we have proven if all occuring minimizers u ε h are unique (the limits are taken with respect to the weak topology). We next prove the statement of Theorem 1. We stress that no assumptions on the regularity of the analytical solution are necessary; no knowledge about existence or estimates for Lagrange multipliers are used.
Proof of Theorem 1 Let u ε n 0 denote a minimizer of the continuous and penalized problem (M ε n 0 ) and let u 0 h n denote a minimizer of the discrete constrained problem (M 0 h n ). We observe Therefore the sequence u n := u ε n h n is bounded and every subsequence (u n k ) k∈N has a weakly convergent subsequence u n k u * . From Lemma 4 we obtain the existence of a subsequence, which we write for simplicity as (u ε s h s ) s∈N , such that the associated sequence (u 0 h s ) s∈N converges weakly to a minimizer u * and e(u 0 From this it follows, again by weak lower semicontinuity of e(·), that e( u * ) ≤ e(u * ).
, we see that the sequence u ε s h s satisfies the assumptions of Proposition 2, and therefore u * ∈ A. Finally we obtain that u * is a minimizer, i.e., the desired convergence result with respect to the weak topology. Moreover, proves convergence of the energy e(u ε s h s ) → e(u * ). The same argument with e(·) replaced by e ε s (·) yields e ε s (u ε s h s ) → e(u * ). Arguing as above, we may even derive energy convergence e(u n ) → e(u * ) as well as e ε n (u n ) → e(u * ).
The strong convergence in the energy semi-norm |||·||| is obtained by a bootstrapping argument: Recall the definition of the energy In other words, the energy semi-norm satisfies From the construction above, we have u s := u ε s h s u * and e(u s ) → e( u * ). Recall that (u) is linear and continuous with respect to the H-norm so that that (u s ) → ( u * ).
Together we see Since ||| · ||| stems from a continuous semi-scalar product we immediately conclude which means convergence in the energy semi-norm.

Thin-film micromagnetics
For many applications in stationary micromagnetics, the model due to Landau and Lifshits [18] is nowadays accepted as relevant. From a computational point of view it is highly challenging. Different length scales involved make large samples of several μm in diameter hardly accessible for direct calculations. Therefore, various reduced models have been proposed and analyzed that cover certain asymptotic regimes, see [9] for an overview.
Here, we discuss the reduced model proposed in [7]. It covers the regime of very thin but relatively large ferromagnetic samples. In [8] the authors obtain the reduced model for vanishing thickness of the sample-under certain assumptions-as -limit of the full problem due to Landau and Lifshits. Micromagnetic thin-film problem (TF): Let ⊆ R 2 be a bounded Lipschitz domain that represents the ferromagnetic sample, whose thickness is neglected by the model. With an in-plane applied exterior field f : → R 2 , we seek a magnetization m * : → R 2 that satisfies the convex pointwise constraint g(m) = |m| − 1 ≤ 0 almost everywhere and minimizes the reduced energy The magnetic potential p : R 3 → R is determined by the reduced magnetostatic Maxwell equation stated here in a distributional sense. The material dependent parameter q > 0 measures the strength of the uniaxial crystalline anisotropy. For soft materials such as permalloy, where q 1, one usually drops this energy contribution by setting q = 0.

The magnetostatic Maxwell equation
The magnetic potential is the solution of the variational formulation (18). Understanding the potential p is crucial to define the appropriate function space for the magnetization m. We denote by n the outer normal in R 2 of and by [·] the jump across . For smooth m ∈ C 1 ( ), every weak solution p of (18) which is sufficiently We give a brief definition of some Sobolev spaces needed in the following. The reader is referred to, e.g., [1] for a detailed discussion. The Sobolev space H 1 (G) for some Lipschitz domain G is defined in the usual way by We then define the fractional order Sobolev space as the space of traces of H 1 functions restricted to . This space may be equipped with the norm which yields a Hilbert space. Finally its dual space with respect to the extended L 2 scalar product is denoted by H −1/2 ( ).
Recall the definition of the simple-layer potential associated with the Laplace-operator in 3D. The operator S can be extended continu- can be extended continuously to V ∈ L( H −1/2 ( ); H 1/2 ( )), see [25]. Moreover, the operator V satisfies the ellipticity estimate with some constant C e > 0 that depends only on . Altogether, we have that defines an equivalent scalar product on H −1/2 ( ).
Proposition 5 (Jump conditions of the simple-layer potential). Given some ϕ ∈ H −1/2 ( ), the simple-layer potential Sϕ satisfies Remark The proof of Proposition 5 can be found, e.g., in [24, Thm. 3.3.1] for the simple-layer potential defined on a closed surface in R 3 . Whereas the proof of the continuity follows literally, the proof of the jump relation for the normal derivative needs some modifications for the present case of a screen. We use similar ideas as in [25]: Choose some bounded Lipschitz domain G − such that ⊆ ∂G − and such that the normal vector n of G − satisfies n| = (0, 0, 1). Given some ϕ ∈ H −1/2 ( ), the second Green's formula for u := Sϕ in the interior and exterior domains G − and G + := R 3 \ G − and the same arguments as in the proof of [24, Thm. 3.3.1] yield the statement.
Remark First, we have shown that the regularity assumption ∇ · m ∈ H −1/2 ( ), ensures existence of p. Second, we want to comment on the constraint m · n = 0 on that arises for smooth solutions. Suppose m · n = 0 on , then a solution to the reduced Maxwell equation (18) does not in general satisfy p ∈ H 1 oc (R 3 ). Assume, e.g., m = (1, 0) T constant. Then, m · n = 0 ∈ L 2 ( ) and ∇ · m = 0 ∈ L 2 ( ). Choose some smooth and bounded domain G ⊆ R 3 with ⊆ G. Integration by parts in the Maxwell equation (18) On the right-hand side of (22), the integral over vanishes due to the choice of m. The functional v → (m ·n)v ds cannot be extended to a linear functional on H 1 (G) since the restriction to is not well-defined. We conclude that v → (∇ p, ∇v) does not define a continuous functional in H 1 (G). This means ∇ p ∈ L 2 (G) and, therefore, p ∈ H 1 (G).

Energy space
In this case, and according to the fundamental theorem of calculus of variations, the weak divergence of m is unique, and we simply write ∇ · m := v. We define the space . We have to do one last step to define the appropriate space for the magnetization. Namely, we have to permit ∇ · m ∈ H −1/2 ( ) ⊇ L 2 ( ).
We define the energy space for the magnetization where By construction H is a Hilbert space and D( ) 2 is a dense subspace. Finally, we make a last remark on H before analyzing the existence of minimizers m * in our function setting. Given m ∈ H, we may represent p as the simple-layer potential of ∇ · m ∈ H −1/2 ( ). From the mapping properties of the simple-layer potential, we immedi- is not a normed space. Also, the fact ∇ p ∈ L 2 (R 3 ) 3 implies further regularity and is necessary for the energy e(m) defined in (17) to be finite. We now establish an appropriate Hilbert space for the magnetostatic potential p.
We define the set by factoring out the constant functions. Note that · B 2 1 (R 3 ) now in fact is a norm.
Lemma 10 There is a continuous linear lifting operator L : , which finally proves the full linear and continuous extension.

Well-posedness of the micromagnetic thin-film problem
We proceed to analyze well-posedness of the problem (T F) in our functional analytic setting. As a first observation, in Proposition 12, we prove that for any given m ∈ H there exists a uniquely determined magnetostatic potential p ∈ B 2 1 (R 3 ). Moreover, we will see that R 3 |∇ p| 2 dx = ∇ ·m 2 V . This reveals that we can restate our energy functional from (17) The outline of the remaining section now reads as follows: In Lemma 13, we prove that the energy e(m) and the penalized energy i.e., m * depends Hölder continuously on the applied field f.

We recall the variational formulation
of the magnetostatic Maxwell equation. As discussed above, m ∈ H satisfies all constraints and the necessary regularity. Therefore, (33) may be stated as in our functional setting.

Proposition 12
The following statements are true:  (v) In particular, there holds Pm 2 We consider the cylindrical domain This proves that F m defines a linear and continuous functional F m : , the functional F m may be extended continuously to the entire Beppo-Levi space while preserving the operator norm. Since the left-hand side of (34) is the scalar product of B 2 1 (R 3 ), the variational formulation may be extended to the full space B 2 1 (R 3 ). This proves (ii). The Riesz representation theorem provides existence and uniqueness of a solution p ∈ B 2 1 (R 3 ), since (34) may be written as which yields statement (i). The Riesz theorem furthermore implies In particular the mapping P : H → L 2 (R 3 ) 3 is well defined and continuous. Linearity follows from the composition P : m → ∇ · m → p → −∇ p, which finally proves (iii). Now, we prove the converse estimate ∇ · m H −1/2 (ω) F m . Lemma 10 and (34) imply Taking the supremum over all v ∈ H 1/2 ( ) \ {0} we obtain For P m = −∇ p, the representation p = S(−∇ · m) and the variational equality (34) for Pm = −∇ p imply The choice m = m yields (Pm, Pm) L 2 (R 3 ) 3 = ∇·m 2 V , which allows us to conclude the proof.
Note that as a direct consequence of Proposition 12, we may indeed rewrite our energy functional from (17)- (18) in the form (29)-(30). The semi-norm ||| · ||| · H of (30) is obviously continuous on H. Since H ⊆ L 2 ( ) 2 is by construction a continuously embedded Hilbert space, the thin-film problem fits into the general setting of Sect. 1. It only remains to prove the assumptions (A1) and (A2), which guarantees the existence of minimizers and allows us to apply our convergent numerical scheme of Sect. 2. From the boundedness of the L 2 -norm of m n , we get lim n→∞ ∇ ·m n H −1/2 ( ) = ∞. From the boundedness of the scalar product (f, m n ) L 2 ( ) and the equivalence of norms The proof of assumption (A2) is a little bit more involved as also the L 2 -norm of a sequence (m n ) n∈N with lim sup n m n H = ∞ may be possibly unbounded. From equivalence of norms ∇ · m V ∇ · m H −1/2 (ω) , we get the existence of a constant C 1 > 0 such that Let ≥ denote the set where |m| ≥ 1, and < its complement, i.e., |m(x)| < 1 for x ∈ < . We use Hölder's inequality to estimate the linear contribution Next, for the penalty energy contribution it holds that Applying these inequalities we obtain Defining the constants we conclude the proof with the following observations: Let (m n ) ⊆ H be a sequence of magnetizations with lim sup n m n H = ∞. Then lim sup n ∇ · m n H −1/2 ( ) = ∞ or lim sup n m n L 2 ( ) = ∞. Hence, at least one contribution, either the · V -norm of the divergence or the L 2 -norm of m, will cause the energy to be unbounded.
So far, we have seen that the thin-film minimization problem fits into the general setting of Sects. 1 and 2. The existence of solutions m * follows for any q ≥ 0 by the direct method of calculus of variations. The following lemma gives additional information on the uniqueness of the solutions: If q > 0, then the energy semi-norm ||| · ||| is in fact positive definite and thus a norm. Uniqueness of the minimizer m * follows in this case.
So far, we have proved (i) and (ii) of Theorem 11. As a last issue in this section, we show continuous dependence of the solutions on the given data. To that end we need the corresponding variational inequality. The following standard result can be found, e.g., in [17] and is formulated here in our setting for the convenience of the reader.

Lemma 15
Let H be a Hilbert space with continuous semi-scalar product · , · . Furthermore, let ∈ L(H ; R) and let A ⊆ H denote a closed and convex subset. Given the energy functional an element u * ∈ A is a minimizer, i.e., if and only if u * satisfies the variational inequality

Lemma 16
If m 1 and m 2 are solutions to (T F) for applied fields f 1 and f 2 , respectively, then Proof To obtain the estimate (44), note that solving the minimization problem (M) is equivalent to solving the variational inequality see Lemma 15. Since m 1 solves the variational inequality for f 1 and m 2 solves the variational inequality for f 2 , we have The observation |m 1 (x) − m 2 (x)| ≤ 2 and taking the square root concludes the proof.

Discretization and experimental analysis
The last step to apply our numerical scheme is to provide a conforming discretization of the energy space H. Note that H 1 0 (∇·; ) ⊆ H dense. The Raviart-Thomas finite elements introduced in [23] to discretize H 1 0 (∇·; ) are thus a natural choice for the discretization of such an energy space, cf. [13].

The space of Raviart-Thomas finite elements
Let T h be a regular triangulation of the domain in the sense of Ciarlet, i.e.,: • Each element T j ∈ T h is a non-degenerate and closed triangle, • T h covers , i.e., = T ∈T h T , • The intersection T i ∩T j , for i = j, is either empty, a common vertex, or a common edge.
The global mesh size h is defined by h = max T ∈T h diam(T ). Moreover, the set of all edges of a triangulation is denoted by E h and E h is the set of all interior edges. We define the space of lowest order Raviart-Thomas finite elements by where P 1 (T h ) denotes the space of piecewise linear and discontinuous functions, n E denotes a normal vector on the edge E, and [·] E denotes the jump across an edge of the triangulation. We stress that the crucial property [m h · n E ] E = 0 ensures the H 1 (∇·; ) conformity of the discrete space RT 0 (T h ). Since H 1 0 (∇·; ) ⊆ H, the set is a conforming discretization of our admissible set A. Next, we describe the standard basis of RT 0 (T h ). Each interior edge E ∈ E h belongs to precisely two elements T + and T − . For such an edge E, let P + and P − be the vertices of T + and T − opposite E; in other words, T ± = conv(E ∪ {P ± }) as shown in Fig. 1. For each E ∈ E h , we define and notice that the jump [ψ E ·n] across any edge vanishes. This implies ψ E ∈ RT 0 (T h ). Moreover, it can be shown that the set is a basis of RT 0 (T h ), cf. [13]. Remark The variational inequality (43) for both, the continuous and the discrete constrained minimization problem, provides a tool to derive a priori error estimates. Indeed, (43) applied on the continuous and the discrete level gives us for arbitrary Applying the trivial inequality 2ab ≤ a 2 + b 2 to the term m * − m 0 h , m * − w h , we see Since w h ∈ A h was arbitrary the last inequality still holds when taking the infimum over all w h ∈ A h on the right-hand side. Choosing w h = h m * with a suitable interpolation operator h : H 1 ( ) → RT 0 (T h ), one can prove an a priori rate of provided that m * ∈ H 1 ( ) 2 with ∇ · m ∈ H 1 ( ). Note, however, that norm convergence of the constrained problem does not imply norm convergence of the penalized system.

A simple damped Newton algorithm
The penalized energy functional e ε (m) is Fréchet differentiable with derivative [11] for the detailed derivation of the last term). Since e ε (m) is convex, we have that finding a global minimizer of (M ε h ) is equivalent to solving the variational problem We apply Newton's algorithm to solve this equation. Note that this derivative of the penalized energy functional is not continuously differentiable due to the penalty energy. Hence no classical result on convergence of Newton's algorithm applies. Let T h denote some regular triangulation with #E h =: N D interior edges. Given some coefficient vector x ∈ R N D the discrete Euler-Lagrange equation reads Hence we seek the zeros of the discrete function F : The notation m ε h (x) indicates that the discrete magnetization m ε h = m ε h (x) depends on the given coefficient vector A crucial step is the computation of the derivative of the function F. In Eq. (46), the derivative of the first two scalar products can be computed easily. The fourth contribution vanishes, since it is just a constant. The third term, however, is more involved. The derivative of (|m|−1) + |m| is not defined classically at the points where |m ε h (x)| = 1. However, since the scalar product is computed by use of numerical quadrature, this exceptional situation is not expected to be encountered numerically at any quadrature point. In our implementation, points where |m(x)| = 1 are treated in the same way as points where |m(x)| < 1. The following formula for the Jacobian of the non-linear contribution is obtained by straight forward calculations.
We define the vectors b j := (f, ψ j ) L 2 and a j (x) := Finally the evaluation of the function F(x) and the derivative matrix DF(x) read Note that the matrix V is dense. We apply matrix compression techniques well-known from boundary element analysis to store and work with the matrix in a data-sparse manner. Among the many strategies available, we use here hierarchical matrices [14], which have log-linear complexity for storing V and realizing the matrix-vector multiplication z → Vz. Our implementation utilizes the HLib library (http://www.hlib. org) for the computation of the matrix V.

Numerical experiments
We study the behavior of our algorithm with a simple set of experiments. We choose the sample to be the unit square = (−0.5, 0.5) 2 . The applied field is constant-a standard assumption in thin-film micromagnetics [7]. Since for this non-linear problem no analytical solution is available, we estimate the error by comparing with a reference solution m re f that is computed on a relatively fine grid. Throughout, the anisotropy parameter is q = 1.

Convergence in h
In this first experiment we are interested in the convergence as h → 0 for fixed ε. To be precise, we choose ε = 10 −2 and we compute the solutions m ε h on a sequence of uniformly refined meshes of . The applied field is f = (1, −0.5) T . The initial mesh with #T h = 16 triangles is shown in Fig. 2. The reference solution m re f was computed on a mesh T h with 262, 144 triangles, which corresponds to 392, 704 degrees of freedom for the discrete space RT 0 (T h ). Figure 3 shows the length of the discrete magnetization and the region of where the penalty term is active. Figure 4 shows the estimated error in the energy norm |||m ε h −m re f ||| ≈ |||m ε h −m ε 0 ||| as well as the full space norm m ε h − m re f H . We also plot the contributions of the

Convergence in ε
In this second experiment we study the convergence of m ε h to m 0 h as ε → 0. We compute discrete and penalized solutions to the same problem as in the first experiment in Sect. 4.3.1. But this time we compute the solutions m ε h for varying penalty parameter on a fixed mesh with #T h = 4,096 triangles. This corresponds to 6,080 degrees of freedom of the discrete space RT 0 (T h ) and a mesh size of h = 0.0312. The reference solution m re f was computed with a value of ε = 3.05 · 10 −5 . Figure 6 shows the estimated error in the energy norm |||m ε h −m re f ||| ≈ |||m ε h −m 0 h ||| as well as the full space norm m ε h − m re f H . We also plot the contributions of the L 2 -norms m ε h − m re f L 2 ( ) and m ε h,2 − m re f,2 L 2 ( ) as well as the error of the    We observe two major differences to the error with respect to h-refinements. First, we observe that the error in the energy norm decays with order 1/2, i.e., |||m ε . This is similar to the error decay with respect to h. In contrast, however, the error in the energy also only decays at the same order 1/2 with respect to ε. This is a bit surprising since the energy is a quadratic quantity. Apparently, the error from the approximation by the penalty method is reflected in the energy mainly through the linear L 2 -contribution (f, m) L 2 . The second difference is that the error in the L 2 -norm and the error of the divergence in the V -norm are of the same order.

Choice of ε = h α
It is an open question how to balance the parameters ε and h in numerical calculations. So far, we only looked at the error contributions independently. Putting everything together might yield a different picture. To empirically understand the optimal choice of ε = h α , we perform a series of calculations for varying α ∈ {0.5, 0.8, 1.0, 1.2, 1.5}. We use the same geometry and parameters as in the first experiment in Sect. 4.3.1. The reference solution m re f was computed on a mesh with #T h = 65,536 triangles which corresponds to 98,048 degrees of freedom. The penalty parameter of the reference solution is ε = 6.52 · 10 −4 . Figure 8 (left) shows the error in the energy norm for the sequences of solutions m ε h with ε = h α plotted versus the number of triangles #T h of the corresponding mesh.
We clearly see that α = 0.5 leads to a reduced order of convergence compared to the other choices. The plot is not clearly visible beyond α = 0.8. This is because the discretization error dominates the total error. In order to obtain more conclusive data, in a second experiment, we applied a larger field f = (2, 2) T . This choice emphasizes certain effects. The error contributions from the discretization and the penalty scheme are well balanced for our purposes. Figure 8 (right) shows the results of this second simulation run. We observe that the choice of α = 1 is optimal. Choice of α < 1 leads to a reduced order of convergence; the choice of α > 1 leads to the same order of convergence as α = 1. We note that the energy norm error, in both simulation, is approximately O(#T  ) is of higher order than √ h. One possible explanation is that, as the mesh size decreases, the active set is better resolved, see also the more detailed experimental analysis in [11]. We are not sure, however, whether this accelerated convergence effect is only a pre-asymptotic phenomenon due to an inadequate resolution of the active set at the beginning of the calculations.

Conclusions
In the first part of the paper we presented a general convergence result for penalty methods. The proof applies to a large class of quadratic minimization problems. We stress that it does not involve any optimality conditions, and the assumptions on the inequality constraints are very weak.
In the second part of the paper we analyzed the thin-film model in micromagnetics from [7]. In contrast to the original works [8,9], our perspective is that of the numerical analyst. We construct an appropriate Hilbert space and prove well-posedness of the problem in this setting. We note that the uniqueness of the solution in the case q = 0 is not stated in the prior works [7][8][9][10]. The present work generalizes and extends the results of [11,12].
Parts of the dissertation [10] are also concerned with the numerical simulation of the same thin-film model problem. There, the author uses an interior point method to compute admissible approximations to the magnetization. While admissibility is mandatory in many applications, we feel that a good qualitative and even quantitative understanding of the present thin-film model can be obtained with numerical methods that relax the admissibility constraint. Additionally, we were able to give a mathematical convergence result for the algorithms presented here.
The numerical experiments show that there is still some work to do. The observation that uniform meshes lead to a sub-linear order of convergence is consistent with the observations in [10]. If the applied field is sufficiently weak so that the constraint |m| ≤ 1 is not active, the minimization problem reduces to a certain linear integral equation. From the literature on boundary element methods it is then clear that the divergence ∇ · m has generic singularities along the boundary of . A heuristic adaptive algorithm developed in [11] shows that the rate of convergence can be improved.
There are some open questions in this context, however, since adaptive mesh refinements towards the boundary of is suboptimal for the approximation of m in the L 2 -norm. Also a posteriori error estimators that measure the error with respect to h and ε are not justified rigorously.
Despite these open questions, the numerical experiments presented here did support our analytical results-convergence in the energy norm holds for arbitrary choices of (h, ε) → (0, 0). The system matrix V of (47) is the bottle-neck in the calculations. From that point of view, our simple Newton method is quite successful. The size of the computations is mainly limited by memory consumption and the building time of V, i.e., by effects emerging from the nature of the problem.