1 Introduction

The possibility of using composite meshes in finite element (FE) simulations of industrial problems is a recurrent topic [8, 12, 28, 29, 33]. Composite meshes are involved as soon as the global discretization of a partial differential equation combines discretizations on local (overlapping or non-overlapping) subdomains, each suitably triangulated by non-matching grids. The reason for using composite meshes are various: fitting the geometry or the local smoothness of the solution, resolving multiple scales in regions with irregular data, using fast solvers on structured grids or a divide-and-conquer/domain decomposition approach to very large problems on parallel machines.

In the present case, we are looking for a simple and practical approach to introduce in certain parts of the computational domain FE functions that are not only continuous, but have also first order, second order or higher order continuous derivatives. In general it is very difficult to introduce FE spaces over simplicial unstructured meshes with such properties. On the other hand, if we work with Cartesian meshes this becomes very simple. It is sufficient to use tensor products of spline spaces with sufficiently high regularity. So, as it is naive to expect that technical devices can be entirely triangulated with Cartesian meshes, we introduce composite meshes involving Cartesian meshes in those subdomains where we want high regular FE representations and triangular unstructured meshes in those subdomains where we want conformity with the geometry.

The industrial application we consider concerns the free-boundary plasma equilibrium in tokamaks for nuclear fusion [3], which derives from the magnetohydrodynamic (MHD) equilibrium, described by the force balance and Maxwell’s equations in the eddy-current approximation. Nuclear fusion is a highly exothermic reaction in which two light atomic nuclei fuse to form a heavier nucleus. The peaceful use of such reactions for energy production on earth is a multinational research effort with high impact on the long-term perspective of energy production and consumption. The most promising technology to achieve this goal is currently the tokamak, a torus shaped reactor that uses strong magnetic fields to confine plasma and to achieve the extreme conditions to start the fusion reaction. The upcoming ITER (International Thermonuclear Experimental Reactor) tokamak, jointly build by China, Europe, India, Russia, South Korea and USA, will be the largest magnetic confinement experiment. It aims at demonstrating the principle of producing more energy from the fusion process than is used to initiate it, something that has not yet been achieved in any fusion reactor. Achieving this dream of creating a small sun on earth, that provides an almost endless amount of clean and sustainable energy, is a main motivation of this joint research work. The various open problems require interdisciplinary research and are a steady source of interesting and challenging questions, requiring also a high expertise in the field of applied and computational mathematics.

By symmetry considerations the free-boundary plasma equilibrium problem can be reduced to a scalar semi-linear elliptic one for the flux of the poloidal magnetic field (see Sect. 2.1 for details and references or [3, Sect. 1.2]). As the magnetic field and the current density are tangential to the level sets of the poloidal flux, the precise calculation of the level set distribution for the poloidal flux is fundamental in tokamak science. Hence, it is important to have good approximations not only of the poloidal magnetic flux but also of its derivatives.

In Fig. 1 we show a sketch of the cross section of a tokamak. It contains the geometrical details such as coils, passive structures and the iron core, that need to be accurately resolved by a triangulation. A peculiarity of the free-boundary plasma equilibrium problem is the unknown plasma domain, that is implicitly given as the domain that is bounded by the largest contour line of the poloidal flux that it not intersected by the limiter (see Fig. 2). Depending on the combination of coil currents, the plasma boundary, a contour line of the poloidal flux, either touches the limiter, the so-called limiter configuration, or is fully detached from the limiter. The latter is called divertor configuration or X-point configuration as the plasma boundary contains a saddle point of the poloidal field. In the very early tokamak devices the plasma was always attached to the limiter, while later the focus shifted towards devices that create free standing plasmas that are completely detached from material that would have to resist extremely high temperatures, otherwise. This allows to maintain plasmas at much higher temperatures. But even though a direct contact of plasma and material is avoided there are nevertheless high temperature heat loads on some parts of the material inside the tokamak. Particles that escape from the plasma hit the limiter at places where the limiter intersects with the contour lines of the poloidal flux. But since the toroidal component of the particle velocity is very large, the impact place is in the vicinity of the X-point. This may damage the device since the particles deposit their significant energy on a relatively small area leading to unacceptably high heat flux densities. To mitigate the heat flux impact several strategies are proposed. One of these strategies aims at the exploration of the so-called snowflake configuration [30, 31]. A snowflake configuration (see Fig. 2 right) is obtained when there is a point where not only the gradient of the poloidal flux vanishes but also its second order derivatives. This second condition implies that contour line through this point has more than four different branches. It thus results that the heat flux load is distributed over a larger impact area.

Figure 1
figure 1

Geometric details of a tokamak in the poloidal plane. Coils, passive structures, iron core and limiter

Figure 2
figure 2

Sketch for characteristic plasma shapes. Contour lines (black) of the poloidal flux and plasma domain (red): The plasma boundary either (left) touches the limiter (gray) or the plasma is enclosed by a flux contour line (middle and right) that passes through a saddle point. This can be a non-degenerated (middle) in the X-point configuration or degenerated (right) saddle point in the snowflake configuration

In [21] we introduced a FE method on composite meshes (see Fig. 3) for free-boundary plasma equilibrium problems and showed that the numerical calculation of such equilibria benefits from approximating the poloidal flux through some higher regular FE functions in the interior of the limiter. In the present paper we extend our previous work to determine coil current distributions that create snowflake configurations. We formulate this task as optimal control problem that is discretized with higher regular FE functions on the composite meshes. While FE methods on composite meshes are widely used in practice, their theoretical foundation is fairly limited in the literature. Therefore, we report here also extensive experimental convergence results that provide reassuring foundations for this application.

Figure 3
figure 3

Composite meshes. Top: detail view for the WEST tokamak, with the iron core (green), the passive structures (red), the various coils (light blue) and the domain bounded by the limiter (white). Down: the composite meshes for the WEST tokamak

The outline is the following: The Sects. 2 and 3 deal with the free-boundary plasma equilibrium problem and the numerical methods, respectively. We present the model (Sect. 2.1), the free-boundary equilibrium problem, and its role for the operation of tokamaks (Sect. 2.2). In Sect. 3.1 we recall the classical mortar element method (MEM) for overlapping meshes and introduce a modified method (MEM-M) that simplifies the implementation by avoiding integrals over cut elements. Next, in Sect. 3.2 we present the MEM-M Galerkin formulation for the plasma equilibrium problem. The related optimal control problem for tuning plasma equilibria appears in Sect. 3.3. We explain how an existing implementation of Newton’s method for the free-boundary equilibrium problem can be easily extended to solve efficiently the corresponding optimal control problems. The section on results, Sect. 4, starts with a validation of the convergence of the MEM and MEM-M and ends with a case study for finding snowflake configurations for the future tokamak CFETR. Sect. 5 gives a brief summary and draws the conclusions.

2 Formulation of the problem

The essential equations for describing plasma equilibrium in a tokamak are force balance, the solenoidal condition and Ampère’s law that read respectively

$$ \operatorname{grad}p = \mathbf {J}\times\mathbf {B},\qquad \operatorname {div}\mathbf {B}= 0,\qquad \operatorname{curl}\frac{1}{\mu} \mathbf {B}= \mathbf {J}, $$
(1)

where p is the plasma kinetic pressure, B is the magnetic induction, J is the current density and μ the magnetic permeability. The magnetodydrodymic equilibrium (1) is a fundamental concept for nuclear fusion and we refer to standard text books, e.g. [3, 13, 15, 16, 35] and [22] for the details. Nevertheless, to keep this contribution concise, we give in the subsequent section a brief introduction following the lines of [21, Sect. 2].

2.1 The free-boundary plasma equilibrium problem

Tokamaks are predominantly axial symmetric devices, hence it is convenient to formulate (1) in a cylindrical coordinate system \((r,\varphi,z)\) in order to consider only a section at \(\varphi= \text{ constant}\) of the tokamak, generally referred to as poloidal section. Working in a poloidal section, the scalar field p does not depend on the angle φ, thus ∇p belongs to the poloidal \((r,z)\)-plane. We introduce \(\mathbb {H} = [0,\infty] \times[-\infty,\infty]\), the positive half plane, to denote the poloidal plane that contains the tokamak centered at the origin. The classical primal unknowns for toroidal plasma equilibria described by (1) are the poloidal magnetic flux \(\psi=\psi(r,z)\), the pressure p and the diamagnetic function f. The poloidal magnetic flux \(\psi:= r\mathbf {A}\cdot\mathbf {e}_{\varphi}\) is the scaled toroidal component (φ-component) of the magnetic vector potential A, such that \(\mathbf {B}= \operatorname{curl}\mathbf {A} \), and \(\mathbf {e}_{\varphi}\) the unit vector for the φ coordinate. The diamagnetic function \(f = r \mathbf {B}\cdot\mathbf {e}_{\varphi}\) is the scaled toroidal component of the magnetic field B. It can be shown that both the pressure p and the diamagnetic function f are constant on ψ contour lines, i.e. \(p=p(\psi)\) and \(f=f(\psi)\).

Force balance, the solenoidal condition and Ampère’s law in (1) yield, in axisymmetric configuration, the following set of equations for the flux \(\psi(r,z)\):

$$ \begin{aligned}& {-}\nabla\cdot \biggl(\frac{1}{\mu[\psi] r} \nabla\psi \biggr) = \textstyle\begin{cases} r p'(\psi) + \frac{1}{\mu_{0} r} f f'(\psi) & \text{in } \mathcal {P}(\psi); \\ I_{i}/ \vert \mathcal {C}_{i} \vert & \text{in } \mathcal {C}_{i}, 1 \leq i \leq M;\\ j_{\mathcal{S}}& \text{in } \mathcal {S};\\ 0& \text{elsewhere in } \mathbb {H}, \end{cases}\displaystyle \\ &\psi(0,z) = 0; \qquad\lim_{ \Vert (r,z) \Vert \to+\infty} \psi(r,z)= 0; \end{aligned} $$
(2)

where ∇ is the gradient in the half plane \(\mathbb {H}\), \(I_{i}\) is the total current (in At, Ampère turns) in the ith coil \(\mathcal {C}_{i} \subset\mathbb {H}\) and μ is a functional of ψ that reads

$$ \mu[\psi] = \textstyle\begin{cases} \mu_{\mathrm{Fe}}(\frac{ \vert \nabla\psi \vert ^{2}}{ r^{2}}) & \text{in } \mathcal {F},\\ \mu_{0} & \text{elsewhere}, \end{cases} $$
(3)

with \(\mu_{0}\) the constant magnetic permeability of vacuum and \(\mu _{\mathrm{Fe}}\) the non-linear magnetic permeability of iron. \(\mathcal {S}\) is the domain of axisymmetric passive structures where a current density \(j_{\mathcal {S}}\) is prescribed. The plasma domain \(\mathcal {P}(\psi)\) is an unknown, which depends non-linearly on the magnetic flux ψ: \(\mathcal {P}(\psi)\) is a functional of the poloidal flux ψ. The different characteristic shapes of \(\mathcal {P}(\psi)\) are illustrated in Fig. 2: the boundary of \(\mathcal {P}(\psi)\) either touches the boundary of \(\mathcal {L}\) (limiter configuration) or the boundary contains one or more saddle points of ψ (divertor configuration). The saddle points of ψ, denoted by \((r_{\mathrm{X}}, z_{\mathrm{X}})\)=\((r_{\mathrm{X}}(\psi),z_{\mathrm{X}}(\psi))\), are called X-points of ψ. The plasma domain \(\mathcal {P}(\psi)\) is the largest subdomain of \(\mathcal {L}\) bounded by a closed ψ-contour line in \(\mathcal {L}\) and containing the magnetic axis \((r_{\mathrm{{max}}} ,z_{\mathrm{{max}}})\). The magnetic axis is the point \((r_{\mathrm{{max}}},z_{\mathrm {{max}}})=(r_{\mathrm{{max}}}(\psi), z_{\mathrm{{max}}}(\psi))\), where ψ has its global maximum in \(\mathcal {L}\). For convenience, we introduce also the coordinates \((r_{\mathrm{{bdp}}},z_{\mathrm{{bdp}}})=(r_{\mathrm{{bdp}}}(\psi ),z_{\mathrm{{bdp}}}(\psi))\) of the point that determines the plasma boundary. Note that \((r_{\mathrm {{bdp}}},z_{\mathrm{{bdp}}} )\) is either an X-point of ψ or the contact point with the limiter \(\partial\mathcal {L}\). The Fig. 4 presents the actual geometric setting of 3 different tokamaks showing the big variety of designs.

Figure 4
figure 4

Poloidal section of the tokamaks ITER (left), WEST (middle) and HL-2M (right). ITER, the International Thermonuclear Experimental Reactor, is currently build in Cadarache, France and planned to be operational in 2015. WEST, the Tungsten (W) Environment in Steady-state Tokamak, is the remodeled Tore Supra tokamak of the CEA, also located in Cadarache. Tora Supra was operational from 1988–2010 and experiments with WEST started in 2017. HL-2M is a modification of HL-2A, a tokamak in Chengdu, China operational since 2001

The equation (2) in the plasma domain is the celebrated Grad–Shafranov–Schlüter equation [17, 25, 32]. The domain of \(p'\) and \(f f'\) is the interval \([\psi_{\mathrm{{bdp}}},\psi_{\mathrm{{max}}}]\) with the scalar values \(\psi_{\mathrm{{max}}}\) and \(\psi_{\mathrm {{bdp}}}\) being the flux values at the magnetic axis and at the boundary of the plasma:

$$ \begin{aligned} &\psi_{\mathrm{{max}}}[\psi]:=\psi \bigl(r_{\mathrm{{max}}}[ \psi ],z_{\mathrm{{max}}}[\psi] \bigr), \\ &\psi_{\mathrm{{bdp}}}[\psi]:=\psi \bigl(r_{\mathrm{{bdp}}}[\psi ],z_{\mathrm{{bdp}}}[ \psi] \bigr) . \end{aligned} $$
(4)

The two functions \(p'\) and \(f f'\) and the currents \(I_{i}\) in the coils are not determined by the model (2) and have to be supplied as data. Since the domain of \(p'\) and \(f f'\) depends on the poloidal flux itself, it is more practical to supply these profiles as functions of the normalized poloidal flux \(\psi_{\mathrm{N}}(r,z)\):

$$ \psi_{\mathrm{N}}(r,z) = \frac{\psi(r,z) - \psi_{\mathrm{{max}}}[\psi]}{\psi_{\mathrm{{bdp}}}[\psi] - \psi _{\mathrm{{max}}}[\psi]}. $$
(5)

These two functions, subsequently termed \(S_{p'}\) and \(S_{ff'}\), have, independently of ψ, a fixed domain \([0,1]\). They are usually given as piecewise polynomial functions. Another frequent a priori model is

$$ { S_{p'}(\psi) = \frac{\beta}{r_{0}} \bigl(1-\psi_{\mathrm{N}}^{\alpha}\bigr)^{\gamma}, \qquad S_{ff'}(\psi) = (1-\beta)\mu_{0} r_{0} \bigl(1-\psi_{\mathrm{N}}^{\alpha}\bigr)^{\gamma}, } $$
(6)

with \(r_{0}\) the major radius (in meters) of the vacuum chamber and \(\alpha,\beta,\gamma\in {\mathbb {R}}\) given parameters. We refer to [26] for a physical interpretation of these parameters. The parameter β is related to the poloidal beta [3, p. 15], whereas α and γ describe the peakage of the current profile.

The total plasma current given by

$$I_{p}:= \int_{\mathcal {P}(\psi)} \lambda \biggl( r {p'} \bigl(\psi(r,z) \bigr) + \frac {{f f'}(\psi(r,z))}{\mu_{0} r} \biggr) \,\mathrm{d}r \,\mathrm{d}z $$

is an important quantity. In many cases one prefers to find solutions to the free-boundary equilibrium problem (2) where \(I_{p}\) has a predefined value. Hence it is common to scale \(S_{p'}\) and \(S_{ff'}\) by an unknown coefficient \(\lambda\in\mathbb {R}\) such that:

$$ \int_{\mathcal {P}(\psi)} \lambda \biggl( r S_{p'} \bigl( \psi_{\mathrm{N}}(r,z) \bigr) + \frac{S_{f f'}(\psi_{\mathrm{N}}(r,z))}{\mu_{0} r} \biggr) \,\mathrm{d}r \,\mathrm{d}z = I_{p}. $$

2.2 The plasma equilibrium and tokamak experiments

Computing plasma equilibria, the solutions to (2), is a central topic in tokamak fusion science. This is essential for simulations with elaborated high-dimensional magnetohydrodynamic models but also for experimenters that need to control real tokamak reactors. They need to compute a huge amount of equilibria to set up discharge scenarios, to study breakdowns and disruptions, or to design the layout of new machines. The computational challenges for numerical codes for such free-boundary equilibrium problems are a problem setting in an unbounded domain with non-linearities due the current density profile in the unknown plasma domain and the non-linear magnetic permeability if the reactor has ferromagnetic structures. Devising stable iterative schemes is known to be very tricky [27], in particular for computing physical unstable equilibria. The combination of Galerkin methods and Newton-type iterations that were first introduced in [4] are among the most successful approaches to such type of free-boundary problems. Computing derivatives of the plasma domain has similarities with shape calculus. We refer to [20] for details and the latest improvements and extensions of this approach. In Figs. 5, 6 and 7 we show a couple of representative examples for such equilibrium calculations that are based on a standard Galerkin method with lowest order Lagrangian finite elements on triangular meshes as described in [4] or [20] and implemented in the MATLAB/Octave library FEEQS.M.Footnote 1

Figure 5
figure 5

Example WEST and ITER: Contour Lines. Contour lines of the magnetic flux ψ for WEST (left) and ITER (right). The location \((r_{\mathrm{{max}}}(\psi),z_{\mathrm{{max}}}(\psi))\) of the maximum of ψ is indicated with a green circle. The location \((r_{\mathrm{X}},z_{\mathrm{X}})\) of the discrete saddle points of ψ is indicated by black circles. The magenta line indicates the contour line that contains the plasma boundary

Figure 6
figure 6

Example WEST and ITER, Flux and Current Density. Pseudo-color plot of the magnetic flux ψ and the plasma current density for two different cases: WEST (left) and ITER (right)

Figure 7
figure 7

Example HL-2M. Contour lines of the magnetic flux ψ for three different configuration of HL-2M. The location \((r_{\mathrm{{max}}} (\psi),z_{\mathrm{{max}}}(\psi))\) of the maximum of ψ is indicated with a green circle. The location \((r_{\mathrm{X}},z_{\mathrm{X}})\) of the discrete saddle points of ψ is indicated by black circles. The magenta line indicates the contour line that contains the plasma boundary

In order to prepare experiments on each machine, it is a routine almost daily work, to compute not only the magnetic flux for a certain given set of coil currents, but also to determine coil currents that create a plasma equilibrium with certain desired properties. Such properties can be for example the shape of the plasma domain, the position of the X-point or the distribution of the plasma current density. It is very convenient to formulate such tasks as inverse or optimal control problem in introducing objective functionals that encode the design goals. A common choice would be the quadratic functional

$${C}(\psi) = \sum_{i=1}^{N_{\mathrm{desi}}} \bigl(\psi (r_{i},z_{i},t)-\psi(r_{0},z_{0},t) \bigr)^{2}, $$

that would help to find an plasma equilibrium that has constant ψ values on \(N_{\mathrm{desi}}+1\) given points \((r_{i},z_{i})\). However, from the definition of the equilibrium problem it is clear that the stationary points of the magnetic flux ψ have a very important role and it would be very beneficial to formulate objective functionals for these stationary points. Moreover, the location of the X-point has a big influence, where the extremely hot impurities released from the plasma core hit the walls of the reactor. Very recently it was discovered that the so-called snowflake configuration, with degenerated X-points or with many X-points close to each other (see Fig. 7) can have very positive effects for the heat load mitigation, and hence, engineers are interested in preparing tokamak scenarios with such configurations.

With the current approaches it is not obvious how to formulate objective functionals for such tasks. The gradients or Hessians of the Galerkin approximation of ψ are non-smooth across element boundaries. Point evaluations of these gradients and Hessians are not well defined. Therefore, we prefer to work with higher order regular Galerkin methods. As this is easy with Cartesian meshes, we are interested in combining Cartesian meshes covering the burning chamber with triangle meshes covering the remaining parts of the computational domain.

3 Numerical methods

To simplify the presentation of the optimal control formulation and the main ingredients for a implementation we focus first on the details of the Mortar Element Method. To keep the discussion concise we elaborate the MEM for a linear elliptic problem.

3.1 A mortar element method (MEM) with overlapping meshes

To focus on the main idea we consider the following Poisson problem for the unknown ψ in the bounded domain \(\Omega\subset\mathbb {R}^{n}\) with boundary \(\Gamma= \partial\Omega\):

$$ - \nabla\cdot({D} \nabla\psi) = f \quad \text{in } \Omega\quad \text{and}\quad \psi_{|\partial\Omega} = \psi_{0} \quad\text{in } \Gamma, $$
(7)

where ∇ (resp., ∇⋅) is the gradient (resp., divergence) operator in \(\mathbb {R}^{n}\) and \({D(\mathbf {x})} \in\mathbb {R}\) positive, for any \(\mathbf {x}\in\Omega\). The right-hand side f and the Dirichlet data \(\psi_{0}\) are given. Let \(L^{2}(\Omega)\), be the functional space of measurable functions on Ω that are square integrable in Ω and \({H}^{1}(\Omega) = \{u \in L^{2}(\Omega), \nabla u \in L^{2}(\Omega )^{2}\}\) the Hilbert space endowed with the norm \(\|u\|^{2}_{{H}^{1}(\Omega)} = \|u\|^{2}_{\Omega} + |u|^{2}_{{H}^{1}(\Omega)}\) where \(|u|^{2}_{{H}^{1}(\Omega)} = \| \nabla u \|^{2}_{\Omega}\). Let \(\Omega^{\mathrm{in}}\subset\Omega\) be a subdomain with \(\Omega ^{\mathrm{in}}\cap \Gamma= \emptyset\) and \(\Omega^{\mathrm{ex}}= \Omega\setminus \Omega^{\mathrm{in}}\) the complement of \(\Omega^{\mathrm{in}}\) in Ω. Further, the boundary of \(\Omega^{\mathrm{in}}\), \(\gamma:= \partial\Omega^{\mathrm{in}}\), is the interface between \(\Omega^{\mathrm{ex}}\) and \(\Omega^{\mathrm{in}}\). To formulate (7) as a variational problem in a domain decomposition framework, let us introduce the functional space

$${\mathcal {H}_{g}} = \bigl\{ (v,w) \in{H}^{1} \bigl( \Omega^{\mathrm{ex}} \bigr) \times{H}^{1} \bigl(\Omega^{\mathrm{in}} \bigr), v_{| \Gamma} = g, v_{| \gamma} = w_{| \gamma} \bigr\} . $$

Then, the weak formulation of (7) is: Find \((\psi ^{\mathrm{ex}},\psi^{\mathrm{in}}) \in{\mathcal {H}_{\psi_{0}}}\) s.t. for all \((v,w) \in{\mathcal {H}_{0}}\)

$$\begin{aligned} &\int_{\Omega^{\mathrm{ex}}} {D}(\mathbf {x}) \nabla\psi^{\mathrm{ex}}(\mathbf {x}) \cdot \nabla v(\mathbf {x}) \,\mathrm{d}\mathbf{x}+ \int_{\Omega^{\mathrm {in}}} {D}(\mathbf {x}) \nabla\psi ^{\mathrm{in}}(\mathbf {x}) \cdot \nabla w(\mathbf {x}) \,\mathrm{d}\mathbf{x} \\ &\quad = \int_{\Omega^{\mathrm {ex}}} f(\mathbf {x}) v(\mathbf {x}) \,\mathrm{d}\mathbf{x} + \int_{\Omega^{\mathrm{in}}} f(\mathbf {x}) w(\mathbf {x})\, \mathrm {d}\mathbf{x}. \end{aligned}$$
(8)

We wish to introduce different types of meshes \(\mathcal{T}^{\mathrm {ex}}\) and \(\mathcal{T}^{\mathrm{in}}\) in the two subdomains \(\Omega^{\mathrm{ex}}\) and \(\Omega^{\mathrm{in}}\). To achieve a maximum of flexibility we do not expect the meshes \(\mathcal{T}^{\mathrm{ex}} \) and \(\mathcal{T}^{\mathrm{in}}\) to be conforming with \(\Omega ^{\mathrm{ex}}\) and \(\Omega^{\mathrm{in}}\). More precisely, we denote by \(\Omega^{\mathrm{ex}}_{h}\) and \(\Omega ^{\mathrm{in}}_{h}\) the domains covered by the mesh elements of \(\mathcal{T}^{\mathrm {ex}}\) and \(\mathcal{T}^{\mathrm{in}}\), respectively, and we only require that \(\Omega^{\mathrm{ex}}\subset\Omega^{\mathrm {ex}}_{h}\subset\Omega\), \(\Gamma\subset\partial\Omega^{\mathrm{ex}}_{h}\) and \(\Omega ^{\mathrm{in}}\subset\Omega^{\mathrm{in}}_{h} \subset\Omega\). Hence the approximation of (8) enters into the framework of overlapping domain decomposition methods. Let \(\gamma^{\mathrm{ex}}= \partial\Omega^{\mathrm{ex}}_{h}\setminus \Gamma\) and \(\gamma^{\mathrm{in}}= \partial\Omega^{\mathrm{in}}_{h}\) be the two boundaries of \(\Omega^{\mathrm{ex}}_{h}\) and \(\Omega^{\mathrm {in}}_{h}\) in Ω that replace the interface γ. Then we introduce the space

$$\mathcal {V}_{g} = \bigl\{ (v,w) \in\mathcal{V}^{\mathrm{ex}} \times \mathcal{V}^{\mathrm{in}}, v_{| \Gamma} = \Pi^{\mathrm {Dir}}g, v_{| \gamma^{\mathrm{ex}}} = \Pi^{\mathrm{ex}}w, w_{| \gamma^{\mathrm{in}}} = \Pi^{\mathrm{in}}v \bigr\} , $$

where \(\mathcal{V}^{\mathrm{ex}}\) and \(\mathcal{V}^{\mathrm{in}}\) are \(H^{1}(\Omega^{\mathrm{ex}}_{h})\) and \(H^{1}(\Omega^{\mathrm{in}}_{h})\) conforming FE spaces defined over \(\mathcal{T}^{\mathrm{ex}}\) and \(\mathcal {T}^{\mathrm{in}}\). The operators \(\Pi^{\mathrm{Dir}} \), \(\Pi^{\mathrm{ex}}\) and \(\Pi^{\mathrm{in}}\) are projections onto the Dirichlet trace spaces \(\mathcal{V}^{\mathrm{}}_{\Gamma}= \operatorname{tr}_{|\Gamma} \mathcal{V}^{\mathrm{ex}}\), \(\mathcal{V}^{\mathrm{ex}}_{\gamma}:= \operatorname{tr}_{|\gamma ^{\mathrm{ex}}} \mathcal{V}^{\mathrm{ex}}\) and \(\mathcal {V}^{\mathrm{in}}_{\gamma}:= \operatorname{tr}_{|\gamma^{\mathrm{in}} } \mathcal{V}^{\mathrm{in}}\). The \(\text{MEM}_{s,t}\) with overlapping domains [1, 6, 23] applied to (8) reads: Find \((\psi^{\mathrm{ex}},\psi^{\mathrm{in}}) \in\mathcal {V}_{\psi_{0}}\) such that

$$ \begin{aligned} \mathsf{a}^{\mathrm{ex}}_{{s}} \bigl( \psi^{\mathrm{ex}},v \bigr) + \mathsf {a}^{\mathrm{in}}_{{t}} \bigl( \psi^{\mathrm{in}},w \bigr) = \mathsf{\ell}^{\mathrm{ex}}_{{s}}(f,v) + \mathsf{\ell}^{\mathrm {in}}_{{t}}(f,w) \quad \forall(v,w) \in\mathcal {V}_{0}, \end{aligned} $$
(9)

where

$$\begin{aligned} & \mathsf{a}^{\mathrm{loc}}_{\mathrm{coef}}(\psi,v) : = \int_{\Omega ^{\mathrm{{loc}}}_{h}} {D}(\mathbf {x}) \nabla\psi(\mathbf {x}) \cdot \nabla v( \mathbf {x}) \,\mathrm{d}\mathbf{x}- \int_{\Omega^{\mathrm {ex}}_{h}\cap\Omega^{\mathrm{in}}_{h}} {\mathrm{coef}} {D}(\mathbf {x}) \nabla\psi(\mathbf {x}) \cdot \nabla v(\mathbf {x}) \,\mathrm{d}\mathbf{x}, \\ &\mathsf{\ell}^{\mathrm{loc}}_{\mathrm{coef}}(f,v): = \int_{\Omega ^{\mathrm{{loc}}}_{h}} f(\mathbf {x}) v(\mathbf {x}) \,\mathrm{d}\mathbf{x}- \int_{\Omega^{\mathrm{ex}}_{h}\cap\Omega ^{\mathrm{in}}_{h}} {\mathrm{coef}} f(\mathbf {x} ) v(\mathbf {x}) \,\mathrm{d}\mathbf{x}, \end{aligned}$$

for \(\mathrm{loc} = \mathrm{ex}\) (resp., \(\mathrm{loc} = \mathrm {in}\)) and \({\mathrm{coef}} = s\) (resp., \({\mathrm{coef}} = t\)), with \({\mathrm{coef}} \in[0,1]\). Optimal convergence results are available when \(s+t=1\) and \(\Pi^{\mathrm{ex}}\), \(\Pi^{\mathrm{in}}\), are the \(L^{2}\) projections onto \(\operatorname{tr}_{|\gamma^{\mathrm{ex}}} \mathcal{V}^{\mathrm{ex}}\), \(\operatorname{tr}_{|\gamma^{\mathrm{in}}} \mathcal{V}^{\mathrm {in}}\), respectively [1, 6]. However, two very restrictive disadvantages occur with the formulation (9):

  1. 1.

    Assembling of the stiffness matrices associated to \(\mathsf {a}^{\mathrm{ex}}_{{s}}(\cdot,\cdot)\) and \(\mathsf{a}^{\mathrm {in}}_{{t}}(\cdot,\cdot)\) involves products of basis functions defined on different meshes. Similarly, assembling of the load vectors corresponding to \(\mathsf {\ell}^{\mathrm{ex}}_{{s}}(f,\cdot)\) and \(\mathsf{\ell}^{\mathrm {in}}_{{t}}(f,\cdot)\) involves integration over intersections of elements from different meshes.

  2. 2.

    The stability of \(\text{MEM}_{s,t}\) requires the projections \(\Pi^{\mathrm{ex}}\) and \(\Pi^{\mathrm{in}}\) to be stable in \(H^{\frac{1}{2}}\). The obvious choice of \(L^{2}\) projections involves again surface integrals of products of basis functions defined on different meshes.

In the following we will introduce two mortar-like mappings different from the standard \(L^{2}\) projection, that allow to choose \(s=t=0\) in (9) and hence avoid the expensive assembling of the stiffness matrix for basis functions on two different meshes.

We recall that the FE spaces \(\mathcal{V}^{\mathrm{ex}}\) and \(\mathcal{V}^{\mathrm{in}}\) can be represented as direct sums \(\mathcal{V}^{\mathrm{ex}}= \mathcal{V}^{\mathrm{ex}}_{\circ}\oplus E \mathcal{V}^{\mathrm{ex}}_{\gamma}\) and \(\mathcal{V}^{\mathrm {in}}= \mathcal{V}^{\mathrm{in}}_{\circ}\oplus E \mathcal{V}^{\mathrm {in}}_{\gamma}\) where \(\mathcal{V}^{\mathrm{ex}}_{\gamma}= \operatorname{tr}_{|\gamma ^{\mathrm{ex}}} \mathcal{V}^{\mathrm{ex}}\) and \(\mathcal {V}^{\mathrm{in}}_{\gamma}= \operatorname{tr} _{|\gamma^{\mathrm{in}}} \mathcal{V}^{\mathrm{in}}\) are the earlier introduced trace spaces of \(\mathcal{V}^{\mathrm{ex}}\) and \(\mathcal{V}^{\mathrm{in}}\) and E is the trivial extension operator. Let us introduce two mappings \(\Pi^{\mathrm{ex}}_{f} \psi^{\mathrm{in}}\) for \(\psi^{\mathrm{in}}= \psi^{\mathrm{in}}_{\circ}+ \psi^{\mathrm{in}}_{\gamma}\), with \(\psi^{\mathrm{in}}_{\circ}\in \mathcal{V}^{\mathrm{in}}_{\circ}\), \(\psi^{\mathrm{in}}_{\gamma}\in \mathcal{V}^{\mathrm{in}}_{\gamma}\) and \(\Pi^{\mathrm{in}} _{f} \psi^{\mathrm{ex}}\) for \(\psi^{\mathrm{ex}}= \psi^{\mathrm {ex}}_{\circ}+ \psi^{\mathrm{ex}}_{\gamma}\), with \(\psi^{\mathrm{ex}}_{\circ}\in\mathcal{V}^{\mathrm{ex}}_{\circ}\), \(\psi^{\mathrm{ex}}_{\gamma}\in\mathcal{V}^{\mathrm{ex}}_{\gamma}\). The mapping \(\Pi^{\mathrm{ex}}_{f}\) is defined as:

$$\Pi^{\mathrm{ex}}_{f} \psi^{\mathrm{in}}:= \Pi^{\mathrm{ex}} \bigl(\psi ^{\mathrm{in}}_{\gamma}+ \Psi^{\mathrm{in}} \bigr), $$

where \(\Psi^{\mathrm{in}} \in\mathcal{V}^{\mathrm{in}}_{\circ}\) such that

$$\mathsf{a}^{\mathrm{in}}_{{0}} \bigl(\Psi^{\mathrm{in}},w \bigr) = \mathsf{\ell }^{\mathrm{in}}_{{0}}(f,w)-\mathsf{a}^{\mathrm{in}}_{{0}} \bigl(\psi ^{\mathrm{in}}_{\gamma},w \bigr)\quad \forall w \in \mathcal{V}^{\mathrm{in}}_{\circ}, $$

and \(\Pi^{\mathrm{ex}}\) is either the \(L^{2}\)-projection or standard nodal interpolation operator onto \(\mathcal{V}^{\mathrm{ex}}_{\gamma}\). The mapping \(\Pi^{\mathrm{in}}_{f}\) is defined analogously. We then introduce the space

$$\mathcal {V}_{g,f} = \bigl\{ (v,w) \in\mathcal{V}^{\mathrm{ex}} \times \mathcal{V}^{\mathrm{in}}, v_{| \Gamma} = \Pi^{\mathrm {Dir}}g, v_{| \gamma^{\mathrm{ex}}} = \Pi^{\mathrm{ex}}_{f} w, w_{| \gamma^{\mathrm{in}}} = \Pi^{\mathrm{in}}_{f} v \bigr\} , $$

and obtain the following modified version of the MEM for overlapping meshes: Find \((\psi^{\mathrm{ex}},\psi^{\mathrm{in}}) \in\mathcal {V}_{\psi _{0},f}\) such that

$$ \mathsf{a}^{\mathrm{ex}}_{{0}} \bigl(\psi^{\mathrm{ex}},v \bigr) + \mathsf {a}^{\mathrm{in}}_{{0}} \bigl(\psi^{\mathrm{in}},w \bigr) = \mathsf{\ell}^{\mathrm{ex}}_{{0}}(f,v) + \mathsf{ \ell}^{\mathrm {in}}_{{0}}(f,w)\quad \forall(v,w) \in\mathcal {V}_{0,0} {.} $$
(10)

A similar approach with the lowest order FE spaces has been proposed in [9, 10] in the context of non-destructive testing. The auxiliary variable \(\Psi^{\mathrm{in}}\) is equal to \(\psi ^{\mathrm{in}}_{\circ}\), since we have both

$$\mathsf{a}^{\mathrm{in}}_{{0}} \bigl(\Psi^{\mathrm{in}},w \bigr) + \mathsf {a}^{\mathrm{in}}_{{0}} \bigl(\psi^{\mathrm{in}}_{\gamma},w \bigr) = \mathsf {\ell}^{\mathrm{in}}_{{0}}(f,w) \quad \forall w \in \mathcal{V}^{\mathrm{in}}_{\circ}$$

and

$$\mathsf{a}^{\mathrm{in}}_{{0}} \bigl(\psi^{\mathrm{in}}_{\circ},w \bigr) + \mathsf{a}^{\mathrm{in}}_{{0}} \bigl(\psi^{\mathrm{in}}_{\gamma},w \bigr) = \mathsf{\ell}^{\mathrm{in}}_{{0}}(f,w) \quad \forall w \in \mathcal{V}^{\mathrm{in}}_{\circ}. $$

Likewise the auxiliary variable \(\Psi^{\mathrm{ex}}\) is equal to \(\psi^{\mathrm{ex}}_{\circ}\). Hence it is easy to see that (10) is equivalent to the following formulation. Find \((\psi^{\mathrm{ex}},\psi^{\mathrm{in}}) \in\mathcal{V}^{\mathrm {ex}}\times\mathcal{V}^{\mathrm{in}}\), \(\psi^{\mathrm{ex}}_{|\Gamma} = \Pi^{\mathrm{Dir}}g\) such that:

$$ \begin{aligned}& \mathsf{a}^{\mathrm{ex}}_{{0}} \bigl( \psi^{\mathrm{ex}},v \bigr) + \mathsf {a}^{\mathrm{in}}_{{0}} \bigl( \psi^{\mathrm{in}},w \bigr) = \mathsf{\ell}^{\mathrm{ex}}_{{0}}(f,v) + \mathsf{\ell}^{\mathrm {in}}_{{0}}(f,w)\quad \forall(v,w) \in \mathcal{V}^{\mathrm{ex}}_{\circ}\times\mathcal {V}^{\mathrm{in}}_{\circ}, v_{|\Gamma= 0}, \\ &\psi^{\mathrm{ex}}_{|\gamma^{\mathrm{ex}}} = \Pi^{\mathrm{ex}}\psi ^{\mathrm{in}},\qquad \psi^{\mathrm{in}}_{|\gamma^{\mathrm{in}}} = \Pi^{\mathrm {in}}\psi^{\mathrm{ex}}, \end{aligned} $$
(11)

which is similar to the numerical zoom formulation in [19], where both the spaces \(\mathcal{V}^{\mathrm{ex}}\) and \(\mathcal{V}^{\mathrm{in}}\) are defined over triangular meshes instead. This form of the modified MEM was also the starting point in [21]. When \(\Pi^{\mathrm{ex}}\) and \(\Pi ^{\mathrm{in}}\) are interpolation operators and \(\mathcal{V}^{\mathrm{ex}}\) and \(\mathcal {V}^{\mathrm{in}}\) are lowest order Lagrangian FE spaces on triangular meshes we can recall an optimal convergence result from [24, Theorem 1] for the error in the \(L^{\infty}\)-norm, under the assumption that discrete maximums principles and a decay condition for the numerical solution away from the boundary (see [24, p. 199]) hold. A similar result was established earlier in [7] (see also [14]) for finite difference schemes on overlapping uniform (Cartesian) meshes. Generalizing such results to more general meshes, combinations of triangular and uniform meshes, higher order FE methods or less restrictive conditions is an open problem. This is nevertheless very crucial as in many cases discrete maximums principles will not hold.

3.2 MEM-M Galerkin formulation for the equilibrium problem

To adapt to the notation from Sect. 3.1 we introduce \(\mathbf {x} :=(x_{r},x_{z}):=(r,z)\). Next, we choose a semi-circle Γ of radius \(\rho_{\Gamma}>0\) surrounding the iron domain \(\mathcal {F}\) and the coil domains \(\mathcal {C}_{i}\). Our computational domain \(\Omega\subset\mathbb {H}\) is the half circle domain with the boundary \(\partial\Omega= \Gamma\cup\Gamma_{0}\), where \(\Gamma_{0}:=\{(0,x_{z}), -\rho_{\Gamma}\le x_{z} \le\rho_{\Gamma}\}\). The exterior domain \(\Omega^{\mathrm{ex}}\), that will be covered by a triangular mesh, is the complement of the limiter-bounded domain \(\mathcal {L} \) in Ω: \(\Omega^{\mathrm{ex}}= \Omega\setminus\mathcal {L}\). The interior domain \(\Omega^{\mathrm{in}}\) is the limiter-bounded domain \(\mathcal {L}\) (see Figs. 4 and 8). We arrive at the following MEM-M Galerkin formulation of the non-linear plasma equilibrium problem (2): Find \(\lambda\in\mathbb {R}\) and \((\psi^{\mathrm{ex}},\psi ^{\mathrm{in}}) \in\mathcal{V}^{\mathrm{ex}}\times\mathcal {V}^{\mathrm{in}}\), \(\psi^{\mathrm{ex}}_{|\Gamma_{0}} =0\) such that:

$$\begin{aligned} &\int_{\Omega^{\mathrm{ex}}_{h}} \frac{\nabla\psi ^{\mathrm{ex}}(\mathbf {x}) \cdot\nabla v(\mathbf {x})}{\mu[\psi^{\mathrm{ex}}] x_{r}} \,\mathrm{d}\mathbf{x}+ c \bigl( \psi^{\mathrm{ex}},v \bigr) \\ &\quad= {\sum_{i=1}^{M}} \int_{\mathcal {C}_{i}} \frac{I_{i} v(\mathbf {x}) }{ \vert \mathcal {C}_{i} \vert } \,\mathrm{d}\mathbf{x}\quad \forall v \in \mathcal{V}^{\mathrm{ex}}_{\circ}, v_{|\Gamma_{0}} = 0, \\ &\int_{\Omega^{\mathrm{in}}_{h}} \frac{\nabla\psi ^{\mathrm{in}}(\mathbf {x}) \cdot\nabla w(\mathbf {x})}{\mu_{0} x_{r}} \,\mathrm {d}\mathbf{x} \\ &\qquad{-}\int_{\mathcal {P}(\psi^{\mathrm{in}})} {\lambda} \biggl( x_{r} S_{p'} \bigl(\psi ^{\mathrm{in}}(\mathbf {x}) \bigr) + \frac{S_{f f'}(\psi^{\mathrm{in}}(\mathbf {x}))}{\mu _{0} x_{r}} \biggr) w( \mathbf {x}) \,\mathrm{d}\mathbf{x} = 0 \quad\forall w \in \mathcal{V}^{\mathrm{in}}_{\circ}, \\ & \int_{\mathcal {P}(\psi^{\mathrm{in}})} \lambda \biggl( x_{r} S_{p'} \bigl(\psi ^{\mathrm{in}}(\mathbf {x}) \bigr) + \frac{S_{f f'}(\psi^{\mathrm{in}}(\mathbf {x}))}{\mu _{0} x_{r}} \biggr) \,\mathrm{d}\mathbf{x} = I_{p}, \\ &\int_{\gamma^{\mathrm{ex}}} \psi^{\mathrm{ex}}(\mathbf {x}) {\xi }(\mathbf {x}) \,\mathrm{d}\mathbf{ s}(\mathbf {x}) - \int_{\gamma^{\mathrm{ex}}} \Pi^{\mathrm{ex}}\psi^{\mathrm{in}}(\mathbf {x}){\xi}(\mathbf {x}) \,\mathrm{d}\mathbf{ s}(\mathbf {x}) = 0 \quad\forall{\xi} \in \mathcal{V}^{\mathrm{ex}}_{\gamma}, \\ &\int_{\gamma^{\mathrm{in}}} \psi^{\mathrm{in}}(\mathbf {x}) {\chi }(\mathbf {x}) \,\mathrm{d}\mathbf{ s}(\mathbf {x}) - \int_{\gamma^{\mathrm{in}}} \Pi^{\mathrm{in}}\psi^{\mathrm{ex}}(\mathbf {x}) {\chi}(\mathbf {x}) \,\mathrm{d}\mathbf{ s}(\mathbf {x} ) = 0 \quad \forall{\chi} \in \mathcal{V}^{\mathrm{in}}_{\gamma}. \end{aligned}$$
(12)

The bilinear form \(c(\cdot,\cdot)\) [2, 11, 18] takes into account the boundary conditions at infinity using Greens functions of the operator \(-\nabla\cdot (\frac{1}{\mu x_{r}} \nabla (\cdot ) )\). It is defined as follows

$$\begin{aligned} c(\psi,\xi):= {}& \frac{1}{\mu_{0}} \int_{\Gamma}\psi(\mathbf {x}) N(\mathbf {x}) \xi (\mathbf {x}) \,\mathrm{d}\mathbf{ s}(\mathbf {x}) \\ & {} + \frac{1}{2 \mu_{0}} \int_{\Gamma}\int_{\Gamma}\bigl(\psi (\mathbf {x})-\psi(\mathbf {y}) \bigr) {Q}( \mathbf {x},\mathbf {y}) \bigl(\xi(\mathbf {x})-\xi(\mathbf {y}) \bigr) \,\mathrm{d}\mathbf{ s}(\mathbf {x}) \,\mathrm{d}\mathbf{ s}(\mathbf {y}), \end{aligned}$$
(13)

with

$$ \begin{aligned}& {Q}(\mathbf {x},\mathbf {y}) = \frac{k(\mathbf {x},\mathbf {y})}{2 \pi(x_{r} y_{r})^{\frac{3}{2}}} \biggl( \frac{2-k(\mathbf {x},\mathbf {y})^{2}}{2-2k(\mathbf {x},\mathbf {y})^{2}} E \bigl(k(\mathbf {x},\mathbf {y}) \bigr)-K \bigl(k(\mathbf {x}, \mathbf {y}) \bigr) \biggr), \\ &N(\mathbf {x}) = \frac{1}{x_{r}} \biggl(\frac{1}{\delta_{+}}+\frac {1}{\delta _{-}} -\frac{1}{\rho_{\Gamma}} \biggr)\quad \text{and}\quad \delta_{\pm}= \sqrt {x_{r}^{2} + ( \rho_{\Gamma}\pm x_{z} )^{2}} \end{aligned} $$

and

$$k^{2}(\mathbf {x},\mathbf {y}) = \frac{ 4 x_{r} y_{r}}{ (x_{r} + y_{r})^{2} + (x_{z} - y_{z})^{2}}. $$

\(K(k)\) and \(E(k)\) are the complete elliptic integrals of the first and second kind respectively.

Figure 8
figure 8

Setting for Validation of MEM. We may choose \(\Omega ^{\mathrm{in}}_{h}\) to have a minimal overlap with \(\Omega^{\mathrm{ex}}_{h}\) (left), that is \(\gamma^{\mathrm{ex}}\) is contained in the layer of elements of \(\mathcal{T}^{\mathrm{in}}\) which define \(\gamma^{\mathrm{in}}\). Otherwise, we say that \(\Omega^{\mathrm{in}}_{h}\) has a large overlap with \(\Omega ^{\mathrm{ex}}_{h}\) (right)

The MEM-M Galerkin formulation (12) is similar to the formulation in [21] with the difference that we treat here the scaling parameter λ as an unknown in order to match with a prescribed total plasma current \(I_{p}\). Working here with the free-boundary equilibrium problem with fixed plasma current is due to the fact that in most applications the total plasma current \(I_{p}\) will be a prescribed quantity.

We want to stress that an implementation of the MEM-M Galerkin formulation (12) requires quadrature rules for the approximation of the integrals. Moreover, since the problem is non-linear we also need an iteration scheme. It is the non-linearity due to the unknown plasma domain \(\mathcal {P}(\psi^{\mathrm{in}})\) that makes these steps non-standard, and we refer to [3, 20] for the details for FE spaces over triangular meshes. The case of FE over Cartesian meshes is presented in [21, Sect. 4.4], and we mention here only that we use a Newton method for the fully discretized system as opposed to a discretization of the Newton method for the weak formulation. In many cases these two approaches are identical, but in the free-boundary setting it makes a difference. The difference is not too important for finding approximate solutions of (12) but it is essential for the optimal control formulation in the following section.

Let \(\mathbf {y}^{\mathrm{ex}}\) and \(\mathbf {y}^{\mathrm{in}}\) represent the vector of the values of degrees of freedom of \(\psi^{\mathrm{ex}}\in\mathcal{V}^{\mathrm{ex}}\) and \(\psi^{\mathrm{in}}\in \mathcal{V}^{\mathrm{in}}\). Then we have the decomposition \(\mathbf {y}^{\mathrm{ex}}= (\mathbf {y}^{\mathrm{ex}}_{\circ}, \mathbf {y}^{\mathrm {ex}}_{\gamma})\) and \(\mathbf {y}^{\mathrm{in}}= (\mathbf {y}^{\mathrm{in}}_{\circ}, \mathbf {y}^{\mathrm{in}}_{\gamma})\), where \(\mathbf {y}^{\mathrm{ex}}_{\circ}\) (resp., \(\mathbf {y}^{\mathrm{in}}_{\circ}\) ) and \(\mathbf {y}^{\mathrm {ex}}_{\gamma}\) (resp., \(\mathbf {y}^{\mathrm{in}}_{\gamma}\) ) are the degrees of freedom in \(\mathcal{V}^{\mathrm{ex}}_{\circ}\) (resp., \(\mathcal{V}^{\mathrm{in}}_{\circ}\)) and \(\mathcal{V}^{\mathrm {ex}}_{\gamma}\) (resp., \(\mathcal{V}^{\mathrm{in}}_{\gamma}\)). Further let u represent the vector of coil currents. The weak formulation (12) yields

the following non-linear algebraic system:

$$ \begin{aligned} &{\mathbf {d}^{\mathrm{ex}} \bigl(\mathbf {y}^{\mathrm{ex}}_{\circ},\mathbf {y}^{\mathrm{ex}}_{\gamma},\mathbf {u} \bigr)} = 0, \\ &{\mathbf {d}^{\mathrm{in}} \bigl(\mathbf {y}^{\mathrm{in}}_{\circ},\mathbf {y}^{\mathrm{in}}_{\gamma},\lambda \bigr)} = 0, \\ &{\mathbf {d}^{\mathrm{pc}} \bigl(\mathbf {y}^{\mathrm{in}}_{\circ},\mathbf {y}^{\mathrm{in}}_{\gamma},\lambda \bigr)} = 0, \\ &\mathbf {y}^{\mathrm{ex}}_{\gamma}+ \mathbf {P}^{\mathrm{ex}}_{\circ } \mathbf {y}^{\mathrm{in}}_{\circ}+ \mathbf {P}^{\mathrm{ex}}_{\gamma } \mathbf {y}^{\mathrm{in}}_{\gamma} = 0, \\ &\mathbf {y}^{\mathrm{in}}_{\gamma}+ \mathbf {P}^{\mathrm{in}}_{ \circ } \mathbf {y}^{\mathrm{ex}}_{\circ}+ \mathbf {P}^{\mathrm{in}}_{ \gamma } \mathbf {y}^{\mathrm{ex}}_{\gamma}= 0 ,\end{aligned} $$
(14)

where \(\mathbf {d}^{\mathrm{ex}}(\mathbf {y}^{\mathrm{ex}}_{\circ},\mathbf {y}^{\mathrm{ex}}_{\gamma},\mathbf {u})=0\), \(\mathbf {d}^{\mathrm {in}}(\mathbf {y}^{\mathrm{in}}_{\circ},\mathbf {y}^{\mathrm{in}}_{\gamma },\lambda )=0\) and \(\mathbf {d}^{\mathrm{pc}}(\mathbf {y}^{\mathrm{in}}_{\circ},\mathbf {y}^{\mathrm{in}}_{\gamma},\lambda)=0\) are the discretization of the first, second and third lines of (12). \(\mathbf {P}^{\mathrm{ex}}_{\circ}\) and \(\mathbf {P}^{\mathrm {ex}}_{\gamma}\) (resp., \(\mathbf {P}^{\mathrm{in}}_{ \circ}\) and \(\mathbf {P}^{\mathrm{in}}_{ \gamma}\)) are the discretization of the projection in the fourth (resp., fifth) line of (12). Combining the unknowns \(\mathbf {y}^{\mathrm {ex}}_{\circ}, \mathbf {y}^{\mathrm{ex}}_{\gamma}, \mathbf {y}^{\mathrm {in}}_{\circ}, \mathbf {y}^{\mathrm{in}}_{\gamma}\) and λ in one vector \(\mathbf {y}:=(\mathbf {y}^{\mathrm{ex}}_{\circ}, \mathbf {y}^{\mathrm {ex}}_{\gamma}, \mathbf {y}^{\mathrm{in}}_{\circ}, \mathbf {y}^{\mathrm{in}}_{\gamma}, \lambda)\) we can recast (14) in the form

$$ 0={\mathbf {b}}(\mathbf {y},\mathbf {u}):= \begin{pmatrix} {\mathbf {d}^{\mathrm{ex}}(\mathbf {y}^{\mathrm{ex}}_{\circ},\mathbf {y}^{\mathrm{ex}}_{\gamma},\mathbf {u})} \\ \mathbf {y}^{\mathrm{ex}}_{\gamma}+ \mathbf {P}^{\mathrm{ex}}_{\circ }\mathbf {y}^{\mathrm{in}}_{\circ}+ \mathbf {P}^{\mathrm{ex}}_{\gamma }\mathbf {y}^{\mathrm{in}}_{\gamma}\\ {\mathbf {d}^{\mathrm{in}}(\mathbf {y}^{\mathrm{in}}_{\circ},\mathbf {y}^{\mathrm{in}}_{\gamma},\lambda)} \\ \mathbf {y}^{\mathrm{in}}_{\gamma}+ \mathbf {P}^{\mathrm{in}}_{ \circ }\mathbf {y}^{\mathrm{ex}}_{\circ}+ \mathbf {P}^{\mathrm{in}}_{ \gamma }\mathbf {y}^{\mathrm{ex}}_{\gamma}\\ {\mathbf {d}^{\mathrm{pc}}(\mathbf {y}^{\mathrm{in}}_{\circ},\mathbf {y}^{\mathrm{in}}_{\gamma},\lambda)} \end{pmatrix} , $$
(15)

that will be very convenient in the following section. The Newton iterations for (15) (see Algorithm 1) require the implementation of the derivative \({\mathbf {b}}_{\mathbf {y}}(\mathbf {y},\mathbf {u})\), which is already available due to previous efforts in [21, Sect. 4.5].

3.3 The inverse problem for tuning plasma equilibria

Combining the discretized free-boundary plasma equilibrium problem (12) with a discretized objective functional \(C(\mathbf {y})\) and regularization \(R(\mathbf {u})\) we arrive at a finite dimensional optimal control formulation that is of the general form

$$ \min_{\mathbf {u}, \mathbf {y}} {C}(\mathbf {y})+R(\mathbf {u}) \quad\text{s.t. } {\mathbf {b}}(\mathbf {y},\mathbf {u}) = 0. $$
(16)

The state variable \(\mathbf {y}\in\mathbb {R}^{N}\) contains the N unknowns of the poloidal flux approximation in \(\mathcal{V}^{\mathrm {ex}}\) and \(\mathcal{V}^{\mathrm{in}}\). The components of the control variable \(\mathbf {u}\in\mathbb {R}^{M}\) are the currents in the M different coils.

By the first order optimality conditions we know that for solutions \((\mathbf {u}^{\ast},\mathbf {y}^{\ast})\) of (16) there exist so called adjoint states \(\mathbf {p}^{\ast}\in\mathbb {R}^{N}\) such that the following \(2N+M\) non-linear equations hold

$$ \begin{aligned}& {C}^{T}_{\mathbf{y}} \bigl(\mathbf {y}^{\ast}\bigr) + {\mathbf {b}}_{\mathbf{y}}^{T} \bigl( \mathbf {y}^{\ast},\mathbf {u}^{\ast}\bigr) \mathbf {p}^{\ast}= 0, \\ &R^{T}_{\mathbf{u}} \bigl(\mathbf {u}^{\ast}\bigr) + { \mathbf {b}}_{\mathbf{u}}^{T} \bigl(\mathbf {y}^{\ast},\mathbf {u}^{\ast}\bigr) \mathbf {p}^{\ast}= 0, \\ &{\mathbf {b}} \bigl(\mathbf {y}^{\ast},\mathbf {u}^{\ast}\bigr) =0. \end{aligned} $$
(17)

Here the subscripts y and u denote differentiation with respect to y and u, respectively. Splitting \(\mathbf {p}= ( \mathbf {p}^{\mathrm{ex}}_{\circ}, \mathbf {p}^{\mathrm{ex}}_{\gamma},\mathbf {p}^{\mathrm{in}}_{\circ}, \mathbf {p}^{\mathrm{in}}_{\gamma}, \mathbf {p}_{\lambda})^{T}\) analogously to \(\mathbf {y}=(\mathbf {y}^{\mathrm {ex}}_{\circ}, \mathbf {y}^{\mathrm{ex}}_{\gamma}, \mathbf {y}^{\mathrm {in}}_{\circ}, \mathbf {y}^{\mathrm{in}}_{\gamma}, \lambda)^{T}\) we can provide the more detailed depictions

$$ \mathbf {b}^{T}_{\mathbf{u}}(\mathbf {y}, \mathbf {u}) \mathbf {p} = \begin{pmatrix} \mathbf {d}^{\mathrm{ex}}_{\mathbf {u}}(\mathbf {y}^{\mathrm{ex}}_{\circ},\mathbf {y}^{\mathrm{ex}}_{\gamma},\mathbf {u})\\ 0 \\ 0 \\ 0 \\ 0 \end{pmatrix} ^{T} \begin{pmatrix} \mathbf {p}^{\mathrm{ex}}_{\circ}\\ \mathbf {p}^{\mathrm{ex}}_{\gamma}\\ \mathbf {p}^{\mathrm{in}}_{\circ}\\ \mathbf {p}^{\mathrm{in}}_{\gamma}\\ \mathbf {p}_{\lambda}\end{pmatrix} $$

and \(\mathbf {b}^{T}_{\mathbf {y}}(\mathbf {y},\mathbf {u}) \mathbf {p}\) with

$$\begin{aligned} & \mathbf {b}^{T}_{\mathbf{y}}(\mathbf {y}, \mathbf {u}) \\ &\quad = \begin{pmatrix} \mathbf {d}^{\mathrm{ex}}_{\mathbf {y}^{\mathrm{ex}}_{\circ}}(\mathbf {y}^{\mathrm{ex}}_{\circ},\mathbf {y}^{\mathrm{ex}}_{\gamma},\mathbf {u}) & \mathbf {d}^{\mathrm{ex}}_{\mathbf {y}^{\mathrm{ex}}_{\gamma}}(\mathbf {y}^{\mathrm{ex}}_{\circ},\mathbf {y}^{\mathrm{ex}}_{\gamma},\mathbf {u}) & 0 & 0 & 0\\ 0 & \mathbf {I}^{\mathrm{ex}} & \mathbf {P}^{\mathrm{ex}}_{\circ}& \mathbf {P}^{\mathrm{ex}}_{\gamma}& 0 \\ 0 & 0 & \mathbf {d}^{\mathrm{in}}_{\mathbf {y}^{\mathrm{in}}_{\circ}}(\mathbf {y}^{\mathrm{in}}_{\circ},\mathbf {y}^{\mathrm{in}}_{\gamma },\lambda) & \mathbf {d}^{\mathrm{in}}_{\mathbf {y}^{\mathrm{in}}_{\circ}}(\mathbf {y}^{\mathrm{in}}_{\circ},\mathbf {y}^{\mathrm{in}}_{\gamma},\lambda)& \mathbf {d}^{\mathrm {in}}_{\lambda}(\mathbf {y}^{\mathrm{in}}_{\circ},\mathbf {y}^{\mathrm {in}}_{\gamma},\lambda)\\ \mathbf {P}^{\mathrm{in}}_{ \circ}& \mathbf {P}^{\mathrm{in}}_{ \gamma }& \mathbf {I}^{\mathrm{in}} & 0 & 0 \\ 0 & 0 & \mathbf {d}^{\mathrm{pc}}_{\mathbf {y}^{\mathrm{in}}_{\circ}}(\mathbf {y}^{\mathrm{in}}_{\circ},\mathbf {y}^{\mathrm{in}}_{\gamma },\lambda) & \mathbf {d}^{\mathrm{pc}}_{\mathbf {y}^{\mathrm{in}}_{\circ}}(\mathbf {y}^{\mathrm{in}}_{\circ},\mathbf {y}^{\mathrm{in}}_{\gamma},\lambda)& \mathbf {d}^{\mathrm {pc}}_{\lambda}(\mathbf {y}^{\mathrm{in}}_{\circ},\mathbf {y}^{\mathrm {in}}_{\gamma},\lambda) \end{pmatrix} ^{T}, \end{aligned}$$

where \(\mathbf {I}^{\mathrm{ex}}\) and \(\mathbf {I}^{\mathrm{in}}\) are unit matrices.

A Newton-type method for solving (17) [5, Chap. 14] are iterations of the type

$$ \begin{pmatrix} {C}_{\mathbf {y}\mathbf {y}}(\mathbf {y}^{k}) & 0 & {\mathbf {b}}_{\mathbf{y}}^{T}(\mathbf {y}^{k},\mathbf {u}^{k})\\ 0 & R_{\mathbf {u}\mathbf {u}}(\mathbf {u}^{k})& {\mathbf {b}}_{\mathbf{u}}^{T}(\mathbf {y}^{k},\mathbf {u}^{k}) \\ {\mathbf {b}}_{\mathbf{y}}(\mathbf {y}^{k},\mathbf {u}^{k}) & {\mathbf {b}}_{\mathbf{u}}(\mathbf {y}^{k},\mathbf {u}^{k}) & 0 \end{pmatrix} \begin{pmatrix} \mathbf {y}^{k+1}- \mathbf {y}^{k} \\ \mathbf {u}^{k+1} -\mathbf {u}^{k}\\ \mathbf {p}^{k+1} \end{pmatrix} = - \begin{pmatrix} {C}_{\mathbf{y}}^{T}(\mathbf {y}^{k}) \\ R_{\mathbf{u}}^{T}(\mathbf {u}^{k}) \\ {\mathbf {b}}(\mathbf {y}^{k},\mathbf {u}^{k}) \end{pmatrix}. $$
(18)

The iteration scheme (18) is different from Newtons method for (17), since it neglects second order derivatives of \({\mathbf {b}}(\mathbf {y},\mathbf {u})\). It is known that such modifications are prone to convergence issues, but this doesn’t seem to be an issue here. In the terminology of Newton methods we use rather an inexact Newton method, than an exact Newton method.

Since in the number of coils is much smaller than the dimension of the approximation space \(\mathcal{V}^{\mathrm{ex}}\) and \(\mathcal {V}^{\mathrm{in}}\), the size of the non-linear system(18) is roughly twice as large as the size of the non-linear discrete free-boundary equilibrium problem (12). Even though it would be possible to invert the linear system in (18) with a direct solver, we have implemented an algorithm (see Algorithm 1) based on the Schur complement, as this appears as a minor modification of Newton’s method for the constraint (15). When the iteration stops, the auxiliary matrix valued variable Y is equal to the sensitivity \(\mathbf {y}_{\mathbf {u}}(\mathbf {u}) = -\mathbf {b}_{\mathbf {y}}^{-1}(\mathbf {y}(\mathbf {u}),\mathbf {u}) \mathbf {b}_{\mathbf {u}}(\mathbf {y}(\mathbf {u}),\mathbf {u})\). In general, it is recommended to avoid the explicit calculation of these sensitivities and adjoint methods were introduced for exactly that purpose. However, as we have a very few number of control parameters, this is not an issue. To motivate Algorithm 1 we introduce

$$ \mathbf {Y}_{k}:= -\mathbf {b}_{\mathbf{y}}^{-1} \bigl( \mathbf {y}^{k},\mathbf {u}^{k} \bigr) \mathbf {b}_{\mathbf{u}} \bigl(\mathbf {y}^{k},\mathbf {u}^{k} \bigr) \quad\text{and}\quad \Delta \mathbf {y}_{k}: = -\mathbf {b}_{\mathbf{y}}^{-1} \bigl( \mathbf {y}^{k},\mathbf {u}^{k} \bigr) \mathbf {b} \bigl(\mathbf {y}^{k},\mathbf {u}^{k} \bigr) $$

and obtain from (18) first the identity

$$ \mathbf {y}^{k+1} -\mathbf {y}^{k} = \mathbf {Y}_{k} \bigl(\mathbf {u}^{k+1}-\mathbf {u}^{k} \bigr) + \Delta\mathbf {y}_{k} $$

and then the following linear system for the increment \(\mathbf {u}^{k+1}-\mathbf {u}^{k}\)

$$ \mathbf {M} \bigl( \mathbf {y}^{k}, \mathbf {u}^{k} \bigr) \bigl( \mathbf {u}^{k+1}-\mathbf {u}^{k} \bigr) = -\mathbf {m} \bigl( \mathbf {y}^{k}, \mathbf {u}^{k} \bigr), $$

with

$$ \mathbf {M} \bigl( \mathbf {y}^{k}, \mathbf {u}^{k} \bigr):= R_{\mathbf {u}\mathbf {u}} \bigl(\mathbf {u}^{k} \bigr) + \mathbf {Y}_{k}^{T} C_{\mathbf {y}\mathbf {y}} \bigl(\mathbf {y}^{k} \bigr) \mathbf {Y}_{k} $$

and

$$ \mathbf {m} \bigl( \mathbf {y}^{k}, \mathbf {u}^{k} \bigr):= R_{\mathbf {u}}^{T} \bigl(\mathbf {u}^{k} \bigr) + \mathbf {Y}_{k}^{T} C_{\mathbf {y}}^{T} \bigl(\mathbf {y}^{k} \bigr) + \mathbf {Y}_{k}^{T} C_{\mathbf {y}\mathbf {y}} \bigl(\mathbf {y}^{k} \bigr) \Delta\mathbf {y}_{k}. $$

In the case that \(\mathbf {b}(\mathbf {y}^{k},\mathbf {u}^{k})\) (and hence \(\Delta \mathbf {y}_{k}\)) vanishes we have that vanishing \(\mathbf {m}( \mathbf {y}^{k}, \mathbf {u}^{k})\) would also imply that the first order optimality conditions (17) hold true. Hence, if both \(\Delta\mathbf {u}_{k}\) and \(\Delta\mathbf {y}_{k}\) vanish then the first order optimality conditions (17) hold true.

The iteration scheme (18) for the constraint optimization problem (16) involves first order derivatives of \({\mathbf {b}}(\mathbf {y},\mathbf {u})\) and first and second order derivatives of \(R(\mathbf {u})\) and \({C}(\mathbf {y})\). The derivative \({\mathbf {b}}_{\mathbf {y}}(\mathbf {y},\mathbf {u})\) is already available from the Newton iterations for (15) and as we have explicit expressions for \({\mathbf {b}}(\mathbf {y},\mathbf {u})\), \(R(\mathbf {u})\) and \({C}(\mathbf {y})\) that are algebraic in u and y we can also provide the remaining derivatives.

4 Results and discussion

We highlight that, to our knowledge, there is no theory yet available, that justifies rigorously convergence of the MEM-M. Only for lowest order Lagrangian elements we have a convergence assertion in \(L^{\infty}\) [24, Theorem 1]. Therefore, we present here first an experimental validation of the MEM-M, and continue afterwards, in Sect. 4.2, with the application.

All the numerical results are based on the MATLAB/Octave library FEEQS.M developed by one of the authors. This library utilizes in large parts vectorization. Therefore, the running time is comparable to C/C++ implementations.

4.1 Experimental validation

For validation of the MEM, we consider a rectangular domain \(\Omega= [-1,1]^{2}\) and define \(\Omega^{\mathrm{in}}\) as the polygon with vertices \((-0.125, 0.5)\), \((0.375, 0.25)\), \((0.375, -0.375)\), \((0, -0.5)\), \(( -0.375, -0.375)\), and \((-0.5, 0.25)\). The meshes \(\mathcal{T}^{\mathrm{in}}\) and \(\mathcal {T}^{\mathrm{ex}}\) for the interior and exterior domain will be a Cartesian mesh and a triangular mesh. For simplicity we prefer to take \(\Omega^{\mathrm{ex}}_{h}= \Omega ^{\mathrm{ex}}= \Omega \setminus\Omega^{\mathrm{in}}\). For the numerical test, we take \({D}=1\) and choose the data \(f(x,y)\) and \(\psi_{0}\) in agreement with \(\psi(x,y) = \cos(\pi x) \sin(\pi y) \) as solution of (7). If \(h_{\mathrm{ex}}\) (\(h_{\mathrm{in}}\)) is the maximal diameter of elements in \(\mathcal{T}^{\mathrm{ex}}\) (\(\mathcal{T}^{\mathrm {in}}\)), and \(p_{\mathrm{ex}}\) (\(p_{\mathrm{in}}\)) the local polynomial degree of the FE spaces \(\mathcal{V}^{\mathrm{ex}}\) (\(\mathcal {V}^{\mathrm{in}}\)), one has optimal convergence if, for a smooth solution, the approximation error in the \(H^{1}(\Omega^{\mathrm{ex}}_{h})\) and \(H^{1}(\Omega^{\mathrm {in}}_{h})\)-norms behaves as \(O(h^{p-1})\), with \(h = \max(h_{\mathrm{ex}},h_{\mathrm{in}})\) and \(p = \min (p_{\mathrm{ex}},p_{\mathrm{in}})\) (in \(L^{2}(\Omega^{\mathrm{ex}}_{h})\) and \(L^{2}(\Omega^{\mathrm {in}}_{h})\)-norms one dares to obtain \(O(h^{p})\)). To keep the presentation as clear as possible we show in the following figures always the maximum between the error in \(\Omega^{\mathrm {ex}}_{h}\) and that in \(\Omega^{\mathrm{in}}_{h}\).

We consider two different pairings of FE spaces \(\mathcal{V}^{\mathrm {ex}}\)\(\mathcal{V}^{\mathrm{in}}\). The first denoted with P1-Q1 uses lowest order linear FEs over \(\mathcal{T}^{\mathrm{ex}}\) and lowest order bilinear FEs over \(\mathcal{T}^{\mathrm{in}}\). The second pair, denoted with P2–Q3 uses quadratic FEs over \(\mathcal{T}^{\mathrm{ex}}\) and bicubic FEs over \(\mathcal{T}^{\mathrm{in}}\). The elements of P2-Q3 are not only continuous on \(\Omega^{\mathrm {in}}_{h}\) and \(\Omega^{\mathrm{ex}}_{h}\) but have also continuous gradients on \(\Omega^{\mathrm{in}}_{h}\).

We focus on the overlapping MEM-M (10) which uses the modified mortar mappings and is equivalent to (11). We also analyze the influence on the error curves of using either \(L^{2}\) projections or interpolation to realize the gluing across \(\gamma^{\mathrm{ex}}\) and \(\gamma ^{\mathrm{in}}\) for the MEM-M. We start with the case where \(\Omega^{\mathrm{in}}_{h}\) has minimal overlap with \(\Omega^{\mathrm{ex}}_{h}\) (see Fig. 8, left). Thus \(\gamma^{\mathrm{in}}\) is adapted with the refinements in \(\Omega^{\mathrm{in}}_{h}\) as shown in Fig. 9. Convergence results with MEM-M are presented in Fig. 10. The convergence rate with MEM-M is optimal for the error in the \(H^{1}\)-norm. The results look slightly better if we apply the interpolation instead of the \(L^{2}\) projection in the definition of the mortar mapping.

Figure 9
figure 9

Definition of \(\gamma^{\mathrm{in}}\). Adaptive definition of \(\gamma^{\mathrm{in}}\) in the case of minimal overlap between \(\Omega^{\mathrm{in}}_{h}\) and \(\Omega ^{\mathrm{ex}}_{h}\). Note that, \(\gamma^{\mathrm{ex}}\) (magenta) remains fixed, while \(\gamma^{\mathrm{in}}\) (red) changes due to the refinements in \(\Omega^{\mathrm{in}}_{h}\). The interior edges of elements of \(\mathcal{T}^{\mathrm{ex}}\) and \(\mathcal{T}^{\mathrm{in}}\) are omitted for clearness

Figure 10
figure 10

Convergene of MEM-M I. Convergence in \(L^{2}\) and \(H^{1}\) of the scheme MEM-M using \(L^{2}\)-projection (left) or nodal interpolation (right)

Next, we study the convergence rates for MEM-M when \(\Omega^{\mathrm {in}}_{h}\) has a large overlap with \(\Omega^{\mathrm{ex}}_{h}\). For this we fix \(\Omega ^{\mathrm{in}}_{h}\) to be the square \([-0.76,0.65]\times[-0.76,0.78]\) (see Fig. 8, right). Note that both \(\gamma^{\mathrm{ex}}\) and \(\gamma^{\mathrm{in}}\) remain fixed during the refinements in \(\Omega^{\mathrm{in}}_{h}\). Once again, the MEM-M yields optimal convergence rate in the \(H^{1}\) norm (see Fig. 10). Moreover in the case of larger overlap we observe even optimal convergence in the \(L^{2}\)-norm. There is no qualitative difference between MEM-M based on the \(L^{2}\)-projection or on the interpolation. More detailed numerical tests of the MEM-M can be found in [34].

With the classical overlapping \(\text{MEM}_{s,t}\) (9) with the parameters s and t set to zero (MEM0,0), the convergence rates in the \(H^{1}\) and \(L^{2}\) norms are not optimal in the case of minimal overlap between \(\Omega^{\mathrm {in}}_{h}\) and \(\Omega^{\mathrm{ex}}_{h}\) (see Fig. 11). The MEM0,0 does not yield convergence in the case of a large overlap between \(\Omega^{\mathrm{in}}_{h}\) and \(\Omega^{\mathrm{ex}}_{h}\).

Figure 11
figure 11

Congergence of MEM II. Convergence in \(L^{2}\) and \(H^{1}\) of the scheme MEM0,0 using \(L^{2}\)-projection (left) or nodal interpolation (right)

Our experimental results for MEM0,0 are not very surprising. All available convergence assertions assume \(s+t=1\), which leads to the cumbersome integration over cut elements, that we prefer to avoid.

4.2 A case study

We present a first case study for the tokamak CFETR. The machine CFETR, the China Fusion Engineering Test Reactor, is a planed device in the road map for the realization of fusion energy in China, that will follow ITER. The geometry of the machine is sketched in Fig. 12. All the following calculations are based on the MEM-M discretization (12) of the free-boundary equilibrium problem (2). We use lowest order Lagrangian finite element for \(\mathcal{V}^{\mathrm{ex}}\) and the Bogner–Fox–Schmit finite element for \(\mathcal{V}^{\mathrm{in}} \). In order to create snowflake-like configurations similar to the ones for HL-2M in Fig. 7 we introduce two objective functionals for \(\psi^{\mathrm{in}}\in\mathcal{V}^{\mathrm{in}}\):

$$ \begin{aligned}& {C}_{1} \bigl(\psi^{\mathrm{in}} \bigr) = \sum_{i=2}^{N_{\mathrm{desi}}} \bigl( \psi^{\mathrm{in}}(\mathbf {x}_{i})-\psi^{\mathrm{in}}(\mathbf {x}_{1}) \bigr)^{2}, \\ &{C}_{2} \bigl(\psi^{\mathrm{in}},\mathbf {x}_{0} \bigr) = \bigl\Vert \nabla\psi^{\mathrm{in}}(\mathbf {x} _{0}) \bigr\Vert ^{2}. \end{aligned} $$
(19)

The objective functional \({C}_{1}\) forces ψ to be constant on the prescribed points \(\mathbf {x}_{1}, \ldots,\mathbf {x}_{N_{\mathrm{desi}}}\). The objective functional \({C}_{2}\) forces ψ to have a stationary point at \(\mathbf {x}_{0}\). Using \({C}_{1}\) alone for the formulation of the optimal control problem (16), is the standard approach to find a certain configuration of plasma currents that give an equilibrium boundary that is close to the prescribed points \(\mathbf {x}_{i}\).

Figure 12
figure 12

Tokamak CFETR. the geometry (left), a zoom of the composite meshes (center), and the location the points \(\mathbf {x}_{i}\) in the definition of the objective functional \({C}_{1}\) (right)

In the following we set \(\mathbf {x}_{0} = (5.42, -4.62)\) and then solve optimal control problems (16) with the objective functional

$${C}(\mathbf {y};w,\mathbf {x}_{0}):= {C}_{1} \bigl( \psi^{\mathrm{in}} \bigr) + w {C}_{2} \bigl(\psi ^{\mathrm{in}}, \mathbf {x}_{0} \bigr) $$

and regularization functional

$$R(\mathbf {u}) = 10^{-14} \sum_{i=1}^{M} I_{i}^{2} $$

for changing values of w. We denote by y is the vector of degrees of freedom of \((\psi^{\mathrm{ex}},\psi^{\mathrm{in}}) \in \mathcal{V}^{\mathrm{ex}} \times\mathcal{V}^{\mathrm{in}}\). The current density profiles (6) use the parameter values \(\alpha= 1\), \(\beta =1.2\), \(\gamma= 1.1\) and \(r_{0}=6.65\). The total plasma current has value \(I_{p} = 11 10^{6} A\). The iteration stops if the relative increments in Algorithm 1 are smaller than \(tol= 10^{-11}\). In Fig. 13 we see that for a sufficiently large value of w our approach is capable to create snowflake-like plasma equilibrium configurations.

Figure 13
figure 13

Results (A). The poloidal flux contour lines (top) and the 16 coil currents (bottom) in Ampère turns for the optimal control problem (16) with objective functional \({C}(\mathbf {y};w,\mathbf {x}_{0})\) for \(w=0,1,10,100,1000\) (from left to right), \(\mathbf {x}_{0} = (5.42, -4.62)\) is indicated by the blue circle

In a second experiment we fix the weight to \(w=1000\) and run the optimal control problem for varying \(\mathbf {x}_{0}\). The results (see Figs. 14 and 15) show that it is easily possible, to find configurations for a great variety of locations of the lower stationary point. Even though most of these configurations are fairly close to snowflake configurations, we were not able to find exact snowflake configurations with this approach. Nevertheless such configurations are useful as it is common practice to depart from snowflake-like configurations to tune manually coil currents that bring the equilibrium closer to an exact snowflake configuration.

Figure 14
figure 14

Results (B). The poloidal flux contour lines (top) and the 16 coil currents (bottom) in Ampère turns) for the optimal control problem (16) with objective functional \({C}(\mathbf {y};1000,\mathbf {x}_{0})\) for \(\mathbf {x}_{0} = (5.02, -4.62)\), \((5.22, -4.62)\), \((5.42, -4.62)\), \((5.62, -4.62)\), \((5.82, -4.62)\) (from left to right) is indicated by the blue circle

Figure 15
figure 15

Results (C). The poloidal flux contour lines (top) and the 16 coil currents (bottom) in Ampère turns for the solutions of the optimal control problem (16) with objective functional \({C}(\mathbf {y};1000,\mathbf {x}_{0})\) and \(\mathbf {x}_{0}= (5.02, -4.22)\), \((5.22, -4.42)\), \((5.42, -4.62)\), \((5.62, -4.82)\), \((5.82, -5.02)\) (from left to right) is indicated by the blue circle

In the spirit of this idea we formulate a second optimal control problem (16) with the objective and regularization functionals

$${C}(\mathbf {y};\mathbf {x}_{0}):= {C}_{2} \bigl( \psi^{\mathrm{in}},\mathbf {x}_{0} \bigr) \quad\text{and}\quad R(\mathbf {u}) = 10^{-15} \sum_{i=1}^{M} (I_{i}-I_{\mathrm{ref},i})^{2}, $$

where the reference currents \(I_{\mathrm{ref},i}\) are the \(M=16\) currents from snowflake-like configurations. This formulation penalizes coil currents that deviate largely from the previously computed currents, while forcing at the same time that \(\psi^{\mathrm{in}}\) has a stationary point at \(\mathbf {x}_{0}\). With this, it is possible to find coil currents that merge two different close stationary points, leading to an exact snowflake configuration (see Fig. 16).

Figure 16
figure 16

Results (D). The poloidal flux contour lines (left) and the 16 coil currents (right) in Ampère turns for the solutions of the optimal control problem (16) with objective and regularization functional \({C}(\mathbf {y};\mathbf {x}_{0}):= {C}_{2}(\psi ^{\mathrm{in}},\mathbf {x}_{0})\) and \(R(\mathbf {u}) = 10^{-15} \sum_{i=1}^{16} (I_{i}-I_{\mathrm{ref},i})^{2}\) for \(\mathbf {x}_{0} = (5.439, -4.168)\) and the currents of the fourth row in the table in Fig. 15 as the reference coil currents \(I_{\mathrm{ref},i}\). Here we found \(\lambda=1.580\times10^{6}\)

5 Conclusions

Handling the heat where the plasma touches the vessel wall is one of the outstanding challenges for magnetically confined fusion energy research. Indeed, the predicted heat load on the ITER vessel walls will be greater than that on the soil beneath a launching rocket. Current experiments are trying to find magnetic field configurations which can provide the most effective heat load reduction. One of such advanced configurations is the so called snowflake configuration, where the plasma boundary is the flux contour-line that passes through a degenerated saddle point of the poloidal flux.

The main control parameters for shaping the contour lines of the poloidal flux, and hence for shaping the form and position of a plasma equilibrium in a tokamak are the currents in the surrounding poloidal field coils. Optimal control formulations combined with finite element methods are an obvious and fairly established mean [3, 20] to determine currents that ensure a certain desired form and position of the plasma. But the low regularity, e.g. lack of well-defined pointwise derivatives, of standard \(H^{1}\)-conforming finite elements, seems to be an obstacle to define good objective functions for finding the snowflake configurations characterized by degenerated saddle points.

We therefore, presented here an extension of this approach that combines the optimal control formulation with a mortar-type FE method. The mortar-type FE method has the advantage that we can introduce higher order regular FE in the places where we have objective functions involving point wise values of flux derivatives. This is achieved in combining FE on Cartesian meshes with FE on triangular meshes.

The examples for the tokamak CFETR approve the viability and flexibility of the presented approach.