Multigrid methods are among the most efficient solvers for elliptic model problems such as (1.1); see, e.g., [6, 35]. Multigrid methods for meshes in polar coordinates were considered in, e.g., [1, 3, 23, 33, 35] but are, however, less studied. In the following sections, we will develop special multigrid components for the model problem in curvilinear coordinates such as the generalized polar coordinates proposed in (2.2).
In order to define the notation, we first define a hierarchy of \(L+1\) grids with \(\varOmega _{l-1}\subset \varOmega _{l}\), \(1\le l\le L\), and \(|\varOmega _L|=n_{r}*n_{\theta }\). To identify matrices and vectors on grid \(\varOmega _l\), we use the subindex l, \(0\le l\le L\). The iterates of step m are characterized by a superindex m, \(m\ge 0\). The restriction operator from grid l to grid \(l-1\) is denoted \(I_l^{l-1}\) and \(I_{l-1}^{l}\) represents the interpolation from grid \(l-1\) to grid l. The presmoothing operation with \(\nu _1\) steps is denoted \(\mathbf{S }^{\nu _1}\), the postsmoothing operation with \(\nu _2\) steps is denoted \(\mathbf{S }^{\nu _2}\). The multigrid cycle \(u_L^{m+1}=\mathbf{MGC} (L,\gamma ,u_L^m,A_L,f_L,\nu _1,\nu _2)\) is then given recursively for \(0\le l\le L\).
In the recursive call, \(\diamondsuit \) stands for zero as a first approximation and in further calls (W-cycle) for an approximation taken from the previous cycle.
Optimized Zebra Line Smoothers
For highly anisotropic problems, point relaxation and standard coarsening (i.e., coarsening by a factor of 2 in each dimension) do not yield satisfactory results. Pointwise smoothing then only has poor smoothing properties with respect to weakly-coupled degrees of freedom (dofs); cf. [35, Sec. 5.1]. In the context of multigrid, we speak of strong coupling between one dof to another if the offdiagonal entry of the considered matrix is “relatively” large; compared to the other offdiagonal entries of the same dof. If the entry is “relatively” small, we speak of weak coupling.
If the anisotropy is aligned with the grid, standard coarsening can be kept and only the smoothing operation has to be adapted to obtain good multigrid performance. Line relaxations are block relaxations where all the connections between degrees of freedom of one line are taken into account to update this line in one single step. Using line relaxation, errors become smooth if strongly connected degrees of freedom are updated together. For a more detailed introduction to line smoothers, see, e.g., [35, Sec. 5.1].
For compact finite difference stencils and linear nodal basis functions, zebra line smoothers correspond to Gauß-Seidel line relaxation methods where all even and all odd lines (rows or columns), respectively, are processed simultaneously. For operators where the anisotropy changes across the domain, alternating zebra relaxation has been proposed; see [33]. The polar coordinate transformed Laplace operator yields strong connections on circle lines on the interior part of the domain and strong connections on radial lines on the outer part; cf. (2.4). Consequently, alternating zebra relaxation was proposed for the unit disk [1]. We will now briefly introduce zebra relaxation and then explain our particular choice of smoothers for all parts of the (deformed) domain from Fig. 1 described by curvilinear coordinates.
Let \(n_l=n_{l,r}\times n_{l,\theta }\) be the number of nodes on grid \(l\in \{0,L\}\). Furthermore, let \(B_l\) and \(W_l\) be disjoint index sets such that \(B_l\cup W_l=\{1,2,\ldots ,n_l\}\) and by reordering
$$\begin{aligned} u_l^{m}=\begin{pmatrix} u^{m}_{l,B}\\ u^{m}_{l,W} \end{pmatrix},\; f_l=\begin{pmatrix} f_{l,B}\\ f_{l,W} \end{pmatrix},\text { and } K_l=\begin{pmatrix} K_{l,BB} &{} K_{l,BW}\\ K_{l,WB} &{} K_{l,WW} \end{pmatrix} \end{aligned}$$
(4.1)
for any grid \(l\in \{0,L\}\). Note that we drop the second index l in B and W to avoid a proliferation of indices.
In the following, we will focus on zebra colorings such that \(K_{l,BB}\) and \(K_{l,WW}\) can be partitioned into a block diagonal system with blocks of size \({\mathcal {O}}(\sqrt{n_l})\). Note that this property does not hold for the radial directions if a full (deformed) disk is considered; if \(r_1=0\), then all these directions are coupled by the origin.
For curvilinear coordinates, the two natural line smoothing operations are denoted circle and radial zebra relaxation. For circle zebra relaxation, all nodes \((r_{l,i},\theta _{l,j})\), \(j\in \{1,\ldots ,n_{l,\theta }\}\) get the same color while \((r_{l,i-1},\theta _{l,j})\) and \((r_{l,i+1},\theta _{l,j})\), \(j\in \{1,\ldots ,n_{l,\theta }\}\), get another color. For radial zebra relaxation \((r_{l,i},\theta _{l,j})\), \(i\in \{1,\ldots ,n_{l,r}\}\) are colored together; see Fig. 5 (left and second to left).
Let us color each line (row or column) alternatingly black and white. Then, the diagonal blocks of size \({\mathcal {O}}(\sqrt{n_l})\) in \(K_{l,BB}\) and \(K_{l,WW}\) only have three entries per row for all finite difference stencils and finite element basis functions introduced in Sect. 3. For a coloring in accordance with the ordering of the nodes, the local block can be tridiagonal. However, also the banded systems with three entries per row can be solved in \({\mathcal {O}}(\sqrt{n})\) operations by a direct solver; see Fig. 5 (second to right and right) for the nonzero structure.
The presmoothing operation \(\mathbf{S }^{\nu _1}(u_l^m,K_l,f_l)\) can be expressed as follows.
The postsmoothing operation \(\mathbf{S }^{\nu _2}(u_l^{m+\frac{2}{3}},K_l,f_l)\) is obtained equivalently. In order to smooth the coarse degrees of freedom first, we will color them always in black.
Remark 4.1
Note that the zebra-line Gauss-Seidel preconditioner is not triangular but block-triangular. That means that all nonzero entries shown in Fig. 5 (also those in the upper triangular part) remain on the left hand side of the system. The shown entries all belong to the same line (row or column). For larger finite difference stencils or hierarchical finite element bases, \(K_{l,BB}\) and \(K_{l,WW}\) from (4.2) may have more than three nonzeros per row. Then, either more colors have to be used or a part of the upper triangular matrix has to be brought to the right hand side.
Let us consider the annulus \({\overline{\varOmega }}_{h_i}:=[r_i,r_i+h_i]\times [0,2\pi ]\) as an individual domain; with a constant discretization parameter \(k_j=k\) in the second dimension, i.e, \(n_{\theta }k=2\pi \). From [1], we know that the smoothing factors of circle and radial relaxation, \(\mu _{\text {CZ},h_i,k_j}\) and \(\mu _{\text {RZ},h_i,k_j}\), on \({\overline{\varOmega }}_{h_i}\) are given by
$$\begin{aligned} \begin{aligned} \mu _{\text {CZ},h_i,k_j}&=\max _{r_i\le r\le r_i+h_i}\left\{ \left( \frac{q_{i,j}^2r^2}{1+q_{i,j}^2r^2}\right) ^2,C_C\right\} \\ \mu _{\text {RZ},h_i,k_j}&=\max _{r_i\le r\le r_i+h_i}\left\{ \left( \frac{1}{1+q_{i,j}^2r^2}\right) ^2,C_R\right\} \\ \end{aligned} \end{aligned}$$
(4.3)
with \(q_{i,j}=\frac{k_j}{h_i}\) as well as \(C_C\in \{0.23,0.34\},\) depending on \(r_i\ge 0\), and \(C_R=0.23\), independently of \(r_i\). From Fig. 6 (left), we see that both relaxations behave very differently on different annuli of size \(h_i\) of the global domain. We see that radial relaxation is prohibitive around the origin but shows good smoothing behavior for \(r \rightarrow 1.3\). Circle relaxation shows good smoothing behavior around the origin but does not provide essential smoothing where the mesh was refined and for \(r\rightarrow 1.3\). In order to obtain a reasonable smoothing procedure on the entire domain, we thus have to combine circle relaxation with radial relaxation. In [1], alternating zebra relaxation, consisting of one step with each smoothing operator, was proposed.
To reduce the workload and to optimize the smoothing operation, we propose the following smoothing procedure. Since circle relaxation leads to good smoothing around the origin, we color the nodes around the origin in circle lines. For each following circle with radius \(r_i>r_1\), we then check in accordance to (4.3), if
$$\begin{aligned} q_{i,j}^2r^2>1\quad \Leftrightarrow \quad \frac{k_j}{h_i}r_i>1 \end{aligned}$$
(4.4)
and change to radial relaxation if this is the case. Note that we use that \(k_j\) is constant on each circle line represented by \(r_i\), \(i=1,\ldots ,n_r\). We then obtain a decomposition of the domain into two domains, where different relaxation methods are used; see Fig. 6.
Although the decomposition rule (4.4) was developed for a domain described by polar coordinates, we also use this as a rule of thumb for the deformed geometries described by transformation (2.2). See Sect. 5.2 for a numerical evaluation.
The optimized presmoothing operation \(\mathbf{S }^{\nu _1}(u_l^m,K_l,f_l)\) is then given with six colors: black (for circle and radial, denoted \(B_C\) and \(B_R\)), white (for circle and radial, denoted \(W_C\) and \(W_R\)), and orange (denoted \(O_C\) and \(O_R\)), which itself is not smoothed; see Fig. 6. The values of the previous half-step of relaxation are implicitly used as Dirichlet boundary conditions on the orange-colored part of the decomposition.
The values \(u_{l,O_{*}}^{m,i}\) on the orange colored part of the domain contain the interface boundary conditions for each half-step of smoother. Note that only those values next to the interior interface, which represent the interface boundary conditions, have to be updated in each step of the iterative process. The larger the stencil, the more lines have to be updated in practice.
Note that (4.5) is not parallelized across the two different smoothers (circle and radial) but that the radial smoothers use information from the circle smoothers. If no information is exchanged, an increase in iterations is to be expected. However, if the color of the outermost circle-smoother line is smoothed first, then for compact FE or FD stencils such as provided in this paper, the two sequential colors of the radial smoothers can be executed in parallel with the second color of the circle smoother. Since the circle color lines are of larger size than the radial color lines, both parallel steps are expected to finish at similar times.
Coarsening and Intergrid Transfer Operators
The coarsening and intergrid transfer operators use the classical choices. We always employ standard coarsening and we use bilinear interpolation, which is also well-defined for anisotropic meshes, if the additional extrapolation algorithm is not used. In case of implicit extrapolation, we use bilinear interpolation for \(l=1,\ldots ,L-1\) only and transfer between the two finest grids is adapted. As presented in Sect. 4.3, extrapolation will only affect the transfer between the two finest grid levels. In case (0, 0) is an actual discretization node and is chosen as first coarse node, we have to adapt the restriction and prolongation there. Our restriction operator is always defined as the adjoint
$$\begin{aligned} I_{l}^{l-1} = \left( I_{l-1}^l\right) ^T,\quad l=1,\ldots ,L. \end{aligned}$$
(4.6)
Remark 4.2
Note that there is no scaling constant in definition (4.6) since for the finite element discretizations as well as for our tailored finite difference schemes, the right hand side is locally scaled with \({\mathcal {O}}(h_ik_j)\), \(1\le i\le n_{r}\) and \(1\le j\le n_{\theta }\); cf. [18] for details on the derivation of the finite difference stencils. As a potential source of implementation error, this has to be taken into account.
Implicit Extrapolation
In this section, we introduce the implicit extrapolation step within our multigrid algorithm, based on the extrapolation strategy of [15, 16, 18]. The extrapolation step is only conducted between the two finest levels of multigrid hierarchy, affecting the operators on and interpolation between \(\varOmega _L\) and \(\varOmega _{L-1}\).
Let us assume that the coarse degrees of freedom are ordered before the fine degrees of freedom. By using the indices \(\cdot _c\) for coarse and \(\cdot _f\) for fine nodes, we have
$$\begin{aligned} K_L=\begin{pmatrix} K_{L,cc} &{} K_{L,cf}\\ K_{L,fc} &{} K_{L,ff} \end{pmatrix},\quad f_l=\begin{pmatrix} f_{L,c} \\ f_{L,f} \end{pmatrix},\quad u^m_l=\begin{pmatrix} u^m_{L,c} \\ u^m_{L,f} \end{pmatrix}, \end{aligned}$$
and equivalently for any other entity defined on \(\varOmega _L\).
In accordance to [15, p. 173], we present the new smoothing procedure that excludes coarse grid nodes from the (pre- or post-)smoothing procedure
$$\begin{aligned} \begin{aligned}&u_{L,f}^{m+1/3}=\mathbf{S }^{\nu _1}(u_{L,f}^m,K_{L,ff},f_{L,f}-K_{L,fc}u_{L,c}^m)\\ \text {and}\quad&u_{L,f}^{m+1}=\mathbf{S }^{\nu _2}(u_{L,f}^m,K_{L,ff},f_{L,f}-K_{L,fc}u_{L,c}^{m+2/3}) \end{aligned} \end{aligned}$$
(4.7)
The new smoother on the finest level is the previously defined smoother only acting on the fine nodes.
Remark 4.3
Only the fine grid nodes are smoothed on the first level and the nodes belonging to the coarse grid are excluded from the smoothing operation. This differs from the introduction of \(\tau \)-extrapolation in [2, 6, 14]. The weaker smoother may lead to a reduced algebraic convergence of the multigrid iteration, but it has the advantage that the fixed point of the multigrid iteration is uniquely defined. For more details, we refer to [15, p. 173] and the references therein.
Before presenting the extrapolated multigrid cycle, we must also introduce the modified intergrid transfer operators \(I_{L-1}^L\) and \(I_L^{L-1}:=(I^L_{L-1})^T\). In order to do so, denote by \({\mathcal {T}}_{L-1}\) the triangulation on \(\varOmega _{L-1}\). We then define
$$\begin{aligned} I_{L-1}^L:=\begin{pmatrix}I_c \\ T_{fc} \end{pmatrix}, \end{aligned}$$
(4.8)
where \(I_c\) is the identity matrix on the coarse degrees of freedom and
$$\begin{aligned} \left( T_{fc}\right) _{s-n_{L-1},t}:&=\left\{ \begin{array}{ll}\frac{1}{2}, &{} \text {if there exists an edge }e \text { in }{\mathcal {T}}_{L-1} \text { s.t. }x_s\in e \text { and }x_t\in \partial e,\\ 0, &{} \text {otherwise.} \end{array}\right. \end{aligned}$$
Note that edges are open sets, i.e., \(\overset{\circ }{e}=e\).
The implicitly extrapolated multigrid cycle \(u_L^{m+1}=\mathbf{IEMGC} (L,\gamma ,u_L^m,K_L,f_L,\nu _1,\nu _2)\) is then given as in [15, Algorithm 1].
Remark 4.4
In [15, pp. 169f], it was shown that the implicitly extrapolated multigrid algorithm for linear elements can be interpreted as a multigrid algorithm solving the original PDE when discretized by quadratic nodal basis functions.
In [15], only constant coefficients were considered. Note that in our applications, due to the transformation of the physical domain, even \(\alpha \equiv 1\) leads to nonconstant coefficients; cf. Sect. 2. Nonconstant coefficients were considered with hierarchical bases in [16]. In contrast to [16], we use the intergrid transfer operator given in [15]. This results from the discretization by nodal basis functions.
The proof of Remark 4.4 is based on the relation between linear nodal, linear quadratic, and h- and p-hierarchical basis functions. The transfer operator \(I_{L-1}^L\) is part of the transformation between a nodal and a hierarchical basis; see also [18, Sec. 4.4.1]. The necessary relations [15, (55) and (56)] are formally proven for nonconstant coefficients in [18, Lemma 4.2, Theorem 4.3]. In particular, we can write
$$\begin{aligned}&\frac{4}{3}I_L^{L-1}\Big (f_L-K_Lu_L^{m+1/3}\Big )-\frac{1}{3} \Big (f_{L-1}-K_{L-1}u_{L,c}^{m+1/3}\Big )\nonumber \\&\quad =I_L^{L-1}\Big [\underset{=:f^{ex}_{L}}{\underbrace{\begin{pmatrix} \frac{4}{3}f_{L,c}-\frac{1}{3}f_{L-1} \\ \frac{4}{3}f_{L,f} \end{pmatrix}}} - \underset{=:K^{ex}_{L}}{\underbrace{\begin{pmatrix} \frac{4}{3}K_{L,cc}-\frac{1}{3}K_{L-1} &{} \frac{4}{3}K_{L,cf}\\ \frac{4}{3}K_{L,fc} &{} \frac{4}{3}K_{L,ff} \end{pmatrix}}}u_L^{m+1/3} \Big ], \end{aligned}$$
(4.9)
where the term in brackets corresponds to the residual computation of the quadratic approach.
Remark 4.5
We note that the direct discretization of a PDE with higher order finite elements will typically lead to a denser matrix structure and consequently to a higher flop cost per matrix-vector multiplication or smoother application. Here, we construct an equivalent higher order discretization using by way of a clever recombination of low order components as they arise canonically in a multigrid solver. In this way, we avoid the explicit set up of any more expensive higher order discrete operator. In other words, the implicit extrapolation multigrid method leads to a qualitatively equivalent high order discretization at reduced cost. This reduces memory cost avoiding the setup of more densely populated matrices, and they avoid the corresponding memory traffic and higher flop cost incurred in each iteration of an iterative solution process. In fact, except the computation of the extrapolated residual in the restriction phase, the cost of the extrapolated multigrid algorithm is the same as for standard low order discretization.
Of course, this alone does not account for other solver cost such as induced by the possibly slower (algebraic) convergence of the extrapolated multigrid algorithm (meaning that more iterations are needed) and the need to solve the discrete system with higher (algebraic) accuracy in order to exploit the lower discretization error. Because of these two effects the cost of computing a proper solution with the extrapolated multigrid algorithm is still expected to be more expensive than solving for a low order discretization. For an in-depth analysis of the so-called textbook efficiency of parallel multigrid algorithms, see also [11, 17].