The task of constructing a multigrid solver for a particular problem is typically performed by a human expert with profound knowledge in numerical mathematics. To automate this task, we first need to represent multigrid solvers in a formal language that we can then use to construct different instances on a computer. The rules of this language must ensure that only valid solver instances can be defined, which means that we can automatically determine their convergence speed and execution time. Additionally, we want to enforce that the generated method works on a hierarchy of grids, which requires the availability of inter-grid operations to obtain approximations of the same operator or grid on a finer or coarser level. Consider the general system of linear equations defined on a grid with spacing h
$$\begin{aligned} A_{h} u_{h} = f_{h}, \end{aligned}$$
(1)
where \(A_{h}\) is the coefficient matrix, \(u_{h}\) the unknown and \(f_h\) the right-hand side of the system. Each component of a multigrid solver can be written in the form
$$\begin{aligned} u_{h}^{i+1} = u_{h}^i + \omega B_{h} (f_h - A_{h} u_{h}^i), \end{aligned}$$
(2)
where \(u_{h}^i\) is the approximate solution in iteration i, \(\omega \in {\mathbb {R}}\) the relaxation factor and \(B_{h}\) an operator defined on the given level with spacing h. For example, with the splitting \(A_{h} = D_{h} - U_{h} - L_{h}\), where \(D_{h}\) represents the diagonal, \(-U_{h}\) the upper triangular and \(-L_{h}\) the lower triangular part of \(A_h\), we can define the Jacobi
$$\begin{aligned} u_{h}^{i+1} = u_{h}^i + D_{h}^{-1} (f_h - A_{h} u_{h}^i) \end{aligned}$$
(3)
and the lexicographical Gauss-Seidel method
$$\begin{aligned} u_{h}^{i+1} = u_{h}^i + (D_{h} - L_{h})^{-1} (f_h - A_{h} u_{h}^i). \end{aligned}$$
(4)
If we assume the availability of a restriction operator \(I_h^H\), that computes an approximation of the residual on a coarser grid with spacing H, a prolongation operator \(I_H^h\), that interpolates a correction obtained on the coarser grid into a finer grid, and an approximation for the inverse of \(A_h\) on the coarser grid, a coarse grid correction can be defined as
$$\begin{aligned} u_{h}^{i+1} = u_{h}^i + I_H^h A_H^{-1} I_h^H(f_h - A_{h} u_{h}^i). \end{aligned}$$
(5)
Furthermore, we can substitute \(u_{h}^i\) in (5) with (3) and obtain a two grid with Jacobi pre-smoothing
$$\begin{aligned} u_{h}^{i+1}&= (u_{h}^i + D_{h}^{-1} (f_{h} - A_{h} u_{h}^i)) \nonumber \\&\quad+ I_H^h A_H^{-1} I_h^H(f_{h} - A_{h} (u_{h}^i + D_{h}^{-1} (f_{h} - A_{h} u_{h}^i))). \end{aligned}$$
(6)
By repeatedly substituting subexpressions, we can automatically construct a single expression for any multigrid solver. If we take the set of possible substitutions as a basis, we can define a list of rules according to which such an expression can be generated. We specify these rules in the form of a context-free grammar, which is described in Table 1. Table 1a contains the production rules while Table 1b describes their semantics. Within the former the symbol \(A_h^{+}\) corresponds to a given splitting of the system matrix \(A_h = A_h^{+} + A_h^{-}\) such that \(A_h^{+}\) is efficiently invertible. For instance, in case of the Jacobi method \(A_h^{+} = D_h\) is defined as the diagonal of \(A_h\). Each rule defines the set of expressions by which a certain production symbol, denoted by \(\langle \cdot \rangle\), can be replaced. Starting with \(\langle S \rangle\), symbols are recursively replaced until the produced expression contains only terminals or the empty string \(\lambda\). The construction of a multigrid solver comprises the recursive generation of cycles on multiple levels. Consequently, it must be possible to create a new system of linear equations on a coarser level, including a new initial solution, right-hand side, and coefficient matrix. Moreover, if we decide to finish the computation on a particular level, we need to restore the state of the next finer level, i.e., the current solution and right-hand side, when applying the coarse grid correction. The current state of a multigrid solver on a level with grid spacing h is represented as a tuple (\(u_h\), \(f_{h}\), \(\delta _h\)), where \(u_h\) represents the current iterate, \(f_{h}\) the right-hand side and \(\delta _h\) a correction expression. To restore the current state on the next finer level, we additionally include a reference \(state_h\) to the corresponding tuple. According to Table 1a, the construction of a multigrid solver always ends when the tuple (\(u^0_h\), \(f_h\), \(\lambda\), \(\lambda\)) is reached. This tuple contains the initial solution and right-hand side on the finest level and therefore corresponds to the original system of linear equations that we aim to solve. Here we have neither computed a correction nor need to restore the state, and both \(\delta _h\) and \(state_h\) contain the empty string.
Table 1 Formal grammar for constructing three-grid multigrid cycles—The first column contains the list of production rules where each symbol on the left side of the \(\vDash\) sign can be replaced by the corresponding symbol on its right side In general, our grammar includes three functions that operate on a fixed level. The function iterate generates a new state tuple based on the previous one by applying the correction \(\delta\) to the current iterate u using the relaxation factor \(\omega\). If available, a partitioning can be included to perform the update in multiple sweeps on subsets of u and \(\delta\), for example, a red-black Gauss-Seidel iteration. The function residual creates a residual expression based on the given state, which is assigned to the newly created symbol \(\delta\). A correction \(\delta\) can be transformed with the function apply, which generates a new correction \(\tilde{\delta }\) by applying the linear operator B to the old one. For example, the following function applications evaluate to one iteration of damped Jacobi smoothing:
$$\begin{aligned} \begin{aligned}&\; \textsc {iterate}(\textsc {apply}(D_h^{-1},\; \textsc {residual}(A_h,\; (u_h^0,\; f_h,\; \lambda ,\; \lambda ))), \; 0.7, \; \lambda ) \\ \rightarrow&\;\textsc {iterate}(\textsc {apply}(D_h^{-1},\; (u_h^0,\; f_h,\; f_h - A_h u_h^0,\; \lambda )), \; 0.7, \; \lambda ) \\ \rightarrow&\; \textsc {iterate}((u_h^0,\; f_h,\; D_h^{-1}(f_h - A_h u_h^0),\; \lambda ), \; 0.7, \; \lambda ) \\ \rightarrow&\; (u^0_h + 0.7 \cdot D_h^{-1}(f_h - A_{h} u^0_h), \; f_h, \; \lambda , \; \lambda ) \end{aligned} \end{aligned}$$
Finally, it remains to be shown how one can recursively create a multigrid cycle on the next coarser level and then apply the result of its computation to the current approximate solution. This is accomplished through the functions cocy and cgcFootnote 1. The former expects a state to which the restriction \(I_h^{H}\) has been already applied. It then creates a new state on the next coarser level using the initial solution \(u^0_H\), the operator \(A_H\), and the restricted correction \(\delta _H\) as a right-hand side \(f_{H}\). Note that on the coarsest level, the resulting system of linear equations can be solved directly, which is denoted by the application of the inverse coarse-grid operator. For restoring the previous state, a reference is stored in \(state_H\). If the computation on the coarser level is finished, the function cgc comes into play. It first restores the previous state on the next finer level and then computes a coarse-grid correction by applying the prolongation operator to the solution computed on the coarser grid, which is then used as a new correction \(\tilde{\delta }_h\) on the finer level. Again the following example application demonstrates the semantics of these functions:
$$\begin{aligned} \begin{aligned}&\; \textsc {cgc}(I_{2h}^{h}, \textsc {iterate}(\textsc {cocy}(A_{2h},\; u^0_{2h}, \; (u_h^0, \; f_h, \; I_h^{2h} (f_h - A_h u_h^0), \; \lambda )), \; 1, \; \lambda ))\\ \rightarrow&\; \textsc {cgc} (I_{2h}^{h}, \; \textsc {iterate}((u^0_{2h}, \; I_h^{2h} (f_h - A_h u_h^0), \; I_h^{2h} (f_h - A_h u_h^0) - A_{2h} u_{2h}^0, \\&\; (u_h^0, \; f_h, \; \lambda , \; \lambda )), \; 1, \; \lambda ))\\ \rightarrow&\; \textsc {cgc} (I_{2h}^{h}, \; (u^0_{2h} + 1 \cdot (I_h^{2h} (f_h - A_h u_h^0) - A_{2h} u^0_{2h}), \; I_h^{2h} (f_h - A_h u_h^0), \; \lambda , \\&\; (u_h^0, \; f_h, \; \lambda , \; \lambda ))) \\ \rightarrow&\; (u^0_h, \; f_h, \; I_{2h}^{h} \cdot (u^0_{2h} + 1 \cdot (I_h^{2h} (f_h - A_h u_h^0) - A_{2h} u^0_{2h})), \; \lambda ) \end{aligned} \end{aligned}$$
Finally, note that Table 1a can produce multigrid cycles with a hierarchy of at most three discretization levels (or coarsening steps), whereas the only viable operation on the lowest level is the application of a coarse grid solver. However, since its rules can be applied recursively, the depth of the resulting grammar expression tree is not restricted, and, in principle, all three discretization levels can be traversed an infinite number of times. In practice, it is often favorable to construct multigrid solvers that employ an even greater number of coarsening steps. For this purpose, additional production rules for the generation of inter-grid transfer operations, i.e., cocy and cgc, must be defined on the respective discretization levels, whereas the general structure of the grammar remains unchanged. Since we have shown how it is possible to generate expressions that uniquely represent different multigrid solvers using the formal grammar defined in Table 1, this paper’s remainder focuses on the evaluation and optimization of the algorithms resulting from this representation.