Keywords

1 Introduction

The self-adaptive hp-Finite Element Method (FEM) has been developed for many years by the community of applied mathematicians working in the field of numerical analysis [3,4,5,6, 9]. They require extremely high numerical accuracy, which is difficult to obtain by other numerical methods. In this paper, we refer to the iterative Algorithm 1 proposed by [3], and we introduce the simplified, one-step, Algorithm 2 as a kernel for the selection of the optimal refinements for the interiors of elements. The edge refinements are adjusted by taking the minimum of the corresponding orders of interiors. We further propose how to replace Algorithm 2 with a Deep Neural Network (DNN).

Fig. 1.
figure 1

The convergence of accuracy of training (left) and validation (right) datasets.

The DNN can make similar quality decisions about mesh refinements as Algorithm 2, while the online computational time is reduced. The main motivation for this work is the following observation. We have noticed that making random of 10% of the decision about element refinements made by the self-adaptive hp-FEM algorithm does not disturb the algorithm’s exponential convergence. Thus, the possibility of teaching the deep neural network making decisions optimal up to 90% is enough to keep the exponential convergence.

figure a

2 Self-adaptive hp-FEM with Neural Network

We focus on the L-shape domain model problem [5, 6] to illustrate the self-adaptive applicability hp-FEM algorithm for the solution of a model problem with a singular point. The gradient of the solution tends to infinity, and intensive mesh refinements are needed to approximate this behavior properly.

We describe in Algorithm 1 the self-adaptive hp-algorithm, initially introduced by [3]. It utilizes Algorithm 2 for the selection of the optimal refinements over element K. This algorithm delivers exponential convergence of the numerical error with respect to the mesh size, which has been verified experimentally by multiple numerical examples [3, 4].

figure b

Our goal is to replace Algorithm 2 with a deep neural network. The left column in Fig. 2 presents the optimal distribution of refinements, as provided by the deterministic algorithm. We can see that all the h refinements (breaking of elements) are performed towards the point singularity. We also see that the p refinements are surrounding the singularity as layers with a different color. They change from red, light and dark pink (\(p=6,7,8\)), through brown (\(p=5\)), yellow (\(p=4\)), green (\(p=3\)), blue (\(p=2\)) and dark blue (\(p=1\)) close to the singularity.

The refinements performed by the iterative Algorithm 1 are executed first closer to the singularity. With the iterations, the differences between the coarse and fine mesh solution tend to zero [3].

Dataset. We propose the following samples to train the DNN:

Input variables: coarse mesh solution \(u_{hp} \in V_{hp}\) for element K, the element sizes and coordinates, the norm of the fine mesh solution over element K, the maximum norm of the fine mesh solution over elements

Output variables: Optimal refinement \(V_{opt}^K\) for element K.

Fig. 2.
figure 2

The mesh provided by the deterministic hp-FEM algorithm and by the deep learning-driven hp-FEM algorithm. Different colors denote different polynomial orders of approximation on element edges and interiors. The original L-shape domain. Zoom 1\(\times \), 1000\(\times \), 100000\(\times \) towards the center. The sequence of hp refined meshes generated by deterministic algorithm (left panel) and DNN driven algorithm (right panel).

Fig. 3.
figure 3

Let panel: The comparison of deterministic and DNN hp-FEM on original L-shape domain. Right panel: The sizes (horizontal h1/vertical h2 directions) from \(10^{-2}\) (right) down to \(10^{-8}\) (left) of the elements where MPL network made incorrect decisions during verification.

Fig. 4.
figure 4

Left panel: The execution times of the parts of the self-adaptive hp-FEM algorithm. Right panel: The refinements generated by DNN for a distorted mesh.

We construct the dataset by executing the deterministic Algorithm 1 for the model L-shape domain problem. We perform 50 iterations of the hp-adaptivity, generating over 10,000 deterministic element refinements, resulting in 10,000 samples. We repeat this operation for rotated boundary conditions (4) by the following angles: 10, 20, 30, 40, 50, 60, 70, 80, and 90\(^\circ \). Each rotation changes the solution and the samples. We obtain a total of 100,000 samples. We randomly select 90% of the samples for training and use the remaining 10% as a test set. We further sample the training data and use 10% of training data as a validation set. After one-hot encoding the categorical variables, each sample is represented by a 136-dimensional vector. Since it is much more common for the deterministic algorithm to make specific h refinement decisions (nref) for the L-shape domain problem, the dataset is imbalanced. To mitigate this, we apply supersampling of underrepresented nref classes. DNN Architecture. We use a feed-forward DNN[1] with 12 fully-connected layers. After 8 layers, the network splits into 6 branches, 4 layers each: the first branch decides about the optimal nref parameter - h refinement, the remaining branches decide about modifying the polynomial orders - p refinement. Experiments have shown that further expanding of the network makes it prone to overfitting [8]. Splitting the network into branches assures sufficient parameter freedom for each variable. This approach also simplifies the model: there is no need to train a DNN for each variable. Since all possible decisions are encoded as categorical variables, we use cross-entropy as the loss function. We encoded the input data as a 136-dimensional normalized vector, as detailed in Table 1. We assume that the polynomial degree will not exceed \(n=11\). We train the network for up to 200 epochs with validation loss-based early stopping on an Nvidia Tesla v100 GPGPU with 650 tensor cores available in ACK Cyfronet PROMETHEUS cluster [2]. To minimize the loss function, we use the Adam optimizer [7], with the learning rate set to \(10e^{-3}\). We apply kernel L2-penalty throughout the training as a means of regularization and dropout [10] with probability 0.5. The network converges after approximately 110 epochs.

Table 1. Dimensionality of specific input features to the DNN, encoded in a single 136-dimensional vector. Polynomial coefficients that do not exist in a given polynomial order are always 0.

DNN Performance. The network achieved over 92% accuracy on the test set. We run three tests to assess whether such a network can be used in the hp-FEM.

First numerical experiment is to reproduce the deterministic Algorithm 2 for the original L-shape domain problem, presented in Figs. 1, 2 and left panel in Fig. 3. Both deterministic and DNN-driven algorithms provide exponential convergence. The verification phase shows that the DNN makes up to 50% of incorrect decisions when the element sizes go down to \(10^{-7}\) and less, see the right panel in Fig. 3. Thus, at the zoom of 100,000 times, we see some differences in Fig. 2. Despite that, the algorithm still converges exponentially.

Second Numerical Experiment. We run the self-adaptive hp-FEM algorithm, and we provide zeros as the coarse mesh solution degrees of freedom. We get the same convergence. This second test shows that the DNN is not sensitive with respect to the coarse mesh solution and that the norm of the fine mesh solution, the maximum norm, and the coordinates and dimensions of the elements are enough to make proper decisions. The DNN looks at the fine mesh solution’s norms at the given and neighboring elements and, based on these data in decides whether the element is to be broken and how it should be broken. Thus, we can replace Algorithm 2 and the coarse mesh solution phase with the DNN. Left panel in Fig. 4 presents the execution times of particular parts of the hp-FEM algorithm. The removal of the coarse mesh solution phase and replacing Algorithm 2 by the DNN saves up to 50% of the execution times.

Fig. 5.
figure 5

The convergence for deterministic and DNN hp-FEM algorithms for the L-shape with b.c. rotated by 45\(^\circ \). The meshes of the deterministic hp-FEM algorithm and by DNN driven hp-FEM algorithm for the L-shape domain with b.c. rotated by \(45^{\circ }\) Zoom \(10^5\times \) times.

Third Numerical Experiment. The third test, illustrated in Fig. 5, concerns the L-shape domain algorithm with boundary conditions rotated 45\(^\circ \) (no samples for this case were provided in the training set). The DNN hp-FEM also provides exponential convergence in this case.

Fourth Numerical Experiment. The last test, illustrated in the right panel in Fig. 4 concerns randomly disturbed mesh, different from the training set. The DNN captures both top and bottom singularities. It produces hp refinements towards the bottom singularity and p refinements towards the top singularity. The resulting accuracy was 1% of the relative error after ten iterations.

3 Conclusions

We replaced the algorithm selecting optimal refinements in the self-adaptive hp-FEM by a deep neural network. We obtained over 92% of correct answers, the same accuracy of the final mesh, and exponential convergence of the mesh refinement algorithm. A very interesting observation is that DNN requires coordinates of elements (to recognize the adjacency between elements), the dimensions of elements (to recognize the refinement level), the \(H^1\) norm of the solution over the element, and the maximum norm of the solutions over elements. The DNN by “looking” at the norms over adjacent elements, recognizes with 92% accuracy the proper p-refinement of the element. The replacement of the coarse mesh solver (line 2 in Algorithm 1) and the optimal refinements selection (Algorithm 2) by the DNN allows for a 50% reduction of the computational time.

The DNN used is available at

http://home.agh.edu.pl/paszynsk/dnn_hp2d/dnn_hp2d.tar.gz