Towards pAdaptive Spectral/hp Element Methods for Modelling Industrial Flows
 852 Downloads
Abstract
There is an increasing requirement from both academia and industry for highfidelity flow simulations that are able to accurately capture complicated and transient flow dynamics in complex geometries. Coupled with the growing availability of highperformance, highly parallel computing resources, there is therefore a demand for scalable numerical methods and corresponding software frameworks which can deliver the nextgeneration of complex and detailed fluid simulations to scientists and engineers in an efficient way. In this article we discuss recent and upcoming advances in the use of the spectral/hp element method for addressing these modelling challenges. To use these methods efficiently for such applications, is critical that computational resolution is placed in the regions of the flow where it is needed most, which is often not known a priori. We propose the use of spatially and temporally varying polynomial order, coupled with appropriate error estimators, as key requirements in permitting these methods to achieve computationally efficient highfidelity solutions to complex flow problems in the fluid dynamics community.
1 Introduction
Computational modelling is now regularly used in the fluid dynamics community, giving insight into flow problems where experimentation is too difficult, impractical or costly to realise. The complex geometries and time constraints involved in modern industrial studies imply that, to date, most numerical simulations are restricted to being steady in time. This limits their capabilities, particularly when the problem of interest involves fundamentally unsteady flow dynamics, such as vortex shedding. However, with the wider availability and reducing cost of largescale computing power, academic and industrial fluid dynamicists are increasingly looking to perform finelydetailed unsteady simulations. These highfidelity simulations will allow us to obtain deeper insight into many challenging engineering problems, where steadystate solvers struggle to capture the relevant unsteady flow structures.
One of the main challenges in conducting such simulations is that the complex geometries that are a natural consequence of studying industrial problems will inherently generate flow structures across a large range of time and length scales. From a practical perspective, it becomes difficult or impossible to predict where numerical resolution is required in the computational domain before the simulation is run in order to accurately resolve the flow. Since uniform refinement across very large domains is computationally prohibitive, the community is turning to adaptive methods, where resolution is dynamically adjusted within the domain as a function of time, in order to overcome this issue.
The spectral/hp element method [11] – in which an unstructured elemental decomposition capable of resolving complex geometries is equipped with highorder polynomial bases are used to give routes to convergence in terms of element size h and polynomial order p – has been used in academic applications for several years. However, it is now emerging as one of the enabling technologies for the simulation of highfidelity industrial simulations. From a numerical perspective, these methods offer attractive properties such as low diffusion and dispersion errors, meaning that for smooth solutions fewer degrees of freedom are required to attain the same accuracy as compared to traditional loworder methods [19]. From a computational perspective, the use of a higher polynomial order leads to compact data structures and enables a balance between the computational and memory intensiveness of the method. This is increasingly becoming a key factor in the efficient use of modern manycore hardware.
On the whole, the development of adaptive methods has been mostly focused around hadaption, where the elements are refined or coarsened in order to adjust the numerical resolution. The use of padaption, on the other hand, has received far less attention. Most of the work in this area has focused on hpadaption, which has been an area of significant attention with various works investigating these techniques for elliptic problems [4, 8, 24] that are not necessarily immediately applicable for fluidbased problems. However, padaption has been shown to be a viable technique in a study by Li and Jameson [14], where adaption in p was shown to provide the highest accuracy with respect to the numerical resolution and computing time.
However highorder methods have presented inherent difficulties that have only started to be overcome in the last few years. These challenges are both mathematical and practical. On the theoretical side, there has been a need to overcome stability issues arising due to aliasing of the solution [17] and timestep size [6]; investigate the generation of curved meshes which conform to the boundary of complex threedimensional domains [20, 22]; and investigate parallel scaling of these methods [27]. On the practical side, the mathematical complexity of the methods has necessitated the development of software frameworks [3] to improve accessibility to academia and industry. These developments now mean that these highorder methods are being applied in very high Reynolds number flows that are of significant interest to, for example, the aerodynamics and aeronautics community [15].
In this article, we will discuss some practicalities of implementing spectral/hp element solvers which use a spatially variable polynomial order across computational domain. We do this both in the context of incompressible and compressible flow. For the former we use a continuous Galerkin approach to solving a semiimplicit form of the incompressible NavierStokes equations; for the latter we use a discontinuous Galerkin projection with an explicit timestepping method. Section 2 discusses the formulation of these methods and how variable polynomial orders are handled in each case. Section 3 illustrates the capabilities of adaptivity in p, before concluding with a brief outlook in Sect. 4.
2 Formulation
This section begins with a brief discussion of the formulation of the spectral/hp element method, the basis being used to represent elemental expansions and how this relates to discontinuous and continuous formulations. We then describe how this formulation can be adapted to allow variable polynomial order across the computational domain, provide implementation details and give an overview of the techniques required to make this approach computationally tractable.
2.1 Domain Discretisation
2.1.1 Choice of Basis
2.2 Implementation Details
2.2.1 Continuous Galerkin Formulation
The key operation of the CG formulation is assembly, wherein local elemental contributions are gathered to impose the C^{0}continuity of the underlying function space, as depicted visually in Fig. 1. The assembly operation associates a vector of concatenated local elemental coefficients \(\mathbf{\hat{u}}_{l} = (\mathbf{\hat{u}}^{1},\ldots,\mathbf{\hat{u}}^{N_{\text{el}}})\) to their global counterparts \(\mathbf{\hat{u}}_{g}\) through an injective map. Here, we note that each \(\mathbf{\hat{u}}^{e}\) corresponds to the vector of local coefficients in Eq. 1. The coefficients in \(\mathbf{\hat{u}}_{g}\) describe the contribution to the solution of the modes which span D^{CG}(Ω). Mathematically, this operation is expressed through a sparse matrixvector operation \(\mathbf{\hat{u}}_{g} = \mathbf{A}\mathbf{\hat{u}}_{l}\). For a uniform polynomial order mesh, the columns of A are nonzero where local degrees of freedom meet to form global degrees of freedom, and zero otherwise, so that the valency of a global degree of freedom i is defined as the number of nonzero columns in the ith row of A. In practice, the high sparsity of A means that we use array indirection to implement the action of A without explicitly constructing it, as defined in Algorithm 1.
Algorithm 1 Continuous C^{0} assembly operation

map[e][i] stores the index of the global degree of freedom corresponding to mode i of element Ω^{ e };

sign[e][i] stores either 1 or 1 to align modes that are of odd polynomial orders such that the basis remains continuous (see [11] for more details).
We note that in practice, even at moderately low polynomial orders, L + λM is rarely explicitly constructed. The use of the mapping above allows us to apply the action of this operator and leverage the computational optimisations possible due to the rich structure of the elemental matrices.
2.2.2 Discontinuous Galerkin Formulation
To project the trace contributions back into the volume consistently, on the higher side we may simply copy the coefficients directly from Γ to Γ_{2}. For Γ_{1}, we have a higherdegree polynomial that must be incorporated into a lower degree edge. To do this in a conservative fashion, we perform a change of basis of the elemental coefficients from the hierarchical basis of Eq. (4) onto an orthogonal space of Legendre polynomials. We then apply a lowpass filter, by zeroing the unwanted highfrequency polynomials. This is necessary since the basis functions given in Eq. (4) are not orthogonal and performing a filtering in this space will alter the mean flux, leading to a loss of conservation. Finally, we perform a change of basis back to the lowerorder hierarchical basis.
2.3 Efficiency Across a Range of Polynomial Orders
As a final note on implementation considerations, a clear observation that can be made when using a variable polynomial order is that the sizes of elemental matrices can vary drastically, particularly when considering threedimensional elements. This is important since operator evaluations, such as the Laplacian matrix of Eq. (6), form the bulk of the computational cost of the spectral/hp element method, either in computing quantities such as the inner product or in solving a system of equations in an iterative fashion. Efficient evaluation of these operators across a wide range of polynomial orders is therefore an important component to the efficacy of a variable polynomial order simulation.
The underlying mathematical formulation and tensorproduct form of the basis admits a number of different implementation choices for the evaluation of these operators, each of which admits differing performance across polynomial orders and choice of hardware [1, 2, 16, 26]. Furthermore, the results of [21] suggest that for modern hardware, where memory bandwidth is a valuable commodity, elemental operations should be amalgamated wherever possible to minimise data transfer and efficiently utilise the memory hierarchy. In the context of variable polynomial orders, the amalgamation of elements that are of the same type and polynomial order, combined with an appropriate implementation strategy as described in [21], should be performed to maximise the computational performance of the method.
3 Results
This section gives a brief overview of results achieved to date using adaption in the polynomial order p with the compressible and incompressible formulations, focusing on error indicators and how this affects the ability to capture the underlying flow physics, and on the computational cost of these approaches.
3.1 Incompressible Flow
 1.
Advance the equations for n_{steps} time steps.
 2.
Calculate S_{ e } for each element.
 3.Modify the polynomial order in each element:

if S_{ e } ≥ ε_{ u } and P < P_{max}, increase P by 1;

if S_{ e } ≤ ε_{ l } and P > P_{min}, decrease P by 1;

maintain same P if none of the above is true.

 4.
Project the solution to the new polynomial space.
 5.
Repeat for n_{runs}.
In the above, ε_{ u } is the tolerance above which the polynomial order is increased, ε_{ l } ≤ ε_{ u } is the tolerance below which the polynomial order is decreased and P_{min} and P_{max} are the minimum and maximum polynomial orders imposed on the procedure.
It is important to note that changing the polynomial order during the solution is costly, due to the need to assemble and decompose the linear systems for the implicit part of the method. Therefore, the choice of n_{steps} plays a key role in obtaining an efficient solution. A lower value of n_{steps} will lead to the refinement step being performed more frequently, at the expense of a higher average computational cost per timestep.
Figure 5 shows the distribution of polynomial order obtained using n_{steps} = 4, 000, P_{min} = 2 and P_{max} = 9. It is clear that the boundary layers and the regions of turbulent separated flow are represented by high order polynomials, while lower orders are used in regions of laminar flow far from the wing. In this case, the average number of degrees of freedom per element is approximately 49, which is equivalent to the value for a constant P = 6 simulation.
Comparison of the computational cost of adaptive order case of Fig. 5 with constant uniform polynomial order and with variable order without adaptive procedure
Case  Cost  \(\frac{1} {Cost}\) 

P = 5  0.60  1.66 
P = 6  0.72  1.39 
P = 7  1.08  0.93 
P = 8  1.19  0.84 
P = 9  1.53  0.65 
Variable order (fixed)  0.95  1.05 
Adaptive order  1.00  1.00 
3.2 Compressible Flow Using Explicit Timestepping
The accurate solution of compressible flow is an important topic in a number of application areas. For instance, the aeronautical community is concerned with accurately predicting the lift and drag coefficients of different wing configurations whilst keeping the computational cost low. This allows considering a wide range of geometries during the design lifecycle and provides the basis for aerodynamic shape optimization. In these applications, the key to accurately predicting lift and drag lies in determining the regions of the domain which influence these coefficients the most. Adaptive methods, combined with appropriate error estimators, are one route to producing fast, accurate and reliable results. This section describes progress made in [5], where a goalbased error estimator based on an adjoint problem derived from the underlying equations has been applied together with the padaptive techniques described in the previous section. The error estimator derives an adjoint problem from a coarselyresolved base flow, the solution of which determines the areas of the domain which have the greatest sensitivity to the lift and drag coefficients. Although this technique has been explored previously, a review of which can be found in [7], this has mostly focused around hadaptivity where the element size is refined or coarsened and padaptivity at low values of p. The purpose of this work has been to consider a wider range of polynomial orders for this problem.
3.2.1 Governing Equations
3.2.2 Adaptive Procedure

Run a loworder simulation to obtain a steady flow field.

Use this flow field to solve a goalbased adjoint problem by considering an infinitesimal perturbation to the flow field.

Compute a distribution of the polynomial order according to a goalbased error estimator based on the adjoint solution.

Using the techniques of Sect. 2, perform the simulation again to compute a solution with a lower error of the lift or drag.

a high resolution case at P = 9 is used as a reference solution, the obtained solution for the xmomentum for which can be seen in Fig. 6a;

uniform polynomial order simulations are performed at P = 3, 5 and 7;

variable polynomial orders are performed with 3 ≤ P ≤ 5 → 9.
Summary of normalised CPU cost and error in drag c_{ d } for various constant and spatially variable polynomial orders, compared to a uniform simulation at P = 9
Case  Cost  ɛ 

P = 3  0.28  1.2 ×10^{−3} 
P = 5  0.29  1.57 ×10^{−4} 
P = 7  0.64  2.69 ×10^{−5} 
P = 9  1.0  – 
3 ≤ P ≤ 5  0.31  3.19 ×10^{−4} 
3 ≤ P ≤ 6  0.32  7.44 ×10^{−5} 
3 ≤ P ≤ 7  0.34  3.47 ×10^{−5} 
3 ≤ P ≤ 8  0.36  2.71 ×10^{−5} 
3 ≤ P ≤ 9  0.45  5.63 ×10^{−6} 
4 Conclusions
In this article we have discussed the use and implementation of adaptive polynomial order in the spectral/hp element method. The canonical flows considered here show the clear benefits of this adaptive process, bringing a reduction in both the computational cost and the number of degrees of freedom required to resolve a given problem. However, there are still a number of challenges that need to be addressed before these methods can be brought to bear on extremely largescale problems. Numerically, future work should focus around the development of more robust error estimators, particularly in the context of unsteady simulations, perhaps based around an unsteady formulation of the adjoint approach used for compressible simulations in Sect. 3. We note that this is inherently more expensive than the subcell estimator, however it will give a better indication of error throughout the domain. More sophisticated techniques also need to be developed for parallel simulations. In particular, the efficient preconditioning of these systems remains an open problem, and very largescale simulations require the development of adaptive loadbalancing techniques that can be used to redistribute the workload evenly across processors as the polynomial order changes. Finally, when dealing with complex geometries, techniques need to be developed to couple the change in polynomial order to the treatment of curvilinear surfaces and the elements that connect to them, in order to preserve the accurate representation of the underlying geometry.
Notes
Acknowledgements
D.M. acknowledges support from the EU Horizon 2020 project ExaFLOW (grant 671571) and the PRISM project under EPSRC grant EP/L000407/1. D.S. is grateful for the support received from CNPq (grant 231787/2013–8) and FAPESP (grant 2012/234930). D.E. acknowledges support from the EU ITN project ANADE (grant PITNGA289428). S.J.S. acknowledges Royal Academy of Engineering support under their research chair scheme. R.M.K. acknowledges support from the US Army Research Office under W911NF1510222 (overseen by Dr. M. Coyle). Computing resources supported by the UK Turbulence Consortium (EPSRC grant EP/L000261/1) and the Imperial College HPC service.
References
 1.C. Cantwell, S. Sherwin, R. Kirby, P. Kelly, From h to p efficiently: strategy selection for operator evaluation on hexahedral and tetrahedral elements. Comput. Fluids 43(1), 23–28 (2011)CrossRefzbMATHMathSciNetGoogle Scholar
 2.C.D. Cantwell, S.J. Sherwin, R.M. Kirby, P.H.J. Kelly, From h to p efficiently: selecting the optimal spectral/hp discretisation in three dimensions. Math. Mod. Nat. Phenom. 6, 84–96 (2011)CrossRefzbMATHMathSciNetGoogle Scholar
 3.C.D. Cantwell, D. Moxey, A. Comerford, A. Bolis, G. Rocco, G. Mengaldo, D. de Grazia, S. Yakovlev, J.E. Lombard, D. Ekelschot, B. Jordi, H. Xu, Y. Mohamied, C. Eskilsson, B. Nelson, P. Vos, C. Biotto, R.M. Kirby, S.J. Sherwin, Nektar++: an opensource spectral/hp element framework. Comput. Phys. Commun. 192, 205–219 (2015)CrossRefGoogle Scholar
 4.L. Demkowicz, W. Rachowicz, P. Devloo, A fully automatic hpadaptivity. J. Sci. Comput. 17(1), 117–142 (2002)CrossRefzbMATHMathSciNetGoogle Scholar
 5.D. Ekelschot, D. Moxey, S.J. Sherwin, J. Peiró, A padaptation method for compressible flow problems using a goalbased error estimator. Comput. Struct. 181, 55–69 (2017)CrossRefGoogle Scholar
 6.E. Ferrer, D. Moxey, S.J. Sherwin, R.H.J. Willden, Stability of projection methods for incompressible flows using high order pressurevelocity pairs of same degree: continuous and discontinuous Galerkin formulations. Commun. Comput. Phys. 16(3), 817–840 (2014)CrossRefGoogle Scholar
 7.K.J. Fidkowski, D.L. Darmofal, Review of outputbased error estimation and mesh adaptation in computational fluid dynamics. AIAA J. 49(4), 673–694 (2011)CrossRefGoogle Scholar
 8.G. Giorgiani, S. FernándezMéndez, A. Huerta, Goaloriented hpadaptivity for elliptic problems. Int. J. Numer. Methods Fluids 72(1), 1244–1262 (2013)CrossRefGoogle Scholar
 9.D. de Grazia, G. Mengaldo, D. Moxey, P. Vincent, S.J. Sherwin, Connections between the discontinuous Galerkin method and highorder flux reconstruction schemes. Int. J. Numer. Methods Fluids 75(12), 860–877 (2014)CrossRefMathSciNetGoogle Scholar
 10.H.T. Huynh, A flux reconstruction approach to highorder schemes including discontinuous Galerkin methods, in: 18th AIAA Computational Fluid Dynamics Conference, p. 4079 (2007)Google Scholar
 11.G. Karniadakis, S. Sherwin, Spectral∕hp Element Methods for Computational Fluid Dynamics, 2nd edn. (Oxford University Press, Oxford, 2005)CrossRefzbMATHGoogle Scholar
 12.G.E. Karniadakis, Spectral elementFourier methods for incompressible turbulent flows. Comput. Methods Appl. Mech. Eng. 80(1–3), 367–380 (1990)CrossRefzbMATHMathSciNetGoogle Scholar
 13.G.E. Karniadakis, M. Israeli, S.A. Orszag, Highorder splitting methods for the incompressible NavierStokes equations. J. Comput. Phys. 97(2), 414–443 (1991)CrossRefzbMATHMathSciNetGoogle Scholar
 14.L.Y. Li, Y. Allaneau, A. Jameson, Comparison of h and padaptations for spectral difference methods, in: 40th AIAA Fluid Dynamics Conference and Exhibit (2010)Google Scholar
 15.J.E.W. Lombard, D. Moxey, S.J. Sherwin, J.F.A. Hoessler, S. Dhandapani, M.J. Taylor, Implicit largeeddy simulation of a wingtip vortex. AIAA J. 54(2), 506–518 (2016)CrossRefGoogle Scholar
 16.G. Markall, A. Slemmer, D. Ham, P. Kelly, C. Cantwell, S. Sherwin, Finite element assembly strategies on multicore and manycore architectures. Int. J. Numer. Methods Fluids 71(1), 80–97 (2013)CrossRefMathSciNetGoogle Scholar
 17.G. Mengaldo, D. de Grazia, D. Moxey, P.E. Vincent, S.J. Sherwin, Dealiasing techniques for highorder spectral element methods on regular and irregular grids. J. Comput. Phys. 299, 56–81 (2015)CrossRefzbMATHMathSciNetGoogle Scholar
 18.G. Mengaldo, D. de Grazia, P.E. Vincent, S.J. Sherwin, On the connections between discontinuous Galerkin and flux reconstruction schemes: extension to curvilinear meshes. J. Sci. Comput. 67(3), 1272–1292 (2016)CrossRefzbMATHMathSciNetGoogle Scholar
 19.R.C. Moura, G. Mengaldo, J. Peiró, S.J. Sherwin, On the eddyresolving capability of highorder discontinuous Galerkin approaches to implicit LES/underresolved DNS of Euler turbulence. J. Comput. Phys. 330, 615–623 (2017)CrossRefMathSciNetGoogle Scholar
 20.D. Moxey, M.D. Green, S.J. Sherwin, J. Peiró, An isoparametric approach to highorder curvilinear boundarylayer meshing. Comput. Methods Appl. Mech. Eng. 283, 636–650 (2015)CrossRefMathSciNetGoogle Scholar
 21.D. Moxey, C.D. Cantwell, R.M. Kirby, S.J. Sherwin, Optimizing the performance of the spectral/hp element method with collective linear algebra operations. Comput. Methods Appl. Mech. Eng. 310, 628–645 (2016)CrossRefGoogle Scholar
 22.D. Moxey, D. Ekelschot, Ü. Keskin, S.J. Sherwin, J. Peiró, Highorder curvilinear meshing using a thermoelastic analogy. Comput. Aided Des. 72, 130–139 (2016)CrossRefGoogle Scholar
 23.P.O. Persson, J. Peraire, Subcell shock capturing for discontinuous Galerkin methods. AIAA paper 112 (2006)Google Scholar
 24.P. Solín, L. Demkowicz, Goaloriented hpadaptivity for elliptic problems. Comput. Methods Appl. Mech. Eng. 193(1), 449–468 (2004)CrossRefzbMATHMathSciNetGoogle Scholar
 25.P.E. Vincent, P. Castonguay, A. Jameson, A new class of highorder energy stable flux reconstruction schemes. J. Sci. Comput. 47(1), 50–72 (2011)CrossRefzbMATHMathSciNetGoogle Scholar
 26.P.E. Vos, S.J. Sherwin, R.M. Kirby, From h to p efficiently: implementing finite and spectral/hp element methods to achieve optimal performance for low and highorder discretisations. J. Comput. Phys. 229(13), 5161–5181 (2010)CrossRefzbMATHMathSciNetGoogle Scholar
 27.S. Yakovlev, D. Moxey, S.J. Sherwin, R.M. Kirby, To CG or to HDG: a comparative study in 3D. J. Sci. Comput. 67(1), 192–220 (2016)CrossRefzbMATHMathSciNetGoogle Scholar