Introduction

Shell finite elements (FE) are standard options to model two-dimensional (2D) curved structures. In commercial codes, shell FE have the assumptions of the classical theories [1,2,3] leading to up to six degrees of freedom (DOF) per node. Such assumptions may be too restrictive in the case of composite structures in which the high transverse deformability and the transverse anisotropy require the proper modeling of shear and normal transverse stresses, and variations of the displacement field at the interface between two layers with different mechanical properties, i.e., the Zig-Zag effect [4]. 3D FE can incorporate such effects but can lead to prohibitive computational costs due to severe aspect ratio constraints. 2D FE remain computationally more efficient and attractive and, over the years, many strategies emerged to extend their capabilities via, for instance, the use of higher-order polynomial thickness expansions leading to increasing DOF per node [5]. This paper presents a new methodology to assess shell FE for linear static analyses of composites, and the following literature survey focuses on this specific area. More comprehensive reviews are in [6,7,8,9].

Concerning the solution schemes, analytical and FE strategies are among the most used. Analytical models received a great deal of interest as they provide very useful exact solutions to, for instance, verify FE modelings. Such exact solutions can take into account the shear deformability [10,11,12,13,14,15,16] or directly provide 3D solutions [17,18,19,20,21,22,23]. Research on refined shell FE focused on higher-order models [24,25,26], the inclusion of transverse stretching and continuity [27,28,29], and the development of solid-shell elements [30,31,32,33,34,35,36,37]. Regardless of the solution scheme, the most important strategies to enhance the capabilities of shell models are either asymptotic or axiomatic. The former exploit asymptotic expansions of most relevant parameter, e.g., the thickness ratio, to build models with a priori known accuracy as compared to 3D models [38,39,40,41]. The latter, on the other hand, build models based on assumptions and, usually, less assumptions lead to more cumbersome models. The axiomatic way has various directories starting from the improvement of classical models [42,43,44,45,46,47,48,49]. As mentioned above, the proper modeling of the transverse behavior of composites is decisive as proved by the efforts of many researchers over the past few years. The focus is on improved modelings of the interlaminar stresses and through-the-thickness continuity [50,51,52,53,54,55], shear correction factors [56], Zig-Zag models [57,58,59,60,61], Layer-Wise (LW) models [62,63,64,65], and mixed formulations [66,67,68] allowing for the a priori modeling of transverse stresses.

Another powerful approach is the Proper Generalized Decomposition (PGD) method [69, 70] in which the construction of the refined model and the solution of the problem take place simultaneously.

From the structural standpoint, the methodology in this paper adopts the Carrera Unified Formulation (CUF) allowing to obtain any-order shell theory without formal changes in the problem matrices [4, 71, 72]. One of the capabilities of CUF is the axiomatic/asymptotic method (AAM) [73, 74] to analyze the relevance of any generalized displacement variable. The systematic use of AAM leads to the definition of the Best Theory Diagram (BTD), i.e., a 2D plot to localize shell models with minimum DOF and maximum accuracies [75, 76]. One of the aims of this paper is to reduce the computational costs to obtain BTD via neural networks (NN). Such networks are mathematical models inspired by biological nervous systems and composed of simple computational units interlinked by a system of connections [77] to learn through training via samples. In this paper, CUF FE provides the samples for the supervised learning of multilayer perceptrons to evaluate the accuracy of refined shell models avoiding FE matrices and analyses. The use of NN in structural and material simulation is increasing due to the superior computational efficiency [78,79,80]. Recent applications for composites concern the prediction of the elastic properties [81], buckling load [82, 83], failure strength [84, 85], natural frequencies [86,87,88], and geometry optimization [89].

In this paper, “Finite element formulation” section provides a brief theoretical description of CUF and its FE formulation. “Best Theory Diagram” section introduces the concept of BTD. “Neural networks and coding” section describes the use of NN to evaluate the accuracy of a shell model. Results and conclusions are in “Results” and “Conclusions” sections, respectively.

Finite element formulation

The CUF displacement field for a 2D model is

$$\begin{aligned} {\mathbf {u}}(\alpha , \beta , z)=F_{\tau }(z){\mathbf {u}}_{\tau }(\alpha , \beta )\qquad \tau =1, \dots , M \end{aligned}$$
(1)

The Einstein notation acts on \(\tau \). \({\mathbf {u}}\) is the displacement vector, \(({\hbox {u}}_{x}\; {\hbox {u}}_{y}\; {\hbox {u}}_{z})^T\). \({\hbox {F}}_{\tau }\) are the thickness expansion functions. \({\mathbf {u}}_{\tau }\) is the vector of the generalized unknown displacements. M is the number of expansion terms. A fourth-order model, referred to as N = 4, is

$$\begin{aligned} \begin{aligned}&u_{x}=u_{x_{1}}+z\,u_{x_{2}}+z^{2}\,u_{x_{3}}+z^{3}\,u_{x_{4}}+z^{4}\,u_{x_{5}}\\&u_{y}=u_{y_{1}}+z\,u_{y_{2}}+z^{2}\,u_{y_{3}}+z^{3}\,u_{y_{4}}+z^{4}\,u_{y_{5}}\\&u_{z}=u_{z_{1}}+z\,u_{z_{2}}+z^{2}\,u_{z_{3}}+z^{3}\,u_{z_{4}}+z^{4}\,u_{z_{5}}\\ \end{aligned} \end{aligned}$$
(2)

and has 15 nodal DOF. The order and type of expansion is a free parameter; thus, the theory of structure is an input of the analysis. The metric coefficients \({\hbox {H}}^k_\alpha \), \({\hbox {H}}^k_\beta \) and \({\hbox {H}}^k_z\) of the \({\hbox {kth}}\) layer are

$$\begin{aligned} \begin{aligned} H^k_\alpha = A^k (1 + z_k/R^k_\alpha ), \;\;\; H^k_\beta = B^k (1 + z_k/R^k_\beta ), \;\;\; H^k_z = 1\; \end{aligned} \end{aligned}$$
(3)
Fig. 1
figure 1

Shell geometry

\({\hbox {R}}^k_\alpha \) and \({\hbox {R}}^k_\beta \) are the principal radii of the middle surface of the \({\hbox {kth}}\) layer, \({\hbox {A}}^k\) and \({\hbox {B}}^k\) the coefficients of the first fundamental form of \(\Omega _k\), see Fig. 1. This paper focused only on shells with constant radii of curvature with \({\hbox {A}}^k = {\hbox {B}}^k = 1\). The geometrical relations are

$$\begin{aligned} \begin{aligned} {{\varvec{\epsilon }}}^k_p&= \begin{Bmatrix} \epsilon ^k_{\alpha \alpha }, \epsilon ^k_{\beta \beta }, \epsilon ^k_{\alpha \beta } \end{Bmatrix}^T = ({{\varvec{D}}}^k_p + {{\varvec{A}}}^k_p) {{\varvec{u}}}^k \\ {{\varvec{\epsilon }}}^k_n&= \begin{Bmatrix} \epsilon ^k_{\alpha z}, \epsilon ^k_{\beta z}, \epsilon ^k_{zz} \end{Bmatrix}^T = ({{\varvec{D}}}^k_{n\Omega } +{{\varvec{D}}}^k_{nz} - {{\varvec{A}}}^k_n) {{\varvec{u}}}^k \end{aligned} \end{aligned}$$
(4)

where

$$\begin{aligned} {\varvec{D}}^k_p= & {} \left[ \begin{array}{c@{\quad }c@{\quad }c} \frac{\partial _{\alpha }}{H^k_{\alpha }} &{} 0 &{} 0 \\ 0 &{} \frac{\partial _{\beta }}{H^k_{\beta }} &{} 0 \\ \frac{\partial _{\beta }}{H^k_{\beta }} &{} \frac{\partial _{\alpha }}{H^k_{\alpha }} &{} 0 \end{array} \right] \; \quad {\varvec{D}}^k_{n\Omega } = \left[ \begin{array}{c@{\quad }c@{\quad }c} 0 &{} 0 &{} \frac{\partial _{\alpha }}{H^k_{\alpha }} \\ 0 &{} 0 &{} \frac{\partial _{\beta }}{H^k_{\beta }} \\ 0 &{} 0 &{} 0 \end{array} \right] \; \quad {\varvec{D}}^k_{nz} = \left[ \begin{array}{ccc} \partial _z &{} 0 &{} 0 \\ 0 &{} \partial _z &{} 0 \\ 0 &{} 0 &{} \partial _z \end{array} \right] \; \end{aligned}$$
(5)
$$\begin{aligned} {\varvec{A}}^k_{p}= & {} \left[ \begin{array}{c@{\quad }c@{\quad }c} 0 &{} 0 &{} \frac{1}{H^k_{\alpha }R^k_{\alpha }} \\ 0 &{} 0 &{} \frac{1}{H^k_{\beta }R^k_{\beta }} \\ 0 &{} 0 &{} 0 \end{array} \right] \; {\varvec{A}}^k_{n} = \left[ \begin{array}{c@{\quad }c@{\quad }c} \frac{1}{H^k_{\alpha }R^k_{\alpha }} &{} 0 &{} 0 \\ 0 &{} \frac{1}{H^k_{\beta }R^k_{\beta }} &{} 0 \\ 0 &{} 0 &{} 0 \end{array} \right] \; \end{aligned}$$
(6)

The stress–strain relations are

$$\begin{aligned} \begin{aligned} {{\varvec{\sigma }}}_{p}^k&= \begin{Bmatrix} \sigma _{\alpha \alpha }^k, \sigma _{\beta \beta }^k, \sigma _{\alpha \beta }^k \end{Bmatrix}^T = {{\varvec{C}}}_{pp}^k {{\varvec{\epsilon }}}_{p}^k + {{\varvec{C}}}_{pn}^k {{\varvec{\epsilon }}}_{n}^k \\ {{\varvec{\sigma }}}_{n}^k&= \begin{Bmatrix} \sigma _{\alpha z}^k, \sigma _{\beta z}^k, \sigma _{z z}^k \end{Bmatrix}^T = {{\varvec{C}}}_{np}^k {{\varvec{\epsilon }}}_{p}^k + {{\varvec{C}}}_{nn}^k {{\varvec{\epsilon }}}_{n}^k \\ \end{aligned} \end{aligned}$$
(7)

where

$$\begin{aligned} {\begin{matrix} {{\varvec{C}}}_{pp}^k=&{}\left[ \begin{array}{c@{\quad }c@{\quad }c} C_{11}^k &{} C_{12}^k &{} C_{16}^k \\ C_{12}^k &{} C_{22}^k &{} C_{26}^k \\ C_{16}^k &{} C_{26}^k &{} C_{66}^k \end{array} \right] \qquad {{\varvec{C}}}_{pn}^k=\left[ \begin{array}{c@{\quad }c@{\quad }c} 0 &{} 0 &{} C_{13}^k\\ 0 &{} 0 &{} C_{23}^k\\ 0 &{} 0 &{} C_{36}^k \end{array} \right] \\ {{\varvec{C}}}_{np}^k= &{}\left[ \begin{array}{c@{\quad }c@{\quad }c} 0 &{} 0 &{} 0 \\ 0 &{} 0&{} 0\\ C_{13}^k &{} C_{23}^k &{} C_{36}^k \end{array} \right] \qquad {{\varvec{C}}}_{nn}^k=\left[ \begin{array}{c@{\quad }c@{\quad }c} C_{55}^k &{} C_{45}^k &{} 0 \\ C_{45}^k &{} C_{44}^k &{} 0 \\ 0 &{} 0 &{} C_{33}^k \end{array} \right] \end{matrix}} \end{aligned}$$
(8)

The FE formulation uses a nine-node shell element based on the Mixed Interpolation of Tensorial Component (MITC) method [90]. The displacement vector becomes

$$\begin{aligned} \delta {{\varvec{u}}}_{s} = N_j \delta {{\varvec{u}}}_{s j}, \quad \quad {{\varvec{u}}}_{\tau } = N_i {{\varvec{u}}}_{\tau i} \quad \quad i,j = 1,\cdots ,9 \end{aligned}$$
(9)

\({{\varvec{u}}}_{\tau i}\) and \(\delta {{\varvec{u}}}_{s j}\) are the nodal displacement vector and its virtual variation, respectively. The strain expression becomes

$$\begin{aligned} \begin{aligned} {{\varvec{\epsilon }}}_p&= F_{\tau } ({{\varvec{D}}}_p + {{\varvec{A}}}_p) N_i {{\varvec{u}}}_{\tau i} \\ {{\varvec{\epsilon }}}_n&= F_{\tau } ({{\varvec{D}}}_{n \Omega } - {{\varvec{A}}}_n) N_i {{\varvec{u}}}_{\tau i} + F_{\tau _{,z}} N_i {{\varvec{u}}}_{\tau i} \end{aligned} \end{aligned}$$
(10)

MITC contrasts the membrane and shear locking via a specific interpolation strategy for the strain components on the nine-node shell element, as follows:

$$\begin{aligned} \begin{aligned} {{\varvec{\epsilon }}}_{p}&= \begin{bmatrix} \epsilon _{\alpha \alpha }\\ \epsilon _{\beta \beta }\\ \epsilon _{\alpha \beta } \end{bmatrix} = \begin{bmatrix} N_{m1} &{}0 &{}0 \\ 0 &{}N_{m2} &{}0 \\ 0 &{}0 &{}N_{m3} \end{bmatrix} \begin{bmatrix} \epsilon _{\alpha \alpha _{m1}}\\ \epsilon _{\beta \beta _{m2}}\\ \epsilon _{\alpha \beta _{m3}} \end{bmatrix}\\ {{\varvec{\epsilon }}}_{n}&= \begin{bmatrix} \epsilon _{\alpha z}\\ \epsilon _{\beta z}\\ \epsilon _{zz} \end{bmatrix} = \begin{bmatrix} N_{m1} &{}0 &{}0 \\ 0 &{}N_{m2} &{}0 \\ 0 &{}0 &{}1 \end{bmatrix} \begin{bmatrix} \epsilon _{\alpha z_{m1}}\\ \epsilon _{\beta z_{m2}}\\ \epsilon _{zz_{m3}} \end{bmatrix} \end{aligned} \end{aligned}$$
(11)

Strains \(\epsilon _{\alpha \alpha _{m1}}\), \(\epsilon _{\beta \beta _{m2}}\), \(\epsilon _{\alpha \beta _{m3}}\), \(\epsilon _{\alpha z_{m1}}\), and \(\epsilon _{\beta z_{m2}}\) stem from 10 and

$$\begin{aligned} \begin{aligned} N_{m1}&= [N_{A1}, N_{B1}, N_{C1}, N_{D1}, N_{E1}, N_{F1} ] \\ N_{m2}&= [N_{A2}, N_{B2}, N_{C2}, N_{D2}, N_{E2}, N_{F2} ] \\ N_{m3}&= [N_{P}, N_{Q}, N_{R}, N_{S}] \end{aligned} \end{aligned}$$
(12)

Subscripts m1, m2 and m3 indicate the point groups (A1,B1,C1,D1,E1,F1), (A2,B2,C2,D2,E2,F2), and (P,Q,R,S), respectively, see Fig. 2. Via Principle of Virtual Displacements (PVD) for the static analysis, the equilibrium equation reads

$$\begin{aligned} \quad {{\varvec{k}}}^{k}_{\tau s i j} {{\varvec{u}}}^{k }_{\tau i} = {{\varvec{p}}}^k_{s j} \end{aligned}$$
(13)
Fig. 2
figure 2

MITC9 tying points

The \(3 \times 3\) matrix \({\varvec{k}}^{k}_{\tau s i j}\) is the fundamental mechanical nucleus whose expression is independent of the order of the expansion. \({{\varvec{p}}}^k_{s j}\) is the load vector. More details regarding the finite element formulation are in [72].

Fig. 3
figure 3

Best Theory Diagram

Best Theory Diagram

One of the CUF capabilities is the axiomatic/asymptotic method (AAM) to evaluate the relevance of generalized variables and the accuracy of structural theories [73, 74]. The fourth-order, equivalent single layer shell model, is the reference model of this paper and all the theories evaluated stem from the combinations of the full fourth-order expansion, i.e., \(2^{15}\) models. The CUF generates the governing equations for the theories considered. In particular, the CUF generates reduced models having combinations of the starting terms as generalized unknowns. Two parameters can identify a theory, namely, the number of active terms and the error or accuracy provided. The Best Theory Diagram (BTD) is the curve composed of all models providing the minimum error with the least number of variables, see Fig. 3. Given the accuracy, models with fewer variables than those on the BTD do not exist. Given the number of variables, models with better accuracy than those on the BTD do not exist. In this paper, the error refers to the maximum transverse displacement,

$$\begin{aligned} Error = 100\times \frac{|u_z - u_z^{N = 4}|}{|u_z^{N = 4}|} \end{aligned}$$
(14)

The combined use of CUF and AAM allows the evaluation of the accuracy of any finite element, as shown in Table 1. Black and white triangles indicate active and inactive generalized displacement variables, respectively, and DOF the nodal degrees of freedom of the element. N = 4 is the full expansion of fourth-order. Other three models, well-known from literature, have incomplete expansions, namely,

  • The First-Order Shear Deformation Theory (FSDT) with five DOF,

    $$\begin{aligned} \begin{aligned}&u_{x}=u_{x_{1}}+z\,u_{x_{2}}\\&u_{y}=u_{y_{1}}+z\,u_{y_{2}}\\&u_{z}=u_{z_{1}} \\ \end{aligned} \end{aligned}$$
    (15)
  • A seven DOF model with parabolic transverse displacement, referred to as PTD,

    $$\begin{aligned} \begin{aligned}&u_{x}=u_{x_{1}}+z\,u_{x_{2}}\\&u_{y}=u_{y_{1}}+z\,u_{y_{2}}\\&u_{z}=u_{z_{1}}+z\,u_{z_{2}}+z^{2}\,u_{z_{3}}\\ \end{aligned} \end{aligned}$$
    (16)
  • A nine DOF model with third-order in-plane displacements referred to as TSDT,

    $$\begin{aligned} \begin{aligned}&u_{x}=u_{x_{1}}+z\,u_{x_{2}}+z^{2}\,u_{x_{3}}+z^{3}\,u_{x_{4}}\\&u_{y}=u_{y_{1}}+z\,u_{y_{2}}+z^{2}\,u_{y_{3}}+z^{3}\,u_{y_{4}}\\&u_{z}=u_{z_{1}}\\ \end{aligned} \end{aligned}$$
    (17)
Table 1 Examples of shell models assessed
Fig. 4
figure 4

CUF and NN framework

Neural networks and coding

CUF FE analyses generate inputs to train NN. In this paper, the inputs are the structural theories and the thickness ratio, and outputs are the maximum transverse displacements. Figure 4 shows the two ways adopted in this paper to build the BTD, i.e.,

  • CUF generates the governing FE equations for all the shell theories stemming from subsets of the fourth-order expansions. Given that the expansion has 15 terms, overall, \(2^{15}\) FE shell models are available. For instance, FSDT is one of these models in which five terms are active—\({\hbox {u}}_{x1}\), \({\hbox {u}}_{y1}\), \({\hbox {u}}_{z1}\), \({\hbox {u}}_{x2}\), and \({\hbox {u}}_{y2}\)—and ten inactive.

  • The FE way runs \(2^{15}\) static FE analyses and reports the error and number of active terms of each case in a 2D plot.

  • The NN way runs one-tenth of the FE analyses and uses them for training. Then, the 2D plot stems from querying the trained NN with all \(2^{15}\) shell models.

  • If a/h is a training variable, and, e.g., three a/h values are available, the overall number of analyses is \(3\times 2^{15}\), and the query of the NN includes the shell model and the thickness ratio.

The aim is to build the BTD with less than \(2^{15}\) analyses and avoid new FE analyses as the thickness ratio changes. In Fig. 4, the NN training set has 10% of all analyses as this is a typical value used in this paper. Also, the figure shows only one hidden layer, although more layers could be useful.

The NN configuration is a multilayer feed-forward with early stopping and mean squared error as the objective function. Each layer has ten neurons. This paper adopts Levenberg–Marquardt training functions [91]. The input coding is a vector with 16 elements, that is, all the fourth-order expansion generalized displacement variables and the thickness ratio. Each generalized variable is either ‘1’ or ‘0’ to indicate its active or inactive status. Each input has an associated output composed by a vector containing the error, Eq. 14. As an example, the following equation shows the coded input of a generic shell model with h/a = 0.1:

$$\begin{aligned} \begin{array}{lcl} u_{x}=u_{x_{1}}+z\,u_{x_{2}}+z^4\,u_{x_{5}} &{}&{}\\ u_{y}=u_{y_{1}}+z\,u_{y_{2}}+z^3\,u_{y_{4}} &{} => &{} [1\,1\,1\,1\,1\,1\,0\,0\,1\,0\,1\,0\,1\,0\,0\,0.1]\\ u_{z}=u_{z_{1}}+z\,u_{z_{2}}+z^{2}\,u_{z_{3}} &{}&{}\\ \end{array} \end{aligned}$$
(18)

Table 2 presents the computational costs of the various processes involved in this paper. The cost normalization used the most expensive process as the reference. The number of layers was chosen via a convergence analysis as the adoption of more than one layer led to negligible increments of the computational cost. For the type of NN adopted here, the use of 1–3 layers is a standard choice [91]. The data training generation used a random selection of the structural theories, and no significant variations in the results were observed between different set of randomly chosen training sets.

Table 2 Overciew of computational costs
Table 3 Comparison between NN and FE
Table 4 0/90/0, \({\overline{u}}_{z} \, (\hbox {z} = 0) = 100 {\hbox {u}}_{z} \, {\hbox {E}}_{T} \, {\hbox {h}}^3 /({\overline{p}}_z\,{\hbox {a}}^4)\)
Table 5 0/90/0, a/h = 100, 10% training sets, one layer, influence of the number of neurons on some particular cases from Figs. 5, 6 and 7
Fig. 5
figure 5

FE and NN results for 0/90/0, a/h = 100, 10% training sets, one layer, influence of the number of neurons

Fig. 6
figure 6

FE and NN results for 0/90/0, a/h = 100, one layer, influence of the number of the training set size

Fig. 7
figure 7

FE and NN results for 0/90/0, a/h = 100, one layer with ten neurons

Table 3 shows an overview of the analyses employed to obtain the BTD. In all cases, the input is the structural theory. The capability of setting the theory as an input is a feature provided by CUF. As mentioned in previous sections, CUF allows one to handle the kinematics with no restrictions concerning the order and type of expansions adopted. Such a capability is decisive to obtain the BTD as a tool to verify the accuracy of any structural theory. In other words, via the BTD, the effect of the addition of a new generalized variable can be estimated. As the structural theory to be verified is set, the FE option requires the computation of the stiffness matrix and the solution of the linear static analysis. On the other hand, the trained NN can provide the output by encoding the structural model.

The use of NN has the aim to overcome two current limitations of BTD. First, the computational cost required can be very high as thousands of analyses are needed, and the complexity of the problem increases. Then, the evaluation of the BTD becomes even more challenging as various problem characteristics vary, e.g., boundary conditions or material properties, and multiple outputs are considered, e.g., displacements and stresses. The use of NN may be a solution to both issues. This paper aims to address the first limitation and partially handling the second one. To address the second issue comprehensively, other NN architectures are needed, e.g., convolutional NN as they can manage high-dimensional input features with high efficiency [92].

Results

The numerical results focus on cases from [93]. The shell has a = b, \({\hbox {R}}_{\alpha } = {\hbox {R}}_{\beta } = \hbox {R}\) and R/a = 5. The load is bi-sinusoidal and applied on the top surface, \({\hbox {p}}_z = {\hat{p}}_{z} \sin (\pi \alpha /\hbox {a}) \sin (\pi \beta /\hbox {b})\). The material properties are \({\hbox {E}}_1/{\hbox {E}}_2 = 25\), \({\hbox {G}}_{12}/{\hbox {E}}_2 = {\hbox {G}}_{13}/{\hbox {E}}_2 = 0.5\), \({\hbox {G}}_{13}/{\hbox {E}}_2 = 0.2\), \(\nu = 0.25\). The finite element model of a quarter of shell has a \(4 \times 4\) mesh as this discretization provides sufficiently accurate results [93]. In all cases, the BTD vertical axis ranges from five to fifteen since models with four or less DOF provide very high errors and are not of practical interest. The numerical results stemmed from two methodologies as follows:

  • The finite element method, FE, required \(2^{12}\) static analyses to build the BTD, i.e., one static analysis per each shell theory having a combination of 12 generalized variables. To lessen the computational cost, the three zeroth-order terms of the expansion are always active as, usually, their influence is very high.

  • NN required 10% of \(2^{12}\) to train, i.e., some 400 static analyses. Depending on the cases, the architecture of the network had one or three layers and ten neurons per layer.

Table 6 0/90/0, influence of a/h on some particular cases from Figs. 8 and 9
Fig. 8
figure 8

FE and NN results for 0/90/0, a/h = 50, one layer with ten neurons

Fig. 9
figure 9

FE and NN results for 0/90/0, a/h = 10, one layer with ten neurons

0/90/0

The first numerical case refers to a simply-supported shell with symmetric lamination. Table 4 shows the reference values of transverse displacements adopted to build the BTD. The current N = 4 model provides good accuracy, although, for thicker shells, the match with 3D solutions is not perfect. However, for the scope of the paper, its accuracy is sufficient.

Table 7 BTD models, 0/90/0, a/h = 100
Table 8 BTD models, 0/90/0, a/h = 50
Table 9 BTD models, 0/90/0, a/h = 10
Table 10 0/90/0/90, \({\overline{u}}_{z} (\hbox {z} = 0) = 100 \mathbf{u}_{z} \,{\hbox {E}}_{T}\, {\hbox {h}}^3/({\overline{p}}_z\,a^4)\)
Table 11 0/90/0/90, influence of a/h on some particular cases from Figs. 13, 14 and 15
Table 12 BTD models, 0/90/0/90, a/h = 100

First, the analysis concerned the choice of the network parameters. Figures 5 and 6 show the BTD from NN via 5 and 10 neurons and using 5% and 10% of the \(2^{12}\) cases for training. Table 5 presents some particular cases focused on structural theories from the literature. The FE BTD serves as a benchmark. The results show that the use of 10 neurons and 10% of cases provides very good matches. The remaining analyses made use of such network architecture and focused on the effect of the thickness ratio on the BTD, given that, as seen in previous papers like [76], this is the most relevant parameter to determine the sets of most important generalized variables. Figure 7 shows the results for a/h = 100 in which (a) reports the accuracy of all \(2^{12}\) shell models as provided by the FE and by the trained NN. On the other hand, (b) shows the BTD from FE and NN together with the accuracy of models from the literature. For instance, ‘FSDT FE’, indicates the accuracy of the first-order shear deformation theory as obtained via the FE model, whereas ‘FSDT NN’ refers to the output of the trained NN. Figures 8 and 9 report the results of a/h = 50 and 10, respectively, and Table 6 presents the numerical values related to the models from literature. The BTD models from NN are in Tables 78 and 9. For instance, the six DOF best model for a/h = 10 is the following:

$$\begin{aligned} \begin{aligned}&u_{x}=u_{x_{1}}+z\,u_{x_{2}}+z^{3}\,u_{x_{4}}\\&u_{y}=u_{y_{1}}+z\,u_{y_{2}}\\&u_{z}=u_{z_{1}}\\ \end{aligned} \end{aligned}$$
(19)
Table 13 BTD models, 0/90/0/90, a/h = 50
Table 14 BTD models, 0/90/0/90, a/h = 10
Table 15 Influence of a/h and lamination on some particular cases from Figs. 17 and 18
Table 16 BTD models, 0/90/0, a/h = 75

The last row of each table reports the relevance factor of the expansion orders (RF). The RF is the ratio between the number of active instances and the total number of cases. For instance, \({\hbox {RF}}_0 = 1\) indicates that the zeroth-order terms are always present in the BTD. The combined information stemming from the previous figures and tables is in Figure 10 for a/h = 10 with the explicit indication of the seven, six, and five DOF best displacement fields. The results suggest that

  • The proposed NN framework can detect the FE results with satisfactory accuracy. Two capabilities are relevant, namely, the possibility of using the NN to evaluate theories from the literature and the ability to cover the discontinuous error range entirely.

  • The discontinuity in the error range, i.e., the presence of accuracy bands indicates that there may not exist structural theories satisfying a given error requirement. As shown in [76], such gaps widen as the thickness ratio increases. For thin shells, the lower-order terms, i.e., the FSDT variables, play a decisive role, and their absence causes high errors. As the shell is thicker, higher-order terms gain relevance leading to more homogeneous error distributions.

  • There are no relevant differences in the BTD for a/h = 100 and 50 except that the latter has a broader error range as the five DOF model, coinciding with the FSDT, yields a 2% error. The models from the literature, although not always on the BTD curve, provide satisfactory accuracies.

  • For a/h = 10, at least six DOF are necessary to have errors smaller than 1% and the variables required to meet such a requirement are the cubic in-plane ones.

  • The analysis of the RF shows that, as well-known, for thin shells, zeroth- and first-order variables are the most relevant. As the thickness increases, the third-order terms gain importance with smaller relevance for first-order ones. The NN detected very similar RF as compared to FE from [76], meaning that the prosed framework can detect the accuracy of a given structural model and determine the models on the BTD curve reliably.

Further analyses concerned the comparison of NN with linear regression (LR). LR is computationally cheaper than NN and can provide explicit weights related to each training feature. Figures 11 and 12 show the results for two training sets; namely, 10 and 100%. The accuracy of LR is acceptable just in the second case but lower than NN.

0/90/0/90

The second numerical case investigated the effect of an asymmetric lamination on the BTD. All other parameters remained as those of the previous case. Table 10 presents the transverse displacement values with comparisons with other models from literature, when available. Figures 1314 and 15 show the BTD from FE and NN, and Table 11 presents the numerical values of the models from literature. For a/h = 50 and 10, the NN had three layers of ten neurons as one layer was not enough to fit the BTD curve that, in these cases, presents a more irregular shape than for a/h = 100. Tables 1213 and 14 show the BTD models and relevance factors. Figure 16 shows the BTD curve for a/h = 10 and the displacement field retrieved from Table 14. The results show that

  • As mentioned, a more complex NN architecture was necessary, and the match between FE and NN BTD is not perfect. Some differences are visible for higher DOF models. However, such differences are still acceptable, given that, in the worst case, remain within the 1% error range. The BTD curve presents various portions having different shapes leading to a more difficult curve fitting.

  • As in the previous case, a/h = 100 and 50 have similar BTD, and seven DOF are enough to have very low errors with the FSDT providing accurate results.

  • For a/h = 10, 11 DOF are necessary to have an error lower than 1% with full fourth-order expansions for the in-plane terms. Besides linear terms, the third-order terms are decisive for their absence leading to errors around 10%.

  • The considered models from the literature provide high accuracy for thin shells. On the other hand, for a/h = 10, only the TSDT can provide acceptable accuracy.

Table 17 BTD models, 0/90/0, a/h = 25
Table 18 BTD models, 0/90/0/90, a/h = 75
Fig. 10
figure 10

BTD for 0/90/0, a/h = 10 with seven, six and five DOF models indicated

Fig. 11
figure 11

0/90/0, a/h = 10, comparison between FE and linear regression (LR), 10% training set

Fig. 12
figure 12

0/90/0, a/h = 10, comparison between FE and linear regression (LR), 100% training set

Fig. 13
figure 13

FE and NN results for 0/90/0/90, a/h = 100, one layer with ten neurons

Fig. 14
figure 14

FE and NN results for 0/90/0/90, a/h = 50, three layers with ten neurons

Fig. 15
figure 15

FE and NN results for 0/90/0/90, a/h = 10, three layers with ten neurons

Fig. 16
figure 16

BTD for 0/90/0/90, a/h = 10 with eleven, nine, seven and five DOF models indicated

Fig. 17
figure 17

FE and NN results for 0/90/0, a/h = 25 and 75, three layers with ten neurons

Fig. 18
figure 18

FE and NN results for 0/90/0/90, a/h = 25 and 75, three layers with ten neurons

a/h as a training variable

This section concerns the use of the thickness ratio as an additional training variable. The aim is to show the possibility of using NN to test structural theories and obtain results as the typical parameters of the structure change without the need of creating and running a new FE analysis. In this section, the training inputs are 13, i.e., 12 generalized displacement variables and the thickness ratio. The NN had three layers with ten neurons each. The results refer to the two lamination schemes of the previous sections and a/h = 100, 50 and 10 were the training sets, i.e., the training set size was 10% of \(3\times 2^{12}\). a/h = 25 and 75 are the thickness ratios evaluated via the trained NN. The BTD curves are in Figs. 17 and 18, values from selected models are in Table 15, whereas the BTD models in Tables 161718, and 19. The results suggest that

  • There is a good match between the FE and NN results. Some differences are observable for a/h = 25 but within the 1% error range at worst.

  • a/h = 25 tends to have similar curves and BTD models of a/h = 10 with the increasing relevance of third-order terms. Such a tendency may explain the better match of FE and NN for thin shells given that the training used a/h = 100, 50, and 10. As seen in the previous sections, the latter case tends to have BTD curves quite different as compared to the first two cases.

  • The trained NN provides outputs concerning models from the literature with good accuracy except for the PTD in the moderately thick, symmetric lamination case.

Table 19 BTD models, 0/90/0/90, a/h = 25

Conclusions

This paper presented a new approach to evaluating the accuracy of shell models for composites via the use of neural networks (NN). The NN training used results from shell finite elements (FE) stemming from the Carrera Unified Formulation (CUF) and adopting a 15 DOF, fourth-order polynomial expansion along the thickness as the reference solution. The first set of training inputs considered one-tenth of the combinations of active and inactive terms, i.e., keeping the constant terms always active, one-tenth of \(2^{12}\) shell theories. The second set of inputs added the thickness ratio as a further variable. In all cases, the target was the maximum transverse displacement of a square, simply-supported shell under bi-sinusoidal transverse load. The NN architecture ranged from one to three layers, with ten neurons each. The result verification exploited the FE results of all \(2^{12}\) cases. The NN provided the Best Theory Diagram (BTD), i.e., a curve giving the computationally cheapest model for a given accuracy. The BTD permits to evaluate the accuracy of any structural model and provides guidelines on the relevance of each generalized displacement variable. The main findings of this paper are the following:

  • The use of NN proved to be valid as matched very well the FE solutions. The main convenience of NN is in the use of some 10% of FE analyses for training to obtain the BTD and evaluate the accuracy of a structural model without the need for further FE analyses.

  • The NN training can incorporate physical features of the problems such as the thickness ratio allowing to obtain results without the need of new FE analyses and preprocessing.

  • Potential critical aspects of this approach emerged as the training considered simultaneously thin and thick shells. Such a scenario required the use of more hidden layers and needs further investigations.

  • The BTD stemming from the NN matched very well those obtained via FE and presented in [76]. As well-known, the third-order in-plane terms are the most relevant variables to include to refine classical theories.

  • Most of the models from literature provides good accuracy, although increments in thickness or asymmetric laminations can make such models inaccurate.

The combined use of CUF and NN is promising given that the former can provide thousands of data sets in minutes and benchmarking for the rigorous assessment of results, the latter can boost the computational efficiency and widen the applicability of virtual modeling. Future investigations should focus on the use of NN for multiple targets, e.g., multi-point stress values, and inverse problems to establish the best model by inputting the accuracy requirement.