Predictions of transient vector solution fields with sequential deep operator network

He, Junyan; Kushwaha, Shashank; Park, Jaewan; Koric, Seid; Abueidda, Diab; Jasiuk, Iwona

doi:10.1007/s00707-024-03991-2

Predictions of transient vector solution fields with sequential deep operator network

Original Paper
Open access
Published: 11 June 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

Acta Mechanica Aims and scope Submit manuscript

Predictions of transient vector solution fields with sequential deep operator network

Download PDF

Junyan He ORCID: orcid.org/0000-0002-9180-4439¹,
Shashank Kushwaha¹,
Jaewan Park^1,2,
Seid Koric^1,2,
Diab Abueidda² &
…
Iwona Jasiuk¹

230 Accesses
Explore all metrics

Abstract

The deep operator network (DeepONet) structure has shown great potential in approximating complex solution operators with low generalization errors. Recently, a sequential DeepONet (S-DeepONet) was proposed to use sequential learning models in the branch of DeepONet to predict final solutions given time-dependent inputs. In the current work, the S-DeepONet architecture is extended by modifying the information combination mechanism between the branch and trunk networks to simultaneously predict vector solutions with multiple components at multiple time steps of the evolution history, which is the first in the literature using DeepONets. Two example problems, one on transient fluid flow and the other on path-dependent plastic loading, were shown to demonstrate the capabilities of the model to handle different physics problems. The use of a trained S-DeepONet model in inverse parameter identification via the genetic algorithm is shown to demonstrate the application of the model. In almost all cases, the trained model achieved an $R^2$ value of above 0.99 and a relative $L_2$ error of less than 10% with only 3200 training data points, indicating superior accuracy. The vector S-DeepONet model, having only 0.4% more parameters than a scalar model, can predict two output components simultaneously at an accuracy similar to the two independently trained scalar models with a 20.8% faster training time. The S-DeepONet inference is at least three orders of magnitude faster than direct numerical simulations, and inverse parameter identifications using the trained model are highly efficient and accurate.

Improved Architectures and Training Algorithms for Deep Operator Networks

Article 24 June 2022

Physics-Informed Deep Neural Operator Networks

Semi-supervised invertible neural operators for Bayesian inverse problems

Article Open access 30 March 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The recent advancements in high-performance computing and machine learning (ML) techniques have led to significant strides in the neural network (NN) applications in various fields, including structural optimization [1, 2], flow prediction [3,4,5], additive manufacturing [6, 7], and exploring structure–property relations [8, 9]. Besides approximating the underlying physics via simulation data (data-driven NNs), various physics-informed NNs have been proposed to embed the physics principles directly into the loss function, thereby alleviating the need for simulation data [10,11,12,13,14,15]. Trained NNs have been successfully used as efficient surrogate models in inverse design [1, 16] and design optimization [17, 18]. For a comprehensive overview of the application of NNs in computational mechanics, see [19].

NNs can be trained to approximate a particular solution, or they can be trained to approximate the underlying physics/mathematical operator for a class of problems in what is known as operator learning [20]. Previous researchers have proposed different architectures for operator learning, with two notable ones being the Fourier neural operator (FNO) and the Deep Operator Network (DeepONet). FNO was first proposed by Li et al. [21] to solve partial differential equations with parametric inputs. Inspired by the Fourier transform used in solving differential equations, the input function goes through multiple Fourier layers. The encoded information is then mapped onto the output function space. Each Fourier layer takes the fast Fourier transform (FFT) of its input and filters out high-frequency modes. FNO and its improved versions have been successfully applied to Burger’s equation, Darcy flow, Navier–Stokes equation [21], as well as in elasticity and plasticity problems in the solid mechanics field [22]. Modified versions of FNOs were widely used in various areas such as material modeling [23], seismic wave equations [24], and 3D turbulence for large eddy simulation [25]. However, since FNO uses FFT, it suffers difficulties when handling complex geometries or intricate non-periodic boundary conditions. Also, Fourier-based operations are computationally expensive when dealing with high-dimensional problems. Lately, the DeepONet proposed Lu et al. [26] emerged as another capable architecture for operator learning. The neural network architecture learns an operator from the governing differential equation of the system, which enables retraining not necessary despite the input to the system, such as parametric functions, boundary conditions, or even varying geometry. DeepONet consists of two separate neural networks, named branch and trunk. The branch network takes the parametric function as an input, while the trunk network takes domain geometry. Both the branch and trunk were fully connected neural networks (FNNs) in the very first form suggested by Lu et al. [26], encoding their input information each, respectively. The output of the branch and trunk then underwent a dot product to map the parametric functions to the target, the solution to the governing differential equation. Koric et al. [27] successfully predicted stress field on a 2D domain with a plastic deformation. Moreover, DeepONets lately have been frequently applied to solve science and engineering problems such as inverse designing nanoscale heat transport system [28], predicting elastic–plastic stress on complex topology-optimized domain [29], earthquake localization by Haghighat et al. [30], heat conduction problem by Koric and Abueidda [31], non-equilibrium thermodynamics of chemical mixtures combined with physics-informed neural networks by Li et al. [32], and convection–diffusion reaction system by Kobayashi et al. [33]. A comprehensive comparison between FNO and DeepONet was performed by Lu et al. [34].

Many real-world engineering problems involve dynamic, time-dependent loading conditions caused by impact, vibrations, or cyclic loading, resulting in a time-dependent response in the system. Often the full flow field (for CFD) or the stress field (for solid mechanics) is needed to gain critical insights in different local regions. Using classical techniques, including finite element analysis (FEA), topology optimization (TO), and sensitivity analysis to obtain a time-dependent response is computationally expensive and time-consuming. Hence, a scientific need exists for more robust data-driven surrogate models for time-dependent problems capable of predicting full-field contours for multiple time steps and vector components. Recurrent neural network (RNN) models like long short-term memory (LSTM) [35] and gated recurrent unit (GRU) [36] are typically used to capture the time-dependent and causal relationship in the inputs and outputs. However, RNNs are typically used to predict a 1D time sequence instead of a full-field contour [8, 37,38,39]. Realizing the effectiveness of the DeepONet in predicting full-field solution contour with parametric inputs, He et al. [40] proposed a modification of the DeepONet model (termed S-DeepONet) to combine the temporal encoding capability of RNNs with the spatial encoding capabilities of DeepONet to predict outcomes for the last time step, particularly for scalar output. In this work, we proposed an improvement over the original S-DeepONet architecture to simultaneously predict the full-field solution at different time steps as well as for different vector components, addressing the need to predict time-dependent vector fields in a single model. In this work, we tested our approach with two example problems: (1) a lid-driven cavity flow and (2) a path-dependent plasticity problem, both involving time dependence and representing real-world engineering use cases. We used the proposed innovative S-DeepONet formulation to predict full-field solutions and compared performance with the classical DeepONet formulation. Furthermore, we demonstrate the use of the trained NN in an inverse parameter identification via genetic algorithm.

This paper is organized as follows: Sect. 2 introduces neural network architectures and provides details on the data generation method. Section 3 presents and discusses the performance of the NN model. Section 4 summarizes the outcomes and limitations and highlights future works.

2 Methods

2.1 S-DeepONet with multiple output dimensions

The key idea in the original S-DeepONet architecture as proposed by the authors [40] is the separation of the temporal component (handled by a GRU branch network) and the spatial component (handled by the FNN trunk network) of the solution operator, and combining the components via a matrix–vector product and a bias:

$$\begin{aligned} \varvec{G}_{n} = \sum _{h=1}^{HD} \varvec{B}_{h} \varvec{T}_{nh} + \beta , \end{aligned}$$

(1)

where $\varvec{G}$, $\varvec{B}$, and $\varvec{T}$ denote the outputs of the S-DeepONet, branch network, and trunk network, respectively. Dimension index h represents the hidden dimension (HD) of branch and trunk networks, and index n represents the flattened spatial dimension, which contains N nodes in the simulation domain. Finally, $\beta $ is a bias added to the product. This structure allows the prediction of a full-field solution discretized by N nodes, where the time-dependent input load information is encoded in the branch network and the spatial geometry information is encoded in the trunk network.

Although including time-dependent information in the input load, the original S-DeepONet only predicts a scalar solution field at the end of the load. To extend the S-DeepONet architecture to predict vector solution fields at different time steps, we further exploit the idea of separation of spatial and temporal components into the trunk and branch networks. To this end, a novel S-DeepONet structure is proposed, and its schematic is shown in Fig. 1.

In the proposed architecture, we leverage the GRU branch network to produce encoded hidden outputs for all the output time steps as a tensor $\varvec{B}$ of shape [HD, S], where S is the number of time steps. Similarly, when provided an [N, 2] input vector containing 2D coordinates of all nodes in the domain, the trunk network in the proposed S-DeepONet produces an encoded output tensor $\varvec{T}$ of shape [N, HD, C], where C is the number of output vector components. $\varvec{T}$ contains the encoded hidden outputs for all nodes and all vector components. To account for the new output dimensions, we combine the information from the branch and trunk via the following product:

$$\begin{aligned} \varvec{G}_{nsc} = \sum _{h=1}^{HD} \varvec{B}_{hs} \varvec{T}_{nhc} + \beta , \end{aligned}$$

(2)

where the lowercase indices s and c correspond to the time step and output component dimensions. The tensor product nature of this combination allows for efficient simultaneous generation of full-field vector outputs at multiple output time steps. This combination can be understood as a simultaneous identification of a set of “basis” shapes (from the trunk network) and corresponding weights (from the branch network), where the final solution contours are expressed as the weighted linear combination of those basis contours. This idea of basis and weight identification of the S-DeepONet architecture is illustrated further in Fig. 2.

By predicting C output vector components together, we are eliminating the need to train C different single-component S-DeepONets, each specialized in one output component, thus leading to tremendous computational time savings.

The branch network of the developed S-DeepONet consists of four GRU layers. The encoding–decoding structure of the original S-DeepONet is adopted, with the first two being the encoder and the last two being the decoder. All GRU layers use a tanh activation function. Finally, a time-distributed dense layer with linear activation is used to output the branch results with a hidden dimension (HD) of 32 for all S time steps. The trunk network is a FNN. All NNs were implemented in the DeepXDE framework [41] with a TensorFlow backend [42]. All models were trained for 300,000 epochs with a batch size of 64. The Adam optimizer [43] was used, and the scaled mean squared error (MSE) was used as the loss function, which is defined as:

$$\begin{aligned} \mathrm{{MSE}} = \frac{ 1 }{ N } \sum ^N_{i=1} (f_{FE} - f_{Pred})^2, \end{aligned}$$

(3)

where N, $f_{FE}$, and $f_{Pred}$ denote the number of data points, the FE-simulated field value, and the NN-predicted field value, respectively.

The choice of hyperparameters is crucial for the performance of NN models. Since the optimal set of hyperparameters depends on the specific problem at hand, instead of employing various automatic hyperparameter optimization methods [44, 45], we have chosen a baseline model size by gradually increasing the number of trainable parameters in the model, and used the same baseline hyperparameters across the two example problems. Using the same model for different problems without any problem-specific fine-tuning highlights the robustness of the proposed architecture. In the process of identifying the baseline hyperparameters, we considered the three following model sizes:

1.
Model 1: GRU $=[64,32,32,64]$, FNN $=[2,101, HD \times C]$
2.
Model 2: GRU $= [128,64,64,128]$, FNN $=[2,101,101,101, HD \times C]$
3.
Model 3: GRU $= [256,128,128,256]$, FNN $=[2,101,101,101,101, 101, HD \times C]$

where the numbers in square brackets denote neurons in each layer and C takes different values in the numerical examples presented in this work. The numbers of trainable parameters are 59,144, 220,932, and 800,384, respectively, when $C=3$.

2.2 Data generation

In this work, we demonstrate the application of the proposed improved S-DeepONet in time-dependent computational fluid dynamics (CFD) and history-dependent plastic deformation problems. Using the trained S-DeepONet, inverse parameter identification is performed with genetic algorithm.

2.2.1 Lid-driven cavity flow

In the first example, we consider a classical CFD benchmark problem: the lid-driven cavity flow. Consider a rectangular cavity of length 3 and height 1, which is discretized into a $121 \times 41$ uniform grid with 4961 nodes. A lid is located at $y=1$ and moves at a time-dependent velocity profile $\bar{u}(t)$ in the X direction, driving the fluid motion in the cavity. The radial basis interpolation (RBI) was used to generate the smooth lid velocity profile, which was defined by six uniformly spaced control points. The starting velocity (i.e., at $t=0$) is 0, and the velocity in all other control points was sampled from the range $[-2,2]$. The fluid is assumed to be incompressible, and the system is governed by the Navier–Stokes equation:

$$\begin{aligned} \frac{ \partial \varvec{u} }{ \partial t } = \mu \nabla ^2 \varvec{u} - ( \varvec{u} \cdot \nabla ) \varvec{u} - \frac{1}{\rho } \nabla P, \end{aligned}$$

(4)

where $\varvec{u}$, $\mu $, $\rho $, and P denote the velocity vector, viscosity, mass density, and pressure, respectively. In this example, a hypothetical fluid with $\rho =1$ and $\mu =0.1$ was used. In 2D, let the X and Y components of the velocity vector be denoted as u and v. No-slip boundary condition was used for all four boundaries:

$$\begin{aligned} \begin{aligned} u = \bar{u}(t), \, v = 0 \;\; \textrm{if} \; { {y}} = 1,\\ \varvec{ u } = \varvec{0}, \;\; \textrm{Otherwise}.\\ \end{aligned} \end{aligned}$$

(5)

For the pressure degree of freedom, the following boundary conditions were used:

$$\begin{aligned} \begin{aligned} P = 0 \;\; \textrm{if} \; { {y}} = 1,\\ \frac{\partial P}{\partial x} = 0 \;\; \textrm{if} \; { {x}} = 0,3,\\ \frac{\partial P}{\partial y} = 0 \;\; \textrm{if} \; { {y}} = 0.\\ \end{aligned} \end{aligned}$$

(6)

The trivial initial conditions $\varvec{u}=\varvec{0}$ and $P=0$ were used. For this simple geometry, the second-order central finite difference (FD) method was used to discretize the governing equation, and the pressure projection method with explicit time integration was used to evolve the system for 10,000 time steps with a time step size of $2 \times 10^{-4}$. A Python code was used to solve this problem, which was adopted from the work of Barba et al. [46]. A total of 4000 simulations were conducted with distinct lid velocity profiles. The primary variables P, u, and v were stored at 25 uniformly spaced output time steps over the simulation period. For better NN training and accuracy, it is best to scale the input and output data to suitable ranges. A time-dependent scale factor for each output variable was first calculated to account for the change in solution scale as the fluid flow develops. For a time step i, this scale is defined as the maximum absolute value of the field values (over all cases and all points) at this time step. The data at time step i was then divided by this scale, leading to a new data scale of $[-1,1]$.

2.2.2 Dog bone axial loading

In the second example, we consider the plastic deformation of a dog bone specimen under time-dependent axial loads. The specimen has a length of 110 mm and a gauge width of 20 mm. A total of 4756 linear plane stress elements were used, and the specimen is fixed on its left end with displacement applied on the right end. A schematic of the specimen and boundary conditions are shown in Fig. 3a.

In the absence of any body and inertial forces, the equilibrium equations and boundary conditions are:

$$\begin{aligned} \begin{aligned} \nabla \cdot \varvec{\sigma } = \varvec{0}, \;\; \forall \varvec{X} \in \Omega ,\\ \varvec{ u } = \bar{\varvec{u}}, \;\; \forall \varvec{X} \in \partial \Omega _u,\\ \end{aligned} \end{aligned}$$

(7)

where $\varvec{\sigma }$ and $\bar{\varvec{u}}$ denote the Cauchy stress and prescribed displacement, respectively. The small-strain assumption is applied, which leads to the following definition of total strain tensor:

$$\begin{aligned} \varvec{\epsilon } = \frac{1}{2} ( \nabla \varvec{u} + \nabla \varvec{u}^T ), \end{aligned}$$

(8)

as well as its additive decomposition into elastic and plastic strain parts:

$$\begin{aligned} \varvec{\epsilon } = \varvec{\epsilon }^e + \varvec{\epsilon }^p. \end{aligned}$$

(9)

For linear elastic isotropic material in plane stress, stress can be found from elastic strain via:

$$\begin{aligned} \begin{bmatrix} \sigma _{11} \\ \sigma _{22} \\ \sigma _{12} \end{bmatrix} = \frac{E}{1-\nu ^2} \begin{bmatrix} 1 &{}\quad \nu &{}\quad 0 \\ \nu &{}\quad 1 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 1-\nu \end{bmatrix} \begin{bmatrix} \epsilon ^e_{11} \\ \epsilon ^e_{22} \\ \epsilon ^e_{12} \end{bmatrix}, \end{aligned}$$

(10)

where E and $\nu $ are the Young’s modulus and Poisson’s ratio. Plasticity is modeled with the simple $J_2$ plasticity with linear isotropic hardening law:

$$\begin{aligned} \sigma _y = \sigma _{y0} + H \bar{\epsilon }_p, \end{aligned}$$

(11)

where $\sigma _y$, $\bar{\epsilon }_p$, $\sigma _{y0}$, and H denote the flow stress, equivalent plastic strain, initial yield stress, and the hardening modulus, respectively. Relevant material properties are shown in Table 1. $\bar{\epsilon }_p$ is an internal history variable in plasticity that records the accumulation of plastic strain at different points in the loading history. Due to its vital role in plasticity, this is considered one of the output variables of the S-DeepONet, while the other output variable is the von Mises stress:

$$\begin{aligned} \bar{\sigma } = \sqrt{ \sigma _{11}^2 + \sigma _{22}^2 + \sigma _{11}\sigma _{22} + 3\sigma _{12}^2 }. \end{aligned}$$

(12)

Table 1 Material properties of the elastic–plastic material model

Full size table

The time-dependent displacement histories were generated similarly to Sect. 2.2.1 with six uniformly spaced control points. The applied displacement is 0 at $t=0$. The displacement magnitude at each control point was randomly selected such that the nominal axial strain magnitude is below 5%. A typical example of the applied displacement is shown in Fig. 3b. A total of 4000 FE simulations were generated using Abaqus/Standard [47], and $\bar{\sigma }$ and $\bar{\epsilon }_p$ were stored at 40 uniformly spaced time steps as the ground truth labels in the NN training. Similar to Sect. 2.2.1, some data scaling is needed for best model performance. The min–max scaler in scikit-learn [48] was used for this example. Two different scalers were used for the two output components to account for the drastically different scales of stress and plastic strain.

2.3 Inverse history identification with trained S-DeepONet

Once the improved S-DeepONet is trained, it can be used to infer time-dependent full-field solution contours efficiently. Take the example of predicting von Mises stress as described in Sect. 2.2.2. Further post-processing of the predicted fields yields a plot of the mean von Mises stress (over the entire dog bone specimen) as a function of load duration, which is a function of the input load history. If given a known curve of the mean stress over time, it is possible to infer the load history necessary to generate this stress history using the trained S-DeepONet and an optimizer. The outputs of this inverse identification are the five scalar displacement values at the control points since the smooth load curve in Sect. 2.2.2 can be uniquely characterized by these values (the value at the first control point is 0). For this work, the genetic algorithm (GA) implementation with PyGAD [49] was used, which provides a gradient-free optimization framework to leverage the trained S-DeepONet as a black-box model. Using GA, 25 generations of optimization were performed with a population size of 100 and the number of parents mating was set to 10. The GA seeks to maximize a scalar fitness function value, and in the current case, it is defined as the inverse of the mean absolute error (MAE) between the predicted and known stress histories. The process of evaluating the fitness value with a trained S-DeepONet is presented in the following algorithm:

3 Results and discussion

All simulations were conducted with six high-end AMD EPYC 7763 Milan CPU cores. All NN training and inference were conducted using a single Nvidia A100 GPU card on Delta, an HPC cluster hosted at the National Center for Supercomputing Applications (NCSA).

To evaluate the model performance in the test set, three quantitative metrics were used, and they are the relative $L_2$ error, mean absolute error, and $R^2$ value:

$$\begin{aligned} \begin{aligned} \mathrm{{Relative \; L_2 \; error}} = \frac{ | f_{FE} - f_{Pred} |_2 }{ |f_{FE}|_2 + \epsilon } \times 100\%,\\ \mathrm{{MAE}} = \frac{1}{N_T} \sum _{i=1}^{N_T} \left| f_{FE} - f_{Pred} \right| ,\\ R^2 = 1 - \frac{ \sum _{i=1}^{N_T} \left( f_{FE} - f_{Pred} \right) ^2 }{ \sum _{i=1}^{N_T} \left( f_{FE} - \bar{f}_{FE} \right) ^2 } , \end{aligned} \end{aligned}$$

(13)

where $f_{FE}$, $f_{Pred}$, $N_T$, and $\bar{f}_{FE}$ denote the finite element (FE) simulated field value, NN-predicted field value, number of test cases, and the mean value of the FE-simulated field values, respectively. A small numerical offset $\epsilon =1\times 10^{-8}$ was added to prevent division by 0 when the FE solution is 0.

3.1 Time-dependent fluid flow

To use the improved S-DeepONet to solve the lid-driven cavity problem, we set $C=3$ to predict three vector components P, u, and v and $S=25$ for 25 output time steps. First, we studied the effect of model size using this example and identified the baseline model size among the three sets of hyperparameters listed in Sect. 2.1. The classical 80–20 data split was used when evaluating different model sizes, and the prediction errors for the three components are shown in Fig. 4a.

The results in Fig. 4a clearly show that the prediction errors of all components decreased with increasing model sizes. Therefore, among the three sets of hyperparameters, we chose the configuration of Model 3 as the baseline, and continue using it for all examples shown in this work. Then, with the baseline model size identified, different percentages of the entire dataset were used in training to investigate the data efficiency of the proposed architecture. In three different instances of the model, we tested training with 50%, 65%, and 80% of all available data; the relative $L_2$ errors for the three solution components are compared in Fig. 4b. Figure 4b shows that the prediction error generally decreases with increasing amount of data in training. Therefore, the classical 80–20 data split with 80% data in training and the rest in testing was adopted in this work for all examples. To test the repeatability of the model performance, we trained the S-DeepONet three times with randomly divided training and testing data. The results show similar repeatable performance, with a mean (over all repetitions and components) relative $L_2$ error of 5.891% and a standard deviation of 0.498%. Detailed performance metrics for the best-performing run out of the three are shown in Table 2. The average training and inference (per case) times over the three runs were 22,278 s and $8\times 10^{-3}$s, respectively. On average, running each FD simulation took 18 s, making the inference 2240 times faster than the direct numerical simulation.

Table 2 Performance metrics for the CFD example

Full size table

The contour and quiver plots for pressure and velocity vectors at different time steps are shown in Fig. 5. They are ranked by the percentile of mean relative $L_2$ error in the predictions, and the 0th (best case), 80th, and 100th (worst case) percentiles are shown for representation. For all plots in Fig. 5, the filled contour was colored by pressure, and the velocity vectors are shown as arrows in the domain whose lengths are proportional to the velocity magnitude.

When predicting full-field solution fields at multiple time steps, two types of errors are worth investigating. They are the time-averaged error (i.e., for each prediction case, the average error over all prediction time steps) and frame-averaged error (i.e., for each prediction case at each time frame, the average error over all nodes in the domain). The time-averaged error provides a holistic measurement of model prediction accuracy over the entire load history, while the frame-averaged error provides a more instantaneous insight into how the prediction accuracy changes with the current magnitude of the input function. For that, scatter plots of the mean absolute error versus the instantaneous lid velocity are shown in Fig. 6.

From the performance metrics in Table 2, we see that the S-DeepONet is able to generate accurate predictions with relative $L_2$ errors less than 10% and $R^2$ of above 0.99 for all components with only 3200 training data point. The contour plots in Fig. 5 further confirm the accuracy of the model, as we can see that the S-DeepONet predicted contours match closely with those computed from direct numerical simulations. In the worst-case scenario (the last row of Fig. 5), we see that the predicted pressure contour differs significantly from the ground truth at step 1. However, for some time steps (e.g., step 17, third column), the S-DeepONet predicted contours are similar to the ground truth even in the worst case. This observation is reasonable, as the percentiles are ranked based on time-averaged error, and does not necessarily mean that the predictions are inaccurate in all time steps (the errors can be concentrated in a few time steps while other step predictions remain accurate). When inspecting the frame-averaged error versus the instantaneous lid velocity, as shown in Fig. 6, we see a consistent, generally increasing trend with increasing lid velocity. The correlation coefficient of the trend line ($R^2$) is around 0.4, indicating a reasonable correlation, a subject which we will further investigate and compare in Sect. 3.2.

3.2 History-dependent plastic deformation

For the plastic deformation problem, we set $C=2$ to predict two vector components, namely the Mises stress and the equivalent plastic strain. The means and standard deviations of the metrics over three repetitions of the S-DeepONet training are summarized in Table 3.

Table 3 Result repeatability, plastic deformation

Full size table

To show the statistical distribution of prediction error among the 800 test cases, stress contours that correspond to the 0th (best case), 80th, and 100th (worst case) percentile prediction error are displayed in Fig. 7 for the improved S-DeepONet model. For the ranking of prediction errors, the average relative $L_2$ error over all time steps and all output components were used to obtain a single scalar representation of prediction accuracy over all time steps and output components. Due to the symmetry in the loading, only the top half of the NN predictions are shown in Fig. 7, with the bottom half reserved for (the flipped top half of) the FE simulation, the two are separated by a black dashed line.

To highlight the efficiency of the current vector output model, we trained two scalar S-DeepONets (i.e., by setting $C=1$), one for $\bar{\sigma }$ and the other for $\bar{\epsilon }_p$, and compare the computational efficiency and accuracy of the models. For a fair comparison, training of the three models were conducted using identical training and testing data points. The model size, training, and inference time are compared in Table 4.

Table 4 Vector model versus scalar models

Full size table

The prediction quality metrics are summarized in Table 5.

Table 5 Vector model versus scalar models, performance metrics

Full size table

The contour plots predicted by the vector S-DeepONet model and the two scalar S-DeepONet models at different time steps are shown in Fig. 8 for model comparison. The test case corresponding to the 50$^{th}$ (median case) percentile prediction error is shown for representation.

We compare the histograms of the time-averaged error for von Mises stress and equivalent plastic strain for the vector and scalar S-DeepONets in Fig. 9. The scatter plots and trend lines for the frame-averaged mean absolute error versus the magnitude of the applied displacement are shown in Fig. 10.

From the repeatability results in Table 3, we see that the performance of the proposed vector S-DeepONet is consistent and repeatable, with minimal performance variation when trained with different data. With the von Mises stress, the model achieves a relative $L_2$ error of about 5% and a $R^2$ value of 0.997, which is accurate considering that only 3200 data points were used in training. The relative $L_2$ error over all test data for equivalent plastic strain is as high as 21%, but this is expected as this metric is significantly inflated numerically when the specimen is elastic. When inspecting the mean absolute error in plastic strain, we see that the error is only about $8.4\times 10^{-5}$, and the $R^2$ value is 0.999, again indicating accurate model predictions. It is also worth highlighting that the proposed architecture solved two problems with completely different physics with only minor changes in network structures to account for the different number of time steps and output vector components, thereby demonstrating the generalizability and versatility of the proposed framework. The contour plots at different percentiles and time steps, as shown in Fig. 7, provide a more direct view of the prediction performance. In the best case (first row of Fig. 7), the S-DeepONet is able to accurately capture the hot spots in stress and plastic strains at different time steps, showing close agreement with the corresponding FE simulations. At the 80$^{th}$ percentile (second row of Fig. 7), we see that the S-DeepONet was unable to predict the stress contour initially when the stress is small, and response is fully elastic but is still able to predict similar contours of both quantities at later time steps once the magnitudes of the stress and plastic strain increase. This observation is reasonable since the relative $L_2$ error can be numerically inflated when the magnitude of the ground truth (i.e., in the denominator of Eq. 13(a)) is small. A similar situation is observed for the worst case (last row of Fig. 7), where the specimen remains completely elastic throughout this loading history. In this case, the relative $L_2$ error for plastic strain is numerically inflated throughout the entire loading history. Hence, it is ranked the worst case.

The key improvement of the current architecture is the ability to predict multiple vector components at time steps with one model. For the case of two output components, Table 4 shows that the vector model only has 0.4% more trainable parameters as compared to the scalar S-DeepONet model. Training the vector model took 15,547 s, which is 20.8% faster than the combined training time for the two scalar models, while the inference speed is about 80% faster. On average, each FE simulation requires 48 s of CPU time to solve, making the S-DeepONet inference about 11,900 times faster than running a FE simulation. Moreover, as revealed by the performance metrics presented in Table 5, we see that the performance of the vector S-DeepONet is highly similar to (albeit slightly worse than) the individual scalar networks. Figure 8 shows that the predicted contour plots of the von Mises stress and equivalent plastic strain are highly similar at different time steps as well. Except at the beginning of the loading history (first column of Fig. 8), where the specimen is fully elastic, the vector and scalar models predicted different contours for equivalent plastic strain (when they should be identically 0). Figure 9 compares the histograms of the time-averaged errors from the three different models. We see that the distribution of the prediction accuracy is similar between the vector and scalar S-DeepONet models. However, considering the significant time and model size savings, training the proposed vector S-DeepONet model instead of training two scalar models individually is the superior option from an efficiency perspective.

Lastly, for a multi-step prediction, it is worth studying how the frame-averaged prediction error changes as a function of the instantaneous applied displacement magnitude. Figure 10 obviously shows that there exists little correlation (i.e., $R^2=0.02$) between the current displacement magnitude and the frame-averaged error. As a comparison, recall from Sect. 2.2.1 that all three components (P, u, and v) show a noticeable correlation with a $R^2$ of around 0.4. This observation difference is reasonable because of the differences in the governing physics. Plasticity is path-dependent; the current stress magnitude depends not only on the current displacement magnitude but also on the past loading history, hence the lower correlation with the current displacement magnitude. On the other hand, the steady-state flow field can be uniquely defined by the lid velocity. Hence, the time-dependent flow field magnitude shows a stronger correlation with the current lid velocity.

3.3 Application of the trained S-DeepONet model

Three additional load curves were randomly generated following the same procedure outlined in Sect. 2.2.2. The trained S-DeepONet model did not see those curves during its training and testing. FE simulations were performed on those load curves to obtain the reference stress history curves, which were inputs to the inverse identification. To validate the results, FE simulations were performed using the load paths identified by S-DeepONet. The load curves and stress histories are compared in Fig. 11.

With the help of efficient forward inference of the S-DeepONet model, the three GA cases were completed in an average of 24 s, which is fast considering that a single forward FE simulation of the dog bone specimen takes around 48 s. The results show that in all three cases, the S-DeepONet predicted stress histories match closely with the given ground truths. Moreover, a comparison of the S-DeepONet predictions and the corresponding FE validation results (generated from the same load curve predicted by GA) shows that the S-DeepONet predictions remain highly accurate for load curves outside of the original training and testing data points, again showing the high accuracy of the trained model. Despite the high similarity between the identified and given stress histories, the load curves identified by GA and S-DeepONet do not always match the known ground truths. We see two characteristic cases from Fig. 11: (1) matching load curves (i.e., Fig. 11a), (2) load curves with equal but opposite displacements (i.e., Fig. 11b). It is also interesting to see a combination of both characteristic behaviors in Fig. 11c where the second half of the predicted load curve matches closely with ground truth while the first half shows opposite displacements. This is reasonable from a mechanics perspective, since the von Mises stress is a positive quantity regardless of tensile or compressive loading, and the material considered in this work does not exhibit tension–compression asymmetry.

4 Conclusions, limitations, and future work

The sequential DeepONet (S-DeepONet) model previously proposed by the authors [40] is a variant of the DeepONet architecture [26] that uses a gated recurrent unit (GRU) in the branch network to capture the temporal information in the time-dependent input functions. However, the original S-DeepONet architecture only predicts the last time frame of a time-dependent evolution, and the predicted field is a scalar. Realizing this limitation, we introduced an improved version of the S-DeepONet capable of simultaneously predicting multiple time steps for solution fields with multiple vector components. This feature is made possible by the tensor product structure of the combination operation that combines the encoded temporal information from the branch network and the encoded spatial information (for all vector components) from the trunk network. We further elucidate that the tensor product combination process can be viewed as the simultaneous identification of a set of basis shapes and weights, with which the final contour can be expressed as a weighted linear combination. This architecture expands on the idea of exploiting the powerful temporal encoding capability of the GRU and the spatial encoding capability of the DeepONet architecture. To the best of the authors’ knowledge, this is also the first time in literature that DeepONet has been extended to predict a transient field with multiple vector components at different time steps. To demonstrate the application of the improved S-DeepONet, we showed an example of lid-driven cavity flow and an example of plastic deformation, both subjected to time-dependent input loads and having multiple output components at many time steps. In both cases, the improved S-DeepONet was able to provide accurate predictions for all output components using only 3200 data points in training. For all components, the DeepONet predictions achieved a $R^2$ value of over 0.99 and a relative $L_2$ error of less than 10% (except for the equivalent plastic strain in Sect. 3.2). We highlight the fact that minimum architecture change is needed (except for change in the output components and time steps) for solving those two problems with totally different underlying physics, which shows that the proposed S-DeepONet is highly versatile. Once trained, the S-DeepONet can infer accurate full-field results at different time steps at least three orders of magnitude faster than direct numerical simulations. The trained model can also be used in conjunction with gradient-free optimizers such as the genetic algorithm to perform accurate inverse parameter identification. The example using the plasticity data shown that the inverse identification is highly efficient (finished in about half the time of one FE simulation) and accurate. Using the plasticity example, we also demonstrated the effectiveness of predicting all output components at once using a vector network instead of training a separate scalar network for each of the output components. With only 0.4% more parameters, the vector S-DeepONet trained 20.8% faster than the two scalar networks combined while maintaining a very similar level of prediction accuracy. Therefore, it is recommended to use the vector S-DeepONet proposed in the current work, if multiple output components are of interest at multiple time steps.

Through the analysis of the mean error magnitude as a function of the mean input load magnitude, it was revealed that the current S-DeepONet suffers from the limitation that it is unable to accurately predict the field contours when the field magnitude is small, even with a time-dependent data scaling scheme. This is likely due to the use of the MSE loss function, which only results in a small loss contribution for those data points with small magnitudes.

With the capability to accurately and efficiently predict solution history at multiple time steps for multiple different components, the improved S-DeepONet architecture proposed in this work provides a versatile tool to the engineering community to build surrogate models for complex nonlinear numerical simulations. The versatility of this architecture is demonstrated in this work through its application in both solid and fluid mechanics problems and can be further employed in different engineering applications. With the weights and bias of the trained model, almost instant full-field forward predictions can be evaluated even on low-end platforms like laptops, which provides a novel tool for rapid preliminary designs, sensitivity analysis, uncertainty quantification, online controls, and as a black-box surrogate model for optimization, as we have demonstrated. In future work, we will investigate the use of transformer models [50] and attention mechanisms [51] in the DeepONet architecture to achieve higher prediction accuracy with lower computational costs.

References

Bastek, J.H., Kochmann, D.M.: Inverse-design of nonlinear mechanical metamaterials via video denoising diffusion models (2023) arXiv preprint arXiv:2305.19836
Sosnovik, I., Oseledets, I.: Neural networks for topology optimization. Russ. J. Numer. Anal. Math. Model. 34(4), 215–223 (2019)
Article MathSciNet Google Scholar
Lira, J.O.B., Riella, H.G., Padoin, N., Soares, C.: Computational fluid dynamics (CFD), artificial neural network (ANN) and genetic algorithm (GA) as a hybrid method for the analysis and optimization of micro-photocatalytic reactors: Nox abatement as a case study. Chem. Eng. J. 431, 133771 (2022)
Article Google Scholar
Belbute-Peres, F.D.A., Economon, T., Kolter, Z.: Combining differentiable PDE solvers and graph neural networks for fluid flow prediction. In: International Conference on Machine Learning, pp. 2402–2411. PMLR (2020)
Ye, X., Li, H., Huang, J., Qin, G.: On the locality of local neural operator in learning fluid dynamics (2023). arXiv preprint arXiv:2312.09820
Valizadeh, M., Wolff, S.J.: Convolutional neural network applications in additive manufacturing: a review. Adv. Ind. Manuf. Eng. 4, 100072 (2022)
Google Scholar
Kwon, O., Kim, H.G., Ham, M.J., Kim, W., Kim, G.-H., Cho, J.-H., Kim, N.I., Kim, K.: A deep neural network for classification of melt-pool images in metal additive manufacturing. J. Intell. Manuf. 31, 375–386 (2020)
Article Google Scholar
Kushwaha, S., He, J., Abueidda, D., Jasiuk, I.: Designing impact-resistant bio-inspired low-porosity structures using neural networks. J. Mater. Res. Technol. 27, 767–779 (2023)
Article Google Scholar
He, J., Kushwaha, S., Abueidda, D., Jasiuk, I.: Exploring the structure-property relations of thin-walled, 2d extruded lattices using neural networks. Comput. Struct. 277, 106940 (2023)
Article Google Scholar
Fuhg, J.N., Bouklas, N.: The mixed deep energy method for resolving concentration features in finite strain hyperelasticity. J. Comput. Phys. 451, 110839 (2022)
Article MathSciNet Google Scholar
He, J., Abueidda, D., Al-Rub, R.A., Koric, S., Jasiuk, I.: A deep learning energy-based method for classical elastoplasticity. Int. J. Plast 162, 103531 (2023)
Article Google Scholar
Nguyen-Thanh, V.M., Zhuang, X., Rabczuk, T.: A deep energy method for finite deformation hyperelasticity. Eur. J. Mech. A. Solids 80, 103874 (2020)
Article MathSciNet Google Scholar
Nguyen-Thanh, V.M., Anitescu, C., Alajlan, N., Rabczuk, T., Zhuang, X.: Parametric deep energy approach for elasticity accounting for strain gradient effects. Comput. Methods Appl. Mech. Eng. 386, 114096 (2021). https://doi.org/10.1016/j.cma.2021.114096
Article MathSciNet Google Scholar
Zhong, W., Meidani, H.: Physics-informed mesh-independent deep compositional operator network (2024). arXiv preprint arXiv:2404.13646
He, J., Abueidda, D., Koric, S., Jasiuk, I.: On the use of graph neural networks and shape-function-based gradient computation in the deep energy method. Int. J. Numer. Methods Eng. 124(4), 864–879 (2023)
Article MathSciNet Google Scholar
Liu, D., Tan, Y., Khoram, E., Zongfu, Yu.: Training deep neural networks for the inverse design of nanophotonic structures. ACS Photon. 5(4), 1365–1369 (2018)
Article Google Scholar
Cook, D.F., Ragsdale, C.T., Major, R.L.: Combining a neural network with a genetic algorithm for process parameter optimization. Eng. Appl. Artif. Intell. 13(4), 391–396 (2000)
Article Google Scholar
Wang, L.: A hybrid genetic algorithm-neural network strategy for simulation optimization. Appl. Math. Comput. 170(2), 1329–1343 (2005)
Article MathSciNet Google Scholar
Herrmann, L., Kollmannsberger, S.: Deep learning in computational mechanics: a review. Comput. Mech. (2024). https://doi.org/10.1007/s00466-023-02434-4
Article Google Scholar
Kovachki, N., Li, Z., Liu, B., Azizzadenesheli, K., Bhattacharya, K., Stuart, A., Anandkumar, A.: Neural operator: learning maps between function spaces with applications to pdes. J. Mach. Learn. Res. 24(89), 1–97 (2023)
MathSciNet Google Scholar
Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., Anandkumar, A.: Fourier neural operator for parametric partial differential equations (2020). arXiv preprint arXiv:2010.08895
Li, Z., Huang, D.Z., Liu, B., Anandkumar, A.: Fourier neural operator with learned deformations for pdes on general geometries (2022a). arXiv preprint arXiv:2207.05209
You, H., Zhang, Q., Ross, C.J., Lee, C.-H., Yue, Yu.: Learning deep implicit Fourier neural operators (IFNOS) with applications to heterogeneous material modeling. Comput. Methods Appl. Mech. Eng. 398, 115296 (2022)
Article MathSciNet Google Scholar
Li, B., Wang, H., Feng, S., Yang, X., Lin, Y.: Solving seismic wave equations on variable velocity models with Fourier neural operator. IEEE Trans. Geosci. Remote Sens. 61, 1–18 (2023)
Google Scholar
Li, Z., Peng, W., Yuan, Z., Wang, J.: Fourier neural operator approach to large eddy simulation of three-dimensional turbulence. Theor. Appl. Mech. Lett. 12(6), 100389 (2022)
Article Google Scholar
Lu, L., Jin, P., Pang, G., Zhang, Z., Karniadakis, G.E.: Learning nonlinear operators via deeponet based on the universal approximation theorem of operators. Nat. Mach. Intell. 3(3), 218–229 (2021)
Article Google Scholar
Koric, S., Viswantah, A., Abueidda, D.W., Sobh, N.A., Khan, K.: Deep learning operator network for plastic deformation with variable loads and material properties. Eng. Comput. 6, 1–13 (2023)
Google Scholar
Lu, L., Pestourie, R., Johnson, S.G., Romano, G.: Multifidelity deep neural operators for efficient learning of partial differential equations with application to fast inverse design of nanoscale heat transport. Phys. Rev. Res. 4(2), 023210 (2022)
Article Google Scholar
He, J., Koric, S., Kushwaha, S., Park, J., Abueidda, D., Jasiuk, I.: Novel deeponet architecture to predict stresses in elastoplastic structures with variable complex geometries and loads. Comput. Methods Appl. Mech. Eng. 415, 116277 (2023d). https://doi.org/10.1016/j.cma.2023.116277
Haghighat, E., Waheed, U.B., Karniadakis, G.: A novel deeponet model for learning moving-solution operators with applications to earthquake hypocenter localization (2023). arXiv preprint arXiv:2306.04096
Koric, S., Abueidda, D.W.: Data-driven and physics-informed deep learning operators for solution of heat conduction equation with parametric heat source. Int. J. Heat Mass Transf. 203, 123809 (2023)
Article Google Scholar
Li, W., Bazant, M.Z., Zhu, J.: Phase-field deeponet: physics-informed deep operator neural network for fast simulations of pattern formation governed by gradient flows of free-energy functionals (2023b). arXiv preprint arXiv:2302.13368
Kobayashi, K., Daniell, J., Alam, S.B.: Improved generalization with deep neural operators for engineering systems: path towards digital twin. Eng. Appl. Artif. Intell. 131, 107844 (2024)
Article Google Scholar
Lu, L., Meng, X., Cai, S., Mao, Z., Goswami, S., Zhang, Z., Karniadakis, G.E.: A comprehensive and fair comparison of two neural operators (with practical extensions) based on fair data. Comput. Methods Appl. Mech. Eng. 393, 114778 (2022)
Article MathSciNet Google Scholar
Schmidhuber, J., Hochreiter, S., et al.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F. Schwenk, H., Bengio, Y.: Learning phrase representations using rnn encoder-decoder for statistical machine translation (2014). arXiv preprint arXiv:1406.1078
Perumal, V., Abueidda, D., Koric, S., Kontsos, A.: Temporal convolutional networks for data-driven thermal modeling of directed energy deposition. J. Manuf. Process. 85, 405–416 (2023)
Article Google Scholar
Koric, S., Abueidda, D.W.: Deep learning sequence methods in multiphysics modeling of steel solidification. Metals 11(3), 494 (2021)
Article Google Scholar
Abueidda, D.W., Koric, S., Sobh, N.A., Sehitoglu, H.: Deep learning for plasticity and thermo-viscoplasticity. Int. J. Plast. 136, 102852 (2021)
Article Google Scholar
He, J., Kushwaha, S., Park, J., Koric, S., Abueidda, D., Jasiuk, I.: Sequential deep operator networks (s-deeponet) for predicting full-field solutions under time-dependent loads. Eng. Appl. Artif. Intell. 127, 107258 (2024)
Article Google Scholar
Lu, L., Meng, X., Mao, Z., Karniadakis, G.E.: Deepxde: a deep learning library for solving differential equations. SIAM Rev. 63(1), 208–228 (2021)
Article MathSciNet Google Scholar
Abadi, Martín, Agarwal, Ashish, Barham, Paul, Brevdo, Eugene, Chen, Zhifeng, Citro, Craig, Greg S. Corrado, Davis, Andy, Dean, Jeffrey, Devin, Matthieu, Ghemawat, Sanjay, Goodfellow, Ian, Harp, Andrew, Irving, Geoffrey, Isard, Michael, Jia, Yangqing, Jozefowicz, Rafal, Kaiser, Lukasz, Kudlur, Manjunath, Levenberg, Josh, Mané, Dandelion, Monga, Rajat, Moore, Sherry, Murray, Derek, Olah, Chris, Schuster, Mike, Shlens, Jonathon, Steiner, Benoit, Sutskever, Ilya, Talwar, Kunal, Tucker, Paul, Vanhoucke, Vincent, Vasudevan, Vijay, Viégas, Fernanda, Vinyals, Oriol, Warden, Pete, Wattenberg, Martin, Wicke, Martin, Yu, Yuan, Zheng, Xiaoqiang: TensorFlow: Large-scale machine learning on heterogeneous systems (2015) https://www.tensorflow.org/. Software available from tensorflow.org
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014). arXiv preprint arXiv:1412.6980
Yogatama, D., Mann, G.: Efficient transfer learning method for automatic hyperparameter tuning. In: Artificial Intelligence and Statistics, pp. 1077–1085. PMLR (2014)
Chadha, C., He, J., Abueidda, D., Koric, S., Guleryuz, E., Jasiuk, I.: Improving the accuracy of the deep energy method. Acta Mech. 234(12), 5975–5998 (2023)
Article MathSciNet Google Scholar
Barba, L.A., Forsyth, G.F.: Cfd python: the 12 steps to Navier–Stokes equations. J. Open Source Educ. 2(16), 21 (2018)
Article Google Scholar
SIMULIA. Abaqus (2020)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet Google Scholar
Gad, A.F.: Pygad: an intuitive genetic algorithm python library (2021). arXiv preprint arXiv:2106.06158
Jaderberg, M., Karen, S., Zisserman, A., et al.: Spatial transformer networks. Adv. Neural Inf. Process. Syst. 28, 52 (2015)
Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł, Polosukhin, I.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 68 (2017)
Google Scholar

Download references

Acknowledgements

The authors would like to thank the National Center for Supercomputing Applications (NCSA) at the University of Illinois, and particularly its Research Consulting Directorate, the Industry Program, and the Center for Artificial Intelligence Innovation (CAII) for their support and hardware resources. This research is a part of the Delta research computing project, which is supported by the National Science Foundation (award OCI 2005572) and the State of Illinois, as well as the Illinois Computes program supported by the University of Illinois Urbana-Champaign and the University of Illinois System. Finally, the authors would like to thank Professors George Karniadakis, Lu Lu, and the Crunch team at Brown, whose original work with DeepONets inspired this research.

Funding

Funding is provided by the National Science Foundation (Grant No. OCI 2005572).

Author information

Authors and Affiliations

Department of Mechanical Science and Engineering, University of Illinois at Urbana-Champaign, Urbana, IL, USA
Junyan He, Shashank Kushwaha, Jaewan Park, Seid Koric & Iwona Jasiuk
National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana, IL, USA
Jaewan Park, Seid Koric & Diab Abueidda

Authors

Junyan He
View author publications
You can also search for this author in PubMed Google Scholar
Shashank Kushwaha
View author publications
You can also search for this author in PubMed Google Scholar
Jaewan Park
View author publications
You can also search for this author in PubMed Google Scholar
Seid Koric
View author publications
You can also search for this author in PubMed Google Scholar
Diab Abueidda
View author publications
You can also search for this author in PubMed Google Scholar
Iwona Jasiuk
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Seid Koric or Diab Abueidda.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Replication of results

The data and source code that support the findings of this study can be found at https://github.com/Jasiuk-Research-Group/S-DeepONet-transient-predictions.

Author contributions

Junyan He and Shashank Kushwaha were involved in methodology, formal analysis, investigation, and writing—original draft. Jaewan Park was responsible for methodology, investigation, and writing—original draft. Seid Koric contributed to conceptualization, methodology, supervision, resources, writing—original draft, and funding acquisition. Diab Abueidda took part in conceptualization, methodology, supervision, and writing—reviewing and editing. Iwona Jasiuk participated in supervision and writing—reviewing and editing.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

He, J., Kushwaha, S., Park, J. et al. Predictions of transient vector solution fields with sequential deep operator network. Acta Mech (2024). https://doi.org/10.1007/s00707-024-03991-2

Download citation

Received: 15 January 2024
Revised: 03 May 2024
Accepted: 11 May 2024
Published: 11 June 2024
DOI: https://doi.org/10.1007/s00707-024-03991-2

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Predictions of transient vector solution fields with sequential deep operator network

Abstract

Similar content being viewed by others

Improved Architectures and Training Algorithms for Deep Operator Networks

Physics-Informed Deep Neural Operator Networks

Semi-supervised invertible neural operators for Bayesian inverse problems

1 Introduction

2 Methods