1 Introduction

Shape optimization plays an important role in industrial structure design. Non-parametric shape optimization, which dates back to the work by Zienkiewicz and Campbell (1973), employs nodal coordinates in a finite element system as design variables. Due to a detailed description of structure shape and enlarged design space, non-parametric shape optimization shows great success in engineering applications (Clausen and Pedersen 2006; Furbatto et al. 2009; Böhm and Clausen 2012; Shimoda et al. 2019).

Both gradient-based and gradientless methods have been investigated to address the non-parametric shape optimization with linear finite element systems (Schnack 1979; Meske et al. 2005; Meske 2007; Le et al. 2011). In the recent decade, the focus is more on optimization of path-dependent nonlinear problems (Pedersen et al. 2017). So far, in almost all the literature about structural optimization considering elastoplasticity, the investigations are based on small strain theory where an additive decomposition of the total strain applies (Schwarz et al. 2001; Bogomolny and Amir 2012; Amir 2017; Li and Khandelwal 2017; Shi et al. 2019). Exceptions are only few in recent years (Wallin et al. 2016; Ivarsson et al. 2018).

One major challenge of gradient-based optimizations for path-dependent finite element systems is the significantly increased computational cost in the sensitivity analysis. Due to path dependency, mechanical responses and hence sensitivities of these responses at one load step are determined not only by the current load condition but also by previous load histories (Park and Choi 1999; Choi and Kim 2005). Direct sensitivity analysis, also called direct differentiation method, must follow an incremental sensitivity analysis procedure (Spivey and Tortorelli 1994; Chattopadhyay and Guo 1995; Kleiber and Kowalczyk 1996; Kim et al. 2000; Schwarz and Ramm 2001; Schwarz 2001; Wisniewski et al. 2003; Gu et al. 2009). A unified framework has been presented on how to formulate sensitivity with adjoint variable method for a wide range of path-dependent system behaviors (Michaleris et al. 1994; Alberdi et al. 2018). It shows that the adjoint variable method should follow a backward solution procedure (Michaleris et al. 1994; Lee 1999; Schwarz and Ramm 2001; Maute et al. 2003; Chung et al. 2003; Alberdi et al. 2018). No matter the direct approach or the adjoint approach is employed, the computational effort of path-dependent sensitivity analysis grows in proportional to the number of load steps. Therefore, techniques to reduce the computational cost in the sensitivity analysis are highly demanded.

For this purpose, sensitivity reanalysis in the frame of independent coefficients strategy has been suggested, which employs local modification of a large-scale structure to avoid repeated solutions of full finite element analysis (Liu and Wang 2008). However, this has been demonstrated to be effective only for linear structural systems. Various strategies have been investigated to reduce the number of load steps in the small strain elastoplasticity sensitivities analysis. Numerical study of a structure with single tetrahedral element and a two-bar truss structure under bilinear isotropic hardening model show that sensitivities may only need to be calculated at the last load step given a monotonic load history (Köbler 2015). The sensitivities are only necessary to calculate at the unloading and reloading points given a cyclic load history (Cardoso 2005). It has been theoretically proved that, for small strain elastoplasticity, intermediate elastic load steps could be skipped in the sensitivity analysis, and the former ones among consecutive plastic steps can also be skipped if directions of flow vectors are unchanged (Wang et al. 2017).

The present publication extends the load step reduction strategies from small strain case to finite strain elastoplasticity. It also demonstrates the applicability to various hardening models. This extension is not straightforward for several reasons. Firstly, the multiplicative decomposition of deformation gradient in finite strain theory leads to a more complicated formulation of adjoint sensitivity. Secondly, as will be shown, the prerequisites for load step reduction in finite strain problems are stricter than in small strain cases. The extent to which the load step reduction is possible should be investigated. Thirdly, the accuracy of sensitivities with reduced load steps must be demonstrated for large strain problems.

The paper is organized as follows: Sect. 2 introduces the finite strain elastoplastic analysis procedure which is employed in this study. In Sect. 3, adjoint sensitivity formulation is presented. In Sect. 4, the properties of adjoint variables and load step reduction method for elastic steps and plastic steps are proposed and theoretically investigated. In Sect. 5, the efficiency and accuracy of the load step reduction method is demonstrated with a solid beam structure under severe deformations and a large-scale connecting rod example. In Sect. 2 to Sect. 5, an isotropic hardening model is assumed to reduce complexity of the discussion. In Sect. 6, the load step reduction method is extended to combined hardening and kinematic hardening cases. Additionally, a short discussion on the applicability to nonlinear elasticity and multilinear elastoplasticity are presented. Finally, the conclusions are drawn in Sect. 7.

2 Finite strain elastoplastic analysis

In this section, the finite strain elastoplastic analysis is introduced based on the total Lagrangian formulation and logarithmic strain measure. Following Lee’s multiplicative decomposition (Lee and Liu 1967; Lee 1969), the total deformation gradient \({_{0}^{t}{\varvec{X}}}\) is expressed as the multiplication of an elastic deformation gradient \({{_{i}^{t}{\varvec{X}}}}^{\mathrm{e}}\) and a plastic deformation gradient \({_{0}^{i}{\varvec{X}}}^{\mathrm{p}}\):

$${_{0}^{t}{\varvec{X}}}={{_{i}^{t}{\varvec{X}}}}^{\mathrm{e}}{_{0}^{i}{\varvec{X}}}^{\mathrm{p}}$$
(1)

where 0 refers to the undeformed configuration and \(i\) refers to the intermediate stress-free configuration. The elastic deformation gradient can be further decomposed by right polar decomposition:

$${{_{i}^{t}{\varvec{X}}}}^{\mathrm{e}}={{\varvec{R}}}^{\mathrm{e}}{{\varvec{U}}}^{\mathrm{e}}$$
(2)

where \({{\varvec{R}}}^{\mathrm{e}}\) is a unitary matrix, called the elastic rotation tensor, and \({{\varvec{U}}}^{e}\) is a symmetric positive-definite matrix, called the elastic right stretch tensor.

The logarithmic strain, also known as Hencky strain, is often employed as a proper strain measure for finite strain problem. The elastic logarithmic strain tensor is defined by:

$${{\varvec{\varepsilon}}}^{\mathrm{e}}=\frac{1}{2}\mathrm{ln}{{{\varvec{X}}^{\mathrm{e}}}^{\mathrm{T}}}{{\varvec{X}}}^{\mathrm{e}}=\ln{{\varvec{U}}}^{e}$$
(3)

The rotated Kirchhoff stress tensor \(\overline{{\varvec{\tau}} }\), which is the spatial Kirchhoff stress rotated to the intermediate stress-free configuration by \({{\varvec{R}}}^{\mathrm{e}}\) (Gabriel and Bathe 1995; Caminero et al. 2011; Neff et al. 2016), is the work-conjugate stress measure to the elastic logarithmic strain:

$$\overline{{\varvec{\tau}} }={{\varvec{D}}}^{e}{{\varvec{\varepsilon}}}^{\mathrm{e}}$$
(4)

where \({{\varvec{D}}}^{\mathrm{e}}\) is the elastic constitutive relation matrix. The spatial Kirchhoff stress \({\varvec{\tau}}\) is eventually obtained by back rotating \(\overline{{\varvec{\tau}} }\):

$${\varvec{\tau}}={{\varvec{R}}}^{\mathrm{e}}\overline{{\varvec{\tau}}}{{{{\varvec{R}} }^{\mathrm{e}}}^{\mathrm{T}}}$$
(5)

The Kirchhoff stress is related to the Cauchy stress \({\varvec{\sigma}}\) through the Jacobian determinant:

$${\varvec{\tau}}=J{\varvec{\sigma}}$$
(6)

The internal force can be obtained by integration under deformed volume V or initial volume V0

$${{\varvec{F}}}^{\mathrm{int}}={\int }_{V}{\varvec{B}}{\varvec{\sigma}}\mathrm{d}v={\int }_{{V}^{0}}{\varvec{B}}{\varvec{\tau}}\mathrm{d}v$$
(7)

Newton–Raphson method is employed in the nonlinear finite element analysis. The flowchart of the method is depicted in Fig. 1. tU is the nodal displacement, \({{}^{t}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}\) is equivalent plastic strain, \({{}^{t}{\varvec{X}}}^{\mathrm{p}}\) is plastic deformation gradient, tF is external force. The upper left superscript denotes the load step.

Fig. 1
figure 1

Newton–Raphson solution procedure for finite strain elastoplasticity

The return mapping algorithm is employed to obtain the plastic deformation gradient, equivalent plastic strain and stress tensor at step t + 1 inside each Newton–Raphson iteration. Given the deformation gradient \({}_{0}{}^{t+1}{\varvec{X}}\) at step t + 1, the plastic deformation gradient and equivalent plastic strain at previous step t, the workflow of return mapping algorithm (Eterovic and Bathe 1990; Dvorkin et al. 1994) is summarized in Table 1. It should be noted that, the algorithm works with the rotated stress tensor. The spatial stress is obtained thereafter by back rotating.

Table 1 Return mapping algorithm for finite strain von-Mises elastoplasticity

As it has been pointed out (Montáns and Bathe 2005; Caminero et al. 2011), Eqs. (16) and (17) can be approximated by

$${{\varvec{\varepsilon}}}_{\boldsymbol{ }\boldsymbol{ }}^{e}\approx {{\varvec{\varepsilon}}}_{\boldsymbol{*}}^{e}-\Delta {{\varvec{\varepsilon}}}^{p}$$
(19)

This approximation holds for moderately large elastic strains, which is typically fulfilled in metal plasticity. Eterovic and Bathe (1990) also claim that Eq. (19) is exact for isotropic hardening plasticity with associated flow rule or for combined isotropic-kinematic hardening cases where the stress and back stress tensors commute. In view of this, Eq. (18) can be written as

$$\overline{{\varvec{\tau}} }={{\varvec{D}}}^{\mathrm{e}}{{\varvec{\varepsilon}}}_{\boldsymbol{*}}^{\mathrm{e}}-{{\varvec{D}}}^{\mathrm{e}}\Delta {{\varvec{\varepsilon}}}^{\mathrm{p}}$$
(20)

When the stopping condition of the return mapping algorithm is satisfied, the following quantities at step t + 1 can be obtained by

$${}^{t+1}{\varvec{\tau}}={}{}^{t+1}{{\varvec{R}}}_{\boldsymbol{*}}^{\mathrm{e}}\overline{{\varvec{\tau}}}{{}{ }^{t+1}{{\varvec{R}}}_{\boldsymbol{*}}^{\mathrm{e}}}^{\mathrm{T}}$$
(21)
$${{}^{t+1}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}={{}{}^{t}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}+\Delta {\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}$$
(22)
$${{}^{t+1}{\varvec{X}}}^{\mathrm{p}}={\mathrm{e}}^{\Delta {{\varvec{\varepsilon}}}^{\mathrm{p}}}{{}{}^{t}{\varvec{X}}}^{\mathrm{p}}$$
(23)

In Eq. (21), the trial elastic rotation tensor \({}^{t+1}{{\varvec{R}}}_{\boldsymbol{*}}^{\mathrm{e}}\) is used to back rotate the stress tensor to the undeformed configuration. Under the associated flow rule, the incremental plastic stretch \(\Delta {{\varvec{\varepsilon}}}^{p}\) and the trial elastic stress tensor \({}^{t+1}{{\varvec{U}}}_{\boldsymbol{*}}^{\mathrm{e}}\) have the same eigenvectors. Therefore, it can be verified the trial elastic rotation tensor equals the real elastic rotation tensor \({{}^{t+1}{\varvec{R}}}_{\boldsymbol{ }}^{\mathrm{e}}\).

Following Eq. (7), the consistent tangent stiffness matrix is

$${{\varvec{K}}}_{\mathrm{tan}}={\int }_{{V}^{0}}\frac{\mathrm{d}{\varvec{B}}}{\mathrm{d}{\varvec{U}}}{\varvec{\tau}}\mathrm{d}v+{\int }_{{V}^{0}}{\varvec{B}}\frac{\mathrm{d}{{\varvec{R}}}_{\boldsymbol{ }}^{\mathrm{e}}}{\mathrm{d}{\varvec{U}}}\overline{{\varvec{\tau}}}{{{\varvec{R}} }^{\mathrm{e}}}^{\mathrm{T}}\mathrm{d}v+{\int }_{{V}^{0}}{\varvec{B}}{{\varvec{R}}}_{\boldsymbol{ }}^{\mathrm{e}}{\varvec{D}}\frac{\mathrm{dln}{{\varvec{U}}}_{\boldsymbol{*}}^{\mathrm{e}}}{\mathrm{d}{\varvec{U}}}{{{\varvec{R}}}^{\mathbf{e}}}^{\mathrm{T}}\mathrm{d}v+{\int }_{{V}^{0}}{\varvec{B}}{{\varvec{R}}}_{\boldsymbol{ }}^{\mathrm{e}}\overline{{\varvec{\tau}}}\frac{\mathrm{d}{{{\varvec{R}} }_{\boldsymbol{ }}^{\mathrm{e}}}^{\mathrm{T}}}{\mathrm{d}{\varvec{U}}}\mathrm{d}v$$
(24)

where the constitutive relation matrix \({\varvec{D}}\) is (Simo and Taylor 1985; Crisfield 2000)

$${\varvec{D}}=\left\{\begin{array}{l} \begin{array}{cc}{ {\varvec{D}}}^{\mathrm{e}} & (\mathrm{elastic step})\end{array}\\ \begin{array}{cc}{ {\varvec{D}}}^{\mathrm{ep}}={{\varvec{Q}}}^{-1}{{\varvec{D}}}^{\mathrm{e}}- \frac{{\varvec{d}}}{{\mathbf{a}}^{\mathrm{T}}{\varvec{r}}+{E}^{\mathrm{p}}}& (\mathrm{plastic step})\end{array}\end{array}\right.$$
(25)

where

$${{\varvec{Q}}={\varvec{I}}+{\varvec{D}}}^{\mathrm{e}}\frac{\mathrm{d}\mathbf{a}}{\mathrm{d}{\varvec{\sigma}}}\Delta {\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}$$
(26)
$$\frac{\mathrm{d}\mathbf{a}}{\mathrm{d}{\varvec{\sigma}}}=\frac{1}{{\left|\overline{{\varvec{\tau}} }\right|}_{\mathrm{eqv}}}\left(\left[\begin{array}{ccc}\begin{array}{c}\begin{array}{cc} 1 & -0.5\end{array}\\ \begin{array}{cc}-0.5 & 1\end{array}\end{array}& \begin{array}{c}\begin{array}{cc}-0.5& \end{array}\\ \begin{array}{cc}-0.5& \end{array}\end{array}& \begin{array}{c}\begin{array}{cc} & \end{array}\\ \begin{array}{cc} & \end{array}\end{array}\\ \begin{array}{c}\begin{array}{cc}-0.5& -0.5\end{array}\\ \begin{array}{cc} & \end{array}\end{array}& \begin{array}{c}\begin{array}{cc}1& \end{array}\\ \begin{array}{cc} & 3\end{array}\end{array}& \begin{array}{c}\begin{array}{cc} & \end{array}\\ \begin{array}{cc} & \end{array}\end{array}\\ \begin{array}{c}\begin{array}{cc} & \end{array}\\ \begin{array}{cc} & \end{array}\end{array}& \begin{array}{c}\begin{array}{cc} & \end{array}\\ \begin{array}{cc} & \end{array}\end{array}& \begin{array}{c}\begin{array}{cc}3& \end{array}\\ \begin{array}{cc} & 3\end{array}\end{array}\end{array}\right]-{\mathbf{a}}^{\mathrm{T}}\cdot \mathbf{a}\right)$$
(27)
$${{\varvec{r}}={{\varvec{Q}}}^{-1}{\varvec{D}}}^{\mathrm{e}}\mathbf{a}$$
(28)
$${\varvec{d}}={\varvec{r}}{{\varvec{r}}}^{\mathrm{T}}$$
(29)

3 Adjoint sensitivity formulation for finite strain elastoplasticity

The formulation of adjoint sensitivity analysis for finite strain elastoplasticity is derived in this section. The solution procedure for the adjoint variables is also explained.

There are two key points in deriving the adjoint formulation. One is the selection of a set of state variables, from which all quantities at each finite element analysis load step can be reconstructed. The selection of state variables is not unique. A proper selection of them will reduce the complexity of the sensitivity formulation. The other key issue is a set of governing equations of the state variables. These equations should be identically equal to zero and have the same number as state variables.

Displacement, stress, strain, plastic strain, equivalent plastic strain and flow vector are typical quantities that to be determined in an elastoplastic analysis. Some of these variables are not independent and can be derived from others. After investigating different combinations, the following four quantities are selected as state variables: displacement, rotated stress tensor, equivalent plastic strain and inverse of plastic deformation gradient

$${}^{t}{\varvec{U}},{}^{t}{\varvec{V}}\stackrel{\scriptscriptstyle\mathrm{def}}{=}\left\{{}^{t}\overline{{\varvec{\tau}} },{}{}^{t}{\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}},{{{}{}^{t}{\varvec{X}}}^{\mathrm{p}}}^{-1}\right\}$$
(30)

The reason of using inversed plastic deformation gradient is it leads to a simpler evaluation of derivatives of the trial rotated stress tensor with Eqs. (8) to (11).

One natural governing equation of the state variables is the equilibrium condition. The residual force is identical to zero at each load step:

$$0\equiv {}^{t}{\varvec{R}}\left({}^{t}{\varvec{U}},{}^{t}{\varvec{V}},{}^{t-1}{\varvec{V}},s\right)={}^{t}{\varvec{F}}-{\int }_{{V}^{0}}{\varvec{B}}{}^{t}{{\varvec{R}}}_{\boldsymbol{ }}^{\mathrm{e}}{}^{t}\overline{{\varvec{\tau}}}{{ }_{\boldsymbol{ }}{}^{t}{{\varvec{R}}}_{\boldsymbol{ }}^{\mathrm{e}}}^{\mathrm{T}}\mathrm{d}v$$
(31)

where s denotes the design variables, and the elastic rotation tensor \({}^{t}{{\varvec{R}}}_{\boldsymbol{ }}^{{\varvec{e}}}\) is a function of \({}^{t}{\varvec{U}}\) and \({{{}^{t-1}{\varvec{X}}}^{\mathrm{p}}}^{-1}\).

Other governing equations are found on the element level. According to Eqs. (15) and (20), the rotated stress tensor and the elastic right stretch tensor should follow

$${}^{t}\overline{{\varvec{\tau}} }={{\varvec{D}}}^{\mathrm{e}}\mathrm{ln}{{}^{t}{\varvec{U}}}_{*}^{\mathrm{e}}-{{\varvec{D}}}^{\mathrm{e}}\left({{}{}^{t}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}-{{}{}^{t-1}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}\right){}^{t}\mathbf{a}$$
(32)

The yield and consistency condition is

$$\left({{}^{t}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}-{{}{}^{t-1}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}\right)\cdot \left[{\left|{}^{t}\overline{{\varvec{\tau}} }\right|}_{\mathrm{eqv}}-{\sigma }_{\mathrm{Y}}\left({{}{}^{t}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}\right)\right]=0$$
(33)

According to Eq. (23), the plastic deformation gradients of two consecutive steps follow

$${{{}^{t}{\varvec{X}}}^{\mathrm{p}}}^{-1}={{{}{}^{t-1}{\varvec{X}}}^{\mathrm{p}}}^{-1}\cdot {\mathrm{e}}^{-\left({{}{}^{t}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}-{{}{}^{t-1}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}\right){}^{t}\mathbf{a}}$$
(34)

The governing Eqs. (32) to (34) on element level define the so-called dependent residual \({}^{t}{\varvec{H}}\) (Michaleris et al. 1994), which identically equals zero at each load step:

$${}^{t}{\varvec{H}}\left({}^{t}{\varvec{U}},{}^{t}{\varvec{V}},{}^{t-1}{\varvec{V}},s\right)=\left(\begin{array}{c}{}^{t}\overline{{\varvec{\tau}} }-{{\varvec{D}}}^{e}\ln{{}^{t}{\varvec{U}}}_{*}^{e}+{{\varvec{D}}}^{e}\left({{}{}^{t}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}-{{}{}^{t-1}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}\right){}^{t}\mathbf{a}\\ \left({}{}^{t}{\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}-{}{}^{t-1}{\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}\right)\cdot \left[{\left|{}^{t}\overline{{\varvec{\tau}} }\right|}_{\mathrm{eqv}}-{\sigma }_{\mathrm{Y}}\left({}{}^{t}{\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}\right)\right]\\ {{{}{}^{t}{\varvec{X}}}^{p}}^{-1}-{{{}{}^{t-1}{\varvec{X}}}^{p}}^{-1}\cdot {e}^{-\left({{}{}^{t}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}-{{}{}^{t-1}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}\right){}^{t}\mathbf{a}}\end{array}\right)\equiv 0$$
(35)

For a system response, it can be expressed as a function of all state variables and design variables

$$f=f\left({}^{1}{\varvec{U}},\dots ,{}^{N}{\varvec{U}},{}^{1}{\varvec{V}},\dots ,{}^{N}{\varvec{V}},s\right)$$
(36)

where the superscript N is the total number of load steps.

The direct differentiation of the response with respect to a design variable s is

$$\frac{\mathrm{d}f}{\mathrm{d}s}=\frac{\partial f}{\partial s}+\sum_{\mathrm{t}=1}^{\mathrm{N}}{\frac{\partial f}{\partial {}^{t}{\varvec{U}}}}^{\mathrm{T}}\frac{\mathrm{d}{}^{t}{\varvec{U}}}{\mathrm{d}s}+\sum_{\mathrm{t}=1}^{\mathrm{N}}{\frac{\partial f}{\partial {}^{t}{\varvec{V}}}}^{\mathrm{T}}\frac{\mathrm{d}{}^{t}{\varvec{V}}}{\mathrm{d}s}$$
(37)

To avoid time-consuming evaluations of \(\mathrm{d}{}_{ }{}^{t}{\varvec{U}}/\mathrm{d}s\) and \(\mathrm{d}{}_{ }{}^{t}{\varvec{V}}/\mathrm{d}s\), two adjoint variable vectors \({}^{t}{\varvec{\lambda}}\) and \({}^{t}{\varvec{\gamma}}\) are introduced. They are in the same size as \({}^{t}{\varvec{R}}\) and \({}^{t}{\varvec{H}}\) respectively. Since \({}^{t}{\varvec{R}}\) in Eq. (31) and \({}^{t}{\varvec{H}}\) in Eq. (35) are identical to zero at all load steps, adding dot product of \({}^{t}{\varvec{\lambda}}\) and \({}^{t}{\varvec{R}}\) and dot product of \({}^{t}{\varvec{\gamma}}\) and \({}^{t}{\varvec{H}}\) to the response function will not change the value of it, i.e.

$$f=f-\sum_{\mathrm{t}=1}^{\mathrm{N}}{{}^{t}{\varvec{\lambda}}}^{\mathrm{T}}{}^{t}{\varvec{R}}-\sum_{\mathrm{t}=1}^{\mathrm{N}}{{}^{t}{\varvec{\gamma}}}^{\mathrm{T}}{}^{t}{\varvec{H}}$$
(38)

From Eqs. (31) and (35), the derivatives of \({}^{t}{\varvec{R}}\) and \({}^{t}{\varvec{H}}\) with respect to the design variable are

$$\frac{\mathrm{d}{}^{t}{\varvec{R}}}{\mathrm{d}s}=\frac{\partial {}^{t}{\varvec{R}}}{\partial s}+\frac{\partial {}^{t}{\varvec{R}}}{\partial {}^{t}{\varvec{U}}}\cdot \frac{\mathrm{d}{}^{t}{\varvec{U}}}{\mathrm{d}s}+\frac{\partial {}^{t}{\varvec{R}}}{\partial {}^{t}{\varvec{V}}}\cdot \frac{\mathrm{d}{}^{t}{\varvec{V}}}{\mathrm{d}s}+\frac{\partial {}^{t}{\varvec{R}}}{\partial {}^{t-1}{\varvec{V}}}\cdot \frac{\mathrm{d}{}^{t-1}{\varvec{V}}}{\mathrm{d}s}$$
(39)
$$\frac{\mathrm{d}{}^{t}{\varvec{H}}}{\mathrm{d}s}=\frac{\partial {}^{t}{\varvec{H}}}{\partial s}+\frac{\partial {}^{t}{\varvec{H}}}{\partial {}^{t}{\varvec{U}}}\cdot \frac{\mathrm{d}{}^{t}{\varvec{U}}}{\mathrm{d}s}+\frac{\partial {}^{t}{\varvec{H}}}{\partial {}^{t}{\varvec{V}}}\cdot \frac{\mathrm{d}{}^{t}{\varvec{V}}}{\mathrm{d}s}+\frac{\partial {}^{t}{\varvec{H}}}{\partial {}^{t-1}{\varvec{V}}}\cdot \frac{\mathrm{d}{}^{t-1}{\varvec{V}}}{\mathrm{d}s}$$
(40)

By taking the total derivative of the response in Eq. (38) with respect to the design variable s, and substituting Eqs. (37), (39) and (40) into it, it follows

$$\frac{\mathrm{d}f}{\mathrm{d}s}= \frac{\partial f}{\partial s}-\sum_{\mathrm{t}=1}^{\mathrm{N}}{{}^{t}{\varvec{\lambda}}}^{\mathrm{T}}\frac{\partial {}^{t}{\varvec{R}}}{\partial s}-\sum_{\mathrm{t}=1}^{\mathrm{N}}{{}^{t}{\varvec{\gamma}}}^{\mathrm{T}}\frac{\partial {}^{t}{\varvec{H}}}{\partial s}+ \sum_{\mathrm{t}=1}^{\mathrm{N}}\left(\frac{\partial f}{\partial {}^{t}{\varvec{U}}}-{{}^{t}{\varvec{\lambda}}}^{\mathrm{T}}\frac{\partial {}^{t}{\varvec{R}}}{\partial {}^{t}{\varvec{U}}}-{{}^{t}{\varvec{\gamma}}}^{\mathrm{T}}\frac{\partial {}^{t}{\varvec{H}}}{\partial {}^{t}{\varvec{U}}}\right)\frac{\mathrm{d}{}^{t}{\varvec{U}}}{\mathrm{d}s}+ \left({\frac{\partial f}{\partial {}^{N}{\varvec{V}}}}^{\mathrm{T}}-{{}^{N}{\varvec{\lambda}}}^{\mathrm{T}}\frac{\partial {}^{N}{\varvec{R}}}{\partial {}^{N}{\varvec{V}}}-{{}^{N}{\varvec{\gamma}}}^{\mathrm{T}}\frac{\partial {}^{N}{\varvec{H}}}{\partial {}^{N}{\varvec{V}}}\right)\frac{\mathrm{d}{}^{N}{\varvec{V}}}{\mathrm{d}s}+\sum_{\mathrm{t}=1}^{\mathrm{N}-1}\left({\frac{\partial f}{\partial {}^{t}{\varvec{V}}}}^{\mathrm{T}}-{{}^{t}{\varvec{\lambda}}}^{\mathrm{T}}\frac{\partial {}^{t}{\varvec{R}}}{\partial {}^{t}{\varvec{V}}}-{{}^{t}{\varvec{\gamma}}}^{\mathrm{T}}\frac{\partial {}^{t}{\varvec{H}}}{\partial {}^{t}{\varvec{V}}}-{{}^{t+1}{\varvec{\gamma}}}^{\mathrm{T}}\frac{\partial {}^{t+1}{\varvec{H}}}{\partial {}^{t}{\varvec{V}}}-{{}^{t+1}{\varvec{\lambda}}}^{\mathrm{T}}\frac{\partial {}^{t+1}{\varvec{R}}}{\partial {}^{t}{\varvec{V}}}\right)\frac{\mathrm{d}{}^{t}{\varvec{V}}}{\mathrm{d}s}$$
(41)

By enforcing the coefficients of \(\mathrm{d}{}{}^{t}{\varvec{U}}/\mathrm{d}s\) and \(\mathrm{d}{}{}^{t}{\varvec{V}}/\mathrm{d}s\) in Eq. (41) to zero, a series of systems of linear equations regarding the adjoint variables are obtained

$${\left(\begin{array}{cc}\frac{\partial {}^{N}{\varvec{R}}}{\partial {}^{N}{\varvec{U}}}& \frac{\partial {}^{N}{\varvec{H}}}{\partial {}^{N}{\varvec{U}}}\\ \frac{\partial {}^{N}{\varvec{R}}}{\partial {}^{N}{\varvec{V}}}& \frac{\partial {}^{N}{\varvec{H}}}{\partial {}^{N}{\varvec{V}}}\end{array}\right)}^{\mathrm{T}}\left(\begin{array}{c}{}^{N}{\varvec{\lambda}}\\ {}^{N}{\varvec{\gamma}}\end{array}\right)=\left(\begin{array}{c}\frac{\partial f}{\partial {}^{N}{\varvec{U}}}\\ \frac{\partial f}{\partial {}^{N}{\varvec{V}}}\end{array}\right)$$
(42)
$${\left(\begin{array}{cc}\frac{\partial {}^{t}{\varvec{R}}}{\partial {}^{t}{\varvec{U}}}& \frac{\partial {}^{t}{\varvec{H}}}{\partial {}^{t}{\varvec{U}}}\\ \frac{\partial {}^{t}{\varvec{R}}}{\partial {}^{t}{\varvec{V}}}& \frac{\partial {}^{t}{\varvec{H}}}{\partial {}^{t}{\varvec{V}}}\end{array}\right)}^{\mathrm{T}}\left(\begin{array}{c}{}^{t}{\varvec{\lambda}}\\ {}^{t}{\varvec{\gamma}}\end{array}\right)=-{\left(\begin{array}{c}0\\ \frac{\partial {}^{t+1}{\varvec{H}}}{\partial {}^{t}{\varvec{V}}}\end{array}\right)}^{\mathrm{T}} {}^{t+1}{\varvec{\gamma}}-{\left(\begin{array}{c}0\\ \frac{\partial {}^{t+1}{\varvec{R}}}{\partial {}^{t}{\varvec{V}}}\end{array}\right)}^{\mathrm{T}} {}^{t+1}{\varvec{\lambda}}+\left(\begin{array}{c}\frac{\partial f}{\partial {}^{t}{\varvec{U}}}\\ \frac{\partial f}{\partial {}^{t}{\varvec{V}}}\end{array}\right) (\mathrm{t}=N-1,\dots ,1)$$
(43)

With these adjoint variables, the adjoint sensitivity formulation of the response is

$$\frac{df}{{\mathrm{d}}s}=\frac{\partial f}{\partial s}-\sum_{\mathrm{t}=1}^{\mathrm{N}}{{}^{t}{\varvec{\lambda}}}^{\mathrm{T}}\frac{\partial {}^{t}{\varvec{R}}}{\partial \mathrm{s}}-\sum_{\mathrm{t}=1}^{\mathrm{N}}{{}^{t}{\varvec{\gamma}}}^{\mathrm{T}}\frac{\partial {}^{t}{\varvec{H}}}{\partial s}$$
(44)

The expressions of partial derivatives in Eqs. (42) and (43) are presented in Appendix A. As shown in Appendix B, solving adjoint variables directly from Eqs. (42) and (43) yields:

$${{}^{t}{\varvec{K}}}_{\mathrm{tan}}{}^{t}{\varvec{\lambda}}=\frac{\partial f}{\partial {}^{t}{\varvec{U}}}-{\frac{\partial {}^{t}{\varvec{H}}}{\partial {}^{t}{\varvec{U}}}}^{\mathrm{T}}{\frac{\partial {}^{t}{\varvec{H}}}{\partial {}^{t}{\varvec{V}}}}^{-\mathrm{T}}\left(\frac{\partial f}{\partial {}^{t}{\varvec{V}}}-{\frac{\partial {}^{t+1}{\varvec{H}}}{\partial {}^{t}{\varvec{V}}}}^{\mathrm{T}}{}^{t+1}{\varvec{\gamma}}-{\frac{\partial {}^{t+1}{\varvec{R}}}{\partial {}^{t}{\varvec{V}}}}^{\mathrm{T}}{}^{t+1}{\varvec{\lambda}}\right)$$
(45)
$${}^{t}{\varvec{\gamma}}={\frac{\partial {}^{t}{\varvec{H}}}{\partial {}^{t}{\varvec{V}}}}^{-\mathrm{T}}\left(\frac{\partial f}{\partial {}^{t}{\varvec{V}}}-{\frac{\partial {}^{t+1}{\varvec{H}}}{\partial {}^{t}{\varvec{V}}}}^{\mathrm{T}}{}^{t+1}{\varvec{\gamma}}-{\frac{\partial {}^{t+1}{\varvec{R}}}{\partial {}^{t}{\varvec{V}}}}^{\mathrm{T}}{}^{t+1}{\varvec{\lambda}}-{\frac{\partial {}^{t}{\varvec{R}}}{\partial {}^{t}{\varvec{V}}}}^{\mathrm{T}}{}^{t}{\varvec{\lambda}}\right)$$
(46)

where \({{}^{t}{\varvec{K}}}_{\mathrm{tan}}\) is the consistent tangent stiffness matrix following Eq. (24). Since the adjoint variables at step t is dependent on the adjoint variables at the next step t + 1, the solution for the adjoint variables must be carried out backwards from the last load step to the first load step.

4 Load step reduction in the adjoint sensitivity analysis

According to Eqs. (44) to (46), the major computational effort in the adjoint sensitivity analysis is the backwards solution of adjoint variables. For each load step, a system of linear equations, which is in the same size as the number of degrees of freedom of the underlying finite element model, needs to be solved. Therefore, the computational cost increases in proportional to the total number of load steps. In this section, strategies to reduce the number of load steps that are used in the sensitivity analysis are investigated.

Besides the computational cost, the memory cost in implementation may also be challenging if it is not properly handled. The memory cost is measure by number of non-zero values that must be stored at the same time. Following Eqs. (45) and (46), tangent stiffness matrices at the equilibrium point, partial derivatives of residual force and partial derivatives of dependent residual are required in the solutions of adjoint variables. These quantities will be gradually collected after the nonlinear analysis at each step. Due to the backward solution procedure, they must be kept in memory until the last step finite element analysis is finished. Although the stiffness matrices are sparse, the direct storage of all these quantities at all load steps is not the most cost-efficient way.

By looking into the formulation of these quantities in Appendices A and B, the tangent stiffness matrices are fully determined by state variables in Eq. (30). This is also the case for the partial derivative quantities. Therefore, to minimize the memory cost, it is suggested to store only state variables at all load steps. During the sensitivity analysis, intermediate quantities are then retrieved from state variables. The regeneration happens only on element level, where the computational effort is neglectable. However, it is worth mentioning that the storage space is anyway in proportional to the number of load steps in sensitivity analysis. Therefore, strategies to reduce load steps in the sensitivity analysis also contribute to save memory cost in implementation.

Before presenting the load step reduction method, one assumption regarding the responses should be made clear. Given a sequence of load steps L = {1, 2,…, t − 1, t, t + 1,…, N}, the system responses that are investigated in this paper are assumed to be functions of quantities of only the last load step, i.e. it assumes that the system response \(f\) satisfies

$$\frac{\partial f}{\partial {}^{t}{\varvec{U}}}=0\,\mathrm{for}\,\mathrm{all}\,t<\mathrm{N}$$
(47)
$$\frac{\partial f}{\partial {}^{t}{\varvec{V}}}=0\,\mathrm{for}\,\mathrm{all}\,t<\mathrm{N}$$
(48)

Many responses fulfill these requirements, such as maximum equivalent stress or final displacement. Besides that, many other responses can also be expressed as a composition of such functions, i.e.

$$g\left({}^{1}{\varvec{U}},\dots ,{}^{N}{\varvec{U}},{}^{1}{\varvec{V}},\dots ,{}^{N}{\varvec{V}}\right)=h^\circ ({f}_{1}\left({}^{1}{\varvec{U}},{}^{1}{\varvec{V}}\right),\dots ,{f}_{\mathrm{N}}\left({}^{N}{\varvec{U}},{}^{N}{\varvec{V}}\right))$$
(49)

where function \({f}_{\mathrm{k}}\) is dependent only on quantities at step k. If this is the case, the sensitivity of individual \({f}_{\mathrm{k}}\) could be calculate first, where the step k is treated as the last load step. And then the sensitivity of \(g\) can be obtained eventually by the chain rule.

4.1 Load step reduction for elastic steps

In the following, an elastic load step describes a step of finite element analysis, in which all elements behave elastically. Otherwise, if any element behaves plastically in the load step, then the step is called a plastic load step.

The following property of adjoint variables at elastic steps was first found in additive decomposition based small strain case (Wang et al. 2017). This property also holds for multiplicative decomposition based finite strain elastoplasticity.

Property 1.

If load step t is an intermediate elastic step (t is not the last step), then the adjoint variable \({}^{t}{\varvec{\lambda}}\) at this step is zero.

Due to limited space, the proof of this property is presented in Appendix C. The property eventually leads to the following load step reduction rule.

Elastic load step reduction rule

For a sequence of load steps L = {1, 2,…, t − 1, t, t + 1,…, N}, if step t is an intermediate elastic load step, then the exact same sensitivity results can be obtained by skipping step t as the load steps contain only S = {1, 2,…, t − 1, t + 1,…, N}.

Mathematically, it means

$$\frac{\mathrm{d}{}_{L}{}f}{{\mathrm{d}}s}=\frac{\mathrm{d}{}_{s}{}f}{{\mathrm{d}}s}$$
(50)

where

$$\frac{\mathrm{d}{}_{L}{}f}{{\mathrm{d}}s}=\frac{\partial f}{\partial s}-\sum_{\mathrm{n}=1}^{\mathrm{N}}{_{L}^{n}{\varvec{\lambda}}}^{\mathrm{T}}\frac{\partial {_{L}^{n}{\varvec{R}}}}{\partial s}-\sum_{\mathrm{n}=1}^{\mathrm{N}}{_{L}^{n}{\varvec{\gamma}}}^{\mathrm{T}}\frac{\partial {_{L}^{n}{\varvec{H}}}}{\partial s}$$
(51)
$$\frac{\mathrm{d}{}_{s}{}f}{{\mathrm{d}}s}=\frac{\partial f}{\partial s}-\sum_{\mathrm{n}=1,\ldots,\mathrm{N},\mathrm{n}\ne \mathrm{t}}{_{s}^{n}{\varvec{\lambda}}}^{\mathrm{T}}\frac{\partial {_{s}^{n}{\varvec{R}}}}{\partial s}-\sum_{\mathrm{n}=1 \ldots,\mathrm{N},\mathrm{n}\ne \mathrm{t}}{{_{s}^{n}{\varvec{\gamma}}}}^{\mathrm{T}}\frac{\partial {_{s}^{n}{\varvec{H}}}}{\partial s}$$
(52)

The subscript S and L on the left denote the step set from which a quantity is calculated. In Eq. (52), the partial derivatives of \({\varvec{R}}\) and \({\varvec{H}}\) are based on load steps set S, where the original step t − 1 and step t + 1 become adjacent load steps. The theoretical proof of this rule is presented in Appendix D. It follows that all intermediate elastic load steps can be skipped in the sensitivity analysis. It should be noted that this rule applies to all types of elements, and no accuracy will be lost after the load step reduction.

4.2 Load step reduction for plastic steps

Under certain conditions, the adjoint variable \({\varvec{\lambda}}\) equals zero even at plastic steps. The following property is proved in Appendix E.

Property 2.

If \(\frac{\partial {}^{t}\mathbf{a}}{\partial {}^{t}\overline{{\varvec{\tau}}} }=0\), \({}^{t}\mathbf{a}={}^{t+1}\mathbf{a}\), \(\frac{\partial {}{}^{t+1}{{\varvec{R}}}^{e}}{\partial {{{}^{t}{\varvec{X}}}^{p}}^{-1}}=0\), \({{}^{t+1}\Delta{\varvec{\varepsilon}}}^{p}\) and \({{}^{t}{\varvec{X}}}^{p}\) commute, then \({}^{t}{\varvec{\lambda}}=0\).

Physically, the four prerequisites mean that the flow vectors at adjacent load steps should be in the same direction, the elastic rotation tensor should be constant with respect to the plastic deformation gradient, and the incremental plastic strain should have the same principal directions as the accumulated plastic strain. If all these conditions are met, then the following load step reduction rule for plastic steps could be proved. Due to limited space, the proof is presented in Appendix F.

Plastic load step reduction rule

For a sequence of load steps L = {1, 2,…, t − 1, t, t + 1,…, N}, if the prerequisites in Property 2 are all fulfilled, additionally \({{}^{t}\Delta{\varvec{\varepsilon}}}^{\mathrm{p}}\) and \({{}^{t-1}X}^{\mathrm{p}}\) also commute and \(\partial {}_{ }{}^{t}{{\varvec{R}}}_{ }^{\mathrm{e}}/\partial {{{}_{ }{}^{t-1}{\varvec{X}}}^{\mathrm{p}}}^{-1}=0\), then the same sensitivity results can be obtained by skipping load step t as the load steps contain only S = {1, 2,…, t − 1, t + 1,…, N}.

In comparison with small strain case (Wang et al. 2017), the condition on incremental plastic flow is stricter. Not only the consecutive two steps should have the same flow vector, but the incremental quantities should also be consistent with the accumulated plastic quantities. The prerequisites of plastic reduction rule could be satisfied only by 1D bar elements when there is no switch between tension and compression in two consecutive plastic steps. For general 2D and 3D elements, it is impossible that these requirements are all fulfilled. Therefore, the following empirical rule is suggested for finite strain elastoplasticity.

Empirical rule

If two consecutive plastic steps are in a monotonic loading procedure, and the incremental plastic flow is in close direction to the accumulated plastic flow, then the former step can be skipped in the sensitivity analysis.

Load steps reduction following the empirical rule will not lead to exact sensitivity results for 2D and 3D elements. By doing so, the accuracy of sensitivities needs to be verified. Its applicability in practice and influence on the sensitivity results will be investigated through numerical examples in the next sections.

It must be made clear that, the proposed reduction rules are derived on the structural level. It means that, as defined at the beginning of Sect. 4.1, the whole structure is treated as plastic even if only just one element behaves plastically. If one plastic element is not in monotonic loading, then the load step for the whole structure cannot be skipped. This strategy applies well for shape optimization, where it is rare that abnormality occurs only in one or a few local elements.

To apply in topology optimization, the load step reduction may be limited by the complicated behaviors of a few or even just one single intermediate element. This doesn’t mean the proposed scheme cannot by applied to topology optimization. In such cases, the proposed reduction rules provide guidance on which load steps could be skipped in the sensitivity analysis. A further reduction, however, may still be possible by investigating adjoint variables on the element level. The element level load step reduction is an open issue to investigate. On the other hand, there are already studies on how to avoid abnormality of intermediate element plasticity in topology optimization. One typical solution is to choose separate penalization exponents and lower bounds for stiffness and yield properties in the SIMP interpolation (Maute et al. 1998; Alberdi and Khandelwal 2017) They are done to avoid large plastic strains in intermediate material regions and mitigate convergence issues brought by low density elements near their yield limits (Amir 2017; Zhang et al. 2017). The combination of these techniques and load step reduction rules in topology optimization may be interesting for further investigation.

5 Numerical examples of adjoint sensitivity analysis with load step reduction

5.1 Solid beam under severe bending

In this section, the load step reduction rules are verified through a cantilever beam structure meshed with 3D tetrahedral elements. The finite element model is depicted in Fig. 2. One end of the structure is fixed. Two external forces in horizontal x-direction and in vertical y-direction are applied on the free end of the structure simultaneously. A bilinear isotropic hardening material is assumed. The Young’s modulus is 210 GPa, plastic modulus is 50 GPa, initial yield stress is 235 MPa and Poisson’s ratio equals 0.3.

Fig. 2
figure 2

Cantilever beam structure in size 300 mm × 15 mm × 15 mm. Design nodes are indicated by red dots

There are 11 load steps as described in Fig. 3. The vertical load pointing downwards increases in the first three steps and then decreases gradually. The horizontal load in x direction increases throughout the procedure. According to the nonlinear finite element analysis, all the load steps are plastic steps.

Fig. 3
figure 3

Load history in horizontal and vertical direction with pentagrams depicting the reduced load steps in the sensitivity analysis

The maximum equivalent plastic strain at the last step is 25.2%. The contour of the equivalent plastic strain is presented in both initial and deformed configurations in Fig. 4. Areas where equivalent plastic strain is larger than 5% are depicted in red. The deformation shows that the beam structure is severely bended under the given loads.

Fig. 4
figure 4

Contour of the equivalent plastic strain under original and deformed configurations

In the sensitivity analysis, the design variables are the vertical coordinates of center nodes on the bottom surface of the beam. These nodes are indicated by red dots in Fig. 2. The maximum equivalent plastic strain at the fixed end and average vertical displacement at the free end are defined as two system responses.

Since the structure is in an increasingly bending procedure under the given loads, the plastic flows are all in close directions. According to the empirical rule, all the intermediate load steps in monotonic loading stage could be skipped. Therefore, only step 3, where the vertical load turns from increasing to decreasing, and the last load step must be included in the sensitivity analysis. The sensitivity results using only step 3 and step 11 are compared with sensitivities using all load steps in Figs. 5 and 6. In subfigures (a), the values of sensitivities are plotted with respect to the position of design nodes. Subfigures (b) present the percentage error in term of relative value, which is defined by

Fig. 5
figure 5

Sensitivity of average vertical displacement at the free end. a Comparison of sensitivities with all load steps and only the reduced load steps (step 3 + step 11). b Relative value to show the percentage error between two results

Fig. 6
figure 6

Sensitivity of maximum equivalent plastic strain. a Comparison of sensitivity with all load steps and only the reduced load steps (step 3 + step 11). b Relative value to show the percentage error between two results

$${\text{relative}}\,\text{value} = \frac{{\text{sensitivities}}\,{\text{with}}\,{\text{reduced}}\,{\text{load}}\,{\text{steps}}}{{\text{sensitivites}}\,{\text{with}}\,{\text{all}}\,{\text{load}}\,{\text{steps}}}$$
(53)

In subfigures (a), sensitivities obtained by central finite differencing scheme with a perturbation size of 10−5 mm are also presented. Both Figs. 5a and 6a show that, the adjoint sensitivity results match well with finite defencing results. Therefore, the adjoint sensitivity analysis procedure is validated.

Focusing on the adjoint sensitivity results, the results with reduced load steps match well with that using all load steps. The percentage errors at most of the design variables are small. It increases significantly as closing to the free end of the beam. This increase in error is partially because the sensitivity at the free end is close to zero. The relative error is calculated with a small denominator and hence is amplified. Filtering out sensitivities whose absolute value is smaller than 1% of the maximum sensitivity, the sensitivity errors for both responses are summarized in Table 2. The average errors are smaller than 3%, which demonstrate a good match of sensitivities when the load steps are reduced. Hence the empirical rule applies for this example.

Table 2 Error of sensitivities of the cantilever beam example

5.2 Solid beam under severe bending and twisting

One of the prerequisites of the empirical rule is incremental plastic flow should be in close direction to the accumulated plastic flow. Due to the vagueness of this statement, it is worth presenting a case to show when the load steps can’t be further reduced even under a monotonic loading procedure.

In this example, the same cantilever beam model as in Sect. 5.1 is employed. The horizontal and vertical load history are plotted in Fig. 7, and there are 20 load steps in total. Besides these two forces, a constant force of 2 kN is applied at the free end in horizontal Z direction. All the load steps are plastic steps.

Fig. 7
figure 7

Load history for beam under bending and twisting with pentagrams depicting the reduced load steps in the sensitivity analysis

The deformation of the beam at representative load steps are depicted in Fig. 8 with contour of equivalent plastic strain. The red color represents the area with equivalent plastic strain larger than 5%. The maximum equivalent plastic strain at final step is 36.3%. The deformations show that the beam slightly bends out of the X–Y plane at the beginning. After large enough plastic strain is accumulated near the fixed end of beam, the out-of-plane load in Z direction gradually causes the beam to twist.

Fig. 8
figure 8

Deformation and contour of equivalent plastic strain at representative load steps of solid beam example under severe bending and twisting

According to the empirical rule, the turning points of monotonic load procedures should be included in the sensitivity analysis. These are step 3, step 13, and step 20. Under twisting of the beam, the principal directions of stress tensor will change significantly. It leads to the change in direction of the associated plastic flow. Therefore, the prerequisite of the empirical rule is violated. Following the empirical rule, from step 15 on all the steps must be included in the sensitivity analysis although they are in a monotonic loading procedure. It means that the load steps in the sensitivity analysis can only be reduced to step 3, step 13, and steps from 15 to 20 as depicted by pentagrams in Fig. 7. Take the vertical displacement at the free end as the response function. The sensitivity results with all load steps and the reduced load steps are compared in Fig. 9. It shows a good match between these two results.

Fig. 9
figure 9

Sensitivity of vertical displacement at the free end

To demonstrate the necessity of step 15 to step 19 in the sensitivity analysis, several try-outs are also presented in Fig. 9. In each of these results, the same reduced load steps are used in the sensitivity analysis except one step between 15 and 19 is additionally skipped. It shows that, if any step between 15 and 19 is skipped in the sensitivity analysis, the result will have significant errors.

This example shows that, to follow the empirical rule, good engineering judgment may be needed in practical applications. A solution to avoid the subjectivity could be quantification of the empirical rule. This is also important for implementing the empirical rule into a non-intrusive optimization procedure because there are cases where structural behaviors change essentially between iterations and hence an automatic load steps reselection is required. How to properly quantify the conditions in the empirical rule and set criteria for selection are still open issues to be investigated.

5.3 Demonstration with a connecting rod structure

In this section, the applicability of load step reduction rules is demonstrated through a connecting rod example under cyclic load history. The finite element model of a typical connecting rod (or called conrod) in an internal combustion piston engine is presented in Fig. 10a. The design variables are x-coordinates of 223 nodes which are highlighted by red dots in Fig. 10b. These nodes lie on the outer surface between the small end and big end of the conrod. A bilinear elastoplastic material is assumed. Material properties are listed in Table 3.

Fig. 10
figure 10

Finite element model of a connecting rod structure

Table 3 Material parameters of the connecting rod

The rod bolts of the structure are fixed, and a horizontal force (Fx) and a vertical force (Fy) are applied uniformly on the inner surface of the small end. The load history is depicted in Fig. 11. The forces are generated by the internal pressure on the piston in a four-stroke cycle. The magnitudes of the forces are artificially enlarged to represent an extreme load case during engine failure.

Fig. 11
figure 11

Load history on the small end of the conrod. Pentagrams depict the reduced load steps in the sensitivity analysis

In the load cycle, the structure is purely compressed in y-direction at step 1. From step 2 to step 7, the compression load increases. At the same time, a horizontal force in x-direction is gradually applied which causes bending of the rod. From step 8 to step 12, the x-direction force is gradually unloaded to zero and the y-direction force is partially unloaded. The horizontal load increases in negative x-direction at step 13 to step 16 accompanied by a slight increase of the compression load. From step 17 to step 19, the horizontal force is fully unloaded to zero while the y-direction force gradually increases to the same level as in step 1.

The contour of the equivalent plastic strain at the last load step is depicted in both initial and deformed configurations in Fig. 12. Areas where equivalent plastic strain is larger than 10% are depicted in red. The maximum equivalent plastic strain is 22.3%, which is close to strain at break of typical steel material. The maximum value point lies on the outer surface near the big end. The average equivalent plastic strain of 106 elements around this point is defined as one system response. The average x-displacement and average y-displacement of the small end at the last step are defined as other two responses.

Fig. 12
figure 12

Equivalent plastic strain of the connecting rod at the last load step

According to the nonlinear analysis, steps 2 to step 7 and step 16 are plastic, other load steps are elastic. Following the elastic load step reduction rule, all elastic load steps are skipped in the sensitivity analysis except step 19. Based on the empirical rule, plastic steps 2 to 6 are also skipped since they lie in a monotonic load procedure. The three load steps that must be involved in the sensitivity analysis are highlighted by pentagrams in Fig. 11 and listed in Table 4.

Table 4 Reduced load steps in sensitivity analysis for the connecting rod example

In Fig. 13, the contour of sensitivities calculated with the reduced load steps are compared with those using all load steps. The results match very well for all three responses.

Fig. 13
figure 13

Comparison of sensitivity results of three system responses. In each subfigure, the result with reduced load steps is depicted on the left and the result using all load steps is presented on the right

The percentage errors of sensitivities are evaluated and summarized in Table 5. To eliminate the effect of small denominator, sensitivities whose absolute value is smaller than 1% of the maximum sensitivity are ignored. It shows that the average percentage errors of sensitivities for all three responses are less than 10%. Hence, the empirical rule applies well in this example.

Table 5 Relative errors of sensitivities for the connecting rod example

5.4 Efficiency in terms of computational time

Up to here, the efficiency of load step reduction rules is measured by the number of load steps used in the sensitivity analysis. It is assumed that the computational time of sensitivity analysis will be reduced proportionally to the number of load steps. This assumption is verified explicitly in this section.

Besides the number of load steps, the computational time of sensitivity analysis is also dependent on many other factors including degrees of freedom of the underlying FE-model, number of responses, number of design variables, computing environment, coding efficiency, etc. To focus on the factor of number of load steps, the ratio of sensitivity analysis time using reduced load steps to the time using all load steps could be evaluated.

In Fig. 14, the ratio of time of sensitivity analysis is presented in the vertical axis. The horizontal axis presents the ratio of number of reduced load steps to the number of all load steps. There are four points in the figure, representing the results of four numerical examples in this paper. For better organization of the paper, the example presented in a later Sect. 6.1 is also included here.

Fig. 14
figure 14

Relation between ratio of sensitivity analysis time and ratio of number of load steps

It clearly shows the reduction of sensitivity analysis time is proportional to the reduction of number of load steps with proportionality constant close to one. Although there are not enough data from statistics point of view, the linear relation could still be concluded, which is also in accordance with intuition.

5.5 Influence on the computational cost of optimization

As shown in previous examples, the accuracy of sensitivity will be lost to some extent while reducing the number of plastic load steps. It naturally leads to the question, how the loss of sensitivity accuracy will influence the optimization? Will the gain in the sensitivity analysis be consumed by the increased number of iterations in optimization? Especially when the system evaluation of a new design requires time-consuming nonlinear finite element analysis. These are open question to be investigated. The following three aspects should be paid attention to while addressing the questions. They show the answers are not straightforward.

Firstly, will and how the slight loss of accuracy jeopardizes information provided to an optimization algorithm? In gradient-based optimizations, sensitivities are usually used to generate a search direction in which to improve system responses. As demonstrated, following the empirical rule, the degree of inaccuracy of the sensitivities are very limited. There is no obvious change of sensitivity magnitudes, and the spatial distributions are also the same. Therefore, it is sufficient to believe that the directional information is still well preserved after proper load step reduction. The influence on the optimization process is thus minimized.

Secondly, on the nature of optimization algorithm, in which framework the sensitivities are utilized leads to algorithm-dependent answers to aforementioned questions. In the literature, efforts are taken to reduce computational cost by improving optimization procedures and algorithms. An accelerated gradient algorithm has been proposed (Arouri and Sayyafzadeh 2020), which is less sensitive to the gradient approximation accuracy than the steepest descent algorithm. Switching back and forth between accurate sensitivity and Broyden approximation (Li et al. 2007), both trust region and SQP based minmax algorithm may take even fewer optimization iterations to converge to an optimal point. The method of moving asymptotes (MMA) is a popular algorithm for shape and topology optimization (Svanberg 1987). An interesting study presented by Amir (2021) shows that, aiming at delivering viable engineering designs within limited number of function evaluations, inexact design sensitivities do not lead MMA algorithm to a significantly inferior solution. In short, the choice of algorithms will lead to different requirement on sensitivity accuracy.

Last but not least, how to use sensitivities in shape optimization procedure is also a topic. Two general problems in gradient-based shape optimization are the mesh-dependency of sensitivities and non-smooth shapes resulted by noisy sensitivity fields. Successful techniques to address these issues are sensitivity filters (Le et al. 2011; Stück and Rung 2011; Sigmund and Maute 2012; Bletzinger 2014), sensitivity weighting (Kiendl et al. 2014) and vertex-morphing (Hojjat et al. 2014). The common point of these methods is that the shape sensitivities are smoothed before they are used for design update. These intentional modifications will unavoidably lead to discrepancy between post-processed sensitivities and raw sensitivities. The requirement on the accuracy of raw sensitivities is loosened to some extent under these circumstances.

Generally speaking, how the loss of raw sensitivity accuracy influences the optimization performance is a case-by-case question. This paper focuses on the reduction of computational cost in sensitivity analysis with least loss of raw accuracy. In practical optimization applications, the requirement on the sensitivity accuracy should be considered in a comprehensive way.

6 Extension to more general constitutive models

6.1 Extension to kinematic and combined hardening model

In this section, the load step reduction technique is extended to kinematic hardening and mixed hardening case. The adjoint sensitivity analysis and the reduction rules are briefly discussed under these models. The applicability of the empirical rule is demonstrated through a conrod example.

In the adjoint sensitivity analysis with kinematic or mixed hardening, the major differences to isotropic hardening case lie in the state variables and dependent residual. They are caused by the introduction of the back stress and the hardening ratio. Denote the back stress under stress-free configuration by \({}^{t}\overline{{\varvec{b}} }\). For bilinear mixed hardening elastoplasticity, the yield surface is

$${\left|{}^{t}\overline{{\varvec{\tau}} }-{}^{t}\overline{{\varvec{b}} }\right|}_{\mathrm{eqv}}={{}{}^{0}\sigma }_{\mathrm{Y}}+\beta \cdot {E}^{p}\cdot {{}{}^{t}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}$$
(54)

where \(\beta\) is the hardening ratio lies in the range from 0 to 1. If \(\beta\) is equal to 1, it describes a pure isotropic hardening behavior, and with \(\beta\) equals 0, it describes a pure kinematic hardening model. The back stress of two consecutive steps follows

$${}^{t}\overline{{\varvec{b}} }={}^{t-1}\overline{{\varvec{b}} }+\left(1-\beta \right)\cdot {{\varvec{D}}}^{\boldsymbol{\alpha }}\cdot \left({{}{}^{t}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}-{{}{}^{t-1}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}\right){}^{t}\mathbf{a}$$
(55)

where

$${{\varvec{D}}}^{\boldsymbol{\alpha }}=\frac{{E}^{p}}{3}\left[\begin{array}{*{20}c} 2 & \, & \, & \, & \, & \, \\ \, & 2 & \, & \, & \, & \, \\ \, & \, & 2 & \, & \, & \, \\ \, & \, & \, & 1 & \, & \, \\ \, & \, & \, & \, & 1 & \, \\ \, & \, & \, & \, & \, & 1 \\ \end{array}\right]$$
(56)

Except the state variables in Eq. (30), the back stress is taken as an additional one, i.e.

$${}^{t}{\varvec{V}}\stackrel{\scriptscriptstyle\mathrm{def}}{=}\left\{{}^{t}\overline{{\varvec{\tau}} },{}{}^{t}{\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}},{{{}^{t}{\varvec{X}}}^{p}}^{-1},{}^{t}{\overline{\varvec{b}}}\right\}$$
(57)

Correspondingly, Eq. (55) is an additional governing equation in the dependent residual:

$${}^{t}{\varvec{H}}\left({}^{t}{\varvec{U}},{}^{t}{\varvec{V}},{}^{t-1}{\varvec{V}},s\right)=\left(\begin{array}{c}\begin{array}{c}{}^{t}\overline{{\varvec{\tau}} }-{{\varvec{D}}}^{{\mathrm{e}}}\ln{{}^{t}{\varvec{U}}}_{*}^{e}+{{\varvec{D}}}^{{\mathrm{e}}}\left({{}{}^{t}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}-{{}{}^{t-1}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}\right){}^{t}\mathbf{a}\\ \left({}{}^{t}{\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}-{}{}^{t-1}{\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}\right)\cdot \left[{\left|{}^{t}\overline{{\varvec{\tau}} }-{}^{t}\overline{{\varvec{b}} }\right|}_{\mathrm{eqv}}-{\sigma }_{\mathrm{Y}}\left({}{}^{t}{\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}\right)\right]\end{array}\\ {{{}{}^{t}{\varvec{X}}}^{p}}^{-1}-{{{}{}^{t-1}{\varvec{X}}}^{p}}^{-1}\cdot {e}^{-\left({{}{}^{t}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}-{{}{}^{t-1}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}\right){}^{t}\mathbf{a}}\\ {}^{t}\overline{{\varvec{b}} }-{}^{t-1}\overline{{\varvec{b}} }-\left(1-\beta \right){{\varvec{D}}}^{\boldsymbol{\alpha }}\left({{}{}^{t}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}-{{}{}^{t-1}\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}\right){}^{t}\mathbf{a}\end{array}\right)\equiv 0$$
(58)

There are no other changes in the adjoint sensitivity analysis procedure. The two properties and two load steps reduction rules presented in Sect. 4.1 and 4.2 can be proved following the same procedures as in Appendices C to F. Therefore, the same empirical rule as in Sect. 4.2 is also proposed to reduce plastic load steps in the sensitivity analysis.

The following example demonstrates the applicability of the empirical rule for kinematic hardening elastoplasticity. In this example, the same conrod model as in Sect. 5.3 is used, including the structure, boundary conditions and design nodes. The material properties are also the same as in Table 3, except the hardening model is pure kinematic, i.e. \(\beta =0\). The load history on the small end of the conrod is depicted in Fig. 15, which is the same as in Fig. 11. However, it should be noted that, all the load steps behave plastically with kinematic hardening material.

Fig. 15
figure 15

Load history of connecting rod example with kinematic hardening model

The contour of the equivalent plastic strain at the turning steps of monotonic load procedures are depicted in Fig. 16. Areas where equivalent plastic strain is larger than 20% are depicted in red. The maximum equivalent plastic strain at the final step is 73%. It shows that the structure experiences severe back and forth bending under the given load history.

Fig. 16
figure 16

Deformation and contour of equivalent plastic strain at representative load steps of conrod example under kinematic hardening model

According to the empirical rule, all the turning steps which are plastic should be involved in the sensitivity analysis. These are step 7, step 12, step 16 and step 19. In comparison with isotropic hardening case, the step 12 is additional. This is because step 12 is a plastic step under kinematic hardening model. The sensitivity results of average equivalent plastic strain response, which is defined in Sect. 5.3, is presented in Fig. 17.

Fig. 17
figure 17

Sensitivity results of equivalent plastic strain response

It shows that the sensitivity analysis with reduced load steps match well with results using all load steps. Excluding sensitivities whose absolute value is smaller than 1% of the largest value, the maximum percentage error is 41% with an average error of only 3.6%. The point with maximum error is also identified in Fig. 17. It shows that the large error is due to the small denominator in calculating with Eq. (53). Therefore, the applicability of empirical rule to kinematic hardening model is demonstrated.

6.2 Extension to nonlinear elasticity and multilinear plasticity

The deductions and examples in this paper are all based on bilinear elastoplasticity. However, the applicability can be extended to nonlinear elasticity and multilinear plasticity without difficulties.

To extend for nonlinear elasticity, the change is in the dependency of elastic modulus on the total strain. The elastic constitutive relation \({{\varvec{D}}}^{e}\) is not constant, but dependent on the strain at each load step:

$${{\varvec{D}}}^{{\mathrm{e}}}={}^{t}{\varvec{D}}(\ln{}{}^{t}{{\varvec{U}}}_{\boldsymbol{*}}^{e})$$
(59)

Therefore, the dependent residual is formulated as

$${}^{t}{\varvec{H}}\left({}^{t}{\varvec{U}},{}^{t}{\varvec{V}},{}^{t-1}{\varvec{V}},s\right)=\left(\begin{array}{c}\begin{array}{c}{}^{t}\overline{{\varvec{\tau}} }-{}^{t}{\varvec{D}}\ln{{}^{t}{\varvec{U}}}_{*}^{e}\\ {}{}^{t}{\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}-{}{}^{t-1}{\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}\end{array}\\ {{{}{}^{t}{\varvec{X}}}^{p}}^{-1}-{{{}{}^{t-1}{\varvec{X}}}^{p}}^{-1}\end{array}\right)\equiv 0$$
(60)

For multilinear plasticity, the change is in the relation of yield strength to the equivalent plastic strain. The formulation of dependent residual is the same as in Eq. (35). The only change appears when the derivative of yield strength with respect to equivalent plastic strain is calculated, where the result is not a constant \({E}^{\mathrm{p}}\), but a function of \({}^{t}{\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}\):

$$\frac{\mathrm{d}{{}{}^{t}\sigma }_{\mathrm{Y}}}{\mathrm{d}{}_{ }{}^{t}{\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}}}={}_{ }{}^{t}{E}_{ }^{p}({}_{ }{}^{t}{\varepsilon }_{\mathrm{eqv}}^{\mathrm{p}})$$
(61)

With these changes, all the deduction presented in this paper can still be verified. Therefore, the two properties, two load steps reduction rules and the empirical rule still apply.

7 Conclusion

In this paper, load step reduction method for adjoint sensitivity analysis of finite strain elastoplasticity is investigated. Two properties regarding adjoint variables are proved theoretically. Based on these properties, it proves that intermediate elastic load steps can be skipped in the sensitivity analysis without loss of accuracy; under certain conditions, the former ones of consecutive plastic load steps can also be skipped.

The applicability for general element types is demonstrated through structures with 3D solid elements. Both isotropic hardening and kinematic hardening examples are presented. Numerical examples show that, the strategies apply well for structures under complicated load history and severe deformation. Following the presented method, satisfying results could be obtained with significantly reduced number of load steps in the sensitivity analysis.