Automatic Differentiation for Solid Mechanics


Automatic differentiation (AD) is an ensemble of techniques that allows to evaluate accurate numerical derivatives of a mathematical function expressed in a computer programming language. In this paper we use AD for stating and solving solid mechanics problems. Given a finite element discretization of the domain, we evaluate the free energy of the solid as the integral of its strain energy density, and we make use of AD for directly obtaining the residual force vector and the tangent stiffness matrix of the problem, as the gradient and the Hessian of the free energy respectively. The result is a remarkable simplification in the statement and the solution of complex problems involving non trivial constraints systems and both geometrical and material non linearities. Together with the continuum mechanics theoretical basis, and with a description of the specific AD technique adopted, the paper illustrates the solution of a number of solid mechanics problems, with the aim of presenting a convenient numerical implementation approach, made easily available by recent programming languages, to the solid mechanics community.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9


  1. 1.

    Vigliotti A (2019) Automatic differentiation for solid mechanics, example scripts. Mendeley Data, v1. Accessed 07 Jan 2020

  2. 2.

    Asaro R, Lubarda V (2006) Mechanics of solids and materials. Cambridge University Press, Cambridge.

    Google Scholar 

  3. 3.

    Bathe K (2014) Finite element procedures. Prentice Hall, Upper Saddle River

    Google Scholar 

  4. 4.

    Bezanson J, Edelman A, Karpinski S, Shah V (2017) Julia: a fresh approach to numerical computing. SIAM Rev 59(1):65–98.

    MathSciNet  Article  MATH  Google Scholar 

  5. 5.

    Bischof C, Khademi P, Mauer A, Carle A (1996) Adifor 2.0: automatic differentiation of fortran 77 programs. IEEE Comput Sci Eng 3(3):18–32.

    Article  Google Scholar 

  6. 6.

    Bonet J, Wood RD (2008) Nonlinear continuum mechanics for finite element analysis, 2nd edn. Cambridge University Press, Cambridge.

    Google Scholar 

  7. 7.

    Callen H (1985) Thermodynamics and an introduction to thermostatistics. Wiley, New York

    Google Scholar 

  8. 8.

    Corliss G, Faure C, Griewank A, Hascoet L, Naumann U (2002) Automatic differentiation of algorithms: from simulation to optimization. Springer, New York

    Google Scholar 

  9. 9.

    Deshpande VS, Ashby MF, Fleck NA (2001a) Foam topology: bending versus stretching dominated architectures. Acta Mater 49(6):1035–1040

    Article  Google Scholar 

  10. 10.

    Deshpande VS, Fleck NA, Ashby MF (2001b) Effective properties of the octet-truss lattice material. J Mech Phys Solids 49(8):1747–1769

    Article  Google Scholar 

  11. 11.

    Elliott C (2018) The simple essence of automatic differentiation (differentiable functional programming made easy). CoRR abs/1804.00746. arXiv:1804.00746

  12. 12.

    Fike JA, Alonso JJ (2011) The development of hyper-dual numbers for exact second-derivative calculations. In: AIAA paper 2011-886, 49th AIAA aerospace sciences meeting

  13. 13.

    Forth S, Hovland P, Phipps E, Utke J, Walther A (eds) (2012) Recent advances in algorithmic differentiation. Springer, Berlin

    Google Scholar 

  14. 14.

    Fuller RB (1966) U.s. patent serial no. 2,986,241

  15. 15.

    Griewank A, Walther A (2008) Evaluating derivatives: principles and techniques of algorithmic differentiation, 2nd edn. Other titles in applied mathematics. Society for Industrial and Applied Mathematics, Philadelphia

    Google Scholar 

  16. 16.

    Hogan R (2014) Fast reverse-mode automatic differentiation using expression templates in c++. ACM Trans Math Softw.

    MathSciNet  Article  MATH  Google Scholar 

  17. 17.

    Hughes T, Cottrell J, Bazilevs Y (2005) Isogeometric analysis: Cad, finite elements, nurbs, exact geometry and mesh refinement. Comput Methods Appl Mech Eng 194(39):4135–4195.

    MathSciNet  Article  MATH  Google Scholar 

  18. 18.

    Kantor IL, Solodovnikov AS (1989) Hypercomplex numbers: an elementary introduction to algebras. Springer, Berlin

    Google Scholar 

  19. 19.

    Korelc J, Wriggers P (2016) Automation of finite element methods. Springer, Cham.

    Google Scholar 

  20. 20.

    Logg A (2007) Automating the finite element method. Arch Comput Methods Eng 14(2):93–138.

    MathSciNet  Article  MATH  Google Scholar 

  21. 21.

    Logg A, Mardal KA, Wells GN et al (2012) Automated solution of differential equations by the finite element method. Springer, Berlin.

    Google Scholar 

  22. 22.

    Lyness J (1968) Differentiation formulas for analytic functions. Math Comput 22(102):352–362.

    MathSciNet  Article  MATH  Google Scholar 

  23. 23.

    Lyness J, Moler C (1967) Numerical differentiation of analytic functions. SIAM J Numer Anal 4(2):202–210.

    MathSciNet  Article  MATH  Google Scholar 

  24. 24.

    Margossian CC (2018) A review of automatic differentiation and its efficient implementation. CoRR abs/1811.05031. arXiv:1811.05031

  25. 25.

    Naumann U (2012) The art of differentiating computer programs: an introduction to algorithmic differentiation. Software, environments, and tools. Society for Industrial and Applied Mathematics, Philadelphia

    Google Scholar 

  26. 26.

    Ogden R (2013) Non-linear elastic deformations. Dover Publications, Dover Civil and Mechanical Engineering, New York

    Google Scholar 

  27. 27.

    Perkel J (2019) Julia: come for the syntax, stay for the speed. Nature 572(7767):141–142.

    Article  Google Scholar 

  28. 28.

    Revels J, Lubin M, Papamarkou T (2016) Forward-mode automatic differentiation in Julia. arXiv:160707892 [csMS]

  29. 29.

    Timoshenko S, Goodier J (1987) Theory of elasticity. McGraw-Hill Book Company, New York

    Google Scholar 

  30. 30.

    Zienkiewicz O, Taylor R (2005) The finite element method for solid and structural mechanics, 6th edn. Elsevier, Amsterdam

    Google Scholar 

  31. 31.

    Zienkiewicz O, Taylor R, Zhu J (2005) The finite element method: its basis and fundamentals. Elsevier, Amsterdam

    Google Scholar 

Download references


A.V. acknowledges support from the Italian Aerospace Research Program (PRO.R.A., through the funding of the METMAT project.

Author information



Corresponding author

Correspondence to Andrea Vigliotti.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


Appendix A

Implementation of Dual Number Systems in the Julia Programming Language

In this section we illustrate a possible implementation of the dual number system in the Julia programming language. Julia is a dynamically typed scientific programming language, whose semantic is particularly suitable for the description of physical problems [4, 27]. Similarly to other object-oriented programming languages, Julia allows the user-defined data types, and permits to overload existing operators or functions to be evaluated on the new types. Therefore the same script that evaluate a numeric function on floating point numbers, can be used to operate on dual numbers, once their arithmetic has been implemented, and produce dual number as a result.

The script block 2 shows a possible implementation of dual numbers in Julia. The dual number type is called D2 and is defined in lines 1–5, having a scalar component v, that stores the current value of the variable, a one dimensional array, d1, that stores all of the first derivatives of v, and a two dimensional array, d2, that stores all of the second derivatives of v.


The script block that follows, extend some ordinary maths operators to function with the D2 type. The first line in the script block informs the language that the scope of the mentioned operators, defined in the Base module, will be extended, and the lines 2–12 implement the arithmetic of dual numbers as defined in Sect. 3.3.1, where each component of a dual number value is accessed through the dot syntax (.), and the single quote (\(^\prime\)) denotes array transposition.


In the following block the function given by Eq. (31) is defined in the first line, and it is immediately evaluated for the values x1=x2=x3=1. In the lines 5-7, the variables x1, x2 and x3 are defined as dual quantities, of the D2 type, where the first argument of the call to the D2 constructor is the value of the variable, which we defined as real part, the second argument is the gradient of each variable, and the second argument is the Hessian. We remark that the gradient of the variable identifies them as independent variables, whose only non zero gradient component is the one relative to the same variable, an is equal to one, while all other derivatives of any order is nought.


The section that follows shows the output of the println commands in the listing 3. As we can observe, y0, the output of call to y(x1,x2,x3) with floating point number is a floating point number, while yd, the output of call to y(x1,x2,x3) with D2-type number is a dual number, whose value coincides with y0, and whose gradient and Hessian coincides with the gradient and Hessian of Eq. (31), in the point where the function has been evaluated, as given by Eq. (33).


We observe that having overloaded the operators involved in the definition of y(x1,x2,x3), allowed us to call the same function with both data type, without making any modification, or having to add any specification to the function itself.

We remark that the implementation of dual numbers in Julia as presented in this section is an attempt to provide a brief and clear illustration of a possible computer implementation of AD, through operators overloading, nonetheless in this form it does not exploit any of the powerful features offered by the Julia programming language, like parametric types, and macros [4, 27]. The implementation developed for the solution of the example presented in the paper, available through [1], which is based on [28], makes a better use of Julia’s features and functionalities and ensures better performances than the example presented in this section.

Appendix B

Arbitrary Order Dual Number Systems

In this section we briefly generalize the definition of dual numbers to an arbitrary order of differentiation. Let \(\varvec{x}\) be a dual number of dimension N and order K

$$\begin{aligned} \begin{aligned}&\varvec{x} \equiv x_0 + x_{i_1} \imath _{i_1} + x_{i_1i_2} \,\imath _{i_1i_2} + x_{i_1i_2i_3} \,\imath _{i_1i_2i_3} + \cdots \\&+ x_{i_1\dots i_K} \,\imath _{i_1\dots i_K} \quad \text {with} \qquad {\left\{ \begin{array}{ll} i_1 &{}\quad \in 1\dots N \\ i_2 &{}\quad \in i_1 \dots N \\ i_3 &{}\quad \in i_2 \dots N \\ &{}\quad \vdots \\ i_{K} &{}\quad \in i_{K-1} \dots N \\ \end{array}\right. } \end{aligned} \end{aligned}$$

with \(\imath _{j}\) the canonical base of \({\mathcal {R}}^N\), with \(j \in 1 \dots N\), and \(\varvec{\imath }_i, \varvec{\imath }_{ij}, \imath _{ijk} \dots , \imath _{i_1\dots i_K}\) are symbols defined as

$$\begin{aligned} \begin{aligned} \imath _{i_1 i_2}&\equiv \imath _{i_1}\otimes \imath _{i_2} + \imath _{i_2}\otimes \imath _{i_1}\\ \imath _{i_1 i_2 i_3}&\equiv \imath _{i_1} \otimes \imath _{i_2} \otimes \imath _{i_3} + \imath _{i_1} \otimes \imath _{i_3}\otimes \imath _{i_2} + \imath _{i_3} \otimes \imath _{i_1}\otimes \imath _{i_2} \\&\quad + \imath _{i_3} \otimes \imath _{i_2}\otimes \imath _{i_1} + \imath _{i_2} \otimes \imath _{i_3}\otimes \imath _{i_1} + \imath _{i_2} \otimes \imath _{i_1}\otimes \imath _{i_3} \\& \vdots \vdots \\ \imath _{i_1\dots i_K}&\equiv \sum _{I^K \in \Pi (K)} \imath _{I^K_1} \otimes \cdots \otimes \imath _{I^K_K} \end{aligned} \end{aligned}$$

where \(I^K\) is a permutation of the indices \(1 \dots K\), \(I_i^K\) are its elements, and \(\Pi (K)\) is the set of all the permutations of the indices \(1 \dots K\). With respect to Eq. (70) we observe that the following holds

$$\begin{aligned} \begin{aligned} \varvec{\imath }_{ij}&= \varvec{\imath }_{ji}\\ \imath _{ijk}&= \imath _{ikj} = \imath _{jik} = \imath _{jki} = \imath _{kij} =\imath _{kji} \\& \vdots \vdots \\ \imath _{I^K}&= \imath _{J^K} \forall \, I^K,J^K \in \Pi (K) \end{aligned}. \end{aligned}$$

The quantities \(x_0, x_i, x_{ij}, x_{ijk} \dots , x_{i_1\dots i_K}\) are real scalars and are the components of \(\varvec{x}\), \(x_0\) is the real part of \(\varvec{x}\), the remaining are dual parts of order \(1, 2, \dots K\). Two dual numbers of dimension N and order K are identical if and only if all of their components are identical, as follows

$$\begin{aligned} \varvec{x}=\varvec{y} \quad \iff \quad {\left\{ \begin{array}{ll} x_0 = y_0 \\ x_i = y_i \\ \vdots \qquad \vdots \\ x_{i_1\dots i_K} = y_{i_1\dots i_K} \end{array}\right. } \end{aligned}$$

The sum of two dual numbers is defined as the dual number whose components are the sum of the components, as follow

$$\begin{aligned} \varvec{z} = \varvec{x}+\varvec{y} \quad \iff \quad {\left\{ \begin{array}{ll} z_0 = x_0 + y_0 \\ z_i = x_i + y_i \\ \vdots \vdots \\ z_{i_1\dots i_K} = x_{i_1\dots i_K} + y_{i_1\dots i_K} \end{array}\right. } \end{aligned}$$

The product of two dual numbers is a dual number obtained as the sum of the mixed products of their components, where the following rules applies for the product of the symbols \(\varvec{\imath }_i, \varvec{\imath }_{ij}, \imath _{ijk} \dots , \imath _{i_1\dots i_K}\)

$$\begin{aligned} \begin{aligned} \varvec{\imath }_i \varvec{\imath }_j&\equiv \varvec{\imath }_{ij} \\ \varvec{\imath }_i \varvec{\imath }_j \imath _k&= \imath _{i} \imath _{jk} \equiv \imath _{ijk}\\&\vdots \vdots \\ \varvec{\imath }_{1} \dots \imath _K&= \varvec{\imath }_{1} \imath _{2\dots K} \equiv \imath _{1\dots K} \end{aligned}{,} \end{aligned}$$
$$\begin{aligned} \varvec{\imath }_i\imath _{1\dots K} \equiv 0 , \end{aligned}$$

where Eq. (74) produce the contribution to higher terms in the product as results of the products of lower order terms in the factors, and Eq. (75) ensures that no component with order higher than K appears in the result. The components of the product are given as

$$\begin{aligned} \varvec{z} = \varvec{x} \varvec{y} \iff {\left\{ \begin{array}{ll} z_0 = x_0 y_0 \\ z_i = x_iy_0 + x_0y_i\\ z_{ij} = x_{ij}y_0 + x_i y_j + x_j y_i + x_0 y_{ij} \\ z_{ijk} = x_{ijk}y_0 + x_{ij} y_k + x_i y_{jk} + x_0 y_{ijk} \\ \quad \vdots \vdots \\ z_{i_1\dots i_K} = x_{i_1\dots i_K}y_0 + x_{i_1\dots i_{K-1}}y_{i_K} + \dots + x_0y_{i_1\dots i_K} \end{array}\right. }, \end{aligned}$$

With reference to the quotient of two dual numbers, we observe that this operation is equivalent to the product of the first time the inverse of the second, where the inverse of a dual number is obtained by solving the following

$$\begin{aligned} \frac{1}{\varvec{x}} = \varvec{y} \iff \ \varvec{y}\varvec{x} = 1, \end{aligned}$$

from which it results

$$\begin{aligned} \frac{1}{\varvec{x}} = \varvec{y} \iff {\left\{ \begin{array}{ll} y_0x_0 = 1 \\ x_iy_0 + x_0y_i =0 \\ x_{ij}y_0 + x_i y_j + x_j y_i + x_0 y_{ij} =0\\ x_{ijk}y_0 + x_{ij} y_k + x_i y_{jk} + x_0 y_{ijk} =0\\ \quad \vdots \vdots \\ x_{i_1\dots i_K}y_0 + x_{i_1\dots i_{K-1}}y_{i_K} + \dots + x_0y_{i_1\dots i_K} =0 \end{array}\right. }, \end{aligned}$$

where we observe that the system (78) is an lower diagonal system, which can be easily solved by recursive substitution starting from the first equation.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Vigliotti, A., Auricchio, F. Automatic Differentiation for Solid Mechanics. Arch Computat Methods Eng (2020).

Download citation