A thermomechanical finite strain shape memory alloy model and its application to bistable actuators

This work presents a thermomechanical finite strain shape memory alloy model that utilizes a projection method to deal with the incompressibility constraint on inelastic strains. Due to its finite strain formulation, it is able to accurately predict the behavior of shape memory alloys with high transformation strains. The key feature of this model is the thermomechanical modeling of the shape memory effect and superelastic behavior by optimizing a global, incremental mixed thermomechanical potential, the variation of which yields the linear momentum balance, the energy balance, the evolution equations of the internal variables as well as boundary conditions of Neumann- and Robin-type. The proposed thermal strain model allows to properly capture transformation induced volume changes, which occur in some shape memory alloys. A finite strain dissipation potential is formulated, which incorporates the disappearance of inelastic strains upon austenite transformation. This important property is consistently transferred to the time-discrete potential using a logarithmic strain formulation. Yield and transformation criteria are derived from the dual dissipation potential. The implementation based on an active set search and the algorithmically consistent linearization are discussed in detail. The model is applied in three-dimensional simulations of a bistable actuator design to explore its capabilities.


Introduction
Since the discovery of their unique properties, shape memory alloys are used in many medical and engineering fields in various applications. Their frequent appearance stems from their unique features, like the shape memory effect and superelasticity. Both effects emerge from the characteristic first-order phase transition from the austenite to the martensite state and vice versa. While the occurring crystallographic effects involving the detwinning of the martensite phase, which enables the shape memory effect, are well understood, the thermomechanical modeling of the occurring effects is not straightforward. In recent times, there has been a high effort to improve shape memory alloy models and to obtain fitting models for specific use cases (see, for reviews, Section 4 of Lester et al. [27] or Cisse et al. [8]). They can be roughly categorized into three classes: models based on statistical thermodynamics, models founded in micromechanics and phenomenological models.
The models based on statistical thermodynamics rely on finding the phase equilibrium through a minimization of a three well potential energy (e.g., Seelecke and Müller [43] or Govindjee et al. [16]). Since these models yield results that also consider the microstructure of the materials, they come mostly with a computational cost which is too high for large, structural simulations. Additionally, gathering required micromechanical material parameters is sometimes an elaborate task.
On the other hand, models based on micromechanics usually consider the mechanics of shape memory alloy (SMA) single-crystals. Many models are then extended into the regime of polycrystal modeling by usage of homogenization techniques (e.g., see Patoor et al. [35] and Lagoudas et al. [25] or the more recent models by Mirzaeifar et al. [29] and Yu et al. [55]). While these models consider the deep, underlying phenomena of shape memory alloys, this advantage again comes with a high computational cost. This makes it really challenging to use these models in large structural simulations of shape memory alloy actuators.
The third group of models is the class of phenomenological models. Usually, they come with the advantage of only having macroscopic material parameters, which mostly are obtained through tensile tests at different temperatures. In recent years, due to the plethora of shape memory alloys effects and applications, many new shape memory alloy models were published, which try to include more and more physical phenomenons. They can be divided into two subgroups: models which include a geometrically linear theory (see, e.g., Auricchio et al. [3], Auricchio et al. [4] and Sedlak et al. [42]), and models which include a geometrically nonlinear theory.
The two main ways to include a geometric nonlinearity in the shape memory alloy model is to either employ a multiplicative split of the deformation gradient going back to [10,24,26] or to make use of an additive split (see Nemat-Nasser [32]) of the rate of deformation, which are both well known from plasticity. While models based on the additive split are computationally enticing, they only allow for small strains, while still capturing large rotations well (see, e.g., Qidwai and Lagoudas [37], Müller and Bruhns [31] or Zhang and Baxevanis [57]). On the other hand, models employing a multiplicative split are computationally more elaborate, but can represent finite stretches well (see, e.g., Reese and Christ [38], Arghavani et al. [1] or Wang et al. [49]). Additionally, there exist many new models considering geometric nonlinearities, which are limited to superelasticity (see, e.g., Bellini et al. [5], Wang et al. [48] and Rezaee-Hajidehi et al. [39]). Furthermore, some models also consider transformation induced plasticity (see, e.g., Hartl et al. [19], Xu et al. [53] and it's extension to partial phase transformations by Scalet et al. [41]).
Many of these models capture thermomechanics, martensite reorientation and detwinning as well as superelasticity at different temperatures or allow for different elastic properties of the materials while being numerically efficient and robust. However, when trying to model bistable shape memory actuators (see the actuator design in Arivanandhan [2]), one needs a numerically robust, fully thermomechanically coupled finite strain model that also models thermal expansion and volumetric effects during phase transition (see, e.g., Potapov et al. [36]). To our knowledge, there is no model in the literature fulfilling all of the aforementioned requirements, which led to the model described in this paper. For example, the model of Wang et al. [49] considers a thermomechanically coupled finite strain theory which is capable of modeling the shape memory effect as well as superelasticity, neglecting volumetric effects due to transformation as well as different expansion coefficients of the SMA phases. The model at hand falls into the aforementioned category of phenomenological models. It is embedded into the generalized standard materials framework developed by Halphen and Nguyen [17], which was extended to thermomechanics by Yang et al. [54] and which allows to ensure thermodynamic consistency. The energies as well as the dissipation potential can be seen as an extension of Sedlak et al. [42] to the finite strain case. Because the satisfaction of the incompressibility of inelastic strains for finite strains is not as straightforward as for the small strain case, a projection method developed for plasticity is incorporated into the model (see Hurtado et al. [21] and Sielenkämper et al. [44]). Further, due to the character of the energies used in this model, special numerical treatment is necessary to solve the model equations using a Newton scheme. Since our aim is to model microactuators in which the R-Phase is not present, it is not incorporated into the model. Additionally, tension-compression anisotropy, which is an important effect in many shape memory alloys (for experimental publications see, e.g., Gall et al. [15] or Wang and Zhu [47], and for modeling approaches see, e.g., Zaki et al. [56] or Sedlak et al. [42]), is not included in the model.
In the past, numerous advancements to current microelectromechanical systems (MEMS) based on electrostatics, magnetism and electrothermal principals have been made. For example, Hoffmann et al. [20] proposed a microactuator based on electrothermal activation making use of bimetal effects. This, however, comes with the downside of low actuation frequencies and a high power consumption. Han et al. [18] proposed an electrostatic actuator-based micro-switch for photonics. Devices based on electrostatics usually can be actuated with high frequencies and are adaptable to many applications. Devices using optomechanics were developed by, e.g., Eichenfield et al. [11], which come with a rather small tuning range, but allow for a very high operation speed [9]. Despite their unique advantages, current MEMS devices are challenging to use in downsized applications, where a high work output combined with a high power efficiency and bistability is crucial. The microactuator modeled in this paper is based on a concept published by Winzek et al. [50], which utilizes a high temperature shape memory alloy with a large thermal hysteresis. This concept may overcome the aforementioned weaknesses of other actuation principles, as shape memory alloys usually have a large work output density and favorable downscaling capabilities (see, e.g., Kohl [22]). Additionally, downscaling this design is expected to drastically increase the possible actuation frequency in comparison to other, electrothermally activated actuators due to the decreasing masses and increasing thermal gradients. One further key aspect is, that the proposed actuator design requires no power in the stable states.
The paper is structured as follows: First, the energies as well as the dissipation potential is derived. Then, in Sect. 3, numerical strategies necessary to solve the model equations are discussed. The numerical results are subsequently shown in Sect. 4 before a summary and outlook concludes the paper in Sect. 5. Notation Throughout this paper, a direct tensor notation is preferred. Scalars and scalar valued functions are typeset using light-face italic characters, e.g., a or A. First and second-order tensors and tensor-valued functions are represented by bold-face italic letters, e.g., a or A. Further, blackboard bold-faced letters are used to denote fourth-order tensors, e.g., or C.
Additionally, the transpose of a second-order tensor is designated by A T , while the major transpose of a fourth-order tensor is given by C T . The symmetric and deviatoric part of a second-order tensor A are denoted by sym( A) = 1 2 ( A + A T ) and A = A − 1 3 tr( A)I, respectively. Here, I is the second-order identity tensor and tr( A) denotes the trace of A. A double contraction of two tensors A and B is denoted by A : B, while the dyadic product is denoted by a ⊗ b. A determinant of a tensor A is either designated by det( A) or by A's third invariant III A .

Kinematics
The deformation gradient F maps a line element from the reference configuration of a body with Volume V 0 into the current configuration of a body with Volume V and is defined as where Grad(•) refers to the gradient with respect to the reference configuration while X and x are the position vectors of a material point in the reference and current configuration, respectively. We consider a multiplicative split of the deformation gradient in the form (see Wang et al. [49]), where F e is the elastic, F i the isochoric part of the deformation due to transformation 1 and F θ the part which describes the volume change due to thermal expansion and transformation 2 of the deformation gradient. This is fairly similar to the multiplicative split in plasticity going back to the works of Eckart [10], Kröner [24] and Lee [26]. Further, we define the determinant J θ = J θ (θ, ξ ) = det(F θ ) to be a function of the absolute temperature θ and the martensite volume fraction ξ ∈ [0, 1]. Since the thermal deformation is assumed to be volumetric, we can express F θ in terms of J θ as As is commonly done and will be useful later, we define the elastic and inelastic left Cauchy-Green tensors as Likewise, we define the inelastic right Cauchy-Green tensor and the inelastic Green-Lagrange strain 1 For brevity, F i is called the inelastic part of the deformation gradient throughout this paper. 2 For brevity, F θ is called the thermal part of the deformation gradient throughout this paper.
Motivated by the observation that the shape-memory effect is caused by an almost 3 volume preserving transformation of the crystal lattice, we assume C i to be volume preserving. Therefore, because det(F i ) = 1 has to hold, the determinant of the deformation gradient is given by with J e = det(F e ). Additionally, we define the velocity gradient l and its symmetric part d as Finally, we define the inelastic 'velocity gradient' L i and its symmetric part D i in analogy to l and d by 2.2 (Im-)Balance equations

Momentum balances
The quasistatic linear momentum balance for a body with volume V in the current configuration is given by where b is the body force and ρ is the mass density. Further, Cauchy's lemma t = σ n as well as the angular momentum balance σ = σ T are assumed to hold.

Energy balance
Next, the balance of the energies is given bẏ where u is the internal energy density per unit reference volume, τ denotes the Kirchhoff stress tensor, q is the heat flux vector in the current configuration and w represents the energy source term. Further, the divergence of the heat flux vector Q = J F -1 q with respect to the reference configuration is then given by J div(q) = Div( Q).

Clausius-Planck inequality
To be thermodynamically consistent, the Clausius-Planck inequality where D is the mechanical dissipation density per unit reference volume and s is the entropy density per unit reference volume, has to be fulfilled at any time. Further, the Helmholtz free energy, which is defined by ψ = u − θ s, is introduced. We assume ψ to be a function of the deformation gradient F, the absolute temperature θ and the internal variables ξ and C i , i.e., ψ = ψ(F, C i , ξ, θ). Furthermore, we make the common assumption that D i.e., there is no energy dissipation when the internal variables are virtually fixed. As Eq. (12) must hold for arbitrary processes, one can show the following standard results: Now, it is easy to show that D = i : D i + qξ ≥ 0 ( 1 4 ) with shorthand notations for the effective Mandel stress with respect to the intermediate configuration i = −2F i ∂ψ/∂ C i F iT and the thermodynamic force q = −∂ψ/∂ξ associated with the martensite volume fraction ξ .

Helmholtz free energy
The Helmholtz free energy density ψ for this model is assumed to be a sum of elastic, chemical and hardeninglike contributions in the form

Elastic energy
The elastic energy ψ e is assumed to be isotropic and to follow a modified Neo-Hookean formulation: Here, λ(ξ ) and μ(ξ ) are the Lamé parameters in dependence of the martensite volume fraction ξ . Further, we use a Reuss-like mixture rule to estimate the elastic constants, i.e., Here, and subsequently, the indices • A and • M refer to the austenite and martensite phase, respectively. Manipulating Eq. (13), one can show that the ordinary form holds for elastic isotropy. Using Eqs. (14) and (18), one can show that where e = C e F e−1 τ F e−T is the Mandel stress with respect to the intermediate, elastically unloaded configuration, which is symmetric due to the assumption of elastic isotropy.

Chemical energy
For the chemical energy, we assume a standard relationship (see, e.g., Lexcellent et al. [28] or Panico and Brinson [34]): where c is the specific heat capacity and s AM is the difference in specific entropy of the austenite and martensite phase: Here, c is assumed constant (compare [42]). In Eq. (20), we made use of the definition of the equilibrium temperature of austenite and martensite, which is θ 0 = u AM / s AM . Further, we assume for the equilibrium temperature A s > θ 0 > M s , where A s is the temperature where the reverse transformation starts and M s is the starting temperature of the forward transformation.

Hardening energy
Since the inelastic strains vanish as ξ → 0, the inelastic strains E i are assumed to satisfy the relation E i = ξ E t (see Otsuka and Ren [33]), where E t is a measure for the effective transformation strain. Now, the hardeninglike energy adapted from Sedlak et al. [42] for finite strains is assumed to be given by Here, k is the maximum transformation strain, E int is a hardening related parameter and E t is a modified von Mises equivalent strain. Obviously, ψ h → ∞ for E t → 1. This is desired, since it captures the martensite becoming fully detwinned, which is assumed to cost additional energy.

Thermal strains
Modeling thermal strains of multi-phase materials requires attention and special treatment. In this work, the determinant of the thermal part of the deformation gradient is connected to the coefficients of thermal expansion (CTEs) by Here, ε θ (ξ, θ ) is the thermal strain. Additionally, α M and α A are the CTEs of the martensite and austenite phase, respectively. Further, θ refM and θ refA are the reference temperatures for austenite and martensite. Here, we want to note that it is necessary to distinguish the two in order to properly represent the transformation-induced volume change which is present in some shape memory alloys (see below). Example We assume a one-dimensional SMA rod with only one reference temperature. Further, we simplify the SMA model by assuming that martensite and austenite transformation occur at the distinct temperatures θ M and θ A , and not at temperature ranges (see Fig. 1). The thermal strains are then given by Furthermore, we assume that we know the shape of our rod at the thermal annealing temperature θ ref (see Fig. 1). During forward transformation, the jump of ε θ at θ M can be analytically calculated and is given by Here, we can not only see that this jump is dependent on the reference temperature, but that it can be positive or negative, depending on the reference temperature θ ref being larger or smaller than the forward transformation temperature θ M . Using two reference temperatures θ ref M and θ ref A , we can not only circumvent this problem, but also model the magnitude of this jump in compliance with, e.g., the experimental results of Potapov et al. [36], where they calculated the jump V in volume from the lattice parameters for Ni 49.8 Ti 35.2 Hf 15 to be 0.47%. However, we want to emphasize that this effect does not play a significant role in many other shape memory alloys. The dissipative behavior of the shape memory alloy is assumed to be governed by the dissipation potential Here, depending on the direction of transformation, either φ M or φ A are used as the active dissipation potential. As is commonly done for plasticity, the dissipation is assumed to be infinite if tr( D i ) = 0, which guarantees volume-preserving inelastic deformations (compare Rockafellar [40]). Additionally, we define the dissipation potential for an increasing martensite volume fraction as where M f is the finish temperature of forward transformation and σ reo is the reorientation stress. Likewise, the dissipation potential when going to a higher austenite volume fraction is defined as Here, A f is the finish temperature of reverse transformation and ε i = 1 2 ln b i . The significance of using the logarithmic inelastic strain ε i as a strain measure for the dissipation is explained below. The dissipation potential represents a geometrically nonlinear generalization of the small strain potential proposed in Sedlak et al. [42], which is based on the works of Bernardini and Pence [6], Panico and Brinson [34] and Moumni et al. [30].

Transformation/yield criteria and inelastic evolution equations
It is assumed that the evolution of C i and ξ follows from the minimization problem which is equivalent to the Legendre-Fenchel transformed problem (compare Eqs. (14), (19)) where φ * is the dual dissipation potential of φ: Due to the minimization in Eq. (27), there is a tendency that forξ < 0 we get D i − (ξ/ξ)ε i = 0 as a result of the last term in Eq. (26). In that case b i → I as ξ → 0 in time (for a proof, see Appendix A), i.e., for vanishing martensite content, the inelastic strain vanishes, which is a physical necessity. With the reparametrization where Q M and Q A are defined in Eqs. (25) and (26) and capture the transformation hysteresis due to temperature change. Further reparametrizing D i / t = λN with λ ≥ 0 and the obvious solution it follows that Thus, we find the following transformation and yield criteria in the classical Karush-Kuhn-Tucker form as well as evolution equations: It is then straightforward to prove thermodynamic consistency (see Eq. (14)). The '0'-branch in Eq. (32) implies that (see Eq. (28)) Thus, we find with Eqs. (4) and (10) the following alternative form of the energy balance:

Rate potential
To simplify the discussion, we start by the isothermal case, i.e., in a first step, the temperature θ is considered to be a given parameter. The rate potential and its time discretized form read Here, ψ n refers to the Helmholtz free energy at the previous time, t is the time step from t n to t n+1 4 and φ is the time discretized version of the dissipation potential multiplied by t, defined by with sg (•) referring to the sign function and ξ = ξ − ξ n . Additionally, III i is the third invariant of C i , i.e., The constraint III i = 1 in Eq. (37) is consistent with the requirement tr( D i ) = 0 (see Eq. (24)). Additionally, we compute the effective inelastic strain increment α as where the inelastic right stretch is U i = √ C i , r C i = C i − C i r and the shorthand notation Now, for ξ going to zero in any given step, i.e., ξ = ξ n + ξ = 0 ⇔ ξ = −ξ n → C i r = C i n 0 = I, C i r goes back to unity again. Thus, the time discrete potential (37) based on definition (40) of C i r is the key to ensure that the inelastic strain consistently disappears during the transformation from martensite to austenite. Further, the function Q(C i n , ξ n+ 1 2 , sg ( ξ )) is given by where ξ n+ 1 2 = (ξ + ξ n )/2 is the midpoint evaluation of ξ , which is employed to obtain a reasonable transformation when the material is not stressed (see Frost et al. [14] for a similar concept). Moreover, we used C i n and ξ n for the reverse transformation in Eq. (41) to circumvent the eigenvalue problem as well as its linearization in every local Newton iteration. In this way, it suffices to solve the eigenvalue problem once per time step.
Using some further shorthand notations for constant terms in Q, we obtain where H and Q summarize H M , H A , Q M0 and Q A0 , respectively. Further, det(C i r ) = det(C i n ) = 1 clearly holds when looking at the definition of C i r in Eq. (40). Additionally, the consistency of φ with the timecontinuous theory is trivially proven except for α. Therefore, we have a look at the approximation of a 1+x at x 1: as ln a x ≈ a x − 1. Hence, we can approximate C i r as We can use this result for computation of the time-discrete derivative of α: Therefore, with the time step width going to zero, we obtain which is consistent with the time-continuous Eq. (26). Furthermore, since III i = det(C i ) = 1, we get for the time derivative of III i :İ Hence, since φ is time-continuous, we can use the discretized potential to solve for C i and ξ :

Numerical strategies
The potential π , as we formulated it in Eq. (48) carries two major numerical difficulties. First, C i is constrained to be volume preserving, i.e., det(C i ) = 1, which is very important to be exactly satisfied to comply with physics and not accumulate errors. Second, π is not differentiable at r C i = 0. The strategies employed to overcome these and other numerical difficulties as well as general numerical approaches are presented in this section.

Inelastic volume preservation
To deal with the constraint in C i , we employ a strategy using a projection of C i into the space of unimodular tensors, which was introduced by Hurtado et al. [21] for crystal plasticity. Our approach here closely follows the approach presented in Sielenkämper et al. [44]. First, we express C i in terms of the unconstrained inelastic auxiliary right Cauchy-Green tensorĈ i : Thus, det(C i ) = 1 is automatically satisfied. This idea is borrowed from various formulations in hyperelasticity, where similar approaches are used to decouple volumetric and deviatoric deformations (see, e.g., Flory and Volkenstein [13] or Simo et al. [45]). Now, we replace C i by C i (Ĉ i ) in the minimization problem in Eq. (48): However, while this removes the constraint from the minimization problem, π is now invariant with respect to changes ofÎ II i , and is therefore not uniquely solvable. For this reason, we add the regularization energy to π . We want to emphasize that this has no effect on the solution. This is due to the fact thatÎ II i = B is exactly satisfied after converging to a solution, since ψ r is the only term in π that is dependent onÎ II i . Therefore, ψ r is zero at any solution of the minimization problem. Further, the constants A and B can be chosen arbitrarily. They do not have to be chosen particularly large. In this paper, we chose B = 1, but any other positive value could be chosen and would lead to the exact same results. In this special case, C i =Ĉ i is exactly satisfied once the solution algorithm is converged. However, we want to note that this equality does not hold in the not yet converged state.
The following tensor, which is used in the residuals and stiffness terms later, is also known from hyperelastic models: where I s is the fourth-order identity on symmetric second-order tensors.

Differentiability at r
The ansatz C i =Î II i− 1 3Ĉ i renders the solution C i a priori volume preserving, but it does not solve the lack of differentiability of φ for C i = C i r . However, this is achieved by the following reparametrization: whereÑ s is an unconstrained, symmetric 2nd-order tensor. The final unconstrained minimization problem now reads It is easy to show that π is invariant with respect to Ñs . In order to render the solution unique, we further modify ψ r : where the choice of A, with the same argument as in Sect. 3.1, still has no effect on the solution. Additionally, the last term with a very small and constant is explained in Sect. 3.5. Now, the minimization problem from Eq. (54) is computed by solving the stationary conditions: where in general only a subset of these three equations is involved, depending on the 'active' variables. The active set of variables is determined by the activation or yield criteria presented in the sequel (see Sects. 3.3 and 3.10).

Algorithmic yield criterion
In a given time step, C i will only change (i.e., C i = C i n ⇔ γ > 0), if this decreases (minimizes) the potential. In order to decide whether the couple ( γ ,Ñ s ) is activated for a given state (F, C i r , ξ, θ), we evaluate the algorithmic yield condition If f > 0, γ is activated, i.e., ( γ ,Ñ s ) are put into the active set. Otherwise, γ = 0 minimizes π , which corresponds to the case where f ≤ 0. If ( γ ,Ñ s ) are part of the active set, the related equations and concerning residuals are: ∂π /∂ γ ! = 0 and ∂π /∂Ñ s ! = 0. For a given F, ξ and θ , we obtain the explicit form of f by variation of π: with the shorthand notation N s =Ñ s / Ñs for an arbitraryÑ s = 0 and δγ ≥ 0. Further, taking a closer look at the variation of π , we obtain (compare Appendix B) Using the definition of the unimodular projector in Eq. (52), we obtain δπ (52) Now, since inequality (58) must hold for arbitrary symmetricÑ s , we get Thus, the algorithmic yield criterion reads However, note that for ξ = 0, and thus C i r = C i n , we have a simplified yield criterion where the superscript 'tr' denotes the trial state and R i = F i U i−1 . This result again shows the consistency with the time-continuous theory (see Eq. (33)).

Initial guess forÑ s
In general, the direction of inelastic flowÑ s is determined by the minimization of π through a Newton scheme, which requires a reasonable initial guess when γ is activated. Coincidentally, an analytical solution for γ → 0 exists forÑ s :Ñ s,tr = The proof is given in Appendix B.
It is noted that for γ = 0, it follows that C i = C i r , which is independent ofÑ s . That means that any choice ofÑ s minimizes π , i.e., there is no unique solution. For very small γ , the sensitivity of π with respect toÑ s is also small, which can lead to a bad condition of the nonlinear equation system, which needs to be solved to minimize π . To stabilize the solution process, the last term in Eq. (55) is added to π, since for γ → 0 it is known thatÑ s =Ñ s,tr minimizes π . In theory, can be chosen arbitrarily small, in practice a finite value is necessary due to the limited numerical accuracy. Thus, the last term in Eq. (55) is used to 'guide' the algorithm toward the right solution when it is no longer able to find it by itself.
3.6 Computing derivatives of α for γ → 0 The derivatives of α are numerically tough to obtain for γ → 0, due to an almost zero denominator of M C i (see Eq. (112) and preceding equations). To overcome this issue, we derived the derivatives of γ with respect to ξ ,Ñ s and γ separately for γ → 0, which are then numerically feasible to obtain. However, this seems to be a rather theoretical as we never observed the case that γ was too close to zero to obtain the derivatives using the usual way shown in Appendix C. Therefore, and for brevity, we do not discuss them here.

Regularization of ψ h
Numerically, ψ h is challenging, since with E t → 1, ψ h and ∂ψ h /∂ E t go to infinity. This is very demanding, because values of E t ≥ 1 can occur during the iterative solution process of the Newton scheme, leading to a loss of convergence. The term (compare to Eq. (21)) is illustrated in Fig. 2 in black.
To stabilize the solution process, we partially replace the function ϕ by a linear approximation in the region where E t > c, where c is close to 1 (see Fig. 2 in green). This regularization approach is adopted from crystal plasticity, where it successfully improved the numerical treatment of the power law in Wulfinghoff and Böhlke [52]. In order to prevent the solution E t from taking nonphysical solutions beyond 1, we add a penalty-type energy as illustrated in Fig. 3. We penalize all states ( E i , ξ) (see Fig. 3) below the dashed half-line starting at ξ 0 by the energy where l F is the minimum distance of the point ( E i , ξ) from the half-line and H p is a large penalty parameter. To further improve the numerical robustness, we added an artificial viscosity-like term for the martensite volume fraction. Since it is a dissipative term, we add this term to the discretized dissipation potential: where η ξ is a small, positive constant.

Thermomechanical coupling
Like the displacements, the temperature is an unknown for our model. Both fields are coupled through the thermomechanical energy and through the dissipative terms, which lead to a heating of the material. Additionally, the temperature changes the material behavior and leads to thermal strains. Both fields are coupled using a potential-based monolithic approach similar to the one presented in Yang et al. [54]. Further, we assume Fourier's heat conduction law with heat conductivity κ. The thermomechanic quasistatic time-discrete potential then reads where we neglect the influence of body forces. The integral form of the local potential π is then given by where w denotes the heat source andt as well asQ denote prescribed tractions and normal heat flux at the Neumann-type boundaries ∂ V 0t and ∂ V 0Q , respectively. Now, the classical weak form of the linear momentum balance is obtained by variation of Eq. (71) with respect to u: with d δ = sym(Grad(δu) F -1 ). Likewise, we obtain the weak form of the energy balance by variation of Eq. (71) with respect to θ : Subsequently, we find u, θ, ξ, C i by finding the solution to the saddle point problem where κ u = {u : u =ū on ∂ V 0u } is the set of admissible displacements satisfying the Dirichlet boundary conditions imposed on the boundary ∂ V 0u and κ θ = {θ : θ =θ on ∂ V 0θ } is the set of admissible temperatures satisfying the Dirichlet boundary conditions imposed on the boundary ∂ V 0θ . Additionally, z is the vector of internal variables, i.e., z = (ξ, C i ).
Manipulating Eq. (74) using the definition s = −∂ψ/∂θ as well as applying Gauss theorem, we obtain where N is the external normal on the boundary ∂ V 0 in the reference configuration. To show consistency with the time-continuous theory, we take a look at the integrand over the volume integral in Eq. (76). By multiplying with θ n / t, we get Here, we want to note that this requires κ to be independent of the temperature θ . However, one possibility is to consider κ = κ(θ n ), which could circumvent this limitation. Now, for t → 0, we get (also see Eq. (35)) Hence, we arrived at the energy balance (Eq. (35)) with Fourier's law Q = −κGrad(θ ), which proves the consistency with the time-continuous theory. Further, the surface integrals in Eq. (76) imply the Neumann boundary conditionQ = −κGrad(θ ) · N on ∂ V 0Q . Alternatively, one can include Robin-type boundary conditions into the model by replacing the integrand tQθ/θ n in Eq. (72) by 1/2 th(θ − θ s ) 2 /θ n , where h is the heat convection coefficient and θ s is the temperature of the surrounding medium. In that case, the variation of yields the boundary condition −κGrad(θ ) · N = h(θ − θ s ) on ∂ V 0Q .

Active set search
When solving the set of Eq. (56), one has to decide which variables will evolve using the activation criteria given in Eqs. (64) and (66). They will then be put into the active array of variables A. If then, at a later state, an activation criterion is inactive, the variable is taken from A again. This is done using the active set search algorithm outlined in Algorithm 1. The algorithm is structured as follows: First, if A = ∅ or no variable from A was rescaled (see the end of this subsection for an explanation of rescaling) during the last iteration, we need to evaluate the activation criteria. With them at hand, we determine our new set A j+1 . The details for determining which variable to activate are given in Algorithm 2. Subsequently, we check if the active set changed from the last iteration. If this is not the case, the solution from the last iteration is confirmed as solution of the equation system. In that case, we compute the algorithmic tangent (see Appendix E for details), save the history variables and exit the material routine. Otherwise, we solve the minimization problem for the current active set A j+1 . This is done Algorithm 1 Active set algorithm solving the minimization problem in Eq. (56). using a Newton scheme. First, we compute our residual and stiffness concerning active set A j+1 , i.e., only derivatives with respect to the active set are computed. If the maximum iterations are exceeded or the norm of the residual is lower than the tolerance, we exit the loop. Otherwise, we check if updating ξ j+1 would result in a negative ξ when applying the Newton step. If this is the case, we rescale, i.e., multiply the increments by a scalar such that ξ k+1 is set to ξ hard min , which is 5 × 10 −5 in this work. Subsequently, we compute ξ k+1 , γ j+1 and Ñs j+1 . In fact, we do not update ξ j+1 , γ j+1 orÑ s j+1 yet. We update them only after exiting the Newton scheme and being sure that ξ incremented in a direction matching g A and g M , which were calculated before the Newton loop. If, due to a bad starting solution, this is not the case, we rescale the Newton step such that ξ j+1 = ξ n . If this happens, we make sure that in the next iteration of the active set algorithm, we won't activate ξ again.

Numerical results
In this section, the previously presented model is tested for different examples. First, to test the model's time convergence behavior and to show the superelastic as well as martensite reorientation behavior at high and low temperatures, respectively, thermomechanical Gauss point evaluations are conducted. Finally, a full actuator model is simulated. To cope with the thin structures occurring in the actuator, the SMA model at hand is embedded into a hexahedral element formulation with reduced integration and hourglass stabilization for the displacement, while the heat conduction terms are fully integrated.

Gauss point evaluations
For the Gauss point evaluations, thermal expansion as well as the transformation induced volume change, which is discussed in Sect. 2.3.4, are neglected for simplicity. The simulations are conducted using a reduced integration hexahedral element with hourglass stabilization, which is embedded into the finite element program FEAP [46]. The material constants used in the Gauss-point are given in Table 1.
Here, it is noted that θ refA and θ refA are chosen such that V = 0.47% (see Eq. (23)), matching the findings reported in Potapov et al. [36]. The numerical parameters are summarized in Table 2. Figure 4 shows a tensile test at a temperature of 270 • C for 49 and 40,000 time steps. Clearly, convergence with regard to time step width is not an issue, as both time step widths yield accurate results.
To demonstrate that the model results indeed do not depend on the numerical parameters, we compare results for two different sets of numerical parameters in a tensile test. The first set of numerical parameters is the one given in Table 2, which is used in the remainder of this paper. The second set is defined in Table 3 just    Fig. 5. Clearly, the numerical parameters do not have a noticeable effect on the results. Next, we modeled tensile tests at 50 • C, 100 • C, 160 • C and 200 • C. The resulting stress-strain curves are shown in Fig. 6.
For low temperatures, i.e. 50 • C and 100 • C, the model predicts a remaining martensite reorientation. On the other hand, for rather high temperatures, the model captures superelastic material behavior when unloading. However, we want to emphasize that these results do not consider any damage or plastic deformations which might already occur. Finally, the model captures the shape memory effect well. This is shown in Fig. 7, where we applied different stresses (160 MPa, 300 MPa, 400 MPa and 500 MPa) and then started a thermal cycle.

Fig. 5
Investigation of the numerical parameters influence (set 1 is given in Table 2, set 2 in Table 3) For 160 MPa, one obtains only twinned martensite, i.e., E t = 0 in the context of this work. Therefore, one only gets a small hysteresis due to the different elastic constants of austenite and martensite. For the larger prestresses, the austenite is transformed into detwinned martensite, i.e., E t = 0. This leads to the typical shape memory effect.

Plate with a hole
In this example, we simulate a plate with a cylindrical hole (see Fig. 8), which is at first loaded by a traction t, then unloaded and subsequently heated, which lets it recover the initial shape.
The material and numerical parameters are unchanged from Sect. 4.1, except the reference temperatures are now θ refA = 80 • C and θ refM = 87.66 • C. Due to the symmetry, using appropriate symmetry conditions, only one-eighth of the entire plate is simulated.
During the loading by the traction, the temperature at the upper and lower end is held constant at the initial temperature ofθ = 80 • C (M s <θ < A s ). Additionally, in the initial state, the plate is fully austenite. First, the tractiont is increased linearly to a maximum of 330 MPa in longitudinal direction over the duration of 50s. Subsequently, the tractiont is decreased to zero over the duration of 50s. Finally, the temperature at both ends of the plate is increased to 200 • C over the course of 20s.
The mesh used in the eighth of the plate consists of 2907 uniformly distributed elements and is shown in Fig. 9.
For the entire simulation, 293 time steps were required, taking a total CPU-time of 2524s. At first, during the loading, almost the entire plate is transformed to martensite. The plate at maximum traction is shown in Fig. 10a. Here, the E x x strain at the edge of the hole reaches roughly up to 18% (see Fig. 11).
Additionally, the plate heats up slightly due to the latent heat and mechanical dissipation (Fig. 10a). However, most of the heat is conducted out of the plate at the temperature Dirichlet boundaries due to the small size of the plate and the long simulated time 5 . Subsequently, after unloading, the plate does not transform back, which is shown in Fig 10b. Finally, when increasing the temperature at both ends, the martensite is transformed back to austenite, leading to a recovery of the initial shape (see Fig. 10c). Further, Fig. 12 shows the temperature and displacement of point A (see Fig. 8) over time. Here, the increase in slope of the displacement at roughly 15s is the point where we have a forward transformation in the vicinity of the hole. Then, the second change in slope of the displacement at roughly 36s is the rest of the plate starting to undergo forward transformation. At 50s, when releasing the traction, the displacements decrease. Then again, due to thermal expansion, the displacement increases slightly after increasing the temperature at the ends at 100s. However, when the temperature reaches the backward transformation temperatures, the material almost Overall, the examples have been chosen such that the stresses in the vicinity of the hole reach unphysical values, at which a real material is expected to show irrecoverable strains or even fracture. Thus, this simulation rather serves as an example showing that the implementation is able to find a solution, even under severe loads.

Finite element actuator model
In this section, we model a bistable shape memory microactuator using in the finite element program FEAP [46]. The actuator concept was published by Winzek et al. [50] for large structures and is built with three layers: a bottom layer of molybdenum, a middle layer of NiTiHf and a top layer of polymethyl methacrylate (PMMA) (Winzek et al. [50] used PMMA layers on both sides). For simplicity, a hexahedral element formulation with reduced integration and hourglass stabilization for the displacements is used for all materials. The material and numerical parameters for the shape memory alloy remain unchanged (see Tables 1 and 2).

Actuating principle
The actuator works through interlacing of the large hysteresis of NiTiHf with the polymers hysteresis. This is shown in Fig. 13.
It is actuated by joule heating to specific temperatures and cooling back to room temperature. Depending on the heat cycle, the martensite state or austenite state is held in place by the polymer. This is understood best when looking at the following example, which is depicted in Fig. 14, where we neglected the CTE of PMMA   for simplicity. Here, we compare a bimorph of SMA and molybdenum on the left to the proposed actuator on the right. When heating from ambient temperature (blue, Fig. 14a) to a temperature above A f (red), the glass transition temperature is reached, which drastically decreases the stiffness of the PMMA. Subsequently, the SMA in the bimorph as well as the bistable actuator reach reverse transformation temperatures, which makes them bend up (Fig. 14b). Now, upon cooling, the actuator reaches θ g , which 'freezes' it's current shape. Thus, when subsequently reaching forward transformation temperatures, unlike the bimorph, it will not revert into the original shape at room temperature (Fig. 14c). Now, we can heat the actuator to a temperature above θ g but below A s (orange in Fig. 14d) to soften the polymer, which makes the actuator adopt the deformation of the bimorph. When now cooling back to room temperature again, the actuator is in it's initial stable state again and the actuation cycle could be started again. The two stable states at room temperature are given by Figs. 14c, e. For the actuation behavior, it is important to have a homogeneous temperature profile in the actuator. Therefore, a wing-shape actuator developed in Arivanandhan et al. [2] is used to obtain rather homogeneous temperatures in the double beam cantilever.
To rapidly optimize the actuators properties, it turned out that building and modeling only the beam section of the actuator is advantageous. Therefore, we simulated just the beam part with suitable boundary conditions instead of simulating the entire wing structure including the attached wafer material. This thin film actuator with a zoom into the layers is shown in Fig. 15, where the Dirichlet boundary condition for the displacements and the Robin boundary condition for the temperature are indicated as well.

Polymer and molybdenum model
Since the interest in modeling the actuator lies rather in the states at θ 1 , θ 2 and θ 3 , and less in the states between them, we chose a very simple polymer model. It is governed by a thermally coupled viscoelastic Maxwell model for finite strains (Young's modulus and Poisson's ratio are 500 MPa and 0.4, respectively), where the viscosity is high (10 7 MPas) at low temperatures and low (1 MPas) at high temperatures. Therefore, it has almost no stiffness at high temperatures while it behaves almost elastically at low ones. Additionally, the glass transition temperature is given by θ g = 77 • C. The molybdenum is modeled with a thermally coupled Neo-Hookean elastic model. For the molybdenum, Young's modulus, Poisson's ratio, the thermal expansion coefficient as well as the reference temperature are E = 65 × 10 3 MPa, ν = 0.31, α = 5 × 10 −6 K −1 and θ ref = 500 • C, respectively.

Mesh convergence
The actuator considered is shown in Fig. 15. It is clamped on the left side. It has a length of 10 mm, a width of 5 mm. The layer thicknesses of the molybdenum and TiNiHf are for this work 20 um and 10 um, respectively. The polymer layer thickness is 160 um for now, before different layer thicknesses are compared in Sect. 4.3.4. For the initial conditions, we assume zero displacements as well as a temperature of 500 • C, i.e., the temperature at which the actuator is thermally annealed in a flat state. Furthermore, we assume the material to be in its austenite state at t = 0, which directly implies that initially C i = I. Subsequently, it is cooled down to room temperature at 20 • C, which bends the actuator due to the mismatch in the coefficients of thermal expansions and difference in cell volume between the austenite and martensite phase.
The heating cycle is realized through applying a heat source in the Mo and NiTiHf material. The heat source magnitude is modeled by a sine function, which is cut off when below 0. It is given by for the higher and lower heat cycles, respectively.
At the top and at the bottom face of the actuator a Robin boundary condition is applied, which cools the thin film to room temperature over time. Due to their small areas, the heat transfer at the lateral faces is neglected. The surrounding air's temperature is 20 • C while the convective heat transfer coefficient is assumed to be 70 W m −2 K −1 in accordance with Kohl et al. [23].
For these thin structures, a sufficiently fine mesh is crucial to obtain converged results. Therefore, we conducted convergence studies with regard to the necessary elements in each layer and the amount of elements needed over the length and width of the actuator. First, it turned out that due to the bending deformation of the thin film, it is sufficient to only use one element over the width of the actuator. Then, the convergence with respect to the amount of elements over the length is tested using four elements over the thickness for each material layer. The results are shown in Fig. 16, where the stroke and temperature of the SMA at the tip is plotted over time. Here, only the temperature for 20 elements over the length is shown, since there was virtually no difference in temperature for the different discretizations. Further, we concluded that using 20 elements over the length leads to an acceptable error in stroke while the main features of the actuator are conveyed well. Additionally, roughly 160 time steps were used for one actuation cycle, e.g., from 0 s to 180 s in Fig. 16.
Next, the convergence with regards to the elements used over the depth for each material was studied. The results are depicted in Fig. 17.
Here, for the number of elements over the SMA thickness, one element is already enough to obtain results with an acceptable error. For the molybdenum and polymer layer, one needs at least two elements to obtain converged results.
To illustrate the importance of modeling the V effect, Fig. 18 shows the stroke and temperature over time for V being 0 and 0.49%. Here, one can see that the volume change contributes to large parts of the achieved stroke between the two stable states at room temperature.

Influence of the polymer thickness
The polymer layer thickness plays a key role for this actuator design-if it is too thin, the high temperature shape cannot be held at room temperature by the polymer layer. On the other hand, it should not be too thick, since body forces and manufacturing problems come up in that case. Further, since the polymer insulates   the actuator thermally to one side, thinner polymer layers lead to the possibility of faster actuation cycles. Therefore, we tested several polymer layer thicknesses, the results are shown in Fig. 19.
First, at 1 (see Fig. 19), the actuator is heated. At first, the larger CTE of the polymer makes the actuator bend down. Then, when reaching θ g , the polymer softens up and the actuator relaxes. Subsequently, at 2 , the actuator is cooled down to room temperature by the surrounding air. The polymer hardens again and the device reaches stable configuration A at room temperature. Now, when heating again, the actuator bends down due to the large CTE of the PMMA at 3 . As soon as the temperature reaches θ g , the polymer softens up. Afterward, at 4 the shape memory alloy reaches A f and the martensite is transformed back to austenite, which makes the actuator deflect up. Finally, when we remove the source at 5 , the polymer hardens before M f is reached. Thus, depending on the polymer thickness, the polymer may hold the actuators shape or release some of it's stroke. Finally, the actuator reaches it's second stable state B at room temperature, which is also it's initial state. For many actuator designs, one wants to maximize the achievable stroke while keeping the device as small and power efficient as possible. Therefore, we try to maximize the difference in stroke between states A and B . In turn, one must choose a sufficient polymer thickness, such that the actuator does not release too much stroke at 5 . For example, 0.1mm of PMMA is not thick enough to hold the shape, while 0.3 mm holds the shape perfectly (see Fig. 19 at 5 ). Depending on the actuation frequency in mind and necessary achievable stroke, either value in between might be a suitable choice.

Summary and outlook
In this paper, a new thermomechanical shape memory alloy model for finite strains is presented. It uses a projection method to fulfill the incompressibility constraint for the inelastic stretches. Further, the model is realized in the generalized standard material formulation being extended to thermomechanics. The optimization of the global, incremental mixed thermomechanical potential by variation yields the mechanical balance principles as well as the evolution equations of the internal variables and boundary condition integrals. The presented model employs a thermal strain formulation for the shape memory alloy which allows to describe the transformation induced volume changes found in some shape memory alloys. Using a logarithmic strain formulation, a finite strain dissipation potential incorporates vanishing inelastic strains upon austenite transformation in a manner consistent with the time-continuous case. Additionally, yield and transformation criteria as well as the algorithmically consistent tangent for the coupled problem are given and discussed. Due to numerical difficulties, a regularization of the hardening energy term is implemented.
The numerical results show, that the model is capable of producing reasonable results in tensile tests. The dependency on the temperature at which the tensile tests are carried out is captured accurately. Furthermore, it enables the solution of thin film problems.
Using the model, we are now able to estimate viable layer thicknesses and device sizes as well as fitting joule heating parameters. Additionally, it turned out that only applying polymer to one side increases the maximum actuation frequency while also introducing more deflection between the stable states and requiring more power. Moreover, we found that the V effect as well as the thermal expansion of the SMA, due to the design being very sensitive to volume changes, is fairly important for the actuators stroke and therefore needs to be modeled accurately as well. Furthermore, the model results show that the bistability at room temperature, which is enabled by the polymer locking in the deformation, is achievable for the shown actuator concept.
In the future, it remains interesting to find ways of solving the model problem without any penalty terms, which are not reasoned physically and are numerically challenging to deal with. A possible solution to this problem could be to make use of the Fischer-Burmeister [12] complementary function. In fact, this has been done in Auricchio et al. [4] for shape memory alloys and in Brepols et al. [7] for damage-plasticity with great success. Furthermore, there is still space for improvements to the algorithms convergence behavior which could increase the speed of the proposed material model, especially under severe loading conditions. Additionally, an inclusion of functional fatigue properties, e.g., a shifting of the TiNiHf transformation temperatures over several transformation cycles into the model would enable to predict the long-term behavior of the actuators.

A Discussion of the dissipation potential φ
In this section, we have a close look at the term from Eq. (26). If this term is zero, While this only shows that b i goes to I for one special ξ(t), one could easily rescale the time to obtain arbitrary ξ(t), for which b i → I with ξ → 0.
B Proof ofÑ s,tr minimizing π for γ → 0 Proof We are looking forÑ s minimizing π for γ → 0: where the variation is done exclusively with respect toÑ s . Here, the dissipation potential φ is neglected because Since π is constructed such that it is minimized byÎ II i = 1, we constrainÑ s to be in the subspace of symmetric 2nd-order tensors satisfying det(Ĉ i ) = det(C i r + 2 γ N s ) = 1 (i.e., we only search this subspace for the minimizer). Hence,Î II i = 1 and which is independent ofÑ s . With the definition of the tensor Eq. (81) becomes Since N i is normalized (i.e., its norm is one), the solution reads which leads toÑ s,tr in Eq. (67).

C First derivatives of π
To minimize the potential using the active set algorithm including the Newton scheme, we need to obtain first and second-order derivatives of the potential with respect to the set of internal variables and temperature. The differential of the discretized potential with respect to the internal variables (u and θ are fixed) is given by dπ (53), (59) = + ∂(ψ e + ψ h + ψ r + ψ p ) where, for simplicity, the factor θ/θ n is neglected in front of σ reo , Q and H . Here, the tensor P N s is given by (see Eq. (59)) Further, the derivative of C i r with respect to ξ is (see Eq. (40)) where H − ( ξ ) is the double negative Heaviside function: H − ( ξ ) = −H (− ξ ). The remaining occurring derivatives are given by ∂ψ c ∂ξ (20) ∂ψ r ∂Ñ s (59), (55) = A Ñs Here, a 1 is the slope of the linear approximation in Fig. 2 at E t = ck and a 0 is the value of the function ϕ at that same point. Further, the terms related to ψ p are set to zero if ξ > E i /c + ξ 0 .

D Second derivatives of π
The second derivatives of the potential π are necessary to compute the algorithmic tangent. They are briefly given in this appendix. First, we define the symmetrizing box product as where A and B are arbitrary second-order tensors. Now, we compute the differential of the operator P N s : where B is an arbitrary symmetric second-order tensor and Here, B is defined as Similarly, we define the derivative of P i via Subsequently, we find for the derivatives of π : Here, for the occurring derivatives we find: ∂ 2 ψ e ∂ξ ∂ C i = −B e i : : ∂ 2 α ∂Ñ s 2 = 2 3 2 γ D N s P i :

E Consistent tangent
When using the finite element method to solve boundary value problems, we make use of the consistent tangent operator. To derive it with respect to d d from the potential, we start with the virtual work of the internal forces Here, d δ is defined as Now, we need to calculate the differential of virtual work of the internal forces: where through using the symmetry of τ we can simplify to where the first term, i.e., l d τ l T δ is related to the so-called geometric tangent and the second term arises from the incremental material stiffness. Further, l d and d d are defined in analogy to l δ and d δ . Here, we neglected, in a first step, the influence of perturbations dθ of the temperature (for the full linearization, see below). Now, we can split a into the elastic and inelastic parts: where (with a small abuse of notation) A denotes the vector of active variables and may contain ξ , γ and N s . Additionally, the elastic tangent e is given by e = (λ(ξ )det(b e ))I ⊗ I + (2μ(ξ ) − λ(ξ(det(b e ) − 1)))I s .
Next, we need to obtain ∂A ∂ E . Therefore, we take a look at the local Newton scheme, where Rearranging the result from Eq. (163) we obtain Additionally, the algorithmic consistent tangent with respect to θ as well as the mixed contributions can be computed through linearizing the weak form in Eq. (74): Here, the derivative ∂τ/∂θ is given by Now, using standard formulations for the element ansatz functions N and its gradients B, one can assemble the element stiffness matrix (see, e.g., Wriggers [51]).