Augmented Lagrangian Method for Constrained Nuclear Density Functional Theory

The augmented Lagrangiam method (ALM), widely used in quantum chemistry constrained optimization problems, is applied in the context of the nuclear Density Functional Theory (DFT) in the self-consistent constrained Skyrme Hartree-Fock-Bogoliubov (CHFB) variant. The ALM allows precise calculations of multidimensional energy surfaces in the space of collective coordinates that are needed to, e.g., determine fission pathways and saddle points; it improves accuracy of computed derivatives with respect to collective variables that are used to determine collective inertia; and is well adapted to supercomputer applications.


Introduction
The HFB equations of the superconducting nuclear DFT [1] can be viewed as a constrained nonlinear optimization problem in which the total energy of the nucleus, represented by a functional of one-body densities, is minimized subject to constraints on the values of several independent variables. In addition to the usually imposed conditions on the number of particles (protons and neutrons), one is often interested in constraining angular momentum components (to study nuclear behavior at nonzero angular momentum) or nuclear multipole moments (or deformations) -to investigate the large amplitude collective motion, such as shape coexistence, fission or fusion.
Constrained calculations are also used when going beyond the standard single-reference DFT, e.g., within the Generator Coordinate Method [2] of the multi-reference DFT [3,4], where the constrained HFB solutions are used to generate a set of basis wave functions employed in further optimization. Another set of applications concerns the adiabatic approximation to the time-dependent HFB (ATDHFB) [2,5,6,7,8,9] wherein derivatives with respect to collective coordinates are often approximated by finitedifference expressions [10].
The fission problem is of particular interest as it involves many constrained calculations along collective degrees of freedom representing families of mean fields characterizing fission pathways and nuclear dynamics during the fission process. In particular, care should be taken to identify saddle points in a multidimensional energy surface [11,12,13]. In this respect, constrained calculations in many variables can be very helpful as they can separate potential energy sheets that overlap when studied in reduced-deformation spaces [14].
An effective approach to satisfy constraints is the method of Lagrange multipliers [15]. For example, the minimization of energy E at the condition that the nuclear quadrupole momentQ has an expectation value q 0 , is equivalent to minimization of the Lagrangian function (or Routhian) In many cases, however, the procedure based on a linear constraint method (LCM) fails and the standard technique adopted is the method of quadratic constraint (the quadratic penalty method, QPM) [16,17,18]. In the above example, the corresponding Lagrangian function reads E ′ = E + c( Q − q 0 ) 2 . As noted in early nuclear self-consistent applications [19,20], results of calculations based on QPM strongly depend on the magnitude of c. For example, when the value of c is too small, one ends up with a solution having the constrained moment quite far away from the requested value. Increasing the value of c is often impossible as the self-consistent procedure ceases to converge. This is a serious deficiency of the method as it leaves important domains of the collective space unre-solved thus obstructing (or even preventing) the theoretical description.
An effective procedure that avoids some of the difficulties pertaining to the standard LCM but not introducing the penalty term is the method proposed in [21] and used in early CHF calculations of Refs. [22,23], and in CHFB calculations of Refs. [24,25], in which the λ is changed iteratively to satisfy the condition (1) at each step. Also, the constrained optimization problem is often treated by means of the conjugate gradient method [26]. However, except for a few cases [27], most of the existing HFB solvers are based on a direct diagonalization approach and a mixing of intermediate solutions during the iteration process using a linear or Broyden mixing [28].
In this study, we demonstrate that the augmented Lagrangian method [29,30] is an excellent alternative for nuclear-constrained HFB calculations. We show that the method always yields self-consistent solutions corresponding to requested values of constraints, independently of the value of the Lagrange multiplier selected. In this way, adopting the ALM, one can always access any region of the multi dimensional energy surface requested by the particular physical phenomena investigated. At the same time, practical implementations of ALM do not require more computational resources as compared to QPM. A procedure, based on QPM but introducing a modifications of q 0 during the iterations through a linear constraint has been used in [31]. While not based on the ALM algoritm, the spirit of this method is close to ALM. This paper is organized as follows. The method of Lagrange multipliers, in its linear and quadratic variants, is briefly discussed in Sec. 2 together with the augmented Lagrangian method. The ALM algorithm adopted in this work for diagonalization-based HFB solvers is laid out in Sec. 3, and the illustrative results are given in Sec. 4. Finally, conclusions are contained in Sec. 5.

The Method of Lagrange Multipliers
A constrained optimization problem is usually specified in terms of equality and inequality constraints [16,17,18]. We consider here a finite-dimensional, equality-constrained nonlinear optimization problem (ECP) of the form where we assume that E : R n → R (an objective function) and g i : R n → R (the constraint functions) are smooth functions, and n > m. The Lagrangian function E ′ : R n+m → R associated with ECP is defined as where λ={λ i } is the vector of Lagrange multipliers.
The following set of necessary and sufficient conditions allow the problem (2) to be formulated in terms of the Lagrangian function (3): -The first-order necessary (local zero-slope) condition: if x * ∈ R n is a local solution of ECP (2), then there exists a unique vector λ * ∈ R m such that (x * , λ * ) is a stationary point for the Lagrangian function -The second-order necessary (local convexity) condition: if the functions E(x) and g i (x) are twice continuously differentiable, then the Hessian matrix of the Lagrangian function (with respect to x) must be positive semidefinite at (x * , λ * ) In this context it should be mentioned, that Eq. (5) is the necessary and sufficient condition for convexity of the E ′ (·, λ * ) function on its convex domain. -The second-order sufficient conditions: assume that E(x) and g i (x) are twice continuously differentiable, and let x * ∈ R n and λ * ∈ R m satisfy the equations and the Hessian matrix then the vector x * is a strict local solution of ECP (2).
In reality, these conditions are not always satisfied. For example, even when Eq. (2) has a solution, the Lagrangian function E ′ could be unbounded [32,33] since the Hessian of the Lagrangian function is not necessarily positively defined. However, it can be shown that if ECP is convex, i.e., E(x) is a convex (∪-shaped) function and g(x) is an affine function, then the local constrained minimum is unique and represents the global minimum.
Suppose that x(q) and λ(q) are continuously differentiable functions giving the local minima of a family of ECPs (2) parameterized by q ∈ R m , and x(q 0 ) = x * , λ(q 0 ) = λ * . This implies [17] that where the primal function E(q) is given by The interpretation of the Lagrangian multipliers is therefore that −λ(q) defines the parametrical gradient of E(q).
When E(x) is the convex function, then a primal function E(q) turns out to be a convex function as well [34].
The following sections illustrate the method of Lagrangian multipliers for a one-dimensional problem. We first discuss the linear constraint method, then quadratic penalty method, and augmented Lagrangian method.

The linear constraint method
The curve E(q) in Fig. 1(a) represents schematically how the primal function might vary as a function of q. It should be noted that we analyze the primal function E(q), not the objective function E(x). To solve the optimization problem, one can draw the tangent to the function E(q) at the point q = q 0 , and use this line as the abscissa axis of a new coordinate system rotated by the angle α 0 with tan α 0 = −λ 0 = dE dq (q 0 ), see Eq. (8). The ordinate axis in the new frame corresponds to the Lagrangian function E ′ (q, λ 0 ) = E(q 0 )−λ 0 (q−q 0 ). An unconstrained minimization of E ′ gives the local minimum in the rotated system at the requested point q 0 . In this way, the constrained minimization of E(q) is achieved by an unconstrained minimization of E ′ (q, λ 0 ). However, as discussed above, the constrained minimization procedure with LCM can be applied only in the regions of E(q) which are convex. In the concave (∩shaped) region, shaded in Fig. 1(a), the function E ′ in the rotated frame has a maximum at point q 1 . The minimization procedure does not yield a stable solution around the maximum, and the constrained calculation with a linear constraint function fails to converge in the whole shaded region, see discussion in Refs. [19,20,21] and in Sec. 7.6 of Ref. [2].

The quadratic constraint approach
The local convexity assumption (5) plays a crucial part in solving the constrained problem (2). The QPM can be applied when the convexity of original ECP is not preserved. In such cases, one approximates the original constrained problem by an unconstrained minimization problem that involves a penalty for violation of the constraints, see Refs. [16,17,18], and also Refs. [2,19,20]: where c > 0 is called penalty parameter and · denotes Euclidean norm. It should be noted that if c is taken sufficiently large, then the local convexity condition can be shown to hold for the Lagrangian function E ′ c (x). Figure 1(b) shows the same one-dimensional case as in Fig. 1(a), but for the QPM. The primal function E(q) is plotted with a solid (black) line, while the penalty function is plotted with dashed lines around the requested point q 0 for two different values c 1 and c 2 of the penalty parameter c. The resulting Lagrangian functions E ′ c (q) = E(q)+c(q− q 0 ) 2 are indicated.
It is immediately seen in Fig. 1(b) that the minimum of the Lagrangian function does not correspond to q 0 but rather to the values q 1 and q 2 corresponding to the penalty parameter c 1 and c 2 , respectively. One can obtain the values of the function E(q) in a broad range by changing the requested point q 0 or the penalty parameter c, or both. But one can neither predict in advance which value q will be reached, nor to produce a regular mesh of values, which is often of interest.
We thus see that the main drawback of the QPM is that it never delivers exactly the requested constraint values. If one denotes the solution to unconstrained problem (10) by x * (c) (i.e., E ′ c (x * (c)) ≈ E(x * ) = E(q 0 )), then it has been shown [17] that lim c→∞ x * (c) = x * . However, the Hessian matrix H(E ′ c )(x) is ill-defined for large values c. One is therefore forced to make a compromise between satisfying the constraints and having a well-conditioned problem when using QPM [17,35].

The augmented Lagrangian method
The ALM can be viewed as a combination of LCM and QPM. It was introduced as a computational tool in 1969, by Hestenes [29] and Powell [30], as an attempt to solve the difficulties of linear and quadratic constraint methods.
Let us begin by introducing the augmented Lagrangian function (11) which is the Lagrangian function for the problem: that has the same local minima as original ECP (2). The If λ k is a good approximation to the solution λ * , then it is possible to approach the optimum x * through the unconstrained minimization of E ′ c k ( ·, λ k ) without using large values of the corresponding penalty constant c k . The only condition is that c k is sufficiently large to ensure that the augmented Lagrangian function E ′ c k is locally convex with respect to x.
It has been shown in Refs. [29,30] that the use of the iterative Lagrange multipliers leads to x k which minimize E ′ c k ( ·, λ k ). Figure 1(c) provides a geometric interpretation of the iteration (15). To understand this figure, note that if x k minimizes E ′ c k ( ·, λ k ), then the vector q k = g(x k ) mini- and One can see in Fig. 1(c) that if λ k is sufficiently close to λ * and/or c k is sufficiently large, the next multiplier λ k+1 will be closer to λ * than λ k (see also Ch. 4 of [17] and Ch. 17 of [18]). The general iterative algorithm, including an adjustment of c k , can be found in Nocedal and Wright [18].

The ALM Algorithm
This section provides guidance on how to implement the ALM given a working QPM algorithm (which is the standard way of carrying out constrained minimization with HFB solvers based on diagonalization iterations).
Let us consider, for simplicity, a solver which defines the energy of the system E(q), and q is the expectation value of the (quadruple) operatorQ, i.e., q = Q . If one wishes to compute E(q 0 ) for a nuclear state with a requested quadruple deformation q 0 , the minimized quantity in QPM is Taking the variational derivative of (18), the resulting mean-field potential can be written as In our study the penalty parameter c we keep fixed. With an appropriate value of c, the self-consistent procedure yields a solution with quadrupole deformation, which is close to the requested value q 0 . Let us now apply the ALM. By taking the variational derivative of the resulting mean-field potential becomes where q 0 (λ) = q 0 − λ/2c. Comparing Eqs. (19) and (21), we see that the use of ALM practically does not need any change in the part of the solver that deals with constrains. One simply needs to substitute q 0 → q 0 (λ). The new information that needs to be supplied is the way λ is updated during the iteration process. According to Eq. (15), the value λ k+1 depends on the previous value λ k as: The iterations can start from a zero value, λ 0 = 0. This is a well-defined starting point as for λ = 0, ALM reduces itself to QPM for which one assumes the solver is already working. Comparing Eqs. (19) and (21), one can see that in ALM the originally requested value q 0 is replaced by an effective value q 0 (λ) which is adjusted during the iteration process according to Eq. (22). At the end of iterations, of course, one ends up with the originally requested value q 0 . The generalization of the ALM algorithm to the case of many constraint variables is straightforward.

Results
The ALM algorithm of Sec. 3 has been implemented and tested with the HFB solvers HFODD v.2.43c [36] and HF-BTHO [37]. The illustrative examples of calculations presented below concern the spontaneous fission of 252 Fm. We used the solver HFODD, which is capable of treating simultaneously all the possible collective degrees of freedom that might appear on the way to fission.
In the particle-hole channel, we employed the SkM * energy density functional [38]. In the pairing channel, we adopted a seniority pairing force with the strength parameters fitted to reproduce the experimental gaps in 252 Fm [39]. The single-particle basis consisted of the lowest 1,140 stretched states originating from the lowest 31 major oscillator shells. The details of the calculations can be found in Ref. [14]. We wish to remark only that special care should be taken when combining ALM with the Broyden mixing [28]. Figure 2 shows the results of constrained DFT calculations on a two-dimensional Cartesian grid of quadrupole (Q 20 ) and octupole (Q 30 ) moments. The requested values of constrained moments correspond to the grid points. While the ALM yields these values very precisely, the standard QPM yields multipole moments that are often very different from the requested ones. In particular, QPM is unable to cover the whole area of interest.

Conclusions
The augmented Lagrangiam method to solve a constrained nonlinear problem of CHFB has been compared with the standard variants of the method of Lagrange multipliers, i.e., quadratic penalty method and linear constraint method. We discuss the numerical strategy beyond QPM and ALM algorithms and show how to implement ALM in HFB solvers based on the diagonalization approach.
Compared to QPM, we find ALM to be superior: it enables precise constrained calculations in many dimensions thus uncovering regions of collective space that are not accessible with the standard method. The method is well adapted to supercomputer applications and its stability makes it a tool of choice for large-scale CHFB calculations, such as computations of multidimensional fission pathways discussed in this work.