Dynamic Programming for Finite Ensembles of Nanomagnetic Particles

We use optimal control via a distributed exterior field to steer the dynamics of an ensemble of N interacting ferromagnetic particles which are immersed into a heat bath by minimizing a quadratic functional. Using the dynamic programming principle, we show the existence of a unique strong solution of the optimal control problem. By the Hopf–Cole transformation, the associated Hamilton–Jacobi–Bellman equation of the dynamic programming principle may be re-cast into a linear PDE on the manifold M=(S2)N\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {M}} = ({\mathbb {S}}^{2})^{N}$$\end{document}, whose classical solution may be represented via Feynman–Kac formula. We use this probabilistic representation for Monte-Carlo simulations to illustrate optimal switching dynamics.


Introduction
We control a ferromagnetic N -spin system which is exposed to thermal fluctuations via an exterior forcing u := (u 1 , . . . , u N ) : [0, T ] × → (R 3 ) N . A relevant application includes data storage devices, for which it is crucial to control the dynamics of the magnetization m := (m 1 , . . . , m N ) : [0, T ] × → (S 2 ) N in order to ensure a reliable transport of data which are represented by those magnetic structures. Besides the external force H ext,i (u i ) = C ext u i , the i-th spin of the ensemble with magnetization m i is exposed to the different forces H ani,i (m i ), H d,i (m i ), and H exch,i (m): • the anisotropic force H ani,i (m i ) = Am i for some A ∈ R 3×3 diag , which favors alignment of m i with the given material dependent easy axis e ∈ R 3 ,  (m, u). The Stratonovich type of the stochastic integral of the diffusion term ensures that each state process m i takes values in S 2 . We refer to [3,4] for further details on the model. Our aim is to find a control process u * such that the solution process m * from (1.2) approximates a given deterministic profile m ≡ ( m 1 , . . . , m N ) ∈ L 2 0, T ; (S 2 ) N . More precisely, we aim to solve the following problem. Problem 1.1 Let the parameters δ, ν, α ≥ 0, and T , λ > 0, N ∈ N as well as h ∈ C 2 (S 2 ) N ; R be given. Let , P, F , {F t } 0≤t≤T be a given stochastic basis with the usual conditions, and W be a {F t } 0≤t≤T -adapted (R 3  subject to (1.2). We call such a minimizer a strong solution of the optimally controlled Landau-Lifschitz-Gilbert equation.
The problem (1.2) with u = 0 has been studied in [3]. In the case of a finite ensemble of nanomagnetic particles, the deterministic optimal control problem has been studied in [1,2] we also refer to [5] for a deterministic control problem for infinitely many ferromagnetic particles (a PDE). In [7], some of the present authors have studied Problem 1.1 in its weak form and constructed a weak optimal solution * := * , P * , F * , {F * t } 0≤t≤T , m * , u * , W * of the underlying problem via the Young measure (relaxed control) approach for a compact control set U ⊂ (R 3 ) N such that 0 ∈ U, which may be generalized to the case U = (R 3 ) N thanks to the coercivity of the cost functional with respect to u; see also [6] for an extension to infinite spin ensembles. To approximate it numerically, implementable strategies may be developed that rest on Pontryagin's maximum principle which characterizes minimizers. In [7], a stochastic gradient method is proposed to generate a sequence of functionaldecreasing approximate feedback controls, where the update requires to solve a coupled forward-backward SDE system. A relevant part here is to simulate a (time-discrete) backward SDE via the least-squares Monte-Carlo method, which requires significant data storage resources [6,7], and thus limits the complexity of practically approachable Problem 1.1.
In this work, we use an alternative strategy which rests on the dynamic programming principle. This allows us to prove the existence of a unique strong solution of Problem 1.1, which sharpens results of [7]. Since the solution of the underlying SDE lies on M = (S 2 ) N , the Hamilton-Jacobi-Bellman equation is defined on the manifold [0, T ] × M. In Sect. 3 we verify that the value function is the unique solution of that Bellman equation and that it belongs to C 1,2 [0, T ] × M . To solve this semi-linear PDE by deterministic numerical strategies seems non-accessible due to the high dimension of the underlying manifold M; we also want to avoid a direct probabilistic representation of its solution which would involve a backward SDE. Indeed, we demonstrate how the nonlinear HJB equation may be replaced with a linear parabolic PDE (3.12) by applying the Hopf-Cole transformation. The quadratic form resp. linearity of the control in the cost functional resp. in the equation (1.2) together with the geometric character of the problem then lead to an isotropic quadratic term in the HJB equation (3.8), which is crucial for this transformation. The regularity of the value function and the optimal policy mapping is explicitly expressed through the regularity of the terminal condition h. Furthermore, the solution w of the linear parabolic PDE can now be represented via a Feynman-Kac formula. This is the starting point for the numerical scheme proposed in Sect. 5. To approximate the optimal pair (m * , u * ) numerically, a Monte-Carlo method for the solution w of the linear equation (3.8) and its tangential gradient ∇ M w is proposed, from which the optimal feedback functionū can be obtained directly via (3.15). To approximate ∇ M w through a difference quotient with needed accuracy, we choose a stencil diameterh = O 1/ √ M for a sufficiently large number of Monte-Carlo realizations M; see Remark 6.1. Importantly, this approach does not require larger data storage resources as [6,7] does, but an ample calculation of iterates from related SDEs. Computational studies for the switching dynamics of single and multiple ferromagnetic particles are reported in Sect. 6.
The strategy which is proposed in this work to efficiently approximate the minimizer of the stochastic optimal control Problem 1.1 exploits its special structure to successfully apply the Hopf-Cole transformation-which then (i) guarantees the existence and uniqueness of a classical solution of the associated Hamilton-Jacobi-Bellman equation, and of a strong solution of the optimal control problem, (ii) leads to a characterisation of optimality through a linear parabolic PDE, which in turn allows the efficient numerical approximation in high-dimensional spaces through Monte-Carlo simulation without the need for computationally more costly backward SDEs.

The Stochastic Landau-Lifschitz-Gilbert Equation
The solution process m of (1.2) attains values in M = (S 2 ) N . For any Then, using σ σ σ also here, and combining H ani, For m i ∈ S 2 , one has  ; C(t, T ; (R 3 ) N ) of (2.3). Indeed by considering the truncation of the control and then using the stochastic version of the Arzela-Ascoli theorem, Prohorov's lemma and Jakubowski-Skorokhod representation theorem (cf. [14]), we have existence of a weak solution of (2.3). Moreover, an application of Gyöngy-Krylov's characterization of convergence in probability introduced in [12] along with pathwise uniqueness of weak martingale solutions gives existence of a unique strong solution, see [6,Appendix]. Furthermore, by applying Itô's formula to the functional x → x 2 R 3 for m i and using the vector identity a, a × b = 0 for any a, b ∈ R 3 , we have P-a.s., Since m i (t) ∈ S 2 , we see that P-a.s., each m i is S 2 -valued, and thus m ∈ L 2 Because the paths of the Landau-Lifschitz-Gilbert process stay on the manifold M, the natural domain for the value function of the control problem is As in the proof of [18, Proposition 3.2] we conclude that Taking the expectation then leads to (2.6), recalling that the Itô integral is a martingale.

Dynamic Programming and HJB Equation
For any (t, m m m) ∈ [0, T ] × M, we consider problem (2.3) to now construct the associated Hamilton-Jacobi-Bellman equation, following the formal rules of dynamic programming. We then use the Hopf-Cole transformation to replace the nonlinear HJB equation by a linear PDE and show the existence of a unique classical solution, which then implies existence of a unique classical solution of the original nonlinear HJB equation. Next, we present a verification theorem which shows that the value function is indeed equal to the solution of the HJB equation. We describe the optimal control through an optimal feedback function which is written explicitly in terms of the value function.

Let us define the Lagrangian
where the parameters δ, λ are given in Problem 1.1, and L m m m, u u u appears in the cost functional. Let , P, F , {F s } t≤s≤T be a given filtered probability space satisfying the usual hypotheses, and W is a {F s } t≤s≤T -adapted (R 3 ) N -valued Wiener process on it. We denote by U s s s m m m [t, T ] the set of all admissible pairs (m, u) such that u ∈ L 2 , and m(·) is the unique {F s } t≤s≤T -adapted M-valued strong solution of (2.3). In fact, the superscript s s s refers to the fact that we search an optimal admissible pair on the given filtered probability space. It follows for admissible (m, u) that L m(·), u(·) ∈ L 1 , recalling that h is a continuous function on a compact manifold. As a further consequence, Dynkin's formula (2.6) then holds for all Note that, thanks to [3, Proposition 1.33], there exists a unique strong solution m of (2.3) for u = 0, and hence the value function is uniformly bounded since h is a given continuous function and m 2 (R 3 ) N = N .

The Hamilton-Jacobi-Bellman Equation
We define the Hamiltonian Note that in the definition of H, the letter m m m stands for the identity map m m m → m m m. Using (2.1), we evaluate the supremum analytically. We have for the tangential gradient ∇ M ψ where * denotes the convex conjugate function, and Q is the 3N × 3N matrix given by Then, in summary, we have We point out that (3.6) ensures that the quadratic term in the Hamiltonian is isotropic, which is crucial for the Hopf-Cole transformation below. We now state the Hamilton-Jacobi-Bellman equation, whose solution we denote by W : (3.8) The Hopf-Cole transformation: The HJB equation (3.8) is a second-order nonlinear PDE on a high-dimensional domain and therefore without further understanding of its structure computationally expensive to solve; the study of its wellposedness as well as the regularity of its solution is non-trivial. We use the Hopf-Cole transformation w = exp −β W , β ∈ R given below, to substitute (3.8) by the linear PDE (3.12).
We span the tangent space of M at any point m m m by the orthonormal tangent vectors It is convenient to conceptually let ∂ i,1 , ∂ i,2 span the local coordinates associated to the ith sphere. Then we have the following relations: for j ∈ {1, 2} and i ∈ {1, 2, · · · , N }, We see that Therefore, we have the following correspondences: λν 2 to obtain a cancellation of the quadratic nonlinearity via (3.10). Substituting the respective terms in (3.8) and multiplying by −β w = 0, we arrive at the second-order linear equation The following theorem shows that weak solutions w may be examined in the Sobolev-Bochner space is the unique classical solution of the Bellman equation (3.8).
Proof The existence of a unique solution w of (3.13) in is for instance given in [19, p. 224]. Using charts of M and a partition of unity to represent (3.13) locally on the flat space, and an application of parabolic regularity theory ensures that w belongs to We shall now establish that w is positive so that (3.14) may be applied. Let φ(t, x) = exp(γ t)[w(t, x) − ] with := minx w(T ,x) > 0 and γ := divb ∞ + 1. In (3.13) we substitute ψ by exp(γ t)ψ and then add and subtract γ φ · ψ. Applying the product rule one obtains By construction ψ(T ) ≡ 0. We conclude that ψ ≡ 0 and therefore φ ≥ 0. Thus w ≥ > 0. The above implies that the nonlinear HJB equation (3.8) has a classical solution W . Moreover, the construction of the Hopf-Cole transformation directly ensures that a function w ∈ C 1,2 [0, T ] × M is a classical solution of (3.12) if and only if W of (3.14) solves (3.8) classically, which then implies the existence of a unique classical solution W , given by (3.14), of the HJB equation (3.8).
It is easy to see that additional smoothness of the terminal condition h directly translates into additional regularity of w and W .

The Verification Theorem
In this subsection, we show that V = W , i.e., the value function of the optimal control problem is equal to the solution of the above HJB equation (3.8). The following verification theorem also provides an explicit formula for the optimal control, which inherits the smoothness of ∇ M W .
Proof The proof is divided into two steps.
Step 1: T ] be any admissible pair. Now, for any ψ ∈ C 1,2 [0, T ) × M , we have, thanks to the definition of the Hamiltonian H, (3.1), and (2.5), and therefore one has Because W is a smooth classical solution, we may substitute ψ by W , in which case the left-hand side of (3.17) vanishes. In other words, Since W ∈ C 1,2 [0, T ] × M , by the mean-value theorem, either applied in combination with Whitney's extension theorem [8, Section 6.5] in the ambient space (R 3 ) N or directly to the geodesics of M, we have Thus, Recalling that by Theorem 3.1 the HJB equation (3.8) has a unique solution W , we conclude from the above that V = W . Now we show the uniqueness of the optimal admissible pair (m * , u * ), and thus in particular of the strong solution of Problem 1.1. We remark that the uniqueness of the optimal admissible pair is not automatically provided from the uniqueness of solutions of the HJB equation, which instead was used to characterize the value function through the differential equation (2.3).  of (m, u). This implies that also (3.16) holds with equality for a.e. s ∈ [t, T ], P-a.s. Hence (m * , u * ) and thus every optimal pair satisfies with m * (t) = 0. In view of (2.2), we observe that We now apply Itô's formula to the functional x → x 2 (R 3 ) N for the above equation, and then use (3.20) to have for some constant C > 0. An application of Gronwall's lemma then implies P-a.s., m * (s) = m * 1 (s), and therefore from (3.20) we get u * (s) = u * 1 (s) for a.e. s ∈ [t, T ]. Thus, the optimal control problem admits a unique strong solution, which is an improvement over [7].

Remark 3.4 In [7]
, P-a.s. the orthogonality of an optimal state and control was shown both theoretically and numerically. In our approach, we also see P-a.s. the orthogonality of m * and u * . Indeed, by using the vector identity a, a × b = 0 for any a, b ∈ (R 3

Probabilistic Representation of the Value Function
To solve the linear PDE (3.12) numerically by deterministic methods is still demanding since it is posed on M ⊂ (R 3 ) N . Therefore, we choose a probabilistic representation of the solution of (3.12) which requires to solve the following forward stochastic differential equation, defined on a given stochastic basis (  We note that because of the linearity of (3.12) the Feynman-Kac representation can be used in place of a backward SDE.

A Numerical Scheme for (4.1)
To approximate the solution m m m of (4.1), we use the semi-implicit method proposed in [17]. Note that (4.3a)-(4.3b) is a system of linear equations, leading to fast simulation times. Furthermore, the numerical schemes ensures m m m j takes values on M. This is exploited when applying arguments from [3] in this finite ensemble setting to conclude weak convergence of iterates {m m m j } J j= +1 , which here are driven by a random walk, towards the strong solution of (4.1) for τ → 0.

HJB Solution
The classical solution w of (3.12) is given by (4.2). In order to approximate it, and hence the classical solution W of the nonlinear HJB equation

Optimal Feedback Transformation
To approximate the functionū at any point

Optimal State
We use again the semi-implicit method proposed in [17] to approximate the solution in a i m j + e e e j 2 ,ū t j , g such that the iterates {m j } J j= converge towards a weak solution of (2.3) for τ → 0. Moreover, the iterates {u j =ū(t j , m j )} J j= defines the discrete optimal control along I J . In summary, we have the following algorithm to compute the optimal solution and control along with Method B.

Computational Experiments
We computationally study the behavior of the optimal state and control for the switching dynamics of an ensemble of N particles by using the algorithms from Sect. 5. For this purpose, we employ discretely distributed random numbers from the GNU Scientific Library [10]. All computations are performed on an Intel Core i5-4670 3.40GHz processor with 16GB RAM in double precision arithmetic. The arising linear algebraic systems are solved by the Gaussian elimination method [10].

Test Studies
We start with test problems to compare the two methods from Sect. 5.2. For this purpose, we omit certain energy contributions in (1.1), and allow only one or two spins such that an exact solution of (3.12) becomes available. As w = exp(−W ), we know that w has to be positive, while w 0,1 may also take negative values. Therefore we also use the constant spherical harmonic Letū exct (t, m m m) resp.ū app (t, m m m) be the function defined in (3.15) associated to (6.2) resp. (4.2), and m * exct (t) resp. m * app (t) be the solution of (2.3) withū exct t, m * exct (t) resp. u app t, m * app (t) . By denoting the error we ,m = e 1 , and other parameters as specified at the beginning of this subsection. We observe that the error err(t) for Method B is significantly smaller (by a factor of 1 20 in our simulations) if compared to Method A, see Fig. 1. Moreover, at least M ≈ 10 6 realizations are needed to balance the approximate computation via Method B with the remaining error sources. We compute the tangential gradient of the functions w 00,00 , w 01,00 and w 00,01 : ,m = e 1 , e 2 , and other parameters as specified in the problem. We observe that the error err(t) decreases if one increases the sample size M.
We observe that the error err(t) for the both test problems 1 and 2 is of the same magnitude as the error made in the approximation of ∇ M w (hence ∇ M W ). Optimal control of two interacting isotropic spins. Remaining in the setting of test problem 2 we next study the time evolution of a single trajectory of the optimal state, as well as the magnitude and direction of the optimal control. In this case, the trajectory of the optimal control lies in x 1 -x 2 plane to balance the random influences; see Fig. 2.

Remark 6.2
Computational studies for both test problems suggest stability of the scheme (4.3a)-(4.3b). However, convergence resp. termination of the scheme depends crucially on

Fig. 2
Test problem 2: time evolution of a single trajectory of the optimal state t → m * i (t) (red), the direction of the optimal control t → u * , and the magnitude of the optimal control t → u * i (t) 2 , an exponential overflow occurs during truncation in simulations, and therefore the computed value of w t, m m m in (4.2) is set to zero then. Hence, in this case, log w(t, m m m) is not defined, and thus the approximation procedure to approximate ∇ M W (t, m m m) terminates. This is one reason that Examples 5.1 and 5.2 from [7] may not directly be simulated here. Notice that no exponential overflow occurs for both test problems above, since δ = 0.

Optimal Control of Three Interacting Spins
We now study an ensemble of N = 3 particles, which additionally are subjected to exchange forces. We are interested in the switching control for one (i = 2) of these particles fromm 2 (at initial time) to −m 2 at given final time T . Take h(m(T )) = 1 2 m(T ) − m(T ) 2 (R 3 ) N , where the deterministic target profile m : [0, T ] → (S 2 ) 3 is given by We use again Method B to approximate ∇ M W . To simulate the optimal pair of the underlying problem, we have used the methodology described in Sects. 4.1 and 5.1-5.3, along with the following set of parameters: with the positive semi-definite matrix J such that for any  1, 2, . . . , N ) , In this case, the minimum value of the cost functional is J * sto ≡ 0.9078. Though the first and third spins start already at the desired state, it is due to the noise, and the exchange forces in particular, that the optimal control is acting on the whole time interval and on all spins. For the second spin, we observe that at the beginning and end, less control is needed opposed to the applied control at the intermediate times; see Fig. 3. The orthogonality of the optimal pair (m * , u * ) (e.g. Remark 3.4) is shown in Fig. 3g-i by displaying the temporal evolution

Optimal Control of Four Interacting Spins
We consider here the switching control for an ensemble of N = 4 particles. Set-up 1: We use the parameters as in Sect. 6.2 withm = e 1 , −e 1 , e 1 , −e 1 , and m(t) = e 1 , m 2 (t), e 1 , m 2 (t) . In this case, the first and third spins start already at the desired state; the associated optimal controls are acting on the whole time interval. Moreover, for the second and fourth spins, significant controls are required to approximately meet the terminal state profile. The time evolution of t → u * 2 (t) 2 R 3 is similar to the results for N = 3 spin constellations (see Fig. 3e), while u * 4 (t) 2 R 3 is delayed in time for the fourth spin. We observe a loop of the orientation of u *

Optimal Control of Ten Interacting Spins
We consider here an ensemble of N = 10 particles to optimally control the dynamics to reach a deterministic target profile  Fig. 3 Time evolution of a single trajectory of the optimal state t → m * i (t) (red), the direction of the optimal control t → u * i (t) u * i (t) −1 R 3 (blue), the magnitude of the optimal control t → u * i (t) 2 R 3 , and the angle between optimal pair t → u * Time evolution of a single trajectory of the optimal state t → m * i (t) (red), the direction of the optimal control t → u * i (t) u * i (t) −1 R 3 (blue), and the magnitude of the optimal control t → u * i (t) 2 R 3 with set-up 2 for i = 1, 2, 3, 4 (Color figure online) is penalized more strongly than in the previous experiments, which has a noticeable effect on the magnitude of u * , compare Figs. 5e-h and 6c, e. At the beginning a stronger control is applied to move towards the desired target profile. Because of the large noise intensity ν, and the less control, some particles of this ensemble do not reach the target profile appropriately. For illustration, we plotted the behavior of the optimal state t → m * i (t) (red), the direction

Fig. 6
A single realization of the optimal state m * (red) at final time T , the time evolution of the optimal state t → m * i (t) (red), the direction of the optimal control t → u * i (t) u * i (t) −1 R 3 (blue), and the magnitude of the optimal control t → u * i (t) 2 R 3 for i = 3, 7 (Color figure online) of the optimal control t → u * i (t) u * i (t) −1 R 3 (blue), and the magnitude of the optimal control t → u * i (t) 2 R 3 for i = 3, 7; see Fig. 6. Let us specify the cost of the proposed method in terms of computations and storage. In the definition of J sto resp. w in Problem 1.1 resp. (4.2), we approximate the expectation via Monte-Carlo estimation along with the variance reduction method of antithetic variates. There, each realization of the integrals requires to store the iterates {m j } J j=0 resp. {m m m j } J j=0 with O(N J) storage complexity due to the necessity of piecewise interpolation as in (5.1). The latter iterates are computed via scheme (4.3a)-(4.3b) by solving 2N linear 3×3 systems, which can be done analytically in O(1) time and in parallel, cf. the discussion in [17, Section 2.3]. The most computationally intensive part is to approximate ∇ M w(t j , m j ), whose accuracy has a direct impact on both, the optimal solution and control. Comparative computational studies show that at least O(10 6 ) realizations are needed to approximate ∇ M w(t j , m j ), whereas O(10 4 ) realizations are sufficient to approximate the cost function. This dependence on the sample size M ∈ N to approximate the optimal control is strengthened in the case of fast changing optimal states.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.