Introduction

Recent advancements of data-driven methods and machine-learned (ML) interatomic potentials have led to dramatically improved descriptions of the potential energy surface (PES) for many material systems. However, the incorporation of spin degrees of freedom (DOF), which are crucial to capture finite temperature phenomena in magnetic materials, has remained a challenging endeavor. In spin density functional theory (SDFT), magnetizaton emerges from the competition of magnetic exchange and band energy contributions1,2, where the energy required for reshuffling electrons in up and down spin channels depends on the local density of states (DOS). The bimodal DOS of iron in the body-centred crystal (bcc) structure affords large DOS values close to the Fermi level, leading to larger magnetic moments than in the face-centred cubic (fcc) structure with its more unimodal DOS that is lower at the Fermi level3,4. This intricate interplay between magnetic and atomic structure implies that multi-atom multi-spin interactions are necessary for capturing different magnetic and atomic structures in a single model.

Unlike approaches that were derived from electronic structure theory and that seamlessly incorporate the complexity of magnetic interactions5,6,7, classical interatomic potentials needed to be supplemented via suitable interaction terms that mimic the quantum exchange interactions. The simplest possibility was to employ a classical Heisenberg Hamiltonian8, where the atomic spin operators are substituted by spin vectors and the exchange interactions are parameterized using first-principles calculations9. Such strategies have been adopted also in most current ML approaches for magnetic systems.

Nikolov et al.10 augmented the spectral neighborhood analysis potential (SNAP) framework with a two-spin bi-linear Heisenberg model with atomic magnetic moment magnitudes being fixed and independent of the environment. A similar approach, where a neural network was trained to describe contributions to the Heisenberg Hamiltonian based on the local magnetic environment, was developed by Yu et al.11. However, this approach did not include information about the underlying lattice and treated the magnetic moments as unit vectors. Eckhoff et al.12 extended the formalism based on Behler-Parrinello symmetry functions13 in a framework that was limited to collinear configurations. Magnetic moments as additional DOF were incorporated by Novikov et al.14 in the moment tensor potential framework15. Even though the description was confined to collinear moments only, the magnetic moment tensor potential was able to reproduce a number of thermodynamic properties of bulk bcc Fe. Recently, Domina et al.16 extended the SNAP framework to deal with arbitrary vectorial fields and demonstrated its functionality by training to non-collinear spin configurations generated using a model Landau-Heisenberg Hamiltonian. In a follow-up work, Suzuki et al.17 showed that it is necessary to include higher-order spin-dependent partial spectra to discriminate configurations with different spin orientations and magnetic anisotropy. Finally, aiming at large-scale spin-lattice dynamics simulations, Chapman et al.18 added a neural network correction term to an embedded atom method potential augmented with a Heisenberg-Landau Hamiltonian. The model was successfully applied in finite temperature simulations of bulk Fe phases as well as complex defects. However, due to its simplicity, absolute errors were in some cases larger than a few tens of meV that are comparable to the fluctuations of exchange parameters with temperature. Thus, none of the existing magnetic ML approaches has so far succeeded in achieving a transferable and quantitatively accurate description of magnetic interactions suitable for modeling magnetism in different crystal structures.

We present an explicit treatment of non-collinear magnetic DOF within the atomic cluster expansion (ACE)19,20, which provides a complete basis in the space of atomic environments19,21. Accurate, transferable and computationally efficient parameterizations of ACE have been developed for diverse bonding environments including bulk metallic systems as well as covalent molecules22,23,24,25,26. Thanks to ACE universality, additional scalar, vectorial or tensorial DOF can be incorporated seamlessly into ACE models20. Specifically for magnetic systems, ACE provides a body-ordered decomposition of combined atomic and magnetic PES in terms of a complete set of basis functions that depend on atomic and magnetic DOF. The inclusion of magnetic DOF requires an extension of the ACE equivariant basis such that any transformation of the relevant translation and rotation symmetry group acting on both atomic and magnetic spaces leaves the energy invariant. Magnetic ACE can therefore be considered as a generalization of most existing magnetic ML models as well as the classical spin-cluster expansion (SCE)27,28,29,30,31.

In this work, we develop a non-collinear magnetic ACE parameterization for the prototypical magnetic element Fe. The model is trained on a large dataset of both collinear and non-collinear DFT calculations and validated for a broad range of structural, thermodynamic, and defect properties. The resulting interatomic potential is able to describe accurately complex potential energy landscapes of different magnetic and atomic phases of Fe as a function of both atomic positions and local magnetic moment vectors.

Results

Reference DFT data

A comprehensive sampling of variations in both atomic positions and magnetic moments is crucial for the construction of any atomistic magnetic ML model. Sampling of the atomic DOF can be carried out following well established protocols employed in ML fitting of PES, commonly by choosing a set of structures and varying their geometries and atomic positions. In contrast, sampling of the magnetic DOF presents a significant difficulty, both from the computational and methodological point of view. Firstly, the number of required calculations increases drastically due to the additional spin degrees of freedom and, secondly, the local atomic magnetic moments need to be constrained to desired magnitudes and orientations32. While it is, in principle, possible to fix both the direction and the magnitude of each atomic magnetic moment to a target vector32, these calculations are computationally demanding. Furthermore, as atomic magnetic moments are computed by integrating over a sphere, different magnetization densities within the sphere may in principle lead to identical moments.

To generate the training dataset for magnetic ACE, we considered both conventional, unconstrained, and collinear as well as constrained non-collinear spin-polarized configurations. These configurations ranged from various spin spirals in ideal bcc cells to supercells with random orientations of the moments and perturbed atomic positions. For bulk phases along the Bain transformation path, we sampled the magnitudes of the collinear magnetic moment over the whole physically reasonable range from 0 to ~ 3 μB atom−1. The simultaneous sampling of both atomic and magnetic DOF enabled to generate a set of uniformly distributed configurations that are relevant for the properties of interest for a wide range of atomic densities as well as magnitudes and directions of the atomic magnetic moments. An example of data collected with this strategy is given in Fig. 1 for the bcc and fcc ferromagnetic (FM) phases. Each data point corresponds to the energy of either structure at a given volume and a constrained value of the magnetic moment. The ground state configurations are marked by the black curve. While the bcc phase has only one minimum, corresponding to the ground state FM bcc phase, the fcc phase exhibits two minima corresponding to high-spin and low-spin configurations.

Fig. 1: DFT energy vs volume for FM bcc and fcc.
figure 1

Constant magnetic moment energy-volume curves for FM bcc and fcc phases computed using constrained DFT. The black curve marks the ground state configurations without any applied constrain. The two minima for fcc correspond to the high- and low-spin magnetic configurations.

The constrained magnetic calculations required convergence of the energy and forces with respect to a constraining penalty term32. In some limited cases it was computationally prohibitive to achieve numerically small penalty contributions, mainly for configurations far from equilibrium such as highly distorted structures and defects (see Supplementary Note 2 for representative examples). Therefore, we excluded configurations for which the penalty energy was larger than ≈ 5 meV atom−1 as these would significantly increase the noise in the data and adversely affect the parameterization.

The resulting training dataset contained about 70,000 structures in total that can be divided into several categories, each associated with a particular property of interest. The categories are listed in Table 1, where we specify the number of configurations and the range of volumes and magnetic moment magnitudes that we considered. The free atom data, obtained from calculations of a bcc unit cell with lattice parameter equal to 12 Å at different magnitudes of the magnetic moment, was used to fit the first-order contribution of the expansion that characterizes the asymptotic large volume limit for each structure with a given magnetic moment. Detailed information about the free atom reference is provided in the Supplementary Note 1.

Table 1 Summary of the database.

Training procedure

The fitting of the magnetic ACE potential for Fe was done following procedures that were established for the non-magnetic ACE22,33. A hierarchical basis extension was employed, starting from one-body contribution and adding gradually contributions with higher body orders. In the first step, expansion coefficients for the first order contribution were parameterized using the free magnetic atom data. This term can be reduced to a Ginzburg-Landau expansion \({\sum }_{n}{A}_{n}{{{{\bf{m}}}}}_{i}^{2n}\), where a maximum number of three terms is commonly used31,34,35,36,37. In our parameterization, excellent agreement with the reference data ( ~ 2 meV atom−1 error) could be obtained using four terms (n = 4). After the first order contribution was fixed, we fitted second-order contributions. The ACE second-order contributions are formally equivalent to a distance-dependent Heisenberg Hamiltonian ∑i > jJij(rij)mi ⋅ mj, its biquadratic correction \({\sum }_{i\ > \ j}{B}_{ij}({r}_{ij}){({{{{\bf{m}}}}}_{i}\cdot {{{{\bf{m}}}}}_{j})}^{2}\), and its bicubic term for \({l}_{max}^{{\prime} }=1\), 2 and 3, respectively (see Methods). In addition, a third-order magnetic contribution, analogous to a screened three-spin interaction ∑ijkKijk(mi ⋅ mj)(mj ⋅ mk), was also included. Angular contributions in higher-order magnetic terms did not improve the fit significantly and were neglected, which reduced the number of basis functions significantly. Radial and angular indices for the atomic contributions were then incremented following a hierarchical basis expansion scheme33, where contributions with increasing body order were gradually added. The cutoff distance of the present parametrization was set to 4.5 Å, but it can be extended if necessary28. Additional hyperparameters, relevant to magnetic DOF only, include the magnetic cutoff mcut = 4 μB, which defines the upper bound of the possible magnitude of atomic magnetic moments, and the upper bounds for magnetic radial functions and spherical harmonics \({n}_{max}^{{\prime} }\) and \({l}_{max}^{{\prime} }\) for each body order (see Methods for details).

The resulting model consists of 6519 parameters and its overall accuracy is equal to 8 meV atom−1 and 37 meV Å−1 for energies and forces, respectively. The main limiting factor in reducing the error further was numerical noise in the reference DFT data that originated from the magnetic moment confinement procedure. In addition, another parameterization was constructed with a particular focus on defect properties (see Supplementary Note 3).

Predicted properties at 0 K

We carried out a thorough validation of the non-collinear magnetic ACE against the reference DFT data and evaluated a broad range of properties of various bulk Fe phases that were not included explicitly in the training. The predicted volume-energy curves for the bcc and fcc magnetic and non-magnetic (NM) Fe phases are plotted in Fig. 2, where the corresponding cohesive energies are given with respect to the non-magnetic free atom. It is obvious that ACE predictions agree closely with the reference DFT data for all considered magnetic and non-magnetic phases, including the portion of the magnetic energy landscape where the NM to magnetic transitions take place. Moreover, our potential is able to distinguish subtly different magnetic states within one structure, such as the low-spin and high-spin states of the FM fcc structure.

Fig. 2: Energy vs volume for bcc and fcc.
figure 2

Volume-energy curves for both magnetic and non-magnetic structures of bcc (left) and fcc (right) with corresponding DFT data (small circles).

Variations of the magnetic energy as a function of magnetic moment are displayed for FM bcc in Fig. 3, where each curve corresponds to a constant volume. As expected, these dependencies are positive and monotonic for small volumes (dark blue curves), while above a certain critical volume their behavior qualitatively changes to include a minimum at finite value of the magnetic moment in analogy to a Landau expansion. In the limit of large volumes (dark red curves), the magnetic energy approaches the free atom value. Graphs for other bcc and fcc structures are given in the Supplementary Figure 6.

Fig. 3: Magnetic energy for FM bcc.
figure 3

Magnetic energy vs magnetic moment magnitude at different volumes for bcc FM. Dashed lines and black dots correspond to equilibrium volume ACE and reference DFT data, respectively.

Two contour plots of magnetic PES for FM bcc and fcc phases are shown in Fig. 4. These plots demonstrate that ACE can capture simultaneously PES of different phases over a broad range of volumes and magnetic moments (from zero up to ≈ 3.2 μB). In agreement with DFT, the bcc phase has a single global minimum at the corresponding equilibrium volume and magnetic moment, while the FM fcc phase exhibits two local minima corresponding to the low-spin and high-spin states.

Fig. 4: Contour plots for FM bcc and fcc.
figure 4

Contour plots for the bcc (left) and fcc (right) FM phases.

The equilibrium properties of the most important bulk Fe phases are listed in Table 2. Further properties, such as magnetic moment variations and phonon spectra, are presented in the Supplementary Figures 7 and 8. As one can see, the equilibrium lattice parameters, magnetic moments, and elastic constants of the magnetic phases are in good agreement with the reference DFT values. Larger discrepancies exist for the non-magnetic phases since only very few of these configurations were included in the training dataset.

Table 2 Equilibrium elastic properties for bcc and fcc.

The Bain transformation path is closely related to the bcc-fcc phase transformation. In the case of Fe, the energetics of this transformation depends sensitively on the magnetic state of both phases38,39,40,41. Variations of energy along the Bain path, computed at the FM bcc equilibrium volume for different magnetic phases of Fe are shown in Fig. 5 for ACE and DFT. Unlike the ground state FM bcc phase, the AFM and NM bcc phases are not mechanically stable with respect to tetragonal distortion, as reflected by negative values of the \({C}^{{\prime} }=\frac{1}{2}({C}_{11}-{C}_{12})\) elastic constant (cf. Table 2). For the fcc phase (\(c/a=\sqrt{2}\)), the energies of the FM and AFM magnetic states are almost identical, but both phases are unstable. The minimum energy AFM structure is a body-centered tetragonal phase with c/a ≈ 1.45. The excellent agreement between ACE and DFT for the Bain path is due to correct incorporation of the coupling between the magnetic and lattice DOF, which is anomalously strong in Fe40.

Fig. 5: Bain transformation paths for FM, AFM, and NM phases.
figure 5

Bain transformation paths between the FM, AFM, and NM bcc and fcc phases (ACE: lines, DFT: circles).

The energy barrier for spin rotations depends sensitively on angular interactions between atomic magnetic moments and changes in moment magnitudes. Here, we demonstrate that ACE captures the energetics of spin rotation between FM and AFM bcc phases. In Fig. 6, we show the energetics associated with the rotation of one magnetic moment in a two-atom bcc cell. As the moment on the central atom is rotated, the magnetic configuration gradually transforms from FM to AFM. The contour plot in Fig. 6(b) depicts PES as a function of volume and rotation angle. The black arrow marks the minimum energy path between the FM and AFM phases. While the equilibrium volumes of both phases are not very different, the magnetic moment of the AFM phase is significantly lower than that of the FM phase (cf. Table 2). This is also correctly reproduced by ACE, as shown in Fig. 6(c), where we plot the rotation energy barriers evaluated at constant magnetic moments. The minimum energy path (dashed gray curve), corresponding to a reduction of the absolute value of magnetic moment from 2.22 μB in FM bcc to 1.25 μB in AFM bcc, is in excellent agreement with the DFT reference (black points).

Fig. 6: FM to AFM spin rotation for bcc.
figure 6

Analysis of the FM to AFM transformation in the bcc phase via rotation of the spin on the central atom: a A schematic picture of the transformation. b A contour plot of PES as a function of volume vs rotation angle. c FM to AFM spin rotation energy barriers at constant magnetic moment. The minimum energy path is marked by the gray dashed curve together with the DFT reference (black points).

The energy of magnetic moment orientations that deviate only slightly from the collinear alignment can be described by lowest order contributions only, i.e., a bilinear Heisenberg model. From the distance-dependent exchange interactions Jij in the bilinear Heisenberg model, the magnon spectrum can be obtained in adiabatic approximation as

$${E}_{i}\left({{{\bf{q}}}}\right)=\mathop{\sum}\limits_{j}{J}_{ij}\left[1-\cos \left({{{\bf{q}}}}\cdot {{{{\bf{R}}}}}_{ij}\right)\right].$$
(1)

We determined the exchange interactions for different coordination shells following the real space approach by Liechtenstein et al.42,43,44, where infinitesimal perturbations to the directions of two neighboring magnetic moments are applied. Calculating the energy δEij for rotating two spins at atomic sites i and j by opposite infinitesimal angles ± θ/2 and comparing to the energy for rotating the two spins individually, δEi and δEj, results in

$$\delta {E}_{ij}-(\delta {E}_{i}+\delta {E}_{j})={J}_{ij}\left(1-\cos \theta \right) \sim \frac{1}{2}{J}_{ij}{\theta }^{2}.$$
(2)

The distance dependent exchange interactions are then obtained by fitting δEij − (δEi + δEj) with respect to the tilting angle for consecutive coordination shells in a large supercell. The resulting adiabatic magnon spectrum is shown in Fig. 7 with reference data obtained using the spin-polarized relativistic Korringa-Kohn-Rostoker (SPRKKR) framework45. Small discrepancies between the ACE and SPRKKR results visible for some high frequencies in the magnon spectrum can be attributed to the long range part of the magnetic interactions neglected in the present ACE parameterization. Nevertheless, the overall agreement is good, indicating the ability of our parameterization to describe spin spirals with different frequencies.

Fig. 7: Jijs and magnon spectra.
figure 7

Exchange interactions (left) and adiabatic magnon spectra (right) predicted by ACE (red) in comparison with SPRKKR calculations (black).

Phase transformations at finite temperatures

The magnetic ACE can be applied in large-scale finite temperature simulations to investigate properties that depend on both spin and lattice DOF. The ACE prediction of the FM to paramagnetic (PM) phase transition in bcc Fe is presented in Fig. 8. In a simulation with 3456 atoms, we employed coupled molecular dynamics (MD) - Monte Carlo (MC) sampling46, where the atoms follow Langevin dynamics while MC is employed for updating the directions of the atomic magnetic moments. A direct simulation of the dynamics of the combined atomic and magnetic system is difficult due to the lack of numerically stable and efficient symplectic integrators for multi-spin models beyond Heisenberg-Landau. The MC sampling enabled us to overcome this problem and to investigate the effect of longitudinal spin fluctuations (LSF) on the FM-PM transition by carrying the simulations with either constant or variable spin magnitudes. The variation of magnetization with temperature shown in Fig. 8 is consistent with previous theoretical studies34,47,48,49. The predicted Curie temperatures from both approaches are about 980 K (without LSF) and 890 K (with LSF). The smaller value of TC obtained in the latter case can be attributed to a decrease of the average local magnetic moment magnitude with temperature, which was reported in previous studies34,50. The underestimation in comparison with the experimental value of 1043 K is likely related to neglect of thermal expansion, as all simulations were performed at volume corresponding to the FM bcc phase at 0 K.

Fig. 8: Magnetization vs temperature.
figure 8

Magnetization vs temperature with and without LSF. The vertical dashed lines indicate the estimated Curie temperatures TC. The experimental value of TC is 1043 K. Insets show snapshots of parts of the simulation cell with atomic magnetic moments marked by red arrows.

Apart from the magnetic transition, we also investigated the structural transitions from α to γ (bcc to fcc) and γ to δ (fcc to bcc) phases of Fe using the stress-strain thermodynamic integration method (SSTI)51,52 (see Methods). In these simulations, it is essential to include the effect of lattice expansion. As shown in Fig. 9(a), ACE predicts qualitatively correctly the increase of lattice parameters with increasing temperature for both bcc and fcc phases. The discrepancy with respect to the experimental values can be traced to overbinding of the GGA functional.

Fig. 9: Thermal expansion and Gibbs free energy difference.
figure 9

Lattice thermal expansion for bcc and fcc phases predicted by ACE in comparison to experimental and theoretical results (a); GAP68, Exp. 169, MSLP18, Exp. 253. Gibbs free energy difference between bcc and fcc phases as a function of temperature for both ACE and CALPHAD70 (b). Vertical green and red dashed lines indicate ACE and experimental transition temperatures53.

The estimated transition temperatures of the two transitions are 1430 and 1710 K, respectively. These theoretical predictions agree reasonably well with the experimental values \({T}_{\alpha -\gamma }^{exp}=1185\) K and \({T}_{\gamma -\delta }^{exp}=1667\) K53, as also visible in Fig. 9(b). The overestimation of Tαγ could be due to the insufficient description of the effect of magnetic fluctuations on the free energy difference, that is responsible for the α to γ transformation54, while the interplay between vibrations and magnetic excitations, which largely affects the γ to δ transition54,55, is correctly captured by our parameterization.

Defects

To demonstrate that ACE is able to capture properties of crystal defects, we also included several defect configurations in the DFT training data. However, as discussed in Sec. Reference DFT data, it is often not possible to reach sufficiently small penalty energies in the constrained DFT calculations for such distorted configurations. Therefore, we needed to resort in many cases to unconstrained spin-polarized calculations only, which limited the sampling of the magnetic PES for defects.

Here we present results for three types of defects - a monovacancy, generalized stacking faults, and a screw dislocation. For most defects, the Heisenberg model is insufficient and it is necessary to include higher-order terms in the magnetic Hamiltonian56. In addition, an accurate reproduction of defect properties can be achieved only if the coupling between spin and lattice excitations is taken into account.

The monovacancy formation and migration energies of 2.57 eV and 0.65 eV, respectively, agree well with the reference DFT data (equal to 2.17 and 0.67 eV, respectively). The generalized stacking fault energy surface, the γ-surface, for the {110} plane is shown in Fig. 10(a). Figure 10(b) shows cuts along the 〈111〉 direction on both the {110} and {211} planes that are related to atomic structures of \(\frac{1}{2}\langle 111\rangle\) screw dislocations. The ACE predictions for both cuts are in excellent agreement with the DFT reference. ACE also predicts the core structure of the \(\frac{1}{2}\langle 111\rangle\) screw dislocation, which governs the low-temperature plasticity of Fe, in quantitative agreement with DFT reference57,58 as demonstrated in Fig. 10(c).

Fig. 10: γ-surfaces and dislocations.
figure 10

a Predicted γ-surface for the {110} crystallographic plane. b Cuts along the 〈111〉 direction for the {110} and {211}γ-surfaces (ACE: lines, DFT: dots). c Differential displacement map of the \(\frac{1}{2}\langle 111\rangle\) screw dislocation predicted by the magnetic ACE potential.

To examine the ability of ACE to reproduce properties of other defects, we constructed a smaller training dataset containing also surfaces and interstitials and generated another ACE parameterization. As shown in the Supplementary Note 3, ACE can reproduce properties of these defects as well.

Discussion

By incorporating magnetic DOF in the form of atomic magnetic moment vectors into ACE, we demonstrated that constrained non-collinear DFT reference data can be reproduced with excellent accuracy and transferability, exceeding those of existing magnetic ML interatomic potentials. We constructed a non-collinear ACE parametrization for Fe and validated it for a wide range of properties, including volume-energy curves, elastic moduli, phonon spectra, Bain transformation paths, spin rotations and magnon spectra, and point and extended defects. These tests showed that magnetic ACE is not only able to capture large structural and magnetic variations but also resolves subtle spin fluctuations that are crucial for a correct reproduction of phase transitions and thermodynamic properties. To this end, it is necessary to include multi-atom multi-spin interactions that are missing in simple models with pairwise couplings between atoms and/or magnetic moments. In iron magnetic angular contributions of body order four and higher are numerically small and can be neglected.

The magnetic ACE was parameterized from DFT reference data that was generated by constraining both the magnitude and direction of the atomic magnetic moments. For configurations with defects or significant atomic displacements, it was often difficult to achieve self-consistency. Furthermore, as the constraints were implemented by integrating magnetization over spheres about atoms, various intra-atomic magnetization distributions could result in the same atomic magnetic moment. This non-uniqueness effectively led to noise in the DFT reference data that ultimately limited the accuracy of our parameterization. Therefore, there is a strong incentive to implement more advanced constraints in DFT that would help to increase the accuracy of magnetic ACE as well as other magnetic ML approaches.

The numerical efficiency of ACE enables to carry out large-scale molecular and spin dynamics simulations to study the dynamics of combined magnetic and structural phase transitions. Nevertheless, to predict both magnetic and structural phase transitions in iron, we employed MD simulations for the atomic positions and MC sampling to vary the atomic magnetic moments. One of the reasons is that there is a lack of classical or semi-classical equations of motion and corresponding, numerically robust integrators applicable to combined atomic and spin dynamics in systems with multi-atom multi-spin interactions and including changes of magnitudes of magnetic moments59.

The ACE for iron can be extended directly to multicomponent systems, such as technologically important magnetic alloys and carbides. While this is straightforward from a formal point of view, the generation of accurate and comprehensive DFT reference data for magnetic multicomponent materials is challenging. Here efficient sampling based on D-optimality active learning60 extended to include magnetic DOF will help to reduce the number of required DFT reference calculations.

Methods

We provide a summary of the magnetic ACE formalism together with aspects of its implementation. Further explanations on implementation and workflow are available in the Supplementary Methods. We also provide computational details of the DFT calculations, the combined MD-MC simulations that were employed for the calculation of the FM-PM transition and the SSTI method.

Energies, forces, and magnetic gradients

We define state variables σji of atom j neighboring atom i in terms of interatomic distances vectors rji, chemical species μj, magnetic moments mj, etc. as

$${\sigma }_{ji}=\left({\mu }_{j},{{{{\bf{r}}}}}_{ji},{{{{\bf{m}}}}}_{j}\right),$$
(3)

with \({\sigma }_{ii}=\left({\mu }_{i},{{{{\bf{m}}}}}_{i}\right)\). A neighbor density on atom i including atomic and magnetic contributions can then be written as

$${\varrho }_{i}\left(\sigma \right)=\mathop{\sum}\limits_{j}\delta \left(\sigma -{\sigma }_{ji}\right).$$
(4)

Magnetic contributions also enter the single bond basis functions,

$${\phi }_{v}\left({\sigma }_{ji}\right)={\delta }_{k}\left({\mu }_{j}\right){R}_{nl}^{{\mu }_{j}{\mu }_{i}}\left({r}_{ji}\right){Y}_{l}^{m}\left({\hat{r}}_{ji}\right){M}_{{n}^{{\prime} }{l}^{{\prime} }}^{{\mu }_{j}{\mu }_{i}}\left({m}_{j}\right){Y}_{{l}^{{\prime} }}^{{m}^{{\prime} }}\left({\hat{m}}_{j}\right)$$
(5)

where \(v=(knlm{n}^{{\prime} }{l}^{{\prime} }{m}^{{\prime} })\) and the primed indices are used to label basis functions that depend on magnetic contributions, and \({\phi }_{v}\left({\sigma }_{ii}\right)={\delta }_{k}\left({\mu }_{i}\right){M}_{{n}^{{\prime} }{l}^{{\prime} }}^{{\mu }_{i}{\mu }_{i}}\left({m}_{i}\right){Y}_{{l}^{{\prime} }}^{{m}^{{\prime} }}\left({\hat{m}}_{i}\right)\).

The projection of the density in Eq. (4) on the corresponding single atom basis functions leads to the atomic basis Aiv and \({A}_{iv}^{(0)}\)

$${A}_{iv}=\langle {\rho }_{i}| {\phi }_{v}\rangle =\mathop{\sum}\limits_{j\ne i}{\phi }_{v}\left({\sigma }_{ji}\right)$$
(6)

and

$${A}_{iv}^{(0)}=\left\langle {\rho }_{i}^{(0)}| {\phi }_{v}\right\rangle ={\phi }_{v}\left({\sigma }_{ii}\right).$$
(7)

From the two atomic bases the tensor product basis is formed

$${{{{\bf{A}}}}}_{i{{{\bf{v}}}}}={A}_{i{v}_{0}}^{(0)}\mathop{\prod }\limits_{t=1}^{N}{A}_{i{v}_{t}},$$
(8)

and symmetrized to ensure invariance with respect to rotation and inversion, leading to equivariant basis functions

$${{{{\bf{B}}}}}_{i}={{{\mathcal{C}}}}\cdot {{{{\bf{A}}}}}_{i},$$
(9)

where \({{{\mathcal{C}}}}\) is a sparse matrix of products of the Clebsch-Gordan coefficients of the atomic and magnetic systems. The coupling tree, used to form possible tuples v (see Supplementary Methods for an example), can be simplified assuming that spin-orbit coupling can be neglected. This is typically an excellent approximation as the spin-orbit coupling energy is on the order of a few μeV for iron bulk systems. Then the atomic and magnetic systems can be completely decoupled and the total angular momenta of the atomic and magnetic channels couple to zero individually, leading to a significant reduction in the number of basis functions (see Supplementary Methods for a detailed explanation). A further reduction of the allowed combinations of atomic and magnetic indices can be obtained by requiring inversion invariance for both atomic and magnetic spaces by restricting the sum of the corresponding angular momenta to even numbers.

We represent the energy for atom i including atomic and magnetic contributions as a linear expansion

$${\varepsilon }_{i}={{{{\bf{c}}}}}^{T}{{{{\bf{B}}}}}_{i},$$
(10)

where c is the vector of the expansion coefficients.

The energy can be rewritten in terms of the \(\tilde{{{{\bf{c}}}}}\) basis introduced in20,22 as

$${\varepsilon }_{i}={{{{\bf{c}}}}}^{T}{{{{\bf{B}}}}}_{i}={{{{\bf{c}}}}}^{T}{{{\mathcal{C}}}}{{{{\bf{A}}}}}_{i}={\tilde{{{{\bf{c}}}}}}^{T}{{{{\bf{A}}}}}_{i}.$$
(11)

This expansion was used to fit the DFT energies and forces. In order that the expression reduces to the non-magnetic ACE when the magnetic moments are zero, the first order equivariant basis was taken as \({A}_{i{\mu }_{i}{n}^{{\prime} }}^{(0)}\left({{{\bf{m}}}}={{{\bf{0}}}}\right)\)=1 by our choice of magnetic radial functions (see the following Sec. Magnetic radial functions).

Expressions for forces and magnetic gradients are obtained by taking the derivative of the energy with respect to atomic positions and magnetic moments, respectively, and are written in a compact notation as

$${{{{\bf{F}}}}}_{k}=\mathop{\sum}\limits_{i}\left({{{{\bf{f}}}}}_{ik}-{{{{\bf{f}}}}}_{ki}\right),$$
(12)

and

$${{{{\bf{T}}}}}_{k}=\mathop{\sum}\limits_{i}{{{{\bf{t}}}}}_{ki}+{{{{\bf{t}}}}}_{k}.$$
(13)

The pairwise atomic forces fki are given by

$${{{{\bf{f}}}}}_{ki}=\mathop{\sum}\limits_{nlm{n}^{{\prime} }{l}^{{\prime} }{m}^{{\prime} }}{\omega }_{i{\mu }_{k}nlm{n}^{{\prime} }{l}^{{\prime} }{m}^{{\prime} }}{\nabla }_{{{{{\bf{r}}}}}_{ki}}{\phi }_{{\mu }_{k}{\mu }_{i}nlm{n}^{{\prime} }{l}^{{\prime} }{m}^{{\prime} }}$$
(14)

and magnetic forces tk and tki by

$${{{{\bf{t}}}}}_{k}=\mathop{\sum}\limits_{{n}^{{\prime} }{l}^{{\prime} }{m}^{{\prime} }}{\omega }_{k{\mu }_{k}{n}^{{\prime} }{l}^{{\prime} }{m}^{{\prime} }}^{(0)}{\nabla }_{{{{{\bf{m}}}}}_{k}}{A}_{k{\mu }_{k}{n}^{{\prime} }{l}^{{\prime} }{m}^{{\prime} }}^{(0)}$$
(15)

and

$${{{{\bf{t}}}}}_{ki}=\mathop{\sum}\limits_{nlm{n}^{{\prime} }{l}^{{\prime} }{m}^{{\prime} }}{\omega }_{i{\mu }_{k}nlm{n}^{{\prime} }{l}^{{\prime} }{m}^{{\prime} }}{\nabla }_{{{{{\bf{m}}}}}_{k}}{\phi }_{{\mu }_{k}{\mu }_{i}nlm{n}^{{\prime} }{l}^{{\prime} }{m}^{{\prime} }}.$$
(16)

The calculation of the adjoints \({\omega }_{i{\mu }_{i}nlm{n}^{{\prime} }{l}^{{\prime} }{m}^{{\prime} }}\) and \({\omega }_{i{\mu }_{i}{n}^{{\prime} }{l}^{{\prime} }{m}^{{\prime} }}^{(0)}\) can be further decomposed to the evaluation of two distinct terms,

$$\begin{array}{rcl}{\omega }_{i{\mu }_{i}nlm{n}^{{\prime} }{l}^{{\prime} }{m}^{{\prime} }}&=&\mathop{\sum}\limits_{N=1}\mathop{\sum}\limits_{{{{\boldsymbol{\mu }}}}{{{\bf{n}}}}{{{\bf{l}}}}{{{\bf{m}}}}{{{{\bf{n}}}}}^{{\prime} }{{{{\bf{l}}}}}^{{\prime} }{{{{\bf{m}}}}}^{{\prime} }}{\Theta }_{{\mu }_{i}{{{\boldsymbol{\mu }}}}{{{\bf{n}}}}{{{\bf{l}}}}{{{{\bf{n}}}}}^{{\prime} }{{{{\bf{l}}}}}^{{\prime} }}^{(N)}\\ &\times &{A}_{i{\mu }_{i}{n}_{0}^{{\prime} }{l}_{0}^{{\prime} }{m}_{0}^{{\prime} }}^{(0)}\mathop{\sum }\limits_{s=1}^{N}d{A}_{i{{{\boldsymbol{\mu }}}}{{{\bf{n}}}}{{{\bf{l}}}}{{{\bf{m}}}}{{{{\bf{n}}}}}^{{\prime} }{{{{\bf{l}}}}}^{{\prime} }{{{{\bf{m}}}}}^{{\prime} }}^{(s)}\end{array}$$
(17)

where

$${\Theta }_{{\mu }_{i}{{{\boldsymbol{\mu }}}}{{{\bf{n}}}}{{{\bf{l}}}}{{{{\bf{n}}}}}^{{\prime} }{{{{\bf{l}}}}}^{{\prime} }}^{(N)}={\tilde{c}}_{{\mu }_{i}{{{\boldsymbol{\mu }}}}{{{\bf{n}}}}{{{\bf{l}}}}{{{{\bf{n}}}}}^{{\prime} }{{{{\bf{l}}}}}^{{\prime} }}^{(N)}$$
(18)

and

$$\begin{array}{rcl}d{A}_{i{{{\boldsymbol{\mu }}}}{{{\bf{n}}}}{{{\bf{l}}}}{{{\bf{m}}}}{{{{\bf{n}}}}}^{{\prime} }{{{{\bf{l}}}}}^{{\prime} }{{{{\bf{m}}}}}^{{\prime} }}^{(s)}&=&{\delta }_{\mu {\mu }_{s}}{\delta }_{n{n}_{s}}{\delta }_{l{l}_{s}}{\delta }_{m{m}_{s}}{\delta }_{{n}^{{\prime} }{n}_{s}}{\delta }_{{l}^{{\prime} }{l}_{s}}{\delta }_{{m}^{{\prime} }{m}_{s}}\\ &\times &\mathop{\prod}\limits_{k\ne s}{A}_{i{\mu }_{k}{n}_{k}{l}_{k}{m}_{k}{n}_{k}^{{\prime} }{l}_{k}^{{\prime} }{m}_{k}^{{\prime} }}\,.\end{array}$$
(19)

The adjoint \({\omega }_{i{\mu }_{i}{n}^{{\prime} }{l}^{{\prime} }{m}^{{\prime} }}^{(0)}\) does not contain the onsite basis contribution and is simply given by

$${\omega }_{i{\mu }_{i}{n}^{{\prime} }{l}^{{\prime} }{m}^{{\prime} }}^{(0)}=\mathop{\sum}\limits_{N=0}\mathop{\sum}\limits_{{{{\boldsymbol{\mu }}}}{{{\bf{n}}}}{{{\bf{l}}}}{{{\bf{m}}}}{{{{\bf{n}}}}}^{{\prime} }{{{{\bf{l}}}}}^{{\prime} }{{{{\bf{m}}}}}^{{\prime} }}{\Theta }_{{\mu }_{i}{{{\boldsymbol{\mu }}}}{{{\bf{n}}}}{{{\bf{l}}}}{{{{\bf{n}}}}}^{{\prime} }{{{{\bf{l}}}}}^{{\prime} }}^{(N)}d{A}_{i{{{\boldsymbol{\mu }}}}{{{\bf{n}}}}{{{\bf{l}}}}{{{\bf{m}}}}{{{{\bf{n}}}}}^{{\prime} }{{{{\bf{l}}}}}^{{\prime} }{{{{\bf{m}}}}}^{{\prime} }}^{(0)}$$
(20)

with

$$d{A}_{i{{{\boldsymbol{\mu }}}}{{{\bf{n}}}}{{{\bf{l}}}}{{{\bf{m}}}}{{{{\bf{n}}}}}^{{\prime} }{{{{\bf{l}}}}}^{{\prime} }{{{{\bf{m}}}}}^{{\prime} }}^{(0)}=\mathop{\prod }\limits_{s=1}^{N}{A}_{i{\mu }_{s}{n}_{s}{l}_{s}{m}_{s}{n}_{s}^{{\prime} }{l}_{s}^{{\prime} }{m}_{s}^{{\prime} }}.$$
(21)

The summation over N in Eq. (20) starts from zero because even a single atom contributes to the total magnetic gradient.

Magnetic radial functions

The magnetic radial functions \({M}_{{n}^{{\prime} }{l}^{{\prime} }}^{{\mu }_{j}{\mu }_{i}}\) used in this work exhibit a different functional form to their atomic counterparts that are given in terms of Chebyshev polynomials19,33). In particular, one has to ensure that the energy is invariant under time reversal symmetry, i.e., mi → − mi for every i. For these reasons, we chose a linear combination of Chebyshev polynomials Tk as

$${M}_{{n}^{{\prime} }{l}^{{\prime} }}^{{\mu }_{j}{\mu }_{i}}\left(m\right)=\mathop{\sum}\limits_{{k}^{{\prime} }}{c}_{{n}^{{\prime} }{l}^{{\prime} }{k}^{{\prime} }}^{{\mu }_{j}{\mu }_{i}}{g}_{{k}^{{\prime} }}^{{\mu }_{j}{\mu }_{i}}\left(m\right),$$
(22)

with

$${g}_{{k}^{{\prime} }}^{{\mu }_{j}{\mu }_{i}}\left(m\right)={T}_{{k}^{{\prime} }}\left(x(m)\right).$$
(23)

The scaled distance x guarantees the invariance under time reversal symmetry

$$x\left(m\right)=1-2{\left(\frac{m}{{m}_{cut}}\right)}^{2},$$
(24)

where mcut is the cutoff for the magnetic moment magnitude. The expansion coefficients \({c}_{{n}^{{\prime} }{l}^{{\prime} }{k}^{{\prime} }}^{{\mu }_{j}{\mu }_{i}}\) for both magnetic and atomic radial functions are adjusted during the fitting procedure.

DFT calculations

All our reference DFT calculations were performed using the non-collinear and collinear versions of VASP 5.4.161,62,63,64 and the projector augment wave (PAW) method65. The constrained local moment approach32 was employed to constrain either both size and direction or just the direction of the atomic magnetic moments. The exchange-correlation energy was represented using the Perdew-Burke-Ernzerhof (PBE) generalized gradient approximation (GGA) method66. We carried out carefully converged calculations with tight settings of the principal parameters in order to obtain accurate results for the energy, forces and magnetic moments. Specifically, the kinetic energy cutoff was set to 500 eV, the convergence threshold for the energy to 10−5 eV and the k-mesh density to 0.18 Å−1. The integration radius for the atomic magnetic moments (VASP parameter RWIGS) was kept constant at the value of the Fe PAW (1.302 Å). The LAMBDA parameter was initially increased in smaller steps and then gradually with larger steps. A typical sequence of values is (1,2,6,10,15,20,25,30). See Supplementary Notes 1 and 2 for the convergence of magnetic moment magnitude with respect to the integration radius and for a discussion on the convergence of the penalty energy in the constrained local moment method.

MD-MC calculations

The MD-MC simulations of the FM-PM transition in bcc Fe consisted of alternating MD and MC steps. The MD simulations were performed using Langevin dynamics (from ASE67 package) with a time step of 1 fs. The MC sampling included uniform spin rotations on a unit sphere with and without additional perturbations of the magnetic moment magnitudes. The simulation supercell had dimensions 12 × 12 × 12 of a bcc cell and contained 3456 atoms. The dimensions of the supercell were kept fixed at all temperatures so that the effect of thermal expansion was neglected. At each temperature, we carried out about 107 steps, with the initial 10% used for equilibration.

The free energy difference between bcc and fcc, shown in Fig. 9b, was calculated following the application of the SSTI method51 to magnetic Fe52. In this approach, stresses are integrated along a deformation path between the bcc and fcc structures. Our calculations employed supercells of dimension 8 × 8 × 8 (512 atoms). The lattice parameters of bcc and fcc at each temperature were adjusted to the values obtained from the corresponding thermal expansions.