Combining pressure and temperature control in dynamics on energy landscapes

Complex systems from science, technology or mathematics usually appear to be very different in their specific dynamical evolution. However, the concept of an energy landscape with its basins corresponding to locally ergodic regions separated by energy barriers provides a unifying approach to the description of complex systems dynamics. In such systems one is often confronted with the task to control the dynamics such that a certain basin is reached with the highest possible probability. Typically one aims for the global minimum, e.g. when dealing with global optimization problems, but frequently other local minima such as the metastable compounds in materials science are of primary interest. Here we show how this task can be solved by applying control theory using magnesium fluoride as an example system, where different modifications of MgF2 are considered as targets. In particular, we generalize previous work restricted to temperature controls only and present controls which simultaneously adjust temperature and pressure in an optimal fashion.


Introduction
A characteristic feature of complex systems in science, technology and applied mathematics is their highly nontrivial dynamics that is closely related to the properties of the energy or cost function landscape of the system of interest [1]. Both understanding the unperturbed time evolution of a complex system and gaining insight into its response to external stimuli require a detailed knowledge of the relevant features of the system's energy landscape. Important quantities are the stable regions of the landscape [2][3][4][5][6][7][8] and the probability flows among them [9,10], both as function of external environmental and control parameters. These stable regions correspond to e.g. (meta)stable chemical compounds [6,11,12], folded or unfolded states of a protein [13][14][15][16][17][18], magnetic phases [19], stable attractors [20], or (sub)optimal solutions of combinatorial optimization problems [21,22], while the flows characterize the likelihood of transitions between stable regions [23,24], the relaxation towards equilibrium [25,26], and the progress of optimization algorithms [22,27]. The determination of such regions and flows requires finding the minima of the landscape and measuring the local volume in state space contained within the basins of the landscape, and furthermore the analysis of the connectivity of the landscape including the derivation of the energetic, entropic and kinetic barriers [10,28] that separate individual minima and the multi-minima basins [2]. In the a e-mail: c.schoen@fkf.mpg.de literature, one finds two complementary approaches to identifying such landscape features: indirectly via extraction from long molecular dynamics [29][30][31] and Monte Carlo simulations [32] or even from long time sequences of experimental signals [33,34], or directly from the landscape itself using various global optimization and exploration algorithms [23]. Of course, in practice, combinations of these methods are often employed, depending on the type of system and objective of the study.
Ideally, such global explorations of the landscape yield a detailed (though coarse-grained) model that can explain many aspects of the dynamics of the underlying system, in particular the observed states, phases or chemical modifications, their individual stability and the transitions among them, and, last but not least, the system's response to external stimuli or changes in the boundary conditions and constraints. A typical result of such a global exploration is e.g. the prediction of the equilibrium composition-pressure-temperature (x, p, T ) phase diagram of a chemical system which depicts the thermodynamically stable phase for each given triplet of (x, p, T ) values [35] or in a more generalized version, the prediction of an extended phase diagram that includes metastable phases and their lifetimes [36].
Of course, in the thermodynamic limit and for infinite times, we should always observe the thermodynamically stable phase for a given (x, p, T ) value in the experiment. However, on finite time scales, it frequently proves to be quite difficult to actually access even the global minimum. Unsurprisingly, the situation is even more challenging if the target is some particular metastable modification of the system, even though our landscape information assures us of its potential existence. This is particularly noticeable in the fields of materials science and solid state chemistry, where the rational design of new compounds and routes for their synthesis has been lacking, and only in recent years a change from an inductive towards a deductive approach to the field has been evolving [11,37,38]. The importance of achieving a high degree of control of the chemical synthesis for our ability to develop new materials and routes to their synthesis has been underlined in a recent BRN report of the department of energy (DOE) [39].
To address this issue, we have recently begun to develop a methodology [40] that employs optimal control theory to compute an optimal schedule of the accessible physical control parameters, which drives the system of interest with a high probability to the desired region on the landscape. Starting point are the available landscape information such as the energies of the minima, the local densities of states and the transition probabilities among the basins as function of energy which can be measured using e.g. the so-called threshold algorithm [41,42]. We use this information to construct a transition probability matrix between the basins as function of temperature, employing so-called kinetic factors [10,28]. Finally we design optimal schedules that maximize the probability to reach a particular basin of the energy landscape. For the last step, we employed similar techniques as have been used for the optimization of relaxation schedules on lumped complex energy landscapes in the past; examples are optimal schedules for global optimization algorithms [22,[43][44][45][46], or the dynamics on minimum+saddle point landscape models derived for atom clusters, with the goal to design an optimal simulated annealing schedule to reach the global minimum [47].
In our previous work the solid compound MgF 2 , which is often compared with TiO 2 , served as the example. While for TiO 2 several modifications such as rutile and anatase exist [48], MgF 2 has so far only been synthesized in the rutile modification [48] (and possibly a closely related CaCl 2 [49] structure). Moreover, simulations of thin film growth suggest that a nano-crystalline CdI 2 type modification (already predicted via early landscape investigations [23,50]) should be present that stabilizes the CaCl 2 type modification against transformation into the thermodynamically stable rutile phase [51]. This fact leads to a special interest in MgF 2 , as its anatase modification occupies a large deep-lying basin on the energy landscape which is energetically only slightly above the rutile minimum [23].
In our work on MgF 2 we could show, that possibilities exist to increase the probability to find its anatase modification. However, this earlier study was restricted to the optimization of the temperature schedule, while in experiments one frequently attempts to control the synthesis and its transformation process by adjusting both temperature and pressure [52,53]. Similarly, parallel tempering simulations in both pressure and temperature have been successfully employed to study the liquid-solid transition in Lennard-Jones systems [54] and explore the landscape of metastable alanine dipeptides [55]. Thus, in this work, we extend our optimization to include both thermodynamic parameters at the same time, where we will again use the MgF 2 landscape as the model system for a proofof-concept study.

The model system
The central starting point of our approach is the energy landscape of the system under consideration, which provides the necessary information for understanding the system's dynamical behavior.
In the study presented below the energy landscape is that of the solid compound MgF 2 . It is used in a simplified form investigated in the past [50]. In that work, in order to reduce the number of degrees of freedom to a manageable size, and to allow for the many millions of energy calculations that were needed to explore the energy landscape, MgF 2 was described by a periodically repeated variable unit cell containing two formula units of MgF 2 , and the energy was computed using a simple Lennard-Jones-plus-Coulomb potential. For further details on the potential, we point the reader to reference [50].

Data acquisition and energy landscape
The energy of the system varies as a function of the relative atom positions and the simulation cell properties like its volume, which we refer to as the system variables. The energy landscape which unfolds in this multidimensional state space is explored by random changes in those system variables. Here we will make use of data which was obtained in a study [23] using the threshold algorithm [41,42].
In this exploration method a random walker roams the energy landscape starting from a given energy minimum. The walker chooses a neighbor state randomly and accepts it as the next state. The only restriction for the walk is the requirement that a given energy value (the lid or threshold) cannot be crossed from below. From a physical point of view this corresponds to an infinite temperature walk in that part of the state space which is below the threshold.
In addition quenches from selected starting points along the trajectories are performed, i.e. a random walker can only proceed to states lower in energy until he has reached one of the local minima. In most cases the walker returns to its initial minimum, and in this way the area of attraction of this minimum can be mapped out. These areas are referred to as the basins of the minima.
Such walks (with their quenches) are repeated many times and the local energy density of states in the accessible region of state space is recorded. In this way an estimate for the local densities of states of the most important basins around the minima on the energy landscape is obtained. As the density of states rapidly increases with energy it is advisable to start the walks with an energetically low threshold and increase it in steps.

Basin 1
Basin 2 level l level l+1 Sometimes at the end of a quench the walker reaches a minimum different from its starting minimum which then allows to gain information on the transition frequencies between the different basins. Due to the energy dependence of the energetic, entropic and kinetic barriers separating the basins these frequencies change as the threshold is increased. In addition a transition between two minima only occurs if the threshold is energetically above a certain energy characteristic for the two minima in question (the transition state energy). The overall procedure is visualized in Figure 1.
During the data acquisition for the MgF 2 system the lid set {E th } was used, in which the thresholds are spaced by 0.1 eV/atom in the low energy regime and by 0.25 eV/atom at higher energies and where the largest lid is E th 1 . For each lid and starting minimum, threshold runs of length up to 2.5 × 10 5 Monte-Carlo steps were performed, and every 5 × 10 4 steps, the random walker was quenched into the nearest local minimum.
In summary, we envision the energy landscape as a collection of minima with their respective basins. Within such a basin local equilibrium is attained on a very short time scale, while the transitions between minima occur on a much longer time scale. If desired the features of the energy landscape can be further analyzed and described in a tree-graph representation [23], so-called transition maps [23] or probability flow diagrams [56], and characteristic regions [57].

The stratified basin model
Due to the discrete energy levels at which the thresholds were placed during the data acquisition runs the resolution of the energy dependence of the inter-basin transition frequencies is limited to the energy difference between two subsequent thresholds. This -and the fact that the state space of the system is by far too large for a direct modelling -suggests to simplify the description by a coarse-graining of the state space. However, such a coarse-graining needs to be performed in a fashion which preserves the dynamic features of the original system. For details of such an approach and the potential errors made through coarse-graining see references [25,58,59]. Thus the weakly coupled basins in the energy landscape need to be clearly separated entities in the resulting coarse-grained model. In our example system MgF 2 , the basins used correspond to the following six modifications (for structural data and explanation of the notation cf. Ref. [50]) which are abbreviated by the number in parenthesis: rutile (1), anatase (2), Mp1 (3), 1/2Occp (4), CdI 2 (5), and 1/2BN (6).
Inside each basin the states between subsequent thresholds are coarse-grained into one node. The resulting model structure is shown in Figure 2. We will refer to this model as the stratified basin model.
In this model the nodes are denoted by a double index (j, i), where j describes the basin and i the level of a node, where the count starts at the top energy. Below we use i min j for the level of the lowest node with a nonvanishing density of states in basin j. Within each basin each node is connected to every other, but between basins only nodes at the same energy level communicate. The transitions are depicted by lines in Figure 2. As noted earlier, the connections between a given pair of basins j and k have a minimum energy E G j,k below which no transitions are possible, defined as the lowest energy for which either r j,k or r k,j is non-vanishing.

Thermal relaxation dynamics
Our aim is now to describe the dynamics of the system in contact with a heat bath at temperature T in a coarsegrained fashion by a master equation The time increases in discrete steps from t m to t m+1 , where the time difference between steps is chosen such that local thermal equilibrium within each basin is reached in one time-step. We reach our goal by setting with L(T ) being the matrix establishing local thermal equilibrium in each basin: Here where E th i is the energy of level i, the local partition function for basin j, and d (j,i) an appropriately scaled density of states in node i of basin j.
The horizontal transition matrix H describing the transitions between the nodes of different basins j and k at levels i = n is given by with Here f i {j,k} ≡ f i {k,j} are the kinetic factors controlling the time scale of the transition rates between minima j and k at level i. One way to visualize these kinetic factors in the coarse-grained state space model of the landscape is by assigning each edge at a given energy level a "carrying capacity" proportional to the kinetic factor. On the level of a microscopic model, this carrying capacity would correspond to the number of connections between the microstates belonging to the two different basins at the energy level i. Finally, g k is the relative weight of the local DOS in basin k in relation to the global DOS in that basin The quantities d (j,i) , f i {j,k} , and g k can be obtained from the data collected during the threshold runs. How that is done in detail is described in our previous work [40].

From energy to enthalpy
When extending our stratified basin model to include a pressure dependence, one needs, in principle, to gain information for a large number of pressures on the possible quantitative and qualitative changes in the relevant energy landscape. For a thermal process of a pressure ensemble the relevant quantity is the enthalpŷ where we usedĤ as the symbol for enthalpy in order to distinguish it from the matrix H and its elements. The probability P α to be in a microstate α in thermal equilibrium at temperature T and pressure p will depend on its (potential) energy E α as well as on its volume V α : withĤ α (p) = E α + pV α being the (potential) enthalpy of microstate α. It is apparent that the enthalpy landscape coincides with the energy landscape for p = 0, but the landscape will undergo quantitative as well as qualitative changes as the pressure increases. On the one hand the enthalpy value will change depending on the state space position but on the other hand the set of stable structures, i.e. the basins, might change too. However, such a global exploration of a multitude of enthalpy landscapes is computationally highly expensive, and not required for the proof-of-concept study we present in this work. Thus, we choose a perturbative-type approach by phenomenologically modeling the modifications of the specific MgF 2 landscape as function of pressure. This is sufficient to exhibit the major qualitative trends in the optimal control problem that are observed when allowing the pressure to vary as part of the optimal schedule. Nevertheless, we stress that the formulae we present in the next subsection are capable of describing the lumped dynamics on the full bundle of enthalpy landscapes. In particular, pressure driven first order phase transitions are well within the domain of our general methodology.

The pressure dependence of the enthalpy landscape
The basic assumption we make in our modeling approach is that the same basins are present over the whole pressure range considered -later, we briefly outline how one proceeds if an addition or subtraction of basins is required. Thus the main influence of the pressure change will be a change in the enthalpy values assigned to nodes of our model.
First we turn to the variation of the enthalpy with the height in energy of a given configuration above the minimum. On the model level the microstates coarse-grained into the nodes of the stratified basin model all possess a volume in addition to their energy, and technically the question is what average volume value does one assign to a node at a given energy level.
Physically the question can be approached by looking at the expansion (or contraction) of the structure as function of the energy level. Typically, we are dealing with a non-linear interaction potential among the atoms that leads to an expansion as function of temperature. But this approximately corresponds to an expansion as function of the energy if we average over all configurations that contribute to the same slice in energy of the basin under consideration and furthermore we might use the thermal expansion coefficient to quantify this average increase in volume at higher energies.
Analogously, we can look at the decreases in cell volume due to the finite compressibility of the various structures. These, too, modify the energy associated with each energy slice as function of pressure. Usually, increasing pressure and increasing temperature work in opposite directions, but, in principle, both effects can be treated by simply adding their effects on the cell volume of the various minimum structures.
Since the thermal expansion mostly depends on the non-linearity in the two-body terms of the potential, which are essentially the same for all structures, the thermal expansion coefficients can be taken as essentially the same for all basins -the only subtlety will be the non-linearity of the thermal expansion as function of temperature itself, and thus as function of energy level.
In the case of the pressure susceptibility, this effect is usually more drastic, leading to differences of up to a factor of two among the various minima. The bulk moduli at zero pressure tend to be reasonably constant as function of pressure, at least for a rather compact ionic solid like the one we are considering as an example system. If one does not have such computed values available, a qualitative heuristic we have found to apply in many landscape explorations is that the structures with the lowest densities tend to have the highest compressibility up to some high pressure, i.e. small volume, where the repulsive term in the interaction potential becomes crucial and all the E(V )-curves begin to rise quite rapidly (and typically the initially densest modifications become thermodynamically stable).
Within the perturbative approach employed here, we thus assign the nodes in our model pressure dependent enthalpiesĤ where the volumes V (j,i) capture the properties of the different state space regions coarse-grained into the nodes. For this proof-of-concept study we have chosen a linear increase of the V (j,i) with energy above the basin minimum energy for all basins with a basin dependent V (j,i min j ) . These minimum values were chosen such that the pressure effects to be discussed are highlighted. In particular, the largest difference between any two minima corresponds to an energy difference of 0.8 eV/atom at the maximum pressure allowed. Figure 3 visualizes the effect of changing pressure on the node at level 2 of basin 4. Here the height of the node corresponds to its enthalpy.
One further assumption we make is that the barriers do not change qualitatively in the sense that we still can connect basins at about the same energy values according to the transition probabilities measured for the p ≈ 0 GPa landscape. Clearly, the actual neighborhoods of the configurations do not change, i.e., we do not change the moveclass, and the effect of the shift in energy as function of pressure is only noticed in the correction to the Boltzmann factor. Here, phenomenological arguments do not help us decide to which extent subtle changes in the barrier structure at energies above the saddle points can occur -this kind of information would only be available from threshold runs at high pressure. However, we note that at elevated energies the barrier structure is dominated by entropic and kinetic barriers which are less susceptible to pressure effects than energetic ones (due to the shift of the energy of the regions around saddle points). Thus we do not expect large changes in the infinite-temperature transition rates unless the basin structure of the landscape changes qualitatively due to the elimination/generation of important stable regions on the landscape.
Note that, in principle, vanishing/appearing basins can be taken into account by phantom nodes that do not participate in the transition matrix until they become stable regions at some pressure. Here, probability that is located in a vanishing region is deposited in those still existing regions that had been in contact with the vanishing one at the previous pressure. (We note that the probability is essentially zero that we will use a pressure in the numerical simulations that corresponds exactly to a critical point where the first order transitions end in a continuous second order transition.)

Building pressure dependence into the stratified basin model dynamics
In order to incorporate the pressure dependence into the model, we proceed in the same fashion as in our earlier work [40]. The major change is that in the horizontal transition matrix H and in the local equilibration matrix L(T ) the role of energy is taken over by the enthalpy, and the pressure induced shifts in enthalpy for each node are taken care of.
When constructing the matrix L(T ) that establishes local thermal equilibrium we now have a different equilibrium distribution due to the shifted enthalpies whereĤ (j,i) (p) is the pressure dependent enthalpy of level i in basin j, and is again the local partition function for basin j.
The changes needed in H describing the originally horizontal transitions between the nodes of different basins at one level are due to the fact that because of the pressure dependence of the energies these transitions are no longer horizontal in energy. We thus find with (16) being the transition rate between nodes of basins j and k at level i. The additional exponential term ensures that transitions uphill in energy are suppressed with the appropriate Boltzmann factor. In order to ensure that H is a proper transition probability matrix we have to reset the diagonal elements such that the sums of the columns equal one Finally we combine the two transition matrices H(T, p) and L(T, p) to obtain

G(T, p) = H(T, p) L(T, p),
which will be the basis for our structure selection process.

Optimal structure selection
Within the coarse-grained stratified basin model developed above, the dynamical evolution of the system is governed by a master equation for the probability to be in a certain basin within a specific energy slice Note that a thermal quench at any time will lead all probability within a basin into its minimum.

Temperature-and pressure-dependent transition rates between basins
For the structure selection we can simplify this system further as we are mainly interested in the overall probability to be in a basin. This information is enough to regain the probabilities to be within the different energy strata of that basin as on the time scale of a single step in our dynamics (19) thermal equilibrium is established. Thus we introduce coarse-grained transition probabilities between basins and the corresponding dynamics for the probability to be in a basin j becomes The transition probabilities in equation (20) between basins form the basis for the desired control of the systems dynamics. In Figure 4 they are shown for two different pressures as a function of x = exp(−0.1/T ), where the temperature is measured in units of the energy. As in our previous exposition we will use x instead of T , but will still refer to it as "temperature". The range of allowed temperatures is constricted to 0 ≤ x ≤ 0.6 to insure that the probabilities to be at energies above the highest modeled level are negligible. This limitation of the temperature corresponds to T ≤ 2271 K for the energies used here. Likewise we restricted the range of the pressure to be 0 ≤ p ≤ 1. Note that for p > 0 the energies are always larger than for p = 0 and thus the probabilities will be smaller. The maximum pressure corresponds to about 128 GPa for the enthalpy difference of 0.8 eV and a relative volume/atom difference of 1Å 3 . In Figure 4, the transition probabilities between selected pairs of minima are shown for the limiting pressures Fig. 4. For a selected number of pairings the transition rates between different basins are shown as a function of temperature and pressure. There are several features which are important for the desired structure selection: the large variations between forward and backward rates are strongly temperature dependent, which allows to shift the relative equilibrium probabilities by a temperature control. In addition a pressure change can also affect the rates considerably, for instance in the (2, 4) rates an increased pressure lowers the rates and diminishes the difference between forward and backward rate, while in the (1, 5) rates their relative size is even inverted. Color figure in the open-access online version of this paper. p = 0, 1 as a function of the temperature x. Several different features can be observed. First of all it is apparent that the transition probabilities are temperature and pressure dependent, which is important for the desired structure selection. As the relation between the transition probabilities can be changed by varying temperature and pressure, transitions from one minimum to another can be furthered while others can be suppressed.
The transition rates between anatase (2) and Mp1 (3) show an increase for larger temperatures, with a slight but noticeable difference for the two pressure values shown. For larger x the 2 to 3 transition is favored as compared to the back-transition. A similar behavior can be observed for the transitions between Mp1 (3) and 1/2Occp (4), however, while for large pressure their ratio can be on the order of two, for zero pressure the rates are close together. This is different for the transitions between Mp1 (3) and 1/2BN (6), which show that the difference between rates can be quite substantial. Here a proper temperature and/or pressure selection can certainly influence the probability flows between the two minima.
The transitions between anatase (2) and 1/2Occp (4), and between rutile (1) and CdI 2 (5), respectively, possess a remarkable feature: with increasing pressure the ratio of the forward and backward rates changes from above one to below one when pressure is applied. Thus the probability flow from one basin to the other can be reversed which provides the desired control possibilities.
Finally we point out that the transitions between Mp1 (3) and CdI 2 (5) show a decreasing (backward) 5 to 3 rate, which is related to the energy dependence of the density of states in the two basins.

Optimizing the temperature and pressure schedule
Based on the model developed above we now aim at finding ways to increase the probability to find the system in a particular basin. The controls we have to achieve that goal are the temperature and the pressure, which we can change as the process unfolds. In technical terms, we want to determine those temperature and pressure schedules (or protocols) which maximize the probability to be in a prescribed basin at the end of the process. The complexity of the problem comes from the fact that we have to optimize both controls in parallel. The process duration is fixed by setting the number of time steps taken to a constant number M . The objective function Φ for this maximization is linear in the final probabilities: In particular, the objective Φ k j to maximize the probability to be in minimum k, is given by We analyzed the structure selection problem for different initial probability distributions P j (t 0 ). Here we show results for processes starting from thermal equilibrium distributions at the maximum allowed temperature, i.e. at x = 0.6, and vanishing pressure. Finding optimal schedules for temperature and pressure such that the final probability to be in one of the basins is maximized is a non-trivial problem. At each timestep of the dynamics described by equation (20) the optimal values for T and p have to be chosen. This task is achieved by the use of control theory, which provides the needed mechanisms, see for instance [60][61][62]. The basic features of its application to the case of structure selection were already presented in [40]: at each time step during the iterative method the temperature T (t m ) and pressure p(t m ) are set such that an appropriately chosen control Hamiltonian is maximized, which leads to a maximization of the objective Φ M while the constraints due to the dynamics are observed. While the parallel optimization of both controls leads to a considerable increase in the numerical effort needed, the convergence features of the algorithm did not change as compared to the case with temperature control only.

Results
As in our previous study [40] we perform the structure selection optimization on a reduced system containing the four basins with the lowest minimum energy at standard pressure: these are rutile (1), anatase (2), Mp1 (3), and 1/2Occp (4).
Of course, the ability of the process to drive the system into the desired basin depends on the number M of time steps available. Here we look at schedules for temperature and pressure of length M = 30.
In Figure 5 we show the optimal temperature and pressure schedule starting from a T max -distribution at p = 0. The panels on the left show from top to bottom the optimal schedules to maximize the probability to be in basin rutile (1), anatase (2) As we see from the figure, the controls needed for maximizing the probability to be in the different basins vary considerably.
The rutile schedules require a vanishing pressure and a slowly decaying temperature with a sharp drop to about a third of its initial value after 60% of the time. Anatase on the other hand needs high pressures, which were here limited to p ≤ 1, with a maximum temperature setting. Mp1, after initial jumps in pressure and temperature, needs high pressure and a slightly decaying temperature clearly below the upper limiting value, and finally 1/2Occp shows a sharp decay in pressure and temperature after about 40% of the time.
Comparing the different temperature/pressure combinations reveals interesting features: from all optimizations one can easily see that the rutile basin gains probability as soon as the pressure is low and loses as soon as the pressure is high. This is a direct consequence of the chosen volume dependence of the rutile basin. In addition low temperatures favor the rutile basin which is expected from its global minimum property. Comparing the anatase and Mp1 optimization shows the importance of an appropriate choice of temperature: while the maximum temperature favors anatase, the intermediate temperature adds to the probability gain of the Mp1 basin.
Another very interesting feature can be seen in the 1/2Occp optimization. The parallel drop of pressure and temperature leads from a situation with a clear preference for the anatase and Mp1 basins to a situation, where rutile is gaining but, more importantly, 1/2Occp can obviously benefit from the increased population in the anatase and Mp1 basins established in the first part of the process.

Discussion and summary
The ability to steer the dynamics on the energy landscape of a chemical system such that a specific modification or configuration of a given chemical compound can be synthesized is of great importance [39]. The work presented in this study constitutes a proof-of-concept for a systematic methodology using optimal control methods in temperature-pressure space to reach the target with high probability in a finite time. As an example system, we employed a simplified model of MgF 2 , which we had studied earlier employing a much more limited control space where only the temperature could be varied.
As in the earlier study, a coarse-grained description of the landscape based on the energies of the minima, the local densities of state and the transition probabilities among the basins as function of energy is constructed: the stratified basin model.
But now the energy is replaced by the enthalpy as function of pressure, where the underlying space of atomic configurations is independent of pressure, of course. What is affected is the shape of the minimum basins, up to the point that as function of pressure, different stable regions of the landscape may be present.
Modifying not only the temperature but also the pressure in an optimal fashion greatly enhances our ability to reach target structures compared to only having the temperature as the control variable. This is reflected in experiments, where one often prepares high-pressure modifications at standard conditions via fast pressure-quenching of such phases at sufficiently low temperatures.
Of course, we used a highly simplified model of MgF 2 , where we only concentrated on those modifications that correspond to locally ergodic regions at standard pressure. We did not include experimentally known high-pressure modifications [63] that would have competed with the lowpressure modifications at very high pressures, in order to keep the optimal control problem manageable. As we described above, from a technical point of view inclusion of such additional phases that are only present for a limited range of pressures is straightforward and does not pose any conceptual problems once data from sufficiently many enthalpy landscapes have been collected. Since in this work we are interested in a proof-of-concept demonstration, we have not performed the very large number of such global explorations needed; instead we have employed a perturbative approach to model the enthalpy landscape at elevated pressures.
Nevertheless, we have used maximum temperature and pressure values which are nowadays feasible (but not easy to handle). The maximum temperature of about 2200 K is certainly high, but clearly within experimental reach. Moreover, as usual in theoretical studies such as ours, temperature and time can to some extent be traded in the sense, that at lower temperatures the corresponding reaction times increase, but lead to similar results if only (energy) barrier crossing is the aim. In that sense, the relationship between the computational Markov-process time and the time scale of the experiment is only one of proportionality, where the actual value of the proportionality factor is not crucial for deriving the optimal pressuretemperature schedule.
Similarly, the maximum pressure of 128 GPa may appear to be very high, but current experiments can reach even 750 GPa (or about 7.5 MBar) in static pressure set-ups [64,65].
Conversely, this means that exploiting relative volume/atom differences of 0.2Å 3 to shift the system between basins is within the current technological limits. This value of 0.2-1Å 3 concurs quite nicely with the typical volume differences between the modifications we have investigated here, since the volume/atom V min at the bottom of the respective minima basins on the empirical energy landscape is found to be [50]: V min (rutile) = 10.641Å 3 , V min (anatase) = 11.502Å 3 , V min (Mp1) = 10.169Å 3 , V min (1/2Occp) = 10.641Å 3 , V min (CdI 2 ) = 11.058Å 3 , V min (1/2BN) = 15.409Å 3 .
Another very interesting question is to what extent the assumption that the chemical system can be visualized as reaching thermal equilibrium between two time-steps of the optimal control procedure is realistic.
This depends, of course, on the structure of the energy landscape and the size distribution of the basins that are present and relevant. Experience with threshold simulations such as those that have been used to obtain the data in this study, has shown that both the relaxation times within the basins and the escape times from the basins show a wide size distribution. This is not surprising since even constant temperature relaxations can exhibit a great variety of behavior ranging from simple fast exponential relaxations to essentially infinitely many relaxation time scales as found e.g. in aging processes of glassy systems.
For the purpose of the modeling approach presented here, the assumption of indeed reaching thermal equilibrium between two time-steps is an important one. Here, the experience of many threshold simulations has shown that the local density of states within a single basin is usually sampled much faster than it takes to transfer probability between two basins. Furthermore, this separation of time scales is clearly an appropriate and consistent assumption: if the system cannot equilibrate within a basin, then this basin is not locally ergodic, and the whole Markov model breaks down. But in that case, it does not make sense to treat such an unstable basin as a stable state of the system that can serve as a target modification, and it should either not be included at all or assigned a phantom node instead. On the other hand, once we employ very large numbers of time steps with very short time intervals in-between, then we may well reach the point that these time intervals are comparable to the local equilibration time. However, we can still define probability flows (and measure them in the threshold runs) that refer to very short time scales although here we need to include memory effects, i.e. the flow out of the current basin will depend on from which basin we have entered the basin of interest. In this case, we must use an extended concept of a Markov-model to allow for such memory effects, with all its implications for the optimal control problem. Such models might also be the appropriate tools to model and control syntheses that proceed far away from thermal equilibrium. But while being a fascinating project, we leave this for future work.
But as long as the assumption of local ergodicity of the basins holds and thus the Markov states of the model landscape are well-defined, the procedure we have presented can be transferred essentially one-to-one to more complex systems and applications. By combining optimal control methods with information gained from global landscape explorations, both on the theoretical and experimental side, it will be possible to design temperaturepressure schedules that will assist the experimental solid state chemist and physicist in improving the yield of their syntheses of particular modifications of solid compounds [66,67]. Clearly, the need to investigate not only one energy landscape but a whole bundle of enthalpy landscapes increases the urgency to improve the efficiency of the data acquisition procedures. One possible extension may be to use the parQ-methods [68] that should allow for a more efficient scanning of the landscape, while another direction would be to employ combinations of recently developed robotics-inspired search algorithms [69][70][71] with the threshold algorithm employed when collecting the data for this study.