JAXVacua – A Framework for Sampling String Vacua

: Moduli stabilisation in string compactifications with many light scalars remains a major blind-spot in the string landscape. In these regimes, analytic methods cease to work for generic choices of UV parameters which is why numerical techniques have to be exploited. In this paper, we implement algorithms based on JAX , heavily utilising automatic differentiation, just-in-time compilation and parallelisation features, to efficiently construct string vacua. This implementation provides a golden opportunity to efficiently analyse large unexplored regions of the string landscape. As a first example, we apply our techniques to the search of Type IIB flux vacua in Calabi-Yau orientifold compactifications. We argue that our methods only scale mildly with the Hodge numbers making exhaustive studies of low energy effective field theories with O (100) scalar fields feasible. Using small computing resources, we are able to construct O (10 6 ) flux vacua per geometry with h 1 , 2 ≥ 2, vastly out-performing previous systematic searches. In particular, we showcase the efficiency of our methods by presenting generic vacua with fluxes below the tadpole constraint set by the orientifold with up to h 1 , 2 = 25 complex structure moduli.


Introduction
The string landscape is believed to contain a large number of vacuum solutions, current estimates ranging between 10 500 [1,2] in type IIB flux compactfications to 10 272,000 [3] in F-theory constructions. 1 Despite these statistical estimates, it is fair to say that our capabilities to efficiently construct string vacua are extremely limited.To understand the low-energy physics attainable from string compactifications, a comprehensive numerical approach for analysing string theory vacua is therefore highly desirable.This is the overarching objective of this work.Success in this endeavour would enable an unprecedented look into the nature of consistent theories of quantum gravity and, in particular, their distinguishing features compared to bottom-up Effective Field Theories (EFTs).
More specifically, high-performance methods to search for realistic vacua will allow us to explicitly scrutinise existing constructions of e.g.string inflation [5,6], while at the same time explore previously unknown mechanisms in the landscape.In this context, there are several computational steps which have to be addressed simultaneously: 1. Sampling UV data (compactification data): Zooming in on Calabi-Yau (CY) compactifications, large datasets of compactification spaces have been constructed with the most prominent example being the Kreuzer-Skarke (KS) database of 4-dimensional reflexive polytopes [7].To obtain the associated EFTs, the relevant information about these compact dimensions has to be extracted.This includes the choice of geometry, its topological data like intersection numbers as well as suitable D-brane/O-plane setups satisfying tadpole and anomaly cancellation conditions.This UV data has been constructed on an individual basis in the past (see [8][9][10][11] for examples in Type IIB string theory), and tools have been developed to enable the study of large numbers of examples [12][13][14].By now this provides access to 'all' models in the KS database which is the first step to systematically analyse their vacuum structures.
2. Calculating EFT properties (quantum corrections): Consistently identifying perturbative and non-perturbative effects associated with a given UV sample is a prerequisite for proper control over the EFT.For CY compactifications, this includes in particular the scalar potential for the moduli fields.For these setups, several interesting phenomenological implications (most prominently an additional phase of matter domination in the early Universe) have been established in concrete geometric constructions (see [6] for a recent review).On the methodological side, following over thirty years of development, we now have tools available for computing a wide range of complex structure potentials [15][16][17][18].In addition, to control other moduli (e.g.Kähler moduli and open string moduli), further effects have to be considered such as non-perturbative effects via gaugino condensation or D-brane instantons (see [19] for a review).
3. EFT analysis: For many explicit constructions, analysing the EFTs has been a bottleneck due to a plethora of moduli fields which effectively limits our access to string theory solutions.For example, typical examples in the KS database feature O(100) complex structure and/or Kähler moduli fields with the Hodge numbers being bounded by h p,q ≤ 491.For these systems, analytic approaches are generically impossible except under certain assumptions. 2In general, one faces two significant bottlenecks: on the one hand, the EFT analysis usually has to be hard-coded on an example by example level.On the other hand, the numerical optimisation appeared to be inefficient [34,35].
In this work, we focus on developing tools which naturally combine the data products from the first two of the above points and, in light of the third item, address the problem of efficiently sampling string vacua.Specifically, we focus on Type IIB flux vacua which are typically used as starting points for many moduli stabilisation procedures [18,28,29].We construct UV input from smooth CY orientifolds where no complex structure or Kähler moduli are projected out.Furthermore, we focus on the large complex structure regime in moduli space where the EFT data is easily derived from string dualities [16,36,37].Our examples are obtained from the KS database using CYTools [12] by taking advantage of recent advances in mirror symmetry [24] and orientifold constructions [13,14].
Our main objective is a self-contained framework for computing EFT properties in the aforementioned compactifications.Using automatic differentiation, we connect elementary inputs such as the prepotential F for the complex structure moduli with the Kähler potential K and superpotential W . Subsequently, this allows us -again using automatic differentiation -to calculate the scalar potential and its derivatives.In our implementation we make heavy use of the just-in-time compilation and parallelisation features of JAX [38] which, as discussed later, accelerates our code in comparison to a simple python implementation by more than two orders of magnitude.We highlight that, although we provide a framework for type IIB flux compactifications, all components of our numerical implementation are modular.Little effort is required to build extensions including further aspects of low energy EFTs or exploring different string duality frames.
In our pipeline, we are able to generate efficiently the relevant equations (like the Fflatness conditions D I W = 0) associated with string solutions for any UV input.This defines an underlying optimisation problem which is the main use case in this work.We solve the latter using standard optimisation methods readily available via standard libraries.Arguably, solving this optimisation problem and generating solutions numerically is a challenging task by itself, but it becomes even more difficult to locate trustworthy and phenomenologically relevant solutions.Regarding the former, this means that we seek to find solutions in regimes of moduli space where calculations are under control.Equally important, we constrain our search to those values of the tadpole that can be realised in actual orientifold compactifications of Type IIB superstring theory.If we allowed unconstrained tadpoles, we would naturally find solutions almost anywhere for any number of moduli simply because of the drastically increased density of vacua in our search space.
We stress that, apart from the minimising task itself, evaluating and checking the aforementioned constraints on, say, billions of fluxes and the associated solutions to F-term equations is computationally expensive.The study of such large ensembles is necessary to address phenomenological requirements and to be able to statistically test the properties of the string landscape in a meaningful manner.Our framework naturally connects with other ongoing work to develop methods which allow to efficiently search for phenomenologically viable solutions in the string flux landscape [39][40][41][42].
The rest of this paper is organised as follows.In Section 2 we give a short portrait of CY orientifold flux compactifications and introduce the type of optimisation problems to be solved in the remainder of the paper.Section 3 describes our implementation and numerical approach to finding flux vacua.We present detailed numerical experiments in Section 4 and summarise our results in the conclusions in Section 5.

Type IIB flux compactifications
In this section, we review Type IIB flux compactifications on CY orientifolds and the type of equations that need to be solved for flux vacua.We mainly state results to set our notation and refer to the literature for more detailed discussions (see [43][44][45] for reviews).These considerations are prerequisites for the subsequent implementation of our root finding and sampling methods.

Type IIB flux vacua
Compactifying Type IIB superstring theory on a CY orientifold leads to 4D N = 1 supergravity with h 1,1 + Kähler and h 1,2 − complex structure moduli Z i .The classical Kähler potential K for the Z i and the axio-dilaton τ can be written as Here, we defined the period vector Π in terms of the prepotential F as The presence of 3-form fluxes induces the F -term scalar potential in terms of the Gukov-Vafa-Witten (GVW) superpotential [17] and integer flux quanta For given choices of a 3-form flux background G 3 = f − τ h, the potential (2.3) exhibits a non-trivial structure of minima called flux vacua.In the first instance it is these flux vacua which we aim at identifying.The available choices of fluxes are limited because they contribute to the cancellation condition for the C 4 -tadpole, in terms of the D3-brane charge induced by fluxes where N D3 (N D3 ) denotes the number of (anti-)D3-branes.The quantity χ(Y 4 ) is the Euler characteristic of the F -theory fourfold [46] encoding D3-charge contributions from O-planes and D7-branes.
In this work, we focus on a class of flux vacua for which the F -term conditions D Φ I W = D I W = 0, Φ I ∈ {τ, Z i }, are satisfied. 3One finds that In fact, these two conditions are equivalent to saying that the 3-form G 3 must be imaginary self-dual (ISD), i.e., ⋆ 6 G 3 = iG 3 in terms of the Hodge star operator ⋆ 3 on X 3 .We can express this condition more explicitly as [47] in terms of the gauge kinetic matrix For ISD fluxes, N flux ≥ 0 is non-negative which is why sources with negative D3-brane charge like D-branes and O-planes [18] are required for tadpole cancellation (2.6).Flux choices can lead to equivalent vacua by being identified under transformations of G = SL(2, Z) × Γ where the former is the Type IIB S-duality group and Γ the modular group acting on M cs (X 3 ).This can be avoided by mapping solutions to the fundamental domain of SL(2, Z).For later convenience, let us define the VEV of the superpotential as4 which is manifestly invariant under SL(2, Z) transformations.We note that for models with h 1,2 > 1, very little is known about the solution space of string theory, see however [35,50,51] for examples with h 1,2 ≥ 4 and [34] for h 1,2 = 2.
A key obstacle hereby is systematically solving F -term conditions (2.8), (2.9) for generic choices of fluxes below tadpole.Typically, analytic methods cease to work in regimes with h 1,2 > 1 unless for special classes of vacua like in [49,[52][53][54] where a subset of VEVs can be fixed analytically under certain conditions.
An important class of models concerns Large Complex Structure (LCS) limits in complex structure moduli space M cs (X 3 ).For such scenarios, the analytic structure for the prepotential is well understood and easily computed using mirror symmetry, see [24] for recent progress.In fact, around LCS points, the coordinates Z i of M cs (X 3 ) are identified with the Type IIA Kähler moduli in the large volume limit of Type IIA string theory compactified on the mirror dual CY manifold X3 [36,57,58].This identification implies that the complex structure moduli Z i take values in the Kähler cone of X3 where the sub-varieties are effective curves, effective divisors and X3 itself.It describes the moduli space of Kähler structures on X3 parametrised by a Kähler form J. By abuse of terminology, we simply refer to K X3 as the Kähler cone without mentioning the mirror side explicitly.The prepotential computed from mirror symmetry reads [36,57,58] Here, the parameters κ ijk are the triple intersection numbers of X3 which, along with the other parameters, are defined as Here, c 2 ( X3 ) denotes the second Chern class of the mirror manifold X3 .Further, the J i ∈ H 1,1 ( X3 , Z) are (1, 1)-forms and χ( X3 ) is the Euler characteristic of X3 .
Finally, the worldsheet instantons on the mirror dual side give rise to exponential contributions of the form [36,58] in terms of genus zero Gopakumar-Vafa (GV) invariants n q [59,60] of effective curves q in the Mori cone M( X3 ) of the mirror manifold X3 .It turns out to be more convenient to work with a different set of invariants obtained from a resummation of poly-logarithms.These are the so-called (genus zero) Gromov-Witten (GW) invariants N (0) q which are related to the GV invariants via (2.17) A systematic approach to computing these invariants has been established by HKTY [16,36].In practice, we compute the GV and GW invariants using CYTools [12,24] up to some finite degree.
The validity of the ansatz (2.14) for F is limited to the region of convergence of the LCS expansion [58].The radius of convergence is determined by the singularity of the associated Yukawa couplings [61,62].In this paper, we only check mild conditions to ensure the validity of our solutions up to a given cutoff on the GV invariants.That is, we look at solutions for which for some ε where F inst is computed up to some finite cutoff in the instanton expansion.In practice, we typically choose ε = 0.1, though we also investigate the cutoff dependence of the instanton expansion by including higher order contributions to F inst in the region of interest.
Before we continue, let us stress that going beyond the LCS regime is possible and should be part of future investigations.Here we limit ourselves to the LCS regime as computational tools from mirror symmetry are readily available in this regime [24].An extension of these tools to other regimes is a clear direction for future developments (see [63,64] for work along those lines).

Model construction and data-sets
Having presented the framework for our optimisation problem, we briefly comment on the data-sets which will be studied subsequently.

CY and orientifold construction
Foremost, we focus on the Kreuzer-Skarke (KS) database [7] comprised of 473,800,776 reflexive polytopes in four dimensions.Any fine, regular, star triangulation (FRST) of these polytopes leads to a CY manifold embedded as the anti-canonical hypersurface [65].Given that a single polytope can have many FRSTs, it typically features several nonisomorphic CY manifolds, though the actual number of topologically inequivalent ones remains opaque. 5rientifolds are constructed from Z 2 -involutions of Calabi-Yau manifolds.In the context of the KS database, the fixed point loci can often be obtained from simple conditions on the polytope [13,14].Although such involutions are inherited from the ambient toric variety covering only a subset of all involutions, they allow for the construction of a sizable number of orientifolds.We note that, in some cases, orientifold involutions lead to singularities which have to be resolved appropriately (see [68,69] for detailed discussions).For the examples discussed in this paper, we checked that our orientifolds do not give rise to any such singularities.
For convenience, we consider orientifolds with h 1,1 − = h 1,2 + = 0 for which the D7-tadpole is cancelled locally by putting D7-branes on top of the O7-planes.The D3-charge from O-planes and non-Higgsable SO(8)-stacks is given by (2.19) Throughout the main text, we constructed suitable models from the KS database using CYTools [12] making sure that there exists an orientifold with h 1,2 + = 0.6 Ultimately, this requirement ensures that the orientifold intersects the LCS patch allowing us to directly apply the formulas stated in previous section.

Flux vacua at LCS
In this work, we are concerned with analysing large samples of vacua to investigate their distributions and attainable properties.It is clear that, due to the absence of generic analytic solutions, this requires dedicated numerical tools.On the aforementioned geometries, the tadpole (through (2.19)) and hence the possible flux vacua are bounded [2,70].This number is generally speaking huge and estimates obtained from continuous flux approximations suggest that the number of flux vacua with Here, we defined the 3rd Betti number b 3 = 2(h 1,2 + 1), the curvature 2-form R and the Kähler form ω on moduli space.We refer to [71] for an explicit evaluation of this formula for examples with few moduli.We stress that the absence of solutions for a given choice of fluxes in our subsequent routine has to be seen in conjunction with the restriction to the LCS regime and to a single Kähler cone phase associated with a single FRST of a polytope.To give some heuristic motivation, let us assume that for each of the N moduli half of the parameter range falls within the strict LCS regime Im(Z i ) > 0. This implies that at large N a uniformly sampled point in moduli space has a probability of 1/2 N to fall in the LCS regime.In reality, the angles between two hyperplanes of the cone becomes smaller as we scale up h 1,2 making the cone even narrower than our previous naive estimate [72].
On top of that, even if points can be efficiently sampled from K X3 , the actual question to ask is: given a choice of fluxes, what are the chances of finding a minimum within the LCS patch traced out by K X3 ?In our case, we search for flux vacua in a single toric phase defined by a single FRST of a polytope.Generally, this geometric phase only covers a small subspace of moduli space where the existence of solutions to (2.8), (2.9) for arbitrary flux choices with N flux ≤ Q D3 cannot be guaranteed. 7As we discuss later in Sect.4.3, this seriously affects our ability to sample vacua at large values of h 1,2 .

Algorithmic approach for string vacua
Below we outline the scope and functionalities of our current numerical approach.Our code is modular such that each component can be developed further on an individual basis and allows to be integrated into the pipeline to extract physical properties.In short, the rationale is as follows: (i) Model construction: We start with a high level input which can either be a custom example or directly interface with existing databases such as the KS database or the CICY database [74,75].
(ii) EFT module: We calculate relevant quantities in the 4D supergravity EFT starting from a prepotential.Here, this includes the flux superpotential (2.4) and the Kähler potential (2.1) for complex structure moduli.Subsequently, we use the known analytical framework to calculate quantities like the F -term scalar potential (2.3).This pipeline is implemented such that we can ensure scalability in the presence of many moduli.To do this, we utilise JAX and its just-in-time (jit) and vectorisation (vmap) features.As detailed below, compared to alternative implementations in python or Mathematica, simply evaluating relevant quantities with our methods is faster by several orders of magnitude, see Fig. 1.
(iii) Optimisation module: We formulate the optimisation conditions for identifying minima of moduli potentials and pass them to optimisation algorithms.A priori there is no optimisation algorithm which is singled out.At this stage, we restrict ourselves to the use of scipy.optimize.rootwhich we find to be rather efficient for our purposes.
(iv) Sampling module: Of particular importance for the efficiency of our optimisation algorithm is the choice of initial guesses for the moduli VEVs, i.e., from where the algorithm starts searching for minima.We present novel methods to choose these points based on the choice of flux quanta using the ISD condition (2.10).
(v) Filter module: The minima are identified using a certain numerical tolerance of the respective conditions.To warrant that we are dealing with actual minima, we perform additional checks on these candidates as outlined below.Among others, this involves checking the positivity of the Hessian of the scalar potential.
Below, we describe in more detail each of the individual modules (ii)-(v) building our framework for the search for string vacua.(The model construction (i) is largely outsourced using tools readily available in CYTools.)Again we stress that each of these components could be modified without interfering with the other parts of the code.Given our implementation, our current bottleneck when extrapolating to large h 1,2 is the sampling and root finding which we comment on in due course.

EFT module -moduli potentials in JAX
For our implementation, we opted for JAX [38] which at its heart employs automatic differentiation to compute gradients of functions.It recognises primitive operations inside functions and applies standard rules for differentiation in such a way to numerically evaluate derivatives rather efficiently.In particular, it avoids finite differences to compute approximate derivatives in numerical differentiation, and it bypasses inefficient expressions in symbolic differentiation which would appear for instance when using sympy.
Overall, this makes the implementation differentiable, i.e., derivatives of quantities with respect to intermediate or input variables can be taken and utilised at machine precision.This is ideal for our quest for finding string vacua in 4D supergravity where only a handful of functions are prerequisites, in our case the prepotential (2.14), in order to compute quantities like period vectors, scalar potentials or the Hessian by taking suitable derivatives.In particular, this implies that our code only needs the absolute minimum of functions to be hard-coded, thereby making our implementation extremely versatile.Among others, it is readily applicable to any moduli space limit away from LCS such as conifold regions and, with minor adjustments, can be easily extended to include Kähler moduli.
Apart from the usefulness of auto-differentiation, there are two powerful tools to speed up the implementations.The first is jit-compilation (standing for just-in-time compilation) which, in short, transforms python-functions into language representations via so-called tracer objects that record all the operations being performed.This reconstruction improves the execution by sending full sequences of operations to tensorflow's XLA (Accelerated Linear Algebra) compiler [76], while in standard python-code each operation is sent to the compiler one at a time. 8For our purposes, jit ensures that there is only a modest scaling with respect to the number of moduli fields through essentially C++ speed of the code.Additionally, the usage of JAX makes most modules readily usable on the GPU which can further improve the efficiency.
Secondly, while vectorisation can always be implemented manually (by rearranging indices and axes), this can become increasingly tedious when the range or the number of indices changes dynamically.This is where the automatic vectorisation feature vmap of JAX is particularly useful.For example, we may want to evaluate the F -term conditions for a single flux at many points or, instead, for several fluxes, but for each flux at a different collection of points.Both cases can be addressed by calling vmap on the various inputs (see the JAX documentation for examples).Clearly, the upshot of vectorising expressions in our implementation are enormous gains in efficiency as we now illustrate.
Let us quantify the improvements of our implementation compared to popular alternatives, namely Mathematica (version 13.4) and python (version 3.9).We compare the time required to evaluate the F -term conditions D I W for all fields in different implementations in Fig. 1.We present two models with h 1,2 = 2, 4 complex structure moduli using in both cases GV invariants up to degree 4. 9 In Mathematica and python, we used a hard-coded version of the F -term conditions, while our jit-and vmap-compilation starts from the prepotential (2.14) and constructs the F -terms via automatic differentiation.We use JAX version 0.4.4 in our benchmark comparison.
In both the Mathematica and python implementations, one observes a clear scaling with h 1,2 coming from the increasing number of primitive operations required when computing D I W .In contrast, the implementations with jit show virtually no difference between Figure 1: Efficiency comparison of different implementations: Time required to evaluate the F -term conditions for two models, described in the text, with h 1,2 = 2, 4 at a given number of points using different implementations.Each curve is obtained from averaging over 10 runs performed on the same laptop using a single core.Further improvements are easily obtained when parallelising over multiple cores or using our code on the GPU. the two models.As promised above, the most dramatic speed up is however observed by making use of vmap which improves the evaluation time by roughly two orders of magnitude.A comparison with other implementations like a direct C++ implementation is beyond the scope of this paper.
Before we continue, we note that, while we provide implementations for general supergravity equations, it can occasionally be more efficient to use simplified expressions.For example, say we solved the F -flatness conditions for a given choice of fluxes.Then it can be useful to employ identities for the F -term scalar potential (2.3) and related quantities like the Hessian, see e.g.[77].

Optimisation module -numerical search for extremal points
In our quest for string vacua, we would like to numerically solve D I W = 0 for given inputs of UV data.Since most optimisation algorithms work with real-valued variables we reformulate our complex-valued optimisation conditions (cf.Eqs.(2.8), (2.9)) in terms of their respective real and imaginary parts.We end up with a one-dimensional array of size 2(h 1,2 + 1) for which we find the roots in terms of the real and imaginary parts of the moduli fields. 10or our analyses, finding zeros of D I W is efficiently achieved using the root finding methods of scipy [78], especially when compared with homotopy solvers previously employed in [34,35].From the methods implemented within scipy.optimize.root,we determine via direct comparisons that method='hybr' associated with a modified version of Powell's method [79] works most reliably and efficiently.As input, we provide a choice of integer fluxes together with initial guesses for the roots.We comment on the sampling of both in the subsequent section.In practice, we run the optimisation module on several CPUs in parallel to speed up the computation.As numerical tolerance ϵ tol for the root finding, we use ϵ tol = 10 −10 across all models which was chosen as a robust choice when comparing with known special solutions for the torus [80] and P[1, 1, 1, 6, 9] [49,81].
Beyond scipy.optimize.rootthere are several optimisation methods which could be used.At this stage, however, involving only a small number of checks, we did not find a more efficient method when comparing with e.g.gradient descent approaches.Clearly, choosing Eqs.(2.8), (2.9) as optimisation targets is only one possibility and our framework can be easily applied to the gradient of the scalar potential instead.This is possible due to the efficient automatic differentiation capabilities of our implementation.In the long term, it would be beneficial to apply even more efficient optimisers which can be parallelised easily within the JAX framework and make use of the GPU.These modifications will be the target of the next round of improvements of our framework.

Sampling module -sampling of fluxes and initial guesses
In this section, we detail the strategies to efficiently sample flux choices and starting conditions for our root finding algorithm.As we will show subsequently, this vastly outperforms previous access to solutions.Hereby, it is absolutely crucial to employ vmap to guarantee speedups of several orders of magnitude similar to Fig. 1.
Obviously, there remain biases associated to our selection procedure.Although we believe that one should account for these biases using existing statistical techniques, this analysis goes beyond the current scope of the paper. 11Here, our main focus is to establish novel methods to numerically access these vacua.The various sampling techniques to be introduced below will be compared in the subsequent section, see in particular Fig. 2.

Initial guesses
Initially, we need to specify points in moduli space from which to start the root finding.To this end, we define a region, e.g.sphere, box, or the Kähler cone K X3 , in which points are being uniformly sampled.We treat the size of these regions as a hyperparameter.While the notion of the Kähler cone is mainly useful for LCS limits, conifold regions can be described by spheres centered around the origin.
For our considerations, it is useful to work inside subcones of K X3 defined in (2.13), namely so-called stretched Kähler cones defined for some c ∈ R + via [72,82] The tip of K X3 [ c ] is given by the shortest (in the Euclidean metric) vector v tip in the Kähler cone K X3 for which the generators of the Mori cone M( X3 ) (as the dual of its closure) have volume at least c.It is convenient to work with K X3 [ c ] for some c > 0 because the convergence of F inst is more easily guaranteed. 12In addition, by picking initial guesses far inside K X3 [ c ], we are more likely to find roots inside the domain of validity of the LCS expansion, thereby improving the stability and success rate of our root finding methods.Lastly, we implemented a variant of the cone sampling procedure which is useful in models where the generators of K X3 are not known explicitly.In such cases, one can select initial guesses along the one-dimensional subspace {c • v tip : c ∈ R + } (with v tip here and subsequently denoting the tip of K X3 [ c = 1 ]) corresponding to the tips of the stretched cones for varying c.

Flux choices
We have to specify flux choices where we distinguish the following two approaches: 1. We sample fluxes independently from the starting point: We do this randomly by sampling from the uniform distribution such that N flux is below or at a fixed tadpole Q D3 as set by Eq. (2.19).Similarly, we sample starting points independently from our flux choices using one of the aforementioned techniques.

ISD biased sampling:
We introduce a new sampling technique which ensures that both the flux is below the tadpole and the starting point is close to the point satisfying the ISD condition in Eq. (2.10).The idea is to sample only half of the fluxes together with points (Z i 0 , τ 0 ) inside the Kähler cone.Then, the ISD constraint (2.10) can be solved for the remaining half of the fluxes.We implemented two variants, namely Here, N 0 denotes the gauge kinetic matrix (2.11) evaluated at Z i 0 .We sample the fluxes on the right hand sides of (3.2) and (3.3), while solving for the fluxes on the left. 13In general, the fluxes m, ñ (dropping indices for convenience) obtained in this way are continuous.To get a meaningful flux vacuum consistent with flux quantisation, we have to round the fluxes m, ñ → m, n ∈ Z.In effect, this implies that we no longer solve (3.2), (3.3) exactly at our initial point in moduli space, but pay the price of shifting the moduli VEVs ⟨Z i ⟩, ⟨τ ⟩ slightly away from Z i 0 , τ 0 , i.e., The sizes of the induced shifts {δZ i , δτ } from rounding the fluxes depend crucially on the chosen method ISD ± , see Fig. 2.More importantly, the two sampling procedures give rise to different characteristics of the resulting distribution of ISD solutions as we will show in Sect.4.1.

Filter module -identification of trustworthy string vacua
After computing the stationary points of V F by solving D I W = 0 or ∂ I V F = 0, we feed them into a pipeline to extract trustworthy minima. 14Currently, we implemented the following checks on our solutions: • We ensure positive string couplings and inequivalence to other solutions under SL(2, Z) transformations.This is simply because we are interested only in physically relevant solutions with g s > 0, while avoiding overcounting solutions due to SL(2, Z).Where necessary, we map τ to the fundamental domain of SL(2, Z) by applying translations and S-duality transformations.
• We verify that our solutions meet the Kähler cone conditions.Depending on the available information associated with the background geometry, we use different implementations to verify that the Kähler cone conditions are satisfied.If hyperplanes or generators of the cone are available, then we can easily check whether the given VEVs lie inside the Kähler cone.Alternatively, we check that the Kähler metric is positive definite which is however not a sufficient condition.
• We check that the Hessian of the scalar potential is positive semi-definite by computing its Cholesky decomposition as we find this method to be more efficient than computing eigenvalues directly.This guarantees that the obtained stationary points are actual minima of the scalar potential.In addition, even if non-negativity of the Hessian can be guaranteed, ensuring the absence of flat directions requires special attention.This is because it can be numerically hard to distinguish minima from flat direction, especially when large hierarchies of scales are involved.For vacua with D I W = 0, the number of massless fluctuations around the minimum, i.e., the dimension of the remaining moduli space is given by the rank of the matrix 15 (see e.g.[77]) For vacua with D I W ̸ = 0, we have to consider the number of first order obstructions to ∂ I V F = 0 which is simply the rank of the Hessian.In both cases, the rank is computed using the tolerance ϵ tol used in the optimisation module.
• Validity of the LCS approximation: We check Eq. (2.18) for ε = 0.1.This gives a rough estimate on the radius of convergence for degrees available in each model. 16 solution passing these criteria we refer to as a flux solution or vacuum.All of the above checks are fully vmap compatible (recall Sect.

Numerical experiments
Having described our algorithm, we now showcase some initial capabilities and comment on physical implications of the respective results.To begin -using the well-studied example of a degree 18 hypersurface in the weighted projective space P[1, 1, 1, 6, 9] -we discuss the differences in our sampling procedures.We then generate large samples of vacua for four and five moduli models which were introduced in [35] featuring at least one million flux vacua, vastly outperforming the handful of previously obtained vacua.Interestingly, we identify that the respective |W 0 |−distributions can be well approximated by the same probability distribution.Finally we analyse the scaling behaviour with h 1,2 by studying examples with up to 25 moduli; identifying suitable success rates and timing behaviour of our approach.

Sampling procedures in practice
To characterise the physics associated with the distribution of flux vacua, we need to understand the biases arising from the respective sampling techniques. 17To start the discussion, we present the difference of the vacua distribution in our sampling techniques previously introduced in Section 3.3.To allow for simpler visualisation, we study the degree 18 hypersurface X 3 in weighted projective space P[1, ] with Hodge numbers (h 1,1 , h 1,2 ) = (2, 272).
In detail, we focus on a particular locus in moduli space where the CY is invariant under G = Z 6 × Z 18 [86].By restricting to fluxes invariant under this symmetry, we only need to solve the F -term conditions along the invariant subspace.The corresponding periods can be obtained from the mirror CY giving rise to an effective prepotential depending only on two moduli.The associated topological data is We also computed GV invariants up to degree 100 using methods described in [24] and collected the leading order GV invariants (see Tab. 1 as a reference).We consider an orientifold with h 1,2 + = 0 constructed explicitly in [87] for which the D3-tadpole is given by Q D3 = 276.
The Kähler cone K X3 is simply the positive quadrant generated by the vectors {(1, 0), (0, 1)}.One thus easily samples points inside the cone by taking positive linear combinations of these generators.The tip of the stretched cone K X3 [c = 1] is just given by v tip = (1, 1), recall Sect.3.3.

Benchmarking our performance
Initially, we compare our implementation with existing searches using the Paramotopy method [88] which has been employed for our model in [34] to search for flux vacua with N flux = 34.While the scan of [34] was performed only including classical terms, we are able to conduct two separate searches: the first uses the purely classical prepotential, whereas the second includes instanton terms up to degree 10.We note that in our framework there is no significant change in performance for either of the two scans despite the fact that F inst includes 65 additional terms with non-vanishing GVs.
Our calculations have been performed on the LMU cluster on a single node with 4 cores with 5GB of memory and 50,000 input fluxes.For the two runs, we found 34,542 (with instantons) and 33,019 (classical) vacua with ε = 0.1 in (2.18) in about 45 minutes respectively.From those vacua, using sampling method ISD + , 33,047 and 31,042 survive when choosing instead ε = 0.01 in (2.18) implying that the LCS expansion is well under control in our framework.As a comparison, the equivalent scan in [34] was performed with 100 nodes each with 32 cores and required a total calculation time of around 75,000 hours to find around 24,882 acceptable solutions.

Qualitative comparison of sampling approaches
Turning to the differences of our sampling methods, we show the distribution of moduli VEVs for the different sampling methods in Fig. 2. By sampling points only along tips of the stretched Kähler cone for different choices of c (recall (3.1)), the resulting sample of flux vacua remains largely 1-dimensional (orange).For cone sampling, one covers a much broader range inside the Kähler cone for both ISD ± .Interestingly, randomly sampling fluxes results in moduli VEVs clustered at the boundary of the Kähler cone.Being close to the boundary suggests that these solutions may not survive in the presence of higher order instanton corrections as the region with |F inst /F | < ε shifts further inside the Kähler cone when cutting the instanton prepotential at higher degree.
Looking at the right in Fig. 2, we find that the distribution of flux tadpole contributions N flux are vastly different among the sampling methods.We clearly see that the ISD + method leads to rather large tadpoles, even for models with small h 1,2 . 18In contrast, the ISD − approach generates samples with significantly smaller N flux values.
The bottom row of Fig. 2 shows a subset of the moduli VEVs at the minimum and the associated initial guesses.The relative distance between them depends significantly on the sampling method.These noticeable dissimilarities can lead to algorithmic biases (see [42]) as the identification of minima is subject to input parameters of the optimisation module like the maximal number of steps.For random flux sampling, we find the largest displacements where, despite sampling points far inside the cone, the moduli VEVs end up close to the boundary.In contrast, the induced shifts δZ i (recall Eq. (3.4)) from rounding fluxes in ISD sampling are appreciably smaller, though they are larger for ISD − than for ISD + .This is because, when using the ISD − method, the inverse matrix N IJ appears in (3.3).In effect, this amplifies the error in (3.3) when rounding the continuous fluxes on the left hand side.Hence, the actual solutions to the F -flatness conditions are on average further away from the initial guess than in ISD + , thereby also affecting the success rate.
To conclude, we clearly see that the different biases due to the sampling method have indeed an appreciable effect.These biases have to be accounted for to deduce constraints on the space of flux vacua and to extract probabilistic statements about the string landscape. 19t this stage we postpone a quantification of these effects to future work as we believe that these techniques are best applied in the context of a direct physics question.
In particular, let us highlight that already in this simple example the distribution of N flux depends critically on the chosen sampling and solving methods.As is obvious from Fig. 2, large values of N flux ≳ O(10 3 ) are naturally preferred for ISD + sampling.This should however not be understood as evidence for the tadpole conjecture.It is merely Capabilities to generate vast sets of generic flux vacua in regimes with h 1,2 > 2 has so far been rather limited.Here, we would like to demonstrate the ease with which such datasets can be efficiently produced within our framework even for more elaborate geometries.We therefore revisit models with four and five complex structure moduli presented in [35].
Specifically, we study three CY hypersurfaces presented in the appendix20 of [35], though we construct orientifolds from different Z 2 -involutions. 21The latter are obtained from the orientifold database of [90] with the D3-tadpole values Q D3 = 104 and Q D3 = 192 for (h 1,1 , h 1,2 ) = (4, 98) (example 2) and (h 1,1 , h 1,2 ) = (5, 185) (examples 3, 4) respectively.As in the previous example, we study these models near a Greene-Plesser point [91] where the CY is manifestly invariant under a discrete subgroup G. Turning on fluxes only along the invariant 3-cycles allows to fix the non-invariant moduli fields at their fixed point under G, thereby solving the associated F -flatness conditions.The remaining periods can directly be computed on the mirror dual CY X3 and depend only on the invariant moduli of which we have four (example 2) or five (examples 3,4) in our models.
For these example, we now collect flux vacua by using the ISD + sampling method for the fluxes and initial points.To this end, we randomly generate fluxes in the range [−5, 5] and points inside the Kähler cone up to Euclidean distance 10 from the origin using the cone generators.For each model, we collected approximately 10 6 vacua taking into account instanton corrections up to degree 10 which took 10 hours of run time on a machine with 4 CPUs and 10GB memory.
We show the distribution of |W 0 | and g s on the left of Fig. 3. Apart from a slight shift of the peak, the former seems to be almost identical across the three models.This is a hint at universality of the |W 0 | distribution across different models within our current framework.In contrast, the distribution of g s exhibits non-trivial structures which are distinct for each model.Clearly, at this point, we are unable to infer that these are characteristic features of the respective models given the expected biases in our data, recall Sect.3.3.Nonetheless, it is interesting to point out that, while the distribution for example 2 peaks at small g s , we observe two distinctive peaks at large g s for example 4.
The distribution of ε and N flux /Q D3 is displayed on the right of Fig. 3.While in the analysis of [35] most vacua were unstable against the inclusion of higher order instanton corrections, we find overall an excellent control over the instanton expansion as exhibited by small ε.This can be expected given that our sampling methods allow us to stabilise moduli deep inside the Kähler cone where instanton terms are highly suppressed.In contrast, as we have seen previously in Fig. 2, random sampling fluxes as in [35] tends to drive the Fterm solutions closer to the boundary, thereby loosing control over the instanton expansion.This makes evident the fact that our new sampling techniques are superior to previously employed techniques.
The rescaled distribution of N flux /Q D3 is largely model independent.At first sight, this is surprising given the expected scaling (2.20) of the number of vacua with h 1,2 and the tadpole.We believe that this is due to the chosen sampling method, though this observation deserves further scrutiny.

Scaling behaviour for large numbers of moduli
Having seen the capabilities to generate reliable samples, we next comment on the capabilities of this framework to analyse geometries with even larger number of moduli.To this end, we construct examples with h 1,2 ∈ {5, 10, 15, 20, 25} moduli using CYTools, see Tab. 2 and ancillary files for the details regarding these models.Specifically, we selected trilayer polytopes [14] in KS with the largest h 1,1 for given h 1,2 and constructed smooth O3/O7 orientifolds with h 1,1 − = h 1,2 + = 0.Even though we collected these models by requiring, in addition to the existence of suitable orientifolds, large Q D3 , we note that similar analysis could be done with much smaller Q D3 at the prize of a higher run time.
Given the model data, we look for flux vacua with N flux ≤ Q D3 .We sample vacua by using the ISD + method running on machines with 8 CPU cores and 12GB RAM.For each model, we include instanton contributions up to degree 2.  allowed D3-charge Q D3 .(Note that each sample at fixed h 1,2 contains a different number of vacua as summarised in Tab 2.) As we argued before in Sect.4.1, our ISD + sampling, which we picked because of its reliability, prefers large N flux which is why the distribution in Fig. 5 is peaked near N flux /Q D3 ∼ 1.In contrast, g s is almost uniformly distributed in a wide range of values for h 1,2 ≤ 15, though the distribution for h 1,2 = 5, 10 exhibit moderate peaks around g s 0.2.

Conclusions
We presented an efficient and reliable framework to find flux vacua providing access to largely unexplored territories of string theory solutions.These advances have been achieved by combining various numerical techniques ubiquitously used in the context of machine learning, namely automatic differentiation, just-in-time compilation, and vectorisation.These methods allow for the generation of fast (C++), parallelisable (on CPU and GPU), and versatile code allowing a simple interface with EFT quantities, e.g., the couplings in the prepotential.
In this work, we are only hinting at upcoming physics analysis for these flux solutions.For instance, our framework, JAXVacua, intrinsically calculates the scalar potential and its derivatives at machine precision to any order in some approximation scheme.This enables systematic studies of properties of moduli potentials in general classes of models.Even more importantly, as we showed in the main text, large scale surveys of previously inaccessible regimes in the string landscape become feasible.This provides excellent opportunities to refine our understanding of the statistics of vacuum solutions, their associated mass spectra and couplings to visible matter fields.Ultimately, there is a wide range of applications of our methods to string model building and cosmology.
At this stage, there are a few technical improvements which we leave for the near future to make flux vacua at LCS of the entire KS database accessible: • Optimisation: Although we identify scipy.optimize.rootas an efficient root finding method for flux vacua, it is nevertheless a limiting factor in efficiency.For instance, it would allow for further speed-up if we can use this method on the GPU directly.In this context it is interesting to see whether other optimisers, such as variants of gradient descent can lead to more efficient flux vacua generation, see [92] for applications of gradient descent methods to different types of string theory vacua.
As the successful choice of optimiser is also based on the structure of the respective energy landscape in this optimisation problem, it will be interesting to compare the similarities between the string theory landscape and other energy landscapes (e.g.spin glass systems or deep learning optimisation).
• Characterising the trustable EFT regime: To understand the stability of flux solutions at the boundaries of the LCS expansion, efficient ways of estimating the radius of convergence in our examples are pending (see [16,36,61,62] for examples).This applies to a subset of our solutions where the control parameter F inst /F pert = ε becomes large.Currently, we evaluate this parameter by calculating the GV expansion using CYTools [24] up to high degrees.In this way, we identified a large sample of solutions where these contributions are exponentially suppressed up to degree 10.Crucially, our implementation is able to compute the relevant quantities to any order in the GV expansion at machine precision.
• Sampling input data: We observed that the success rate of finding vacua below tadpole decreases in our current pipeline significantly at large h 1,2 > 15.We believe that this is mainly due to inefficient sampling procedures.To make progress, it would be instructive to formulate conditions that, for a given choice of fluxes, guarantee solutions to exist inside the Kähler cone.Secondly, while the ISD sampling methods have shown outstanding success, it turns out to be hard to sample points in the Kähler cone such that N flux ≤ Q D3 .By implementing this as an optimisation problem, gradient descent methods could help locating such points inside the Kähler cone.
We note that, although our methods and examples are currently limited to LCS limits, our implementation can be adapted to study different asymptotic limits in moduli space such as conifold regimes [63,64].In these cases, the solutions to the Picard-Fuchs equation for the periods have different analytic properties.Such generalisations will be investigated in the future.
More generally speaking, our approach in JAXVacua can be extended to other moduli, such as Kähler moduli stabilisation.This will enable a general search for vacua where the presence of novel minima could be observed that fall neither in the class of the standard KKLT nor the LVS.Such new solutions are believed to arise from the intricate structure of EFTs from string compactifications.In the presence of large numbers of moduli, such regions are most easily discovered in comprehensive numerical explorations as presented in this paper.

Figure 2 :
Figure 2: Comparison between the different sampling procedures.Top left: Distribution of flux vacua in the plane of moduli VEVs, utilising units as outlined in the main text.Top right: Distribution of N flux with the vertical line indicating the tadpole bound Q D3 = 276 and the dark shaded region highlighting solutions with N flux > Q D3 .Bottom row: Distribution of initial guesses (gray) compared to the actual solutions to D I W = 0.

Figure 3 :
Figure 3: Samples of flux vacua with four and five complex structure moduli of [35].Left: Distribution of W 0 and g s , showing universal behaviour in the distribution of W 0 .Right: Distribution of ε and N flux , demonstrating very good control over the instanton expansion.

Figure 4 :
Figure 4: Efficiency analysis of our code for the models summarised in Tab.2: Left: Time required to apply the optimisation module on a given number of input fluxes obtained from the average over five independent runs.Right: The distribution of the instanton control parameter ε (recall Eq. (2.18)).Note that all ε values are far below our cut-off 0.1.

Figure 5 :
Figure 5: Solutions for large number of moduli.Left: Distribution of the ratio of N flux over the maximally allowed D3-charge Q D3 .Right: Distribution of the string coupling g s .