The hand-made tail: Non-perturbative tails from multifield inflation

It is becoming increasingly clear that large but rare fluctuations of the primordial curvature field, controlled by the tail of its probability distribution, could have dramatic effects on the current structure of the universe -- {\it e.g.} via primordial black-holes. However, the use of standard perturbation theory to study the evolution of fluctuations during inflation fails in providing a reliable description of how non-linear interactions induce non-Gaussian tails. Here, we use the stochastic inflation formalism to study the non-perturbative effects from multi-field fluctuations on the statistical properties of the primordial curvature field. Starting from the effective action describing multi-field fluctuations, we compute the joint probability density function and show that enhanced non-Gaussian tails are a generic feature of slow-roll inflation with additional degrees of freedom.


Introduction
The reconstruction of our universe's history relies on the assumption that the primordial curvature fluctuation ζ( x) (responsible for our universe's inhomogeneities) was initially distributed according to a Gaussian statistics, parametrised by an almost scale invariant power spectrum P ζ (k).Although this assumption agrees with every relevant cosmological observation [1], there are good reasons to suspect that our primordial universe could not have been perfectly Gaussian.To start with, the simplest models of cosmic inflationthe theory that explains the origin of ζ( x)-predict tiny, but non-vanishing, levels of non-Gaussianities [2].Unfortunately, this minimal prediction will likely remain out of reach for the next generation of cosmological surveys.On the other hand, large non-Gaussianity can arise if, during inflation, ζ( x) evolved experiencing large self-interactions and/or interactions with other relevant degrees of freedom [3][4][5][6][7].One way to parametrise the observable effects of these interactions on the distribution of ζ( x) is in the form of n-point correlation functions ζ( x 1 ) • • • ζ( x n ) .The shape of these n-point functions in momentum space can display distinctive signatures, providing a powerful diagnostic of the types of fields present during inflation.For example, massive fields with spin can leave oscillatory features in the primordial bispectrum (the amplitude of the 3-point correlation function of primordial fluctuations) with a shape determined by their spin [8][9][10][11].
However, n-point correlation functions computed with standard perturbation theory are inappropriate to assess the occurrence of large statistical excursions of ζ( x).The prevalence of large statistical excursions is dictated by the shape of the tail of the probability distribution function describing the statistics of ζ( x).But perturbative methods fail to correctly determine the profile of tails.As emphasised in [12], perturbation theory schematically relates the 1-point probability distribution of ζ and connected nth-moments ζ n c [which, in turn, are related to connected n-point correlation functions of ζ( x)] as where σ 2 ζ is the Gaussian variance of the distribution, determined by the power spectrum In terms of the usual f NL and g NL parameters for the first few terms in the expansion, the previous expression takes the form For typical statistical excursions ζ ∼ σ ζ 1, the expansion of the distribution function in terms of moments ζ n c remains under control as long as ζ n c /σ n ζ 1, which can be satisfied in perturbation theory even for values of f NL and g NL of order 1.On the other hand, for unlikely large statistical excursions ζ ∼ 1 this expansion may fail, particularly in models predicting f NL and g NL of order 1 (or larger) commonly encountered in theories of inflation involving sizable non-linear interactions (for instance, in the form of interactions with other degrees of freedom).In such models, not only ζ 3 c and ζ 4 c , but all n-point moments are expected to contribute corrections of order 1 on the tail of the distribution, making the expansion (1.2) useless to study extreme fluctuations.This failure of perturbation theory to parametrise large statistical excursions of ζ in certain models of inflation motivates the consideration of non-perturbative techniques to study the consequence of non-linear interactions of ζ during inflation [12][13][14][15][16][17][18].
As unlikely as they might be, large statistical fluctuations of the primordial field ζ can have dramatic effects on the formation of structure in our universe.More to the point, after inflation, large fluctuations of ζ( x) can lead to overdense regions of space that inevitably collapse into primordial black holes (PBHs) (see Refs. [19,20] for recent reviews).These black holes could become the seeds of supermassive black holes at the center of galaxies, and form a substantial part of the dark matter content of our universe.The abundance and clustering properties of these PBHs are extremely sensitive to the shape of the tails of the PDF dictating the distribution of ζ( x) [21][22][23][24].Thus, to correctly understand the possible generation of PBHs as a result of inflation, we need a reliable, non-perturbative approach to reconstruct the non-Gaussian tails of the primordial fluctuation's PDF.
The purpose of this work is to quantify precisely the effects of light isocurvature fluctuations on the probability density function of ζ( x) by using the non-perturbative approach offered by the stochastic inflation formalism [25][26][27][28][29][30][31][32][33][34][35][36][37][38][39][40][41].The authors of [42] have argued that the interaction between ζ and a light field ψ can introduce non-Gaussian corrections that modify the shape of tails of the probability density function of ζ( x), valid at the end of inflation.Here we confirm this scenario and we show that the joint PDF describing the statistic of ζ and ψ in two-field models of inflation with canonical kinetic terms is given by where κ is related to the strength of the coupling between ζ and ψ, and the ellipses stand for additional subleading contributions that we calculate in some specific examples.The non-perturbative nature of (1.3) is not obvious, but it becomes apparent after integrating over ψ to reveal the tail of the distribution for ζ, which becomes strongly non-Gaussian The dependence on κ makes manifest the non-perturbative sensitivity of tails to nonlinear interactions between ζ and other degrees of freedom.Similar non-Gaussian tails have been found in other single-field scenarios where the background is non trivial [43][44][45][46], and quantum diffusion plays an important role.And in [42], for a sudden, transient coupling between the curvature field and a light spectator field.Instead, in our calculation, slow-roll is preserved throughout and all interactions and couplings are constant in time.
In multi-field models of slow-roll inflation with non-canonical kinetic terms we expect corrections in (1.3) that would change the details on how (1.4) is obtained, leading to a different profile for the tail.
An important advantage of the analysis presented here, based on the fluctuations, is that we are able to show that such non-Gaussian tails are a generic consequence of multifield inflation, and also that they do not require the interruption of slow roll.The leading non-Gaussian contribution to the PDF can be traced back to an ever-present quadratic derivative coupling1 ζψ between the curvature and the isocurvature perturbations [47,48].In minimal multifield scenarios the coupling κ is related to the angular velocity Ω of the inflationary trajectory in field space.However, our results apply to any model where this derivative coupling is present.
We will be particularly interested in the case of a very light or even massless (ultralight) isocurvature fluctuation [49,50].This provides an alternative inflationary scenario -potentially relevant to string compactifications-with predictions currently indistinguishable from those of single-field inflation but where light fields do not need to be stabilized.We consider the background to be quasi de Sitter during the whole duration of inflation.The UV completion of such systems in terms of an effective field space metric and an effective multifield potential has been discussed in [10,[50][51][52].
To derive (1.3), we start from the effective action for the perturbations of a two field model of inflation [47,48].From the action, we will coarse grain the equations of motion to obtain a Fokker-Planck equation for the long wavelength modes.In order to introduce derivative interactions we write the equation in phase space.Because the time scale associated with the approach to equilibrium of the velocity field v ζ is much shorter than the one of the other fields, we can integrate out v ζ directly from the Fokker-Planck equation.This leads to Eq. (4.48) which involves only ψ and ζ.This equation assumes that the entropy mass of the second field is light, that the coupling Ω 2 < H 2 and that the curvature power spectrum is smaller that one, and to our knowledge has not been previously derived.Surprisingly enough, it will be possible to solve the time dependent Fokker-Planck equation on a myriad of cases, which among other consequences show that the derivative coupling Ω, both enhances the Gaussian variance and modifies the PDF introducing a coupling κ ∼ Ω 2 /H 2 .
The structure of the paper is as follows.In Section 2 we study the statistics of primordial curvature perturbations and we show how to integrate out v ζ .We also review some known results in the case of spectator fields on fixed de Sitter.In Section 3 we present the linear Fokker-Planck equation for the curvature perturbation coupled to another light field.Since the distribution is Gaussian is possible to obtain exact expressions for the variances of the fields, that as we will show, match known results using standard techniques.Section 4 contains the main results of this paper, where we study the full non linear Fokker-Planck equation.After integrating out v ζ we will show how it is to obtain (1.3) and under which assumptions it holds.Finally, in Section 5 we conclude and present different ideas to explore in the future.There are a series of appendices where we present technical details of the calculations.

Statistics of primordial curvature perturbations
Before studying the effects of isocurvature fields on the statistics of ζ, we first review the use of the stochastic formalism, showing how it allows a derivation of the probability density function describing the statistics of single fields in quasi-de Sitter backgrounds.

Primordial curvature perturbation
We start by considering the task of deriving the probability distribution of the primordial curvature perturbation ζ.First, let us recall that the canonical quadratic action for ζ describing its dynamics during inflation is given by where M Pl is the reduced Planck mass.In the previous expression, a = a(t) is the usual scale factor, and is the first slow-roll parameter, determined by the Hubble parameter H = ȧ/a as = − Ḣ/H 2 , and required to be much smaller than 1 throughout inflation.
For simplicity, we will disregard slow-roll corrections and take both H and as constants.
Then, the equation of motion for the long wavelength modes (with wavelengths much larger than the Hubble radius H −1 ) is given by One can modify this equation to quantify the influence of short wavelength fluctuations on the evolution of ζ by introducing a source term representing noise [53].The resulting equation takes the form where η ζ = η ζ (t) is a time-dependent Gaussian noise with a two-point correlator given by: where we have identified φ 1 = ζ, and φ 2 = ζ.Equation (2.5) is a Langevin equation with a drift matrix A ij and noise vector f i given by From (2.4) it follows that the noise vector f i must satisfy f i (t)f j (t ) = D ij δ(t−t ), where D ij is the diffusion matrix, given by (2.7) In general, one might be interested in computing correlation functions of the stochastic fields fields φ i of Eq. (2.5).These can be computed with the help of a probability density function P (φ i , t) derived from the associated Fokker-Planck equation [54].The Fokker-Planck equation is determined by A ij and D ij as Using (2.6) and (2.7), we re-express the Fokker-Planck equation in terms of φ 1 = ζ and To solve this equation, let us assume a general Gaussian profile of the form where S −1 ij are the elements of the (symmetric) covariance matrix, whose inverse S ij is constituted by two-point moments as The time dependence of S is determined by (2.9) together with initial conditions.To determine S we can take the fields ζ and v ζ to be coordinates with Fourier transforms p and q respectively.Then, the Fourier transformed version of (2.9) is where P represents the Fourier transform of P .The ansatz given in Eq. (2.10) then implies the following form for P : Replacing this expression back into (2.12)we get the following set of equations satisfied by the elements of the matrix S: (2.16) To solve these equations, we need to impose initial conditions at a given time t 0 .For instance, consider an initial Gaussian distribution (2.10) such that at t = t 0 the matrix S contains initial values (2.17) Solving Eqs.(2.14)-(2.16)with these initial conditions, we then arrive at: The initial values ζv and S (0) vv are the variances associated with long wavelength fluctuations that have already crossed the horizon prior to t 0 .If we are interested only in the statistics of those modes that cross the horizon starting at t 0 , we can set the initial values of S ij to 0. This corresponds to a distribution where the position ζ and rapidity v ζ of the fluctuation are exactly localized at the origin of the field phase space Then, the solutions take the form: Replacing these expressions back into (2.10)we obtain the desired expression for the distribution P .The coefficients S ζζ , S ζv and S vv depend on time with a characteristic timescale determined by H −1 .In the limit t − t 0 H −1 the distribution simplifies to an asymptotic expression given by (2.24) Notice that the widths associated with ζ and v ζ differ in their time dependence, with v ζ sharply localized around 0 (signaling that v ζ decays quickly after it becomes superhorizon).In fact, we may marginalize v ζ by integrating it from the distribution (2.24), in which case we obtain which is a Gaussian distribution for ζ with variance σ 2 ζ given by Recall that this expression is valid for t H −1 , provided the initial condition S ij = 0 at t 0 = 0.The time dependence of the variance σ 2 ζ just reflects the fact that as time progresses, more and more modes populate the long wavelength regime.In this way, at a given time t, the probability distribution P (ζ, t) describes the statistics of long-wavelengths that crossed the horizon between t 0 and t.
whose solution is precisely given by Eq. (2.25).Notice that (2.9) included a term involving a second derivative with respect to v ζ , and only a first derivative with respect to ζ.In contrast, Eq. (2.27) contains a second derivative with respect to ζ.This can be understood as the effect of integrating out v ζ over the Fokker-Planck equation (2.9) with the ideas of Refs.[54,55], which go as follows: First, by noticing that the time scale in which v ζ becomes time independent is given by t v = 1/3H, we can rewrite Eq. (2.9) as: (2.28) Here, the terms on the right hand side (RHS) are much smaller than those on the left hand side (LHS).Indeed if we assume that the fields are given by their typical values ζ ∼ σ 2 ζ then we have that the term containing the time derivative is of order t v σ2 ζ P ∼ ∆ 2 ζ P while the second term is of order 1/ √ HtP .On the other hand the terms on the LHS are both of order O(1) × P .We can make use of this hierarchy if we expand the PDF in powers of Ht v : Then, by replacing this expression back into (2.28)we find that at leading order in t v the equation for the first term P (0) is simply given by whose solution can be written as where φ 0 is a function of ζ and t that can be determined by considering the equation for P (1) : Now, notice that in this last equation the LHS corresponds to a total derivative of v ζ , whereas the RHS is proportional to Gaussian function of v ζ .Hence we can integrate it with respect to v ζ and obtain the constraint equation ∂φ 0 ∂t = 0.This in turns allows us to write a solution for P 1 given by which now depends on another function φ 1 (ζ, t).Repeating the same step we can find, at the next order in t v , the following equation for P (2) : Given that this equation has the same structure as (2.32), we immediately infer, after integrating over v ζ , that Collecting the terms for the PDF we obtain, (2.36) After integrating over v ζ , this result reduces to (2.37) Finally, using the constraint equations for φ 0 and φ 1 we find, up to first order in t v , that P (ζ; t) must satisfy which is the Fokker-Planck equation (2.27) obtained by ignoring the ζ term in the Langevin equation.Going beyond second order in t v does not add any further correction as the equations obtained for the other terms in the PDF expansion are the same as those in (2.34).

Spectator fields in de Sitter
To complement the previous discussions, we now study the statistics of a light spectator field on a de Sitter background.The light scalar field ψ has potential V (ψ) and its equation of motion is given by We take V (ψ) H 2 since the field is light.As we did with ζ, the statistical properties of ψ can be studied by dividing the field into long and short wavelength modes.In this way, the long wavelength field ψ, satisfies the Langevin equation where η ψ is a Gaussian noise representing the effects of the short wavelength modes.The correlation function of the noise term is given by This equation is highly non linear since the drift − V (ψ) 3H is an arbitrary function of ψ.Nevertheless it is possible to find an exact solution.This is done by first noticing that (2.42) has an equilibrium solution which is obtained by imposing that P is time independent.To obtain solutions to (2.42) we can now write P (ψ, t) as where the coefficients Λ n and the functions Φ n satisfy the following eigenvalue problem The time dependence of P (ψ, t) is controlled by the eigenvalues Λ n , which are positive and, for general potentials V (ψ), their value increase with n (with Λ 0 = 0).This implies that the decay rate to the equilibrium distribution is given by 1/Λ 1 .For instance, when the potential is quadratic (V (ψ) = 1 2 m 2 ψ 2 ) one finds the solution of (2.45) is given by Hermite polynomials with eigenvalues given by Λ n = m 2 3H 2 × n.In this case, the solution reaches equilibrium for ∆N Using the decomposition (2.44) it is also possible to deduce the statistical properties of the equal time correlation function G(R), where R is the distance between two points, found as where the coefficients A n are given by From this result, we may conclude that the stochastic approach is valid for a patch of size R ∼ H −1 e H/Λ 1 .For a quadratic potential this is of order R ∼ H −1 e H 2 /m 2 H −1 , which implies that the statistical average occurs over a large number of Hubble patches.This quantity has to be compared with the correlation length of the observed universe, given by ∼ H −1 e ∆N .This implies that a field fits inside the observed universe if ∆N < H 2 /m 2 .
The present analysis assumed a fixed de Sitter background, but it can be generalised to the case of quasi-de Sitter backgrounds, as required to study slow-roll inflation.In this case there are added difficulties.One problem involves the role of gauge transformations on Hamiltonian constraints satisfied at the level of the Langevin equations.In the following discussion we will avoid this issue by assuming that the graviton is decoupled from scalar fields (i.e.we consider the decoupling limit, in which the mixing with gravity is negligible for energies larger than √ H).Furthermore we will assume that the time dependence of the couplings is negligible over the time scales we will consider (typically an e-fold).Within this regime, the dynamics reduces to study the action for the curvature perturbation ζ coupled to an isocurvature field ψ via derivative couplings.In this way the problem is analogous to studying two coupled spectator fields, evolving on de Sitter.

Statistics for two-field inflation
In this section we use the tools introduced in the previous section to derive the probability density function describing the statistics of fluctuations in multifield theories.For now, we shall restrict our treatment to the case of two-field models, and focus on the case of theories with linear interactions.In Section 4 we consider the role of non-linear interactions.
Our starting point is to consider the two-field action (background plus perturbations) describing inflation: where S EH is the Einstein-Hilbert action, γ ab (φ) is a sigma model metric describing the geometry of the scalar field target space and V (φ) is the scalar potential driving inflation.Deviations from a geodesic trajectory are parametrised by the angular velocity Ω.The action for the curvature field ζ and the isocurvature field ψ is obtained by decomposing the fields along tangent and normal directions to the inflationary trajectory (see [6,47,48,56] for a more detailed explanation).The quadratic action is given by [49] where Ω is the coupling between the two fields, and µ is the so called entropy mass of ψ.In addition, we use The equations of motion resulting from the variation of the previous action are where 2 m 2 = µ 2 − 4Ω 2 .Notice that the coupling Ω mixes both fields, but these equations can still be solved using perturbation theory if we assume that the coupling satisfies Ω/H 1.In this case we have that the two point functions for ζ and ψ at horizon crossing, are given by Since ψ is a light field, it can continue evolving after crossing the horizon.When the coupling Ω = 0, ψ seeds the perturbations of ζ which could lead to its correlation functions growing on superhorizon scales [49].As we will see all these effects can be properly incorporated by studying the linearised Langevin equations.
In order to analyse the stochastic dynamics let us note that ψ couples to v ζ .In order to include these terms it is more convenient to use the phase space formulation of the Langevin equations.As explained in Section 2 this is achieved by introducing the time derivatives of the fields in the Langevin equations.For (3.4) these correspond to where η ζ and η ψ are Gaussian noise terms with correlation functions given by Notice that the Langevin equations are coupled, hence the PDF do not factorise into This adds some complications due to the fact that ζ does not reach equilibrium, and so it is not possible to a priori make use of the decomposition (2.44) to find the PDF.
Nevertheless, since the Langevin equations are linear it is possible to find an exact Gaussian solution.In this case all we need to do is to compute the covariance matrix, as explained in Appendix C. Since we have the drift and the noise matrices given by the covariance matrix is given by where we have assumed initial conditions given by P = δ(ζ)δ(v ζ )δ(ψ)δ(v ψ ) and we have set the initial time to zero.Notice that it is not necessary to write the explicit PDF since the variances are given by where φ a is one of the fields and also the corresponding element on the diagonal of C.
Off diagonal elements of C are cross correlations between different fields.Before writing explicit expressions for C(t) let us notice that the coupled dynamics imply that there are several time scales over which the field decays.A useful way of understanding this is by noticing that the time dependence is encoded in the exponential of the drift matrix A.
Since for µ = 0, A is diagonalisable it can be written as A = U DU −1 with D a diagonal matrix containing the eigenvalues λ i of A. Using this decomposition, the exponential of the drift matrix can be written as exp(At) = U exp(Dt)U −1 which implies that the time dependence will appear in terms containing e λ i t .
For the drift matrix A given in (3.8) the eigenvalues λ i are, 0,−3H, −3H(1 + µ 2 /H 2 ) and −µ 2 /3H.Since the integrand will contain factors of exp t(λ i + λ j ) there are three main cases.First if λ i = λ j = 0 after integrating implies the appearance of terms linear in t, which are due to the variance of ζ always growing with time.The others cases arise when one of the eigenvalues is different from zero.When the sum of the eigenvalues is proportional to 3H(1+µ 2 /H 2 ) the term decays after a time t ∼ 1/(3H) analogous to v ζ in the single field case.After integrating this term will generate two pieces, one is constant because it is evaluated at t = t and the other one decays.The third case is when the sum of the eigenvalues is proportional to µ 2 /H.These terms decay on a longer time scale dictated by the isocurvature mass t ψ ∼ H/µ 2 .Notice, moreover, that the decay depends on the isocurvature mass, not on m.As an example let us look at the variance of ζ.From (3.9), this quantity is found to be given by where we have kept only leading order terms and we have assumed Ht 1.As anticipated, the time dependence appears both linearly in t and in powers of exp(−µ 2 t/H).The first dependence is due to the modes entering the comoving horizon as we discussed in Section 1. Notice, however, that there is a new piece which depends on the mass and the coupling of the extra field.The second term is due to the superhorizon evolution of the light scalar field which takes a longer time to decay.At very large times t t ψ this contribution is suppressed and the time dependence is as usual.However, at intermediate times 1 Ht ≤ H 2 /µ 2 the time dependence from the exponential can dominate.To see this, let us expand the variance in small µ There are now cubic terms in Ht, which do not depend on the entropy mass but only on the factor Ω/H.This effect is due to the light field behaving as a massless (ultralight) field before it settles into an equilibrium distribution.This can be seen more directly if we look at the variance of ψ for Ht 1 At late times the exponential can be neglected and one recovers the equilibrium value.

Interestingly at intermediate times 1
Ht ≤ H 2 /µ 2 , expanding in small µ we find which implies that for small (Ht)µ 2 /H 2 the field behaves as a massless field.Later, when the second piece becomes of order (Ht)µ 2 /H 2 ∼ 1 or larger this expansion stops being valid and we need to consider the full expression.It is also interesting to rewrite the variances in terms of the scale invariant power spectrum.This can be done by writing time in terms of the number of efolds after horizon crossing.If a mode with wavenumber k leaves the horizon at a time t k we can then write where k * and t * are the longest scale and time measured.Since the variance of a field φ is computed as it is possible to invert this relation and write ∆ φ as the logarithmic derivative of the variance with respect to k.It is also convenient to express all quantities in terms of e-folds using ∆N = H(t k + t * ).Finally we find that Notice that at t = 0 we have ∆ 2 ζ at horizon crossing acts as an initial condition for ∆ 2 ζ (t), which continues to evolve since the light field has not yet reached equilibrium.Notice that similar formulae were obtained by using the in-in formalism in [52].From the last formulae we also see that, in general, light fields behave as massless fields during a time 1 ∆N 3H 2 /µ 2 .Expanding again in small µ we find that Notice that this is same superhorizon growth described in the case of ultralight fields [49]; this is the case when the entropy mass is exactly zero.We plot (3.16) and (3.17) in Fig. 1.We see that for curvature field there is an initial ultralight phase which will last as long as µ 2 ∆N/3H 2 1.Notice that if this inequality holds until the end of inflation then H 2 /µ 2 ∆N , the ultralight phase is all there is, since the field does not have time to start moving away from a massless distribution.Another feature which we have previously discussed is that the coupling changes the final amplitude of the primordial fluctuation.In the case of the light field we see that the power spectrum eventually decays to zero

Probability distribution function
Having understood the time dependence of the variances, now we would like to derive the PDF of the two-field model.To start with, let us notice that the time scales associated < l a t e x i t s h a 1 _ b a s e 6 4 = " h 6 10 -4 0.01 We plot the scale invariant power spectra (3.16) amd (3.17).We set ∆ 0 ζ = 2 × 10 −9 .The parameters for the black line are µ = 0.2H, Ω = .05,the orange line µ = 0.2H, Ω = .2Hand for the gray line µ = .05H,Ω = .2Hwith the velocity fields t v ψ and t v ζ are much smaller than the scale of ζ and ψ.Using this it is possible to integrate out v ψ and v ζ from the Fokker-Planck equation.We will explain in detail how this is done in Section 4.1.1.The resulting Fokker-Planck equation is where This Fokker-Planck equation is still linear, even though it has a mixed noise term, and it can be solved with the techniques of Appendix C. Further simplification is possible if we consider the following: As explained in Appendix C, the noise term is computed by using the two-point function of the field at horizon crossing.Nevertheless, we have seen that the value of the variance grows with time on superhorizon scales.On the other hand, a direct computation of the two point function also shows a superhorizon growth [49,50] which, of course, is the same result we obtained in (3.16).This can be understood as follows.
As it has been previously pointed out in Refs.[49,50], in the long wavelength limit the equations of motion for ζ decouple if they are written in terms of Notice that in the long wavelength limit the first equation admits as a solution v ζ = v 0 ζ a −3 .Plugging this solution back into the second equation implies that the last source term vanishes for t 3H, hence both equations are decoupled.Nonetheless, the fact that v ζ also depends on ψ explains the superhorizon growth since the curvature mode will depend on integrals of ψ which does not immediately decay on superhorizon scales.
We can make use of the fact that the equation of motion of v ζ is free if we define it as v ζ = ζ instead of the usual definition.In order to do this consistently we also require the noise of the field v ζ to include the superhorizon growth.This can be done by shifting D ζ to be where ∆ 2 ζ (t) is the same as that given in (3.16).We defer a more detailed analysis on how to modify the noise term to Appendix D. Since the variable that appears in the equations of motion is v ζ there is no difference between which variable we use, which implies that the statistics of ζ are the same than for ζ.We can check explicitly by solving the Fokker-Planck equation, which now becomes This equation can be solved using the techniques described in Appendix C. The solution is found to be given by where the variances are defined in (3.11) and (3.13), and is the off-diagonal variance which grows from an initial value κ ∼ HtΩ/f ζ in the ultralight phase to a value of κ ∼ Ht ψ Ω/f ζ when the distribution for ψ reaches equilibrium.Notice that the combination κ/σ 2 ψ is almost constant and is related to the size of the coupling between the two fields.Also we have that σ 2 ζ σ 2 ψ κ 2 and so the denominator of the second term in the PDF is always positive.
After marginalising over ψ we find that the variance of ζ is given by (3.11), as anticipated, hence the Gaussian statistics of this field are independent on how we define v ζ .Clearly the same happens for ψ.Moreover, since all couplings are linear, the variances do not change but the minima of ψ is displaced.By minimizing the PDF with respect to ψ we find that the minima ψ is at which implies that the classical trajectory of ψ is shifted by the interaction.Of course, this does not mean that the statistical fluctuations are modified, since they are still simply given by σ 2 ψ .From now on, for simplicity, we will remove the tilde from ζ.

Non linear interactions in a two-field model
So far we have focused our attention on the second order action (3.2) where it was possible to find an exact solution of the Fokker-Planck equation.Now we would like to consider the role of non linear terms and study how they modify the probability distribution (3.26).As already stated, for a spectator field there are well known techniques which allow us to find non-perturbative solutions.However these techniques are not useful to uncover the PDF for the curvature field ζ since, as we have discussed, there is no equilibrium distribution for ζ.Despite this shortcoming, we will be able to uncover precise non-perturbative effects on the joint distribution P (ζ, ψ), based on the Gaussian distribution we found in the previous section.To do so, our strategy will be to ignore non linear terms in v ζ , while keeping higher order terms in ψ.
Let us start this discussion by writing down the action for perturbations in the case of a canonical two-field model of inflation [57,58].Up to leading order in slow-roll the action is, For simplicity we consider the first few powers of the potential A crucial point is that the action contains non linear interactions between ζ and ψ, which are due to the non geodesic motion of the background trajectory.Apart from those appearing explicitly in (4.1), there are no further interactions between the two fields, which can be understood as arising from the original canonical kinetic term in the action.Non canonical kinetic terms will generate higher order interactions between ζ and ψ which we are assuming to be suppressed.Furthermore, notice that we have not written interactions including gradients coming from gravitational couplings, as they will be negligible for the stochastic dynamics.Finally, as in the linear case, it will be convenient to write the action in terms of Doing this, the action becomes where we have kept terms up to cubic order with respect to ψ.Higher interactions are suppressed by further powers of Ω/H (which we take as a small parameter) although mixed terms are only up to fourth order and they can be reincorporated without trouble.Notice that in the same way as the mass term of ψ becomes the entropic mass µ, other self interaction couplings are also modified, g = g For simplicity, let us examine the case when there is a large cubic interaction for ψ but the equation for ζ can be considered as free.The equations of motion are where we have disregarded higher order self interactions of ψ (which can be included back at any point of our analysis).In order to apply the stochastic approximation for ζ we have to demand that higher order interactions are suppressed.The first term on the RHS of Eq. 4.4 is suppressed for typical fluctuations, since Ω f ζ .We will assume that the third term is at most of the size of the second one.For the second term on the RHS we have that which follows from the fact that Ω f ζ and that the variance of ζ is much larger than the one for ψ for Ω = 0.For larger fluctuations of ζ the inequality (4.5) still holds.Since at leading order Eq.(4.3) is free then v ζ decays after leaving the horizon and the LHS of Eq. (4.4) is negligible.

Fokker-Planck equation
In order to study the effect of v ζ over ψ more systematically we will analyse the stochastic dynamics of the two-field system.For this, we use the strategy employed for the linear case, that is, we coarse grain the fields directly from the equation of motion for the perturbations (4.4).Leading non linearities come only from long-wavelength modes, with interactions involving short-wavelength modes being subdominant.In the end, the effects of short wavelength modes reduce to the same linear noise terms as in the linear Langevin equations [30].Moreover, we will consider the couplings to be small with respect to H so it is possible to treat interactions using perturbation theory.This implies that the noise terms are as in the linear case considered in (3.6).Another simplification comes from the fact that there are no interactions involving ψ (as they are gravitationally suppressed).Because of this, we can neglect all terms with time derivatives of ψ except for the leading friction term.Indeed this is related to the fact that for typical fluctuations ψ H 2 ψ, since in the long wavelength limit we have that ψ where the last inequality follows from the fact that we are considering light fields.By the same argument we may ignore the second derivative of ψ in the first equation.Of course this can be understood as integrating out ψ from the Fokker-Planck equation and the details will be analogous to those examined in the case of the curvature field in Section 2.2.Finally, after separating the equations into long-and short-wavelength modes, we find that the Langevin equations for the long-wavelength mode are As previously discussed, there are two ways of introducing ζ to the Langevin equations.
We follow the simpler one, whereby we consider an extra Langevin equation for the field ζ = v ζ .As we described in Section 3.1, this means that we need to include a time dependent noise for v ζ .After considering these steps, we finally find that the associated Fokker-Planck equation is given by where ∆ 2 ζ (t) is the same quantity found in Eq. (3.16), D ψ = H 3 /4π 2 , and where the variances are given by Before continuing, let us comment on the fact that drift terms including powers of v ζ will become ζ derivatives.This can be understood as a consequence of the shift symmetry of the curvature mode (see also [59]).Using this we can deduce that terms including two derivatives of ζ will change the variance of the ζ distribution, and the tail of the distribution of ψ.Since we are interested in the tail of the distribution of ζ we can ignore them for now and include them later.This is achieved by imposing that the quadratic terms in the drift for ψ is larger than the quadratic terms in the drift in v ζ or, equivalently, that where we have used the fact that v ζ ∼ Hζ.Notice that this is achieved only for Ω relatively large, although not necessarily larger than H.A non zero Ω increases the variance of ζ making it much larger than that of ψ, which otherwise will be very similar.Besides that, if Ω/H is very suppressed, it will make the inequality in (4.12) impracticable.In what follows we will assume that (4.12) holds, and we will comment on its effect on the PDF later.Similarly as we saw in Section 2.2, terms in the drift containing powers of which we will show is due to the shift symmetry of ζ.Using this property, we can neglect the term proportional to v 2 ζ in the drift of ψ since is subleading with respect to the term proportional to ψv ζ .

Adiabatic elimination of v ζ
We have used the variable v ζ since it was a convenient way of studying the derivative couplings that appear in the action.Nevertheless, we are not interested in the statistical properties of v ζ and moreover it decays faster than the other fields.It is then useful to eliminate v ζ from the Fokker-Planck equation and obtain P (ζ, σ) directly.The way of doing this systematically is called adiabatic elimination of fast variables [54,55].We made a similar computation in Section 2.2, which we now generalise to include the coupling with another field.The idea is to expand the probability density function in powers of the time scale of the fast variable.Then replacing order by order it will be possible to factorise the dependence on the fast variable from the slow variables.For our case, it is convenient to write the Fokker-Planck equation as This expansion becomes useful if we write the Fokker-Planck equation as where we have written the mass in terms of the time scale t −1 ψ = µ 2 /3H.Notice that the operator on the RHS of Eq. (4.14) is of order t v /t ψ 1 whereas the operator on the LHS is of order one, which is the reason why the approximation is well justified.After replacing (4.13) into the Fokker-Planck equation this can also be written as a series in powers of t v .At zeroth order in t v we find the following equation Notice that this equation only specifies the v ζ dependence of P (0) .Since this is a first order ODE we can solve it and write P (0) as which is a Gaussian distribution of v ζ with variance given by (3/2)H 2 ∆ 2 ζ (t) in line with the linear solution for P .Of course, had we included higher order terms in v ζ , they would have entered into the LHS of (4.14), and the solution (4.16) would have had to be modified accordingly.To figure out what the restrictions are for P (1) we plug (4.16) into the expansion for P .We find that at order t v Which has a similar structure to (4.15) but with a more complicated RHS.Due to this, it is not possible to immediately solve for P (1) .Yet, it is possible to simplify the last equation by noticing that the LHS is a total derivative and that the RHS is multiplied by a Gaussian function.After integrating over v ζ the LHS vanishes while on the RHS only some terms with even powers of v ζ remain.Since these have to add up to zero, we finally find Notice that the first three terms form a Fokker-Planck equation for φ 0 (ψ, t) and hence are of order 1/t ψ .The last term is of order t −1 v so we can neglect it.We can in principle solve for the ψ dependence of φ 0 , which is just given by a Gaussian with variance σ 2 ψ , although the explicit solution will not be important.Instead, we can now find the solution for P 1 by using the above condition.Equation (4.17) then reduces to whose solution can be written as (4.20) If we now plug this solution into the terms of order t 2 v , we find that This equation is similar to the constraint for P (1) , thus we can again integrate over v ζ .We find that where we have neglected sub leading terms.Notice that the terms proportional to Ω 4 can be recast as subleading corrections to the noise of ζ of the form Ω 2 ζ 2 .These terms have been recently studied [38,59,60] and we will leave their analysis to future work.Instead of finding a solution for P 2 , let us collect the terms in the PDF up to order t v (4.23) Since we are not interested in the full distribution, but only on the one on configuration space, we can integrate out v ζ , which leads to . Finally collecting all the ingredients, taking the time derivative of P (ζ, ψ, t) and using (4.18) and (4.23) we find that which is a Fokker-Planck equation for ζ and ψ after integrating out v ζ .Notice that the couplings between v ζ and ψ now translate into mixed derivatives between ψ and ζ.Of course in the absence of such couplings, the system reduces to two linear Fokker-Planck equations.We can find an analytical solution if we Fourier transform ζ and ψ to p and q.
The resulting equation can be solved by looking for solutions of the form which translates into two independent ODEs for M and L: ) Imposing that at t = 0, M = 0, we find . (4.28) We can approximate the integral using a saddle point approximation to find

.29)
This approximation works well for early times 1 Ht Ht ψ times, when ψ has not reached its equilibrium distribution.At later times a better approximation is obtained by considering that M (t) is time independent and we will comment on this later.Following the same method a straightforward computation shows that L(t) is which is valid for 1 Ht Ht ψ .

Non Gaussian tails
In order to understand the effect of the non linear interactions in the PDF it is useful to ignore first the linear mixing term.This is not well justified since it means that we are ignoring a ψζ interaction that is important for large ζ.Nevertheless its addition does not change the qualitative effect of adding the non linear derivative interaction.As we discuss in Section 3.1, this is because the effect of the linear mixing term is shifting the trajectory of ψ but not its variance.Due to this let us ignore this term for the moment.Furthermore, we see that by doing this, it becomes possible to integrate over ψ analytically which greatly simplifies the analysis.After these considerations let us study the following PDF where we have defined which is a time independent parameter.Notice that this coupling is related to (3.27) through the relation by which it should be clear that κ is related to the size of the interactions.We are interested in the distribution for ζ which we obtain after integrating over ψ.The integral can be done analytically and expressed in term of Bessel functions: Toy model of the distribution described in (4.35).The black region is the when ζ < σ 2 ζ /κ while the orange line is otherwise.The dotted line is a Gaussian distribution that fits the region in the left of the plot.The parameters are made up to highlight the fact that the distribution on the right has a very non Gaussian tail.
To appreciate the distribution we plot it in Fig. 2. From there we see that for it is a displaced Gaussian around the centre but it becomes strongly non Gaussian for ζ > ζ cr ≡ σ 2 ζ κ .We can understand the asymptotic behavior of P (ζ) as emerging from a change on the saddle points in (4.32).Let us notice from it, that the shifted distribution of ζ has the overall effect of changing the coefficient in front of ψ 2 , which becomes negative for large values of ζ > ζ cr .This implies that there are three saddle points for ψ, one at ψ = 0, and other two at The behaviour for the asymptotics of P (ζ) are then similar to the Stokes phenomena, in the sense that for large values of ζ the saddle point changes from 0 to ψ.If we expand around 0 we find a Gaussian distribution for ζ, whereas if we expand (4.32) around ψ after integrating over ψ, we find that valid for ζ ζ cr and that it coincides with the large ζ limit of (4.35).This effect is non perturbative in nature since κ 1 and so typically ζ/κ ≥ 1.Let us now comment on the regime of validity of κ.As we mentioned before Eq.(4.29) was only valid for intermediate times t v t t ψ , whereas for later t ≥ t ψ it is more accurate to consider that M (t) is time independent.Solving for M and L we find that the PDF is given by, g X f k H Q 9 a v c q P Z w X 9 K e / 0 G 1 P J 0 K g = < / l a t e x i t > ⇣ < l a t e x i t s h a 1 _ b a s e 6 4 = " w B j d Z u 9 y b l U Z 9 7 S u t n < l a t e x i t s h a 1 _ b a s e 6 4 = " S F w C L 0 v a w 6 a 4 4 G 6 b r D / o 8 / q X t 7 < l a t e x i t s h a 1 _ b a s e 6 4 = " g p J A s N a e X e 5 f G E A T Y 4 g o 9 with κ replaced by κ and σ 2 0ψ = 3H 4 /(8π 2 µ 2 ) the equilibrium distribution for ψ.In absence of ζ, the PDF in (4.38) reduces to the equilibrium distribution of in accordance with the fact the we are considering the distribution at late times t ≥ t ψ .In this sense, (4.38) is the distribution when the field ψ has settled into its equilibrium distribution.Of course if the field is ultralight then t ψ → ∞ and the transition between κ and κ does not take place during inflation.Furthermore, we see that for t t ψ , κ < κ but otherwise κ > κ since κ keeps growing.Expanding in powers of t ψ we find that from where we can see that κ grows until it reaches the equilibrium value κ .Anyhow since at equilibrium σ ζ ≈ Ht ψ ∆ 2 ζ , we can write κ ∼ Ω 2 /H 2 σ 2 ζ to see that the coupling does not change its dependence on the parameters.In any case is worth mentioning that κ is a limiting value where the time dependence has become negligible.
All of this implies that initially the tail of the distribution becomes non Gaussian at smaller values of ζ until it settles down on σ 2 ζ /κ .When this happens the coupling of the tail becomes constant Still at early times the distribution is more localised, hence the values at which the tail becomes non-Gaussian are smaller, as can be seen in Fig. 3.In the end, even though the saddle point changes at larger ζ, larger values of ζ are more likely due to the growth of σ 2 ζ .Finally, let us pay attention to the fact that the tail is typically very suppressed since we have that since κ−1 1.In order to have a larger effect the power spectrum has to be several orders of magnitude larger than the CMB values.Furthermore, let us note that the critical value for ζ does not depend on the amplitude of the fluctuations ∆ 2 ζ (t), which in the end implies that for smaller values of ∆ 2 ζ (t) the probability for the regions where the tail changes is very suppressed.In any case the tail we have found is always larger than a Gaussian tail, which is due to the fact that κζ/σ To conclude let us stress the point that changing Ω has a large effect, as can be seen in Fig. 3b, where the overall effect is flattening and shifting the PDF.

Local non-Gaussianity
In order to understand how the tail is related to usual perturbation theory, we can estimate the size of the non-Gaussianities produced by the interaction ∼ v 2 ζ ψ we are considering.From the Langrangian (4.2), we have that f NL can be estimated to be of order where the last term corresponds to the off diagonal coupling parameter.Indeed, using (3.27) the last relation can be recast as which implies the following: Perturbation theory is usually under control when f NL ζ 1 which in our case corresponds to expanding around the Gaussian saddle point.Observables can be computed by expanding them in powers of the power spectrum since it is always suppressed.On the other hand, when f NL ζ ∼ 1 perturbation theory fails since the distribution is non Gaussian, which means that the expansion in powers of the power spectrum cannot be justified.We can see this explicitly now since ζ cr ∼ 1/f NL .Naively one should expect that for ζ ≥ ζ cr perturbation theory fails which we now see translates into the tail becoming strongly non-Gaussian.Clearly if f NL 1 one might be worried about corrections from other interactions becoming important for large values of the tail.Nevertheless, since there is a finite number of interactions in the EFT we are considering we can still deduce the behaviour of the tail up to certain (large) values of ζ when other interactions become important.In order to estimate the validity of our results we need to include the terms we have been neglecting so far, which is what we are going to discuss now.

Including the linear mixing
In the previous section we ignored the linear mixing term, since we argued it does not affect the appearance of non-Gaussian tails.When adding it, it turns out that it is not possible to integrate over ψ analytically but it is still possible to obtain the saddle points and check how the tail changes for large ζ.Firstly, let us notice that the point where the saddle point changes is not modified significantly.To see this we can expand the PDF and notice that the terms proportional to ψ 2 in Eq. (4.31) are which implies that for small ζ < 1/3 the saddle point does not change even when Ω 2 /H 2 is very large.Since we are interested in small Ω 2 /H 2 this does not significantly change our results.Next, if we ignore suppressed terms we find that the displaced saddle points are now at The fact that the saddle point now contains a constant piece is reflected in the tail.Indeed, expanding around ψ we find that for large ζ where the different signs correspond to different saddle points.In general there is one correct saddle point which can be picked based on the analytical properties of the PDF, a task which we leave for future work. 3In any case, if we assume that the fluctuation in ζ is large we find that the tail gets a small correction.Finally let us notice that the effects of the linear mixing term can also be understood as f NL ζ becoming large.In this case it corresponds to the interaction L 3 ∼ 2f ζ Ω/Hψv 2 ζ , and so the constant piece of the saddle point, which appears for smaller ζ, appears as long as Ω/Hζ ∼ 1.Nevertheless, the tail of the distribution will not change until Ω 2 /H 2 ζ ∼ 1 where the other saddle point becomes real.
As a side comment let us notice that the coupling does not grow unbounded but it settles into a constant value at approximately t ∼ t ψ , given by as explained in Eq. (3.27).

Adding more interactions
Having understood the leading interaction effect we can now add the rest of the terms to the system.In particular, let us consider a general potential V (ψ).Since the new terms do not add any new shorter time scale than t v we can eliminate v ζ from the Fokker-Plank equation following the same steps we described in Section 4.1.1.This results in the following Fokker-Plank equation where we have ignored sub leading terms in the noise, and expanded to cubic order in t v in order to obtain the last two terms.Before solving the Fokker-Planck equation let us pay attention to the fact that ζ appears only through derivatives of the PDF.This stems from the fact that ζ posses a shift symmetry, by which the only allowed interaction contains time derivatives or gradients.As for the stochastic dynamics, this result in the drift being an explicit function of v ζ .Now, at leading order in t v , we see that this translates into derivatives of ζ.Indeed allowing higher powers of v ζ in the drift, translates into higher derivatives of ζ in the Fokker-Planck equation [33,64].Another important feature of Eq. (4.48) is that, as expected, in the absence of the coupling Ω the equation reduces to uncoupled Fokker-Plank equations whose solution factorises as Finally let us stress that in order to obtain Eq. (4.48) we only need to assume that H 2 , and µ 2 H 2 .The assumption about Ω can be relaxed, but doing that will modify the value of the two point function which will modify the noise function.

Including higher ζ derivatives
Let us now study the last two terms in (4.48) and restrict to V (ψ) = µ 2 ψ 2 /2.These contain two derivatives of ζ, and as expected will modify the variance of the distribution for ζ.As we will see they restrict the range of the Gaussian fluctuations for ψ.Indeed we can find the solution for the PDF at equilibrium to be log where we have neglected other terms that appear in the denominator which are suppressed by additional powers of ∆ 2 ζ .We see from (4.49) that the variance gets shifted for larger values of ψ.Notice, moreover, that for typical values of ψ this effect is very suppressed since Ω f ζ .In any case, at that point the distribution is not valid and other terms will become dominant.This effect would modify the tail of ψ, and as long as we consider small fluctuations around the new trajectory of ψ the description we have given for the distribution in ζ remains valid.If we expand in small ψ we find that there are new terms proportional to ψ 2 ζ 2 .These terms would actually change the saddle point moving it back to ψ = 0 at larger values of ζ.This effect is related to the condition we impose in (4.12).In the end we find that a violation of (4.12) is related to the fact that there is no change in the saddle points due to the quadratic term in ψ.In general we found that for late times and larger Ω 2 /H 2 the effect of these new terms is suppressed.The reason behind this is that for late times t t ψ the variance of σ ψ stops growing whereas the variance of ζ grows until the end of inflation.

General V (ψ)
We have focused only on quadratic potentials for ψ but it possible to study a general potential.To start with, let us neglect the linear coupling and focus only on the leading order interactions.If we Fourier transform in p and write P = exp(−σ 2 ζ /2p 2 ) P we obtain   a Fokker-Planck equation with only derivatives of ψ and with a corrected drift term We can rewrite this equation using the replacement by which the Fokker-Planck equation becomes an eigenvalue problem of the form which is similar to Eq. (2.45) with the particular difference that the equilibrium distribution exp(−v) in this case is time independent.Solving this equation in general lies beyond the scope of this paper, but we can still comment on the case that ψ reaches the equilibrium PDF.Fourier transforming in p we find that the PDF is given by which is a generalisation of (4.38), including general potentials.Let us consider the example V (ψ) = µ 2 ψ 2 /2 + λψ 4 /4 which allows us to integrate ψ out of the equilibrium distribution.This leads to This expression is a bit complicated but we can see that a quartic self interaction reduces the amplitude of the tail.This has an overall effect of making the distribution Gaussian for larger values of ζ.We plot (4.54) in Fig. 4 where it is noticeable how the non Gaussian effects diminish by increasing λ.This can be shown by expanding around the non trivial saddle point.In this case after integrating out ψ we find that Notice that the tail contains a quadratic term, whose variance has a correction which is the inverse of the correction of the coefficient of the tail.This means that increasing λ makes the correction of the quadratic term smaller while the correction for the linear term becomes larger.The effect can be understood by estimating the sizes of the non Gaussianites.Indeed we have that for the quartic interaction f (λ) NL , is of order

.56)
If we compare the ratio between f NL and f NL we find that it coincides with the ratio that controls whether the Gaussian term dominates Let us first detail the case in which the ratio is much larger than one.When this happens, since perturbation theory breaks down for f NL ζ ∼ 1, the action cannot be trusted anymore and the effect of the cubic coupling between ζ and Ω is not seen.Clearly at this point other self interactions have to be taken into account and the PDF at large ζ might become dominated by other higher order terms.On the other hand when the ratio in (4.57) becomes smaller than one the saddle point changes before f (λ) NL ζ ∼ 1 and so the tail becomes non Gaussian.For any non zero λ there it will be a point when f (λ) NL ζ ∼ 1.At that stage the computations of the tail in are not valid.Nevertheless this will produce an exponentially small effect on the whole PDF.

Conclusions
We have studied the statistics of large but rare fluctuations within the multi-field inflation paradigm using the stochastic inflation formalism.In the simplest class of two-field models, the primordial curvature fluctuation ζ interacts with an isocurvature field ψ as a result of turns of the background trajectory in the target space of scalar fields.This translates into a derivative coupling proportional to Ω, the rate of turn of the trajectory, appearing at both linear and non-linear level in the evolution of perturbations.We found that the non-linear interactions induced by Ω imply non-Gaussian deformations affecting the tails of the joint probability distribution of the perturbations.
By assuming that the evolution of the background is close to de Sitter, we derived the Fokker-Planck equation that is satisfied by the probability density function characterising both fields ζ and ψ.When only the linear evolution of the fields is considered, we find that a non vanishing Ω enhances the growth of the variance of ζ.This matches with results obtained previously using perturbation theory.A particular case of this scenario is when the entropy mass is exactly zero (the ultralight limit), studied recently in Ref. [49].We showed that initially all spectator fields behave as ultralight fields after horizon crossing but after some time, that depends on their mass, the fields decay.If the entropy mass is zero, we recover the exact ultralight case.
On the other hand, when non linearities are taken into account, we find that after integrating out ψ the tail of the PDF of ζ becomes non-Gaussian for values of ζ H 2 /Ω 2 .This can be understood as the Stokes phenomenon, whereby a Gaussian saddle point leads to non-Gaussian saddle points for large values of a parameter.Crucially the coupling makes the exponential tail to be larger than the Gaussian tail.Such a result has important consequences.For instance the abundance of PBHs formed during inflation depends strongly on the tail of the PDF.In this way a non-Gaussian tail implies that the abundance of PBHs can be substantially enhanced in models with derivative couplings.Another consequence would be a possible modification on the clustering of galaxies which depend on rare large fluctuations, whose probability would now be enhanced.
There are several paths along which our work can be expanded.For instance, here we considered a simple EFT of multi field inflation in which the number of interaction terms with derivatives couplings is limited.There are other known examples with a larger number of interaction terms which might become relevant for large values of ζ.In those cases one might need to resum the implied corrections to obtain accurate expressions for the tail.Another interesting topic would be to understand how corrections to the Fokker-Planck equation are related to resummation of loops.This has been well understood for light spectator fields on pure de Sitter [20,28,38,65], and a similar result should follow from the Fokker-Planck equation arising from our two field model.Also, it would be interesting to compare our results to other derivations of the Fokker-Planck equations within the multi-field paradigm [27,41,66,67] and its relation with the Hamilton-Jacobi formalism [50,68].On these examples the derivation of the Fokker-Planck equation was done directly from the background equations, whereas in this article we obtained the stochastic dynamics directly from perturbations.
Our results suggest that the abundance of PBHs in multifield models can be much larger than that obtained when the Gaussian approximation is used to study the production of PBHs through the enhancement of the power spectrum.It would be interesting to apply our results to models such us those of Refs.[69,70] to reassess the production of PBHs.To do this one would need to go beyond the assumption that Ω H, implying that some of the terms that in our analysis were suppressed would now become dominant.Finally we see that stochastic inflation might allow one to go beyond perturbation theory.In that sense it will be interesting to understand our results in the light of recent works such as [12,59] (see [71,72] for a discussion on the importance of this for PBHs).For instance it has been suggested that when there is a non perturbative tail there is an exponential enhancement of the large N point correlation function [73].Whether those result apply to the case we study here, we leave for future work.
Eq. (B.2) can be understood as follows, the first piece corresponds to the flat space two point function in physical coordinates.As we move deeper into the bulk this expression dominates.This is expected as we have picked a vacuum that reproduces the Minkowski vacuum.This term dilutes as we approach the horizon.The second term does not depend explicitly on time and it corresponds to the two point function on a scale invariant theory.This is due to the symmetries of inflation at horizon crossing.
In order to compare to the stochastic result let us compute the correlation functions at coincident points.This corresponds to the variance of δφ.Since the expression diverges let us introduce a time dependent cut-off such that For shorter distances we evaluate the two point at the cut-off, which schematically implies that which is the result we have obtained by solving the Fokker-Planck equation.Notice that the time dependence comes from the fact that at each time more modes are included in the region below the cut off.A similar computation shows that at coincident points the φ correlation function goes as where we have pointed out that these are vacuum fluctuations, to distinguish them from the statistically averaged two point functions.To compute this, let us note that after smearing the field the noise function is directly proportional to the smeared speed, as can be seen after taking the time derivative from (A.6).Then, we have that the correlation function is given by Notice that, while at separate time both decay to zero, at equal times the correlation function diverges while the statistical average is finite.This is an effect of smearing out over a region where there are statistical fluctuations, which in the end translates into the correlation function for the speed being finite at equal time.

C General solution for linear coefficients
In this appendix we will follow [54] to derive general solutions of the Fokker-Planck equation.Let us start by considering the following equation, where both A ij and D ij are n × n constant matrices and in addition D ij is symmetric and semipositive definite.Subject to initial conditions The solution of this equation is Gaussian which we will show.First if we multiply the equation for φ i and integrate over φ we find after integration by parts, whose solution is given by φ i = e tA y 0 , (C.4) in matrix notation.Now if we insert φ i φ j into the Fokker-Planck equation we find, Even though this expression looks abstract it can be easily computed .To conclude given that we assume that the distribution was Gaussian then it is fully determined by the covariance matrix, hence we have, Let us check now that the expression for C agrees with what we found before.We first have that, which coincides with the expression we found before.

D Ultralight field
In this appendix we will show how to modify the noise term in the Fokker-Planck equation to take into account the superhorizon time dependence of the curvature power spectrum.This can be done in general but in this case we will focus on the case of an ultralight field [49].Let us start from the Langevin Eqs.Notice that in this case ζ grows outside the horizon due to the interactions with ψ.This implies that the noise term η ζ (t), which is computed by coarse graining ζ, should also grow on superhorizon scales.To do so, let us first recall that for an ultralight field, the two point function for the curvature mode is given by [49]., Notice that it diverges in the limit kτ → 0. However, since inflation will last for a finite amount of time, then there is an natural cut-off for the power spectrum.A more systematic way to deal with this IR behaviour is to regularise the growing logs by introducing boundary counterterms as in [76].Doing so results in the following regularised expression, To compute the covariance matrix we can use the results from Appendix C, which are compatible with a time dependent difussion matrix.At leading order in λ 2 ∆N 2 , and for t 1/H, we have that, where we have used that 1 H log(k * /H) = ∆N , is the number of efolds until the end of inflation and where D ζ = 9H 5 /(8 π 2 ) and λ = √ 2 α/H.This result coincide with the power spectrum computed in [49].Finally let us notice that the faster growth in the variance avoids inflation becoming eternal.This can be seen by the following argument.In general inflation becomes if during an interval t ∼ H−1 the quantum fluctuations δφ 2 1/2 is larger than the classical change of the field ∆φ = φ/H.From (D.9) we have that this is avoid if φ where we have used that at horizon crossing ζ = − H φ δφ.The last inequality implies that the condition for eternal inflation is more strict than in single fields inflation (which is that ∆ 2 ζ ≤ 1).Indeed (D.13) can be written as, Notice that this assumes that the power spectrum didn't vary significantly during the whole inflation.If there is a momentarily increase of the power spectrum, such that λ 1 then the last inequality will not hold.
E Computing further corrections to the PDF.
In this appendix we will include higher order corrections to the solution of the Fokker-Planck equation in (4.31) .
, (E.5) where we have discarded the second solution since it grows for large √ pψ.In order to obtain a simplified expression let us notice that a typical fluctuation of p ∼ 1/ σ 2 ζ .Using this we can deduce that the at leading order f (p) is constant, as the ratio between the two leading order terms is given by, This inequality still holds for larger values of the ζ.If we write f ( p) at leading order we find that the Hermite function reduces to one and we recover the usual distribution (4.32).When adding the linear mixing term the distribution is more complicated but still depends on f (p).If we ignore this term, we find that at leading order the PDF contains further corrections at order p 2 .The effect of those add up to the quadratic terms that appeared in the drift for v ζ .In the end this will modify the value of the tail for very large values of ζ, acting as exponetentially suppressed corrections as expected.

2. 2
Integrating out v ζ Although ζ and v ζ had the same status in the treatment leading to (2.24) we are ultimately interested only in the statistics of ζ (after all, v ζ decays quickly on super-horizon scales).This led us to derive (2.25) after marginalizing v ζ .Alternatively, we can integrate out v ζ at an early stage, and obtain a Fokker-Planck equation only for ζ.For instance, if we neglect the second derivative of ζ in (2.5), the system reduces to a single Langevin equation ζ = η ζ , yielding the following Fokker-Planck equation r 7 X 8 b + 2 8 1 s 6 d 8 2 H V i d D / g 8 N O O 9 p u d z 7 5 1 H b J t B b I C 9 I i r 0 l E d s g e e U 8 O S J d w Y s h X c k O + B S + D d 8 G H 4 O O 0 N W j 8 9 q y R e x X 0 f g G n U d S t < / l a t e x i t > t e x i t s h a 1 _ b a s e 6 4 = " z D O E 5 U k k Z e / N c 6 L d n K 8 p O l 4 S b W R e + 3 d + Q d 9 1 u 9 y n / P T / S h v J M 3 7 C j R S g = = < / l a t e x i t > t e x i t s h a 1 _ b a s e 6 4 = " T l o 7 6 q + G S e b P b f 7 Z m 5 E H I 4 5 x 5 u c k + i O D M 2 D F 8 a w d j 4 l 6 8 T k 1 P N 6 Z n Z b 3 P z C 9 8 7 J s s 1 h T b N e K b P E 2 K A M w l t y y y H c 6 W B i I T D W d L f r / S z P 6 A N y + R v W y j o C n I l W c o o s Z 5 q x / d g y e X 8 S t g K o 5 2 t z S 0 8 A J s 1 2

1 <
l a t e x i t s h a 1 _ b a s e 6 4 = " w B j d Z u 9 y b l U Z 9 7 S u t

5 <
l a t e x i t s h a 1 _ b a s e 6 4 = " g p J A s N a e X e 5 fG E A T Y 4 g o 9 1 2 A v x I = " > A A A C l 3 i c b V H d a t s w G F W 8 b s u y n z Y r u x i 9 E c s G H Y x g B 9 b u r o X C y G U y l j Z Q m y A r n x t h S X Y l u W 1 m 9 A K 7 3 G 3 3 G H u Z P c D e Y 3 J + YH H 2 g d D h n O / w S d + J c 8 6 0 8 f 3 f D e / B z s N H j 5 t P W k + f P X + x u 9 d + e a 6 z Q l E Y 0 Y x n a h w T D Z x J G B l m O I x z B U T E H C 7 i 9 K z S L 2 5 A a Z b J r 2 a e Q y T I l W Q J o 8 Q 4 a j w 4 D L + B I e 8 n e x 2 / 6 y 8 K b 4 N g B T q n r 9 L v 4 Z 9 f b w e T d m M e T j N a C J C

Figure 3 .
Figure 3. a) The figure is the PDF of (4.35).The orange line is for N = 30 and the black line is for = 40.The other parameters are Ω = .2Hand ∆ ζ = 10−3.The dotted lines are the region where ζ ≥ σ 2 ζ /κ b) Both curves are at N = 40 but the green line has Ω = 0.3H.

1 <
l a t e x i t s h a 1 _ b a s e 6 4 = " m O 5 U M l T h 0 X 9 m O 0 X O m p k d w V P 3 p p 2 a M 4 0 S 5 I w 1 e s H 8 6 c i K 0 n o n I d Q p i J n p T K 8 h / a g W j d K w 3 5 p v 4 w z B n M s 0 M S L o c H 2 c c m w QX C 8 F j p o A a P n O A U M X c D z C d E E W o c W s r h x K + 0 0 Q I I s d 5 G M V T m x c X x l N r 1 z U F N g / f Y Q c E / g I W O 2 w 3 3 d c r 9 / W m O 4 r T l Z b + r V 2 t t C t r X V i r R P D / Q b 9 e C x q 1 e t e l d o K W t Y N e o y P 0 F g W o i V r o M + q g H q K I o T v 0 A 9 1 7 B 9 4 n r + 2 d L l u 9 0 m / P P l o r 7 + w B X C f R N A = = < / l a t e x i t > ⇣ < l a t e x i t s h a 1 _ b a s e 6 4 = " g p J A s N a e X e 5 f G E A T Y 4 g o 9 1 2 A v x I = " > A A A C l 3 i c b V H d a t s w G F W 8 b s u y n z Y r u x i 9 E c s G H Y x g B 9 b u r o X C y G U y l j Z Q m y A r n x t h S X Y l u W 1 m 9 A K 7 3 G 3 3 G H u Z P c D e Y 3 J + YH H 2 g d D h n O / w S d + J c 8 6 0 8 f 3 f D e / B z s N H j 5 t P W k + f P X + x u 9 d + e a 6 z Q l E Y 0 Y x n a h w T D Z x J G B l m O I x z B U T E H C 7 i 9 K z S L 2 5 A a Z b J r 2 a e Q y T I l W Q J o 8 Q 4 a j w 4 D L + B I e 8 n e x 2 / 6 y 8 K b 4 N g B T q n r 9 L v 4 Z 9 f b w e T d m M e T j N a C J CG c q L 1 Z e D n J i q J M o x y s K 2 w 0 J A T m p I r u H R Q E g E 6 K h c P t v i d Y 6 Y 4 y Z Q 7 0 u A F + 6 + j J E L r u Y h d p y B m p u t a R f 5 X q x i l E 1 2 b b 5 J P U c l k X h i Q d D k + K T g 2 G a 5 2 g q d M A T V 8 7 g C h i r k f Y D o j i l D j N t c K J d z S T A g i p 2 U Y J 6 k t q w v j 1 N p N T Y E t w w / Y A Y G / g M U O 2 7 r 7 b u 2 + q 7 v j J F 9 r + b Z 2 v d a u r X V h B f V o t s F 5 r x s c d X t D l 9 p H t K w m O k B v 0 C E K 0 D E 6 R X 0 0 Q C N E E Uc / 0 D 3 6 6 b 3 2 T r z P X n / Z 6 j V W n n 2 0 U d 7 w L / W r 0 U 8 = < / l a t e x i t > P (⇣)

Figure 4 .
Figure 4. We plot the log of the ζ distribution given by (4.54) for three different values of g.The black line corresponds to λ = 0, the red line to λ = 10 −5 and the orange line to λ = 10 −2 .The dotted lines are the regions where 3ζ > Ht µ 2 Ω 2 .

5 ) 0 e
It is more convenient to use covariance matrix C ij = φ i φ j − φ i φ j .In which case the above equation reduces to,∂ t C = AC + CA t + D , (C.6)in matrix notation.Now if we write the covariance matrix as C = e tA Ce tA t then we get∂ t C = e −tA De −tA t ,(C.7)where assuming that C(0) = 0 has a solution given by, C(t) = t (t−t )A De (t−t )A t dt .(C.8)

1 2k 3 τ 2 1 +
λ 2 A 1 − A 2 log(k/µ IR ) + log 2 (k/µ IR ) , (D.4)where µ IR is an infrared cut-off.If we set µ IR = H the diffusion coefficient for ζ changes to,D ζ → D ζ 1 + λ 2 A 1 − A 2 log(a) + log 2 (a) .(D.5) .30)Using the solution (4.29) and (4.30) we can immediately compute the Fourier transform in k since the PDF is Gaussian.Computing the Fourier transform in p is non trivial, since p appears in the exponents.In order to Fourier transform we expand M (t) and L(t) in powers of ∆ 2 ζ (0).Notice that this requires ζ ∆ 2 ζ (0)Ω 2 /µ 2 which we will assume to hold true and indeed it does for typical values of ζ and small amplitude of the density perturbations.Retaining terms only up to second order in p and Fourier transforming back, we finally obtain: 2 t 2 λ 2 ) 27H 2where we have kept terms at order √ λ∆N 2 .Notice that this steady state is reached within a couple of efolds.If we drop the slow roll terms, the PDF is given by Ht 3 λ 2 ζv ζ − It is possible to compute the correlation function directly from the distribution by integrating over all the fields φ a φ b = is the horizon crossing wavenumber.Since for modes that have crossed the horizon, t can be written as t * = 1 H log(k * /H), we have that * ( φ 2 ) = ∆ 2 φ , (D.11)where k * . First, let us solve the following, Our task will be to add the last term to the PDF (4.31).To simplify we will look for late time solutions such as ψ has reached its equilibrium distribution.We can eliminate some of the terms in the equation by Fourier transforming ζ to p, and look for solutions of the form, After replacing into the Fokker-Planck equation (E.1) we obtain,0 =F (k) + f 2 ζ H 3D ψ t ψ (6ψ + 4if Ht ψ ∆ 2 ζ Ωp + 4itψ∆ 2 ζ Ω 2 ψp)F (ψ) + 1 3D ψ t ψ (36ipt ψ ψ 2 Ω 2 + 2f 2 H(3 + 2ipt ψ ∆ 2 ζ Ω 2 ))F (ψ).(E.3)Ignoring the linear mixing term the solution is given by, P (p, ψ, t) ∼ exp(−σ 2 ζ p 2 /2)F (p, ψ) (E.2)