Charged and Neutral $\bar B_{u,d,s} \to \gamma$ Form Factors from Light Cone Sum Rules at NLO

We present the first analytic ${\cal O}(\alpha_s)$-computation at twist-$1$,$2$ of the $\bar B_{u,d,s} \to \gamma$ form factors within the framework of sum rules on the light-cone. These form factors describe the charged decay $\bar B_u \to \gamma \ell^- \bar{\nu}$, contribute to the flavour changing neutral currents $\bar B_{d,s} \to \gamma \ell^+ \ell^-$ and serve as inputs to more complicated processes. We provide a fit in terms of a $z$-expansion with correlation matrix and extrapolate the form factors to the kinematic endpoint by using the $g_{BB^*\gamma}$ couplings as a constraint. Analytic results are available in terms of multiple polylogarithms as ancillary files. We give binned predictions for the $\bar B_u \to \gamma \ell^- \bar{\nu}$ branching ratio along with the associated correlation matrix. By comparing with three SCET-computations we extract the inverse moment $B$-meson distribution amplitude parameter $\lambda_B = 360 (110) \text{MeV}$. The uncertainty thereof could be improved by a more dedicated analysis. In passing, we extend the photon distribution amplitude to include quark mass corrections with a prescription for the magnetic vacuum susceptibility, $\chi_q$, compatible with the twist-expansion. The values $\chi_q = 3.21 (15) \text{GeV}^{-2}$ and $\chi_s = 3.79 (17) \text{GeV}^{-2}$ are obtained.


Introduction
In this work we provide the (photon) on-shell form factors contributing to semileptonic B u → γ −ν and flavour changing neutral decay (FCNC)B d,s → γ + − with the method of light-cone sum rules (LCSR). This is the first complete next-to leading order (NLO) computation in α s including twist-1,2 and leading order (LO) twist-3 and partial twist-4. The dynamics, contrary to naive expectation, are more involved than forB → V (ρ, ω, K * , φ) [1,2]. This comes as the point-like photon, of leading twist-1, has no counterpart for the mesons and yet the photon is to be described by a photon distribution amplitude (DA) (analogous to the vector meson case and cf. Tab. 11 for a dictionary). The photon DA can be interpreted as ρ 0 , ω 0 , φ-γ conversion and is normalised by the QCD magnetic vacuum susceptibility χ, for which we propose a prescription to include quark mass effects in Sec. 4.2.1. The twist-1 part leading in the heavy quark scaling FB →γ ∝ m , which we discuss in the hadronic picture in the large-N c limit using a dispersion in Sec. 2.3. In the literature the chargedB u → γ form factors have been considered in LCSR at leading order (LO) in α s in [3][4][5] and a partial (no twist-1) NLO contribution in [6]. The decay has received considerable attention in QCD-factorisation and soft-collinear effective theory (SCET), in part, as a toy model for factorisation and for its prospect to extract parameters of the B-meson's distribution amplitude parameters [7][8][9][10][11][12][13][14][15][16][17]. In Sec. 5.4 we extract the B-meson DA parameters λ B by comparing to a SCET evaluation in three representative models. SCET-investigations include [7][8][9][10] at leading power and [11] at next-to-leading log and partial sub-leading powers. Subleading powers, (partially) corresponding to the photon DA, were accounted for by dispersion techniques [12][13][14]. Other approaches include perturbative QCD [18] and a hybrid approach of QCD factorisation and LCSR [15] to which we compare in Sec. 4.3. A first lattice form factor evaluation has been reported for D (s) , K, π → γ in [19] and further evaluations of D s (K) → γ are in progress [20].
The neutralB d,s → γ form factors describe flavour changing neutral currents which are generic probes for physics beyond the Standard Model. The activity in this particular corner has been revived, since it has been pointed out [21] that, in a certain kinematic region,B s → γµ + µ − may be extracted from the very same dataset asB s → µ + µ − at LHC experiments [22][23][24] (cf. also [25]). This triggered a number of phenomenological investigations [26][27][28][29]. Previously these form factors had been considered, in LCSR at LO [30], in [31] using aspects of heavy quark symmetry (with input from [32]), quark model computations [27,33] and more recently in SCET [28]. In addition, in this decay, the dipole operator m bqL σ·F b necessitates photon off-shell form factors. These have been investigated for the first time, at LO with QCD sum rules [34], utilising the on-shell form factors of this paper as an input.
The paper is organised as follows. In Sec. 2 the form factors are defined and their heavy quark scaling is discussed in a hadronic picture. In Sec. 3 the correlation functions entering the computation, Ward identities and contact terms (CTs ) are discussed extensively. The evaluation at LO and NLO for the perturbative and soft parts is documented in Sec. 4. The numerics Sec. 5 contains our main practical results in terms of fits, plots, the extraction of the inverse moment of the B-meson DA and the binned prediction for theB u → γ −ν rate in Secs. 5.1, 5.2, 5.2 and 5.4 respectively. The paper ends with summary and conclusions in Sec. 6. Appendices include conventions and comments on form factor definitions (Apps. A), gauge invariance of the amplitude (App. B), coordinate space equation of motion (including their non-trivial renormalisation) (App. C), a compendium of photon DAs (App. D), results of correlation functions (App. E) and discontinuities of multiple-polylogarithms in (App. F).

Generalities on theB → γ Form Factors
The section provides the method-independent information on the B → γ form factors (FFs), such as the definition and the heavy quark scaling.

Definition of the Form Factors
Amongst the dimension six operators of the b → q effective Hamiltonian there are the standard vector and tensor operators. Note that the scalar operator vanishes by parity conservation of QCD and the pseudoscalar one only contributes for an off-shell photon (cf. reference [34] for a complete discussion). We consider, in addition, an operator which is redundant by the equation of motion (EOM) but that will prove useful as a check and is of interest per se [35,36]. In complete generality, with conventions as in [26,34], extended to include the charged case, we define the FFs 1 wherem ± ≡ m ± /m Bq , m ± ≡ (m b ± m q ) and QB q = Q b − Q q is the charge of theB q -meson. Comparison with literature-conventions are deferred to the next section. Hereafter, denotes the point-like (or scalar QED) contribution, proportional to the decay constant, and consists of the infrared (IR) sensitive Low-term, which captures the gauge variance of the matrix element on the LHS (for chargedB q ) and assures that the FFs themselves are gauge invariant (cf. Sec. 3.2 and App. B.1). The gauge variance of the matrix element is cancelled in the fullB u → γ −ν amplitude when the photon-emission from the lepton is added cf. App. B.2. The same terms appears in P µ -structure whose subtraction from the FF was stressed in [19,37] cf. Sec. 2.2. Furthermore, s e is the sign-convention of the covariant derivative (cf. App. A), p B = q + k, on-shell momenta p 2 B = m 2 B , k 2 = 0, f Bq the decay constant 0|qγ µ γ 5 b|B q (p B ) = ip µ B f Bq , (2.3) local operators (describing the b → q transition) (2.4) and Lorentz structures (ε 0123 = 1 convention, e = √ 4πα > 0 and P ⊥, ,Low µ ≡ * ρ P ⊥, ,Low µρ ) P ⊥ µρ ≡ ε µρβγ (p B ) β k γ , P µρ ≡ i (p B · k g µρ − k µ (p B ) ρ ) , P Low µρ ≡ i(p B ) µ (p B ) ρ . (2.5) The Lorentz structures are transverse, q · P ⊥, = 0, due to the λ = ±1-helicity of the photon. As usual the Lorenz gauge, k · (λ, k) = 0, is assumed and → + k is the residual gauge invariance encoded as P ⊥, | →k = 0. By writing the Lorentz decomposition of γ(k, )|qiσ µν (1+γ 5 )b |B q (p B ) and using the algebraic relation (A.2) the FF relation can be shown to hold, in complete analogy to the B → V FFs, T B→V 1 (0) = T B→V 2 (0). 1 It is important to keep in mind that the results quoted throughout are for theB-meson. For example, by applying a charge C-or CP -transformation, one infers that the vector (but not the axial) parts change sign V, T ⊥ → −V, T ⊥ (assuming C|B = |B and or CP |B = |B ) when passing from theB-to the B-FFs. The adaption to generic phases under C and CP is straightforward.
A short comment on the ⊥, subscripts. Interpreting the FF as aB → 1 −(+) γ transition, ⊥ ( ) correspond to perpendicular (parallel) 3-dimensional polarisation vectors of the photon and the 1 −(+) state (as can be seen from (2.5)). In the B → V literature, e.g. [38], the ⊥, combinations are known as the transversity amplitudes and are useful because of definite CP-parities.

Form Factor Conventions in the Literature
We wish to explain why we have adopted a different notation w.r.t. to some of the literature. There are three reasons: firstly, in the off-shell case there are 7 FFs [34] for which the F A,V notation becomes inefficient. Second, the subtraction of the point-like term in the charged case following [19,37] (cf. Fig. 9 for comparison). Third, in [26] the sign of the FFs was chosen such that theB d,s → γ FFs are positive, and since the spectator quark is dominant this implies that theB u → γ FFs are negative. The latter explains the sign difference (2.7) w.r.t. to the SCET-literature with focus on the charged transition. Compared with [19] we find V Ds→γ assuming their translation to the SCET-literature. Essentially these are the same conventions as in (2.7) with the point-like part separated, which hopefully will become the new standard. When comparing with [27,33] we believe the following map should be used Note that in comparing analytic results one needs to keep in mind that the signs of χ and s e (and s g ) are author-dependent as well.

B → γ Form Factor in the context of QED-corrections to leptonic decays
Since the B → γ FFs are the real emission corrections to the leptonic decay B + → + ν or part of the FCNC B d,s → + − , we consider it worthwhile to clarify a few aspects with regards to QED-corrections to leptonic decays: the point-like term indicated in (2.1), the absence ln m -terms, the importance of the kinematic endpoint and extrapolations. The arguments below go beyond B-physics since infrared effects are rather universal. In the charged case the photon couples to the charged B-meson, there is a point-like or monopole contribution which can be identified by matching the axial current One then easily finds where the metric term comes from the photon field in the covariant derivative. Upon using k 2 = 0 and * · k = 0 one recovers the form in (2.1).
It seems worthwhile to point out that leptonic decays are free from hard-collinear logs of the form O(α ln m ), as these have been shown to vanish from structure-dependent corrections by gauge invariance (cf. section 3.4 in [39]). Thus they can only arise from the point-like contribution which is itself free from ln m -terms. This can be seen as follows. In the real emission the collinear terms would arise from the Low-term which is proportional to the helicity-suppressed LO term and therefore the logs are of the form m 2 ln m at worst. Finally, since real and virtual point-like contributions cancel each other in the photon-inclusive rate the virtual corrections must be free from such terms altogether. This completes the argument.
In the actual measurement of B + → + ν the undetected photon is soft and q 2 is very close to the kinematic endpoint, which is beyond the limit of the direct LCSR computation. This underlines the importance of the extrapolation as discussed in Sec. 5.1.2 (based on the g BB * γ couplings [40]). A more direct sum rule analysis or the lattice approaches mentioned earlier are of course an alternative thereto. We will report on the impact of B s → γ + − as a background to B d → + − elsewhere.

Heavy Quark Scaling of Form Factors
The heavy quark scaling, m b → ∞, of the FFs has been of wide interest in connection with factorisation theorems. In some cases, such as for the B-meson decay constant f Bq ∼ m 3), the scaling can be traced back to the (relativistic) normalisation fo the B-meson state. For the B → V FFs the same reasoning would lead to m by the dynamics of a fast moving s-quark, released by the heavy b, combining with a slow-moving spectator quarkq. However, theB → γ FFs scale like m −1/2 b because the perturbative photon does not need to hadronise from two asymmetric quarks. 2 The question we would like to pose here is whether this scaling can be understood from a hadronic picture. In the heuristic discussion put forward below we argue that the scaling can be understood from a certain number of low lying vector resonances coupling to the strange part of the electromagnetic current. For this purpose it is useful to consider the photon off-shell FF at first, k 2 = 0, and write a dispersion relation in the photon momentum where "s.t." stands for subtraction terms and by T ⊥ (q 2 ) = lim k 2 →0 T * ⊥ (q 2 , k 2 ) the on-shell FF is recovered. It is convenient to assume the limit of large colours, N c → ∞, for which the spectrum is entirely described by vector mesons of zero width and the above dispersion representation assumes the form 3 . It is difficult to say anything precise for the state aboven since by assumption the LC-OPE technique does not apply to this case and we cannot infer the behaviour in n of the form factor. However, it would be rather surprising if this contribution was leading in the heavy quark limit since the finite sum contribution gives a result compatible with no heavy quark suppression other than kinematic factors. How does the sum of terms from n = 1 to n =n affect the overall scaling? In the limit of large colours the Regge ansatz (n = 1, 2, 3, . . . ) m 2 Vn ∝ n with f Vn ∝ n 0 is phenomenologically successful in that it reproduces the correct logarithmic behaviour of the vacuum polarisation [41]. From explicit computation we know that TB s→Vn 1 ∝ n 0 thus every summand scales like n −1/2 m −3/2 b . If we further assume that every summand is of the same sign then the sum (2.12) behaves like It remains to determine hown scales with m b . Let us imagine to stretch m b by a factor of λ, m b → λm b , then m Vn → λm Vn (but Λ QCD and m Vn /m B fixed), then impliesn → λ 2n and thusn ∝ m 2 b . Hence from (2.13) we finally conclude our anticipated result of (2.14) The argumentation for the other three FFs is completely analogous. This is a satisfactory result in that it matches our explicit results but remains heuristic mainly since we are not able to make a precise statement about the n >n-terms other than that they are unlikely to change the scaling (cf. above).

TheB → γ Form Factors from LCSR
The method of QCD sum rules [42,43] was originally established for two-point correlation functions with a local OPE. LCSR were introduced [44] to describe Σ + → pγ FFs. It has been understood that heavy-to-light FFs are to be described by LCSR rather than 3-point functions with local OPE [45]. We refer the reader to [46] for a review on practical aspects of sum rules.

Definition of the Correlation Function appropriate for the Sum Rule
In the LCSR the B-meson is interpolated by a current, with standard choice and the FF is extracted from the correlation function Above and hereafter we use the shorthand x = d 4 x and Γ = V, T, D is the index associated with the operators (2.4). The mass dimensions of the scalar functions are: [Π V,T,D ] = 1. The index ρ is to be contracted by * ρ and the structures X Γ and Y Γ are related to the CTs and can be unambiguously identified from the QED-and or axial-Ward identity (WI), to be discussed in Sec. 3.2.
The LCSR is then obtained by evaluating (3.2), in the light-cone OPE, and equating it to a dispersion representation. For Π V ⊥ the LCSR reads where the narrow width approximation for the B-meson has been assumed and the ellipses correspond to heavier resonances and multiparticle states. The FFs are then extracted via the standard procedures of Borel transformation and approximating the heavier states by the perturbative integral, which is exponentially suppressed due to the Borel transform and M 2 is the Borel mass, with some minimal detail given in App. A.1. If we were able to compute ρ V ⊥ exactly then V ⊥ (q 2 ), obtained from (3.4), would be independent of the Borel mass and its variation serves as a quality measure of the sum rule. The other FFs are obtained in exact analogy. Hereafter we will abbreviate m Bq = m B and f Bq = f B unless otherwise stated.

QED and Axial Ward Identity
This section can be skipped by the casual reader as it deals with the technical issue of how to project on the FFs and the appearance of the QB-terms in (2.1).
Generally on-shell matrix elements can be extracted as pole residues from correlation functions. However for three-point functions, with a current insertion for the photon, this procedure is complicated by CTs which manifest themselves in the axial-and QED-WI. Whereas these matters have been discussed previously in the literature [47], our presentation is more complete in that we discuss the derivative and tensor case as well and work purely at the level of correlation functions.
A general decomposition of the correlator (3.2) with electromagnetic current insertion reads where the p 2 B and q 2 dependence on the right-hand-side (RHS) is suppressed for brevity. Compared to (3.2) the three structures Y Γ , W Γ and U Γ are introduced. The U -and W -terms are irrelevant after contraction with (k) ρ , as they vanish, and will hereafter be ignored. It is the resolution of the ambiguity between Y -Z and Π Γ that needs addressing.
One can derive a QED WI directly from the path-integral by varying q(x) → (1 + ω(x)iQ q )q(x) and demanding invariance under ω(x), similar to the derivation of the EOM given in App. C.1. For the QED WI we find and for the axial WI (q(x) → (1 + ω(x)iγ 5 )q(x)) one obtains where Q ≡ (Q b + Q q ) and the other terms on the RHS are and where the former follows from (C.3) upon omission of the QED current and the latter follows from gauge invariance. Note that for the tensor part all lower point correlation function vanish since there are not enough vectors to impose the antisymmetry in the Lorentz vectors. The correction term proportional to the sum of charges is due to the derivative term in the O D µ -operator cf.
). From the decomposition (3.5) and the QED WI one infers that The following three points are worth clarifying: ) are compatible with the WI (3.10) and would not contribute to the -FFs.
• The X V -term is the analogue of the Low-term in (2.1). This can be seen by rewriting Bq + . . . , (3.11) using the dispersion representation (3.3) and This derivation becomes exact, reproducing (2.1), when applying the LSZ-limit to the correlation function.
• The analogous X D -term can be treated in the very same way. The missing information is the matrix element which is related to f B by the EOM (C.1) implying f D B = −f B . When factors are arranged one reproduces the CT in (2.1).
In practice, is the correct recipe for projecting on Π Γ . It should be added that the entire discussion of these CTs can be avoided if one were to consider directly a physical amplitude, as then invariance under → + k holds cf. App. B.2. The final recipe to obtain V , with the point-like term subtracted, is to use in (3.4), and similarly for D . The expression iX V corresponds to QBf Bq . From (3.10) and the corresponding WI below (3.8) one can match to the standard pseudoscalar correlator [40]. In the notation of this reference the imaginary part of iX V reads

The Computation
The aim of this section is to describe the computation of the correlation function (3.2) which contain both perturbative and light-cone OPE contributions, as discussed in Secs. 4.1 and 4.2 respectively. These computations have been performed using the packages of FeynCalc [48,49], LiteRed [50], Kira [51], pySecDec [52,53] and PolyLogTools [54].
First we discuss the treatment of γ 5 in dimensional regularisation (DR). The emergence of DR as the standard tool for multi-loop calculations has led to a number of prescriptions for γ 5 to allow for its consistent use. In this work we adopt two different prescriptions depending on the number of γ 5 's that appear in the traces. Our case includes either one or two γ 5 's. For a single γ 5 we adopt the so-called Larin scheme [55], whereby  [55]. In practice this amounts to the NLO replacement in (3.2). When two γ 5 's are present we use naive DR, whereby γ 5 is treated as an anticommuting object with regards to all γ µ matrices (µ = 1...d). We have checked that this is equivalent to replacing each one with (4.1).
As the handling of the γ 5 is delicate it is good practice to subject it to consistency tests. To this end we have checked the EOM at LO and NLO which constitutes a non-trivial test (cf. Sec. 4.4 for more detail). We have also verified the axial WI at LO; however since we project directly on scalar function at NLO the same test is not easily performed, but given the EOM test we can be confident of our results. A further test is that the FF relation (2.6) is explicitly obeyed, which is particularly valuable as it tests γ 5 's of even and odd number against each other.
Before going into some detail on the calculation let us comment on light quark mass m q -contributions which can be relevant for the strange quark. These effects are included everywhere except at NLO twist-1 which would necessitate master integrals beyond the current state of the art. These effects are rather small and the inclusion of m s -effects in the magnetic vacuum susceptibility, that we propose, is a more significant effect. In the perturbative and local OPE contribution the photon is treated as a perturbative vertex with γ(k, λ)|A µ (x)|0 = * µ (k, λ)e ik·x . In the literature this has been referred to as the hard contribution [6]. On grounds of the 1/m b power behaviour, the perturbative part and the condensate part are counted as twist-1 and -4 respectively. We will make this clearer when discussing the light-cone OPE where the notion of twist arises. It is helpful to distinguish the diagrams where the photon is emitted from the q-or the b-quarks, proportional to the Q q -and Q b -charge respectively, which we refer to as A-and B-type. It seems worthwhile to point out that the soft contributions are all proportional to Q q .

Leading Order
The LO diagrams shown in Fig. 1 are straight forward to compute. Besides the perturbative graph we include the quark condensate but neglect the gluon condensate since −m b qq G 2 . Explicit analytic results for the correlation function can be found in App. E. A relevant and interesting aspect is that the A-type quark condensate diagram behaves as Q q qq /k 2 and diverges for an on-shell photon. This can be seen as a heuristic motivation for the photon DA which replaces these diagrams.

Next-to-leading Order
The most labour-intensive task of this work is the computation of the radiative corrections to the perturbative part. This computation is performed off-shell, k 2 = 0, as it provides a regularisation to IR-divergences and we can make use of the NLO basis of master integrals [57]. With three external momenta and one internal mass, intended for X 0 → W + W − with X 0 = {H, Z, γ * }, they are state-of-the-art if the method of differential equations in massive 2-loop integrals. The diagrams are depicted in Fig. 2 with the additional derivative diagrams in Fig. 3. After evaluating the traces in FeynCalc [48,49] scalar products of momenta are efficiently replaced with propagator denominators via the LiteRed [50] package, and then reduced to master integrals by Kira [51]. The master integrals (MIs) are expressed in terms of multiple-polylogarithms (MPLs), cf. App. F. The k 2 → 0 limit is taken and thereafter the discontinuities for the dispersion representation (3.3) are found using the PolyLogTools [54] package. We extended the basis of MIs in [57] to include those that are given by the swapping of external legs. Whereas technically this leads to an overcomplete basis, this was necessary in order to perform both the renormalisation and the k 2 → 0 limit efficiently where cancellation of the IR-divergent ln k 2 -terms takes place. We have included these MIs as an ancillary file to the arXiv version. 4 All integrals, including the new MIs, have been numerically verified against pySecDec [52,53] and PolyLogTools [54]. Following this executive summary we give some more detail below. With the help of LiteRed, scalar products of loop and external momenta appearing in the numerators of the integrals depicted in the diagrams of Fig. 2 are written in terms of propagator denominators such that all integrals are reduced to the following form MI[T, {n 1 , n 2 , n 3 , n 4 , n 5 , n 6 , n 7 }] = d d with n i ∈ Z and T ∈ {A, B}. For the A-type basis, corresponding to diagrams proportional 4 Of course these additional integrals follow by exchanging the external momenta, namely p 2 B ↔ q 2 or p 2 1 ↔ p 2 2 , in the MPLs of their original basis counterparts. However, this would lead to lengthy results and make the explicit cancellation of ultraviolet-(UV) and IR-divergences (k 2 → 0 limit) difficult as the MPLs contain many hidden relations. Hence we found it easier to construct the additional basis elements using the procedure in [57].
to the light-quark charge Q q , the propagator denominators are and for the B-type basis, corresponding to diagrams proportional to the heavy-quark charge Q b , we have, with the usual i0-prescription implied. We note that the A-type and B-type diagrams correspond to the (a)-(b) and (c)-(d) topologies in [57]. The bare Π ,⊥ are subject to renormalisation. Besides the standard renormalisation of the b-quark mass in connection with the self-energy graphs, one need to renormalise the operators of the correlation function. 6 The electromagnetic current does not renormalise, Z J b = Z m b and the renormalisation of the weak operators (2.4) is discussed in App. C.2. In particular the renormalisation of O D µ provides a rather non-trivial test of the computation. After this we perform the k 2 → 0 limit and take the discontinuities aided by PolyLogTools. In order to do this efficiently we use shuffle algebra relations, such as G(a, b, z) G(c, z) = G(a, b, c, z) + G(a, c, b, z) + G(c, a, b, z), (4.8) to reorganise the result into a linear sum of MPLs. The task of computing the discontinuity of the correlation function then becomes one of computing the discontinuity of each individual MPL with details given in App. F. It should be emphasised that the Q b -type MIs are much more involved than the Q q -ones, which is also mirrored in their alphabet length of 18 and 10 respectively [57]. The final step in the computation of the perturbative densities is to translate the MPLs into the more familiar classical polylogs. 7 In doing so we move from complicated functions with relatively simple branch cut structures, to simple functions with complicated branch 5 We note that the MIs for the B-type basis, including the additional MIs (4.7), were also recently computed in [58] in the context of mixed QCD-EW corrections to the HW + W − vertex. 6 There is no need to discuss the renormalisation of overall local 1/ 2 UV and 1/ UV terms due to the three operators coinciding as they are free of discontinuities in p 2 B . Such terms do not contain any physical information and consequently do not contribute to the dispersion integral. 7 Alternatively one could use the HandyG package [59] to perform the numerical evaluation.
cut structures. Consequently, the representation of a given MPL in terms of polylogs depends on the regions in which its arguments lie, meaning that a single expression may not be sufficient to span all possible values of the arguments. An explicit example of this can be seen in Eq.(4.2) of [60]. We have numerically verified that the perturbative densities written in terms of classical polylogs are equivalent to those given in terms of MPLs in the region of interest,

The light-cone OPE diagrams of Twist-2,3 and 4
In phenomenology the light-cone OPE has its roots in deep inelastic scattering, which is an inclusive measurement where the operators on the light-cone are the parton distribution functions P (p)|q(z)Γq(0)|P (p) . The Fourier transform of the light-cone direction p − z + describes the probability of finding a constituent parton of a certain momentum fraction. The application of the light-cone OPE to hard exclusive processes [61] is a different matter, as the light-cone operators correspond to DAs, e.g. 0|d(z)Γu(0)|π + (p) and the Fourier transform in the light-cone direction, denoted by the variable u in this work, describes the amplitude for each parton having momentum fraction u. We refer the reader to the technical review [62] where the connection to conformal symmetry is presented in depth and detail. In a concrete setting an OPE is associated with factorisation. The light-cone OPE corresponds to collinear factorisation and is often schematically presented (in momentum space) as where Π is the correlation function, T H a perturbatively calculable hard scattering kernel, φ i a DA, ⊗ denotes the integration over the collinear momentum fractions (suppressed in the formula), µ F is the factorisation scale and the sum runs over DAs of increasing twist. A necessary condition for factorisation to hold is that the µ F -dependence cancels to the given order in a computation. This corresponds to the absorption of collinear IR-divergences of the hard kernel into the DAs. We will discuss this explicitly in Sec. 4.2.3.

Quark Mass Correction to the Photon Distribution Amplitude
This section serves to illustrate the leading twist-2 photon DA and we present a method to incorporate quark mass corrections, which are sizeable in the case of the strange quark. The leading twist-2 photon DA is given by (D.4) where the Wilson line is omitted for brevity and χ is known as the magnetic vacuum susceptibility, which serves as a normalisation of the photon DA in the sense that f ⊥ γ = qq χ is the analogue of the transverse decay constant f T ρ for a ρ-meson. Its value without the inclusion of quark mass effects (and thus no quark subscript) is χ ≈ 3 GeV, at the scale µ = 1 GeV, known from QCD sum rules [63,64] and lattice computations [65,66].
For finite quark mass m q the local matrix element is UV-divergent. More precisely, it mixes with the identity (additive renormalisation) which is symptomatic of any OPE. Thus including finite quark mass corrections requires a specific prescription which we propose. Our starting point is the non-local matrix element for an off-shell photon γ * (k 2 = 0) where we parameterise (µ-dependence suppressed) and remind the reader that, by definition, χ does not include quark mass corrections. The LO contribution is straightforward to compute where K 0 is the modified Bessel function of the second kind. The ln(−x 2 + i0)-term signals the UV-divergence which is regulated by x 2 = 0. Upon taking the local limit from the start and using DR d = 4 − 2 , we identity at Before discussing the LO-prescription to remove the UV-divergence, let us digress on the sign of the ln(−x 2 + i0)-term. Some time ago Ioffe and Smilga [63] analysed the local matrix element using a UV cut-off Λ and a constituent quark mass. Taking into consideration their sign convention of χ < 0, we expect an increase of Λ 2 ↔ 1/x 2 to effectively enhance χ (recall qq < 0 for m q > 0). The same remark applies to the evaluation of this quantity in the background of a constant magnetic field via its Dirac eigenvalues cf. App. B in [65]. Hence our sign is in accordance with these treatments. 8 We would like to motivative the LO-prescription. The Fourier transform of ln(−x 2 ) corresponds to a zero momentum insertion approximation. We have checked by explicit computation that this leads to the same result when one uses a mass-insertion approximation in perturbation theory. 9 Hence at LO in α s dropping the ln(−x 2 µ 2 )-term is the correct prescription in order to avoid double counting. Whether dropping solely the ln(−x 2 µ 2 )term is the correct prescription for higher order or not is an open question and deserves further exploration. Now that the UV-divergent part is separated it remains to be seen what we can do with the DA piece (4.14) If one could trust the k 2 → 0 limit then one could simply add it to the twist-2 DA Φ γ * (0) in place. Clearly this is not the case since perturbation theory is not applicable for k 2 → 0 as illustrated by the singularity. A well-known solution to this problem is to invoke a dispersion relation [64]. We can build on their work for determining χ in the massless quark limit and add our contribution. The only relevant conceptual difference is that we have to use a once-subtracted dispersion relation because of the logarithmic divergence.
It was found in [64] that the sum rules for higher moments are unstable and we thus focus on the zeroth moment χ q (recall f ⊥ γq = qq χ q ). The once subtracted dispersion representation reads The relative sign, between the χ qq φγ, and the ln(−x 2 )-term differs from [44] (eq A.2) (and [64] eq (2.20,3.17)), when χ > 0 is assumed as stated in those papers. This could be due to χ < 0 convention used in previous publications [67]. Note that there is a typo in that paper. Namely g 2 φ /4π = 9m 2 φ /(4πf 2 φ ) ≈ 11.7 and not the inverse of what is quoted in that reference. 9 This is not a good choice per se as it is IR-divergent. Moreover, if one were to use the ln(−x 2 µ 2 )-term directly, via (4.13), this would lead to an IR-divergence. Of course these issues are resolved if one drops the ln(−x 2 µ 2 )-term and uses perturbation theory with quark masses in the denominator. In particular, the IR-divergence becomes regular mq ln mq.
Assuming the quark content (|ρ 0 [ω] (|ūu ∓ |dd )/ √ 2 and |φ |ss ) and adopting the zero width approximation we may rewrite the dispersion relation as follows where r 0 is the hadronic threshold due to higher states and we have adopted the isospin , in order to evaluate the RHS, besides the decay constants (f ρ , f T ρ ) = (0.220(5), 0.160 (7)) GeV, (f φ , f T φ ) = (0.233(4), 0.191(4)) GeV [36] and continuum thresholds (r [64], one needs the subtraction constant and the spectral density ρ(r). Both of these can be obtained from the local OPE f ⊥ γ * s (k 2 ). Using the results in App. B in [64] and adding our m q -correction we find is obtained from (4.17). Using χ q = f ⊥ γq / qq and adopting the notation where we have kept −k 2 0 > 3 GeV 2 in the deep Euclidean region as appropriate for a short distance expansion. In addition we quote an SU (3)-ratio All results are highly stable with respect to the subtraction point k 2 0 which is a sign of consistency in both the formula and its input, as the individual parts show sizeable variation (for k 2 0 ∈ [2, 10] GeV 2 the results vary less than a percent). The value of (4.20) corresponds to a typical SU (3)-ratio in the non-perturbative realm. The m s -contribution in perturbation theory can be estimated to be of the order of m s /m b ≈ 0.02 and can therefore be dropped.
We wish to emphasise that whereas χ q (f γq ) merely reproduces the result in [64] (within minor variation of the input), χ s (f γs ) is the first determination of this quantity. The quantity χ s in [65,66] should not be compared to the χ s above as it differs in the renormalisation prescription. We re-emphasise that our prescription emerges naturally from the twist expansion. The fact that a correction to a non-perturbative matrix element can be computed with perturbative methods is somewhat exceptional and so let us comment. A key difference to QCD condensates or matrix elements is that f ⊥ γq is of mass dimension one which is taken by the quark mass. In QCD this would involve O(Λ n QCD )-terms. The remaining logarithmic divergence ln Λ QCD can be absorbed into the perturbative part (twist-1).

Leading Order
The LO graphs, corresponding to the light-cone DA contributions to the FFs are depicted in Fig. 4. The expansion is performed up to twist-4, including 3-particle DAs, with the definition of the DAs given in App. D. As mentioned above the Q q qq -term is part of this expansion cf. Eq. (D.11) to see this explicitly.

Next-to-leading Order
Further to the LO graphs we have also computed the twist-2 O(α s ) corrections in the photon DA φ γ (u). The relevant diagrams are depicted in Fig. 5, with additional diagrams for Π D µρ shown in Fig. 6, and the total results given in App. E.3 in terms of Passarino-Veltman (PV) functions. A noticeable feature is that in the Feynman gauge the box diagram A 3 ) vanishes. For the radiative corrections there are issues with regard to IRand UV-divergences to be discussed. A particularly clear discussion of these matters can be found in [68]. First, the necessary conditions for collinear factorisation to hold are (i) that there are no divergences on integration over the collinear momentum fraction u and (ii) that the collinear IR-divergences can be absorbed into the bare DA, φ bare γ , which is not observable. We find that our integrals converge when integrated over u therefore satisfying the first condition. The collinear IR-divergences are absorbed into the LO DA. For this to be possible the divergences have to assume the following form T div H,1 = c T H,0 ⊗ V 0 . In this equation T H,0(1) are the LO and NLO hard scattering amplitudes, V 0 is the LO Efremov-Radyushkin-Brodsky-Lepage evolution kernel [69][70][71] and c is a constant or simply the counterterm [68]. One may use the property that the eigenfunctions and eigenvalues of V ⊥ 0 are known [72] is the n th Gegenbauer polynomials of degree 3/2. The divergencies can then be absorbed into determines c = 1 2 . This is indeed the constant that we find in our explicit computation and thus shows µ F -independence and cancellation of the collinear IR-divergences. This is not a new result as the computation is equivalent to the φ ⊥ -amplitude inB → V FFs where it was verified at the twist-2 level in [1,2] and for the VB u→γ in [6]. 10 Figure 6. Additional diagrams originating from the emission of the photon from the O D µ -operator. In principle there are two additional graphs, one for perturbation theory and for Q qq , where the photon is emitted from the covariant derivative of the O D µ -vertex. However, these graphs turn out to be proportional to * µ and are orthogonal to our projection prescription (3.14), and we therefore do not include them. Further not that if they were truly missing the EOM would not close.
Obtaining the dispersion representation (3.3) of the twist-2 NLO contribution is not straightforward from the representation given in (E.10), which we write schematically . Whereas the function Π ⊥ (p 2 B ) has a cut starting at m 2 b , π ⊥ (u, p 2 B ) has additional cuts in the (u, p 2 B )-variables. Two natural choices present themselves. Firstly, one commits to a certain form of the twist-2 photon DA and integrates over du in (4.26) such that the (poly)logs are functions of the external kinematic variables only and one can then obtain the discontinuity in the standard way. The other possibility is that one uses the fact that a FF's discontinuity is equivalent to its imaginary part. One may write, working with a single subtraction term at p 2 B =s in order to regulate the logarithmic UV-divergence, where Γ 1,2 are paths distorted into the upper complex planes C s,u to avoid singularities.
We have validated this procedure numerically.
In the final dispersion integral, after Borel transformation, the recipe reads where the integration endpoint of Γ 1 , which is ∞, is moved to the continuum threshold s 0 . This integral is numerically very stable.

Comparison of Form Factor computation with the Literature
The twist-1 and 2 LO results agree with [3,4]. At twist-3 and 4 we can compare with [6] and find that the qq , U, A, S γ & T 4γ contributions are missing and our result inS is larger by a factor of six. At NLO solely the twist-2 result has been reported in [6,15].
In the first reference it is presented only in numeric form and the agreement is reasonable taking into account the slightly different numerical input. In [15] analytic results are given but there is a sign difference between the twist-1 and the twist-2 term (cf. their figure 5) in contrast to our result. 11 To continue the discussion at this point it is useful to already take into account some of our numerical results of the proceeding section. Correcting for the mentioned sign, and taking into account the global sign difference (2.7), focusing on twist-1 and 2 one gets VB u→γ The discrepancy is too large in view of our O(10%)-uncertainty. In addition the value V ⊥ = −0.36 is too large in view of the Belle exclusion limit (cf. Sec. 5.2). The breakdown suggests that the culprit is the twist-1 contribution for which two sources can be identified. First, the twist-1 result in [15] is only to leading power in 1/m b and this suggests that the non-leading part is sizeable. Second, the not too well-known B-meson DA parameters and the assumption of a specific model for the DA itself (cf. the discussion in Sec. 5.4). To conclude, our analysis suggests that a posteriori the hybrid approach in [15] is numerically not sound because it does not include the full next-leading power corrections in the 1/m b -expansion.

Equations of Motion as a Test of the Computation
Generally EOM give rise to relations between correlation functions, cf. App. C which often involve CTs. For simple local FFs these CTs are absent and imply, in the case at hand, the following relations We wish to stress that these relations are completely general and thus have to be obeyed by any approach. We further note that the point-like part f (pt) Bq (2.1) cancels between the LHS and RHS in the -equation. 11 We would like to thank Yuming Wang for confirming this sign error.
Reassuringly, we find that (4.29) holds in all parameters for which we have complete results: perturbation theory (twist-1), {χ qq , a 2 (γ), a 4 (γ)} (twist-2), f 3γ at LO in 1/m b (twist-3) and Q b qq (twist-4 local OPE). For twist-1 and 2 this includes the NLO correction in α s . There are some comments to be made on the other cases. The parameter f 3γ is defined from a 3-particle matrix element of twist-3 but could mix into a 4-particle matrix element. This effect is 1/m b -suppressed, which we have checked analytically, and fortunately numerically negligible. The only true twist-4 term that we include is Q b qq and now turn to the reason why, which has to our knowledge not been thoroughly discussed in the literature.
It concerns the consistent inclusion of twist-4 3-particle DA parameters. Our finding is that this demands 4-particle DAs and is based on the following two observations. Firstly, 4-particle DAs of the formq(x)G αβ (vx)G γδ (ux)σ ρσ q(0), which do appear in the light-cone expansion of the propagator cf. eq. A.16 in [73], allow for twist-4 contributions. Secondly, these terms would mix via the EOM, generalisations of Eq. 4.40 [64], into the 3-particle DAs of twist-4. 12 The point is that the EOM at the FF-level need to close in each hadronic parameter separately, to be used in the numerics. In fact we find that of the 3-particle twist-4 parameters only ζ + 2 closes and {κ, κ + , ζ 1 , ζ + 1 , ζ 2 } including the qF q -type contributions (D.25) do not. 13 As there is no argument as to whether any of the FF-combination are to be preferred over others we are led to conclude that one needs to drop those parameters altogether. Fortunately quite a few of them are zero (Tab. 1) and the numerical impact of the others is well below 10% (Tab. 6). Whereas the numbers in the table seem higher for the chargedB u , footnote 13 is relevant in that it probably means that the numbers are overestimated. Note that for the photon DA the omission of twist-4 parameters for the 3-particle case then implies that the 2-particle twist-4 DAs are effectively zero as their only contribution comes from the EOM with the 3-particle DA of twist-4. This is different to the vector mesons where the m 2 V -contributions appear as 2-particle ones of twist-4 and are sizeable for the φ-meson for example [36].

Numerics
In this section we turn to the numerics for which we first discuss the input, the sum rule parameters, correlations of parameters due to the EOM and the fit-ansatz in Sec. 5.1. In Sec. 5.2 FF-plots, fit parameters and values for q 2 = 0 and are given in Fig. 7, Tab. 5 Tab. 4 respectively The correlation matrix can be found in an ancillary file to the arXiv version. The prediction of theB u → γ −ν rate with bins and correlation matrix is given in Sec. 5.2 In Sec. 5.4 we present the extraction of the B-meson DA parameter λ B by comparing to a set of three SCET-computations. 12 For example, for vector meson DAs the 3-particle parameter ζ3ω T , of twist-3, mixes into the 2-particle DA h3 twist-4, e.g. [74], due to the EOM. In the same way 3-and 4-particle DAs are coupled through the EOM and lower twist parameter can always mix into a higher twist parameter. 13 For the charged transitions further twist-4 contributions are needed, namely the photon parton of the photon DA connecting to the charged lepton. Strictly speaking, it is then better to consider the amplitude rather than the FF (beyond the rather trivial issue of the QB-CTs in (2.1)).

Input Parameters, Sum Rule Parameters and Fits
The input parameters, including the values for the photon DAs, are collected in Tab. 1. An important aspect is the choice of the mass scheme. The MS-and the pole-scheme give rise to large effects in either higher twist or radiative corrections which suggests that neither is optimal (cf. Fig. 13 and Tab. 10). The kinetic scheme, originally developed for the OPE in inclusive decays [75], can be regraded as a compromise of these two schemes and we have found that it does indeed give rise to stable results. From Fig. 12 it is seen that this leads to stability in µ kin with uncertainties in the 1-2% range. To our knowledge this is the first time this scheme is used in the context of LCSR and we also adapt it in the closely related work on the g BB * γ -type on-shell couplings [40]. The value of the kinetic mass, shown in Tab. 1, has been obtained using the two-loop relation between the MS and kinetic mass given in [76]. The uncertainty on the kinetic mass has been estimated by adding the error on the MS mass and the one from the conversion formula (variation of In line with other applications we set the scale of the kinetic scheme to µ kin = (1 ± 0.4) GeV.
The sum rule specific parameters, given in Tab. 2, are determined by imposing a series of consistency conditions. Whilst we refrain from assigning a q 2 dependence to the SR parameters we consider the constraints set out below to apply for q 2 ∈ [0, 7] GeV 2 . For the effective threshold s 0 it is required that the daughter SR, which is formally exact, reproduces the known B-meson mass to within 1-2%. The SU (3) ratio of effective thresholds should approximately reproduce that of the meson masses s Bs , which we use as an additional loose constraint. The Borel mass is obtained by considering two competing factors. A large Borel mass leads to faster convergence of the LC-OPE at the cost of greater contamination from the continuum states, whereas a small value of the Borel mass achieves exactly the opposite. We balance these two factors by requiring both that the continuum contributes no more than 30% and that the highest twist contribution remains below 10% of the result. Varying the Borel mass to ±2 GeV results in 1-2% changes in the FF-value which is rather satisfactory. Next, we turn to correlations imposed by the EOM.

Equations of Motion as Constraints on Sum Rule Parameters
Besides providing a non-trivial check of both the computation and the formalism of the OPE and DAs itself cf. Sec. 4.4, the EOM serve an additional purpose in that they correlate the vector and tensor FF as the derivative FF turns out to be small. This observation was first made for the B → V -FFs, with V = ρ, φ, ... a light meson, in [35] and more systematically exploited in [36]. Numerically, the derivative FFs in (4.29) are suppressed by an order of magnitude as compared to the vector and tensor ones. This means that the sum rule specific parameters, of Borel mass and continuum threshold, ought to be approximately equal s V 0 s T 0 and M V M T ; otherwise this would imply an unprecedented violation of quark-hadron duality for the derivative FF.
Running coupling parameters α s (m Z ) [77] m Z [77] 0.1176 (20)   Photon distribution amplitude parameters     Does this still hold for the B → γ FFs? Inspecting Tab. 5 we note that this pattern persists for the ⊥-direction but not for the -direction. As the computation is a 1-to-1 formal map with the B → V FFs for higher twist-2 and above (cf. Tab. 11) this suggests that there is something special going on in the perturbative diagrams (twist-1 and Q b qq ).
In order to understand the origin let us briefly recall how the FF-hierarchy can be understood in B → V . We proceed by investigating the heavy quark limit of the FF. Whereas sum rules do not lend themselves to a heavy quark expansion, since some m b dependence is hidden in hadronic parameters, it is nevertheless possible to extract the leading 1/m b -behaviour by making the following well-known substitutions whereΛ, ω 0 and τ are all parameters of O(Λ QCD ). For B → V it was observed that the derivative FFs are 1/m b -suppressed at O(α 0 s ) [35,36]. This pattern is broken at NLO where they show the same scaling. 14 The B → γ scaling are collected in Tab. 3. The twist-2 parameter χ qq , of course, confirms the earlier findings, whereas for twist-1 it is found that the hierarchy is obeyed in Q q but not Q b . The former is in complete accord with the heavy quark scaling argument, presented in Sec. 2.3 where the Q q -part is inferred as a sum of B → V n FFs. As for the latter we can offer a similar argumentation as in Secs. 2.3, Eq. (2.14), by using the dispersion representation in the limit of large colours where we focus on the resonances for simplicity. The heavy quark scaling (m Vn , f em Vn , T Vn→Bs . Contrary to the case for the Q q -charge, there is no obvious significant alteration for the sum when scaling in m b and we conclude TB s→γ , which is indeed consistent with our findings cf. Tab. 3. This explains the scaling but also makes it clear that the Q b -part has nothing to do with light vector mesons and the ideas behind the large energy limit are thus not applicable. 15 Hence there is no reason to believe that such a hierarchy is in place and the pattern in Tab. 5 is now qualitatively understood. We consider it instructive to give a numeric breakdown for the LO contributions up to twist-2 where q = u, d. Above we see that whilst the derivative FF is small in the ⊥-direction it provides a considerable contribution in the -direction, and is the reason we allow the central values of the continuum thresholds of VB →γ and TB →γ to differ, in contrast to the other FFs. Moreover the numerical suppression of D ⊥ in perturbation theory might or might not be accidental. In view of these findings we apply a moderate (smaller than in B → V [36]) correlation of the tensor versus vector thresholds. Further to the correlations induced by the EOM, the effective thresholds of the FFs and the decay constant are correlated, which is warranted as both thresholds are associated with the B-meson state where q = u, d, s and Γ = V, T .
symmetry breaking corrections in [80] and investigated at NNLO in [83]. The FF relations in [80] are compatible with our finding from the EOM at LO in 1/m b . 15 Charmonium states can be neglected since 0|bγµb|ψcc is rather small.

Fit-ansatz
A generic FF, say F n , is fitted to a z-expansion ansatz, following the same setup as in previous work [34,36], where the leading pole is factored out The variable z describes the following map into the complex plane: . It is noted that t ± → m Bq ± 2m π , m Bs ± 2m K is formally correct but we use t ± as above with no impact on our results in the physical region. We note that whilst t + corresponds to the multiparticle production threshold, the choice of t 0 (and therefore t − ) is arbitrary and does not impact our results. The masses of the lowest lying J P = 1 −(+) states, relevant to the ⊥ ( )projection, are m R = m B * q (B 1q ) . Relations between the FFs provide constraints on the fit parameters. For example (2.6) implies TB →γ Further constraints arise from the dispersion representation. Considering the vector FF for concreteness, one can infer the residue of the lowest-lying resonance 16 with some more detail in [40] along with the definition of the decay constants f (T ) B * ,B 1 and the on-shell matrix elements g BB * (B 1 )γ describing the B * (B 1 ) → Bγ transition. This relation enforces the constraint and similar constraints for other FFs. The main idea of the fit is that the four FFs with 15 q 2 -points (ranging from [0, 14] GeV 2 ) and N samples = 2500 samples are subjected to a single fit providing correlations between the fit parameters of the FFs. This is the same Markov Chain Monte Carlo methodology as used in [36]. Data points are generated from normal distributions correlated as in (5.6), with the exception of {µ F , µ cond , µ αs , µ kin , M 2 , M 2 f B } for which a skew normal distribution is applied. The latter is necessary since clearly  not all parameters have a symmetric validity-range. The skew normal distribution is defined by three parameters, which we take to be the mean, the standard deviation and a shape parameter, α shape . The latter is chosen such that the probability of generating a sample below a certain cut-off is less than 10 −4 . For the scales we impose {µ F , µ cond , µ αs } > 0.8 GeV and µ kin > 0.2 GeV whilst for the Borel parameters we require M 2 > 4 GeV 2 and M 2 f B > 2 GeV 2 , below which the twist expansion is non-convergent. The fitted FFs reproduce the mean values of generated FF data sets to within ≈ 0.5% over the fitted region. The values of the fit parameters are given in Tab. 4, with some more description in the caption. Covariance and correlation matrices along with the central values of the fit-parameters are given as ancillary file appended to the arXiv version of this paper. The JSON formatted file is ordered according to charge, statistical quantity and FF(s) of interest and can be readily used in Mathematica. For example, after loading the data Similarly the correlation between αB u V 0 and αB u T 2 can be accessed via OptionValue[data, "Bu"→"correlation"→"VparaTpara"→"a0a2"] .
The uncertainty and covariance matrix are accessed in an analogous manner.

Results and Plots
The central values of the FFs (2.1) at q 2 = 0 and the residues (5.10) are given in Tab. 5. Plots of the neutralB d -,B s -and chargedB u -modes are shown in Fig. 7. Together with the fit-values given in Tab. 4 of the previous section and the correlation matrix given in an ancillary file this constitutes the main practical results of our paper. A breakdown of contributions, split according to twist and DAs, is given in Tab. 6. We remind the reader that all twist-4 contributions, other than the Q b qq condensates, are dropped as they require the inclusion of 4-particle DAs cf. Sec. 4.4. This effect which is fortunately well below 10% as can inferred from Tab. 5.  Whereas formally the fit-parameter α 0 equals the FF-value at q 2 = 0, on comparison with Tab. 5 we see a deviation of O(2%) between the two. This comes as the FFs are evaluated analytically whereas the fit-values follow from the Monte Carlo Markov-chain analysis (N samples = 2500 random samples) and the deviations are due to slightly asymmetric distribution of the uncertainty.
From the plots one infers the typical behaviour of a FF. Its value at q 2 = 0 is below the normalisation of a current, e.g. the electromagnetic pion FF f π→π + (0) = 1 and it raises as a result of hadronic poles (or multiquark thresholds in partonic QCD). One sees that the constraint from the first pole is well in line with the raise reproduced in our computation for q 2 ≈ 14 GeV 2 and would become progressively worse. In the tensor case the ⊥and -directions only show significant deviations for large q 2 , with the coincidence at q 2 = 0 enforced by (2.6). For the vector FFs this pattern would be reproduced if the derivative FFs were small. However, this is not the case in the -direction, as the vector form factor differ more notably (cf. (5.5) and the discussion of Sec. 5.1.1). Furthermore, we find that for both the twist-2 and leading 1/m b twist-1 contribution (proportional to Q q only) the ⊥and -directions give identical results. Hence any degree of asymmetry between ⊥ and must arise from sub-leading perturbative and/or higher twist contributions. Our decision to drop hadronic parameters that do not close under the EOM from the analysis (cf. Sec. 4.4) therefore acts to increase the symmetry between the two directions.

Comparison with the Literature
In this section we compare our FFs to some of the results in the literature. Comparison with SCET computations of these FFs [28] is not straightforward as the B-meson DA parameters are not known with a lot of certainty. It is therefore advisable to turn the tables and use . The plots are reproduced from the fit-values in Tab. 4 and the uncertainty bands stem from the correlation matrix given as an ancillary file cf. Sec. 5.1 for further details. Whereas we consider the LCSR computation valid up to q 2 < 14 GeV 2 , the ansatz (5.7) with the pole residue from [40] allows one to extend the form factor beyond this region. The sign of the charged form factors is reversed so that they are positive. We refer the reader to App. 2.1.1 for discussion of conventions.
our predictions to set bounds as done in Sec. 5.4. Comments on comparison in terms of analytic computations can be found at the beginning of App. E. Lattice results are available for D d,s → γ and K, π → γ modes in [19] and in preparation by another group [20]. At low photon recoil such computations are of importance for QED-corrections. Whereas we have chosen not to present D → γ FFs, the pole residues of the vector FFs, g DD * γ , are computed in our other work [40] and are consistent with experiment in the measured modes. A notable aspect is that the shape in [19] does not seem to be compatible with the D * -pole playing a significant role close to the kinematic endpoint of D → γ ν (q 2 = m 2 D ). We can, however, compare to the quark model computations [27,33] . Two representative plots of the neutral modes are given in Fig. 8. There is rather good agreement at low q 2 but we can clearly see that our FFs are larger for higher q 2 which ought have a significant impact on phenomenology. For illustrative purposes we have added Fig. 9 comparing the form factor with and without the point-like contribution added.

Predictions for theB u → γ −ν Rate
TheB u → γ −ν decay proceeds via a charged current and depends on the local FF only. Neglecting the lepton mass, anticipating a measurement in the electron mode, the differential rate after integrating over the lepton angle is given by (5.13) TheB u → γ −ν rate will, finally, be measured at Belle II [38]. A 90% confidence level (CL), B(B u → γ −ν ) Eγ >1 GeV < 3.0 · 10 −6 @90% CL, with Belle data has been reported two years ago [85]. This limit is about a factor of two above our predictions of 1.60(37) · 10 −6 given in Tab. 7.

Extraction of the Inverse Moment λ B of the B-meson DA
As previously mentioned, besides being a toy model for factorisation,B u → γ −ν is of interest to extract the B-meson DA parameters from experiment. Here, we replace experiment by our computation which comes with the additional bonus that the O(8%) uncertainty of the CKM matrix element |V ub | = 3.82(24) · 10 −3 [77] is eliminated. This is also useful as |V ub | suffers from some tension between exclusive and inclusive b → u transitions [77] .
Let us turn to some minimal definitions in order to clarify the context. The B-meson DA of leading twist-2, φ B + was originally introduced in [86]. Its inverse moment is a genuinely unknown non-perturbative parameter λ B = O(Λ QCD ). Above µ is the renormalisation scale (or factorisation scale in processes), assumed to be µ = 1 GeV if not stated otherwise. Mainly due to its non-local character it has so far evaded a direct Figure 11. Plots taken from [14] (coloured curves) overlaid with our predictions for the integrated branching ratios (cf. Tab. 7) scaled to account for the fact that in [14] |V ub | excl = 3.70(16) · 10 −3 is used. This results an effect of a factor of 0.938 on the branching fraction. Solid black lines represent our central values with uncertainty bands indicated by dashed lines. The colour coding of the plots corresponds to the different models cf. (5.16) and [14] of course.
first principle determination. In computations using the heavy quark limit it is often the leading uncertainty. More precisely, at NLO there are two further non-perturbative parameters that appear, the inverse logarithmic momentsσ 1,2 e.g. [9,11]  In essence a separate (indirect) determination of these parameters is not feasible. Thus actual studies one is forced to use a model-ansatz for φ B + . This is done in reference [14] for a 3-parameter family, which permits an analytic evolution at O(α s ). This 3-parameter model is further constrained to three 2-parameter models which give rise to the following ranges inσ   Fig. 11 for the three φ B + models, rounded to 5 MeV. The rightmost column contains the mean value over the models for each minimum photon energy cut, rounded to the nearest 10 MeV and the E min γ = 1.5 GeV is selected as the our final value in (5.17).
In Fig. 11 we have overlaid our branching fraction predictions, shown in Tab. 7, with the plots from [14]. From Tab. 8 one sees that the values predicted for a given model are largely stable on the minimum photon energy cut. There is, as expected, some dependence on the specific model of φ B + . However, averaging over the ranges of the three models (cf. right hand column of Tab. 8), one again sees the consistency between the different photon cuts, which suggests good agreement in the shape for these values.
Whereas E min γ = 2 GeV is arguably the best result because of the range of validity of the approaches, it still seems preferable to take an average of the three values for λ Model Avg. We now turn to discussing the value obtained in relation to other determinations. The previously mentioned Belle measurement from 2018 sets a bound of λ B > 240 MeV @90% CL using [14] as the reference input. A direct determination via a QCD sum rule, which however uses non-local quark condensates which are not free from model-dependence, was performed in 2003 and gave λ B = 460(100) MeV [89] (and the previously quoted value forσ 1 ). An update [90] using improved numerical input and two models for the B-meson DA yields λ B = 383(153) MeV, a value closer to ours. The same strategy as here has been applied at LO [6,91] and NLO [92,93] with various specific model functions for φ B + . The B → γ [6] LO-analysis has been done in the heavy quark limit and resulting in λ B = 600 MeV with no given error. The LO B → π analysis with an exponential model gave λ B = 460(160) MeV [91]. At NLO there are the works on B → π [92] and B → ρ [93] with λ B = 354 +30 −38 MeV and 343 +64 −79 MeV respectively for which we quote the exponential model values (cf. those references for other model determinations).
To conclude, from Tab. 9 it is seen that the restriction to a single model does reduce the error considerably and underlines the importance of the QCD SR determinations. In summary, the picture looks consistent but a dedicated heavy quark analysis of the NLO B → γ analysis must ultimately be the most reliable method as it will not involve any photon DAs. We anticipate returning to this issue in a forthcoming publication.

Summary and Discussions
In this work we have computed theB u,d,s → γ form factors at NLO for twist-1,2 and LO in twist-3,4 with light-cone sum rules. This involved the evaluation of the diagrams in Fig. 2 which was the technically most challenging part of this work and involved the use of a series of tools [48][49][50][51]54] and state-of-the-art master integrals [57]. The form factors are provided as fits to a z-expansion ansatz (5.7) with fit-parameters given in Tab. 4. The correlation matrix can be found in an ancillary file to the arXiv version. Form factor values at q 2 = 0 and plots are presented in Tab. 5 and Fig. 7 respectively. This constitutes our main practical results. Analytic results of the correlation functions in terms of Goncharov functions are given as an ancillary file (Mathematica notebook) to the arXiv version. An important technical aside is the choice of the m b mass-scheme for which we have adapted the kinetic scheme, which shows good scale-stability (cf. Fig. 12) unlike the pole-or MSscheme for which either the α s -or twist-corrections are unnaturally large. Besides the form factors themselves, we have established a few valuable theoretical results and advances. In Sec. 2.3, starting from a dispersion representation, we motivated the m  (4.29) were verified to hold in all hadronic parameters for which completeness can be expected. In doing so it became clear that in order to include the twist-4 3-particle distribution amplitude parameters, given in Tab. 12, one needs to include 4-particle DAs of twist-4 which have not yet been classified. ForB → γ form factors the impact is fortunately rather small. The charged form factors describe theB u → γ −ν decay, anticipated to be measured at Belle II [38], for which our differential rate prediction, binned and including correlation matrix, are given in Sec. 5.2. Comparing to SCET-computations, of three models of the B-meson distribution amplitude, we have extracted its inverse moment (5.14), λ B = 360(110) MeV. 18 The neutral form factors are the input to the flavour changing neutral current processB d,s → γ + − , which will come into focus by LHCb in the muon-mode [96][97][98][99], alongside off-shell form factors [34] and genuine long distance contributions [26][27][28]. The scope of applications of theB → γ form factors does not end here as they can serve as inputs, to related processes, in terms of subtraction constants of dispersion relations. Examples are, the weak annihilation process inB → V γ e.g. [100], B s → µµ axion/dark photon for which generic photon off-shell B → γ * form factors are needed (cf. App. A [34] for how dispersion relations can bridge the k 2 ≈ m 2 ρ,ω,φ resonance region) andB u → + − −ν (requiring q 2 , k 2 = 0) as recently investigated in [101,102]. Our form factor predictions should be helpful in reducing the theory error in all of these modes and enhance the new physics sensitivity.

Acknowledgments
We are grateful to Gunnar Bali, Martin Beneke, Christoph Bobeth, Vladmir Braun, Matteo Di Carlo, Paolo Gambino, Einan Gardi, Yao Ji, Chris Sachrajda and Yuming Wang for useful discussions and to James Gratrex for careful reading of the manuscript. We are indebted to Thomas Gehrmann and Roberto Bonciani for guidance in the master integral literature, Claude Duhr and James Matthews for help with Polylog Tools and Stefano Di Vita, Pierpaolo Mastrolia, Amedeo Primo & Ulrich Schubert for useful exchange and guidance with regard to [57]. BP is also grateful to Calum Milloy for useful discussions and guidance regarding the method of differential equations and the subtleties therein. RZ acknowledges Damir Becirevic for the kind hospitality at IJCLab, Pôle Théorie Orsay-Paris when this work was in its initial stages. RZ is supported by an STFC Consolidated Grant, ST/P0000630/1. BP is supported by an STFC Training Grant ST/N504051/1.

Ancillary files
We include a number of ancillary files. A file with the renormalised NLO correlation functions for twist-1 in terms of MPLs (corr PT NLO.m); the new set of master integrals (MI vzzb.m) discussed in Sec. 4.1.2 which is linearly dependent on the basis [57] but of value to make cancellations explicit. The z-expansion fits, given as a JSON file (BtoGam FF.json), 18 It seems worthwhile to mention that comparing theory computations with each other eliminates the sizeable |V ub |-uncertainty. Moreover, this analysis is to be regarded as a crude first approach and can be improved by doing a fully differential analysis, including the two logarithmic momentsσ1,2 into a 3parameter fit, which we might turn to in a future publication. with some description in Sec. 5.1.2 and the notebook (FF-plots from json.nb) that exemplifies its implementation.

A Conventions and Additional Plots
We define the covariant derivative as where e > 0 (α = e 2 /(4π)), Q e = −1, Q u,c,t = 2/3 and Q b,d,s = −1/3. This leads to a Feynman rule −s e ieQ f / A − s g ig s / A a T a . Note that the sign of the FFs (2.1) depends on s e which we therefore prefer to keep explicit. The sign of the gluon term s g is also kept which appears in the light-cone propagator in the background field (D.28) as well as the 3particle DAs. The final results are independent of s g and the rate is of course independent of s e , but attention must be paid when compared or combined with other work. Using the g = diag(1, −1, −1, −1) metric the following relation holds, with ε 0123 = 1 and γ 5 = iγ 0 γ 1 γ 2 γ 3 . Moreover we have T a T a = C F 1 with C F = (N 2 c − 1)/(2N c ) and α s = g 2 /(4π) as usual.

A.1 Borel Transformation
The Borel transform is defined as where Q 2 = −q 2 is a Euclidean momentum and Q 2 /n = M 2 B defines the so-called Borel mass.
For the perturbative part and the condensates, which are poles at tree level, one needs the following Borel transformation The DA-part can be reduced to a form du 1 ∆ n (n ≥ 1) for which simple rules for Borel transformation are given in App. A in [36]. Of course one could integrate over the DAparameters du or Dα, then subject the result to a dispersion relation and use (A.4). The advantage of using the former method is that one does not need to commit to a specific form/truncation of the DAs.

A.2 Plots of Scale Dependence and Twist-contributions
In this appendix we collect additional plots and tables. The merit of the NLO versus a LO analysis shows in the mass scale dependence plots in Fig. 12. The advantages of the kinetic-scheme over the pole-and MS-schemes become apparent in Fig. 13 and Tab. 10, respectively. twist DA    Columns represent the percentage contribution of each DA to "Total * ". We remind the reader that the asterisk in total indicates it includes twist-4 contributions that do not close under the EOM (cf. Sec. 4.4). Contributions from 3-particle DAs have been dropped for brevity. Formally one should redetermine the sum rule parameters (cf. Tab. 2) for the MS evaluation, however we choose to use the same parameters as for the kinetic evaluation as this is sufficient to illustrate our point. We note the kinetic scheme shows greater suppression of the higher twist contributions. Additionally, in contrast to the kinetic scheme, radiative corrections in the MS-scheme act to cancel with the LO results.

B Gauge (In)variance and the m = 0 Amplitude
In this appendix we intend to clarify a few issues around gauge invariance.

B.1 Gauge-variant part of the Charged Form Factor
In Sec. 3.2 it was shown how the QB-term in (2.1) appears in the LCSR computation and below its consistency is illustrated on general grounds. The QB-term is present since O µ itself is not gauge invariant (charged) in the case where the weak current is charged. Let us write the matrix element on the LHS of (2.1) as follows and then perform a gauge tranformation → + k, Figure 13. Plots comparing the relative size of the radiative corrections to twist-1 and -2 in the pole-and kinetic-scheme, for two representative form factors. We observe large corrections in the pole scheme, with the O(α s ) contributions being in the region of 50-60% and 40-50% of the LO result at twist-1 and -2, respectively. In contrast, using the kinetic scheme at µ kin = 1 GeV, we find the NLO corrections are approximately ∼ 20% and ∼ 10% of the LO result at twist-1 and -2, respectively.
It is readily seen that this matches the gauge variation of the Low-term on the RHS in (2.1).

B.2 Gauge Invariance Restored, in theB u → γ −ν -Amplitude
Here we illustrate that the total amplitude, emission from the B-meson (i.e. FF) and the lepton leg, is gauge invariant. The total amplitude A is proportional to where H eff ∼ L µ O V µ and L µ =¯ Γ µ ν is the charged lepton current with some unspecified Dirac matrix Γ µ . The total amplitude must be invariant under the residual gauge transformation → + k, which is the case if charge conservation is imposed (Q b = −1/3, Q u = 2/3 and Q = −1).
In conclusion, the total amplitude is indeed invariant under → + k.
The m -case is special in that the emission term is local and allows a redefinition of the FF to absorb all effects [11]. We follow this procedure in our notation as we consider it worthwhile to make contact with the existing literature.
• In the m = 0 case the emission from the lepton is proportional to the tree level matrix element. Essentially * where p B = k + l 1 + l 2 and the EOM was used for the ν. This allows to redefine the FF in such a way that the amplitude is proportional to the FF only.
• Focusing on the γ 5 -part of the FFs (2.1) where the O(q µ ) terms are irrelevant as they vanish in the massless case when contracted with the vector or axial lepton matrix element.
• By gauge invariance of the full amplitude, as argued above, it is then clear that the extra * µ -term must cancel the emission term and thus the full amplitude can be written as It seems worthwhile to comment on why in the m = 0 approximation the pointlike structure (terms proportional to f B (2.2)) disappears. This is the case since the point-like term is given by the LO matrix element times * ·p B and other factors. This is a consequence of Low's theorem [103]and means that as far as angular momentum conservation is concerned it behaves like the LO matrix element. The additional photon forms a Lorentz scalar with p B and is thus in a relative p-wave to the Bmeson. Finally, since the LO matrix element is helicity suppressed (proportional to m ) this contribution vanishes entirely in the m → 0 limit.

C Correlation Functions and Equations of Motion
In this appendix we wish to clarify a few technical aspects regarding the EOM.

C.1 Derivation of the Equations of Motion at Correlation Function Level
To what extent the naive EOM 19 ) are altered in correlation functions by CTs is a typical field theory problem. Below we give the correlation function version of (C.1) with insertion of the photon and B-meson currents. The main result is that with the definition of Π , compatible with the QED WI, the EOM are satisfied in the most straightforward way This equation is the basis for the simple relation (4.29) and the usual statement in the literature that the EOM hold on physics states. Below we show how (C.2) emerges, despite a few CTs on the way, by starting with correlation functions in coordinate space.

C.1.1 Current Insertion (hard Photon)
By variation from the path integral under q(x) → q (x) andb(x) →¯ b (x) and requiring independence in q (x) and b (x) one gets two equations which can be combined to 0|Tqi( one gets the useful forms for the ⊥-direction and the -direction The QED-covariant derivative on σµν is needed in case Q b = Qq. With due apologies for the notation, here Dν (without arrows) is to be understood as a total derivative.
Above all derivatives are understood as being outside the timer-ordered products. This leads to particularly simple rules in momentum space. In order to make contact with our previous discussion we take the Fourier transformation x e iqx y e −ip B y . All CTs are absent for the ⊥-direction, while for the -direction the δ(x − 0)-and the δ(x − y)-terms lead to correlation functions in the variables p 2 B and k 2 respectively. Crucially, when contracting with the photon polarisation tensor * ρ these terms are proportional to * µ and thus do not contribute to our projection prescription (3.14). Hence for the current insertion the EOM do indeed assume the simplest possible form (C.1).

C.1.2 Distribution Amplitude Part (soft Photon)
If the photon is not created by the perturbative current then the analogues of (C.5) and (C.6) are given by where γ|A µ (y)|0 = 0 by parity and γ|V µ (y)|0 = 0 when the photon is not point-like, as is the case here. Not surprisingly the EOM assume the same simple form (C.1) as the current insertion contributions.

C.2 Renormalisation of Correlation Functions and Equation of Motion
Since the (non)renormalisation of the EOM has only been briefly discussed in [36] and involves some mixing of operators we consider it worthwhile to revisit this matter, thereby showing the consistency of our results with the computation of the renormalisation matrix [36].
In order to clarify the notation we remind the reader of the renormalisation of a Wilson coefficient in the OPE. The Wilson coefficients correspond, after Fourier transformation, to the hard scattering kernels T H in the notation in (4.9). We may discuss the renormalisation in the language of the local OPE.
We use the notation O to denote the relation between bare and renormalised operators. The bare renormalisation constants include the wavefunction renormalisation Z q . The local OPE is parameterised by . . ϕ denotes a matrix element over physical states and summation over the index C is understood. The OPE equation may be written as where it was used that the renormalisation matrix of the Wilson coefficient and the operator are each other's inverse. This is valid in mass-independent schemes such as DR with MS. In our case: (i) O B → J Bq which does not renormalise and thusZ BB → 1 in the formula above, (ii) O C → O T ∼qσ µν q which renormalises multiplicatively, and we use which the renormalisation is more involved [36]. We may simplify the task by absorbing the operator O 4 in [36], [36] , 20 for which the renormalisation matrix, computed in [36] cf. App. A, reads (C.10) Above and hereafter 21 ∆ = C F α s 4π and Z q = 1 − 2∆ (Feynman gauge) was used to translate from Z AA toZ AA . In this basis the non-renormalisation of the EOM, is expressed by the columns ofZ AA adding up to zero. Above the dots stand for single insertion of fields leading to CTs on the LHS. If the field insertions are absent the EOM simply assume the form (O In summary, dropping the index B, (C.8) simplifies to After this general discussion it remains to adapt to the operator basis used in this paper. The relation is as follows (C.14) In the basis (C.14), in which our correlation function are presented, the UV-counterterms assume the form taking into account that the O V -O mV relation (C.14) implies a sign flipZ 13 and dropping +6 inZ 33 (it corresponds to the mass renormalisation Z m = 1 − 6∆). Concluding, we have explicitly checked that the renormalised Π Γ quantity Here we focus on the basis relevant to the ⊥-direction. The one for the -direction is obtained by b → γ5b and mq → −mq. 21 This equation corrects an obvious factor of 2 typo in v3 of [36].
is free from UV-divergencies when expanded to order O(α s ). Above the superscripts (0) and (1) stand for LO and NLO respectively. We have explicitly verified (C.16) in our NLO computations, that is twist-1 and twist-2. As usual this shows consistency of the explicit computation and that Z AA is implicitly contained in the computation of this paper.

D Compendium for the Photon Distribution Amplitude
In this appendix we collect the information on the photon DAs. The main paper on the photon DA is [64] where the powerful background field technique is used. The photon DA was originally introduced in [44]. The reasons for writing such an appendix are at least threefold. First we collect the DAs used in this paper in more a compact way, second we include quark mass corrections and thirdly the DAs are presented in a format which is more useful for computations (e.g. elimination of 1/kz-factors through integration by parts and replacing ⊥ etc by standard vectors). The most sizeable quark mass corrections are due to twist-2, the correction to the magnetic vacuum susceptibility Sec. 4.2.1 ,and the mixing into the twist-3 DA via the EOM.
In reference [64] the photon DA are given for an external photon field (i.e. 0| . . . |0 Fµν ) as well as for a single photon state (i.e. γ(q, )| . . . |0 ). The two are related by assuming a plane wave for the external field. In this appendix we use the latter. The photon DAs and their hadronic parameters are defined in Secs. D.1 and D.2 respectively. A small dictionary between DAs of the photon and vector mesons is given in Tab. 11. The light-cone propagator in the background gauge field is given in the Sec. D.3 and which tangentially fits into this appendix.

D.1 Matrix Elements of the Photon Distribution Amplitude
The photon DAs are largely analogous to the vector meson ones, cf. Tab. 11. We identify two main differences. Since the photon is massless, the pure m V -terms are zero 22 and γ|qF µν q|0 matrix elements play a distinguished role as they do not introduce new hadronic parameters.
It is convenient to first introduce ⊥ and g ⊥ µν [104] g µν which are orthogonal to the light-like directions z and k (photon momentum): ⊥ k = ⊥ z = 0 and (g ⊥ q) µ = (g ⊥ z) µ = 0. 23 It is also worth noting that ⊥ is invariant under → + k which is the residual gauge transformation of the Lorenz gauge (k) · k = 0. In the formulae we use g µν and µ rather than the ⊥ version. Factors 1/kz that arise from these quantities are treated, as usual, via integration by parts 1 kz an artificial m 2 V term in their matrix element definition which might led one to think that they vanish altogether for the photon. 23 Above and hereafter we (often) suppress the contractions for brevity, e.g. k · z → kz.
where 1 0 dvf (v) = 0 is assumed and has to hold in order to produce a smooth kz → 0 limit. More concretely, we use the notation When comparing with [64] it should be taken into account that these authors use s e | BBK = −1 and s g | BBK = −1 which is more of a standard in the DA-literature.

Twist-2:
The only twist-2 DA φ γ is defined by where k [α z β] = k α z β − k β z α hereafter. and Wilson lines are not shown for brevity. Here χ q refers to the magnetic vacuum susceptibility for a quark q. 24 The incorporation of quark mass effects is not-trivial as it requires a subtraction prescription and our proposal is presented in the main text in Sec It is noted that RHS is still explicitly gauge invariant.
Twist-3: There are two 2-particle DAs 25 which are the analogue of g (v,a) ⊥ for vector mesons however since the photon mass is zero all its contributions enter through the EOM. In addition there are mass corrections, for which we follow the notation in [36] 24 Unlike in [100] we do not adapt the sign of χ to the convention but keep χ > 0. 25 The notation εµ( * , k, z) is a Levi-Civita tensor contracted with three vectors * , k and z.
which generalisesδ + (V ) = f ⊥ V 2m q /(f V m V ) from the vector meson case upon inspection of the matrix elements Finally there are also the 3-particle DAs 26 which also contribute via the EOM. AboveG αβ ≡ 1 2 αβγδ G γδ with G αβ = ∂ [α A β] + .. and Dα = dα 1 dα 2 dα 3 δ(1 − i α i ) is the measure of the momentum fractions. And finally, the hadronic parameter f 3γ is defined by (D.10) Twist-4 A word of caution before we start out. The twist-4 sector requires the introduction of a yet unknown 4-particle DA in order to be complete in all hadronic parameters cf. Sec. 4.4. We therefore drop the twist-4 terms (other than the pure Q b qq ones) in the numerics but keep them in the analytic results. The twist-4 2-particle DAs h γ and A are defined in (D.4) already and the quark condensate constitutes a third one. The Q q qq condensate terms are not well-defined in the OPE for on-shell photon since the come with a 1/k 2 -factor and hence these terms have to be absorbed into a DA. 27 . We arrive at the same expression as [100], taking into account that s e | LZ = −1, with some more detail where U = 1 is introduced as a flag in order to trace this term in the computation. It is needed in order to satisfy the QED WI for example. In addition to this exotic term there are the 3-particle twist-4 DAs given by DαS(α)e ikz(α 2 +vα 3 ) , γ(k, )|q(0)eQ q F ρσ (vz)q(z)|0 = +ieQ q qq k [ρ and γ(k, )|q(0)gG ρσ (vz)σ µν q(z)|0 = +s g s e eQ q qq kz and We note that {S γ , T 4γ } have the same form as {S, T 4 } and thus it is convenient to define.
as the results will necessarily depend on this combination. The quantities {S γ , T 4γ } are special in that they do not introduce a further hadronic quantity other than qq cf. subsection Sec. D.1. The remaining 1/kz-factor in (D.13) requires special treatment because of the additional integral over v originating from the background gauge field propagator (D.28). Moreover, for the computation it proves efficient to write the integrals over the 3-particle parameters α i as an integral over a single parameters u in complete analogy to the 2-particle case For the 1/kz relation, 1 0 dα 3 e ikz(1−vα 3 ) 1−α 3 0 dα 2 F (α) = 0, is required and has to hold, as before, for a smooth kz → 0 limit. The vanishing of this boundary terms has been explicitly verified for all T i (α). In practice the function G(v) → {1, v}.   properties of the DA. The symmetry properties refer to the DA-variables as used in this work and for the vector mesons G-parity odd states are assumed in accordance with the photon quantum numbers. S and A stand for symmetric and antisymmetric respectively as can be inferred from the explicit DAs given in Sec. D.2. The twist and conformal spin are given by t = l − s and j = (l + s)/2, where l is the dimension of the operator and s its spin whereby the k, ⊥ and z directions count as +1, 0 and −1 with regard to spin. The conformal spin takes into account the lowest non-vanishing hadronic parameter which is different for particles of even or non-definite G-parity. E.g. for the j V K * = 7/2 and not j Vρ = 9/2. The asterisk on the conformal spin indices states that the DA is an admixture of the conformal states e.g. (jq, j q ) = (1, 0) ± (0, 1). We do not show the twist-5 DA G v,a ⊥ introduced in [36]. For the vector meson we give some of the various names given to the DAs in the literature are shown e.g. [104], [105] [106]. For the photon DA we use the [64]-notation which is standard.

D.2 Explicit Distribution Amplitudes
In this section we give the form of the DA used in this paper based on the results in [64] with added quark mass corrections. As before we present in increasing twist. The hadronic parameters with respect to their twist and values are summarised in Tabs. 12 and 1 respectively. Table 12: Hadronic parameters and their association with the twist. The conformal spin of the 2-particle and 3-particle DA-parameters is j χ qq an = 2 + n (with a 0 ≡ 1) and j (f3γ ,ω V γ ,ω A γ ) = (7/2, 9/2, 9/2) respectively. The bottom right corner does not close under the EOM, it requires the inclusion of 4-particle DAs cf. the discussion in Sec. 4.4.
Twist-2: The φ γ DA takes the same form as the ρ 0 meson DA where C 3/2 n is a Gegenbauer polynomial, ξ ≡ 2u − 1, formally a 0 (γ) = 1 and all odd Gegenbauer moments vanish (a 2n+1 (γ) = 0) as the photon is G-parity odd. The corresponding hadronic parameter χ q is given in (D.4) is determined from a sum rule or lattice QCD for m q = 0 and the mass correction to it is new to this work and presented in Sec. 4.2.1. It can be understood as the linear response to an external electromagnetic field qσ xy q Bz = eχ q qq B z + non-linear. Hence the name susceptibility. In [64] the parameter a 2 (γ) is investigated but it is concluded that no reliable information on a 2 (γ) or any higher order Gegenbauer moment can be extracted from the sum rules considered. We take a 2 (γ)| 2 GeV = 0(0.1) as a reference value for its uncertainty given that one of the determinations in [64] yields a 2 (γ)| 1 GeV = 0.07. 28 Twist-3: The twist-3 DAs are characterised by 3-particle DA parameters of which the leading term is defined in (D.10). The DA are given by (ξ ≡ 2u − 1) The instanton model [107] photon DA, quoted in [64], is normalised at µ0 = 0.6 GeV and gives a0(γ, µ0) = 0.96 and a2(γ, µ0) = 0.03 with all others Gegenbauer moments vanishing.
whereδ + is defined in (D.7). These matrix elements match those in eq. 5.4 in [104] upon application of the following map (ζ V,A 3V = f V,A 3V /(m V f V ) as defined in eq 4.5 [104]) and dropping all other hadronic parameters as they vanish for the photon. Moreover, they agree with [64] (where the definition of the hadronic parameters ω V,A γ can be found) in the limit of no quark mass correctionsδ + → 0 as expected.

D.3 Quark Propagator on the Light-cone in Fock-Schwinger Gauge
The Fock-Schwinger gauge is defined by (x − x 0 ) · A a (x) = 0. The power of the technique is that one can replace usual by covariant derivatives (x − x 0 ) · ∂ ≡ (x − x 0 ) · D. For x 0 = 0 a formal solution is given by which is used to compute diagram A D 1 ) in Fig. 6. For the computation of the 3-particle DA contribution the propagator in the gluon background field is of use and its light-cone version has been derived in [73] which we write for a massive propagator in the form 0|T q(0)q(x)|0 A rather than 0|T q(x)q(0)|0 A which amounts to a replacement of v →v in the gluon field. One expands where we only need the first correction S (2) q which can be written in a symmetric form, (D.28) The variable s g is the sign of the covariant derivative (A.1) which we prefer to keep explicit. Note further terms in the expansion can be read off from eq. A.16 in [73], and even though given for a massless propagator the inclusion of quark mass would seem rather straightforward.

E Results of Correlation Functions
In this appendix we collect the result of our computation in terms of correlations functions.

E.1 Leading order Correlation Functions
We introduce the following shorthands and remind the reader that U is a flag that has to be set to one cf. (D.11). We denote C 0 (0, p 2 B , q 2 , m 2 q , m 2 q , m 2 b ) in the limit that m q is small, as C 0 0, p 2 B , q 2 , 0, 0, m 2 b . In terms of PV functions the LO correlators read , , , (E.5) (v) (u) +2m 2 b Q q qq , (E.6) . (E.8) , (E.9) E.2 Next-to-leading Order at twist-1 (Perturbative Diagrams) The computation is too lengthy to present in a paper, instead we include the results in the ancillary file corr PT NLO.m. The results are split according to both FF and charge i.e. PiVperpQq corresponds to the contribution from Π V ⊥ (p 2 B , q 2 )| PT proportional to Q q . The full perturbative correlation function can be obtained by summing the contribution from each charge, e.g. Π T (p 2 B , q 2 )| PT =PiTparaQq+PiTparaQb.

E.3 Next-to-leading Order at twist-2
The O(α s ) corrections to the twist-2 amplitude can be decomposed into the following form, with (K V , K T,D ) = (m b , m B ). For brevity we drop the reference to the projection as both directions yield the same result. The PV functions that appear, along with their corresponding coefficients, are given below with R Γ denoting rational terms.

F Multiple-Polylogarithms
MPLs, or Goncharov functions, can be viewed as generalisations of the classical polylogarithms. MPLs arise naturally when using the iterative method of differential equations to compute Feynman integrals. The development of the method of differential equations and advances in the understanding of MPLs represent a major breakthrough for multi-loop calculations [108]. The MPLs of weight n are defined iteratively by the integral, G(w 1 , . . . w n ; u) = u 0 dt t − w 1 G(w 2 , . . . , w n ; t).

F.1 Discontinuities
To identify the discontinuity of an MPL across some branch cut we require two key ingredients. The first is that since the discontinuity operator preserves weight and the discontinuity itself must be proportional to 2πi the discontinuity of a weight-n MPL can only be comprised of objects of weight-(n−1). 29 The second key ingredient is a piece of machinery from the mathematics of algebras and coalgebras, whose utility arises from the fact that MPLs form a Hopf algebra. That is, an MPL f , satisfies the relation ∆(Disc f ) = (Disc ⊗ id)∆(f ).

(F.4)
The coaction on MPLs ∆ has been used without definition, however we direct the interested reader to [108,109] for a detailed discussion of the coaction and the MPL Hopf algebra. The following calculation is facilitated by the Mathematica package PolyLogTools [54]. Proceeding by means of example, we consider a weight-4 MPL which is the highest weight appearing in the calculation. To keep expressions manageable we consider only the kinematic region of interest, p 2 B > m 2 b > q 2 . From the first ingredient we know that the discontinuity must take the following form, where we have used the fact that Disc G(1, p 2 B /m 2 b ) = −2πi Θ(p 2 B /m 2 b − 1). Comparing (F.6) and (F.7) the function g 3 can be easily read off, (F.8) To obtain g 2 we repeat the process but this time consider the 2, 2 component, The LHS can again be evaluated using (F.4), however this time the discontinuity operator will act on weight-2 MPLs. These discontinuities can be computed in an identical manner to the one outlined here, however only the 1, 1 component and a numerical fit of a constant will be required. In this manner, computing a discontinuity is an iterative process building on the knowledge of discontinuities of lower weight MPLs. Evaluating (F.9) we find g 2 = 0. Proceeding we identify g 1 through the 3, 1 component, = ∆ 31 (2πi g 3 ) + 2i(π 3 ⊗ g 1 ). (F.10) This time the evaluation of the LHS will require discontinuities of weight-3 MPLs which can be determined by considering the 1, 2 and 2, 1 components of their respective coactions and numerically fitting a constant. Inserting all the relevant quantities we find, (F.11) Finally we numerically determine α = 0, such that the full discontinuity in the region of interest is given by, (F.12)

F.2 The MPL Basis
As discussed in section 4.1 it was necessary to compute additional master integrals to those of [57] in order for large cancellations to become manifest. These additional integrals were related to the ones of [57] by the swapping of external legs (p 2 B ↔ q 2 ). Whilst both the Q q (A type) and Q b (B type) computations required these additional integrals it was only necessary to explicitly compute them for the B-type diagrams. For the A-type MI the relative simplicity of the alphabet translates to there being just five unique MPL arguments, Under p 2 B ↔ q 2 the arguments w 2,3,5 map onto themselves and w 1 ↔ w 4 . Consequently the basis of MPLs for the A-type MIs is closed under swapping of external legs. To generate the additional A-type MIs we can therefore simply take p 2 B ↔ q 2 in the appropriate MIs of [57] and large cancellations of MPLs will still be manifest. For the B-type MIs there are 19 possible arguments of the MPLs. These arguments do not close under leg swapping so a similar procedure cannot be applied.