Radiative Decays of Heavy-light Mesons and the $f_{H,H^*,H_1}^{(T)}$ Decay Constants

The on-shell matrix elements, or couplings $g_{H H^*(H_1)\gamma}$, describing the $B(D)_q^* \to B(D)_q \gamma$ and $B_{1 q} \to B_q \gamma$ ($q = u,d,s$) radiative decays, are determined from light-cone sum rules at next-to-leading order for the first time. Two different interpolating operators are used for the vector meson, providing additional robustness to our results. For the $D^*$-meson, where some rates are experimentally known, agreement is found. The couplings are of additional interest as they govern the lowest pole residue in the $B(D) \to \gamma$ form factors which in turn are connected to QED-corrections in leptonic decays $B(D) \to \ell \bar \nu$. Since the couplings and residues are related by the decay constants $f_{H^*(H_1)}$ and $f^T_{H^*(H_1)}$ respectively, we determine them at next-leading order as a by-product. The quantities $\{ f_{H^*}^T, f_{H_1},f_{H_1}^T\}$ have not previously been subjected to a QCD sum rule determination. All results are compared with the existing experimental and theoretical literature.


Introduction
In this paper, we consider the on-shell couplings g HH * γ and g HH 1 γ , for B(D) * q → B(D) q γ and B 1q → B q γ where q = u, d, s, from light-cone sum rules (LCSR) [1,2]. 1 Our own interest in these couplings is two-fold. Firstly they describe the decay H * (H 1 ) → Hγ; secondly they appear as residues of the m 2 H * (H 1 ) -pole for the H → γ form factor e.g. [3] and are likely dominant at the kinematic endpoint. The form factors in this kinematic region are of importance for QED-corrections to H → ν and H → . The neutral form factor is an ingredient for the Standard Model prediction of B s → µµγ [4,5] and invisible particle searches in B s → X (where X could be a flavoured axion or a dark photon at the LHCb, CMS or ATLAS experiment [6]).

The Couplings g HH * (H 1 )γ and their Relation to H → γ Form Factors
The purpose of this section is to discuss relevant method-independent aspects of the computation. For concreteness we shall write H = B, throughout this section, which stands for either of the beauty B u,d,s -or charmed D u,d,s -mesons. The couplings of interest are defined from the on-shell amplitudes 2 A B * →Bγ = i 2 s e e ε αβγδ (p B ) α η β F γδ g BB * γ , where D µ = ∂ µ + ies e A µ (with s e = ±1 depending on convention), η is the vector meson's polarisation, F αβ = ik [α * β] stands for the photon's outgoing plane wave and the coupling's mass dimension is [g BB * γ ] = [g BB 1 γ ] = −1. We refer the reader to App. A for more details on conventions. For the decay rates, with α = e 2 /4π as the fine structure constant, we obtain where the first expression agrees with [20] for example. These rates follow from an effective Lagrangian of the form 3 L ef f = s e g BB * γ 1 2 (B * , ∂B † , F ) − is e g BB 1 γ B * α ∂ β B † F αβ + h.c. .

(2.3)
This Lagrangian can be used at small recoil and has to be supplemented by higher order couplings away from it. As mentioned earlier, another point of interest in the couplings arises from their relation to pole-residues of theB → γ form factors [3] (and cf. App. A). 4 For clarity let us consider the dispersion representation of the vector form factor

4)
2 More concretely the couplings parametrise the on-shell matrix elements B (pB)γ(k)|B * (q) = [−i(2π) 4 δ (4) ( pi)] AB * →Bγ and B (pB)γ(k)|B1(q) = [−i(2π) 4 δ (4) ( pi)] AB 1 →Bγ . 3 One might wonder whether the proximity of the B and B * mass leads to any enhanced terms in the soft photon region in diagrams where the photon couples to an external B-meson and a lepton for instance (e.g. diagram top left of figure 3 in [21] where the weak Hamiltonian corresponds to B * → K ). The behaviour of the denominator in the soft region (i.e. kµ → 0), , is softened by the derivative term F αβ and avoids unsuppressed large logarithms of the form ln(∆M 2 B /m 2 B ). This is another manifestation, with a different twist, of the finding in [21] (cf. section 3.4 therein) that structure dependent terms do not generate large logarithms. 4 We note that when translating between theB → γ and B → γ form factors only the axial, and not the vector parts change sign, as can be inferred by applying a charge C-transformation with C|B = |B . We stress that our results are formally quoted for theB-meson.
where the dots represent higher terms in the spectrum. For the tensor form factor, TB →γ ⊥[ ] (q 2 ), the analogous form holds. The relation of the residues to the couplings are with decay constants f (T ) B * (B 1 ) defined in (3.7). The following exact relations, with µ UVdependence suppressed, are a consequence of the freedom to choose a particle's interpolating operator in field theory. This provides us with a non-trivial consistency check of our SR evaluation. Finally a note on the ultraviolet (UV) scale dependence µ UV . The couplings are of course scaleindependent since they correspond to on-shell matrix elements. Thus the vector residues are scale-independent whereas the tensor ones scale like the tensor decay constant with γ T = α s 4π 2C F + O(α 2 s ) , (2.8) and C F = (N 2 c − 1)/(2N c ) = 4/3.
3 The g HH * (H 1 )γ Couplings from Light-cone Sum Rules

The Computation
The couplings can be computed within the framework of QCD SRs on the light-cone. Proceeding via standard techniques we define two correlation functions [3] with quantum numbers chosen such that Π Γ ⊥ and Π Γ contain information on g BB * γ and g BB 1 γ , respectively and Γ ∈ {V, T }. Above the shorthand x = d 4 x has been adopted and the dots represent structures [3] which are not important for this discussion. The B-meson is interpolated by the operator J Bq J Bq ≡ (m b + m q )biγ 5 q , B q |J Bq |0 = m 2 Bq f Bq , (3.2) and the Lorentz structures P µ ⊥, are given by with the photon's polarisation vector, p B = q + k and on-shell momentum k 2 = 0. The vector and tensor operators of the b → q effective Hamiltonian are given by For brevity, from this point onwards we drop the subscript denoting the quark flavour such that m Bq = m B , f Bq = f B , et cetera. As previously mentioned, the computation of the correlation function is the same as for theB → γ form factor; we refer the reader to [3] for details of the calculation and now turn to the double dispersion relation.

The Dispersion Relation
The hadronic representation of the correlation functions is obtained from the double discontinuity of the correlation function 5 and reads where the sum runs over the vector meson's polarisations. The integration domain Σ ⊥, ranges from a lower cut shifted by two pion masses from the poles up to infinity. The dots indicate single dispersion integrals which do not contribute to the final result, and can be seen as the analogues of the subtraction terms of single dispersion integrals. The matrix elements to the right are the decay constants where η is the vector mesons' polarisation vector e.g. Eq. (A.2). The SR procedure involves the Borel transformation in both variables, , to enhance convergence. In the case of a dispersion relation of the form (3.6) this is straightforward due to the well-known formula (3.8) We refer the reader to App. D for the definition of the Borel transformation. 5 As Schwartz's reflection principle applies, one may use Disc → 2i Im cf. [22] for instance.

The Light-cone Operator Product Expansion
The correlation functions (3.1) are evaluated with perturbative QCD using the light-cone operator product expansion (LC-OPE) ordered, in practice, by a converging expansion in twist. The twist, known from deep inelastic scattering, is the dimension of the operator minus its spin. We refer to [3] for specific details and to the technical [23] and applied [2] reviews on the subject. It seems worthwhile to state that, contrary to intuition, the photon is more involved than an ordinary vector meson as it has both perturbative (twist-1) and non-perturbative nature (higher-twist). The latter is encoded in the photon distribution amplitude (DA) which can be understood as ρ/ω-γ or φ-γ conversions. At LO in α s we perform the computation up to twist-4 including 3-particle DAs, whilst at next-to-LO (NLO) twist-1 and twist-2 contributions have been computed. See however Sec. 3.5 for remarks on the completeness of twist-4.

The "Partonic" Dispersion Relation
One may also write a dispersion relation in perturbative QCD, which is formally distinct by its slightly different analytic structure with the discontinuity starting at m 2 b . 6 The dots have the same meaning as for the "hadronic" dispersion relation. Performing the double dispersion relation at NLO is complicated by pole singularities in q 2 = p 2 B . Taking a single discontinuity, say in p 2 B , one is faced with where the ρ i themselves contain non-trivial cuts. 7 These singularities, dubbed second type singularities [22,24], are solutions of the Landau equation for Π Γ ⊥ (p 2 B , q 2 ) but are not on the physical sheet. However this changes once the discontinuity is taken in p 2 B and they need to be taken into account. We refer the reader to App. C for technical details. 8

Borel Transformation of LO Terms for generic Distribution Amplitudes
As previously stated, for a given dispersion representation (3.9) the Borel transformation is straightforward due to (3.8). However, this demands committing to a specific DA. As 6 The m u,d,s masses are considered in the linear approximation for which we have derived new results such as the mq-correction to the twist-2 photon DA [3]. 7 At LO this is not the case and this is what makes them considerably easier to handle in practice. 8 An alternative is to use Schwartz's reflection principle, Disc → 2i Im, to obtain the discontinuity cf.
footnote 5. One can then deform the dt-integration path into the complex plane, away from the poles, in order to obtain a working dispersion representation. This approach, whilst being computationally inefficient, provides numerically stable results as long as the upper integration boundaries in the dt and ds integrals are sufficiently far apart. However, given the almost degenerate values of the masses mB and mB * a sufficient separation of the upper boundaries can not be justified, rendering this approach sub-optimal.
these can improve over time, due to better determination of hadronic parameters, there is some advantage in keeping them generic. Let us consider where f n (u) is some function proportional to the DA with suitable features in order to be compatible with first principle analytic properties. How to perform the Borel transformation and the continuum subtraction is described in App. D. These results extend those currently seen in the literature and are presented in greater detail. In theory a double Borel transform provides two Borel parameters. In practice however, we content ourselves to setting them equal (and u 0 = 1/2 cf. (D.12)), which is justified since m B ≈ m B * (≈ m B 1 ). The 3-particle DAs can be handled with the same technique as they reduce to an effective 2-particle DA (cf. appendix D in reference [3]).

The Sum Rule
The final step in completing the SR is to invoke semi-global quark-hadron duality. For a double dispersion relation this is not straightforward. Before addressing this issue let us assume an integration region (parametrised by a single parameter a andδ s,t specified in the next subsection), implemented with step function on the spectral density Equating the"partonic" and "hadronic" parts one obtains the sum rule 14) with the relation between the couplings and the residues r Γ [⊥] given in (2.5) and Γ = V, T . The somewhat unconventional factor of two in the exponent is a consequence of our definition of the Borel mass (3.12). The LCSR determines the product f B r Γ [⊥] and to obtain the residues and the couplings one replaces the decay constants by a QCD SR to the same accuracy in α s , e.g. and As previously mentioned, the two determinations for each couplings serve as an additional quality test of our SR. fixed, leads to a range of possible duality windows depending on the value of the parameter a. The solid green, blue, yellow, and orange curves correspond to the a = 1/2, 1, 2, ∞, cases respectively. In the limits 0 →t 0 the curves intersect at a single point, σ

Duality Region as a Function of the Duality Parameter a
Finally we turn to the question of the duality region encoded in (3.13) and derive explicit relations as a function of the parameter a. In defining the duality region, we follow earlier work [1,25] but extend it in that we considers 0 ,t 0 as a function of the parameter a. The solutions to the boundary defined by (3.17), and which therefore enter (3.13), areδ  Its geometric meaning can be inferred from Fig. 1. It takes on the rôle of the single dispersion effective threshold if ρ Γ ∝ δ(s − t) which is the case for a large part of the contributions. Fortunately, variation of the duality parameter a does not lead to large effects when the daughter sum rule is invoked to constrain the SR parameters, as will be discussed in the next section. We turn to the question of which choice of the parameter a is suitable. We find that in the majority of cases the dependence of the couplings on the duality window is rather limited, as evidenced by Tab. 5. The exceptions are the B 1s -and the D * s -meson cases, showing more significant variation. It has been argued that for the Isgur-Wise function [26] and the small velocity limit [27] that a = 1 is a necessary choice. Whether or not this translates to other cases and in particular to the case at hand is an open question. We adopt a = 1 as our default choice, and include variations under the duality window in our estimate of the total uncertainty (cf. Sec. 3.5.1).

Numerical Analysis
Physical input parameters used for the numerical evaluation of the SRs can be found in Tab. 14 in the appendix.
As there are a number of different renormalisation scales involved we discuss them in some detail. The UV scale, µ UV , has already been mentioned below Eq. (2.6) and is set to the pole mass m b (m c ) pole . For the LCSR there remains the scale of the coupling µ αs , the mass µ m (or µ kin cf. below) and the LC-OPE factorisation scale µ F . We set which is a standard albeit not a necessary choice and equate µ αs = µ F . The choice of a mass scale is linked with a choice of mass scheme. For the B → γ form factors we have found [3] that the MS-and the polescheme give rise to large effects in either higher twist or at O(α s ) rendering both of them suboptimal. For the g BB * (B 1 )γ couplings the evidence for adopting the kinetic-over the MS-scheme is less compelling (smaller improvement in twist-convergence). However, in an effort to remain consistent with our previous work [3], we choose to adopt the kineticscheme for the evaluation of both the FF residues and the effective couplings. As the kinetic mass scheme, originally devised for the inclusive decay operator product expansion (OPE) [8], can be considered as a compromise between the MS-and the pole-scheme. Moreover, it is indeed found that the kinetic scheme is stable under scale variation. The kinetic scale is set to µ kin = 1 GeV, with further details in [3]. For the D-meson decays the situation is different and the MS scheme gives more stable results than the kinetic scheme and we thus employ the MS scheme with the standard choice µ m = m c (m c ). This might not come as a surprise since m c itself is closer to µ F as compared to the B-case.
As indicated in Eqs. (3.15) and (3.16), to obtain the physical quantities one needs to divide by the decay constant(s) to the same order (cf. Sec. 4 for their discussion). The new inputs are the condensates, given in Tab. 14, and the factorisation scale of the local OPE, denoted by µ cond , which is set to µ cond = µ F in order to facilitate cancellations in the ratio. A summary of all renormalisation scales is given in Tab. 1 (left). Another aspect is that we drop twist-4 corrections, other than the pure quark condensates, as they are incomplete (requiring the inclusion of 4-particle DAs [3]). The resulting uncertainty ought to be captured, at least in part, by the variation of the Borel parameter. Table 1: Summary of the scales involved in the determination of the residues (Tab. 6) and coupling constants (Tab. 7) to the left of the double separation line. To the right we have the scale changes used for the best determination of the decay constants (Tab. 8). The quantity m c (m c ), above, is the MS mass at the scale m c . The uncertainty in µ F , for the B-meson (D-meson), is chosen to be determined by a number of constraints. As usual the Borel mass is determined subject to two competing factors, contamination from higher states is effectively suppressed by a smallM 2 , whilst fast convergence of the LC-OPE favours a largeM 2 as higher terms in the expansion are accompanied by ever increasing inverse powers of the Borel mass. The compromise of these two criteria, resulting in an approximately flat curve, is known as the Borel-window. To constrain the effective thresholds {s 0 ,t 0 } the, formally exact, daughter SR for the sum of meson masses (3.20) is employed with the ratio ofs 0 ,t 0 matched to the ratio of meson masses in the respective channels cf. caption of Tab. 2. In addition we imposes Bs ] to be satisfied reasonably well. We turn to the dependency on a specific duality parameter a. It is found that in the B-meson cases a single set of SR parameters is sufficient to satisfy (3.20) to within ≈ 2% for the a = {1/2, 1, 2} cases considered. For the D-mesons this no longer holds and a small modification to the SR parameters is made at each value of a.
We consider it worthwhile to comment on the specific numerical values of the thresholds found. The expectation for a single dispersive threshold s 0 is (m B i + 2m π ) 2 < s 0 < (m B i + m ρ ) 2 , and lying closer to the top boundary. Inspecting Tab. 2, we note that this is indeed the case for the single dispersion threshold s f B 0 but not for the double dispersion threshold σ 0 takes on a similar rôle to the single dispersive effective threshold, one must remember that it contains additional information on the excited vector meson channel, cf. (3.18) and might further be a result of the peculiar analytic structure in (s, t) of the LC-OPE. 9 Let us turn to the correlation imposed on parameters based on physical arguments. Whilst the effective threshold for decay constant s f B 0 can be independently determined 9 It is conceivable that if one were to adapt the daughter sum rule method to the extraction of gDD * π and gBB * π in [25], one could even find better agreement with experiment and/or the lattice.
it would contradict the method if it were completely independent of the σ (a) 0 -threshold, since they are both associated with the same state. A 50%-correlation is adopted between the two. The vector versus tensor results are correlated since, by the (exact) equation of motion, their difference is equal to a derivative operator which is numerically (and to some extent parametrically) suppressed at low recoil. In order to remain consistent this implies a correlation of the effective thresholds, as argued in [28] and more systematically exploited in [29]. 10 The correlations are imposed based on the contribution of the derivative operator to the equations of motion, which is ≈ 10% in the B ⊥-and ≈ 20% in the D ⊥-case. In the B -case the contribution is ≈ 40-45%. Another relevant aspect concerning the plethora of predictions is that not all channels are of equal quality. This is highlighted by the two separate determinations of the residue, from the vector and tensor interpolating current. Let us define the ratio U B ideally is close to one. We find reassuringly good values for the for the B 1 -case as anticipated cf. footnote 1. For the D * -case it is the accidental cancellation of the two charge contributions in perturbation theory, to be discussed further below, and the sensitivity to higher twist which gives rise to larger deviation from one. For the B 1 -case the concept of a well isolated resonance is not assured and for the D 1 it simply does not hold cf. also footnote 1. Therefore we do not quote any results for the D 1 whilst for the B 1 the results are deemed just marginally acceptable to present.
It is instructive to present a breakdown in terms of charges for comparison with other work and illustrate the, presumably accidental, cancellations in the charged D-and B 1 -case which unfortunately implies that these results are less reliable. For definiteness we quote the breakdown for the couplings obtained from the vector interpolating current where Q i are the standard quark charges Q b = Q d = Q s = − 1 3 and Q c = Q u = 2 3 and "ht" stands for higher twist and q = u, d . The size of the higher twist can be inferred from   Tab. 3. The twist-3 contribution is up to 5% in some cases and, as previously argued, most twist-4 contributions have to be dropped since they are incomplete without the inclusion of 4-particle DAs (cf. Sec. 3.5.1 for further relevant remarks in this direction).
We now proceed to discuss the numerical features of the B-and D-meson results in turn. Beginning with the B-mesons, for the values given in Tab. 2 we find that the daughter SRs (3.20) are, in all cases, satisfied to within 2%. The continuum contributions range from 25% in the ⊥-modes to 35% in the -modes. In the ⊥-modes the SR is dominated by the perturbative and twist-2 contributions which are approximately equal in size and are of the same sign. The remaining contributions make up ≈ 10-20% of the total. The story is repeated in the tensor -modes, however the situation in the vector -modes is somewhat altered. Here unfortunate cancellations act to suppress the perturbative contribution and the twist-2 sector is numerically dominant providing O(80)% of the total value. A breakdown of contributions according to twist is given in Tab. 3 for a representative selection of residues. The O(α s ) corrections are mass scheme dependent. In the kinetic mass scheme (µ kin = 1 GeV) employed the NLO results are sizeable, providing a correction of ≈ 20-35% and ≈ 20-25% at twist-1 and -2, respectively cf. Tab. 3. The benefit and necessity of an NLO computation is clearly visible in the scale variation plots shown in Fig. 2 as residual effects are then of O(α 2 s ). The SR parameters for the D-mesons are determined subject to the same tests as outlined above. In all cases the continuum contribution remains below 30% and in the neutral modes the daughter SR (3.20) is satisfied to within 3%. In the charged mode the daughter SR shows poor convergence. Again, this is due to the presumably artificial smallness of the perturbative contribution due to cancellation in Q c and Q u (cf. Tab. 3 and (3.22)). In contrast to the B-mesons we note that whilst the dominant contribution arises from the twist-1 or -2 sectors the twist-3 and -4 sectors are sizeable, in particular in the  Table 3: A breakdown of contributions according to twist, "pa" = number of partons and the specific DA. The definitions of the DAs can be found in [3]. The asterisk in total is a reminder that it does not include twist-4 contributions not closing under the equations of motion cf. [3]. charged case. The O(α s ) corrections to twist-2 range from ≈ 12% of the LO result in the vector modes to ≈ 20% in the tensor modes. In the twist-1 sector the tensor modes and the neutral vector mode have radiative corrections ranging between ≈ 2 -30%. In the charged vector modes, however, the corrections are > 50%, due to the previously mentioned large charge cancellation at LO.
The uncertainty due to the input parameters is estimated by varying each parameter, within the given interval, in turn and adding each individual uncertainty in quadrature. To incorporate correlations between the various thresholds, discussed previously, we generate 300 samples of the thresholds according to a Gaussian distribution such that the mean corresponds to the central value of each threshold and the standard deviation reproduces the associated uncertainty. We then evaluate the desired quantity for each of these samples, taking the standard deviation of the resulting points to be the uncertainty due to threshold variation. Our predictions for the couplings are given as the mean value of the vector and tensor interpolating current determinations. We estimate the associated uncertainty as the standard deviation of the two evaluations. Moreover, the uncertainty associated with  Table 4: Breakdown of the main contributions to the uncertainty for a representative selection of residues. ∆s 0 includes the combined uncertainty, incorporating correlations, due varying all effective thresholds, cf. discussion above (3.1). ∆ t=3 contains the total uncertainty due to all twist-3 hadronic parameters {f 3γ , ω A γ , ω V γ }. The uncertainty due to the choice of duality region is encapsulated in the quantity ∆Σ which represents the standard deviation of the a = {1/2, 1, 2} evaluations. The total uncertainty, which also includes smaller contributions such as the gluon condensate, is obtained by added uncertainties in quadrature.
varying the duality window is taken to be the standard deviation of the a = {1/2, 1, 2} determinations (cf. Tab. 5). This provides a small contribution in the B-meson cases, but notably a more significant contribution in the D s mode. Adding in quadrature the uncertainty from all sources, we obtain the total uncertainty as quoted in Tab. 7.
The final values for the residues and the couplings are shown in Tabs. 6 and 7 respectively. The value of the coupling presented in the table is the average of the vector and    tensor determinations.

Comparison with Literature and Experiment
It is of interest to compare to the existing literature and experiment. The values of the couplings obtained in this work, which constitute the mean value of the tensor and vector determinations, along with determinations from other computations as well as experiment are collected in Tab. 7. Unfortunately only two of the six couplings can be inferred from experiment as the widths of the vector mesons are too often unknown. 11 Moreover, in this section we use (B u , B d , D d , D u ) → (B + , B 0 , D + , D 0 ) which is the notation often used in experiment.
With regards to the two experimental values, we update the analysis in [13] and make some further comments. We first turn to the D * + , for which the width Γ(D * + ) = 83.4 (18) keV and branching fraction B(D * + → D + γ) = 0.016(4) are known [33], and with (2.2) give |g D + D * + γ | = 0.47(7) GeV −1 instead of the previous 0.50 (8) GeV −1 in [13]. For 11 The situation is different to the gBB * π couplings as there the B * → Bπ decay is kinematically forbidden and thus it seems unfortunate that the B * (B1) → Bγ transitions are not known because of unknown total widths.    [34], deduced from D * -decays and subject to 1/m c corrections, is rather close to our value given the difference in methods. Whereas Ref. [11] established that the D-couplings can be determined from LCSR we do not include the LO results presented therein as the input is outdated and a numerical comparison seems of limited use. D * 0 the width is unknown and one needs to rely on isospin to infer it [13]. First we deduce g c , related to the D * + Dπ-coupling, as g c = 0.57 (7), which is down from 0.61(7) in [13]. Considering all decay channels one then obtains Γ(D * 0 ) = 56.5(14.0) keV, down from 68(17) keV in [13]. Using the branching fraction B(D * 0 → D 0 γ) = 0.353(9) keV, down from 0.381(29) keV we get |g D 0 D * 0 γ | = 1.77(16) GeV −1 , down from 2.02(26) GeV −1 in [13].

units [ GeV
Our result g D 0 D * 0 γ = 2.11 +0. 35 −0.34 GeV −1 is compatible with experiment and so are the results of the other method. Our value for g D + D * + γ = 0.40 +0.12 −0.13 GeV −1 is again compatible with the new experimental value |g D + D * + γ | = 0.47(7) GeV −1 . Differences between this work and the LCSR computation [12] are noticeable and can be at least partially accounted for by computational differences. Firstly, we have computed twist-1 O(α s )-corrections whereas they did not. Secondly, we include linear quark mass correction at LO and in the magnetic vacuum susceptibility χ q (cf. section 3.2.1 in [3]). Third and most importantly, we drop twist-4 corrections other than the Q b qq condensate cf. previous section. For the Bmeson case the twist-4 corrections are not large and the impact is small and the differences can be attributed to the first two cases. For the D-mesons higher twist corrections are more important per se, as they are less convergent. In addition, for the charged case perturbation theory is presumably artificially suppressed which makes these results less reliable in general. Another important aspect is that in the charged case the inclusion of S γ and T 4γ is definitely incomplete as the photon can connect to the external states. A sign of this is that in the neutral case the twist-4 contributions cancel, whereas in the charged case they are additive cf. Tab. 3.
The lattice determination g DsD * s γ | [14] = 0.11 (2) is approximately three standard deviations lower than our value g DsD * s γ = 0.60 (19). This is where the breakdown (3.22) is useful. We find g DsD * s γ ≈ −0.6 Q c − 3.0 Q s and from Fig. 3 in [14] one deduces Whilst it is noted that in both cases the charm and strange quark contribution largely cancel each other, the effect is more pronounced for the lattice result. The charm contribution is rather close and the deviation is in the strange quark part with almost a factor 2 difference, which seems large but not as large as the initial number would suggest. It is instructive to investigate the D * + → D + γ case as by D-spin symmetry 12 , one would roughly expect a 20-30%-deviation. For our computation this is indeed the case g D + D * + γ ≈ −0.65 Q c − 2.5 Q d ≈ 0.40 (13), which does agree reasonably well with experiment g D + D * + γ = 0.47(7) (cf. Tab. 7). Concerning the question of D-spin breaking, some further guidance can be obtained from the lattice evaluation of the D d,s → γ form factors [35]. The fits to a linear and an extended pole model are in agreement with 20-30% D-spin breaking close to the kinematic endpoint. If the same level of D-spin breaking were valid at the m 2 D * -pole, which some past experience suggests, then g D + D + * γ and g DsD * s γ should not deviate considerably more than 20-30% from each other. If the former is true then this gives rise to a tension between the experimental g D + D * + γ = 0.47(7) and the lattice determination of g DsD * s γ | [14] = 0.11 (2). In conclusion it remains somewhat unclear what the resolution of this puzzle is. Whereas the sum rule results seems consistent, we wish to emphasise that, in exactly these modes, the sum rules are not the best of their kind for various reasons. It may well be that the level of cancellations between the strange and the charm charge contributions are so severe that past experience is overthrown. It would be helpful to have further lattice determinations of these couplings and in particular a more precise one for g D + D + * γ .

The f H , f H * , f H 1 , f T H * and f T H 1 Decay Constants from QCD Sum Rules
The main reason for computing the decay constants is that to the best of our knowledge {f T H * , f T H 1 }, required for the relation between the couplings and the (form factor) residues (2.5), have not been subjected to a QCD SR evaluation and are thus new. The quantities f B and f B * have previously been computed [15][16][17] to NLO with even partial NNLO results. We recompute these SRs and find agreement with the analytic expressions of the first two references. 13 In the work [17] the O(α s ) qq corrections were computed independently and we do disagree with some the expression e.g. the incomplete Gamma function. Compare equation (21) [17] versus (B.6) and equation (59) in [16]. 12 The exchange of d ↔ s, which is still a good approximate symmetry under QED. 13 A direct comparison with [15,16] can be made by taking the limit s0 → ∞ in the results in App. B, as we provide the correlation functions after taking the Borel transform with continuum subtraction.

The Computation
The starting points for the computations are the "diagonal" correlation functions where we have taken H = B for concreteness again. Above J α =qγ α b, J T αβ =qσ αβ b and the previously encountered J B is given in (3.2). The Lorentz structures are The Lorentz invariant functions are related to the hadronic quantities as follows and the remaining structureΓ f B * (p 2 ) is related to Γ f B (p 2 ) up to contact terms by the equation of motion. The correlation functions for the {f B 1 , f T B 1 } decay constants follow with rules for the insertion of the γ 5 into the currents cf. (B.10) and B * → B 1 in (4.1) following the ideas in [36].
The generic SR is parametrised by   Tab. 14. For the B(D)-meson SR a uniform uncertainty ∆s 0 = ±1.5 GeV 2 (∆s 0 = ±0.5 GeV 2 ), ∆M 2 = ±1.5 GeV 2 (∆M 2 = ±0.5 GeV 2 ) is applied to the threshold and the Borel parameter. The slightly smaller uncertainty assigned to the decay constant SR parameters versus those of the residues reflects the fact that the daughter SR is satisfied to within ≈ 0.5% in the former but only to within ≈ 2% in the latter. The relative size of the radiative corrections are denoted by δX such that For comparison we include the most recent lattice determinations. The J P = 0 − decay constants are taken from [37] which averages over values in [39][40][41][42] and [43][44][45][46][47] for the B-and D-mesons, respectively. For the J P = 1 − states we quote the values obtained in [19]. The experimental values are from the PDG review and the extraction of the decay constants involve the CKM matrix |V ub | and |V cd(s) | as inputs.
The PDG-error is from the experiment and the CKM input in the first and second parentheses respectively. Note that the central values for f B1 and f B1s from [17] deviate considerably from ours which might be due to discrepancies in the O(α s ) qq -corrections (cf. remarks at the end of the first paragraph in Sec. 4).  Table 9: Breakdown of the main contributions to the uncertainty for a representative selection of decay constants in units of MeV in the kinetic scheme. The uncertainty ∆ hd covers higher dimensional condensates omitted from the OPE which are estimated as the values of the d = 4, 5 condensates. This is conservative as the four quark condensates are known to be a sub per mille effect. The total uncertainty also includes contributions not shown in the table, such as ∆m 2 0 which has a negligible impact.

Numerical Analysis
The numerical analysis is the same as for the residues/couplings except that the scales are taken to be different as, in contrast, there is no motivation to cancel terms in ratios. Concretely, the condensate and α s scale are changed as shown, to the right of the vertical double separation, in Tab. 1. This enforces a change in SR parameters {M 2 f B , s f B 0 } according to the previous criteria, with thresholds fixed such that the daughter SR reproduces the known value of the associated meson mass to ≈ 0.5%. The continuum contribution is kept below ≈ 45%. The SR parameters are given alongside the main results in Tab. 8 (cf. Tab. 12 for MS-evaluation of the B-meson decay constants) and a representative breakdown of the uncertainty is given in Tab. 9. Isospin breaking effects impact at the sub per mille level e.g. [48] and are therefore not considered as they are 1.17 (7) 1.17 (7) 1.23 (8) (6) 1.29(9) 1.18(9) 1.04 (6) 1.20 (9) 1.15(9) 1.20 (11) 1.07(9)  superseded by the actual uncertainties. If considered, it would seem sensible to include QED effects as well, which would then necessitate the inclusion of the radiative mode in addition.
The uncertainties of the decay constants are around 10% and in agreement with lattice results of O(1-4%)-uncertainty. Moreover, we quote other QCD SR determinations, [16] and [38]. We differ from these results mainly in two aspects. First we do not include partial NNLO effects but treat the mass scheme and the factorisation scale dependence µ cond separately and thus more carefully. Secondly, we use a significant update of the strange quark condensate. We note that our values are also consistent with the classic Jamin and Lange result [15], (f B , f Bs ) = (210 (19), 244(21)) MeV.

Ratios of Decay Constants
Some of the decay constants are related by heavy quark and/or SU (3) F symmetries, and thus there is some tradition in investigating ratios and determining their deviation from unity. A total of 24 ratios are shown in Tab. 10 (cf. Tab. 13 for the MS-evaluation of the B-meson ratios).
SU (3) F -type ratios such as f Bs /f B are typically above 1 as one would intuitively expect. We quote our results, denoted by "PZ" for brevity instead of "this work", against some results from the literature (1.17(7), 1.19 (7) ) PZ SRs Comparison with the lattice average and shows that there is good agreement albeit the precision in lattice QCD, at the sub per mille level, is beyond reach for QCD SRs. The above lattice values are averaged over the works of [39,41,49,50] and [44,45,47] for the B-and D-ratio respectively. The PDG value, for which the CKM matrix elements |V cd(s) | are inputs, deviates close to three standard deviations from the lattice result. Further ratios of interest stem from heavy quark symmetry which groups the B and the B * meson into the same multiplet as in this (non-relativistic) limit the spin ceases to matter. Deviations of the rations from one therefore highlight sensitivities to effects beyond that limit and comparison with the literature  (20)) [19] lattice does show some minor tension between the results. Note that the lattice result [52] is with N f = 2 and [19,53] are with N f = 2 + 1 and thus more reliable. For further discussion of the possible reasons for discrepancies cf. section IV in [53]. We now proceed to give some detail on of the individual uncertainties of the ratios in the SR computation. In both the B-and D-meson rations the effective thresholds prove to be the largest source of uncertainty. Whilst correlations between the thresholds, discussed previously, act to constrain the error the contribution to the total uncertainty is still significant, sitting in the region of ≈ 70-80%. The remaining uncertainty can be mostly attributed to the associated quark mass and in the D-meson rations the coupling scale µ αs provides a contribution to the total uncertainty of a similar order. For the SU (3) F -ratios in (4.4), the quark condensate ratio ss / qq provides a notable contribution to the total uncertainty.

Summary and Discussion
In this work we have determined the couplings of photons to heavy-light quark mesons (2.3) from light-cone sum rules at next-to-leading order in α s at the twist-1,-2 level, at leading order in twist-3, and partial twist-4. 14 We have also investigated the effect of various duality regions (cf. Sec. 3.4.1 and Tab. 5) and have found the impact to be small. Our main results, with uncertainties of O(15%), are given in Tab. 7 along other theoretical and experimental results for comparison. The residues related to theB → γ form factors, as in (2.4) and (2.5), are given in Tab. 6. As a by-product we have determined the heavy decay constants f H , f H * , f H 1 , f T H * and f T H 1 (H = B, D) in QCD sum rules at next-to-leading order. 15 To the best of our knowledge {f T B * (D * ) , f T B 1 } have not been evaluated with QCD sum rules and we therefore close a gap in the literature. Agreement is found with existing results, where comparison is possible, on the analytic and numerical level cf. Tab. 8. Our treatment differs, besides a significant update to the strange quark condensate, in that we treat the mass-scheme and the factorisation scale dependence µ cond separately and thus more carefully, but do not include partial O(α 2 s ) corrections to perturbation theory. Ratios of decay constant are given in Tab. 10 and compared to the literature in Sec. 4.2.1.   We now turn to phenomenological aspects. The coupling determinations lead to the radiative decay predictions given in Tab. 11, consistent with the experimentally known D + /D 0 -rates. It's unfortunate that the B-rates are not experimentally known as our predictions are more reliable in that sector (e.g. independence of the interpolating current and convergence of the twist expansion). Particularly for the D + /D s -channels there is the additional issue of large cancelation of the Q c -and Q q/s -contributions which present a challenge for all theory approaches (cf. the discussion in Sec. 3.5.1). An important aspect is the interplay with the real QED-corrections in leptonic decays H → ν(γ). This is the case since the couplings describe the pole residue (2.4) and [54] which, bearing in mind previously mentioned cancellations, should play a significant role in the soft-photon emission. In view of the importance of QED-corrections at the precision frontier, these couplings will hopefully attract further attention from the experimental and theory community.

A Convention, Definitions and Additional Tables
In this appendix we collect conventions, definitions and input parameters.

A.1 Convention and Definitions
We use the convention ε 0123 = 1 for the Levi-Civita tensor and D µ = ∂ µ + s e ieQ f A µ + s g ig s A µ for the covariant derivative (e > 0 and Q e = −1 for the electron as a u-spinor). Below we will keep explicit factors s i in place, which are assumed s i = 1 throughout the main text, in order to facilitate comparison with the literature. The B q -meson (q = d, u, s) decay constant is defined by and for the B * q (1 − ) and B 1q (1 + ) states via The definition for the D−, D * -and D 1 -mesons are analogous. With these conventions the couplings the effective Lagrangian (2.3) assumes the form For completeness we state the definition of the B → γ form factors used in [3] γ(k, where P ⊥ µ and P µ are defined in the main text (3.3), QB q is theB-meson charge and the dots represents the Low-term (or contact term) which is not important for this paper (cf. [3] for details). Note that, the point-like term, proportional to f B , is not be included for the g BuB 1u γ coupling as it is not associated with the B 1u -pole. The local operators in (A.4) are given by O

A.2 Additional Tables
Here we provide some additional tables, namely the input parameters Tab. 14, MS determinations of the decay constants Tab. 12 and their ratios Tab. 13. We note that when fixing the SR parameters via the daughter SR we observe that the optimal value of the effective threshold for the B * decay constant sits below that of the  f    B. Clearly this does not make sense from a physical point of view and so we relax the condition on the daughter SR (4.3) such that it reproduces the associated meson mass to within 1.5%, which allows for the physical ordering of the thresholds to be imposed. We do not observe this problem when evaluating in the kinetic scheme which is another reason in its favour.
Running coupling parameters α s (m Z ) [33] m Z [33] 0.1176 (20) 91. 19 GeV   Table 14: Summary of input parameters. † Value obtained by using the O(α 2 s ) conversion between the MS and the kinetic mass given in [40]. The uncertainty is obtained by adding in quadrature the uncertainty due to the MS mass and the conversion formula. For the meson masses we have not indicated an uncertainty as they are negligible. We refer to [3] for all the input concerning the photon DA that enters the light-cone sum rule computation.
The Wilson coefficients are presented after integration and can therefore depend on the effective threshold. For comparison with the literature cf. footnote 13 in the main text.
is a factor that depends on the mass scheme. Above we have also included the leading light quark mass corrections to the LO quark condensate contribution. As mentioned in Sec. 4.1, we have verified that the NLO scale dependence, in µ UV and µ cond , is consistent with the LO expression. The SRs for the J P = 1 + decay constants can be obtained from the J P = 1 − ones by changing the sign of certain contributions according to their chirality, in spirit with the parity doubling proposal in [36].

C Double Dispersion Relation
In computing the densities we are faced with the following problem. We have an analytic function F (p 2 B , q 2 ) for which it is straightforward to derive a single dispersion relation where the density is formally given by πρ(s, q 2 ) = Im s F (s, q 2 ). The density can be decomposed into poles in s = q 2 such that The singularities in s = q 2 are of so-called second type, which are special solutions of the Landau equations [22,24]. It is our task to write the q 2 -dependence of (C.2) dispersively, say in an integral over dt, and impose a continuum subtraction. The duality interval is discussed in (3.17) in the main text.

C.1 Leading Order
At LO in PT the ρ i themselves contain no non-trivial cuts. Consequently, the poles provide the only contribution to the discontinuity in q 2 , allowing us to write where the continuum subtraction has been implemented as in (3.17). Partially integrating and performing the integrals over the δ-functions we obtain, where the prime denotes a derivative w.r.t. the variable s and ρ i ≡ ρ i (s, t). Above we have utilised the fact that Im t ρ i (m 2 b , t) = (Im t ρ i (s, t)) | s→m 2 b = 0. Application of the principal part to the double integral of (C.6) leads to a technical splitting of the integration region, which can be most clearly seen in Fig. 1. Schematically, one has ds , (C.8) which corresponds to triangles B, A, and C of Fig. 1 respectively.

D Subtracted Borel Transformation of Tree Level DA Terms
We're faced with the problem of finding the double Borel transformation of the following generic function ( = 0, 1)  If one commits to a specific function f (u) the du-integral can be done and its double dispersion integral can be worked out in a relatively straightforward manner. In the literature the case F (0) 1 has been worked out more generically [58] which we generalise to F wherec k ≡k ! k! ck. Above we have assumed that f n (u) ∝ (uū) n−1 (1 + O(u,ū)) which is a sufficient condition for the function F n, (p 2 B , q 2 ) (D.1) to be free from 1/(p 2 B − q 2 ) singularities. 16 The first dispersion representation can be obtained by a change of variable .

(D.7)
At this level any further singularities are induced by 1/(s − q 2 ) 1+k and, as discussed in the previous section, correspond to so-called second type singularities. These singularities cannot appear for F n,0 (p 2 B , q 2 ) itself which is a fact that we have used in making the specific ansatz (D.4). The double dispersion relation then reads 1,3 } and the mass corrections to {Ψ (a) , Ψ (v) }, for which an accurate polynomial fit can be made. For the case a = 1 ands 0 =t 0 , M 2 1 = M 2 2 ≡ 2M 2 withM 2 →M 2 and u 0 → 1/2, which is the one considered in the literature [58], there are miraculous simplifications. First the exponential factor in (D.15) becomes s-independent and (D.14) assumes a more manageable form, for which the k-dependence in the Ω-terms cancels! It is remarkable that for this special case the continuum subtraction vanishes for n > 1 (n > 2) inF n,0 (F n,1 ) and accidentally renders some results in the literature, where continuum subtractions have been neglected, more accurate. Note thatF 1,0 has previously been computed in App. B of [58] and we agree with their result.