Cottingham formula and nucleon polarizabilities

The difference between the electromagnetic self-energies of proton and neutron can be calculated with the Cottingham formula, which expresses the selfenergies as an integral over the electroproduction cross sections – provided the nucleon matrix elements of the current commutator do not contain a fixed pole. We show that, under the same proviso, the subtraction function occurring in the dispersive representation of the virtual Compton forward scattering amplitude is determined by the cross sections. The representation in particular leads to a parameter-free sum rule for the nucleon polarizabilities. We evaluate the sum rule for the difference between the electric polarizabilities of proton and neutron by means of the available parameterizations of the data and compare the result with experiment.


Introduction
The mass difference between proton and neutron had been puzzling for a long time. Ever since Heisenberg had introduced isospin symmetry to explain the near degeneracy of these two levels [1], it was taken for granted that the strong interaction is invariant under isospin rotations and that the mass difference is of electromagnetic origin. In this framework, it was difficult, however, to understand the experimental fact that the neutral particle is heavier than the charged one. A first step towards a resolution of the paradox was taken by Coleman and Glashow, who introduced the tadpole dominance hypothesis [2,3], which associates the bulk of the electromagnetic self-energies with an octet operator. The origin of the tadpole remained mysterious, however. The puzzle was solved only in 1975, when it was realized that the strong interaction does not conserve isospin, because the masses of the up-and down-quarks strongly differ [4]. The crude estimates for the ratios of the three lightest quark masses obtained in that work, m u /m d ≃ 0.67, m s /m d ≃ 22.5, have in the meantime been improved considerably. In particular, Weinberg [5] pointed out that in the chiral limit, the Dashen theorem provides an independent estimate of the quark mass ratios, as it determines the electromagnetic self-energies of the kaons in terms of those of the pions. Neglecting higher orders in the expansion in powers of m u , m d , and m s , he obtained the estimate m u /m d ≃ 0.56, m s /m d ≃ 20.1. Also, the decay η → 3π turned out to be a very sensitive probe of isospin breaking [6][7][8][9][10]. The quark mass ratios obtained from that source also confirmed the picture. According to the most recent edition of the FLAG review [11], the current lattice averages are m u /m d = 0.46 (3), m s /m d = 20.0(5).

Cottingham formula, dispersion relations
The analysis of [4] relies on the Cottingham formula [12], which invokes dispersion relations to relate the spin-averaged nucleon matrix elements of the time-ordered product, p|T j µ (x)j ν (y)|p , to those of the commutator of the electromagnetic current, p|[j µ (x), j ν (y)]|p . Lorentz invariance and current conservation determine the Fourier transforms of these matrix elements in terms of two invariant amplitudes, which only depend on the two variables ν = p · q/m and q 2 , where m is the nucleon mass and q the photon momentum. We stick to the notation used in [4] and denote the invariant amplitudes by T 1 (ν, q 2 ), T 2 (ν, q 2 ) and V 1 (ν, q 2 ), V 2 (ν, q 2 ), respectively. Explicit formulae that specify the matrix elements p|T j µ (x)j ν (y)|p and p|[j µ (x), j ν (y)]|p in terms of the invariant amplitudes are listed in appendix A, where we also exhibit the relations between the structure functions V 1 (ν, q 2 ), V 2 (ν, q 2 ) and the cross sections σ T and σ L of electron scattering.
While the functions V 1 (ν, q 2 ), V 2 (ν, q 2 ) are odd under ν → −ν, the time-ordered amplitudes T 1 (ν, q 2 ), T 2 (ν, q 2 ) are even. In view of the contributions arising from Regge exchange, V 1 (ν, q 2 ) ∼ ν α , V 2 (ν, q 2 ) ∼ ν α−2 , only T 2 obeys an unsubtracted dispersion relation, while for T 1 a subtraction is needed. 1 For q 2 < 0, the dispersion relations thus take the form The formulae hold in the cut ν-plane; the upper and lower half-planes are glued together along the interval |ν| < Q 2 /2m of the real axis (throughout, we use Q 2 ≡ −q 2 whenever this is convenient). As illustrated with the discussion in appendix E, it is important that kinematic singularities, zeros, and constraints be avoided -throughout this paper, we work with the amplitudes defined in appendix A, which are free of these [14][15][16]. We refer to S 1 (q 2 ) as the subtraction function. It represents the value of the amplitude T 1 (ν, q 2 ) at ν = 0. For later use we introduce the analogous notation also for T 2 (ν, q 2 ):

Reggeons and fixed poles
In [4] it is assumed that the asymptotic behaviour is determined by Reggeon exchange. The contribution of a Regge pole to a scattering amplitude at large centerof-mass energy squared s and small momentum transfer t ≤ 0 has the form (see e.g. [17]): where α(t) and β(t) denote the trajectory and the residue, respectively, and τ is the signature. In the context of the present paper, we are concerned with t = 0 and τ = 1. The continuation of the asymptotic formula (4) to low energies is not unique. For definiteness, we work with the representation where s + and s − stand for s ± = (p ± q) 2 = m 2 ± 2mν −Q 2 and s 0 ≥ m 2 is a constant. The expression (5) is manifestly symmetric under photon crossing. Unless the intercept α is an integer, 2 the first term in the curly brackets contains a branch cut along the positive real axis, starting at 2mν = s 0 −m 2 +Q 2 . The second is real there. One readily checks that, on the upper rim of this cut, the individual terms in the sum (5) differ from the asymptotic expression (4) only through contributions of O(s α−1 ). The basic assumption made in [4] is that, in the limit ν → ∞ at fixed q 2 , only the Reggeons survive, so that the difference tends to zero: 3 We refer to this hypothesis as Reggeon dominance.
A nonzero limiting value in (6) would represent a νindependent term. In Regge-language, a term of this type would correspond to a fixed pole at angular momentum J = 0. The Reggeon dominance hypothesis (6) thus excludes the occurrence of such a fixed pole. The presence or absence of a fixed pole at J = 0 in Compton scattering is a standard topic in Regge pole theory [18] and the literature contains several works advocating the presence of such a contribution. In particular, the universality conjecture formulated in [19] has received considerable attention (see e.g. [20] and the papers quoted therein).
Note, however, that these considerations go beyond the safe grounds provided by asymptotic freedom. While the short distance properties of QCD ensure that, if both ν and q 2 are large, the behaviour of T 1 (ν, q 2 ) and T 2 (ν, q 2 ) is governed by the perturbative expansion in powers of the strong coupling constant, the behaviour in the Regge region, where only ν becomes large, is not controlled by the short distance properties of QCD. In particular, values of q 2 of the order of Λ 2 QCD are outside the reach of perturbation theory, even if ν is large.
The perturbative analysis shows that an infinite set of graphs needs to be summed up to understand the high-energy behaviour of the amplitudes in the Regge region. The dominating contributions can be represented in terms of poles and cuts in the angular momentum plane (Reggeon calculus, Reggeon field theory). The behaviour of the sum thus differs qualitatively from the one of the individual diagrams.
There is solid experimental evidence for the presence of Reggeons also in the data. The relation (6) amounts 2 Integer values of α require special treatment, but since this case does not arise for the parameterizations we are working with, we do not discuss it further. 3 More precisely, it is assumed that the difference disappears rapidly, so that it obeys an unsubtracted dispersion relation.
to the assumption that the asymptotic behaviour of the current correlation function can be understood in terms of these. In the analysis described in the present paper, this assumption plays a key role. In particular, as will be demonstrated explicitly below, it uniquely fixes the subtraction function relevant for the difference between proton and neutron in terms of the electron cross sections, so that the entire self-energy difference can be expressed in terms of these cross sections. In other words, the necessity of a subtraction in the fixed-q 2 dispersion relation for T 1 (ν, q 2 ) modifies the relation between the self-energy difference and the electron cross sections, but does not destroy it. 4 The subtraction functions occurring in the fixed-t dispersion relations relevant for real Compton scattering are analyzed in [21,22]. As shown there, the experimental information on the differential cross sections can be used to impose bounds on the subtraction functions. In particular, these bounds lead to the conclusion that the electric polarizability of the proton is necessarily larger than the magnetic one, in conformity with experiment. An update of this work with the data available today is highly desirable. Unfortunately, this approach to the problem cannot readily be extended to virtual Compton scattering, because data on the differential cross sections are available only for real photons.

Recent work
The numerical analysis of [4] was based on the scaling laws proposed by Bjorken [23]. The data available at the time were perfectly consistent with these, but Bjorken scaling correctly accounts for the short-distance properties of QCD only to leading order in the perturbative expansion in powers of α s . The higher-order contributions generate specific violations of Bjorken scaling [24,25]. In the meantime, the implications of the phenomenon and the corresponding modification of the short-distance properties of the matrix elements p|T j µ (x)j ν (y)|p have been investigated by Collins [26]. Unfortunately, however, he did not reevaluate the self-energy difference in this framework. In fact, the question of whether the Reggeons do dominate the asymptotic behaviour or whether the amplitude in addition contains a fixed pole at I = 1, J = 0 is not touched at all in that work.
Motivated in part by the study of hadron electromagnetic mass shifts on the lattice (see, e.g., [27][28][29]), the Cottingham formula has recently been reexamined [30][31][32][33][34][35][36], but the central issue in this context -the possible occurrence of fixed poles -is not addressed in these papers, either. Instead, the electron cross sections σ T , σ L and the subtraction function S 1 (q 2 ) are treated as physically independent quantities. The main problem with the framework set up in [31] is that a direct experimental determination for S 1 (q 2 ) is not available. To bridge the gap, the authors set up a model which parameterizes the dependence of the subtraction function on q 2 . The overall normalization, S 1 (0), can in principle be determined from the difference between the magnetic polarizabilities of proton and neutron, albeit the experimental value is subject to rather large uncertainties [37]. The main problem in this approach, however, is the momentum-dependence of the subtraction function, which leads to a systematic uncertainty that is difficult to quantify.

Structure of the present paper
The remaining sections are organized as follows. In section 2, we show how the Reggeon dominance hypothesis (6) fixes the subtraction function S 1 (q 2 ) from spacelike data alone. In section 3, we discuss the splitting of the amplitudes T i into elastic and inelastic contributions. We derive sum rules for the nucleon polarizabilities in section 4, while a thorough phenomenological analysis is provided in section 5. In particular, the sum rules allow us to predict the difference between the electric polarizabilities of proton and neutron. In view of the fact that the proton polarizabilities are experimentally known more accurately, our result can be turned into a prediction of the electric polarizability of the neutron, which is consistent with observation but somewhat more precise. The magnetic polarizabilities then follow from the Baldin sum rule. Section 6 is devoted to the electromagnetic self-energies of proton and neutron. We discuss the renormalization of the Cottingham formula, in particular the role of the subtraction function in the evaluation of the self-energy and provide a comparison with recent work on the issue. A summary and concluding remarks are given in section 7. In appendix A, we detail the notation used. Appendix B reviews those properties of Compton scattering we are making use of. In particular, we discuss the frame-dependence of the spin average and derive the low-energy theorem which underlies the sum rule for the electric polarizability. Appendices C and D contain a short discussion of the role of causality in our analysis. Last but not least, we note that in [30][31][32][33], a comparison with the analysis of [4] is attempted. Unfortunately, many of the statements made there are sim-ply incorrect. Some of the misconceptions are rectified in appendix E.

Determination of the subtraction function
The Regge amplitude obeys a once-subtracted fixed-q 2 dispersion relation: In the space-like region and for ν ≥ 0, the absorptive part of the amplitude specified in (5) is given by The Reggeon dominance hypothesis (6) implies that the difference between the full amplitude and the Regge contributions, , obeys an unsubtracted dispersion relation. In particular, the value of T 1 (0, . Hence the subtraction function can be represented as This formula explicitly represents the subtraction function in terms of measurable quantities: the structure function V 1 (ν, q 2 ) is determined by the cross sections for inclusive electron-nucleon scattering. The high-energy behaviour of these cross sections also determines the Reggeon residues β α (Q 2 ) and thereby fixes the term T R 1 (0, q 2 ), as well as the corresponding contribution to the structure function, V R 1 (ν, q 2 ). If the trajectory intercepts α were all below zero, the unsubtracted dispersion integral over V R 1 (ν, q 2 ) would converge and would exactly compensate the first term on the right of (9) -the subtraction function would then be given by the unsubtracted dispersion integral over V 1 (ν, q 2 ). The expression for the subtraction function in (9) shows how the divergence of the unsubtracted dispersion integral generated by the Reggeons is handled: the corresponding contribution is removed from the integrand, so that the integral converges also at the physical values of the intercepts. The modification is compensated by the term T R 1 (0, q 2 ), which must be added to the integral over the remainder. The procedure amounts to analytic continuation in α from negative values, where T R 1 (0, q 2 ) is given by the unsubtracted dispersion integral over V R 1 (ν, q 2 ) to the physical values, where that representation does not hold any more.
We emphasize that the specific form used for the Regge parameterization does not matter. In particular, the Regge amplitude specified in (5) involves a free parameter, s 0 . Since it does not affect the leading term in the asymptotic behaviour, the value used for s 0 is irrelevant -our results are independent thereof. In the following, we simplify the equations by taking s 0 in the range s 0 ≥ (m + M π ) 2 , which has the advantage that V R 1 (ν, q 2 ) then vanishes outside the inelastic region.

Elastic part
The contributions to the structure functions arising from the elastic reaction e + N → e + N are determined by the electromagnetic form factors of the nucleon. In the space-like region, these contributions are restricted to the lines q 2 = ±2νm and read (i = 1, 2) where G E (t) and G M (t) are the Sachs form factors. The elastic contributions to the time-ordered amplitudes T 1 , T 2 cannot be specified as easily. In perturbation theory, they are usually referred to as Born terms and it is not a trivial matter to specify them at higher orders of the calculation. In effective low-energy theories, the decomposition into a Born term and a 'structure part' is not a simple matter, either. For a detailed discussion of these aspects, we refer to [38][39][40]. In the framework of dispersion theory, however, the decomposition is unambiguous. The reason is that analytic functions are fully determined by their singularities and their asymptotic behaviour: dispersion theory provides a representation of the amplitudes in terms of its singularities. In our framework, this representation is given by the dispersion relations (2) and the sum rule (9). The elastic contribution is the part of the amplitude which is generated by the singularities due to the elastic intermediate states. These are specified in (10). Accordingly, the elastic parts of T 1 , T 2 are obtained by simply replacing V 1 , V 2 with V el 1 , V el 2 and dropping the Regge contributions. In the case of T 2 , this leads to In the case of T 1 (ν, q 2 ) there are two contributions, one from the subtraction function, the other from the subtracted dispersion integral: The sum rule (9) for the subtraction function implies Taken together, the two terms on the right hand side of (12) yield the unsubtracted dispersion integral, so that the expression takes the same form as the one for T el 2 (ν, q 2 ): Inserting the explicit expressions for the elastic contributions to the structure functions, we obtain Both functions tend to zero when ν becomes large: by construction, the elastic part of T 1 (ν, q 2 ) does not contain a singularity at infinity. Moreover, as demonstrated in appendix D, even taken by itself, the elastic contributions can be represented in manifestly causal form. The explicit expression for the elastic part of the subtraction function, exclusively involves the form factors, which are known very precisely.

Inelastic part
We refer to the remainder as the inelastic part of the amplitude: In contrast to the elastic part, which contains the poles generated by the elastic intermediate states and is singular at the origin, the inelastic part is regular there. At high energies, the converse is true: while the elastic part tends to zero, the inelastic part includes the contributions from the Reggeons, which are singular at infinity. In particular, the sum rule for the inelastic part of the subtraction function reads: where ν th = M π + (M 2 π − q 2 )/2m denotes the inelastic threshold. The dispersive representation for the inelastic part of T 1 (ν, q 2 ) then becomes In the case of T 2 (ν, q 2 ), a subtraction is not needed. The contribution from the elastic intermediate state to the dispersion integral in (2) coincides with the expression for T el 2 (ν, q 2 ) in (15). Removing this part, which is even more singular at the origin than T el 1 (ν, q 2 ), we obtain the following representation for the inelastic part:

Subtraction function in terms of cross sections
The structure function V 1 (ν, q 2 ) is a linear combination of the transverse and longitudinal cross sections, see appendix A: The representation of the subtraction function thus involves integrals over the transverse and longitudinal cross sections. For S inel 1 (q 2 ) and S inel 2 (q 2 ), the following integrals are relevant: Expressed in terms of these, S inel 1 (q 2 ) and S inel 2 (q 2 ) are given by While the transverse parts of Σ 1 (Q 2 ) and Σ 2 (Q 2 ) only differ in sign, the longitudinal parts are quite different. Regge asymptotics implies that σ T as well as σ L grow in proportion to ν α−1 . Accordingly, the integral Σ T (Q 2 ) converges -it represents a generalization of the integral relevant for the Baldin sum rule to Q 2 = 0 (cf. subsection 4.2). While Σ L 2 (Q 2 ) is dominated by the contributions from the low-energy region and rapidly converges as well, it is essential that Reggeon exchange be accounted for in Σ L 1 (Q 2 ). We are assuming that, at high energies, the longitudinal cross section can be approximated with a representation of the form At Q 2 = 0, a Reggeon term proportional to ν α in V 1 corresponds to a contribution toσ L that is proportional to ν α+1 . For nonzero values of Q 2 , however, the factor in front of the sum implies that the corresponding cross section contains sub-leading contributions. As discussed at the end of section 2, the specific form used for the Regge parameterization is not essential -as long as it satisfies a once-subtracted dispersion relation and correctly represents the asymptotic behaviour of the physical cross section. We stick to the one specified in (5), which leads to (27).

Chiral expansion
Chiral perturbation theory (χPT) exploits the fact that in the limit m u , m d → 0 (at fixed Λ QCD , m s , ..., m t ) QCD acquires an exact chiral symmetry, which strongly constrains the low-energy properties of the amplitudes. The chiral perturbation series provides a representation of the quantities of interest in powers of momenta and quark masses. In the chiral limit, the pion is a massless particle, but when the quark masses m u , m d are turned on, the pion picks up mass in proportion to the square root thereof, M 2 π = (m u + m d )B + O(m 2 q log m q ).
In the context of the present paper, we only need the chiral expansion of the form factors G E (q 2 ), G M (q 2 ), and of the functions S 1 (q 2 ), S 2 (q 2 ). These quantities involve a single momentum variable, q 2 . As we work in the isospin limit m u = m d , the corresponding chiral perturbation series involves an expansion in the two variables M π and q 2 . The series can be ordered in powers of M π ; the coefficients then depend on the ratio which counts as a quantity of O(1). In contrast to the straightforward Taylor series in powers of q 2 , the chiral expansion is able to cope with the infrared singularities generated by the pions.
To leading order in the chiral expansion, the infrared singularities are described by a set of one-loop graphs of the effective theory [41]. In the case of the magnetic Sachs form factor, for instance, the evaluation of the relevant graphs within Heavy Baryon χPT leads to the following expression for the first non-leading term in the chiral expansion [42]: 5 Up to and including O(M π ), the magnetic form factor can thus be represented in terms of the magnetic moment µ, the pion decay constant, F π = 92.21 (14) MeV [44], and the nucleon matrix element of the axial charge, g A = 1.2723(23) [45]. The formula shows that, up to higher-order contributions, the singularity is described by a function of the ratio τ = (−q 2 )/4M 2 π : the scale is set by the pion mass, not by Λ QCD . The presence of a scale that disappears in the chiral limit also manifests itself in the slope of the form factor at q 2 = 0, i.e. in the magnetic radius: the above representation shows that the chiral expansion of the magnetic radii of proton and neutron starts with a term of O(1/M π ). 5 Note that the range of validity of the representation (29) is limited. The ππ intermediate states generate a branch point in the form factors at q 2 = 4M 2 π , which corresponds to τ = −1. While relativistic formulations of Baryon χPT do cover this region, an infinite series of Heavy Baryon χPT graphs contributes in the vicinity of that point, more precisely in the region where τ + 1 is small, of O(M 2 π /m 2 ) [43]. In the present paper, however, we make use of the chiral expansion only near τ = 0, where the nonrelativistic framework is adequate.
The low-energy behaviour of the electric Sachs form factors is less singular: Accordingly, the chiral expansion of the electric radii does not start with a term of O(1/M π ), but with a chiral logarithm, comparable to the situation with the charge radius of the pion.
The subtraction function also diverges if the chiral limit is taken at a fixed value of the ratio q 2 /M 2 π : the leading term in the chiral expansion of S inel 1 (q 2 ) is of order 1/M π and is determined by F π and g A as well [46]: The expansion of the analogous term in T 2 starts with [46]: In either case, the leading term is the same for proton and neutron -for S inel 1 (q 2 ) and S inel 2 (q 2 ), the chiral expansion of the difference between proton and neutron only starts at O(log M π ).

Low-energy theorems
In contrast to the elastic parts, which are singular at the origin, the inelastic contributions to T 1 (ν, q 2 ), T 2 (ν, q 2 ) do admit a Taylor series expansion in powers of ν and q 2 . Two low-energy theorems relate the leading terms in this expansion to the polarizabilities of the nucleon. The theorems amount to rather nontrivial statements, because the functions T 1 (ν, q 2 ), T 2 (ν, q 2 ) represent the virtual Compton scattering amplitude in the forward direction, while the experimental determination of the polarizabilities relies on real Compton scattering at nonzero scattering angle. A concise derivation is given in appendix B.
In the above notation, the low-energy theorems take the simple form:  [47,49,51] and analyses of the Baldin sum rule [48,50] (see also [37,52]). The latter results were imposed in [47,49], so that the quoted errors for α E and β M are anticorrelated.
where κ is the anomalous magnetic moment, α E and β M are the electric and magnetic polarizabilities of the particle, and α em is the fine structure constant. These relations show that the polarizabilities contain an elastic as well as an inelastic part, while their sum, α E +β M , is purely inelastic: Table 1 shows that the elastic parts only represent a small fraction of the polarizabilities.

Sum rules for the polarizabilities
The left hand side of the low-energy theorem (33) represents the inelastic part of the subtraction function at q 2 = 0: The representation for the subtraction function in (18) thus amounts to a sum rule for the inelastic part of the magnetic polarizability. Adding the elastic contribution, the sum rule takes the form To our knowledge, this sum rule is new. It states that, in the absence of fixed poles, the magnetic polarizabilities of proton and neutron are determined by the cross sections for photo-and electroproduction. If the amplitude T 1 (ν, q 2 ) were to obey an unsubtracted dispersion relation, the Regge terms in the expression for Σ L 1 (Q 2 ) could be dropped, so that the sum rule would reduce to the one proposed in [53]. Regge asymptotics implies that a subtraction is needed, but if the Reggeon trajectories and residues are known, the subtraction can be expressed in terms of these.
Evaluating the dispersive representation (20) at ν = q 2 = 0, we obtain The low-energy theorem (34) thus represents the familiar Baldin sum rule [54]. The integral occurring here is a limiting case of the quantity Σ T (Q 2 ) introduced in (22): the Baldin sum rule amounts to Comparison with (37) shows that the electric polarizability obeys a sum rule that exclusively involves the longitudinal cross section and the anomalous magnetic moment: 5 Numerical analysis

Experimental information
We evaluate the cross section integrals on the following basis. 6 W < 1.3. At low energies, the resonance ∆(1232) generates the most important inelastic contribution. It decays almost exclusively into πN final states which have been thoroughly explored. The SAID, MAID, Dubna-Mainz-Taipei (DMT), and chiral-MAID collaborations provide pion photo-and electroproduction cross sections into these channels [55][56][57][58][59][60][61][62][63]. 7 For W < 1.3 and real photons (Q 2 = 0), the transverse cross section is well approximated by the sum over these contributions. In particular, the representations we are using are consistent with isospin symmetry, which implies that the contributions from the ∆ to the proton and neutron cross sections are the same up to symmetrybreaking effects of O(m u −m d , α em ) which are expected to be very small. Moreover, as seen from figure 1 (left panel), the ∆ dominates in the transverse cross sections and gives very small contributions to the longitudinal ones. This property is directly related to the smallness of the C2 Coulomb quadrupole form factor for the ∆N γ * transition. In the non-relativistic quark model, where both the nucleon and the ∆ are zeroorbital-momentum three-quark states, this form factor, as well as the one of the E2 electric quadrupole, vanish altogether [66][67][68].
The comparison of the full lines in the two panels of figure 1 shows that, in the region where the ∆ generates the dominant contribution, the transverse cross sections for proton and neutron are indeed nearly the same: the differences are smaller than the individual terms by an entire order of magnitude [66]. For the polarizabilities, the behaviour of the ratio σ L /Q 2 in the limit Q 2 → 0 is relevant. Since MAID and DMT offer a representation also for this quantity, these parameterizations are particularly convenient for us. For definiteness, we identify the central values of the cross sections in the region W < 1.3 with the average of MAID and DMT, abbreviated as MD: σ MD = 1 2 (σ MAID + σ DMT ). As far as the proton cross sections are concerned, the results obtained with SAID, MAID, and DMT are practically the same, but figure 2 shows that for the small differences between proton and neutron, this is not the case. The uncertainties in the input used for the cross sections do affect our numerical results and will be discussed together with these.
In the intermediate region, we rely on the work of Bosted and Christy (BC), who provide parameterizations of the transverse and longitudinal proton and neutron cross sections in the resonance region, m + M π < W < 3.2, in the range 0 < Q 2 < 8 [69,70]. These contain a wealth of information, but suffer from a number of shortcomings. In particular, their fit to the data is carried out under the assumption that the ratio σ L /σ T is the same for proton and neutron. An experimental analysis that does not rely on this assumption would be most welcome. Second, the parameterization does not properly cover the region of very small photon virtualities (cf. [71]): (a) The algebraic form of the representation used for σ L implies that the quantityσ L ≡ σ L ν 2 /Q 2 disappears when Q tends to zero instead of approaching a nonzero limiting value. (b) Isospin symmetry implies that the contributions of the resonance ∆(1232) to proton and neutron are the same, but, as noted in [36], the BC-parameterization does not respect this symmetry to the expected accuracy. (c) The parameterization of the contribution from the resonance N (1530) exhibits an unphysical dependence on Q 2 : in the tiny interval 0 < Q 2 < 0.001, the contribution from this resonance to the transverse cross section of the proton varies by about 40%. Although this artefact only manifests itself at very small values of Q 2 , it seriously affects our calculation because the results obtained for the polarizabilities depend on whether we simply evaluate the sum rules at Q 2 = 0 or use very small positive values of Q 2 -for the physical cross sections, a difference of this sort cannot arise.
In the interval 1.3 < W < 3, we use the following crude estimate for mean values and errors: (i) The central value is identified with the result obtained with the BC-parameterization. (ii) In order to wash out the spikes occurring at very small values of Q 2 , we assign an 8% uncertainty to the BC-representation of the proton cross sections: ∆σ p = 0.08 σ p . (iii) Since the difference between the proton and neutron cross sections is much smaller than the individual terms, small relative errors in the latter can generate large relative errors in the difference. For this reason, we use the same error estimate for σ p−n as for the individual terms, i.e. work with ∆σ p−n = 0.08 σ p .
The comparison of the representations for the difference between the proton and neutron cross sections Consistency check at the transition point W = 1.3. The plot compares the representations of MAID and DMT used below that point with the BC-parametrization used above it. The difference between MAID and DMT and the band attached to BC represent an estimate of the uncertainties to be attached to these parameterizations. As discussed in the text, the picture implies that these parameterizations provide a coherent framework only for Q 2 > 0.5. used below and above W = 1.3 offers a consistency test on our calculations. Figure 2 compares the representations of MAID and DMT with the uncertainty band attached to BC at the transition point. The left panel shows that the representations for the difference of the transverse cross sections used below and above that point agree with one another only for Q 2 > 0.5. The problem arises from the deficiencies mentioned above, which prevent us from reliably evaluating the cross section integrals at low values of Q 2 . The right panel shows that the uncertainties in the difference of the longitudinal cross sections are considerable, but within these, the representations used are coherent.
W > 3. We estimate the contributions from higher energies with the representation of Alwall and Ingelman [72]. It is based on the vector-meson-dominance model of Sakurai, Schildknecht, Donnachie, and Landshoff [73][74][75][76] and offers a parameterization of the transverse and longitudinal cross sections of the form where s = W 2 is the square of the center-of-mass energy. The Pomeron cut is approximated by a Regge pole at α P = 1.091, while the Reggeons with the quantum numbers of f and a 2 are lumped together in a single contribution with α R = 0.55. The Pomeron residues of proton and neutron are the same: For the remainder, we follow [4], invoke SU (3), and stick to the value of the D/F ratio quoted there (for the definition of the Regge couplings D and F and a review of their determination, we refer to [77]): The parameterizations for the structure function F 2 of Capella et al. [78,79] and for the ratio σ L /σ T of Sibirtsev et al. [80] provide an alternative Regge representation of the cross sections, which we refer to as CS. In figure 3, the consistency check made at the transition point W = 1.3 is repeated for W = 3. The plot shows that, for Q 2 < 1.4, the central representation of AI is indeed contained in the uncertainty band attached to BC, while the one of CS runs above it. The comparison indicates that, at low values of Q 2 , working with AI yields a coherent picture, while with CS this is not the case. For Q 2 > 2, however, the situation is reversed: there, the AI-representation yields values for the difference between the transverse cross sections that are too small while the CS-representation is consistent with the values obtained from BC. This confirms the conclusion reached in [72]: the above form of the AI-representation applies as it stands only for Q 2 < 1. At higher values of Q 2 , the parameterization underestimates the size of the structure function F 2 and further contributions have to be added for the vector-meson-dominance formulae to become compatible with the observed behaviour. Since we do not account for these and the uncertainties we attach to the central representation do not cover the gap, the input we are working with becomes incoherent for Q 2 > 2. The right panel, on the other hand, shows that the representations we are using for the longitudinal cross section do survive the consistency test, irrespective of the value of Q 2 . Note that we are discussing the properties of the difference between the proton and neutron cross sections. The main problem here is that all of the well-established properties of the proton cross sections drop out when taking the difference between proton and neutron. High precision is required to measure the remainder, in particular also at high energies, where the Pomeron dominates the scenery. Also, since the longitudinal cross section is significantly smaller than the transverse one, pinning it down accurately is notoriously difficult. In both of the above representations, the ratio σ L /σ T is taken to be energy-independent. 8 This appears to be consistent with experiment, but since we are not aware of a theoretical explanation, a test of the validity of this assumption would be very useful. For recent applications of these representations to the amplitudes under consideration we refer to [71,81,82].

Evaluation of Σ T and Σ 2 for the proton
We start the discussion of the cross section integrals with the one over the transverse cross section, Σ T (Q 2 ), which is specified in (22). The value at the origin is relevant for the sum of the electric and magnetic polarizabilities, Σ T (0) = α E + β M . Since the longitudinal cross section vanishes at Q 2 = 0, the function takes the same value there. In fact, figure 4 shows that Σ 2 (Q 2 ) is dominated 8 For the representation (41), this property implies β L R /β T R = β L P /β T P . In the notation of [72], it corresponds to by the transverse part also at nonzero virtuality -the longitudinal part amounts to a modest correction.
As pointed out in [71], the structure function F 1 (x, Q 2 ) can also be used to continue the integral relevant for the Baldin sum rule to nonzero values of Q 2 : Since the integrand differs from the one relevant for Σ T (Q 2 ) only by the factor 1 + Q 2 /ν 2 , the quantity Σ F1 also reduces to α E + β M when Q 2 vanishes, but drops off somewhat less rapidly when Q 2 grows. The lines for Σ T , Σ 2 , and Σ F1 in figure 4 are obtained by using the parameterizations specified in subsection 5.1. As stated there, the contributions from the region W < 1.3 are evaluated with the mean of MAID and DMT, but we could just as well have used SAIDon this plot, the difference would barely be visible.
In [71], the function Σ F1 (Q 2 ) is instead evaluated with the BC-parameterization, also in the region of the ∆-resonance. This leads to the behaviour indicated by the dash-dotted line labelledΣ F1 . The topmost line, which is obtained by evaluating the same formula with the MD-parameterization, is higher by about 0.8 units. The difference is closely related to the fact that the BCparameterization does not respect isospin symmetry to the expected accuracy (see the discussion in subsection 5.3).
As pointed out by Bernard, Kaiser, and Meißner [83], χPT neatly explains the size of the combination of polarizabilities occurring in the Baldin sum rule. The parameter-free expression (32) for the leading term in the chiral perturbation series of Σ 2 (Q 2 ) is shown as a dashed line. The comparison with the  experimental result for α E + β M shows that, at small values of Q 2 , the leading term of the chiral series dominates. In the limit Q 2 → 0, this term reduces to In the chiral limit this formula diverges in inverse proportion to M π : if the quarks are taken massless, T inel 2 (ν, q 2 ) contains an infrared singularity at ν = q 2 = 0.
The same singularity also shows up in the Q 2dependence, which exhibits the presence of an unusually small scale: at leading order of the chiral expansion, the function Σ 2 (Q 2 ) depends on Q 2 only via the vari- Hence the scale is set by 2M π rather than M ρ . Figure 4 shows that, in reality, Σ 2 (Q 2 ) drops even more rapidly, partly on account of the second-sheet pole associated with the ∆, partly due to other higherorder contributions of the chiral series [40,46,84,85].
The spike seen in figure 4 at tiny values of Q 2 illustrates the artefact mentioned in subsection 5.1, which concerns the contribution from the resonance N (1530): if the numerical values of the integrals in the region 0.002 < Q 2 < 0.005 are fit with a low order polynomial, the extrapolation to Q 2 = 0 is higher than the result of the direct evaluation at Q 2 = 0, by about 0.4 units. Since the experimental information from real Compton scattering and from photoproduction is more stringent than the one from electron scattering, which for these very small values of Q 2 necessarily involves extrapolations, we think that the results obtained by evaluating the integral over the transverse cross section at Q 2 = 0 are more reliable. The value obtained there with MAID or DMT is (α E + β M ) p = 14.1, while SAID yields a result that is lower by about 0.1 units. The numbers obtained at Q 2 = 0 with the parameterizations we are using thus agree with the result (α E + β M ) p = 13.8(4) quoted in the review [37], which stems from [48].
5.3 Σ T and Σ 2 : proton-neutron difference Figure 5 shows the difference between the integrals over the proton and neutron cross sections. The picture looks very different from figure 4: while there, the curves start at Σ ≃ 14 and rapidly drop with Q 2 , those in figure 5 start at Σ ≃ 0 and stay there. Since the integrals under consideration are rapidly convergent, the behaviour of the cross sections in the resonance region is relevant. The reason why not much is left in the difference between proton and neutron is that, in that region, the proton and neutron cross sections are nearly the same.
In particular, as mentioned in subsection 5.1, isospin symmetry implies that the most prominent low-energy phenomenon, the ∆, drops out when taking the difference between the proton and neutron cross sections. The cancellation of the main contributions also manifests itself in the chiral perturbation series: the leading terms in Σ p 2 and Σ n 2 are large, of order 1/M π , but the coefficients are the same, so that the chiral expansion of Σ p−n 2 only starts at O(1). For our cross section integrals to exhibit these features, it is essential that the representations we are us-ing in the region of the ∆ respect isospin symmetry. The dash-dotted line illustrates the fact that the BCparameterization of the cross sections violates this constraint quite strongly: the bump seen around Q 2 ≃ 0.1 arises from the difference between proton and neutron which occurs in that parameterization in the region of the ∆. As mentioned above, the difference between the parameterizations MD and BC in the region W < 1.3 also shows up in figure 4. It so happens that the difference between the results obtained via extrapolation from Q 2 > 0.002 and via evaluation at Q 2 = 0 nearly cancels the one between the contributions from the region of the ∆ obtained with BC and with MD, so that the number obtained for α p E + β p M in [81] agrees with experiment.
The spike seen at very small values of Q 2 is about twice as large as the one in figure 4 and manifests itself much more prominently because the difference between proton and neutron is an order of magnitude smaller than the individual terms. The value obtained at Q 2 = 0 is consistent with the experimental result, (α E + β M ) p−n = −1.4(6).

Pomeron exchange
The integrals considered in the preceding two subsections converge rapidly. Their properties are governed by the low-energy behaviour of the cross sections -the asymptotic behaviour does not play a significant role. For the integral Σ L 1 (Q 2 ) specified in (23), the situation is very different: for this integral to converge, it is essential that the asymptotic behaviour of the longitudinal cross section be known, so that it can properly be accounted for. At high energies, the leading contribution stems from Pomeron exchange, which generates a branch point at J = 1 in the angular momentum plane. In phenomenological parameterizations, such as the one specified in (41), the branch cut is often approximated by a Regge pole in the range 1 < α P < 2. For this parameterization to have the required asymptotic accuracy, it must describe the contribution from the Pomeron up to terms that disappear in the limit ν → ∞.
The Regge representation we are using to describe the asymptotic behaviour of the structure functions leads to the parameterization (27). In this framework, the Pomeron term in (41) not only generates a leading contribution to the cross section with α = α P , but also a daughter with α = α P − 1. Furthermore, in contrast to the situation with the parameterization of the contributions from the nonleading Reggeons, the value of the parameter s 0 does matter here: a change in the value of s 0 generates an asymptotic contribution proportional to ν αP −1 . If the integral in (23) does converge for one particular value of s 0 , it diverges for any other value.
As an illustration of the mathematical problem we are facing here, consider a contribution of the form which is free of fixed poles. For ν ≥ 0, q 2 ≤ 0, the corresponding absorptive part is given by: In the limit δ → 0, the modification of the structure function disappears, while the change in the time-ordered amplitude does not, but takes the form of a fixed-pole contribution, ∆T 1 (ν, q 2 ) → ξ(q 2 ), which can have any desired value.
In short: although the hypothesis that the Reggeons properly account for the behaviour at large values of ν uniquely determines the subtraction function even if Pomeron exchange contributes, the evaluation of (23) requires knowledge of the asymptotic behaviour to an accuracy that is beyond reach. In the absence of theoretical information about the properties of the Pomeron, we are dealing with what Hadamard [86] called an ill-posed problem: in principle, the data do determine the solution, but tiny changes in the data (structure function) can lead to substantial changes in the solution (subtraction function). For this reason, we do not discuss the sum rules for the individual polarizabilities of proton and neutron any further.
A model-independent determination of the subtraction function occurring in the dispersive representation of the proton Compton amplitude is also of interest in connection with the proton radius puzzle (for a recent review see [87]). As pointed out in [88], at least part of the discrepancy could be explained if for some reason the contribution to the Lamb shift that is governed by the virtual Compton scattering amplitude were significantly larger than expected. The χPT analyses [40,46,84,85] as well as the recent works on effective field theory [89] and finite-energy sum rules [90] were largely motivated by this puzzle; an improved knowledge of the subtraction function would be of interest also in that context. Unfortunately, however, a major breakthrough in the theoretical understanding of the Pomeron is required before the sum rule set up above could reliably be evaluated.

Evaluation of Σ L 1 for the proton-neutron difference
In the present subsection, we focus on the difference between proton and neutron, where the Pomeron drops out: the asymptotic behaviour of σ L (ν, Q 2 ) p−n is dominated by the nonleading terms in (41), which grow less rapidly with ν, so that the problems discussed in the preceding subsection do not arise. The analysis of the difference is of interest for two reasons: (1) the sum rule is obtained under the same premises (absence of fixed poles, Reggeon dominance hypothesis) as the Cottingham formula. Consequently, confronting the result with existing experimental information on the polarizabilities, one may test the validity of this hypothesis. (2) our result for α p−n E is somewhat more accurate than the determination based on the current experimental information. Combined with the experimental values of the polarizabilities of the proton and the Baldin sum rule, this yields an improved prediction for the polarizabilities of the neutron. Figure 6 compares the integrals over the transverse and longitudinal cross sections, for the difference between proton and neutron. The function Σ T (Q 2 ) p−n already occurred in figure 5 -we are now merely focusing on a smaller range in the variable Q 2 . The plot shows that the integral Σ L 1 (Q 2 ) p−n behaves in a qualitatively different way. Both integrals are small, but while Σ T (Q 2 ) exhibits the pronounced spike at Q 2 = 0 discussed earlier, the dependence on Q 2 of Σ L 1 (Q 2 ) is dominated by the contribution from the region of the ∆, which is well understood -in particular, the MAID and DMT representations show nearly the same Q 2dependence. Using the mean of the two as central value and half of the difference as an estimate for the uncertainty for the contributions from W < 1.3 would in our opinion represent a fair recipe, but to stay on the conservative side, we double the error estimate. For the value of the integral at Q 2 = 0 this prescription yields Σ L 1 (0) MD = −1.4 (4). The contributions from intermediate energies, 1.3 < W < 3, are small: the estimate Σ L 1 (0) BC = 0.2(2) covers the deficiencies of the representation used there. Above that range, we use the AIrepresentation, attach an uncertainty of 30% to it, and get Σ L 1 (0) AI = −0.3 (1). Adding errors in quadrature, we finally obtain

Prediction for the polarizabilities of the neutron
In view of the relation (40), the result (48) amounts to a prediction for the difference between the electric polarizabilities of proton and neutron: This is consistent with the current experimental value, α p−n E = −0.9(1.6), but significantly more precise. The numerical result obtained from the Baldin sum rule for the difference in the value of α E + β M between proton and neutron, (α E + β M ) p−n = −1.4(6), then implies According to (36), this result also determines the value of the subtraction function relevant for the self-energy difference at Q 2 = 0: Finally, combining the current experimental result for the electric and magnetic polarizabilities of the proton, α p E = 10.65 (50) and β p M = 3.15 (50), with the numbers for α p−n E and (α E +β M ) p−n , we arrive at a prediction for the electric and magnetic polarizabilities of the neutron: α n E = 12.3(7) , β n M = 2.9(0.9) .
Note that the procedure used avoids relying on the available parameterizations of the transverse cross section, which contain sharp spikes at very small values of Q 2 that make the evaluation of Σ T (0) problematic. We make use of the fact that those present in the longitudinal cross section are much milder and allow us to assign a meaningful uncertainty to Σ L 1 (0). We also emphasize that the fluctuations exclusively affect the behaviour at small values of Q 2 . For the evaluation of the electromagnetic self-energy to be discussed in section 6, these deficiencies are of no concern, because phase space suppresses the contributions from the vicinity of the point Q 2 = 0.

Result for the subtraction function
According to (26), the inelastic part of the subtraction function relevant for the self-energy is determined by the difference between the integrals Σ L 1 (Q 2 ) p−n and Σ T (Q 2 ) p−n . The central values of these integrals are shown in figure 6. The narrow band in figure 7 indicates the corresponding result for the subtraction function. The width of the band is obtained by evaluating the uncertainties in the contributions arising from the three subintervals, separately for the transverse and longitudinal contributions, and adding the results in quadrature. For better visibility, the vertical axis is stretched with the inverse of the dipole form factor, As discussed in subsection 5.1, the region Q 2 < 0.5 contains unphysical fluctuations -this is why we chop the uncertainty band off there. Note also that, although the calculation returns reasonable results even at Q 2 = 2, it is not reliable there, because it does not account for the contributions by which the AI-parameterization needs to be supplemented in order to agree with experiment at those values of Q 2 (see subsection 5.1).
The figure also indicates the value S inel 1 (0) p−n = 1.0(2.7) obtained from the current experimental result for β p−n M , as well as our prediction in (51). These numbers concern the value of the subtraction function at Q 2 = 0, but are slightly displaced for better visibility.

Comparison with previous work
Recently, Walker-Loud, Carlson, and Miller [31] proposed a simple ansatz for the subtraction function. In our notation, their proposal amounts to The singularity at q 2 = 0 arises from the elastic contribution in (15). The corresponding expression for the  The curves are drawn for the current experimental value, which is indicated by the error bar on the left and concerns the value at Q 2 = 0, but is displaced to make it visible. The range obtained with the model in [36] (ESTY) starts with the same width at Q 2 = 0, but shrinks as Q 2 grows. The comparatively narrow third band represents our work. We do not present an error estimate in the region 0 < Q 2 < 0.5, because there, our results are sensitive to the inadequacies of the parameterizations used for the cross sections, but we do show our prediction for the value of the subtraction function at Q 2 = 0.
inelastic part of the subtraction function, 9 is regular at q 2 = 0 and one readily checks that the ansatz is consistent with the low-energy theorem (36). It amounts to an extrapolation of that formula to nonzero values of q 2 , controlled by the parameter m 0 . In figure 7, this expression is indicated as a broad band of nearly constant width. The ansatz (53) for the subtraction function generates a logarithmic divergence in the integral (62) for the corresponding contribution to the self-energy difference. As discussed in subsection 6.1, the self-energy difference indeed diverges logarithmically. The divergence is absorbed in the electromagnetic renormalization of m u and m d , which is of order e 2 m u , e 2 m d . As pointed out by Erben et al. [36], the logarithmic divergence generated by the ansatz (53) is not proportional to the masses of the two lightest quarks and can thus not be absorbed in their renormalization: the particular extrapolation proposed in [31] is not consistent with the short-distance properties of QCD. The variant proposed in [36], repairs this shortcoming, as it disconnects the behaviour at small values of q 2 from the asymptotic behaviour. This expression is represented by the central band that gradually shrinks if Q 2 increases. The explicit choice made in [36] for the coefficient c implicitly assumes that the contribution from the subtracted dispersion integral, m disp γ , stays finite when the cut-off is removed, so that the logarithmic divergence then exclusively arises from the term m S γ , which stems from the subtraction function. As discussed in detail in [4], however, the deep inelastic region also contributes to the coefficient of the logarithmic divergence. The scaling violations do not extinguish this contribution [26]. Hence the choice made for c cannot be taken literally, but it does have the proper quark mass factors, so that the divergence arising from the subtraction function is suppressed. Since the authors cut the integral over the subtraction function off at Λ 2 = 2 GeV 2 , it barely makes any difference whether c is set equal to zero or taken from [36]. In fact, one of the variants of the model studied in [35] does correspond to c = 0.

Cottingham formula
The electromagnetic self-energy of a hadron diverges logarithmically. To first order in α em the renormalized electromagnetic Lagrangian requires counter terms proportional to the operators 1,qq, and O G = G a µν G aµν : whereD Λ (x) is the regularized photon propagator in coordinate space. The counter term proportional to the unit operator does not contribute to the self-energy. The remainder is determined by the renormalization of the quark masses and of the coupling constant g required by the electromagnetic interaction. To leading order, these are given by (see for instance [91]): The form of the regularization used for the photon propagator is irrelevant -it exclusively affects the value of the running scale µ. The proton and neutron matrix elements of the operator (56) lead to a version of the Cottingham formula [12] that is valid in QCD: It represents the electromagnetic self-energy in terms of the time-ordered amplitudes T 1 and T 2 specified in appendix A.

Elastic part of the self-energy
Analogously to the electric and magnetic polarizabilities, the self-energy also consists of an elastic and an inelastic part, The contribution from the elastic intermediate states remains finite when the cut-off is removed. It is obtained by replacing T 1 , T 2 with the elastic parts T el 1 , T el 2 , which are given explicitly in (15), and replacing D Λ (q 2 ) with the full photon propagator, D(q 2 ) = (−q 2 − iǫ) −1 . With a Wick rotation, the expression can be brought to the form where v el 1 (q 2 ) and v el 2 (q 2 ) represent the sums of squares of form factors specified in (10). The variable y stands for y ≡ ν 2 /Q 2 . For the elastic contribution, which is concentrated to the line Q 2 = 2mν, we have y = Q 2 /4m 2 . In [4], the dipole approximation for the Sachs form factors was used, which yields m el γ = 0.63 MeV for the proton and −0.13 MeV for the neutron, so that the elastic contribution to the self-energy difference amounts to (m el γ ) p−n = 0.76 MeV. In the meantime, the precision to which the form factors are known has increased significantly. For a thorough review of the experimental information, we refer to [92]. The above estimates of the elastic contributions to the proton and neutron self-energies do receive significant corrections, but the difference between proton and neutron is affected by less than 0.02 MeV. Compared to the uncertainties in the contributions arising from the deep inelastic region, the departures from the dipole approximation are too small to matter.

Inelastic part of the self-energy
The inelastic part receives three distinct contributions: The term m S γ arises from the subtraction function S inel 1 (q 2 ), m disp γ is given by a dispersion integral over the structure functions, and m ct γ accounts for the fact that the electromagnetic interaction renormalizes the quark masses as well as the coupling constant of QCD. In the above discussion of the polarizabilities, renormalization did not play any role, because these concern the properties of T 1 , T 2 at low energies. In fact, the inelastic part of the magnetic polarizability exclusively picks up the contribution from the subtraction function specified in (36). In the decomposition used in (61), we have The term m S γ is obtained by replacing T 1 (ν, q 2 ) in (58) by the subtraction function S inel 1 (q 2 ), performing a Wick rotation, and averaging over the directions of the Euclidean momentum. The result reads This term measures the size of the self-energy arising from the subtraction function (more precisely, the inelastic part thereof -the remainder is included in δm el γ ). The second term on the right of (61) is obtained by replacing the amplitudes T 1 , T 2 with their inelastic parts T inel 1 , T inel 2 and dropping the contribution from the subtraction function in the dispersive representa-tion for T inel 1 . The explicit expression reads The term with 3/2y makes the difference between the unsubtracted and subtracted dispersion integral over V 1 : it removes the leading term in the behaviour of f 1 when Q 2 is held fixed and ν tends to ∞, so that the integral over ν converges, despite the growth of V 1 generated by Reggeon exchange. On the other hand, when Q 2 becomes large, the behaviour in the deep inelastic region is relevant. In QCD, the contributions from that region diverge logarithmically if the cut-off is removed. In (63), we have simply cut the integral off at Q 2 = Λ 2 -this amounts to a regularization of the photon propagator in Euclidean space: In the normalization of the states (A.2), the mass shift generated by the counter terms in (56) is given by Neglecting second-order isospin-breaking effects proportional to e 2 (m u − m d ), the proton and neutron matrix elements of operators with isospin zero are the same. Hence the operators O G ,ss,cc, . . . drop out in the self-energy difference. Moreover, isospin symmetry relates the neutron matrix elements of the light quarks to those for the proton, e.g. k|ūu|k n = k|dd|k p . Using these properties, the contribution from the electromagnetic renormalization of the quark masses to the self-energy difference can be brought to the form The formula shows that the coefficient of the logarithmic divergence is proportional to the masses of the two lightest quarks. In the chiral limit the divergence disappears altogether: if u and d are taken massless, the self-energy difference approaches a finite limit if the cutoff is removed. In reality, the contributions from the deep inelastic region do generate a logarithmic divergence, albeit with a small coefficient. An update of the analysis performed in [4] is needed to account for the scaling violations in the corresponding contributions to the renormalized self-energy difference.

Numerical evaluation
In [31], the contribution from the subtraction function to the self-energy difference is evaluated with Λ 2 = 2 GeV 2 . According to (26), the inelastic part of the subtraction function is given by the difference between two cross section integrals. The part which involves the transverse cross sections, Σ T (Q 2 ), generates a convergent contribution to the formula (62) for the self-energy. As discussed above, our numerical representation of Σ T (Q 2 ) becomes incoherent at values of Q 2 below 0.5, but phase space suppresses that region, so that our estimate, m S γ (Σ T ) ≃ −0.14 MeV, should be close to the truth (actually, with a coherent representation of the available experimental information, this part could be evaluated rather accurately, even without cutting the integral off). The corresponding integral over the longitudinal cross section, Σ L 1 (Q 2 ), is less sensitive to the shortcomings of the representation we are using (this is why we were able to obtain a rather accurate prediction for the difference between the electric polarizabilities of proton and neutron). Numerically, the contribution from that integral to the self-energy difference is tiny: m S γ (Σ L 1 ) ≃ −0.03 MeV. In other words: the contributions from the Reggeons do require a subtraction, but taken together with those arising from low energies, the entire contribution from the longitudinal cross section to the subtraction function generates a negligibly small part of the self-energy difference. Together with the number for the contributions from the transverse cross section given above, we obtain This is to be compared with the number obtained by instead inserting the expression (54) (40), respectively, which is lower than our prediction in (50). The difference reflects the fact that, in figure 7, the bands that correspond to their models run above ours. While these extractions involve a model dependence which is difficult to quantify, there has recently been progress in the direct calculation of the polarizability from the lattice, see [93].
Note that the momentum dependence of the subtraction function must match the behaviour in the deep inelastic region. Taken by itself, the contribution from the subtraction function is very sensitive to the choice of the cut-off Λ. As shown in [4], the term m disp γ is equally sensitive, but the sum of the two contributions is nearly independent of Λ, because the Cottingham formula only contains the very weak logarithmic divergence that is related to the electromagnetic renormalization of the quark masses m u and m d . As indicated in (65), the coefficient of the divergence is proportional to these masses and hence very small. Also, it does not come exclusively from the subtraction function. The contributions to m disp γ arising from the deep inelastic region contribute to the coefficient of the logarithmic divergence as well. These were estimated in [4] on the basis of the data available at the time, which did not show any violations of Bjorken scaling. In the meantime, there has been considerable progress in understanding the properties of the structure functions in the deep inelastic region and there is very clear evidence for scaling violations. For a thorough review of these developments, we refer to [94]. A corresponding update of the results obtained on the basis of the Cottingham formula would be of high interest, also in view of the progress made in calculating electromagnetic self-energies on the lattice, but this goes beyond the scope of the present paper.
7 Summary and conclusion 1. Causality relates the imaginary part of the amplitude for Compton scattering on the nucleon in the forward direction to the cross section of the process e + N → e + anything. The relation holds for real photons as well as virtual photons of spacelike momentum, q 2 ≤ 0. The spin-averaged forward scattering amplitude involves two invariants, which we denote by T 1 (ν, q 2 ) and T 2 (ν, q 2 ). Their imaginary parts are determined by the transverse and longitudinal cross sections of electron scattering, σ T and σ L . 2. Regge asymptotics implies that only T 2 (ν, q 2 ) obeys an unsubtracted fixed-q 2 dispersion relation, while the one for T 1 (ν, q 2 ) requires a subtraction, which represents the value of the amplitude at ν = 0: S 1 (q 2 ) = T 1 (0, q 2 ). The dispersive representation of the spin-averaged forward Compton scattering amplitude thus consists of two parts: an integral over the cross sections σ T , σ L and an integral over the subtraction function S 1 . The same also holds for the Cottingham formula, which represents the electromagnetic self-energy of the nucleon in terms of the spin-averaged forward Compton amplitude. 3. It had been pointed out long ago [4] that -unless the Compton amplitude contains a fixed pole at J = 0 -the subtraction function is unambiguously determined by the cross sections of electron scattering. We do not know of a proof that the Compton amplitude of QCD is free of fixed poles, but assume that this is the case and refer to this assumption as Reggeon dominance. As briefly discussed in subsection 1.2, the validity of this hypothesis is questioned in the literature. Indeed, an analysis of the Compton amplitude based on first principles that would determine the behaviour in the Regge region (high energies, low photon virtualities) is not available. If the hypothesis were to fail, this would be most interesting, as it would imply that the known contributions generated by the short-distance singularities and the exchange of Reggeons do not fully account for the high-energy behaviour of QCD. 4. On the basis of Reggeon dominance, we have derived an explicit representation of the subtraction function in terms of the electron scattering cross sections. The representation requires the asymptotic behaviour of the longitudinal cross section to be known up to contributions that disappear at high energies. For the proton Compton amplitude, where Pomeron exchange generates the dominating contribution, the available information does not suffice to reliably evaluate the subtraction function. In the difference between proton and neutron, however, the Pomeron drops out. We have shown that the experimental information available at low photon virtuality does suffice to work out the subtraction function relevant for this difference. 5. In [31], the electron cross sections σ T , σ L and the subtraction function S 1 (q 2 ) are instead treated as physically independent quantities. The authors invoke the low-energy theorem that relates the value of the subtraction function at q 2 = 0 to the magnetic polarizability and use experimental information about the latter to pin down the value of the subtraction function at the origin. As direct experimental information about the q 2 -dependence is not available, the authors construct a model for that. Figure 7 compares their model with our prediction. As pointed out in [36], the model of [31] is not consistent with the fact that the coefficient of the loga-rithmic divergence vanishes in the chiral limit. The alternative ansatz for the subtraction function proposed there, which does obey this constraint, is also shown in figure 7. 6. The authors of [31] use their ansatz for the subtraction function to evaluate the difference between the self-energies of proton and neutron and obtain m WCM γ = 1.30(03)(47) MeV, significantly higher than the result obtained in [4], m GL γ = 0.76 (30) MeV. The difference is blamed on a 'technical oversight' committed in [4]. This claim is wrong: it suffices to replace their ansatz for the subtraction function with the parameter-free representation used in [4], which is spelt out explicitly in (26) above. Leaving all other elements of their calculation as they are, the central value for the selfenergy difference then drops to m γ = 0.63 MeV, thereby neatly confirming the old result. The same conclusion is reached with the calculation performed in [36]. 7. We emphasize that the present work only concerns low photon virtualities. An update of the analysis carried out in [4] which accounts for the progress made on the experimental and theoretical sides during the last 40 years -in particular an evaluation of the contributions from the deep inelastic region which accounts for the violations of Bjorken scaling -is still missing. 8. Our representation for the subtraction function also leads to a prediction for the difference between the electric polarizabilities of proton and neutron. The result is given in (49). Using the currently accepted results obtained from the Baldin sum rule, this also determines the difference of the magnetic polarizabilities and, using the comparatively rather precise, known value of the electric polarizability of the proton, we obtain an estimate also for the polarizabilities of the neutron. The result is given in (52). 9. The fact that the results obtained from Reggeon dominance are consistent with experiment and even somewhat more precise amounts to a nontrivial test of the hypothesis that the Compton amplitude is free of fixed poles. Quite apart from the possibility of taking new data at small photon virtuality, an improved representation of the available experimental information on the cross sections would allow us to reduce the uncertainties quite substantiallyin particular, if the deficiencies of the available parameterizations mentioned in subsection 5.1 could be removed, the main source of uncertainties in our calculation would immediately disappear. 10. The main problem we are facing with our analysis is that all of the well-established features of elec-tron scattering drop out when taking the difference between proton and neutron: the leading terms of the chiral perturbation series are the same, the contribution from the most prominent resonance, the ∆(1232), is the same, and the leading asymptotic term due to Pomeron exchange is also the same. Since all of these contributions cancel out, not much is left over. Only a fixed pole could prevent the subtraction function relevant for the difference between proton and neutron from being small. The available data do not exclude the occurrence of a fixed pole, but they indicate that if the phenomenon occurs at all, then the pole must have a rather small residue. ticular, we thank P. Bosted for making the pertinent Fortran code available to us. J. G. thanks the Helmholtz-Institut für Strahlen-und Kernphysik at Rheinische Friedrich-Wilhelms-Universität Bonn for hospitality -part of this work was performed there. Financial support by BMBF ARCHES, the Helmholtz Alliance HA216/EMMI, the Swiss National Science Foundation, the Deutsche Forschungsgemeinschaft (CRC 16, "Subnuclear Structure of Matter" and CRC 110, "Symmetries and the Emergence of Structure in QCD"), the Volkswagenstiftung under contract no. 86260, and the DOE (Grant No. DE-FG02-00ER41132) is gratefully acknowledged.

A Notation
The structure functions are related to the Fourier transform of the spin-averaged matrix element of the current commutator, The states are normalized with and p|O|p stands for 1 2 s p, s|O|p, s . For the Fourier transform of the time-ordered matrix element, p|T j µ (x)j ν (y)|p , we use the normalization Since Lorentz invariance, current conservation, and parity only allow two independent tensors of this type, these matrix elements contain two invariants each, which only depend on the two variables ν ≡ p · q/m and q 2 (m is the mass of the nucleon). We denote the invariants by V 1 (ν, q 2 ), V 2 (ν, q 2 ) and T 1 (ν, q 2 ), T 2 (ν, q 2 ), respectively: In contrast to the standard structure functions F 1 , F 2 the invariants V 1 , V 2 are free of kinematic singularities or zeros. The two sets are related by with x ≡ Q 2 /2mν. The notation for the longitudinal structure function F L is not universal. The convention used in the minireview on the structure functions in The Review of Particle Physics [95] reads The structure functions V 1 (ν, q 2 ) and V 2 (ν, q 2 ) represent linear combinations of the transverse and longitudinal cross sections σ T and σ L : The value of the structure function V 2 (ν, 0) also determines the total cross section for photoproduction, σ tot = lim Q 2 →0 σ T : For the one-particle matrix elements of the current, we use the notation where q = p 1 − p 2 . The nucleon spinors are normalized withū(p, s ′ )u(p, s) = 2mδ s ′ s . The functions F D (t) and F P (t) are referred to as Dirac and Pauli form factors, respectively. Whenever convenient, we replace these by the Sachs form factors, which are defined by (A.11) In dipole approximation, the form factors are parameterized with where κ = F P (0) stands for the anomalous magnetic moment.

B Compton scattering
Virtual Compton scattering in the non-forward direction provides the bridge between the two processes discussed in subsection 4.1: scattering of real photons at nonzero scattering angle and scattering of virtual photons in the forward direction. Compton scattering has been thoroughly explored in the literature, for the case where both of the two photons are on the mass shell (real Compton scattering, RCS) as well as when one of them (VCS) or both (VVCS) are offshell [15,16,40,53,89,[96][97][98][99][100][101][102][103][104][105][106][107][108][109][110][111][112]. We normalize the amplitude witĥ and use matrix notation, collecting the different spin orientations in the 2 × 2 matrix T µν (p f , q f |p i , q i ). The spin average is given by the trace of this matrix, (B.14)

B.1 Lorentz invariance
The spin average is not independent of the Lorentz frame used. To see why this is so, consider a Lorentz transformation: We denote the pure Lorentz transformation (boost) that takes a particle at rest into one of four-momentum p by B p and work in the basis where the state |p, s is obtained from the corresponding state at rest, |p, s , by application of the relevant boost: |p, s = U (B p )|p, s . Lorentz transformations not only change the momentum, but also subject the spin direction to a rotation, referred to as Wigner rotation: The Wigner rotation arises because the boost B Λp differs from ΛB p by a rotation, which we denote by W (Λ, p): The matrixŴ s ′ s (Λ, p) in (B.16) is the spin 1 2 representation of W (Λ, p). If Λ is a pure rotation, we have W (R, p) = R. Also, since the product of two boosts in the same direction is again a boost in that direction, a pure Lorentz transformation in the direction of p does not generate a Wigner rotation.
Lorentz invariance implies the transformation law In the trace, the Wigner rotations only drop out for those Lorentz transformations for which W (Λ, p i ) = W (Λ, p f ). In general, this condition is violated. Hence knowledge of the spin average in one particular frame of reference does not in general suffice to determine the spin average in a different frame: the transformation law (B.18) involves the entire matrixT µν , including the spin-flip components of the amplitude.
In the Breit frame, p f + p i = 0, the momenta of the initial and final states point in opposite directions. The boost which takes the Breit frame into the Lab frame, where p i = 0, is a pure Lorentz transformation in that direction. Hence the change of frame does not generate a Wigner rotation. Accordingly, the spin average in the Lab is determined by the spin average in the Breit frame.
For the Lorentz transformation that takes the Breit frame into the centre-of-mass system, however, this is not the case: the Wigner rotation generated by this transformation for the initial state differs from the one relevant for the final state, W (Λ, p i ) = W (Λ, p f ). Hence knowledge of the spin average in the Breit frame does not suffice to evaluate the spin average in the centre-ofmass system or vice versa.

B.2 Crossing symmetry, parity, and time reversal
The symmetry of the time-ordered product, T j µ (x)j ν (y) = T j ν (y)j µ (x), implies invariance under crossing of the photons: 10 Invariance under space reflections amounts to: where π = diag(1, −1, −1, −1) inverts the sign of the space components but leaves the time components alone. Time reversal not only inverts the momentum and spin directions, but in addition interchanges the initial and final states. Moreover, the amplitudes are mapped into their complex conjugate. Exploiting the fact that the Hermitian conjugate of the operator j µ (x)j ν (y) is given by j ν (y)j µ (x), time reversal invariance can be brought to the form The superscript T indicates that the transposed matrix is relevant. The matrix ǫ = iσ 2 flips the spin in the initial and final states. For the Pauli matrices, we have ǫ σ T ǫ −1 = −σ. The above relations lead to the following symmetry property of T µν : It implies that, in the decomposition T µν = T µν 1 + i T µν i σ i of the amplitude in the basis spanned by 1, σ 1 , σ 2 , σ 3 , the spin-independent part, T µν , is even under the operation p i ↔ p f , q i → −q i , q f → −q f , while the spin-dependent part, T µν i , is odd. The mapping interchanges the Mandelstam variables s = (p i + q i ) 2 and u = (p i − q f ) 2 , but t = (p f − p i ) 2 as well as the photon virtualities q 2 i , q 2 f stay put. While the Breit frame is invariant under this operation, neither the Lab frame nor the centre-of-mass system have that property.

B.3 Low-energy expansion
The elastic intermediate states generate poles in T µν . In the Mandelstam variables the poles are located at s = m 2 and u = m 2 . We refer to these contributions as Born terms and denote them by T µν B , The decomposition is not unique [38,40,102]. The essential property of the Born terms is that they account for the elastic singularities. This ensures that the remainder,T µν , is regular at q i = q f = 0 and can thus be expanded in a Taylor series in powers of the photon momenta and energies, which the low-energy expansion treats as small. The construction described below leads to Born terms that are conserved, so that this also holds for the remainder: 11 Keeping P ≡ 1 2 (p i +p f ) fixed, momentum conservation determines the initial and final nucleon momenta in terms of the photon variables q i , q f . However, unless P vanishes, energy conservation leads to a nonlinear constraint on the photon energies, so that it is not consistent to treat all of these as quantities of O(q). The problem disappears for P = 0, i.e. in the Breit frame: energy conservation then implies ω i = ω f = ω, so that the kinematics is unambiguously determined by the independent variables ω, q i , q f , which all count as quantities of O(q). This is why the Breit frame is the preferred frame of reference for the low-energy expansion. For real Compton scattering, a transparent discussion of this issue is given in [113].
In the Breit frame, the Taylor series in powers of the variables ω, q i , q f takes the form: . . exclusively contain spin-dependent terms and do not contribute to the spin average. 11 In general, it is not a trivial matter to impose current conservation on the Born terms. If one for instance evaluates the one-particle singularities in the space components T ab B and determines the remaining components of T µν B by solving the constraints q f µ T µν B = q i ν T µν B = 0, one in general arrives at a representation for the Born terms that contains kinematic singularities. The presence of kinematic singularities in the Born terms complicates the analysis because the remainder T µν is then not regular at q i = q f = 0. Compton scattering on the pion illustrates the problem: the one-particle singularities do not generate a term proportional to g µν , but unless a regular term of this type is allowed for, the representation of the Born terms can be consistent with current conservation only if it contains kinematic singularities.
We now turn to the consequences of Lorentz invariance for the low-energy expansion. If the photon energies and momenta are small of O(q), the momentum transferred to the nucleon is small as well. We thus only need to consider Lorentz frames where the nucleon momenta are also small of O(q). The standard choices (Laboratory, centre-of-mass system, Breit frame) all belong to this category. If a Lorentz transformation Λ is to connect two such frames, then the relative velocity must be small, so that the standard decomposition into a boost and a rotation, Λ = BR only involves a small boost. Hence it suffices to analyze the transformation properties under rotations and under small boosts.
The behaviour under rotations is trivial, because the corresponding Wigner rotations in the initial and final states are identical, W (R, p i ) = W (R, p f ) = R. Hence they leave the spin-independent part of the amplitude alone and transform the matrices σ = {σ 1 , σ 2 , σ 3 } occurring in the spin-dependent part like a vector.
To analyze the properties of small boosts, the SL(2, C) representation of the Lorentz group is more convenient than the one acting on the coordinates and momenta. Consider the matrix B = exp( 1 2 v · σ) ∈ SL(2, C), which represents a pure Lorentz transformation with a small velocity v = O(q). The product of two such boosts is given by this is a pure Lorentz transformation, 12 but at O(q 2 ), the product in addition contains a small Wigner rotation: . The essential point here is that, for the pure Lorentz transformation needed to remove the relative velocity of the two systems, the Wigner rotations generated in the initial and final states are at most of order q 2 . Accordingly, the Lorentz invariance condition (B.18) implies that, if Λ is a boost that only generates velocities of O(q), the individual terms of the low-energy expansion transform like ordinary tensors -up to higher order corrections: The corrections only matter if the expansion is taken beyond next-to-leading order. In particular, the leading even term of the series is proportional to the unit matrix in spin space even if the reference system is not identified with the Breit frame. 12 The fact that the velocity addition is modified only shows up at O(q 3 ).

B.4 Spin average
The Breit frame also offers a convenient decomposition of the amplitude into independent tensors. In the following, we explicitly construct the decomposition for the spin average, which we denote by T µν . Current conservation (q f µ T µν = q i ν T µν = 0) implies that the amplitude is uniquely determined by its space components. Furthermore, rotation invariance ensures that the spin average involves five independent amplitudes (a, b = 1, 2, 3): (B.27) Since this decomposition exclusively makes use of rotation invariance, the coefficients I n are free of kinematic singularities. They depend on the rotation-invariant quantities ω, |q i |, |q f |, and q i · q f , which can be expressed in terms of the Mandelstam variables and the photon virtualities. In view of s + t + u = 2m 2 + q 2 i + q 2 f , only four of these are independent, for instance: s, u, q 2 i , q 2 f . The trace of the relation (B.22) implies that the invariants are symmetric under the interchange of s and u, I n (s, u, q 2 i , q 2 f ) = I n (u, s, q 2 i , q 2 f ) , n = 1, . . . , 5 , (B.28) and the crossing symmetry relation (B.19) then shows that they are symmetric under q 2 i ↔ q 2 f as well, except that I 3 and I 4 are interchanged, I n (s, u, q 2 i , q 2 f ) = I n (s, u, q 2 f , q 2 i ) , n = 1, 2, 5 , The last relation implies that if the photon virtualities are the same -in particular for real Compton scattering -there are only four independent amplitudes [15]. We add a remark concerning the spin average for the case where the scattering amplitude is written in the form If the momenta p i and p f are parallel, the sum over the spins can be represented as a product of projectors: so that spin average can be represented as

(B.32)
This formula might suggest that a frame-independent definition of the spin average does exist. As already noted by Tarrach [15], this is not the case, however: the relation (B.32) is correct only in those Lorentz frames where p i and p f are parallel. In particular, it does not hold in the centre-of-mass frame, where the right hand side of (B.32) does not represent the spin average performed on the left hand side.

B.5 Born terms
The Born terms contain poles along the lines s = m 2 and u = m 2 . The residue of the pole in the s-channel involves a sum over the one-particle matrix elements of the current: At the poles, the momentum of the intermediate state is on the mass-shell, p 2 n = m 2 , and the sum over the spin directions is given by s u(p n , s)ū(p n , s) = / p n + m . (B.34) We specify the residues off the mass-shell with analytic continuation, simply replacing p n with the total momentum, P = p i + q i = p f + q f , also for P 2 = m 2 : The two terms represent the two tree graphs obtained with the standard Feynman rules, except that the photon-nucleon vertices are equipped with form factors according to (A.10). Taken separately, the two terms do not obey current conservation, but taken together they do.
In the Breit frame, the spin average can be evaluated with (B.31). Comparing the resulting expression for the space components with (B.27), we obtain the following explicit representation for the Born terms: B n , n = 1, . . . , 5 , The symbol F i D stands for F D (q 2 i ) and F f D , F i P , F f P are defined analogously. In the Breit frame, the variable ω represents the energy of the photons. Expressed in terms of the Mandelstam variables, we have These expressions of course satisfy the relations (B.28) and (B.29). In the case of equal photon virtualities, Born terms for the spin-averaged amplitude are also provided in [40]. The representation specified in (3) and (4) of that work differs from ours in the overall normalization of the amplitude: basically, it amounts to replacing the factor K in (B.31) by 2m.

B.6 Leading low-energy constants
Since ω, q i , q f are independent variables andT is a linear combination of the variables ω, q i , q f , but this property is inconsistent with current conservation: since invariance under space reflections requires the components T Indeed, the following calculation shows that there are exactly two independent conserved tensors of that order. As noted above, the contribution of O(q 2 ) is spin independent,T µν 2 = T µν 2 1 and rotation invariance requires the space componentsT ab 2 to be of the form (B.27) withĪ 1 = c 1 ω 2 +c 2 q f ·q i +c 3 q 2 i +c 4 q 2 f , while the other coefficients are constants. For ω = 0, current conservation requires q a fT ab 2 = q b iT ab 2 = 0. This impliesĪ 2 = −c 2 ,Ī 3 =Ī 4 =Ī 5 = c 3 = c 4 = 0. For the space components, the general solution of the conditions imposed by rotation invariance and current conservation at O(q 2 ) thus reads Current conservation then fixes the remaining components in terms of the same two constants: As observed already by Klein [114], there are only two conserved tensors of polynomial form at O(q 2 ).

B.7 Real Compton scattering
For real photons, q 2 i = q 2 f = 0, the projection onto the polarization vectors (ǫ 0 = 0, ǫ · q = 0) annihilates the time components as well as the contributions from I 3 , I 4 , I 5 . At low frequencies, the scattering amplitude is fully determined by the charge of the particle [115], which we express in units of the proton charge, Q e ≡ F D (0). At first order in the expansion in powers of the photon frequency ω, the anomalous magnetic moment κ = F P (0) also shows up [116,117], and at O(ω 2 ), further contributions, characterized by the two low-energy constants in (B.38), manifest themselves [114]. In the Breit frame, the low-energy expansion of the spin-averaged amplitude starts with where n i ≡ q i /ω, n f ≡ q f /ω are the unit vectors in the direction of the initial and final photon momenta, respectively, and z ≡ n f · n i (in the Breit frame, the term linear in ω does not contribute to the spin average). Comparison with the well-known low-energy representation of the Compton scattering amplitude (see for instance (2.1) and (2.5) in [37]) shows that, up to normalization, the low-energy constants c 1 and c 2 represent the electric and magnetic polarizabilities, respectively: We emphasize that only the sum of the contributions from the Born terms and the polarizabilities manifests itself in Compton scattering. Both the choice of the reference frame and the choice of the Born terms are a matter of convention [38,40,102]. Accordingly, the literature contains several different variants of the above representation. In the context of the present paper, the relation (B.41) amounts to a definition of the polarizabilities.
Together with the Born terms, the two polarizabilities α E , β M determine the low-energy expansion not only of the spin average but of the entire amplitude T µν up to and including O(q 2 ). In fact, if not only the photon energies and momenta but the nucleon momenta are also booked as small quantities of O(q), this statement is valid in any reference frame: the leading terms can be written in a manifestly Lorentz invariant manner, The quantities K µν 1 , K µν 2 represent the generalization of the conserved tensors specified in (A.4) to nonforward directions, − g µν (P · q) 2 − P µ P ν q f · q i , (B.43) with P = 1 2 (p i + p f ). Conservation of energy and momentum implies that P · q f coincides with P · q i ≡ P · q. While the Breit frame formulae (B.38), (B.39) only contain terms of O(q 2 ), the representation (B.42) of the contributions from the polarizabilities includes higher orders of the low-energy expansion.
B.8 Low-energy theorems for T 1 (ν, q 2 ) and T 2 (ν, q 2 ) In the forward direction, the spin average is a Lorentz invariant notion and only two of the four invariant variables are independent: the photon virtualities are the same, q 2 i = q 2 f = q 2 , and the momentum transfer t vanishes. The standard variable ν = (s − u)/4m coincides with the frequency in the Breit frame, ω = ν.
The spin average involves the two invariants T 1 , T 2 defined in (A.3) and (A.5). In the notation of (B.27) these amplitudes are given by T 1 = I 2 + I 3 + I 4 + I 5 , q 2 T 1 + ν 2 T 2 = I 1 . (B.44) While for real Compton scattering only the invariants I 1 , I 2 count, the amplitudes relevant for Compton scattering of virtual photons in the forward direction also pick up a contribution from I 3 , I 4 , I 5 .
The comparison of the above expressions for the Born terms with the elastic part of the forward amplitudes in (15) shows that the singularities on the left and right hand sides of (B.44) are indeed the same, but in the case of T 1 , the regular parts differ: The difference also shows up when evaluating the relation (B.38) in the forward direction, where it implies a low-energy theorem for T 1 [40,53,89,102,107] as well as one for T 2 [54], This demonstrates that, although the amplitudes T 1 (ν, q 2 ), T 2 (ν, q 2 ) do not determine the angular distribution of real Compton scattering, they do encode the polarizabilities. The calculation described in appendix B.3 removes the apparent contradiction: the amplitudes I 3 − I B 3 , I 4 − I B 4 , I 5 all disappear if the photon energy is set equal to zero. We repeat that the decomposition of the amplitude into a contribution generated by the elastic singularities and a remainder only becomes unique if the asymptotic behaviour is specified. The extra term in T 1 arises because the recipe used above to specify the Born terms implies that the amplitudes I B n do not tend to zero when ν → ∞. In the above analysis of non-forward Compton scattering, the asymptotic behaviour does not play any role -accordingly, the regular parts of the Born terms used in that analysis are without physical significance. In contrast, the decomposition of the forward amplitudes into an elastic and an inelastic part set up in section 3 does invoke the asymptotic behaviour. It implies that the polarizabilities do pick up a contribution from the elastic singularities.

C Causality
The structure functions V 1 (ν, q 2 ), V 2 (ν, q 2 ) are experimentally accessible only in the space-like region, q 2 ≤ 0. As discussed in detail in [4,118], causality -the fact that the current commutator vanishes outside the lightcone -very strongly constrains their continuation into the time-like region. General properties of causal functions are described in [119,120] and explicit representations that manifestly incorporate causality [121][122][123][124] are available. These have been used, in particular, in the analysis of the structure functions at high energies [125]. The further representation constructed in appendix D shows that the contributions generated by the elastic intermediate states or individual resonances can be written in manifestly causal form.
In the present context, the key statement 13 is that the continuation is uniquely determined up to a polynomial in the variable ν: V (ν, q 2 ) = ǫ(ν) N n=0 σ n (q 2 )ν 2n , (C. 47) where the coefficients σ n (q 2 ) vanish for q 2 ≤ 0. In Regge language, such contributions represent fixed poles in the angular momentum plane, located at integer values of the angular momentum.
Regge asymptotics excludes fixed poles in V 2 (ν, q 2 ), but a term with n = 0, that is a fixed pole with J = 0, is not a priori ruled out in V 1 (ν, q 2 ): V fp 1 (ν, q 2 ) = ǫ(ν)σ(q 2 ) . (C.48) A term of this form is also consistent with the short distance properties of QCD, which ensure that, in the Bjorken limit, where ν and q 2 both become large, the structure functions tend to zero -this merely imposes a constraint on the asymptotic behaviour, which in particular requires that σ(s) disappears when s becomes large. If a fixed pole were present in V 1 (ν, q 2 ), it would not show up in the electron cross sections, but would affect the time-ordered amplitude, through the term T fp 1 (ν, q 2 ) = ∞ 0 ds σ(s) s − q 2 − iǫ . (C.49) Accordingly, a formula that expresses the electromagnetic self-energy in terms of the electron cross sections could then not be given, nor would it be possible to express the polarizabilities of the nucleon in terms of these cross sections. The analysis of [4] is based on the assumption that the matrix element of the current commutator is free of fixed poles, so that the electron cross sections unambiguously determine the structure functions V 1 (ν, q 2 ), V 2 (ν, q 2 ), not only in the space-like region, but also for time-like momenta. The short distance properties of QCD ensure that there is then no ambiguity in T 1 (ν, q 2 ), T 2 (ν, q 2 ) either: the electron cross sections fully determine these. Accordingly, the electromagnetic self-energy as well as the polarizabilities of the nucleon are determined by these cross sections, at least in principle.

D A new causal representation
Consider the product of two retarded propagators t ret (p, q) = 1 µ 2 1 − (q 0 + iǫ) 2 + q 2 × 1 µ 2 2 − (p 0 + q 0 + iǫ) 2 + (p + q) 2 . (D.50) The Fourier transform of this amplitude, t ret (p, x) = d 4 qe −ix·q t ret (p, q) , (D.51) is given by the convolution of the two propagators in coordinate space. Since these vanish outside the forward light-cone, the same is also true of the convolution:t ret (p, x) differs from zero only in the forward light-cone, x 2 ≥ 0, x 0 ≥ 0. This reflects the properties of the integrand in (D.51), which contains four poles t ret (p, q) = 1 (q 0 − ω 1 + iǫ)(q 0 + ω 1 + iǫ) × 1 (q 0 + p 0 − ω 2 + iǫ)(q 0 + p 0 + ω 2 + iǫ) , ω 1 = µ 2 1 + q 2 , ω 2 = µ 2 2 + (q + p) 2 . (D.52) All of these occur in the lower half of the q 0 -plane. The path of integration can therefore be deformed into a segment from −∞ to −R, a semi-circle of radius R and a segment from R to +∞. If x 0 < 0, the factor e −iq 0 x 0 suppresses the integrand if R is taken large, except for the segments where the imaginary part of q 0 is not large. But there, t ret (p, q) is small, of order 1/R 4 . Since the length of these segments is of order R, their contributions also tend to zero if R is taken large. Since the integral is path-independent, it vanishes for x 0 < 0. The quantityt ret (p, x) is Lorentz invariant. The time component of the vector x depends on the frame chosen, but for any point outside the forward light-cone, there is a frame where the time component is negative. This confirms thatt ret (p, x) vanishes outside the forward light-cone.
The advanced version of the amplitude only differs in the sign of the iǫ prescription. Like the advanced propagators in coordinate space, the Fourier transform of t adv (p, q) is different from zero only in the backward light-cone, x 2 ≥ 0, x 0 ≤ 0. The difference between the two, v(p, q) = 1 2πi (t ret (p, q) − t adv (p, q)) , (D. 53) is therefore causal: the Fourier transform of v(p, q) vanishes outside the light-cone. Since t adv (p, q) is the complex conjugate of t ret (p, q), the function v(p, q) is real. For space-like momenta, v(p, q) picks up a contribution only from the poles at q 0 = −p 0 ± ω 2 : v(p, q) = 1 µ 2 1 − q 2 δ((p + q) 2 − µ 2 2 ) , q 2 ≤ 0 , ν ≥ 0 .
(D. 54) This demonstrates that any function which for spacelike momenta can be represented as admits a causal continuation into the time-like region (we have replaced µ 2 1 and µ 2 2 by a = µ 2 1 , b = µ 2 2 − m 2 , respectively and imposed the condition b ≥ 0, which ensures that, for space-like momenta, the support of V (ν, q 2 ) is contained in the physical region, Q 2 ≤ 2m|ν|). There is only one continuation that is free of fixed poles. The corresponding time-ordered amplitude is given by This representation, in particular, yields a causal description of the elastic contributions discussed in section 3. Since the form factors are analytic functions of t, the coefficients v el 1 , v el 2 introduced in (10) do admit a representation of the form dard Regge behaviour and hence necessarily involves a subtracted dispersion relation for T 1 . Scaling violations concern the behaviour of the structure functions in the deep inelastic region, where q 2 becomes large. It is true that in [4], the scaling violations are ignored, but to claim that this is used in [4] as an argument to evade the subtraction function is plain wrong: a subtraction was made and the subtraction function was determined.