1 Introduction

The present measurements of the proton charge radius from muonic hydrogen spectroscopy [1, 2] differ by a puzzling \(7 \sigma \) from the radius value extracted from the hydrogen spectroscopy [3] and the elastic electron–proton scattering data [4]. This huge discrepancy, which has become known as the “proton radius puzzle”, see Ref. [5] for a recent review, has led to intense theoretical and experimental activity in recent years. So far it has defied an explanation which could bring all three experimental techniques in agreement with each other. To shed further light on this puzzle, several new experiments involving muons are being planned. Their aim is to test the lepton universality in the interaction of a lepton with a proton. One can compare the elastic scattering of electrons and muons on the proton target and measure the proton charge radius in the muon–proton elastic scattering in a similar way as was done in the electron–proton elastic scattering [4, 6]. Such an elastic scattering experiment is presently being planned by the MUSE collaboration [7]. Complementarily, one can also compare the electron and muon pair photoproduction on the proton as proposed in [8]. These experiments should be performed at the \( 1 ~\% \) level or better of experimental accuracy in order to have an impact on the observed discrepancy in the proton charge radius. Such a level of precision therefore calls for studies of the higher order corrections in such processes, as the corrections to cross sections suppressed by one power in the fine-structure constant \(\alpha = e^2/(4 \pi ) \approx 1/137\) are also in the 1 % or few % range. In particular it requires studies of the two-photon exchange (TPE) correction to the unpolarized lepton–proton elastic scattering for the case when the mass of the lepton cannot be neglected relative to its momentum. In a previous work [9], we have performed an estimate of the leading TPE contribution from the proton intermediate state, and provided estimates for the MUSE experiment. In this work we account for the inelastic intermediate states, i.e. all possible intermediate states in the TPE box graphs beyond the proton state. As our aim is an estimate of such corrections for the MUSE experiment, which corresponds with very low momentum transfers, we will estimate the inelastic TPE corrections through the near forward doubly virtual Compton scattering process [10]. The essential hadronic information is contained in the unpolarized proton structure functions and in one subtraction function. We clarify the TPE correction coming from the subtraction function, and we provide an empirical determination of the subtraction function based on the high-energy behavior of the forward Compton amplitude \( \mathrm {T}_1 \).

The plan of the present paper is as follows. We review the elastic TPE contribution to the unpolarized lepton–proton scattering in Sect. 2, and derive a low momentum transfer expansion accounting for all terms due to the non-zero lepton mass. We subsequently discuss the forward unpolarized doubly virtual Compton scattering process which will serve as our starting point in the determination of the inelastic TPE corrections in Sect. 3. We evaluate the TPE correction due to the subtraction function in the forward Compton amplitude \( \mathrm {T}_1 \), and we provide an empirical determination of the subtraction function from data in Sect. 4. We provide the expressions for the inelastic TPE correction coming from the unpolarized proton structure functions and present the results of our numerical evaluation for the muon–proton elastic scattering in Sect. 5. Our conclusions are given in Sect. 6.

2 Elastic TPE contribution

In this work we study the TPE correction at low momentum transfer to unpolarized elastic scattering of a charged lepton with mass m and initial (final) momentum k (\( k' \)) on a proton with mass M and initial (final) momentum p (\( p' \)); see Fig. 1 for the notations of kinematics and helicities. In this work, we consider the region of small squared momentum transfer \( Q^2 \ll ~M^2,~M E \), where E is the lepton beam energy (in the lab frame) and \( Q^2 = -\left( k-k'\right) ^2 \).

Fig. 1
figure 1

TPE graph in elastic lepton–proton scattering

We define the TPE correction \( \delta _{2 \gamma } \) from the difference between the expected cross section \( \sigma ^{\mathrm {exp}} \) and the cross section in the one-photon exchange (OPE) approximation \( \sigma _{1 \gamma } \):

$$\begin{aligned} \sigma ^{\mathrm {exp}} \simeq \sigma _{1 \gamma } \left( 1 + \delta _{2 \gamma } \right) , \end{aligned}$$
(1)

corresponding with first order corrections in the fine structure constant \( \alpha = e^2 / \left( 4 \pi \right) \), with e the unit of electric charge.

In the low \( Q^2 \) region, the dominant contribution to the TPE graph of Fig. 1 results from the proton intermediate state (elastic contribution). We have fully calculated this contribution in a previous work [9]. In this section, we start by studying the quality of some approximate expressions for the elastic contribution in the low \(Q^2\) limit, for which analytical expressions can be provided.

The first TPE estimate is due to Feshbach and McKinley [11], who calculated the TPE contribution, corresponding with Coulomb photon couplings to the static proton (i.e. two \( \gamma ^0 \) vertices). This so-called Feshbach term contribution to \( \delta _{2 \gamma } \) in Eq. (1) is denoted by \( \delta _F \) and can be expressed through the scattering angle in the laboratory frame \( \theta \) and the lepton velocity v as

$$\begin{aligned} \delta _F = \pi \alpha v \frac{ \sin \theta /2 \left( 1 - \sin \theta /2 \right) }{1 - v^2 \sin ^2 \theta /2 }. \end{aligned}$$
(2)

As a next step, one may consider the TPE correction in the scattering of two point-like Dirac particles (corresponding with two \( \gamma ^\mu \) couplings). In Appendix A we provide some analytical expressions of this contribution in the limit of small \(Q^2 \ll M^2, M^2 (E^2 - m^2)/s\), with the center-of-mass frame squared energy \( s = M^2 + m^2 + 2 M E \), both for the cases when \( Q^2 \ll m^2 \); see Eqs. (A6)–(A8), and when \( Q^2 \) and \( m^2 \) are of similar size, see Eq. (A9).

We show in Fig. 2 (left panel) the comparison between the Feshbach term, the TPE contribution for point-like Dirac particles, and the TPE for a point-like proton, with inclusion of the magnetic moment contribution. It is seen that the Feshbach correction of Eq. (2) with account of the recoil correction factor \( \left( 1 + m/M \right) \) describes the result for point-like Dirac particles quite well in the kinematics of the MUSE experiment.

We also show in Fig. 2 (right panel) the effect of the proton FFs, according to the full numerical calculation of Ref. [9]. In the low \(Q^2\) kinematics of the MUSE experiment, the inclusion of the FFs provides a reduction of the TPE by around \( 40~\% \) at \( Q^2 \approx 0.025 ~\mathrm {GeV}^2 \), consequently one should use the full numerical calculation of Ref. [9] (corresponding with the elastic TPE result in Fig. 2) in MUSE kinematics.

Fig. 2
figure 2

Left panel TPE correction for the case of a point-like proton, compared with the case when one neglects the magnetic moment (Dirac particle), as well as the Feshbach result (corresponding with Coulomb photon exchange). Right panel TPE correction for the case of the proton with electric and magnetic form factors of the dipole form. We compare the box graph calculation with the Feshbach term corrected by the recoil correction \( 1 + m / M\)

3 Forward unpolarized doubly virtual Compton scattering tensor

The TPE contribution, \( \delta _{2 \gamma }\), for the muon–proton scattering process is in general given by the interference of the one-photon exchange amplitude (\( T^{\mathrm {OPE}} \)) and the two-photon exchange amplitude (\( T^{\mathrm {TPE}}\)) as

$$\begin{aligned} \delta _{2 \gamma } = \frac{2 \mathfrak {R}\left( \sum \nolimits _{\mathrm {spin}}T^{\mathrm {TPE}} \left( T^{\mathrm {OPE}} \right) ^* \right) }{\sum \nolimits _{\mathrm {spin}} |T^{\mathrm {OPE}}|^2}. \end{aligned}$$
(3)

The OPE expression in the denominator of Eq. (3) is given by

$$\begin{aligned} \sum \limits _{\mathrm {spin}} |T^{\mathrm {OPE}}|^2 = \frac{8 e^4}{\tau } \frac{1 - \varepsilon _0}{1 - \varepsilon } \left( \varepsilon G^2_\mathrm{E}(Q^2) + \tau G^2_\mathrm{M} (Q^2) \right) , \end{aligned}$$
(4)

where \(G_\mathrm{E}\) (\(G_\mathrm{M}\)) are the proton electric (magnetic) form factors, respectively, and with kinematical quantities \(\tau \) and \(\varepsilon _0\) defined as in Eq. (A3). The interference between the OPE and TPE amplitudes in the numerator of Eq. (3) can be expressed as

$$\begin{aligned}&2 \mathfrak {R}\left( \sum \limits _{\mathrm {spin}}T^{\mathrm {TPE}} \left( T^{\mathrm {OPE}} \right) ^* \right) = - \mathfrak {R}\frac{4 \pi e^4}{Q^2} \mathop {{\int }} \frac{ i \mathrm {d}^4 \tilde{q}}{\left( 2 \pi \right) ^4} \frac{L^{\mu \nu \alpha } H_{\mu \nu \alpha } }{\left( \tilde{q} - \frac{q}{2} \right) ^2 \left( \tilde{q} + \frac{q}{2} \right) ^2 } , \nonumber \\&L^{\mu \nu \alpha }\nonumber \\&\quad = \mathbf Tr \left\{ \left( \gamma ^\mu \frac{\hat{K}-\hat{\tilde{q}}+m}{\left( K-\tilde{q}\right) ^2-m^2} \gamma ^\nu + \gamma ^\nu \frac{\hat{K}+\hat{\tilde{q}}+m}{\left( K+\tilde{q}\right) ^2-m^2} \gamma ^\mu \right) (\hat{k}+m ) \gamma ^\alpha \right. \nonumber \\&\qquad \times \left. (\hat{k^\prime }+m ) \phantom {\left\{ \left( \gamma ^\mu \frac{\hat{K}-\hat{\tilde{q}}+m}{\left( K-\tilde{q}\right) ^2-m^2} \gamma ^\nu + \gamma ^\nu \frac{\hat{K}+\hat{\tilde{q}}+m}{\left( K+\tilde{q}\right) ^2-m^2} \gamma ^\mu \right) (\hat{k}+m ) \gamma ^\alpha \right. }\right\} , \nonumber \\&H^{\mu \nu \alpha } = \mathbf Tr \left\{ M^{\mu \nu } \left( \hat{p}+M \right) \Gamma ^\alpha ( Q^2) (\hat{p^\prime }+M ) \right\} , \end{aligned}$$
(5)

with all momenta defined as in Fig. 1, where \( \tilde{q} \) is the loop-momentum over which one integrates, and where \(\Gamma ^\alpha \) denotes the on-shell proton electromagnetic vertex:

$$\begin{aligned} \Gamma ^\alpha ( Q^2 ) \;=\; F_\mathrm{D} (Q^2) \, \gamma ^\alpha \;+\; F_\mathrm{P}(Q^2) \, \frac{i \sigma ^{\alpha \beta } q_\beta }{2 M} \, , \end{aligned}$$
(6)

with \(F_\mathrm{D}\) (\(F_\mathrm{P}\)) the Dirac (Pauli) form factors of the proton, respectively. Furthermore in Eq. (5), \(M^{\mu \nu }\) denotes the proton doubly virtual Compton scattering tensor.

The main aim of the present work is to quantitatively estimate the inelastic TPE contribution in the low \( Q^2 \) region, corresponding with the MUSE kinematics. For this purpose, we will approximate the hadronic tensor \(M^{\mu \nu }\) in Eq. (5) by the forward doubly virtual scattering (VVCS) tensor. The unpolarized forward VVCS process \( \gamma ^*(\tilde{q}) + p (P) \rightarrow \gamma ^*(\tilde{q}) + p (P) \) is described by two invariant amplitudes \( \mathrm {T}_1 \) and \( \mathrm {T}_2 \), which are defined in this work through the tensor decomposition:

$$\begin{aligned} M^{\mu \nu }= & {} - \left( -g^{\mu \nu }+\frac{\tilde{q}^{\mu } \tilde{q}^{\nu }}{\tilde{q}^2 }\right) \mathrm {T}_1 (\tilde{\nu }, \tilde{Q}^2 ) \nonumber \\&- \frac{1}{M^2} \left( P^{\mu }-\frac{ M \tilde{\nu } }{\tilde{q}^2 }\,\tilde{q}^{\mu }\right) \left( P^{\nu }-\frac{ M \tilde{\nu } }{\tilde{q}^2 }\, \tilde{q}^{\nu } \right) \mathrm {T}_2 (\tilde{\nu }, \tilde{Q}^2 ),\nonumber \\ \end{aligned}$$
(7)

with the photon energy \( \tilde{\nu } = \left( P \cdot \tilde{q} \right) /M \) and the squared photon virtuality \( \tilde{Q}^2 \equiv - \tilde{q}^2 \). The absorptive parts of the amplitudes \( \mathrm {T}_1 \) and \( \mathrm {T}_2 \) are related to the proton structure functions \( F_1 \) and \( F_2 \) by

$$\begin{aligned}&\mathfrak {I}\mathrm {T}_1(\tilde{\nu }, \tilde{Q}^2) = \frac{e^2}{4 M} F_1(\tilde{\nu }, \tilde{Q}^2), \nonumber \\&\quad \mathfrak {I}\mathrm {T}_2 (\tilde{\nu }, \tilde{Q}^2) = \frac{e^2}{4 \tilde{\nu }} F_2(\tilde{\nu }, \tilde{Q}^2). \end{aligned}$$
(8)

In this work, we will approximate the unpolarized tensor \(M^{\mu \nu }\) entering Eq. (5) for the process \(\gamma ^*(q_1=\tilde{q} + q/2) + p (P - q/2) \rightarrow \gamma ^*(q_2 = \tilde{q} - q/2) + p (P + q/2)\) in the low momentum transfer limit \(q \rightarrow 0\), i.e. \(q_1 \approx q_2\), by [10]:

$$\begin{aligned} M^{\mu \nu }\approx & {} - \left( -g^{\mu \nu }+\frac{q_1^{\mu } q_2^{\nu }}{q_1 \cdot q_2 }\right) T_1\left( \tilde{\nu }, - q_1 \cdot q_2 \right) \nonumber \\&- \frac{1}{M^2} \left( P^{\mu }-\frac{ M \tilde{\nu } }{q_1 \cdot q_2 }\, q_1^{\mu }\right) \left( P^{\nu }-\frac{ M \tilde{\nu } }{q_1 \cdot q_2 }\, q_2^{\nu } \right) \nonumber \\&\times \,T_2 \left( \tilde{\nu }, - q_1 \cdot q_2\right) . \end{aligned}$$
(9)

By using the electromagnetic gauge invariance of the lepton tensor,

$$\begin{aligned} q_1^\nu L_{\mu \nu \alpha } = 0, \quad q_2^\mu L_{\mu \nu \alpha } = 0, \end{aligned}$$
(10)

the hadronic tensor of Eq. (9) can be expressed equivalently as

$$\begin{aligned} M^{\mu \nu }\approx & {} - \left( -g^{\mu \nu }+\frac{q^{\mu } q^{\nu }}{\tilde{Q}^2 - \frac{Q^2}{4} }\right) T_1\left( \tilde{\nu }, \tilde{Q}^2 - \frac{Q^2}{4}\right) \nonumber \\&- \frac{1}{M^2} \left( P^{\mu }+\frac{ M \tilde{\nu } }{\tilde{Q}^2 - \frac{Q^2}{4} }\,q^{\mu }\right) \left( P^{\nu }-\frac{ M \tilde{\nu } }{\tilde{Q}^2 - \frac{Q^2}{4} }\, q^{\nu } \right) \nonumber \\&{\times }\, T_2\left( \tilde{\nu }, \tilde{Q}^2 - \frac{Q^2}{4}\right) , \end{aligned}$$
(11)

which is the tensor form which we will use in our evaluations of \(\delta _{2 \gamma }\).

The real part of the amplitude \( \mathrm {T}_1 \) can be expressed through a subtracted dispersion relation (DR) as integral over the invariant mass \( W^2\) of the intermediate hadronic state as

$$\begin{aligned}&\mathfrak {R}\mathrm {T}_1 (\tilde{\nu }, \tilde{Q}^2) \nonumber \\&\quad = \mathfrak {R}\mathrm {T}^{\mathrm {Born}}_1 ( \tilde{\nu }, \tilde{Q}^2 ) + \mathrm {T}^{\mathrm {subt}}_1 (0, \tilde{Q}^2) + \frac{2 \tilde{\nu }^2 }{\pi } \nonumber \\&\qquad \times \mathop {{\int }} \limits ^{~~ \infty }_{W^2_{\mathrm {thr}}} \frac{ e^2 M F_1\left( (W^2 - P^2 + \tilde{Q}^2)/(2 M),\tilde{Q}^2\right) \mathrm {d} W^2}{\left( W^2 - P^2 + \tilde{Q}^2 \right) \left( \left( P + \tilde{q} \right) ^2 - W^2 + i \varepsilon \right) \left( \left( P - \tilde{q} \right) ^2 - W^2 + i \varepsilon \right) },\nonumber \\ \end{aligned}$$
(12)

with the pion–proton inelastic threshold: \( W^2_{\mathrm {thr}} = \left( M + m_{\pi } \right) ^2 \approx 1.15 ~\mathrm {GeV}^2 \), where \( m_{\pi } \) denotes the pion mass, and where \( \mathrm {T}^{\mathrm {subt}}_1 (0, \tilde{Q}^2 ) \) is the subtraction function at \( \tilde{\nu } = 0\). The real part of the amplitude \( \mathrm {T}_2 \) can be obtained from an unsubtracted DR:

$$\begin{aligned}&\mathfrak {R}\mathrm {T}_2 (\tilde{\nu }, \tilde{Q}^2) = \mathfrak {R}\mathrm {T}^{\mathrm {Born}}_2 ( \tilde{\nu }, \tilde{Q}^2 ) + \frac{1 }{\pi }\nonumber \\&\quad \times \mathop {{\int }} \limits ^{~~ \infty }_{W^2_{\mathrm {thr}}} \frac{ e^2 M F_2 \left( (W^2 - P^2 + \tilde{Q}^2)/(2M), \tilde{Q}^2\right) \mathrm {d} W^2}{\left( \left( P + \tilde{q} \right) ^2 - W^2 + i \varepsilon \right) \left( \left( P - \tilde{q} \right) ^2 - W^2 + i \varepsilon \right) },\nonumber \\ \end{aligned}$$
(13)

In Eqs. (12) and (13), the Born contributions to the unpolarized Compton amplitudes \( \mathrm {T}^{\mathrm {Born}}_1 \) and \( \mathrm {T}^{\mathrm {Born}}_2 \), due to the proton intermediate state, are given by

$$\begin{aligned} \mathfrak {R}\mathrm {T}_1^{\mathrm {Born}} ( \tilde{\nu }, \tilde{Q}^2 )= & {} \frac{\alpha }{M} \left( \frac{\tilde{Q}^4 G^2_\mathrm{M}(\tilde{Q}^2)}{ \tilde{Q}^4 - 4 M^2 \tilde{\nu }^2 } -F_\mathrm{D}^2 (\tilde{Q}^2) \right) , \end{aligned}$$
(14)
$$\begin{aligned} \mathfrak {R}\mathrm {T}_2^{\mathrm {Born}} ( \tilde{\nu }, \tilde{Q}^2 )= & {} 4 M \alpha \tilde{Q}^2 \frac{ F_\mathrm{D}^2(\tilde{Q}^2) + \frac{\tilde{Q}^2}{4 M^2} F^2_\mathrm{P} (\tilde{Q}^2) }{ \tilde{Q}^4 - 4 M^2 \tilde{\nu }^2}. \end{aligned}$$
(15)

Note that in the derivation of a DR as given e.g. in Eq. (12) the elastic (nucleon pole) term contribution, given by only the first term of Eq. (14), correctly appears. This pole contribution differs from the Born term by

$$\begin{aligned} \mathrm {T}_1^{\mathrm {Born}} ( \tilde{\nu }, \tilde{Q}^2 ) - \mathrm {T}_1^{\mathrm {pole}} ( \tilde{\nu }, \tilde{Q}^2 ) = - \frac{\alpha }{M} F^2_D ( \tilde{Q}^2 ). \end{aligned}$$
(16)

As this is an energy (\( \tilde{\nu } \)) independent function, we have absorbed it in the definition of \( \mathrm {T}^{\mathrm {subt}}_1 ( 0, \tilde{Q}^2 ) \), which in Eq. (12) is defined as

$$\begin{aligned} \mathrm {T}^{\mathrm {subt}}_1 (0, \tilde{Q}^2 ) \equiv \mathrm {T}_1 (0, \tilde{Q}^2) - \mathrm {T}^{\mathrm {Born}}_1 (0, \tilde{Q}^2) \equiv \tilde{Q}^2 \beta (\tilde{Q}^2).\nonumber \\ \end{aligned}$$
(17)

The advantage of expressing the amplitude \( \mathrm {T}_1 \) w.r.t. to its Born contribution, results from the fact that the non-Born amplitude in Eq. (17) starts at \(\tilde{Q}^2\) and is usually parametrized in terms of polarizabilities, i.e. the function \( \beta (\tilde{Q}^2) \) at \( \tilde{Q}^2 = 0 \) is given by the magnetic polarizability \( \beta _M \): \( \beta ( 0 ) = \beta _M \) [13, 21].

In order to evaluate the inelastic TPE contribution using the forward non-Born VVCS amplitudes, we will need the information on the proton structure functions \( F_1 \) and \( F_2 \) as well as to specify the subtraction function in Eq. (17), which is parametrized through the function \( \beta ( \tilde{Q}^2 ) \). We will evaluate the subtraction function contribution in Sect. 4, and the dispersive contribution due to the structure functions \( F_1 \) and \( F_2 \) in Sect. 5.

4 Subtraction function contribution to TPE correction

In the present section we will discuss the TPE correction due to the subtraction function \(\mathrm {T}^{\mathrm {subt}}_1 (0, \tilde{Q}^2 )\). For this purpose, we will compare three different estimates for \( \beta ( \tilde{Q}^2 ) \) defined through Eq. (17). At low \( Q^2 \) we will use existing estimates from heavy-baryon and baryon chiral perturbation theory. Furthermore, we will provide an empirical determination of \( \beta ( \tilde{Q}^2 ) \), based on the high-energy behavior of the Compton amplitude.

4.1 Heavy-baryon ChPT subtraction function

First of all, we show the fit of Ref. [13] obtained by matching the heavy-baryon chiral perturbation theory (HBChPT) result to a dipole behavior:

$$\begin{aligned} \beta (\tilde{Q}^2 ) = \frac{\beta _M}{\left( 1 + \tilde{Q}^2 / \Lambda ^2 \right) ^2}, \qquad \Lambda = 530 - 842 ~\mathrm {MeV}, \end{aligned}$$
(18)

with the value of the magnetic polarizability \( \beta _M = (2.5 \pm 0.4) \times 10^{-4} ~\mathrm {fm^3}\) taken from PDG [14]. For the purpose of showing error bands in our numerical estimates, we choose the lower and upper edges of such bands to correspond with the values: \( \Lambda = 530 ~\mathrm {MeV},~\beta _M = 2.1 \times 10^{-4} ~\mathrm {fm^3} \) and \( \Lambda = 842 ~\mathrm {MeV},~\beta _M = 2.9 \times 10^{-4} ~\mathrm {fm^3} \), respectively. The resulting bands for \( \mathrm {T}^{\mathrm {subt}}_1 (0, \tilde{Q}^2 ) \) are shown in Fig. 3, and correspondingly for \( \beta (\tilde{Q}^2) \) in Fig. 4 (blue bands).

Fig. 3
figure 3

The empirical subtraction function of Eq. (26) in comparison with the subtraction functions from HBChPT of Birse et al. [13], and from BChPT [15]

Fig. 4
figure 4

The empirical estimate for the magnetic polarizability \( \beta (\tilde{Q}^2) \), based on Eqs. (17) and (26) compared with the HBChPT result of Birse et al. [13] normalized to the PDG value \( \beta (0) = \left( 2.5 \pm 0.4 \right) \times 10^{-4} ~\mathrm {fm^3}\) [14], and with the BChPT result [15]

4.2 Baryon ChPT subtraction function

Second, we also show the prediction for \(\beta (\tilde{Q}^2)\) resulting from the covariant baryon chiral perturbation theory (BChPT) [15], with \(\beta \) decomposed as

$$\begin{aligned} \beta (\tilde{Q}^2 ) = \beta _{\pi N} (\tilde{Q}^2 ) + \beta _{\Delta } (\tilde{Q}^2 ) + \beta _{\pi \Delta } (\tilde{Q}^2 ), \end{aligned}$$
(19)

with \( \beta _{\pi N} (\tilde{Q}^2 ) \) the \( \mathcal {O} ( p^3 ) \) diamagnetic polarizability contribution from \( \pi N \) loops given by Eq. (22) of Ref. [15], \( \beta _{\Delta } (\tilde{Q}^2 ) \) the paramagnetic contribution of the \( \Delta \)-resonance to the magnetic polarizability [16] and \( \beta _{\pi \Delta } (\tilde{Q}^2 ) \) the \( \mathcal {O} ( p^{7/2} ) \) at \( p \simeq m_\pi \) diamagnetic polarizability contribution from \( \pi \Delta \) loops [16].

In Fig. 3, we compare the heavy-baryon and baryon ChPT predictions for \( \mathrm {T}^{\mathrm {subt}}_1 (0, \tilde{Q}^2 ) \). Notice that the HBChPT value of \( \beta (0) \) is taken from a fit to data (PDG 2014) whereas the baryon ChPT value of \( \beta (0) \) results from the sum of the positive paramagnetic part due to the s-channel \( \Delta \)-excitation \( \beta _{\Delta } (0) \simeq 7 \times 10^{-4}~\mathrm {fm}^3 \), and the negative diamagnetic part due to \( \pi N \) and \( \pi \Delta \) loops, i.e. \( \beta _{\pi N} (0) = - 2 \times 10^{-4}~\mathrm {fm}^3 \) and \( \beta _{\pi \Delta } (0) = - 1.2 \times 10^{-4}~\mathrm {fm}^3 \).

4.3 Empirical determination of the subtraction function

In this section, we discuss an empirical estimate of the function \( \beta ( \tilde{Q}^2 ) \) at non-zero \( \tilde{Q}^2 \) from experimental information on inelastic electron–proton scattering. Following the idea of Refs. [17, 18], the subtraction function can be obtained from an unsubtracted dispersion relation for the amplitude \( \mathrm {T}_1 ( \tilde{\nu }, \tilde{Q}^2 ) - \mathrm {T}_1^\mathrm {R} ( \tilde{\nu }, \tilde{Q}^2 ) \), where \(\mathrm {T}_1^\mathrm {R}\) denotes a Regge amplitude which is chosen such as to match the high-energy behavior of the amplitude \( \mathrm {T}_1\), i.e. \(\mathrm {T}_1 - \mathrm {T}_1^\mathrm {R} \rightarrow 0\) for \(\tilde{\nu }\rightarrow \infty \).Footnote 1 The function \(\mathrm {T}_1^\mathrm {R}\) is chosen as a sum over the leading Regge trajectories:

$$\begin{aligned}&\mathrm {T}_1^{\mathrm {R}} ( \tilde{\nu }, \tilde{Q}^2 ) \nonumber \\&\quad \equiv - \frac{\pi \alpha }{M} \sum _{\alpha _0 > 0} \frac{\gamma _{\alpha _0} (\tilde{Q}^2 )}{\sin \pi \alpha _0}\nonumber \\&\quad \quad \times \left\{ \left( \tilde{\nu }_0 - \tilde{\nu } - i \varepsilon \right) ^{\alpha _0} + \left( \tilde{\nu }_0 + \tilde{\nu } - i \varepsilon \right) ^{\alpha _0} \right\} \nonumber \\&\quad \quad - \frac{\pi \alpha }{M} \sum _{{\alpha _0} > 1} \frac{ {\alpha _0} \tilde{\nu }_0 \gamma _{\alpha _0} (\tilde{Q}^2 )}{\sin \pi \left( {\alpha _0} - 1 \right) }\nonumber \\&\quad \quad \times \left\{ \left( \tilde{\nu }_0 - \tilde{\nu } - i \varepsilon \right) ^{{\alpha _0} - 1} + \left( \tilde{\nu }_0 + \tilde{\nu } - i \varepsilon \right) ^{{\alpha _0} - 1} \right\} , \end{aligned}$$
(20)

with the intercept \( {\alpha _0} > 0 \), \( \tilde{\nu }_0 \) is a reference hadronic scale which is used as a free parameter and \( \gamma _{\alpha _0} (\tilde{Q}^2 ) \) are the Regge residues. Using Eq. (8), the imaginary part of \( \mathrm {T}^\mathrm {R}_1 \) yields the corresponding Regge structure:

$$\begin{aligned} F_1^\mathrm{R} ( \tilde{\nu }, \tilde{Q}^2 )\equiv & {} \frac{M}{\pi \alpha } \mathfrak {I}\mathrm {T}^{R}_1 ( \tilde{\nu }, \tilde{Q}^2 ) \nonumber \\= & {} \sum _{{\alpha _0} > 0} \gamma _{\alpha _0} (\tilde{Q}^2 ) \left( \tilde{\nu } - \tilde{\nu }_0 \right) ^{{\alpha _0}} \Theta \left( \tilde{\nu } - \tilde{\nu }_0 \right) \nonumber \\&+ \sum _{{\alpha _0} > 1} \gamma _{\alpha _0} (\tilde{Q}^2 ) {\alpha _0} \tilde{\nu }_0 \left( \tilde{\nu } - \tilde{\nu }_0 \right) ^{{\alpha _0}-1} \Theta \left( \tilde{\nu } - \tilde{\nu }_0 \right) .\nonumber \\ \end{aligned}$$
(21)

The Regge residues \( \gamma _{\alpha _0} (\tilde{Q}^2 ) \) can be obtained by performing a fit to inclusive electroproduction data on a proton. In our work we use the Donnachie–Landshoff (DL) high-energy fit [19] to obtain the proton structure function \(F_1\) as

$$\begin{aligned} F_1 ( \tilde{\nu }, \tilde{Q}^2 ) \underset{ \tilde{\nu }\gg }{ \longrightarrow } \sum \limits _{\alpha _0 > 0} \gamma _{\alpha _0} ( \tilde{Q}^2 ) \tilde{\nu } ^{\alpha _0}, \end{aligned}$$
(22)

where the values of the Regge intercepts \( \alpha _0 \) and the residue functions \( \gamma _{\alpha _0} (\tilde{Q}^2 ) \) are detailed in Appendix B.

By comparing Eqs. (21) and (22) we notice that the second term in Eq. (21) is chosen such that for the Regge trajectory with \( 1 < \alpha _0 < 2 \) (“Pomeron”):

$$\begin{aligned} F_1 (\tilde{\nu }, \tilde{Q}^2 ) - F^\mathrm{R}_1 (\tilde{\nu }, \tilde{Q}^2 ) \underset{\tilde{\nu }\gg }{ \sim } \tilde{\nu } ^{\alpha _0-2}, \end{aligned}$$
(23)

whereas for the Regge trajectory with \( 0 < \alpha _0 < 1 \) (Reggeon):

$$\begin{aligned} F_1 (\tilde{\nu }, \tilde{Q}^2 ) - F^\mathrm{R}_1 (\tilde{\nu }, \tilde{Q}^2 ) \underset{\tilde{\nu }\gg }{ \sim } \tilde{\nu } ^{\alpha _0-1}. \end{aligned}$$
(24)

This ensures that in all cases the quantity \([ F_1 ( \tilde{\nu }, \tilde{Q}^2 ) - F_1^\mathrm{R} ( \tilde{\nu }, \tilde{Q}^2 ) ] \rightarrow 0\) when \(\tilde{\nu }\rightarrow \infty \).

Consequently, one can write down an unsubtracted dispersion relation for \(\mathrm {T}_1 - \mathrm {T}_1^\mathrm {R}\) at fixed \( \tilde{Q}^2\) as:

$$\begin{aligned}&\mathrm {T}_1 ( \tilde{\nu }, \tilde{Q}^2 ) - \mathrm {T}_1^\mathrm {R} ( \tilde{\nu }, \tilde{Q}^2 ) = \mathrm {T}_1^\mathrm {pole} ( \tilde{\nu }, \tilde{Q}^2 ) \nonumber \\&\quad + \frac{2}{\pi }\mathop {{\int }} \mathrm {d} \nu ' \frac{\nu ' \mathfrak {I}\left[ \mathrm {T}_1 ( \nu ', \tilde{Q}^2 ) - \mathrm {T}_1^\mathrm {R} ( \nu ', \tilde{Q}^2 ) \right] }{\nu '^2 - \tilde{\nu }^2} . \end{aligned}$$
(25)

Using Eqs. (8), (16), and (17), this yields an expression for \( \mathrm {T}^{\mathrm {subt}}_1 ( 0, \tilde{Q}^2 ) \) which, expressed in terms of \( W^2 \equiv 2 M \nu ^\prime + M^2 - \tilde{Q}^2 \), is given by

$$\begin{aligned}&\mathrm {T}^{\mathrm {subt}}_1 ( 0, \tilde{Q}^2 ) = \mathrm {T}_1^\mathrm {R} ( 0, \tilde{Q}^2 ) + \frac{\alpha }{M} F^2_D ( \tilde{Q}^2 ) + \frac{2 \alpha }{ M} \nonumber \\&\quad \times \mathop {{\int }} \limits ^{~~ \infty }_{s_{\mathrm {thr}}} \frac{ F_1\left( (W^2 - M^2 + \tilde{Q}^2)/(2 M),\tilde{Q}^2 \right) - F^\mathrm{R}_1\left( (W^2 - M^2 + \tilde{Q}^2)/(2 M),\tilde{Q}^2 \right) }{ W^2 - M^2 + \tilde{Q}^2} \mathrm {d} W^2, \nonumber \\ \end{aligned}$$
(26)

where the lower integration limit in Eq. (26) \(s_{\mathrm {thr}}\) is given by

$$\begin{aligned} s_{\mathrm {thr}} = \mathrm {\mathrm {min}} \left( s_0 \equiv 2 M \tilde{\nu }_0 + M^2 - \tilde{Q}^2,~ W^2_{\mathrm {thr}} = ( M + m_\pi )^2 \right) ,\nonumber \\ \end{aligned}$$
(27)

corresponding with a branch cut of \(F_1\) starting at \(W_{\mathrm {thr}}^2\) and a branch cut of \(F_1^\mathrm{R}\) starting at \(s_0\). Eq. (26) allows one to quantitatively estimate the subtraction function given the structure function \( F_1 \), the Regge fit determining \( F^\mathrm{R}_1 \) of the form of Eq. (21), as well as the corresponding value of \(\mathrm {T}^\mathrm {R}_1 ( 0, \tilde{Q}^2 ) \), which follows from Eq. (20) as

$$\begin{aligned} \mathrm {T}^\mathrm {R}_1 (0, \tilde{Q}^2 )= & {} - \frac{2 \pi \alpha }{M} \sum \limits _{\alpha _0 > 0} \frac{\gamma _{\alpha _0} ( \tilde{Q}^2 )}{\sin \pi \alpha _0} \tilde{\nu }_0^{\alpha _0} \nonumber \\&- \frac{2 \pi \alpha }{M} \sum \limits _{\alpha _0 > 1} \frac{ \alpha _0 \tilde{\nu }_0 \gamma _{\alpha _0} ( \tilde{Q}^2 )}{\sin \pi \left( \alpha _0 - 1 \right) } \tilde{\nu }_0^{\alpha _0 - 1 }, \end{aligned}$$
(28)

and is also fully determined by the Regge fit.

In our numerical evaluation of Eq. (26), we describe the proton structure function \( F_1 \) in the resonance region by the fit performed by Christy and Bosted (BC) [20]. This fit is valid in the following region of kinematic variables: \( 0 < \tilde{Q}^2 < 8 ~\mathrm {GeV}^2 \), and \(W^2 < 9.61 ~\mathrm {GeV}^2 \approx 10 ~\mathrm {GeV}^2 \). For the dispersion integral in Eq. (26) we connect the BC fit with the DL high-energy fit starting from \( W^2 = 10 ~\mathrm {GeV}^2 \). The latter fit is described in Appendix B. The resulting proton structure function \( F_1 \) is shown in Fig. 5 as it enters the integral of Eq. (26). We add a \( 3~\% \) error band to the BC fit [20] and use the same error estimate for all Regge pole residues. We notice that at low values of \(\tilde{Q}^2\), both fits either overlap or are very close around the matching point \(W^2 \approx 10\) GeV\(^2\). With increasing values of \(\tilde{Q}^2\) there is a slight mismatch in both fits around \(W^2 = 10\) GeV\(^2\), which is due to the fact that the BC fit has not accounted for the HERA high-energy data, and the DL fit has not accounted for the lower W data. Even though a combined fit of all data would be very worthwhile, or a smooth interpolating procedure between the BC and DL fits could easily be performed, for our purpose we will only need data at lower value of \(\tilde{Q}^2\) up to about 1 GeV\(^2\). For this purpose, we can just split the \(W^2\) integral entering Eq. (26) in a region \(W^2 < 10\) GeV\(^2\) where we will use the BC fit and a region \(W^2 > 10\) GeV\(^2\) where we will use the DL fit.

Fig. 5
figure 5

Fits of the proton structure function \( F_1 \) used in our estimates entering the dispersion integral in Eq. (26)

In Fig. 6, we demonstrate explicitly the vanishing high-energy behavior of the quantity \(F_1\)\(F_1^\mathrm{R}\), which is the necessary condition for the unsubtracted DR of Eq. (26) to hold.

Fig. 6
figure 6

High-energy behavior of the function \( F_1\)\(F_1^\mathrm{R} \) for the fixed value \( s_0 = 1 ~\mathrm {GeV}^2\)

We furthermore provide another consistency check of our numerical implementation. As the Regge function \(\mathrm {T}_1^\mathrm {R}\) of Eq. (20) has an arbitrary scale \(\tilde{\nu }_0\) (or equivalently \(s_0\)), the total result should not depend on the specific choice of this parameter. We demonstrate this in Fig. 7, where we illustrate how the \(s_0\) dependence of the individual contributions in Eq. (26) adds up to yield the total result which is independent of \(s_0\).

Fig. 7
figure 7

The contribution of the individual terms in Eq. (26) to \({\mathrm {T}}^\mathrm {subt}_1(0,0)\) as a function of \( s_0 \). Dashed curve the dispersion integral contribution from the BC fit \( \sim \int _{W^2_{\mathrm {thr}}}^{10~ \mathrm {GeV}^2} (F^{\mathrm {BC}}_1 - F_1^\mathrm{R}) \). Dashed-dotted curve the dispersion integral contribution from the DL fit \( \sim \int _{10~\mathrm {GeV}^2}^\infty (F^{\mathrm {DL}}_1 - F_1^\mathrm{R}) \). Dotted curve the dispersion integral contribution \( \sim - \int ^{W^2_{\mathrm {thr}}}_{s_0} F_1^\mathrm{R} \) due to \(F_1^\mathrm{R}\). Dashed double-dotted curve the contribution from the real part \( T_1^R(0,0) \) according to Eq. (28). Solid curve sum of all terms in Eq. (26), yielding the \(s_0\)-independent value of \(T^{\mathrm {subt}}_1 (0,0) \)

In Fig. 3 we present the empirically extracted subtraction function \( \mathrm {T}^{\mathrm {subt}}_1 ( 0, \tilde{Q}^2 ) \) of Eq. (26) and compare it with the subtraction functions of Refs. [13, 15]. The subtraction function \( \mathrm {T}^{\mathrm {subt}}_1(0, \tilde{Q}^2)\) should vanish linearly when \(\tilde{Q}^2 \rightarrow 0\) according to Eq. (17). This general property therefore provides a quality check on the accuracy of an empirical determination as described above. One notices from Fig. 3 that the value of \( \mathrm {T}^{\mathrm {subt}}_1\) at \(\tilde{Q}^2 = 0\) is compatible with zero within 1–1.5\( ~\sigma \). We would like to notice, however, that at present such empirical determination can unfortunately only give the correct order of magnitude of \( \mathrm {T}^{\mathrm {subt}}_1 ( 0, \tilde{Q}^2 ) \). This is partly due to the non-perfect match between the proton \(F_1\) fits for the resonance region and the large W region, as we have shown in Fig. 5. Despite this caveat, it seems, however, that with increasing \( \tilde{Q}^2 \), \( \mathrm {T}^{\mathrm {subt}}_1 ( 0, \tilde{Q}^2 ) \) changes sign in the range somewhere between 0.1 and 0.4 GeV\(^2\), which may be an indication of the range up to which the ChPT-based results can be used. To provide a more accurate determination of the functional dependence of \( \mathrm {T}^{\mathrm {subt}}_1 ( 0, \tilde{Q}^2 ) \), a combined fit of all proton \(F_1\) structure function data over the whole range of W , incorporating the Regge behavior at large W would be desirable. At intermediate values of \(Q^2\), below and around 1 GeV\(^2\), this will also require one to have more accurate data in the intermediate W range between 3–10  GeV. In the lower end of this range, such data can be provided by the JLab 12 GeV facility.

Using our empirical determination of \( \mathrm {T}^{\mathrm {subt}}_1 ( 0, \tilde{Q}^2 ) \), we can extract \(\beta ( \tilde{Q}^2) \) dividing \( \mathrm {T}^{\mathrm {subt}}_1(0,\tilde{Q}^2)\) by \(\tilde{Q}^2\) according to Eq. (17). For the purpose of combining our empirical estimate of \( \mathrm {T}^{\mathrm {subt}}_1(0, \tilde{Q}^2)\) with the empirical value of \(\beta (0)\) as determined from RCS, we use the central curve in the empirically determined error band of \( \mathrm {T}^{\mathrm {subt}}_1 ( 0, \tilde{Q}^2 )\) (green band in Fig. 3) to extract \(\beta (\tilde{Q}^2)\) in the range \( \tilde{Q}^2 > 0.12~\mathrm {GeV}^2\), and extrapolate it by a linear function to the PDG value of \(\beta _M\) at \(\tilde{Q}^2 = 0 \). The resulting curve is displayed in Fig. 4. We will use the latter curve in the following to provide an empirical estimate for the subtraction function contribution to the TPE correction for the muon–proton elastic scattering at small momentum transfer.

4.4 TPE correction from the subtraction function

Using Eqs. (3)–(5), we can now estimate the TPE correction due to the \( \mathrm {T}^{\mathrm {subt}}_1 ( 0, \tilde{Q}^2 )\) contribution to the first term in the hadronic tensor of Eq. (7). Performing the traces in Eq. (5) explicitly, the subtraction function results in the following TPE correction in the region of low momentum transfers:

$$\begin{aligned} \delta ^{\mathrm {subt}}_{2 \gamma }= & {} \frac{32 \pi G_\mathrm{E} }{\varepsilon G^2_\mathrm{E} + \tau G^2_\mathrm{M} } \frac{1-\varepsilon }{1-\varepsilon _0} \frac{1}{M} \nonumber \\&\times \mathfrak {R}\mathop {{\int }} \frac{ i \mathrm {d}^4 \tilde{q}}{\left( 2 \pi \right) ^4} \beta \left( \tilde{Q}^2 - \frac{Q^2}{4} \right) \Pi _{K}^{+} \Pi _{K}^{-}\Pi _{Q}^{+} \Pi _{Q}^{-} \nonumber \\&\times \left\{ \left( K \cdot P \right) m^2 \left( \tilde{Q}^2 - \frac{Q^2}{4}\right) ^2 + \frac{1}{2} \left( Q^2 \left( P\cdot \tilde{q} \right) \right. \right. \nonumber \\&\quad \left. \left. - 4 \left( K\cdot P\right) \left( K\cdot \tilde{q} \right) \right) \left( K \cdot \tilde{q}\right) \left( \tilde{Q}^2 +\frac{Q^2}{4}\right) \phantom {\left\{ \left( K \cdot P \right) m^2 \left( \tilde{Q}^2 - \frac{Q^2}{4}\right) ^2 + \frac{1}{2} \left( Q^2 \left( P\cdot \tilde{q} \right) \right. \right. }\right\} ,\nonumber \\ \end{aligned}$$
(29)

where the lepton (photon) propagators \(\Pi ^\pm _K\) (\(\Pi ^\pm _Q\)) are defined as in Eq. (A5). The second term within the curly brackets of Eq. (29) can be simplified to yield the expression

$$\begin{aligned} \delta ^{\mathrm {subt}}_{2 \gamma }= & {} \frac{32 \pi m^2 G_\mathrm{E} }{\varepsilon G^2_\mathrm{E} + \tau G^2_\mathrm{M} } \frac{1-\varepsilon }{1-\varepsilon _0} \frac{\left( K \cdot P \right) }{M} \nonumber \\&\times \, \mathfrak {R}\mathop {{\int }} \frac{ i \mathrm {d}^4 \tilde{q}}{\left( 2 \pi \right) ^4} \beta \left( \tilde{Q}^2 - \frac{Q^2}{4} \right) \Pi _{K}^{+} \Pi _{K}^{-}\Pi _{Q}^{+} \Pi _{Q}^{-}\nonumber \\&\times \, \left\{ \left( \tilde{Q}^2 - \frac{Q^2}{4}\right) ^2 - \frac{ 2 \left( K \cdot \tilde{q}\right) ^2}{m^2 + \frac{Q^2}{4}} \left( \tilde{Q}^2 + \frac{Q^2}{4}\right) \right\} , \nonumber \\ \end{aligned}$$
(30)

making explicit the overall proportionality of \( \delta ^{\mathrm {subt}}_{2 \gamma } \) to the squared lepton mass \( m^2 \).

The integration in Eq. (30) is performed through a Wick rotation, as detailed in Appendix C, and the resulting TPE correction is given by

$$\begin{aligned} \delta ^{\mathrm {subt}}_{2 \gamma }= & {} \frac{m^2 G_\mathrm{E}\left( Q^2\right) }{\varepsilon G^2_\mathrm{E}\left( Q^2\right) + \tau G^2_\mathrm{M}\left( Q^2\right) } \frac{1-\varepsilon }{1-\varepsilon _0} \frac{\left( K \cdot P \right) }{M} \frac{Q}{K}\nonumber \\&\times \mathop {{\int }} \limits ^{~~\infty }_{x_{\mathrm {min}}} f \left( x, a \right) \beta \left( \frac{Q^2 \left( x -1 \right) }{4}\right) \frac{\mathrm {d} x}{4 \pi }, \end{aligned}$$
(31)

in terms of the dimensionless variable \( x = 4 \tilde{Q}^2/Q^2 \) with the weighting function \( f \left( x, a \right) \):

$$\begin{aligned} f \left( x, a \right)= & {} - \frac{2 x^{\Theta \left( 1-x \right) }}{\sqrt{1 + a}} \Theta \left( x \right) \nonumber \\&+ \frac{1+ 2 a - x}{ 1 + a} \frac{ | 1 - x |}{1+x} \left\{ \ln \left| \frac{ x- z }{ x + z } \right| \Theta \left( x \right) \Theta \left( 1-x \right) \right. \nonumber \\&\left. + \ln \left| \frac{z + 1 }{ z - 1} \right| \Theta \left( x - 1\right) + \ln \left| \frac{ x- z }{ x + z } \frac{z - 1 }{ z + 1} \right| \right. \nonumber \\&\left. \times \Theta \left( x - x_{\mathrm {min}} \right) \Theta \left( -x \right) \phantom {\left\{ \ln \left| \frac{ x- z }{ x + z } \right| \Theta \left( x \right) \Theta \left( 1-x \right) \right. }\right\} , \end{aligned}$$
(32)

with

$$\begin{aligned}&z = \frac{1 - x - \sqrt{\left( 1+x\right) ^2 + 4 a x} }{2 \sqrt{ 1 + a }}, \nonumber \\&x_{\mathrm {min}} = - \left( \sqrt{1+a} - \sqrt{a} \right) ^2, ~\quad a = \frac{4 m^2}{Q^2}. \end{aligned}$$
(33)

At low momentum transfers the result of Eq. (31) starts from a term proportional to \( Q^2 \). We show the x (or \( \tilde{Q}^2 \)) dependence of the weighting function of Eq. (32) in Fig. 8.

Fig. 8
figure 8

The weighting function f of Eq. (32) for the range of \(Q^2\) values of the MUSE experiment

The TPE contribution due to the subtraction function also provides a correction to the 2S–2P muonic hydrogen Lamb shift, which is the largest hadronic uncertainty in this precise quantity [1, 2]. Using the ChPT-based results for \( \beta ( \tilde{Q}^2 ) \) as input, this TPE correction was estimated in Refs. [13, 15] and found to be too small to resolve the proton radius puzzle. The correction from the above discussed empirically determined subtraction function to the 2S energy level in muonic hydrogen, after integration up to \( \tilde{Q}^2 = 1~\mathrm {GeV}^2 \), yields:Footnote 2

$$\begin{aligned} \Delta E^{\mathrm {subt}}_{\mathrm {\mathrm {2S}}} \approx ( 2.0 - 2.3 ) ~\mu \mathrm {eV}, \end{aligned}$$
(34)

which is in fair agreement with the estimate of Birse et al. [13], though slightly smaller: \( \Delta E^{\mathrm {subt}}_{\mathrm {\mathrm {2S}}} \approx 4.2 \pm 1.0 ~\mu \mathrm {eV} \). Our result of Eq. (34) is also within errors of the analogous evaluation of Ref. [22], where the authors assumed the existence of a \( J = 0 \) fixed pole. It was speculated in Ref. [23] that to explain the proton radius puzzle would require a huge enhancement of \( \beta ( \tilde{Q}^2 ) \) at large \( \tilde{Q}^2 \). In order to account for the experimentally observed discrepancy in \(\Delta E_{\mathrm {2S}}\) of around 310 \(\mu \)eV [5], it would require an around two orders of magnitude larger TPE correction than the naturally expected result from the ChPT estimates. For this purpose, an ad hoc subtraction function, proposed to be added as an extra contribution on top of the ChPT-based subtraction functions discussed above, was conjectured in Ref. [23] with the following functional form:

$$\begin{aligned}&\beta _{\mathrm {extra}} ( \tilde{Q}^2 ) = \left( \frac{\tilde{Q}^2}{M^2_0} \right) ^2 \frac{\beta _M}{ \left( 1 + \tilde{Q}^2 / \Lambda _0^2 \right) ^5}, \nonumber \\&\quad M_0 = 0.5 ~\mathrm {GeV}, \quad \Lambda _0 = 3.92 ~\mathrm {GeV}. \end{aligned}$$
(35)

In such a scenario, the large \( \tilde{Q}^2 \) region would also dominate the TPE correction to the muon–proton elastic scattering, and the integral of Eq. (31) would be approximated by

$$\begin{aligned} \delta ^{\mathrm {subt}}_{2 \gamma ,0}&\approx - \frac{3 m^2 G_\mathrm{E} }{\varepsilon G^2_\mathrm{E} + \tau G^2_\mathrm{M} } \frac{1-\varepsilon }{1-\varepsilon _0} \frac{\left( K \cdot P \right) }{ \pi M} \mathop {{\int }} \limits ^{~\infty }_0 \beta (\tilde{Q}^2 ) \frac{\mathrm {d} \tilde{Q}^2}{ \tilde{Q}^2}\nonumber \\&\approx - \frac{ 3 Q^2 m^2 }{2 \pi E } \mathop {{\int }} \limits ^{~\infty }_0 \beta (\tilde{Q}^2) \frac{\mathrm {d} \tilde{Q}^2}{\tilde{Q}^2}, \end{aligned}$$
(36)

where the last step gives the approximate expression in the limit \( Q^2 \ll M^2, ~M E, ~E^2 \). This approximation corresponds in magnitude with the result of Ref. [23] for \( \mu ^{-} p \) scattering, however, it differs by an overall sign.

In Fig. 9 we compare the TPE correction to elastic muon–proton scattering (for MUSE kinematics) due to the above discussed ChPT as well as empirically determined subtraction functions. To estimate the size of uncertainties of the BChPT result [15], we plot a band corresponding with a variation of the upper integration limit in Eq. (31) between \( \tilde{Q}^2 = 0.9 - 5\) GeV\(^2\). We notice that the HBChPT and BChPT results are in agreement within their uncertainties.

Fig. 9
figure 9

Subtraction function contribution to the TPE correction in elastic muon–proton scattering for the muon lab momentum \( \mathrm {k} = 153 ~\mathrm {MeV} \). Blue band result for the HBChPT-based subtraction function [13]. Pink band result for the BChPT-based subtraction function [15]. Dashed doubly dotted (green) curve result based on the empirical subtraction function, corresponding with the dashed doubly dotted curve in Fig. 4. Solid curve result based on the conjectured subtraction function of Ref. [23]. The (black) dashed-dotted curve is the Feshbach term of Eq. (2) for a point-like Dirac particle corrected by the recoil factor \((1 + m / M)\). The sign labels on the curve show the sign of the corresponding expressions for \( \mu ^- p \) scattering

Fig. 10
figure 10

The dependence of the integral of Eq. (31) on the upper integration limit \( \tilde{Q}^2_{\mathrm {max}} \) for three different estimates of the subtraction function \(\beta (\tilde{Q}^2)\) as described in the text

The TPE correction due to the empirically extracted subtraction function is also shown on Fig. 9, giving a similar though slightly smaller result. This can be understood as the empirically determined \(\beta (\tilde{Q}^2)\) changes sign as a function of \(\tilde{Q}^2\). The region of \(\tilde{Q}^2\) contributing to the above result is shown in Fig. 10. One sees that the TPE integral has largely converged for an upper integration limit value of around \(\tilde{Q}^2_\mathrm {max} \sim 1\) GeV\(^2\).

In Fig. 9, we furthermore also show the TPE correction to elastic muon–proton scattering resulting from the subtraction function conjectured in Ref. [23] to explain the proton radius puzzle through enhancing the TPE corrections by nearly two orders of magnitude. Even though the weighting functions entering the TPE corrections in the muonic hydrogen Lamb shift and the elastic muon–proton scattering are different, one notices from Fig. 9 that the subtraction function of Ref. [23] also yields a nearly two order of magnitude larger TPE correction for the elastic muon–proton scattering. To put this in perspective, we also display in Fig. 9 the model independent estimate of the elastic TPE contribution, which has to be added on top of the inelastic TPE contribution, and which is due to the Feshbach term of Eq. (2) corrected by the recoil factor \((1 + m / M)\). One notices that the use of such large subtraction function would yield an inelastic TPE correction to elastic muon–proton scattering which in magnitude already would exceed the elastic Feshbach contribution around \(Q^2 = 0.02\) GeV\(^2\), and would increase further with increasing \(Q^2\).

5 Inelastic contribution to TPE correction

Besides the subtraction function contribution, the inelastic TPE correction to elastic muon–proton scattering includes the contribution of the DR integrals in Eqs. (12) and (13). Using Eqs. (3) and (4) and working out the traces in Eq. (5), the corresponding contributions from the unpolarized proton structure functions \( F_1 \) and \( F_2 \) to \(\delta _{2 \gamma }\) are given by

$$\begin{aligned} \delta _{2 \gamma }^{F_1}= & {} F ~ \mathfrak {R}\mathop {{\int }} \limits ^{~~ \infty }_{W^2_{\mathrm {thr}}} \mathrm {d} W^2 \mathop {{\int }} \frac{ i \mathrm {d}^4 \tilde{q}}{\left( 2 \pi \right) ^4}\nonumber \\&\times \Pi _{Q}^{+} \Pi _{Q}^{-} F_1\left( W^2,\tilde{Q}^2-\frac{Q^2}{4}\right) \frac{(P\cdot \tilde{q})^2 }{ W^2 - P^2 + \tilde{Q}^2 } \nonumber \\&\times \frac{\left\{ A \left( \Pi _{K}^{-} + \Pi _{K}^{+} \right) + B \left( \Pi _{K}^{-} - \Pi _{K}^{+} \right) \right\} }{\left( \left( P + \tilde{q} \right) ^2 - W^2 \right) \left( \left( P - \tilde{q} \right) ^2 - W^2 \right) } , \end{aligned}$$
(37)
$$\begin{aligned} \delta _{2 \gamma }^{F_2}= & {} F ~ \mathfrak {R}\mathop {{\int }} \limits ^{~~ \infty }_{W^2_{\mathrm {thr}}} \mathrm {d} W^2 \mathop {{\int }} \frac{ i \mathrm {d}^4 \tilde{q}}{\left( 2 \pi \right) ^4}\nonumber \\&\times \Pi _{Q}^{+} \Pi _{Q}^{-} F_2\left( W^2,\tilde{Q}^2-\frac{Q^2}{4}\right) \nonumber \\&\times \frac{ \left\{ C \left( \Pi _{K}^{-} + \Pi _{K}^{+} \right) + D \left( \Pi _{K}^{-} - \Pi _{K}^{+} \right) \right\} }{\left( \left( P + \tilde{q} \right) ^2 - W^2 \right) \left( \left( P - \tilde{q} \right) ^2 - W^2 \right) } , \end{aligned}$$
(38)

with definitions introduced in Eqs. (A2)–(A5) and the following notation:

$$\begin{aligned} F= & {} \frac{8 e^2}{M^2} \frac{G_\mathrm{E}}{\varepsilon G_\mathrm{E}^2 + \tau G^2_\mathrm{M}} \frac{1-\varepsilon }{1-\varepsilon _0}, \nonumber \\ A= & {} -4 m^2\left( K\cdot P\right) , \nonumber \\ B= & {} \frac{\tilde{Q}^2+\frac{Q^2}{4}}{\tilde{Q}^2-\frac{Q^2}{4}}\left( Q^2 \left( P\cdot \tilde{q} \right) - 4 \left( K\cdot P\right) \left( K\cdot \tilde{q} \right) \right) , \nonumber \\ C= & {} \frac{1}{2} \left( K\cdot P\right) \left( 4 \left( K\cdot P\right) ^2 - Q^2 P^2 \right) + \frac{Q^2}{\tilde{Q}^2-\frac{Q^2}{4}} \left( P\cdot \tilde{q}\right) \nonumber \\&\times \left( \left( K\cdot \tilde{q} \right) P^2 - \left( P\cdot \tilde{q}\right) \left( K\cdot P\right) \right) , \nonumber \\ D= & {} -\frac{1}{4} \left( P^2 + \frac{ (P\cdot \tilde{q})^2 Q^2}{\left( \tilde{Q}^2-\frac{Q^2}{4}\right) ^2 }\right) \nonumber \\&\times \left( Q^2 \left( P\cdot \tilde{q}\right) - 4 \left( K\cdot P\right) \left( K\cdot \tilde{q} \right) \right) \nonumber \\&-\ \frac{1}{2} \left( P\cdot \tilde{q}\right) \frac{ \tilde{Q}^2-\frac{3 Q^2}{4} }{ \tilde{Q}^2-\frac{Q^2}{4} } \left( 4 \left( K\cdot P\right) ^2 - Q^2 P^2 \right) . \end{aligned}$$
(39)

Our numerical studies of the inelastic TPE contribution indicate that, in the limit \( Q^2 \ll m^2, ~M^2, ~ M E \), the momentum transfer expansion starts with a \( Q^2 \) term and contains no \( Q^2 \ln Q^2 \) type of non-analyticity. This is unlike the elastic electron–proton scattering case, where in the limit \( m^2 \ll Q^2 \ll M^2 , ~ M E \) a non-analytic behavior of the type \( Q^2 \ln Q^2\) is present at low \(Q^2\) [10, 12].

Fig. 11
figure 11

W dependence of the integrand \(f \left( W \right) \), which determines the inelastic TPE correction, as given by Eq. (40). The integrand is shown for the case of \( e^{-} p \) and \( \mu ^{-} p \) elastic scattering. The external kinematics (indicated on the plots) correspond with the MUSE experiment

Fig. 12
figure 12

Same as Fig. 11, but for the lepton momentum \( {k} = 115 ~\mathrm {MeV} \)

Fig. 13
figure 13

TPE correction for \( \mu ^{-} p \) elastic scattering for three different muon lab momenta as planned in the MUSE experiment. The TPE correction due to the subtraction function is shown for three subtraction function inputs: Birse et al. [13] (blue bands), BChPT [15] (solid curves), and the empirical determination as described in Sect. 4 (dashed doubly dotted curves). The inelastic TPE correction due to the dispersion integrals over the proton structure functions \( F_1\) and \( F_2 \) is shown by the dashed-dotted curves. The resulting total inelastic TPE correction (sum of both) is shown by the green bands using the subtraction function of Birse et al.

To provide numerical estimates of the inelastic TPE contribution to elastic muon–proton scattering due to the dispersion integrals, we express the corresponding integrals of Eqs. (37) and (38) in the form

$$\begin{aligned} \delta _{2 \gamma }^{F_1, F_2} = \delta _{2 \gamma }^{F_1} + \delta _{2 \gamma }^{F_2} =\mathop {{\int }} \limits ^{~~\infty }_{W^2_{\mathrm {thr}}} f \left( W \right) \mathrm {d} W^2. \end{aligned}$$
(40)

In Figs. 11, 12 we compare the W dependence of the integrand \( f \left( W \right) \) in Eq. (40) for the \(\mu ^- p\) and \(e^- p\) elastic scattering processes. As input for the proton structure functions \(F_1\) and \(F_2\), we use the fit performed by Christy and Bosted [20]. We find that the \(\tilde{Q}^2\) integrations are well saturated when performed up to \(\tilde{Q}^2 = 8\) GeV\(^2\), which is the largest value covered by the BC fit. As a test, we extended the BC fit beyond its fit region and found that the relative contribution from the region \( 8~\mathrm {GeV}^2 < \tilde{Q}^2 < 12~\mathrm {GeV}^2 \) is smaller than \( 0.015 ~\% \).

Figures 11, 12 show results in different kinematics corresponding with the MUSE experiment. The TPE corrections to \(e^- p \) are sizeably larger than for the \(\mu ^- p \) case at low \(Q^2\). With increasing \(Q^2\), the \(\mu ^- p\) TPE corrections increase, as is evident from the result at lower beam momentum in Fig. 12, where at \(Q^2 = 0.03\) GeV\(^2\) both corrections reach similar sizes. We furthermore notice in Fig. 11 that the integrand for the elastic \(e^- p\) scattering displays a narrow peak corresponding with the quasi-real photon singularity (for both photons), see Ref. [10], which is absent for the \(\mu ^- p\) case.

Fig. 14
figure 14

The total TPE correction for \(\mu ^- p\) elastic scattering is shown as sum of the elastic TPE, the TPE correction from the \( F_1\) and \( F_2 \) proton structure functions and the TPE correction from the subtraction function of Ref. [13]. It is compared with the Feshbach term for point-like particles, see Eq. (2), corrected by the recoil factor \( (1 + m / M )\), the elastic contribution based on the box graph evaluation with dipole form factors [9] and the \( e^- p \) total TPE correction [10]

To estimate the inelastic TPE correction to elastic lepton–proton scattering, we find that the W integration in Eq. (40) is well saturated when performed up to 3.1 GeV, which is the largest value covered by the BC fit. When again extending the BC fit beyond its fit range, for the purpose of a test, we checked that the relative contribution from the region \( 3.1 ~\mathrm {GeV} < W < 4 ~\mathrm {GeV} \) to \(\delta _{2 \gamma }^{F_1, F_2}\) is smaller than \( 1.5 ~\% \). We estimate the uncertainties of the numerical integration coming from the integration regions outside the BC fit and from the inaccuracies in the BC fit at 5–6\( ~\% \) level.

The resulting inelastic TPE corrections for the elastic \(\mu ^- p\) scattering process are shown in Fig. 13 as a function of \(Q^2\) for three values of muon beam momentum, corresponding with the MUSE kinematics. Note that the muon beam lab momenta \( {k} = 115 ~\mathrm {MeV} \), \( {k} = 153 ~\mathrm {MeV} \), and \( {k} = 210 ~\mathrm {MeV} \), correspond with the kinematically allowed regions of \( Q^2 < 0.039~\mathrm {GeV}^2\), \( Q^2 < 0.066~\mathrm {GeV}^2\), and \( Q^2 < 0.116~\mathrm {GeV}^2\), respectively. We notice that for the small momentum transfers corresponding with the MUSE kinematics, the inelastic TPE corrections to elastic \(\mu ^- p\) scattering are very small, in the range of \(\delta _{2 \gamma } \sim 5 \times 10^{-4}\). This is well below the anticipated cross section precision of around 1 % of the MUSE experiment. Furthermore, we notice that the TPE corrections due to the subtraction function and the dispersive \(F_1, F_2\) structure function integrals come with opposite signs, leading to a partial cancellation.

We present in Fig. 14 the total TPE correction as a sum of the Born TPE correction of Ref. [9], corresponding with a proton intermediate state, and the inelastic TPE of this work using the subtraction function of Birse et al. [13]. We compare our result with the Feshbach term of Eq. (2) for a point-like Dirac particle corrected by the recoil factor \( (1 + m / M )\), with the elastic TPE correction based on the box graph evaluation with proton form factors of the dipole form [9], and with the corresponding TPE correction for elastic \( e^- p \) scattering of Ref. [10]. Contrary to the electron–proton scattering case, where the subtraction function contribution is negligible [10] as it is proportional to the lepton mass squared, in the case of muon–proton scattering the inelastic proton structure function contribution is partially canceled by the \( \mathrm {T}_1 \) subtraction function resulting in a negligibly small inelastic TPE correction for the MUSE kinematics. Only with increased lepton beam energy or when going to larger \(Q^2\) values one needs to start accounting for the inelastic TPE correction, which shifts the total correction a little closer to the Feshbach result.

6 Conclusions

In this work we have estimated the TPE correction to muon–proton elastic scattering at low momentum transfer. For the elastic (proton) intermediate state contribution, we have derived a low-momentum transfer expansion accounting for all terms due to the non-zero lepton mass. Besides the elastic contribution, we have accounted for the inelastic intermediate states by expressing the TPE process at low momentum transfer approximately through the forward doubly virtual Compton scattering. The input in our evaluation of the inelastic TPE correction is given by the unpolarized proton structure functions and by one subtraction function, corresponding with the forward Compton amplitude \(\mathrm {T}_1\) at zero photon energy. For the latter, we have compared two estimates based on heavy-baryon and baryon chiral perturbation theory with an empirical determination. For the empirical determination, we have expressed the subtraction function through an unsubtracted dispersion relation for the amplitude \(\mathrm {T}_1 - \mathrm {T}_1^R\). The function \(\mathrm {T}_1^R\) is suitably defined through a Regge pole fit such that at high-energies \((\mathrm {T}_1 - \mathrm {T}_1^R) \rightarrow 0\), ensuring convergence and applicability of the unsubtracted dispersion relation. We have provided a numerical evaluation of the subtraction function based on a Regge fit of high-energy proton structure function data. It was found that the extracted subtraction function is compatible in magnitude with the chiral perturbation theory calculations, and thus cannot explain the proton radius puzzle through missing TPE corrections, which would have required a total TPE correction which is larger by around an order of magnitude compared with the empirical and chiral perturbation theory-based evaluations. Besides the subtraction function, the second part of the inelastic TPE contribution was obtained through dispersion integrals over the unpolarized proton structure functions. For the latter, we used a fit of the data in the proton resonance region. Using our formalism, we have provided estimates for the total TPE corrections in the kinematics of forthcoming muon–proton elastic scattering data of the MUSE experiment. We found that in the MUSE kinematics, the elastic TPE contribution largely dominates, and the size of the inelastic TPE contributions is within the anticipated error of the forthcoming data.