1 Introduction

Generalized parton distributions (GPDs), theorized to image the three dimensional nucleon structure in the joint phase space of transverse spatial position and longitudinal momentum, can be probed by exclusive scattering processes. The cleanest of them, the Deeply Virtual Compton Scattering (DVCS) [1,2,3,4,5] is illustrated by a handbag diagram in Fig. 1. The DVCS amplitudes in the Bjorken limit can be decomposed either into helicity amplitudes, or, equivalently into complex structure functions, viz. Compton Form Factors (CFFs), which are to be measured experimentally [6,7,8,9]. The CFFs \({\mathcal H}\), \({\mathcal E}\), \(\widetilde{{\mathcal H}}\) and \(\widetilde{{\mathcal E}}\) are convolution of corresponding chiral-even GPDs H, E, \(\widetilde{H}\), and \(\widetilde{E}\), respectively with renormalized coefficient functions calculable at any order of perturbative QCD (pQCD). Several global extraction of CFFs from DVCS experimental data are now released within neural network [10,11,12,13,14]. Retrieving GPDs from CFFs, known as the deconvolution problem, is still a challenge endeavour [15,16,17,18,19] and steps forward only recently to neural network modelling [20]. Through the gravitational form factors (GFFs), GPDs can also decipher the mechanical properties, e.g. mechanical radius, pressure and shear force distributions of internal nucleon [21,22,23,24], though suffered from big systematic errors at present [25,26,27].

The complete and precise extraction of all GPDs from data of exclusive processes puts the highest demands on both theory and facility. The cutting-edge experiment of GPDs have been ongoing during the passed two decades by ZEUS [28, 29], H1 [30,31,32,33], HERMES [34], COMPASS [35], Hall A [36, 37] and CLAS [38] and are proposed to continue at JLab update [39,40,41,42] and also future electron-ion colliders [43,44,45,46,47,48]. Available data, limited in the valence region, give much more constraint on GPD H than the other GPDs, as recognized by VGG [49, 50] and Goloskokov-Kroll (GK) models [51,52,53,54,55,56,57,58]. GPD E appears in measurements with neutron or transversely polarized nucleon, which is pioneering, however, suffering from scarcity and sizable statistical uncertainties [59,60,61,62]. The GPD E, has for partner the Sivers function considering that they do not depend on quark helicity and both involve a flip of nucleon helicity. The GPD E is of its own importance in the sense of its involvement to the parton orbital angular momenta (OAM) [3], which plays an important role in the proton spin decomposition [54, 63, 64]. From this perspective, DVCS measurements with a transverse polarized proton beam are proposed as one of the priority programs at the proposed Electron-Ion Collider in China (EicC), designed to collide 3.5 GeV polarized electron beam of 80% polarization with 20 GeV polarized proton beam of 70% polarization at instantaneous luminosity of \(2 \times 10^{33}\) cm\(^{-2}\) s\(^{-1}\) or higher [46,47,48]. EicC aims to bridge the kinematic coverage usually referred as sea quark region between JLab and EIC at Brookhaven National Laboratory (BNL). In view of the required uncertainty at planned facilities in order to make substantial progress in the understanding of the DVCS process, an unbiased determination of CFFs or GPDs from all existing measurements is particularly relevant for future experimental design. For the time being, a global extraction of GPDs from full data sets has not been accomplished yet, but it does make much progress at the level of CFFs by several groups [10,11,12,13,14] and at the cross section level by FemtoNet deep neural network [14, 65]. Among them, PARTONS framework [12, 13, 66] is publicly available in a manner of open source and its unbiased uncertainties propagation embodied into replica method would serve as a remarkable toolkit for our purpose.

Fig. 1
figure 1

The handbag diagram of Deeply Virtual Compton Scattering on the proton \(e p \rightarrow e' p' \gamma \) at leading twist and leading order. Here \(Q^2 = -q^2\) is the virtuality of the exchanged photon between the initial electron and proton with q the virtual photon 4-vector. The x is the average fractional longitudinal momentum of the active parton, and \(\xi \) is half the difference of longitudinal momentum fractions between the initial and the final parton. The Mandelstam variable \(t = (p - p')^2\) is the squared four-momentum transfer between the initial and final proton

In this paper, we introduce Bayesian reweighting strategy to investigate the impact of pseudo-data at EicC on the extraction of CFFs. Specifically, in Sect. 2 we present the methodology applied to generate pseudo-data of transversely polarized proton beam-spin asymmetry within the kinematic region of EicC. After introducing the reweighting tool used to perform our analysis in Sect. 3, we present the sensitivity of CFF \(Im {\mathcal E}\) to specific data samples in a more quantitative manner within PARTONS Artificial neural network (ANN). We conclude our findings and make further remarks in Sect. 4.

2 Generation of pseudo-data

2.1 Remarks on theory and kinematics

At the leading order, DVCS process off proton in Fig. 1 is factorized to the partonic channel \( q \gamma ^* \longrightarrow q \gamma \) where an active parton q, is re-absorbed into proton that remains intact after collisions. The observed \(e p \rightarrow e' p' \gamma \) reaction is the superposition of DVCS with the known Bethe-Heitler (BH) process, the latter being the initial and final state radiation described by electromagnetic form factors in quantum electrodynamics (QED). The GPDs, initially introduced to describe this kind of deeply exclusive processes evolving with the virtuality \(Q^2\) of the photon, are dependent on the Mandelstam variable t associated to the four-momentum transfer to the proton, the fractional longitudinal momentum transfer x to the struck parton, and the skewness \(\xi \) defining the longitudinal momentum fractions transferred to the parton. GPDs are widely connected to other interesting physics quantities. In particular, after performing a Fourier transform with respect to the transverse component of t in the limit case \(\xi = 0\), one obtains the impact parameter dependent GPDs [67, 68], which is the probability density to find a parton of longitudinal momentum fraction x with respect to (w.r.t) its transverse distance \({b}_{\bot }\) from the momentum centre of the nucleon, thus giving tomographic images of the nucleon in 1+2 dimensional joint representation of longitudinal momentum and transverse position.

The first moments of the GPDs H, E, \(\widetilde{H}\), and \(\widetilde{E}\) are related to the Dirac, Pauli, axial and pseudoscalar form factors, respectively. The second moments of the GPDs are relevant to the total angular momenta for quark and gluon through Ji’s sum rule [3],

$$\begin{aligned} J_{q,g}=\frac{1}{2}\int _{-1}^{1}\mathrm dx \,x \left[H_{q,g}(x,\xi ,t =0)+E_{q,g}(x,\xi ,t =0) \right]. \nonumber \\ \end{aligned}$$
(1)

Considering that the GPD H (and also \(\widetilde{H}\)) in the forward direction (\(\xi = 0\), \(t =0\)) reduces to the usual parton distributions constrained by deeply inelastic scattering (DIS) experiments, the GPD E is the genuine interesting issue on the topic of parton angular momentum. Here \(J_{q,g}\)combine gauge-invariantly into the nucleon spin,

$$\begin{aligned} \frac{1}{2}=J_q + J_g = \frac{1}{2}\Delta \Sigma + L_q+ J_g , \end{aligned}$$
(2)

with \(\frac{1}{2}\Delta \Sigma \), \(L_q\) and \(J_g\) being the quark spin angular momentum, quark OAM and gluon total angular momentum respectively. To study the OAM of the partons, one needs to explore beyond one-dimensional parton distributions. In view of the quark helicity contribution \(\Delta \Sigma \) known also from DIS experiments, the GPD E is the sole missing piece toward the full understanding of quark OAM. Indeed, GPD E describes the amplitude in terms of the flip of nucleon spin but non-flip of the parton helicities in light-cone frame, implying therefore one unit change of OAM between the initial and final nucleon states. This provides a practical way to quantifying quark OAM inside the nucleon with solid theory basis.

At leading twist level the GPDs F (H, E, \(\widetilde{H}\), and \(\widetilde{E}\)), describing the soft structure of the nucleon, enter the cross section of DVCS through its sub-amplitudes, CFFs \({\mathcal F}\) (\({\mathcal H}\), \({\mathcal E}\), \(\widetilde{{\mathcal H}}\) and \(\widetilde{{\mathcal E}}\)), by the convolutions of GPDs over over variable x) [9, 69],

$$\begin{aligned}{} & {} {\mathcal F}(\xi ,t,Q^2) = \sum _{q=u,d,s, \cdots } e_q^2 \int _{-1}^1\nonumber \\{} & {} \quad \times \mathrm dx\, \left[ \frac{1}{\xi - x- i \epsilon } \mp \frac{1}{\xi + x -i \epsilon } \right] F^q (x,\xi ,t) , \end{aligned}$$
(3)

where the sum is made over quark flavors q and the upper/lower signs are for the unpolarized GPDs (H, E) and the polarized GPDs (\(\widetilde{H}\), \(\widetilde{E}\)), respectively. At leading order we have \(\xi \simeq x_B/(2-x_B)\). As genuine observables, CFFs can be precisely measured and separated by various cross sections and asymmetries w.r.t different beam spin, particularly their azimuthal modulations within different kinematic bins (\(Q^2\), \(\xi \), t) (or equivalently \(Q^2\), \(x_B\), t) [70]. This makes the first step towards extraction of GPDs globally from worldwide data.

However, the GPD E, and hence CFF \({\mathcal E}\), involved in the extraction of the quark OAM along the longitudinal axis, surprisingly appears only in the experiment of neutron target or a transversely polarized proton beam, which currently merely available with limited kinematic domain in fix-target experiments at HERMES [59,60,61,62] and proposed recently by JLab [38, 42]. At present only valence region are experimentally accessed with deficient accuracy and expected to be ameliorated by the update of JLab 12 GeV [39]. It is prerequisite to understanding how sea quarks and gluons behave inside nucleon at future colliders, as depicted in Fig. 2 together with existing measurements. Nowadays it is widely recognized that such extensive programs with large kinematic coverage and accurate measurements of asymmetries and cross sections are together indispensable for high precision extracting of GPDs. Under current configuration of the electron and proton beam-energy (3.5 GeV \(\times \) 20 GeV), EicC will dramatically extend the experimental study of GPDs with high statistics to the sea quark region of 0.01 \(< x_B<\) 0.1 when restricting to the safely perturbative region 2 \(< Q^2<\) 30 GeV\(^2\), and reach down to \(x_B\sim 0.004\) when \(Q^2\) approaching to 1 GeV\(^2\). It is feasible to ascertain the uncultivated domain of larger virtuality \(Q^2>\) 30.0 GeV\(^2\) with bearable statistics for measuring the differential cross section. It will touch the valence quark region with relatively higher \(Q^2\) than that at JLab 12 GeV program. This kinematic coverage of EicC is unique to explore parton spatial tomography of the sea quark inside nucleon, thus complementary to the EIC at RHIC with the aim of understanding glue. EicC is also distinctive in the sense that the interference between the DVCS and the BH processes is more prominent at a lower energy machine, whereas the DVCS process is expected to dominate at high energies.

Fig. 2
figure 2

The kinematic domain of EicC (green hatched area) for measurements of different single spin asymmetries in comparison with the planned JLab 12 GeV program (blue hatched zone) and future EIC at BNL with low (45 GeV) and high (140 GeV) energy scenario. The points are the current DVCS world data at HERA (H1, ZEUS, HERMES) and JLab 6 GeV and 12 GeV (CLAS and Hall A). The figure is adapted from Refs. [44, 71]

2.2 Pseudo-data production

In the present exploration an integrated luminosity of 50 fb\(^{-1}\) is assumed for the generated data samples, corresponding to around 290 days of data taking under the current luminosity design of EicC with a 100% operational efficiency. The DVCS and BH processes together with their interference term have been simulated by the Monte Carlo (MC) generator MILOU [73], slightly modified from its original version [74, 75]. The DVCS amplitude is evaluated by CFFs tables generated in a GPD-inspired framework to next-to-leading (NLO) and twist-two accuracy incorporating as well NLO GPD evolution [76]. The exponential t-dependence of the DVCS amplitude is introduced with a constant t-slope parameter 5.45 GeV\(^{-2}\) for the MILOU steering card. Figure 3 demonstrates the event samples obtained from MILOU package using the following loose selection criteria: the invariant mass of \(\gamma p'\) in final states are set up \(W>\) 2.0 GeV to isolate the resonance contribution; the detector acceptance is 0.01 \(< y<\) 0.95; the momentum P\(_{p'}\) of the final proton in the laboratory frame is smaller than 99% of the initial proton momentum, and the scattering angle \({\theta }_{p'}\) is bigger than 2 mrad. The scattered proton acceptance is constrained to be 1.0 \(> |t|>\) 0.01 GeV\(^2\) with a resolution of \(\Delta |t|>\) 0.02 GeV\(^2\). Possibility of 0.002 \(< |t|<\) 0.01 GeV\(^2\) is scrutinized, critically depending on the final performance of detector. For comparison, the values of \(|t|>\) 0.03 GeV\(^2\) and \(\Delta |t|>\) 0.03 GeV\(^2\) are adopted at EIC simulation [74].

Fig. 3
figure 3

The events distribution of DVCS in the (\(x_B\), \(Q^2\)) plane at EicC. The samples contain DVCS, BH, and their interference term. A perfect detector acceptance and efficiency is assumed besides the explicitly shown kinematic cuts (blue lines) and bin scheme (red rectangles). The green y = 0.85 line is only shown for guideline

Fig. 4
figure 4

The projected relative accuracy for \(\text {A}_{\text {UT}}^{\text {SC}} \equiv \text {A}_{\text {UT}}^{\sin (\phi -\phi _s)\cos \phi }\) asymmetry in the process of DVCS with transversely polarized proton beam in the region 1.0 GeV\(^2<Q^2<30.0\) GeV\(^2\) under the integrated luminosity 50.0 fb\(^{-1}\) at EicC. Only statistics uncertainty is included by MILOU generator. The relative uncertainty of each data point should be interpreted using the scale indicated on the right-side vertical axis of the plots. The size of \(\text {A}_{\text {UT}}\) is estimated with GK model [55, 56, 72]. The black star is the HERMES data of \(\text {A}_{\text {UT,I}}^{\sin (\phi -\phi _s)\cos \phi }\) asymmetry [61]. The values of |t| bins under the same \(Q^2\) are not shown here for simplicity

The (\(x_B\), \(Q^2\)) bin scheme of Fig. 3 is logarithmically spanning in seven \(Q^2\)-bins:

$$\begin{aligned}{} & {} [1.0, 1.6],\quad [1.6, 2.6],\quad [2.6, 4.3],\quad [4.3, 7.0],\\{} & {} \quad [7.0, 18.5],\quad [18.5, 30.0],\quad [30.0, 80.0]\, \text {GeV}^2 , \end{aligned}$$

each of which contains at most 7 \(x_B\)-bins. In total 27 kinematic bins in (\(x_B\), \(Q^2\)) plane are accessible with considerable statistical significance in range of 1.0 GeV\(^2 < Q^2<\) 30.0 GeV\(^2\). Besides, 1 bin at \(Q^2 \in \) [30.0, 80.0] GeV\(^2\), under good control of statistical uncertainty only for measuring differential cross sections, extends the reachable phase space of EicC to very high \(Q^2\). Each of (\(x_B\), \(Q^2\)) bins is further divided into t-bins, among which 22 have at least 2 t-bins, providing the handle of the impact parameter space after Fourier transform. Altogether 69 bins in (\(x_B\), \(Q^2\), \(-t\)) plane can be individually decomposed into 18 \(\phi \)-bins \(\in \) [0, 2\(\pi \)], where \(\phi \) is the angle between the leptonic plane and the real photon production plane. The number of events in each kinematic bin is proportional to the \(\mathrm d\sigma (\phi ,\phi _S)\), the abbreviation of five-fold differential cross section,

$$\begin{aligned} \mathrm d\sigma (\phi ,\phi _S) \equiv \frac{\mathrm d\sigma ^{e p \rightarrow e' p' \gamma }}{\mathrm dx_B\mathrm dQ^2 \mathrm d|t| \mathrm d\phi \mathrm d\phi _S} , \end{aligned}$$
(4)

which is the coherent sum of DVCS and BH amplitudes,

$$\begin{aligned} \mathrm d\sigma= & {} \mathrm d\sigma _{\text {UU}}^{\text {BH}} + e_l \mathrm d\sigma _{\text {UU}}^{\text {I}} + \mathrm d\sigma _{\text {UU}}^{\text {DVCS}}\nonumber \\{} & {} + S_\text {T} ( e_l \mathrm d\sigma _{\text {UT}}^{\text {I}} + \mathrm d\sigma _{\text {UT}}^{\text {DVCS}} ) , \end{aligned}$$
(5)

with the \(e_l\) being the electron beam charge in units of the elementary charge. The \(\phi _S\) is the angle between lepton scattering plane and \(S_\text {T}\), the transverse component of the incoming proton spin polarization vector that is orthogonal to photon direction. The transversely polarized proton beam-spin asymmetry \(\text {A}_{\text {UT}}\) is selected as a trial observable, defined in term of the charge-normalized cross sections for opposite orientations of the transverse spin of the nucleon,

$$\begin{aligned} \text {A}_{\text {UT}} (x,Q^2) = \frac{\mathrm d\sigma (\phi ,\phi _S)-\mathrm d\sigma (\phi ,\phi _S+\pi )}{\mathrm d\sigma (\phi ,\phi _S)+\mathrm d\sigma (\phi ,\phi _S+\pi )} , \end{aligned}$$
(6)

which is approximately given by a \(\sin {(\phi -\phi _s)}\cos {\phi }\) dependence plus a \(\cos {(\phi -\phi _s)}\sin {\phi }\) modulation [77]. Under the assumption of dominance of BH term in above denominator it still obtains more or less direct linear dependence on CFFs of \(\text {A}_{\text {UT}}\). For instance, CFF \(\mathcal {E}\) of proton becomes manifest in the \(\sin {(\phi -\phi _s)}\cos {\phi }\) module, whose interference part of amplitudes are given by,

$$\begin{aligned}{} & {} \text {A}_{\text {UT,I}}^{\sin {(\phi -\phi _s)}\cos {\phi }} \propto \text{ Im }\, \Big [-\frac{t}{4M^2}\big ({ F_2\mathcal {H}}-{F_1\mathcal {E}}\big ) +\xi ^2\nonumber \\{} & {} \quad \times \big (F_1+\frac{t}{4M^2}F_2\big )\big (\mathcal {H}+\mathcal {E}\big )\nonumber \\{} & {} \quad -\xi ^2\big (F_1+F_2)\big (\widetilde{\mathcal {H}} + \frac{t}{4M^2}\widetilde{\mathcal {E}}\big )\Big ] , \end{aligned}$$
(7)

where \(Im {\mathcal E}\) is imaginary part of CFF \({\mathcal E}\), and \(F_{1,2}\) are the nucleon Dirac and Pauli form factor. The \(\cos {(\phi -\phi _s)}\sin {\phi }\) module is complicated by contributions from both CFF \(\widetilde{\mathcal {H}}\) and \(\widetilde{\mathcal {E}}\),

$$\begin{aligned} \text {A}_{\text {UT,I}}^{\cos {(\phi -\phi _s)}\sin {\phi }}\propto & {} \text{ Im }\, \big ({ F_2\widetilde{\mathcal {H}}}-{F_1 \xi \widetilde{\mathcal {E}}}\big ), \end{aligned}$$
(8)

So the \(\sin {(\phi -\phi _s)}\cos {\phi }\) module, providing a rare access to the \(Im {\mathcal E}\) with no kinematic suppression of its contribution relative to those of the other CFFs, is what we are really care about for the present study. The full analytic formulas, which relate CFFs with observables at the twist-two level and include the power-suppressed contributions, are explicitly listed in Eqs. (71, 75) of Ref. [77].

The uncorrelated statistical uncertainties in each bin are calculated with the help of likelihood method,

$$\begin{aligned} \delta \text {A}_{\text {UT}}^{\sin {(\phi -\phi _s)}\cos {\phi }} = \frac{\sqrt{2}}{f P_\text {T}} \sqrt{ \frac{1 - \langle \text {A}_{\text {UT}}\rangle }{N_{events}} } , \end{aligned}$$
(9)

where \(N_{events}\) is the total number of BH/DVCS events obtained after scaling the generated cross sections to integrated luminosity of EicC. The \(P_T = 70\)% is the transverse polarization of nucleon beam. As displayed in Fig. 4, the EicC measurements of the \(\text {A}_{\text {UT}}\) with a single angular modulation \({\sin (\phi -\phi _s)\cos \phi }\) have rather small statistical uncertainties for a wide kinematic region, as low as a few percent at values of \(|t|>\) 0.01 GeV\(^2\) judged by the GK model. This implies that the measurement is actually limited by systematic uncertainties, of a few percentage depending on the facility design, which can be easily incorporated with quadrature addition. The events are largely accumulated in 0.002 \(< |t|<\) 0.01 GeV\(^2\) region, giving rise to well-below 0.01 absolute uncertainties. The relative uncertainties in this region are as high as tens of percent, solely driven by the tiny magnitudes of asymmetries.

3 Description of the impact study

Fig. 5
figure 5

The pseudo-data of \(\text {A}_{\text {UT}}^{\sin {(\phi -\phi _s)}\cos {\phi }}\) asymmetry generated within the kinematic coverage at EicC. The error bars of points correspond to the integrated luminosity 50.0 fb\(^{-1}\). The blue bands are current constraint in sea-quark region evaluated by PARTONS neural network, and the red bands are those after reweighted by pseudo-data of 0.12 fb\(^{-1}\). All the central values are taken from GK model for purpose of demonstration only

Fig. 6
figure 6

The \(\xi Im {\mathcal E}\) versus skewness \(\xi \) under \(Q^2 = \) 4.0 GeV\(^2\) and \(-t =\) 0.10, 0.20, 0.40, and 0.60 GeV\(^2\). The blue and red error bands are uncertainties within framework of PARTONS ANN before and after reweighting by the pesudodata of \(\text {A}_{\text {UT}}^{\sin {(\phi -\phi _s)}\cos {\phi }}\) modulation at EicC with the integrated luminosity 0.12 fb\(^{-1}\). The systematic uncertainty is not considered yet

Since ten years ago ANN was utilized to globally extract the CFFs under the assumption of vanishing real part of \(\widetilde{\mathcal E}\) and \(\widetilde{\mathcal H}\), which are poorly constrained by available data [10, 27]. This unbiased method is shown to properly reduce model dependency and propagate the uncertainties in the sense that the obtained CFFs are hardly constrained in kinematic region where data are scarce or deficient. An open source PARTONS framework was developed and publicly available [12, 13], which abandoned the aforementioned assumption and extended substantially the set of data used in phenomenological studies.. It is an ideal starting point for follow-up Bayesian reweighting technique according to the rules of statistical inference, as proved by NNPDF [78, 79]. To enable this study, we utilize the ensemble of \(N_{\text {rep}} =\) 101 CFF replicas generated through importance sampling by PARTONS with the input of DVCS data collected in the past two decades. The uncertainties of each CFF are independently described by these initial replicas as a function of (\(\xi \), \(Q^2\), \(-t\)). After locally removing the outliers of those replicas with iteratively the 3\(\sigma \) rule [80, 81] and averaging over this distilled ensemble, the mean values of \(\text {A}_{\text {UT}}^{\sin {(\phi -\phi _s)}\cos {\phi }}\) and the predicted size of CFF in sea-quark region, are nearly zero with large standard deviation, reflecting the little constraint power of data in this region.

The superscript “SC” stands for \(\sin {(\phi -\phi _s)}\cos {\phi }\), now explicitly labeled in caption of Fig. 4 but disable in the main text. For demonstration purpose, instead of relying on the central values estimated from ANN replicas, we use those given by GK model, \(\text {A}_{\text {UT}}^{\sin {(\phi -\phi _s)}\cos {\phi }}\), see Fig. 5. The current integrated luminosity of \(\text {A}_{\text {UT}}^{\sin {(\phi -\phi _s)}\cos {\phi }}\) in sea-quark region is estimated by PARTONS to be only around 0.01 fb\(^{-1}\), for comparison see the blue bands in Fig. 5. This figure also reflects the uncertainties for the cross sections and the extracted slope parameter.

The mean values of the asymmetry evaluated from PARTONS replicas are then randomly smeared by the statistical uncertainties at EicC. The generated pseudo-data in this way, labeled as a \(N_{\text {data}}\) row vector-\({\textbf {y}}\), are exploited to update the ensemble of replicas by calculating the weight of each replica [78, 79],

$$\begin{aligned} \omega _k= & {} \frac{1}{Z} (\chi _k^2)^{\frac{N_{\text {data}}-1}{2}} e^{-\frac{\chi _k^2}{2}} \end{aligned}$$
(10)

with the normalization factor Z fixed by \(\sum _{k=1}^{N'_{\text {rep}}}\omega _k = N'_{\text {rep}}\). Note \(N'_{\text {rep}} < N_{\text {rep}}\) due to the removal of outliers. The \(\chi _k^2\) is the goodness of fit indicator between replica and pseudo-data [82]

$$\begin{aligned} \chi _k^2 = ({\textbf {y}} - {\textbf {y}}_k) \sigma ^{-1} ({\textbf {y}} - {\textbf {y}}_k)^T \end{aligned}$$
(11)

where \({\textbf {y}}_k\) is the row vector of k-th replica generated in ANN at the kinematic bins of pseudo-data, \(\sigma \) is the covariance matrix of pseudo-data \({\textbf {y}}\) and \(^T\) denotes matrix transposition. The number of effective replicas after reweighting are judged in the same spirit of Shannon entropy:

$$\begin{aligned} N_{\text {eff}} = e^{-\sum _{k=1}^{N'} \omega _k \log \omega _k} \end{aligned}$$
(12)

The newly measurements at EicC is supposed to be much more precise than contemporary uncertainties inferred from ANN as illustrated in Fig. 5, resulting into a large reduction of uncertainty of CFF. This is verified by a trial inspection under 50.0 fb\(^{-1}\) luminosity of EicC that little effective replicas survive on reweighting. In order to keep the reweighting procedure reliable, we reconcile to decrease the luminosity to 0.12 fb\(^{-1}\). The number of left effective replicas still relevant is around 1/3 of the initial replicas, keeping the importance sampling off statistical insignificance.

In kinematic terms the factorization in Fig. 1 is valid when the virtuality \(Q^2\) of the photon probe is large but the momentum transfer \(-t \ll Q^2\) to the nucleon is small compared to this scale of the probe. In order to match the kinematic cuts in PARTONS, the additional conditions

$$\begin{aligned} Q^2 > 1.5~\textrm{GeV}^2 , \qquad \frac{-t}{Q^2} < 0.2 \end{aligned}$$
(13)

are applied to the avoid significant higher-twist corrections [36, 37, 83,84,85]. Then the lowest \(Q^2 \) bin [1.0, 1.6] GeV\(^2\) at EicC is not included into the reweighting strategy. The impact of remnant pseudo-data on the extraction of imaginary part of Compton form factors (CFF) \({\mathcal E}\) is displayed in Fig. 6 under \(<Q^2> = \) 4.0 GeV\(^2\) and several typical \(-t\) values ranging from 0.10 to 0.60 GeV\(^2\). The blue bands represent the accuracy driven by the existing data. The red bands show the accuracy after including the projected \(\text {A}_{\text {UT}}^{\sin {(\phi -\phi _s)}\cos {\phi }}\) data of EicC under the integrated luminosity of 0.12 fb\(^{-1}\). One can see that the uncertainties for the extraction of the imaginary CFF \({\mathcal E}\) is obviously reduced in the sea quark region especially in the large \(-t\) region with only hours running of EicC. The impact beyond sea quark region is also unraveled. So we have observed the statistical relevance of transverse polarization asymmetry measurements at EicC relying on the global extraction of CFF. An accurate knowledge of CFF \({\mathcal E}\) would have direct consequences on the first glimpse of the quark OAM inside proton. The large coverage of kinematic range will diminish the uncertainties appearing in Fourier transformation to impact-parameter-dependence, leading to a precise visualization of transverse position space and illumination of 3-dimensional nucleon structure.

Since the new dataset contain a lot of information on the CFFs, necessitating a reweighting globally or a re-training against all the measured and proposed experimental data in order to excavate the full influence of the machine. The re-training under the same pseudo-data in Fig. 5 is attempted within KM neural network [10, 11, 27] in EicC white paper [48]. There the uncertainty bands of CFF \({\mathcal H}\) and \({\mathcal E}\) are smaller than those in PARTONS under the hypothesis of vanishing real part of \(\widetilde{\mathcal E}\) and \(\widetilde{\mathcal H}\). Still, we monitored a significant reduction of uncertainties of \(Im {\mathcal E}\), as selectively depicted at \(-t = 0.2\,\textrm{GeV}^2\). It also considerably ignites the understanding the real part \(Re {\mathcal E}\) in the range of \(\xi < 0.1\) when dispersive relations are enabled.

4 Discussion and conclusion

The understanding of proton spin and 3D structure in conventional quark model tends to be insufficient with desired accuracy [86, 87]. Though quark OAM are in connection with some of the transverse momentum dependent parton distributions, e.g. the Sivers function [88,89,90,91] or pretzelosity [92, 93] in a phenomenological way, its relation to GPD E is what genuinely sustained by solid theoretical foundation. A precise extracting GPD E and also H from exclusive processes hopefully leads to a quantification of quark OAM inside proton [94,95,96]. It is extremely challenging to achieve this in practice because it requires measuring GPD for all x at fixed \(\xi \). Nevertheless, future electron-ion colliders worldwide have great potential to advance our knowledge of quark OAM and also proton tomography. It is therefore crucial to obtain remarkably various polarization choices and wide kinematic reach with the help of high luminosity accelerators and hermetic detectors. Though global GPD fits over the whole DVCS kinematic domain, from the glue to the valence region, have remained intractable yet, the local and global fits of CFFs are available at the amplitude level. These tools motivate a feasibility study for the determination of CFFs by versatile observables at EicC and other facilities, and provide one of the cornerstones of the machine construction. In return, the design of the EicC offers unprecedented new opportunities to inspect and ameliorate these fits by feed them with pseudo-data at sea quark region.

The asymmetry measurement of DVCS with transversely polarized proton beam is particularly relevant for the experimental determination of the parton OAM. A selected modulation of \(\text {A}_{\text {UT}}\) has the best sensitivity to E to the imaginary part of CFF \({\mathcal E}\). The DVCS simulations used for our studies are based on the physical Ansatz of constant Regge slope B, and the projected events at EicC are selected in the appropriate kinematic region, covering a domain of 1.0 GeV\(^2< Q^2 <80.0\) GeV\(^2\), 0.004 \(< x_B<\) 0.3, and \(|t|>\) 0.002 GeV\(^2\). The absolute statistical uncertainty for the measured asymmetry can be as low as 0.01 for \(|t|>\) 0.01 GeV\(^2\). The relative uncertainties below a few percentage, judged by the magnitude of asymmetry in GK models. The uncertainty for the extraction of the \(Im {\mathcal E}\) is unambiguously reduced around the sea region as long as the asymmetry pseudo-data of EicC are included into the reweighting procedure. Meanwhile we take advantage of high \(Q^2\) region to disentangle the constraining power of \(Im {\mathcal E}\) since GPDs are also subject to QCD evolution. This serves as the best check so far of the relationship of DVCS single spin asymmetry and CFFs in a non-local fashion, and reveals future experimental constraints on this observable. Measurements of the \(\text {A}_{\text {UT}}\) asymmetry must be completed to realise the full physics potential of EicC.

At last, let us highlight some specific perspective of future evolvement. The local extraction of CFFs is another frequently used alternative for propagating uncertainties of data to amplitudes [97, 98]. A recent progress toward Rosenbluth extraction framework for CFF has captured the essentials of DVCS architecture but still needs to be solidified [99,100,101]. Moreover, more efforts are required to extrapolate zero skewness of GPDs by theoretical calculation, .e.g. in nonlocal chiral effective theory [102] and basis light-front quantization [103, 104], to the skewness covered by the facilities. Fortunately, lattice QCD calculations of GPDs have made rapid progress [105,106,107,108,109,110,111,112,113,114]. During the development of these tools, it will become apparent that one must measure all of the DVCS observables at several facilities to obtain continuous kinematic coverage across the large \(x_B\) down to the saturation regime that suffice for a complete 3D images of proton in the language of GPDs. With a complete knowledge of these functions it will be possible to quest for instance the OAM and proton spin puzzle through Ji’s sum rules. Last but not least we address that the polarized light-ion beams such as \(^3\)He with a polarization of 70% at EicC and EIC potentially allows one to separate the quark flavors by DVCS off the neutron [11, 115] together with hard exclusive meson electroproduction (DEMP) [116] and the positron beam at JLab [81].