A constrained fit of the valence transversity distributions from dihadron production

We present a constrained analysis of the valence transversity Parton Distribution Functions from dihadron production in semi-inclusive DIS. While usual extractions of the transversity distributions rely explicitly on the fulfillment of the Soffer bound, the present analysis releases that restriction to implement further constraints through the method of Lagrange multipliers. The results are quantitatively comparable to previous analyses in the kinematical range of data; the qualitative impact translates into an increased flexibility in the functional form.


Introduction
Parton Distribution Functions (PDFs) describe the structure of hadrons in terms of their constituents. At leading twist, the hadronic structure is defined by the distributions corresponding to the vector, axial-vector and tensor Dirac bilinears. The first two, the unpolarized and the helicity Parton Distribution Functions, while under constant study and improvement, are relatively well-determined. In the last decade, the third leading-twist PDF, the one corresponding to the tensor bilinear [1][2][3][4], has been explored in phenomenology. The transversity PDF describes the distribution of the transverse polarization of quarks as related to that of their parent hadron. Transversity is a chiral-odd object that is paired, in physical processes, to another chiral-odd object.
One such candidate object is the single-hadron Transverse Momentum Dependent Fragmentation Function (TMD FF) depending on the fraction of longitudinal momentum of the fragmenting quark carried by the outgoing hadron, as well as on the trace of the intrinsic transverse momentum of that active quark. The second candidate is the dihadron FF (DiFF), measured in a fragmentation process into a pair of hadrons whose relative momenta tag the transverse polarizaa e-mail: aurore@fisica.unam.mx (corresponding author) tion of the active quark via modulations at the cross-section level. Thanks to the data on semi-inclusive pion production in electron-positron annihilation from the B factories, both types of FFs have been measured and used to assess transversity PDFs in either global fits and single-process fits.
Dihadron Fragmentation Functions were proposed by Jaffe et al. [5] to access the transversity PDF and successfully implemented in phenomenological studies [6], as demonstrated by the recent determinations of the transversity PDF [7][8][9] -that is, the collinear PDF discussed in this article, as opposed to the Transverse Momentum Dependent (TMD) transversity distributions that have been recently studied in Refs. [10][11][12].
Yet another possible way to access the transverse polarization of quarks is found through Generalized Parton Distributions relevant for Deeply Virtual Meson Production [13,14].
The analyses for the transversity are based on reduced experimental inputs with respect to the data sets used for the other two leading-twist PDFs. All three are subject to analogous first principles, e.g. positivity constraints and support requirement. In this paper, we propose to broaden the methodology to incorporate the assessment of first principles directly in the fitting procedure. This methodology palliates the low experimental accuracy of processes involving transversity PDFs.
To leading-order, the unpolarized, f 1 (x), and longitudinally polarized distributions, g 1 (x), can be interpreted as probability densities in terms of the helicity representation for Dirac particles -basis of the positivity bound |g The interpretation of transversity distribution, h 1 (x), must be done in the transversity basis -leading to the bound |h q 1 (x)| < f q 1 (x). Another positivity constraint for h 1 (x) has been stated by Soffer [15] |h q for each flavor of quark and anti-quarks separately. If the Soffer bound is realized at a low factorization scale Q, it is preserved under QCD evolution towards higher scales [16,17]. The bound in Eq. (1) has been treated as a first-principle constraint in all extractions of transversities until now. Together with the support x ∈ [0, 1] and the corresponding end-point behavior, they constitute the basic properties that any parametrization of the transversity PDF must satisfy. The implementation of positivity constraints generates a hierarchical dependence on the unpolarized PDF for the helicity PDF and both the unpolarized and helicity PDFs for the transversity.
As a result of the fundamental characteristic of the PDFs to describe the internal structure of hadrons in a given spin configuration, their first Mellin moments embody the fundamental charges when originated from conserved currents. The first moment of the transversity distribution is known as the tensor charge, in analogy to the vector and axial charge.
The tensor charge provides interesting information for processes involving hadrons that are also relevant for searches of New Physics [18][19][20][21].
On the one hand, recent analyses of the transversity highlight a tension between the data and the expression of the Soffer bound in some kinematical regions [7,9,11]. On the other hand, the tensor charge for the valence and isovector flavor combinations as determined through phenomenology and lattice [22][23][24][25] differ, with the largest discrepancy coming from the u-valence contribution. In light of the phenomenological conflicts with the Soffer bound and the latest lattice QCD determinations of the tensor charge, we analyze the level of probability of the Soffer bound given the semiinclusive DIS data. Subsequently the derived significance of the bound will be reflected in the objective function. The Soffer bound only appears in that step, hence leveraging the choice of the parametrization in the minimization procedure. The parametrization will be chosen, first, to fulfill the support in Bjorken x and, second, so as to optimize the assessment of the uncertainty that reflects as objectively as possible the error coming from both the experimental data and the theory, i.e. the implementation of the positivity bounds here. The methodology includes a further iteration of the minimization imposing inequality constraints to ensure a smooth behavior for x → 1 through the method of Lagrange multipliers. We carry out the new analysis based on the point-by-point extraction for proton and deuteron SIDIS data of Ref. [8]. There have been no recent improvements from the theory side or new determinations of DiFFs that would justify a complete reanalysis of the two-hadron SIDIS asymmetry. This paper is organized as follows. Our methodology is described in Sect. 2. In Sect. 3 we discuss our results -with a particular emphasis on the tensor charge, and then conclusions and possible extensions of this work are drawn in Sect. 4. The state-of-the-art formalism related to dihadron production in semi-inclusive DIS is overviewed in Appendix A.

Methodology
The determination of the valence transversities is obtained in three main steps. We work within the hypothesis that first principles constraints can be implemented through the fitting procedure to optimize the collective information of the N d = 22 data points and the first principles. We first assess the probability that the Soffer bound is valid using a Bayesian approach. This first step allows a reweighting of each data point, taking into account the Soffer bound implicitly. Then the fulfillment of other theoretically justified behaviors is achieved through a constrained fit using the method of Lagrange multipliers. Finally, the constrained fit is repeated N times using the bootstrap technique as an integral part of the error treatment [8].

Probability of the soffer bound
The practical implementation of the Soffer bound relies on the other two leading-twist PDFs, to be precise on the fits of the unpolarized and helicity PDFs. A PDF parametrization should come with an uncertainty that combines the statistical uncertainty obtained in the fit and an error coming from the various choices for physical parameters and hypothesis, e.g. the value of α s (M 2 Z ) or the order in perturbation theory. In the standard parameterizations of h 1 (x), the validity of the bound is ensured by construction at the level of the functional form at Q 0 and is preserved under QCD evolution [17]. Hence, a theoretical error could analogously be attributed to the Soffer bound (SB) in the first place, introducing an extra flexibility on the parametrization.
On the other hand, the confrontation of the SB with the combination of transversities extracted from proton and deuteron targets, Eqs. (A.3)-(A.4), 1 also depends on the approximations used in the extraction of the DiFFs. The mentioned theoretical error should ideally comprise both sources of uncertainty. Thus, we would like to estimate the goodness of the Soffer bound -expressed in terms of PDF parameterizations -given the transversity combinations extracted from HERMES [26] and COMPASS [27,28] data and implicitly include the bound as a source of uncertainty in the fitting procedure.
For that purpose we evaluate the probability of the transversity PDF lying outside the SB. We first map each experimental point, which has a Gaussian distribution, into an in-out case. Suppose that we know the limits of the bound exactly, i.e., we can identified unequivocally whether the data are inside or outside the bound: θ j represents that probability, where A j is the true value of the transversity combinations of Eqs. (A.3-A.4) evaluated for j = 1, . . . , N d . The region is the area comprised by the inequality of Eq. (1) evaluated in each corresponding kinematical bin using MSTW08LO at LO [29] for f 1 (x j , Q 2 j ) and JAM15 [30], at NLO, for We define a hyperparameter t corresponding to the probability that the data value lies outside the bound, namely a prior distribution for θ j given t. The true values for the transversity combinations can only be inferred through the actual data, such that where the probability of A j given the data evaluates the distance of the data point j to the bound, considering both the uncertainty on the data and on the bound. The distributions are assumed to be Gaussians centered on the experimental value, for the data, or as explained after Eq. (2), for the bound, with the corresponding standard deviation. We assume an error from going from NLO to LO for the helicity PDF based on the NLO-LO difference for unpolarized PDFs. The final desired probability is expressed, using Bayes theorem, as is the prior for t that is chosen to be flat, and the probabilities are defined above. Evaluating F(t), we find a distribution with a mode at t = 0 and a central value for t, for the 68% confidence interval. It is illustrated in Fig. 1. How should we interpret the result shown in Fig. 1? It is unlikely that the bound is incorrect with a probability higher than t = 10% as most of the contribution to the integral of F(t) comes from the range 0 < t < 0.1. Still there exists a window through which the agreement of the implemented SB and the transversity combinations is poorer. Since the SB Fig. 1 Distribution of the hyperparameter t as given by Eq. (5). The vertical blue line corresponds to the central valuet, the blue shaded area to the 68% CL evaluated from the upper/lower bound of the integral of F(t). The orange shaded area corresponds to the 95% CL. The y-axis has arbitrary units is introduced here as first principle, we translate this result into a relaxation of the expression of the bound through a Bayesian reweighting bin-by-bin in the objective functioni.e. here the χ 2 . The weight is obtained as follows, In Fig. 2, the statistical weight is represented for each bin in the bottom plots. It can be appreciated that the proton combination is almost not affected by this procedure while the lowest and highest x-bins of the deuteron combination happen to statistically lie inside that window of poor agreement, as expected from previous analyses [7,8].
In the bootstrap technique, the minimization of the objective function is performed N = 200 times. As explained in, e.g., Ref. [8], N replicas of the extracted combinations the N replicas are generated within the corresponding extended Gaussian error bars. The objective function can be written, for each replica r , as where { p I } is the set of free parameters to be determined for each replica r . In the next Section, the function h p/D 1 theo will be extensively described.
The same exercise has been repeated with CT09MC1 [31] in both Eqs. (A.3-A.4) and the expression of the SB. The lightest weights for the deuteron combination slightly change, repercussions will be discussed in Sect. 3. The overall conclusions of this section are not affected by the choice of LO unpolarized PDF.

Parameterization and constraints
Now that the main first principle based constraint has been included through a statistical reweighting -which is effec-tively implemented at the stage of bootstrapping the datathe parameterization can take a more adaptable form: the data alone lead the determination of the free parameters of the chosen form. An unbiased parametrization is particularly welcome in kinematical ranges with little to no data or, a fortiori, kinematical ranges that exhibit apparent conflict with the positivity bounds. The latter are best accommodated through a flexible form that could realistically be contrasted with future data. We wish to impose the required support through the functional form. It is indeed possible for the up distribution. However, the behavior of x h d v 1 is affected by bigger errors as it is dominated by the deuteron data, and a judicious choice of parametrization is not sufficient to ensure the expected behavior towards the end-point x → 1. A smooth fall-off in x is expected by the underlying QCD evolution occurring at higher x-values -see e.g. [32]. This observation directly relates to the power-law fall-off ruling the two other leading-twist PDFs. We tame undesired behaviors in the large-x region for the down distribution through a constrained fit using the method of Lagrange multipliers -see e.g. [33]. The minimization itself contains two steps. First, the objective function is minimized via a non-linear least-square, determining the set of best fit parameters, { p I }.
The definition of the χ 2 ({ p I }), Eq. (9), requires a functional form for h p/D 1 theo (x). The first considerations while determining the parameterization is to guarantee integrability and support at x = 0, where the exponent has been chosen based on previous outputs and the expected small-x behavior [34] to be slightly modified by the polynomial in x. The latter, P q n (g(x)), of order n, can be as flexible as the data allow for. We choose to express P n in terms of Bernstein polynomials as has been done in Ref. [35]. They are defined as with n k the binomial coefficients. Those polynomials have the advantage of selecting particular regions in g(x) and as such can be employed to emphasize particularly relevant kinematical regions. That is, for semi-inclusive DIS, the low-x region becomes relevant with respect to the valence and large-x region. A rescaling of the variable will help the parameterization adjusting the data, we choose g(x) = x 0.3 . A desirable feature of the polynomials in Eq. (10) is a statistically representative error outside the data range, yet in agreement with the first principle constraints at hand. In order to span the Bjorken variable range as significantly as possible, we use four different degrees for the polynomial P n -thus using four different functional forms distinguishable by their order n -and optimize the number of Bernstein polynomi-  (4,9,12) p u 3, (1,12,20) p u 4, (4,16,24) p d 1, (1,3,4,5) p d 2, (4,6,11) p d 3, (3,5,12,20) p d 4, (2,8,16,24) als by trial and error. In Fig. 3, we show the polynomials that we have adopted, respectively, for the up and the down parameterization. The functional form is now expressed as 1,i will be evaluated N /4 times.
In the following step, we will clarify our choice for a limited coverage of the functional form for values of x 0.6 for the up and x 0.5 for the down. As mentioned above, we need to ensure a smooth fall-off of the transversity in the limit x → 1 that, for the d contribution, cannot be achieved exclusively from the choice of functional form. In most cases, a second step will be required to constrain the functional form in an allowed region. When the objective function is subject to m constraints of the form C l ({ p }) = 0 with l = 1, . . . m, the later are imposed through the Lagrangian to which a stationary point of L is found minimizing with respect to the parameters { p } and the Lagrange parameters {λ}.
We define a new objective function that depends on the new set of best fit parameters, { p I I }, which consists in the set made of p q I I i,k , and replaces the set obtained through the first minimization, { p I }. We guide the large-x behavior of the down parameterization only, using the following N c = 4 constraints  In previous -unpolarized and longitudinally polarized -PDF determinations, the method of the Lagrange multipliers has been made popular for error estimation [36]. In the present approach, this method is used to impose limits on the fit parameters. This last step completes a methodology that focuses on the adaptability of the parametrization to constraints, in opposition to a parametrization that is constrained a priori.

Transversity PDF
Both steps of the minimization procedure are carried out in Python. The convergence for both the main χ 2 minimiza- In Fig. 4, we show the envelopes for the obtained valence transversities at 68% CL, respectively, for the up and the down distributions. Following the color code introduced in Fig. 3, it can be observed that the higher order polynomials, n = 30, 40 in yellow and green, allow a larger error bar at smaller values of x. On the other hand, the lower order polynomials, n = 10, 20 in red and purple, are confined 1 (x, 5GeV 2 ) is built using the four versions of the functional form together. It is shown at 68% and 95% CL in Fig. 5. For consistency, we also show the Soffer bound at NLO using the helicity PDF of JAM15 [30] and the unpolarized PDF of MSTW08 at NLO 2 [29]. Were there central values for the obtained transversities, they would be enclosed inside the Soffer bound. In that sense, our result is similar to the first Hessian approach of Ref. [7].
The small-x behavior predicted in Ref. [34] is fulfilled for the up transversity PDF up to, at least, x = 0.1 ; no clear conclusion can be drawn for the down above x ∼ 0.03.
In Fig. 6, the 68% and 95% CL proton and deuteron combinations are depicted and compared to the point-by-point extraction of Eqs. (A.3-A.4). The final χ 2 /d.o.f. is evaluated against those extracted data. The number of degrees of Fig. 6 Combinations of transversities for the proton (upper) and the deuteron (lower) compared to the global 68% in blue (with full blue line contour) and 95% in orange (with dashed contours) for the present fit compared to the extracted data [8]. The dark blue square come from HERMES data while the dark orange dots are extracted from COM-PASS data  Fig. 7.
The analysis has been carried out using the MSTW08 LO parametrization. We have repeated the fitting procedure using the CT09MC1 set for the unpolarized PDF [31], it is a LO set fitted on real data and NLO pseudo data with α s at 1loop. There is no qualitative difference in the final transversity PDF. We notice a slightly smaller average value for the isovector tensor charge, quantity that will be defined here below.
Compared to previous extractions, we might comment that our result for the down transversity presents a smaller error band in the data region -as our functional form was built on that purpose. It is achieved at the expense of a slightly wider band for the up distribution in that region. The error band increases for small-and large-x values. In this respect, our down distribution differs from that of Refs. [8,9] and the less flexible versions of the parameterization of Ref. [7], all of them being dihadron-based extractions. The same comment is in order for comparisons with single-hadron semi-inclusive DIS [10,11]. The combination h u v 1 − h d v 1 of the two available collinear extractions is compared in Fig. 8. The small-x behavior differs and the error band associated to the present analysis is wider in the valence region.
We also compare, in Fig. 8, our results to the lattice QCD evaluation of the isovector transversity PDF [38,39]. There is an acceptable agreement in the region 0.1 < x < 0.3. The lower-x region of our result is more structured, as dictated by the data. Both lattice evaluations exceed the phenomenological determinations at large-x -region in which the parameterization of the fits are strongly bound by the positivity limits.
As explained in Appendix A, our analysis of the transversity PDF through Eqs. (A.3, A.4) has been carried out at an average value of Q 2 = 5 GeV 2 . To that regard we have further performed our analysis considering QCD evolution only for the Fragmentation Functions. We find a 15% increase in the χ 2 when fixing the scale of the unpolarized PDFs in the asymmetry to 5 GeV 2 . The comparison with the isovector transversity from lattice QCD is not substantially improved.
Finally, we notice that the PDF+Lattice result for the isovector combination [12] is systematically higher than the present results for the whole support in x, except for x → 0.

Tensor charge
We next evaluate the first Mellin moment of the transversity PDF to get the tensor charge The distributions correspond best to skewed Gaussian distributions and the obtained values are to 1σ . The corresponding 2σ uncertainties are, for the up, (−0.42, 0.31) and, for the down, (−0.57, 0.87). In particular, we are interested in the isovector combination, g T (Q 2 ), as it is of particular interest for, e.g., beta decay observables [18,19]. We show the corresponding stacked histogram in Fig. 9. The Gaussian distribution showed on the r.h.s. of Fig. 9 is given by The trend observed above leads to discernible consequences here: the more flexible the parameterization, the wider the distribution of the tensor charge. Our result is compatible with lattice QCD evaluations of the isovector  [25]. It is at the same time wide enough to encompass all of the other phenomenological determinations. As for the separate up and down contributions, e.g., the ETM Collaboration reports δu = 0.716 (28) and δd = −0.210 (11). The tensor charge for the up quark differs from our result as well as most phenomenologically determined tensor charges. The relaxation of the Soffer bound in the range of the data did not ease that discrepancy, as also discussed in Ref. [40].
In a region where scarce to no data are available, the choice of functional form generates a crucial uncertainty. The role of the small-x region, where no strong expressions of first principle bounds nor data are available, matters. The addition of proton-proton collision data [41], used in the first global fit of the transversity [9], does unfortunately not increase the coverage at lower x values -but it certainly does in Q 2 . In that sense, data from an Electron Ion Collider would be extremely helpful, be it for a qualitative improvement over a quantitative reduction of the uncertainty. Would the tensor charge become relevant in view of search for New Physics, the appropriate observables would not necessarily linearly depend on the uncertainty of the tensor charge, as shown in the case of beta decay in Ref. [19].

Conclusions
The transversity Parton Distribution Function is the least known of the three leading-twist PDFs. Its phenomenological determination has been made possible thanks to independent data for its partners in semi-inclusive DIS, the chiralodd fragmentation functions. In particular, the formalism developed around dihadron Semi-Inclusive DIS allows for a collinear extraction of combinations of valence transversity PDFs [6]. It requires the knowledge on the Dihadron Fragmentation Functions that have been studied in Ref. [8,42]. Since its first point-by-point extraction from dihadron semiinclusive DIS [43], enormous efforts towards the extraction of the transversity PDF, accessible in processes involving fragmentation to a pion-pair [9], have been made.
In this paper we have presented a constrained fit of the valence transversity PDFs from dihadron semi-inclusive DIS data. The adopted methodology is characterized by the flexibility to accommodate constraints from the fitting procedure. It consists in three steps. First we have examined the positivity constraints on the transversity distribution: a priori from the expression of the bound combined with the data is introduced as a theoretical uncertainty on the PDF that is then implemented through a reweighting of the data. Second we have chosen flexible functional forms -free from ad hoc expressions of the Soffer bound -to enhance the information obtained from the data and first principles. It results in an improved treatment of the PDF uncertainties. Finally, we have guided the valence down transversity PDF to follow a fall-off behavior at x → 1 through the method of Lagrange multipliers, forcing the set of parameters to observe the theoretical constraints.
The obtained 68% and 95% CL envelopes for the valence transversity distribution functions fulfill the expected small-x behavior, the up distribution fully realizes the Soffer bound and the edges of the down envelope reflect the relaxation of the bound combined with the lack of data in the large-x region. Our results globally show wider error bands outside the data range with respect to the error band inside that range than found in previous extractions. That trend, allowed by the -absence of -data, translates into a wide distribution of the tensor charge yet with a more comfortable extrapolation in the whole support. The resulting isovector tensor charge is in agreement with the determinations of lattice QCD. We believe that this exercise supports the idea that the choice of functional form contributes to the global error of the PDF determination, especially in kinematical regions with limited data coverage.
At SIDIS scale the relative effect of DGLAP is small compared to the accurracy of the obtained PDFs. The effect of QCD evolution will become important when including data from proton-proton collision from RHIC for which the role of NLO corrections might become important, as discussed in Ref. [9]. Our methodology could be applied, as a natural extension of this work, to the extended set of data including the aforementioned proton-proton data and the inclusion of the corresponding DGLAP routine. More semi-inclusive DIS data in the valence region are expected from CLAS12 and SoLID at JLab; data from the future Electron Ion Collider would improve the uncertainty in the low-x region. ration grant SP 778/4-1 (DFG) and 278017 (CONACyT). R.F.-H. is also supported in part by CONACyT project 252167-F. J.B. has been funded by Beca Alianza del Pacífico 2018-2 and UdeI-FC of the Universidad Nacional de Ingeniería, Peru.

Data Availability Statement
This manuscript has no associated data or the data will not be deposited. [Authors' comment: The data are available in Refs. [26][27][28].] Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/. Funded by SCOAP 3 . data has been proposed in Ref. [46] to confirm their overall compatibility.