Transverse-momentum dependent parton distribution functions beyond leading twist in quark models

Higher-twist transverse momentum dependent parton distribution functions (TMDs) are a valuable probe of the quark-gluon dynamics in the nucleon, and play a vital role for the explanation of sizable azimuthal asymmetries in hadron production from unpolarized and polarized deep-inelastic lepton-nucleon scattering observed in experiments at CERN, DESY and Jefferson Lab. The associated observables are challenging to interpret, and still await a complete theoretical explanation, which makes guidance from models valuable. In this work we establish the formalism to describe unpolarized higher-twist TMDs in the light-front framework based on a Fock-space expansion of the nucleon state in terms of free on-shell parton states. We derive general expressions and present numerical results in a practical realization of this picture provided by the light-front constituent quark model. We review several other popular quark model approaches including free quark ensemble, bag, spectator and chiral quark-soliton model. We discuss how higher-twist TMDs are described in these models, and obtain results for several TMDs not discussed previously in literature. This study contributes to the understanding of non-perturbative properties of subleading twist TMDs. The results from the light-front constituent quark model are also compared to available phenomenological information, showing a satisfactory agreement.

SIDIS is a rich source of information on the nucleon structure including subleading-twist effects. However, in a tree-level factorization approach, twist-3 SIDIS observables receive 4 (or 6) contributions due to twist-3 (or twist-2) transverse momentum dependent parton distribution functions (TMDs) convoluted with twist-2 (or twist-3) transverse momentum dependent fragmentation functions [22]. This makes the theoretical interpretation of data challenging, and motivates model studies to help to clarify the underlying physics. The important impact of model studies for the understanding of TMDs was reviewed in [23]. Model calculations also indicate that the status of TMD factorization in SIDIS beyond leading twist is not yet fully clarified [24]. Information on collinear twist-3 parton distribution functions is limited to g q T (x) accessed in polarized DIS, see [25] for an overview. The interference fragmentation function approach based on collinear factorization offers a way to access further twist-3 parton distribution functions in a collinear factorization [26]. A first extraction of one of these functions, namely e q (x), using this framework was recently reported in Ref. [27].
Higher-twist TMDs can in general be decomposed in contributions from leading-twist, current quark mass terms and pure interaction-dependent ("tilde") terms. This is accomplished by employing equations of motion (EOM) and reveals that tilde-terms are not parton densities but quark-gluon correlation functions. Neglecting the tilde-and mass terms is sometimes referred to as Wandzura-Wilczek approximation [28]. This step can be helpful in phenomenology to disentangle the many contributions to twist-3 SIDIS observables [29][30][31][32], and can in certain cases be a numerically useful approximation [25,33]. But it removes the richness of the largely unexplored but attractive non-perturbative physics of quark-gluon correlations. Precisely this is an important motivation to study subleading-twist effects [34,35].
Higher-twist TMDs and parton distribution functions of quarks are expressed in terms of hadronic matrix elements of bilinear quark-field correlators of the type h|ψ(0)Γψ(z)|h , which makes them amenable to studies in quark models [36], defined in the following as models without explicit gauge-field degrees of freedom. Quark models with interactions allow one, in principle, to model also the interaction-dependent tilde-terms. Quark models have been shown to give a useful description of leading-twist TMDs and related SIDIS observables, provided one applies them carefully within their range of applicability. Much less is known about higher-twist TMDs, and important questions emerge. What precisely can we learn from quark models? To what extent can quark models give estimates for higher-twist TMDs? And how useful are such estimates phenomenologically? This work will not provide an extensive answer to these complex questions. But it will, as we hope, shed new light on the applicability of quark models to TMDs beyond leading twist. In this work we will limit ourselves to unpolarized higher-twist TMDs. Earlier work in this sector was presented in [36][37][38][39][40][41].
The specific goals of this work are as follows. After a brief introduction on unpolarized TMDs in Sec. II, we will work out in Sec. III a general approach to derive unique decompositions of subleading-twist TMDs into twist-2 parts and mass terms by making use of the free EOM, where tilde-terms are absent. In the subsequent sections we will generalize this formalism to include interactions in specific quark models, which will give rise to tilde-terms.
In Secs. IV-VI we will discuss several quark models, starting with the ensemble of free quarks [42], a prototype for parton-model frameworks where interactions are absent. When discussing interacting models, we will include the spectator [38], chiral quark-soliton [39,40] and bag [41] models, and investigate how interaction-dependent tilde-terms arise in those models. Hereby we will not only review available results, but also present new results not discussed previously in the literature. We will also derive a so-called Lorentz-invariance relation (LIR) among unpolarized TMDs valid in frameworks without gauge degrees of freedom, i.e. also in quark models. We will use the LIR to test the theoretical consistency of the model frameworks.
A central part of this work is Sect. V. Here we will extend the light-front constituent quark model approach (LFCQM), which was used in the past to study leading-twist TMDs, to the description of higher-twist TMDs. This model in some sense exhibits features of both free and interacting quark models. In fact, we will find that some (not all) of the relations among TMDs derived from free EOM hold, which can be traced back to the fact that this approach is based on a light-front Fock-state expansion of the nucleon state in terms of on-shell parton states obeying the free EOM. However, we also find that the LIR is not supported in the LFCQM. Technically this is because the single quarks are on-shell, but the three-quark state they form is not, with the off-shellness introduced by the nonperturbative bound state information encoded in the nucleon wave-function. The deeper and more general reasons for the non-compliance with the LIR can be traced back to generic issues with the conservation of the minus-component of the electromagnetic current in light-front approaches, which requires the inclusion of higher Fock states not accounted for in this approach.
The paper is rounded up by Sec. VII, where we will present and compare the numerical results from the quark models. We will also confront predictions from the LFCQM to available results from phenomenology on e q (x). After the conclusions in Sec. IX, we will present Appendices with technical details.

II. TMDs AND EQUATIONS OF MOTION RELATIONS
Quark and antiquark TMDs for flavor q are defined in QCD in terms of quark correlators of the type Φ q ij (P, p, S; path) = d 4 z (2π) 4 e ip·z P, S|ψ j (0) W(0, z; path) ψ i (z)|P, S , where P (S) denotes four-momentum (polarization vector) of the nucleon, and p is the four-momentum of the quark. TMDs are given by such correlators integrated over p − with p + = xP + . Factorization theorems dictate (for p −integrated correlators) the process-dependent "path" indicated in Eq. (1) along which appropriate Wilson lines connect the bi-local quark field operators [43]. (Our notation is For brevity we do not indicate the scale dependence of the correlators and TMDs, and often omit the flavor index q on the quark fields ψ ≡ ψ q .) In order to count independent structures, one decomposes the correlator in terms of scalar "amplitudes" multiplied by independent Lorentz structures allowed by the symmetries of the strong interactions and constructed from the four-vectors P , S, p [42,44] and a (near-)lightlike four-vector n [45] which characterizes the path of the Wilson line (actually, the situation is more complex than that [25], but this does not change the general conclusion [45]). In QCD one has 32 independent amplitudes: A q i with 1 ≤ i ≤ 12 and B q j with 1 ≤ j ≤ 20 [45]. There are also 32 TMDs: namely 8 at leading twist, 16 at twist-3, and 8 (more academic) at twist-4. Thus, one ends up with as many TMDs as amplitudes, and there are a priori no relations among TMDs, unless one resorts to approximations such as the above-mentioned Wandzura-Wilczek approximations.
What distinguishes the A q i and B q i is that the former multiply Lorentz structures made from P , S, p only, while the latter explicitly include also the vector n characterizing the gauge-link. Therefore, in quark models (with no gauge fields) all the B q i amplitudes are absent. Moreover, the amplitudes A q 4 , A q 5 , A q 12 are "naively T-odd" which is allowed in QCD [46][47][48][49], but forbidden in quark models [50]. Thus, in quark models up to twist-3, one has 9 amplitudes describing 14 T-even TMDs, out of which 6 (8) are twist-2 (twist-3). This implies the existence of 5 "Lorentz-invariance relations" (LIRs) among T-even TMDs [42,44] which must hold in quark models [33], but are not valid in QCD [51] due to the presence of B q i -amplitudes [45]. Depending on the quark model, in addition to LIRs, also further relations may arise [37,38,41,52] due to (spherical, spin-flavor) symmetries of model wave-functions [53].
When we focus on the case of an unpolarized target within a quark model, the general decomposition of the correlator is completely specified by 3 terms, where the dots denote T-odd or polarization-dependent terms, or gauge-link related B q i amplitudes. If we denote by P | · · · |P the target spin-averaged matrix element, then the complete set of unpolarized T-even TMDs is given by 4 TMDs, the twist-2 f q 1 (x, p T ), the twist-3 e q (x, p T ) and f ⊥q (x, p T ), and the twist-4 f q 4 (x, p T ): In terms of the Lorentz-scalar amplitudes A q i , these unpolarized TMDs read in quark models Up to twist-3 level in the unpolarized T-even sector, we have 3 TMDs and 3 amplitudes. Thus, even in quark models, there are in general no relations between f q 1 (x, p T ), e q (x, p T ) and f ⊥q (x, p T ). The full structure of the quark correlator (1) in the unpolarized T-even sector is completed by the twist-4 TMD f q 4 (x, p T ) [3,45]. Twist-4 TMDs are rather academic objects. In physical situations, like power corrections to the DIS structure functions, f q 4 (x, p T ) mixes with other twist-4 quark-gluon correlators [34,[54][55][56][57][58][59][60]. While the practical understanding of power corrections is of interest [62,63], our motivation to include f q 4 (x) is rather that it will serve as an important internal consistency check of our approach. In fact, in quark models new features emerge as one goes to higher twists (as in QCD, albeit on a far simpler level).
Including twist-4, we encounter in quark models the situation that 4 unpolarized TMDs {f q 1 , e q , f ⊥q , f q 4 } are expressed in terms of 3 amplitudes {A q 1 , A q 2 , A q 3 } (in QCD the amplitude B q 1 also contributes). This implies a LIR among these TMDs valid in Lorentz-covariant quark models (but not in QCD). Using the methods of [44], see App. A, we find To the best of our knowledge, this relation has not been presented in the literature before. Let us end this section with two general results. In complete analogy to the positivity proof of f q 1 (x, p T ), one can show that the twist-4 TMD satisfies the positivity constraint With N q denoting the valence quark number of flavor q, the following sum rule is formally satisfied We discuss in App. B how this sum rule can be proven, and what is formal about it.

III. EQUATIONS OF MOTION RELATIONS IN FREE QUARK MODELS
Generally speaking, matrix elements of higher-twist operators can be decomposed by means of equations of motion (EOM) into contributions from twist-2, mass terms and tilde-terms [54][55][56]. We present here a general approach to derive such relations tailored for applications in quark models, where the situation is simplified due to the absence of gauge interactions. More precisely, in this section we concentrate on free quark models. It should be noted that, for instance, parton model frameworks [42,[64][65][66][67][68] belong to this class of models. After discussing the LFCQM in the next section, we will further generalize the formalism to models with interactions.
In order to derive a starting formula for EOM relations, we proceed as follows. Let Γ be an arbitrary Dirac matrix. We apply the free EOM within the fully unintegrated correlator, integrate by parts, and obtain Next, repeating the above steps with P |ψ(−z)(−i ← − / ∂ − m q )Γψ(0)|P (or, equivalently, taking the complex conjugate of (14)) and shifting the field positions by z, yields an identity analogous to (14) but with Γ( / p − m q ) replaced by ( / p − m q ) Γ, where Γ = γ 0 Γ † γ 0 is the Dirac conjugate of Γ. Adding up these two identities yields where we introduced the p − -integration and a factor 1 2 for later convenience. In the following we also set p + = xP + . Equipped with the identity (15), we proceed to derive the EOM relations among e q (x, p T ), f ⊥q (x, p T ) and f q 1 (x, p T ). They are obtained by choosing appropriate Γ matrices. Choosing respectively Γ = γ + and Γ = iσ +j T , we obtain which coincide with the EOM relations in QCD [22] but with (in free quark models) consistently neglected tilde-terms. We remark that / p = γ + p − + γ − p + − γ j T p j T in the identity (15) introduces the factors p + = xP + or p j T which become prefactors of x in (16) and (17) or are "absorbed" by the definition of (5). But the piece with γ + p − drops out due to (γ + ) 2 = 0. However, at twist-4 the component p − contributes. In order to eliminate it, we derive the identity which reflects the fact that the correlator (1) describes on-shell quarks in a free quark model. Using the identity (15) with Γ = iσ +− and making use of (18), we derive the EOM relation For Γ ∈ {1, γ − , γ j T , iσ −j T , iσ jk T } we obtain linear combinations of the EOM relations (16), (17) and (19). For example, for the choice Γ = 1 one obtains an EOM connecting all 4 TMDs, which reduces to (19) using (16) and (17), namely All the other Γ-structures are not relevant for unpolarized TMDs. We end this section with three important remarks. First, in free quark models the set of unpolarized T-even TMDs {f q 1 , e q , f ⊥q , f q 4 } can be expressed in terms of one single TMD, say f q 1 . That there is only one independent structure, can also be seen as follows. In Eq. (14) we have shown that in the class of free quark models tr[Γ( / p − m q )Φ q ] = 0 for all Γ. This implies that ( / p − m q )Φ q = 0, and inserting here the decomposition (1) of the correlator, for the case of an unpolarized nucleon in a quark model, yields where we used ( / p − m q ) / pA 3 = −m q ( / p − m q )A 3 if p 2 = m 2 q . Since the Dirac matrices 1, / p, / P and / p / P are linearly independent for p ∝ P , we conclude that Using this result in Eqs. (7)-(10) together with 2xP + p − = p 2 T + m 2 q , we recover the relations (16), (17), and (19). In particular, Eq. (22) shows that in free quark models the unpolarized correlator consists of only one independent amplitude, meaning that all unpolarized TMDs are related to each other.
Second, since the general Lorentz decomposition in models with on-shell quarks is fully specified by a single A i amplitude according to (22), all our free EOM relations (16), (17) and (19) can in some sense be understood as LIRs. It has to be stressed, that the general LIR (11) only explores Lorentz invariance in relativistic quark models, but makes no use of model details such as EOMs. Therefore, none of the EOM relations (16), (17), (19) is equivalent to the general LIR (11). However, a particular linear combination of (16), (17) and (19) can be formally proven to be equivalent to the LIR (11). The proof is formal though, since it can be invalidated by the properties of the amplitudes A q i in a given model, see App. A 2. Third, the EOMs in interacting quark models can be anticipated from Eqs. (16), (17) and (19), and read where the operator definitions of the specific tilde-terms have to be carefully worked out using the EOMs of the models under consideration. In QCD (23) and (24) hold, with the tilde-terms defined in terms of quark-gluon correlators [22]. But the term proportional to p 2 T f q 1 (x, p T ) in (25) could in QCD be naturally expressed in terms of correlators with transverse gluon inclusions of the type N |ψ i / D T γ + i / D T ψ|N [58]. Our free quark model results are recovered in the limit i / D T → i / ∂ T . For QCD treatments of higher-twist distributions we refer to [34,[54][55][56][57][58][59][60]. We also remark that the "brute-force" systematic neglect of all QCD quark-gluon correlations is the basis for WW-type approximations [33], and the general helicity formalism with the twist-2 QCD parton model of Ref. [68].
After discussing models where the quarks obey the free EOM (which is not the same as models without interactions) in Secs. IV and V, we will come back to several interacting quark models in Sec. VI.

IV. ENSEMBLE OF FREE QUARKS
In this section we derive the general expression for the unpolarized T-even TMDs up to twist-4 in quark models in which the quarks obey the free Dirac equations. Following Ref. [42], we assume that the nucleon is described as an ensemble of non-interacting partons of momentum P and spin S, which can be considered as a generic prototype for parton-model approaches [64][65][66][67][68]. We consider the TMD correlator where Γ = {γ + , 1, γ j T , γ − } stands for the matrices entering the definition of unpolarized T-even TMDs. In Eq. (26), we insert the free-field Fourier expansion of the quark field ψ on the surface z + = 0. We could equivalently use light-front as well as instant-form quantization for free fields. However, to make the link with the LFCQM which will be discussed in the following section, we adopt the light-front form with the following Fourier expansion where b q and d q † are the annihilation operator of the quark field and the creation operator of the antiquark field, respectively. Furthermore, λ is the light-front helicity of the partons andk denotes the light-front momentum variable k = (k + , k T ). Using (27) and restricting ourselves to the quark contribution, the operator in the correlator (26) reads By inserting (28) in the correlator (26), we obtain where x = p + /P + and P q λλ ′ is a density matrix in the space of the quark light-front helicity and its trace is the quark density operator evaluated in the target. The light-front spinors are given by Specifying the matrix Γ for the different unpolarized T-even TMDs, we find Using these results in the quark correlator (29), we obtain From the results in Eqs. (34)- (37), it is obvious that the EOM relations (16), (17) and (19) are satisfied. These relations are a consequence of the on-shell relation for the single-quark states. In order to explicitly evaluate integrated relations such as (11), we need to specify the quark momentum density (30) and therefore a model for the target state. To this aim, we will use as an example the LFCQM.

V. LIGHT-FRONT CONSTITUENT QUARK MODEL
The LFCQM has been used successfully to describe many nucleon properties [69][70][71][72][73][74][75][76][77][78] including leading-twist TMDs [52,[79][80][81][82]. Here we extend the analysis to unpolarized T-even TMDs beyond leading twist, restricting ourselves to the three-quark (3Q) Fock sector. The light-front Fock-space expansion of the nucleon state is performed in terms of free on mass-shell parton states with the essential QCD bound-state information encoded in the light-front wave function (LFWF). Restricting ourselves to the 3Q Fock sector, one therefore effectively deals with an ensemble of free quarks as described in Sec. IV. The T-even unpolarized TMDs can be therefore expressed as in Eqs. (34)- (37) where, as we will show, the quark momentum density in the proton P q is given by the overlap of LFWFs averaged over the light-front helicity of the quarks. We will apply the results obtained in this section to a specific model for the LFWFs [83], discuss numerical results, and compare to other models in Sec. VII (after a dedicated discussion how those models describe higher twist TMDs in Sec. VI).
Restricting ourselves to the 3Q Fock sector, the target state with definite four-momentum P = [P + , M 2 2P + , 0 T ] and light-front helicity Λ can be written as follows where ψ Λ;q1q2q3 λ1λ2λ3 is the 3Q LFWF with λ i and q i referring to the light-front helicity and flavor of quark i, respectively, r stands for (r 1 , r 2 , r 3 ) with r i = (x i M 0 , p T i ), and M 0 denotes the mass of the non-interacting 3Q state. We note that the single particle states in (38) 2P + is the minus component of the nucleon momentum. Furthermore, for the 3Q Fock state one also has i p + i = P + , whereas the LFWF depends on the plus component of the momenta of the noninteracting system of three quarks, i.e. k + i = x i M 0 , which is related to p + i by a longitudinal light-front boost. The integration measures in Eq. (38) are defined as The calculation of the T-even unpolarized TMDs proceeds along the lines outlined in the previous section. The explicit expression for the quark-momentum density is obtained by inserting the LFWF expansion of the proton (38) in Eqs. (30) and (31), with the result The matrix elements and scalar products in Eq. (40) read Using (41) and (42), and performing the integrations overk ′ and the quark momentap ′ i , Eq. (40) becomes In the case of SU (6)-symmetric LFWF, the contributions from all quarks q i with i = 1, 2, 3 are equal. We can choose to label the active quark with i = 1 and multiply by three the corresponding contribution. Then, the final results for the unpolarized TMD correlators with SU (6)-symmetric LFWF reads where we used the notation After discussing other quark models in the next section, we will produce numerical results from Eqs. (34)-(37) with the quark momentum density (44) obtained from the LFWFs of Ref. [83]. Before proceeding with that, let us discuss a general result concerning the integrated LIR (11).
In the LFCQM, the LIR (11) is not satisfied. This result is generic, and does not depend on the specific model for the LFWF. We checked that the LIR (11) is not supported neither using the LFWF of Ref. [83] nor those of Ref. [84].
Moreover, we also assured ourselves that the LIR (11) is also not valid in the light-front constituent model of the pion [85], which demonstrates that this feature does not depend on whether one deals with a three-body light-front Fock state |qqq as in the case of the nucleon, or a two-body light-front Fock state |qq as in the case of the pion.
From a technical point of view, the non-compliance with the LIR (11) can be understood as follows. In the integration of the LFWFs the relation i k + i = M 0 = M N with the off-shell energy condition i p − i = P − comes into play, and spoils the relation which would be naively expected for non-interacting quarks. The reason is that LFWFs represent the overlaps of the interacting state with free multiparton Fock states ψ n = n|ψ and contain the information about the interaction. In the LFCQM, we truncate the Fock space to the three-quark sector and use the free EOM to write down the bad components of the quark field in the TMD correlator. It is therefore not surprising that the free EOM relations are satisfied for the unintegrated TMDs, where we single out the free-motion of the individual active quark from the spectator quarks. On the contrary, in the integrated TMDs, we convolute the motion of the "free" active quark with the dynamics of the interacting 3Q system, with a consequent violation of the LIR.
In a light-front approach, such as the LFCQM, the violation of the LIR (11) is an expected feature, and reflects general issues of the light-front approach with sum rules of higher-twist parton distributions and with matrix elements of the minus component of the electromagnetic current. This has been elucidated from various perspectives [86,87]. In order to explain this point, we first remark that in the LFCQM f q 4 (x) vanishes in the limits x → 0 and x → 1 (as do all other parton distribution functions and TMDs). Because of that we can integrate (11) over x and derive in this way the sum rule (13), see App. B. Thus, in the LFCQM the integral dx f q 4 (x) receives contributions from the region x > 0 only. However, as shown in Ref. [86] in 1+1-dimensional QCD calculations, the sum rule (13) is satisfied only if one takes into account a δ(x)-contribution which originates from zero modes in the light-front quantization and whose existence can also be established using dispersion relation techniques [86]. More on δ(x)-contributions to parton distribution functions can be found in App. B 2. The description of light-front zero modes is beyond the scope of the LFCQM, and it is therefore not surprising that this model does not satisfy the sum rule (13) and the LIR from which this sum rule can be derived (within this model).
Alternatively one can explain the non-compliance of the LFCQM with the sum rule (13) by observing that it is related to the matrix elements of the minus component of the electromagnetic current. The latter is of course conserved in the light-front approach. But for that, one has to consider contributions from higher light-front Fock states [87] which are not accounted for in the LFCQM.

VI. EQUATIONS OF MOTION RELATIONS IN MODELS WITH INTERACTIONS
In this section, we discuss three models with interactions: bag model, spectator model, chiral quark-soliton model. We focus on formal aspects. Numerical results from some of these models will be presented in the next section.

A. Bag model
In the MIT bag model, relativistic (in our case massless) quarks are confined due to imposed boundary conditions inside a spherical cavity of radius R 0 fixed by the nucleon mass according to R 0 M N = 4ω [88][89][90]. Here ω ≈ 2.04 is the dimensionless "frequency" of the lowest eigenmode whose momentum space wave-function is given by where σ i (χ m ) denote Pauli matrices (spinors) and p i = p i /p with p = | p|. The normalization factor N and the functions t l (expressed in terms of spherical Bessel functions j l with l = 0, 1) are given by We introduce the convenient notation of [41] A = 16ω 4 In this notation, with t l ≡ t l (p) in the following and with the SU (6) spin-flavor symmetry factors N u = 2 and N d = 1, the results for the T-even unpolarized TMDs read The collinear function e q (x) was discussed in [36], while e q (x) and f q 4 (x) were calculated in [37]. Except for f q 4 (x, p T ), all these TMDs were discussed in detail in [41].
We will not investigate analytically the EOM relations in this model, and content ourselves with a qualitative discussion. The best example to explain the origin of tilde-terms in the bag model is e q (x, p T ) for which the general decomposition is given by x e q (x, p T ) = xẽ q (x, p T ) for massless quarks. The bag model quarks obey the free Dirac equation inside the cavity, and we know that the absence of interactions implies vanishing tilde-terms. Thus, the result for e q (x, p T ) in Eq. (49) is a boundary effect [36]. This is a physically appealing result: the bag boundary "mimics" confinement and hence gluonic effects. In this sense, it can be viewed as a (crude) model for quark-gluon correlations [36]. Note that the massless bag model quarks are off-shell, (11). This can be proven analytically by repeating step by step the proof of a different LIR from the Appendix of [41]. In our case, also a simpler proof is possible. Exploring the fact that the integrand 2 M N t 0 t 1 is a spherically symmetric function of where the last step follows after integration by parts. Combining this result with the expressions for f q 1 and f q 4 in Eq. (49) proves the LIR (11).
That f q 4 (x) satisfies the sum rule (13) can be shown in two ways. 1 One way is to integrate the model expressions (49) over p T and x, with dx = dp z /M N according to Eq. (48). Hereby the odd terms (± p z t 0 t 1 ) in the expressions for f q 4 and f q 1 in Eq. (49) vanish, implying that the integrals 2 dx f q 4 (x) = dx f q 1 (x) = N q are equally normalized. Alternatively, knowing from direct computation that in the bag model d dx f ⊥q(1) (x) is a continuous function at x = 0 (which in general does not need to be the case, see App. B), one can integrate the above-proven LIR (11) to verify (13).

B. Spectator model
In the spectator model, one treats the intermediate states that can be inserted in the definition of the correlator (1) as effective degrees of freedom with quantum numbers of diquarks and definite masses. Adopting the model of Ref. [38] for the diquark spectator system, one can write where m D is the diquark mass, a D is a spin factor taking the values a s = 1 (scalar diquark) and a a = −1/3 (axialvector diquark), and g(p 2 ) is a form factor that takes into account in an effective way the composite structure of the nucleon and the diquark. This form factor is often assumed to be [92] g( where Λ is a cut-off parameter and N is a normalization constant. This choice has the advantage of killing the pole of the quark propagator.
The results for the T-even unpolarized TMDs read where we introduced for convenience The flavor dependence is provided by SU (4) symmetry and similarly for the other TMDs. Except for f q 4 (x, p T ), all these TMDs were already obtained in [38]. Remarkably, the tilde-terms are simply given by which illustrates the connection between interaction and quark off-shellness p 2 − m 2 q . Using the analytic expressions (54), it is straightforward to check that the LIR (11) is satisfied, see also App. A 3 for further details.

C. Chiral quark-soliton model
We proceed with the χQSM. Here the nucleon is described as a chiral soliton in an effective, non-renormalizable lowenergy theory [93] defined in terms of the Lagrangian L = ψ(i / ∂ −M U γ5 −m q )ψ, where U γ5 = exp(iγ 5 τ · π)/f π denotes the chiral field and f π = 93 MeV the pion decay constant. The parameter M = 350 MeV is not a "constituent quark mass", but a dimensionful coupling constant of the quark fields to the chiral field, which is dynamically generated by instanton-anti-instanton interactions in the semi-classical description of the QCD vacuum [94][95][96]. A popular jargon is to refer to M as "dynamical mass". In contrast, m q = O(few MeV) is the current quark mass of light quarks. In many practical calculations, one can work in the chiral limit, and set m q to zero. In the following analytical derivations, we shall keep m q finite. The cutoff Λ cut = O(ρ −1 av ) of the effective theory is set by the inverse of the average instanton size ρ −1 av ≈ 600 MeV and determines the initial scale of the model. The theory can be solved in the limit of a large number of colors N c , where a soliton solution is found for a static pion field with hedgehog symmetry π( x) = f π e r P (r) where e r = x/r and r = | x|. Expressed in terms of the profile function P (r), the chiral field is given by U γ5 = cos P (r) + iγ 5 ( e r · τ ) sin P (r).
In the χQSM, the equation of motion is (i / ∂ − M U γ5 − m q )ψ = 0. The difference with the free quark case is the presence of the interaction term M U γ5 , which will be responsible for the emergence of "interaction-dependent" tilde-terms. Since the interaction contains no derivatives and U γ5 = U γ5 , we obtain an identity analogue to Eq. (15), with the substitution m q → m q + M U γ5 , i.e.
The treatment of the ( / p − m q ) part is precisely the same as in Sec. III, so we can focus on the structure ΓM U γ5 + M U γ5 Γ . Choosing respectively Γ = γ + and Γ = iσ +j T yields where Several comments are in order. First, even in the chiral limit m q → 0, the χQSM predicts a non-zero e q (x, p T ) which arises from the interaction term. That the operator M (U + U † ) is associated with interactions is evident: it is proportional (a) to M which is the dynamically generated mass due to interactions of light quarks in the strongly interacting QCD (instanton) vacuum, and (b) to the chiral field binding the effective quark degrees of freedom to form the nucleon. Second, it is remarkable that the strong chiral interactions do not generate a tilde-term in the case of f ⊥q (x, p T ). This information is very useful for phenomenology. In fact, it supports the WW-approximation for this TMD, which was applied to phenomenology in [97]. Third, the above expressions describe also antiquark TMDs according to with a (−)-sign for f 1 , and a (+)-sign for e and f ⊥ . It is interesting to remark that, from the χQSM, we can recover results of the non-interacting theory by taking formally the limit U → 1, which can be done by letting the size of the soliton go to zero [93]. In this formal limit Interestingly the tilde-term does not vanish but becomes effectively a mass term. 2 By taking U γ5 → 1, we "removed" the soliton field which binds the quarks. But we did not remove effects of the strongly interacting QCD vacuum where our quarks are embedded. In fact, light current quarks (with small masses m q ) acquire M as a response to collective instanton vacuum effects. So M and hence the result in Eq. (63) are of dynamical origin. If we "switched off" QCD vacuum effects, also M → 0. Thus, the result for e q (x, p T ) is clearly an interaction-dependent tilde-term. It is an important cross check that, in the formal limit of vanishing interactions, we recover results from the free theory.
It should be noted that in QCD the first two Mellin moments ofẽ q (x) vanish (see Eq. (B10) in App. B), but not in the χQSM. This is a limitation of the model, but not its failure. QCD sum rules for Mellin moments are specific to gauge theories (one would have basically the same sum rules in QED). The model interactions are different, which results in different but consistently satisfied sum rules within the models [39].
For completeness, we also discuss f q 4 (x, p T ). Exploring hedgehog symmetry, one finds that the sum rule (13) holds. Positivity can be proven within the model in complete analogy to f q 1 (x, p T ) [98]. In order to derive the EOM relation, we use the identity (58) with e.g. Γ = 1. As in free quark case, we encounter a contribution from the structure γ + p − where we have to eliminate p − . This is done by generalizing (18) to the case of the χQSM through the replacement with due care to the fact that U ±γ5 do not need to commute with Γ. After a bit lengthy but straightforward algebra we obtain with (for the flavor-singlet case) where all terms are either matrix elements of the chiral field U γ5 or the dynamical mass M (or both), and hence are manifestly interaction-dependent. Thus, switching off (soliton, instanton vacuum) interactions removesf q 4 (x, p T ). In the language of the light-front Fock-state expansion, the quark correlator in that model contains all |nq, (n − 3)q components for n = 3, 4, 5, · · · summed up. The calculation in terms of a Fock expansion is efficient if one restricts oneself to the minimal Fock state n = 3 [78], and becomes quickly impractical beyond that [99][100][101]. To get the "full answer", one has to evaluate the entire correlator. This numerically laborious task was done at large N c for the flavor-singlet unpolarized TMDs f u+d 1 (x, p T ) and fū +d 1 (x, p T ) [102,103], from which we could immediately obtain flavor-singlet results for xf ⊥q (x, p T ) via (60), but the computation in the non-singlet channel has not yet been performed. Results for the parton distribution function e q (x) were presented in [39,40], while f q 4 (x) was never studied.
In this section we treated the χQSM as a "quark model" as done in [39,40] and other higher-twist studies [104,105]. As in any quark model, also here it is possible to evaluate matrix elements ofψΓψ operators of any twist. We found the results consistent in the sense that the tilde-terms, which we separated off by means of EOM, really encode model interactions and vanish in a formal limit of a non-interacting theory. However, strictly speaking the χQSM should be understood as the "leading-order" approximation of the instanton vacuum model, which is of paramount importance to identify the model distributions of quarks and antiquarks with leading-twist QCD parton distribution functions at low normalization scale µ 0 ∼ ρ −1 av ∼ 600 MeV [106]. The tight connection of the χQSM to instanton vacuum became also apparent in our discussion: we were able to show that tilde-terms vanish, only after switching off all interactions, also those associated with instanton vacuum effects. Fully consistent higher-twist studies require to work directly in the instanton vacuum [94][95][96]. Only in this way a realistic description of the non-perturbative quark-gluon dynamics can be obtained. In some cases, tilde-terms of partons distribution functions were found to be small [107,108] in the instanton vacuum, but not in all [96,109]. Since a fully consistent treatment of higher-twist matrix elements requires instanton vacuum techniques, we refrain from showing here numerical results for higher-twist TMDs within the χQSM, and refer to instanton vacuum model studies [96,[107][108][109].

VII. NUMERICAL RESULTS AND COMPARISON TO PHENOMENOLOGY
In this section we present numerical results from the models. The LFCQM results are new, and discussed in more detail. For comparison we include bag and spectator model results [38,41]. All results refer to a low quark model scale. We then evolve e q (x) obtained in the LFCQM and compare it with a recent extraction [27].
We apply the general light-front formalism elaborated in Sec. V to the model of 3Q LFWF from [110,111]. The parameters of this model were fixed to reproduce the anomalous magnetic moments of the proton and neutron. The parameter of importance for the following discussion is the constituent quark mass m q = 263 MeV. The results of this quark model, as well as any quark model without explicit gluon and sea-quark degrees of freedom, refer to a low initial scale µ 0 LO = 420 MeV.
The results for the integrated TMDs f q 1 (x), e q (x), f ⊥q (x), and f q 4 (x) obtained from this approach are shown in Fig 1. The bag and spectator model results for f q 1 (x), e q (x), and f ⊥q (x) included for comparison in Fig. 1 are from [38,41], while the results for f q 4 (x) in those models are from this work. Because of SU (6) spin-flavor symmetry, the flavor dependence of unpolarized T-even TMDs is trivial in LFCQM and bag model, i.e. u-quark distributions are a factor 2 bigger than d-quark distributions. This is different in the diquark spectator model, where the two-body wave function obeys SU (4) symmetry that does not lead to a simple relation between u-and d-quark distributions (only in the large N c -limit, where the scalar and axial diquark masses become equal, would one have the same trivial flavor dependence in the spectator model as in the other two models).
The results for the twist-2 function are comparable in the three models. For instance, f q 1 (x) exhibits a peak roughly around x ≈ 0.3 in all models. Also the magnitude is similar, which is understandable because the flavor number sum rule determines the normalizations of the lowest moments. In particular, the results from the bag model and LFCQM show a very similar behavior at large x.
The picture is very different for higher twist. As compared to the other models, the magnitude of the higher-twist TMDs is bigger in the LFCQM. This is partly due to the fact that higher-twist TMDs in this model arise from mass effects, and the constituent quark mass of this model m q = 263 MeV is sizable. Also the overall shapes of e q (x), f ⊥q (x) and f q 4 (x) differ largely in the three models. For instance, the maxima of the curves are scattered over a wide interval in x. A very distinctive feature is the node in e d (x) in the diquark model. All models comply with the positivity constraint for f q 4 (x) in Eq. (12). Note also that the distributions do not vanish at x = 0 in the bag and diquark models, in contrast to the LFQCM. This is due to the power-law ansatz of the model for the LFWF, which vanishes when any x i → 0. As a consequence 3 f q 1 (x) ∝ x 3 for x → 0. Therefore e q (x), f ⊥q (x) and f q 4 (x), which are related to f q 1 (x) by means of the EOM relations (16), (17) and (19), have not only regular small-x limits but even vanish for x → 0, too.
Next let us discuss sum rules. The twist-3 parton distribution function e q (x) obeys the sum rules [36] dx e q (x) = 1 2M N P |ψ(0)ψ(0)|P , The LFCQM satisfies both sum rules. In the case of (66) this means that integrating e q (x) yields the same result as evaluating the local matrix element on the right-hand-side of this equation. The compliance of the model with the second sum rule is evident from the EOM relation (16). For the first sum rule, however, this is highly non-trivial. We explain this in detail in App. B 2. Numerically the result for the sum rule (66) in agreement with q dx e q (x) = (6-10) expected in QCD [98] (see App. B 2 for further comments on this result). Next we turn our attention to the p T -dependence of TMDs. We define the mean transverse momenta (n = 1) and the mean squared transverse momenta (n = 2) in the TMD as follows In Table I (a) we show results for these quantities for unpolarized T-even TMDs in the LFCQM. Since in the LFCQM the flavor dependence appears as an overall factor N q , the p n T in Eq. (69) are equal for u-and d-quarks. Compared to f q 1 (x, p T ), the mean transverse momenta in e q (x, p T ) and f ⊥q (x, p T ) are smaller while those of f q 4 (x, p T ) are larger, implying that e q (x, p T ) and f ⊥q (x, p T ) fall off with p T faster than f q 1 (x, p T ) and vice-versa for f q 4 (x, p T ). An instructive quantity is the ratio R G ≡ 2 p T /(π p 2 T ) 1/2 . If the p T -dependence of the TMDs was exactly Gaussian, this ratio would be unity. Table I (a) shows that the LFCQM supports this "measure of Gaussianity" within 5%.
The definitions of p n T in Eq. (69) are not useful in all models. In the bag model, the x-integration in (69) would include unphysical regions and bias the result, see footnote 1. Moreover, p 2 T defined in (69) is divergent for some TMDs [41]. Also in the spectator model (69) is not useful, especially for e q (x, p T ) where nodes in p T occur such that p 2 T is negative. In this situation, one gains more insight with a different definition of p 2 T,v which is chosen such that one obtains (if it is possible) a useful Gaussian approximation of the true p T -dependence at valence-x within a model [41], namely This definition is x-dependent, but typically the x-dependence is weak in the valence-x region [41]. For definiteness, we choose the value x v = 0.3 as reference. Using this definition, we can directly compare all models, see Table I (b).
With the values quoted in Table I (b) the true p T -dependence is approximated within (5-20)% depending somewhat on the TMD and model. As shown in Fig. 2, in the LFCQM the approximations work reasonably well in a large range of p T . However, it is important to realize that the TMD picture holds for p 2 T ≪ µ 2 0 with the initial scale µ 2 0 ≈ 0.176 GeV 2 in quark models. Thus, beyond p 2 T 2 p 2 T,v the non-perturbative results from quark models for TMDs have no physical meaning.
The spectator model is the only model with non-trivial flavor dependence considered here. Interestingly, the p Tdistributions of d quarks are systematically broader than those of u quarks. The reason behind this is the diquark masses, which set the physical scales for the p T -behavior. The d-quark TMDs are given entirely in terms of the heavier axial-vector diquark, and are therefore broader. The u-quark TMDs receive contributions from both scalar and axial-vector diquarks, but the lighter scalar diquark dominates which makes the distributions narrower.

VIII. COMPARISON TO PHENOMENOLOGY
In order to confront the LFCQM results to phenomenology, it is necessary to evolve them from the low initial scale to experimentally relevant scales. Taking evolution effects into account, the LFCQM as described in the previous section, was shown to describe satisfactorily data related to twist-2 TMDs in the valence-x region with an accuracy of (10-30)% [52,[79][80][81][82]. Whether higher-twist TMDs are described with similar success, remains to be seen. The recent study [27] puts us in the position to investigate this question for e q (x).
For the comparison we will need e q (x) at a scale of 1.5 GeV 2 . The pure twist-3 contributionẽ q (x) follows a complicated evolution pattern [112][113][114] typical for subleading-twist distributions, see also the reviews [115,116]. However, in our caseẽ q (x) = 0 and all we have is x e q (x) = mq MN f q 1 (x) with the evolution of the latter given by the standard evolution of f q 1 (x). To be consistent, we also have to make m q subject to LO evolution of the QCD running quark mass (in fact, the quark mass insertion makes the contribution of mq MN f q 1 (x) "chiral odd" and hence a legitimate contribution to the chiral odd e q (x)). It is part of the model, that the value of m q at the initial scale is a sizable constituent quark mass, rather than a small QCD current quark mass. But one has to recall that this constituent mass has to be understood as an effective parameter describing a quark dressed by non-perturbative interactions inside the hadron.
In Fig. 3 (a) we show e u (x) at the initial scale of µ 2 0,LO = 0.176 GeV 2 , and after LO evolution in the above-described way to a final scale of Q 2 = 1.5 GeV (for technical details of the evolution parameters we refer to [81]). The results for the d-quark distribution can be obtained by rescaling by a factor 1/2 the u-quark distribution, according to the SU (6)-flavor factors. Fig. 3 (a) shows that the effects of evolution are sizable, and cannot be neglected. The same observations were made also in twist-2 case [52,[79][80][81][82].
Recently the CLAS collaboration has measured azimuthal distributions of π + π − pairs produced in SIDIS using a longitudinally polarized 6 GeV electron beam off an unpolarized proton target [117]. Correlations of final-state hadrons [118][119][120] provide a handle to access novel information on the nucleon structure in collinear factorization [121] including e q (x) [26]. In this process, one focuses on the kinematics where the struck parton fragments into a hadron pair, which gives rise to various azimuthal asymmetries. If we denote by σ ⇄ the cross sections for producing the hadrons h 1 h 2 from positive or negative helicity electrons with the beam polarization P B impinging on an unpolarized target, e ⇄ (l) + N (P ) → e(l ′ ) + h 1 (P h1 ) + h 2 (P h2 ) + X, then the observables of interest in our context are [26] 1 1 2 where we introduced the abbreviations u hh ≡ {z hh , ζ, m 2 hh } and d 3 u hh = dz hh dζdm 2 hh . The DIS variables describing lepton scattering are q = l − l ′ , Q 2 = −q 2 , x = Q 2 /(2 P · q), and y = (P · q)/(P · l). The kinematics of the produced hadron pair is described by the invariant dihadron mass m 2 hh = (P h1 +P h2 ) 2 , the total longitudinal momentum fraction z hh = z 1 + z 2 transferred from the struck quark to the hadron pair, and its relative distribution ζ = (z 1 − z 2 )/z hh , where z i = (P · P hi )/(P · q). Finally, R T is the component of the relative momentum 1 2 (P h1 − P h2 ) transverse with respect to the total hadron momentum (P h1 + P h2 ) and given by R 2 . The angle φ R is the inclination of the dihadron plane with respect to the lepton scattering plane counted from the direction of the outgoing lepton [26].
The hadrons h 1 h 2 can be produced in different relative partial waves, and H ∢q 1 (u hh ) and G ∢q (u hh ) describe the interference of s-and p-waves [122]. The former is leading twist and arises from the fragmentation of a transversely polarized quark, the latter is subleading twist and due to quark-gluon correlations in the fragmentation process. In contrast, the leading-twist fragmentation function D q 1 (u hh ) is diagonal in the partial waves. By deducing information on D q 1 (u hh ) from the PYTHIA Monte-Carlo event generator [123] tuned to hadron spectra produced from e + e − collisions in the Belle experiment, and analyzing Belle data on azimuthal asymmetries in dihadron production [124], some information on H ∢q 1 (u hh ) was inferred in [125]. On the basis of this information, a first extraction of e q (x) from the CLAS data [117] was reported in [27] (for an earlier attempt to access e q (x) from SIDIS data on TMD observables, see [32]).
In Ref. [27] it was argued that the CLAS data on the ratio of the cross sections (71) and (72) cannot be dominated by the second term in (71) proportional to G ∢q (u hh ). Assuming this term to be zero, an approximation referred to as "WW-scenario" in [27], yields the extracted data points for the combination e V (x) ≡ 4 9 (e u − eū)(x) − 1 9 (e d − ed)(x) shown in Fig. 3 (b) which refers to Q 2 = 1.5 GeV 2 .
For comparison we show in Fig. 3 (b) the results from the LFCQM for the flavor combination e V (x) = 4 9 e u (x) − 1 9 e d (x) at the same scale. The agreement with the extraction is very satisfactory for the two higher-x bins. The description of the lowest x-bin is less good. But it is important to recall that the LFCQM is applicable in the valence x-region and subject to limitations below x 0.2 [79], cf. footnote 3. Let us remark that the "WW scenario" of Ref. [27] is completely in line with the LFCQM. The consistent bruteforce neglect of tilde-terms removes not only G ∢q (u hh ) but also e q (x). This is precisely the situation in the LFCQM where e q (x) = mq MN f q 1 (x) is modeled in terms of a sizable constituent quark mass contribution. It is important to add a cautious remark. The "WW scenario" assumed in Ref. [27] is one possible way of dealing with the unknown contribution of G ∢q (u hh ) but not the only one. In [27] also a "beyond-WW scenario" was explored where this fragmentation function is allowed to be non-zero, with the constraint to reproduce preliminary CLAS data on the double spin asymmetry in dihadron production with a longitudinally polarized beam and target. This asymmetry is due to G ∢q (u hh ) and compatible with zero within error bars according to the preliminary data [126]. Although this strongly constrains the magnitude of this fragmentation function, a non-zero G ∢q (u hh ) compatible with the preliminary data [126] has a non-negligible impact on the extraction of e q (x). This indicates that the extraction shown in Fig. 3 (b) could have sizable unestimated systematic uncertainties. The only safe conclusion at the moment is that e q (x) seems to be non-zero in either scenario [27].
Keeping these cautious reservations in mind, we conclude that the LFCQM prediction for e q (x) is compatible with the presently available preliminary CLAS data [117] extracted in the "WW-scenario" [27] which is conceptually in line with the model.

IX. CONCLUSIONS
Sizable azimuthal asymmetries in SIDIS with (un)polarized beams due to subleading-twist TMD effects have been observed whose theoretical description is not fully clarified. Insights from models can provide valuable guidelines. Quark models in principle offer a tool to evaluate hadronic matrix elements of quark-field correlators of any twist [36], allowing one to model also TMDs, including higher twist. It is therefore of interest to explore them as a resource for the interpretation of available data, or for predictions for future experiments. For that it is important to assess the applicability and limitations of quark models and improve the understanding of how higher-twist TMDs are modeled. The aim of the present work was to contribute to this understanding.
We have shown that exploring the respective equations of motion, higher-twist TMDs can be decomposed in quark models into contributions from leading-twist TMDs, quark-mass terms and pure-interaction dependent ("tilde") terms. This is in some sense analogue to QCD, although the model interactions are far simpler than the QCD gauge interactions. Also the meaning of quark mass may differ, as in some models one may deal with a sizable "constituent quark mass". Nevertheless, the decompositions are fully consistent within the models, and we have shown that the interaction-dependent tilde-terms vanish in formal limits when the model-interactions are "switched off".
We have reviewed how this happens in the bag model: the tilde-terms vanish when one removes the bag boundary condition [36]. Since the latter is designed to "mimic" confinement and hence "gluonic effects", this demonstrates that the modeling of tilde-terms in the bag model is consistent in this sense [36]. We also reviewed how tilde-terms arise in the spectator model, namely due to off-shellness effects [38]. A new result obtained in this work was the discussion of tilde-terms in the chiral quark-soliton model. We have shown that these terms vanish if one formally reduces the strength of the solitonic field which binds the quarks in that model, and removes the instanton interactions which "dress" the light quarks with a dynamically generated mass. These results indicate that tilde-terms are "reasonably" modeled in these approaches and are generated by the respective effective interactions.
A remarkable result obtained in this work is the absence of tilde-terms in the twist-3 TMDs f ⊥q (x, p T ) and f ⊥q (x, p T ) in the chiral quark-soliton model. Other unpolarized higher-twist TMDs receive significant tilde-terms in that model, which arise from the strong chiral interactions that bind the quarks in a solitonic field. For instance, the twist-3 distribution functions e q (x) and eq(x) are, in the chiral limit, solely due to a tilde-term which is rather sizable in that model [39,40]. But in the case of f ⊥ the chiral interactions do not induce tilde-terms, and these TMDs are given by x f ⊥q (x, p T ) = f q 1 (x, p T ) for quarks and analogous for antiquarks in the leading order of the large-N c expansion. This prediction may have interesting phenomenological applications.
We also studied models where quarks do not feel explicit interactions which, however, not always implies truly noninteracting theories. In the ensemble of free quarks [42], which can be understood as a prototype of more sophisticated parton model frameworks, the interactions are simply absent and the tilde-terms are consequently zero. Parton model approaches have important applications, and allow us to separate "kinematical" from "dynamical" effects. This leads to valuable insights [64][65][66][67][68], but does not teach us anything about tilde-terms.
An interesting approach studied in this work in great detail is the light-front constituent quark model (LFCQM) which we extended beyond leading twist. This approach is based on a light-front Fock-state expansion of the nucleon state in terms of on-shell partons -each obeying the free EOM. Certain "unintegrated relations" among TMDs that are valid in free quark models are therefore naturally supported in this model, but not all. In fact, some free quark model relations among p T -integrated TMDs are not supported. One can understand this by recalling that the free quark states in the Fock expansion are used to construct the nucleon light-front wave-function which encodes non-perturbative information and hence the bound-state nature, through certain parameters and the way the free quarks states are arranged to form the nucleon state. Removing the bound state nature in this case would bring us back to the free quark ensemble model.
In order to test the consistency of the different quark model approaches, we derived a so-called Lorentz-invariance relation (LIR). Such relations are spoiled in QCD due to gauge interactions, but they hold in relativistic quark models without gluon degrees of freedom. We have shown that all quark models satisfy the LIR, except for the LFCQM. We traced back the reasons for this to general features of the light-front formalism which appear at subleading twist [86]. The non-compliance of the LFCQM with this specific LIR is equivalent to the violation of the sum rule for the twist-4 parton distribution function f q 4 (x). In order to satisfy this sum rule, one has to include light-front zero modes [86]. An equivalent explanation is that this sum rule is related to the matrix element of the minus-component of the electromagnetic current P |J − |P . In a light-cone approach, one has to consider overlap contributions from higher Fock-state components [87]. Since the modeling of zero-modes or higher Fock-state components is beyond the scope of the LFCQM, the LIR and the sum rule for f q 4 (x) which follows from it, are consequently not supported. In the LFCQM, where the quarks are non-interacting in the above explained sense, tilde-terms are absent and the higher-twist TMDs arise from their respective (and in the model consistently described) twist-2 contributions and mass terms. Due to the size of the constituent quark mass of about 300 MeV in that model, the mass terms are sizable. This feature is reasonable and consistent within this model, recalling that the results refer to a low renormalization point µ 0 ∼ 0.4 GeV . We presented numerical results from the LFCQM model, and compared with other models.
The LFCQM has been used extensively (more than the other models) in the past for phenomenological applications in the context of leading-twist TMDs, and it was shown that its results are compatible with data within a typical model accuracy of about (10-30)%. A comparison to phenomenology in the twist-3 sector is more difficult, as the associated SIDIS observables receive contributions from 4-6 TMDs and require also a good understanding of presently unknown higher-twist fragmentation functions.
However, recently a phenomenological extraction of the twist-3 parton distribution function e q (x) was reported [27] based on the collinear interference fragmentation function framework [26]. Taking into account the evolution from the low initial scale of the LFCQM to the experimentally relevant scale, we observe a very good agreement with the extracted result within model accuracy. One should bear in mind, that the first extraction of information on e q (x) has unestimated systematic uncertainties [27]. Nevertheless, the good agreement of the model predictions and the phenomenological result is an encouraging indication that the LFCQM may be similarly successful in the twist-3 sector as it is in the twist-2 sector.
Future works will shed more light on the applicability of this and other quark models to the description of TMDs beyond leading twist, and allow us to assess with more confidence to which extent quark model approaches are capable to contribute to our understanding of non-perturbative partonic properties at higher twist.
Before discussing the derivation of the LIR, we rewrite the expression for f q 4 in Eq. (10) as follows We see that f q 4 (x, p T ) is expressed in terms of f q 1 (x, p T ) and a remaining part f q rest (x, p T ) related to the amplitude A q 3 . The only TMD defined solely in terms of A q 3 is f ⊥q (x, p T ). The goal is therefore to relate f q rest (x, p T ) to f ⊥q (x, p T ). For that we first follow Ref. [44].

Derivationà la Tangerman-Mulders
In this derivation the variables of the amplitude A i are treated as independent quantities. In the next section we will see that in quark models the situation can be different.
In order to proceed, we integrate f q rest (x, p T ) over p T (in principle, one could formally also take higher transverse moments, i.e. weight by (p 2 T /2M 2 N ) n with n > 1 before p T -integration, though this may raise convergence issues). Recalling that p + = xP + , we introduce the convenient variable where the second relation follows for fixed x. The quark virtuality is then given by Treating σ, x and p 2 T as independent variables, we obtain Notice that in the intermediate step (A5) we integrated by parts with respect to p 2 T which is justified, provided A 3 falls off at large p T faster than 1/p 4 T . This condition also ensures that the (1)-moment f ⊥q(1) (x) is finite. Inserting the result (A6) in (A1) yields the LIR (11).

Derivation for on-shell particles
When the parton (with mass m q ) is onshell as could be the case in models, then under the p − integral defining the TMD in terms of the amplitude A i , both arguments of A i (σ, p 2 ) are fixed in terms of x and p T We simulate this situation as follows The on-shell condition (A8) allows one to perform the p − integration, but it is convenient to refrain from this step. Instead, we make use of (A7) and (A8) under the integral of f q rest (x, p T ) in Eq. (A1) and obtain Thus, in contrast to the general case discussed in the previous section, here we could complete the task of relating f q rest (x, p T ) and f ⊥q (x, p T ) without integrating out transverse momenta. Inserting this result in (A1) yields This relation does not contain new information in "on-shell" models, where it can be derived from the EOM relations. For example, inserting (17) in (A10) yields (19). Nevertheless, we encounter (A10) here as an "unintegrated on-shell version" of the LIR (11). Thus, in both on-shell and general cases one finds a relation expressing f q 4 in terms of f q 1 and f ⊥q , (11) and (A10). At first glance, these relations seem to be different and this is puzzling. The essential ingredient of the derivation of LIRs is the (unique and complete, in quark models) decomposition (1) of the correlator in terms of A i amplitudes, and this is dictated by Lorentz invariance which all (relativistic) quark models obey. However, both versions (11) and (A10) are formally equivalent (what formally means will become clear shortly).
Starting from the derivative of f ⊥(1)q (x) we obtain We again emphasize that in the step leading to (A11) we integrated by parts, which is legitimate provided A 3 falls off at large p T faster than 1/p 4 T . This condition is anyway required in order to have a finite results for f ⊥(1)q (x). It is an interesting question to wonder what would happen if f ⊥(1)q (x) was divergent. In that case, it may (or may not) be possible to introduce an appropriate regularization scheme chosen such that f ⊥(1)q (x) is finite and the LIR (11) is satisfied. In this context it is interesting to remark, that in the bag model the transverse moment f (1)q 1 (x) is divergent and needs regularization. However, d dx f (1)q 1 (x) in that model is finite. In fact, this feature was used in [41] to define the "regularized" f (1) = 0). This in turn implies the interesting possibility that a LIR of the type (11) could hold in model although the associated transverse moment is undefined. So far we have not yet encountered such an example.

Derivation for spectator model
In a spectator model, the spectator system is on-shell m 2 D = (P − p) 2 = M 2 N − 2P · p + p 2 , and the energy of the struck quark is determined by four-momentum conservation. The struck quark is off-shell, but both variables of the amplitude A i are constrained as Like in the free quark model, we can simulate this situation by introducing a new amplitude 4 Note that because of the specific form (A14), it is actually not necessary to integrate over p T , and we can consider directly (A1). If we have four-momentum conservation with the spectator system onshell, we can write Thus, we obtain the following LIR for spectator models which is satisfied by the separate diquark contributions, but not by the total result for f q 4 (x, p T ) due to the different diquark masses. Even if the diquark masses were equal (which is the case in the limit of a large number of colors), one should notice that in contrast to the free quark model, this relation contains the model parameter m D . Hence, it is an internal model relation, with limited or no validity beyond the spectator model.
Finally, we are going to check explicitly that the LIR (A16) reduces to (11) once integrated over p T . We have In QCD the electromagnetic current is defined as J µ = q e q J q µ with J q µ = ψ(0)γ µ ψ(0). The general decomposition of its forward matrix elements is P |J q µ |P = (2P µ )F q 1 (0) with F q 1 (0) = N q . Thus, from Eq. (B2) we conclude that 2 dx f q 4 (x) = dx f q 1 (x) = N q .
A variant of this proof consists in making use of the fact that Mellin moments are Lorentz scalars. Thus, one may go to the nucleon rest frame, where one finds in the expressions for the first moments of f q 1 (x) and f q 4 (x) matrix elements of the type P |ψ(0)γ ± ψ(0)|P = P |ψ † (0)(1 ± γ 0 γ 3 )ψ(0)|P / √ 2. Now, after the x-integration has removed any memory of the light-front direction (local matrix element), the contributions P |ψ † (0)(±γ 0 γ 3 )ψ(0)|P vanish due to rotational symmetry in the nucleon rest frame implying that 2 dx f q 4 (x) and dx f q 1 (x) are equally normalized. In quark models, where LIRs are valid, also another formal proof is possible. Integrating the LIR (11) over x, one formally finds 2 dx f q 4 (x) = dx f q 1 (x), since where we used (62) and explored the fact that TMDs vanish for x → 1. However, here we tacitly assumed that f ⊥q(1) (x) is a continuous function of x including the point x = 0. This can, but does not need, to be the case in models. Thus, in the general case one could find that the small x-behavior invalidates this proof, due to A gaze at models provides intuition. In both the bag model and χQSM (13) is satisfied, which is straightforward to check by directly integrating model expressions and exploring rotational (in bag model) or hedgehog (in χQSM) symmetries. In the bag model (where one has to keep in mind the reservations due to the unphysical negative-x region, see footnote 1), f ⊥q(1) (x) is a continuous function at x = 0, so one can also integrate the LIR to prove (13). But in the χQSM, which describes at x < 0 physical TMDs according to (62), one has x f ⊥q(1) (x) = f q 1 (x) and the latter exhibits a discontinuity at x = 0 that ensures positivity [106]. Thus, in the χQSM the sum rule (13) is valid, but cannot be proven by integrating the LIR.
As the last proof in quark models, we notice that f q rest (x) in the intermediate step (A4) can be rewritten as [44] f q rest (x) = dσ dτ d 2 p T (σ − 2xM 2 N ) Integrating this expression over x we obtain which vanishes because we deal with an integral of the type where the x i are simple zeros of the argument v(x), and our function v(x) = xσ − x 2 M 2 N − p 2 T − τ is such that v ′ (x 1,2 ) = ∓ σ 2 − 4M 2 N (τ + p 2 T ). Using this expression for v ′ (x) in (B8), one formally finds that dx f q rest (x) = 0, confirming (B7) and proving the sum rule (13). However, in a specific model one has to investigate carefully whether x i ∈ [−1, 1] such that the integrated δ(v(x)) has indeed support in the integration region.

Other potentially violated sum rules
Sum rules like (13) are referred to as formal. They are mathematically correct. But in the formal theoretical evaluation of such a sum rule, a δ(x)-singularity (if present) is integrated over, and contributes to the result. However, the experimental test of such a sum rule will only include results inferred (and extrapolated) from data taken at finite x > 0. Hereby of course the contribution of the δ(x)-singularity will be missed, and sum rule perceived as violated.
We are not aware of how (even in principle) the twist-4 sum rule (13) could be tested, but there are other sum rules which can be tested experimentally. The most famous example is the long-discussed and still unsettled possible violation of the Burkardt-Cottingham sum rule which features the twist-3 parton distribution function g q T (x) [127]. Also the sum rule of the twist-3 parton distribution function h q L (x) was debated [128]. But the most interesting case in the context of this work is the Jaffe-Ji sum rule [36] connecting e q (x) to the pion-nucleon sigma-term σ πN . By exploring QCD equations of motion, e q (x) can be decomposed as follows [98] e q (x) = δ(x) 2M N P |ψ(0)ψ(0)|P +ẽ q (x) + e q mass (x).
Hereẽ q (x) and e q mass (x) denote, respectively, the pure twist-3 and mass term, which in QCD have the properties dxẽ q (x) = dx xẽ q (x) = dx e q mass (x) = 0.
For x = 0 the mass term is expressed in QCD as well as in quark models by xe q mass (x) = mq MN f q 1 (x). Thus, in QCD the sum rule (disregarding a small doubly isospin violating term) for e q (x) is given by A δ(x)-contribution in e q (x) was found in (1 + 1)-dimensional models [128], perturbative one-loop light-front calculations [129], and non-perturbative calculations in the χQSM [39,40]. In the one-loop dressed-quark model of [129], δ(x) emerged as a p + zero mode in light-front time-ordered perturbation theory. In the χQSM, the coefficient of the δ(x)-function (and hence σ πN , see [130]) is related to the quark vacuum condensate [39,40], a quantity with central importance as order parameter of spontaneous chiral breaking.
No δ(x) singularity appears in the bag [36,37] or spectator [38] models. Particularly interesting in our context is the model with massive quarks in light-front one-loop Hamiltonian perturbation theory with light-front gauge [131] where also no δ(x) contribution was found (this was in fact impossible, because in contrast to [129], in the calculation of [131] a prescription for the operator 1 ∂ + was chosen, which discards p + zero modes). Theẽ q (x) and e q mass (x) from [131] do not satisfy (B10). However, remarkably dx (ẽ q + e q mass )(x) nevertheless satisfies the sum rule (B11). Thus, in this calculation the information on σ πN is, instead of being concentrated in the point x = 0, redistributed over the whole interval 0 < x < 1. The same kind of "holographic principle" is observed in the LFCQM, see Sec. VII.
which proves that x f ⊥ (x, p T ) = f 1 (x, p T ). Two remarks are in order. First, we see that the relation is satisfied for each quark level separately. This is so, because we used the EOM for the single quark states. Second, we see explicitly the off-shellness of the quark in the n th level p n = (E n , p T , xM N − E n ), namely p 2 n = E 2 n − (xM N − E n ) 2 − p 2 T = 0, which would have been expected for massless onshell quarks.