Variational estimates for discrete operators modeled on multi-dimensional polynomial subsets of primes

We prove the extensions of Birkhoff's and Cotlar's ergodic theorems to multi-dimensional polynomial subsets of prime numbers $\mathbb{P}^k$. We deduce them from $\ell^p$-boundedness of $r$-variational seminorms for the corresponding discrete operators of Radon type, where $p>1$ and $r>2$.


I
Let (X, B, µ) be a σ-finite measure space with d 0 invertible commuting and measure preserving transformations T 1 , . . . , T d 0 : X → X. Let P = P 1 , . . . , P d 0 : R k → R d 0 denote a polynomial mapping such that each P j is a polynomial on R k having integer coefficients without a constant term. Let B be an open bounded convex subset in R k containing the origin such that for some ι > 0 and all N ∈ N, where for λ > 0, we have set In this paper we consider the following averages where k = k ′ + k ′′ , P denotes the set of prime numbers, and π B (N) = n∈N k ′ p ∈P k ′′ 1 B N (n, p).
One of the results of this article establishes the following theorem.
Theorem A. Assume that p ∈ (1, ∞). For every f ∈ L p (X, µ) there exists f * ∈ L p (X, µ) such that Sums over prime numbers are irregular, thus it is more convenient to work with weighted averaging operators, Then the pointwise convergence of (A N f : N ∈ N) can be deduced from the properties of (M N f : N ∈ N), see Proposition 2.1 for details. 1 Next to the averaging operators we also study pointwise convergence of truncated discrete singular operators. To be more precise, let K ∈ C 1 R k \ {0} be a Calderón-Zygmund kernel satisfying the differential inequality (1.2) |x| k |K(x)| + |x| k+1 |∇K(x)| ≤ 1, for all x ∈ R k with |x| ≥ 1, and the cancellation condition for every 0 < λ ′ ≤ λ. Then the truncated discrete singular operator H P N is defined as The logarithmic weights in M P N and H P N correspond to the density of prime numbers. In this article we prove the following theorem, which may be thought as an extension of Cotlar's ergodic theorem, see [4].
The classical approach to the pointwise convergence in L p (X, µ) proceeds in two steps. Namely, one needs to show L p (X, µ) boundedness of the corresponding maximal function reducing the problem to showing the Theorem C. For every p ∈ (1, ∞) there is C p > 0 such that for all r ∈ (2, ∞) and all f ∈ L p (X, µ), The constant C p is independent of the coefficients of the polynomial mapping P.
The variational estimates for discrete averaging operators have been the subject of many papers, see [8,10,11,13,15,16,26]. In [10], Krause studied the case d 0 = k = k ′ = 1 and has obtained the inequality (1.4) for p ∈ (1, ∞) and r > max{p, p ′ }. On the other hand, Zorin-Kranich in [26] for the same case obtained (1.4) for all r ∈ (2, ∞) but for p in some vicinity of 2. Only recently in [11] the variational estimates have been established in the full range of parameters, that is p ∈ (1, ∞) and r ∈ (2, ∞), covering the case k ′′ = 0. In [26], Zorin-Kranich has proved (1.4) also for the averaging operators modeled on prime numbers, that is when d 0 = k = k ′′ = 1 with a polynomial P(n) = n. It is worth mentioning that the variational estimates for discrete operators are based on a priori estimates for their continuous counterparts developed in [9], see also [11,Appendix].
The variational estimates for discrete singular operators have been studied in [3,11,13,16]. In [16], the authors obtained the inequality (1.5), for the truncated Hilbert transform modeled on prime numbers, which corresponds to d 0 = k = k ′′ = 1 and a polynomial P(n) = n. In fact, discrete singular operators of Radon type required a new approach. An important milestone has been laid by Ionescu and Wainger in [7]. Ultimately, the complete development of the discrete singular operators of Radon type has been obtained in [11].
Concerning pointwise ergodic theorems over prime numbers, there are some results using oscillation seminorms. In [1], Bourgain has shown pointwise convergence for the averages along prime numbers for functions from L 2 (X, µ). Then his result was extended to all L p (X, µ), p > 1, by Wierdl in [24], see also [2,Section 9]. Not long afterwards, Nair in [18] has proved Theorem A for L 2 (X, µ), d 0 = k = k ′′ = 1, and any integer-valued polynomial. Nair also studied ergodic averages for functions in L p (X, µ) for p 2, however, [19,Lemma 14] contains an error. In fact, the estimates on the multipliers W N are insufficient to show that the sum considered at the end of the proof has bounds independent of |α − a/b|. Lastly, the extension of Cotlar's ergodic theorem to prime numbers has been established in [14], see also [16].
In view of the Calderón transference principle, while proving Theorem C, we may work with the model dynamical system, namely, Z d 0 with the counting measure and the shift operators. Let us denote by M P N and H P N , the corresponding operators, namely, and (1. 7) H P N f (x) = n∈Z k ′ p ∈(±P) k ′′ f x − P(n, p) K(n, p)1 B N (n, p) k ′′ j=1 log p j .
We now give some details about the method of the proof of Theorem C for the model dynamical system. To simplify the exposition we restrict attention to the averaging operators. Let us denote by m N the discrete Fourier multiplier corresponding to M P N . To deal with r-variational estimates we apply the method recently used [13], see also [26]. Namely, given ρ ∈ (0, 1) we consider the set D ρ = {N n : n ∈ N}, where N n = 2 n ρ . Then in view of (5.6) we can split the r-variation into two parts: long variations and short variations, and study them separately. For each p ∈ (1, ∞) we can choose ρ so that the estimate for ℓ p -norm of short variations is straightforward. Next, to control long variations we adopt the partition of unity constructed in [11], that is for some parameter β ∈ N 0 . Each projector Ξ β n,s is supported by a finite union of disjoint cubes centered at rational points belonging to R β s . In this way, we distinguish the part of the multiplier where we can identify the asymptotic from the highly oscillating piece. The oscillating part is controlled by a multi-dimensional version of Weyl-Vinogradov's inequality with a logarithmic loss together with ℓ p Z d estimates for multipliers of Ionescu-Wainger type. By the triangle inequality, to control the first part it is enough to show First, by the circle method of Hardy and Littlewood, we find the asymptotic of the multiplier m N n . Here we encounter the main difference from [11]. Namely, for ξ sufficiently close to the rational point a/q we have The limitation on the size of the denominator is a consequence of the fact that for a larger q the Siegel-Walfisz theorem has an additional term due to the possible exceptional zero of the exceptional quadratic character. The second issue is the slower decay of the error term in (1.9). In particular, the later has its impact on the size of the cubes in the partition of unity. Both facts made the analysis of the approximating multipliers ν s N n harder. To overcome this we directly work with m N . Moreover, we get completely unified approach to the variational estimates for the averaging operators and the truncated discrete singular operators. Going back to the sketch of the proof, in order to show (1.8), we divide the variation into two parts: s < n ≤ 2 κ s and 2 κ s < n, where κ s ≃ (s + 1) ρ/10 . For large scales 2 κ s < n, we transfer a priori estimates on L p -norm for r-variation of the related continuous multipliers. Since the Gaussian sums satisfies |G(a/q)| q −δ for some δ > 0, we gain a decay (s + 1) −δβρ on ℓ 2 . Consequently, by interpolation the ℓ p norm of r-variation for large scales is bounded by (s + 1) −2 provided that β is sufficiently large. In the case of small scales s < n ≤ 2 κ s , the estimate on ℓ 2 is obtained with a help of the numerical inequality (2.3). We again show that ℓ 2 norm is bounded by (s + 1) −δβρ+1 . Because of the weaker asymptotic (1.9), to obtain ℓ p bounds for r-variations over small scales required a new approach. We further divide the index set into dyadic blocks, then on each block we construct a good approximation to the multiplier giving bounds on ℓ p norm independent of the block. At the cost of additional factor of κ 2 s , we control ℓ p norm of r-variation. Again, by interpolation combined with a choice of β large enough we can make the ℓ p norm bounded by (s + 1) −2 .
Let us briefly describe the structure of the article. In Section 2.1 we collect basic properties of the variational seminorm. In Section 2.2, we show how to deduce Theorem A from r-variational estimates (1.4) and (1.5). Then we present the lifting procedure, which allows us to replace any polynomial mapping P by a canonical one Q. In the next section, we describe multipliers of Ionescu-Wainger type whose ℓ p norm estimates are essential to our argument. In Section 3, we show a multi-dimensional version of Weyl-Vinogradov's inequality with a logarithmic loss. Moreover, we prove the estimate on the Gaussian sums of a mixed type. Sections 4.1 and 4.2 are devoted to study the asymptotic behavior of multipliers M N and H N , respectively. Finally, to get completely unified approach to the variational estimates for the averaging operators and truncated singular operators, at the beginning of Section 5, we list the properties shared by them which are sufficient to prove Theorem C. In the next two sections we show the estimates on long and short variations.
Notation. Throughout the whole article, we write A B (A B) if there is an absolute constant C > 0 such that A ≤ CB (A ≥ CB). Moreover, C stand for a large positive constant whose value may vary from occurrence to occurrence. If A B and A B hold simultaneously then we write A ≃ B. Lastly, we write A δ B (A δ B) to indicate that the constant C depends on some δ > 0.
and by Minkowski's inequality Moreover, for any j 0 ∈ A, Finally, for any increasing sequence (u k : 0 ≤ k ≤ K), we have The following lemma is essential in studying variational seminorms.

Pointwise ergodic theorems.
In this section we show how to deduce the pointwise ergodic theorem (Theorem A) from a priori r-variational estimates for M P N . Proposition 2.1. Let p ∈ (1, ∞) and r ∈ (2, ∞). Suppose that there is C > 0 such that for all f ∈ L p (X, µ), Proof. Let us fix N ∈ N. For each m ∈ {1, . . . , N } and s ∈ {1, . . . , k ′′ }, we set by the partial summation we obtain Hence, where we have used the trivial estimate which is a consequence of (1.1) and the prime number theorem. Observe that thus by repeated application of (2.5), we arrive at the conclusion that because the prime number theorem implies that ϑ B (N) ≃ N k . In particular, by taking f = 1 X and p = ∞ in (2.6) we get Hence, for any p ∈ [1, ∞] and f ∈ L p (X, µ), Next, if p > 1 then we can write In view of (2.1), a priori estimate (2.4) entails that Hence, while proving µ-almost everywhere convergence of the averages A N f : N ∈ N for f ∈ L p (X, µ), we may assume that the function f is bounded. By (2.7), for p = ∞, we can write Therefore, the convergence of M P N f (x) : N ∈ N implies the convergence of A P N f (x) : N ∈ N to the same limit.
Thanks to the Calderón's transference principle we can restrict attention to the model dynamical system, that is, Z d 0 with the counting measure and the shift operator. Hence, it suffices to study the operators (1.6) and (1.7) on ℓ p Z d 0 .

Lifting lemma.
For the polynomial mapping P = P 1 , . . . , P d 0 , let us define It is convenient to work with the set Γ = γ ∈ Z k \ {0} : 0 ≤ γ j ≤ deg P, for each j = 1, . . . , k equipped with the lexicographic order. Then each P j can be expressed as for some c j,γ ∈ Z. The cardinality of the set Γ is denoted by d. We identify R d with R Γ . Let A be a diagonal d × d matrix such that for all γ ∈ Γ and v ∈ R d , For t > 0, we set Finally, we introduce the canonical polynomial mapping, then LQ = P. The following lemma allows us to reduce the problems to studying the canonical polynomial mappings.

Lemma 2.
[12, Lemma 2.1] Let R P N be any of the operators M P N or H P N . Suppose that for some p ∈ (1, ∞) and r ∈ (2, ∞), . In the rest of the article by M N and H N we denote the averaging and the truncated discrete singular operator for the canonical polynomial mapping Q, that is M N = M Q N and H N = H Q N . 2.4. Ionescu-Wainger type multipliers. Let F denote the Fourier transform on R d , that is for any f ∈ L 1 R d , To simplify the notation, by F −1 we denote the inverse Fourier transform on R d as well as the inverse Fourier transform on the d-dimensional torus identified with (0, 1] d . We also fix η : R d → R, a smooth function such that 0 ≤ η ≤ 1, and 16d . We additionally assume that η is a convolution of two non-negative smooth functions with supports contained inside − 1 8d , 1 8d d .
Next, let us recall necessary notation to define auxiliary multipliers of Ionescu-Wainger type. For details we refer to [12]. The following construction depends on a parameter β ∈ N.

and
A q = a ∈ N k q : gcd q, a 1 , . . . , a k = 1 . Lastly, we set (2.9) U β n = a/q : a ∈ A q and q ∈ P n .
its discrete counterpart is given by the formula where E n being a diagonal d × d matrix with positive entries (ǫ n,γ : γ ∈ Γ) such that ǫ n,γ ≤ exp − n 1/5 . Then by [13, Theorem 2.1], for each p ∈ (1, ∞) and any finitely supported function f : Z d → C, where r = max ⌈p/2⌉, ⌈p ′ /2⌉ . The scalar-valued version of (2.10) was proved in [7], see also [12]. The vector-valued extension was recently observed in [13]. Essentially its proof follows the same line as scalarvalued except that in place of Marcinkiewicz-Zygmund inequality one uses Kahane's vector-valued extension of Khinchine's inequality, see [13, Theorem 2.1] for details.

Weyl-Vinogradov sum.
We say that a subset of integers A is polynomially regular, if for all α, α 1 > 0, there are β 0 > 0 and a constant C > 0 so that for any integer 1 ≤ Q ≤ (log N) α 1 , β > β 0 and any polynomial P of a form for some coprime integers a and q, such that 1 ≤ a ≤ q, and we have for all r ∈ {1, . . . , Q} and N ∈ N.
Let us check that Z is polynomially regular. We write and hence, by Weyl estimates with logarithmic loss (see e.g. [25, Remark after Theorem 1.5]), proving the claim. Another example of polynomially regular sets is the set of prime numbers. This is a consequence of [6, Theorem 10].
Our aim is to understand exponential sums over Cartesian products of polynomially regular sets. Let us fix a function φ : The main result of this section is the following theorem.
wherein for some 0 < |γ 0 | ≤ d, for some coprime integers a and q such that 1 ≤ a ≤ q, and The constant C depends on α, d and a constant in (3.3).
Proof. Let us first assume that φ ≡ 1. The proof consists of three steps.
Step 1. We consider the case when k = 1 and |γ 0 | = d. Take α > 0 and α 1 > 0, and let β > β 0 = 3β 1 + 3dα, where β 1 is the value of β 0 determined by A 1 for α and α 1 . Suppose that a and q are coprime integers such that 1 ≤ a ≤ q, and If a ′ /q ′ a/q then Hence, we obtain 1 Observe that the last estimate is also valid if q ′ = q. Let Q be an integer such that proving (3.4). We now set θ = ξ d − a ′ /q ′ and apply the partial summation to get Since which finishes the proof of Step 1.
Step 2. We next consider k ≥ 2 and γ 0 (0, . . . , 0, ℓ, 0, . . . , 0) for any ℓ ≤ d. Without loss of generality we may assume that γ 0 (1) ≥ 1. By the triangle inequality followed by Cauchy-Schwarz inequality we get which, by another application of Cauchy-Schwarz inequality, is bounded by Finally, . Notice that the set Θ is a convex subset of a cube [−N, N] 2k . Moreover, the polynomial Q(x, x ′ ) has degree at least |γ 0 | having a coefficient ξ γ 0 in front of the monomial x γ 0 . Therefore, by [12, Theorem 3.1], there are β 0 > 0 and C > 0 such that provided that β > β 0 . Hence, by (3.5), (3.6) and (3.7) we obtain Step 3. Suppose that k ≥ 1 and γ 0 = (0, . . . , 0, ℓ, 0, . . . , 0) for 1 ≤ ℓ ≤ d. Without loss of generality we may assume that γ 0 = (ℓ, . . . , 0). The proof is by a backward induction over ℓ ∈ {1, . . . , d}. We write If ℓ = d the conclusion follows by Step 1. Suppose that ℓ < d. In view of Step 2 and the inductive hypothesis, the estimate holds for any |γ 0 | = j, ℓ < j ≤ d. Let β 1 be the largest value of β 0 among those that were determined in Step 2 and resulting from the inductive hypothesis. By Dirichlet's principle, for each ℓ < |γ| ≤ d, we select coprime integers a γ and q γ , such that If for some γ ∈ Γ, ℓ < |γ| ≤ d we have (log N) β 1 ≤ q γ , then the conclusion follows by the inductive hypothesis or Step 2. Otherwise, we set θ γ = ξ γ − a γ /q γ and Q = lcm{q γ : ℓ < |γ| ≤ d}. We have To estimate the inner sum on the right-hand side of (3.11), we apply the partial summation. Setting we can write By (3.9), for (n 1 ,ñ) ∈ Ω we have Recall that γ 0 = (ℓ, 0, . . . , 0) and thus, by Step 1 applied to S (r) n 1 ,ñ we obtain where β 2 is the value of β 0 determined in Step 1 for α + β 1 and α 1 . Hence, Consequently, by (3.8), (3.10) and (3.11) we get Finally, we deal with a general φ. Given α, let β 0 be such that We divide the cube [−N, N] k into J closed cubes (Q j : 1 ≤ j ≤ J) with sides parallel to the axes and having side lengths O N(log N) −α−1 . Thus By Q o j we denote the interior of Q j . We assume that Q o j are disjoint with the axes. Let n j be the vertex of Q j at the largest distance to the origin. Then by the mean value theorem and (3.3), we have On the other hand, in view of (3.12), we get which together with (3.14) completes the proof.
We next apply Theorem 1 to get the following variant of Weyl-Vinogradov's inequality.
The constant C depends on α, d and a constant in (3.3).
Proof. We claim that the following holds true.
for some coprime integers a and q, such that 1 ≤ a ≤ q, and The proof is by a backward induction over r. For r = k ′′ the assertion follows by Theorem 1. For r ∈ {1, . . . , k ′′ }, N ∈ N and m ∈ {1, . . . , N }, we set where Ω is a convex subset of [−N, N] k . For 0 ≤ r < k ′′ , by the partial summation, we can write Hence, by the inductive hypothesis we get proving the claim. Now, the theorem follows by Claim 1 for r = 0.

Gaussian sums.
Given q ∈ N and a ∈ A q , the Gaussian sum is where ϕ is Euler's totient function, i.e ϕ(q) equals to the number of elements in A q . The following theorem provides a very useful estimate on the Gaussian sums.

Theorem 3.
There are C > 0 and δ > 0 such that for all q ∈ N and a ∈ A q , Proof. Let us recall that for a, q ∈ N, (see e.g. [17, Theorem 4.1]) wherein µ(q) is Möbius function defined for q = p j 1 1 · · · p j m m , p j are distinct prime numbers, as We start the proof of the theorem by considering d = 1. Then Suppose that k ′ ≥ 1. If G(a/q) 0 then q | a γ for all γ = (γ ′ , 0) ∈ Γ. Since a ∈ A q , we must have k ′′ ≥ 1.
Next, let us consider the case d ≥ 2. For a given polynomial P on R k with integral coefficients we define where a ∈ A q . Our aim is to show that there are C > 0 and δ > 0 such that for all q ∈ N and a ∈ A q , (3.17) S(q, P) ≤ Cq k−δ .
Therefore, if q = p j 1 1 · · · p j m m for some distinct prime numbers p j , then Since ω(q), the number of distinct prime factors of q, satisfies (see e.g. [17,Theorem 2.10]) Hence, it is enough to proof (3.17) for q = p j with p being a prime number and j ≥ 1. Since for any arithmetic function F, we have where for σ ∈ {0, 1} k ′′ , we have set Observe that To obtain a contradiction, suppose that q < Q. Let γ 0 ∈ Γ, |γ 0 | = 1 be such that q γ 0 = Q. Thus q | p j−σ 1 . For any r ∈ N k q we can wright a γ x γ .

M
In this section we develop some estimates on discrete Fourier multipliers corresponding to operators M N and H N .

Averaging operators.
For a function f : Z d → C with a finite support we have where m N is the discrete Fourier multiplier By (1.1) and the prime number theorem, Next, let us define where |B| denotes Euclidean measure of B. By a multi-dimensional version of van der Corput's lemma (see [22,Proposition 2.1]) we have |Φ N (ξ)| min 1, where A is the matrix defined in (2.8). Moreover, Therefore, for N < N ′ ≤ 2N, we have We start with the following proposition.
The constant c is absolute.
Proof. Observe that for a prime number p, p | q if and only if (p mod q, q) > 1. Hence, for each s ∈ {1, . . . , k ′′ }, we have Let θ = ξ − a/q. Then by (4.4), Since for (u, p) ∈ N k ′ × P k ′′ such that u ≡ r ′ mod q, and p ≡ r ′′ mod q, By the partial summation we obtain (4.7) where for x ≥ 2, we have set ϑ(x; q, r) = p∈Px p≡r mod q log p.
Analogously, we can write Furthermore, in view of the Siegel-Walfisz theorem ( [20,23], see also [17,Corollary 11.21]), there are C, c > 0 such that for all x ≥ 2, (r, q) = 1 and 1 ≤ q ≤ (log x) 2β ′ , Hence, by (4.7), (4.8) and (4.5), we obtain Thus, In view of (4.1), similar arguments applied to the sums over p 2 , . . . , p k ′′ lead to By [12, Proposition 3.1], the number of lattice points in B N at the distance < q from the boundary of B N is O(qN k−1 ). Moreover, for each (x, y) ∈ [0, 1] k , and (qu + qx, v + y) ∈ B N , we have Hence, by (4.6) and (4.1), Finally, another application of the mean value theorem allows us to replace the sums by the corresponding integrals. Indeed, we have which is again bounded by qN k−1 L. Therefore, In particular, taking ξ = 0, a = 0 and L = 1, we obtain This completes the proof.

Lemma 3.
For each α > 0 there is C > 0 such that for all N ∈ N, and ξ ∈ T d satisfying where 1 ≤ q ≤ L, a ∈ A q , and 1 ≤ L ≤ exp c √ log N (log N) −α , we have Proof. Given α > 0, let β ′ ≥ dβ α , where β α is the value determined in Theorem 2.

Lemma 4.
For all p ∈ [1, ∞), N 1 , N 2 ∈ N, N 1 < N 2 , and any f ∈ ℓ p Z d , Proof. Let us denote by m n the convolution kernel corresponding to M n . Consider (x, y) ∈ N k ′ × P k ′′ . If (x, y) ∈ B N 1 then If (x, y) ∈ B N 2 \ B N 1 then by setting n 0 = min n ∈ N : x ∈ B n , we have Therefore, and hence, by Young's inequality, which finishes the proof since ϑ B (N 1 ) ≃ N k 1 . 4.2. Truncated discrete singular operators. In this section we investigate the asymptotic of Fourier multipliers corresponding to the truncated discrete singular operators H N with a kernel K satisfying (1.2) and (1.3). Let h N be the Fourier multiplier corresponding to H N , that is for a finitely supported function f : Z d → C, We also define In view of a multi-dimensional version of van der Corput's lemma (see [22,Proposition 2 Moreover, by (1.3), Hence, We start with a proposition analogous to Proposition 4.1.

Proposition 4.2.
For each β ′ > 0 there C, c > 0 such that for all N < N ′ ≤ 2N, and ξ ∈ T d satisfying Proof. For a prime number p, p | q if and only if p mod q, q > 1. Therefore, by (1.1), (1.2), and the prime number theorem, for any s ∈ {1, . . . , k ′′ }, To simplify the notations, for (x, y) ∈ R k \ {0}, we set F(x, y) = e 2πiθ ·Q(x,y) K(x, y), where θ = ξ − a/q. For any (u, p) ∈ N k ′ × P k ′′ such that u ≡ r ′ mod q, and p ≡ r ′′ mod q, we have By the partial summation Analogously, we have Hence, by (4.9) and (1.2), we obtain Therefore, By similar reasonings applied to the sums over p 2 , . . . , p k ′′ , one can show that , thus by the mean value theorem, we obtain

Moreover, in view of [12, Proposition 3.1], the number of lattice points in
Lastly, we can replace the sums by the corresponding integrals because which is bounded by qN −1 L.
Analogously to Lemma 3, we can prove the following statement.

Lemma 5.
For each α > 0 there is C > 0 such that for all N ≤ N ′ ≤ 2N, and ξ ∈ T d satisfying Lemma 6. For all p ∈ [1, ∞), N 1 , N 2 ∈ N, N 1 < N 2 , and any f ∈ ℓ p Z d , Proof. Let h n denote the convolution kernel corresponding to H n . Observe that for (x, y) otherwise the sum equals zero. Thus, by (1.2), we obtain hence, by Young's inequality,

V
In this section we present the estimates for ℓ p Z d norm of the r-variational seminorm for the averaging operators (M N : N ∈ N) and the truncated discrete singular operators (H N : N ∈ N). In order to give a unified approach, we set (Y N : N ∈ N) to be any of them. By (y N : N ∈ N) we denote the corresponding discrete Fourier multipliers and by (Υ N : N ∈ N) its continuous counterparts. We start by listing properties that are sufficient to obtain r-variational estimates. Let ρ ∈ (0, 1) and set N n = 2 n ρ . Property 1. In view of [11] (see also [9]) for each p ∈ (1, ∞) there is C p > 0 such that for all r ∈ (2, ∞) and any function f ∈ L p R d ∩ L 2 R d , where A is the matrix defined in (2.8). Property 3. By Lemma 4 and Lemma 6 we deduce that for each p ∈ (1, ∞) and any f ∈ ℓ p Z d , because by (4.10), In particular, Property 4. By Theorem 2 and partial summation for each α > 0, there is β α > 0 so that for any β > β α , and n ∈ N, if there is γ 0 ∈ Γ, such that for some coprime numbers a and q such that 1 ≤ a ≤ q, and (log Property 5. By Proposition 4.1 and Proposition 4.2, for each β ′ > 0 there is C > 0 such that for all n ∈ N, and ξ ∈ T d , satisfying Property 6. By Lemma 3 and Lemma 5, for each α > 0, all n ∈ N, and ξ ∈ T d , satisfying Before we embark on proving variational estimates, we show the following auxiliary result. Proposition 5.1. For each p ∈ (1, ∞) there is C > 0, such that for all increasing sequences of integers (n j : j ∈ N) and any function f Proof. For each j ∈ N, such that 2 n−1 ≤ n j < 2 n ≤ 2 m ≤ n j+1 < 2 m+1 , we write For every j 1 , j 2 ∈ N, j 1 < j 2 such that n j 1 −1 < 2 n ≤ n j 1 < n j 2 < 2 n+1 ≤ n j 2 +1 , we estimate Hence, for some increasing sequence of integers (m j : j ∈ N), we have The conclusion now follows by [5] and Property 1.
The aim of this section is to prove the following theorem.
Theorem 4. For each p ∈ (1, ∞) and r ∈ (2, ∞) there is C > 0 such that for any finitely supported function f : We split a variational seminorm into two parts long variations V L r , and short variations respectively. Then We first estimate ℓ p -norm of long variations.
We begin with p = 2 and s < n ≤ 2 κ s .

Theorem 5.
For each β ∈ N there is C > 0 such that for all s ∈ N 0 , r ∈ (2, ∞) and any finitely supported where δ is determined in Theorem 3.
Proof. First, let us see that for each m > s, supports of functions η m (· − a/q) are disjoint while a/q varies over R β s . Indeed, otherwise there would be a/q, a ′ /q ′ ∈ R β s , a ′ /q ′ a/q and ξ ∈ T d , such that η m (ξ − a/q) > 0 and η m (ξ − a ′ /q ′ ) > 0. Hence, which is impossible. Next, we consider the following multiplier Let us see that Λ β n,s is sufficiently close to (y N n −y N n−1 )Ξ β j,s . For each a/q ∈ R β s , we have q ≤ exp c 2 √ log N n , thus by (5.5), on the support of η n (· − a/q) we can write Therefore, F −1 (y N n − y N n−1 )Ξ β n,s − Λ β n,s f ℓ 2 ≤ Cn −1−βδρ f ℓ 2, and hence, Therefore, our task is reduced to showing boundedness of the first term on the right-hand side of (5.12).
Observe that for n > s, η n = η n η s , thus we can write Now, in view of Lemma 1, where I i j = j2 i , j2 i + 1, . . . , ( j + 1)2 i − 1 . Let us consider a fixed i ∈ {0, . . . , κ s }. To bound the norm of the square function on the right-hand side of (5.13), we first study its continuous counterpart, that is thus by Property 2, Therefore, Now, by Proposition 5.1, we have thus, in view of (2.10), we conclude that Therefore, by (5.13), we arrive at the Finally, by Plancherel's theorem and hence, by Theorem 3, which together with (5.14) and (5.12) concludes the proof.
Theorem 6. For each β ∈ N and p ∈ (1, ∞) there is C > 0, such that for all s ∈ N 0 , r ∈ (2, ∞), and any finitely supported function f : Proof. For the proof, let us consider the following multiplier Fix s < n 1 < n 2 ≤ min 2 κ s , 2n 1 . Let J n 1 = N n 1 2 −3χ √ log N n 1 . We claim the following holds true.

Claim 2.
For each β ∈ N and p ∈ (1, ∞) there is C > 0, such that for all n 1 ≤ n ≤ n 2 ≤ 2n 1 , The constant C is independent of n 1 and n 2 .

Claim 3.
For each β ∈ N and p ∈ (1, ∞) there is C > 0, such that for all s ∈ N 0 , we have Let us see that (5.20) suffices to finish the proof of the theorem. Indeed, (5.19) together with (5.20) imply that V r n j=n 1 Therefore, by (2.2) and Minkowski's inequality It remains to prove Claim 3. By Lemma 1, we can write Let us fix i ∈ {0, 1, . . . κ s }. In view of Proposition 5.1, where the implied constant is independent of i. Hence, by (2.10), we obtain which together with (5.21) implies (5.20).
We now turn to studying the part of the variational seminorm where 2 κ s < n. For s ∈ N 0 we set Q s = e (s+1) ρ/10 ! Theorem 7. For each β ∈ N there is C > 0, such that for all r ∈ (2, ∞), s ∈ N 0 , and any finitely supported where δ is determined in Theorem 3.
By Plancherel's theorem, for any u ∈ N d Q s and a/q ∈ R β s , we have because in view of (5.1), for each ξ ∈ T d , Therefore, Since the set R β s has at most e (d+1)(s+1) ρ/10 elements, and (d + 1)(s + 1) ρ/10 + (s + 1) ρ/10 e (s+1) ρ/10 − log 2 2d 2 κ s ≤ −(s + 1) ρ , Hence, Let us observe that the functions x → I(x, y) and x → J(x, y) are Q s Z d -periodic. Therefore, by repeated change of variables, we get By [12,Proposition 4.1] (see also [15,Proposition 3.2]), Property 1 entails that for each u ∈ N d Q s , we have Observe that Since by Theorem 3 and disjointness of supports of ̺ s (· − a/q) while a/q varies over R β s , we get which together with (5.24) implies (5.23) and the proof of theorem is completed.
Proof. First, we are going to refine Claim 4.