Non-uniform Berry-Esseen bound by unbounded exchangeable pairs approach

In this paper, a new technique is introduced to obtain non-uniform Berry-Esseen bounds for normal and nonnormal approximations by unbounded exchangeable pairs. This technique does not rely on the concentration inequalities developed by Chen and Shao [4,5] and can be applied to the quadratic forms and the general Curie-Weiss model.


§1 Introduction
Since Charles Stein presented his ideas in the seminal paper [15], there have been a lot of research activities around Stein's method. Stein's method is a powerful tool to obtain the approximate error of normal and non-normal approximations. The readers can refer to Chatterjee [2] for recent developments of Stein's method.
While several works on Stein's method pay attention to the uniform error bounds, Stein's method showed to be powerful on the non-uniform error bounds, too. By Stein's method, Chen and Shao [4,5] obtained non-uniform Berry-Esseen bounds for independent or locally dependent random variables. The key point in their works is the concentration inequality, which also has strong connection with another approach called the exchangeable pairs approach.
The exchangeable pairs approach turned out to be an important topic within Stein's method. Let W be the random variable under study. The pair (W, W ′ ) is called an exchangeable pair if (W, W ′ ) and (W ′ , W ) share the same distribution. With ∆ = W − W ′ , Rinott and Rotar [12], Shao and Su [11] obtained a Berry-Esseen bound of the normal approximation when ∆ is bounded. If ∆ is unbounded, Chen and Shao [6] provided a Berry-Esseen bound and got the optimal rate for an independence test. The concentration inequality plays a crucial role in previous studies, such as Shao and Su [11] , Chen and Shao [6]. Recently, Shao and Zhang [13] made a big step for unbounded ∆ and without using the concentration inequality. They obtained a simple bound as seen from the following result. [13]) Let (W, W ′ ) be an exchangeable pair, ∆ = W − W ′ , and the relation E(∆|W ) = λ(W + R), a.s., holds for some constant λ ∈ (0, 1) and a random variable R . Then,

Theorem 1. (Shao and Zhang
In this paper, inspired by the idea of Shao and Zhang [13], we extend their results and get a non-uniform Berry-Esseen bound for unbounded exchangeable pairs by combining new techniques. In addition, Chatterjee and Shao [3] introduced a new approach for non-normal approximation by Stein's method in the case of bounded ∆. When ∆ is unbounded, Shao and Zhang [13] obtained Berry-Esseen bounds for non-normal approximation. In this paper, we extend their result to the non-uniform case.
The main contribution of this paper is threefold. First, we introduce a new technique to obtain non-uniform Berry-Esseen bounds for unbounded exchangeable pairs. Our proof does not rely on the concentration inequality. Second, we present a non-uniform Berry-Esseen bound for non-normal approximation. As far as we know, there are only a few results in this area. Shao, Zhang and Zhang [14] obtained a Cramér-type moderate deviation for non-normal approximation. At last, we apply our results to quadratic forms and the general Curie-Weiss model.
The paper is organized as follows. We present the main result in Section 2. We give some technical lemmas and the proof of the main result in Section 3. The applications of our result are collected in Section 4. §2 Main result In this section, we present some notions and notations about Stein's method. Further details can be found in Shao and Zhang [13]. We then state our main result.
Let the function g(x), x ∈ R, of the class C 2 , satisfy the following conditions: is non-decreasing, and xg(x) ≥ 0 for x ∈ R; (A2) g ′ (x) is abusolutely continuous and 2(g ′ (x)) 2 − g(x)g ′′ (x) ≥ 0 for all x ∈ R; (A3) lim x↓−∞ g(x)p(x) = 0 and lim x↑+∞ g(x)p(x) = 0 , where From Shao and Zhang [13], we know that if (A1)∼(A3) hold, then, f z (x) has the following properties, for any fixed z ∈ R: For a random variable W , applying Stein's equation to it and taking expectation on both sides, we have: Before presenting our main result, we introduce another condition we want g(x) to satisfy: (A4) There is a number τ ∈ (0, 1) and a positive constant K τ such that There is a large class of functions g satisfying condition (A4), besides the conditions (A1)∼(A3). A typical example is g(x) = sgn(x)|x| α , α ≥ 1 (α is a real number).
Let X be a random vector in R n and W = φ(X) the random variable of interest (φ is a certain measurable function). Denote by F (z), z ∈ R, the distribution function whose density function is defined by (2). Now we present our main result.
for some constant λ ∈ (0, 1) and a random variable R. Assume that g(x), x ∈ R, satisfies (A1) ∼(A4) and Eg 2 (W ) < ∞. Then, for any z ∈ R, Here C is a constant depending on τ and [13] provided the Berry-Esseen bound for non-normal approximation similar to (1). Theorem 2 is a non-uniform refinement of their result.

Remark 1. Shao and Zhang
Here, Φ(z), z ∈ R is the standard normal distribution function. However, the technique "leave one out" to deal with sums of independent variables is very similar to the exchangeable pair technique. If we begin with (9) (see in the proof of the main result) and use some results from Chen and Shao [4], it is not difficult to obtain (5) . In some applications such as the quadratic forms treated later in this paper, where relation (3) is satisfied with R = 0 and g(x) = x, the non-uniform part C 1+|z| in (4) can be improved significantly by replacing it with C (1+|z|) 2 . §3 Proof of Theorem 2 Proof. In what follows, C is used to denote a constant whose value may change at each occurrence.
From Shao and Zhang [13], it is known that and Observe that Then we obtain Combining (6) and (7), for z > 0, we have For z ≤ 0, using (6) and (8), we have The only difference between (9) and (10) is that the expectation E E(∆∆ * |X) To prove (4), we first assume that z > 0.
Using Cauchy's inequality to the fist term of (9) yields We will show now that Since g(x), x ∈ R, satisfies (A4), for the τ in (A4) , we have Thus we conclude that F (x)g(x) p(x) + 1 is bounded on (−∞, 0) and it does not depend on z. We notice further that Hence By (13), we have We notice that and C depends on τ.
The next is to use the fact that ∥f ′ z ∥ ≤ 1 and see that which complete the proof of (12) for z > 0. By (11) and (12), we have Using Cauchy's inequality, for the second term of (9), we find We will show that √ Since we know that g( Thus we have proved (17) for z > 0. By (16) and (17), we have For the third term of (9), we obtain 1 2λ By Markov's inequality, Then , (19) becomes From Shao, Zhang and Zhang [14], we know that for z ∈ R. For the last term of ( 9), we have To show (22 ) holds, it suffices to consider z ≥ 0 where |g(z)| = g(z). There is z 0 ∈ (0, +∞] such that is bounded for z > z 0 , 1 g(z) = g(z)+1 g(z) · 1 1+g(z) ≤ C 1+g(z) for z > z 0 . Also, for From (9), (15), (18), (20) and (22), it follows that we have proved (4) for z > 0. For z ≤ 0, we take (10) and use Cauchy's inequality. For the third term of ( 10), it is easy to see that 1 2λ For the last term of (10), in view of (21), Thus we only need to prove (12) and (17) for z ≤ 0. For z ≤ 0, we have, for any τ ∈ (0, 1), that . By the same arguments as above, we obtain Then following similar steps as in the proof for z ≥ 0, we establish (12) for z ≤ 0. To prove (17) for z ≤ 0, it suffices to show that E ( I(W ≤ z) − F (z) ) 2 ≤ C/|g(z)| 2 for z ≤ 0. Indeed, by (25) and Markov's inequality, Let us summarize our findings:

Quadratic forms
Let X 1 , X 2 , ..., X n be i.i.d. random variables with zero mean, unit variance and a finite fourth moment. Let A = (a ij ) 1≤i,j≤n be a real symmetric matrix with a ii = 0 and let This is a classical example which has been widely discussed in the literature. For example, de Jong [7] obtained the asymptotic normality of W n , Chatterjee [1] gave an L 1 bound and Götze and Tikhomirov [10] studied the Kolmogorov distance between the distribution of W n and the distribution of the same quadratic forms with X ij repalced by corresponding Gaussian random variables. Shao and Zhang [13] established the following bound: The next theorem is a non-uniform refinement of this bound.
Theorem 3. Let {X 1 , X 2 , ..., X n } be i.i.d random variables with zero mean, unit variance and a finite fourth moment. Let A = (a ij ) n i,j=1 be a real symmetric matrix with a ii = 0 for all where C is an absolute constant depending on EX 4 1 .
Proof. Let (X ′ 1 , X ′ 2 , ..., X ′ n ) be an independent copy of (X 1 , X 2 , ..., X n ) and θ a disrete uniformly distributed random variable over the set {1, 2, ..., n} and independent of all oher random variables. Define Then (W, W ′ ) is an exhcangeable pair. It is easy to see that and These relations imply that condition (3) is satisfied with g(x) = x, λ = 2 n and R = 0. By Shao and Zhang [13], and Note that EX 4 1 < ∞ and EW 4 n < C for any n = 1, 2 · · · . Then, For τ involved in (A4), we can take, for example τ = 1 2 and derive that By ( 14), we have Then, Using the same arguments as those in the proof of the main result, we find that Hence the bound C 1+|z| in (4) can be improved replacing it by C (1+|z|) 2 . Thus, referring to Theorem 2, in view of (27) and (28), we complete the proof of this theorem.

General Curie-Weiss model
The Curie-Weiss model is important in statistical physics and has been extensively discussed in the literature. For some history and the first asymptotic results, the reader is referred to Ellis and Newman [8], [9]. Using the technique of exchangeable pair approach, Chatterjee and Shao [3] studied a kind of Curie-Weiss model. Shao and Zhang [13] studied a general Curie-Weiss model and got the optimal convergence rate. In this subsection, we refine the bound in Shao and Zhang [13] to the non-uniform case.
Let L(x),x ∈ R, be a distribution function satisfying the conditions: For a positive integer k and a real number λ, say that L be of type k with strength λ, if where, to recall that Φ(x),x ∈ R, is the standard normal distribution function.
Let (X 1 , ..., X n ) be a random vector with joint distribution function P n,β (x), x = (x 1 , ..., x n ) ∈ R n , such that where K n is the normalizing constant. Let ξ be a random variable with distribution function L. Moreover, assume that: (1) for 0 < β < 1, there exists a constant b > β such that (2) for β = 1, there exist constants b 0 > 0, b 1 > 0 and b 2 > 1 such that: We have the following results: Theorem 4. Suppose that the distribution function of the random vector (X 1 , X 2 , ..., X n ) is given by (30), where L satisfies (29) and let S n = X 1 + · · · + X n .
where F 1 (z), z ∈ R, is the distribution function of a random variable Z 1 ∼ N (0, 1 1−β ) and C is a constant depending on b and β.
. Proof. Recall that S n = n ∑ i=1 X i . We first construct an exchangeable pair as follows. For a fixed i, 1 ≤ i ≤ n, given {X j , j ̸ = i}, let X ′ i be a random variable which is conditionally independent of X i and has the same conditional distribution as X i . Let θ be a random index unformly distributed over {1, · · · , n} and independent of all other random variables. Let S ′ n = S n − X θ + X ′ θ . Then (S n , S ′ n ) is an exhangeable pair. When 0 < β < 1, let W n = S n / √ n and W ′ n = S ′ n / √ n. Then (W n , W ′ n ) is an exchangeable pair. By Shao and Zhang [13], the following relations are satisfied: Here C depends on β and b. Thus (3) is satisfied with g(x) = (1 − β)x, and λ = 1 n . Using (36), (37), (38) and Theorem 2 , we obtained (33).