Local linear conditional cumulative distribution function with mixing data


 This paper investigates a conditional cumulative distribution of a scalar response given by a functional random variable with an $$\alpha $$
 α
 -mixing stationary sample using a local polynomial technique. The main purpose of this study is to establish asymptotic normality results under selected mixing conditions satisfied by many time-series analysis models in addition to the other appropriate conditions to confirm the planned prospects.

kernel approach cannot adequately estimate the conditional models for the reason that this technique suffers from a large bias particularly at the boundary region.
However, the kernel approach can be improved using local polynomial smoothers, and especially local linear smoothers, because they correct the asymptotic bias that is adversely affected at the boundaries (see Fan and Gijbels [16] for more discussions on this subject in the real case). In recent years, there has been a strong interest in local linear smoothers in the infinite-dimensional space (see, for instance, Baìllo and Grané [1] and Barrientos-Marina et al. [2]). And it should be noted that the last precursor work has been extended in many directions, including asymptotic properties (see Demongeot et al. [11,12] and Zhou and Lin [37]), nature of the variables (see Demongeot et al. [14]), or the dependence type (see Demongeot et al. [10] and Laksaci et al. [27]).
In this regard, our interest in this paper is to give a result concerning the limit in distribution of the estimate of the CDF, by the local linear fit. More precisely, we consider the case when the observations (X i , Y i ) i≥0 are strongly mixing. We prove the asymptotic normality of a local linear estimator of the CDF by utilizing the appropriate form of Bernstein's blocking arguments and a reduction analysis leading to the Lindeberg-Feller central limit Theorem. We point out that this contribution has a potential impact in practice as well as in the theory. Indeed, from a practical point of view, this asymptotic property is used to derive confidence intervals or to make statistical tests. On the other hand, from a theoretical point of view, the asymptotic normality is a basic ingredient to determine the mean quadratic error or to study the uniform integrability of the estimator.
Accordingly, this work is mainly structured as follows: Sect. 2 will present the model selected for study, describe the estimation method through giving the explicit solution to the minimization problem, and provide some basic assumptions and notations. Section 3 will state the main asymptotic normality results achieved by the conditional distribution function estimator, indicating that their accuracy will lead to interesting perspectives. Finally, Sect. 4 will discuss the applicability of the provided asymptotic results on some statistical problems such determination of confidence intervals. Detailed proofs of the main results will be consequently postponed to the appendix.

The model
The model is defined in the following way. Assuming that (X i , Y i ) 1≤i≤n is stationary α-mixing process. The X i are random variables with values in a functional space F where the random variables Y i are real. In all the sequel F, a semi-metric space endowed with a semi-metric d(., .) is taken into consideration. For x ∈ F, the conditional probability distribution of Y i which is given by X i = x is classically written as follows: This distribution is absolutely continuous with respect to the Lebesgue measure on R.

The estimate
The conditional cumulative distribution function F x is estimated by a where the couple ( a, b) is obtained by the optimization rule: where β(., .) and δ(., .) are known functions from F 2 into R, K is a kernel, H is a cumulative distribution function, and h K and h H are the bandwidths parameters. However, if the bi-functional operator β is such that, ∀z ∈ F, β(z, z) = 0, then the quantity F x (y) is explicitly defined by the following: Several asymptotic properties of this estimation are recently obtained. It turns out that the existing literature addresses either the statement of almost-complete consistencies or a mean-square error (see Demongeot et al. [12]).

Assumptions and notations
In what follows, x (resp., y) will denote a fixed point in F (resp., in R), N x (resp., N y ) will denote a fixed neighborhood of x (resp., of y), and φ x (r 1 , r 2 ) = P(r 2 ≤ δ(X, x) ≤ r 1 ), and let G be the real valued function defined as for any l ∈ {0, 2}: ∂ y l . We now state some conditions which ensure asymptotic normality of (2): (H1) (i) For any r > 0, φ x (r ) := φ x (−r, r ) > 0, and there exists a function x (·), such that: (ii) For any l ∈ {0, 2}, the quantities G The coefficients of α-mixing sequence (X i , Y i ) i∈N satisfy the following two conditions: (H6) The locating operator β satisfies the following two conditions:  1 2 and lim n→+∞ q n n φ x (h K ) 1 2 α(v n ) = 0, where q n is the largest integer, such that q n (r n + v n ) = O(n).

Comments on the assumptions
It is observed that the assumptions listed above are standard in the FDA context. In particular, hypotheses (H1) and (H6) are not unduly restrictive and are common in the setting of the functional local linear fitting (see Barrientos-Marina et al. [2], and Demongeot et al. [12] among others). Concerning the first part of (H1), the reader will find, in Ferraty and Vieu [19], a deep discussion concerning the links between this assumption, the semi-metric d, and the small ball concentration properties. Moreover, this hypothesis intervenes to compute the exact constant terms involved in the asymptotic expansions. For example, the previous hypothesis is proposed to evaluate the constant M j = IE(K j 1 ), where j ∈ (1, 2). However, the second part of (H1) is needed to evaluate the bias of estimation in the asymptotic result. To avoid the expression of covariance in the rate of convergence, assumptions (H3) and (H4) are required; in addition, hypothesis (H3) can be differently seen based on the idea of the maximum concentration between the quantities P(X i ∈ B(x, h K )) and P(X j ∈ B(x, h K )) (see Ferraty et al. [18]). Concerning the hypothesis (H4), it is used to insure the absolute convergence of the series k∈Z Cov(X 0 , X k ). Conditions on the smoothing parameters h K and h H are standard and will be stated along the theorem below. The boundedness of the Kernel K in (H7)(i) is standard; also assumptions (H7)(ii) and (H8) constitute technical conditions for brevity proofs. Furthermore, the role of assumption (H8) is to use Bernstein's big-block and small-block techniques to prove the asymptotic normality for the α-mixing sequence; nonetheless, the choice of the sequences (r n ) and (v n ) in hypothesis (H8) is not surprising. Another choice can be found in Masry [31].

Main results
Before announcing the main results, the quantities M j and N (a, b) are introduced to provide bias and variance dominant terms. where and where Remark 3.2 1. If we impose the additional assumption: and, in addition, if we replace the function φ x (h K ) by its empirical estimator defined by the following: the bias term can be canceled to obtain the following Corollary: When the assumptions (H1)-(H9) are held, the following asymptotic result is achieved:

Proof of Theorem 3.1
Starting by

Denote by
then The relationship (7) is important to establish the asymptotic normality of F x (y); moreover, the continuity of F x insures the asymptotic negligibility of B n (x, y) and if F x D converges in probability to 1 as n −→ ∞, then is obtained. The proof of Theorem 3.1 will be completed from the above expression and the following results for which proofs are given in the appendix.

Confidence intervals
In parallel, the precise form of (3) is very useful to construct confidence intervals for F x (y) based on the normal-approximation method that requires to estimate the quantities M 1 , and M 2 by the following empirical estimators: To show the asymptotic (1 − ξ) confidence interval of F x (y), where 0 < ξ < 1, it is necessary to consider the estimator of V H K (x, y) as follows: In addition, a kernel K and a distribution function H are chosen to satisfy the condition (H7) by selecting the bandwidths h K and h H through adapting the cross-validation method. Choosing the locating functions β(., .) and δ(., .) constitutes an important parameter for the practical utilization of the employed approach. There are several ways permitting to choose the operators β(., .) and δ(., .) (see Barrientos-Marina-Marina et al. [2] for some examples), but the appropriate choice is determined with respect to the shape of the curves and depends on the purpose of the statistical study. For example, if the functional data are smooth curves, one can try to use the following family of locating functions: where x (q) denotes the qth derivative of the curve x and θ(t) is the eigenfunction of the empirical covariance operator 1 i − X (q) ) associated with the q-greatest eigenvalue. Finally, by Corollary (3.3), the asymptotic (1 − ξ) confidence interval of F x (y) is given by the following: ⎡ where λ ξ 2 is the ξ 2 quantile of the standard normal distribution.

Proof of Lemma 3.4.
Let us first note that, in view of (6), we have In the same way, using the definition of j , this equality can be rewritten as follows: Denote by and thus Finally, the rest of the proof is based on the following statements: Proof of (8).
Let us write the left-hand side of (8) as follows: Hence, by Slutsky's Theorem (see Theorem 11.1.5 in [25]), (8) is straightforward consequence of the following two claims: Proof of (10).
As a matter of fact, we need to evaluate the variance of (T 2, j ). For this, we have the following: Therefore, to prove that lim n→+∞ Var(T 2, j ) = V H K (x, y), it is necessary to establish the following results: Proof of (12). One has n Var(T 2,1 ) = Concerning the second term on the right-hand side of (14), we have the following: and by the continuity of F x we deduce that: Now, we turn to the first term on the right-hand side of (14). Let us begin with writing: In view of (15), classical computations of the second term on the right-hand side of (16) give: Concerning the first term on the right-hand side of (16), we use the following definition of the conditional variance: Thus, using an integration by parts followed by a change of variable, we get: , and by the continuity of F x , we deduce that Therefore, the second term on the right-hand side of (17) tends to (F x (y)) 2 as n tends to infinity. Finally, we have the following: Next, using Lemma A.1 of [37], we get:

Proof of (13).
First, the set E 1 and E 2 are defined by setting where m n is a sequence of integers, such that m n −→ +∞ as n → +∞. Now, denote by Having the sum of covariance over the set E 1 , by stationarity: F x (y))). (18) Under (H7)(ii), Having |H i (y) − F x (y)| ≤ 1, according to (H3), the following inequality is obtained as follows: Now, by the application of Lemma A.1 [37], we have the following: It follows that, by (H5)(iii) and taking m n = 1 Concerning the sum over the set E 2 , the proposition A.10 (ii) of [19] is used to get: First, we evaluate the quantity E(|L i | q ). Conditioning on X i , and using the fact that | H i − F x (y) |≤ 1, we obtain: Again, using Lemma A.1 of [37], we have: Finally, the obtained result is combined with assumptions (H4)(ii) and the sequence m n previously chosen to get: Now, the asymptotic normality of the conditional cumulative distribution estimation is established dealing with dependent random variables: Remark that (13) implies that: Therefore, it suffices to show the following result: Bernstein's big-block and small-block procedure is employed following similar arguments to those involved in Theorem 3.1 of Liang and Baek [29]. (1, 2, . . . , n) is splitted into 2κ n + 1 subsets with large blocks of size (r n ) and small blocks of size (v n ) and by putting κ = n r n + v n .
Assumption (H8)(ii) permits to define the large block size as follows: Moreover, some easy computations are obtained using the same hypothesis: and it can easily be deduced that, as n → +∞: In addition, if v n is replaced by r n , we obtain lim n→+∞ κr n n = 1.

Proof of (22).
By Markov's Inequality, it remains to establish for all ε > 0: To prove (24), it is clearly observed that: Noting also that, by the second-order stationarity, it will be retained: Consequently: .
By the assumption (H8), it is clear that: Concerning A 2 , it has: which leads to For (25), we have the following: where μ n = n − κ n (r n + v n ), and by the definition of κ n , we have μ n < r n + v n . Hence And, again hypothesis (H8), we get

Proof of (23).
Making use of Volkonskii and Rozanovs's Lemma [36] and the fact that the process (X i , Y i ) is strong mixing; and that ϒ a is A j a i a measurable with i a = a(r n + v n ) + 1 and j a = a(r n + v n ) + r n , hence, with V j = exp( itϒ j √ n ), we have the following: Consequently, according to formula (26), ϒ j are asymptotically independent. Therefore, for the variance of S 1,n , we have the following: Var(S 1,n ) = κr n n Var(Z 1 ) Furthermore, from assumption (H8), κr n n −→ 1 as n −→ +∞.
Finally, we get Now, to end the proof of (23), we focus on the central limit Theorem due to Linderberg. More precisely, by applying the Linderberg's version of central limit Theorem on ϒ j , it suffices to show that for any ε > 0: In view of the first summation of (21), classical computations give ϒ j n ≤ r n n |Z 1 |.
Therefore, for all ε and if n is great enough, the set {|ϒ j | > ε √ nV H K (x, y)} becomes empty; and the proof of (23) is, therefore, complete.
Proof of (11). By Bienaymé-Tchebychev's Inequality, it is sufficient to show that, for all ε > 0: In addition, Cauchy-Schwarz's Inequality entails Then, (11) is a straightforward consequence of the following results:

Proof of (27).
First, we can write the following: For the first term on the right-hand side of this equality, we have the following: Concerning the second term of the previous equality, we have the following: The proof of this result is very close to the proof of (13). Specifically, by keeping the same notations as those used in (13), and by splitting the sum into two separate summations over the sets E 1 and E 2 : where the sequence m n is chosen, such that m n −→ +∞ as n −→ +∞. Denoting now by A 1,n and A 2,n the sum of covariances over E 1 and E 2 , respectively; so By stationarity, we have: Moreover, assumptions (H1) and (H6) imply that: On the other hand, we may apply Jensen's Inequality and assumption (H3) to obtain: In the next step, we use the technical Lemma A.1 of [2] to get: Then, choosing m n = √ n, and since is bounded from assumption (H3), we arrive at: Let us now treat the sum over E 2 . The application of the inequality for bounded mixing processes [see Proposition A.10(i) in [19]] for all l = i leads to: On the other hand, using the fact that j≥x+1 j −s ≤ u≥x u −s = ((1 − s)u s−1 ) −1 , and under (H4)(i), it is easy to get: Finally, we have: .
We use the same choice of m n as before, and using assumption (H5)(i), we obtain:

Proof of (28).
We start by writing The first term on the right-hand side of (30) tends to V H K (x, y) as n tends to infinity, and the proof of this result was shown in (13). Concerning the second term on the right-hand side of (30), we have: where with an integration by parts followed by the change of variables t = y−z h H allows to write: Moreover, the latter integral can be rewritten as follows: Now, under assumption (H7)(ii), and by the continuity of F x , we have the following: In addition, by applying the technical Lemma A.1 in [37], we get lim n→+∞ Finally, assumption (H5)(ii) allows us to deduce that: Proof of (9). By following the same ideas as those used in (11), we show that:

Proof (32).
To show the required result (32), it suffices to prove the L 2 consistency of T 4, j : Concerning the first term on the right-hand side of (34), we have the following: On the other hand, by exactly the same arguments at (13), the second term on the right-hand side of (34) tends to 0 as n tends to the infinity, and the desired result (32) is obtained.

Proof of Lemma 3.5
By the definition of F x D , we have the following: Let us write Finally, the claimed result will be obtained as soon as the two following claims have been checked: Proof of claim 5.2 By following the same ideas as those used in claim 1, we show that: lim n→+∞ Var(T 2, j ) = 0.
First, we have the following: Var(T 2, j ) = Var(T 2,1 ) + 2 Cov(T 2,i , T 2, j ), where Var(T 2,1 ) = 1 Second, we use the technical Lemma A.1 of [2], to get: ; then, by the assumption (H5)(i), we have the following: lim n→+∞ Var(T 2,1 ) = 0. Moreover: Let us now define the sets E 1 and E 2 as follows: where the sequence m n is chosen, such that m n −→ +∞ as n −→ +∞, and we denote by A 1,n and A 2,n the sum of covariances over E 1 end E 2 , respectively, then By stationarity and assumption (H3), we have: Now, we use the technical lemma A.1 of [37], to obtain: The fact that is bounded by assumption (H3), and by the choice m n = √ n permits to get: A 1,n −→ 0 as n −→ ∞.
Concerning the sum over E 2 , by following the same ideas as those used in (29), we get: .
We use the same choice of m n as before, and by assumption (H5)(i), we obtain the following: A 2,n −→ 0 as n → ∞.