Empirical likelihood inference for monotone index model

This paper proposes an empirical likelihood inference method for monotone index models. We construct the empirical likelihood function based on a modified score function developed by Balabdaoui et al. (Scand J Stat 46:517–544, 2019), where the monotone link function is estimated by isotonic regression. It is shown that the empirical likelihood ratio statistic converges to a weighted chi-squared distribution. We suggest inference procedures based on an adjusted empirical likelihood statistic that is asymptotically pivotal, and a bootstrap calibration with recentering. A simulation study illustrates usefulness of the proposed inference methods.


Introduction
Single index models are widely used in statistics since they compromise interpretability of index coefficients in the parametric part and flexibility of regression modeling in the nonparametric part (see, ch. 8 of Li and Racine, 2007, for a review). Many estimation methods have been proposed for single index models, such as the semiparametric least squares estimator (Härdle et al., 1993;Ichimura, 1993), M-estimator (Klein & Spady, 1993), and average derivative estimator (Powell et al., 1989). Although these estimation methods have desirable theoretical properties under certain regularity conditions, they typically require some nonparametric smoothing method to evaluate the unknown link function, which involves tuning parameters, such as bandwidth and series length parameters, and the optimal choices of them are substantial (theoretical and practical) problems.
The monotone single index model, in which monotonicity is imposed on the link function, has been studied in recent years. Balabdaoui et al., (2016) showed that the least square estimator of a monotone single index model generally converges at the cube root rate, but its asymptotic distribution is still unknown. The main difficulty for deriving the asymptotic distribution of the least square estimator arises from the non-differentiability of the objective function; in a monotone single index model, the link function, which is an infinite-dimensional nuisance parameter, is generally estimated by a nonparametric approach such as isotonic regression, while the index part is parametrically modeled as a linear combination of the covariates. Then the derivative of the objective function with respect to the index coefficients is intractable due to the non-smoothness of the estimated nuisance parameter.
To overcome this issue, Groeneboom and Hendrickx, (2018) developed a score-type estimator for the current status model, which is a special case of monotone single index models. Their approach is based on the estimating equation which is the same as the first-order condition of the least square estimator except that it ignores the derivative of the estimated link function. They proved √ n-consistency and asymptotic normality of their estimator without any tuning parameter. Their result was extended to general monotone single index models by Balabdaoui et al., (2019), where they derived √ nconsistency and asymptotic normality for the parametric component and an n 1/3 / log n convergence rate for the nonparametric estimator of the link function.
Although the score estimation approach is remarkable, the main drawback is that it requires smoothing parameters to estimate the asymptotic variance to implement hypothesis testing and interval estimation. Because the estimating function in the score-type approach is dependent on the estimated link function, some conditional expectation is involved in the asymptotic variance. Besides, the partial derivative of the link function is also included in the asymptotic variance even though the estimated link function is not smooth. Therefore, smoothing methods, such as the kernel smoothing, are employed to estimate such quantities, which require us to select multiple smoothing parameters and make statistical inference cumbersome.
To address this problem, we propose an empirical likelihood inference method based on the score-type approach for monotone index models. We show that the empirical likelihood statistic based on the estimating equation of Balabdaoui et al., (2019) converges in distribution to the weighted chi-squared distribution. Even in our empirical likelihood approach, the conditional expectation as mentioned above appears in the asymptotic distribution. To circumvent selection of smoothing parameters, we adapt the bootstrap calibration method proposed by Hjort et al., (2009) to our context. Because of the estimating equation with the estimated nuisance parameter pluggedin, a classical naive bootstrap method is not asymptotically valid. Hjort et al. (2018) provided a modified bootstrap method by recentering and reweighting to deal with such a situation. Combining the empirical likelihood and modified bootstrap methods, our approach provides a simple and theoretically justified method for statistical inference in monotone single index models.
The remainder of this paper is organized as follows. Section 2 presents our basic setup, methodology, and theoretical results. In Sect. 3, we conduct a small simulation study to illustrate the proposed method. All proofs are contained in the appendix.

Main result
We closely follow the setup and notation of Balabdaoui et al., (2019) (hereafter BGH). Consider the monotone index model where Y is a scalar response variable, X is a d-dimensional vector of covariates, is an error term, α 0 is a k-dimensional vector of parameters, and ψ 0 : R → R is an unknown monotone increasing function. For identification, we assume that α 0 belongs to the d-dimensional unit sphere S d−1 = {α ∈ R d : ||α|| = 1}. We are interested in conducting statistical inference (i.e., interval estimation and hypothesis testing) on α 0 based on the empirical likelihood approach.
To motivate the score-type approach of BGH, we tentatively assume that ψ 0 is known. The population score equation for the least square estimation of β 0 is where ψ (1) 0 is the derivative of ψ 0 and J(β) is the Jacobian of S(β). Thus, it is natural to construct an estimator of β 0 by taking an empirical counterpart of (2) and inserting estimators for ψ (1) 0 and ψ 0 . However, when we estimate ψ 0 by the isotonic regression method, the resulting estimator of ψ 0 is typically discontinuous and it is not clear how to evaluate the derivative ψ (1) 0 without introducing smoothing parameters. To address this issue, BGH and Groeneboom and Hendrickx, (2018) considered the modified population score equation In particular, for point estimation of α 0 , BGH proposed to solve the following scoretype equation: with respect toβ, and estimate α 0 byα = S(β), where for given β,ψ β is obtained by the isotonic regressionψ and M is the set of monotone increasing functions defined on R.
In this paper, we employ the score-type equation in (3) as a moment function and propose the following empirical likelihood statistic whereĝ By the Lagrange multiplier argument, its dual form is obtained as where the Lagrange multiplierλ solves In practice, we use the dual representation in (7) to implement statistical inference. To study the asymptotic properties of the empirical likelihood statistic (β 0 ), we impose the following assumptions. Let · be the Euclidean norm and B(a 0 , A) = {a : a − a 0 ≤ A} be a ball around a 0 of radius A.
is an iid sample generated by (1). The support X of X is convex with a nonempty interior, and X ⊂ B(0, R) for some R > 0. The Lebesgue density of X has a bounded derivative on X . There exist positive constants c and These assumptions are adaptations of Assumptions A1-A6 in BGH. Compared to BGH, our assumptions are simpler because we do not need to control the behavior of the score function outside the true parameter α 0 = S(β 0 ). Assumption A1 is on the distribution form of the data. The support condition in A1 may be relaxed by assuming X to follow a sub-Gaussian distribution. The moment condition in A1, which is analogous to BGH's A6, is required to guarantee max 1≤i≤n |Y i | = O p (log n) to control the entropy of a class of score functions. Assumption A2 is on the true link function ψ 0 . Compared to BGH which considers point estimation, we only need to impose boundedness, which is a mild requirement.
Under these assumptions, our main result is presented as follows.
Remark 2 One way to conduct statistical inference based on (β 0 ) is to estimate the critical values of w 1 χ 2 1,1 + · · · + w d−1 χ 2 1,d−1 based on some estimators of and V . Based on (13), V is consistently estimated byV . On the other hand, can be estimated bŷ An alternative way for statistical inference is to adjust the empirical likelihood statistic (β 0 ) to recover the asymptotic pivotalness. Based on Rao and Scottc (1981) (see also Xue and Zhu, 2006), the above theorem implies Then the confidence region of α 0 = S(β 0 ) can be obtained by {S(β) : where q a is the (1 − a)th quantile of the χ 2 d−1 distribution. Remark 3 A drawback of the asymptotic inference method presented in the previous remark is that it requires a selection of a tuning parameter to implement the nonparametric estimatorm(·). In order to obtain an inference procedure which is free from tuning parameters, we adapt the bootstrap method of Hjort et al., (2009) as follows.
(1) Based on the original sample {Y i , X i } n i=1 , computeβ as in (4) ( Under the additional assumptions A3-A5 in the appendix, the validity of this bootstrap approximation is obtained as follows.

Theorem 2 Under Assumptions A1-A5, it holds
where P * is the bootstrap distribution conditional on the data.
The column "α 1 " reports the Monte Carlos averages and standard deviations of the first element of the BGH estimatorα. It shows that the mean is close to the truth, α 01 = 1/ √ 3 0.577, while the standard deviation becomes smaller with the sample size. From the columns (N), we can see that both the adjusted and bootstrap empirical likelihood tests have reasonable size properties. Both tests become powerful as the sample size increases and the true values of α 0 are more distinct from the null values (i.e., from A1 to A3). Also, we find that overall the bootstrap test rejects slightly more often than the adjusted test.
Overall, our simulation results are encouraging.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Thus, an expansion of (8) aroundλ = 0 using the same argument in (Owen, (1991), proof of Theorem 2) based on (11) implieŝ A second-order expansion of (7) aroundλ = 0 using (12) yields Then it is enough for the conclusion to show that 1 n n i=1ĝ 0iĝ 0i We first show (13). Decompose By the law of large numbers, the first term of (15) converges to V ; by Proposition 4 of BGH and Assumption A1, the second term converges to zero; by p.23 of BGH-supp and Assumption A1, the third term converges to zero. Combining these results, we obtain (13). We now show (14). Let P n be the empirical measure of {X i , Y i } n i=1 , P 0 be the true measure of (X , Y ), and and τ i,S 0 is the sequence of jump points ofψ β 0 . By the definition ofĒ n (x S 0 ), it holds I I I = 0 (see, (C.10) in BGH-supp).
For I I , decompose The same argument as in pp. 19-20 of BGH-supp guarantees I I a = o p (n −1/2 ) and I I b = o p (n −1/2 ). For I I c , using (C.11) of BGH-supp and Proposition 4 of BGH, we have for some C > 0. Therefore, we obtain For I , decompose From pp. 21-22 of BGH-supp, we can show that I b = o p (n −1/2 ). Therefore, and the central limit theorem implies (14). Therefore, the conclusion is obtained.

A.2 Proof of Theorem 2
Based on Groeneboom and Hendrickx, (2018), it is sufficient for the conclusion to show thatV whereβ is obtained by solving (4). For the validity of bootstrap, we add the following assumptions.
where the first equality follows from a similar argument to (6.25) in GH, and the second equality follows from a rearrangement andβ − β 0 = O p (n −1/2 ). Combining these results, we have Comparing this and (19), the central limit theorem yields (21). Therefore, the conclusion follows.