Abstract
Semiparametric regression models have received considerable attention over the last decades, because of their flexibility and their good finite sample performances. Here we propose an innovative nonparametric test for the linear part of the models, based on random sign-flipping of an appropriate transformation of the residuals, that exploits a spectral decomposition of the residualizing matrix associated with the nonparametric part of the model. The test can be applied to a vast class of extensively used semiparametric regression models with roughness penalties, with nonparametric components defined over one-dimensional, as well as over multi-dimensional domains, including, for instance, models based on univariate or multivariate splines. We prove the good asymptotic properties of the proposed test. Moreover, by means of extensive simulation studies, we show the superiority of the proposed test with respect to current parametric alternatives, demonstrating its excellent control of the Type I error, accompanied by a good power, even in challenging data scenarios, where instead current parametric alternatives fail.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Semiparametric regression models have a long history in statistics (see, e.g., the textbooks Green and Silverman 1994; Bickel et al. 1998; Ruppert et al. 2003, and references therein). Because of their flexibility and versatility, they have been the object of an extensive and still very active literature. In this work, we propose an efficient (conditional) resampling-based test (Pesarin 2001; Hemerik and Goeman 2018b; Chung and Romano 2013) for the linear component in partially linear and semiparametric regression models with roughness penalties. The test can be applied to a vast class of extensively used models, with nonparametric components defined over one-dimensional, as well as over multi-dimensional domains, including manifold domains. This embraces, for instance, the highly popular semiparametric regression models based on splines (see, e.g., Heckman 1986; Yu and Ruppert 2002; Wand and Ormerod 2008; Wang 2019, and references therein), on thin-plate splines (see, e.g., Wood 2003), and on spherical splines (Wahba 1981), as well as semiparametric models based on recent smoothing techniques over two-dimensional (possibly irregularly shaped or curved) domains, such as soap film smoothing (Wood et al. 2008), bivariate-splines over triangulations (Lai and Schumaker 2007; Baramidze et al. 2006; Lai et al. 2009; Guillas and Lai 2010; Lai and Wang 2013; Wang et al. 2020), and Spatial Regression with Partial Differential Equation regularization (SR-PDE) (see, e.g., Sangalli et al. 2013; Azzimonti et al. 2015; Ettinger et al. 2016; Wilhelm et al. 2016; Sangalli 2021).
Various classical approaches are available to make inference in the context of semiparametric regressions, and different strategies have been proposed to cope with the bias induced by the roughness penalty. Some possibilities include undersmoothing approaches developed for nonparametric models [see, e.g., the review in Hall and Horowitz (2013)], Bayesian approaches (Wahba 1983; Nychka 1988; Marra and Wood 2012) and various corrections of Wald-type test statistics, such as the sandwich estimators in Gray (1994) and Yu and Ruppert (2002), and the Speckman’s version in Speckman (1988) and Holland (2017). These approaches might nonetheless have poor performances in the finite sample scenario, due to the effects of the roughness penalty (see, e.g., Maas and Hox 2004; Freedman 2006). In particular, as also evidenced by the simulation studies reported in this work, such tests have a poor control of Type I error.
Here we propose an innovative test for the linear part of semiparametric regression models, based on conditional resampling of a transformation of the residuals. This test, unlike other proposals, allows to overcome the problem of dependence in the residuals that is particularly strong in semiparametric models. Some approaches proposed in the context of classical regression models, such as those in Huh and Jhun (2001) and Kherad-Pajouh and Renaud (2010), derive transformed residuals from spectral decomposition of the residualizing matrix that projects into the residual space. In the setting considered by these authors, the conditional distribution of the test statistic can be defined on the basis of permutations (see, e.g., Pesarin 2001; Chung and Romano 2013; Pauly et al. 2015; Winkler et al. 2014), rotations (Solari et al. 2014) or sign-flips (Hemerik et al. 2020) of such transformed residuals. These approaches are nonetheless not valid in this context, since the residualizing matrix is not idempotent in the case of penalized regression models. Because of this, the transformed residuals are not spherical (i.e., they are not homoscedastic and independent) and the standard permutation, rotation or sign-flip procedures become invalid in our context. To overcome this problem, we here study a conditional sign-flip procedure, named eigen sign-flip test, that preserves the finite sample covariance structure of the residuals, hence ensuring asymptotically exactness of the derived test. This idea has been explored in Ferraccioli et al. (2021), restricted to a specific case of SR-PDE model. The current work addresses instead the broad spectrum of highly popular semiparametric regression models mentioned above. Moreover, we study in detail the asymptotic properties of the test. In particular, we prove the asymptotic exactness of the test and derive similar results for interval hypothesis and confidence intervals. Some of the obtained results leverage on the asymptotic properties of the estimator of the nonparametric part of the model. Such properties in turn depend on conditions that are model-specific, since they depend, for instance, on the dimension and geometry of the domain over which the nonparametric term is defined, on the roughness term being considered, on the type of basis, etc. In the present work, we hence define assumptions that are general enough to cover a variety of semiparametric regression models, and refer the reader to other works for the appropriate specifications of such assumptions for the specific model being considered (e.g., to Claeskens et al. (2009) for univariate penalized splines estimators, to Holland (2017) for multivariate penalized splines estimators, to Xiao (2019) for general penalized splines and to Arnone et al. (2021) for SR-PDE.)
The paper is organized as follows. In Sect. 2 we briefly review the semiparametric penalized regression framework, outlining the forms of the associated discrete estimators. In Sect. 3 we recall some classical parametric approaches for inference on the linear part of a semiparametric regression model and summarize the properties of the score test statistic in this context. In Sect. 4 we present the eigen-sign flip test and describe its theoretical and asymptotic properties. In Sect. 5 we compare our proposal to more classical parametric approaches in extensive simulation studies. In Sect. 6 we present an application to the study of human development in Nigeria. Finally, some discussions and possible directions for future research are outlined in Sect. 7.
2 Semiparametric regression
Let \(y_{i} \in {\mathbb {R}}\) be the value of the variable of interest observed in correspondence of covariates \({\textbf{x}}_{i} \in {\mathbb {R}}^{q}\) and of \({\textbf{p}}_{i}\in \Omega \subseteq {\mathbb {R}}^{d},\) \(d\ge 1\). We consider the semiparametric model
where \(\varvec{\beta } \in {\mathbb {R}}^{q}\) is the vector of regression parameters, f is a real-valued smooth function on \(\Omega \), and \(\epsilon _{i}\) are i.i.d. random errors with \({\mathbb {E}}(\epsilon _{i}) = 0\) and \({\mathbb {E}}(\epsilon _{i}^2) = \sigma ^{2}\).
The interest is to estimate both the linear coefficients \(\varvec{\beta }\) and the nonparametric component f. However, the estimation of \((\varvec{\beta },f)\) in model (1) via maximum likelihood is usually inappropriate or infeasible, due to the infinite-dimensionality of the nonparametric component f. To avoid this problem, some type of roughness penalty can be imposed, in order to reduce the space of possible solutions. In general, the resulting penalized likelihood estimators for \(\varvec{\beta }\) and f are the solution of the minimization problem
where \({\mathcal {P}}(\cdot )\) is some type of roughness penalty. Depending on the assumptions on the domain \(\Omega \subseteq {\mathbb {R}}^{d},\) on the dimension d, and on the required smoothness of the function f, various proposals for \({\mathcal {P}}(f)\) have been considered in the literature, and different discretization procedures have been adopted to reduce the infinite-dimensional estimation problem (2) to a finite dimensional one. For instance, for \(d=1\) and \(\Omega \) an interval of the real line, model (1)-(2) can involve the classical and extensively used O’Sullivan splines (O’Sullivan 1986; Heckman 1986; Yu and Ruppert 2002; Wand and Ormerod 2008), whose penalty is the integrated squared derivative of some order, and can, for instance, rely on B-spline bases. When \(\Omega \) is the real plane, it is possible to use thin-plate splines (see, e.g., Duchon 1977; Wahba 1990; Wood 2003), which involve as penalty the so-called thin-plate energy. Moreover, various recent techniques target two-dimensional bounded planar domains \(\Omega \subset {\mathbb {R}}^2\), including: soap-film smoothing (Wood et al. 2008) that considers a penalty involving the Laplacian of f; bivariate-splines over triangulations (Lai and Schumaker 2007; Guillas and Lai 2010; Lai and Wang 2013), whose regularizing term may include high-order derivatives; SR-PDE (Sangalli et al. 2013; Azzimonti et al. 2015), where the regularizing term can involve general second-order partial differential equations, and the estimation problem is discretized via finite element bases (Sangalli et al. 2013; Azzimonti et al. 2015) or advanced spline bases (Wilhelm et al. 2016). Some of these techniques also permit the constructions of semiparametric models over spherical domains (Wahba 1981; Baramidze et al. 2006; Lai et al. 2009) and general surface domains (Ettinger et al. 2016; Wilhelm et al. 2016).
2.1 Discrete estimators
The estimation of model (1) usually involves the representation of the nonparametric component f through some type of basis expansion, depending on the penalization being considered. Let \(\Psi \in {\mathbb {R}}^n\times {\mathbb {R}}^K\) be the matrix of the evaluations of the K basis functions \(\psi _{1}, \dots , \psi _{K}\) at the n data locations \({\textbf{p}}_{1}, \dots , {\textbf{p}}_{n}\), that is,
Then, we write \((f({\textbf{p}}_{1}), \dots , f({\textbf{p}}_{n}))^{\top }=\Psi \varvec{\gamma }\) for some vector of coefficients \(\varvec{\gamma }\in {\mathbb {R}}^K.\) Moreover, let P denote the \(K\times K\) positive semidefinite matrix representing the discretization of the penalty \({\mathcal {P}}(\cdot )\). Finally, set \({\textbf{y}}=(y_1,\ldots ,y_n)^{\top }\)and denote by \(X\in {\mathbb {R}}^n\times {\mathbb {R}}^q\) the design matrix, whose i-th row is given by \({\textbf{x}}_{i}\). The estimation problem (2) is therefore discretized as
The solution to (3) is uniquely determined by the normal equations
Setting
the explicit form of the estimators for \(\varvec{\beta }\) and \(\varvec{\gamma }\) is, respectively,
or equivalently
3 Inference on \(\varvec{\beta }\)
In semiparametric regression, a natural question is whether the covariates X have an effect on the variable of interest. We are thus interested in the system of hypotheses
A standard approach to verify (9) is to use a Wald-type test (see, e.g., Schervish 2012), based on the asymptotic distribution of \(\hat{\varvec{\beta }}\). The study of the asymptotic distribution of \(\hat{\varvec{\beta }},\) in semiparametric regression models, has been tackled by a number of works. See, for instance, Heckman (1986); Yu and Ruppert (2002); Li and Ruppert (2008); Holland (2017); Xiao (2019); Yu et al. (2019); Wang et al. (2020) for semiparametric models based on univariate and bivariate splines.
The parametric Wald-type test may nonetheless have poor performances in small sample scenarios, due to the overestimation of the variance of the test statistic, induced by the penalization. A number of corrections to Wald-type test have been proposed to avoid this issue, such as the sandwich estimators in Gray (1994) and Yu and Ruppert (2002) and the Speckman’s version in Speckman (1988) and Holland (2017). Nonetheless, these approaches can only partially solve the problem, and may lead to a poor control of the Type I error, especially when a strong temporal/spatial structure in the covariates is present, as indicated by the simulations carried on in Sect. 5.
In the Sect. 4 we introduce an innovative nonparametric alternative for testing on \(\varvec{\beta }\). Such proposal is based on the score statistic. For this reason, in the remainder of this section we review the properties of the score statistic in the context of penalized semiparametric regression. The proposed method does not rely on the estimation of the Fisher information matrix to define the null distribution, which is implicitly recovered by an appropriate nonparametric resampling procedure, as described in Sect. 4.
3.1 Properties of the score statistic in penalized semiparametric regression
We first study the distributional properties of the score statistic, which constitute the base of the nonparametric test defined in Sect. 4. Using the normal equation (4), we can define the classical score test statistic
Since \(\varvec{\gamma }\) is unknown, we can use the plug-in \(\hat{\varvec{\gamma }}\). Substituting \(\hat{\varvec{\gamma }}\) in expression (10), we define the test statistic T as
with \({\textbf{r}} = {\textbf{y}} - X{\varvec{\beta }_{0}}\). We make the following assumption:
-
(A1)
For n large enough, the matrix \(\Psi ^{\top }\Psi \) is positive definite.
Assumption (A1) is quite general; its specification depends on the basis considered. In particular, this specification usually involves conditions on the nodes of the basis and their position with respect to the design points \({\textbf{p}}_{1}, \dots , {\textbf{p}}_{n}.\) More specifically, it involves the type of basis, the rate at which the number of bases K grows with n, the minimum distance between the nodes, and the density of the design points inside the domain. For instance, in the case of univariate penalized splines estimators, (A1) follows from Assumptions 1–3 in Claeskens et al. (2009). In the case of multivariate penalized splines estimators, it follows from Assumptions 1–2 in Holland (2017). In the case of SR-PDE, it follows from Assumptions 3–5 in Arnone et al. (2021).
Here we consider the case of fixed designs, thus implicitly conditioning on the sample points and the covariates. Similar results can be obtained in the random design scenario, by introducing further assumptions on the distribution of the design points and covariates (e.g., that the covariates are realizations of continuous processes on \(\Omega \)).
Under (A1), we can consider the Demmler and Reinsch (1975) decomposition
where U is the matrix of eigenvectors, and \(\rho \) is the corresponding vector of eigenvalues \(\{\rho _{k}\}_{k = 1}^{K}\) (see Eubank 1999, for details). Let us also denote \(A = \Psi (\Psi ^{\top }\Psi )^{-1/2}U\). Note that this matrix is semi-orthogonal, i.e., \(A^{\top }A = I_{K}\) and \(AA^{\top } = \Psi (\Psi ^{\top }\Psi )^{-1}\Psi ^{\top }\). Following Demmler and Reinsch (1975), we can rewrite the matrix \(\Lambda \) in (6) as
Using this decomposition, we can now study the behavior of the bias of the test statistic T, in terms of the eigenvalues \(\rho _{k}\).
Lemma 3.1
Assume (A1) and let and . Let also be the q-dimensional vectors corresponding to the rows of , and be the elements of the vector . Under the null hypothesis (9), the bias \({\textbf{b}}_{\lambda }\) of T is
where the inequality is considered element-wise.
Proof
Denote by \(\varvec{\epsilon }\) the n-dimensional vector of i.i.d. residuals. Under the null hypothesis, we have
since the term \({\mathbb {E}}(\varvec{\epsilon })\) is zero by assumption. Using the decomposition in (3.1), it follows that
Substituting (13) in \({\textbf{b}}_{\lambda }\), we obtain
Using the notation and , the bias can therefore be rewritten as
where are the q-dimensional vectors corresponding to the rows of , and the elements of the vector . Equation (14) highlights that the bias is a sum of K contributions, weighted by the eigenvalues \(\rho _{k}\), and moderated by \(\lambda \). Since the function \(x/(1+x) < x\), for \(x > 0\), we can bound the bias as follows
\(\square \)
The expression (14) highlights how the bias depends on the chosen penalization through the eigenvalues \(\rho _k\). We finally make the following assumption.
-
(A2)
The smoothing parameter \(\lambda = \lambda _{n}\) is chosen so that \(\lambda \sum _{i = 1}^{K}\rho _{i} = o(1)\).
Thanks to Lemma 3.1, assumption (A2) implies the asymptotic unbiasedness of score statistic T, since f is a continuous function on the bounded domain \(\Omega \) and the covariates are realizations of a continuous process on \(\Omega \). This is a standard assumption when studying the asymptotic properties of semiparametric and nonparametric penalized regression models. Likewise for Assumption (A1), also Assumption (A2) needs to be specified depending on the penalty and basis considered. Indeed, Assumptions (A1)–(A2) are intentionally left quite general to embrace various semiparametric models; moreover, the precise rates of convergence are not of direct interest in this work. Theorem 1 in Claeskens et al. (2009) gives, for instance, the appropriate rates for \(\lambda \) in the case of univariate penalized splines estimators, Theorem 3 in Holland (2017) gives it for multivariate penalized spline estimators, while Lemma 3 in Arnone et al. (2021) gives it for SR-PDE estimators.
We can now state the main result for the asymptotic distribution of the test statistic T.
Theorem 3.2
Let \(\nu = \sigma ^{2}X^{\top }\Lambda ^{2}X\). Under the assumptions (A1)–(A2), the test statistic T in (11) is asymptotically normal under the null hypothesis (9), with
Proof
We know that
where the notation \([X^{\top }\Lambda ]_{i}\) is used to indicate the i-th column of the \(q\times n\) matrix \(X^{\top }\Lambda \). Under assumption (A2), it follows from (12) that the bias \({\textbf{b}}_{\lambda }\) is asymptotically zero. The expected value \({\mathbb {E}}(T)\) is therefore asymptotically zero. For the variance, under the null hypothesis we have
Substituting the expression of \(\Lambda \) from equation (13) in the previous expression, we obtain
Using the notation and completing the square in the second term, we hence get
where are the q-dimensional vectors corresponding to the rows of . Note that the first term does not depend on \(\lambda \). As for the second term, since \(x^{2}/(1+x)^{2} < x^{2}\), for \(x > 0\), we have
where the maximum is taken element-wise. Therefore, for n large enough (since the covariates are realizations of a continuous process on \(\Omega \)), assumption (A2) implies that the second term in (15) vanishes faster than the first term. Concerning the first term in (15), with a similar argument it is easy to check that the matrix \(AA^{\top }\) is idempotent with rank K. Thus, it admits the spectral decomposition \(AA^{\top } = U \text {diag}(1, \dots , 1, 0, \dots , 0)U^{\top }\), with the first K non-null eigenvalues equal to 1. The term \(X^{\top }(I_{n} - AA^{\top })X\) is therefore the sum of \(n-K\) components with bounded variance, since the covariates are realizations of a continuous process on \(\Omega \), thus the Feller condition is satisfied. It follows from the central limit theorem [see, e.g., Van der Vaart (2000)] that the test statistic T is also asymptotically normal. \(\square \)
4 Eigen sign-flip test for the linear component in penalized semiparametric regression models
In the classical linear regression case, under the standard assumption of i.i.d. random noise, the score statistics can also be viewed as a sum of n contributions that have asymptotically zero mean, under the null hypothesis \(H_0\) (9). This information can be used to derive the null distribution of the test statistic, without the need of a direct estimation of the Fisher information. In the context of semiparametric regression, instead, a first naive attempt to derive the distribution of the test statistic can be made by random permutations (or sign-flips) of the contributions of the score (Winkler et al. 2014; Hemerik et al. 2020). This approach, attempted in Ferraccioli (2020) for a simple type of SR-PDE model (Sangalli et al. 2013), might nonetheless be not optimal in the semiparametric regression setting. The reason for this lies in the fact that naive permutation does not account for the correlation between residuals, nor for the bias of the estimates, which is inherent to semiparametric models. To solve this issue, always considering a special case of SR-PDE model, Ferraccioli et al. (2021) defines a new test statistic, that leverages on the spectral decomposition of the matrix \(\Lambda \), leading to the definition of the eigen sign-flip test.
We here defined the eigen sign-flip test on \(\varvec{\beta }\) for a general forms of penalized semiparametric regression models. We study the properties of the test, proving its asymptotic distribution. A thorough discussion on the nature of the proposed test in given in Sect. 4.2.
Definition 1
(Eigen sign-flip test) Let us consider the singular value decomposition \(\Lambda = VDV^{\top }\). Set \(\Pi = \textrm{diag}(\pi _{1}, \dots , \pi _{n})\), where \(\varvec{\pi } = (\pi _{1}, \dots , \pi _{n})\) is a random vector uniformly distributed in \(\{-1, 1\}^{n}\). Let us also define the n-dimensional vectors \({\tilde{X}} = D^{1/2}V^{\top }X\) and \(\tilde{{\textbf{r}}} = D^{1/2}V^{\top }{\textbf{r}} = D^{1/2}V^{\top }({\textbf{y}} - X{\varvec{\beta }_{0}})\). The eigen sign-flip statistics is defined as
Note that the observed statistic \(T = T_{I}\) corresponds to the case where \(\pi _{i} = 1,\ i=1,\ldots ,n\). As standard in permutational approaches, the component-wise p-values are thus computed as the rank of \(T_{I}\) with respect to a sample of M sign-flips \(\varvec{\pi }\), divided by M (see, e.g., Pesarin 2001).
4.1 Asymptotic properties of the eigen sign-flip test
We now study the asymptotic properties of the test statistic \(T_{\Pi }\) in Definition 1. We first show that the asymptotic distribution of the test statistic \(T_{\Pi }\) is the same as \(T_{I}\). We then show that the eigen sign-flip test is asymptotically exact.
Theorem 4.1
Let \(\nu = \sigma ^{2}X^{\top }\Lambda ^{2}X\). Under the assumptions (A1)–(A2), for any given \(\Pi \), the distribution of \(T_{\Pi }\) is asymptotically normal, with
Proof
For the expected value, under the null hypothesis we have
Following the same reasoning of the proof of Theorem 3.2, but with the quantity \(V \Pi V^{\top }X\) in place of X, we can show that the expected value of \(T_{\Pi }\) is asymptotically zero.
As for the variance, under the null hypothesis we have
It follows from the central limit theorem (Van der Vaart 2000), that the test statistic \(T_{\Pi }\) is also asymptotically normal. \(\square \)
Remark 1
Note that the bias in the mean of the test statistic is intrinsic in the regularization approach, and cannot be avoided in the finite sample scenario. Because of this bias, we are only able to reach asymptotically exact results.
Remark 2
Note also that the matrix \(\Pi \) is defined so that it commutes with D. This is necessary to ensure that the variance of the test statistic is invariant under the action of \(\Pi \).
We now introduce some notation before establishing the main result, that constitutes the pivot point to prove the asymptotic control of the probability of Type I error. For the sake of simplicity of exposition, we consider the results for a single covariate case in the remainder of this section and in Sect. 4.3. In Sect. 4.4, we outline the procedure for the general multivariate case. Let \(\alpha \in [0, 1)\). For any \(a \in {\mathbb {R}}\), let \(\lceil a \rceil \) be the smallest integer which is larger than or equal to a and let \(\lfloor a \rfloor \) be the largest integer which is at most a. We consider all the possible \(w = 2^{n}\) sign-flips \(\Pi _1,\ldots ,\Pi _w,\) where \(\Pi _1=I\). For a given value of the test statistic \(T_{I}^n\), we hence consider all the associated sign-flipped values \(T_{I}^n,T_{\Pi _2}^n, \ldots , T_{\Pi _w}^n\), where we use the superscript n to highlight the sample size. We denote by \(T_{(1)}^n \le \ldots \le T_{(w)}^n\) the corresponding sorted value. Finally, we write \(T_{[1-\alpha ]}^{n} = T_{(\lceil 1-\alpha \rceil w)}^{n}\).
Theorem 4.2
Consider the test that rejects \(H_{0}\) if and only if \(T_{I}^{n} > T_{[1-\alpha ]}^{n}\). Then, under the null hypothesis, the test is asymptotically exact and the rejection probability \({\mathbb {P}}(T_{I}^{n} > T_{[1-\alpha ]}^{n})\) is at most \(\alpha \).
Proof
We need to show that the asymptotic distribution of the \(2^n\)-dimensional vector of test statistics \({\textbf{T}}=(T_{I}^n, \ldots , T_{\Pi _w}^n)^{\top }\) is invariant under sign-flip transformations \(\Pi \), that is \({\textbf{T}}{\mathop {=}\limits ^{d}}\Pi \circ {\textbf{T}}\), where \({\mathop {=}\limits ^{d}}\) represent the equality in distribution and the composition stands for \(\Pi \circ {\textbf{T}}=\Pi \circ (T_{I}^n,T_{\Pi _2}^n, \ldots , T_{\Pi _w}^n)^{\top }=(T_{\Pi I}^n,T_{\Pi \Pi _2}^n, \ldots , T_{\Pi \Pi _w}^n)^{\top }\). This will prove the asymptotic control of the Type I error through Theorem 15.2.1 in Lehmann and Romano (2008) and Theorem 1 in Hemerik and Goeman (2018a).
Let \(\tilde{{\textbf{X}}}\) be the diagonal \(n \times n\) matrix with elements \(({\tilde{X}}_{1}, \dots , {\tilde{X}}_{n})\). The test statistic in Definition 1 can be rewritten as
where \(\mathbb {1}_{n}\) is the n-dimensional unit vector. The test statistic \(T_{\Pi }\) can hence be viewed as sum of n contributions, where each element of \(\tilde{{\textbf{X}}}\tilde{{\textbf{r}}}\) is sign-flipped through \(\Pi \). Similarly, the variance of \(T_\Pi \) can be written as
see also Theorem 4.1.
To evaluate the joint distribution of the test statistics \({\textbf{T}}\), let us now define \(\varvec{\Pi }\) as the \(2^n\times n\) matrix collecting all the \(w=2^n\) vectors of sign-flip row-wise. Therefore, we can write \({\textbf{T}} = \varvec{\Pi } \tilde{{\textbf{X}}}\tilde{{\textbf{r}}}\), and \(\Pi \circ {\textbf{T}}=\varvec{\Pi } \Pi \tilde{{\textbf{X}}}\tilde{{\textbf{r}}}\). The joint distribution of \({\textbf{T}}\) is multivariate normal with variance \(\text {Var}({\textbf{T}})=\sigma ^2n^{-1}\varvec{\Pi } \tilde{{\textbf{X}}}D\tilde{{\textbf{X}}} \varvec{\Pi }^{\top }\) and asymptotically zero mean. We now have to show that \(\Pi \circ {\textbf{T}}\) follows the same asymptotic multivariate normal distribution. First note that the transformation \(\Pi \) does not affect the expected value that remains asymptotically zero. Furthermore, for the variance we have
Thanks to Theorem 15.2.1 in Lehmann and Romano (2008) and Theorem 1 in Hemerik and Goeman (2018a), this yields the null invariance \({\textbf{T}}{\mathop {=}\limits ^{d}}\Pi \circ {\textbf{T}}\). It follows that under \(H_0\), \({\mathbb {P}}(T_{I}^{n} > T_{[1-\alpha ]}^{n})\le \alpha \). \(\square \)
Remark 3
The previous result is still valid in the case when \(w \ne 2^{n}\), i.e., when not every element of \(\{1, -1\}^{n}\) is used once (see, e.g., Hemerik and Goeman 2018a). For computational reasons, it is in fact common practice to sample uniformly from \(\{-1, 1\}^{n}\), with or without replacement. The same holds also for the results in the next section.
4.2 On the nature of the eigen sign-flip test
A few comments on this approach may be useful to understand its nature.
In the simpler context of linear regression models, it is possible to define a residualizing matrix that projects into the residual space. This is an orthogonal projection matrix; as such, it is idempotent, and all its eigenvalues are zero or one. Thanks to this, for classical linear regression models, Kherad-Pajouh and Renaud (2010) propose to pre-multiply the residuals by the semiorthogonal matrix defined by the eigenvectors corresponding to the non-null eigenvalues. This pre-multiplication transforms the n residuals into pseudo-residuals, reducing their cardinality. The number of these new pseudo-residuals is equal to the rank of the residualizing projection matrix (i.e., the number of non zeros eigenvalues, usually equal to the number of covariates). Being the remaining eigenvalues of residualizing matrix equal to one, the resulting pseudo-residuals are now independent and homoscedastic (i.e., spherical). In particular, Kherad-Pajouh and Renaud (2010) suggest the use of a permutation approach, while Solari et al. (2014) extend it to the more general framework of rotations matrices.
Unfortunately, within the semiparametric regression framework, the residualizing matrix \(\Lambda \) is not a projection matrix and is not idempotent; therefore, its eigenvalues do not take values in \(\{0, 1\}\). A multiplication by these eigenvalues would act as a scaling factor for the residuals, making them independent, but not homoscedastic. For this reason, defining \(\Pi \) as a permutation or a rotation matrix would not be a valid solution. Defining instead \(\Pi \) as sign-flipping matrix ensures the commutative property \(\Pi D=D \Pi ,\) as highlighted in Remark 2. This property is indeed crucial, as it guarantees that the variance of the test statistics is constant over \(\Pi ,\) as proved in Theorem 4.1. It is also worth to emphasize that the test guarantees only asymptotic exactness since the penalization induces a bias in the estimation of the mean—which vanishes with increasing n—while, for fixed n, the variance remains constant for all the test statistics defining the null distribution. On the contrary, the standard parametric Wald test needs asymptotic results for both the mean and the variance, such as those obtained in Sect. 3.1. Similar considerations could be drawn for a naive sign-flip score test that does not decomposes the matrix \(\Lambda \). In this case, for finite samples, the sign-flipped test statistic would not have variance equal to observed test statistic. This would lead to performances that are comparable to the parametric Wald test, as shown in Ferraccioli (2020). This difference between the eigen sign-flip test and the other competitors is crucial in providing an adequate control of Type I error, as shown by the simulations in Sect. 5.
4.3 Interval hypotheses, two-sided tests and confidence intervals
So far we have defined the eigen sign-flip test for point-wise null hypothesis. The most common situation in practice is nonetheless to define interval null hypotheses. As for standard approaches, we need to prove that the p-value computed under any point-wise null hypothesis \(H_0: \beta = \beta _0-\epsilon \) (\(\forall \ \epsilon >0\)) has also Type I error probability bounded by \(\alpha \). For convenience, let us define the test statistic as a function of the tested coefficient, that is \(T_{\Pi }(\beta _0) = n^{-1/2}X^{\top }V\Pi V^{\top } \Lambda ({\textbf{y}} - X\beta _0) = n^{-1/2}X^{\top }V\Pi D V ({\textbf{y}} - X\beta _0)\). We now give two results for interval hypothesis and two-sided tests, and a third result on confidence intervals.
Corollary 4.2.1
(Interval hypotheses) Consider the hypotheses \(H_0^{'}:\ \beta =\beta _0-\epsilon \) and \(H_1^{'}:\ \beta >\beta _0-\epsilon \), with \(\epsilon >0\). Then for every \(\epsilon > 0\), \({\mathbb {P}}(T_{\Pi }(\beta _0-\epsilon ) \ge T_{I}(\beta _0-\epsilon ))\le {\mathbb {P}}(T_{\Pi }(\beta _0) \ge T_{I}(\beta _0))\). The same is true for the opposite hypothesis: that is, \(\epsilon <0\) and \(H_1':\ \beta <\beta _0+\epsilon \).
Proof
We have
Note that last inequality holds since
Note that \((I-\Pi )\) is a diagonal matrix with non-negative diagonal entries, thus it is positive semi-definite for all \(\Pi \). Therefore, any quadratic form of it is nonnegative. \(\square \)
Corollary 4.2.2
(Two-sided test) Consider \(\alpha _{1}, \alpha _{2} \in (0,1]\) and such that \(\alpha _{1}+\alpha _{2}<1\). Then, under \(H_{0}: \beta =\beta _0\), as \(n \rightarrow \infty \),
That is, the eigen sign-flip test controls the Type I error asymptotically when testing \(H_0:\ \beta = \beta _0\) against the two sided alternative \(H_1:\ \beta \ne \beta _0\).
Proof
Theorem 4.2 proves that \({\mathbb {P}}\left[ T_{I}^{n} < T_{[\alpha _{1}]}^{n} \right] \rightarrow \alpha _{1}\) and \({\mathbb {P}}\left[ T_{I}^{n} > T_{[1-\alpha _{2}]}^{n}\right] \rightarrow \alpha _{2}\). This, together with the fact that \({\mathbb {P}}\left[ (T_{I}^{n} < T_{[\alpha _{1}]}^{n}) \cap (T_{I}^{n} > T_{[1-\alpha _{2}]}^{n})\right] \rightarrow 0\), proves the corollary. \(\square \)
As consequence of the two lemmas above, we can also derive confidence intervals for the parameter \(\beta \).
Corollary 4.2.3
(Confidence Interval) Let \(\alpha \in (0,1]\). Then the set
is a one-sided confidence interval for parameter \(\beta \) with asymptotic coverage \(1-\alpha \). Let also \(\alpha _{1}, \alpha _{2} \in (0,1]\) such that \(\alpha _{1}+\alpha _{2}<1\). Similarly, the set
is a two-sided confidence interval with asymptotic coverage \(1 - (\alpha _{1} + \alpha _{2})\).
Proof
The proof follows directly from Corollaries 4.2.1 and 4.2.2. \(\square \)
4.4 Testing a subset of the covariates
We now deal with the case where we have multiple covariates, and we are interested in testing a subset of the covariates. Specifically, assume \(X\in {\mathbb {R}}^n\times {\mathbb {R}}^q\) represents the set of covariates of interest, with associated vector of coefficients \(\varvec{\beta },\) and \(Z\in {\mathbb {R}}^n\times {\mathbb {R}}^p\) the set of covariates associated with the vector of nuisance coefficients \(\varvec{\zeta }\). The minimization problem in (2) then becomes
We might be interested in testing
for any value of \(\varvec{\zeta }\) and \(\varvec{\gamma }\). Let us define \(\Psi ^{*} = [Z|\Psi ]\) the \(n\times (p + K)\) matrix composed by the covariates associated with the nuisance parameters and the bases for the nonparametric part of the model, with coefficients \(\varvec{\theta } = (\varvec{\zeta }, \varvec{\gamma })\). We can then rewrite equation (5) as
where \({\mathbb {O}}\) is a matrix of zeros.
Definition 1 of the eigen sign-flip test remains valid also in this case, with the only modification of the matrix \(\Lambda \) in (6), where \(\Psi \) is replaced by \(\Psi ^{*}\). Moreover, the following corollary provides the extension of Theorem 4.2 to the case where \(\varvec{\beta }\) is a vector.
Corollary 4.2.4
Consider the test that rejects \(\text {H}_{0}: \varvec{\beta } = \varvec{\beta }_{0}\) if and only if \(\varphi (T_{I}^{n}) > \varphi (T_{[1-\alpha ]}^{n})\), where \(\varphi (\cdot )\) is any nonparametric combining function (Section 6.2 Pesarin 2001). Then, under the null hypothesis, the test is asymptotically exact and the rejection probability \({\mathbb {P}}(\varphi (T_{I}^{n}) > \varphi (T_{[1-\alpha ]}^{n}))\) is at most \(\alpha \).
Proof
In order to extend the proof of Theorem 4.2 to the multivariate framework, we need to rely on the Nonparametric Combination of dependent test statistics, as defined, e.g., in Section 6.2 of Pesarin (2001). First of all, recall that the test statistic T is a vector itself. Moreover, Theorem 3.2 proves the asymptotic multivariate normality of T and Theorem 4.1 shows that the sign-flipped vectors of test statistics \(T_{I}^n, \ldots , T_{\Pi _w}^n\) share the same distribution. Therefore, the matrix \({\textbf{T}}=(T_{I}^n, \ldots , T_{\Pi _w}^n)^{\top }\) is equal in distribution to \(\Pi \circ {\textbf{T}}\). More precisely, \({\textbf{T}}\) is the \(2^n\)-dimensional vector of test statistics \(T_{(\cdot )}^{n}\), i.e., each row of \({\textbf{T}}\) is a sampling from the multivariate test statistics T. We can therefore use any nonparametric combining function Pesarin (2001) to obtain a p-value. \(\square \)
Among the most commonly used nonparametric combining functions, defined, e.g., in Pesarin (2001), are the max-T, sum-T or Mahalanobis distance. As an illustrative example, a p-value based on the \(\min \)-p combining function (Westfall and Young 1993) rejects the multivariate null hypothesis if the maximum value of T is larger than the \(1-\alpha \) quantile of the distribution of the maxima computed over the w elements of \((T_{I}^n, \ldots , T_{\Pi _w}^n)\).
5 Simulation studies
In this section we present two simulation studies, to investigate the finite sample performances of the proposed test. Simulation 1, in Sect. 5.1, considers a semiparametric model based on classical univariate splines (as, for instance, in Heckman 1986; Wand and Ormerod 2008). Simulation 2, in Sect. 5.2, considers instead a semiparametric model based on SR-PDE (Sangalli et al. 2013). In these different settings, we compare the performances of three different tests:
-
Wald: a classical Wald-type test based on the asymptotic distribution of \(\hat{\varvec{\beta }}\);
-
Speck: a similar Wald-type test based on the asymptotic distribution of the Speckman version of the estimator (Speckman 1988), as derived in Holland (2017);
-
ESF: the Eigen sign-flip score test introduced in Definition 1.
The results show the performances of the tests over 1000 simulation repetitions.
5.1 Simulation 1
In Simulation 1, we simulate from model (1), with \(\Omega =[0,1]\) and \(p_1,\ldots ,p_n\) randomly sampled from a uniform distribution on \(\Omega ,\) with \(n=200.\) For the nonparametric component of the model, we consider the test function 1 from the function gamSim in the R package mgcv (Wood 2015, 2017), defined as \(0.2p^{11}(10(1 - p))^6 + 10(10p)^3(1 - p)^{10}.\) We consider \(q=1\) covariate, and we generate \(x_1,\ldots ,x_n\) according to four different stochastic processes:
-
(a)
an i.i.d. random sample from \({\mathcal {N}}(0, 0.1^2)\);
-
(b)
a Gaussian random field on [0, 1] with mean zero and scale 0.01;
-
(c)
the function \(\sqrt{p + 2}\) on [0, 1], with added an i.i.d. random sample from \({\mathcal {N}}(0, 0.1^2);\)
-
(d)
the function \(\sqrt{p + 2}\) on [0, 1], with added a Gaussian random field with mean zero and scale 0.01.
The covariates and the true f are standardized, before computing the response variable y, so that their relative contributions to the response are comparable. We consider both \({\beta }_0=0\) and other 10 different values of \({\beta }_0,\) from 0.01 to 0.1, to check both the Type I error and the power of the test. Finally, we add i.i.d. normal random errors \(\epsilon _1,\ldots ,\epsilon _n,\) with zero mean and standard deviation 0.1. For each test case, the generation of the covariates and noise is repeated 1000 times.
The model is estimated using cubic B-spline bases, with 200 equispaced internal nodes on \(\Omega \), using the implementation in Wand and Ormerod (2008). The smoothing parameter is chosen via cross-validation. The tests are performed with nominal value 0.05. For the proposed eigen sign-flip test, we consider 1000 random sign-flips.
Table 1 shows the control of Type I error, and Fig. 1 shows the power functions for the three competing tests. The table and figure immediately highlight that the most challenging scenarios are cases (b) and (d), where the covariates have been generated with a dependence structure, sampling from a Gaussian process. The classic parametric test (Wald) shows an extremely poor control of the Type I error in these two cases, with an observed proportion of Type I error of over \(26\%\), when the nominal value of the test is \(5\%\). This behavior is possibly due to the poor estimation of the variance induced by the regularized estimates. The Speckman variant appears more robust, partly correcting for the misspecified variance. Nonetheless, this test is significantly underconservative in cases (b) and (d), with a proportion of Type I error of almost 9%, while it is over-conservative in cases (a) and (c), where it returns a proportion of Type I error of about 2.3%. The proposed eigen sign-flip score test, on the contrary, maintains an extremely good control of the Type I error, under all scenarios, and it is never underconservative. Also in the challenging cases (b) and (d), at the cost of a slightly loss of power, it manages to keep a proportion of Type I error very close (and just slightly inferior) to the nominal value of the test.
We also considered the case of multiple covariates, following the simulation scheme detailed above, but including simultaneously all four covariates (a), (b), (c) and (d) in the data generation, and testing one parameter at a time, considering the other parameters as nuisance, as detailed in Sect. 4.4. The same considerations as those detailed for the simulation in Fig. 1 can be drown (results non included for sake of space).
5.2 Simulation 2
In Simulation 2, we simulate from model (1), with \(\Omega =[0,1]\times [0,1]\) and \({\textbf{p}}_1,\ldots ,{\textbf{p}}_n\) randomly sampled from a uniform distribution on \(\Omega ,\) with \(n=225.\) For the nonparametric component of the model, we consider the test function 2 from the function gamSim in the R package mgcv (Wood 2015, 2017), defined as
We consider \(q=1\) covariate, and we generate \(x_1,\ldots ,x_n\) according to four different stochastic processes:
-
(a)
a Gaussian random field with zero mean and scale 0.05;
-
(b)
a Matern random field with \(\nu = 1\), \(\sigma = 2\) and scale 0.1;
-
(c)
the function \(\cos (5(p_{1} + p_{2})) + (2p_{1} - p_{1}p_{2}^2)^2\) with added a Gaussian random field with scale 0.05;
-
(d)
the function \(\cos (5(p_{1} + p_{2})) + (2p_{1} - p_{1}p_{2}^2)^2\) with added a Matern random field with \(\nu = 1\), \(\sigma = 2\) and scale 0.1.
The covariates and the true f are standardized, before computing the response variable y, so that their relative contributions to the response are comparable. We consider both \({\beta }_0=0\) and other 10 different values of \({\beta }_0,\) from 0.01 to 0.1, to check both the Type I error and the power of the test. Finally, we add i.i.d. normal random errors \(\epsilon _1,\ldots ,\epsilon _n,\) with zero mean and standard deviation 0.1. For each test case, the generation of the covariates and noise is repeated 1000 times.
The model is estimated using SR-PDE, with linear finite elements on a mesh having 225 nodes on a regular lattice over \(\Omega ,\) implemented using the package fdaPDE. The smoothing parameter is chosen via cross-validation. The tests are performed with nominal value 0.05. For the proposed eigen sign-flip test, we consider 1000 random sign-flips.
The results are presented in Table 2 and Fig. 2. The classic parametric test (Wald) has poor performances and very low control of the Type I error in all the scenarios, with proportion of Type I error of about 15% and higher. The Speckman variant is always more robust than the Wald, but it is often severely underconservative, with observed proportion of Type I error of about 10%. The proposed eigen sign-flip, on the contrary, at a loss of some power, permits an extremely good control of the Type I error, even in the more challenging scenarios, where the covariate has a strong spatial structure.
6 Study of human development in Nigeria
In this section we apply the proposed methodology to the analysis of human development in Nigeria. In particular, we are interested in better understanding the difference in socioeconomic and health conditions in the various states of the country. Unfortunately, data at national and subnational level are often poor or not publicly available. This lack of information and of public domain surveys hamper the efforts to identify and develop targeted interventions in troubled areas (Jerven 2013). An alternative to traditional data consists in using other sources of openly accessible data, such as data from social media, mobile phone networks, or satellites. In particular, a popular recent approach leverages on satellite images of luminosity at night to estimate economic activity (Chen and Nordhaus 2011; Jean et al. 2016). These images highlight urban areas, which typically offer better provisions of basic services such as electricity, water and public health, as well as more job opportunities, with respect to rural areas.
Here we use open satellite data (NASA Worldview Snapshots), together with demographic data, to predict human development. Specifically, as a response variable, we consider the Human Development Index (HDI) (available at https://globaldatalab.org/shdi), an aggregated index that takes into account multiple dimensions at the household and individual level in health, education and standard of living. This index is available at states level, for the 36 states of Nigeria, and for the Federal Capital Territory. The values of this index are shown in panel d of Fig. 3. As covariates, in the parametric part of the model, we use the population density, \({\textbf{x}}_{Pop}\), of each state (data from the National Bureau of Statistics, Nigeria), shown in panel e of Fig. 3, and the three satellite images shown in the top panels of the same figure that are
-
Nightlight luminosity, \({\textbf{x}}_{Night}\), obtained via the VIIRS Nighttime Imagery, that captures low-light emission sources, under varying illumination conditions (panel a);
-
Short-Wave Infrared, \({\textbf{x}}_{SWIR}\), that highlights bare soils, such as deserts (panel b);
-
Near Infrared, \({\textbf{x}}_{NIR}\), that highlights vegetation (panel c).
We are interested in identifying significant effects of these covariates on human development, considering the model
Since the HDI, the response variable, and one of the covariate, the population density, are available at state level, we also aggregate the other three covariates at state level, considering their areal means. We then apply SR-PDE, considering the data located at the capitals of each state. We use a mesh with 320 nodes and select the smoothing parameter through generalized cross-validation (\(\lambda _n = 0.1\)). We hence perform significance tests on each covariate, one at a time, considering the other parameters as nuisance, as described in Sect. 4.4, using the eigen sign-flip procedure with 5000 random sign-flips.
Nightlight results significant (\(p < 0.005\)), with a positive impact on human development (the estimated coefficients is 0.29). The finding on nightlight is in line with other recent research studies (Chen and Nordhaus 2011; Jean et al. 2016). The presence of urban areas, in fact, plays a huge role in the overall wealth of the population. This of course does not imply a causality effect, since increased wealth has itself an impact on the development of urban areas. Nightlight is nonetheless a good indicator of the socioeconomic status at local level, that does not require any official statistics, as previously discussed. Short-wave infrared seems to be slightly significant (\(0.05< p < 0.1\)), with a negative impact on human development (the estimated coefficients is \(-0.016\)). The result might suggest that the presence of deserted areas with large amounts of bare soil lead to a decrease in human development. The more advanced states are indeed close to the ocean, in the southern part of the country. The northern part instead, that is mostly deserted, is not very populated. It is also worth noting that the aggregation at state level averages localized features, such as the presence of rivers, lakes or small vegetation, possibly reducing important information. The third satellite covariate, near infrared, does not appear significant (\(p > 0.1\)). The same apply for population density (\(p > 0.1\)). This is possibly due to the fact that the distribution is highly skewed, with most of the population residing in the state of Lagos, in the southwest of the country (see panel d in Fig. 3). Panel f in Fig. 3 shows the predicted MPI values, highlighting the very high explicative power of the model.
7 Discussion
This paper describes a strongly innovative and highly promising inferential approach in the context of semiparametric regression with roughness penalties. The paper focuses on tests for the linear part of the models. On the other hand, similar ideas can be used to develop tests and confidence bands on the nonlinear part of the models. Moreover, the described approach could be extended to deal with semiparametric regression with spatiotemporal components [see, e.g., Ugarte et al. (2009, 2010); Aguilera-Morillo et al. (2017); Marra et al. (2012); Augustin et al. (2013); Bernardi et al. (2018)], further broadening the spectrum of potential models that could benefit from our proposal. These developments will be objects of dedicated future studies. We are confident this inferential approach will become popular and will prove to be highly valuable in the varied contexts where semiparametric regression is used.
References
Arnone E, Kneip A, Nobile F, Sangalli LM (2021) Some first results on the consistency of spatial regression with partial differential equation regularization. Stat Sin. https://doi.org/10.5705/ss.202019.0346
Augustin NH, Trenkel VM, Wood SN, Lorance P (2013) Space-time modelling of blue ling for fisheries stock management. Environmetrics 24(2):109–119
Azzimonti L, Sangalli LM, Secchi P, Domanin M, Nobile F (2015) Blood flow velocity field estimation via spatial regression with pde penalization. J Am Stat Assoc 110(511):1057–1071
Baramidze V, Lai M-J, Shum CK (2006) Spherical splines for data interpolation and fitting. SIAM J Sci Comput 28(1):241–259. https://doi.org/10.1137/040620722
Bernardi MS, Carey M, Ramsay JO, Sangalli LM (2018) Modeling spatial anisotropy via regression with partial differential regularization. J Multivar Anal 167:15–30. https://doi.org/10.1016/j.jmva.2018.03.014
Bickel PJ, Klaassen Chris AJ, Ritov Y, Wellner JA (1998) Efficient and adaptive estimation for semiparametric models. Springer, New York. ISBN 0-387-98473-9. Reprint of the 1993 original
Carmen Aguilera-Morillo M, Durbán M, Aguilera AM (2017) Prediction of functional data with spatial dependence: a penalized approach. Stoch Env Res Risk Assess 31(1):07–22
Chen X, Nordhaus WD (2011) Using luminosity data as a proxy for economic statistics. Proc Natl Acad Sci 108(21):8589–8594
Chung EY, Romano JP (2013) Exact and asymptotically robust permutation tests. Ann Stat 41(2):484–507. https://doi.org/10.1214/13-AOS1090
Claeskens G, Krivobokova T, Opsomer JD (2009) Asymptotic properties of penalized spline estimators. Biometrika 96(3):529–544
Demmler A, Reinsch C (1975) Oscillation matrices with spline smoothing. Numer Math 24(5):375–382
Douglas N (1988) Bayesian confidence intervals for smoothing splines. J Am Stat Assoc 83(404):1134–1143
Duchon J (1977) Splines minimizing rotation-invariant semi-norms in Sobolev spaces. Lecture Notes in Math., vol 571, pp 85–100
Ettinger B, Perotto S, Sangalli LM (2016) Spatial regression models over two-dimensional manifolds. Biometrika 103(1):71–88. https://doi.org/10.1093/biomet/asv069
Eubank RL (1999) Nonparametric regression and spline smoothing, volume 157 of Statistics: Textbooks and Monographs, 2nd edn. Marcel Dekker, Inc., New York
Ferraccioli F (2020) Nonparametric methods for complex spatial domains: density estimation and hypothesis testing. PhD thesis, Università degli Studi di Padova
Ferraccioli F, Sangalli LM, Finos L (2021) Some first inferential tools for spatial regression with differential regularization. J Multivar Anal. https://doi.org/10.1016/j.jmva.2021.104866
Freedman DA (2006) On the so-called “huber sandwich estimator’’ and “robust standard errors’’. Am Stat 60(4):299–302
Gray RJ (1994) Spline-based tests in survival analysis. Biometrics, pp 640–652
Green PJ, Silverman BW (1994) Nonparametric regression and generalized linear models, volume 58 of monographs on statistics and applied probability. Chapman & Hall, London. https://doi.org/10.1007/978-1-4899-4473-3
Guillas S, Lai M-J (2010) Bivariate splines for spatial functional regression models. J Nonparam Stat 22(3–4):477–497. https://doi.org/10.1080/10485250903323180
Hall P, Horowitz J (2013) A simple bootstrap method for constructing nonparametric confidence bands for functions. Ann Stat 41(4):1892–1921. https://doi.org/10.1214/13-AOS1137
Heckman NE (1986) Spline smoothing in a partly linear model. J Roy Stat Soc: Ser B (Methodol) 48(2):244–248
Hemerik J, Goeman J (2018) Exact testing with random permutations. TEST 27(4):811–825
Hemerik J, Goeman JJ, Finos L (2020) Robust testing in generalized linear models by sign flipping score contributions. J R Stat Soc Ser B 82(3):841–864
Holland AD (2017) Penalized spline estimation in the partially linear model. J Multivar Anal 153:211–235
Huh M-H, Jhun M (2001) Random permutation testing in multiple linear regression. Commun Stat Theory Methods 30(10):2023–2032
Jean N, Burke M, Michael Xie W, Davis M, Lobell DB, Ermon S (2016) Combining satellite imagery and machine learning to predict poverty. Science 353(6301):790–794
Jerven M (2013) Poor numbers: how we are misled by African development statistics and what to do about it. Cornell University Press
Jesse H, Jelle G (2018) Exact testing with random permutations. TEST 27(4):811–825
Kherad-Pajouh S, Renaud O (2010) An exact permutation method for testing any effect in balanced and unbalanced fixed effect anova. Comput Stat Data Anal 54(7):1881–1893
Lai M-J, Schumaker LL (2007) Spline functions on triangulations, volume 110 of Encyclopedia of Mathematics and its Applications. Cambridge University Press, Cambridge. https://doi.org/10.1017/CBO9780511721588
Lehmann EL, Romano JP (2008) Testing statistical hypotheses. Springer Science & Business Media
Li Y, Ruppert D (2008) On the asymptotics of penalized splines. Biometrika 95(2):415–436
Maas CJM, Hox JJ (2004) Robustness issues in multilevel regression analysis. Stat Neerl 58(2):127–137
Marra G, Wood SN (2012) Coverage properties of confidence intervals for generalized additive model components. Scand J Stat 39(1):53–74
Marra G, Miller DL, Zanin L (2012) Modelling the spatiotemporal distribution of the incidence of resident foreign population. Stat Neerl 66(2):133–160
Matthieu W, Luca D, Sangalli Laura M, Pierre W (2016) IGS: an IsoGeometric approach for smoothing on surfaces. Comput Methods Appl Mech Eng 302:70–89. https://doi.org/10.1016/j.cma.2015.12.028
Ming-Jun Lai CK, Shum VB, Wenston P (2009) Triangulated spherical splines for geopotential reconstruction. J Geodesy 83(4):695–708
Ming-Jun L, Li W (2013) Bivariate penalized splines for regression. Stat Sin 23(3):1399–1417
O’Sullivan F (1986) A statistical perspective on ill-posed inverse problems. Stat Sci, pp 502–518
Pauly M, Brunner E, Konietschke F (2015) Asymptotic permutation tests in general factorial designs. J Roy Stat Soc B 77(2):461–473. https://doi.org/10.1111/rssb.12073
Pesarin F (2001) Multivariate permutation tests: with applications in biostatistics, volume 240. Wiley Chichester
Ruppert D, Wand MP, Carroll RJ (2003) Semiparametric regression. Number 12. Cambridge University Press
Sangalli LM (2021) Spatial regression with partial differential equation regularisation. Int Stat Rev 89(3):505–531. https://doi.org/10.1111/insr.12444
Sangalli LM, Ramsay JO, Ramsay TO (2013) Spatial spline regression models. J Roy Stat Soc B 75(4):681–703
Schervish MJ (2012) Theory of statistics. Springer Science & Business Media
Solari A, Finos L, Goeman JJ (2014) Rotation-based multiple testing in the multivariate linear model. Biometrics 70(4):954–961
Speckman P (1988) Kernel smoothing in partial linear models. J Roy Stat Soc: Ser B (Methodol) 50(3):413–436
Ugarte MD, Goicoa T, Militino AF, Durbán M (2009) Spline smoothing in small area trend estimation and forecasting. Comput Stat Data Anal 53(10):3616–3629
Ugarte MD, Goicoa T, Militino AF (2010) Spatio-temporal modeling of mortality risks using penalized splines. Environ Office J Int Environ Soc 21(3–4):270–289
Van der Vaart AW (2000) Asymptotic statistics, volume 3. Cambridge university press
Wahba G (1990) Spline models for observational data, volume 59 of CBMS-NSF regional conference series in applied mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA. https://doi.org/10.1137/1.9781611970128
Wahba G (1981) Spline interpolation and smoothing on the sphere. SIAM J Sci Stat Comput 2(1):5–16. https://doi.org/10.1137/0902002
Wahba G (1983) Bayesian confidence intervals” for the cross-validated smoothing spline. J R Stat Soc Ser B Methodol 45(1):133–150
Wand MP, Ormerod JT (2008) On semiparametric regression with O’sullivan penalized splines. Aust N Zealand J Stat 50(2):179–198
Wang Y (2019) Smoothing splines: methods and applications. Chapman and Hall/CRC
Wang L, Wang G, Lai M-J, Gao L (2020) Efficient estimation of partially linear models for spatial data over complex domains. Stat Sin 30:347–369
Westfall PH, Young SS (1993) Resampling-based multiple testing: Examples and methods for p-value adjustment, volume 279. Wiley
Winkler AM, Ridgway GR, Webster MA, Smith SM, Nichols TE (2014) Permutation inference for the general linear model. Neuroimage 92:381–397
Wood S (2015) Package ‘mgcv’. R package version 1:29
Wood SN (2017)Generalized additive models: an introduction with R, 2 edn. Chapman and Hall/CRC
Wood SN (2003) Thin plate regression splines. J R Stat Soc Ser B 65(1):95–114
Wood SN, Bravington MV, Hedley SL (2008) Soap film smoothing. J R Stat Soc Ser B 70(5):931–955
Xiao L (2019) Asymptotic theory of penalized splines. Electron J Stat 13(1):747–794
Yan Yu, Ruppert D (2002) Penalized spline estimation for partially linear single-index models. J Am Stat Assoc 97(460):1042–1054
Yu S, Wang G, Wang L, Liu C, Yang L (2019) Estimation and inference for generalized geoadditive models. J Am Stat Assoc
Acknowledgements
We are grateful to Eleonora Arnone and Augusto Fasano for insightful discussions on this work, and to the anonymous Reviewers for constructive comments that led to a much improved manuscript.
Funding
Open access funding provided by Università degli Studi di Padova within the CRUI-CARE Agreement.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ferraccioli, F., Sangalli, L.M. & Finos, L. Nonparametric tests for semiparametric regression models. TEST 32, 1106–1130 (2023). https://doi.org/10.1007/s11749-023-00868-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11749-023-00868-9