SNP is a method of multivariate nonparametric time series analysis. SNP is an abbreviation of ‘semi-nonparametric’ which was introduced by Gallant and Nychka (1987) to suggest the notion of a statistical inference methodology that lies halfway between parametric and nonparametric inference. The method employs an expansion in Hermite functions to approximate the conditional density of a multivariate process.

The leading term of this expansion can be chosen through selection of model parameters to be a Gaussian vector autoregression (VAR) model, a semi-parametric VAR model, a Gaussian ARCH model (Engle 1982), a semiparametric ARCH model, a Gaussian GARCH model (Bollerslev 1986), or a semiparametric GARCH model, either univariate or multivariate in each case. The unrestricted SNP expansion is more general than that of any of these models. The SNP model is fitted using maximum likelihood together with a model selection strategy that determines the appropriate order of expansion. Because the SNP model possesses a score, it is an ideal candidate for the auxiliary model in connection with efficient method of moment estimation (Gallant and Tauchen 1996). Due to its leading term, the SNP approach does not suffer from the curse of dimensionality to the same extent as kernels and splines. In regions where data are sparse, the leading term helps to fill in smoothly between data points. Where data are plentiful, the higher-order terms accommodate deviations from the leading term. The method was first proposed by Gallant and Tauchen (1989) in connection with an asset pricing application. A C + + implementation of SNP is at http://econ.duke.edu/webfiles/arg/snp/, together with a User’s Guide, which is an excellent tutorial introduction to the method.

Important adjuncts to SNP estimation are a rejection method for simulating from the SNP density developed in Gallant and Tauchen (1992), which can be used, for example, to set bootstrapped confidence intervals as in Gallant et al. (1992); nonlinear error shock analysis as described in Gallant et al. (1993), which develops the nonlinear analog of conventional error shock analysis for linear VAR models; and re-projection, which is a form of nonlinear Kalman filtering that can be used to forecast the unobservables of nonlinear latent variables models (Gallant and Tauchen 1998).

As stated above, the SNP method is based on the notion that a Hermite expansion can be used as a general purpose approximation to a density function. Letting z denote an M–vector, we can write the Hermite density as hzPz2φz where Pz denotes a multivariate polynomial of degree Kz and φ(z) denotes the density function of the (multivariate) Gaussian distribution with mean zero and variance the identity matrix. Denote the coefficients of Pz by a, which is a vector whose length depends on Kz and M. When we wish to call attention to the coefficients, we write Pza.

The constant of proportionality is 1/Ps2φsds which makes h(z) integrate to one. As seen from the expression that results, namely

hz=Pz2φzPs2φsds,

we are effectively expanding the square root of the density in Hermite functions of the form Pzφz. Because the square root of a density is always square integrable and because the Hermite functions of the form Pzφz are dense for the collection of square integrable functions (Fenton and Gallant 1996), every density has such an expansion. Because Pz2/Ps2φsds is a homogeneous function of the coefficients of the polynomial Pz, the coefficients can only be determined to within a scalar multiple. To achieve a unique representation, the constant term of the polynomial part is put to 1. Customarily the Hermite density is written with its terms orthogonalized and the C ++ code is written in the orthogonalized form for numerical efficiency. But reflecting that here would lead to cluttered notation and add nothing to the ideas.

A change of variables using the location-scale transformation y = Rz + μ, where R is an upper triangular matrix and μ is an M-vector, gives

fyθPR1yμ2φR1yμ/detR

The constant of proportionality is the same as above, 1/Ps2φsds. Because {φ[R−1(yμ)]/| det(R)|} is the density function of the M-dimensional, multivariate, Gaussian distribution with mean μ and variance-covariance matrix Σ = RR′, and because the leading term of the polynomial part is 1, the leading term of the entire expansion is proportional to the multivariate, Gaussian density function. Denote the Gaussian density of dimension M with mean vector μ and variance matrix Σ by nM (y|μ, Σ) and write

fyθPz2nMyμΣ

where z = R−1(yμ) for the density above.

When Kz is put to zero, one gets f(y| θ) = nM(y| μ, Σ) exactly. When Kz is positive, one gets a Gaussian density whose shape is modified due to multiplication by a polynomial in z = R−1(yμ). The shape modifications thus achieved are rich enough to accurately approximate densities from a large class that includes densities with fat, t-like tails, densities with tails that are thinner than Gaussian, and skewed densities (Gallant and Nychka 1987).

The parameters θ of f(y|θ) are made up of the coefficients a of the polynomial Pz plus μ and R and are estimated by maximum likelihood which is accomplished by minimizing \( {s}_n\left(\theta \right)=\left(-1/n\right){\sum}_{t=1}^n\mathit{\log}\left[f\left({y}_t|\theta \right)\right] \) As mentioned above, if the number of parameters pθ grows with the sample size n, the true density and various features of it such as derivatives and moments are estimated consistently (Gallant and Nychka 1987).

This basic approach can be adapted to the estimation of the conditional density of a multiple time series {yt} that has a Markovian structure. Here, the term ‘Markovian structure’ is taken to mean that the conditional density of the M–vector yt given the entire past yt−1,yt−2,... depends only on L lags from the past. For convenience, we will presume that the data are from a process with a Markovian structure, but one should be aware that, if L is sufficiently large, then non-Markovian data can be well approximated by an SNP density (Gallant and Long 1997). Collect these lags together as xt−1 = (yt−1,yt−2,...,ytL), where L exceeds all lags in the following discussion.

To approximate the conditional density of {yt} using the ideas above, begin with a sequence of innovations {zt}. First consider the case of homogeneous innovations; that is, the distribution of zt does not depend on xt−1. Then, as above, the density of zt can be approximated by hzPz2φz where Pz is a polynomial of degree Kz. Follow with the location-scale transformation yt = Rzt + μx where μx is a linear function that depends on Lu lags

$$ {\mu}_x={b}_0+{Bx}_{t-1}. $$

(If Lu < L, then some elements of B are zero.) The density that results is

fyxθPz2nMyμxΣ

where z = R−1(yμx). The constant of proportionality is as above, 1/Ps2φsds. The leading term of the expansion is nM(y| μx, Σ) which is a Gaussian vector autoregression or Gaussian VAR. When Kz is put to zero, one gets nM(y| μx, Σ) exactly. When Kz is positive, one gets a semiparametric VAR density.

To approximate conditionally heterogeneous processes, proceed as above but let each coefficient of the polynomial Pz be a polynomial of degree Kx in x. A polynomial in z of degree Kz whose coefficients are polynomials of degree Kx in x is, of course, a polynomial in (z, x) of degree Kz + Kx. Denote this polynomial by Pzx. Denote the mapping from x to the coefficients a of Pz such that Pzax=Pzx by ax and the number of lags on which it depends by Lp. The form of the density with this modification is

fyxθPzx2nMyμΣ

where z = R−1(yμx ). The constant of proportionality is 1/Psx2φsds. When Kx is zero, the density reverts to the density above. When Kx is positive, the shape of the density will depend upon x. Thus, all moments can depend upon x and the density can, in principal, approximate any form of conditional heterogeneity (Gallant and Tauchen, 1989).

In practice the second moment can exhibit marked dependence upon x. In an attempt to track the second moment, Kx can get quite large. To keep Kx small when data are markedly conditionally heteroskedastic, the leading term nM (y|μx, Σ) of the expansion can be put to a Gaussian GARCH rather than a Gaussian VAR. SNP uses a modified BEKK expression as described in Engle and Kroner (1995); the modifications are to add leverage and level effects.

$$ {\displaystyle \begin{array}{ll}{\Sigma}_{x_{t-1}}=& {R}_0{R}_0^{\prime }+\sum \limits_{i=1}^{L_g}{Q}_i{\Sigma}_{x_{t-1-i}}{Q}_i^{\prime}\hfill \\ {}& +\sum \limits_{i=1}^{L_r}{P}_i\left({{y_t}_{-}}_i-{\mu}_{x_{t-1-i}}\right){\left({{y_t}_{-}}_i-{\mu}_{x_{t-1-i}}\right)}^{\prime }{P}_i^{\prime}\hfill \\ {}& +\sum \limits_{i=1}^{L_v}\max \left[0,{V}_i\left({{y_t}_{-}}_i-{\mu}_{x_{t-1-i}}\right)\right]\hfill \\ {}& \max {\left[0,{V}_i\left({{y_t}_{-}}_i-{\mu}_{x_{t-1-i}}\right)\right]}^{\prime }\ \hfill \\ {}& +\sum \limits_{i=1}^{L_w}{W}_i{{x_{\Big(}}_1}_{\Big)}{{{{}_{,}}_t}_{-}}_1{x}_{(1),t-i}^{\prime }{W}_i^{\prime }.\hfill \end{array}} $$

Above, R0 is an upper triangular matrix. The matrices Pi, Qi, Vi, and Wi can be scalar, diagonal, or full M by M matrices. The notation x(1),ti indicates that only the first column of xti enters the computation. The max(0, x) function is applied elementwise. Because \( {\sum}_{x_{t-1}} \) must be differentiable with respect to the parameters of \( {\mu}_{x_{t-2-i}} \), the max(0,x) function is approximated by a twice continuously differentiable cubic spline. Defining \( {R}_{x_{t-1}} \) by the factorization \( {\sum}_{x_{t-1}}={R}_{x_{t-1}}{R}_{x_{t-1}}^{\prime } \) and writing x for xt−1, the SNP density becomes

fyxθPzx2nMyμΣx

where \( z={R}_x^{-1}\left(y-\mu \right) \). The constant of proportionality is 1/Psx2φsds. The leading term nM(y|μx, Σx) is Gaussian ARCH if Lg = 0 and Lr> 0 and Gaussian GARCH if both Lg> 0 and Lr> 0 (leaving aside the implications of Lv and Lw).

See Also