The New Palgrave Dictionary of Economics

2018 Edition
| Editors: Macmillan Publishers Ltd

Varying Coefficient Models

  • Andros Kourtellos
  • Thanasis Stengos
Reference work entry


Varying coefficient models offer a compromise between fully nonparametric and parametric models by allowing for the desired flexibility of the response coefficients of standard regression models to uncover hidden structures in the data without running into the serious curse of the dimensionality issue.


Functional coefficient models Heteroskedasticity Least squares Linear regression models Maximum likelihood Nonparametric estimation Parameter heterogeneity Random coefficients model Smooth coefficient models Tuning variables Varying coefficient models 

JEL Classifications


One of the most interesting forms of nonlinear regression models is the varying coefficient model (VCM). Unlike the linear regression model, VCMs were introduced by Hastie and Tibshirani (1993) to allow the regression coefficients to vary systematically and smoothly in more than one dimension. It is worth noting the distinction between the VCM and the so-called random coefficients model, which assumes that the coefficients vary non-systematically (randomly). Versions of the VCM are encountered in the literature as functional coefficient models (see Cai et al. 2000b) and smooth coefficient models (see Li et al. 2002).

VCMs are very useful tools in applied work in economics as they can be used to model parameter heterogeneity in a general way. For example, Durlauf et al. (2001) estimate a version of the Solow model that allows the parameters for each country to vary as functions of initial income. This work is extended in Kourtellos (2005), who finds parameter dependence on initial literacy, initial life expectancy, expropriation risk and ethnolinguistic fractionalization. Li et al. (2002) use the above smooth coefficient model to estimate the production function of the non-metal mineral industry in China. Stengos and Zacharias (2006) use the same model to examine an intertemporal hedonic model of the personal computer market, where the coefficients of the hedonic regression were unknown functions of time. Hong and Lee (2003) forecast the nonlinearity in the conditional mean of exchange rate changes using a VCM that allows the autoregressive coefficients to vary with investment positions. Ahmad et al. (2005) apply the VCM in the estimation of a production function in China’s manufacturing industry to show that the marginal productivity of labour and capital depends on the firm’s R&D values. Mamuneas et al. (2006) study the effect of human capital on total factor productivity in an empirical growth framework. In what follows we present the basic structure of the standard VCM specification as it appears in the literature and then proceed to discuss certain aspects of estimation and some of its recent generalizations.

Basic Specification

Consider the following VCM
$$ {y}_i=\beta {\left({z}_i\right)}^{\prime }{X}_i+{u}_i $$

with E(ui|Xi)= 0, where Xi = (1, xi2,..., xip)′ is a p dimensional vector of slope regressors and β(zi)′ = (β1(zi1), β2(zi2),..., βp(zip)) is a p dimensional vector of varying coefficients, which take the form of unknown smooth functions of zi1, zi2,..., zip, respectively. Notice that β1(zi) is a varying intercept that measures the direct relationship between the tuning variable zi and the dependent variable in a nonparametric way. We refer to the variables zi’s as tuning variables, and they can be one-dimensional or multidimensional. These functions map the tuning variables into a set of local regression coefficient estimates that imply that the effect of Xi on yi will not be constant but rather it will vary smoothly with the tuning variables. These tuning variables could take the form of a scalar like time or it could be a vector of dimension q. A common situation in the literature arises when the zj is the same for all j.

It is worth is noting that the VCM (1) is a very flexible and rich family of models. One of the reasons is that the general additive separable structure of (1) offers also a very useful compromise to the general high-dimensional nonparametric regression that is known to suffer from the curse of dimensionality. This allows for nonparametric estimation even when the conditioning regressor space is in high dimensional. Another is that it nests many well-known models as a special case. For instance, consider the following cases. If βj(zij)= βj, for all j then we are dealing with the usual linear model. If βj(zij)= βjzij for some variable j, we simply have the interaction term βjxijzij entering the regression function. If xi = c (a constant) or if zij = xij for all j = 1, … p, then the model takes the generalized additive form where the additive components are unknown functions (see Hastie and Tibshirani 1990; Linton and Nielsen 1995).

We now set out some estimation issues. A popular estimation approach is based on local polynomial regression, as illustrated by Fan (1992), Fan and Gijbels (1996), and Fan and Zhang (1999), which we present in the context of a random sample design. Given a random sample \( {\left\{\left({z}_i,{X}_i,{y}_i\right)\right\}}_{i=1}^n, \)the estimation procedure solves a simple local least squares problem. To be precise, for each given point z0 the functions βj(z), j = 1... p are approximated by local linear polynomials βj(z) ≈ cj0 + cj1(zz0) for z in a neighborhood of z0. This leads to the following weighted local least squares problem:
$$ \sum_{i=1}^n{\left[{y}_i-\sum_{j=1}^p\left\{{c}_{j0}+{c}_{j1}\left(z-{z}_0\right)\right\}{X}_{ij}\right]}^2{K}_h\left({z}_i-{z}_0\right) $$

for a given kernel function K and bandwidth h, where Kh(·) = K(·/h)/h. While this method is simple, it is implicitly assumed that the functions βj(z) possess the same degrees of smoothness and hence can be approximated equally well in the same interval. Fan and Zhang (1999) allow for different degrees of smoothness for different coefficient functions by proposing a two-stage method. This is similar in spirit to what Huang and Shen (2004) do for global smoothers using regression splines but allowing each coefficient function to have different (global) smoothing parameters.

An attractive alternative to local polynomial estimation is a global smoothing method based on general series methods such as polynomial splines and trigonometric approximation (see Ahmad et al. 2005; Huang et al. 2004; Huang and Shen 2004; Xue and Yang 2006a). All these papers emphasize the computational savings from having to solve only one minimization problem. Ahmad, Leelahanon and Li stress the efficiency gains of the series approach over a kernel-based approach when one allows for conditional heteroskedasticity. We should note that the inference for the estimated coefficients will differ for different choices of approximation, and the asymptotic properties of such estimators are generally not easy to obtain.

Although the model was initially developed for i.i.d. data, it has been extended for time series data by Chen and Tsay (1993), Cai et al. (2000b), Huang and Shen (2004), and Cai (2007) for strictly stationary processes with different mixing conditions. The coefficient functions typically now become functions of time and/or lagged values of the dependent variable. It is worth noting that estimation issues such as bandwidth selection are similar, as in the i.i.d. data case (see Cai 2007). The varying coefficient model has also been employed to analyse longitudinal data (see Brumback and Rice (1998), Hoover et al. (1998), and Huanget al. (2004).

The Partially Linear Varying Coefficient Model

An interesting special case of eq. (1), where the unknown coefficient functions depend on a common zi, is the partially linear VCM. Here some of the coefficients are constants (independent of zi). In that case, eq. (1) can be rewritten as
$$ {y}_i={\alpha}^{\prime }{W}_i+\beta {\left({z}_i\right)}^{\prime }{X}_i+{u}_i $$

where Wi is the ith observation on a (1 × q) vector of additional regressors that enter the regression function linearly. The estimation of this model requires some special treatment as the partially linear structure may allow for efficiency gains, since the linear part can be estimated at a much faster rate, namely, \( \sqrt{n} \).

The partially linear VCM has been studied by Zhang et al. (2002), Xia et al. (2004), Ahmad et al. (2005), and Fan and Huang (2005). Zhang et al. (2002) suggest a two-step procedure where the coefficients of the linear part are estimated in the first step using polynomial fitting with an initial small bandwidth using cross validation (see Hoover et al. 1998). In other words, the approach is based on under-smoothing in the first stage. Then these estimates are averaged to yield the final first-step linear part estimates which are then used to redefine the dependent variable and return to the environment of eq. (1), where local smoothers can be applied as described above. Alternatively, Xia et al. (2004) separate the estimation of γ from that of β(zi) by noting that the former can be estimated globally, but the latter locally. This is what they call a ‘semi-local least squares procedure’, and they achieve a more efficient estimate of γ without under-smoothing using standard bandwidth selection methods. Once γ has been estimated, then again the linear part can be used to redefine the dependent variable and return to the environment of eq. (3).

More recently, Fan and Huang (2005) use a profile least squares estimation approach to provide a simple and useful method for (3). More precisely, they construct a Wald test and a profile likelihood ratio test for the parametric component that share similar sampling properties. More importantly, they show that the asymptotic distribution of the profile likelihood ratio test under the null is independent of nuisance parameters, and follows an asymptotic χ2 distribution. They also propose a generalized likelihood ratio test statistic to test whether certain parametric functions fit the nonparametric varying coefficients. This hypothesis test includes testing for the significance of the slope variables X (zero coefficients) and the homogeneity of the model (constant coefficients). Other work on specification testing includes Li et al. (2002), Cai et al. (2000b), Cai (2007), Yang et al. (2006) that mainly rely on bootstrapping in their implementation.

Generalizations and Extensions

A useful generalization of (1) is to allow the dependent variable to be related to the regression function nonlinearly m(Xi, Zi)= β(zi)′Xi via some given link function g(···)
$$ yi=g\left(\beta {\left({z}_i\right)}^{\prime }{X}_i\right)+{u}_i $$

This generalization is known as the generalized varying coefficient model and was originally proposed by Hastie and Tibshirani (1993). Cai et al. (2000a) study this model using local polynomial techniques and propose an efficient one-step local maximum likelihood estimator. Notice that if g(···) is the normal CDF then (4) generalizes the standard tool of the discrete choice literature, namely the probit model.

Another strand of the literature allowed for a multivariate tuning variable zl, l = 1, 2, …, q. Although Hastie and Tibshirani (1993) proposed a back-fitting algorithm to estimate the varying coefficient functions, they did not provide any asymptotic justification. The most notable advance in this context has been by Xue and Yang (2006a), who propose a generalization of the VCM as in (1) that allows the varying coefficients to have an additive coefficient structure on regression coefficients to avoid the curse of dimensionality
$$ {\beta}_j(z)={\gamma}_{j0}+{\gamma}_{j1}\left({z}_1\right)+\cdots +{\gamma}_{jq}\left({z}_q\right)\kern0.48em \mathrm{for}\;\mathrm{all}\;j. $$

Under mixing conditions, Xue and Yang (2006a) propose local polynomial marginal integration estimators, while Xue and Yang (2006b) study this model using polynomial splines.

Finally, Cai et al. (2006) have shifted the discussion to consider a structural VCM. They examine the case of endogenous slope regressors, and propose a two-stage IV procedure based on local linear estimation procedures in both stages. We believe that this line of research is fruitful for economic applications.


VCMs have increasingly been employed as useful tools that allow for a compromise between fully nonparametric and parametric models. This compromise allows for the desired flexibility to uncover hidden structures that underlie the response coefficients of standard regression models without running into the serious curse of the dimensionality issue. More importantly, the structure of the VCM that allows the regression coefficients to vary with a tuning variable is very appealing in many economic applications, for it has a natural interpretation of non-constant marginal effects.

See Also


  1. Ahmad, I., S. Leelahanon, and Q. Li. 2005. Efficient estimation of a semiparametric partially varying linear model. Annals of Statistics 33: 258–283.CrossRefGoogle Scholar
  2. Brumback, B., and J. Rice. 1998. Smoothing spline models for the analysis of nested and crossed samples of curves. Journal of the American Statistical Association 93: 961–976.CrossRefGoogle Scholar
  3. Cai, Z. 2007. Trending time-varying coefficient time series models with serially correlated errors. Journal of Econometrics 136: 163–188.CrossRefGoogle Scholar
  4. Cai, Z., J. Fan, and R. Li. 2000a. Efficient estimation and inferences for varying-coefficient models. Journal of the American Statistical Association 95: 888–902.CrossRefGoogle Scholar
  5. Cai, Z., J. Fan, and Q. Yao. 2000b. Functional coefficient regression models for nonlinear time series models. Journal of the American Statistical Association 95: 941–956.CrossRefGoogle Scholar
  6. Cai, Z., M. Das, H. Xiong, and Z. Wu. 2006. Functional coefficient instrumental variables models. Journal of Econometrics 133: 207–241.CrossRefGoogle Scholar
  7. Chen, R., and R. Tsay. 1993. Functional coefficient autoregressive models. Journal of the American Statistical Association 88: 298–308.Google Scholar
  8. Durlauf, S., A. Kourtellos, and A. Minkin. 2001. The local Solow growth model. European Economic Review 45: 928–940.CrossRefGoogle Scholar
  9. Fan, J. 1992. Design-adaptive nonparametric regression. Journal of the American Statistical Association 87: 998–1004.CrossRefGoogle Scholar
  10. Fan, J., and I. Gijbels. 1996. Local polynomial modelling and its applications. London: Chapman and Hall.Google Scholar
  11. Fan, J., and T. Huang. 2005. Profile likelihood inferences on semiparametric varying- partially linear models. Bernoulli 11: 1031–1057.CrossRefGoogle Scholar
  12. Fan, J., and W. Zhang. 1999. Statistical estimation in varying-coefficient models. Annals of Statistics 27: 1491–1518.CrossRefGoogle Scholar
  13. Hastie, T., and R. Tibshirani. 1990. Generalized additive models. New York: Chapman and Hall.Google Scholar
  14. Hastie, T., and R. Tibshirani. 1993. Varying coefficient models. Journal of the Royal Statistical Society, Series B 55: 757–796.Google Scholar
  15. Hong, Y., and T.-H. Lee. 2003. Inference on predictability of foreign exchange rates via generalized spectrum and nonlinear time series models. The Review of Economics and Statistics 85: 1048–1062.CrossRefGoogle Scholar
  16. Hoover, D., C. Rice, C. Wu, and L. Yang. 1998. Nonparametric smoothing estimates of time-varying coefficient models with longitudinal data. Biometrika 85: 809–822.CrossRefGoogle Scholar
  17. Huang, J., and H. Shen. 2004. Functional coefficient regression models for nonlinear time series: A polynomial spline approach. Scandinavian Journal of Statistics 31, 515–534.Google Scholar
  18. Huang, J., C. Wu, and L. Zhou. 2004. Polynomial spline estimation and inference for varying coefficient models with longitudinal data. Statistica Sinica 14: 763–788.Google Scholar
  19. Kourtellos, A. 2005. Modeling parameter heterogeneity in cross-country growth regression models. Mimeo, Department of Economics, University of Cyprus.Google Scholar
  20. Li, Q., C. Huang, D. Li, and T. Fu. 2002. Semiparametric smooth coefficient models. Journal of Business and Economic Statistics 20: 412–422.CrossRefGoogle Scholar
  21. Linton, O., and J. Nielsen. 1995. A kernel method of estimating structural nonparametric regression based on marginal integration. Biometrika 82: 93–100.CrossRefGoogle Scholar
  22. Mamuneas, T., A. Savvides, and T. Stengos. 2006. Economic development and the return to human capital: A smooth coefficient semiparametric approach. Journal of Applied Econometrics 21: 111–132.CrossRefGoogle Scholar
  23. Stengos, T., and E. Zacharias. 2006. Intertemporal pricing and price discrimination: A semiparametric hedonic analysis of the personal computer market. Journal of Applied Econometrics 21: 371–386.CrossRefGoogle Scholar
  24. Stone, C. 1977. Consistent nonparametric regression. Annals of Statistics 5: 595–620.CrossRefGoogle Scholar
  25. Xia, Y., W. Zhang, and H. Tong. 2004. Efficient estimation for semivarying-coefficient models. Biometrika 91: 661–681.CrossRefGoogle Scholar
  26. Xue, L., and L. Yang. 2006a. Estimation of semiparametric additive coefficient model. Journal of Statistical Planning and Inference 136: 2506–2534.CrossRefGoogle Scholar
  27. Xue, L., and L. Yang. 2006b. Additive coefficient modeling via polynomial spline. Statistica Sinica 16: 1423–1446.Google Scholar
  28. Yang, L., B. Park, L. Xue, and W. Härdle. 2006. Estimation and testing for varying coefficients in additive models with marginal integration. Journal of the American Statistical Association 101: 1212–1227.CrossRefGoogle Scholar
  29. Zhang, W., S.-Y. Lee, and X. Song. 2002. Local polynomial fitting in semivarying coefficient model. Journal of Multivariate Analysis 82: 166–188.CrossRefGoogle Scholar

Copyright information

© Macmillan Publishers Ltd. 2018

Authors and Affiliations

  • Andros Kourtellos
    • 1
  • Thanasis Stengos
    • 1
  1. 1.