Asymptotic Inferences in a Doubly-Semi-Parametric Linear Longitudinal Mixed Model

Abstract

Warriyar and Sutradhar (Brazilian J. Probab. Stat., 28, 561–586, 2014) studied a semi-parametric linear model in a longitudinal setup with Gaussian errors, where the main regression parameters were estimated using an efficient GQL (generalized quasi-likelihood) estimation approach, and the efficiency properties of the estimators were examined through a simulation study. In this paper we provide a generalization of their linear semi-parametric regression model to a wider setup where the error distributions are relaxed and errors are assumed to follow a four-moments based semi-parametric structure leading to a doubly semi-parametric model. On top of regression parameters and nonparametric function estimation, this doubly semi-parametric nature of the model makes the four-moments based variance and correlation parameters estimation quite challenging. We resolve this computational issue analytically by developing exact formulas for all necessary higher order moments. As the longitudinal studies involve large number of independent individuals providing repeated responses, we study the asymptotic properties of the estimators and make sure that the estimators including the estimator of nonparametric function are consistent.

This is a preview of subscription content, access via your institution.

References

  1. Altman, N.S. (1990). Kernel smoothing of data with correlated errors. J. Am. Stat. Assoc. 85, 749–758.

    MathSciNet  Article  Google Scholar 

  2. Amemiya, T. (1985). Advanced econometrics. Harvard University Press, Cambridge.

    Google Scholar 

  3. Bishop, Y.M.M., Fienberg, SE. and Holland, P.W. (1975). Discrete Multivariate Analysis : Theory and Practice. Springer, New York.

    MATH  Google Scholar 

  4. Bun, M.J.G. and Carree, M.A. (2005). Bias-corrected estimation in dynamic panel data models. J. Business Econo. Statist. 23, 200–210.

    MathSciNet  Article  Google Scholar 

  5. Chen, J., Li, D., Liang, H. and Wang, S. (2015). Semiparametric GEE analysis in partial linear single-index models for longitudinal data. Ann. Stat. 43, 1682–1715.

    Article  Google Scholar 

  6. Fleishman, A.I. (1978). A method of simulating non-normal distributions. Psychometrika 43, 521–532.

    Article  Google Scholar 

  7. Hsiao, C. (2003). Analysis of panel data. University Press, Cambridge.

    Book  Google Scholar 

  8. McDonald, D.R. (2005). The local limit theorem: a historical perspective. J. Iranian Stat. Soc. 4, 73–86.

    MATH  Google Scholar 

  9. Pagan, A. and Ullah, A. (1999). Nonparametric econometrics. Cambridge University Press, Cambridge.

    Book  Google Scholar 

  10. Rao, R.P., Sutradhar, B.C. and Pandit, V.N. (2012). GMM Versus GQL inferences n linear dynamic panel data models. Brazilian J. Probab. Stat.26, 167–177.

    MathSciNet  Article  Google Scholar 

  11. Sneddon, G. and Sutradhar, B.C. (2004). On semiparametric familial-longitudinal models. Stat. Probab. Lett. 69, 369–379.

    MathSciNet  Article  Google Scholar 

  12. Sutradhar, B.C. (2003). An review on regression models for discrete longitudinal responses. Stat. Sci. 18, 377–393.

    Article  Google Scholar 

  13. Sutradhar, B.C. (2011). Dynamic mixed models for familial longitudinal data. Springer, New York.

    Book  Google Scholar 

  14. Sutradhar, B.C., Rao, R.P. and Pandit, V.N. (2008). Generalized method of moments versus generalized quasi-likelihood inferences in binary panel data models. Sankhya B 70, 34–62.

    MATH  Google Scholar 

  15. Wang, N., Carroll, R.J. and Lin, X. (2005). Efficient semi-parametric marginal estimation for longitudinal/clustered data. J. Amer. Stat. Assoc. 100, 147–157.

    Article  Google Scholar 

  16. Warriyar, K.V.V. and Sutradhar, B.C. (2014). Estimation with improved efficiency in semi-parametric linear longitudinal models. Brazilian J. Probab. Stat. 28, 561–586.

    MathSciNet  Article  Google Scholar 

  17. Wedderburn, R.W.M. (1974). Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method. Biometrika 61, 439–447.

    MathSciNet  MATH  Google Scholar 

  18. Zeger, S.L. and Diggle, P.J. (1994). Semi-parametric Models for Longitudinal Data With Application to CD4 Cell Numbers in HIV Seroconverters. Biometrics50, 689–699.

    Article  Google Scholar 

Download references

Acknowledgments

The authors are grateful to Bhagavan Sri Sathya Sai Baba for providing opportunities to carry out the research at the Sri Sathya Sai University. This research was supported by a grant from the Natural Sciences and Engineering Research Council of Canada. The authors would like to thank two referees and the Associate Editor for their valuable comments on the previous version.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Brajendra C. Sutradhar.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix: An Outline for the Proof of Lemmas 3.1 to 3.5

Through all these Lemmas we have computed all possible fourth order moments for responses collected at different time points. In summary the following moments were computed as an aid toward the computation of the final variances and covariances:

  • a. E(Yitμit)4,t = 1,…,T

  • b. (i) E[(Yiuμiu)3(Yitμit)]; (ii) E[(Yiuμiu)(Yitμit)3], u < t;u,t = 1,…,T

  • c. E[(Yiuμiu)2(Yitμit)2], u < t;u,t = 1,…,T

  • d. (i) E[(Yiuμiu)2(Yiμi)(Yimμim)]; (ii) E[(Yiuμiu)(Yiμi)2(Yimμim)]; (iii) E[(Yiuμiu)(Yiμi)(Yimμim)2], u < < m;u,,m = 1,…,T

  • e. E[(Yiuμiu)(Yiμi)(Yimμim)(Yitμit)], u < < m < t;u,,m,t = 1,…,T.

For the purpose the deviation responses at different time points (u < t < < m) were expressed as follows as par the dynamic model (18):

$$ \begin{array}{@{}rcl@{}} y_{iu}-\mu_{iu}&=& \sigma_{\gamma} z_{i}\gamma_{i}\sum\limits_{j=0}^{u-1}\theta^{j}+\sum\limits_{j=0}^{u-1}\theta^{j}\epsilon_{i,u-j} \end{array} $$
(a.1)
$$ \begin{array}{@{}rcl@{}} y_{it}-\mu_{it}&=&\sigma_{\gamma} z_{i}\gamma_{i}\sum\limits_{j=0}^{t-1}\theta^{j}+\sum\limits_{j=0}^{u-1}\theta^{t-u+j}\epsilon_{i,u-j} +\sum\limits_{j=u}^{t-1}\theta^{j-u}\epsilon_{i,t+u-j} \end{array} $$
(a.2)
$$ \begin{array}{@{}rcl@{}} y_{i\ell}-\mu_{i\ell}&=&\sigma_{\gamma} z_{i}\gamma_{i}\sum\limits_{j=0}^{\ell-1}\theta^{j}+\sum\limits_{j=0}^{u-1}\theta^{\ell-u+j}\epsilon_{i,u-j} +\sum\limits_{j=u}^{t-1}\theta^{j+\ell-t-u}\epsilon_{i,t+u-j} \\ &+&\sum\limits_{j=t}^{\ell-1}\theta^{j-t}\epsilon_{i,\ell+t-j} \end{array} $$
(a.3)
$$ \begin{array}{@{}rcl@{}} y_{im}-\mu_{im}&=&\sigma_{\gamma} z_{i}\gamma_{i}\sum\limits_{j=0}^{m-1}\theta^{j}+\sum\limits_{j=0}^{u-1}\theta^{m-u+j}\epsilon_{i,u-j} +\sum\limits_{j=u}^{t-1}\theta^{j+\ell-t-u}\epsilon_{i,t+u-j} \\ &+&{\sum}^{\ell-1}_{j=t}\theta^{m-\ell+j-t}\epsilon_{i,\ell+t-j} +{\sum}^{m-1}_{j=\ell}\theta^{j-\ell}\epsilon_{i,m+\ell-j}. \end{array} $$
(a.4)

Next because each deviation responses are expressed above as a sum of various summations, to compute the exponent of a summation up to order 4, we have used the followings: Consider the deviation response at a marginal time t for example , and express the deviation as

$$ y_{it}-\mu_{it}=\sum\limits_{j=0}^{t-1}\theta^{j}\{\sigma_{\gamma} z_{i}\gamma_{i}+\epsilon_{i,t-j}\}=\sum\limits_{j=0}^{t-1}w_{ij}, $$
(a.5)

which is Eq. a.1. Then the exponents up to 4, of this summation, are computed as

$$ (y_{it}-\mu_{it})^{2}=\sum\limits_{j=0}^{t-1}w^{2}_{ij}+2\sum\limits_{j=0}^{t-2}\sum\limits_{k=j+1}^{t-1}w_{ij}w_{ik}, $$
(a.6)
$$ \begin{array}{@{}rcl@{}} (y_{it}-\mu_{it})^{3}&=&\sum\limits_{j=0}^{t-1}w^{3}_{ij}+3\sum\limits_{j=0}^{t-2}\sum\limits_{k=j+1}^{t-1} (w^{2}_{ij}w_{ik}+w_{ij}w^{2}_{ik}) \\ &+&6\sum\limits_{j=0}^{t-3}\sum\limits_{k=j+1}^{t-2} \sum\limits_{l=k+1}^{t-1}w_{ij}w_{ik}w_{il}, \end{array} $$
(a.7)

and

$$ \begin{array}{@{}rcl@{}} (y_{it}-\mu_{it})^{4}&=&\sum\limits_{j=0}^{t-1}w^{4}_{ij}+4\sum\limits_{j=0}^{t-2}\sum\limits_{k=j+1}^{t-1}(w^{3}_{ij}w_{ik}+w_{ij}w^{3}_{ik})+6 \sum\limits_{j=0}^{t-2}\sum\limits_{k=j+1}^{t-1}w^{2}_{ij}w^{2}_{ik} \\[2ex] &+&12\sum\limits_{j=0}^{t-3}\sum\limits_{k=j+1}^{t-2}\sum\limits_{l=k+1}^{t-1}[w^{2}_{ij}w_{ik}w_{il}+w_{ij}w^{2}_{ik}w_{il}+w_{ij}w_{ik}w^{2}_{il}] \\ [2ex] &+&24\sum\limits_{j=0}^{t-4}\sum\limits_{k=j+1}^{t-3}\sum\limits_{l=k+1}^{t-2} {\sum}^{t-1}_{m=l+1}w_{ij}w_{ik}w_{il}w_{im}. \end{array} $$
(a.8)

Finally, the expectations over the moments/distribution of 𝜖it and γi, were computed following the semi-parametric moment conditions given by Eqs. 1920 under the DSDM (doubly semi-parametric dynamic mixed) model.

Appendix: Asymptotic Normality of \(\hat {\beta }_{SGQL}\)

We outline the derivation of the asymptotic distribution as follows. Notice from Eq. 87 that this estimator has the closed form expression as

$$ \begin{array}{@{}rcl@{}} \hat{\beta}_{SGQL}&=&\left[ \frac{1}{K}\sum\limits_{i=1}^{K}(X_{i}-\hat{X}_{i})' {{\Sigma}^{*}_{i}}^{-1}(z_{i},\theta,\sigma^{2}_{\gamma},\sigma^{2}_{\epsilon}) (X_{i}-\hat{X}_{i})\right]^{-1} \\ &\times &\frac{1}{K} \sum\limits_{i=1}^{K}(X_{i}-\hat{X}_{i})' {{\Sigma}^{*}_{i}}^{-1}(z_{i},\theta,\sigma^{2}_{\gamma},\sigma^{2}_{\epsilon}) (Y_{i}-\hat{Y}_{i}) \\ &=&\left[\frac{1}{K}\sum\limits_{=1}^{K}\frac{\partial f_{i}\left( \beta|y_{i}\right)}{\partial \beta^{\prime}}\right]^{-1} \left[\frac{1}{K}\sum\limits_{i=1}^{K}f_{i}(\beta|y)\right] \\ &=&\left[\frac{1}{K}V^{*}_{K}(\beta,\theta,\sigma^{2}_{\gamma},\sigma^{2}_{\epsilon}) \right]^{-1}\bar{f}_{K}(\beta), \text{(say)}. \end{array} $$
(b.1)

Further notice that fi’s in Eq. b.1 are clearly independent because y1,…,yi, …,yK are independent vectors from K individuals. But, they are not identically distributed because of the fact that

$$ (Y_{i}-\hat{Y}_{i}) \sim ((X_{i}-\hat{X}_{i})\beta, {{\Sigma}^{*}_{i}}^{-1}(z_{i},\theta,\sigma^{2}_{\gamma},\sigma^{2}_{\epsilon})), $$
(b.2)

by Eqs. 3541. That is, the means, variances and covariances are subject/individual dependent, i.e., they vary from individual to individual. Now by applying (b.2), it follows that \(\bar {f}_{K}(\beta )\) in Eq. b.1 has the mean-variance property given by

$$ \begin{array}{@{}rcl@{}} E[\bar{f}_{K}(\beta)]&=&\frac{1}{K}V^{*}_{K}(\beta,\theta,\sigma^{2}_{\gamma},\sigma^{2}_{\epsilon})\beta , \\ \text{cov}[\bar{f}_{K}(\beta)] &=&\frac{1}{K^{2}}V^{*}_{K}(\beta,\theta,\sigma^{2}_{\gamma},\sigma^{2}_{\epsilon}). \end{array} $$
(b.3)

Next we assume that the multivariate version of Lindeberg’s condition holds, that is,

$$ \begin{array}{@{}rcl@{}} {\lim}_{K \rightarrow \infty}{V^{*}}^{-1}_{K}\sum\limits_{i=1}^{K}\sum\limits_{\{f^{\prime}_{i}{V^{*}}^{-1}_{K}f_{i}\}> \epsilon}f_{i}f^{\prime}_{i}p^{*}(f_{i})=0 \end{array} $$
(b.4)

holds, for all 𝜖 > 0, p(⋅) being the probability distribution of fi. Then the Lindeberg-Feller central limit theorem [Amemiya 1985. Theorem 3.3.6), McDonald 2005, Theorem 2.2] implies the following convergence in distribution \((\rightarrow _{d}):\)

$$ \begin{array}{@{}rcl@{}} &&Z_{K}=K[V^{*}_{K}]^{-\frac{1}{2}}\bar{f}_{K}(\beta)\rightarrow_{d} N_{p}([V^{*}_{K}]^{\frac{1}{2}}\beta,I_{p}). \end{array} $$
(b.5)

Ip being the p × p identity matrix.

Next, by using the notations from Eq. b.5 it follows from Eq. b.1 that

$$ \begin{array}{@{}rcl@{}} \hat{\beta}_{SWGQL}&=&\left[\frac{1}{K}V^{*}_{K}(\beta,\theta,\sigma^{2}_{\gamma},\sigma^{2}_{\epsilon}) \right]^{-1}\bar{f}_{K}(\beta) \\ &=&\left[\frac{1}{K}V^{*}_{K}(\beta,\theta,\sigma^{2}_{\gamma},\sigma^{2}_{\epsilon}) \right]^{-1}\frac{1}{K}[V^{*}_{K}]^{\frac{1}{2}}Z_{K} \\ &=&[V^{*}_{K}]^{-\frac{1}{2}}Z_{K} \\ &&\rightarrow_{d} N_{p}(\beta,\left[V^{*}_{K}(\beta,\theta,\sigma^{2}_{\gamma},\sigma^{2}_{\epsilon}) \right]^{-1}), \end{array} $$
(b.6)

showing that \(\hat {\beta }_{SWGQL}\) has asymptotically multi-normal distribution with mean β and covariance matrix, \(\left [V^{*}_{K}(\beta ,\theta ,\sigma ^{2}_{\gamma },\sigma ^{2}_{\epsilon }) \right ]^{-1}\).

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sutradhar, B.C., Rao, R.P. Asymptotic Inferences in a Doubly-Semi-Parametric Linear Longitudinal Mixed Model. Sankhya A (2021). https://doi.org/10.1007/s13171-020-00239-8

Download citation

Keywords and phrases

  • Asymptotic properties of the estimators
  • consistency
  • doubly semi-parametric
  • dynamic dependence
  • higher order moments up to order four
  • moments and quasi-likelihood estimation.

AMS (2000) subject classification

  • Primary 62F12; Secondary 62G05
  • 62H20