Penalized Generalized Quasi-Likelihood Based Variable Selection for Longitudinal Data

Nadarajah, Tharshanna; Variyath, Asokan Mulayath; Loredo-Osti, J. Concepción

doi:10.1007/978-3-319-31260-6_8

Tharshanna Nadarajah⁷,
Asokan Mulayath Variyath⁷ &
J. Concepción Loredo-Osti⁷

Part of the book series: Lecture Notes in Statistics ((LNSP,volume 218))

864 Accesses
2 Citations

Abstract

High-dimensional longitudinal data with a large number of covariates, have become increasingly common in many bio-medical applications. The identification of a sub-model that adequately represents the data is necessary for easy interpretation. Also, the inclusion of redundant variables may hinder the accuracy and efficiency of estimation and inference. The joint likelihood function for longitudinal data is challenging, particularly in correlated discrete data. To overcome this problem Wang et al. (Biometrics 68:353–360, 2012) introduced penalized GEEs (PGEEs) with a non-convex penalty function which requires only the first two marginal moments and a working correlation matrix. This method works reasonably well in high-dimensional problems; however, there is a risk of model mis-specification such as variance function and correlation structure and in such situations, we propose variable selection based on penalized generalized quasi-likelihood (PGQL). Simulation studies show that when model assumptions are true, the PGQL method has performance comparable with that of PGEEs. However, when the model is mis-specified, the PGQL method has clear advantages over the PGEEs method. We have implemented the proposed method in a real case example.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Simultaneous Variable Selection and Estimation in Generalized Semiparametric Mixed Effects Modeling of Longitudinal Data

Variable selection for generalized partially linear models with longitudinal data

Article 24 November 2020

Double penalized variable selection procedure for partially linear models with longitudinal data

Article 15 October 2014

References

Akaike, H: Information theory as a extension of maximum likelihood principle. In: Petrove, B.N., Csaki, F. (eds.) Second Symposium of Information Theory, pp. 267–282. Akademiai Kiado, Budapest (1973)
Google Scholar
Akaike, H: A new look at the statistical model identification. IEEE Trans. Autom. Control 19, 716–723 (1974)
Article MathSciNet MATH Google Scholar
Antoniadis, A.: Wavelets in statistics: a review (with discussion). J. Italian Stat. Assoc. 6, 97–144 (1997)
Article Google Scholar
Antoniadis, A., Fan, J.: Regularization of wavelets approximations. J. Am. Stat. Assoc. 96, 939–967 (2001)
Article MathSciNet MATH Google Scholar
Cantoni, E., Flemming, J.M., Ronchetti, E.: Variable selection for marginal longitudinal generalized linear models. Biometrika 61, 507–514 (2005)
Article MathSciNet MATH Google Scholar
Craven, P., Wahba, G.: Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numer. Math. 31, 377–403 (1979)
Article MathSciNet MATH Google Scholar
Crowder, M.J.: On use of a working correlation matrix in using generalized linear models for repeated measures. Biometrika 82, 407–410 (1995)
Article MATH Google Scholar
Donoho, D.L., Johnstone, I.M.: Ideal spatial adaptation by wavelet shrinkage. Biometrika 81, 425–455 (1994)
Article MathSciNet MATH Google Scholar
Dziak, J.J., Li, R., Qu, A.: An overview on quadratic inference function approaches for longitudinal data. In Frontiers of Statistics, Volume 1: New Developments in Biostatistics and Bioinformatics, J. Fan, J.S. Liu, and X. Lin (eds), Chapter 3, 49–72. 5 Toh Tuch Link, Singapore: World Scientific Publishing, (2009)
Google Scholar
Fan, J.: Comments on “Wavelets in Statistics: A Review” by A. Antoniadis. J. Italian Stat. Assoc. 6, 131–138 (1997)
Article Google Scholar
Fan, J., Li, R.: Variable selection via non concave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96, 1348–1360 (2001)
Article MathSciNet MATH Google Scholar
Fan, J., Li, R.: New estimation and model selection procedures for semiparametric modeling in longitudinal data analysis. J. Am. Stat. Assoc. 99, 710–723 (2004)
Article MathSciNet MATH Google Scholar
Liang, K.Y., Zeger, S.L.: Longitudinal data analysis using generalized linear models. Biometrika 73, 13–22 (1986)
Article MathSciNet MATH Google Scholar
Lv, J., Fan, Y.: A unified approach to model selection and sparse recovery using regularized least squares. Ann. Stat. 37, 3498–3528 (2009)
Article MathSciNet MATH Google Scholar
McKenzie, E.: Some ARMA models for dependent sequences of Poisson counts. Adv. Appl. Probab. 20, 822–835 (1988)
Article MathSciNet MATH Google Scholar
Nadarajah, T.: Penalized empirical likelihood based variable selection. M.Sc. thesis, Memorial University of Newfoundland, St. John’s (2011)
Google Scholar
Pan, W.: Akaike’s information criterion in generalized estimating equations. Biometrics 57, 120–125 (2001)
Article MathSciNet MATH Google Scholar
Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)
Article MathSciNet MATH Google Scholar
Sutradhar, B.C.: An overview on regression models for discrete longitudinal responses. Stat. Sci. 18, 377–393 (2003)
Article MathSciNet MATH Google Scholar
Sutradhar, B. C. Dynamic Mixed Models for Familial Longitudinal Data. New York: Springer (2011)
Book MATH Google Scholar
Sutradhar, B.C., Das, K.: On the efficiency of regression estimators in generalized linear models for longitudinal data. Biometrika 86, 459–465 (1999)
Article MathSciNet MATH Google Scholar
Sutradhar, B.C., Kovacevic, M.: Analysing ordinal longitudinal survey data: generalized estimating equations approach. Biometrika 87, 837–848 (2000)
Article MathSciNet MATH Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58, 267–288 (1996)
MathSciNet MATH Google Scholar
Variyath, A.M.: Variable selection in generalized linear models by empirical likelihood. Ph.D. thesis, University of Waterloo, Waterloo (2006)
Google Scholar
Variyath, A.M., Chen, J., Abraham, B.: Empirical likelihood based variable selection. J. Stat. Plan. Infer. 140, 971–981 (2010)
Article MathSciNet MATH Google Scholar
Wang, H., Leng, C.: Unified LASSO estimation via least squares approximation. J. Am. Stat. Assoc. 102, 1039–1048 (2007)
Article MathSciNet MATH Google Scholar
Wang, L., Qu, A.: Consistent model selection and data-driven smooth tests for longitudinal data in the estimating equations approach. J. R. Stat. Soc. Ser. B 71, 177–190 (2009)
Article MathSciNet MATH Google Scholar
Wang, L., Li, H., Huang, J.: Variable selection in nonparametric varying- coefficient models for analysis of repeated measurements. J. Am. Stat. Assoc. 103, 1556–1569 (2008)
Article MathSciNet MATH Google Scholar
Wang, L., Zhou, J., Qu, A.: Penalized generalized estimating equations for high-dimensional longitudinal data analysis. Biometrics 68, 353–360 (2012)
Article MathSciNet MATH Google Scholar
Wedderburn, R.W.M.: Quasi-likelihood functions, generalized linear models, and the Gauss–Newton method. Biometrika 61 (3), 439–444 (1974)
MathSciNet MATH Google Scholar
Xu, P., Wu, P., Wang, Y., Zhu, L.X.: A GEE based shrinkage estimation for the generalized linear model in longitudinal data analysis. Technical report, Department of Mathematics, Hong Kong Baptist University, Hong Kong (2010)
Google Scholar
Xue, L., Qu, A., Zhou, J.: Consistent model selection for marginal generalized additive model for correlated data. J. Am. Stat. Assoc. 105, 1518–1530 (2010)
Article MathSciNet MATH Google Scholar
Xiao, N., Zhang, D., Zhang, H.H.: Variable selection for semiparametric mixed models in longitudinal studies. Biometrics 66, 79–88 (2009)
MathSciNet MATH Google Scholar
Zhang, H.H., Lu, W.: Adaptive Lasso for Cox’s proportional hazards model. Biometrika 94, 691–703 (2007)
Article MathSciNet MATH Google Scholar
Zou, H.: The adaptive Lasso and its oracle properties. J. Am. Stat. Assoc. 101 (476), 1418–1429 (2006)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors’ are grateful for the opportunity to present their work at the 2015 International Symposium in Statistics (ISS) on Advances in Parametric and Semiparametric Analysis of Multivariate, Time Series, Spatial-temporal, and Familial-longitudinal Data. Special thanks go to Professor Brajendra Sutradhar for organizing the conference, to members of the symposium audience for insightful discussion of our presentation, and to two anonymous referees for their thoughtful comments on our manuscript. The authors’ research was partially supported by grants from Natural Sciences & Engineering Research Council of Canada and Canadian Institute of Health Research.

Author information

Authors and Affiliations

Memorial University, St. John’s, NL, Canada, A1C5S7
Tharshanna Nadarajah, Asokan Mulayath Variyath & J. Concepción Loredo-Osti

Authors

Tharshanna Nadarajah
View author publications
You can also search for this author in PubMed Google Scholar
Asokan Mulayath Variyath
View author publications
You can also search for this author in PubMed Google Scholar
J. Concepción Loredo-Osti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tharshanna Nadarajah .

Editor information

Editors and Affiliations

Department of Mathematics & Statistics, Memorial University of Newfoundland, St. John's, Newfoundland and Labrador, Canada
Brajendra C. Sutradhar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nadarajah, T., Variyath, A.M., Loredo-Osti, J.C. (2016). Penalized Generalized Quasi-Likelihood Based Variable Selection for Longitudinal Data. In: Sutradhar, B. (eds) Advances and Challenges in Parametric and Semi-parametric Analysis for Correlated Data. Lecture Notes in Statistics(), vol 218. Springer, Cham. https://doi.org/10.1007/978-3-319-31260-6_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-31260-6_8
Published: 16 June 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31258-3
Online ISBN: 978-3-319-31260-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Penalized Generalized Quasi-Likelihood Based Variable Selection for Longitudinal Data

Abstract

Access this chapter

Similar content being viewed by others

Simultaneous Variable Selection and Estimation in Generalized Semiparametric Mixed Effects Modeling of Longitudinal Data

Variable selection for generalized partially linear models with longitudinal data

Double penalized variable selection procedure for partially linear models with longitudinal data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Penalized Generalized Quasi-Likelihood Based Variable Selection for Longitudinal Data

Abstract

Access this chapter

Similar content being viewed by others

Simultaneous Variable Selection and Estimation in Generalized Semiparametric Mixed Effects Modeling of Longitudinal Data

Variable selection for generalized partially linear models with longitudinal data

Double penalized variable selection procedure for partially linear models with longitudinal data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation