Abstract
In this paper, we consider the variable selection problem in functional linear regression with interactions. Our goal is to identify relevant main effects and corresponding interactions associated with the response variable. Heredity is a natural assumption in many statistical models involving two-way or higher-order interactions. Inspired by this, we propose an adaptive group Lasso method for the multiple functional linear model that adaptively selects important single functional predictors and pairwise interactions while obeying the strong heredity constraint. The proposed method is based on the functional principal components analysis with two adaptive group penalties, one for main effects and one for interaction effects. With appropriate selection of the tuning parameters, the rates of convergence of the proposed estimators and the consistency of the variable selection procedure are established. Simulation studies demonstrate the performance of the proposed procedure and a real example is analyzed to illustrate its practical usage.
Similar content being viewed by others
References
Bien, J., Taylor, J., Tibshirani, R. (2013). A Lasso for hierarchical interactions. The Annals of Statistics, 41, 1111–1141.
Cai, T., Hall, P. (2006). Prediction in functional linear regression. The Annals of Statistics, 34, 2159–2179.
Cardot, H., Ferraty, F., Mas, A., Sarda, P. (2003). Testing hypothesis in the functional linear model. Scandinavian Journal of Statistics, 30, 241–255.
Chen, J., Chen, Z. (2008). Extended Bayesian information criterion for model selection with large model space. Biometrika, 94, 759–771.
Chipman, H. (1996). Bayesian variable selection with related predictors. Canadian Journal of Statistics, 24, 17–36.
Choi, N. H., Li, W., Zhu, J. (2010). Variable selection with the strong heredity constraint and its oracle property. Journal of the American Statistical Association, 105, 354–364.
Collazos, J. A., Dias, R., Zambom, A. Z. (2016). Consistent variable selection for functional regression models. Journal of Multivariate Analysis, 146, 63–71.
Cox, D. R. (1984). Interaction. International Statistical Review, 52, 1–31.
Crambes, C., Kneip, A., Sarda, P. (2009). Smoothing splines estimators for functional linear regression. The Annals of Statistics, 37, 35–72.
Dawson, J. P., Adams, P. J., Pandis, S. N. (2007). Sensitivity of PM2.5 to climate in the Eastern US: A modeling case study. Atmospheric Chemistry and Physics, 7, 4295–4309.
Fuchs, K., Scheipl, F., Greven, S. (2015). Penalized scalar-on-functions regression with interaction term. Computational Statistics & Data Analysis, 81, 38–51.
Gertheiss, J., Maity, A., Staicu, A. M. (2013). Variable selection in generalized functional linear models. Stat (International Statistical Institute), 2, 86–101.
Hall, P., Hooker, G. (2016). Truncated linear models for functional data. Journal of the Royal Statistical Society, Series B, 78, 637–653.
Hall, P., Horowitz, J. L. (2007). Methodology and convergence rates for functional linear regression. The Annals of Statistics, 35, 70–91.
Hao, N., Feng, Y., Zhang, H. H. (2018). Model selection for high-dimensional quadratic regression via regularization. Journal of the American Statistical Association, 113, 615–625.
Horváth, L., Kokoszka, P. (2012). Inference for functional data with applications. New York: Springer.
Huang, L., Zhao, J., Wang, H., Wang, S. (2016). Robust shrinkage estimation and selection for functional multiple linear model through LAD loss. Computational Statistics & Data Analysis, 103, 384–400.
Kong, D., Xue, K., Yao, F., Zhang, H. H. (2016). Partially functional linear regression in high dimensions. Biometrika, 103, 147–159.
Lian, H. (2013). Shrinkage estimation and selection for multiple functional regression. Statistica Sinica, 23, 51–74.
Ma, H., Li, T., Zhu, H., Zhu, Z. (2019). Quantile regression for functional partially linear model in ultra-high dimensions. Computational Statistics & Data Analysis, 129, 135–147.
Matsui, H., Konishi, K. (2011). Variable selection for functional regression models via the \(L_{1}\) regularization. Computational Statistics & Data Analysis, 55, 3304–3310.
Ramsay, J. O., Silverman, B. W. (2005). Functional data analysis (2nd ed.). New York: Springer.
She, Y., Wang, Z., Jiang, H. (2018). Group regularized estimation under structural hierarchy. Journal of the American Statistical Association, 113, 445–454.
Tai, A. P., Mickley, L. J., Jacob, D. J. (2010). Correlations between fine particulate matter (PM2.5) and meteorological variables in the United States: implications for the sensitivity of PM2.5 to climate change. Atmospheric Environment, 44, 3976–3984.
Usset, J., Staicu, A. M., Maity, A. (2016). Interaction models for functional regression. Computational Statistics & Data Analysis, 94, 317–329.
Wan, Y. T., Xu, M. Y., Huang, H., Chen, S. X. (2021). A spatio-temporal model for the analysis and prediction of fine particulate matter concentration in Beijing. Environmetrics, 32, e2648.
Wang, H., Li, R., Tsai, C. (2007). Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika, 94, 553–568.
Xue, K., Yao, F. (2021). Hypothesis testing in large-scale functional linear regression. Statistica Sinica, 31, 1101–1123.
Yao, F., Müller, H. G. (2010). Functional quadratic regression. Biometrika, 97, 49–64.
Yu, D., Zhang, L., Mizera, I., Jiang, B., Kong, L. (2019). Sparse wavelet estimation in quantile regression with multiple functional predictors. Computational Statistics & Data Analysis, 136, 12–29.
Yuan, M., Lin, Y. (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society, Series B, 68, 49–67.
Yuan, M., Joseph, V. R., Zou, H. (2009). Structured variable selection and estimation. Annals of Applied Statistics, 3, 1738–1757.
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101, 1418–1429.
Acknowledgements
The authors sincerely thank the Editor, Associate Editor, and two anonymous reviewers for their insightful comments and suggestions. Sanying Feng’s research was supported by the National Statistical Science Research Project of China (No. 2019LY18), the Natural Science Foundation of Henan Province, China (No. 212300410412), the Foundation of Henan Educational Committee, China (No. 21A910004), and the Excellent Youth Foundation of Zhengzhou University, China (No. 32210452). Tiejun Tong’s research was supported by the General Research Fund (No. HKBU12303918), the National Natural Science Foundation of China (No. 1207010822), and the Initiation Grant for Faculty Niche Research Areas (No. RC-IG-FNRA/17-18/13) of Hong Kong Baptist University.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
About this article
Cite this article
Feng, S., Zhang, M. & Tong, T. Variable selection for functional linear models with strong heredity constraint. Ann Inst Stat Math 74, 321–339 (2022). https://doi.org/10.1007/s10463-021-00798-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-021-00798-z