Abstract
Multivariate functional linear regression is commonly adopted to model the effects of several function-valued covariates on a scalar response. To select functional covariates with a time-varying effect, we develop a framework based on the reproducing kernel Hilbert space (RKHS). In particular, each coefficient function is assumed to reside in this RKHS and an RKHS norm is chosen as the penalty function in the regularized empirical risk function. This special penalty term enables us to achieve sparsity and smoothness when fitting multivariate functional linear models. Moreover, simulation studies demonstrate that the proposed estimator compares favorably with some traditional methods in variable selection, function estimation and prediction in finite samples. Finally, we apply the proposed framework to two real examples.
Similar content being viewed by others
References
Ando T, Konishi S, Imoto S (2008) Nonlinear regression modeling via regularized radial basis function networks. J Stat Plan Inference 138(11):3616–3633
Bro R (1999) Exploratory study of sugar production using fluorescence spectroscopy and multi-way analysis. Chemom Intell Lab Syst 46(2):133–147
Cardot H, Ferraty F, Sarda P (2003) Spline estimators for the functional linear model. Stat Sin 13(3):571–591
Craven P, Wahba G (1976) Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numer Math 31:377–403
Dineen R, Vilisaar J, Hlinka J, Bradshaw C, Morgan P, Constantinescu C, Auer D (2009) Disconnection as a mechanism for cognitive dysfunction in multiple sclerosis. Brain 132(1):239–249
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360
Ferraty F, Vieu P (2006) Nonparametric functional data analysis: theory and practice. Springer, New York
Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22
Gertheiss J, Maity A, Staicu AM (2013) Variable selection in generalized functional linear models. Statistics 2(1):86–101
Gu C (2013) Smoothing spline ANOVA models, 2nd edn. Springer, New York
Hall P, Horowitz JL (2007) Methodology and convergence rates for functional linear regression. Ann Stat 35(1):70–91
Happ C, Greven S (2018) Multivariate functional principal component analysis for data observed on different (dimensional) domains. J Am Stat Assoc 113(522):649–659
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference and prediction, 2nd edn. Springer, New York
Hecke WV, Nagels G, Leemans A, Vandervliet E, Sijbers J, Parizel PM (2010) Correlation of cognitive dysfunction and diffusion tensor MRI measures in patients with mild and moderate multiple sclerosis. J Magn Reson Imaging 31(6):1492–1498
Kim YJ, Gu C (2004) Smoothing spline gaussian regression: more scalable computation via efficient approximation. J R Stat Soc Ser B (Statistical Methodology) 66(2):337–356
Kokoszka P, Reimherr M (2017) Introduction to functional data analysis. CRC, London
Kong D, Xue K, Yao F, Zhang HH (2016) Partially functional linear regression in high dimensions. Biometrika 103(1):147–159
Lin Y, Zhang HH (2006) Component selection and smoothing in multivariate nonparametric regression. Ann Stat 34(5):2272–2297
Matsui H, Konishi S (2011) Variable selection for functional regression models via the L1 regularization. Comput Stat Data Anal 55(12):3304–3310
Munck L, Nørgaard L, Engelsen SB, Bro R, Andersson C (1998) Chemometrics in food science-a demonstration of the feasibility of a highly exploratory, inductive evaluation strategy of fundamental scientific significance. Chemom Intell Lab Syst 44(1–2):31–60
Oreja-Guevara C, Ayuso T, Brieva L, Hernández MÁ, Meca-Lallana V, Ramió-Torrentà L (2019) Cognitive disfunctions and assessments in multiple sclerosis. Front Neurol 10:581
Ramsay JO, Silverman BW (2005) Functional data analysis, 2nd edn. Springer, New York
Reiss PT, Goldsmith J, Shang HL, Ogden RT (2017) Methods for scalar-on-function regression. Int Stat Rev 85(2):228–249
Sun X, Du P, Wang X, Ma P (2018) Optimal penalized function-on-function regression under a reproducing kernel Hilbert space framework. J Am Stat Assoc 113(524):1601–1611
Wahba G (1990) Spline models for observational data. Society for Industrial and Applied Mathematics, Philadelphia
Yuan M, Cai TT (2010) A reproducing kernel Hilbert space approach to functional linear regression. Ann Stat 38(6):3412–3444
Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B (Statistical Methodology) 68(1):49–67
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 Appendix A.1. Proof of Proposition 1
Proof
Let \(\xi _{ij}(t) = (K_1X_{ij})(t)\) for \(t \in [0, 1]\) and \(\mathop {}\!\mathcal {L}(\varvec{\xi }_j)\) denote the linear space expanded by \(\xi _{ij}, i = 1, \ldots , n\). Based on [26], we know that \(\xi _{ij} \in \mathop {}\!\mathcal {H}_1\), and hence \(\mathop {}\!\mathcal {L}(\varvec{\xi }_j)\) is a subspace of \(\mathop {}\!\mathcal {H}_1\). Each \(\beta _j\) can be written as
for some \(\rho _j \in \mathop {}\!\mathcal {H}_1 \ominus \mathop {}\!\mathcal {L}(\varvec{\xi }_j)\), where \(\mathop {}\!\mathcal {H}_1 \ominus \mathop {}\!\mathcal {L}(\varvec{\xi }_j)\) denotes the orthogonal complement of \(\mathop {}\!\mathcal {L}(\varvec{\xi }_j)\) in \(\mathop {}\!\mathcal {H}_1\).
Next, we plug the expression of \(\beta _j\) into (4). Let \(u_i = Y_i - \sum _{j = 1}^p \int _{\mathop {}\!\mathcal {I}_j} d_jX_{ij}(t) \textrm{d}t\) for \(i = 1, \ldots , n\). Then \(\hat{\bar{\beta }}_j\)’s minimize
As we know that \(\Vert \bar{\beta }_j\Vert _{\mathop {}\!\mathcal {H}_1}^2 = \Vert \sum _{i = 1}^n c_{ij}\xi _{ij}\Vert _{\mathop {}\!\mathcal {H}_1}^2 + \Vert \rho _j\Vert _{\mathop {}\!\mathcal {H}_1}^2\), the penalty term is minimized when \(\rho _j = 0\). The proof is completed. □
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yeh, CK., Sang, P. Variable Selection in Multivariate Functional Linear Regression. Stat Biosci (2023). https://doi.org/10.1007/s12561-023-09373-x
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12561-023-09373-x