Variable selection and collinearity processing for multivariate data via row-elastic-net regularization

Chen, Bingzhen; Zhai, Wenjuan; Kong, Lingchen

doi:10.1007/s10182-021-00403-x

Variable selection and collinearity processing for multivariate data via row-elastic-net regularization

Original Paper
Published: 09 May 2021

Volume 106, pages 79–96, (2022)
Cite this article

AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Bingzhen Chen¹,
Wenjuan Zhai² &
Lingchen Kong³

429 Accesses
1 Citation
Explore all metrics

Abstract

Multivariate data is collected in many fields, such as chemometrics, econometrics, financial engineering and genetics. In multivariate data, heteroscedasticity and collinearity occur frequently. And selecting material predictors is also a key issue when analyzing multivariate data. To accomplish these tasks, multivariate linear regression model is often constructed. We thus propose row-sparse elastic-net regularized multivariate Huber regression model in this paper. For this new model, we proof its grouping effect property and the property of resisting sample outliers. Based on the KKT condition, an accelerated proximal sub-gradient algorithm is designed to solve the proposed model and its convergency is also established. To demonstrate the accuracy and efficiency, simulation and real data experiments are carried out. The numerical results show that the new model can deal with heteroscedasticity and collinearity well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A review of unsupervised feature selection methods

Article 29 January 2019

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Article 30 August 2016

A feature selection method based on Shapley values robust for concept shift in regression

Article Open access 09 May 2024

References

Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
Article MathSciNet Google Scholar
Breiman, L., Friedman, J.H.: Predicting multivariate responses in multiple linear regression. J. R. Stat. Soc. Ser. B Stat. Methodol. 59, 3–54 (1997)
Article MathSciNet Google Scholar
Chen, B.Z., Kong, L.C.: High-dimensional least square matrix regression via elastic net penalty. Pac. J. Optim. 13(2), 185–196 (2017)
MathSciNet MATH Google Scholar
Chen, B.Z., Zhai, W.J., Huang, Z.Y.: Low-rank elastic-net regularized multivariate Huber regression model. Appl. Math. Model. 87, 571–583 (2020)
Article MathSciNet Google Scholar
Das, J., Gayvert, K., Bunea, F., Wegkamp, M., Yu, H.: Encapp: elastic-net-based prognosis prediction and biomarker discovery for human cancers. BMC Genom. 16, 1–13 (2015)
Article Google Scholar
Hastie, T., Tibshirani, R., et al.: ‘Gene shaving’ as a method for identifying distinct sets of genes with similar expression patterns. Genome Biol. 1, 1–21 (2000)
Article Google Scholar
Huber, P.: Robust estimation of a location parameter. Ann. Math. Stat. 35, 73–101 (1964)
Article MathSciNet Google Scholar
Huber, P.: Robust Statistics. Wiley, New York (1981)
Book Google Scholar
Koenker, R., Bassett, G.: Regression quantiles. Econometrica 46, 33–50 (1978)
Article MathSciNet Google Scholar
Mukherjee, A., Zhu, J.: Reduced rank ridge regression and its kernel extensions. Stat. Anal. Data Min. ASA Data Sci J. 4, 612–622 (2011)
Article MathSciNet Google Scholar
Negahban, S., Wainwright, M.: Simultaneous support recovery in high dimensions: benefits and perils of block $l_1/l_{\infty }$-regularization. IEEE Trans. Inform. Theory 57, 3841–3863 (2011)
Obozinski, G., Wainwright, M., Jordan, M.: Support union recovery in high-dimensional multivariate regression. Ann. Stat. 39(1), 1–47 (2011)
Article MathSciNet Google Scholar
Rodol$\grave{a}$, E., Torsello, A., Harada, T., Kuniyoshi, Y., Cremers, D.: Elastic net constraints for shape matching. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1169–1176, (2013)
Similä, T., Tikka, J.: Input selection and shrinkage in multiresponse linear regression. Comput. Stat. Data Anal. 52, 406–422 (2007)
Article MathSciNet Google Scholar
Skagerberg, S., MacGregor, J.F., Kiparissides, C.: Multivariate data analysis applied to low-density polyethylene reactors. Chemom. Intell. Lab. Syst. 14, 341–356 (1992)
Article Google Scholar
Stransky, N.: The cancer cell line encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483(7391), 603–607 (2012)
Article Google Scholar
Toh, K., Yun, S.: An accelerated proximal gradient algorithm for nuclear norm regularized least squares problems. Pac. J. Optim. 6, 615–640 (2010)
MathSciNet MATH Google Scholar
Tropp, J.A.: Algorithms for simultaneous sparse approximation. Part II Convex Relaxation. Signal Process. 86, 589–602 (2006)
MATH Google Scholar
Turlach, B., Venables, W., Wright, S.: Simultaneous variable selection. Technometrics 47, 350–363 (2005)
Article MathSciNet Google Scholar
Xin, X., Hu, J., Liu, L.: On the oracle property of a generalized adaptive elastic-net for multivariate linear regression with a diverging number of parameters. J. Multiva. Anal. 162, 16–31 (2017)
Article MathSciNet Google Scholar
Yi, C., Huang, J.: Semismooth Newton coordinate descent algorithm for elastic-net penalized Huber loss regression and quantile regression. J. Comput. Graph. Stat. 26, 547–557 (2017)
Article MathSciNet Google Scholar
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 67, 301–320 (2005)
Article MathSciNet Google Scholar
Zou, H., Zhang, H.: On the adaptive elastic-net with a diverging number of parameters. Ann. Statist. 37, 1733–1751 (2009)
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors are very grateful to two anonymous reviewers and associate editor for their insightful remarks and comments which considerably improved the presentation of our paper.

Author information

Authors and Affiliations

Institute of Statistics and Big Data, Renmin University of China, 59 Zhongguancun Avenue, Beijing, 100872, China
Bingzhen Chen
Department of Applied Mathematics, Cangzhou Jiaotong College, Cangzhou, 061100, China
Wenjuan Zhai
Department of Applied Mathematics, Beijing Jiaotong University, Beijing, 100044, China
Lingchen Kong

Authors

Bingzhen Chen
View author publications
You can also search for this author in PubMed Google Scholar
Wenjuan Zhai
View author publications
You can also search for this author in PubMed Google Scholar
Lingchen Kong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bingzhen Chen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by the Key Program of Cangzhou Jiaotong College (HB202001002) and the National Natural Science Foundation of China (12071022).

Appendix

1.1 Proof of Theorem 2

As shown in Chen et al. (2020), the derivative of $H_\alpha ^n(B)$ is

$$\begin{aligned} \nabla _B H_\alpha ^n(B)= -\tfrac{1}{n}X^\mathrm{T}\varPsi (B), \end{aligned}$$

where $\varPsi (B)=\left( \psi ^\mathrm{T}\left( {{\varvec{y}}}_1-B^\mathrm{T}{{\varvec{x}}}_1\right) , \cdots , \psi ^\mathrm{T}\left( {{\varvec{y}}}_n-B^\mathrm{T}{{\varvec{x}}}_n\right) \right) ^\mathrm{T}$,

$$\psi ({{\varvec{z}}})={\left\{ \begin{array}{ll} {{\varvec{z}}},&{}\Vert {{\varvec{z}}}\Vert _\mathrm{2} \le \alpha ,\\ \tfrac{\alpha }{\Vert {{\varvec{z}}}\Vert _\mathrm{2} }{{\varvec{z}}} ,&{}\Vert {{\varvec{z}}}\Vert _\mathrm{2} > \alpha , \\ \end{array}\right. }$$

and $\varPsi (B)$ has the following upper bound

$$\begin{aligned} \Vert \varPsi (B)\Vert _\mathrm{F}=\sqrt{\sum \nolimits _{i=1}^{n}\sum \nolimits _{j=1}^{q}\varvec{\psi }_{ij}^2} =\sqrt{\sum \nolimits _{i=1}^{n}\Vert \varvec{\psi }_i\Vert _2^2}\le \sqrt{n\alpha ^2} \le \sqrt{n}\alpha . \end{aligned}$$

(10)

Let $L(\lambda _1,\lambda _2, B) = \tfrac{1}{n}\sum \nolimits _{i=1}^{n} h_\alpha \left( {{\varvec{y}}}_i-B^\mathrm{T}{{\varvec{x}}}_i\right) +\lambda _1\sum \nolimits _{k=1}^{m}\Vert {{\varvec{b}}}_k\Vert _2+\tfrac{\lambda _2}{2}\sum \nolimits _{k=1}^{m}\Vert {{\varvec{b}}}_k\Vert _2^2$. Then, the Karush–Kuhn–Tucker condition of optimization problem (2) is

$$\begin{aligned} 0\in \nabla _B L\left( \lambda _1,\lambda _2, \widehat{B}\right) = -\tfrac{1}{n}X^\mathrm{T}\varPsi \left( \widehat{B}\right) + \lambda _2\widehat{B} + \lambda _1S, \end{aligned}$$

(11)

where $S=\left( {{\varvec{s}}}_1,\cdots ,{{\varvec{s}}}_m\right) ^\mathrm{T}$ and ${{\varvec{s}}}_k$ satisfies

$${\left\{ \begin{array}{ll}{{\varvec{s}}}_j=\tfrac{\hat{{{\varvec{b}}}}_k}{\Vert \hat{{{\varvec{b}}}}_k\Vert _2}, &{} \hat{{{\varvec{b}}}}_k\ne 0,\\ \Vert {{\varvec{s}}}_k\Vert _2\le 1, &{}\hat{{{\varvec{b}}}}_k=0.\end{array}\right. }$$

Let ${{\varvec{e}}}^{(k)}= (0,\cdots , 0,1, 0,\cdots , 0)^\mathrm{T}\in \mathbb {R}^m$, where “1” is the kth component of ${{\varvec{e}}}^{(k)}$. Multiplying ${{\varvec{e}}}^{(k)}$ on the both sides of equation (11), it follows that

$$0\in -\tfrac{1}{n}{{\varvec{e}}}^{(k)}X^\mathrm{T}\varPsi \left( \widehat{B}\right) + \lambda _2{{\varvec{e}}}^{(k)}\widehat{B} + \lambda _1{{\varvec{e}}}^{(k)}S,$$

i.e.,

$$\begin{aligned} -\tfrac{1}{n}\varPsi \left( \widehat{B}\right) ^\mathrm{T}\dot{{{\varvec{x}}}}_k + \lambda _2\hat{{{\varvec{b}}}}_k + \lambda _1 {{\varvec{s}}}_k\ni 0. \end{aligned}$$

(12)

For the purpose of deriving the upper bound for $\Vert \hat{{{\varvec{b}}}}_i-\hat{{{\varvec{b}}}}_j\Vert _2$, we consider the case $\hat{{{\varvec{b}}}}_i\ne 0$ and $\hat{{{\varvec{b}}}}_j\ne 0$. By letting $k=i$ and $k=j$ in (12), we obtain

$$\begin{aligned} {\left\{ \begin{array}{ll} -\tfrac{1}{n}\varPsi (\widehat{B})^\mathrm{T}\dot{{{\varvec{x}}}}_i + \lambda _2\hat{{{\varvec{b}}}}_i + \lambda _1 \tfrac{\hat{{{\varvec{b}}}}_i}{\Vert \hat{{{\varvec{b}}}}_i\Vert _2}=0, \\ {-\tfrac{1}{n}}\varPsi (\widehat{B})^\mathrm{T}\dot{{{\varvec{x}}}}_j + \lambda _2\hat{{{\varvec{b}}}}_j + \lambda _1 \tfrac{\hat{{{\varvec{b}}}}_j}{\Vert \hat{{{\varvec{b}}}}_j\Vert _2}=0. \end{array}\right. } \end{aligned}$$

Then, we have

$$\begin{aligned} \lambda _2(\hat{{{\varvec{b}}}}_i -\hat{{{\varvec{b}}}}_j)+ \lambda _1 \left( \tfrac{\hat{{{\varvec{b}}}}_i}{\Vert \hat{{{\varvec{b}}}}_i\Vert _2}- \tfrac{\hat{{{\varvec{b}}}}_j}{\Vert \hat{{{\varvec{b}}}}_j\Vert _2}\right) = \tfrac{1}{n}\varPsi (\widehat{B})^\mathrm{T}\left( \dot{{{\varvec{x}}}}_i-\dot{{{\varvec{x}}}}_j\right) . \end{aligned}$$

It follows that

$$\begin{aligned} \left\| \lambda _2(\hat{{{\varvec{b}}}}_i -\hat{{{\varvec{b}}}}_j) + \lambda _1 \left( \tfrac{\hat{{{\varvec{b}}}}_i}{\Vert \hat{{{\varvec{b}}}}_i\Vert _2} - \tfrac{\hat{{{\varvec{b}}}}_j}{\Vert \hat{{{\varvec{b}}}}_j\Vert _2}\right) \right\| _2= \tfrac{1}{n}\left\| \varPsi (\widehat{B})^\mathrm{T}\left( \dot{{{\varvec{x}}}}_i-\dot{{{\varvec{x}}}}_j\right) \right\| _2. \end{aligned}$$

(13)

On the one hand,

$$\begin{aligned} \left\| \varPsi (\widehat{B})^\mathrm{T}\left( \dot{{{\varvec{x}}}}_i-\dot{{{\varvec{x}}}}_j\right) \right\| _2&\le \left\| \varPsi (\widehat{B})\right\| _2\cdot \left\| \dot{{{\varvec{x}}}}_i-\dot{{{\varvec{x}}}}_j\right\| _2 \nonumber \\ {}&\le \left\| \varPsi (\widehat{B})\right\| _\mathrm{F}\cdot \left\| \dot{{{\varvec{x}}}}_i-\dot{{{\varvec{x}}}}_j\right\| _2\nonumber \\ {}&\le \sqrt{n}\alpha \cdot \left\| \dot{{{\varvec{x}}}}_i-\dot{{{\varvec{x}}}}_j\right\| _2\nonumber \\ {}&\le n\alpha \cdot \sqrt{2(1-\rho )}. \end{aligned}$$

(14)

On the other hand,

$$\begin{aligned} (\hat{{{\varvec{b}}}}_i -\hat{{{\varvec{b}}}}_j)^\mathrm{T}\left( \tfrac{\hat{{{\varvec{b}}}}_i}{\Vert \hat{{{\varvec{b}}}}_i\Vert _2}- \tfrac{\hat{{{\varvec{b}}}}_j}{\Vert \hat{{{\varvec{b}}}}_j\Vert _2}\right) =\left( \Vert \hat{{{\varvec{b}}}}_i \Vert _2 + \Vert \hat{{{\varvec{b}}}}_j \Vert _2 \right) \left( 1 -\cos (\theta )\right) \ge 0, \end{aligned}$$

where $\theta$ is the angle between $\hat{{{\varvec{b}}}}_i$ and $\hat{{{\varvec{b}}}}_j$. Thus,

$$\begin{aligned} \left\| \lambda _2(\hat{{{\varvec{b}}}}_i -\hat{{{\varvec{b}}}}_j)+ \lambda _1 \left( \tfrac{\hat{{{\varvec{b}}}}_i}{\Vert \hat{{{\varvec{b}}}}_i\Vert _2}- \tfrac{\hat{{{\varvec{b}}}}_j}{\Vert \hat{{{\varvec{b}}}}_j\Vert _2}\right) \right\| _2 \ge \lambda _2\Vert \hat{{{\varvec{b}}}}_i -\hat{{{\varvec{b}}}}_j\Vert _2. \end{aligned}$$

(15)

Combining (13), (14) and (15), it is easy to obtain the desired result. $\square$

1.2 Appendix: proof of Corollary 1

If $\dot{{{\varvec{x}}}}_{i}=\dot{{{\varvec{x}}}}_{j}$, then the sample correlation coefficient $\rho =\tfrac{1}{n}\dot{{{\varvec{x}}}}^\mathrm{T}_{i} \dot{{{\varvec{x}}}}_{j}=1$. Considering the upper bound (5), we have $\Vert \hat{{{\varvec{b}}}}_i -\hat{{{\varvec{b}}}}_j\Vert _2\le 0$. It follows that $\hat{{{\varvec{b}}}}_i=\hat{{{\varvec{b}}}}_j$. $\square$

1.3 Appendix: proof of Theorem 3

For the optimization problem (6), the Karush–Kuhn–Tucker conditions are

$$\begin{aligned} \left( L_H+\lambda _2\right) \left( {{\varvec{b}}}_k-{{\varvec{g}}}_k\right) +\lambda _1{{\varvec{s}}}_k=0,~\forall ~k\in \{1,2,\cdots ,m\}, \end{aligned}$$

(16)

where ${{\varvec{g}}}_k^{\text {T}}$ is the kth row of G and

$$\begin{aligned} {\left\{ \begin{array}{ll}{{\varvec{s}}}_k=\tfrac{{{\varvec{b}}}_k}{\Vert {{\varvec{b}}}_k\Vert _2}, &{} {{\varvec{b}}}_k\ne 0,\\ \Vert {{\varvec{s}}}_k\Vert _2\le 1, &{}{{\varvec{b}}}_k=0.\end{array}\right. } \end{aligned}$$

(17)

If ${{\varvec{b}}}_k=0$, equality (16) becomes

$$\begin{aligned} -\left( L_H+\lambda _2\right) {{\varvec{g}}}_k+\lambda _1{{\varvec{s}}}_k=0. \end{aligned}$$

It follows that

$${{\varvec{s}}}_k=\tfrac{L_H+\lambda _2}{\lambda _1}{{\varvec{g}}}_k.$$

Considering the second inequality in (17), we use

$$\begin{aligned} \Vert {{\varvec{g}}}_k\Vert _2\le \tfrac{\lambda _1}{L_H+\lambda _2} \end{aligned}$$

(18)

to determine ${{\varvec{b}}}_k=0$.

If ${{\varvec{b}}}_k\ne 0$, (16) is in the following form

$$\begin{aligned} \left( L_H+\lambda _2\right) \left( {{\varvec{b}}}_k-{{\varvec{g}}}_k\right) +\lambda _1\tfrac{{{\varvec{b}}}_k}{\Vert {{\varvec{b}}}_k\Vert _2}=0. \end{aligned}$$

(19)

It is equivalent to

$$\begin{aligned} \left( L_H+\lambda _2+\tfrac{\lambda _1}{\Vert {{\varvec{b}}}_k\Vert _2}\right) {{\varvec{b}}}_k=\left( L_H+\lambda _2\right) {{\varvec{g}}}_k. \end{aligned}$$

Taking the $\ell _2$-norm on both sides, we obtain that

$$\Vert {{\varvec{b}}}_k\Vert _2=\Vert {{\varvec{g}}}_k\Vert _2-\tfrac{\lambda _1}{L_H+\lambda _2}.$$

Inserting this expression into (19), we obtain

$$\begin{aligned} {{\varvec{b}}}_k=\left( 1-\tfrac{\lambda _1}{\left( L_H+\lambda _2\right) \Vert {{\varvec{g}}}_k\Vert _2} \right) {{\varvec{g}}}_k. \end{aligned}$$

(20)

Combining (18) and (20), the desired result (7) can be obtained.${\square }$

1.4 Appendix: proof of Theorem 4

Following the procedure in Beck and Teboulle (2009) or Toh and Yun (2010), inequality (8) can be easily obtained.

Considering the triangular inequality $\Vert \hat{B}-B^0\Vert _\mathrm{F}\le \Vert \hat{B}\Vert _\mathrm{F}+\Vert B^0\Vert _\mathrm{F}$ and (8), we have

$$\begin{aligned} F(B^k)- F(\hat{B})\le \tfrac{2L_H\Vert \hat{B}-B^0\Vert _\mathrm{F} ^2}{(k+1)^2}\le \tfrac{2L_H\left( \Vert \hat{B}\Vert _\mathrm{F}+\Vert B^0\Vert _\mathrm{F}\right) ^2}{(k+1)^2}. \end{aligned}$$

(21)

Note that $\hat{B}$ is the solution to (2). It follows that

$$\begin{aligned} H_{\alpha }^n(\hat{B})+\lambda _1\sum \limits _{j=1}^{m}\Vert \hat{{{\varvec{b}}}}_j\Vert _2 + \lambda _2 \Vert \hat{B}\Vert _\mathrm{F}^2&< F(0)=H_{\alpha }^n(0)\nonumber =\tfrac{1}{n}\sum _{i=1}^{n}h_{\alpha }({{\varvec{y}}}_i)\nonumber \\ {}&=\tfrac{1}{n}\sum _{i=1}^{n} {\left\{ \begin{array}{ll} \tfrac{1}{2}\Vert {{\varvec{y}}}_i\Vert _{2}^{2}, &{} \Vert {{\varvec{y}}}_i\Vert _{2}\le \alpha \\ \alpha \left( \Vert {{\varvec{y}}}_i\Vert _{2}-\tfrac{1}{2}\alpha \right) , &{} \Vert {{\varvec{y}}}_i\Vert _{2}>\alpha \end{array}\right. }\nonumber \\&\le \tfrac{1}{n}\sum _{i=1}^{n}\max \left\{ \tfrac{1}{2}\Vert {{\varvec{y}}}_i\Vert _{2}^{2},~\alpha \left( \Vert {{\varvec{y}}}_i\Vert _{2}-\tfrac{1}{2}\alpha \right) \right\} \nonumber \\&\le \tfrac{1}{2n}\Vert Y\Vert _\mathrm{F}^{2}. \end{aligned}$$

(22)

It is easy to obtain

$$\begin{aligned} {\left\{ \begin{array}{ll} H_{\alpha }^n(\hat{B})+ \lambda _1\sum \limits _{j=1}^{m}\Vert \hat{{{\varvec{b}}}}_j\Vert _2 + \tfrac{\lambda _2}{2} \Vert \hat{B}\Vert _\mathrm{F}^2 \ge \lambda _1\sum \limits _{j=1}^{m}\Vert \hat{{{\varvec{b}}}}_j\Vert _2 \ge \lambda _1\Vert \hat{B}\Vert _\mathrm{F}, \\ H_{\alpha }^n(\hat{B})+ \lambda _1\sum \limits _{j=1}^{m}\Vert \hat{{{\varvec{b}}}}_j\Vert _2 + \tfrac{\lambda _2}{2}\Vert \hat{B}\Vert _\mathrm{F}^2\ge \tfrac{\lambda _2}{2} \Vert \hat{B}\Vert _\mathrm{F}^2. \end{array}\right. } \end{aligned}$$

(23)

Combining (22) and (23), we can obtain the following upper bound of $\Vert \hat{B}\Vert _\mathrm{F}$

$$\Vert \hat{B}\Vert _\mathrm{F}<\min \left\{ \Vert Y\Vert _\mathrm{F}^{2}/(2n\lambda _1),~\Vert Y\Vert _\mathrm{F}\sqrt{1/(n\lambda _2)}\right\} .$$

Inserting this inequality to (21), it follows that

$$\begin{aligned} F(B^k)- F(\hat{B})\le \tfrac{2L_H\Vert \hat{B}-B^0\Vert _\mathrm{F} ^2}{(k+1)^2}<\tfrac{2L_H\left( C+\Vert B^0\Vert _\mathrm{F}\right) ^2}{(k+1)^2}, \end{aligned}$$

where $C=\min \left\{ \Vert Y\Vert _\mathrm{F}^{2}/(2n\lambda _1),~\Vert Y\Vert _\mathrm{F}\sqrt{1/(n\lambda _2)}\right\}$. In order to make $B^k$ to be the $\epsilon$-optimal solution, i.e., $F(B^k)- F(\hat{B})\le \epsilon$, we only need to terminate the algorithm when

$$\begin{aligned} \tfrac{2L_H\left( C+\Vert B^0\Vert _\mathrm{F}\right) ^2}{(k+1)^2}<\epsilon . \end{aligned}$$

It follows that

$$\begin{aligned} k\ge \sqrt{2L_H/\epsilon }\left( C+\Vert B^0\Vert _\mathrm{F}\right) -1. \end{aligned}$$

${\square }$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, B., Zhai, W. & Kong, L. Variable selection and collinearity processing for multivariate data via row-elastic-net regularization. AStA Adv Stat Anal 106, 79–96 (2022). https://doi.org/10.1007/s10182-021-00403-x

Download citation

Received: 09 September 2020
Accepted: 25 April 2021
Published: 09 May 2021
Issue Date: March 2022
DOI: https://doi.org/10.1007/s10182-021-00403-x

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Variable selection and collinearity processing for multivariate data via row-elastic-net regularization

Abstract

Access this article

Similar content being viewed by others

A review of unsupervised feature selection methods

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

A feature selection method based on Shapley values robust for concept shift in regression

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

1.1 Proof of Theorem 2

1.2 Appendix: proof of Corollary 1

1.3 Appendix: proof of Theorem 3

1.4 Appendix: proof of Theorem 4

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Variable selection and collinearity processing for multivariate data via row-elastic-net regularization

Abstract

Access this article

Similar content being viewed by others

A review of unsupervised feature selection methods

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

A feature selection method based on Shapley values robust for concept shift in regression

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

1.1 Proof of Theorem 2

1.2 Appendix: proof of Corollary 1

1.3 Appendix: proof of Theorem 3

1.4 Appendix: proof of Theorem 4

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation