Abstract
Univariate and multivariate lasso estimation methods are highly sensitive to outlying observations because of the sum of squared norms term in the objective function. Using sum of norms (least absolute deviations, LAD) instead of sum of squared norms gives us a considerably more robust estimate for the regression coefficients. In this paper we combine LAD with the multivariate lasso method and illustrate its estimation using simulated data set that are similar to those typically seen in association genetics. We will shortly consider also how the significance testing is done for non-zero coefficients and how the tuning parameter value can be determined.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alfons, A., Croux, C., Gelper, S.: Sparse least trimmed squares regression for analyzing high-dimensional large data sets. Ann. Appl. Stat. 7, 226–248 (2013)
Arslan, O.: Weighted LAD-LASSO method for robust parameter estimation and variable selection in regression. Comput. Stat. Data Anal. 56, 1952–1965 (2012)
Chen, L., Pourahmadi, M., Maadooliat, M.: Regularized multivariate regression models with skew-t error distributions. J. Stat. Plan. Inference 149, 125–139 (2014)
Chi, E.C., Scott, D.W.: Robust parametric classification and variable selection by a minimum distance criterion. J. Comput. Graph. Stat. 23, 111–128 (2014)
Cho, S., Kim, K., Kim, Y.J., Lee, J.-K., Cho, Y.S., Lee, J.-Y., Han, B.-C., Kim, H., Ott, J., Park, T.: Joint identification of multiple genetic variants via elastic-net variable selection in a genome-wide association analysis. Ann. Hum. Genet. 74, 416–428 (2010)
Crooks, L., Sahana, G., de Koning, D.J., Lund, M.S., Carlborg, Ö.: Comparison of analyses of the QTLMAS XII common dataset. II: genome-wide association and fine mapping. BMC Proc. 3(Suppl 1), S2 (2009)
Daye, Z.J., Chen, J., Li, H.: High-dimensional heteroscedastic regression with an application to eQTL data analysis. Biometrics 68, 316–326 (2012)
Gao, X., Huang, J.: Asymptotic analysis of high-dimensional LAD-regression with LASSO. Stat. Sin. 20, 1485–1506 (2010)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Prediction, Inference and Data Mining, 2nd edn. Springer, New York (2009)
Khan, J.A., Van Aelst, S., Zamar, R.H.: Robust linear model selection based on least angle regression. J. Am. Stat. Assoc. 102, 1289–1299 (2007)
Lambert-Lacroix, S., Zwald, L.: Robust regression through the Huber’s criterion and adaptive lasso penalty. Electron. J. Stat. 5, 1015–1053 (2011)
Lange, K.L., Little, R.J.A., Taylor, J.M.G.: Robust statistical modeling using the t distribution. J. Am. Stat. Assoc. 84, 881–896 (1989)
Li, Z., Sillanpää, M.J.: Overview of LASSO-related penalized regression methods for quantitative trait mapping and genomic selection. Theor. Appl. Genet. 125, 419–435 (2012)
Li, Z., Möttönen, J., Sillanpää, M.J.: A robust multiple-locus method for quantitative trait locus analysis of non-normally distributed multiple traits. Heredity (2015, in press)
Liu, J., Huang, J., Ma, S.: Analysis of genome-wide association studies with multiple outcomes using penalization. PLoS One 7, e51198 (2012)
Lund, M.S., Sahana, G., de Koning, D.J., Su, G., Carlborg, Ö.: Comparison of analyses of the QTLMAS XII common dataset, I. Genomic selection. BMC Proc. 3(Suppl 1), S1 (2009)
Meinshausen, N., Meier, L., Bühlmann, P.: p-values for high-dimensional regression. J. Am. Stat. Assoc. 104, 1671–1681 (2009)
Mutshinda, C.M., Sillanpää, M.J.: Extended Bayesian LASSO for multiple quantitative trait loci mapping and unobserved phenotype prediction. Genetics 186, 1067–1075 (2010)
Nordhausen, K., Oja, H.: Multivariate L 1 methods: the package MNM. J. Stat. Softw. 43, 1–28 (2011)
Nordhausen, K., Möttönen, J., Oja, H.: MNM: Multivariate Nonparametric Methods. An Approach Based on Spatial Signs and Ranks. R package version 0.95-1. http://CRAN.R-project.org/package=MNM (2009)
O’Hara, R.B., Sillanpää, M.J.: Review of Bayesian variable selection methods: what, how and which. Bayesian Anal. 4, 85–118 (2009)
Oja, H.: Multivariate Nonparametric Methods with R: An Approach Based on Spatial Signs and Ranks. Lecture Notes in Statistics, vol. 199. Springer, Heidelberg (2010)
Park, T., Casella, G.: The Bayesian LASSO. J. Am. Stat. Assoc. 103, 681–686 (2008)
Purdom, E., Holmes, S.P.: Error distribution for gene expression data. Stat. Appl. Genet. Mol. Biol. 4(1), article 16 (2005)
QTL-MAS: Data [online]. Available at: http://www.computationalgenetics.se/QTLMAS08/QTLMAS/DATA.html (2012). Cited 8 Feb 2012
Sun, W., Ibrahim, J.G., Zou, F.: Genome-wide multiple loci mapping in experimental crosses by the iterative adaptive penalized regression. Genetics 185, 349–359 (2010)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58, 267–288 (1996)
Turlach, B.A., Venables, W.N., Wright, S.J.: Simultaneous variable selection. Technometrics 47, 349–363 (2005)
Wang, H., Leng, C.: Unified lasso estimation by least squares approximation. J. Am. Stat. Assoc. 102, 1039–1048 (2007)
Wang, H., Li, G., Jiang, G.: Robust regression shrinkage and consistent variable selection through the LAD-lasso. J. Bus. Econ. Stat. 25, 347–355 (2007)
Wu, T.T., Chen, F., Hastie, T., Sobel, E., Lange, K.: Genome-wide association analysis by lasso penalized regression. Bioinformatics 25, 714–721 (2009)
Xu, S.: An expectation-maximization algorithm for the Lasso estimation of quantitative trait locus effects. Heredity 105, 483–494 (2010)
Xu, S., Hu, Z.: Generalized linear model for interval mapping of quantitative trait loci. Theor. Appl. Genet. 121, 47–63 (2010)
Xu, J., Ying, Z.: Simultaneous estimation and variable selection in median regression using Lasso-type penalty. Ann. Inst. Stat. Math. 62, 487–514 (2010)
Yang, R., Wang, X., Li, J., Deng, H.: Bayesian robust analysis for genetic architecture of quantitative traits. Bioinformatics 25(8), 1033–1039 (2009)
Yi, N., Xu, S.: Bayesian LASSO for quantitative trait loci mapping. Genetics 179, 1045–1055 (2008)
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B 68, 49–67 (2006)
Yuan, M., Ekici, A., Lu, Z., Monteiro, R.: Dimension reduction and coefficient estimation in multivariate linear regression. J. R. Stat. Soc. Ser. B 69, 329–346 (2007)
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B 67, 301–320 (2005)
Zhou, Z., Jiang, R., Qian, W.: LAD variable selection for linear models with randomly censored data. Metrika 76, 287–300 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Möttönen, J., Sillanpää, M.J. (2015). Robust Variable Selection and Coefficient Estimation in Multivariate Multiple Regression Using LAD-Lasso. In: Nordhausen, K., Taskinen, S. (eds) Modern Nonparametric, Robust and Multivariate Methods. Springer, Cham. https://doi.org/10.1007/978-3-319-22404-6_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-22404-6_14
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22403-9
Online ISBN: 978-3-319-22404-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)