Skip to main content

Robust Variable Selection and Coefficient Estimation in Multivariate Multiple Regression Using LAD-Lasso

  • Chapter
Modern Nonparametric, Robust and Multivariate Methods

Abstract

Univariate and multivariate lasso estimation methods are highly sensitive to outlying observations because of the sum of squared norms term in the objective function. Using sum of norms (least absolute deviations, LAD) instead of sum of squared norms gives us a considerably more robust estimate for the regression coefficients. In this paper we combine LAD with the multivariate lasso method and illustrate its estimation using simulated data set that are similar to those typically seen in association genetics. We will shortly consider also how the significance testing is done for non-zero coefficients and how the tuning parameter value can be determined.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Alfons, A., Croux, C., Gelper, S.: Sparse least trimmed squares regression for analyzing high-dimensional large data sets. Ann. Appl. Stat. 7, 226–248 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  • Arslan, O.: Weighted LAD-LASSO method for robust parameter estimation and variable selection in regression. Comput. Stat. Data Anal. 56, 1952–1965 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  • Chen, L., Pourahmadi, M., Maadooliat, M.: Regularized multivariate regression models with skew-t error distributions. J. Stat. Plan. Inference 149, 125–139 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  • Chi, E.C., Scott, D.W.: Robust parametric classification and variable selection by a minimum distance criterion. J. Comput. Graph. Stat. 23, 111–128 (2014)

    Article  MathSciNet  Google Scholar 

  • Cho, S., Kim, K., Kim, Y.J., Lee, J.-K., Cho, Y.S., Lee, J.-Y., Han, B.-C., Kim, H., Ott, J., Park, T.: Joint identification of multiple genetic variants via elastic-net variable selection in a genome-wide association analysis. Ann. Hum. Genet. 74, 416–428 (2010)

    Article  Google Scholar 

  • Crooks, L., Sahana, G., de Koning, D.J., Lund, M.S., Carlborg, Ö.: Comparison of analyses of the QTLMAS XII common dataset. II: genome-wide association and fine mapping. BMC Proc. 3(Suppl 1), S2 (2009)

    Google Scholar 

  • Daye, Z.J., Chen, J., Li, H.: High-dimensional heteroscedastic regression with an application to eQTL data analysis. Biometrics 68, 316–326 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  • Gao, X., Huang, J.: Asymptotic analysis of high-dimensional LAD-regression with LASSO. Stat. Sin. 20, 1485–1506 (2010)

    MathSciNet  MATH  Google Scholar 

  • Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Prediction, Inference and Data Mining, 2nd edn. Springer, New York (2009)

    Book  MATH  Google Scholar 

  • Khan, J.A., Van Aelst, S., Zamar, R.H.: Robust linear model selection based on least angle regression. J. Am. Stat. Assoc. 102, 1289–1299 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Lambert-Lacroix, S., Zwald, L.: Robust regression through the Huber’s criterion and adaptive lasso penalty. Electron. J. Stat. 5, 1015–1053 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  • Lange, K.L., Little, R.J.A., Taylor, J.M.G.: Robust statistical modeling using the t distribution. J. Am. Stat. Assoc. 84, 881–896 (1989)

    MathSciNet  Google Scholar 

  • Li, Z., Sillanpää, M.J.: Overview of LASSO-related penalized regression methods for quantitative trait mapping and genomic selection. Theor. Appl. Genet. 125, 419–435 (2012)

    Article  Google Scholar 

  • Li, Z., Möttönen, J., Sillanpää, M.J.: A robust multiple-locus method for quantitative trait locus analysis of non-normally distributed multiple traits. Heredity (2015, in press)

    Google Scholar 

  • Liu, J., Huang, J., Ma, S.: Analysis of genome-wide association studies with multiple outcomes using penalization. PLoS One 7, e51198 (2012)

    Article  Google Scholar 

  • Lund, M.S., Sahana, G., de Koning, D.J., Su, G., Carlborg, Ö.: Comparison of analyses of the QTLMAS XII common dataset, I. Genomic selection. BMC Proc. 3(Suppl 1), S1 (2009)

    Google Scholar 

  • Meinshausen, N., Meier, L., Bühlmann, P.: p-values for high-dimensional regression. J. Am. Stat. Assoc. 104, 1671–1681 (2009)

    Google Scholar 

  • Mutshinda, C.M., Sillanpää, M.J.: Extended Bayesian LASSO for multiple quantitative trait loci mapping and unobserved phenotype prediction. Genetics 186, 1067–1075 (2010)

    Article  Google Scholar 

  • Nordhausen, K., Oja, H.: Multivariate L 1 methods: the package MNM. J. Stat. Softw. 43, 1–28 (2011)

    Article  Google Scholar 

  • Nordhausen, K., Möttönen, J., Oja, H.: MNM: Multivariate Nonparametric Methods. An Approach Based on Spatial Signs and Ranks. R package version 0.95-1. http://CRAN.R-project.org/package=MNM (2009)

  • O’Hara, R.B., Sillanpää, M.J.: Review of Bayesian variable selection methods: what, how and which. Bayesian Anal. 4, 85–118 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  • Oja, H.: Multivariate Nonparametric Methods with R: An Approach Based on Spatial Signs and Ranks. Lecture Notes in Statistics, vol. 199. Springer, Heidelberg (2010)

    Google Scholar 

  • Park, T., Casella, G.: The Bayesian LASSO. J. Am. Stat. Assoc. 103, 681–686 (2008)

    Google Scholar 

  • Purdom, E., Holmes, S.P.: Error distribution for gene expression data. Stat. Appl. Genet. Mol. Biol. 4(1), article 16 (2005)

    Google Scholar 

  • QTL-MAS: Data [online]. Available at: http://www.computationalgenetics.se/QTLMAS08/QTLMAS/DATA.html (2012). Cited 8 Feb 2012

  • Sun, W., Ibrahim, J.G., Zou, F.: Genome-wide multiple loci mapping in experimental crosses by the iterative adaptive penalized regression. Genetics 185, 349–359 (2010)

    Article  Google Scholar 

  • Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58, 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  • Turlach, B.A., Venables, W.N., Wright, S.J.: Simultaneous variable selection. Technometrics 47, 349–363 (2005)

    Article  MathSciNet  Google Scholar 

  • Wang, H., Leng, C.: Unified lasso estimation by least squares approximation. J. Am. Stat. Assoc. 102, 1039–1048 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Wang, H., Li, G., Jiang, G.: Robust regression shrinkage and consistent variable selection through the LAD-lasso. J. Bus. Econ. Stat. 25, 347–355 (2007)

    Article  MathSciNet  Google Scholar 

  • Wu, T.T., Chen, F., Hastie, T., Sobel, E., Lange, K.: Genome-wide association analysis by lasso penalized regression. Bioinformatics 25, 714–721 (2009)

    Article  Google Scholar 

  • Xu, S.: An expectation-maximization algorithm for the Lasso estimation of quantitative trait locus effects. Heredity 105, 483–494 (2010)

    Article  Google Scholar 

  • Xu, S., Hu, Z.: Generalized linear model for interval mapping of quantitative trait loci. Theor. Appl. Genet. 121, 47–63 (2010)

    Article  Google Scholar 

  • Xu, J., Ying, Z.: Simultaneous estimation and variable selection in median regression using Lasso-type penalty. Ann. Inst. Stat. Math. 62, 487–514 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  • Yang, R., Wang, X., Li, J., Deng, H.: Bayesian robust analysis for genetic architecture of quantitative traits. Bioinformatics 25(8), 1033–1039 (2009)

    Article  Google Scholar 

  • Yi, N., Xu, S.: Bayesian LASSO for quantitative trait loci mapping. Genetics 179, 1045–1055 (2008)

    Article  Google Scholar 

  • Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B 68, 49–67 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  • Yuan, M., Ekici, A., Lu, Z., Monteiro, R.: Dimension reduction and coefficient estimation in multivariate linear regression. J. R. Stat. Soc. Ser. B 69, 329–346 (2007)

    Article  MathSciNet  Google Scholar 

  • Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B 67, 301–320 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  • Zhou, Z., Jiang, R., Qian, W.: LAD variable selection for linear models with randomly censored data. Metrika 76, 287–300 (2013)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jyrki Möttönen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Möttönen, J., Sillanpää, M.J. (2015). Robust Variable Selection and Coefficient Estimation in Multivariate Multiple Regression Using LAD-Lasso. In: Nordhausen, K., Taskinen, S. (eds) Modern Nonparametric, Robust and Multivariate Methods. Springer, Cham. https://doi.org/10.1007/978-3-319-22404-6_14

Download citation

Publish with us

Policies and ethics