Skip to main content
Log in

A sparse linear regression model for incomplete datasets

  • Theoretical advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Incomplete data are often neglected when designing machine learning methods. A popular strategy adopted by practitioners to circumvent this consists of taking a preprocessing step to fill the missing components. These preprocessing algorithms are designed independently of the machine learning method that will be applied subsequently, which may lead to sub-optimal results. An alternative solution is to redesign classical machine learning methods to handle missing data directly. In this paper, we propose a variant of the forward stagewise regression (FSR) algorithm for incomplete data. The original FSR is an iterative procedure to estimate parameters of sparse linear models. The proposed method, named forward stagewise regression for incomplete datasets with GMM (FSIG), models the missing components as random variables following a Gaussian mixture distribution. In FSIG, the main steps of FSR are adapted to deaç with the intrinsic uncertainty of incomplete samples. The performance of FSIG was evaluated in an extensive set of experiments, and our model was able to outperform classical methods in most of the tested cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Belanche L, Kobayashi V, Aluja T (2014) Handling missing values in kernel methods with application to microbiology data. Neurocomputing 141:110–116. https://doi.org/10.1016/j.neucom.2014.01.047

    Article  Google Scholar 

  2. Chen SS, Donoho DL, Saunders MA (2001) Atomic decomposition by basis pursuit. SIAM Rev 43(1):129–159

    Article  MathSciNet  Google Scholar 

  3. Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  4. Dheeru D, Karra Taniskidou E (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml

  5. Efron B, Hastie T, Johnstone I, Tibshirani R et al (2004) Least angle regression. Ann Stat 32(2):407–499

    Article  MathSciNet  Google Scholar 

  6. Eirola E, Doquire G, Verleysen M, Lendasse A (2013) Distance estimation in numerical data sets with missing values. Inf Sci 240:115–128

    Article  MathSciNet  Google Scholar 

  7. Eirola E, Lendasse A, Vandewalle V, Biernacki C (2014) Mixture of gaussians for distance estimation with missing data. Neurocomputing 131:32–42. https://doi.org/10.1016/j.neucom.2013.07.050

    Article  Google Scholar 

  8. Figueiredo MA, Nowak RD, Wright SJ (2007) Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems. IEEE J Sel Top Signal Process 1(4):586–597

    Article  Google Scholar 

  9. Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92

    Article  MathSciNet  Google Scholar 

  10. García-Laencina PJ, Sancho-Gómez JL, Figueiras-Vidal AR (2009) Pattern classification with missing data: a review. Neural Comput Appl 19:263–282

    Article  Google Scholar 

  11. Gui J, Sun Z, Ji S, Tao D, Tan T (2017) Feature selection based on structured sparsity: a comprehensive study. IEEE Trans Neural Netw Learn Syst 28(7):1490–1507

    Article  MathSciNet  Google Scholar 

  12. Hastie T, Taylor J, Tibshirani R, Walther G (2006) Forward stagewise regression and the monotone lasso. Electron J Stat 1:2007

    MathSciNet  MATH  Google Scholar 

  13. Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer series in statistics. Springer, New York

    Book  Google Scholar 

  14. Hulse JV, Khoshgoftaar TM (2014) Incomplete-case nearest neighbor imputation in software measurement data. Inf Sci 259:596–610

    Article  Google Scholar 

  15. Hunt L, Jorgensen M (2003) Mixture model clustering for mixed data with missing information. Comput Stat Data Anal 41(3–4):429–440. https://doi.org/10.1016/S0167-9473(02)00190-1

    Article  MathSciNet  MATH  Google Scholar 

  16. Little RJA, Rubin DB (2002) Statistical analysis with missing data. Wiley-Interscience, Hoboken

    Book  Google Scholar 

  17. Liu Z, Wu XJ, Shu Z (2019) Sparsity augmented discriminative sparse representation for face recognition. Pattern Anal Appl. https://doi.org/10.1007/s10044-019-00792-5

    Article  Google Scholar 

  18. Malkomes G, de Brito CEF, Gomes JPP (2017) A stochastic framework for k-SVD with applications on face recognition. Pattern Anal Appl 20(3):845–854. https://doi.org/10.1007/s10044-016-0541-3

    Article  MathSciNet  Google Scholar 

  19. Meng XL, Rubin DB (1993) Maximum likelihood estimation via the ECM algorithm: a general framework. Biometrika 80(2):267–278

    Article  MathSciNet  Google Scholar 

  20. Mesquita DP, Gomes JP, Junior AHS, Nobre JS (2017) Euclidean distance estimation in incomplete datasets. Neurocomputing 248:11–18. https://doi.org/10.1016/j.neucom.2016.12.081

    Article  Google Scholar 

  21. Murphy KP (2012) Machine learning: a probabilistic perspective. MIT Press, Cambridge

    MATH  Google Scholar 

  22. Nebot-Troyano G, Belanche-Muñoz LA (2010) A kernel extension to handle missing data. In: Bramer M, Ellis R, Petridis M (eds) Research and development in intelligent systems XXVI. Springer, London, pp 165–178

    Chapter  Google Scholar 

  23. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464. https://doi.org/10.1214/aos/1176344136

    Article  MathSciNet  MATH  Google Scholar 

  24. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodol) 58:267–288

  25. Veras MBA, Mesquita DPP, Gomes JPP, Souza Junior AH, Barreto GA (2017) Forward stagewise regression on incomplete datasets. In: Rojas I, Joya G, Catala A (eds) Advances in computational intelligence. Springer, Cham, pp 386–395

    Chapter  Google Scholar 

  26. Wu TT, Lange K et al (2008) Coordinate descent algorithms for lasso penalized regression. Ann Appl Stat 2(1):224–244

    Article  MathSciNet  Google Scholar 

  27. Xie P, Liu X, Yin J, Wang Y (2016) Absent extreme learning machine algorithm with application to packed executable identification. Neural Comput Appl 27(1):93–100. https://doi.org/10.1007/s00521-014-1558-4

    Article  Google Scholar 

  28. Yang AY, Sastry SS, Ganesh A, Ma Y (2010) Fast l1-minimization algorithms and an application in robust face recognition: a review. In: 2010 17th IEEE international conference on image processing (ICIP). IEEE, pp 1849–1852

  29. Yuan GX, Chang KW, Hsieh CJ, Lin CJ (2010) A comparison of optimization methods and software for large-scale l1-regularized linear classification. J Mach Learn Res 11(Nov):3183–3234

    MathSciNet  MATH  Google Scholar 

  30. Zahin SA, Ahmed CF, Alam T (2018) An effective method for classification with missing values. Appl Intell 48(10):3209–3230. https://doi.org/10.1007/s10489-018-1139-9

    Article  Google Scholar 

  31. Zhang H, Wang S, Xu X, Chow TWS, Wu QMJ (2018) Tree2Vector: learning a vectorial representation for tree-structured data. IEEE Trans Neural Netw Learn Syst 29(11):5304–5318. https://doi.org/10.1109/TNNLS.2018.2797060

    Article  MathSciNet  Google Scholar 

  32. Zhang X, Song S, Wu C (2013) Robust Bayesian classification with incomplete data. Cogn Comput 5(2):170–187. https://doi.org/10.1007/s12559-012-9188-6

    Article  Google Scholar 

  33. Zhang Z, Xu Y, Yang J, Li X, Zhang D (2015) A survey of sparse representation: algorithms and applications. IEEE Access 3:490–530

    Article  Google Scholar 

  34. Ziegler ML (2000) Variable selection when confronted with missing data. PhD thesis, University of Pittsburgh

Download references

Acknowledgements

The authors would like to thank the Brazilian National Council for Scientific and Technological Development (CNPq) for financial support (Grant 302289/2019-4).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to João P. P. Gomes.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Veras, M.B.A., Mesquita, D.P.P., Mattos, C.L.C. et al. A sparse linear regression model for incomplete datasets. Pattern Anal Applic 23, 1293–1303 (2020). https://doi.org/10.1007/s10044-019-00859-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-019-00859-3

Keywords

Navigation