Advertisement

Interpretable sparse SIR for functional data

  • Victor Picheny
  • Rémi Servien
  • Nathalie Villa-Vialaneix
Article
  • 42 Downloads

Abstract

We propose a semiparametric framework based on sliced inverse regression (SIR) to address the issue of variable selection in functional regression. SIR is an effective method for dimension reduction which computes a linear projection of the predictors in a low-dimensional space, without loss of information on the regression. In order to deal with the high dimensionality of the predictors, we consider penalized versions of SIR: ridge and sparse. We extend the approaches of variable selection developed for multidimensional SIR to select intervals that form a partition of the definition domain of the functional predictors. Selecting entire intervals rather than separated evaluation points improves the interpretability of the estimated coefficients in the functional framework. A fully automated iterative procedure is proposed to find the critical (interpretable) intervals. The approach is proved efficient on simulated and real data. The method is implemented in the R package SISIR available on CRAN at https://cran.r-project.org/package=SISIR.

Keywords

Functional regression SIR Lasso Ridge regression Interval selection 

Notes

Acknowledgements

The authors thank the two anonymous referees for relevant remarks and constructive comments on a previous version of the paper.

Supplementary material

References

  1. Allen, R.G., Pereira, L.S., Raes, D., Smith, M.: Crop evapotranspiration-guidelines for computing crop water requirements-fao irrigation and drainage paper 56. FAO, Rome 300(9), D05109 (1998)Google Scholar
  2. Aneiros, G., Vieu, P.: Variable in infinite-dimensional problems. Stat. Probab. Lett. 94, 12–20 (2014)MathSciNetCrossRefMATHGoogle Scholar
  3. Bernard-Michel, C., Gardes, L., Girard, S.: A note on sliced inverse regression with regularizations. Biometrics 64(3), 982–986 (2008).  https://doi.org/10.1111/j.1541-0420.2008.01080.x MathSciNetCrossRefMATHGoogle Scholar
  4. Bettonvil, B.: Factor screening by sequential bifurcation. Commun. Stat. Simul. Comput. 24(1), 165–185 (1995)CrossRefMATHGoogle Scholar
  5. Biau, G., Bunea, F., Wegkamp, M.: Functional classification in Hilbert spaces. IEEE Trans. Inf. Theory 51, 2163–2172 (2005)MathSciNetCrossRefMATHGoogle Scholar
  6. Borggaard, C., Thodberg, H.: Optimal minimal neural interpretation of spectra. Anal. Chem. 64(5), 545–551 (1992)CrossRefGoogle Scholar
  7. Bura, A., Cook, R.: Extending sliced inverse regression: the weighted chi-squared test. J. Am. Stat. Assoc. 96(455), 996–1003 (2001)MathSciNetCrossRefMATHGoogle Scholar
  8. Bura, E., Yang, J.: Dimension estimation in sufficient dimension reduction: a unifying approach. J. Multivar. Anal. 102(1), 130–142 (2011).  https://doi.org/10.1016/j.jmva.2010.08.007 MathSciNetCrossRefMATHGoogle Scholar
  9. Casadebaig, P., Guilioni, L., Lecoeur, J., Christophe, A., Champolivier, L., Debaeke, P.: Sunflo, a model to simulate genotype-specific performance of the sunflower crop in contrasting environments. Agric. For. Meteorol. 151(2), 163–178 (2011)CrossRefGoogle Scholar
  10. Chen, C., Li, K.: Can SIR be as popular as multiple linear regression? Stat. Sin. 8, 289–316 (1998)MathSciNetMATHGoogle Scholar
  11. Chen, S., Donoho, D., Saunders, M.: Atomic decomposition by basis puirsuit. SIAM J. Sci. Comput. 20(1), 33–61 (2015)CrossRefMATHGoogle Scholar
  12. Cook, R.: Testing predictor contributions in sufficient dimension reduction. Ann. Stat. 32(3), 1061–1092 (2004)MathSciNetCrossRefMATHGoogle Scholar
  13. Cook, R., Yin, X.: Dimension reduction and visualization in discriminant analysis. Aust. N. Z. J. Stat. 43(2), 147–199 (2001)MathSciNetCrossRefMATHGoogle Scholar
  14. Coudret, R., Liquet, B., Saracco, J.: Comparison of sliced inverse regression aproaches for undetermined cases. J. Soc. Fr. Stat. 155(2), 72–96 (2014). http://journal-sfds.fr/index.php/J-SFdS/article/view/278
  15. Dauxois, J., Ferré, L., Yao, A.: Un modèle semi-paramétrique pour variable aléatoire hilbertienne. Comptes Rendus Math. Acad. Sci. Paris 327(I), 947–952 (2001).  https://doi.org/10.1016/S0764-4442(01)02163-2 CrossRefMATHGoogle Scholar
  16. Fauvel, M., Deschene, C., Zullo, A., Ferraty, F.: Fast forward feature selection of hyperspectral images for classification with Gaussian mixture models. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 8(6), 2824–2831 (2015).  https://doi.org/10.1109/JSTARS.2015.2441771 CrossRefGoogle Scholar
  17. Ferraty, F., Hall, P.: An algorithm for nonlinear, nonparametric model choice and prediction. J. Comput. Graph. Stat. 24(3), 695–714 (2015).  https://doi.org/10.1080/10618600.2014.936605 MathSciNetCrossRefGoogle Scholar
  18. Ferraty, F., Hall, P., Vieu, P.: Most-predictive design points for functional data predictors. Biometrika 97(4), 807–824 (2010).  https://doi.org/10.1093/biomet/asq058 MathSciNetCrossRefMATHGoogle Scholar
  19. Ferré, L.: Determining the dimension in sliced inverse regression and related methods. J. Am. Stat. Assoc. 93(441), 132–140 (1998).  https://doi.org/10.1080/01621459.1998.10474095 MathSciNetMATHGoogle Scholar
  20. Ferré, L., Villa, N.: Multi-layer perceptron with functional inputs: an inverse regression approach. Scand. J. Stat. 33(4), 807–823 (2006).  https://doi.org/10.1111/j.1467-9469.2006.00496.x CrossRefMATHGoogle Scholar
  21. Ferré, L., Yao, A.: Functional sliced inverse regression analysis. Statistics 37(6), 475–488 (2003)MathSciNetCrossRefMATHGoogle Scholar
  22. Fraiman, R., Gimenez, Y., Svarc, M.: Feature selection for functional data. J. Multivar. Anal. 146, 191–208 (2016).  https://doi.org/10.1016/j.jmva.2015.09.006 MathSciNetCrossRefMATHGoogle Scholar
  23. Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1–22 (2010)CrossRefGoogle Scholar
  24. Fromont, M., Tuleau, C.: Functional classification with margin conditions. In: Lugosi, G., Simon, H. (eds.) Proceedings of the 19th Annual Conference on Learning Theory (COLT 2006), Springer (Berlin/Heidelberg), Pittsburgh, PA, USA, Lecture Notes in Computer Science, vol. 4005, pp. 94–108 (2006).  https://doi.org/10.1007/11776420_10
  25. Fruth, J., Roustant, O., Kuhnt, S.: Sequential designs for sensitivity analysis of functional inputs in computer experiments. Reliab. Eng. Syst. Saf. 134, 260–267 (2015)CrossRefGoogle Scholar
  26. Golub, T., Slonim, D., Wahba, G.: Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics 21(2), 215–223 (1979).  https://doi.org/10.2307/1268518 MathSciNetCrossRefMATHGoogle Scholar
  27. Grollemund, P., Abraham, C., Baragatti, M., Pudlo, P.: Bayesian functional linear regression with sparse step functions. Preprint (2018). arXiv:1604.08403
  28. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Data Mining, Inference and Prediction. Springer, New York (2001)MATHGoogle Scholar
  29. Hernández, N., Biscay, R., Villa-Vialaneix, N., Talavera, I.: A non parametric approach for calibration with functional data. Stat. Sin. 25, 1547–1566 (2015).  https://doi.org/10.5705/ss.2013.242 MathSciNetMATHGoogle Scholar
  30. James, G., Wang, J., Zhu, J.: Functional linear regression that’s interpretable. Ann. Stat. 37(5A), 2083–2108 (2009).  https://doi.org/10.1214/08-AOS641 MathSciNetCrossRefMATHGoogle Scholar
  31. Kneip, A., Poß, D., Sarda, P.: Functional linear regression with points of impact. Ann. Stat. 44(1), 1–30 (2016).  https://doi.org/10.1214/15-AOS1323 MathSciNetCrossRefMATHGoogle Scholar
  32. Li, K.: Sliced inverse regression for dimension reduction. J. Am. Stat. Assoc. 86(414), 316–342 (1991). http://www.jstor.org/stable/2290563
  33. Li, L., Nachtsheim, C.: Sparse sliced inverse regression. Technometrics 48(4), 503–510 (2008)MathSciNetCrossRefGoogle Scholar
  34. Li, L., Yin, X.: Sliced inverse regression with regularizations. Biometrics 64(1), 124–131 (2008).  https://doi.org/10.1111/j.1541-0420.2007.00836.x MathSciNetCrossRefMATHGoogle Scholar
  35. Lin, Q., Zhao, Z., Liu, J.: On consistency and sparsity for sliced inverse regression in high dimensions. Preprint (2018). arXiv:1507.03895
  36. Liquet, B., Saracco, J.: A graphical tool for selecting the number of slices and the dimension of the model in SIR and SAVE approaches. Comput. Stat. 27(1), 103–125 (2012)MathSciNetCrossRefMATHGoogle Scholar
  37. Matsui, H., Konishi, S.: Variable selection for functional regression models via the \(l_1\) regularization. Comput. Stat. Data Anal. 55(12), 3304–3310 (2011).  https://doi.org/10.1016/j.csda.2011.06.016 CrossRefMATHGoogle Scholar
  38. McKeague, I., Sen, B.: Fractals with point impact in functional linear regression. Ann. Stat. 38(4), 2559–2586 (2010).  https://doi.org/10.1214/10-AOS791 MathSciNetCrossRefMATHGoogle Scholar
  39. Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., Leisch, F.: e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien. R package version 1.6-7 (2015)Google Scholar
  40. Ni, L., Cook, D., Tsai, C.: A note on shrinkage sliced inverse regression. Biometrika 92(1), 242–247 (2005)MathSciNetCrossRefMATHGoogle Scholar
  41. Park, A., Aston, J., Ferraty, F.: Stable and predictive functional domain selection with application to brain images. Preprint (2016). arXiv:1606.02186
  42. Portier, F., Delyon, B.: Bootstrap testing of the rank of a matrix via least-square constrained estimation. J. Am. Stat. Assoc. 109(505), 160–172 (2014).  https://doi.org/10.1080/01621459.2013.847841 CrossRefMATHGoogle Scholar
  43. Rasmussen, C., Williams, C.: Gaussian Processes for Machine Learning. The MIT Press, Cambridge (2006)MATHGoogle Scholar
  44. Schott, J.: Determining the dimensionality in sliced inverse regression. J. Am. Stat. Assoc. 89(425), 141–148 (1994)MathSciNetCrossRefMATHGoogle Scholar
  45. Simon, N., Friedman, J., Hastie, T., Tibshirani, R.: A sparse-group lasso. J. Comput. Graph. Stat. 22, 231–245 (2013).  https://doi.org/10.1080/10618600.2012.681250 MathSciNetCrossRefGoogle Scholar
  46. Tibshirani, R., Saunders, G., Rosset, S., Zhu, J., Knight, J.: Sparsity and smoothness via the fused lasso. J. R. Stat. Soc. B 67(1), 91–108 (2005)MathSciNetCrossRefMATHGoogle Scholar
  47. Zhao, Y., Ogden, R., Reiss, P.: Wavelet-based LASSO in functional linear regression. J. Comput. Graph. Stat. 21(3), 600–617 (2012).  https://doi.org/10.1080/10618600.2012.679241 MathSciNetCrossRefGoogle Scholar
  48. Zhu, L., Miao, B., Peng, H.: On sliced inverse regression with high-dimensional covariates. J. Am. Stat. Assoc. 101(474), 360–643 (2006)MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.MIAT, Université de Toulouse, INRACastanet TolosanFrance
  2. 2.INTHERES, Université de Toulouse, INRA, ENVTToulouseFrance

Personalised recommendations