A Functional Approach to Variable Selection in Spectrometric Problems
In spectrometric problems, objects are characterized by high-resolution spectra that correspond to hundreds to thousands of variables. In this context, even fast variable selection methods lead to high computational load. However, spectra are generally smooth and can therefore be accurately approximated by splines. In this paper, we propose to use a B-spline expansion as a pre-processing step before variable selection, in which original variables are replaced by coefficients of the B-spline expansions. Using a simple leave-one-out procedure, the optimal number of B-spline coefficients can be found efficiently. As there is generally an order of magnitude less coefficients than original spectral variables, selecting optimal coefficients is faster than selecting variables. Moreover, a B-spline coefficient depends only on a limited range of original variables: this preserves interpretability of the selected variables. We demonstrate the interest of the proposed method on real-world data.
Unable to display preview. Download preview PDF.
- 6.Marx, B.D., Eilers, P.H.: Generalized linear regression on sampled signals with penalized likelihood. In: Forcina, A., Marchetti, G.M., Hatzinger, R., Falmacci, G. (eds.) Statistical Modelling. Proceedings of the 11th International workshop on Statistical Modelling, Orvietto (1996)Google Scholar