Abstract
Frequently the predictor space of a multivariate regression problem of the type y = m(x 1, …, x p ) + ε is intrinsically one-dimensional, or at least of far lower dimension than p. Usual modeling attempts such as the additive model y = m 1(x 1) + … + m p (x p ) + ε, which try to reduce the complexity of the regression problem by making additional structural assumptions, are then inefficient as they ignore the inherent structure of the predictor space and involve complicated model and variable selection stages. In a fundamentally different approach, one may consider first approximating the predictor space by a (usually nonlinear) curve passing through it, and then regressing the response only against the one-dimensional projections onto this curve. This entails the reduction from a p- to a one-dimensional regression problem.
As a tool for the compression of the predictor space we apply local principal curves. Taking things on from the results presented in Einbeck et al. (Classification – The Ubiquitous Challenge. Springer, Heidelberg, 2005, pp. 256–263), we show how local principal curves can be parametrized and how the projections are obtained. The regression step can then be carried out using any nonparametric smoother. We illustrate the technique using data from the physical sciences.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bailer-Jones, C. A. L. (2002). Determination of stellar parameters with GAIA. Astrophysics and Space Science, 280, 21–29.
Banfield, J. D., & Raftery, A. E. (1992). Ice flow identification in satellite images using mathematical morphology and clustering about principal curves. Journal of the American Statistical Association, 87, 7–16.
Brunsdon, C. (2007). Path estimation from GPS tracks. In Geocomputation. NUI Maynooth, Ireland.
Chang, K., & Ghosh, J. (1998). Principal curves for nonlinear feature extraction and classification. SPIE Applications of Artificial Neural Networks in Image Processing III, 3307, 120–129.
Delicado, P. (2001). Another look at principal curves and surfaces. Journal of Multivariate Analysis, 77, 84–116.
Einbeck, J., Tutz, G., & Evers, L. (2005a). Exploring multivariate data structures with local principal curves. In C. Weihs, & W. Gaul (Eds.), Classification – The ubiquitous challenge (pp. 256–263). Heidelberg: Springer.
Einbeck, J., Tutz, G., & Evers, L. (2005b). Local principal curves. Statistics and Computing, 15, 301–313.
Hastie, T., & Stuetzle, W. (1989). Principal curves. Journal of the American Statistical Association, 84, 502–516.
Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning. New York: Springer.
Kégl, B., & Krzyżak, A. (2002). Piecewise linear skeletonization using principal curves. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 59–74.
Kégl, B., Krzyżak, A., Linder, T., & Zeger, K. (2000). Learning and design of principal curves. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 281–297.
Peña, M., Barbakh, W., & Fyfe, C. (2008). Elastic maps and nets for approximating principal manifolds and their application to microarray data visualization. In A. N. Gorban, et al. (Eds.), Principal manifolds for data visualization and dimension reduction (pp. 131–150). Berlin: Springer.
Acknowledgements
We would like to thank Coryn-Bailer-Jones, leader of the group “Astrophysical parameters” based at MPIA Heidelberg, for providing the simulated GAIA data and explaining the background of the GAIA mission. The collaboration between the first two authors was supported by LMS Grant Ref 4709. The third author was funded by an EPSRC Vacation Bursary.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Einbeck, J., Evers, L., Hinchliff, K. (2009). Data Compression and Regression Based on Local Principal Curves. In: Fink, A., Lausen, B., Seidel, W., Ultsch, A. (eds) Advances in Data Analysis, Data Handling and Business Intelligence. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01044-6_64
Download citation
DOI: https://doi.org/10.1007/978-3-642-01044-6_64
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01043-9
Online ISBN: 978-3-642-01044-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)