Skip to main content

Data Compression and Regression Based on Local Principal Curves

  • Conference paper
  • First Online:
Advances in Data Analysis, Data Handling and Business Intelligence

Abstract

Frequently the predictor space of a multivariate regression problem of the type y = m(x 1, , x p ) + ε is intrinsically one-dimensional, or at least of far lower dimension than p. Usual modeling attempts such as the additive model y = m 1(x 1) + + m p (x p ) + ε, which try to reduce the complexity of the regression problem by making additional structural assumptions, are then inefficient as they ignore the inherent structure of the predictor space and involve complicated model and variable selection stages. In a fundamentally different approach, one may consider first approximating the predictor space by a (usually nonlinear) curve passing through it, and then regressing the response only against the one-dimensional projections onto this curve. This entails the reduction from a p- to a one-dimensional regression problem.

As a tool for the compression of the predictor space we apply local principal curves. Taking things on from the results presented in Einbeck et al. (Classification – The Ubiquitous Challenge. Springer, Heidelberg, 2005, pp. 256–263), we show how local principal curves can be parametrized and how the projections are obtained. The regression step can then be carried out using any nonparametric smoother. We illustrate the technique using data from the physical sciences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Bailer-Jones, C. A. L. (2002). Determination of stellar parameters with GAIA. Astrophysics and Space Science, 280, 21–29.

    Article  Google Scholar 

  • Banfield, J. D., & Raftery, A. E. (1992). Ice flow identification in satellite images using mathematical morphology and clustering about principal curves. Journal of the American Statistical Association, 87, 7–16.

    Article  Google Scholar 

  • Brunsdon, C. (2007). Path estimation from GPS tracks. In Geocomputation. NUI Maynooth, Ireland.

    Google Scholar 

  • Chang, K., & Ghosh, J. (1998). Principal curves for nonlinear feature extraction and classification. SPIE Applications of Artificial Neural Networks in Image Processing III, 3307, 120–129.

    Google Scholar 

  • Delicado, P. (2001). Another look at principal curves and surfaces. Journal of Multivariate Analysis, 77, 84–116.

    Article  MATH  MathSciNet  Google Scholar 

  • Einbeck, J., Tutz, G., & Evers, L. (2005a). Exploring multivariate data structures with local principal curves. In C. Weihs, & W. Gaul (Eds.), Classification – The ubiquitous challenge (pp. 256–263). Heidelberg: Springer.

    Chapter  Google Scholar 

  • Einbeck, J., Tutz, G., & Evers, L. (2005b). Local principal curves. Statistics and Computing, 15, 301–313.

    Article  MathSciNet  Google Scholar 

  • Hastie, T., & Stuetzle, W. (1989). Principal curves. Journal of the American Statistical Association, 84, 502–516.

    Article  MATH  MathSciNet  Google Scholar 

  • Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning. New York: Springer.

    MATH  Google Scholar 

  • Kégl, B., & Krzyżak, A. (2002). Piecewise linear skeletonization using principal curves. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 59–74.

    Article  Google Scholar 

  • Kégl, B., Krzyżak, A., Linder, T., & Zeger, K. (2000). Learning and design of principal curves. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 281–297.

    Article  Google Scholar 

  • Peña, M., Barbakh, W., & Fyfe, C. (2008). Elastic maps and nets for approximating principal manifolds and their application to microarray data visualization. In A. N. Gorban, et al. (Eds.), Principal manifolds for data visualization and dimension reduction (pp. 131–150). Berlin: Springer.

    Chapter  Google Scholar 

Download references

Acknowledgements

We would like to thank Coryn-Bailer-Jones, leader of the group “Astrophysical parameters” based at MPIA Heidelberg, for providing the simulated GAIA data and explaining the background of the GAIA mission. The collaboration between the first two authors was supported by LMS Grant Ref 4709. The third author was funded by an EPSRC Vacation Bursary.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jochen Einbeck .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Einbeck, J., Evers, L., Hinchliff, K. (2009). Data Compression and Regression Based on Local Principal Curves. In: Fink, A., Lausen, B., Seidel, W., Ultsch, A. (eds) Advances in Data Analysis, Data Handling and Business Intelligence. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01044-6_64

Download citation

Publish with us

Policies and ethics