Skip to main content
Log in

Principal Curves of Oriented Points: theoretical and computational improvements

  • Published:
Computational Statistics Aims and scope Submit manuscript

Summary

Principal curves where introduced by Hastie & Stuetzle (1989) as smooth parametric curves passing through the middle of a multidimensional data set. Delicado (2001) defines Principal Curves of Oriented Points, based on the fixed points of a function from p into itself. This definition is nonparametric and smoothing methods are used to find principal curves of a data set. Here we extend this work in two directions. First, we propose a bandwidth choice method based on the Minimum Spanning Tree of the data set. Second, we present an object oriented application that implements the principal curves computation for any dimension in a flexible recursive way. Examples on synthetic and real data are included.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5

Similar content being viewed by others

References

  • Avram, F. & Bertsimas, D. (1992), ‘The minimum spanning tree constant in geometrical probability and under the independent model: a unified approach’, The Annals of Applied Probability 2(1), 113–130.

    Article  MathSciNet  Google Scholar 

  • Avram, F. & Bertsimas, D. (1993), ‘On central limit theorems in geometrical probability’, The Annals of Applied Probability 3(4), 1033–1046.

    Article  MathSciNet  Google Scholar 

  • Banfield, J. D. & Raftery, A. E. (1992), ‘Ice floe identification in satellite images using mathematical morphology and clustering about principal curves’, Journal of the American Statistical Association 87, 7–16.

    Article  Google Scholar 

  • Beardwood, J., Halton, H. J. & Hammersley, J. M. (1959), ‘The shortest path through many points’, Proc. Cambridge Philos. Soc. 55, 299–327.

    Article  MathSciNet  Google Scholar 

  • Bishop, C. M. & Tipping, M. E. (1998), ‘A hierarchical latent variable model for data visualization’, IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 281–293.

    Article  Google Scholar 

  • Bishop, C., Svensén, M. & Williams, C. K. I. (1998), ‘GTM: The generative topographic mapping’, Neural Computation 10(1), 215–234.

    Article  Google Scholar 

  • Caroni, C. & Prescott, P. (1995), ‘On Rohlf’s method for the detection of outliers in multivariate data’, Journal of Multivariate Analysis 52, 295–307.

    Article  MathSciNet  Google Scholar 

  • Chang, K. & Ghosh, J. (2001), ‘A unified model for probabilistic principal surfaces’, IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 22–41.

    Article  Google Scholar 

  • Delicado, P. (2001), ‘Another look at principal curves and surfaces’, Journal of Multivariate Analysis 77, 84–116.

    Article  MathSciNet  Google Scholar 

  • Delicado, P. & Huerta, M. (2002), Principal Curves of Oriented Points: Theoretical and computational improvements, Technical Report DR 2002/06, EIO, UPC. (Available at https://doi.org/www-eio.upc.es/~delicado/PCOP/).

  • Dong, D. & McAvoy, T. J. (1996), ‘Nonlinear principal component analysis based on principal curves and neural networks’, Computers Chem. Engng. 20, 65–78.

    Article  Google Scholar 

  • Friedman, J. H. (1991), ‘Multivariate adaptive regression splines’, The Annals of Statistics 19, 1–141. (With discussion).

    Article  MathSciNet  Google Scholar 

  • Greenacre, M. J. (1993), Correspondence analysis in practice, Academic Press.

  • Guggenheimer, H. W. (1977), Differential Geometry, Dover Publications.

  • Hastie, T. (1984), Principal curves and surfaces, Laboratory for Computational Statistics Technical Report 11, Stanford University, Dept. of Statistics.

  • Hastie, T. & Stuetzle, W. (1989), ‘Principal curves’, Journal of the American Statistical Association 84, 502–516.

    Article  MathSciNet  Google Scholar 

  • IPC (2000), ‘International Data Base’, International Programs Center, U.S. Census Bureau. https://doi.org/www.census.gov/ipc/www/idbnew.html.

  • Kégl, B., Krzyzak, A., Linder, T. & Zeger, K. (2000), ‘Learning and design of principal curves’, IEEE Trans. Pattern Analysis and Machine Intelligence 22(3), 281–297.

    Article  Google Scholar 

  • LeBlanc, M. & Tibshirani, R. J. (1994), ‘Adaptive principal surfaces’, Journal of the American Statistical Association 89, 53–64.

    Article  Google Scholar 

  • Mulier, F. & Cherkassky, V. (1995), ‘Self-organization as an iterative kernel smoothing process’, Neural Computation 7(6), 1165–1177.

    Article  Google Scholar 

  • Rohlf, F. J. (1975), ‘Generalization of the gap test for the detection of multivariate outliers’, Biometrics 31, 93–102.

    Article  Google Scholar 

  • Sandilya, S. & Kulkarni, S. R. (2000), Principal curves with bounded turn, in ‘IEEE International Symposium on Information Theory’, p. 321.

  • Scott, D. (1992), Multivariate Density Estimation, Wiley.

  • Simonoff, J. (1995), Smoothing Methods in Statistics, Springer, New York.

    MATH  Google Scholar 

  • Smola, A. J., Mika, S., Schölkopf, B. & Williamson, R. C. (2001), ‘Regularized principal manifolds’, Journal of Machine Learning Research 1, 179–209.

    MathSciNet  MATH  Google Scholar 

  • Steele, M. (1988), ‘Growth rates of euclidean minimal spanning trees with power weighted edges’, The Annals of Probability 16(4), 1767–1787.

    Article  MathSciNet  Google Scholar 

  • Steele, M. (1993), ‘Probability and problems in euclidean combinatorial optimization’, Statistical Science 8(1), 48–56.

    Article  Google Scholar 

  • Tan, S. & Mavarovouniotis, M. L. (1995), ‘Reducing data dimensionality through optimizing neural network inputs’, AIChE Journal 41, 1471–1480.

    Article  Google Scholar 

  • Tarpey, T. & Flury, B. (1996), ‘Self-consistency: A fundamental concept in statistics’, Statistical Science 11, 229–243.

    Article  MathSciNet  Google Scholar 

  • Tibshirani, R. J. (1992), ‘Principal curves revisited’, Statistics and Computing 2, 183–190.

    Article  Google Scholar 

  • Tipping, M. E. & Bishop, C. M. (1999), ‘Probabilistic principal component analysis’, Journal of the Royal Statistical Society, Series B, Methodological 61, 611–622.

    Article  MathSciNet  Google Scholar 

  • Verbeek, J., Vlassis, N. & Krse, B. (2002), ‘A k-segments algorithm for finding principal curves’, Pattern Recognition Letters 23, 1009–1017.

    Article  Google Scholar 

  • Wand, M. & Jones, M. (1995), Kernel Smoothing, Chapman and Hall, London.

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pedro Delicado.

Additional information

This work was partially supported by the Spanish DGES grants PB98-0919 and BFM 2001-2327, and by the European Commission project HPCF CT-2000-00041. We are grateful to Valentín Navarro who helps us to pre-process population pyramids data set. The authors would like to thank two anonymous referees for their helpful comments and suggestions. Running head: Principal Curves of Oriented Points.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Delicado, P., Huerta, M. Principal Curves of Oriented Points: theoretical and computational improvements. Computational Statistics 18, 293–315 (2003). https://doi.org/10.1007/s001800300145

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s001800300145

Keywords

Navigation