Skip to main content

On the effects of dimensionality on data analysis with neural networks

  • Conference paper
  • First Online:
Book cover Artificial Neural Nets Problem Solving Methods (IWANN 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2687))

Included in the following conference series:

Abstract

Modern data analysis often faces high-dimensional data. Nevertheless, most neural network data analysis tools are not adapted to high-dimensional spaces, because of the use of conventional concepts (as the Euclidean distance) that scale poorly with dimension. This paper shows some limitations of such concepts and suggests some research directions as the use of alternative distance definitions and of non-linear dimension reduction.

MV is a Senior research associate at the Belgian FNRS. GS is funded by the Belgian FRIA. The work of DF and VW is supported by the Interuniversity Attraction Pole (IAP), initiated by the Belgian Federal State, Ministry of Sciences, Technologies and Culture. The scientific responsibility rests with the authors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Scott, D.W., Thompson, J. R.: Probability density estimation in higher dimensions. In: Douglas, S.R. (ed): Computer Science and Statistics. Proceedings of the Fifteenth Symposium on the Interface, North Holland-Elsevier, Amsterdam, New York, Oxford (1983) 173–179

    Google Scholar 

  2. Demartines, P.: Analyse de données par réseaux de neurones auto-organisés. Ph.D. dissertation (in French), Institut National Polytechnique de Grenoble-France (1994)

    Google Scholar 

  3. Aggarwal, C. C., Hinneburg, A., Keim, D. A.: On the surprising behavior of distance metrics in high dimensional spaces. In: Van den Bussche, J., Vianu, V. (eds): Proceedings of Database Theory—ICDT 2001, 8th International Conference Lecture Notes in Computer Science, vol 1973. Springer, London, UK (2001) 420–434

    Google Scholar 

  4. Beyer K. S., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” meaningful? In: Beeri, C., Buneman, P. (eds): Poceedings of Database Theory—ICDT’ 99, 7th International Conference. Lecture Notes in Computer Sciences, vol 1540, Springer, Jerusalem, Israel (1999) 217–235

    Google Scholar 

  5. Bellmann, R.: Adaptive Control Processes: A Guided Tour. Princeton Univ. Press (1961)

    Google Scholar 

  6. Silverman, B.W.: Density estimation for statistics and data analysis. Chapman & Hall (1986)

    Google Scholar 

  7. Fukunaga K.: Introduction to Statistical Pattern Recognition. Academic Press, Boston, MA, (1990)

    MATH  Google Scholar 

  8. Hérault, J., Guérin-Dugué, A., Villemain, P.: Searching for the embedded manifolds in highdimensional data, problems and unsolved questions. Proceedings of ESANN’2002—European Symposium on Artificial Neural Networks, d-side public, Bruges-Belgium (2002) 173–184

    Google Scholar 

  9. Steinbach, M., Ertoz, L., Kumar, V.: Challenges of clustering high dimensional data. New Vistas in Statistical Physics—Applications in Econo-physics, Bioinformatics, and Pattern Recognition, Springer-Verlag (2003)

    Google Scholar 

  10. Verleysen, M.: Learning high-dimensional data. Acc. for public. in Ablameyko, S., Goras, L., Gori, M., Piuri, V. (eds): Limitations and future trends in neural computation, IOS Press.

    Google Scholar 

  11. Shepard, R. N.: The analysis of proximities: Multidimensional scaling with an unknown distance function, parts I and II, Psychometrika, 27 (1962) 125–140 and 219-246

    Article  MathSciNet  MATH  Google Scholar 

  12. Shepard, R.N, Carroll, J.D: Parametric representation of nonlinear data structures. In P. R. Krishnaiah (ed.): International Symposium on Multivariate Analysis, Academic Press, (1965) 561–592

    Google Scholar 

  13. Sammon,:A nonlinear mapping algorithm for data structure analysis, IEEE Trans. on Computers, C-18 (1969) 401–409

    Article  Google Scholar 

  14. Demartines, P., Hérault, J.: Curvilinear Component Analysis: a self-organizing neural network for nonlinear mapping of data sets, IEEE T. Neural Networks,. 8-1 (1997) 148–154

    Article  Google Scholar 

  15. Lee, J. A., Lendasse, A., Verleysen, M: Curvilinear Distance Analysis versus Isomap. In: Proceedings of ESANN’2002, 10th European Symposium on Artificial Neural Networks, dside public, Bruges—Belgium, (2002) 185–192

    Google Scholar 

  16. Lendasse, A., Lee, J. A., de Bodt, E., Wertz, V., Verleysen, M.: Dimension reduction of technical indicators for the prediction of financial time series—Application to the Bel 20 market index. European Journal of Economic and Social Systems, 15-2 (2001), pp. 31–48

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Verleysen, M., Francois, D., Simon, G., Wertz, V. (2003). On the effects of dimensionality on data analysis with neural networks. In: Mira, J., Álvarez, J.R. (eds) Artificial Neural Nets Problem Solving Methods. IWANN 2003. Lecture Notes in Computer Science, vol 2687. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44869-1_14

Download citation

  • DOI: https://doi.org/10.1007/3-540-44869-1_14

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40211-4

  • Online ISBN: 978-3-540-44869-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics