Advertisement

Geometrical Complexity of Data Approximators

  • Evgeny M. Mirkes
  • Andrei Zinovyev
  • Alexander N. Gorban
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7902)

Abstract

There are many methods developed to approximate a cloud of vectors embedded in high-dimensional space by simpler objects: starting from principal points and linear manifolds to self-organizing maps, neural gas, elastic maps, various types of principal curves and principal trees, and so on. For each type of approximators the measure of the approximator complexity was developed too. These measures are necessary to find the balance between accuracy and complexity and to define the optimal approximations of a given type. We propose a measure of complexity (geometrical complexity) which is applicable to approximators of several types and which allows comparing data approximations of different types.

Keywords

Data analysis Approximation algorithms Data structures Data complexity Model selection 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Hirotugu, A.: A new look at the statistical model identification. IEEE Transactions on Automatic Control 19(6), 716–723 (1974)CrossRefzbMATHGoogle Scholar
  2. 2.
    Vapnik, V., Chervonenkis, A.: Ordered risk minimization I. Automation and Remote Control 35, 1226–1235 (1974)zbMATHMathSciNetGoogle Scholar
  3. 3.
    Gorban, A.N., Zinovyev, A.: Principal graphs and manifolds. In: Olivas, E.S., Guererro, J.D.M., Sober, M.M., Benedito, J.R.M., Lopes, A. (eds.) Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods and Techniques, Information Science Reference, pp. 28–59. IGI Global, Hershey (2009)Google Scholar
  4. 4.
    Zinovyev, A., Mirkes, E.: Data complexity measured by principal graphs. Computers and Mathematics with Applications (2013) doi:10.1016/j.camwa.2012.12.009, arXiv:1212.5841 Google Scholar
  5. 5.
    Gorban, A.N., Zinovyev, A.: Principal manifolds and graphs in practice: from molecular biology to dynamical systems. International Journal of Neural Systems 20(3), 219–232 (2010)CrossRefGoogle Scholar
  6. 6.
    Blakeslee, S.: Lost on earth: wealth of data found in space, An Edward Ng’s quote from the article in New York Times (March 1990)Google Scholar
  7. 7.
    Burnham, K.P., Anderson, D.R.: Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, 2nd edn. Springer (2002)Google Scholar
  8. 8.
    Akaike, H.: A new look at the statistical model identification. IEEE Transactions on Automatic Control 19(6), 716–723 (1974)CrossRefzbMATHMathSciNetGoogle Scholar
  9. 9.
    Myung, I.J.: The Importance of Complexity in Model Selection. Journal of Mathematical Psychology 44, 190–204 (2000)CrossRefzbMATHGoogle Scholar
  10. 10.
    Forster, M.R.: Key Concepts in Model Selection: Performance and Generalizability. Journal of Mathematical Psychology 44, 205–231 (2000)CrossRefzbMATHGoogle Scholar
  11. 11.
    Edmonds, B.: What is complexity? – The philosophy of complexity per se with application to some examples in evolution. In: Heylighen, F., Aerts, D. (eds.) The Evolution of Complexity. Kluwer, Dordrecht (1998)Google Scholar
  12. 12.
    Brooks, R.J., Tobias, A.M.: Choosing the best model: Level of detail, complexity, and model performance. Mathematical and Computer Modelling 24(4), 1–14 (1996)CrossRefzbMATHGoogle Scholar
  13. 13.
    Gorban, A.N., Sumner, N., Zinovyev, A.: Topological grammars for data approximation. Applied Mathematics Letters 20(4), 382–386 (2007)CrossRefzbMATHMathSciNetGoogle Scholar
  14. 14.
    Kolmogorov, A.N.: Three approaches to the quantitative definition of information. Problems of Information Transmission 1(1), 1–7 (1965)MathSciNetGoogle Scholar
  15. 15.
    Alahakoon, D., Halgamuge, S.K., Sirinivasan, B.: A self growing cluster development approach to data mining. In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics, San Diego, USA, pp. 2901–2906 (1998)Google Scholar
  16. 16.
    PCA Master applet, Mirkes, E., University of Leicester (2011) http://bioinfo.curie.fr/projects/elmap
  17. 17.
    Kohonen, T.: The Self-Organizing Map (SOM)., http://www.cis.hut.fi/projects/somtoolbox/theory/somalgorithm.shtml
  18. 18.
    Gorban, A.N., Kégl, B., Wunch, D.C., Zinovyev, A. (eds.): Principal Manifolds for Data Visualisation and Dimension Reduction. LNSE, vol. 58. Springer, Heidelberg (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Evgeny M. Mirkes
    • 1
  • Andrei Zinovyev
    • 2
    • 3
    • 4
  • Alexander N. Gorban
    • 1
  1. 1.Department of MathematicsUniversity of LeicesterUK
  2. 2.Institut CurieParisFrance
  3. 3.INSERM U900ParisFrance
  4. 4.Mines ParisTechFontainebleauFrance

Personalised recommendations