Skip to main content

Manifold Learning in Data Mining Tasks

  • Conference paper
Book cover Machine Learning and Data Mining in Pattern Recognition (MLDM 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8556))

Abstract

Many Data Mining tasks deal with data which are presented in high dimensional spaces, and the ‘curse of dimensionality’ phenomena is often an obstacle to the use of many methods for solving these tasks. To avoid these phenomena, various Representation learning algorithms are used as a first key step in solutions of these tasks to transform the original high-dimensional data into their lower-dimensional representations so that as much information about the original data required for the considered Data Mining task is preserved as possible. The above Representation learning problems are formulated as various Dimensionality Reduction problems (Sample Embedding, Data Manifold embedding, Manifold Learning and newly proposed Tangent Bundle Manifold Learning) which are motivated by various Data Mining tasks. A new geometrically motivated algorithm that solves the Tangent Bundle Manifold Learning and gives new solutions for all the considered Dimensionality Reduction problems is presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bengio, Y., Courville, A., Vincent, P.: Representation Learning: A Review and New Perspectives. arXiv preprint: arXiv:1206.5538v2, 1–64 (2012)

    Google Scholar 

  2. Bunte, K., Biehl, M., Hammer, B.: Dimensionality reduction mappings. In: IEEE Symposium Series in Computational Intelligence (SSCI) 2011 - Computational Intelligence and Data Mining (CIDM), pp. 349–356. IEEE, Paris (2011)

    Google Scholar 

  3. Cox, T.F., Cox, M.A.A.: Multidimensional Scaling. Chapman and Hall/CRC, London (2001)

    MATH  Google Scholar 

  4. Jollie, T.: Principal Component Analysis. Springer, New-York (2002)

    Google Scholar 

  5. Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation 15, 1373–1396 (2003)

    Article  MATH  Google Scholar 

  6. Hecht-Nielsen, R.: Replicator neural networks for universal optimal source coding. Science 269, 1860–1863 (1995)

    Article  Google Scholar 

  7. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  8. Kramer, M.: Nonlinear Principal Component Analysis using autoassociative neural networks. AIChE Journal 37(2), 233–243 (1991)

    Article  Google Scholar 

  9. DeMers, D., Cottrell, G.W.: Nonlinear dimensionality reduction. In: Hanson, D., Cowan, J., Giles, L. (eds.) Advances in Neural Information Processing Systems, vol. 5, pp. 580–587. Morgan Kaufmann, San Mateo (1993)

    Google Scholar 

  10. Kohonen, T.: Self-organizing Maps, 3rd edn. Springer (2000)

    Google Scholar 

  11. Martinetz, T., Schulten, K.: Topology representing networks. Neural Networks 7, 507–523 (1994)

    Article  Google Scholar 

  12. Lafon, S., Lee, A.B.: Diffusion Maps and Coarse-Graining: A Unified Framework for Dimensionality Reduction, Graph Partitioning and Data Set Parameterization. IEEE Transaction on Pattern Analysis and Machine Intelligence 28(9), 1393–1403 (2006)

    Article  Google Scholar 

  13. Schölkopf, B., Smola, A., Műller, K.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation 10(5), 1299–1319 (1998)

    Article  Google Scholar 

  14. Saul, L.K., Roweis, S.T.: Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000)

    Article  Google Scholar 

  15. Donoho, D.L., Grimes, C.: Hessian eigenmaps: New locally linear embedding techniques for high-dimensional data. Proceedings of the National Academy of Arts and Sciences 100, 5591–5596 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  16. Tehenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)

    Article  Google Scholar 

  17. Weinberger, K.Q., Saul, L.K.: Maximum Variance Unfolding: Unsupervized Learning of Image Manifolds by Semidefinite Programming. International Journal of Computer Vision 70(1), 77–90 (2006)

    Article  Google Scholar 

  18. Brand, M.: Charting a manifold. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems, vol. 15, pp. 961–968. MIT Press, Cambridge (2003)

    Google Scholar 

  19. Zhang, Z., Zha, H.: Principal Manifolds and Nonlinear Dimension Reduction via Local Tangent Space Alignment. SIAM Journal on Scientific Computing 26(1), 313–338 (2005)

    Article  MathSciNet  Google Scholar 

  20. Bengio, Y., Delalleau, O., Le Roux, N., Paiement, J.-F., Vincent, P., Ouimet, M.: Learning Eigenfunctions Link Spectral Embedding and Kernel PCA. Neural Computation 16(10), 2197–2219 (2004)

    Article  MATH  Google Scholar 

  21. Bengio, Y., Delalleau, O., Le Roux, N., Paiement, J.-F., Vincent, P., Ouimet, M.: Out-of-sample extension for LLE, Isomap, MDS, Eigenmaps, and spectral clustering. In: Thrun, S., Saul, L., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems, vol. 16, pp. 177–184. MIT Press, Cambridge (2004)

    Google Scholar 

  22. Saul, L.K., Roweis, S.T.: Think globally, fit locally: unsupervised learning of low dimensional manifolds. Journal of Machine Learning Research 4, 119–155 (2003)

    MathSciNet  Google Scholar 

  23. Saul, L.K., Weinberger, K.Q., Ham, J.H., Sha, F., Lee, D.D.: Spectral methods for dimensionality reduction. In: Chapelle, O., Schölkopf, B., Zien, A. (eds.) Semisupervised Learning, pp. 293–308. MIT Press, Cambridge (2006)

    Google Scholar 

  24. Burges, C.J.C.: Dimension Reduction: A Guided Tour. Foundations and Trends in Machine Learning 2(4), 275–365 (2010)

    Article  Google Scholar 

  25. Gisbrecht, A., Lueks, W., Mokbel, B., Hammer, B.: Out-of-Sample Kernel Extensions for Nonparametric Dimensionality Reduction. In: Proceedings of European Symposium on Artificial Neural Networks, ESANN 2012. Computational Intelligence and Machine Learning, pp. 531–536. Bruges, Belgium (2012)

    Google Scholar 

  26. Strange, H., Zwiggelaar, R.: A Generalised Solution to the Out-of-Sample Extension Problem in Manifold Learning. In: Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, San Francisco, California, USA, pp. 471–478. AAAI Press, Menlo Park (2011)

    Google Scholar 

  27. Cayton, L.: Algorithms for manifold learning. Univ of California at San Diego (UCSD), Technical Report CS2008-0923, pp. 541–555. Citeseer (2005)

    Google Scholar 

  28. Huo, X., Ni, X., Smith, A.K.: Survey of Manifold-based Learning Methods. In: Liao, T.W., Triantaphyllou, E. (eds.) Recent Advances in Data Mining of Enterprise Data, pp. 691–745. World Scientific, Singapore (2007)

    Google Scholar 

  29. Izenman, A.J.: Introduction to manifold learning. Computational Statistics 4(5), 439–446 (2012)

    Article  Google Scholar 

  30. Ma, Y., Fu, Y. (eds.): Manifold Learning Theory and Applications. CRC Press, London (2011)

    Google Scholar 

  31. Narayanan, H., Mitter, S.: Sample complexity of testing the manifold hypothesis. In: Lafferty, J., Williams, C.K.I., Shawe-Taylor, J., Zemel, R., Culotta, A. (eds.) Advances in Neural Information Processing Systems, vol. 23, pp. 1786–1794. MIT Press, Cambridge (2010)

    Google Scholar 

  32. Rifai, S., Dauphin, Y.N., Vincent, P., Bengio, Y., Muller, X.: The manifold Tangent Classifier. In: Shawe-Taylor, J., Zemel, R.S., Bartlett, P., Pereira, F., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 24, pp. 2294–2302. MIT Press, Cambridge (2011)

    Google Scholar 

  33. Chen, J., Deng, S.-J., Huo, X.: Electricity price curve modeling and forecasting by manifold learning. IEEE Transaction on Power Systems 23(3), 877–888 (2008)

    Article  Google Scholar 

  34. Song, W., Keane, A.J.: A Study of Shape Parameterisation Methods for Airfoil Optimisation. In: Proceedings of the 10th AIAA / ISSMO Multidisciplinary Analysis and Optimization Conference, AIAA 2004-4482. American Institute of Aeronautics and Astronautics, Albany (2004)

    Google Scholar 

  35. Bernstein, A., Kuleshov, A., Sviridenko, Y., Vyshinsky, V.: Fast Aerodynamic Model for Design Technology. In: Proceedings of West-East High Speed Flow Field Conference, WEHSFF-2007. IMM RAS, Moscow (2007), http://wehsff.imamod.ru/pages/s7.htm

  36. Bernstein, A., Kuleshov, A.: Cognitive technologies in the problem of dimension reduction of geometrical object descriptions. Information Technologies and Computer Systems 2, 6–19 (2008)

    Google Scholar 

  37. Bernstein, A.V., Burnaev, E.V., Chernova, S.S., Zhu, F., Qin, N.: Comparison of Three Geometric Parameterization methods and Their Effect on Aerodynamic Optimization. In: Control with Applications to Industrial and Societal Problems (Eurogen 2011), Capua, Italy, September 14 - 16 (2011)

    Google Scholar 

  38. Lee, J.A., Verleysen, M.: Quality assessment of dimensionality reduction based on k-ary neighborhoods. In: Saeys, Y., Liu, H., Inza, I., Wehenkel, L., Van de Peer, Y. (eds.) JMLR Workshop and Conference Proceedings. New Challenges for Feature Selection in Data Mining and Knowledge Discovery, vol. 4, pp. 21–35. Antwerpen, Belgium (2008)

    Google Scholar 

  39. Lee, J.A., Verleysen, M.: Quality assessment of dimensionality reduction: Rank-based criteria. Neurocomputing 72(7-9), 1431–1443 (2009)

    Article  Google Scholar 

  40. Freedman, D.: Efficient simplicial reconstructions of manifold from their samples. IEEE Transaction on Pattern Analysis and Machine Intelligence 24(10), 1349–1357 (2002)

    Article  Google Scholar 

  41. Karygianni, S., Frossard, P.: Tangent-based manifold approximation with locally linear models. In: arXiv preprint: arXiv:1211.1893v1 [cs.LG] (November 6, 2012)

    Google Scholar 

  42. Golub, G.H., Van Loan, C.F.: Matrix Computation, 3rd edn. Johns Hopkins University Press, MD (1996)

    Google Scholar 

  43. Hotelling, H.: Relations between two sets of variables. Biometrika 28, 321–377 (1936)

    Article  MATH  Google Scholar 

  44. James, A.T.: Normal multivariate analysis and the orthogonal group. Ann. Math. Statistics 25, 40–75 (1954)

    Article  MATH  MathSciNet  Google Scholar 

  45. Wang, L., Wang, X., Feng, J.: Subspace Distance Analysis with Application to Adaptive Bayesian Algorithm for Face Recognition. Pattern Recognition 39(3), 456–464 (2006)

    Article  MATH  Google Scholar 

  46. Edelman, A., Arias, T.A., Smith, T.: The Geometry of Algorithms with Orthogonality Constraints. SIAM Journal on Matrix Analysis and Applications 20(2), 303–353 (1999)

    Article  MathSciNet  Google Scholar 

  47. Hamm, J., Lee, D.D.: Grassmann Discriminant Analysis: a Unifying View on Subspace-Based Learning. In: Proceedings of the 25th International Conference on Machine Learning (ICML 2008), pp. 376–383 (2008)

    Google Scholar 

  48. Bernstein, A.V., Kuleshov, A.P.: Manifold Learning: generalizing ability and tangent proximity. International Journal of Software and Informatics 7(3), 359–390 (2013)

    Google Scholar 

  49. Kuleshov, A.P., Bernstein, A.V.: Cognitive Technologies in Adaptive Models of Complex Plants. Information Control Problems in Manufacturing 13(1), 1441–1452 (2009)

    Google Scholar 

  50. Lee, J.M.: Manifolds and Differential Geometry. Graduate Studies in Mathematics, vol. 107. American Mathematical Society, Providence (2009)

    MATH  Google Scholar 

  51. Lee, J.M.: Introduction to Smooth Manifolds. Springer, New York (2003)

    Book  Google Scholar 

  52. Rifai, S., Vincent, P., Muller, X., Glorot, X., Bengio, Y.: Contractive Auto-Encoders: Explicit Invariance during Feature Extraction. In: Getoor, L., Scheffer, T. (eds.) Proceedings of the 28th International Conference on Machine Learning (ICML 2011), pp. 833–840. Omnipress, Bellevue (2011)

    Google Scholar 

  53. Silva, J.G., Marques, J.S., Lemos, J.M.: A Geometric approach to motion tracking in manifolds. In: Paul, M.J., Van Den Hof, B.W., Weiland, S. (eds.) A Proceedings Volume from the 13th IFAC Symposium on System Identification, Rotterdam (2003)

    Google Scholar 

  54. Silva, J.G., Marques, J.S., Lemos, J.M.: Non-linear dimension reduction with tangent bundle approximation. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005), vol. 4, pp. 85–88. Conference Publications (2005)

    Google Scholar 

  55. Silva, J.G., Marques, J.S., Lemos, J.M.: Selecting Landmark Points for Sparse Manifold Learning. In: Weiss, Y., Schölkopf, B., Platt, J. (eds.) Advances in Neural Information Processing Systems, vol. 18. MIT Press, Cambridge (2006)

    Google Scholar 

  56. Bernstein, A.V., Kuleshov, A.P.: Tangent Bundle Manifold Learning via Grassmann & Stiefel Eigenmaps. arXiv preprint: arXiv:1212.6031v1 [cs.LG], pp. 1–25 (December 2012)

    Google Scholar 

  57. Achlioptas, D.: Random matrices in data analysis. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 1–7. Springer, Heidelberg (2004)

    Google Scholar 

  58. Tyagi, H., Vural, E., Frossard, P.: Tangent space estimation for smooth embeddings of riemannian manifold. arXiv preprint: arXiv:1208.1065v2 [stat.CO], pp. 1–35 (May 17, 2013)

    Google Scholar 

  59. Singer, A., Wu, H.: Vector Diffusion Maps and the Connection Laplacian. Comm. on Pure and App. Math. (2012)

    Google Scholar 

  60. Coifman, R.R., Lafon, S., Lee, A.B., Maggioni, M., Warner, F., Zucker, S.: Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps. Proceedings of the National Academy of Sciences, 7426–7431 (2005)

    Google Scholar 

  61. Wolf, L., Shashua, A.: Learning over sets using kernel principal angles. J. Mach. Learn. Res. 4, 913–931 (2003)

    MathSciNet  Google Scholar 

  62. Kuleshov, A., Bernstein, A.: Yanovich, Yu.: Asymptotically optimal method in Manifold estimation. In: Márkus, L., Prokaj, V. (eds.) Abstracts of the XXIX-th European Meeting of Statisticians, Budapest, Hungary, July 20-25, p. 325 (2013), http://ems2013.eu/conf/upload/BEK086_006.pdf

  63. Genovese, C.R., Perone-Pacifico, M., Verdinelli, I., Wasserman, L.: Minimax Manifold Estimation. Journal Machine Learning Research 13, 1263–1291 (2012)

    MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Kuleshov, A., Bernstein, A. (2014). Manifold Learning in Data Mining Tasks. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2014. Lecture Notes in Computer Science(), vol 8556. Springer, Cham. https://doi.org/10.1007/978-3-319-08979-9_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-08979-9_10

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-08978-2

  • Online ISBN: 978-3-319-08979-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics