Skip to main content

Dimension Estimation Using Autoencoders and Application

  • Chapter
  • First Online:
Deep Learning Applications, Volume 3

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1395))

  • 879 Accesses

Abstract

Dimension Estimation (DE) and Dimension Reduction (DR) are two closely related topics, but with quite different goals. In DR, one attempts to project a random vector, either linearly or nonlinearly, to a lower-dimensional space that preserves the information contained in the original higher-dimensional space. However, in DE, one attempts to estimate the intrinsic dimensionality or number of latent variables in a set of measurements of a random vector. DE and DR are closely linked because reducing the dimension to a smaller value than suggested by DE will likely lead to information loss. In particular, when considering linear methods such as Principal Component Analysis (PCA), DE and DR are often accomplished simultaneously. However, in this paper, we will focus on a particular class of deep neural networks called autoencoders (AEs), which are used extensively for DR but are less well studied for DE. We show that several important questions arise when using AEs for DE, above and beyond those that arise for more classic DR/DE techniques such as PCA. We address AE architectural choices and regularization techniques that allow one to transform AE latent layer representations into estimates of intrinsic dimension. We demonstrate the effectiveness of our techniques on synthetic, image processing benchmark problems, and most importantly, diverse applications such as the analysis of financial markets and network security.

This chapter is an extension of the conference paper with additional results (This document does not contain technology or Technical Data controlled under either the U.S. International Traffic Arms Regulations or the U.S. Export Administration Regulations).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We use the notation \(\theta \) for the nonlinear activation function instead of the more standard \(\sigma \), as \(\sigma \) is also used to denote singular values.

  2. 2.

    S&P 500 is an equity index that measures the stock performance of 500 large companies listed on stock exchanges in the United States.

References

  1. Abdulhammed, R., Faezipour, M., Musafer, H., Abuzneid, A.: Efficient network intrusion detection using PCA-based dimensionality reduction of features. In: 2019 International Symposium on Networks, Computers and Communications (ISNCC). IEEE (2019). https://doi.org/10.1109/isncc.2019.8909140

  2. Abramowitz, M., Stegun, I.A., Romer, R.H.: Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables (1988)

    Google Scholar 

  3. Albanese, C., Jackson, K., Wiberg, P.: Dimension reduction in the computation of value-at-risk. The Journal of Risk Finance 3(4), 41–53 (2002)

    Article  Google Scholar 

  4. Alexeev, V., Tapon, F.: Equity portfolio diversification: how many stocks are enough? evidence from five developed markets. Evidence from Five Developed Markets, 28 November 2012. FIRN Research Paper (2012)

    Google Scholar 

  5. Bahadur, N., Paffenroth, R., Gajamannage, K.: Dimenslon estlmatlon of equlty markets. In: 2019 IEEE International Conference on Big Data (Big Data), pp. 5491–5498. IEEE (2019)

    Google Scholar 

  6. Bhuyan, M.H., Bhattacharyya, D.K., Kalita, J.K.: Network anomaly detection: methods, systems and tools. IEEE Commun. Surv. & Tutor. 16(1), 303–336 (2014). https://doi.org/10.1109/surv.2013.052213.00046

  7. Buczak, A.L., Guven, E.: A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Communications Surveys Tutorials 18(2), 1153–1176 (2016). https://doi.org/10.1109/COMST.2015.2494502

    Article  Google Scholar 

  8. Elsayed, M.S., Le-Khac, N.A., Dev, S., Jurcut, A.D.: Ddosnet: a deep-learning model for detecting network attacks (2020)

    Google Scholar 

  9. Evans, J.L., Archer, S.H.: Diversification and the reduction of dispersion: An empirical analysis. J. Financ. 23(5), 761–767 (1968)

    Google Scholar 

  10. Golub, G., Reinsch, C.: Singular value decomposition and least squares solutions. In: Numerische Mathematik, p. 18. Springer (1970)

    Google Scholar 

  11. Hotelling, H.: Analysis of a complex of statistical variables into principal components. Journal of educational psychology 24(6), 417 (1933)

    Article  Google Scholar 

  12. Hoyle, B., Rau, M.M., Paech, K., Bonnett, C., Seitz, S., Weller, J.: Anomaly detection for machine learning redshifts applied to sdss galaxies. Mon. Not. R. Astron. Soc. 452(4), 4183–4194 (2015)

    Article  Google Scholar 

  13. Jolliffe, I.: Principal Component Analysis. Wiley Online Library (2002)

    Google Scholar 

  14. LeCun, Y.: The mnist database of handwritten digits (1998). http://yann.lecun.com/exdb/mnist/

  15. Lee, I.: Big data: Dimensions, evolution, impacts, and challenges. Bus. Horiz. 60(3), 293–303 (2017)

    Article  Google Scholar 

  16. Lee, J.A., Verleysen, M.: Nonlinear Dimensionality Reduction. Springer Science & Business Media (2007)

    Google Scholar 

  17. Li, C., Farkhoor, H., Liu, R., Yosinski, J.: Measuring the intrinsic dimension of objective landscapes (2018). arXiv:1804.08838

  18. Mira, J., Sandoval, F.: From Natural to Artificial Neural Computation: International Workshop on Artificial Neural Networks, Malaga-Torremolinos, Spain, 7–9 June 1995: Proceedings, vol. 930. Springer Science & Business Media (1995)

    Google Scholar 

  19. Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 807–814 (2010)

    Google Scholar 

  20. Nassirtoussi, A.K., Aghabozorgi, S., Wah, T.Y., Ngo, D.C.L.: Text mining of news-headlines for forex market prediction: A multi-layer dimension reduction algorithm with semantics and sentiment. Expert Syst. Appl. 42(1), 306–324 (2015)

    Article  Google Scholar 

  21. Ng, A., et al.: Sparse autoencoder. CS294A Lecture Notes 72(2011), 1–19 (2011)

    Google Scholar 

  22. Parsons, T.L., Rogers, T.: Dimension reduction for stochastic dynamical systems forced onto a manifold by large drift: a constructive approach with examples from theoretical biology. J. Phys. A: Math. Theor. 50(41), 415601 (2017)

    Article  MathSciNet  Google Scholar 

  23. Plaut, E.: From principal subspaces to principal components with linear autoencoders (2018). arXiv:1804.10253

  24. Rathnayaka, R., Wang, Z., Seneviratna, D., Nagahawatta, S.: An econometric evaluation of Colombo stock exchange: evidence from ARMA & PCA approach. In: Proceedings of the 2nd International Conference on Management and Economics, p. 10 (2013)

    Google Scholar 

  25. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error propagation. Technicl report, California Univ San Diego La Jolla Inst for Cognitive Science (1985)

    Google Scholar 

  26. Samaria, F.S., Harter, A.C.: Parameterisation of a stochastic model for human face identification. In: Proceedings of 1994 IEEE Workshop on Applications of Computer Vision, pp. 138–142. IEEE (1994)

    Google Scholar 

  27. Sharafaldin, I., Lashkari, A.H., Ghorbani, A.A.: Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: Proceedings of the 4th International Conference on Information Systems Security and Privacy. SCITEPRESS - Science and Technology Publications (2018). https://doi.org/10.5220/0006639801080116

  28. Statman, M.: How many stocks make a diversified portfolio? Journal of financial and quantitative analysis 22(3), 353–363 (1987)

    Article  Google Scholar 

  29. Tang, G.Y.: How efficient is naive portfolio diversification? an educational note. Omega 32(2), 155–160 (2004)

    Article  Google Scholar 

  30. Van Der Maaten, L., Postma, E., Van den Herik, J.: Dimensionality reduction: a comparative. J Mach Learn Res 10(66–71), 13 (2009)

    Google Scholar 

  31. Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103. ACM (2008)

    Google Scholar 

  32. Wang, X.: On the effects of dimension reduction techniques on some high-dimensional problems in finance. Operations Research 54(6), 1063–1078 (2006)

    Article  MathSciNet  Google Scholar 

  33. Wang, Y., Yao, H., Zhao, S.: Auto-encoder based dimensionality reduction. Neurocomputing 184, 232–242 (2016)

    Article  Google Scholar 

  34. Wani, M.A., Bhat, F.A., Afzal, S., Khan, A.I.: Advances in Deep Learning. Springer, Singapore (2020). https://doi.org/10.1007/978-981-13-6794-6

  35. Wani, M.A., Kantardzic, M., Sayed-Mouchaweh, M. (eds.): Deep Learning Applications. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-1816-4

  36. Wani, M.A., Khoshgoftaar, T.M., Palade, V. (eds.): Deep Learning Applications, vol. 2. Springer, Singapore (2021). https://doi.org/10.1007/978-981-15-6759-9

  37. Wylie, C.R., Barrett, L.C., Wylie, C.R.: Advanced Engineering Mathematics (1960)

    Google Scholar 

  38. Zhao, W., Du, S.: Spectral-spatial feature extraction for hyperspectral image classification: A dimension reduction and deep learning approach. IEEE Trans. Geosci. Remote Sens. 54(8), 4544–4554 (2016)

    Article  Google Scholar 

  39. Zhou, C., Paffenroth, R.C.: Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 665–674 (2017)

    Google Scholar 

Download references

Acknowledgements

Results in this paper were obtained in part using a high-performance computing system acquired through NSF MRI grant DMS-1337943 to WPI.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Randy Paffenroth .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 190806 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Bahadur, N., Lewandowski, B., Paffenroth, R. (2022). Dimension Estimation Using Autoencoders and Application. In: Wani, M.A., Raj, B., Luo, F., Dou, D. (eds) Deep Learning Applications, Volume 3. Advances in Intelligent Systems and Computing, vol 1395. Springer, Singapore. https://doi.org/10.1007/978-981-16-3357-7_4

Download citation

Publish with us

Policies and ethics