Abstract
Dimension Estimation (DE) and Dimension Reduction (DR) are two closely related topics, but with quite different goals. In DR, one attempts to project a random vector, either linearly or nonlinearly, to a lower-dimensional space that preserves the information contained in the original higher-dimensional space. However, in DE, one attempts to estimate the intrinsic dimensionality or number of latent variables in a set of measurements of a random vector. DE and DR are closely linked because reducing the dimension to a smaller value than suggested by DE will likely lead to information loss. In particular, when considering linear methods such as Principal Component Analysis (PCA), DE and DR are often accomplished simultaneously. However, in this paper, we will focus on a particular class of deep neural networks called autoencoders (AEs), which are used extensively for DR but are less well studied for DE. We show that several important questions arise when using AEs for DE, above and beyond those that arise for more classic DR/DE techniques such as PCA. We address AE architectural choices and regularization techniques that allow one to transform AE latent layer representations into estimates of intrinsic dimension. We demonstrate the effectiveness of our techniques on synthetic, image processing benchmark problems, and most importantly, diverse applications such as the analysis of financial markets and network security.
This chapter is an extension of the conference paper with additional results (This document does not contain technology or Technical Data controlled under either the U.S. International Traffic Arms Regulations or the U.S. Export Administration Regulations).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We use the notation \(\theta \) for the nonlinear activation function instead of the more standard \(\sigma \), as \(\sigma \) is also used to denote singular values.
- 2.
S&P 500 is an equity index that measures the stock performance of 500 large companies listed on stock exchanges in the United States.
References
Abdulhammed, R., Faezipour, M., Musafer, H., Abuzneid, A.: Efficient network intrusion detection using PCA-based dimensionality reduction of features. In: 2019 International Symposium on Networks, Computers and Communications (ISNCC). IEEE (2019). https://doi.org/10.1109/isncc.2019.8909140
Abramowitz, M., Stegun, I.A., Romer, R.H.: Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables (1988)
Albanese, C., Jackson, K., Wiberg, P.: Dimension reduction in the computation of value-at-risk. The Journal of Risk Finance 3(4), 41–53 (2002)
Alexeev, V., Tapon, F.: Equity portfolio diversification: how many stocks are enough? evidence from five developed markets. Evidence from Five Developed Markets, 28 November 2012. FIRN Research Paper (2012)
Bahadur, N., Paffenroth, R., Gajamannage, K.: Dimenslon estlmatlon of equlty markets. In: 2019 IEEE International Conference on Big Data (Big Data), pp. 5491–5498. IEEE (2019)
Bhuyan, M.H., Bhattacharyya, D.K., Kalita, J.K.: Network anomaly detection: methods, systems and tools. IEEE Commun. Surv. & Tutor. 16(1), 303–336 (2014). https://doi.org/10.1109/surv.2013.052213.00046
Buczak, A.L., Guven, E.: A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Communications Surveys Tutorials 18(2), 1153–1176 (2016). https://doi.org/10.1109/COMST.2015.2494502
Elsayed, M.S., Le-Khac, N.A., Dev, S., Jurcut, A.D.: Ddosnet: a deep-learning model for detecting network attacks (2020)
Evans, J.L., Archer, S.H.: Diversification and the reduction of dispersion: An empirical analysis. J. Financ. 23(5), 761–767 (1968)
Golub, G., Reinsch, C.: Singular value decomposition and least squares solutions. In: Numerische Mathematik, p. 18. Springer (1970)
Hotelling, H.: Analysis of a complex of statistical variables into principal components. Journal of educational psychology 24(6), 417 (1933)
Hoyle, B., Rau, M.M., Paech, K., Bonnett, C., Seitz, S., Weller, J.: Anomaly detection for machine learning redshifts applied to sdss galaxies. Mon. Not. R. Astron. Soc. 452(4), 4183–4194 (2015)
Jolliffe, I.: Principal Component Analysis. Wiley Online Library (2002)
LeCun, Y.: The mnist database of handwritten digits (1998). http://yann.lecun.com/exdb/mnist/
Lee, I.: Big data: Dimensions, evolution, impacts, and challenges. Bus. Horiz. 60(3), 293–303 (2017)
Lee, J.A., Verleysen, M.: Nonlinear Dimensionality Reduction. Springer Science & Business Media (2007)
Li, C., Farkhoor, H., Liu, R., Yosinski, J.: Measuring the intrinsic dimension of objective landscapes (2018). arXiv:1804.08838
Mira, J., Sandoval, F.: From Natural to Artificial Neural Computation: International Workshop on Artificial Neural Networks, Malaga-Torremolinos, Spain, 7–9 June 1995: Proceedings, vol. 930. Springer Science & Business Media (1995)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 807–814 (2010)
Nassirtoussi, A.K., Aghabozorgi, S., Wah, T.Y., Ngo, D.C.L.: Text mining of news-headlines for forex market prediction: A multi-layer dimension reduction algorithm with semantics and sentiment. Expert Syst. Appl. 42(1), 306–324 (2015)
Ng, A., et al.: Sparse autoencoder. CS294A Lecture Notes 72(2011), 1–19 (2011)
Parsons, T.L., Rogers, T.: Dimension reduction for stochastic dynamical systems forced onto a manifold by large drift: a constructive approach with examples from theoretical biology. J. Phys. A: Math. Theor. 50(41), 415601 (2017)
Plaut, E.: From principal subspaces to principal components with linear autoencoders (2018). arXiv:1804.10253
Rathnayaka, R., Wang, Z., Seneviratna, D., Nagahawatta, S.: An econometric evaluation of Colombo stock exchange: evidence from ARMA & PCA approach. In: Proceedings of the 2nd International Conference on Management and Economics, p. 10 (2013)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error propagation. Technicl report, California Univ San Diego La Jolla Inst for Cognitive Science (1985)
Samaria, F.S., Harter, A.C.: Parameterisation of a stochastic model for human face identification. In: Proceedings of 1994 IEEE Workshop on Applications of Computer Vision, pp. 138–142. IEEE (1994)
Sharafaldin, I., Lashkari, A.H., Ghorbani, A.A.: Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: Proceedings of the 4th International Conference on Information Systems Security and Privacy. SCITEPRESS - Science and Technology Publications (2018). https://doi.org/10.5220/0006639801080116
Statman, M.: How many stocks make a diversified portfolio? Journal of financial and quantitative analysis 22(3), 353–363 (1987)
Tang, G.Y.: How efficient is naive portfolio diversification? an educational note. Omega 32(2), 155–160 (2004)
Van Der Maaten, L., Postma, E., Van den Herik, J.: Dimensionality reduction: a comparative. J Mach Learn Res 10(66–71), 13 (2009)
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103. ACM (2008)
Wang, X.: On the effects of dimension reduction techniques on some high-dimensional problems in finance. Operations Research 54(6), 1063–1078 (2006)
Wang, Y., Yao, H., Zhao, S.: Auto-encoder based dimensionality reduction. Neurocomputing 184, 232–242 (2016)
Wani, M.A., Bhat, F.A., Afzal, S., Khan, A.I.: Advances in Deep Learning. Springer, Singapore (2020). https://doi.org/10.1007/978-981-13-6794-6
Wani, M.A., Kantardzic, M., Sayed-Mouchaweh, M. (eds.): Deep Learning Applications. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-1816-4
Wani, M.A., Khoshgoftaar, T.M., Palade, V. (eds.): Deep Learning Applications, vol. 2. Springer, Singapore (2021). https://doi.org/10.1007/978-981-15-6759-9
Wylie, C.R., Barrett, L.C., Wylie, C.R.: Advanced Engineering Mathematics (1960)
Zhao, W., Du, S.: Spectral-spatial feature extraction for hyperspectral image classification: A dimension reduction and deep learning approach. IEEE Trans. Geosci. Remote Sens. 54(8), 4544–4554 (2016)
Zhou, C., Paffenroth, R.C.: Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 665–674 (2017)
Acknowledgements
Results in this paper were obtained in part using a high-performance computing system acquired through NSF MRI grant DMS-1337943 to WPI.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Bahadur, N., Lewandowski, B., Paffenroth, R. (2022). Dimension Estimation Using Autoencoders and Application. In: Wani, M.A., Raj, B., Luo, F., Dou, D. (eds) Deep Learning Applications, Volume 3. Advances in Intelligent Systems and Computing, vol 1395. Springer, Singapore. https://doi.org/10.1007/978-981-16-3357-7_4
Download citation
DOI: https://doi.org/10.1007/978-981-16-3357-7_4
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-3356-0
Online ISBN: 978-981-16-3357-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)