Abstract
A well-known and widely used technique for mapping data from high-dimensional space to lower-dimensional space is multidimensional scaling (MDS). Although MDS, as a dimensionality reduction method used for data visualization, demonstrates great versatility, it is computationally demanding, especially when the data set is not fixed and its size is constantly growing. Traditional MDS approaches are limited when analyzing very large data sets, as they require very long computation times and large amounts of memory. A way to minimize MDS stress, which can be used to reduce the dimensionality of large-scale data, has been developed using the ideas of Geometric MDS, where all points in a low-dimensional space change their coordinates simultaneously and independently during a single iteration of stress minimization. It is shown in this paper that Geometric MDS allows the implementation of parallel computing for the dimensionality reduction process of large-scale data using multithreaded multi-core processors. We explore how the computational time consumption of data dimensionality reduction and multidimensional data visualization depends on the number of processor cores or processor threads used.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Dzemyda, G., Kurasova, O., Žilinskas, J.: Multidimensional Data Visualization. SOIA, vol. 75. Springer, New York (2013). https://doi.org/10.1007/978-1-4419-0236-8
Dos Santos, S., Brodlie, K.: Gaining understanding of multivariate and multidimensional data through visualization. Comput. Graph. 28(3), 311–325 (2004)
Buja, A., Swayne, D.F., Littman, M.L., Dean, N., Hofmann, H., Chen, L.: Data visualization with multidimensional scaling. J. Comput. Graph. Stat. 17(2), 444–472 (2008)
Lee, J.A., Verleysen, M.: Nonlinear Dimensionality Reduction. Springer Science & Business Media, Heidelberg (2007). https://doi.org/10.1007/978-0-387-39351-3
Van Der Maaten, L., et al.: Dimensionality reduction: a comparative. J. Mach. Learn. Res. 10(66–71), 13 (2009)
Medvedev, V., Dzemyda, G., Kurasova, O., Marcinkevičius, V.: Efficient data projection for visual analysis of large data sets using neural networks. Informatica 22(4), 507–520 (2011)
Ivanikovas, S., Medvedev, V., Dzemyda, G.: Parallel realizations of the SAMANN algorithm. In: Beliczynski, B., Dzielinski, A., Iwanowski, M., Ribeiro, B. (eds.) ICANNGA 2007. LNCS, vol. 4432, pp. 179–188. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-71629-7_21
Jolliffe, I.: Principal component analysis. Wiley Online Library (2005)
Jackson, J.E.: A User’s Guide to Principal Components, vol. 587. Wiley, Hoboken (2005)
Torgerson, W.S.: Theory and Methods of Scaling. Wiley, Hoboken (1958)
Borg, I., Groenen, P.J.: Modern Multidimensional Scaling: Theory and Applications. Springer Science & Business Media, Heidelberg (2005). https://doi.org/10.1007/0-387-28981-X
Borg, I., Groenen, P.J., Mair, P.: Applied Multidimensional Scaling and Unfolding. Springer, Heidelberg (2018). https://doi.org/10.1007/978-3-319-73471-2
Xu, X., Liang, T., Zhu, J., Zheng, D., Sun, T.: Review of classical dimensionality reduction and sample selection methods for large-scale data processing. Neurocomputing 328, 5–15 (2019)
Markeviciute, J., Bernataviciene, J., Levuliene, R., Medvedev, V., Treigys, P., Venskus, J.: Attention-based and time series models for short-term forecasting of COVID-19 spread. CMC-Comput. Mater. Continu. 70(1), 695–714 (2022). https://doi.org/10.32604/cmc.2022.018735
Dzemyda, G., Sabaliauskas, M.: A novel geometric approach to the problem of multidimensional scaling. In: Sergeyev, Y.D., Kvasov, D.E. (eds.) NUMTA 2019. LNCS, vol. 11974, pp. 354–361. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-40616-5_30
Sabaliauskas, M., Dzemyda, G.: Visual analysis of multidimensional scaling using GeoGebra. In: Dzitac, I., Dzitac, S., Filip, F.G., Kacprzyk, J., Manolescu, M.-J., Oros, H. (eds.) ICCCC 2020. AISC, vol. 1243, pp. 179–187. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-53651-0_15
Dzemyda, G., Sabaliauskas, M.: On the computational efficiency of geometric multidimensional scaling. In: 2021 2nd European Symposium on Software Engineering (ESSE 2021), 6-8 November 2021, Larissa, Greece, pp. 1–6. ACM, New York, NY, USA (2021). https://doi.org/10.1145/3501774.3501794
Dzemyda, G., Sabaliauskas, M.: Geometric multidimensional scaling: a new approach for data dimensionality reduction. Appl. Math. Comput. 409, 125,561 (2021). https://doi.org/10.1016/j.amc.2020.125561
Pace, R.K., Barry, R.: Sparse spatial autoregressions. Stat. Probab. Lett. 33(3), 291–297 (1997)
Anaconda software distribution (2021). https://anaconda.com/. Anaconda Documentation, Computer software
Acknowledgement
This research has received funding from the Research Council of Lithuania (LMTLT), agreement No S-MIP-20-19.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Dzemyda, G., Medvedev, V., Sabaliauskas, M. (2022). Multi-Core Implementation of Geometric Multidimensional Scaling for Large-Scale Data. In: Rocha, A., Adeli, H., Dzemyda, G., Moreira, F. (eds) Information Systems and Technologies. WorldCIST 2022. Lecture Notes in Networks and Systems, vol 469. Springer, Cham. https://doi.org/10.1007/978-3-031-04819-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-04819-7_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-04818-0
Online ISBN: 978-3-031-04819-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)