Skip to main content

An Online Pyramidal Embedding Technique for High Dimensional Big Data Visualization

  • Conference paper
  • First Online:
Intelligent Systems (BRACIS 2020)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12320))

Included in the following conference series:

  • 918 Accesses

Abstract

Visualizing multidimensional Big Data is defying: high dimensionalities hinder or even preclude visual inspections. A means of tackling this issue is to use DR (Dimensionality Reduction) techniques, producing low-dimensional representations of high-dimensional data. Popular DR algorithms (e.g., Principal Component Analysis, t-Distributed Stochastic Neighbor Embedding), albeit helpful, are computationally expensive. Most have \(\mathcal {O}(n^2)\) or \(\mathcal {O}(n^3)\) ATC (Asymptotic Time Complexity) and/or calculate pairwise distances of the entire data set, exceeding available memory and rendering Big Data DR time-consuming or impracticable. These issues impede the employment of DR for online learning appliances, where recurrent, cumulative model updates are habitual. The stochastic factor of some approaches similarly obstructs any meaningful inspection on how knowledge is spatially disposed. The recently introduced PCS (Polygonal Coordinate System)—an incremental, geometric-based technique with linear ATC—is compelling; however, its restriction to 2-D embeddings amounts to significant information loss. We propose the Big Data ready, incremental PES (Pyramidal Embedding System), which builds on PCS virtues by additionally generating 3-D embeddings through its pyramid-like interspace, mitigating quality degradation. Visual inspections, as well as pairwise distance based statistical analyses, validate the PES ability to retain structural arrangements when embedding high- and low-dimensional data while retaining flexibility in resources consumption.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Blouvshtein, L., Cohen-Or, D.: Outlier detection for robust multi-dimensional scaling. IEEE Trans. Pattern Anal. Mach. Intell. 41(9), 2273–2279 (2018)

    Article  Google Scholar 

  2. Dua, D., Graff, C.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml

  3. Flexa, C., Gomes, W., Viademonte, S., Junior, C.S., Alves, R.: A geometry-based approach to visualize high-dimensional data. In: 2019 8th Brazilian Conference on Intelligent Systems (BRACIS), pp. 186–191. IEEE (2019)

    Google Scholar 

  4. Gracia, A., González, S., Robles, V., Menasalvas, E.: A methodology to compare dimensionality reduction algorithms in terms of loss of quality. Inf. Sci. 270, 1–27 (2014)

    Article  MathSciNet  Google Scholar 

  5. Gracia, A., González, S., Robles, V., Menasalvas, E., Von Landesberger, T.: New insights into the suitability of the third dimension for visualizing multivariate/multidimensional data: a study based on loss of quality quantification. Inf. Visual. 15(1), 3–30 (2016)

    Article  Google Scholar 

  6. Li, H., Robini, M.C., Yang, F., Magnin, I., Zhu, Y.: Cardiac fiber unfolding by semidefinite programming. IEEE Trans. Biomed. Eng. 62(2), 582–592 (2014)

    Article  Google Scholar 

  7. Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(Nov), 2579–2605 (2008)

    MATH  Google Scholar 

  8. Palese, L.L.: A random version of principal component analysis in data clustering. Comput. Biol. Chem. 73, 57–64 (2018)

    Article  Google Scholar 

  9. Praveena, M.A., Bharathi, B.: A survey paper on big data analytics. In: 2017 International Conference on Information Communication and Embedded Systems (ICICES), pp. 1–9. IEEE (2017)

    Google Scholar 

  10. Su, Y., Lin, R., Kuo, C.C.J.: Tree-structured multi-stage principal component analysis (TMPCA): theory and applications. Expert Syst. Appl. 118, 355–364 (2019)

    Article  Google Scholar 

  11. Ultsch, A.: Clustering with som: U*c. In: Proceedings of Workshop on Self-Organizing Maps, Paris, France, pp. 75–82 (2005)

    Google Scholar 

  12. Van Der Maaten, L.: Accelerating t-SNE using tree-based algorithms. J. Mach. Learn. Res. 15(1), 3221–3245 (2014)

    MathSciNet  MATH  Google Scholar 

  13. Wang, Y., et al.: A perception-driven approach to supervised dimensionality reduction for visualization. IEEE Trans. Vis. Comput. Graph. 24(5), 1828–1840 (2017)

    Article  Google Scholar 

  14. Wei, X., et al.: Reconstructible nonlinear dimensionality reduction via joint dictionary learning. IEEE Trans. Neural Netw. Learn. Syst. 30(1), 175–189 (2018)

    Article  MathSciNet  Google Scholar 

  15. Weisstein, E.W.: Pyramid. Wolfram MathWorld (2002)

    Google Scholar 

  16. Yang, L., Song, S., Gong, Y., Gao, H., Wu, C.: Nonparametric dimension reduction via maximizing pairwise separation probability. IEEE Trans. Neural Netw. Learn. Syst. 30(10), 3205–3210 (2019)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adriano Barreto .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Barreto, A., Moreira, I., Flexa, C., Cardoso, E., Sales, C. (2020). An Online Pyramidal Embedding Technique for High Dimensional Big Data Visualization. In: Cerri, R., Prati, R.C. (eds) Intelligent Systems. BRACIS 2020. Lecture Notes in Computer Science(), vol 12320. Springer, Cham. https://doi.org/10.1007/978-3-030-61380-8_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-61380-8_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-61379-2

  • Online ISBN: 978-3-030-61380-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics