Skip to main content

A Novel Multivariate Mapping Method for Analyzing High-Dimensional Numerical Datasets

  • 1352 Accesses

Part of the Lecture Notes in Computer Science book series (LNAI,volume 9728)

Abstract

In modern science, dealing with high dimensional datasets is a very common task due to the increasing availability of data. Multivariate data analysis represents challenges in both theoretical and empirical levels. Until now, several methods for dimensionality reduction like Principal Component Analysis, Low Variance Filter and High Correlated Columns has been proposed. However, sometimes the reduction achieved by existing methods is not accurate enough to analyze datasets where, for practical reasons, more reduction of the original dataset is required. In this paper, we propose a new method to transform high dimensional dataset into a one-dimensional. We show that such transformation preserves the properties of the original dataset and thus, it can be suitable for many applications where a high reduction is required.

Keywords

  • Feature selection
  • Dimensionality reduction
  • Density estimation

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-41561-1_23
  • Chapter length: 9 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   64.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-41561-1
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   84.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.

References

  1. Aldana-Bobadilla, E., Alfaro-Prez, C.: Finding the optimal sample based on shannon entropy and genetic algorithms. In: Sidorov, G., Galicia-Haro, S.N. (eds.) MICAI 2015. LNCS, vol. 9413, pp. 353–363. Springer, Heidelberg (2015)

    CrossRef  Google Scholar 

  2. Cox, K.A., Dante, H.M., Maher, R.J.: Product appearance inspection methods and apparatus employing low variance filter, 17 August 1993. US Patent 5,237,621

    Google Scholar 

  3. Doane, D.P.: Aesthetic frequency classifications. Am. Stat. 30(4), 181–183 (1976)

    MathSciNet  Google Scholar 

  4. Gowda, K.C., Krishna, G.: The condensed nearest neighbor rule using the concept of mutual nearest neighborhood. IEEE Trans. Inf. Theor. 25(4), 488–490 (1979)

    CrossRef  Google Scholar 

  5. Hyndman, R.J.: The problem with sturges rule for constructing histograms. Monash University (1995)

    Google Scholar 

  6. Hyndman, R.J., Fan, Y.: Sample quantiles in statistical packages. Am. Stat. 50(4), 361–365 (1996)

    Google Scholar 

  7. Kalegele, K., Takahashi, H., Sveholm, J., Sasai, K., Kitagata, G., Kinoshita, T.: On-demand data numerosity reduction for learning artifacts. In: 2012 IEEE 26th International Conference on Advanced Information Networking and Applications (AINA), pp. 152–159. IEEE (2012)

    Google Scholar 

  8. Lane, D.M.: Online statistics education: an interactive multimedia course of study (2015). http://onlinestatbook.com/2/graphing_distributions/histograms.html. Accessed 03 Dec 2015

  9. Liu, H., Motoda, H.: Instance Selection and Construction for Data Mining, vol. 608. Springer, Heidelberg (2013)

    Google Scholar 

  10. Reeves, C.R., Bush, D.R.: Using genetic algorithms for training data selection in RBF networks. In: Liu, H., Motoda, H. (eds.) Instance Selection and Construction for Data Mining, vol. 608, pp. 339–356. Springer, Heidelberg (2001)

    CrossRef  Google Scholar 

  11. Shlens, J.: A tutorial on principal component analysis (2014). arXiv preprint arXiv:1404.1100

  12. Skalak, D.B.: Prototype and feature selection by sampling and random mutation hill climbing algorithms. In: Proceedings of the Eleventh International Conference on Machine Learning, pp. 293–301 (1994)

    Google Scholar 

  13. Randall Wilson, D., Martinez, T.R.: Reduction techniques for instance-based learning algorithms. Mach. Learn. 38(3), 257–286 (2000)

    CrossRef  MATH  Google Scholar 

  14. Lei, Y., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. ICML 3, 856–863 (2003)

    Google Scholar 

Download references

Acknowledgments

The authors acknowledge the support of Consejo Nacional de Ciencia y tecnología (CONACyT) and Centro de Investigación y Estudios Avanzados-CINVESTAV.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edwin Aldana-Bobadilla .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Aldana-Bobadilla, E., Molina-Villegas, A. (2016). A Novel Multivariate Mapping Method for Analyzing High-Dimensional Numerical Datasets. In: Perner, P. (eds) Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2016. Lecture Notes in Computer Science(), vol 9728. Springer, Cham. https://doi.org/10.1007/978-3-319-41561-1_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-41561-1_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-41560-4

  • Online ISBN: 978-3-319-41561-1

  • eBook Packages: Computer ScienceComputer Science (R0)