Abstract
With the advancement in technology, data produced from different sources such as Internet, health care, financial companies, social media, etc. are increases continuously at a rapid rate. Potential growth of this data in terms of volume, variety and velocity coined a new emerging area of research, Big Data (BD). Continuous storage, processing, monitoring (if required), real time analysis are few current challenges of BD. However, these challenges becomes more critical when data can be uncertain, inconsistent and redundant. Hence, to reduce the overall processing time dimensionality reduction (DR) is one of the efficient techniques. Therefore, keeping in view of the above, in this paper, we have used principle component analysis (PCA) and singular value decomposition (SVD) techniques to perform DR over BD. We have compared the performance of both techniques in terms of accuracy and mean square error (MSR). Comparative results shows that for numerical reasons SVD is preferred PCA. Whereas, using PCA to train the data in dimension reduction for an image gives good classification output.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Gantz, J., Reinsel, D.: IDC, The Digital Universe (2014)
Swati, A., Ade, R.: Dimensionality reduction: an effective technique for feature selection. Int. J. Comput. Appl. 117(3), 18–23 (2015)
Gupta, T.K., et al.: Dimensionality reduction techniques and its applications. J. Comput. Sci. Syst. Biol. 8(3), 170 (2015)
Person, K.: On lines and planes of closest fit to system of points in space. Philos. Mag. 2, 559–572 (1901)
Hotelling, H.: Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24(6), 417 (1933)
Jollie, I.T.: Principal Component Analysis. Springer, New York (1986)
Oja, E.: Simplifed neuron model as a principal component analyzer. J. Math. Biol. 15(3), 267273 (1982)
Terence, D.: An optimality principle for unsupervised learning. In: NIPS, pp. 11–19 (1988)
Kung, S.Y., Diamantaras, K.I.: A neural network learning algorithm for adaptive principal component extraction (APEX). In: International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1990, pp. 861–864 (1990)
Rubner, J., Tavan, P.: A self-organizing network for principal-component analysis. EPL (Europhysics Letters) 10(7), 693–696 (1989)
Baldi, P., Hornik, K.: Neural networks and principal component analysis: learning from examples without local minima. Neural Netw. 2(1), 53–58 (1989)
Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. JHU Press, Baltimore and London (2012)
Henry, E.R., Hofrichter, J.: Singular value decomposition: application to analysis of experimental data. Methods Enzymol. 210, 129–192 (1992)
Deerwester, S., Harshman, R., et al.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–397 (1990)
Sarwar, B., et al.: Application of dimensionality reduction in recommender system-a case study. Technical report, DTIC Document (2000)
Brand, M.: Fast online SVD revisions for lightweight recommender systems. In: Proceedings of the International Conference on Data Mining, pp. 37–46. SIAM (2003)
Sarwar, B., et al.: Incremental singular value decomposition algorithms for highly scalable recommender systems. In: Fifth International Conference on Computer and Information Science, pp. 27–28 (2002)
Lichman, M.: UCI Machine Learning Repository (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Tanwar, S., Ramani, T., Tyagi, S. (2018). Dimensionality Reduction Using PCA and SVD in Big Data: A Comparative Case Study. In: Patel, Z., Gupta, S. (eds) Future Internet Technologies and Trends. ICFITT 2017. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 220. Springer, Cham. https://doi.org/10.1007/978-3-319-73712-6_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-73712-6_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73711-9
Online ISBN: 978-3-319-73712-6
eBook Packages: Computer ScienceComputer Science (R0)