Dimensionality Reduction Using PCA and SVD in Big Data: A Comparative Case Study

Tanwar, Sudeep; Ramani, Tilak; Tyagi, Sudhanshu

doi:10.1007/978-3-319-73712-6_12

Sudeep Tanwar¹⁷,
Tilak Ramani¹⁷ &
Sudhanshu Tyagi¹⁸

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 220))

Included in the following conference series:

International Conference on Future Internet Technologies and Trends

1114 Accesses
32 Citations

Abstract

With the advancement in technology, data produced from different sources such as Internet, health care, financial companies, social media, etc. are increases continuously at a rapid rate. Potential growth of this data in terms of volume, variety and velocity coined a new emerging area of research, Big Data (BD). Continuous storage, processing, monitoring (if required), real time analysis are few current challenges of BD. However, these challenges becomes more critical when data can be uncertain, inconsistent and redundant. Hence, to reduce the overall processing time dimensionality reduction (DR) is one of the efficient techniques. Therefore, keeping in view of the above, in this paper, we have used principle component analysis (PCA) and singular value decomposition (SVD) techniques to perform DR over BD. We have compared the performance of both techniques in terms of accuracy and mean square error (MSR). Comparative results shows that for numerical reasons SVD is preferred PCA. Whereas, using PCA to train the data in dimension reduction for an image gives good classification output.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Gantz, J., Reinsel, D.: IDC, The Digital Universe (2014)
Google Scholar
Swati, A., Ade, R.: Dimensionality reduction: an effective technique for feature selection. Int. J. Comput. Appl. 117(3), 18–23 (2015)
Google Scholar
Gupta, T.K., et al.: Dimensionality reduction techniques and its applications. J. Comput. Sci. Syst. Biol. 8(3), 170 (2015)
Google Scholar
Person, K.: On lines and planes of closest fit to system of points in space. Philos. Mag. 2, 559–572 (1901)
Article Google Scholar
Hotelling, H.: Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24(6), 417 (1933)
Article MATH Google Scholar
Jollie, I.T.: Principal Component Analysis. Springer, New York (1986)
Book Google Scholar
Oja, E.: Simplifed neuron model as a principal component analyzer. J. Math. Biol. 15(3), 267273 (1982)
Article Google Scholar
Terence, D.: An optimality principle for unsupervised learning. In: NIPS, pp. 11–19 (1988)
Google Scholar
Kung, S.Y., Diamantaras, K.I.: A neural network learning algorithm for adaptive principal component extraction (APEX). In: International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1990, pp. 861–864 (1990)
Google Scholar
Rubner, J., Tavan, P.: A self-organizing network for principal-component analysis. EPL (Europhysics Letters) 10(7), 693–696 (1989)
Article Google Scholar
Baldi, P., Hornik, K.: Neural networks and principal component analysis: learning from examples without local minima. Neural Netw. 2(1), 53–58 (1989)
Article Google Scholar
Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. JHU Press, Baltimore and London (2012)
Google Scholar
Henry, E.R., Hofrichter, J.: Singular value decomposition: application to analysis of experimental data. Methods Enzymol. 210, 129–192 (1992)
Article Google Scholar
Deerwester, S., Harshman, R., et al.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–397 (1990)
Article Google Scholar
Sarwar, B., et al.: Application of dimensionality reduction in recommender system-a case study. Technical report, DTIC Document (2000)
Google Scholar
Brand, M.: Fast online SVD revisions for lightweight recommender systems. In: Proceedings of the International Conference on Data Mining, pp. 37–46. SIAM (2003)
Google Scholar
Sarwar, B., et al.: Incremental singular value decomposition algorithms for highly scalable recommender systems. In: Fifth International Conference on Computer and Information Science, pp. 27–28 (2002)
Google Scholar
Lichman, M.: UCI Machine Learning Repository (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of CE, Institute of Technology, Nirma University, Ahmedabad, India
Sudeep Tanwar & Tilak Ramani
Department of ECE, Thapar University, Patiala, Punjab, India
Sudhanshu Tyagi

Authors

Sudeep Tanwar
View author publications
You can also search for this author in PubMed Google Scholar
Tilak Ramani
View author publications
You can also search for this author in PubMed Google Scholar
Sudhanshu Tyagi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sudeep Tanwar .

Editor information

Editors and Affiliations

Sardar Vallabhbhai National Institute of Technology, Surat, India
Zuber Patel
Sardar Vallabhbhai National Institute of Technology, Surat, India
Shilpi Gupta

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tanwar, S., Ramani, T., Tyagi, S. (2018). Dimensionality Reduction Using PCA and SVD in Big Data: A Comparative Case Study. In: Patel, Z., Gupta, S. (eds) Future Internet Technologies and Trends. ICFITT 2017. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 220. Springer, Cham. https://doi.org/10.1007/978-3-319-73712-6_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-73712-6_12
Published: 21 January 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73711-9
Online ISBN: 978-3-319-73712-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics