Dimensionality Reduction Using PCA and SVD in Big Data: A Comparative Case Study

  • Sudeep Tanwar
  • Tilak Ramani
  • Sudhanshu Tyagi
Conference paper
Part of the Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering book series (LNICST, volume 220)


With the advancement in technology, data produced from different sources such as Internet, health care, financial companies, social media, etc. are increases continuously at a rapid rate. Potential growth of this data in terms of volume, variety and velocity coined a new emerging area of research, Big Data (BD). Continuous storage, processing, monitoring (if required), real time analysis are few current challenges of BD. However, these challenges becomes more critical when data can be uncertain, inconsistent and redundant. Hence, to reduce the overall processing time dimensionality reduction (DR) is one of the efficient techniques. Therefore, keeping in view of the above, in this paper, we have used principle component analysis (PCA) and singular value decomposition (SVD) techniques to perform DR over BD. We have compared the performance of both techniques in terms of accuracy and mean square error (MSR). Comparative results shows that for numerical reasons SVD is preferred PCA. Whereas, using PCA to train the data in dimension reduction for an image gives good classification output.


Dimensionality reduction Principle component analysis Singular value decomposition Big data 


  1. 1.
    Gantz, J., Reinsel, D.: IDC, The Digital Universe (2014)Google Scholar
  2. 2.
    Swati, A., Ade, R.: Dimensionality reduction: an effective technique for feature selection. Int. J. Comput. Appl. 117(3), 18–23 (2015)Google Scholar
  3. 3.
    Gupta, T.K., et al.: Dimensionality reduction techniques and its applications. J. Comput. Sci. Syst. Biol. 8(3), 170 (2015)Google Scholar
  4. 4.
    Person, K.: On lines and planes of closest fit to system of points in space. Philos. Mag. 2, 559–572 (1901)CrossRefGoogle Scholar
  5. 5.
    Hotelling, H.: Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24(6), 417 (1933)CrossRefzbMATHGoogle Scholar
  6. 6.
    Jollie, I.T.: Principal Component Analysis. Springer, New York (1986)CrossRefGoogle Scholar
  7. 7.
    Oja, E.: Simplifed neuron model as a principal component analyzer. J. Math. Biol. 15(3), 267273 (1982)CrossRefGoogle Scholar
  8. 8.
    Terence, D.: An optimality principle for unsupervised learning. In: NIPS, pp. 11–19 (1988)Google Scholar
  9. 9.
    Kung, S.Y., Diamantaras, K.I.: A neural network learning algorithm for adaptive principal component extraction (APEX). In: International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1990, pp. 861–864 (1990)Google Scholar
  10. 10.
    Rubner, J., Tavan, P.: A self-organizing network for principal-component analysis. EPL (Europhysics Letters) 10(7), 693–696 (1989)CrossRefGoogle Scholar
  11. 11.
    Baldi, P., Hornik, K.: Neural networks and principal component analysis: learning from examples without local minima. Neural Netw. 2(1), 53–58 (1989)CrossRefGoogle Scholar
  12. 12.
    Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. JHU Press, Baltimore and London (2012)Google Scholar
  13. 13.
    Henry, E.R., Hofrichter, J.: Singular value decomposition: application to analysis of experimental data. Methods Enzymol. 210, 129–192 (1992)CrossRefGoogle Scholar
  14. 14.
    Deerwester, S., Harshman, R., et al.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–397 (1990)CrossRefGoogle Scholar
  15. 15.
    Sarwar, B., et al.: Application of dimensionality reduction in recommender system-a case study. Technical report, DTIC Document (2000)Google Scholar
  16. 16.
    Brand, M.: Fast online SVD revisions for lightweight recommender systems. In: Proceedings of the International Conference on Data Mining, pp. 37–46. SIAM (2003)Google Scholar
  17. 17.
    Sarwar, B., et al.: Incremental singular value decomposition algorithms for highly scalable recommender systems. In: Fifth International Conference on Computer and Information Science, pp. 27–28 (2002)Google Scholar
  18. 18.
    Lichman, M.: UCI Machine Learning Repository (2013)Google Scholar

Copyright information

© ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2018

Authors and Affiliations

  1. 1.Department of CE, Institute of TechnologyNirma UniversityAhmedabadIndia
  2. 2.Department of ECEThapar UniversityPatialaIndia

Personalised recommendations