Advertisement

Robust Principal Component Analysis by Reverse Iterative Linear Programming

  • Andrea VisentinEmail author
  • Steven Prestwich
  • S. Armagan Tarim
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9852)

Abstract

Principal Components Analysis (PCA) is a data analysis technique widely used in dimensionality reduction. It extracts a small number of orthonormal vectors that explain most of the variation in a dataset, which are called the Principal Components. Conventional PCA is sensitive to outliers because it is based on the \(L_2\)-norm, so to improve robustness several algorithms based on the \(L_1\)-norm have been introduced in the literature. We present a new algorithm for robust \(L_1\)-norm PCA that computes components iteratively in reverse, using a new heuristic based on Linear Programming. This solution is focused on finding the projection that minimizes the variance of the projected points. It has only one parameter to tune, making it simple to use. On common benchmarks it performs competitively compared to other methods. The data and software related to this paper are available at https://github.com/visentin-insight/L1-PCAhp.

Keywords

Principal components analysis Linear programming L1-norm Robust 

References

  1. 1.
    Alfaro, C.A., Aydın, B., Valencia, C.E., Bullitt, E., Ladha, A.: Dimension reduction in principal component analysis for trees. Comput. Stat. Data Anal. 74, 157–179 (2014)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Bouhouche, S., Lahreche, M., Moussaoui, A., Bast, J.: Quality monitoring using principal component analysis and fuzzy logic application in continuous casting process 1. Am. J. Appl. Sci. 4(9), 637–644 (2007)CrossRefGoogle Scholar
  3. 3.
    Brooks, J.P., Dulá, J.H., Boone, E.L.: A pure L1-norm principal component analysis. Comput. Stat. Data Anal. 61, 83–98 (2013)CrossRefGoogle Scholar
  4. 4.
    Carter, J.F., Yates, H.S., Tinggi, U.: Stable isotope and chemical compositions of European and Australasian ciders as a guide to authenticity. J. Agric. Food Chem. 63(3), 975–982 (2015)CrossRefGoogle Scholar
  5. 5.
    Choulakian, V.: L1-norm projection pursuit principal component analysis. Comput. Stat. Data Anal. 50(6), 1441–1451 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Croux, C., Filzmoser, P., Fritz, H.: Robust sparse principal component analysis. Technometrics 55(2), 202–214 (2013)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Croux, C., Ruiz-Gazen, A.: High breakdown estimators for principal components: the projection-pursuit approach revisited. J. Multivar. Anal. 95(1), 206–226 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Daudin, J.J., Duby, C., Trecourt, P.: Stability of principal component analysis studied by the bootstrap method. Statistics: J. Theoret. Appl. Stat. 19(2), 241–258 (1988)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Ding, C., Zhou, D., He, X., Zha, H.: R1-PCA: rotational invariant L1-norm principal component analysis for robust subspace factorization. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 281–288. ACM (2006)Google Scholar
  10. 10.
    Hawkins, D.M., Bradu, D., Kass, G.V.: Location of several outliers in multiple-regression data using elemental sets. Technometrics 26(3), 197–208 (1984)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Hodge, V.J., Austin, J.: A survey of outlier detection methodologies. Artif. Intell. Rev. 22(2), 85–126 (2004)CrossRefzbMATHGoogle Scholar
  12. 12.
    Jolliffe, I.: Principal Component Analysis. Wiley Online Library, New York (2002)zbMATHGoogle Scholar
  13. 13.
    Hill Jr., T.W., Ravindran, A.: On programming with absolute-value functions. J. Optim. Theory Appl. 17(1–2), 181–183 (1975)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Kaplan, S.: Comment on a precis by Shanno and Weil. Manag. Sci. 17(11), 778–780 (1971)CrossRefGoogle Scholar
  15. 15.
    Ke, Q., Kanade, T.: Robust L1-norm factorization in the presence of outliers and missing data by alternative convex programming. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 739–746. IEEE (2005)Google Scholar
  16. 16.
    Kwak, N.: Principal component analysis based on L1-norm maximization. IEEE Trans. Pattern Anal. Mach. Intell. 30(9), 1672–1680 (2008)CrossRefGoogle Scholar
  17. 17.
    Kwak, N.: Principal component analysis by-norm maximization. IEEE Trans. Cybern. 44(5), 594–609 (2014)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml
  19. 19.
    Luenberger, D.G.: Optimization by Vector Space Methods. Wiley, New York (1997)zbMATHGoogle Scholar
  20. 20.
    Malagón-Borja, L., Fuentes, O.: Object detection using image reconstruction with PCA. Image Vis. Comput. 27(1), 2–9 (2009)CrossRefGoogle Scholar
  21. 21.
    McDonald, G.C., Schwing, R.C.: Instabilities of regression estimates relating air pollution to mortality. Technometrics 15(3), 463–481 (1973)CrossRefGoogle Scholar
  22. 22.
    Park, Y.W., Klabjan, D.: Algorithms for L1-norm principal component analysis (2014)Google Scholar
  23. 23.
    Rao, M.R.: Technical note - some comments on ‘linear’ programming with absolute-value functionals. Oper. Res. 21(1), 373–374 (1973)CrossRefzbMATHGoogle Scholar
  24. 24.
    Ravindran, A., Hill Jr., W.H.: Note - a comment on the use of simplex method forabsolute value problems. Manag. Sci. 19(5), 581–582 (1973)CrossRefzbMATHGoogle Scholar
  25. 25.
    Röver, C., Bizouard, M.A., Christensen, N., Dimmelmeier, H., Heng, I.S., Meyer, R.: Bayesian reconstruction of gravitational wave burst signals from simulations of rotating stellar core collapse and bounce. Phys. Rev. D 80(10), 102004 (2009)CrossRefGoogle Scholar
  26. 26.
    Shanno, D.F., Weil, R.L.: Technical note - ‘linear’ programming with absolute-value functionals. Oper. Res. 19(1), 120–124 (1971)CrossRefzbMATHGoogle Scholar
  27. 27.
    Zhuo, S., Guo, D., Sim, T.: Robust flash deblurring. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2440–2447. IEEE (2010)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Andrea Visentin
    • 1
    Email author
  • Steven Prestwich
    • 1
  • S. Armagan Tarim
    • 2
  1. 1.Insight Centre for Data Analytics, Department of Computer ScienceUniversity College CorkCorkIreland
  2. 2.Department of ManagementCankaya UniversityAnkaraTurkey

Personalised recommendations