Using the Multivariate Normal to Improve Random Projections

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10585)

Abstract

Random projection is a dimension reduction technique which can be used to estimate Euclidean distances, inner products, angles [9], or even \(l_p\) distances (for even p) [10] between pairs of high dimensional vectors. We extend the work of Li [9] and our prior work [7] to show how marginal information, principal components, and control variates can be used with the multivariate normal distribution to improve the accuracy of the inner product estimate of vectors. We call our method COntrol Variates For Estimation via First Eigenvectors (COVFEFE). We demonstrate the results of COVFEFE on the Arcene and MNIST datasets.

References

  1. 1.
    Anaraki, F.P., Hughes, S.: Memory and computation efficient PCA via very sparse random projections. In: Proceedings of the 31st International Conference on Machine Learning (2014)Google Scholar
  2. 2.
    Fowler, J.: Compressive-projection principal component analysis. IEEE Trans. Image Process. 18(10), 2230–2242 (2009)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Guyon, I., Gunn, S., Ben-Hur, A., Dror, G.: Result analysis of the nips 2003 feature selection challenge. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems, vol. 17, pp. 545–552. MIT Press (2005). http://papers.nips.cc/paper/2728-result-analysis-of-the-nips-2003-feature-selection-challenge.pdf
  4. 4.
    Honda, K., Nonoguchi, R., Notsu, A., Ichihashi, H.: PCA-guided k-means clustering with incomplete data. In: 2011 IEEE International Conference on Fuzzy Systems (FUZZ), pp. 1710–1714. IEEE (2011)Google Scholar
  5. 5.
    Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. Contemp. Math. 26(189–206), 1 (1984)MathSciNetMATHGoogle Scholar
  6. 6.
    Kang, K., Hooker, G.: Improving the recovery of principal components with semi-deterministic random projections. In: 2016 Annual Conference on Information Science and Systems, CISS 2016, Princeton, NJ, USA, 16–18 March 2016, pp. 596–601 (2016). https://doi.org/10.1109/CISS.2016.7460570
  7. 7.
    Kang, K., Hooker, G.: Random projections with control variates. In: Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods, vol. 1, ICPRAM, pp. 138–147. INSTICC, ScitePress (2017)Google Scholar
  8. 8.
    Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, pp. 2278–2324 (1998)Google Scholar
  9. 9.
    Li, P., Hastie, T.J., Church, K.W.: Improving random projections using marginal information. In: Lugosi, G., Simon, H.U. (eds.) COLT 2006. LNCS (LNAI), vol. 4005, pp. 635–649. Springer, Heidelberg (2006). doi:10.1007/11776420_46 CrossRefGoogle Scholar
  10. 10.
    Li, P., Mahoney, M.W., She, Y.: Approximating higher-order distances using random projections. CoRR abs/1203.3492 (2012). http://arxiv.org/abs/1203.3492
  11. 11.
    Loia, V., Tomasiello, S., Vaccaro, A.: Using fuzzy transform in multi-agent based monitoring of smart grids. Inf. Sci. 388, 209–224 (2017)CrossRefGoogle Scholar
  12. 12.
    Muirhead, R.J.: Aspects of Multivariate Statistical Theory. Wiley-Interscience, Hoboken (2005)MATHGoogle Scholar
  13. 13.
    Petersen, K.B., Pedersen, M.S.: The matrix cookbook. http://www2.imm.dtu.dk/pubdb/p.php?3274, version 20121115
  14. 14.
    Ross, S.M.: Simulation, 4th edn. Academic Press Inc., Orlando (2006)MATHGoogle Scholar
  15. 15.
    Xu, Q., Ding, C., Liu, J., Luo, B.: PCA-guided search for k-means. Pattern Recogn. Lett. 54, 50–55 (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Singapore University of Technology and DesignSingaporeSingapore

Personalised recommendations