Advertisement

Beta Random Projection

  • Yu-En Lu
  • Pietro Liò
  • Steven Hand
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5151)

Abstract

Random projection (RP) is a common technique for dimensionality reduction under L 2 norm for which many significant space embedding results have been demonstrated. In particular, random projection techniques can yield sharp results for R d under the L 2 norm in time linear to the product of the number of data points and dimensionalities in question. Inspired by the use of symmetric probability distributions in previous work, we propose a RP algorithm based on the hyper-spherical symmetry and give its probabilistic analyses based on Beta and Gaussian distribution.

Keywords

Randomised algorithm dimensionality reduction multi-dimensional indexing 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Achlioptas, D.: Database-friendly random projections. In: PODS 2001: Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 274–281. ACM Press, New York (2001)CrossRefGoogle Scholar
  2. 2.
    Bawa, M., Condie, T., Ganesan, P.: Lsh forest: self-tuning indexes for similarity search. In: WWW 2005: Proceedings of the 14th international conference on World Wide Web, pp. 651–660. ACM Press, New York (2005)Google Scholar
  3. 3.
    Bourgain, J.: On lipschitz embedding of finite metric spaces in hilbert space. Israel J. Math. 52, 46–52 (1985)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Cost, S., Salzberg, S.: A weighted nearest neighbor algorithm for learning with symbolic features. Mach. Learn. 10(1), 57–78 (1993)Google Scholar
  5. 5.
    Indyk, P., Matoušek, J.: Low-distortion embeddings of finite metric spaces. In: Handbook of Discrete and Computational Geometry, 2nd edn. (2004)Google Scholar
  6. 6.
    Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: STOC 1998: Proceedings of the thirtieth annual ACM symposium on Theory of computing, pp. 604–613. ACM Press, New York (1998)CrossRefGoogle Scholar
  7. 7.
    Johannesson, B., Giri, N.: On approximations involving the beta distribution. Communications in statistics. Simulation and computation (Commun. stat., Simul. comput.) 24(2), 489–503 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. In: Conference in modern analysis and probability, pp. 189–206 (1984)Google Scholar
  9. 9.
    Li, P., Hastie, T., Church, K.W.: Improving random projections using marginal information. In: Lugosi, G., Simon, H.U. (eds.) COLT 2006. LNCS (LNAI), vol. 4005, pp. 635–649. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  10. 10.
    Li, P., Hastie, T.J., Church, K.W.: Very sparse random projections. In: KDD 2006: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 287–296. ACM Press, New York (2006)Google Scholar
  11. 11.
    Muller, M.E.: A note on a method for generating points uniformly on n-dimensional spheres. Commun. ACM 2(4), 19–20 (1959)CrossRefzbMATHGoogle Scholar
  12. 12.
    Pentland, A., Picard, R., Sclaroff, S.: Photobook: Content-based manipulation of image databases. In: SPIE Storage and Retrieval for Image and Video Databases, vol. II(2185) (February 1994)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Yu-En Lu
    • 1
  • Pietro Liò
    • 1
  • Steven Hand
    • 1
  1. 1.University of Cambridge Computer LaboratoryCambridgeUK

Personalised recommendations