Subspace Tracking for Latent Semantic Analysis

  • Radim Řehůřek
Conference paper

DOI: 10.1007/978-3-642-20161-5_29

Part of the Lecture Notes in Computer Science book series (LNCS, volume 6611)
Cite this paper as:
Řehůřek R. (2011) Subspace Tracking for Latent Semantic Analysis. In: Clough P. et al. (eds) Advances in Information Retrieval. ECIR 2011. Lecture Notes in Computer Science, vol 6611. Springer, Berlin, Heidelberg

Abstract

Modern applications of Latent Semantic Analysis (LSA) must deal with enormous (often practically infinite) data collections, calling for a single-pass matrix decomposition algorithm that operates in constant memory w.r.t. the collection size. This paper introduces a streamed distributed algorithm for incremental SVD updates. Apart from the theoretical derivation, we present experiments measuring numerical accuracy and runtime performance of the algorithm over several data collections, one of which is the whole of the English Wikipedia.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Radim Řehůřek
    • 1
  1. 1.NLP labMasaryk University in BrnoCzech Republic

Personalised recommendations