Skip to main content

Parallel SVD Computing in the Latent Semantic Indexing Applications for Data Retrieval

  • Chapter
Parallel Computing

Abstract

One of the main sources of information in our society is a written word. Since times of Sumerians, a written document became the main tool to inform, to teach, to entertain and to archive the knowledge. Today, some 6000 years after Sumerians, nothing has changed with respect to the importance of a written text. To become widely available, the knowledge must be manipulated in an easy and reliable way, and some type of text encoding on a computer is needed

The Latent Semantic Indexing (LSI) is a concept-based automatic indexing method for overcoming the two fundamental problems which exist in the traditional lexicalmatching retrieval schemes: synonymy and polysemy. It is based on the modeling of a term – document relationship using the reduced-dimension representation of a term-document matrix computed by its partial Singular Value Decomposition (SVD).We describe main principles of the LSI in the form of a mathematical model and discuss its implementation on a parallel computer with distributed memory.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. W. Berry and M. Browne, Understanding Search Engines: Mathematical Modeling and Text Retrieval, First ed., SIAM, Philadelphia, PA (1999).

    Google Scholar 

  2. M. W. Berry, Z. Drmač and E. R. Jessup, Matrices, vector spaces, and information retrieval, SIAM Rev. 41 (1999) 335–362.

    Article  MATH  MathSciNet  Google Scholar 

  3. H. Zha, A subspace-based model for information retrieval with applications in latent semantic indexing, in: Proc. Irregular ’98, LNCS 1457, Springer Verlag, New York, NY (1998) 29–42.

    Google Scholar 

  4. R. B. Lehoucq and D. C. Sorensen, Deflation techniques for an implicitly restarted Arnoldi iteration, SIAM J. Matrix Anal. Appl. 17 (1996) 789–821.

    Article  MATH  MathSciNet  Google Scholar 

  5. B. Parlett, The symmetric eigenvalue problem, First ed., SIAM, Philadelphia, PA (1996).

    Google Scholar 

  6. H. Rutishauser, Simultaneuos iteration method for symmetric matrices, Num.Math. 16 (1970) 205–223.

    Article  MATH  MathSciNet  Google Scholar 

  7. A. H. Sameh and J. A.Wasniewski, A trace minimization algorithm for the generalized eigenvalue problem, SIAM J. Num. Anal. 19 (1982) 1243–1259.

    Article  MATH  Google Scholar 

  8. M. W. Berry, Large scale sparse singular value computations, J. Supercomp. Appl. 6 (1992) 13–49.

    Google Scholar 

  9. H. Zha and H. D. Simon, On updating problems in latent semantic indexing, SIAM J. Sci. Comput. 21 (1999) 782–791.

    Article  MATH  MathSciNet  Google Scholar 

  10. D. I. Witter and M. W. Berry, Downdating the latent semantic indexing model for conceptual information retrieval, Comput. J. 41 (1998) 589–601.

    Article  MATH  Google Scholar 

  11. A. Björck, Numerical Methods for Least Squares Problems, First ed., SIAM, Philadelphia, PA (1996).

    Google Scholar 

  12. E. Kogbetliantz, Diagonalization of general complex matrices as a new method for solution of linear equations, Proc. Intern. Congr. Math. Amsterdam 2 (1954) 356–357.

    Google Scholar 

  13. E. Kogbetliantz, Solutions of linear equations by diagonalization of coefficient matrices, Quart. Appl. Math. 13 (1955) 123–132.

    MATH  MathSciNet  Google Scholar 

  14. V. Hari and J. Matejaš, Accuracy of the Kogbetliantz method, preprint, University of Zagreb (2005).

    Google Scholar 

  15. V. Hari and V. Zadelj-Martič, Parallelizing Kogbetliantz method: A first attempt, J. Num. Anal. Industr. Appl. Math. 2 (2007), 49–66.

    MATH  Google Scholar 

  16. F. T. Luk and H. Park, On parallel Jacobi orderings, SIAM J. Sci. Statist. Comput. 10 (1989) 18–26.

    Article  MATH  MathSciNet  Google Scholar 

  17. V. Hari, Accelerating the SVD block-Jacobi method, Computing 75 (2005) 27–53. 18. M. Bečka, G. Okša and M. Vajteršic, Dynamic ordering for a parallel block-Jacobi SVD algorithm, Parallel Comput. 28 (2002) 243–262.

    Article  Google Scholar 

  18. M. Bečka and G. Okša, On variable blocking factor in a parallel dynamic block-Jacobi SVD algorithm, Parallel Comput. 29 (2003) 1153-1174.

    Article  MathSciNet  Google Scholar 

  19. H. N. Gabov, Data structures for weighted matching and nearest common ancessors with linkings, in: Proceedings of the First Annual ACM-SIAM Symposium on Discrete Algorithms, ACM, New York (1990) 434–443.

    Google Scholar 

  20. W. J. Cook and A. Rohe, Computing minimum-weight perfect matchings, INFORMS J. Comput. 11 (1999) 138–148.

    MATH  MathSciNet  Google Scholar 

  21. J. Van Leeuwen, ed., Handbook of Theoretical Computer Science. Volume A: Algorithms and Complexity, Elsevier, Amsterdam (1990) 587.

    Google Scholar 

  22. A. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, S. Ostrouchov and D. Sorensen, LAPACK Users’ Guide, Second ed., SIAM, Philadelphia (1999).

    Google Scholar 

  23. M. Bečka and M. Vajteršic, Block-Jacobi SVD algorithms for distributed memory systems: I. Hypercubes and rings, Parallel Alg. Appl. 13 (1999) 265–287.

    MATH  Google Scholar 

  24. M. Bečka and M. Vajteršic, Block-Jacobi SVD algorithms for distributed memory systems: II. Meshes, Parallel Alg. Appl. 14 (1999) 37–56.

    MATH  Google Scholar 

  25. I. Foster and C. Kasselman, Computational Grids. In: The Grid: Blueprint for a Future Computing Infrastructure, I.Foster and C.Kasselman (Eds.), Morgan and Kaufmann Publishers (1998).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gabriel Okša .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Okša, G., Vajteršic, M. (2009). Parallel SVD Computing in the Latent Semantic Indexing Applications for Data Retrieval. In: Trobec, R., Vajteršic, M., Zinterhof, P. (eds) Parallel Computing. Springer, London. https://doi.org/10.1007/978-1-84882-409-6_12

Download citation

  • DOI: https://doi.org/10.1007/978-1-84882-409-6_12

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-84882-408-9

  • Online ISBN: 978-1-84882-409-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics