Skip to main content

Investigating Unstructured Texts with Latent Semantic Analysis

  • Conference paper

Abstract

Latent semantic analysis (LSA) is an algorithm applied to approximate the meaning of texts, thereby exposing semantic structure to computation. LSA combines the classical vector-space model — well known in computational linguistics — with a singular value decomposition (SVD), a two-mode factor analysis. Thus, bag-of-words representations of texts can be mapped into a modified vector space that is assumed to reflect semantic structure. In this contribution the authors describe the lsa package for the statistical language and environment R and illustrate its proper use through examples from the areas of automated essay scoring and knowledge representation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • BAEZA-YATES, R. and RIBEIRO-NETO, B. (1999): Modern Information Retrieval. ACM Press, New York.

    Google Scholar 

  • BERRY, M., DUMAIS, S. and O’BRIEN, G. (1995): Using Linear Algebra for Intelligent Information Retrieval. SIAM Review, 37, 573–595.

    Article  MathSciNet  MATH  Google Scholar 

  • DEERWESTER, S., DUMAIS, S., FURNAS, G., LANDAUER, T. and HARSHMAN, R. (1990): Indexing by Latent Semantic Analysis. JASIS, 41, 391–407.

    Article  Google Scholar 

  • LANG, D.T. (2004): Rstem. R Package Version 0.2-0.

    Google Scholar 

  • STALNAKER, J.M. (1951): The Essay Type of Examination. In: E.F. Lindquist (Ed.): Educational Measurement. George Banta, Menasha, 495–530.

    Google Scholar 

  • WILD, F., STAHL, C., STERMSEK, G. and NEUMANN, G. (2005): Parameters Driving Effectiveness of Automated Essay Scoring with LSA. In: M. Danson (Ed.): Proceedings of the 9th CAA. Prof. Development, Loughborough, 485–494.

    Google Scholar 

  • WILD, F. (2005): lsa: Latent Semantic Analysis. R Package Version 0.57.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wild, F., Stahl, C. (2007). Investigating Unstructured Texts with Latent Semantic Analysis. In: Decker, R., Lenz, H.J. (eds) Advances in Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70981-7_43

Download citation

Publish with us

Policies and ethics