Abstract
Latent semantic analysis (LSA) is an algorithm applied to approximate the meaning of texts, thereby exposing semantic structure to computation. LSA combines the classical vector-space model — well known in computational linguistics — with a singular value decomposition (SVD), a two-mode factor analysis. Thus, bag-of-words representations of texts can be mapped into a modified vector space that is assumed to reflect semantic structure. In this contribution the authors describe the lsa package for the statistical language and environment R and illustrate its proper use through examples from the areas of automated essay scoring and knowledge representation.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
BAEZA-YATES, R. and RIBEIRO-NETO, B. (1999): Modern Information Retrieval. ACM Press, New York.
BERRY, M., DUMAIS, S. and O’BRIEN, G. (1995): Using Linear Algebra for Intelligent Information Retrieval. SIAM Review, 37, 573–595.
DEERWESTER, S., DUMAIS, S., FURNAS, G., LANDAUER, T. and HARSHMAN, R. (1990): Indexing by Latent Semantic Analysis. JASIS, 41, 391–407.
LANG, D.T. (2004): Rstem. R Package Version 0.2-0.
STALNAKER, J.M. (1951): The Essay Type of Examination. In: E.F. Lindquist (Ed.): Educational Measurement. George Banta, Menasha, 495–530.
WILD, F., STAHL, C., STERMSEK, G. and NEUMANN, G. (2005): Parameters Driving Effectiveness of Automated Essay Scoring with LSA. In: M. Danson (Ed.): Proceedings of the 9th CAA. Prof. Development, Loughborough, 485–494.
WILD, F. (2005): lsa: Latent Semantic Analysis. R Package Version 0.57.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wild, F., Stahl, C. (2007). Investigating Unstructured Texts with Latent Semantic Analysis. In: Decker, R., Lenz, H.J. (eds) Advances in Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70981-7_43
Download citation
DOI: https://doi.org/10.1007/978-3-540-70981-7_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70980-0
Online ISBN: 978-3-540-70981-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)