Spotting Topics with the Singular Value Decomposition
The singular value decomposition, or SVD, has been studied in the past as a tool for detecting and understanding patterns in a collection of documents. We show how the matrices produced by the SVD calculation can be interpreted, allowing us to spot patterns of characters that indicate particular topics in a corpus. A test collection, consisting of two days of AP newswire traffic, is used as a running example.
KeywordsSingular Vector Term Vector Document Vector Test Corpus Negative Entry
Unable to display preview. Download preview PDF.
- 1.Michael Berry. Large scale singular value calculations. International Journal of Supercomputer Applications, 6:13–49, 1992.Google Scholar
- 5.Susan Dumais. Improving the retrieval of information from external sources. Behavior Research Methods, Instruments & Computers, 23(2):229–236, 1991.Google Scholar
- 6.Donna Harman. Overview of the Fourth Text REtrieval Conference (TREC-4). National Institute of Standards and Technology, 1995.Google Scholar
- 7.Bradley Kjell and Ophir Frieder. Visualization of literary style. In IEEE International Conference on Systems, Man and Cybernetics, pages 656–661. IEEE, 18–21 October 1992.Google Scholar
- 8.Thomas Landauer and Michael Littman. Computerized cross-language document retrieval using latent semantic indexing. United States Patent 5,301,109, 5 April 1994.Google Scholar