Abstract
This paper will develop and demonstrate a novel method for analyzing scientific indexes called Latent Semantic Differentiation. Using two distinct datasets comprised of scientific abstracts, it will demonstrate the procedure’s ability to identify the dominant themes, cluster the articles accordingly, visualize the results, and provide a qualitative description of each cluster. Combined, the analyses will highlight the utility of the procedure for scientific document indexing, structuring university departments, facilitating grant administration, and augmenting ongoing research on scientific citation. Because the procedure is extensible to any textual domain, there are numerous avenues for continued research both within the sciences and beyond.
Similar content being viewed by others
References
Kuhn, T. S., The Structure of Scientific Revolutions, The University of Chicago Press, Chicago, 1996.
Braam, R. R., H. F. Moed, A. F. J. Vanraan, Mapping of science by combined cocitation and word analysis. 1. Structural aspects, Journal of the American Society for Information Science, 42(4) (1991) 233–251.
Blute, M., The evolutionary ecology of science, Journal of Memetics-Evolutionary Models of Information Transmission, 7(1) (2003) 19.
Hull, D., Science as a Process, The University of Chicago Press, Chicago, 1988.
Leydesdorff, L., Theories of citation?, Scientometrics, 43(1) (1998) 5–25.
White, H. D., K. W. Mccain, Visualizing a discipline: An author co-citation analysis of information science, 1972–1995, Journal of the American Society for Information Science, 49(4) (1998) 327–355.
Shiffrin, R. M., K. Borner, Mapping knowledge domains, Proceedings of the National Academy of Sciences of the United States of America, 101(Suppl. 1) (2004) 5183–5185.
Soderqvist, T., A. M. Silverstein, Studying leadership and subdisciplinary structure of scientific disciplines — Cluster-analysis of participation in scientific meetings, Scientometrics, 30(1) (1994) 243–258.
Newman, M. E. J., Coauthorship networks and patterns of scientific collaboration, PNAS, 101(Suppl. 1) (2004) 5200–5205.
Morris, S. A., G. G. Yen, Crossmaps: Visualization of overlapping relationships in collections of journal papers, PNAS, 101(Suppl. 1) (2004) 5291–5296.
Hopcroft, J. J., Khan, O., Kulis, B., Selman, B., Tracking evolving communities in large linked networks, Proceedings of the National Academy of Sciences, 101(1) (2004) 5249–5253.
Mika, P., T. Elfring, P. Groenewegen, Application of semantic technology for social network analysis in the sciences, Scientometrics, 68(1) (2006) 3–27.
White, H. D., et al., User-controlled mapping of significant literatures, PNAS, 101(Suppl. 1) (2004) 5297–5302.
Skupin, A., The world of geography: Visualizing a knowledge domain with cartographic means, PNAS, 101(Suppl. 1) (2004) 5274–5278.
Menczer, F., Evolution of document networks, PNAS, 101(Suppl. 1) (2004) 5261–5265.
Mane, K. K., K. Borner, Mapping topics and topic bursts in PNAS, PNAS, 101(Suppl. 1) (2004) 5287–5290.
Hui, S. C., A. C. M., Document retrieval from a citation database using conceptual clustering and co-word analysis., Online Information Review, 28(1) (2004) 22–32.
Henzinger, M., S. Lawrence, Extracting knowledge from the world wide web, PNAS, 101(Suppl. 1) (2004) 5186–5191.
Ginsparg, P., et al., Mapping subsets of scholarly information, PNAS, 101(Suppl. 1) (2004) 5236–5240.
Erosheva, E., S. Fienberg, J. Lafferty, Mixed-membership models of scientific publications, PNAS, 101(Suppl. 1) (2004) 5220–5227.
Griffiths, T. L., M. Steyvers, Finding scientific topics, PNAS, 101(Suppl. 1) (2004) 5228–5235.
Landauer, T. K., D. Laham, M. Derr From paragraph to graph: Latent semantic analysis for information visualization, Proceedings of the National Academy of Sciences of the United States of America, 101(Suppl. 1) (2004) 5214–5219.
Landauer, T. K., P. W. Foltz, D. Laham, An introduction to latent semantic analysis, Discourse Processes, 25(2&3) (1998) 259–284.
Landauer, T. K., S. T. Dumais, A Solution to plato’s cave: The latent semantic analysis theory of acquisition, induction, and representation of knowledge, Psychological Review, 104(2) (1997) 211–240.
Landauer, T. K., Learning and representing verbal meaning: The latent semantic analysis theory, Current Directions in Psychological Science, 7(5) (1998) 161–164.
Best, M., R. Pocklington Meaning as use: Transmission fidelity and evolution in the NetNews, Journal of Theoretical Biology, 196(3) (1999) 389–395.
Laham, D., “Latent Semantic Analysis Approaches to Categorization.” From http://scholar.google.com/url?sa=U&q=http://lsa.colorado.edu/categories.pdf, 1997.
Berry, M. W., L. Wo, J. T. Giles, GTP (General Text Parser) Software for Text Mining. C. Warren Neel Conference on the new frontiers of statistical data mining and knowledge discovery, Knoxville, TX, 2002.
Best, M., Microevolutionary Language Theory. Massachusetts Institute of Technology, City, 2000.
Landauer, T. K., Data Requirements for Conducting Latent Semantic Analysis. City, 2006.
Shenk, M., Models for the Future of Anthropology. City, 2006.
Alden Smith, E.Anthropological Schisms. City, 2006.
Yanagisako, S., D. Segal, Welcoming Debate: Exploring Links and Disconnects Among the Quadrants. City, 2006.
van Leeuwen, T.N., et al., Language biases in the coverage of the Science Citation Index and its consequences for international comparisons of national research performance, Scientometrics, 51(1) (2001) 335–346.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Blatt, E.M. Differentiating, describing, and visualizing scientific space: A novel approach to the analysis of published scientific abstracts. Scientometrics 80, 385–406 (2009). https://doi.org/10.1007/s11192-008-2070-3
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-008-2070-3