Using ‘core documents’ for the representation of clusters and topics
The notion of ‘core documents’, first introduced in the context of co-citation analysis and later re-introduced for bibliographic coupling, refers to the representation of the core of a publication set according to given criteria. In the present study, the notion of core documents is extended to the combination of citation-based and textual links. It is shown that core documents defined this way can be used to represent and describe document clusters and topics at different levels of aggregation. Methodology is illustrated using the example of two ISI Subject Categories selected from applied and social sciences.
KeywordsCore documents Cluster analysis Hybrid clustering Bibliographic coupling Text mining
Methodology has partially been developed in the context of the ERACEP project within the Coordination and Support Actions (CSAs) of the ERC work programme. The authors wish to acknowledge this support.
- Batagelj, V., & Mrvar, A. (2003). Pajek—analysis and visualization of large networks. In M. Jünger & P. Mutzel (Eds.), Graph drawing software (pp. 77–103). New York: Springer.Google Scholar
- Lamirel, J.C., Ta A.P., & Attik, M. (2008), Novel labeling strategies for hierarchical representation of multidimensional data analysis results. In: A. Gammerman (Ed.), Proceedings of the 26th IASTED International Conference on Artificial Intelligence and Applications (Track 595-138, pp. 169–174), 11–13 Feb 2008, Innsbruck, Austria. Anaheim, CA: ACTA Press.Google Scholar
- Sen, S. K., & Gan, S. K. (1983). A mathematical extension of the idea of bibliographic coupling and its applications. Annals of Library Science and Documentation, 30, 78–82.Google Scholar