Do second-order similarities provide added-value in a hybrid approach?

Thijs, Bart; Schiebel, Edgar; Glänzel, Wolfgang

doi:10.1007/s11192-012-0896-1

Do second-order similarities provide added-value in a hybrid approach?

Published: 21 November 2012

Volume 96, pages 667–677, (2013)
Cite this article

Scientometrics Aims and scope Submit manuscript

Bart Thijs¹,
Edgar Schiebel² &
Wolfgang Glänzel^1,3

778 Accesses
21 Citations
Explore all metrics

Abstract

Recent studies on first- and second-order similarities have shown that the latter one outperforms the first one as input for document clustering or partitioning applications. First-order similarities based on bibliographic coupling or on lexical approaches come with specific methodological issues like sparse matrices, sensitive to spelling variances or context differences. Second-order similarities were proposed to tackle these problems and take the lexical context into account. But also a hybrid combination of both types of similarities proved an important improvement which integrates the strengths of the two approaches and diminishes their weaknesses. In this paper we extend the notion of second-order similarity by applying it in the context of the hybrid approach. We conclude that there is no added value for the clearly defined clusters but that the second-order similarity can provide an additional viewpoint for the more general clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Experimental evaluation of parameter settings in calculation of hybrid similarities: effects of first- and second-order similarity, edge cutting, and weighting factors

Article 01 April 2017

Diagonal Co-clustering Algorithm for Document-Word Partitioning

Combining semantic and term frequency similarities for text clustering

Article 02 January 2019

References

Ahlgren, P., & Colliander, C. (2009). Document–document similarity approaches and science mapping: experimental comparison of five approaches. Journal of Informetrics, 3(1), 49–63. doi:10.1016/j.joi.2008.11.003
Article Google Scholar
Bichteler, J., & Eaton, E. A. (1980). The combined use of bibliographic coupling and co-citation for document retrieval. JASIS, 31(4), 278–282. doi:10.1002/asi.4630310408
Article Google Scholar
Boyack, K. W., & Klavans, R. (2010). Co-citation analysis, bibliographic coupling, and direct citation: which citation approach represents the research front most accurately? JASIST, 61(12), 2389–2404. doi:10.1002/asi.21419
Article Google Scholar
Braam, R. R., Moed, H. F., & van Raan, A. F. J. (1991a). Mapping of science by combined co-citation and word analysis, part 1: structural aspects. JASIS, 42(4), 233–251. doi:10.1002/(SICI)1097-4571(199105)42:4<233::AID-ASI1>3.0.CO;2-I
Article Google Scholar
Braam, R. R., Moed, H. F., & van Raan, A. F. J. (1991b). Mapping of science by combined co-citation and word analysis part II: dynamical aspects. JASIS, 42(4), 252–266. doi:10.1002/(SICI)1097-4571(199105)42:4<252::AID-ASI2>3.0.CO;2-G
Article Google Scholar
Colliander, C., & Ahlgren, P. (2011). Experimental comparison of first and second-order similarities in a scientometric context. Scientometrics, 90(2), 675–685. doi:10.1007/s11192-011-0491-x
Article Google Scholar
Glänzel, W. (2012). The role of core documents in bibliometric network analysis and their relation with h-type indices. Scientometrics, 93(1), 113–123. doi:10.1007/s11192-012-0639-3.
Article Google Scholar
Glänzel, W., Janssens, F., & Thijs, B. (2009). A comparative analysis of publication activity and citation impact based on the core literature in bioinformatics. Scientometrics, 79(1), 109–129. doi:10.1007/s11192-009-0407-1
Article Google Scholar
Glänzel, W., & Thijs, B. (2011). Using `core documents’ for the representation of clusters and topics. Scientometrics, 88(1), 297–309. doi:10.1007/s11192-011-0347-4
Article Google Scholar
Glänzel, W., & Thijs, B. (2012). Using ‘core documents’ for detecting and labelling new emerging topics. Scientometrics, 91(2), 399–416. doi:10.1007/s11192-011-0591-7.
Article Google Scholar
Janssens, F. (2007). Clustering of scientific fields by integrating text mining and bibliometrics. Ph.D. Thesis, Faculty of Engineering, Katholieke Universiteit Leuven, Belgium. http://www.hdl.handle.net/1979/847.
Kopcsa, A., & Schiebel, E. (1998). Science and technology mapping: a new iteration model for representing multidimensional relationships. JASIS, 49(1), 7–17. doi:10.1002/(SICI)1097-4571(1998)49:1<7::AID-ASI3>3.0.CO;2-W
Article Google Scholar
Picard, J. (1999), Finding content-bearing terms using term similarities. In Proceedings of EACL’99, 241–244.
Rousseeuw, P. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20(1), 53–65. doi:10.1016/0377-0427(87)90125-7
Article MATH Google Scholar
Schiebel, E. (2012). Visualization of research fronts and knowledge bases by three-dimensional areal densities of bibliographically coupled publications and co-citations. Scientometrics, 91(2), 557–566. doi:10.1007/s11192-012-0626-8.
Article Google Scholar
Zitt, M., & Bassecoulard, E. (1994). Development of a method for detection and trend analysis of research fronts built by lexical or co-citation analysis. Scientometrics, 30(1), 333–351. doi:10.1007/BF02017232
Article Google Scholar

Download references

Acknowledgments

This is a version of a paper presented by the first author at the 17th International Conference on Science and Technology Indicators (Montreal, 5–8 September 2012). The full proceedings are published as: E. Archambault, Y. Gingras, V. Lariviere (Eds.), Proceedings of STI 2012 Montreal, Science-Metrix and OST, Montréal, Quebec, Canada.

The methodology has partially been developed in the context of the ERACEP project within the Coordination and Support Actions (CSAs) of the ERC work programme. The authors wish to acknowledge this support.

Author information

Authors and Affiliations

Centre for R&D Monitoring (ECOOM) and Department of MSI, KU Leuven, Leuven, Belgium
Bart Thijs & Wolfgang Glänzel
AIT Austrian Institute of Technology GmbH, Vienna, Austria
Edgar Schiebel
Department of Science Policy and Scientometrics, LHAS, Budapest, Hungary
Wolfgang Glänzel

Authors

Bart Thijs
View author publications
You can also search for this author in PubMed Google Scholar
Edgar Schiebel
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang Glänzel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wolfgang Glänzel.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Thijs, B., Schiebel, E. & Glänzel, W. Do second-order similarities provide added-value in a hybrid approach?. Scientometrics 96, 667–677 (2013). https://doi.org/10.1007/s11192-012-0896-1

Download citation

Received: 19 October 2012
Published: 21 November 2012
Issue Date: September 2013
DOI: https://doi.org/10.1007/s11192-012-0896-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Do second-order similarities provide added-value in a hybrid approach?

Abstract

Access this article

Similar content being viewed by others

Experimental evaluation of parameter settings in calculation of hybrid similarities: effects of first- and second-order similarity, edge cutting, and weighting factors

Diagonal Co-clustering Algorithm for Document-Word Partitioning

Combining semantic and term frequency similarities for text clustering

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Do second-order similarities provide added-value in a hybrid approach?

Abstract

Access this article

Similar content being viewed by others

Experimental evaluation of parameter settings in calculation of hybrid similarities: effects of first- and second-order similarity, edge cutting, and weighting factors

Diagonal Co-clustering Algorithm for Document-Word Partitioning

Combining semantic and term frequency similarities for text clustering

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation