Skip to main content
Log in

The contribution of the lexical component in hybrid clustering, the case of four decades of “Scientometrics”

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

The introduction of textual analysis and the use of lexical similarities already proved an important asset in science mapping. Earlier research showed the added value of hybrid document networks over link-based ones through the reduction of the extreme sparseness. However, it was only after the application of Natural Language Processing and phrase extraction that networks purely based on lexical similarities could be used as input for topic detection in quantitative science studies. This study investigates the contribution of the lexical component in hybrid cluster on a set of articles published in the journal Scientometrics since its foundation during four decades. Shifting the weight of the lexical components generates changes in the structure of the underlying hybrid network, which can be detected through clustering techniques. We show that these changes are not moving documents randomly, but in fact identify small groups of papers either at the borderline between different topics or combining those. In addition, the analysis substantiates that the lexical component adopts the structure of the network rather than amplifies hidden structures of the link-based network.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Data sourced from Clarivate Analytics Web of Science Core Collection

Fig. 2

Data sourced from Clarivate Analytics Web of Science Core Collection

Fig. 3

Data sourced from Clarivate Analytics Web of Science Core Collection

Similar content being viewed by others

References

  • Blondel, V. D., Guillaume, J. L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 10, P10008.

    Article  Google Scholar 

  • Boyack, K. W., & Klavans, R. (2010). Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately? Journal of the American Society for Information Science and Technology, 61(12), 2389–2404.

    Article  Google Scholar 

  • Callon, M., Courtial, J. P., & Laville, F. (1991). Co-word analysis as a tool for describing the network of interactions between basic and technological research: The case of polymer chemistry. Scientometrics, 22(1), 155–205.

    Article  Google Scholar 

  • Garfield, E. (1969). Permuterm Subject Index—The primordial dictionary of science. Current Contents, 12(22), 4.

    Google Scholar 

  • Glänzel, W., & Thijs, B. (2011). Using `core documents’ for the representation of clusters and topics. Scientometrics, 88(1), 297–309.

    Article  Google Scholar 

  • Glänzel, W., & Thijs, B. (2012). Hybrid solutions—The best of all possible worlds? Bibliometrie & Praxis und Forschung, 1(3), URN:urn:nbn:de:bvb:355-152-4.

  • Glänzel, W., & Thijs, B. (2017). Using hybrid methods and core documents for the representation of clusters and topics. The astronomy dataset. Scientometrics, 111(2), 1071–1087.

    Article  Google Scholar 

  • Glenisson, P., Glänzel, W., Janssens, F., & de Moor, B. (2005). Combining full text and bibliometric information in mapping scientific disciplines. Information Processing and Management, 41(6), 1548–1572.

    Article  Google Scholar 

  • Good, B. H., de Montojoye, Y.-A., & Clauset, A. (2010). Performance of modularity maximization in practical contexts. Physical Review E, 81, 046106.

    Article  MathSciNet  Google Scholar 

  • Janssens, F., Glänzel, W., & de Moor, B. (2008). A hybrid mapping of information science. Scientometrics, 75(3), 607–631.

    Article  Google Scholar 

  • Manning, Ch. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., & McClosky, D. (2014). The Stanford CoreNLP Natural Language Processing Toolkit. In Proceedings of the 52nd annual meeting of the association for computational linguistics: system demonstrations (pp. 55–60).

  • Thijs, B., Glänzel, W., & Meyer, M. (2017). Improved lexical similarities for hybrid clustering through the use of noun phrases extraction. FEB Research Report MSI_1703, MSI_1703. Leuven (Belgium): KU Leuven, Faculty of Economics and Business.

  • Todorov, R., & Winterhager, M. (1990). Mapping Australian geophysics—A co-heading analysis. Scientometrics, 19(1–2), 35–56.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bart Thijs.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Thijs, B., Glänzel, W. The contribution of the lexical component in hybrid clustering, the case of four decades of “Scientometrics”. Scientometrics 115, 21–33 (2018). https://doi.org/10.1007/s11192-018-2659-0

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-018-2659-0

Keywords

Navigation