Fine-grained classification of journal articles based on multiple layers of information through similarity network fusion: The case of the Cambridge Journal of Economics

Baccini, Alberto; Baccini, Federica; Barabesi, Lucio; Cioni, Martina; Petrovich, Eugenio; Pignalosa, Daria

doi:10.1007/s11192-023-04884-2

Fine-grained classification of journal articles based on multiple layers of information through similarity network fusion: The case of the Cambridge Journal of Economics

Published: 19 December 2023

Volume 129, pages 373–400, (2024)
Cite this article

Scientometrics Aims and scope Submit manuscript

Alberto Baccini ORCID: orcid.org/0000-0003-0293-482X¹,
Federica Baccini²,
Lucio Barabesi¹,
Martina Cioni¹,
Eugenio Petrovich³ &
…
Daria Pignalosa¹

284 Accesses
3 Altmetric
Explore all metrics

Abstract

In order to explore the suitability of a fine-grained classification of journal articles by exploiting multiple sources of information, articles are organized in a two-layer multiplex. The first layer conveys similarities based on the full-text of articles, and the second similarities based on cited references. The information of the two layers are only weakly associated. The Similarity Network Fusion process is adopted to combine the two layers into a new single-layer network. A clustering algorithm is applied to the fused network and the classification of articles is obtained. In order to evaluate its coherence, this classification is compared with the ones obtained by applying the same algorithm to each of two layers. Moreover, the classification obtained for the fused network is also compared with the classifications obtained when the layers of information are integrated using different methods available in literature. In the case of the Cambridge Journal of Economics, Similarity Network Fusion appears to be the best option. Moreover, the achieved classification appears to be fine-grained enough to represent the extreme heterogeneity characterizing the contributions published in the journal.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The contribution of the lexical component in hybrid clustering, the case of four decades of “Scientometrics”

Article 02 February 2018

Generating clustered journal maps: an automated system for hierarchical classification

Article Open access 03 January 2017

Bibliographic coupling and hierarchical clustering for the validation and improvement of subject-classification schemes

Article 23 July 2015

Data availability

After acceptance, raw data will be available here https://10.5281/zenodo.7876691 Preprint: the article is available at https://arxiv.org/pdf/2305.00026.pdf.

References

Agresti, A. (2012). Categorical data analysis (Vol. 792). John Wiley & Sons.
Google Scholar
Ahlgren, P., & Colliander, C. (2009). Document similarity approaches and science mapping: Experimental comparison of five approaches. Journal of Informetrics, 3(1), 49–63. https://doi.org/10.1016/j.joi.2008.11.003
Article Google Scholar
Ambrosino, A., Cedrini, M., Davis, J. B., Fiori, S., Guerzoni, M., & Nuccio, M. (2018). What topic modeling could reveal about the evolution of economics. Journal of Economic Methodology, 25(4), 329–348.
Article Google Scholar
Baccini, A., Barabesi, L., Khelfaoui, M., & Gingras, Y. (2019). Intellectual and social similarity among scholarly journals: An exploratory comparison of the networks of editors, authors and co-citations. Quantitative Science Studies, 1(1), 277–289.
Article Google Scholar
Baccini, F., Barabesi, L., Baccini, A., Khelfaoui, M., & Gingras, Y. (2022a). Similarity network fusion for scholarly journals. Journal of Informetrics, 16(1), 101226. https://doi.org/10.1016/j.joi.2021.101226
Article Google Scholar
Baccini, F., Bianchini, M., & Geraci, F. (2022b). Graph-based integration of histone modification profiles. Mathematics, 10(11), 503–515. https://doi.org/10.3390/math10111842
Article Google Scholar
Baccini, F., Barabesi, L., & Petrovich, E. (2023). Similarity matrix average for aggregating multiplex networks. Journal of Physics: Complexity, 4(2), 025017. https://doi.org/10.1088/2632-072X/acda09
Article Google Scholar
Bird, S., Klein, E., & Loper, E. (2009). Natural language processing with python: Analyzing text with the natural language toolkit. O’Reilly Media Inc.
Google Scholar
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022. https://doi.org/10.5555/944919.944937
Article Google Scholar
Blondel, V. D., Guillaume, J.-L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10), P10008. https://doi.org/10.1088/1742-5468/2008/10/p10008
Article Google Scholar
Boyack, K. W. (2017). Investigating the effect of global data on topic detection. Scientometrics, 111(2), 999–1015. https://doi.org/10.1007/s11192-017-2297-y
Article Google Scholar
Boyack, K. W., & Klavans, R. (2010). Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately? Journal of the American Society for Information Science and Technology, 61(12), 2389–2404. https://doi.org/10.1002/asi.21419
Article Google Scholar
Boyack, K. W., & Klavans, R. (2020). A comparison of large-scale science models based on textual, direct citation and hybrid relatedness. Quantitative Science Studies, 1(4), 1570–1585. https://doi.org/10.1162/qss\$\$a\$\$00085
Article Google Scholar
Brunson, J. C. (2020). ggalluvial: Layered grammar for alluvial plots. Journal of Open Source Software, 5(49), 2017. https://doi.org/10.21105/joss.02017
Article Google Scholar
Brunson, J.C., & Read, Q.D. (n.d.). ggalluvial: Alluvial plots in ‘ggplot2’. Retrieved from http://corybrunson.github.io/ggalluvial/ (R package version 0.12.4)
Cherrier, B. (2017). Classifying economics: A history of the JEL codes. Journal of Economic Literature, 55(2), 545–79. https://doi.org/10.1257/jel.20151296
Article Google Scholar
Claveau, F., & Gingras, Y. (2016). Macrodynamics of economics: A bibliometric history. History of Political Economy, 48(4), 551–592. https://doi.org/10.1215/00182702-3687259
Article Google Scholar
Cramér, H. (1946). Mathematical methods of statistics. Princeton University Press.
Google Scholar
Edwards, J., Giraud, Y., & Schinckus, C. (2018). A quantitative turn in the historiography of economics? Journal of Economic Methodology, 25(4), 283–290. https://doi.org/10.1080/1350178X.2018.1529133
Article Google Scholar
Eykens, J., Guns, R., & Engels, T. C. E. (2021). Fine-grained classification of social science journal articles using textual data: A comparison of supervised machine learning approaches. Quantitative Science Studies, 2(1), 89.
Article Google Scholar
Fisher, N. (1990). The classification of the sciences. In R. Olby (Ed.), Companion to the history of modern science (pp. 853–868). Routledge.
Google Scholar
Garćýa, C., Otero, D., & Salazar, B. (2023). The drifting influence of Hall’s random-walk hypothesis on consumption modeling. History of Political Economy, 55(1), 103–143. https://doi.org/10.1215/00182702-10213653
Article Google Scholar
Glänzel, W., & Schubert, A. (2003). A new classification scheme of science fields and subfields designed for scientometric evaluation purposes. Scientometrics, 56(3), 357–367. https://doi.org/10.1023/A:1022378804087
Article Google Scholar
Glänzel, W., & Thijs, B. (2011). Using ‘core documents’ for the representation of clusters and topics. Scientometrics, 88(1), 297–309. https://doi.org/10.1007/s11192-011-0347-4
Article Google Scholar
Glänzel, W., & Thijs, B. (2017). Using hybrid methods and ‘core documents’ for the representation of clusters and topics: The astronomy dataset. Scientometrics, 111(2), 1071–1087. https://doi.org/10.1007/s11192-017-2301-6
Article Google Scholar
Jaccard, P. (1912). The distribution of the flora in the alpine zone. New Phytologist, 11, 37–50. https://doi.org/10.1111/j.1469-8137.1912.tb05611.x
Article Google Scholar
Janssens, F., Glänzel, W., & De Moor, B. (2008). A hybrid mapping of information science. Scientometrics, 75(3), 607–631. https://doi.org/10.1007/s11192-007-2002-7
Article Google Scholar
Kessler, M. M. (1965). Comparison of the results of bibliographic coupling and analytic subject indexing. American Documentation, 16(3), 223–233. https://doi.org/10.1002/asi.5090160309
Article Google Scholar
Klavans, R., & Boyack, K. W. (2017). Which type of citation analysis generates the most accurate taxonomy of scientific and technical knowledge? Journal of the Association for Information Science and Technology, 68(4), 984–998. https://doi.org/10.1002/asi.23734
Article Google Scholar
Kleminski, R., Kazienko, P., & Kajdanowicz, T. (2020). Analysis of direct citation, co-citation and bibliographic coupling in scientific topic identification. Journal of Information Science, 48(3), 349–373. https://doi.org/10.1177/0165551520962775
Article Google Scholar
Marcuzzo, M. C., Naldi, N., Sanfilippo, E., & Rosselli, A. (2008). Cambridge as a place in economics. History of Political Economy, 40(4), 569–593. https://doi.org/10.1215/00182702-2008-027
Article Google Scholar
Ni, C., Sugimoto, C. R., & Jiang, J. (2013). Venue-author-coupling: A measure for identifying disciplines through author communities. Journal of the American Society for Information Science and Technology, 64(2), 265–279. https://doi.org/10.1002/asi.22630
Article Google Scholar
Omelka, M., & Hudecová, S. (2013). A comparison of the mantel test with a generalised distance covariance test. Environmetrics, 24(7), 449–460. https://doi.org/10.1002/env.2238
Article MathSciNet Google Scholar
Petrovich, E. (2020). Science mapping and science maps. Knowledge Organization, 48(7–8), 535–562.
Google Scholar
R Core Team. (2021). R: A language and environment for statistical computing. Vienna, Austria. Retrieved from http://www.R-project.org/
Saith, A. (2023). The Cambridge journal of economics—A forum of one’s own. Review of Political Economy, 35(1), 28–49. https://doi.org/10.1080/09538259.2022.2104027
Article Google Scholar
Savoy, J. (2020). Machine learning methods for stylometry. Springer.
Book Google Scholar
Sjögåarde, P., & Ahlgren, P. (2018). Granularity of algorithmically constructed publication-level classifications of research publications: Identification of topics. Journal of Informetrics, 12(1), 133–152. https://doi.org/10.1016/j.joi.2017.12.006
Article Google Scholar
Sjögåarde, P., & Ahlgren, P. (2020). Granularity of algorithmically constructed publication-level classifications of research publications: Identification of specialties. Quantitative Science Studies, 1(1), 207–238.
Article Google Scholar
Small, H. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science, 24(4), 265–269. https://doi.org/10.1002/asi.4630240406
Article MathSciNet Google Scholar
Székely, G. J., & Rizzo, M. L. (2014). Partial distance correlation with methods for dissimilarities. Annals of Statistics, 42(6), 2382–2412. https://doi.org/10.1214/14-AOS1255
Article MathSciNet Google Scholar
Székely, G. J., Rizzo, M. L., & Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. The Annals of Statistics, 35(6), 2769–2794.
Article MathSciNet Google Scholar
Thor, A., Marx, W., Leydesdorff, L., & Bornmann, L. (2016). Introducing CitedReferencesExplorer (CRExplorer): A program for reference publication year spectroscopy with cited references standardization. Journal of Informetrics, 10(2), 503515. https://doi.org/10.1016/j.joi.2016.02.005
Article Google Scholar
Todeschini, R., & Baccini, A. (2016). Handbook of bibliometric indicators: Quantitative tools for studying and evaluating research. Wiley-VCH.
Book Google Scholar
Truc, A., Claveau, F., & Santerre, O. (2021). Economic methodology: A bibliometric perspective. Journal of Economic Methodology, 28(1), 67–78. https://doi.org/10.1080/1350178X.2020.1868774
Article Google Scholar
Wang, B., Jiang, J., Wang, W., Zhou, Z.-H., & Tu, Z. (2012). Unsupervised metric fusion by cross diffusion. IEEE Conference on Computer Vision and Pattern Recognition. https://doi.org/10.1109/CVPR.2012.6248029
Article Google Scholar
Wang, B., Mezlini, A., Demir, F., Fiume, M., Tu, Z., Brudno, M., & Goldenberg, A. (2014). Similarity network fusion for aggregating data types on a genomic scale. Nature methods, 11, 333–337. https://doi.org/10.1038/nmeth.2810
Article Google Scholar
Zitt, M., Lelu, A., Cadot, M., & Cabanac, G. (2019). Bibliometric delineation of scientific fields. In W. Glänzel, H. Moed, U. Schmoch, & M. Thelwall (Eds.), Springer handbook of science and technology indicators (pp. 25–68). Springer.
Chapter Google Scholar

Download references

Acknowledgements

We thank Alessandra Durio who contributed to the work by doing all the processing for the construction of the similarity matrices based on bags of words and topic modeling. We also thank two anonymous referees for their insightful comments that enabled substantial improvement of the article. This article is available as preprint at https://arxiv.org/pdf/2305.00026.pdf.

Funding

The research is funded by the Italian Ministry of University, PRIN project: 2017MPXW98, PI: Alberto Baccini.

Author information

Authors and Affiliations

Dipartimento di Economia Politica e Statistica, Università degli Studi di Siena, Piazza San Francesco, 7, 53100, Siena, Italy
Alberto Baccini, Lucio Barabesi, Martina Cioni & Daria Pignalosa
Dipartimento di Ingegneria Informatica, Automatica e Gestionale “Antonio Ruberti”, Università degli Studi di Roma “La Sapienza”, Roma, Italy
Federica Baccini
Dipartimento di Filosofia e Scienze dell’Educazione, Università degli Studi di Torino, Torino, Italy
Eugenio Petrovich

Authors

Alberto Baccini
View author publications
You can also search for this author in PubMed Google Scholar
Federica Baccini
View author publications
You can also search for this author in PubMed Google Scholar
Lucio Barabesi
View author publications
You can also search for this author in PubMed Google Scholar
Martina Cioni
View author publications
You can also search for this author in PubMed Google Scholar
Eugenio Petrovich
View author publications
You can also search for this author in PubMed Google Scholar
Daria Pignalosa
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

AB and LB contributed to the study conception and design. Material preparation, data collection and analysis were performed by AB, LB, MC and EP; FB supervised the methods of matrix integration and their comparison; DP interpreted data from the methodology of economics perspective. All authors partecipated to the writing of the manuscript.

Corresponding author

Correspondence to Alberto Baccini.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher' Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Supplementary figures

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Baccini, A., Baccini, F., Barabesi, L. et al. Fine-grained classification of journal articles based on multiple layers of information through similarity network fusion: The case of the Cambridge Journal of Economics. Scientometrics 129, 373–400 (2024). https://doi.org/10.1007/s11192-023-04884-2

Download citation

Received: 06 May 2023
Accepted: 17 November 2023
Published: 19 December 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s11192-023-04884-2

Keywords

JEL Classification

B2
A1

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fine-grained classification of journal articles based on multiple layers of information through similarity network fusion: The case of the Cambridge Journal of Economics

Abstract

Access this article

Similar content being viewed by others

The contribution of the lexical component in hybrid clustering, the case of four decades of “Scientometrics”

Generating clustered journal maps: an automated system for hierarchical classification

Bibliographic coupling and hierarchical clustering for the validation and improvement of subject-classification schemes

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher' Note

Appendix A: Supplementary figures

Rights and permissions

About this article

Cite this article

Keywords

JEL Classification

Navigation

Fine-grained classification of journal articles based on multiple layers of information through similarity network fusion: The case of the Cambridge Journal of Economics

Abstract

Access this article

Similar content being viewed by others

The contribution of the lexical component in hybrid clustering, the case of four decades of “Scientometrics”

Generating clustered journal maps: an automated system for hierarchical classification

Bibliographic coupling and hierarchical clustering for the validation and improvement of subject-classification schemes

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher' Note

Appendix A: Supplementary figures

Appendix A: Supplementary figures

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classification

Search

Navigation