A new clustering method to explore the dynamics of research communities

Cambe, Jordan; Grauwin, Sebastian; Flandrin, Patrick; Jensen, Pablo

doi:10.1007/s11192-022-04463-x

A new clustering method to explore the dynamics of research communities

Published: 29 July 2022

Volume 127, pages 4459–4482, (2022)
Cite this article

Scientometrics Aims and scope Submit manuscript

Jordan Cambe^1,2,
Sebastian Grauwin^1,2,
Patrick Flandrin¹ &
…
Pablo Jensen ORCID: orcid.org/0000-0001-9912-2849^1,2

433 Accesses
1 Altmetric
Explore all metrics

Abstract

Description of temporal networks and detection of dynamic communities have been hot topics of research for the last decade. However, no consensual answers to these challenges have been found due to the complexity of the task. Static communities are not well defined objects, and adding a temporal dimension makes the description even more difficult. In this article, we propose a coherent temporal clustering method to explore the dynamics of research communities: the Best Combination of Local Communities (BCLC). Our method aims at finding a good balance between two contradictory objectives: closely following the short-term evolution by finding optimal partitions at each time step, on the one hand, and temporal smoothing, which privileges historical continuity, on the other hand. We test our algorithm on two bibliographic data sets by comparing their mesoscale dynamic description to those derived from a (static) simple clustering algorithm applied over the whole data set. We show that our clustering algorithm can reveal more complex dynamics than the simple approach and reach a good agreement with expert’s knowledge.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Identifying Diachronic Topic-Based Research Communities by Clustering Shared Research Trajectories

A Hybrid Semantic Approach to Building Dynamic Maps of Research Communities

Challenges in Community Discovery on Temporal Networks

Data availibility

Data used in this article are available at:

Notes

References

Aynaud, T., Fleury, E., Guillaume, J. L., & Wang, Q. (2013). Communities in evolving networks: Definitions, detection, and analysis techniques. In A. Mukherjee, M. Choudhury, F. Peruani, N. Ganguly, & B. Mitra (Eds.), Dynamics on and of complex networks, volume 2: applications to time-varying dynamical systems (pp. 159–200). New York, NY: Springer.
Chapter Google Scholar
Barthélemy, M. (2011). Spatial networks. Physics Reports, 499(1), 1–101. https://doi.org/10.1016/j.physrep.2010.11.002.
Article MathSciNet Google Scholar
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008(10), P10008 http://stacks.iop.org/1742-5468/2008/i=10/a=P10008
Boccaletti, S., Latora, V., Moreno, Y., Chavez, M., & Hwang, D. U. (2006). Complex networks: Structure and dynamics. Physics Reports, 424(4), 175–308. https://doi.org/10.1016/j.physrep.2005.10.009
Article MathSciNet MATH Google Scholar
Cambe, J., Grauwin, S., Flandrin, P., Jensen, P. (2020). Exploring and comparing temporal clustering methods . https://arxiv.org/abs/2012.01287
Claveau, F., & Gingras, Y. (2016). Macrodynamics of economics: A bibliometric history. History of Political Economy, 48(4), 551–592. https://doi.org/10.1215/00182702-3687259.
Article Google Scholar
Dall’Asta, L., Barrat, A., Barthélemy, M., Vespignani, A. (2006). Vulnerability of weighted networks. Journal of Statistical Mechanics: Theory and Experiment 2006(04), P04006 http://stacks.iop.org/1742-5468/2006/i=04/a=P04006
Duan, D., Li, Y., Jin, Y., Lu, Z. (2009). Community mining on dynamic weighted directed graphs. In: Proceedings of the 1st ACM International Workshop on Complex Networks Meet Information & Knowledge Management, CNIKM ’09, ACM, New York, NY, USA 11–18. https://doi.org/10.1145/1651274.1651278.
Fortunato, S., & Barthélemy, M. (2007). Resolution limit in community detection. Proceedings of the National Academy of Sciences, 104(1), 36–41. https://doi.org/10.1073/pnas.0605965104
Article Google Scholar
Fortunato, S., & Hric, D. (2016). Community detection in networks: A user guide. Physics Reports, 659, 1–44. https://doi.org/10.1016/j.physrep.2016.09.002
Article MathSciNet Google Scholar
Ghasemian, A., Zhang, P., Clauset, A., Moore, C., & Peel, L. (2016). Detectability thresholds and optimal algorithms for community structure in dynamic networks. Phys. Rev. X, 6, 031005. https://doi.org/10.1103/PhysRevX.6.031005
Article Google Scholar
Görke, R., Maillard, P., Schumm, A., Staudt, C., & Wagner, D. (2013). Dynamic graph clustering combining modularity and smoothness. J. Exp. Algorithmics, 18, 15–129. https://doi.org/10.1145/2444016.2444021.
Article MathSciNet MATH Google Scholar
Görke, R., Maillard, P., Staudt, C., & Wagner, D. (2010). Modularity-driven clustering of dynamic graphs. In P. Festa (Ed.), Experimental Algorithms (pp. 436–448). Berlin Heidelberg, Berlin, Heidelberg: Springer.
Chapter Google Scholar
Grauwin, S., & Jensen, P. (2011). Mapping scientific institutions. Scientometrics, 89(3), 943. https://doi.org/10.1007/s11192-011-0482-y
Article Google Scholar
Grauwin, S., & Sperano, I. (2018). Bibliomaps-a software to create web-based interactive maps of science: The case of ux map. Proceedings of the Association for Information Science and Technology, 55(1), 815–816.
Article Google Scholar
Greene, D., Doyle, D., Cunningham, P. (2010). Tracking the evolution of communities in dynamic social networks. In: 2010 International Conference on Advances in Social Networks Analysis and Mining, pp. 176–183 . https://doi.org/10.1109/ASONAM.2010.17
Guo, C., Wang, J., & Zhang, Z. (2014). Evolutionary community structure discovery in dyna mic weighted networks. Physica A: Statistical Mechanics and its Applications, 413, 565–576. https://doi.org/10.1016/j.physa.2014.07.004
Article Google Scholar
Hartmann, T., Kappes, A., & Wagner, D. (2016). Clustering evolving networks. In L. Kliemann & P. Sanders (Eds.), Algorithm engineering: Selected results and surveys (pp. 280–329). Cham: Springer International Publishing.
Chapter Google Scholar
Holme, P. (2015). Modern temporal network theory: a colloquium. The European Physical Journal B, 88(9), 234. https://doi.org/10.1140/epjb/e2015-60657-4.
Article Google Scholar
Holme, P., & Saramäki, J. (2012). Temporal networks. Physics Reports, 519(3), 97–125. https://doi.org/10.1016/j.physrep.2012.03.001
Article Google Scholar
Kvålseth, T. O. (2017). On normalized mutual information: Measure derivations and properties. Entropy, 19(11), 631. https://doi.org/10.3390/e19110631
Article Google Scholar
Lariviere, V., Archambault, E., & Gingras, Y. (2008). Long-term variations in the aging of scientific literature: From exponential growth to steady-state science (1900–2004). Journal of the American Society for Information Science and Technology, 59(2), 288–296. https://doi.org/10.1002/asi.20744
Article Google Scholar
Lorenz, P., Wolf, F., Braun, J., Djurdjevac Conrad, N., & Hövel, P. (2018). Capturing the dynamics of hashtag-communities. In C. Cherifi, H. Cherifi, M. Karsai, & M. Musolesi (Eds.), Complex Networks & Their Applications VI (pp. 401–413). Cham: Springer International Publishing.
Chapter Google Scholar
Lund, K., Jeong, H., Grauwin, S., & Jensen, P. (2017). Une carte scientométrique de la recherche en éducation vue par la base de données internationales scopus. Les Sciences de l’education-Pour l’Ere nouvelle, 50(1), 67–84.
Article Google Scholar
Masuda, N., & Lambiotte, R. (2016). A guide to temporal networks. World Scientific (EUROPE). https://doi.org/10.1142/q0033.
Article MATH Google Scholar
Matias, C., & Miele, V. (2016). Statistical clustering of temporal networks through a dynamic stochastic block model. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79(4), 1119–1141. https://doi.org/10.1111/rssb.12200
Article MathSciNet MATH Google Scholar
Morini, M., Flandrin, P., Fleury, E., Venturini, T., Jensen, P. (2017). Revealing evolutions in dynamical networks . https://hal.inria.fr/hal-01558219
Mucha, P. J., Richardson, T., Macon, K., Porter, M. A., & Onnela, J. P. (2010). Community structure in time-dependent, multiscale, and multiplex networks. Science, 328(5980), 876–878. https://doi.org/10.1126/science.1184819
Article MathSciNet MATH Google Scholar
Newman, M. (2003). The structure and function of complex networks. SIAM Review, 45(2), 167–256. https://doi.org/10.1137/S003614450342480.
Article MathSciNet MATH Google Scholar
Newman, M. E. J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 8577–8582. https://doi.org/10.1073/pnas.0601602103.
Article Google Scholar
Onnela, J.P., Saramäki, J., Hyvönen, J., Szabó, G., de Menezes, M.A., Kaski, K., Barabási, A.L., Kertész, J. (2007). Analysis of a large-scale weighted network of one-to-one human communication. New Journal of Physics 9(6), 179 http://stacks.iop.org/1367-2630/9/i=6/a=179
Pastor-Satorras, R., & Vespignani, A. (2001). Epidemic spreading in scale-free networks. Physical Review Letters, 86, 3200–3203. https://doi.org/10.1103/PhysRevLett.86.3200.
Article Google Scholar
Rossetti, G., & Cazabet, R. (2018). Community discovery in dynamic networks. A survey. ACM Computer Survey, 51(2), 35:1-35:37. https://doi.org/10.1145/3172867.
Article Google Scholar
Rossetti, G., Pappalardo, L., Pedreschi, D., & Giannotti, F. (2017). Tiles: an online algorithm for community discovery in dynamic social networks. Machine Learning, 106(8), 1213–1241. https://doi.org/10.1007/s10994-016-5582-8.
Article MathSciNet Google Scholar
Wagner, S., Wagner, D. (2007). Comparing clusterings—an overview. Tech. Rep. 4, Karlsruhe
Wang, Y., Wu, B., & Pei, X. (2008). Commtracker: A core-based algorithm of tracking community evolution. In C. Tang, C. X. Ling, X. Zhou, N. J. Cercone, & X. Li (Eds.), Advanced data mining and applications (pp. 229–240). Berlin Heidelberg, Berlin, Heidelberg: Springer.
Chapter Google Scholar

Download references

Acknowledgements

A previous version of this study has been archived as a preprint Cambe et al. (2020)

Funding

This work was supported by the ACADEMICS Grant of the IDEXLYON, project of the Université de Lyon, PIA operated by ANR-16-IDEX-0005.

Author information

Authors and Affiliations

Laboratoire de Physique, University Lyon, ENS de Lyon, UCB Lyon 1, CNRS, F-69342, Lyon, France
Jordan Cambe, Sebastian Grauwin, Patrick Flandrin & Pablo Jensen
Institut Rhônalpin des Systemes Complexes IXXI, F-69342, Lyon, France
Jordan Cambe, Sebastian Grauwin & Pablo Jensen

Authors

Jordan Cambe
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Grauwin
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Flandrin
View author publications
You can also search for this author in PubMed Google Scholar
Pablo Jensen
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization: PJ Methodology: JC, SG, PJ; Formal analysis and investigation: JC, SG, PF, PJ; Writing—original draft preparation: JC, SG, PF, PJ; Writing - review and editing: JC, SG, PF, PJ; Funding acquisition: PJ; Supervision: PJ.

Corresponding author

Correspondence to Pablo Jensen.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Appendix

Dynamics of scientific research communities

We investigated four temporal community detection methods, two global and two local methods. However, as measures from GA and GPA are very close and measures from BMLA and BCLC are also very close, we only presented the GA and BCLC methods in the core of this article. The two other methods (GPA and BMLA) and their measures are described below.

Global projected algorithm (GPA)

Here, we want to include some dynamics into our global algorithm. We thus start with the set of GA-streams obtained by running the Louvain algorithm (Blondel et al., 2008) on the global BC network. Then, we define BC networks in each period, only keeping the articles sharing at least two references with at least one other article within the period. Removing the “long-term connections only” articles which do not share two or more references with another article in their period results in an average loss of 7.8% of the articles taken into account in the global BC network. For each time period, we define local communities by grouping together the publications that are in the same GA-streams, resulting in a set of local projected communities in each period. Finally, we compute historical streams by applying our matching algorithm to the projected communities. Interestingly, the streams that are build from this method do not necessarily correspond to the GA-streams: the predecessors / successors of a cluster may not be subsets of the same GA-stream of this particular cluster, resulting in splits or merges. In practice, a few GA-streams may in effect be cut into into two or more GPA-streams localized in different time periods. This approach thus allows to visualize the evolution of a GA-stream in terms of dynamical events (splits and merges).

Best-modularity local algorithm (BMLA)

For each time period, we run N independent runs (we used \(N=100\)) of the Louvain algorithm. Because of the noise inherent to the Louvain algorithm, these partitions may be a bit different, while having similar modularity values (in practice the modularity difference between the partitions of different runs is lower than 0.005). Compared to the BCLC method, we do not try here to choose the partitions of the run best matching the partition from the previous or next period, but keep the partition with the best modularity among the N runs in each time period. BMLA historical streams are then defined by applying the matching algorithm to these ‘best-modularity’ partitions.

BMLA Algorithm

This algorithm returns temporal streams we call BMLA-streams. These streams maximize the modularity at each time t without considering the global modularity of the whole system.

Comparing All algorithms

Tables 4 and 5 show there is very little difference between the local algorithms and between the global algorithms, for all measures on both data sets.

Table 4 Similarly to Table 2, \(\vert P_{X}\vert\) is the number of streams in partition X

Full size table

Table 5 Similarly to Table 3, In this table each cell contains two lines

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cambe, J., Grauwin, S., Flandrin, P. et al. A new clustering method to explore the dynamics of research communities. Scientometrics 127, 4459–4482 (2022). https://doi.org/10.1007/s11192-022-04463-x

Download citation

Received: 01 June 2021
Accepted: 30 June 2022
Published: 29 July 2022
Issue Date: August 2022
DOI: https://doi.org/10.1007/s11192-022-04463-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new clustering method to explore the dynamics of research communities

Abstract

Access this article

Similar content being viewed by others

Identifying Diachronic Topic-Based Research Communities by Clustering Shared Research Trajectories

A Hybrid Semantic Approach to Building Dynamic Maps of Research Communities

Challenges in Community Discovery on Temporal Networks

Data availibility

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Appendix

Dynamics of scientific research communities

Global projected algorithm (GPA)

Best-modularity local algorithm (BMLA)

Comparing All algorithms

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A new clustering method to explore the dynamics of research communities

Abstract

Access this article

Similar content being viewed by others

Identifying Diachronic Topic-Based Research Communities by Clustering Shared Research Trajectories

A Hybrid Semantic Approach to Building Dynamic Maps of Research Communities

Challenges in Community Discovery on Temporal Networks

Data availibility

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Appendix

Appendix

Dynamics of scientific research communities

Global projected algorithm (GPA)

Best-modularity local algorithm (BMLA)

Comparing All algorithms

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation