Skip to main content
Log in

A new clustering method to explore the dynamics of research communities

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Description of temporal networks and detection of dynamic communities have been hot topics of research for the last decade. However, no consensual answers to these challenges have been found due to the complexity of the task. Static communities are not well defined objects, and adding a temporal dimension makes the description even more difficult. In this article, we propose a coherent temporal clustering method to explore the dynamics of research communities: the Best Combination of Local Communities (BCLC). Our method aims at finding a good balance between two contradictory objectives: closely following the short-term evolution by finding optimal partitions at each time step, on the one hand, and temporal smoothing, which privileges historical continuity, on the other hand. We test our algorithm on two bibliographic data sets by comparing their mesoscale dynamic description to those derived from a (static) simple clustering algorithm applied over the whole data set. We show that our clustering algorithm can reveal more complex dynamics than the simple approach and reach a good agreement with expert’s knowledge.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availibility

Data used in this article are available at:

Notes

  1. http://apps.isiknowledge.com/.

  2. http://www.sebastian-grauwin.com/bibliomaps/.

References

Download references

Acknowledgements

A previous version of this study has been archived as a preprint Cambe et al. (2020)

Funding

This work was supported by the ACADEMICS Grant of the IDEXLYON, project of the Université de Lyon, PIA operated by ANR-16-IDEX-0005.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: PJ Methodology: JC, SG, PJ; Formal analysis and investigation: JC, SG, PF, PJ; Writing—original draft preparation: JC, SG, PF, PJ; Writing - review and editing: JC, SG, PF, PJ; Funding acquisition: PJ; Supervision: PJ.

Corresponding author

Correspondence to Pablo Jensen.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Appendix

Appendix

Dynamics of scientific research communities

We investigated four temporal community detection methods, two global and two local methods. However, as measures from GA and GPA are very close and measures from BMLA and BCLC are also very close, we only presented the GA and BCLC methods in the core of this article. The two other methods (GPA and BMLA) and their measures are described below.

Global projected algorithm (GPA)

Here, we want to include some dynamics into our global algorithm. We thus start with the set of GA-streams obtained by running the Louvain algorithm (Blondel et al., 2008) on the global BC network. Then, we define BC networks in each period, only keeping the articles sharing at least two references with at least one other article within the period. Removing the “long-term connections only” articles which do not share two or more references with another article in their period results in an average loss of 7.8% of the articles taken into account in the global BC network. For each time period, we define local communities by grouping together the publications that are in the same GA-streams, resulting in a set of local projected communities in each period. Finally, we compute historical streams by applying our matching algorithm to the projected communities. Interestingly, the streams that are build from this method do not necessarily correspond to the GA-streams: the predecessors / successors of a cluster may not be subsets of the same GA-stream of this particular cluster, resulting in splits or merges. In practice, a few GA-streams may in effect be cut into into two or more GPA-streams localized in different time periods. This approach thus allows to visualize the evolution of a GA-stream in terms of dynamical events (splits and merges).

Best-modularity local algorithm (BMLA)

For each time period, we run N independent runs (we used \(N=100\)) of the Louvain algorithm. Because of the noise inherent to the Louvain algorithm, these partitions may be a bit different, while having similar modularity values (in practice the modularity difference between the partitions of different runs is lower than 0.005). Compared to the BCLC method, we do not try here to choose the partitions of the run best matching the partition from the previous or next period, but keep the partition with the best modularity among the N runs in each time period. BMLA historical streams are then defined by applying the matching algorithm to these ‘best-modularity’ partitions.

BMLA Algorithm

figure c

This algorithm returns temporal streams we call BMLA-streams. These streams maximize the modularity at each time t without considering the global modularity of the whole system.

Comparing All algorithms

Tables 4 and 5 show there is very little difference between the local algorithms and between the global algorithms, for all measures on both data sets.

Table 4 Similarly to Table 2, \(\vert P_{X}\vert\) is the number of streams in partition X
Table 5 Similarly to Table 3, In this table each cell contains two lines

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cambe, J., Grauwin, S., Flandrin, P. et al. A new clustering method to explore the dynamics of research communities. Scientometrics 127, 4459–4482 (2022). https://doi.org/10.1007/s11192-022-04463-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-022-04463-x

Keywords

Navigation