Higher-order rich-club phenomenon in collaborative research grant networks

Modern scientific work, including writing papers and submitting research grant proposals, increasingly involves researchers from different institutions. In grant collaborations, it is known that institutions involved in many collaborations tend to densely collaborate with each other, forming rich clubs. Here we investigate higher-order rich-club phenomena in networks of collaborative research grants among institutions and their associations with research impact. Using publicly available data from the National Science Foundation in the US, we construct a bipartite network of institutions and collaborative grants, which distinguishes among the collaboration with different numbers of institutions. By extending the concept and algorithms of the rich club for dyadic networks to the case of bipartite networks, we find rich clubs both in the entire bipartite network and the bipartite subnetwork induced by the collaborative grants involving a given number of institutions up to five. We also find that the collaborative grants within rich clubs tend to be more impactful in a per-dollar sense than the control. Our results highlight advantages of collaborative grants among the institutions in the rich clubs.


Introduction
The reliance on teamwork in scientific research has increased over the last decades 1,2 .The fraction of scientific papers written by teams of researchers and the number of authors in a scientific paper have increased over the last century on average 3,4 .Various factors affect outcomes of scientific teamwork, including the team size (i.e., the number of authors of a paper) 4,5 , internationality (i.e., the number of countries involved in a paper) 6,7 , ethnic diversity (i.e., the number of ethnicities involved in a paper) 8 , interdisciplinarity (i.e., the number of disciplines of authors involved in a paper) 9,10 , and team freshness (i.e., fraction of authors who have not collaborated with others before) 11 .In addition, quantitative approaches to scientific collaboration networks have contributed to the understanding of patterns of collaborations among researchers 1,12 and their relations to research productivity (e.g., the number of published papers) or impact (e.g., the number of citations received by published papers) of researchers [13][14][15][16][17][18][19][20][21][22] .
A universal trend in modern scientific teamwork is that researchers from different institutions collaborate with each other [23][24][25] .Such teams tend to produce papers with higher citation impacts than those written by teams confined to a single institution 25 .Patterns of co-authorships among researchers from different institutions have been characterized through analyses of collaboration networks among institutions [26][27][28] .Grant collaboration involving multiple institutions is also a growing trend [29][30][31] .Ma et al. analyzed a British collaboration network among institutions in which edges represent partnerships between two institutions in funded research projects 31 .They found that universities with many edges tend to be densely connected to each other, forming a rich club.Analyses of such grant collaboration networks may inform the government and other stakeholders on how to allocate research funding to institutions 32 .
In the present study, we represent collaborations among institutions on research grants as bipartite networks to investigate grant collaborations among two or more institutions.Note that Ma et al. investigated a dyadic collaboration network of research grants in which collaborations between three or more institutions were represented by dyadic collaborations 31 .Such a projection into dyadic networks, called the one-mode projection, is a major method for analyzing networks involving higher-order interactions among nodes 1,12,33 .However, evidence suggests limitations of describing such higher-order data only using pairwise interactions 34,35 .In fact, despite the coordination cost that collaborating institutions owe, it is not uncommon that more than two institutions participate in a funded research project 23,24,36 .Grants with large monetary amounts often require or at least encourage inter-institutional collaboration and are sometimes a main reason for collaboration among institutions 37 .Large grant teams in terms of the number of investigators tend to be more productive 38 , and collaboration with such large and productive teams tends to receive grants in the future 39 .These factors may also lead to an increase in the number of collaborating institutions.Thus motivated, we investigate networks of higher-order grant collaborations among institutions.
The relationships between research funding and research productivity or impact have been investigated for individual grants 40 , investigators [41][42][43][44][45] , institutions 31,[46][47][48][49] , and geographical regions 50 .Understanding such relationships is expected to assist the government and other stakeholders to develop strategies for allocating research funds to different units for enhancing research productivity or impact.Evidence supports positive correlations between the monetary amount of research funding received by an institution and its research productivity or impact 31,[46][47][48][49] .On the other hand, the per-dollar productivity or impact of an institution that receives a large amount of research funding tends to be diminishing [51][52][53][54] .Given this, in the present study we ask the following question: do institutions participating in many collaborative grants gain advantages in their per-dollar research impact when they densely collaborate with each other (i.e., they form a rich club) in research grants?We examine this question using bipartite-network representation of collaborative grants among institutions, which allows us to investigate relationships among rich clubs, research impact, and the collaboration size.

Collaborative grants
We use publicly available data on the grants administered by the National Science Foundation (NSF) 1 .We focused on the collaborative grants in each of which multiple institutions participate and each institution was responsible for a separate award.Therefore, each collaborative grant is composed of a set of linked awards each of which is separately administered by a single institution.For this type of collaborative grant, research proposals submitted by collaborating institutions must have the same project title beginning with 'Collaborative Research:' (e.g., see the latest guide posted by the NSF 2 .We confirmed that this rule was applied at least since 1999 3 ).Therefore, we first collected the data of the awards with the project title beginning with 'Collaborative Research:' and the start date between January 1, 2000 and December 31, 2020.Second, we identified the set of institutions that received at least one such award.Third, we used the Wikipedia APIs 4 to categorize each institution into one of 48 types; see Table S1 for the complete list of institution types.Fourth, we obtained the data of the awards received by the institutions whose type name includes 'university', 'college', or 'school' (see Table S1 for the list of institution types that we focused on).Among these institutions, there are 14,081 collaborative grants each of which contains at least two awards (i.e., institutions).Fifth, for each collaborative grant, we identified the set of participating institutions, the 7-digit award number (i.e., ID) assigned to each participating institution, and the monetary amount distributed to each participating institution.
To quantify the research outputs produced under the collaborative grants, we use the Web of Science Core Collection database 5 .There are 1,082,349 papers that were published between January 1, 2000 and December 31, 2020 and include at least one of the words 'National Science Foundation' and 'NSF' in the acknowledgment section.The fraction of papers with acknowledgment data in this data set has increased since 2008 because the Web of Science started recording the funding acknowledgment data in August 2008 6 .For each of these papers, we extracted the 7-digit award numbers mentioned in the acknowledgement section, the number of times cited by other papers in the database, the research disciplines assigned to the paper, which is available in the data set, the publication year, and the document type.We retained the 1,066,324 papers whose document types are either 'Article', 'Review', 'Letter', 'Editorial Material', 'Meeting Abstract' or 'Proceedings Paper', as suggested in Ref. 61 .Then, for each award comprising a collaborative grant, we identified the papers that mentioned its award number in the acknowledgment section.We removed the collaborative grants with less than five published papers in the database because such collaborative grants often have extreme impact values due to the small number of the associated papers.Then, we were left with 7,026 collaborative grants, each of which is associated with at least five of the 101,283 published papers.These collaborative grants have been awarded to 570 institutions in total.

Single-institution grants
For comparison, we also analyzed the grants that were composed of just one award given to one institution.To prepare such data, we first identified the awards of which the project title did not begin with 'Collaborative Research:' and the start date was between January 1, 2000 and December 31, 2020.There are 148,795 awards that meet these criteria and have been received by any of the 570 institutions that have participated in at least one collaborative grant.Second, for each of these awards, we identified the institution that received the award, the 7-digit award number (i.e., ID) assigned to the institution, the monetary amount of the award, and the first and last names of a principal investigator (PI) and co-PIs.Third, for each award, we identified the papers that mentioned its award number in the acknowledgment section.We removed the awards associated with less than five published papers in the Web of Science database.Then, we were left with 41,510 awards.According to the NSF's guide, these awards belong to one of the following three types of grant: (i) single-institution grant without co-PI, (ii) single-institution grant in which all the co-PIs are from the same institution as the PI's, and (iii) collaborative grant in which at least one co-PI from a different institution from the PI's participates and the PI's institution is responsible for the award.
We focus on the awards of types (i) and (ii) because they are genuine single-institution grants.We found 24,866 awards of type (i) among the 41,510 awards.It is not straightforward to classify the remaining 16,644 awards into types (ii) and (iii) because the affiliations of the co-PIs are not available in our data set.Therefore, we attempted to identify the awards of type (ii) as follows.First, for each co-PI in a given award, we obtain the set of candidate affiliations of the co-PI as the set of the affiliations of the authors who have the same first name initial and the same full last name as the co-PI in any of the papers associated with the award.Second, we regard that an award is of type (ii) if and only if the set of candidate affiliations of every co-PI in the award includes the institution that has received the award.We obtained 7,854 awards of type (ii) among the 16,644 awards with co-PIs.Otherwise, we regard that the award is of type (iii).
In summary, we obtained 24, 866 + 7, 854 = 32, 720 single-institution grants, each of which is associated with at least five of the 363,116 published papers.These grants have been awarded to 441 institutions in total.

Bipartite network of institutions and collaborative grants
From the data on the collaborative grants, we construct a bipartite network that consists of a set of institutions V = {v 1 , . . ., v N }, where N is the number of institutions, a set of collaborative grants U = {u 1 , . . ., u M }, where M is the number of collaborative grants, and a set of edges E.An edge (v i , u j ) exists between institution v i and collaborative grant u j if and only if v i received an award in the collaborative grant u j .A unique 7-digit award number and a unique monetary amount are associated with each edge (v i , u j ) ∈ E. We denote by k i the degree of v i , i.e., the number of awards that institution v i received from collaborative grants.We denote by s j the degree of u j , i.e., the number of collaborating institutions in collaborative grant u j .We show in Fig. 1 a hypothetical bipartite network of four institutions and three collaborative grants.In this example, we have

Detection of rich clubs
A rich club of a dyadic network is defined as a subnetwork in which the nodes with the highest degrees (i.e., the nodes with the largest numbers of connected edges) are densely inter-connected to each other 62,63 .There are a few studies on rich clubs in bipartite networks.Opsahl et al. investigated rich clubs in a bipartite network of academic authors and papers 33 .They constructed a weighted unipartite network in which the weight of each edge between two authors is equal to the number of coauthored papers, which corresponds to the one-mode projection of the bipartite network to a unipartite network, and then applied a method to detect weighted rich clubs for dyadic networks.The same method was applied to detect a rich club in a bipartite bran network 64 , a bipartite transportation network 65 , and a bipartite technological network 66 .In the present work, we investigate rich clubs in higher-order networks of collaborative grants among institutions, which one-mode projection does not characterize.Specifically, we develop and apply a method to detect rich clubs in bipartite networks without using the one-mode projection.
We define a rich club of a given bipartite network composed of institutions and collaborative grants in which the institutions with the largest degrees densely collaborate with each other.To compute the rich club, we first calculate the rich-club coefficient, denoted by φ (k), for the original bipartite network for a given degree k.By extending the definition for dyadic networks 62,63 , we define φ (k) as the number of collaborative grants that are exclusively composed of the institutions with a degree larger than k divided by the maximum possible number of collaborative grants that are exclusively composed of some of these nodes.Formally, we define where U >k is the set of collaborative grants that are exclusively composed of the institutions with a degree larger than k, and N >k is the number of institutions with a degree larger than k.To examine the presence of a rich club, we need to compare φ (k) with values for a reference model 63 .Therefore, we define the normalized rich-club coefficient, denoted by ρ(k), as where φ rand (k) is the rich-club coefficient for the reference model of bipartite network.If ρ(k) is sufficiently larger than 1, we say that the institutions with a degree larger than k form a rich club.For dyadic networks, a standard choice of the reference model is the configuration model, which randomizes the edges of the original network while preserving the degree of each node 63 .Here we use a counterpart of the configuration model for bipartite networks in which we randomize the edges of the original bipartite network while preserving the degree of each institution and each collaborative grant 67,68 .We compute φ rand (k) as the rich-club coefficient averaged over 10,000 randomized bipartite networks.

Measuring research impact for awards, institutions, and grants
Each award in collaborative grants is associated with a monetary amount and a set of journal and conference papers supported by the award, with which we calculate the per-dollar research impact 40 as follows.First, to compare the citation count across different publication years and research disciplines, we normalize the number of citations received by each of the 101,283 papers, which are associated with at least one collaborative grant 61,69 .To this end, we denote by c the number of citations that a given paper z has received.We define c 0 as the number of citations that a paper that was published in the same year as z and belongs to a research discipline assigned to z has received on average.Specifically, we set c 0 = (∑ d∈D(z) cd,y(z) )/|D(z)|, where D(z) is the set of the research disciplines assigned to z, |D(z)| is the number of research disciplines to which z belongs, y(z) is the publication year of z, and cd,y(z) is the average number of citations received by the papers published in discipline d and year y(z).Each paper is assigned to at least one of the 42 research disciplines 1 (see Supplementary Section S2 for details).We define the normalized number of citations received by z as c/c 0 .Then, we define the per-dollar impact of the award given to institution v i in collaborative grant u j , denoted by x i j , as the sum of c/c 0 over all the papers associated with the award, which we then divide by the monetary amount of the award.We measure the impact of collaborative funded research for a given subset of institutions, denoted by V (V ⊆ V ), as follows.We first calculate the average per-dollar impact of the awards in collaborative grants that the institutions in V have received, denoted by xinst (V ).Then, we define the normalized impact for the set of institutions V as xinst (V )/ x, where x is the average per-dollar impact of all the awards in collaborative grants.For example, when we consider the set of institutions V = {v 1 , v 3 } in a bipartite network shown in Fig. 1(b), we obtain xinst (V ) = (x 11 + x 12 + x 33 )/3.Note that x = (x 11 + x 12 + x 21 + x 22 + x 23 + x 33 + x 42 + x 43 )/8.If the normalized impact is larger than 1, the impact of V is higher than the average impact of all the institutions.
We measure the impact of a given subset of collaborative grants, denoted by U (U ⊆ U), as follows.We first calculate the average per-dollar impact of the awards in U , denoted by xgrant (U ).We are interested in whether institutional collaborations yield higher impact than the average impact of the participating institutions.Therefore, we define the normalized impact of U as xgrant (U )/ xinst (V (U )), where V (U ) is the set of institutions participating in at least one collaborative grant in U .Note that xinst (V (U )) is the average per-dollar impact of the awards that the institutions in V (U ) have received.As an example, let us consider the set of collaborative grants U = {u 1 , u 2 } in a bipartite network shown in Fig. 1(b).One obtains xgrant (U ) = (x 11 + x 21 + x 12 + x 22 + x 42 )/5.Because set of institutions V (U ) is {v 1 , v 2 , v 4 }, one obtains xinst (V (U )) = (x 11 + x 12 + x 21 + x 22 + x 23 + x 42 + x 43 )/7.If the normalized impact is larger than 1, the impact of the collaborative grants in U is higher than the average impact of the institutions participating in a collaborative grant in U .
To quantify the impact of single-institution grants, we adapt the above procedure for collaborative grants to the case of single-institution grants as follows.First, we construct a bipartite network composed of institutions and single-institution grants.Second, we normalize the number of citations received by each of the 363,116 papers that are associated with at least one single-institution grant by the publication year and research discipline.Then, we directly apply the definitions of impact in the case of bipartite networks of institutions and collaborative grants to the bipartite networks of institutions and single-institution grants.

Higher-order rich clubs in collaborative grants
We explore possibility of higher-order rich clubs in collaborative grants.We are also interested in how a rich-club phenomenon depends on the number of institutions in a collaborative grant.Therefore, we calculate the normalized rich-club coefficients for the entire bipartite network and the bipartite subnetwork induced by the collaborative grants of degree (i.e., the number of collaborating institutions), s.We consider s ∈ {2, 3, 4, 5} because collaborative grants with s ≥ 6 are rare; there are less than 100 grants for each s ≥ 6.
Figure 2(a) shows the normalized rich-club coefficients for the different bipartite networks.Figure 2(a) indicates that the entire bipartite network shows a rich-club phenomenon (i.e., rich-club coefficient > 1.10, although this criterion is arbitrary) for the threshold of the number of awards from collaborative grants, k, approximately 100 ≤ k ≤ 200.(The P-value is less than 0.005 for 1 ≤ k ≤ 193 according to the Bonferroni-corrected permutation test; see Supplementary Section S3.)The rich-club coefficient reaches the maximum value of approximately 1.21 at k = 144.The figure also indicates that, although the bipartite subnetwork with s = 2 has rich clubs that are statistically significant (see Supplementary Section S3), the rich-club coefficient values are modest with the largest value of 1.13.In contrast, the bipartite subnetwork only composed of collaborations among s = 3 institutions, the subnetwork restricted to s = 4, and that restricted to s = 5 show relatively strong and persistent rich clubs across a range of k.Therefore, the institutions that receive the largest numbers of awards from either the triadic, quartic, and quintic collaborative grants tend to more densely collaborate with each other than the institutions with the largest numbers of awards from dyadic collaborative grants.Note that the normalized rich-club coefficient for the entire bipartite network (diamonds in Fig. 2(a)) is mostly determined by that for the subnetwork induced by the dyadic collaborative grants (crosses in Fig. 2(a)).This is because dyadic collaborative grants are dominant in number; they account for approximately 67% of all the collaborative grants.
We next compare the rich clubs in the different subnetworks.We focus on the 50 institutions with the largest numbers of awards in the entire bipartite network of collaborative grants.For these institutions, we calculate the Spearman's rank correlation coefficient in terms of the number of awards between each pair of the five bipartite networks (i.e., the entire network, s = 2 subnetwork, s = 3 subnetwork, s = 4 subnetwork, and s = 5 subnetwork).We show the rank correlation for all pairs of networks in Fig. 2(b).We find that the entire network is the most strongly correlated with the s = 2 subnetwork.This result is expected because the collaborations between s = 2 institutions are by far the largest contributor to the entire network.Figure 2(b) also indicates that the correlation is larger when s is closer between two subnetworks.
This result led us to hypothesize that some institutions are good at securing collaborative grants involving fewer institutions, while other institutions are the opposite.To test this hypothesis, we classify the same 50 institutions using a principal component analysis (PCA).To run the PCA, we encode each institution into a four-dimensional vector composed of the normalized number of awards in collaborative grants with s = 2, s = 3, s = 4, and s = 5.Specifically, we scale each entry of the vector to have mean 0 and standard deviation 1.Then, we run the PCA on the normalized vectors using the scikit-learn library 71 .
We show the PCA result in Fig. 2(c).Each data point is labeled with the institution's rank in terms of the number of awards in collaborative grants that the institution has received; see Table S2 for the names of the 50 institutions.The first two principal components, denoted by PC1 and PC2, explain 74.7% and 13.1% of the variance of the data, respectively.Therefore, we conclude that the two-dimensional representation of the institutions shown in Fig. 2(c), where the two axes correspond to PC1 and PC2, is sufficient.The eigenvector corresponding to PC1 is (0.53, 0.54, 0.49, 0.44), which indicates that the number of awards from collaborative grants of any size of collaboration approximately equally contributes to PC1.As expected, institutions with a higher rank (i.e., data points labeled with a smaller number in Fig. 2(c)) tend to have a higher PC1 value.The eigenvector corresponding to PC2 is (−0.25, −0.28, −0.22, 0.89).Therefore, the PC2 classifies the 50 institutions into those frequent in collaborations with smaller numbers of institutions (i.e., 2 ≤ s ≤ 4) and those frequent in collaborative grants with s = 5.For example, the University of California, Berkeley ranks the 11th, 11th, 3rd, and 1st in the s = 2, s = 3, s = 4, and s = 5 subnetworks, respectively; University of Washington ranks the 6th, 2nd, 9th, and 2nd in the same four subnetworks; University of Colorado at Boulder ranks the 8th, 7th, 4th, and 4th; University of California, Los Angeles ranks the 24th, 29th, 22nd, and 7th; University of California, Santa Barbara ranks the 22nd, 38th, 42nd, and 8th; Rice University ranks the 45th, 44th, 82nd, and 6th.The latter three universities have a much higher rank in the subnetwork with s = 5 than that in the entire network.The behavior of institutions with a low PC2 value is the opposite.For example, University of Illinois at Urbana-Champaign ranks the 1st, 1st, 8th, and 10th in the s = 2, s = 3, s = 4, and s = 5 subnetworks, respectively; University of Michigan, Ann Arbor ranks the 3rd, 3rd, 5th, and 17th in the same four subnetworks; Massachusetts Institute of Technology ranks 5th, 9th, 12th, and 28th; Duke University ranks 18th, 18th, 34th, and 55th; Virginia Polytechnic Institute and State University ranks 32nd, 19th, 14th, and 53rd.

Research impact of the institutions with the largest numbers of collaborative grants
We now investigate research impact of the institutions with the largest numbers of awards from collaborative grants.Note that these institutions form putative rich clubs.For comparison, we also analyze the research impact of the institutions with the largest numbers of awards from single-institution grants.Here we analyze the data separately for all the collaborative grants, the collaborative grants comprising s ∈ {2, 3, 4, 5} institutions, and single-institution grants.First, we show the rank plot of the number of awards received by the institution, k, in Fig. 3(a).The figure indicates that k is skewed toward the top-ranked institutions.For example, the top 20% of institutions obtained approximately 82% of the awards in collaborative grants and approximately 79% of the awards in single-institution grants.This result is consistent with the concentration of research funding in top-ranked institutions observed in the NSF 72 , the National Institutes of Health grants in the US 53,73 , and the Engineering and Physical Sciences Research Council grants in the UK 31 .We also found that the top-ranked institutions less dominate the distribution of awards in the case of collaboration with a larger number of institutions (i.e., larger s).For example, the top 20% of institutions account for approximately 79% of the awards in single-institution grants (i.e., s = 1), 76% for s = 2, 70% for s = 3, 60% for s = 4, and 53% for s = 5.To be further quantitative, we have calculated the coefficient of variation for the distribution of the number of awards, which is equal to 1.75, 1.67, 1.49, 1.17, and 0.95 for s = 1, s = 2, s = 3, s = 4, and s = 5, respectively; the Gini coefficient is 0.74, 0.72, 0.66, 0.56, and 0.46 for s = 1, s = 2, s = 3, s = 4, and s = 5, respectively.
Second, we show the normalized impact of the institutions as a function of k in Fig. 3(b).We find that the institutions with approximately 100 or more awards from collaborative grants tend to be less productive in the per-dollar sense than those with fewer awards.Similarly, the institutions with approximately 100 or more awards from single-institution grants tend to be less productive than those with fewer awards.This result of the diminishing per-dollar productivity or impact at the institution level is consistent with the previous results [51][52][53][54] .Figure 3(b) also indicates that similar diminishing research impact is present for collaborative grants of different collaboration sizes, s ∈ {2, 3, 4, 5}.

Research impact of the collaborative grants within rich clubs
Given the results shown in Fig. 3, rich clubs may be detrimental to research impact because a rich club is a set of high-degree nodes, i.e., institutions with many awards.However, Fig. 3 does not imply that collaborative grants among rich-club institutions are not productive; we did not look into collaboration among rich-club institutions with Fig. 3. Therefore, we now investigate possible associations between the rich clubs in collaborative grant networks and research impact.We first validate the impact of the collaborative grants within rich clubs, which are exclusively composed of the institutions with the largest numbers of  awards.We denote by U >k,≥p the set of collaborative grants in which the fraction of the institutions with more than k awards from collaborative grants is at least p.We compare impact of the collaborative grants, U >k,≥p , for different p values.We show in Fig. 4 the normalized impact of the collaborative grants in U >k,≥p for different values of k and p for the entire network and the subnetwork of each collaboration size s ∈ {2, 3, 4, 5}.For the entire network, Fig. 4(a) indicates that the collaborative grants in U >k,≥p with p = 1 and large k tend to be more productive than the expectation for the participating institutions.The maximum value of the normalized impact is approximately 1.15 at k = 159.The figure also indicates that the collaborative grants in U >k,≥p with p = 1 for given value of k tend to have a higher normalized impact than those in U >k,≥p with 0 < p < 1.For example, at k = 159, the normalized impact is 1.15, 1.10, 1.00, 0.97, and 0.98 for p = 1, p = 0.8, p = 0.6, p = 0.4, and p = 0.2, respectively.Figures 4(b)-(e) indicate that the normalized impact for U >k,≥p with p = 1 tends to be larger than 1 at large k values in the subnetwork with s ∈ {2, 3, 4, 5}.This result is qualitatively the same as that for the entire collaboration network shown in Fig. 4(a).Figures 4(b)-(e) also indicate that the normalized impact for U >k,≥p with p = 1 tends to be larger than that for U >k,≥p with 0 < p < 1 in each subnetwork with s ∈ {2, 3, 4, 5}.By definition, the normalized impact of the single-institution grants is exactly equal to 1 for any k.Altogether, these results indicate that collaborations among the institutions with the largest numbers of collaborative grants tend to be productive, not because such institutions tend to be strong in research but because they collaborate.
To further investigate the association between rich clubs and research impact, we investigate relationships between the normalized rich-club coefficient, ρ(k), and the normalized impact of the collaborative grants that are exclusively composed of the institutions in the rich club.We denote by U >k the set of collaborative grants that are exclusively composed of the institutions with more than k awards from collaborative grants.Note that U >k is equivalent to U >k,≥p with p = 1.If ρ(k) is sufficiently larger than 1, then U >k is the set of collaborative grants contained in the rich club.Therefore, if rich clubs are associated with high research impact, the normalized impact of U >k should be larger than 1 for the k values at which ρ(k) is sufficiently larger than 1.
We show in Fig. 5 the plots of ρ(k) and the normalized impact of U >k against k, separately for the entire network and the subnetworks with s ∈ {2, 3, 4, 5}.The figure indicates that the normalized impact of U >k tends to be larger than 1 if ρ(k) is larger than 1 in the entire network (Fig. 5(a)).For example, ρ(k) is largest at k = 144.The institutions with more than 144 awards collaborate with each other approximately 21% more densely than in a randomized network (i.e., ρ(144) ≈ 1.21).The impact of the collaborative grants in U >144 is approximately 14% higher than expected from the average impact of the institutions participating in a collaborative grant in U >144 .However, at k = 299, the rich club is absent (i.e., ρ(299) ≈ 0.67), and the impact of the collaborative grants in U >299 is 30% lower than the expectation for the participating institutions.The Pearson correlation coefficient between ρ(k) and the normalized impact, where we regarded a pair of these two quantities for a value of k as a data point, is equal to r = 0.85 (P-value is less than 0.001).We also found a significant positive correlation between these two quantities for the subnetwork with s = 2 (r = 0.89, P < 0.001; see Fig. 5(b)), s = 4 (r = 0.61, P < 0.005; see Fig. 5(d)), and s = 5 (r = 0.98, P < 0.001; see Fig. 5(e)).For the subnetwork with s = 3, while we found a negative correlation (r = −0.81,P < 0.001; see Fig. 5(c)), the normalized impact tends to be larger than 1 if ρ(k) is larger than 1 for approximately 1 ≤ k ≤ 45.

Discussion
We investigated higher-order rich-club phenomena in networks of collaborative research grants.To this end, we developed a method to detect rich clubs in bipartite networks.We observed rich clubs in both the entire bipartite network and the subnetworks induced by the collaborative grants with a given number of collaborating institutions, s, where s ∈ {2, 3, 4, 5}.The subnetworks with s = 3, 4, and 5 had stronger rich clubs than that with s = 2. Regarding performances of rich clubs, we found that the collaborative grants within rich clubs tend to have higher per-dollar impact than the average impact expected for the institutions participating in the collaboration.We emphasize that the higher impact of rich clubs is a genuine effect of collaboration because the impact of the single-institution grants is normalized to 1.These results support our hypothesis that collaborations among institutions in rich clubs are productive.
Our results extend the findings on the rich clubs in grant collaboration networks shown in a previous study 31 in the following two aspects.First, we found that some collaboration-rich institutions tend to densely collaborate with each other in research grants involving fewer institutions, whereas other collaboration-rich institutions tend to do so in research grants involving more institutions.One factor underlying this phenomenon may be strategies of individual institutions regarding interdisciplinary research projects.Evidence suggests that interdisciplinary research projects are less likely to attract funding in a short term 74 , whereas they positively contribute to long-term funding performance 75 .This tendency may affect funding strategy of individual researchers and institutions, which may affect the distribution of the size of collaboration in terms of the number of institutions for the institution to which the researchers belong.Note that Ma et al. employed the one-mode projection and therefore the impact of the size of collaboration is not a question that they focused on in their study.Second, the benefits of rich clubs to the per-dollar research impact seem to come from collaborations among the institutions that belong to the rich clubs.Ma et al. indicated that the rich clubs attract a large number or monetary amount of awards and tend to produce a large number of papers with high quality 31 .In contrast, our results indicate that collaborations among the institutions in rich clubs are productive in terms of the per-dollar research impact, whereas the institutions themselves with many collaborations are not particularly productive.
The generality of rich clubs in grant collaboration networks deserves further investigation.For example, the presence of rich-club phenomena and their association with research impact may be stronger in some research disciplines than in others.Our results do not guarantee the association between rich clubs and research impact across different disciplines.In fact, the strength of the correlation between productivity and institutional collaborations in writing papers substantially depends on research disciplines 76 .Rich clubs and their relevance to research impact may also depend on funding agencies.The National Institute of Health financially encourages that multiple investigators with expertise in different health profession fields work 9/18 together in research projects 77 , which may lead to rich-club phenomena in networks in which the node is a department or institution.Moreover, higher-order rich-club phenomena in grant collaboration networks may depend on the definition of the node.In fact, Ma et al. reported that a British collaboration network among investigators in which an edge represents two investigators' co-funded research projects does not have rich clubs 31 .
We did not address causality between rich clubs and research impact.Furthermore, the higher impact of the collaborative grants within the rich clubs may be associated with various properties of the member institutions other than the density of their collaborations, including the internationality of the faculty 78 , departmental and institutional size 79 , grant type 42 , and funding support from industries 80 , which may affect research impact.Additionally, there are other forms of dense mesoscopic structure of grant collaboration networks, most famous one of which is probably the community structure.Such other forms of dense mesoscopic structure may also affect research impact.Examples of collaborations that may form such mesoscopic or community structures include teams composed of private universities that may be subsidized by their financial resources 23 , collaborations among investigators from different departmental affiliations 30 , and collaborations between universities and industries 81 .Moreover, many co-authorship networks among authors also show structures including the community structure and rich clubs 1,33,82 .The present method is also applicable to the investigation of higher-order rich-club phenomena in co-authorship networks.Further exploring the associations and causality between mesoscopic structure of networks involving higher-order interaction and research impact for various types of scientific collaborations warrants future work.

Figure 1 .
Figure 1.An example of three collaborative grants and the corresponding bipartite network of institutions and collaborative grants.

Figure 2 .Figure 3 .
Figure 2. Rich-club phenomena in networks of grant collaboration.(a) Normalized rich-club coefficient ρ(k) as a function of the number of awards that the institution received from collaborative grants.We measured ρ(k) for the entire network (labeled "All collaborations"), the subnetwork only composed of collaboration between s = 2 institutions, that with s = 3, s = 4, and s = 5.In this figure, Fig. 3(b), Fig. 4(a)-(e), and Fig.5, we omit data points for a given value of k if there are less than five instances contributing to the data point.(b) Rank correlation matrix between the different networks, where the rank is in terms of the number of awards in collaborative grants that the institution has received.We used the top 50 institutions in the entire network to calculate the rank correlation.(c) PCA result for the 50 institutions with the largest numbers of awards in the entire network.The number indicates the institution's rank in the entire network.See the Supplementary Materials for the names of the 50 institutions.

Figure 4 .
Figure 4. Advantage of collaborations among the award-rich institutions.We plot the normalized impact of the collaborative grants in each of which fraction of the institutions receiving more than k awards from collaborative grants is at least p.We denote by V >k,≥p the set of the institutions participating in at least one collaborative grant in U >k,≥p .(a) Entire network.(b) Subnetwork with s = 2. (c) Subnetwork with s = 3.(d) Subnetwork with s = 4. (e) Subnetwork with s = 5.

Figure 5 .
Figure 5. Overlay of the rich-club coefficient and research impact of the collaborative grants.Each panel shows the normalized rich-club coefficient and the normalized impact as a function of the number of awards k that the institution has received from collaborative grants.(a) Entire network.(b) Subnetwork with s = 2. (c) Subnetwork with s = 3.(d) Subnetwork with s = 4. (e) Subnetwork with s = 5.