Introduction

Research collaborations have become a widespread mode of scientific practice at high speed in recent decades (Bozeman & Youtie, 2017). Looking at the micro-, meso- and macro-levels of the science system, the explosive spread of collaborative research can be attributed to various causes: by participating in research collaborations, scientists can increase the quantity or quality of their research output (Andrade et al., 2009), enhance their reputation, visibility and recognition in the scientific community (Barabási & Wang, 2021), or gain competitive advantages in the context of an increasingly dynamic allocation of funding (Crow & Dabars, 2019). However, research collaborations are also attractive for the researchers’ home organisations: they secure the inflow of research funds, enhance their reputation and attract qualified (young) scientists (Abramo et al., 2014). Finally, through the pooling of competencies, perspectives, experience and resources, collaborative research enables large, complex research topics that transcend disciplinary boundaries to be addressed (Aboelela et al., 2007). In addition to the manifold advantages of collaborative research, there is also a wide range of challenges and risks associated with it: if, for example, the scientists in a research team do not succeed in bindingly agreeing on shared and clear goals, organising joint work processes in a goal-oriented manner, communicating on content-related issues and generating commitment, social cohesion and trust, the collaboration risks stalling and joint success can be jeopardised (Porter & Birdi, 2018).

Although research collaborations are not a new phenomenon in scientific practice (Vinck, 2010) and the demand for empirically stable knowledge regarding the preconditions for successful research collaboration remains high (Aboelela et al., 2007), there is surprisingly little systematic knowledge available on the conditions contributing to the success of research collaborations. The state of research is characterised by anecdotal, theoretical and qualitative studies that are mostly not generalisable (Shrum et al., 2007). As a result, a wide range of evidence circulates on how successful research collaboration should be designed. However, it remains unclear how this evidence can be put into a hierarchical order with regard to its significance for the success of the collaboration. This is where the present article comes in. On the basis of representative survey data from the research project Determinants and effects of cooperation in homogeneous and heterogeneous research clusters (DEKiF), the question of which cooperation practices and conditions have the most significant influence on the success—defined as the extent to which the goals communicated to the funding agency are achieved—of research clusters is explored with the help of a Random Forest (Breiman, 2001). The empirical reference point of the paper is collaboration between principal investigators and spokespersons in ongoing and completed research clusters of the coordinated programmes and excellence clusters funded by the German Research Foundation (DFG).

The paper is structured as follows: First, a basic definition of the term research collaboration is presented, from which two different types of research collaboration are derived. Based on the current state of research, an overview of the types of influences that various intra- and interpersonal factors exert on the success of a research cluster are given. Following the outline of the data basis, the applied method of analysis and the operationalisations, the results are presented. The article concludes with a discussion of the central findings, the theoretical and practical implications as well as the identification of limitations and the need for further research.

Research collaborations

Due to the fact that disparate types and facets of research collaborations have been studied in various contexts over the past decades, the term research collaboration is used inconsistently in the literature (Bukvova, 2010). A widely accepted definition is provided by Laudel (2002).

What is a research collaboration?

Following Laudel (2002), a research collaboration can be defined as that situation in which n > 1 scientists functionally relate and coordinate their research activities in order to achieve specific research and/or collaboration goals. The scientists involved in a research collaboration pursue research and/or collaboration goals by participating in a collaboration. Research goals are directed towards the production of knowledge, collaboration goals towards other interests. Only if the research goals of all scientists involved in a collaboration are identical is there a common research goal. If the research goals of the cooperating scientists are not identical, there is only a common cooperation goal. A common research goal is therefore not a necessary prerequisite for a research collaboration. Scientists can also cooperate because their cooperative actions are compatible with other interests, e.g. the fulfilment of social norms, reputational gains or integration into the scientific community. Finally, Laudel assigns the term research collaboration exclusively to those research actions that are based on personal interactions between collaborating scientists: “Thus, formal communication and references to other scientists are regarded as a different phenomenon. While all scientific research is collaborative in certain respects because it makes use of the work of other scientists, collective knowledge production based on formal communication differs from immediate collaboration in its social dynamics, especially in the way actions are coordinated” (Laudel, 2002, p. 5).

What types of research collaboration exist?

Laudel (1999, 2002) makes a global distinction between two different types of research collaboration: (1) in the context of collaborations involving a division of labour, all researchers involved in a collaboration share a common research goal. The researchers reach this goal by making independent, creative contributions in the context of closely interconnected research activities. Laudel assigns a prominent position to collaborations involving a division of labour because only in this type of collaboration all collaboration partners make creative contributions in a joint research process: “Since research is by its nature a creative activity, it seems justified to apply the term 'division of labour' only to those collaborations in which both partners make creative contributions”(Laudel, 1999, p. 38).Footnote 1 The collaborations involving a division of labour are extremely demanding: it requires a high degree of coordination between the researchers involved and the research processes for which they are responsible, and at the same time presupposes constant agreement on the common collaboration goals.

A supporting collaboration (2) differs from a collaboration involving a division of labour to the extent that the collaboration partners support the achievement of their collaboration partners' research goals by taking on non-creative routine work or by granting access to material or immaterial resources. Accordingly, a supporting collaboration occurs when researchers delegate routine activities such as measurements, substance preparation or the exploration of new methods to third-party researchers. Similarly one can speak of a supporting collaboration when collaboration partners grant each other mutual access to research infrastructures or support each other in the use of new research equipment and infrastructures (Laudel, 2002). According to Laudel, supporting collaboration thus exists even if the researchers involved in a collaboration do not pursue common research goals: collaboration partner A, for example, pursues individual research goals that are not shared by its collaboration partner B. Collaboration partner B nevertheless supports the achievement of the research goals of his collaboration partner A because he pursues other interests with his support, e.g. the fulfilment of social norms, access to resources or reputational gains (Laudel, 2002).

Factors influencing successful research collaborations

Since the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) only allows the establishment and continuation of an Research Collaboration (RC) if the Principal Investigators (PIs) involved are able to credibly demonstrate that they have common goals, which they achieve in close collaboration and with interlocking, creative research contributions (German Research Foundation, 2010, 2015, 2019, 2020, 2021b), it can be assumed that the collaboration between the PIs and the spokespersons is essentially characterised by the division of labour.

Particularly in view of the non-routine research processes arising in a collaboration involving division of labour, various challenges arise that stand in the way of successful research collaboration and are multiplied due to the decentralised organisational structures of RCs as well as the low level of control and management of the PIs’ subproject-internal research work (Defila et al., 2008). However, RCs can maximise their chances of success by taking various intra- and interpersonal factors into account. In this context, research suggests that the characteristics of the participating PIs (1) and spokespersons (2), the collaborative development of shared goals and questions (3), interconnecting the subprojects’ research (4), the synthesis of their research results, and the team climate (5) of an RC are particularly important (Table 1).

Table 1 Intra- and interpersonal factors of research collaboration

The six dimensions of intra- and interpersonal factors mentioned are not an exhaustive list of possible success factors for RCs. However, the results of the preliminary study of the research project DEKiF and extensive literature research indicate that the six dimensions of the two factors are highly relevant for the performance of a variety of cluster constellations (small and large, interdisciplinary and cross-disciplinary, virtual and co-present RCs in all disciplines).

Intrapersonal factors

Following Misra et al. (2011) intrapersonal factors are understood as those influences on the design and success of a research collaboration that can be traced back to personality traits and characteristics of the members of an RC. The following section focuses on intrapersonal factors associated with two types of RC members: (1) the characteristics and abilities of the PIs and (2) the characteristics and abilities of the spokesperson of an RC.

Characteristics and skills of PIs

The set of qualifications and skills for which the PIs of an RC are hired determine in various ways the foundation for the subsequent collaborative success of an RC: in order for an RC to achieve its overarching, mostly interdisciplinary (Supplementary Fig. 7) goals, the (complementary) expertise and scientific knowledge (KRS1, KRS7)Footnote 2 of the participating researchers are of central relevance to its collective success (Twyman & Contractor, 2019): their combination enables the creation of synergies that allow an RC to achieve research outcomes that are beyond the disciplinary knowledge corpus and methodological inventories of individual PIs (Defila et al., 2006). By creating synergies, highly complex research that exceeds the capacity of solitary or monodisciplinary research becomes processable in a time- and resource-efficient manner (Olechnicka et al., 2019).

However, since an RC is composed of individuals with different (possibly conflicting) personalities, it is not enough to combine highly competent researchers in one research team (Bennett & Gadlin, 2019). An RC is a complex, purpose-driven, social entity (Salas et al., 2017) whose potential can only be realised if the personality traits of all participants exhibit a sufficient fit (KRS5) (Stokols et al., 2008). Only if the personalities of an RC’s PIs harmonise fundamentally with each other, can professional cultural differences be bridged, divergent interests and benefit calculations be reconciled without high friction losses, work processes be organised in a goal-oriented manner, cooperation problems be identified early on and jointly overcome (Fiore et al., 2015).

In addition to content-related and personal criteria, strategic selection criteria are also important for the success of an RC. For example, a PI can also be important for the success of an emerging RC even if there is no compelling, content-related need for their membership in the RC, but their participation guarantees the RC exclusive access to special research infrastructures (KRS4) that are essential for achieving the intended research goals (Abramo et al., 2017). The same applies to the selection of particularly reputable scientists (KRS3): not only can they be expected to achieve content-related excellence, their participation in an RC can additionally spur the other PIs in the RC to perform at their respective best (Bozeman et al., 2001). Thus, the selection of a scientific superstar may also be strategically advisable because they can act as a motivator and thus as a performance optimiser for an RC (Balsiger, 2005).

Finally, recruiting PIs who are particularly well connected in the (inter)national scientific community (KRS8) can also have various strategic advantages: on the one hand, PIs with far-reaching network connections can help an RC during its initiation phase to recruit particularly sought-after (young) scientists for the RC (Barabási & Wang, 2021). However, the social capital of a potential PI can also influence the “number of people whose support can be counted on for research proposals […] on the reviewer side” (Münch, 2007, p. 38).Footnote 3 In this respect, the application of strategic selection criteria in the staffing of an RC can tip the scales in various ways for the establishment, continuation and success of an RC.

In contrast to these directly observable characteristics, which can be applied and variously weighted as criteria for the selection of potential PIs during the formation of an RC, the subsequent cooperative behaviour of the PIs depends on a variety of factors and is more difficult to anticipate in advance (Hall et al., 2008). However, the PIs’ willingness to collaborate is constitutive for the success of the later coordination and implementation of the PIs’ collaborative knowledge production at the cluster level: only if the PIs feel responsible for achieving the goals of the RC (ARB4), if they are committed to the concerns of the RC at the cluster level (PRM1), if they cooperate across subprojects (PRM2) and if they reliably deliver their contributions necessary for achieving the common goals (PRM13), can an interconnection of the subproject-internal research work succeed and thus the overarching goal of any collaboration involving a division of labour be achieved: the integration of heterogeneous scientific knowledge across subprojects and thus the generation of (cross-disciplinary) research results (Blanckenburg et al., 2005).

The central strength of collaborative research is also one of its greatest challenges (O’Rourke et al., 2019): on the one hand, the heterogeneity of the competences, scientific knowledge and perspectives brought in by the PIs enables the solution of far-reaching and/or cross-disciplinary scientific problems, but on the other hand it makes it difficult for the PIs to understand each other and, as a consequence, to achieve the common research and collaboration goals of an RC (Derry et al., 2005). For successful collaboration, it is therefore essential that the PIs of an RC are willing to devote sufficient time and energy to comprehending (DIT3) and engaging with methods, ways of thinking and perspectives that are unfamiliar to them (DIT1) (Blanckenburg et al., 2005). Conversely, it is equally important for the success of content-related communication processes that the PIs are willing and able to make their own discipline-specific axioms, methods, ways of thinking and perspectives understandable to their collaboration partners (possibly from outside the discipline) (DIT2) (Cooke & Hilton, 2015). Only if the PIs in a research cluster are willing and able to communicate on a content-related level, can the early interconnection of subproject-internal research work, the integration of subproject-internal research results and thus the generation of synergies succeed (Holbrook, 2013).

Characteristics and skills of the spokesperson

In addition to the characteristics of the PIs, the skills and qualifications of the spokespersons also play an important role in the success of an RC (Salazar et al., 2019). On the one hand, these are science-unspecific and necessary for the successful management and control of all purposeful, social systems (Defila et al., 2008): in order to achieve common goals, the spokesperson of an RC must be able to motivate (cross-subproject) cooperation (LEG4) (Defila et al., 2006), support the members of the cluster in resolving conflicts (LEG6) (Blanckenburg et al., 2005) adequately address the varying information needs of the members of the cluster and project groups (LEG5) (Beer et al., 2020), be open to criticism, suggestions and proposals from PIs (LEG8) (Blanckenburg et al., 2005), act in the interests of the cluster (LEG7) (Salazar et al., 2019) and create an open and participatory environment for PIs to work together at the cluster level (LEG9) (Di Giulio et al., 2008).

Because the medium-term continuation of an RC can only be ensured if the goals communicated to the research funder are achieved, on the other hand, a pragmatic approach to obstacles, unforeseen situations and the resulting situational challenges in the course of the implementation phase is of existential importance for the success of an RC: in particular, the steering action of the spokesperson must therefore also be always committed to an “opportunistic rationality” (Knorr-Cetina, 1984, p. 124).Footnote 4 unforeseen situations and obstacles must be addressed pragmatically, opportunity structures must be used strategically or the goals of the cluster must be modified in the face of changing, situational circumstances (Baurmann & Vowe, 2014). It is the responsibility of the spokesperson of an RC to anticipate the need for strategic steering processes at the cluster level in good time, to stimulate them and to moderate and accompany their implementation (Defila et al., 2006; Winter, 2019). For this reason, the leadership of an RC must have both high moderation skills (LEG3) and high strategic skills (LEG1).

Finally, the spokesperson of an RC must also bear the responsibility for ensuring that, in particular, the quality requirement of integrating the subproject-internal research results, which is central to cross-disciplinary collaborations, is met: “This responsibility has a thematic dimension (related to the researched topic), a methodological dimension (especially related to the cognitive processes of integration) and a social-communicative dimension (especially related to the people involved)” (Defila et al., 2015, p. 66).Footnote 5 In addition to strong moderation skills (LEG3), which are necessary to ensure communication between different disciplines and subprojects, the leader of an RC must also have a high level of technical competence (LEG2), e.g. in order to be able to contribute to the knowledge production, identify inconsistencies with regard to subproject-internal research work and results, or prepare synthesis procedures (Defila et al., 2015).

Interpersonal factors

Following Misra et al. (2011), interpersonal factors are understood as those influences on the success of RCs that can be traced back to the interactions of an RC’s PIs and spokespersons. The following section focuses on four tasks involving interpersonal factors: (1) the development processes of common research and cooperation goals, (2) the interconnection of the subprojects, (3) the synthesis of the research results of the subprojects and (4) the team climate generated by group dynamics.

Shared research goals

In view of the dynamics and openness of all research processes, the decentralised organisational structures of RCs, the flat hierarchies between PIs and the resulting low controllability and steerability of subproject-internal research work (Di Giulio et al., 2008), the common goals of an RC function as its central, organisational guardrail (Defila et al., 2006): they help to identify changes necessitated by obstacles, unforeseen situations or situational challenges and to adapt the subproject-internal research work to them (Cooke & Hilton, 2015). However, the goals of an RC can only bring about a steering and coherence-fostering goal commitment among the PIs if they are accepted by all cooperation partners. Accordingly, it is important to prevent particular interests from dominating the development or updating of common research goals (Defila et al., 2006). Therefore, in order to ensure a sustainable communitarisation of goals among all PIs, all represented disciplines must be given equal consideration in the development of joint research and cooperation goals (ENL1) (O’Rourke et al., 2019). In particular, collaborations involving the division of labour, which are strongly interconnected in terms of content and for whose success the integration of the research results of the subprojects is constitutive, must furthermore align the subprojects’ research goals with those of the RC (ENL4) (Defila et al., 2006). In this way, a common and binding fixed point is created against which the subproject-internal research work can be aligned. It is of central importance that the content-related, methodological and theoretical concepts of the RC’s subprojects are coordinated with regard to the common collaborative goals (ARB9, IN21). This allows the research work of the PIs’ subprojects to be continuously interconnected and the research results achieved in the subprojects to be consistently integrated at cluster level (O’Rourke et al., 2019).

Interconnection of the subprojects

In order for the common research questions of an RC to be answered coherently across subprojects, it is necessary that the subprojects of an RC remain in close, continuous exchange (IN26) and that the research work relevant to the achievement of the RC’s goals within the subprojects is (continuously) aligned across subprojects (ARB9) (Di Giulio et al., 2008). Without a continuous, mutual adjustment of the subprojects’ work, the interconnectedness of the subproject-internal research work threatens to successively unravel. As a result, the RC’s subprojects gradually drift apart, which in turn jeopardises the synthesis capacity of the subprojects’ research results (Defila et al., 2006). In this context, it is constitutive for successful interconnectedness of research work in the subprojects that the PIs can rely on a shared, epistemic foundation in the context of their collaboration (Baurmann & Vowe, 2014). Accordingly, an RC’s PIs must have shared terminology (IN24), align their research work with a common theoretical basis (IN23) and apply consistent methods with a view to the integrability of the research results (IN22) (Thompson, 2009).

Synthesis

Only by agreeing on research or collaboration goals that are acknowledged by all PIs and by interlinking the subproject-internal research work, can the central promise of collaborative research be fulfilled (O’Rourke et al., 2019): the synthesis of the subprojects’ research results that enables the PIs to answer the RC’s research questions jointly and consistently (Defila et al., 2006). For synthesis building, following the widely accepted (e.g. Andersen & Wagenknecht, 2013; Defila et al., 2015; Krott, 1996; Pohl et al., 2007) typology developed by Rossini and Porter (1979), four combinable procedure types of synthesis building can be distinguished:

Under procedure type (1) project management, the integration of research findings is delegated to a single person or a small group (IN29). The advantage is that a single person or a small group can carry out the coherent integration of the subprojects’ research results in a more time-efficient and conflict-free manner than would be possible with more consensus-oriented procedure types (see below). In the procedure type (2) group, the integration of the research results is carried out jointly by all PIs (IN28). The subprojects’ research results are prepared for synthesis at group level in such a way that they are comprehensible to all PIs and their diffusion can take place jointly. The advantage of this type of procedure lies in the knowledge sharing between the group members and the high degree of uniformity this creates. Within the framework of procedure type (3) negotiation, the integration of research results takes place through continuous exchange between the subprojects (IN26), which is ensured by an iterative linking of the subprojects’ research work. The central advantage of the procedure type negotiation lies in the successive, decentralised and thus less complex and organisationally easier synthesis formation. Within the framework of procedure type (4) system, the integration of the subprojects’ research results is ultimately ensured by the uniform use of common theories (IN22) and coherent methods (IN23) throughout the cluster. In this way, the subprojects’ research results can be brought together efficiently and without contradictions and interpreted jointly (Defila et al., 2006).

Team climate

The term team climate can generally be used to describe the atmosphere of cooperation in an RC (N. Anderson et al., 2000). An RC’s team climate is constituted by the internal practices, procedures, behaviours, norms or expectations that shape the collaboration of the RC’s PIs in various ways and thereby influence the RC’s performance (Yuan et al., 2008). Following West (1996), three key dimensions of team climate are distinguished below, which influence the success of an RC in different ways: vision, participative safety and task orientation.

Vision

To enable an RC to add value and to bridge PIs’ divergent interests, the PIs involved in it must have common research and cooperation goals. These must be condensed into a clear vision shared by all PIs, which guides cooperation at the cluster level and motivates the participants to the highest degree (Antoni, 2000). Through goals shared by all PIs, the human resources of an RC can be bundled, the coherence of the subproject-internal research work can be ensured and synergies can be created. In order to mobilise the synergy potential of an RC, all PIs must agree on the common goals of the RC (PRM7) and feel sufficiently committed to them (ARB8) (Blanckenburg et al., 2005). “Imposed team visions, on the other hand, are usually ineffective for high performance and innovation in cooperation, i.e. visions must be allowed to develop. If they are not repeatedly reflected upon, modified and negotiated, they degenerate into ineffective footnotes of the past” (N. Anderson et al., 2000, p. 9).Footnote 6 Only if the goals communicated to the funding agency appear realistic and achievable in the long term (ARB6), can the RC’s PIs’ commitment, motivation and willingness to cooperate be maintained (Blanckenburg et al., 2005). If, on the other hand, unforeseen cooperation problems arise that make the achievement of common goals unrealistic, an RC’s PIs are at risk of not deriving adequate benefit from the time- and resource-consuming cooperation. As a result, the PIs may succumb to the notorious loss of motivation and thus a comprehensive erosion of their commitment to the overarching goals of the RC (John, 2019).

Finally, the goals of an RC can only become an effective fixed point for the subproject-internal research work if they are clearly defined (ARB7) and the responsibilities of the PIs and subprojects resulting from the goals are bindingly clarified (ARB5): only if there is sufficient clarity for all PIs at all times about what the subprojects for which they are responsible have to achieve in order to reach the common goals, can the PIs’ research activities be mutually related in a meaningful way, networked and thus the goals of the RC can be achieved jointly and efficiently (Derry & DuRussel, 2005).

Participatory safety

Participativeness and safety are characterised as a single psychological construct in which the contingencies are such that involvement in decision-making is motivated and reinforced while occurring in an environment which is perceived as interpersonally non-threatening” (West & Farr, 1990, p. 311). Following West and Farr, it can be therefore assumed that the more the PIs of an RC participate in relevant decision-making (ENS4, ENS5, ENT4, ENT5), the more they identify with the consequences resulting from the decisions (W. Anderson & West, 1998). For PIs, participation means being able to actively take part in an RC, to help shape it, to have a say in it. Participative decision-making is thus constitutive for PIs’ sustainable commitment to an RC (Chawla & Singh, 1998).

An atmosphere of mutual trust (PRM9) and a ‘we-feeling’ (ARB1) can be classified as central conditions for active participation in decision-making (W. Anderson & West, 1998): such a cohesive climate of collaboration enables PIs to take (social or epistemic) risks in the course of their collaboration (e.g., to develop risky ideas, to openly address and resolve simmering conflicts) (O’Rourke et al., 2019) without fear of negative sanctions or disparagement from their collaboration partners (Edmondson 1999). Establishing and maintaining a cohesive collaborative climate encourages open debate and discussion between PIs and stimulates a free exchange of ideas (Hall et al., 2012). Similarly, a cohesive collaborative climate also facilitates the constructive resolution of relationship or task conflicts that arise (De Dreu & Weingart, 2003), the integration of intra-project research findings (Hall et al., 2012) and effective error management at the cluster level (Cooke & Hilton, 2015).

Participatory decision-making processes, in particular, repeatedly lead to lengthy, complicated and even unproductive decision-making processes (Di Giulio et al., 2008). Agile, directive decisions by an assertive spokesperson, individual PIs or a subgroup (ENS1, ENS2, ENS3, ENT1, ENS2, ENT3) can therefore also be important for an RC’s performance time and again (Hollaender, 2003). However, in order to avoid disagreements, misunderstandings or conflicts between the PIs, these must meet with the acceptance and common conviction of all PIs (Shinn, 1982): in the best case, therefore, participation consists of “[…] the process of decision-making being collectively negotiated, whereas individual decisions are also in the hands of individual members, depending on situational requirements and expertise. In this way, collective decision-making does not lead to paralysis of action, but to participatory optimised decision-making strategies that promote efficient cooperation” (N. Anderson et al., 2000, p. 11).Footnote 7

Task orientation

Following West and Farr (1990), task orientation can be understood as the efforts by the PIs in an RC to achieve their common goals to the highest degree. The PIs reach this goal through reflection, evaluation and control of the progress towards the goal as well as through constructive controversies. If, on the other hand, group thinking (Janis, 1972) sets in at the cluster level of an RC due to a lack of task orientation, i.e. if controversial disputes or unpleasant goal evaluations are avoided due to disinterest, social desirability, internal conformity or in order to maintain group harmony, individual freedoms and creativity potentials are in danger of being curtailed and the innovation and performance of an RC impeded (Crutchfield & Ulmann, 1973).

In order for an RC to comprehensively achieve its goals, risks, uncertainties or problems relevant to the research cluster must therefore be constructively negotiated and solved by all PIs (Baurmann & Vowe, 2014). A climate of cooperation characterised by fairness (ARB2) is of central importance (N. Anderson et al., 2000): factual questions or content-related problems can be solved constructively but controversially, appreciatively and objectively by all PIs at the cluster level, and destructive and aggressive disputes that promote loss of value and process can be avoided (Blanckenburg et al., 2005).

Finally, an RC can only achieve its goals if delays or unforeseen situations are anticipated as early as possible and addressed flexibly and pragmatically (Choi & Pak, 2007). On the one hand, this requires regular monitoring of the progress towards goals, during which feedback from the subprojects is obtained on whether the (sub-)goals set in each case can be achieved (ARB3). On the other hand, an RC must be continuously prepared for the fact that delays and a variety of unforeseen situations can occur in the course of the implementation phase (PRM14). Project plans and work processes must therefore be constructed as concretely as necessary and as flexibly as possible in order to be able to modify them agilely according to changing circumstances (Blanckenburg et al., 2005).

Data, methods and operationalisation

Data

The influence of the aforementioned 51 intra- and interpersonal factors on the success of RCs is examined below on the basis of cross-sectional data. These were obtained in the context of the joint project DEKiF in the year 2020 through a web survey. The population targeted by the survey was \(n=15.595\) PIs and spokespersons involved in selected research clusters funded by the German Research Foundation (DFG).Footnote 8 During the field phase, \(n=4.972\) evaluable questionnaires from PIs and \(n=340\) from spokespersons were returned, resulting in a response rate of approximately 34 per cent. Contact could not be made with approximately 4 per cent of the target persons due to dysfunctional or outdated e-mail addresses. The sample obtained consists of 26.59 per cent female, 74.41 per cent male and 0.11 per cent of non-binary gender respondents. The average age of the respondents was \(\overline{x}=52.67\) years with a standard deviation of \(\sigma =9.52\) years.

Before the survey was conducted, a complete list of contact addresses of all target persons in the population was generated via the GEPRIS database (German Research Foundation, 2021a) using a web scraping procedure, which made it possible to aim for a complete survey and to obtain a sample that could be used for inferential statistics. Furthermore, due to the availability of a list of all target persons in the population, the difference between the inferential population, the target population and the sampling frame was eliminated (Weisberg, 2009). Significant distortions of the representation of the sample could therefore only occur through unrealisable contacts (see above) and through unit nonresponse.

In general, the response rate of 34 per cent achieved by the survey is quite positive for a social science survey (Döring & Bortz, 2016): comparable international surveys (e.g. Bozeman & Corley, 2004; Bozeman & Gaughan, 2007) have achieved similar response rates in recent years, national ones even significantly lower (e.g. Ambrasat et al., 2022; Neufeld & Johann, 2016). The reasons why 66 per cent of the invited researchers did not participate in the survey can only be speculated. Apart from the fact that the survey was conducted in the turmoil of the first year of the covid pandemic, the non-response could be due to survey fatigue, poor timing or lack of incentives. In general, a low non-response is desirable. However, it is more important for the representativeness of a sample that the non-response is not systematic (Döring & Bortz, 2016). To address possible non-response bias, a unit nonresponse analysis of the obtained sample (Weisberg, 2009) was conducted. The analyses showed that with regard to the (1) content-related affiliation and the (2) gender of the PIs and spokespersons as well as with regard to their (3) affiliation to ongoing and completed clusters (4) of the various funding lines, the nonresponse error is low: the relative frequencies of the characteristics of the variables mentioned deviate in the sample by an average of 1.9 per cent from those in the population (see Supplementary Fig. 5 and 6). In addition to the none-response analyses, various structural equation analyses were conducted on the basis of the data used here, with adjustment for non-responses through weighting [BLINDED]. The structural equation analyses showed that taking the survey weights into account did not change the analysis results in any significant way.

For these reasons, it is assumed that the following analyses are based on a statistically reliable sample and are not influenced by the relatively low response rate of 34 per cent. After listwise deletion, they are based on the responses of \(n=1.417\) and spokespersons from all subjects and funding lines, ongoing and completed, different sizes, disciplinary and cross-disciplinary RCs (see Supplementary Fig. 7–11).

Methods

The subsequent investigation of the most important factors influencing the success of research collaborations is carried out using the machine learning algorithm Random Forests (Breiman, 2001) (RF). RF is fundamentally based on the non-parametric CART algorithm (Classification and Regression Trees) introduced by Breiman et al. (1983). The CART algorithm generates decision trees that can be used for predicting both categorical output variables and continuous output variables.

CART

CART algorithm-based classification and regression trees ((Fig. 1A) partition the input data supplied to them into binary-coded, disjoint subgroupings. The partitioning of the input data is carried out by the CART algorithm using binary decision rules: based on all observations of a data set (root node), the CART algorithm initially searches for an input variable (and, in the context of generating a regression tree, the intercept of an input variable) that divides the input data with respect to an output variable into two disjoint subgroupings with minimum within variance and maximum between variance (split) (Kern et al., 2019). Once an optimal binary decision rule for the initial partitioning of the input data has been found, the CART algorithm searches again for an input variable or an intersection point for each of the subgroupings created by the initial split, which divides the previously found subgroupings into two further, disjoint and homogeneous subgroupings. In this respect, the CART algorithm proceeds recursively: each partitioning of the input data produced by the CART algorithm based on the intersection of a variable is partitioned a second time by a binary decision rule into two further disjoint, maximally homogeneous subgroupings in each case (internal nodes). Without further intervention by the user, the CART algorithm repeats the partitioning of the subgroupings produced by the splitting until final subgroupings (leaf nodes) have been found, which cannot be further homogenised by further splitting based on the output variable. Possible measures of the purity of the final leaf nodes include (for classification trees) the misclassification rate, the entropy or (for classification and regression trees) the Gini index of a CART solution.

Fig. 1
figure 1

A Example of a decision tree generated by the CART algorithm; B Example of pruning a decision tree generated by the CART algorithm; C Example of the decision trees generated by the Random Forest algorithm

The top-down tree structures produced by the CART algorithm can lead to potentially highly complex prediction models that may be overfitted to the observed data (Kern et al., 2019). To prevent such overfitting, the complexity of the decision trees modelled by the CART algorithm can be controlled by pruning: on the one hand, this can be done by a priori defined termination criteria (e.g. exceeding the maximum depth of a tree or falling below a minimum number of observations per leaf node), which stop the tree construction of the CART algorithm when one or more conditions occur. Furthermore, a posteriori cost complexity pruning can be used to extract the part of a decision tree that provides the best predictive performance from the structure of the entire decision tree (Fig. 1B) The quality of the individual tree segments is determined on the basis of their predictive performance on the output variable and their complexity through cross-validations (Nwanganga & Chapple, 2020).

Because the splitting rules found by the CART algorithm during tree construction depend on the distributions of the variables in the input data, the decision trees generated by the CART algorithm are not very robust to variations in the data supplied to them (Strobl et al., 2009): each (sub-)sample passed to the CART algorithm results in a different, possibly completely divergent tree configuration and thus varying predictions. As a result, decision trees generated by the CART algorithm are considered unstable and not very accurate in terms of their predictive power, in addition to many advantages—such as their low requirement for input data and their extremely intuitive interpretability and visualisability (Kern et al., 2019).

Random forest

The ensemble approach of the RF algorithm addresses this deficit (Fig. 1C) instead of individual decision trees, the predictions of the RF algorithm are based on ensembles of different sizes—e.g. consisting of 1000 decision trees. The tree ensemble of an RF is constructed by a bagging approach (Breiman, 2001) (bootstrap aggregating): from the input data supplied to the RF algorithm, a separate bootstrap sample (with return) is drawn for each tree construction. Because the data basis for an ensemble ‘s decision trees subsequently differs, the trees of the RF algorithm also vary considerably in their structure and prediction. The diversity of the decision trees of an RF ensemble is further increased because their size and complexity are neither reduced a priori by stop criteria nor a posterior by pruning. Finally, the RF algorithm also randomly restricts the number of input variables available for random selection at each split of each tree of an RF ensemble. This split-variable randomisation forces the RF algorithm to repeatedly draw on different subsamples of the input variables for the construction of the individual decision trees in the course of constructing a tree ensemble, regardless of the strength of the corresponding influence of the input variables on the output variable. This has the advantage that the RF algorithm cannot repeatedly select particularly influential predictors for tree construction. As a result, the path structures of the individual trees of an ensemble cannot be dominated by a small number of particularly influential input variables. In this way, the RF algorithm constructs a highly diverse ensemble of trees whose predictions are combined into a stable and accurate predictive model (Boehmke & Greenwell, 2019): for the prediction of a continuous output variable, the predictions of each decision tree for observation are determined, and finally, an average prediction is formed for it. For categorical output variables, the predicted class of each classification tree of the ensemble is determined, and finally, the modal value is used to predict an observation (Kern et al., 2019). Since the predictions of an RF are based on an ensemble of de-correlated decision trees, the instability of individual decision trees can be compensated for: “As a result, random forests typically achieve a considerable boost in prediction performance in comparison with CART” (Kern et al., 2019, p. 77).

Method selection

The RF algorithm has several advantages over classical regression and classification methods: decision tree-based methods such as RF are superior to classical regression analyses, especially when modelling a high number of complex, non-linear relationships—such as those between cooperation practices and cooperation success (James et al., 2013). Furthermore, in contrast to classical regression methods, the Random Forest algorithm places only low demands on the specified input and output variables as well as the type of their relationships (Hastie et al., 2009). This is of central importance, especially with regard to the data analysed here: separate analyses showed that in the case of a classical, linear regression, neither linear relationships between the 51 independent and the dependent variable could be assumed, nor could the assumption of homoscedasticity or normally distributed residuals be held. Furthermore, in the case of (ordinal) logistic regression, neither could be assumed to have a linear relationship between the explanatory variables and the logit of the response variable. Finally, the RF algorithm is relatively robust to outlier values (Breiman, 2001), which are often found in the data to be analysed (see Supplementary Fig. 12). In this respect, the RF analysis seems to be a suitable alternative to classical regression methods from the author's point of view: it can be interpreted very intuitively, it enables the quantification of the relative importance of the input data for the prediction of the output variable and it allows a detailed evaluation of the quality of the prediction model (Kern et al., 2019).

Operationalisation

The central, Likert-scaled output variable—the extent to which an RC achieved its goals (ER11)—was dichotomised before being submitted to the RF classification procedure: a high level of achievement of the RC’s objectives (Likert scale: 4–5) was rated as complete success (CS) and coded 1. Non- and low achievement (Likert scale score: 1–3) of an RC's goals were scored as no complete success (NCS) and coded 0. The binary coding of the output variable was done for three reasons: on the one hand, the usefulness of RF (1) for predicting nominal and metric output variables is well researched. Random forests predicting ordinal scaled variables (such as the untransformed Likert-scaled output variable), on the other hand, are a rather exotic method and the research literature on their perfomance is sparse (Janitza et al., 2014). Somewhat less decisive for the decision to dichothomise, but nevertheless important, seemed to the author (2) the fact that separate analyses showed that RF regression would lead to a significantly less accurate model than RF classification. Finally, the stated aim of paper (3) is to predict the CS of an RC. Not least for reasons of complexity reduction, the author considers the dichothomisation and the reduction of the information of the output varibale to be appropriate.

All 51 input variables were fed into the RF analysis unchanged according to their original scaling as ordered, five-level factor variables (Supplementary Tables 2).

Results

After listwise deletion, the specified RF classifier is based on the survey data of \(n=1.417\) PIs and spokespersons from all disciplines who were involved in mono- and interdisciplinary RCs of different sizes, subject areas and of different heterogeneity, spatial distribution and duration (Supplementary Figs. 7–11). Fifty-one intra- and interpersonal factors were specified as input variables for the prediction of the dummy-coded success variable. RF analysis was conducted using R (2020) and the RandomForest package by Breiman et al. (2018).

A total of 60 per cent of the total data set (\(n=877\) cases) was randomly merged into a training data set, which was used to train the RF classifier. The remaining 40 per cent of the total data set (\(n=540\) cases) was merged into a test data set which was used to assess the predictive quality of the RF classifier (Kern et al., 2019).Footnote 9

According to the rule of thumb of Boehmke and Greenwell (2019), the originally generated forests consisted of approximately 10 times as many trees as input variables, i.e. 500 trees. Since with an increasing number of trees the RF reduces its errors and thus only increases its predictive power at the expense of a linearly increasing computation time (Genuer & Poggi, 2020), the number of trees was successively increased until both the error rate of the CS and the NCS could not be reduced any further. Since the error rate of the NCS only stabilises from about 750 trees, the final RF was constructed from 1000 individual trees. (Fig. 2B). The number of predictors that can be randomly selected as potential candidates for each split by the RF algorithm from the total set of input variables was set to \(\sqrt{51}\). All other tuning parameters were set to default according to Breiman et al. (2018).

Fig. 2
figure 2

A Confusion matrix of the Random Forest generated from the test data B Out-of-bag (OOB) error of the 1000 decision trees of the Random Forest C Receiver operating characteristic (ROC) curve of the Random Forest

Assessment of model and prediction quality

To test the specified RF classifier for its model and prediction quality, various evaluation measures for Goodness of Fit (GoF) and Goodness of Prediction (GoP) were extracted. GoF measures quantify how well a specified RF model fits the observed data globally. GoP measures can also be used to assess the extent to which an RF classifier is able to accurately predict output variable values for new data (Paluszynska et al., 2020).

The GoF of the RF model specified here was assessed with the Brier Score (Rufibach, 2010). The Brier score yields a value of \(B=0\) for an error-free RF classifier and a value of \(B=.25\) for an uninformative model that predicts a fifty-fifty chance of CS for all observations (Paluszynska et al., 2020). With a Brier score of B = 0.05, the model fit of the specified RF classifier can thus be considered very good.

As part of the evaluation of the GoF of the RF model, a threshold of \(C=.5\) was set for the prediction of the test data. Respondents whose predicted probability of complete success was \(p>.5\) were accordingly classified as CS. Vice versa, respondents whose predicted probability of complete success was \(p\le .5\) were classified as NCS.

Figure 2A shows the corresponding confusion matrix resulting from the RF predictions. It shows that overall, 94 per cent of all CS and NCS cases (Accuracy)Footnote 10, 99.6 per cent of all CS cases (True Positive) but only 20 per cent of all NCS cases (True Negative) were correctly classified by the specified RF.Footnote 11 The extremely good predictive ability of the CS and the poor predictive ability of the NCS has a simple reason: the output variable is extremely unbalanced in its distribution (Supplementary Table 3). As a result of the fact that there is a wealth of information on successful RC in the input data, but only little and at the same time highly varying information on unsuccessful RC, the CS of RC´s can be predicted well by the RF, while NCS can only be predicted extremely poorly. From the author's point of view, this is not a problem: Firstly, the stated aim of the paper is to predict only the CS of research clusters on the basis of the ten most important input variables, but not the NCS of RC´s. Secondly, it is in the nature of the research subject that the number of unsuccessful RCs is only between 5–10 per cent, not due to the selected population, poor response rates or social desirability (Bozeman & Youtie, 2017). The analysis of unsuccessful research clusters therefore requires a fundamentally different research design, i.e. analytical methods that can deal with small numbers of cases and highly varying data.

The relationship between sensitivity (number of true positive classified cases) and specificity (number of true negative classified cases) can be visualised using a receiver operating characteristic (ROC) curve.Footnote 12 If the predictions of an RF model resemble a random classification (reference line, Fig. 2C), the corresponding ROC curve lies diagonally across the graph and has an area under the curve (AUC) value of \(AUC=.5\). If, on the other hand, the predictions of an RF model are perfect, the corresponding ROC curve runs along the three coordinates \(A\left(1|0\right)\), \(B\left(1|1\right)\) and \(C\left(0|1\right)\) and has a value of \(AUC=1\) (Paluszynska et al., 2020).

The ROC curve of the specified RF model has a value of \(AUC=.91\). Since the area under the curve is close to 1, the RF classifier can thus be described as extremely efficient in terms of predicting the CS of RC's (Boehmke & Greenwell, 2019).

Importance of the input variables

In order to extract the ten most important variables influencing the complete success of RCs from the RF analysis, two central measures are available: the Mean Decrease in Accuracy indicates (1) how much precision power an RF model loses by excluding an input variable. The higher the loss of precision power caused by the exclusion of an input variable, the more important that input variable is for the prediction of the output variable of an RF classifier. The Mean Decrease in Gini in turn indicates (2) the extent to which an input variable contributes to the homogeneity of a subgrouping in the context of splitting. The higher the Mean Decrease in Gini value of an input variable, the more important the corresponding predictor is for predicting the output variable of an RF classifier (Biecek & Burzykowski 2021).Footnote 13

Since the Mean Decrease in Accuracy and the Mean Decrease in Gini do not show sufficient consistency with regard to the ten most important input variables (Supplementary Fig. 4, Supplementary Table 4), a scatterplot was specified for a simultaneous assessment of both measures following Paluszynska et al. (2020). The Mean Decrease in Accuracy was plotted on the x-axis and the Mean Decrease in Gini on the y-axis (Fig. 3A). The ten variables positioned highest on the x- and y-axis simultaneously were classified as the ten most important factors influencing the success of research clusters.

Fig. 3
figure 3

A Bidimensional variable importance plot of the mean decrease in accuracy and the mean decrease in Gini, B Correlation network of the correlated input variables, C Accumulated local profiles—plot of the ten most important input variables for the success of research clusters

The specified scatterplot of the importance of the input variables (Fig. 3A) shows that the realistic achievability of the goals communicated to the funding agency (ARB6) is the most important condition for the CS of an RC. This is followed—with decreasing importance—by PIs’ agreement on the common research and collaboration goals (PRM7), clear requirements for the subprojects with regard to the common goals (ARB5) and fair dealings between the cluster members (ARB2). Furthermore, it is of central importance for an RC’s CS that the PIs feel committed to the common goals (ARB8) and do everything in their power to reliably deliver their contributions to achieve the common goals (PR13). Furthermore, the top ten predictors of an RC’s CS include that the internal activities of the subprojects are aligned with regard to the RC’s common goals (ARB9), that the PIs strive to sufficiently anticipate the ways of thinking and methodological approaches of unfamiliar disciplines (DIT3), and that cooperation between PIs at the cluster level is characterised by mutual trust (PRM9). Finally, the last of the top ten predictors of an RC’s CS is PIs’ reliable commitment to the interests of the RC (PRM1).

How do the top ten input variables affect the probability of an RC achieving a CS? In principle, the partial dependence (PD) of the individual predictors can be calculated to answer this question within the framework of an RF analysis (Biecek & Burzykowski 2021). However, one condition of the correctness of the calculation of the PD is that the input variables are not significantly correlated with each other (Biecek & Burzykowski 2021). To check this assumption, the pairwise rank correlations (Spearman 1904) of all Likert-scale input variables were calculated. All correlations \(\rho =>.5\) were visualised in the form of a network graph for a compact and clear presentation (Kuhn et al. 2020). Since the corresponding correlation network (Fig. 3B) clearly shows that the (ten most important) input variables are in part strongly correlated with each other, Accumulated Local Effects (ALE) were calculated instead of PD (Apley & Zhu 2019)Footnote 14: When calculating the ALE of an input variable, intervening effects of third predictors are isolated. In this way, in contrast to classical PD, ALEs provide a correct estimate of a predictor effect even if the input variables in question correlate with other predictors (Biecek & Burzykowski 2021). Interpreting the ALE is intuitive: “The value of the ALE can be interpreted as the main effect of the feature at a certain value compared to the average prediction of the data. For example, an ALE estimate of -2 at \({x}_{j}\)=3 means that when the j-th feature has value 3, then the prediction is lower by 2 compared to the average prediction” (Molnar 2019, p. 131). Accordingly, the specified ALE plots (Fig. 3C) show on the x-axis the range of values of the ten most important input variables, and on the y-axis the deviations of the prediction of the characteristic values of an input variable from the mean prediction of all data.

The specified ALE plots show that the strongest positive-curvilinear effect on an RC’s CS comes from the input variable ARB6, i.e. from the realistic achievability of the goals communicated to the funding agency. Eight of the remaining nine input variables (PRM7, ARB5, ARB2, ARB8, PR13, ARB9, PRM9, PRM1) are less strongly associated with the probability of an RC’s CS, but nevertheless also clearly curvilinear. It is noticeable that low predictor values \((1|2)\) are constantly associated with negative deviations from the average probability of CS (see Fig. 3C), while high scale values \((4|5)\) are associated with a positive deviation from the average probability of CS. This clearly indicates that nine of the ten most important predictors have a significant (positive or negative) influence on an RC’s CS probability, especially when they assume values at the edges of their scales. The input variable DIT3 is an exception: surprisingly, the more the PIs of an RC try to anticipate the ways of thinking and methodological approaches of other PIs or disciplines, the more the probability of success decreases.

Discussion

Summary of the findings

The findings from the state of research regarding central determinants for success in collaborative research were updated with the ten most important variables influencing an RC’s CS using RF analysis. The specified RF classifier showed a very good fit to the data and predicted an RC’s CS with high accuracy. The stated objective of the paper—the exploration of the ten most important determinants of RC’s CS informed by the state of research—was thus met. From the ten most important factors of RC's CS, five central conditions for successful research collaboration can be derived:

  • 1. Realistic, clear and shared goals

    Highly realistic research or collaboration goals (ARB3) are by far the most important factor influencing the CS of an RC. The RF analysis further shows that an RC also needs goals that are clearly defined (ARB5) and shared by all PIs (PRM7). Realistic, clear and shared goals avoid excessive demands and demotivation (Blanckenburg et al., 2005), ensure a sustainable commitment of the PIs (Aubé & Rousseau, 2005; John, 2019), enable organisational and content-related control of the subprojects and thus reduce the centrifugal forces that potentially arise from the organisational structures of RCs (Defila et al., 2006). They are the core of an RC, without which “there can be no orientation, no evaluation and consequently also no correction of action with regard to goal achievement” (Vowe & Meißner, 2020, p. 170).Footnote 15

  • 2. Commitment of the Pis

    A strong commitment (PRM1, ARB8, PR13) of the collaboration partners to the common goals of the RC is crucial in the context of collaborative research: Particularly at the beginning of a collaborative research project, the PIs have to make considerable up-front investments over long periods of time without being able to fix the amount of returns that the time and energy resources they have invested may yield at a later point in time (Kozlowski & Bell, 2001).. However, given the decentralised organisational structures of research clusters (Defila et al., 2008), PIs have significant incentives for opportunistic behaviour: investments in joint collaboration at cluster level may be withheld in favour of the interests of the subprojects for which the PIs are responsible due to a lack of interest in joint success when obstacles or challenges arise (Baurmann & Vowe, 2014). However, if the PIs do not invest early and intensively enough in cross-subproject collaboration due to a lack of commitment, the RC cannot sufficiently achieve its goals (Meißner et al., 2022; [BLINDED]).

  • 3. Cohesive team climate

    In order for the RC’s PIs to find common solutions despite controversial positions and opinions, to take social or epistemic risks, to conduct open debates and discussions, or to resolve emerging relationship or task conflicts constructively and effectively, a cooperation climate characterised by fairness (PRM9) and trust (ARB2) is required (Shrum et al., 2001). The cohesive forces of a fair and trusting cooperation climate can mitigate cooperative relationships that are perceived as stressful and the friction losses they promote, prevent wear and tear on commitment, job satisfaction and creativity, and thus efficiently mobilise the synergy potential of an RC (Blanckenburg et al., 2005).

  • 4. Research Coordination

    The common research questions of an RC can only be answered in a coherent manner across subprojects if the research work relevant to the achievement of the RC's goals is (continuously) coordinated (ARB9) (Defila et al., 2008). Without continuous coordination across subprojects, the interconnectedness of the subproject-internal research work threatens to be successively dismantled and the PIs’ research work gradually drifts apart. As a result, the ability to synthesise the subprojects’ research results is jeopardised. If the subprojects’ research results cannot be synthesised, the RC’s common research question cannot be answered successfully (O’Rourke et al., 2019): if there is no synthesis at the end, an RC ultimately does not add value, i.e. the subprojects of an RC could just as well have been carried out and funded on a stand-alone basis (Defila et al., 2006).

  • 5. Interdisciplinary communication

    Surprisingly, the RF shows that the ability of PIs to make other cluster members understand their own point of view (DIT3) has a (slightly) negative influence on the probability of success of RCs. This can be taken as an indication that interdisciplinary communication is sometimes a challenging and time-consuming endeavour that can, at worst, slow down or even block the research of individual PIs or an entire RC (O'Rourke et al., 2019). This results in the RC's objectives not being sufficiently achieved within the deadlines set by the funding organisation (Defila et al., 2006). In this respect, it can be assumed that interdisciplinary communication must, on the one hand, enable all PIs to have a sufficient mutual understanding of their research work as well as knowledge of the central results, but at the same time—in order to avoid paralysis of the collaboration—it must not overburden the PIs (Misra et al., 2011). Interdisciplinary communication that follows a straightforward rule is therefore needed: as simple as possible, as complex as necessary.

Theoretical and practical contributions

Even though the success factors for research collaborations have been the implicit or explicit subject of numerous papers, to the author's knowledge no study to date has investigated the influence of intra- and interpersonal factors on the success of RCs in a comparably systematic way and on the basis of representative survey data. With the help of the Random Forest classifier, it was possible to bundle the diverse findings that can be taken from the current state of research, put them in a hierarchical order with regard to their importance for the success of collaboration and thus extract the ten most significant factors influencing the success of RCs. In this way, it was possible to differentiate the understanding of how the various intra- and interpersonal factors should be weighted and how a successful research collaboration should be designed. Accordingly, various implications for researchers, funding agencies, science managers and policy-makers can be derived from the contribution:

Firstly, it was found that the goals communicated to the funding agency are of utmost relevance for the subsequent success of an RC. Potential founders of RCs who fail to formulate equally clear, binding, shared and, above all, achievable goals have less chance of success. Secondly, RCs that establish a working atmosphere characterised by fairness and trust can significantly increase their chances of success. In the context of their collaboration, spokespersons and PIs who establish or run an RC should constantly direct efforts towards ensuring that uncertainties, ambiguities and dissent on substantive issues are resolved constructively and that team members are affirmed in their individual competence and not made to feel insecure. Thirdly, the present study supports the assumption that an RC can increase its chances of success by (continuously) coordinating and mutually interconnecting the subproject-internal research work. The interconnection of research work is a necessary condition for an RC to succeed in integrating the subprojects’ research results into an overall view, to answering the common research question of the RC consistently and thus to meeting the central claim of integration-oriented knowledge production. In particular, the spokespersons of RCs must ensure that the subprojects’ research work is continuously interconnected and mutually integrated. The design of communication, work organisation and the organisational structure of an RC must accordingly also be aligned with the synthesis work of the subprojects from the outset. Fourthly, bridging epistemic incompatibilities is an extremely demanding and time-consuming core task and at the same time an integral prerequisite for the success of many RCs. It must be decisively supported by research funding agencies, science managers and policy-makers (for example, by providing the necessary material and human resources) and professionalised and continuously accompanied by targeted research network management.

Researchers, funding agencies, science managers and policy-makers can use the factors identified by the RF analysis to anticipate the success or failure of an RC at an early stage. The factors mentioned are necessary, but of course not sufficient conditions for the success of an RC. However, if the factors mentioned are not taken into account, the probability of failure increases significantly.

Limitations and gaps for future research

The results of the present study are of limited significance in several respects: First, the CS of research clusters was measured on the basis of subjective self-assessments by PIs and spokespersons. It is generally expected that self-evaluations by researchers are subject to greater bias than—for example—evaluations of success based on bibliometric productivity data. Secondly, due to the specified RF classifier being built on survey data from PIs and spokespersons of RCs, the ranking of the top ten reasons to CS, an RC is also based on the perspective of PIs and spokespersons. Survey data from the many “research slaves” (Münch, 2007, p. 360), i.e. the research assistants, doctoral students and postdocs who conduct a wide variety of research work within the subprojects and in the execution of countless supporting collaborations (Laudel, 2002), would allow complementary insights into the most important factors influencing the success of the collaborations.

Thirdly, for reasons of complexity reduction, various characteristics of RCs (e.g., mono- or interdisciplinary cooperation mode, length of past duration, personnel size, or disciplinary composition) remained unconsidered in the analysis. For this reason, the question to what extent the ten most important success factors identified by the RF analysis are relevant for different subgroups or to what extent substantial differences exist between differently constituted RCs remains open.

Fourthly, because the analyses in this article are based on survey data from PIs and spokespersons of DFG research clusters, the results are not directly generalizable to research collaborations in other funding systems and nations.