1 Introduction

Regions continuously undergo structural change. New activities emerge and grow, while old activities shrink or vanish. The ability to diversify into new fields crucially matters for regions’ economic growth and resilience (Content and Frenken 2016). Consequently, regional diversification is at the focus of policy makers. For example, the current Smart Specialization Strategy of the European Union explicitly supports and encourages regional diversification strategies (Foray et al. 2011). However, the extent to which regional policy can actually influence such (long-running) developments is still an open question.

In the current paper, we approach this question by focusing on R&D subsidies as one important tool of modern regional innovation policy and analyze its effect on regional technological diversification. In light of the contemporary research underlining the strongly path dependent nature of regional (related) diversification (Neffke et al. 2011; Boschma et al. 2013, 2015; Essletzbichler 2015; Rigby 2015; Balland et al. 2019), we are particularly interested in two questions: Firstly, is policy part and potential facilitator of such path dependencies? This question refers to the allocation of R&D subsidies, which may be used to support the diversification into new (related) fields that build on already existing development paths. Secondly, can policy intervene and alter the process of regional diversification, and if so, how? We argue that R&D subsidization can be useful and effective in this context. If designed in a suitable manner, such programs alleviate the risks associated with the exploration of new activities and simultaneously stimulate inter-organizational collaboration. Accordingly, they (partly) compensate for uncertainties inherent to diversification activities and stimulate the accesses and use of external knowledge at the same time. Put differently, they closely relate to and potentially impact processes at the heart of (technological) diversification.

Our paper thereby fills a gap in the existing literature, as, so far, few efforts have been made to assess systematically the contribution of R&D policy to regional technological diversification (Boschma and Gianelle 2014). Moreover, most evaluations of R&D subsidization programs are restricted to the firm level (Czarnitzki et al. 2007; Czarnitzki and Lopes-Bento 2013), while attention has only recently been drawn to the regional level (Maggioni et al. 2014; Broekel 2015; Broekel et al. 2017).

We support our theoretical arguments with an empirical investigation on the contribution of project-based R&D subsidization by the Federal Government of Germany to regional technological diversification processes. Firstly, we explore the extent to which the allocation of R&D subsidies supports unrelated or related technologies in regions. Secondly, we test if these R&D subsidies increase the chances of successful diversification in general and if they are rather conducive for related or unrelated technological diversification in regions. Thirdly, we differentiate between subsidies for individual and for joint research projects, as previous research showed that the two subsidy modes can have different effects (Broekel 2015; Broekel et al. 2017).

Our empirical study builds on a panel regression approach utilizing data on 141 German labor-market regions covering the period from 1991 to 2010. Patent information is used as an indicator for technology-oriented R&D activities in regions and matched with subsidized R&D projects. Our empirical results confirm the path dependent nature of regional technological diversification, which is driven by technological relatedness. In addition, R&D subsidies are more likely allocated to related capabilities in regions, indicating the tendency of policy to be part of the path dependency in regional diversification. Our study confirms that R&D subsidies stimulate technological diversification in regions. The identified positive effects are particularly pronounced and robust in the case of subsidized joint R&D projects. We find that R&D subsidies for joint research projects are an appropriate policy that, to some extent, compensates for missing relatedness and hence facilitates diversification into unrelated technological activities.

The remainder of the study is organized as follows Sect. 2 provides an overview of the existing literature on regional diversification and R&D policy. We describe our data and empirical approach in Sect. 3. The empirical results are part of Sect. 4. The paper concludes with a discussion of our results regarding their implications for regional innovation policy in Sect. 5.

2 R&D subsidies and regional diversification

2.1 R&D subsidies and diversification

R&D policy programs are justified by knowledge creation and innovation being important production factors for economic growth. Nevertheless, knowledge creation suffers from significant market failures (Nelson 1959; Arrow 1962; McCann and Ortega-Argiles 2013). For instance, firms cannot fully benefit from their R&D investments, as new knowledge might lack appropriability and spills over to third parties, giving rise to positive externalities. Similarly, R&D projects are characterized by significant uncertainty making ex ante calculations of investments into R&D a difficult task. Increasing complexity of technologies also requires efforts exceeding individual firms’ capabilities. Accordingly, collaboration with other organizations becomes a necessity, which raises the danger of moral hazard and unintended knowledge spillovers (Hagedoorn 2002; Cassiman and Veugelers 2002; Broekel 2015). In sum, private R&D investments are likely to fall short of a social optimum. This motivates and justifies public intervention, which seeks to close the gap between actual and socially desired levels of knowledge creation by supporting R&D activities.

There are numerous instruments policy may use to increase the level of R&D activities. Among the most prominent and frequently used tools are project-based R&D subsidies (Aschhoff 2008). These are intended to increase R&D activities of organizations regarding their innovation input and output. Concerning the input, one major question is whether firms use public subsidies as a complementary and additional financial source to realize R&D projects or if they “crowd out” private investments. The large body of empirical research finds mixed results. Although a general crowding-out effect cannot be ruled out and depends largely on firm characteristics, the majority of studies find evidence for additionality effects (Busom 2000; Czarnitzki and Hussinger 2004; Zúñiga-Vicente et al. 2014). Regarding innovation output, public subsidies seem to stimulate R&D activities. A number of studies show the positive effect of R&D subsidies on firms’ innovativeness (Czarnitzki et al. 2007; Czarnitzki and Hussinger 2018; Ebersberger and Lehtoranta 2008). That is, significant parts of private R&D activities would not have been realized without subsidization, implying that public subsidies seem to complement private R&D.

Yet the design of R&D subsidization programs offers a lot of flexibility, which allows for substantial “fine-tuning” of initiatives. For instance, subsidization can be restricted to specific organizations (location, size, industry), to selected fields (technologies, sectors), or to particular modes of R&D (individual or joint). Policy can also decide about starting dates and time periods of support. Usually, R&D subsidies are granted through competitive bidding procedures (Aschhoff 2008), and they are targeted at innovative self-discovery processes (Hausmann and Rodrik 2003) with the stimulation of inter-organizational knowledge exchange becoming an increasingly important feature (Broekel and Graf 2012).

All of these features are used in contemporary policies to varying degrees. For instance, the EU-Framework Programmes (EU-FRP) are focused on supporting R&D and on stimulating inter-regional as well as international knowledge diffusion by exclusively supporting collaborative projects (Scherngell and Barber 2009; Maggioni et al. 2014). Another example of R&D subsidization with specific features is the German BioRegio contest. This initiative focused on advancing one particular technology (biotechnology) and rewarded proposals building on and stimulating intra-regional collaboration (Dohse 2000).

While most empirical studies evaluate the effects of R&D subsidies at the firm level, we follow Broekel (2015) and extend this perspective to the regional level. More precisely, we argue that project-based R&D subsidization may play a role in regional diversification processes. Interestingly, linking policy to regional diversification has rarely been done in the literature. An exception concerns the case study by Coenen et al. (2015) that investigates opportunities, barriers, and limits of regional innovation policy aiming at the renewal of mature industries. The authors show, for the case of the forest industry in North Sweden, that regional innovation policy can accompany the process of regional diversification by supporting the adoption and creation of related technologies. Our study complements this approach by focusing on a particular policy, namely R&D subsidies and their effects on regional diversification.

2.2 Regional diversification and relatedness

Regional diversification is in the focus of contemporary innovation policy. For instance, the EU’s Smart Specialization strategy aims at fostering (technological) diversification around regions’ core activities (Foray et al. 2011). Thereby, policy seeks to exploit the benefits associated with diversification. For instance, diversification positively relates to the level of income, allowing regions to climb the ladder of economic development (Imbs and Wacziarg 2003). Diversified regions are, moreover, less likely to run into the trap of cognitive lock-ins (Grabher 1993) and are less prone to suffer from exogenous shocks because of portfolio effects (Frenken et al. 2007). Regional R&D competences in multiple fields also give rise to synergies increasing the exploitation and experimentation of technological opportunities (Foray et al. 2011).

A large stream of literature increasingly devotes its research to the path dependent feature of regional diversification expressed by the crucial role of relatedness (Hidalgo et al. 2007; Boschma and Frenken 2011; Neffke et al. 2011; Hidalgo et al. 2018). Concepts such as related diversification and regional branching (Boschma and Frenken 2011) highlight that regional diversification is not a random process but that existing capabilities influence the development of future capabilities. The so-called principle of relatedness (Hidalgo et al. 2018) is not only working at the individual level of firms (Teece et al. 1994; Breschi et al. 2003) but shows its importance at different spatial scales. For example, Hidalgo et al. (2007) find that nations are more likely to diversify into new export products that are related to their existing product portfolio. Neffke et al. (2011) transfer this approach to the regional level. By relying on information about products of Swedish manufacturing firms, they show that new industries do not emerge randomly across space. Rather, they are more likely to emerge in regions where related capabilities already exist. Essletzbichler (2015) confirms this finding for industrial diversification in US metropolitan areas. Similar results are obtained by Boschma et al. (2013) for the export profile of Spanish regions. By comparing the impact of relatedness for different spatial levels, the authors also show related industries to play a more crucial role at the regional compared to the national level. Rigby (2015) and Boschma et al. (2015) analyze regional diversification in US metropolitan areas. Both confirm that technology entries are positively, and exits are negatively, correlated with their relatedness to regions’ technology portfolios.

The ample empirical evidence for related diversification being the norm rather than the exception reveals the dominant role of path dependency in diversification processes. By building on related capabilities, economic actors follow existing technological trajectories, rely on established routines, and build on familiar knowledge (Nelson and Winter 1982; Dosi 1988). Building on existing capabilities rather than exploring completely new ones reduces uncertainties and risks while increasing the likelihood of successful diversification.

The path dependency in regional diversification certainly has substantial advantages. For instance, regions can specialize and build competitive advantages in certain activities providing them with important growth opportunities (Martin and Sunley 2006; Boschma and Frenken 2006). The continuous specialization of the Silicon Valley into information and communication technologies is a prominent example of successful related diversification along a promising path (Storper et al. 2015). Nevertheless, related diversification can also lead to regional lock-ins by following mature paths with little future prospects, such as in the German Ruhr-Area (Grabher 1993). Diversification into unrelated activities can prevent such lock-ins by broadening the set of regional capabilities. In addition, it increases regional resilience toward external shocks (Frenken et al. 2007). Yet unrelated diversification requires the exploration of new knowledge, which is uncertain, risky, and less promising.

2.3 R&D subsidies and regional diversification

Can project-based R&D subsidies impact regional diversification? If so, how? Firstly, diversification requires organizations to leave existing routines by exploring new activities involving novel (at least to the organization) knowledge and technologies. It further implies less foresight on potential outcomes and lower abilities to plan R&D processes as well as commercialization possibilities. Existing routines are less helpful in designing financial plans, selecting appropriate suppliers, or buying needed equipment. Consequently, diversification-oriented R&D can be expected to represent a risky and uncertain undertaking. Organizations therefore show a tendency to avoid diversification into completely new activities. R&D subsidies can to some extent compensate the risks associated with diversification and induce actors to explore new activities (Fier et al. 2006). We therefore argue that organizations are highly likely to use R&D subsidies for (risky) diversification activities.

Secondly, the effects of project-based R&D subsidies unfold beyond the individual organization (Broekel 2015; Maggioni et al. 2014). Organizations are embedded into regional economies through labor mobility, collaboration, social networks, input–output linkages, and other types of interactions. This is highlighted in various approaches, including regional innovation systems, learning regions, and clusters (Cooke 1998; Florida 1995; Porter 2000). Accordingly, knowledge and competences that are acquired in subsidized projects are more likely to be picked up and utilized by other regional actors. In this sense, R&D subsidies present a resource inflow into the region’s innovation system supporting  innovation activities, including those oriented toward diversification.

Thirdly, regional diversification frequently takes place through spin-off and start-up processes (Boschma and Wenting 2007; Klepper 2007; Boschma and Frenken 2011). At the same time, spin-offs in particular have been identified as frequent and above-average recipients of R&D subsidies (Cantner and Kösters 2012). The added value of the support thereby exceeds what has been discussed above. Fier et al. (2006) identified subsidies to support university spin-outs by adding credibility and strengthening public relations. Under the assumption that there is no discrimination against spin-offs active in technologies new to a region, R&D subsidies thereby directly support regional diversification.

Fourthly, many R&D subsidization initiatives seek to advance particular technologies (e.g., biotechnology) (Dohse 2000). Announcing such initiatives signals to economic actors that these technologies are (at least in the eyes of policymakers) promising and may offer economic potential. If effective, this is likely to stimulate actors to expand already existing activities in these technologies or diversify into these activities.

In sum, R&D subsidies alleviate risks of research activities with uncertain outcomes. Therefore, they encourage riskier research, expand R&D resources, and exert particular benefits for spin-offs as well as spin-outs. In turn, all these contribute to regional diversification. Notably, the discussed effects are largely independent of the policy being designed to support diversification. Naturally, such diversification-enhancing effects are amplified when R&D subsidization policies aim to support diversification, as was the case in the BioRegio contest (Dohse 2000).

Many of the described mechanisms are working at the level of organizations. However, successful diversification at this level does not necessarily imply that a new activity is also new to the region. Figure 1 illustrates the two scenarios of regional diversification (panel C and D) in contrast to those of no diversification (panel A) and diversification at the organizational but not regional level (panel B). Clearly, the main mechanisms of regional diversification unfold their force at the level of organizations. However, regional diversification goes beyond this, as, for instance, it does not reflect an organization engaging into a new activity, which is, however, already performed by another organization in the region. In the remainder of the paper, we focus on scenarios C and D when referring to regional diversification. Scenario C occurs when organizations are active in multiple regions and it shifts or expands one of its activities from on region into another without any other organization in that region being active in this field. In contrast, scenario D refers to  the case of an organization taking up an activity that was not part of its portfolio or of that of any other organization in the region. We argue that R&D subsidies are likely more relevant for diversification activities that are new to the region, as actors face higher risks and uncertainties if they can neither build on own competences nor on those of other local organizations. While this implies hiding some diversification activities at the organizational level (panel B), it considers substantial additions to the regional technological portfolio.

Fig. 1
figure 1

The interplay of organizational and regional diversification

We further argue that not all subsidies equally impact all diversification processes. We particularly expect them to matter more for regions diversifying along existing technological trajectories (related diversification). The primary reason for this is that the subsidies are more likely to be received by projects building on existing regional competences. Innovation policy does not allocate R&D subsidies randomly. Applications need to pass a review process, which usually aims at selecting those with the highest chances of being successful (Aubert et al. 2011). This applies to applications with applicants’ competences meeting those necessary for the successful completion of projects. In addition, organizations usually require technological expertise, prior experiences, infrastructure, and matching qualifications to write convincing applications. This is more likely when organizations are active in similar or related activities (Blanes and Busom 2004; Aschhoff 2008).

This selection process is not restricted to the organizational level. For instance, Broekel et al. (2015b) show that even when controlling for organizational characteristics, being located in a regional cluster (of related activities) increases the chances of receiving R&D subsidies (at least in the case of EU-FRP). One of the reasons is that organizations located within clusters “are more likely to learn about subsidization programs, which is probable to translate into higher application rates” (Broekel et al. 2015b, p. 1433). It seems reasonable to assume that this especially applies to policy initiatives related to activities of organizations within the cluster. Consequently, we expect that R&D policy plays a role in the path dependency in regional diversification by preferentially allocating public resources to related, rather than to unrelated, capabilities in regions. Our first hypotheses read as follows:


Project-based subsidization of R&D positively influences technological diversification in regions.


Project-based subsidization of R&D is more likely to contribute to related diversification.

While these hypotheses refer to R&D subsidies in general, we argue that the influence of R&D policy depends on its specific mode. Previous research has shown that the effects of R&D subsidization differ between subsidies granted to individual and joint research projects (Broekel and Graf 2012; Broekel 2015). In contrast to subsidies for individual projects, supporting joint R&D projects has a greater potential for stimulating the exploration of new knowledge and activities, as these require collaboration between organizations. Consequently, such support is likely to change organizations’ and regions’ embeddedness into intra-regional and inter-regional knowledge networks (Fier et al. 2006; Wanzenböck et al. 2013; Broekel 2015; Töpfer et al. 2017). For instance, Broekel et al. (2017) measure the technological similarity of partners in subsidized projects and find these to be rather heterogeneous. Firms are also shown to add science organizations to their portfolio of collaboration partners when participating in subsidized R&D projects (Fier et al. 2006).

The utilization of subsidies to explore new knowledge is further highlighted by the location of collaboration partners. In Germany, only 12% of collaborations established by joint projects subsidized by the federal government connect partners within the same region (Broekel and Mueller 2018). In the case of the EU-FRP for biotechnology, this figure is as small as one percent (Broekel et al. 2015b). Accordingly, project-based subsidies are frequently employed to establish or strengthen relations with dissimilar actors from different regions, which is crucial and typical for diversification activities (Hagedoorn 1993; Boschma and Frenken 2011; van Oort et al. 2015). We therefore expect subsidies for joint (collaborative) research to have stronger effects than individual grants, due to their impact on collaboration and knowledge networks. As collaborative R&D subsidies facilitate knowledge exchange between new and heterogeneous actors, we particularly expect joint research projects to increase the likelihood of unrelated diversification in regions. This is summarized in the following hypotheses:


Subsidized joint R&D projects contribute to a larger extent to technological diversification in regions than do individual R&D projects.


Subsidized joint R&D projects facilitate regional diversification into unrelated activities.

3 Data and methods

3.1 Measuring regional diversification

To study the relationship between R&D subsidies and regional diversification, we focus on 141 German labor-market regions (LMR), as defined by Kosfeld and Werner (2012). Moreover, our data cover the years from 1991 to 2010. In a common manner, we use patent data to approximate technological activities (Boschma et al. 2015; Rigby 2015; Balland et al. 2019). Despite well-discussed drawbacks (Griliches 1990; Cohen et al. 2000), patents entail detailed information about the invention process, such as the date, location, and technology, all of which are fundamental for our empirical analysis. We extract patent information from the OECD REGPAT Database, which covers patent applications at the European Patent Office (EPO). Based on inventors’ residences, we assign patents to the corresponding LMR. For smaller regions in particular, annual patent counts are known to fluctuate, strongly challenging robust estimations. We therefore aggregate our data into four 5-year periods (1991–1995, 1996–2000, 2001–2005, 2006–2010).

Technologies are classified according to the International Patent Classification (IPC). The IPC summarizes hierarchically eight classes at the highest and more than 71,000 classes at the lowest level. We aggregate the data to the four-digit IPC level, which differentiates between 630 distinct technology classes. The four-digit level represents the best trade-off between a maximum number of technologies and sufficiently large patent counts in each of these classes.

Previous studies relied on the location quotient (LQ), also called revealed technological advantage (RTA), to identify diversification processes. For example, LQ values larger than one signal the existence of technological competences in a region, and values below signal their absence. Successful diversification is then identified when the LQ grows from below one to above one between two periods (Boschma et al. 2015; Rigby 2015; Cortinovis et al. 2017; Balland et al. 2019). We refrain from this approach for two important reasons. Firstly, being a relative measure, the LQ approach allows technologies to “artificially” emerge in regions simply by decreasing patent numbers in other regions. Secondly, the LQ is normalized at the regional and technology levels, which can interfere with the inclusion of regional and technology fixed effects in panel regressions.

We therefore rely on an alternative and more direct approach to assess diversification processes by concentrating on absolute changes in regional patent numbers. More precisely, we create the binary dependent variable Entry with a value of 1 if we do not observe any patents in technology k in region r and period t, and a positive value in the subsequent period \(t+1\). We intensively checked the data for random fluctuations between subsequent periods, which can inflate the number of observed entries. The aggregation of regional patent information into 5-year periods, however, eliminated such cases almost completely.

3.2 Information on R&D subsidies

Our main explanatory variable, Subsidies, represents the sum of R&D projects in technology class k and region r at time t. The so-called Foerderkatalog of the German Federal Ministry of Education and Research (BMBF) serves as our data source. The BMBF data cover the largest parts of project-based R&D support at the national level in Germany (Czarnitzki et al. 2007; Broekel and Graf 2012) and have been used in a number of previous studies (Broekel and Graf 2012; Broekel et al. 2015a, b; Cantner and Kösters 2012; Fornahl et al. 2011). The data provide detailed information on granted individual and joint R&D projects, such as the starting and ending dates, the location of the executing organization, and a technological classification called Leistungsplansystematik (LPS).

The LPS is a classification scheme developed by the BMBF and consists of 47 main classes. The main classes are, similarly to the IPC, disaggregated into more fine-grained subclasses, which comprise 1395 unique classes at the most detailed level. To create the variable Subsidies, we need to match the information on R&D subsidies with the patent data. Both are based on different classification schemes (IPC and LPS), which prevents a direct matching. Moreover, there is no existing concordance of the two classifications.

We therefore develop such a concordance. To build the concordance, we reduce the information contained in the Foerderkatalog by excluding classes that are irrelevant for patent-based innovation activities. This primarily refers to subsidies in the fields of social sciences, general support for higher education, gender support, and labor conditions. Next, we utilize a matched-patent-subsidies-firm database created by the Halle Institute of Economic Research. This database includes 325,497 patent applications by 5398 German applicants between 1999 and 2017. It also contains information on 64,156 grants of the Foerderkatalog with 10,624 uniquely identified beneficiaries. In this case, beneficiaries represent so-called executive units (“Ausführende Stelle”) (see Broekel and Graf 2012).

In this database, grant beneficiaries and patent applicants are linked by name-matching. Hence, the IPC classes of beneficiaries' patents can be linked to the LPS classes of their grants. In principle, this information allows for a matching of the most fine-grained level of the IPC and LPS. In this case, however, the majority of links are established by a single incidence of IPC classes coinciding with LPS classes, i.e., there is only one organization with a patent in IPC class k and a grant in LPS class l. Moreover, the concordance is characterized by an excessive number of zeros, as only few matches of the 71, 000 (IPC) \(^{*}\) 1395 (LPS) cases are realized.

To render the concordance more robust, we therefore establish the link on a more aggregated level, which also makes the concordance correspond to the data employed in this study. More precisely, we aggregate the IPC classes to the four-digit level and the LPC to the 47 main classes defined in (BMBF 2014). It is important to note that not all LPS main classes are relevant for patent-based innovation (e.g., arts and humanities). We eliminate such classes and eventually obtain 30 LPS main classes that are matched to 617 out of 630 empirically observed IPC classes. For these, we calculate the share of organizations \(S_{l,k}\) with grants in LPS l that also patent in IPC k:

$$\begin{aligned} S_{l,k}=\frac{n_{l,k}}{\sum _{x=1}^{X_l} n_{x}} \end{aligned}$$

with \(n_{l,k}\) being the number of organizations with at least one patent in k and grant in l. \(X_l\) is the total number of organizations with grants in l. On this basis, we calculate the number of subsidized projects, \( \hbox {Subsidies}_{l,k}\), assigned to region r and technology k by multiplying the number of grants in l acquired by regional organizations with patents in k with \(S_{l,k}\). Following the discussion in Sect. 2, we calculate Subsidies in three versions: on the basis of all subsidized projects (\( \hbox {Subsidies}_{k,r}\)), for individual projects (\( \hbox {Subsidies}^{\mathrm{Single}}_{k,r}\)), and considering only joint projects (\( \hbox {Subsidies}^{\mathrm{Joint}}_{k,r}\)) in technology class k and region r.

3.3 Relatedness density

Our second most important explanatory variable is relatedness. We follow the literature in constructing this variable as a density measure (Hidalgo et al. 2007; Rigby 2015; Boschma et al. 2015). More precisely, relatedness density reveals how well technologies fit to the regional technology landscape. It is constructed in two steps.

Firstly, we measure technological relatedness between each pair of technologies. The literature suggests four major approaches: (1) entropy-based (Frenken et al. 2007), (2) input–output linkages (Essletzbichler 2015), (3) spatial co-occurrence (Hidalgo et al. 2007), and (4) co-classification (Engelsman and van Raan 1994). We follow the fourth approach and calculate technological relatedness between two technologies (four-digit patent classes) based on their co-classification pattern (co-occurrence of patent classes on patents). The cosine similarity gives us a measure of technological relatedness between each technology pair (Breschi et al. 2003).

Secondly, we determine which technologies belong to regions’ technology portfolios at a given time. Straightforwardly, we use patent counts with positive numbers indicating the presence of a technology in a region. Following Hidalgo et al. (2007), we measure relatedness density on this basis as:

$$\begin{aligned} \hbox {Density}_{k,r} = \frac{\sum _{m} x_{m} \; \rho _{k,m}}{\sum _{m} \rho _{k,m}} * 100 \end{aligned}$$

where Density stands for relatedness density. \(\rho\) indicates the technological relatedness between technology k and m, while \(x_{m}\) is equal to 1 if technology m is part of the regional portfolio (Patents \(> 0\)) and 0 otherwise (Patents \(=\) 0). Consequently, we obtain a 141 \(\times\) 630 matrix including the relatedness density for each of the 630 IPC classes in all 141 LMRs indicating their respective relatedness to the existing technology portfolio of regions.

3.4 Control variables

In addition to R&D subsidies and relatedness density, the empirical literature has identified a number of other determinants of regional technological diversification. Knowledge spillover from adjacent regions can potentially impact regional diversification processes (Boschma et al. 2013). We account for these potential spatial spillovers and include technological activities in neighboring regions (\(Neighbor \; Patents_{k,r}\)) as a spatially lagged variable. The variable counts the number of patents in technology k of all neighboring regions s of region r. Regions s and r are neighbors if they share a common border.

We also control for a number of time-varying regional and technology characteristics that influence regional diversification processes. Firstly, regional diversification is dependent on the development stage of regions (Petralia et al. 2017). Hence, economically well-performing regions have more opportunities to diversify into new and more advanced activities than less developed regions. We follow existing approaches and use the gross domestic product per capita (\(Regional\;GDP_{r}\), log transformed) to control for the economic performance of regions (Petralia et al. 2017; Balland et al. 2019). Secondly, the size of the region also plays a role. Regions with a larger working force tend to be more successful in terms of diversification (Boschma et al. 2015; Balland et al. 2019). We therefore include the number of employees in a region (\(Regional\;Employment_{r}\), log transformed) in our empirical model. Both variables, \(Regional\;GDP_{r}\) and \(Regional\;Employment_{r}\), are obtained from the German "ArbeitskreisVolkswirtschaftliche Gesamtrechnungen der Länder” (August 2018). Thirdly, we also consider the number of regional patents (\(Regional\;Patents_{r}\)) to control for the size of the regional patent stock, which also serves as a measure of regions’ overall technological capabilities. Fourthly, diverse regions with larger sets of capabilities have more opportunities to move into new fields than regions with narrow sets (Hidalgo et al. 2007). The regional diversity (\(Regional\;Diversity_{r}\)) variable detects this and is defined as the number of technologies k with positive patent counts in a region. Lastly, the size of technologies is controlled for by considering the number of patents in a given technology (\(Technology\;Size_{k}\)). Descriptive statistics and correlations for all variables are reported in Table 1.

Table 1 Summary statistics and correlation matrix

3.5 Empirical model

We follow an established approach in the literature on regional diversification to set up our empirical model (Boschma et al. 2015; Balland et al. 2019). More precisely, we rely on panel regressions to explain the status of technological diversification in a region. Our basic model is specified as follows:

$$\begin{aligned} Entry_{k,r,t}= & {} \beta _{1}\hbox {Subsidies}_{k,r,t-1} + \beta _{2}Density_{k,r,t-1}\nonumber \\&+ X_{k,r,t-1} + R_{r,t-1} + T_{k,t-1} + \tau _{k} + \pi _{r} + \omega _{t} + \epsilon _{k,r,t} \end{aligned}$$

Entry indicates the status of diversification into technology k of region r at time t. Accordingly, all estimations are based at the region-technology level. Subsidies summarizes the number of subsidized R&D projects. In alternative models, it is replaced with the number of individual (SubsidiesSingle) and joint projects (SubsidiesJoint). Density is the relatedness density, and X, R, and T are vectors of control variables at the technology-region, region, and technology level. All estimations include technology (\(\tau\)), region (\(\pi\)), and time (\(\omega\)) fixed effects capturing time-invariant, unobserved, heterogeneity. We assume a time delay with which our dependent variable responds to variation in the explanatory variables. R&D subsidies, for example, are unlikely to cause immediate effects visible in innovation activities as approximated by patents. Rather, they unfold their influence in subsequent years. Consequently, we lag the explanatory variables by one time period, which corresponds to 5 years.Footnote 1

As Entry is a binary variable, a logit regression is applicable. Nevertheless, logit regressions with many fixed effects and few time periods can lead to the prominent incidental parameters problem causing biased results (Neyman and Scott 1948). Therefore, we rather rely on a linear probability model (LPM) to assess the probability that technology k emerges in region r. We, nevertheless, report the results of the three-way fixed effects logit regression in our robustness checks (see Table 7 in “Appendix” ). An entry model implies restricting the observations to those cases in which an entry is possible. Accordingly, we reduce the sample to all potential cases of entry, which corresponds to technology k being absent from the regional technology portfolio in \(t-1\) (zero patents).

4 Results

4.1 The allocation of R&D subsidies

We start with the exploration of R&D subsidies’ allocation. Panel A in Fig.  2 reveals the distribution of R&D subsidies across the 630 IPC subclasses between 2006 and 2010. The colors indicate the eight main sections of the IPC. Panel A shows that subsidies are not widely scattered across all main sections but rather concentrate in specific domains. A large portion of subsidies flows into technologies belonging to physics, chemistry, electricity, and human necessities. In contrast, textiles, mechanical engineering, and construction technologies receive considerably less subsidies. IPC subclasses, such as G01N (Investigating or Analysing Material), H01L (Semiconductors), A61K (Preparation for Medical Purposes), and C12N (Microorganisms and Genetic engineering) are among the most strongly subsidized technologies.

Fig. 2
figure 2

Distribution of a R&D subsidies and b percentage of entries across IPC subclasses between 2006 and 2010. Colors indicate the eight IPC main sections. The dashed horizontal lines represent the sample mean (color figure online)

Panel B of Fig.  2 shows how frequently technologies emerge in regions. Larger entry numbers indicate that many regions diversified into the according technologies. This reflects the spatial diffusion of these technologies within Germany. Entry numbers vary considerably between technologies, with each IPC subsection being characterized by low- and high-entry technologies. The visual inspection of Fig. 2 reveals that subsidies are not necessarily allocated to technologies with the highest numbers of entries. For example, technologies in mechanical engineering and fixed construction show large numbers of entries and receive comparatively few subsidies. In other cases, there seems to be some alignment. For instance, the top four technologies with the highest entry numbers (F24J = Production of use of heat, C10L = Fuels, F03D = Wind motors, and E21B = Earth and rock drilling) represent technological fields related to renewable energy production or energy usage. Renewable energies have become very popular in Germany and are still strongly subsidized to support the transition from fossil energy sources to renewables (Jacobsson and Lauber 2006). This is also reflected in our data, as in this case, subsidization seems to correspond to technological entry.

Fig. 3
figure 3

Relationship between relatedness density and R&D subsidies in different time periods with a 1991–1995, b 1996–2000, c 2001–2005, and d 2006–2010

Another interesting aspect to look at is the relationship between subsidy allocation and relatedness density. Figure 3 visualizes relatedness density differentiated by subsidized and non-subsidized projects over all four time periods (panel A to D). It is striking that relatedness density substantially differs between subsidized and non-subsidized technologies. Subsidized technologies are on average characterized by higher relatedness densities than the non-subsidized ones. Notably, this difference has grown over time. This suggests that R&D policy has increasingly subsidized related technologies in regions.

We expand the visual inspection of the relationship between subsidy allocation and relatedness density with a linear panel regression. Subsidies (and its disaggregation into \( \hbox {Subsidies}^{Single}\) and \( \hbox {Subsidies}^{Joint}\)) serves as the dependent variable and Density as the main explanatory variable. Control variables capture potential confounders and fixed effects account for time invariant ommited variables. Table 2 reports the results. The findings clearly support the previous visual interpretation. Technologies in regions are more likely to receive R&D subsidies when they are related to existing regional capabilities. In sum, the results for the allocation of subsidies in Germany suggest that contemporary project-based R&D subsidization has a tendency to support path dependent, related diversification in regions.

Table 2 Regression results for the allocation of subsidies

4.2 The relationship between R&D subsidies and technological diversification in regions

The link between R&D subsidies and technological diversification in regions is central to the present paper. Figure 4 maps entry ratesFootnote 2(panel A), the average relatedness density (panel B), the spatial allocation of R&D subsidies (panel C), and the number of patents (panel D) across the 141 German regions. The maps highlight a number of interesting spatial patterns. Firstly, entry rates tend to be larger in regions with higher patenting activities. For example, South Germany, with Munich and Stuttgart as innovative regions, is characterized by particularly high entry rates. Similar patterns are also observed for the West of Germany with Cologne and North Germany with Hamburg and Hanover as centers of innovation and technological entries. Nevertheless, some regions experience high entry rates while being only moderately successful in patenting (e.g., Chemnitz and Dresden in Saxony).

Fig. 4
figure 4

a Entry rates, i.e., realized entries divided by possible entries, b average relatedness density of realized entries, c number of subsidized R&D projects, and d numbers of patents in German LMRs between 2006 and 2010

Secondly, higher entry rates seem to strongly correlate with the average relatedness density in regions. That is, regions characterized by higher relatedness densities also realize a larger share of their entries. This visual observation corresponds to the ample empirical evidence that related activities are more likely to emerge in regions than unrelated activities (Neffke et al. 2011; Boschma et al. 2013, 2015; Rigby 2015; Balland et al. 2019).

Thirdly, regions with lower patenting activities and lower entry rates (e.g., North-Eastern regions) receive more R&D subsidies than innovative regions with higher entry rates. More precisely, 9 out of the top 10, and 12 of the top 20 regions with the most subsidized R&D projects are located in the North and East of Germany. Accordingly, the allocation of R&D subsidies seems to follow a convergence strategy by favoring regions with fewer technological activities.

Our central results of the regression analysis linking subsidies to entries are reported in Table 3. Regarding the control variables (see Models 2d, 2e, and 2f), we find patenting activities in neighboring regions (\(Neighbor\;Patents\)) to be positively associated with regional technological diversification, which is indicated by the significantly positive coefficients for this variable in all models. Accordingly, being in spatial proximity to regions already successful in a particular technology, renders diversification into this technology more likely. The positive link between activities in neighboring regions and regional diversification supports the idea of spatial knowledge spillovers, which are intensified by geographic proximity (Jaffe et al. 1993).

In addition, our models suggest that entries are less likely to occur in regions with large knowledge stocks. The corresponding coefficient of \(Regional\;Patents\) is significantly negative. Most likely, this is the outcome of a level effect: regions with strong inventive activities are already well diversified and successful. Hence, there are fewer opportunities for further diversification (see for example, Imbs and Wacziarg 2003). A similar argument applies to the size of technologies, \(Technology\;Size\). Its coefficient is significantly negative, indicating that large technologies are less likely to emerge in regions. This is likely driven by large technologies already being well diffused in space and; hence, they have fewer (remaining) opportunities to emerge. Diversity remains insignificant, which is most likely due to its effect being captured by \(Regional\;Patents\) or by the fixed effects. The regional employment size (\(Regional\;Employment\)) and the economic performance of regions (\(Regional\;GDP\)) are not significant and thus do not play an important role in regional technological diversification in German LMRs.

Table 3 Regression results of linear probability model for entries

In all models, relatedness density is significantly positive. Technologies are more likely to emerge in regions that are related to existing regional capabilities, which confirms the path dependency of regional diversification and the idea of regional branching. Hence, our results confirm the numerous empirical studies on this matter (Boschma et al. 2013, 2015; Rigby 2015; Balland et al. 2019).

We now turn toward the heart of our analysis. The variable Subsidies is included into the base Model 2a without any additional variables. Its coefficient becomes significantly positive. The variable remains significant when including relatedness density (Model 2c) and further control variables (Model 2e). Accordingly, we confirm our hypothesis H1a, as the relationship between subsidized R&D projects and regional technological diversification is positive.

To approach our hypothesis H1b regarding a potential interplay between subsidies and relatedness, we included an interaction term of Density and Subsidies in Model 2f. Nevertheless, the corresponding coefficient remains insignificant. Accordingly, entries are not more likely to occur when the underlying technologies are related to the regional technology portfolio and receive R&D subsidies. Hence, our results do not support hypothesis H1b.

Besides the significance of the coefficient, it is usually also interesting to discuss the effect strength. Our matching of subsidies to patent data has severe implications for the interpretation of effect sizes of Subsidies, however. Most subsidized R&D projects are allocated (i.e., divided) to multiple technologies (IPC subclasses). This results in a fractional counting of projects, such that for each observation (technology-region combination), the absolute numbers of assigned projects do not reflect full projects but rather the corresponding shares of a project assigned to this technology by the matching procedure presented in Sect.  3.2. Accordingly, the obtained coefficient of Subsidies does not correspond to full projects but to fractionally allocated project numbers. With this in mind, we suggest the following interpretation: Increasing the numbers of fractionally allocated subsidized R&D projects by 0.012 will increase the probability of entries by approximately 0.35%.Footnote 3 Accordingly, subsidies’ effects appear to be relatively small.

We hypothesized that subsidies for single and joint projects are likely to have distinct effects on regional technological diversification (H2a). Table 4 reports the corresponding results of this differentiation. We include both subsidy types in different models. Both variables’ coefficients are significantly positive in all model specifications confirming the previously identified positive relation of subsidies and diversification. In line with previous studies (Fornahl et al. 2011; Broekel et al. 2015a), however, the coefficient of \( \hbox {Subsidies}^{Joint}\) [lower bound = 0.69, upper bound = 1.06], as reported in Model 3b, is significantly larger than \( \hbox {Subsidies}^{Single}\) [lower bound = 0.2, upper bound = 0.40], as reported in Model 3a. This suggests that subsidies for joint R&D projects increase the likelihood of entries to a larger extent than do subsidies for individual projects, which confirms our hypothesis H2a. Expanding the numbers of joint projects by the average change between two consecutive time periods of 0.015 increases the entry probability by approximately 1.31%.Footnote 4

We also test for potential interaction effects between the two subsidy modes and relatedness to investigate hypothesis H2b. Interestingly, and in contrast to the findings for all subsidies, we find a significantly negative coefficient for the interaction of \( \hbox {Subsidies}^{Joint}\) and Density (Model 3f). This finding suggests that subsidized joint research projects can compensate for a lack of relatedness to some extent.

Table 4 Regression results of linear probability model for entries and subsidies for individual and joint projects

We investigate the interaction of Subsidies and Density in more detail by grouping our observations into three subsamples. The subsamples represent different parts of the distribution of relatedness density values, namely, low, mid-, and higher relatedness values.Footnote 5 Models 4a and 4b in Table 5 report the results for the subsample with low relatedness density. Density is found to be insignificant, while the estimated coefficient of Subsidies is significantly positive. Again, our results suggest that it is important to consider the subsidy mode, as \( \hbox {Subsidies}^{Single}\) (lower bound \(= -\) 0.080, upper bound \(=\) 0.157) is insignificant and \( \hbox {Subsidies}^{Joint}\) (lower bound \(=\) 0.116, upper bound \(=\) 0.913) is significantly positive. This suggests that R&D subsidies for collaborative projects can compensate for missing relatedness, as there are no instances of high density in this sample and, hence, they cannot drive entry probabilities. The results change for larger relatedness values. Now Density becomes significant as well, while the coefficient of \( \hbox {Subsidies}^{Joint}\) (lower bound \(=\) 0.251, upper bound \(=\) 0.525) decreases in size (Model 4f). Accordingly, these results confirm our hypothesis H2b: Subsidies for joint projects are able to facilitate unrelated diversification, while this is not the case for subsidized individual projects.

Table 5 Regression results of linear probability model for three different levels of relatedness density

4.3 Robustness analyses

When evaluating the effects of R&D subsidies on regional technological diversification, endogeneity of subsidies represents a crucial concern. In our case, endogeneity can occur if technology entries in regions impact subsidy allocation. The use of time lags of 5 years implies that technology entries would need to influence the allocation of subsidies to that same technology in the region 5 years before (when it was not existent there). While this is an unlikely scenario, there might be effects at work that operate over long time periods.

Another source of endogeneity in our context is the non-random selection of recipients (Busom 2000; David et al. 2000; Aubert et al. 2011). R&D policy is more likely to reward projects with higher likelihoods of success. Such is probable when recipients have been successful in acquiring projects in previous periods. For instance, subsidy recipients could have accomplished entries of technologies in regions in previous time periods, which, in turn, positively influenced the likelihood of receiving grants in subsequent projects in related technologies. Addressing this endogeneity problem is not straightforward. One possibility is to apply instrumental variables regressions (IV). This requires a valid instrument at the level of technology-region pairs that additionally varies over time, in our case from 1991–2010. We follow Koski and Pajarinen (2015) and use the total numbers of subsidized projects (across all regions) in each technology to instrument the potentially endogenous subsidy variables at the region-technology level. Our previous analyses have shown that the two modes of subsidies yield distinct results. We therefore differentiate between individual (\(Total^{Single}_{k}\)) and joint projects (\(Total^{Joint}_{k}\)) in the construction of the instruments. The underlying rationale is that an increase in the total numbers of subsidized projects generally increases a specific regions’ probability to acquire a subsidized project in this technology.

In our context, the exclusion restriction of our instrumental variable regression states that, conditional on the control variables included in the model, the number of subsidized projects in a technology at the national level, has no effect on the entry probability of this technology in a particular region five years later, other than through their direct allocation to this region. The exclusion restriction would not hold if the federal subsidies would exert a direct effect on the entry probability of a certain technology in a specific region. In principle, this is possible if large shares of federal subsidies are allocated to few regions and thus directly influence technological diversification in regions. However, the average share of subsidized projects received by an individual region of all subsidized projects in one technology is 0.6% (median share equals 0.22%). Accordingly, the influence of single regions on the general allocation seems to be rather marginal.Footnote 6

Another challenge could be that our dependent variable Entry has an effect on the allocation of federal subsidies five years before. We believe this to be highly unlikely, as the emergence of single technologies in some regions does not influence the allocation of subsidies by the federal government five years before. Consequently, we are confident that the total number of subsidized projects in a technology is a reliable instrument for the technology-region specific numbers and is thus suitable to address potential endogeneity concerns.

Table 6 reports the results of the first-stage and second-stage regressions. The first-stage regressions indicate that \(Total^{Single}\) (Model 5a) and \(Total^{Joint}\) (Model 5c) are valid instruments, as they are positively related to the number of subsidized projects at the regional level. The results of the second-stage regression confirm the previously observed (weak) effect of individual projects on regional technological diversification. The corresponding coefficient of \( \hbox {Subsidies}^{Single}\) is insignificant (lower bound \(= -\) 0.324, upper bound \(=\) 0.363) in the second-stage regression (Model 5b). Moreover, Model 5d confirms our results for the subsidization of joint research projects. The obtained coefficient of \( \hbox {Subsidies}^{Joint}\) remains significantly positive (lower bound \(=\) 0.015, upper bound \(=\) 1.764) in the second-stage of the IV specification. Consequently, the IV regressions substantiate our previous finding of a positive effect of collaborative R&D subsidies on regional technological diversification and underlines that the two subsidy modes have distinct effects on regional technological diversification.

Table 6 Results of instrumental variables regression

5 Discussion and conclusion

Our study discusses and empirically tests the relationship between project-based R&D subsidies and regional technological diversification. It thereby contributes to two literature streams: the assessment of R&D subsidies’ effects and the literature on regional diversification. Existing studies on the effects of R&D subsidies primarily focus on their general contribution to innovation activities and their potential stimulation of R&D efforts, efficiency, and outputs. In this study, we argue that they may also support technological diversification, despite not necessarily being intended to do so. Accordingly, R&D subsidies may induce additional (positive) effects that have not yet been considered in existing evaluations. With respect to the literature on regional diversification, our study adds a crucial perspective that remains underdeveloped. While (related) diversification is empirically well investigated (Hidalgo et al. 2007; Rigby 2015; Boschma et al. 2015; Essletzbichler 2015), little attention has been paid to the role of R&D policy in this context.

We complement our arguments with an empirical study on the technological diversification of German regions and project-based R&D subsidization of the federal government. Our empirical results for the allocation of these R&D subsidies show their allocation tends to be positively biased toward related activities in regions. Accordingly, R&D policy seems to be part of the path dependency in regional diversification, as it manifests related activities. This suggests a rather risk-averse allocation strategy. As related activities have greater chances of becoming successful than other activities (Neffke et al. 2011; Boschma et al. 2015; Rigby 2015), supporting such minimizes the chances of failure (see discussions in Dohse 2000; Cantner and Kösters 2012; Aubert et al. 2011). Most likely, it is the competitive character of the allocation process through which this risk aversion is implemented. When evaluating applications, applicants’ and applications’ quality are relatively easy to assess and evaluate. Therefore, they are likely to be weighted more strongly than less “objective” aspects, such as novelty and future development potentials.

From the perspective of the literature on related variety (Frenken et al. 2007; Neffke et al. 2011) and the Smart Specialization strategy of the EU (Foray et al. 2011), our findings have to be evaluated as evidence for a positive contribution of the R&D subsidization policy to regions’ future growth and prosperity. By allocating subsidies to related technologies, R&D policies support the emergence and growth of related variety. The latter has been argued and empirically shown to stimulate regional (related) technological diversification, which, in turn, has been confirmed to matter for regions’ long-term economic growth (Frenken et al. 2007; Neffke et al. 2011; Kogler et al. 2013).

However, our study raises a crucial question rarely discussed in this context: Should policy, in fact, try to (directly or indirectly) facilitate related diversification? Put differently, is related diversification truly troubled by market failures justifying policy intervention? The regional branching mechanism suggests that related technologies are the most likely to emerge in regions (Boschma and Frenken 2010). In addition, one may argue that regional branching implies that diversification is a path dependent process that eventually leads to a thinning out of regional knowledge diversity. This in turn makes lock-in scenarios more likely, which are to be avoided due to their negative impact on growth and future developments.

In contrast, from a market-failure perspective, it can be argued that stimulating unrelated diversification should be the focus of R&D policy, to break the constraints of existing path dependencies. Supporting unrelated diversification policy increases regional knowledge diversity. Through a portfolio effect, diversity will render regions more resilient to external shocks, which is proposed as one of the main goals of innovation policy (Martin 2012). In addition, regional technological diversity lays the foundation for unexpected and uncommon knowledge recombination, which frequently forms the basis for breakthrough inventions (Uzzi et al. 2013; Kim et al. 2016).

In accordance to this perspective, our empirical results do not hint at a multiplicative effect of R&D subsidies and relatedness. In contrast, our findings suggest the existence of a substitutional relationship between relatedness and R&D subsidies at the regional level.

In addition, our results reveal the importance of differentiating between subsidies for individual- and joint research projects (Broekel 2015). Subsidies for joint R&D projects exert a much stronger effect on regional technological diversification than those for individual projects. The difference becomes even more pronounced when applying instrumental variable regressions. In particular, subsidies for joint R&D projects are also able to compensate for missing relatedness to some extent. Similar is not observed for individual R&D subsidies. Most likely, it is their stimulation of interactions between new and heterogeneous actors from different regions facilitating inter-organizational learning that explains their advantage in this context. This adds to existing research showing their higher effectiveness for stimulating innovation activities in general (Fornahl et al. 2011; Broekel 2015; Broekel et al. 2017). It also begs the question of why the majority of projects subsidized by the German federal government do not yet involve inter-organizational collaboration (Broekel and Graf 2012).

Our paper opens a number of avenues for future research. The scope of our study is limited to technological diversification in regions, approximated by patent data. Although patent data have their justification and are often used in this context (Boschma et al. 2015; Rigby 2015; Balland et al. 2019), they also limit our analysis to technologies that can be patented. It is therefore important to study the link between subsidies and other forms of diversification to improve our understanding of policy impact on regional diversification. For instance, this concerns sectoral diversification measured with information on the occupational composition in regions, representing a crucial next step for future research.

Additionally, R&D policy still lacks the appropriate tools to identify promising but underdeveloped technologies and for evaluating the spatial context in which they (best) evolve. We believe that our paper takes a step in that direction by showing that regional branching helps in understanding the economic transformation of regions. Moreover, we provide an empirical setup for evaluating the role of a specific policy tool (R&D subsidies) in this context.