1 Introduction

The fight against climate change is arguably at an unprecedented critical phase. Experts concur that while there are enough capital, technology, policy instruments and scientific knowledge to cut carbon emissions significantly by 2030, inaction due to political inertia could trigger a calamitous domino effect for both the environment and for society (Haines & Patz, 2004; McMichael et al., 2006). Countries and regions worldwide are actively exploring avenues to deal with the opportunities and challenges of shifting to a low-carbon regime, but such an endeavour requires also policies that promote a wide spectrum of innovations, including low-carbon technologies as well as sustainable production and consumption practices (Stern, 2007). According to Ayres and van den Bergh (2005, p. 116) these policies would enact “economic growth […] accompanied by structural change, which implies continuous introduction of new products and new production technologies, and changes in [energy] efficiency and dematerialisation”.

Against this backdrop, the present paper provides an overview of green technological development in European (EU) regions, which is deemed timely in view of the radical commitments stipulated in the recent EU Green Deal to achieve climate neutrality by 2050. Our goal is threefold. First, we explore the geographical distribution of innovative activities and profile EU regions in terms of technological capabilities. Second, we elaborate a metric to capture the shape of the local knowledge space and, consequently, to identify regions’ green innovation potential. Third, we check whether possessing comparative advantage in specific technological domains is associated with a region’s capacity to develop green technologies.

To frame these goals in current scholarly and policy debates, we call attention to two characteristics of the transition to low-carbon societies. First, geography matters. The European Commission (2015) along with other international bodies emphasises that regions and cities are responsible for implementing as much as 70% of green action plans. Of course, not all territories are equally proactive or well-equipped in terms of their innovation potential—due to differences in availability of natural resources, infrastructures, competences, and institutions—and regions differ also in terms of exposure to environmental impacts. As a result, green technologies may paradoxically emerge in more developed areas while the urgency of deploying those technologies is stronger in poorer regions (Bathiany et al., 2018; Mendelsohn et al., 2006).

This calls for an analytical framework rooted in economic geography that accounts for spatial differences in the transition towards sustainable economies. Territorial differences provide a clear rationale for the regional and local governance of environmental transitions, so that spatially differentiated transformation trajectories reflect local needs and potentials (Truffer & Coenen, 2012). Thereby, the spatial dimension is pivotal in at least two ways. The first is by reaffirming the centrality of institutions that oversee the reorganisation of production and consumption, and thereby the inherently context-specific nature of environmental policies (Gibbs, 2006; York & Rosa, 2003). The second is the importance of co-location and territorial proximity for creating and consolidating synergies for sustainable creation and use of natural resources (Chertow, 2008). Building on these insights, Truffer and Coenen’s (2012) call for cross-fertilisation between regional studies and sustainability transition studies recently paved the way to a strand of empirical work (Barbieri et al., 2020b; Corradini, 2019; Montresor & Quatraro, 2019; Perruchas et al., 2020). The present study draws on and contributes to this flourishing area of research.

At the same time, achieving zero GreenHouse Gases (GHG) emissions requires, as per recent pronouncements by the European Commission, radical “economic and societal transformations […], engaging all sectors of the economy and society” [European Commission, 2018, p. 5]. Put otherwise, the implementation of the Green Deal will require structural changes and, inevitably, a mix of opportunities and challenges. Given the difficulty in separating local and global environmental aspects of climate-related hazards, benefits to some will no doubt come at a cost to others. To illustrate, decarbonizing harmful production activities might cause job losses and worker displacement. This calls for analytical instruments that are consistent with the uncertainty of a scenario that features feedback loops, multiple trade-offs and emergent behaviours.

In view of this, we adopt a set of specific tools from the interdisciplinary field of complexity economics to account for the increasingly dynamic and interconnected nature of the necessary socio-economic transformations for environmental sustainability. In this framework, economic systems are understood as adaptive and dynamic by virtue of collective properties that arise from the interactions among their micro-components, rather than from their individual properties (Arthur, 1999; Blume & Durlauf, 2001; Cilliers, 2001). Our goal is to extend the Economic Complexity (EC) approach (Hidalgo & Hausmann, 2009; Tacchella et al., 2012) to the analysis of the environmental competitiveness of European regions. EC methods capture the underlying competences of productive systems in different domains of human activities, i.e. industrial, technological and scientific production, while also providing tools to analyse the interaction across these dimensions. Various international institutions, such as the European Commission, the World Bank, the OECD as well as local and national governments have adopted specific measures of economic complexity.

EC methods have proven effective for quantifying information on technological capabilities at various levels of aggregation, recently also in relation to environmental technologies and products (de Cunzo et al., 2022; Mealy & Teytelboym, 2020; Napolitano et al., 2019; Sbardella et al., 2018). A key ingredient for the success of EC in characterising the structure of regional capabilities on a large scale is a broad-encompassing approach to account for variety in regional output rather than a narrow focus on specific areas of regional specialisation. In particular, mapping the innovative capabilities of a large set of regions across multiple countries, knowing in which technologies a region is competitive, is more relevant than knowing ‘how much’ it produces in any specific subset. Since the sustainable transition will entail large-scale industrial, infrastructural and spatial transformations, we envisage that an economic complexity approach holds the promise of shedding new light on the green potential of individual technological competences.

Two main findings arise from our analysis. First, we provide novel insights into the connection between green and non-green technological capabilities, namely that regional know-how in the non-green technological realm can be exploited in the green domain–and vice versa. Second, the shape of the regional technology space matters when it comes to dealing with complex capabilities. In particular, regional technology spaces that exhibit higher propensity to develop technologies connected with green technological fields (i.e. higher green potential) specialise in a wide range of green technological domains that span the spectrum from the less to the most diffused ones.

The remainder of the paper is organised as follows. Section 2 describes the data, measures and methods employed in the empirical analysis. Section 3 shows the geographical distribution of green innovative activities and potential in European regions. Section 4 presents the main findings of the analysis. Finally, Sect. 5 concludes.

2 Data and methods

The goal of this paper is to connect green and non-green capabilities in developing complex technologies by assessing whether and to what extent the latter are conducive to green technological advances. Green technologies recombine different bits of knowledge from different sources (Barbieri et al., 2020a) and the exploration of the nature of these sources is fundamental from a policy perspective. Various scholars argue that green and non-green technical knowledge exhibit complementarity (Barbieri et al., 2021), so that the development of non-green technologies generates positive externalities for the generation of green knowledge (Markard & Hoffmann, 2016; Sinsel et al., 2020), and vice versa (Noailly & Shestalova, 2017). To meet our goal, the empirical analysis is organised in two steps. First, to set the scene we provide a broad overview of the innovation landscape. To this end, we profile European regions based on their (green and non-green) innovation capacity using the Economic Fitness and Complexity (EFC) algorithm (Tacchella et al., 2012). This recursive algorithm is designed to extract information about capabilities from the network of technologies in which regions innovate by simultaneously determining the complexity of each technology and the fitness of each region. Put simply, complexity is a measure of how sophisticated the capabilities needed to innovate in a technology are, while fitness measures the extent to which a region is endowed with complex capabilities. To illustrate, a higher fitness score indicates that the portfolio of (green or non-green) technologies of a region includes technologies that are more complex; hence, that region possesses more advanced technological capabilities.

Building on the above, the second step is the analysis of the mutual influence between green and non-green technological capabilities. To this aim, we propose a measure of ‘green potential’ of each region inspired by recent contributions on the green product space or the knowledge space (see e.g. Fankhauser et al., 2013; Hamwey et al., 2013; Mealy & Teytelboym, 2020; Boschma et al., 2013; Castaldi et. al, 2015; Rigby, 2015; Balland & Rigby, 2017). The Green Potential (GP) indicator builds on bipartite networks connecting regions with technological fields similar to the ones employed in the first step. However, rather than extracting information from a single network layer, it is built by projecting the layer connecting regions with non-green technologies at time \({t}_{1}\) on the layer connecting the same regions with green technologies at time \(t{ }_{2}=t{ }_{1 }+ \Delta t\). The resulting network of time-lagged co-occurrences allows the identification of non-green technological classes in a regional portfolio that signal the early emergence of competitiveness in green technologies in the region. Such a methodology relies on studies on the role of spatial knowledge spillovers in the transition towards sustainable economies (Barbieri et al., 2020b; Cheng & Jin, 2020; Nomaler & Verspagen, 2021), and allows us to map the ecosystem of regional technological competences and to identify the combinations of non-green know-how that are most likely to favour a region’s entry in the domain of green technology. This approach also yields evidence on the strengths and weaknesses of green regional specialisation and is therefore relevant for the design of both climate and regional development European policy.

2.1 Data—measuring regional patenting activity

To analyse innovative activities within European NUTS2 regions, we employ PATSTAT 2020a (European Patent Office, 2020), a comprehensive database that collects the records of over 100 million patent documents from national and regional patent authorities around the world. Although patents are widely used to measure innovative activities, they carry well-known limitations. Indeed, the commercial value of patents may differ substantially across inventions, and not all inventions are patented. In addition, some technical knowledge cannot be patented. Finally, there is a high heterogeneity across sectors and countries (Archibugi & Pianta, 1996; Griliches, 1991). However, the wealth of information contained in patents is a useful data source in innovation studies. For instance, patents are the only available measure that can be accessed at reasonable costs that allows to discern between green and non-green technological fields and to cover a vast number of regions worldwide at a very fine spatial level (Popp, 2005).

The information available in PATSTAT is particularly rich. Crucial to our purposes, it records the Cooperative Patent ClassificationFootnote 1 (CPC) codes used by patent offices to associate the claims contained in patent applications with the specific areas of technology in which the applications make an innovative contribution. As we detail below, CPC codes are proxies of embedded technological competences. Furthermore, PATSTAT records the address of a substantial number of patent inventors and patent applicants, which can be used to assign the patents to geographical regions. In this paper, we focus on the set of patents filed between 1997 and 2017 by Europe-based inventors. To prepare the data for the analysis, we start by geolocalizing patent applications. To this end, we assign patents to their inventors' NUTS3 2013 regions of residence by exploiting, whenever possible, two sources of information: PATSTAT and the patent geolocalization exercise performed by de Rassenfosse et al. (2019)–which we henceforth refer to as the Geocoding. Whenever the information differs between the two sources (a rare event), we weigh each inventor’s location. For example, if PATSTAT places two inventors of a patent in regions X and Y respectively, while Geocoding places the same two inventors in regions X and Z, we consider the contribution of region X to the invention to be twice as big as that of regions Y and Z. Notice that the Geocoding associates longitude and latitude to inventor addresses instead of a NUTS code. We translate the punctual information to NUTS3 by using GIS data from EurostatFootnote 2 and then aggregate NUTS3 2013 codes to build NUTS2 2016 regions.Footnote 3 This way we attribute patent applications to around 300 European NUTS2 regions across 33 countries.Footnote 4

The second step in the data preparation is the association of patents to technological fields through the associated CPC codes. The CPC has a tree-like hierarchical structure comprising five hierarchical levels. The top level contains nine sections that gradually branch out into over two hundred thousand subgroups. There are two types of codes. Codes starting with letters A to H are similar to the codes used in the International Patents ClassificationFootnote 5 (IPC), and represent a traditional classification of technological fields. Codes starting with Y tag cross-sectional technologies spanning over several sections of the IPC classification.Footnote 6 In particular, the Y02 class (Technologies or Applications for Mitigation or Adaptation against Climate Change) identifies patents related to climate change adaptation and mitigation technologies covering a wide range of technologies related to sustainability objectives, such as energy efficiency in buildings, energy generation from renewable sources, sustainable mobility etc. In what follows, we refer to traditional technology codes within the CPC as Non-Green Technologies (NGTs) and to codes belonging to the Y02 class as Green Technologies (GTs).

Finally, we group patents in INPADOC patent families (our main unit of analysis), each of which represents a collection of patent documents covering the same invention. Among those, we select only families that include at least two patent applications, one of which filed at a patent office of the IP5 forum.Footnote 7 This preliminary selection yields a data-set of around 6 million patent families out of 63 million total families in PATSTAT. In addition, it allows us to both avoid country biases and to select relevant innovations for which information is readily available. On the whole, we geocoded more than 90% of inventors and gathered CPC classification codes for at least one patent in each geolocalized family.

We employ a fractional count to assign the innovative contribution of each geolocalised patent family into the corresponding fields of technology and geographical regions. For each family, we take the number R of NUTS2 regions of residence of the inventors, the number C of NGTs, and the number of Cy of GTs. We then assign a share of \(1/(R ^\ast C)\) to all combinations of NGT and NUTS2 region, and a share of \(1/(R ^\ast C_y)\) for all combinations of GT and NUTS2 region.

2.2 Measures

2.2.1 (Green) Technological Fitness

As mentioned in Sect. 2, the EFC framework allows the computation of the Technological Fitness index for European NUTS 2 regions to quantify the complexity of the portfolio of capabilities in each region (Pugliese & Tübke, 2019). Furthermore, knowing the complexity of all technologies we can build, in line with other measures of local complexity (Balland & Rigby, 2017; Operti et al., 2018; Sbardella et al., 2017), a Green Technological Fitness (GTF) index and a Non-Green Technological Fitness (NGTF) index (Napolitano et al., 2019).

Figure 1a illustrates that the input to the EFC algorithm is a binary bipartite network that connects regions to technologies in which they file patents. To transform the number of patents filed by regions in each technology into a binary variable, it is necessary to set a threshold value. In line with the general convention in Economic Complexity, we set the threshold Revealed Comparative Advantage (RCA; see Balassa, 1965). This leaves us with an adjacency matrix M in which a region shares a link with a technology whenever it files more than its fair share of patents in that technology. Therefore, for each combination of region i and technology j the corresponding matrix element \({M}_{ij}\) is defined as follows:

$$M_{ij} = \left\{ {\begin{array}{*{20}l} 1 & {{\text{if}}\;RCA_{ij} \ge 1} \\ 0 & {{\text{otherwise}}} \\ \end{array} } \right.\;\;{\text{where}}\;RCA_{ij} = \frac{{X_{ij} }}{{\sum\nolimits_{{i^{\prime } }} {X_{{i^{\prime } j}} } }}/\frac{{\sum\nolimits_{{j^{\prime } }} {X_{{ij^{\prime } }} } }}{{\sum\nolimits_{{i^{\prime } j^{\prime } }} {X_{{i^{\prime } j^{\prime } }} } }}$$
(1)
Fig. 1
figure 1

The binary network that connects NUTS2 regions to the CPC classes in which they have a comparative advantage: graphical representation in (a) and adjacency matrix in (b)

and \({X}_{ij}\) is the sum of the shares of patents in technology class t that can be traced back to region i.

The binary matrix is then fed to the EFC algorithm, which yields a measure of fitness for each region (\({F}_{i}\)) and of complexity for each technology (\({Q}_{j}\)). In formulae, for region i and technology j, the recursive algorithm is defined as follows:

$$\left\{ {\begin{array}{*{20}c} {\tilde{F}_{i}^{\left( n \right)} = \sum\nolimits_{j} {M_{ij} Q_{j}^{{\left( {n - 1} \right)}} } } \\ {\tilde{Q}_{j}^{\left( n \right)} = \frac{1}{{\sum\nolimits_{i} {M_{ij} \frac{1}{{F_{i}^{\left( n \right)} }}} }}} \\ \end{array} } \right.\;\;\;\;\;\;\;\;\;\;\left\{ {\begin{array}{*{20}c} {F_{i}^{\left( n \right)} = \frac{{\tilde{F}_{i}^{\left( n \right)} }}{{\left\langle {\tilde{F}_{i}^{\left( n \right)} } \right\rangle_{i} }}} \\ {Q_{j}^{\left( n \right)} = \frac{{\tilde{Q}_{j}^{\left( n \right)} }}{{\left\langle {\tilde{Q}_{j}^{\left( n \right)} } \right\rangle_{j} }}} \\ \end{array} } \right.$$
(2)

where \(< \cdot >_{x}\) denotes the arithmetic mean with respect to the possible values assumed by the variable dependent on x, and the initial condition is \(Q_{j}^{\left( 0 \right)} = 1 \forall j\). The fixed point of the algorithm in Eq. 2 defines the non-monetary metric that quantifies \({F}_{i}\), the fitness of region i, and \({Q}_{j}\), the complexity of technology j.

Depending on the structure of the input matrix, the EFC algorithm is known to possibly converge to zero fitness and zero complexity for a subset of geographical areas and technologies respectively. However, this is not an issue because it is always possible to define a consistent ranking along both dimensions; for this reason, we base our analysis on the fitness ranking rather than scores.

The rationale of the algorithm is that the fitness of the regions under analysis and the complexity of the technologies in which they invest can be determined recursively by taking advantage of the information provided by the composition of the technological portfolio of the former. In particular, a region with a more advanced set of capabilities will have a more diversified portfolio of technologies, spanning from the most to the least complex ones, and will therefore have a higher fitness score. In turn, complex technologies are rare and appear almost exclusively in the portfolio of high-fitness regions. Consequently, a region with low fitness has a smaller endowment of capabilities and thus operates exclusively in less complex (green and non-green) technological domains. Figure 1b illustrates this point by displaying matrix M wherein rows and columns are ordered by, respectively, the technological fitness of the regions and the complexity of technologies. The black dots identify the technologies in which regions have RCA greater than 1. Fitness decreases from top to bottom, while complexity increases from the leftmost to rightmost column. The vertical green stripes correspond to green technologies. This particular ordering of M brings out a peculiar nested structure wherein regions with lower fitness are competitive in a subset of activities in which higher fitness regions are competitive. Nested structures typically emerge from the implementation of the EFC algorithm to matrices like M constructed from a variety of data sources, e.g. patents, international trade, scientific publications (Cimini et al., 2014; Napolitano et al., 2019; Tacchella et al., 2012). Nestedness, in turn, points to a key feature of the EFC framework, namely the possibility to capture the capability structure of a country or region in a given domain of human activity not based on how much competitiveness it displays in any subset of activities but, rather, in which activities it is competitive.

Once the complexity of technologies is determined, it is possible to derive the GFT index employing the Sector Fitness approach, i.e.by including in the computation of the complexity only the subset of technologies of interest, in this case green technologies. Hence, the GTF of a region is defined as the sum of the complexities of the Y02 codes with which the region shares a link. Similarly, the fitness in NGTF is defined as the sum of the complexities over the set of technologies that are not considered green according to the CPC classification, i.e. all codes belonging to sections A-H.Footnote 8

2.2.2 Green potential of the regional knowledge space

To assess whether a region’s non-green knowledge base is significantly correlated with its capacity to patent in environment-related technology fields, we elaborate a measure of regional Green Potential (GP) to predict which technological fields are statistically significant ‘early signals’ of the development of high RCA in green technologies in NUTS2 regions. To this aim, we draw both from recent works on the green product space (Fankhauser et al., 2013; Hamwey et al., 2013; Mealy & Teytelboym, 2020), and the technology space (Boschma et al., 2013; Breschi et al., 2003; Nesta & Saviotti, 2005; Rigby, 2015). The GP indicator builds on Pugliese et al. (2019), who defined a multilayer network analysis to study the knowledge spillovers from the patenting activity and scientific production of countries to their exported goods, and on Pugliese and Tübke (2019), who applied a range of economic complexity techniques to profile the technological competitiveness of European regions. To define our indicator of regional green technological potential we adopt a threefold strategy: (i) definition of the technology space; (ii) selection of statistically significant links in the network; (iii) projection of the technology space onto NUTS2 regional patent portfolios. Therefore, we first construct a ‘time-augmented’ technology space that links GT and non-green technologies NGTs, i.e. a network in which a link between a NGT and a GT exists if there is a significantly higher than random probability that regions with high RCA in the NGT at time \({t}_{1}\) also have high RCA in the GT after \(\Delta t\) years.

In practice, as shown in Fig. 2a, we start with two binary networks: one connecting NUTS2 regions to NGTs at time \({t}_{1}\); and the other connecting NUTS2 regions to GTs at time \({t}_{2}={t}_{1}+ \Delta t\). The adjacency matrices of the two networks \({M}^{NGT}({t}_{1})\) and \({M}^{GT}({t}_{2})\) are normalised with the same procedure displayed in Eq. 1: in both networks, a link is established when a region shows a revealed comparative advantage greater than 1 in a technological class. The columns of \({M}^{NGT}\) reflect the 656 CPC 4-digit codes comprising sections A-H, while the columns of \({M}^{GT}\) reflect the 44 green 8-digit codes under CPC class Y02. The reason for employing different aggregations to characterise GTs and NGTs for this exercise is that there are too few 4-digit CPC codes under Y02 for a meaningful analysis, and this would get in the way of the statistical validation of links in the GT-NGT technology space.

Fig. 2
figure 2

The binary network that connects A-H CPC non-green technologies at time \({t}_{1}\) to Y02 green technologies at time \({t}_{2}\). Each network link represents the correlation between having a comparative advantage in a NGT and a subsequent comparative advantage in a GT

By contracting the two binary networks over the geographical dimension as in Fig. 2b, we obtain a \(NGT-GT\) network that identifies the probability of observing, within the regions included in our dataset, time-lagged empirical co-occurrences between comparative advantages in NGTs at time \({t}_{1}\) and comparative advantages in GTs at time \({t}_{2}\). To avoid ‘‘size effects’’, we normalise each co-occurrence by \({u}_{j}({t}_{1})={ \Sigma }_{i }{M}_{i,j}^{NGT}({t}_{1})\), the ubiquity of non-green technology \(j\) across regions, and by \({d}_{i}({t}_{2})={ \Sigma }_{j }{M}_{i,j}^{NGT}({t}_{2})\), the diversification in green technologies of region \(i\) in which the co-occurrence is observed. This way, we measure the probability that having a comparative advantage in a \(NGT\) is an early signal of developing a comparative advantage also in the \(GT\). These probabilities are contained in the “assist” matrix \({B}^{NGT,GT}({t}_{1},{t}_{2})\) (Pugliese et al., 2019), the generic element of which corresponds to the normalised co-occurrences across all regions of non-green technologies (\(j)\) and green technologies \(\left( {j^{\prime } } \right)\), and is defined as follows:

$$B_{{ j, j^{\prime } }}^{ NGT,GT} \left( {t_{1} ,t_{2} } \right) = \Pr \left( {j^{\prime } ,t_{2} | j,t_{1} } \right) = \Sigma_{i } \Pr \left( {j^{\prime } ,t_{2} | i} \right) \Pr \left( {i | j,t_{1} } \right) = \Sigma_{i } \frac{{M_{{i, j^{\prime } }}^{GT} \left( {t_{2} } \right)}}{{d_{i} \left( {t_{2} } \right)}}\frac{{M_{i, j}^{NGT} \left( {t_{1} } \right) }}{{u_{j} \left( {t_{1} } \right)}}$$
(3)

However, a co-occurrence may not be informative per se. An observed high probability of co-occurrence may in fact be due to the ubiquity of technological fields or regional diversification. Therefore, to rule out spurious links, we assess the statistical significance of each link in the network with a null-model called the Bipartite Configuration Model (BiCM, see Saracco et al., 2015, 2017; Straka et al., 2017), a maximum entropy algorithm designed to randomise bipartite networks. The null hypothesis of the BiCM is that NGT-NT co-occurrences are random and their probability is determined only by the ubiquity of the NGT and by the diversification of the region that has a comparative advantage in the GT. Once we have filtered the links with our null model, we interpret each statistically significant co-occurrence between a NGT and a GT as a signal of an overlap between the capabilities required to innovate in both. Intuitively, patent codes that share similar inputs will be close to each other in the technology space, and proximity in the statistically validated NGT-GT network is positively related to the probability that acquiring a competitive advantage in the NGT is predictive of a competitive advantage in a connected GT.

We leverage the information stored in the \(NGT-GT\) network to build our index of regional green potential. For each non-green technology \(NGT\) we define \({N}_{NGT\to GTs}({t}_{1},{t}_{2})\) as the number of significant time-lagged co-occurrences between \(NGT\) s and green technologies. We interpret this as a proxy for the strength of the association between non-green technologies and the green knowledge base. Finally, we project this value onto regional patent portfolios by weighing the average of \(N_{NGT \to GTs}\) against \(X_{ i, j}^{ NGT}\), the patent stock of region i:

$$GP_{i} \left( {t_{1} ,t_{2} } \right) = \Sigma_{j } X_{ i, j}^{ NGT} \left( {t_{1} } \right) N_{NGT \to GTs} \left( {t_{1} ,t_{2} } \right).$$
(4)

It is worth noting that in the context of complexity approaches to capture regional technological capabilities, patenting is a proxy of research activities in a technological field. Thereby, we look at how many patents are produced by a region in each field only to determine the technological activities in which the region is specialised. This implies that the total number of patents is only indirectly relevant to our analysis. In this context, many of the common issues connected with the analysis of patent data are not problematic for our exercises. For example, the fact that a country tends to patent more than another in every field because of different regulations, does not affect our analysis because we are looking only at the share of patents in each field. Moreover, the fact that the propensity to patent is higher in some fields does not affect our analysis either, because we are looking only at the relative share of a field in different countries.

3 Exploratory data analysis

In this section we profile European regions based on their Green and Non-Green Technological Fitness rankings (Sect. 3.1), green potential (Sect. 3.2), and the green technological domain in which they strive to innovate (Sect. 3.3).

3.1 Technological fitness in Green and Non-Green Technologies

In this paper, non-green technologies are all CPC classes outside of the Y02 class, while green technologies are all technologies classified under Y02. As shown in Fig. 1b, the EFC algorithm yields a ranking in which highly diversified regions that innovate in non-complex and complex technologies alike hold the top position whereas regions at the bottom specialise in more “common” technological fields.

As mentioned, green technologies contribute to mitigate greenhouse emissions as well as adapt to climate change and are classified in the CPC Y02 class (Technologies or Applications for Mitigation or Adaptation against Climate Change), which comprises eight groups. In contrast, NGTs comprise a much larger set of CPC codes covering the entire technological spectrum (over a hundred classes). Green technologies can be more complex, radical, pervasive and impactful than most non-green technologies (Barbieri et al., 2020a) and therefore require a wide range of competences that (at times) are far from established know-how (De Marchi, 2012). Accordingly, we do not expect the regions that have the highest technological fitness in NGTs to necessarily top the fitness ranking in GTs; we also expect the latter ranking to display a relatively turbulent evolution over time.

Figure 3 lends support to the hypothesis. For instance, it shows that though a relative majority of regions tends to stay in the same part of the ranking from one period to the next, the fraction of regions that move between nodes is noticeable. Indeed, some regions drop all the way from the top of the ranking to the bottom (and vice versa) in the space of just a few years. To an extent, the turbulence in the green fitness ranking is also reflected also in the country labels, whose colours are quite mixed by 2015. This implies that multiple regions within the same country can quickly rise (e.g. Lithuania) or fall (e.g. Italy) according to the metric. The above is quite in contrast with the evolution of the non-green fitness ranking, which is much more stable over time (see Fig. 9 in the Appendix for more details).

Fig. 3
figure 3

Evolution of the Green Technological Fitness ranking of NUTS2 regions

The maps in Fig. 4 depict the green fitness of NUTS2 regions at the beginning and the end of the period under analysis. The regions with lower green fitness are coloured in yellow, which turns into progressively darker shades of green as green fitness increases. To improve readability, the fitness scores in both maps are rescaled between 0 and 1. The green fitness landscape is quite heterogeneous across countries and relatively stable over time. In particular, we observe a persistent divide between Central and Eastern European regions. A striking element is the substantial lack of coverage in several countries, which signals no patenting activity in fields related to green technology. By 2017 the gap is almost completely closed in terms of the existence of green patents in every European region. Furthermore, the green area is far less concentrated at the end of the period than at the beginning, suggesting that in less than 20 years entire countries started innovating in green technologies, and some regions have also caught up with the leaders.

Fig. 4
figure 4

Green Technological Fitness of NUTS2 regions in 2002 and 2017 respectively in (a) and (b)

Focusing on the regions with the highest fitness in green technologies, we calculated the average ranking position for two time periods, 1998–2006 and 2007–2017 as shown in Table 1. The same computation performed on the top regions in terms of non-green fitness yields Table 2.

Table 1 Top 10 NUTS2 regions in the Green Technological Fitness ranking
Table 2 Top 10 NUTS2 regions in the Non-Green Technological Fitness ranking

Comparing Table 1 with Table 2 we see that diversity is higher in green technologies than in non-green technologies (eight countries represented instead of four) and that the evolution is more turbulent (only five regions remain in the top ten between the two time periods while there are seven in the case of non-green technologies). Four regions are in the top ten during the whole time period in both non-green and green technologies fitness rankings: Oberbayern (DE21), Helsinki-Uusimaa (FI1B), Ile-de-France (FR10) and Noord-Brabant (NL41). This could indicate that the high-quality knowledge and skills available in these regions to develop non-green technologies could be also used for green technologies.

These metrics are in line with other studies about regional technological development in Europe. Among the top thirteen regions present in the non-green fitness ranking in both periods, eleven are classified as “Innovation leaders” (highest level) and two as “Strong innovators” (second highest level) in the Regional Innovation Scoreboard 2021 (European Commission, 2021). On the green technology side, despite the lack of comparative study of European Regions’ performances, some examples can be confirmed by the existing literature. The rise of Stockholm (SE11) in the green fitness ranking (Table 1) is observed in a technical report about the development of green technologies in Sweden (DTU Management Engineering, 2019), as well as the good position of the region of Noord-Brabant (NL41), located around Eindhoven (Balland et al., 2019). Finally, the presence of only two regions from the South (Lombardia—ITC4, Emilia-Romagna—ITH5) seems to be associated with an important development of supportive policies (greenER, 2018; Eco-Innovation Observatory, 2015).

3.2 Green potential: geography and evolution

Figure 5 shows for each 3-digit A-H CPC class \({N}_{NGT\to GTs}(2012, 2017)\), the strength of its association with green technologies, i.e. the share of 99% statistically significant \(NGT \left( {t_{1} } \right) {-} GT \left( {t_{2} } \right)\) links over the total possible links in the technology space, where \(t_{1} = 2012\) and \(t_{2} = 2017\). For ease of visualisation, each colour indicates a 1-digit CPC section. The bar chart shows that 85% of 3-digit CPC codes corresponding to non-green technologies have at least a significant link to a green technology. The average share of significant links in the plot is 0.017 (the average number of significant \(NGT \left( {t_{1} } \right) {-} GT \left( {t_{2} } \right)\) links over the total number of links that \(NGT ({t}_{1})\) shows with all green technologies). At the chosen statistical significance, shares lower than 0.01 are compatible with the null hypothesis of random association. Hence, bars that are lower than the dotted horizontal line indicate technologies that, according to the data, are not significant precursors of green technologies. However, 59% of non-green technologies display shares higher than that threshold, confirming that eco-innovative fields are inextricably interconnected with other types of technologies, and that they are embedded into different production contexts.

Fig. 5
figure 5

Share of 99% statistically significant links in the non-green–green technology space of each A-H CPC non-green technology (considered at time \({t}_{1}=2012\)) to all Y02 green technologies (considered at time \({t}_{2}=2017\))

In the timeframe under analysis green technologies appear linked mostly to pre-existing patents about production or transformation of different types of materials, engines and pumps, as well as technologies used in construction. More in detail, we observe the highest shares in section B—Performing Operations; Transporting, which contains six of the ten CPC codes that co-occur most frequently with some green technology: B32 (Layered products), B26 (Hand Cutting Tools), B24 (Grinding), and B29 (Working of plastics). Moreover, two technologies from section C—Chemistry can be found in the top ten, i.e., C04 (Cements; … ceramics) and C07 (Organic chemistry), More in general, sections F—Mechanical Engineering and E—Fixed constructions play a significant role in terms of the number of technological fields they contain, which co-occur significantly with green technologies. Notable examples in this sense are F05 (Indexing schemes relating to engines or pumps), F02 (Combustion engines), F04 (Pumps for liquids or elastic fluids), F25 (Refrigeration or cooling; … heat pump systems), and E05 (Locks; keys; window or door fittings).

Figure 6 displays the green potential \(G{P}_{i}\left({t}_{1},{t}_{2}\right)\) of each NUTS2 region i; in Fig. 6 (a) \({t}_{1}=1998\) and \({t}_{2}=2002\), while in Fig. 6 (b) \({t}_{1}=2012\) and \({t}_{2}=2017\). \(G{P}_{i}\) projects the information contained in Fig. 5 along the geographical dimension. Comparing the map for 2002 with the map for 2017, we observe several differences in the colour patterns. On the one hand, this is due to the fact that the direction in which regions direct their efforts to innovate changes over time. On the other hand, the technological space is rewired by the technological efforts of each region that give way to new connections between non-green technologies and green technologies. In a way, the green potential index captures to what extent the non-green part of the regional technology portfolios incorporates the pathways that characterised the network at a given point in time.

Fig. 6
figure 6

The Green Potential of NUTS2 regions in 2002 (a) and 2017 (b)

Notice that the regions in the highest quintiles of green potential are not necessarily those with the highest green fitness, or with the highest technological fitness in general. This suggests that the green potential index provides a different information than that of technological fitness, which instead is an indication of the complexity of the regional technological knowledge base. Indeed, regions that are highly diversified and competitive in many technologies do not necessarily also have the highest green potential. In this regard, we show in Sect. 4.2 that green potential has a non-trivial relation with Green Technological Fitness. We refer the reader to the Appendix for a more detailed discussion of the empirical evidence concerning the evolution of the green and non-green knowledge bases of regions.

4 Econometric analysis

In this section we present and comment on the empirical analysis on green fitness. As discussed in Sect. 2, the NUTS2 level offers a compromise between data availability and the dimension of the unit of observation. Indeed, the regional unit has been adopted in various empirical works that explore the geography of green innovative activities (see e.g. Ghisetti & Quatraro, 2013; Santoalha & Boschma, 2020).

Since the goal of the study is to explore regional capacity to develop green technologies, viz. Green Technological Fitness and the green potential of the non-green knowledge space, the analysis is organised in two steps. First, we explore the relationship between regional fitness calculated on green and non-green patenting activities. This enables us to check whether there is a relationship between the capabilities to develop these two instantiations of innovation (Sect. 4.1). Secondly, we delve into the relationship between green and non-green technological capabilities by observing whether our measure of regional green potential is correlated with better green fitness performance (Sect. 4.2).

To provide empirical evidence on the relationship between green and non-green regional fitness and investigate the role of a region’s green potential, we estimate the following econometric models:

$$\begin{gathered} GTF_{i,t} = \alpha + \beta_{1} NGTF_{i,t } + \beta_{2} Controls_{i,t} + \sigma_{i} + \tau_{t} + \phi_{i,t} + \varepsilon_{i,t} \hfill \\ GTF_{i,t} = \alpha + \beta_{1} GP_{i,t} + \beta_{2} Controls_{i,t} + \sigma_{i} + \tau_{t} + \phi_{i,t} + \varepsilon_{i,t} \hfill \\ NGTF_{i,t} = \alpha + \beta_{1} GP_{i,t} + \beta_{2} Controls_{i,t} + \sigma_{i} + \tau_{t} + \phi_{i,t} + \varepsilon_{i,t} \hfill \\ \end{gathered}$$
(5)

where \(GTF\) is the regional fitness calculated on green technologies developed in region i at time t. \(NGTF\) is the regional fitness measured using non-green technologies. GP is the green potential of the knowledge space of region i as defined in Eq. 4. \(Controls\) is a set of variables that control for the volume of size of the patenting activity performed in the region, population and Gross Domestic Product. Moreover, regional fixed effects (\(\sigma\)) control for unobservable heterogeneity that is constant over time and varies across European regions. Regional fixed effects enable us to control for idiosyncratic features that characterise European regions (e.g. geographical characteristics, etc.), whereas time fixed effects (\(\tau\)) control for unobservable variation that is common to all regions but varies over time (e.g. changes in practices at the European Patent Office, etc.). Finally, we include region-specific time trends (\(\phi\)) that account for unobservable heterogeneity that varies linearly over time in each EU region (Barbieri et al., 2020b; Charlot et al., 2015). This enables us to capture and control for different aspects that affect regional complexity that we are not able to control due to data availability (e.g. environmental policy implementation, skills endowment, etc.).

4.1 The relationship between Green and Non-Green Technological Fitness of European regions

The first empirical exercise consists in exploring the association between Green and Non-Green Technological Fitness at the regional level. In so doing, we aim to capture the extent to which innovative capabilities in non-green technologies are conducive to the development of green, complex technologies. Indeed, regional technological fitness provides a picture of the “rareness” of technological capabilities that characterise the regional knowledge space.

Figure 7 shows the relationship between Green and Non-Green Technological Fitness of EU regions. The scatter plot, weighted by the intensity of the patenting activity in the region, highlights a strong correlation between the two measures. This suggests that green technologies, usually more novel and impactful than other technologies (Barbieri et al., 2020a), require capabilities that are unevenly diffused across regions. Regions that are already patenting in complex non-green technologies may have a comparative advantage for developing more complex green technologies. In other words, developing non-green technologies may require know-how, skills, resources (human, financial, technological, etc.) that can be also useful for green technologies–and vice versa.

Fig. 7
figure 7

Relationship between Green and Non-Green Technological Fitness in NUTS2 regions. Each point corresponds to a NUTS2. Low values of the axes are associated with higher ranks. Each variable is weighted by total patenting activity (size of the circle)

Figure 7 provides a descriptive indication of the relationship between green and non-green regional fitness, which we further investigate by estimating the econometric model in Sect. 2.3.

Table 3 reports the result of the model estimation. Column (1) shows the results of the OLS model. Columns (2) and (3) include both regional and time fixed effects to the model of column (1). Finally, in Columns (4) to (6), our preferred specifications, we include regional specific time trends. The results confirm the strong correlation between non-green and green regional fitness: a one percent increase in non-green regional fitness is associated with a 0.8–0.9 percentage increase in green regional fitness – depending on the specification. These insights emphasise that although green and non-green technologies may compete, especially when financial resources are constrained, they also show patterns of complementarity in terms of knowledge capabilities.

Table 3 Econometric results on the estimation of the relationship between Green and Non-Green Technological Fitness

4.2 The green potential of the regional knowledge space

So far we found a correlation between regional green and non-green fitness. Here we investigate whether green regional fitness is associated with specific patterns in the regional knowledge space. In doing so, we delve into the characteristics of the regional knowledge base with a view to identify connections with higher levels of green fitness.

To this end, we develop and employ the green potential indicator introduced in Sect. 2.2.2 to investigate whether this measure is correlated with the Non-Green and Green Technological Fitness of EU regions. Figure 8 shows the relationship between green regional fitness and quintiles of the green potential indicator. We observe that the fitness rankings are similar between green technologies (blue bars) and non-green technologies (red bars). This first descriptive insight emerges from the strong, positive relationship between green and non-green regional fitness as highlighted in the previous empirical exercise. In addition, when the green potential of the regional knowledge space is low, on average regions have lower ranks in both green and non-green fitness. However, moving from the bottom to the top quintiles of green potential are regions characterised by higher levels of green and non-green fitness (lower values in the ranking). Figure 8b shows this relationship by adopting a dynamic perspective. Therein, regions in the bottom and top quintile lose positions in the ranking of green regional fitness, whereas regions in the middle of the green potential distribution gain positions on average.

Fig. 8
figure 8

Relationship between Green and Non-Green Technological Fitness and the Green Potential of NUTS2 regions (a). Relationship between improvements in Green Technological Fitness and the Green Potential of NUTS2 regions (b)

The relationship between the green potential of the technology space and both green and non-green fitness is further investigated in Table 4. We estimate a similar model to the previous one, in which the key explanatory variable is the green potential of non-green technologies. We observe that in Column (1) the relationship between the green potential and Green Technological Fitness is negative and significant. However, this result is mainly driven by the fact that the pooled OLS is not able to capture the idiosyncratic features that may explain an important part of the variation in green fitness. In fact, when we include regional and time fixed effects the coefficient of GreenPotential is positive and significant. Moreover, by adding regional specific time trends (Columns 3 and 4) the coefficient is still significantly different from zero–holding other variables constant. Finally, when we look at non-green regional fitness, the coefficient is positive and non-significant.

Table 4 Estimation of the relationship between Green Potential and Green and Non-Green Technological Fitness

These results confirm the existence of a connection between the regional knowledge space and the green fitness measure. In particular, such a connection relies on the potential of green and non-green technological advances to generate positive spillovers in terms of capabilities to deal with more complex green technologies.

5 Concluding reflections

The Green Deal is the staple of Europe’s commitment to achieve climate neutrality by 2050. Such an ambitious target requires significant efforts on all sides: policy-makers, firms and consumers. Given the scale and the complexity of the environmental transition, a top-down approach would not go very far because action plans need to be implemented from the bottom-up and need to deal with the specificities of place in regions and cities. Of course, not all territories are equally proactive, nor are they equally capable to adapt to new criteria of environmental sustainability that entail a radical reconfiguration of production and consumption activities.

Against this backdrop, we propose a novel methodology to help inform policy on regional capabilities that are relevant for green innovation. In particular, we explore the geographical distribution of innovative activities and profile EU regions in terms of technological capabilities to identify regions’ green innovation potential. Finally, we check the association between comparative advantage in specific technological domains and green technology capacity to validate the relevance of the metric in informing policy action.

The results indicate that the regions with the most advanced green capabilities are mainly in central and western Europe, with Germany standing out. Further, only few regions have capacity to patent at the highest level in all green technologies, thus suggesting that local capabilities are important to fostering, or hampering, their development. Last, we find a strong correlation between non-green and green regional fitness, which implies that although green and non-green technologies may compete—for example for financial or human capital resources—the underlying knowledge capabilities exhibit interesting complementarities. The proposed methodology can therefore capture the potential for green technologies for regions without adopting a narrow focus on green technologies.

Let us conclude by offering some policy implications. The Green Deal is a necessary policy for both the expected environmental and economic benefits. While the environmental effects will have global impact via the channel of cooperation, the economic impact in each region will depend on the preexisting local technological capabilities, among other things. Indeed, the Green deal may potentially exacerbate centre-periphery tensions between and polarisation within EU economies (Lucchese & Pianta, 2020). A timely assessment of green specific regional capabilities is therefore relevant both to inform industrial policy and to project possible winners and losers with an eye towards cohesion policies. Capabilities are however field-specific and product-specific: measures focusing on how much absorptive capacity a region has can distract policy makers from looking at what a region is able to do. The analysis of this paper tried to fix this gap, identifying which regions show potential in green technology by looking at the present focus of their innovation efforts.

The analysis and metrics discussed in the present work can form the basis for an organic measurement effort of regional capabilities for the development of green technologies, akin to similar efforts to capture country and regional innovation capabilities in general–like the European Innovation Scoreboard (Hollanders, 2009) and the Regional Innovation Scoreboard (Markelbach et al., 2019). These could inform regional industrial policy while defining long term objectives for the region. It is indeed important to notice that the need for a quantitative approach connecting sustainable development with local characteristics was already in the mind of policy makers. The European Commission Joint Research Centre is moving to include Green policies into its regional cohesion policy, the Smart Specialisation Strategies—S3 (Mccann & Soete, 2020). This holistic way of looking at regional and sustainability policies at the same time, named Smart Specialisation Strategies for Sustainability—S4, is based on the same theoretical foundational idea behind this paper: the relevance of local characteristics. This shift will require both novel scientific results and novel metrics to inform policies and strategies. In this paper, we tried to do that.

To conclude, the insights of the present study can be relevant for future research. First, we pointed out that patents capture only a portion of innovative activities. Corroborating these findings with ad hoc innovation surveys or other data sources would be important to support existing evidence and offer more detailed insights that could not be identified by means of patent analysis. Second, our explorative study points to a mutual interdependence between green and non-green knowledge bases, which implies questions concerning whether and to what extent non-strictly environmental capabilities can be useful in the transition to low-carbon societies. We hope that this initial effort will pave the way to future research on empirical designs that strives for the identification of causal, robust effects.