1 Introduction

In economic geography, development economics and management studies, cognitive relatedness (also known as technological proximity, cognitive proximity, technological closeness, non-tradable capabilities similarity or skill relatedness) of firms, industries and regions is increasingly suggested to be an important precondition for economic diversification, industry branching and economic development in countries, cities and regions (Boschma 2005; Frenken et al. 2007; Hidalgo et al. 2007; Neffke and Henning 2013). As was already noted by Smith (1776, p. 430), more developed societies tend to be more diversified in their economies: Though in a rude society there is a good deal of variety in the occupations of every individual, there is not a great deal in those of the whole society (…). In a civilized state, on the contrary, though there is little variety in the occupations of the greater part of individuals, there is an almost infinite variety in those of the whole society.

In this vain, the literature on economic development of countries has focused on the impact of changing patterns of branching and economic complexity on economic growth (Hausmann and Hidalgo 2011; Hidalgo and Hausmann 2009). However, as pointed out by Saviotti and Pyka (2004, 2010), diversification is not only a determinant of growth, but also a result of growth. Hence, diversification and economic growth are co-dependent, where diversification is a prerequisite for growth, while simultaneously growth is needed to stimulate demand and changes in demand. Examining the causality of the relationship between diversification and economic growth, Dietrich (2012) found that growth structural change (irrespective of being measured by employment and value added) induces economic growth, while economic growth slows down structural change in the short run but accelerates change in the longer run, where the aggregate effect depends on whether structural change is measured by employment or value added.

Along these lines, the economic geography literature on related and unrelated variety has studied the impact of cognitive relatedness across industries to predict both economic diversification and growth opportunities of regional economies (related variety) and regional portfolio circumstances that mitigate economic shocks (unrelated variety) (Frenken et al. 2007). The regional and regional-sector levels of analyses comply most naturally to the competitive cluster conditions for specialized or diversified growth. Castaldi et al. (2015) extend the latter discussion to the question under what conditions regions and countries with unrelated variety can also yield product innovation (especially radical ones), and why some regions and countries manage to diversify into (cognitively) unrelated industries.

The broader benefits of diversification may be captured by either the accessibility of knowledge in other than localized networks, especially via international relations of migration, cooperation and multinational enterprises (Content and Frenken 2016; Boschma and Iammarino 2009), or in the absorptive, learning and entrepreneurial capacities of firms that are beyond path-dependence able to internalize previously unknown knowledge from outside the firm (Fritsch and Kublina 2018). While the former may still be researched by aggregate data (on MNE’s, migrants and cooperation), the latter actually requires studies to focus on firm-level performance and mechanisms of interaction (of agglomerated regions, sectors and clusters, e.g., transfer mechanisms of knowledge and knowledge spillovers)—something that is still underdeveloped in the literature (Burger et al. 2011; Van Oort et al. 2012; Knoben et al. 2016). According to Van Oort (2015), such a shift in focus from the regional to the firm level is also needed to capture the sectoral and firm-, entrepreneurial- and skilled-worker-level heterogeneity that actually causes differences in regional development. More specifically, it helps us to better understand for whom related and unrelated variety matter and under which circumstances. This paper explores under which heterogeneous contextual conditions cognitive relatedness (“related and unrelated variety”) is associated with firm-level productivity. Here, it is believed that related variety is positively related to firm productivity through facilitating spillovers, knowledge spillovers and local outsourcing (Castaldi et al. 2015), while high levels of unrelated variety are expected to be negatively associated with firm productivity because firms operating in very different industries lack complementarities in factor inputs and there are low levels of knowledge spillovers (Aarstad et al. 2016).

From the firm-context literature, it is clear that it is important to take multilevel heterogeneity into account when analyzing productivity (Syverson 2011). Both the relatedness (leading to externalities) literature and the firm performance literature hint at their potential interplay, although this has not been substantiated yet in empirical analyses. Despite the complex and sophisticated methods of conceptually linking region and sector-related spillovers with growth and cities, an ever-growing body of empirical literature on external economies of scale remains inconclusive on the exact agglomeration circumstances that optimally enhance growth (Melo et al. 2009; De Groot et al. 2015). Van Oort et al. (2012) argue that the missing link that leads to this ambiguity in this research results on agglomeration economies may be the hypothesized, yet often missing relationship between heterogeneous agglomeration economies and individual firm performance. The related and unrelated variety hypotheses were originally embedded in this agglomeration literature, suggesting that related variety as a measure of cognitive proximity captures growth and productivity circumstances (externalities) of regions and cities better than other measures of localization and urbanization (see Content and Frenken 2016 for an overview of the burgeoning literature confirming this).

Although the measurement of relatedness across regions advanced over the last decade from sectoral branching (Frenken et al. 2007; Hidalgo et al. 2007) to include co-location (Ellison et al. 2010; Neffke et al. 2011), skill relatedness (Neffke and Henning 2013), technological cooperation (Basile et al. 2012) and institutional relatedness (Boschma and Capone 2015; Cortinovis et al. 2017), the exact benefits for firms of whatever relatedness remain unknown. This gap is remarkable because the theories that underlie spatial externalities and agglomeration economies are microeconomic in nature. Heterogeneity at the firm level (Baldwin and Okubo 2006; Cingano and Schivardi 2004), the inter-regional level (Acemoglu and Dell 2010), various firm-context interplay levels (Syverson 2011) and the level of mechanisms of agglomeration economies (Faggio et al. 2017) are introduced in the literature as important for understanding variations in firm-level productivity. A link to—by agglomeration externalities inspired—cognitive relatedness, however defined, then again is missing, although Neffke and Henning (2014) in a framework of agents of change and entrepreneurship explore the role of the firm in diversification processes (not productivity).

In this study, we use data and methods that allow us to disentangle firm-level variation in productivity from that in heterogeneous composed regions, sectors and size types of firms, while identifying relationships between agglomeration and relatedness variables. Previous work on variety and firm productivity has in this regard has argued related variety positively influences productivity through enabling knowledge spillovers and increasing possibilities of outsourcing locally (Cainelli et al. 2016), while unrelated variety negatively impacts productivity of firms because very different industries do not have complementarities in factor inputs and knowledge spillovers (Aarstad et al. 2016). Building on the firm-level studies (Cainelli and Iachobuchi 2012; Aarstad et al. 2016; Cainelli et al. 2016; Innocenti and Zampi 2019) that have examined the relationship between variety and firm performance, we add to the literature by (a) applying firm-level production functions with capital and labor, (b) estimating firm-level total factor productivity, (c) introducing relatedness as moderating agglomeration externalities mechanism for productivity (controlling for other determinants) and (d) introducing heterogeneity at firm, regional, sector and region–sector (cluster) levels (including business services).

Applying this to Europe (in 2010), we find that regional and sectoral bounded externalities of relatedness foster firm productivity to a large extent (region more so than sector). More specifically, we find that related variety has a positive impact on firm-level productivity, especially for firms in high-tech sectors and regions, for knowledge-intensive business services and for firms of all sizes except micro-firms. Robustness analyses reveal that firm productivity is not affected by both regional and sectoral high-tech characterizations in the same degree, but by regional contexts more than by sectoral ones. Similarly, larger firms systematically have higher relatedness impacts than those that are only sectorally advanced. Unrelated variety is positively associated with productivity of medium-tech firms and micro-firms, and negatively to productivity of low-tech firms, indicating that cognitively unrelated activities can still work out positive for productivity in European firms in specific entrepreneurial circumstances. This heterogeneity has important implications for European and national policies.

The remainder of this paper is organized as follows. Section 2 introduces related and unrelated variety in agglomeration and strategic management studies. Section 3 sets out our data, research methodology and model specification. Section 4 presents modeling results, and Sect. 5 concludes and discusses policy implications.

2 Related literature

In economic geography and regional economics, there is now a long tradition of research that has examined which static and dynamic agglomeration conditions give rise to economic growth. Melo et al. (2009) and De Groot et al. (2015) document accurately the stream of literature that started with the seminal contributions of Glaeser et al. (1992) and Henderson et al. (1995) on localized agglomeration externalities and economic growth. The notion of agglomeration economies refers to external economies of scale arising from the concentration of firms and individuals in space. The effects of agglomeration economies on firm productivity have been researched extensively. Puga (2010) summarizes the evidence of such agglomeration economies based on (1) a clustering of production beyond what can be explained by chance or comparative advantage, (2) on spatial patterns in wages and rents and (3) on systematic variations in productivity with the urban environment. Productivity gains from agglomeration economies have been documented unequivocally in the literature. Melo et al. (2009) in a meta-analysis of 729 elasticities taken from 34 different studies found that the productivity gains of urban agglomeration economies are positive, but vary greatly in their reported estimates. The authors attribute this significant variability in the magnitude of agglomeration effects to country-specific effects, the industrial coverage, the specification of agglomeration economies and the presence of controls for both unobserved cross-sectional heterogeneity and differences in time-variant labor quality.

Concurrently, there is mounting evidence that the question whether sectoral diversity or specialization is most conducive to regional growth, is actually not very informative from the start as it ignores the fundamental heterogeneity on multiple levels involved. Studies on the effects of agglomeration externalities on regional growth have indeed produced mixed results to such a large extent, that the dichotomy between specialization and diversification cannot fully capture the complexity and heterogeneity of agglomeration externalities. Instead, there is by now also a burgeoning literature stressing that specialization and diversification should be seen as complementary over industry life cycles, contracted to diversification opportunities and capabilities of firms and institutions in economies, with entrepreneurial and cognitive skills functioning as catalysts for emerging and renewed economic diversification (Cortinovis et al. 2017; Van Oort et al. 2012; Faggio et al. 2017). Yet, diversification or (in later stages) specialization in itself does not guarantee economic growth.

Featuring the idea of cognitive proximity, Frenken et al. (2007) distinguish between related and unrelated variety: it is not just the presence of different technological or industrial sectors that stimulates regional development; industries need complementarities in terms of shared competences or untraded capabilities and interdependencies (Storper 1997). Related variety is distinguished from unrelated variety, since information and knowledge will not transfer to all different industries evenly due to varying cognitive distances between each industry pairs (Ellison et al. 2010). It is argued that industries are more related when they are closer to each other in an industry classification system—measured by co-occurrence and hierarchical industry classification membership. Recent studies have shown that especially related variety contributes to productivity and employment growth at the regional level (see for an overview Content and Frenken 2016). Whereas the early studies on related variety mainly examined the overall effect of varieties on regional performance, more recently scholars have turned attention to the issues that (a) relatedness can be measured more directly linked to mechanisms of knowledge transfer (Hartog et al. 2012), for example input–output relations (Fan and Lang 2002; Feser 2003), labor flows (Neffke and Henning 2013, 2013; Diodato and Weterings 2014), innovation based co-operations (Neffke and Henning 2013; Rigby 2012), and (b) that related variety is perhaps not a conditio sine qua non either for regional performance, but that the impact of related variety (or agglomeration externalities) on growth is dependent on the characteristics of the sector, size, age and evolution phases of firms (Faggio et al. 2017), institutional settings in nations and regions (Boschma et al. 2015; Cortinovis et al. 2017) and absorptive capacities of firms in regional economic systems (Fritsch and Kublina 2018). In studies that focus on sectoral heterogeneity and diversification, Hartog et al. (2012), Timmermans and Boschma (2013), Marrocu and Paci (2013), Bishop and Gripaios (2010) and Cortinovis and Van Oort (2015) found that related variety is more important for high-tech industries, where acquisition and internalizing of new knowledge is among the key issues for firm survival and growth. Yet, there is a legion of possibilities of the impact of heterogeneity—ranging from sectors to regions to sector-industry combinations to firms level capabilities and characteristics. This calls for careful quantification and modeling of multilevel processes and characteristics (Knoben et al. 2016).

Increasingly, more empirical studies on agglomeration use disaggregated and firm- and plant-level data, with cities or city industries as a reference unit that is built-up by heterogeneous attributes (Wixe 2015). Studies using aggregated data provide only limited insights and weak support for the effects of agglomeration economies on firm performance (Van Oort et al. 2012). Regional-level relationships are not necessarily reproduced at the firm level because information on the variance between firms is lost when aggregated regional-level data are used. Hence, even if regions endowed with a greater number of agglomeration economies grow faster, this conclusion cannot be generalized to firms.

In addition, agglomeration effects found in area-based studies may be purely compositional (Neumark and Simpson 2015). Combes et al. (2012) and Behrens et al. (2014) show how spatial sorting influences productivity measures. Similarly, Baldwin and Okubu (2006) show that the agglomeration of productive firms may simply be the result of a spatial selection process in which more productive firms are drawn to dense economic areas. For this reason, it remains unclear whether geographical differences are an artifact of location characteristics (e.g., agglomeration economies) or are simply caused by differences in business and economic composition (heterogeneity). Urban economic and spatial econometrics can identify urban and regional-level phenomena while controlling for sorting effects of firms and workers. Alternatively, following social sciences, hierarchical random effects or multilevel modeling, which allows the micro-level and macro-level (agglomerated contexts) to be modeled simultaneously, is becoming an increasingly common practice in strategic management and organization studies (Corrado and Fingleton 2012; Knoben et al. 2016). In this paper, addressing the micro–macro-level heterogeneity and interrelationships—basically, questioning which types of firm’s profit from which types of agglomeration and externality contexts—our analysis is served by multilevel modeling.

The multilevel character of firm-level productivity has been set out by Cingano and Schivardi (2004) and Syverson (2011), who make a plea for analyzing firm-level total factor productivity in relation to simultaneously important contexts. Especially Syverson (2011) surveys and evaluates recent empirical work addressing the question of why businesses differ in their measured productivity levels. The causes turn out to be manifold and differ depending on the particular setting. He includes elements sourced in production practices—and therefore over which producers have some direct control, at least in theory—as well as from producers’ external operating environments (contexts). Building on papers by Maksimovic and Phillips (2002) and Schoar (2002) that both investigate the productivity of plants within conglomerate firms (in their setting, those that operate in multiple two- or three-digit SIC industries), and that leverage US manufacturer micro-data to convincingly argue that a diversification leads firms to over perform in productivity because of learning effects inducing selection, the optimal allocation of resources within a business and capable management. The relatedness literature and its absorptive capacity interpretation uses similar arguments, meaning that relatedness (variety) does not directly foster regional development, but only indirectly through its effect on firm performance.

Following the discussed literature, our models, that to the best of our knowledge are one of the first to capture firm-level productivity in various contexts simultaneously, should incorporate heterogeneity at firm level (degree of high-tech and innovativeness), at regional level (core-periphery distinction of technologically advanced regional production complexes), sectoral level (including high-tech manufacturing and knowledge-intensive business services) and sector by region level (clusters). Previous literature that has linked variety to productivity has argued that

Our study comes closest to the work of Aarstad et al. (2016), who conducted a multilevel study on the relationship between related and unrelated variety and enterprise productivity. Using Norwegian data, the authors concluded that unrelated variety is negatively associated with enterprise productivity, when related variety increases the innovation of the enterprises. Cainelli et al. (2016) examined the relation between related variety and total factor productivity of over 12,000 Italian manufacturing companies. They found that there is a positive effect of related variety on firms’ TFP, although this effect was less strong than the effect of the specialization-related localization externalities. In related work, Cainelli and Iacobucci (2012) focused on Italian business groups to analyze the effect of the variety of industries in the local area on the vertical integration of firms. They found that especially a higher level of vertically related variety (vis-à-vis unrelated variety) is associated with a lower need of firms to integrate activities along the production chain because they can obtain intermediate goods and services locally. In addition, the authors showed that in areas with many small firms variety significantly influences the degree of vertical integration, while areas characterized by many small firms related variety significantly influence activities along the production chain. Conversely, Innocenti and Zampi (2019) did not find a significant association between related and unrelated variety and the growth of innovative start-ups at an early stage of their life.

3 Data and methodology

Using a Cobb–Douglas production function framework and multilevel (mixed) models, we control for firm inputs such as employment and capital, but also control for other regional-level indicators like localization, education level, population density and economic openness. In this, we are also explicitly interested to what extent firms in different industries are differently affected by related and unrelated variety. To analyze the relationship between related variety, unrelated variety and firm productivity, we use data from the ORBIS–Amadeus database, a commercial database provided by Bureau van Dijk (Moody’s Analytics). The ORBIS–Amadeus database provides financial and economic information on firms in almost all European countries, including balance sheets and income statement items, and provides a wide range of performance indices. ORBIS–Amadeus also includes information on a firm’s location, ownership and age and assigns NACE codes to companies—the European standard of industry classification after the French Nomenclature statistique des Activités économiques dans la Communauté Européenne—which can be used to classify firms according to sector. The effort to standardize and harmonize financial information included in the ORBIS–Amadeus database and make it comparable across countries has resulted in its wide-spread use in research (Kalemli-Özcan 2015). We employ a cross-sectional database of firms for 13 European Union Countries for year 2010. Only firms with known information of value added for the year included in the model. Our sample consists of 200,463 firms, located across 13 European countries.

We distinguish between several classifications in our models. First, we categorized firms into 184 NUTS-2 regions, second in 9 sectors and third in 1525 sectors by regions (clusters). The countries included in the sample are Belgium, Czech Republic, Germany, Spain, Finland, France, Italy, Portugal, Sweden, UK, Slovakia, Poland and Hungary. The 9 sectors include (1) Manufacturing 1 (food and textiles), (2) Manufacturing 2 (chemicals and plastic), (3) Manufacturing 3 (metals), (4) Manufacturing 4 (transportation and storage), (5) information and communication, (6) real estate services, (7) professional, scientific and technical activities, (8) administrative and support service activities and 9) other service activities. Following Wintjes and Hollanders (2010) and Cortinovis and Van Oort (2015) regions were ranked according to their capacities in terms of knowledge accessibility, knowledge absorption and knowledge diffusion (see “Appendix 1”). We categorized our regions into three technological regimes (high technological regime, medium technological regime and low technological regime). Furthermore, firms were categorized according to their technological and knowledge intensity following Faggio et al. 2017 into four categories, (1) high–medium-technology manufacturing, (2) medium–low technology manufacturing, (3) knowledge-intensive business services and (4) less knowledge-intensive business services. Although service industries are frequently studied in agglomeration analyses (Henderson et al. 1995; Kolko 2010; Jacobs et al. 2013), they are mostly absent in relatedness studies that resort data from input–output methodologies or manufacturing surveys. We further categorized firms according to their size into four categories, micro-sized firms, small-sized firms, medium-sized firms, large-sized firms, according to the European Union categorization.Footnote 1

Our study employs three (composed) firm-level variables of the ORBIS–Amadeus database. The dependent variable in this study is firm labor productivity and is measured as the logarithm of value added on employees for each firm \( i \) for year 2010. Employment is measured as the logarithm of the number of employees per firm. Capital is measured as the logarithm of the sum of tangible fixed assets and depreciation and is measured in thousands of euro (firm specific).

Agglomeration variables are measured at the regional level, and besides related and unrelated variety, two other variables are introduced. Localization economies or sectoral specialization is measured as the concentration (location quotient) of employees per region (measured at the region-by-sector level). Localization economies concern sector-specific externalities that are thought important for productivity, especially in later stages of firm and industry life cycles (Kemeny and Storper 2015). Next, density is measured as population per km2. This dimension of agglomeration is not directly related to localization economies (specialization) and diversity economies, but to pure urban size effects (Puga 2002; Ciccone 2002; Ciccone and Hall 1996). In general, the literature suggests that higher density enables better interaction, enhancing growth (Rice et al. 2006), although recent research that controls for composition effects also moderates the suggested impact (Henderson 2003).

Several variables are introduced on the regional level to control for further determinants of regional and cluster heterogeneity (Kim 1997). The degree of openness of European regions (a measure related to firm competition and regional competitiveness) is calculated as the total value of imports and exports in a region divided by the regions GDP. This volume of trade indicator is based on a make and use tables (IO-Table) for 2000 on NUTS-2 level concerning 14 sectors and 59 product categories, including services. This database is developed by the Netherlands Environmental Assessment Agency (PBL), see Thissen et al. (2017). The volume of trade goes up with the size of the region at a declining rate. It is strongly dependent on global economic development with competition on global markets, driving up productivity and attracting new investments and collaborations. High potential may also spill over to nearby regions or in the regional network of specialized and subcontracting industries and regions. The average educational level of regions is measured by the percentage of tertiary and higher educated in the total population. The relationship with (employment and productivity) growth thought to be is positive, as more skilled people can be more productive, and agglomeration may attract more of these people (Moretti 2004; Rauch 1993, for a more critical interpretation see Shapiro 2006).

To measure related and unrelated variety, we use an entropy measure proposed by Frenken et al. (2007) at the NUTS-2 level. When measuring regional variety to study the effects on firm productivity, decomposition is useful as it is expected that variety at a high level of sector aggregation reflects the possibility to switch between input substitutes (unrelated variety), while one expects variety at a low level of sector aggregation to be an indication of possible knowledge spillovers because of cognitive similarity (related variety). We also use the ORBIS–Amadeus micro-data as source for the calculation of related and unrelated variety at the NUTS 1 Level. Since small firms are underrepresented in this database, firm-level data are weighted by turnover values. In this fashion, we capture the large and sectorial heterogeneous regional economies best. We compute entropy using employment data, which are available for the four-digit level from the ORBIS–Amadeus database. Unrelated variety per region is indicated by the entropy of the two-digit distribution; related variety is indicated by the weighted sum of the entropy at the four-digit level within each two-digit class.

Marginal variety can be computed at all four-digit SIC levels in the dataset, meaning the increase in variety when moving from one digit level to the next. Formally, let \( P_{i} \) be the four-digit SIC levels share of employment in Section G over total employment in region, by summing the four-digit share one can measure variety

$$ V = \mathop \sum \limits_{g = 1}^{G} P_{i} \log_{2} \left( {\frac{1}{{P_{i} }}} \right). $$

This measure is variety in a general form. The higher its value is, the higher the industrial diversification of a region is. This measure can be split into related and unrelated variety. Firstly by summing the four-digit shares \( P_{i} \), one can derive the two digits shares, \( P_{g} \)

$$ P_{g} = \mathop \sum \limits_{{i \in S_{g} }} P_{i} . $$

Then, unrelated variety is measured as the two-digit level entropy and is given by:

$$ {\text{UV}} = \mathop \sum \limits_{g = 1}^{G} P_{g} \log_{2} \left( {\frac{1}{{P_{g} }}} \right). $$

To measure related variety we estimate the marginal increase when moving from the two-digit to the four-digit level. Formally related variety is the weighted sum of entropy within each two-digit sector and is given by:

$$ {\text{RV}} = \mathop \sum \limits_{g = 1}^{G} P_{g} H_{g} $$

whereFootnote 2

$$ H_{g} = \mathop \sum \limits_{{i \in S_{g} }} \frac{{P_{i} }}{{P_{g} }}\log_{2} \left( {\frac{1}{{P_{i} /P_{j} }}} \right) $$

The method of hierarchical or multilevel modeling allows the micro-level and macro-level to be modeled simultaneously. There are two distinct advantages to multilevel models (Van Oort et al. 2012). First, multilevel models offer a natural way to assess contextuality, or the extent to which a link exists between the macro-level and the micro-level. Applying multilevel analysis to empirical work on agglomeration begins from the simple observation that firms sharing the same external environment are more similar in their performance than firms that do not share the same external environment because of shared agglomeration or other externalities. Hence, we can assess the extent to which variance in firm-level productivity can be attributed to between-firm variance, between-area variance, between-sector variance or between cluster (sector-region). With multilevel analysis, we are able to assign variability to the appropriate context. Second, multilevel analysis allows us to incorporate unobserved heterogeneity into the model by including random intercepts and allowing relationships to vary across contexts through the inclusion of random coefficients. Whereas “standard” regression models are designed to model the mean, multilevel analyses focus on modeling variances explicitly. This kind of complexity can be captured in a multilevel framework through the inclusion of random coefficients.

A Cobb–Douglas production function is expressed as follows:

$$ Y_{ijk} = A(T_{ijk} )^{{\beta_{n} }} K_{ijk}^{{\beta_{1} }} L_{ijk}^{{\beta_{2} }} e^{{e_{ijk} }} $$

where \( y_{ijk} \) is labor productivity (value added per worker) of the \( i \)th firm nested within \( j \)th group of sector by region, which is nested within the \( k \)th region, \( A(T_{ijk} ) \) is TFP, \( K_{ijk} \) the capital stock and \( L_{ijk} \) the labor force. TFP of plant \( i \) depends on its regional and sectorial environment (see Fig. 1), and it is expressed in terms of related and unrelated variety (sectorial), localization, and urbanization economies, openness of the region and level of education of the inhabitants (regional). Taking logs of Eq. (1) and applying a three-level model which involves regions at the higher-level, sectors by regions at the meso-level and firms within regions at the lower level, we obtain Eq. (2):

$$ \begin{aligned} y_{ijk} & = \beta_{0} + \beta_{1} {\text{CAP}}_{ijk} + \beta_{2} {\text{EMPL}}_{ijk} + \beta_{3} {\text{REL}}_{ijk} + \beta_{4} {\text{UNREL}}_{ijk} + \beta_{5} {\text{LOC}}_{ijk} \\ & \quad + \,\beta_{6} {\text{URB}}_{ijk} + \beta_{7} {\text{OPEN}}_{ijk} + \beta_{8} {\text{EDUC}}_{ijk} + u_{00k} + u_{0jk} + e_{0ijk} \\ \end{aligned} $$

where \( u_{0k} \) is the partitioning of the total variance in variance between regions, \( u_{0jk} \) is the variance between sectors by regions but within regions, and \( e_{0ijk} \) is the variance within sectors by regions. Figure 1 presents the multilevel structure of the analyses.

Fig. 1
figure 1

Multilevel structure of analyses

In Eq. 2, we assume that the firm-level predictor variables are uncorrelated with the sector and regional-level error terms and that the sector-level predictor variables are uncorrelated with the regional-level error terms. However, both theoretical and empirically, such an assumption is difficult to meet. Not correcting for this would lead to inconsistent parameter estimates. However following Snijders and Berkhof (2008), we remove the correlation between the lower-level predictor variables and higher-level error terms, by including sector and regional means of the firm-level predictor variables in the regression model, a procedure known as the Mundlak (1978) correction.

Please note that although the Mundlak correction addresses part of the endogeneity problem, the multilevel modeling framework does not control for endogeneity arising from reverse causality between firms’ productivity and variety. Specifically, firms’ location choices resulting in lower or higher levels of related and unrelated variety in some places could be induced by the migration of firms between places of low productivity to places of high productivity (Melo et al. 2009). One solution to this problem would be to instrument related and unrelated variety, but unfortunately, finding credible instruments is hard. Therefore, the results should be interpreted as conditional associations, rather than causal relationships.

4 Results

In Table 3, the results of the main model are reported. Although there is no positive association between related variety in the model without control variables, related variety is positively and significantly related to firm’s productivity in the last, preferred specification including sector and country fixed effects (dummies). Furthermore in that specification, employment and capital (firm level), and education and population density (regional level) are positive and significantly related to firm productivity. Localization and unrelated variety do not have a significant impact on labor productivity in the main models presented in Table 3.

Table 1 Descriptive statistics of explaining variables
Table 2 Correlation matrix
Table 3 Regression analyses of firm productivity

Table 4 tests for the effect of related variety on firm labor productivity in a spatial regime setting, where the regimes reflect different degrees of regional high-tech characteristics. Related variety is only positive significant related to productivity of firms in the high-technology regime, insignificant for medium-technology regime regions and negatively significant for low technology regimes. All other significant results are as expected (with localization impacting negatively on firm productivity, as often encountered in the literature).

Table 4 Regression analyses of firm productivity and related variety across different spatial regimes

Table 5 focuses on the relationship between related variety and firm productivity in different sectors, identified according to knowledge intensity (compare Henderson et al. 1995; Faggio et al. 2017; Jacobs et al. 2013). Related variety is positive and significant related to firm productivity in sectors with higher level of technological manufacturing and knowledge-intensive services. Our findings suggest that related variety as an agglomeration externality impacts most in high-tech and high-service firms, confirming earlier findings by inter alia Bishop and Gripaios (2010); Hartog et al. (2012), Timmermans and Boschma (2013) and Cortinovis and Van Oort (2015). In the sectoral models, unrelated variety has similar significant impacts on firm productivity than related variety (compare Castaldi et al. 2015), indicating that also diversification opportunities in cognitively unrelated industries (breakthrough technologies) correlate with firm productivity in high-tech and medium-tech manufacturing sectors.

Table 5 Regression analyses of firm productivity and related variety across different sectors

Model 6 presents models where the sample is split according to firm sizes. There is a positive relationship between related variety and firms of all sizes except the micro-firms. The results indicate that knowledge spillovers are more likely to be transmitted between firms that are not too small, e.g., have larger absorptive capacities (Fritsch and Kublina 2018). Very small firms cannot adapt easily to new techniques and do not allow diverse forms of knowledge to be absorbed easily. The positive and significant sign of localization for micro-firms indicates that specialization externalities are actually those contributing to firm growth. For the other firm size groups, localization is negative significant related to firm productivity (Table 6).

Table 6 Regression analyses of firm productivity and related variety across different firm sizes

Tables 8, 9 and 10 (“Appendix 2”) show results of robustness checks, where further analyses were conducted on the interaction of spatial and sector heterogeneity. From these additional analyses, it can be concluded that the significance and the sign of related variety changes when combining the level of technological regimes and level of sectors’ knowledge intensity. The main sensitivity found is that related variety is foremost positively and significantly related to productivity of firms when heterogeneity in the technological level of regions is addressed, while it is less related to the knowledge intensity of sectors. The effect of sectors diminishes when a firm operates in a high-technology regional regime, which implies that the positive and significant coefficient of related variety among high-tech sectors is a result of interdependencies with the regimes; technological level, and not the result of separating it from low-and-medium-sector knowledge intensity variety. This suggests that that related variety has an impact on firm productivity mainly through regionally embedded agglomeration economies.

Similarly, additional analyses are presented on the interaction of sectors (high-tech manufacturing, medium- and low-tech manufacturing, knowledge-intensive services, less knowledge-intensive services) and firm size classes in Tables 11, 12, 13 and 14 in “Appendix 2”. These tables show that the firms are, the larger the impact of related variety and localization economies on firm-level productivity. The knowledge-intensive services show identical patterns. Contrary to this, medium- and low-tech firms and less knowledge-intensive firms show hardly any productivity impact from related variety.

5 Conclusions and discussion

In this paper, we use a model to take multilevel heterogeneity into account when analyzing productivity in a firm context. Both the relatedness/externalities literature and the firm performance literature hint at their potential interplay, yet this is not substantiated yet in detailed analyses at the European level. From our analyses, it became clear that differences in regional, sectoral and firm size compositions of economic activities condition the degree to which firms are more productive in Europe. These insights are important for analyzing the growth opportunities and competitive advantage of regions, and for drafting place-based policies to improve these positions. Given the hypothesized importance of European national and regional integration, the testing of the effectiveness of sectoral policies, and the identification of entrepreneurial and firm-level opportunities and niches, the absence of insights into these policies so far is remarkable.

We used data (for 2010) and methods that allowed us to disentangle firm-level variation in productivity from that in heterogeneous composed regions, sectors, region–sector combinations (clusters) and size types of firms, while identifying relationships between agglomeration and relatedness variables. Applying this to Europe, we found that regional and sectoral bounded externalities of relatedness fostered firm productivity to a large extent, suggesting that the regional level of absorptive capacity of firms plays an important role. We more specifically found that related variety impacts positively on firm-level productivity, especially for firms in high-tech sectors and regions, for knowledge-intensive business services and for firms of all sizes except micro-firms.

Robustness analyses revealed that firm productivity was not affected by sectoral and regional high-tech characterizations to the same degree, but by regionally mediated spillovers more than by sectoral ones. Similarly, larger firms systematically had higher relatedness impacts than those that were only sectorally advanced. These findings back the general notion of place-based (regional) development strategies in the European Union being an essential ingredient next to people-based or firm-based policies (Barca et al. 2012), as localized policies are best used to address specific bottlenecks and diversification opportunities given path-dependent development trajectories locally.

This research also has limitations that may be addressed in future research. The analysis was based on a cross-sectional dataset of firms in European regions and sectors for 2010. Dynamic data would allow better identification, and addition insights into growth opportunities. As relatedness becomes more and advanced in measurement, staying increasingly close to mechanisms of knowledge transfer (labor mobility, innovative cooperation, detailed input–output linkages), future analysis could couple to these measurement improvements to test the robustness of their outcomes. Another problem that has remained unaddressed in the paper is that the effects of related and unrelated variety could be biased as a result of possible endogeneity, arising from reverse causality between firms’ productivity and variety: firms’ location choices inducing related and unrelated variety could be induced by high levels of productivity present in some places (Melo et al. 2009). In other words, differences in productivity between places might result in migration of firms. Finally, better grained data on firm level (on absorptive capacity for instance) would give us more grip on the impacts of heterogeneity.