1 Introduction

Innovation through R&D is recognized in endogenous growth theory as a main driver of economic growth (see, for example, Barro and Sala-i-Martin 1995; Bresnahan and Trajtenberg 1995). While this theoretical claim has been backed by an abundance of solid empirical evidence, the same strand of research also indicates that the relationship between innovation and growth is rather heterogeneous in nature (Romer 1986, 1990a, b, 1994; Grossman and Helpman 1993; Lichtenberg 1993; Eaton and Kortum 1997). Distinct regions can be identified that were outperforming others in terms of generating a relatively high productivity compared to their level of R&D expenditure, while this efficiency of the utilization of R&D expenditure was often explained on the basis of a socially dependent diffusion rationale (see Davelaar and Nijkamp 1989; Rodriguez-Pose 1999; Bresnahan and Yin 2010). In the present study, we direct the attention to this puzzling issue one step earlier in the diffusion of knowledge process, namely, whether and how the social lattice in a locality will influence the number of R&D—produced new pieces of knowledge that will eventually appear in a locality. This process is the most essential one, as it defines the initial conditions for endogenous economic growth based on R&D innovation.

Taking this premise as a starting point of the current analysis, we refer here to the recent cultural percolation of ideas hypothesis presented by Tubadji and Nijkamp (2015).Footnote 1 In line with the cornerstones of the underlying culture-based development (CBD) concept, the authors argue that emergence of an idea from the world of ideas into the material world of documented R&D innovation resembles the physical process of percolation. Percolation is the process of penetration of a liquid through a porous lattice. Analogically, ideas pass through the social lattice in a locality in order to be selected and implemented by a local research agent, and thus to get transformed into R&D output. This filtering process is generally indifferent of the size of the investment, but it can be expected to experience a higher cultural bias with the increase of the R&D investment due to the perceived higher risk of failure (for more details on culture and cultural bias and risk, see Tubadji 2012, 2013; Batabyal and Nijkamp 2014). Moreover, every percolation is determined by its tipping point (a switching point between a 0 and 1 mode of existence).

The tipping point of each lattice is a function of the structure of the lattice, but even in the natural sciences such as physics and mathematics the tipping point is known only for some relatively simple lattice structures. The exact structure of the entire local social network is very difficult to be captured and is itself unobservable in its entirety. Therefore, its tipping point is practically impossible to even try to estimate. Thus, the only way to approach the cultural percolation model empirically is through an approximation of its tipping point. Some approximations for social networks structure (and not their tipping points) have been tried (see e.g. Zenou 2012) which use the labour market structure to approximate the network structure. Yet, this firstly leads to endogeneity issues if the effect of the network on economic processes related to the labour market is investigated. And second, a very strong assumption for the social network structure has to be made, in particular that it is dependent only on immediate one, two or three connections, for which the tipping point is known and possible to calculate. Instead, we propose here an alternative measure related to a purely cultural paradigm—the six degrees of separation in society—as a possible endogeneity—free approximation for the tipping point of cultural percolation.

It should be noted that existing research on the dissemination of information through a network has generated the well-known concept of the ‘six degrees of separation’, i.e. the importance of being able to connect a random draw of six people in a row in order to manage to transmit consistently a piece of information from person A to person B who do not know each other (Traverse and Milgram 1969; Granovetter 1973; Watts and Strogatz 1998). We reinterpret this notion as the dependence of efficient network communication on the likelihood of having six degrees of social connectedness (SDSC) between six random members in the network. More precisely, we are interested in the likelihood of having six random people in the social network sharing the same attitude levels (i.e. common shared cultural values). The higher the likelihood of having SDSC, the lower the attitudinal diversity in the locality, but also the higher the speed of percolation of certain type of ideas can be expected to be. If the notion of SDSC is empirically based on a variable related to open-mindedness of the milieu towards new ideas, these SDSC will lead to a more efficient and more productive percolation of ideas, which can be quantified as more original ideas (or more patents). Furthermore, as we know from the homogeneity literature and social networks (see for instance Lozares et al. 2014), the higher the cultural proximity, the higher the ‘love’ for each other. This means that the very likelihood of cultural diversity among six randomly drawn people in a locality will predict how efficiently they will be able to communicate with each other and how connected they will be in their attitudes (see Akerlof and Kranton 2000). Thus, the six degrees of separation (or social connectedness) can be approximated with the likelihood of six random people in a locality having the same cultural origin. If we use people whose cultural origin is foreign and carrying a sense of diversity, we can expect our six degrees measure to be positive related with more innovative ideas in R&D investment.

Thus, the aim of the current paper is to generalize the six degrees of separation notion from the perspective of local culture (which can be measured through a general measure of local attitude and cultural diversityFootnote 2) and to shed additional light on the role of the cultural factor for the R&D investment efficiency in the NUTS2 localities in the EU27. The next section offers an overview of the conceptual motivation behind this generalization of the six degrees of freedom and their involvement in the percolation mechanism explained in Tubadji and Nijkamp (2014). Section 3 presents the data for the numerical illustration about the available EU27 regions and the difference-in-differences approach applied for operationalizing our six degrees of separation and cultural percolation of R&D ideas hypothesis. Section 4 offers some concluding remarks and discussed the replicability of the numerical illustration offered here.

2 Cultural percolation of R&D ideas as a function of the six degrees of separation

Below follows an expose of the conceptual blending between a Schumpeterian quality ladder model and a local cultural relativism perspective. We discuss a Schumpeterian motivated model for percolation of new ideas through the local social network. The cultural percolation is defined as the passage of an idea from the pool of possible ideas through the social network structure, till it transforms its status into an idea selected by the decision-making processes for R&D investment. The selected idea for R&D investment becomes an output from this investment with actual innovative value added documentable by a patent (see Tubadji and Nijkamp 2015 for a more comprehensive elaboration of the cultural percolation model). Every percolation process however depends on its threshold, which operates like a tipping point and is generally very difficult to capture quantitatively, as we will elaborate on this below. Thus, our novel proposition here is that the six degrees of separation paradigm is theoretically justifiable to be used as an empirical approach to approximate the tipping point of this percolation process through the social network. Below follows a gradual elaboration of this rationale.

The essence of the cultural percolation model of new ideas lies in the filtering and selection of new ideas from the pool of possible ideas as the distinct ideas that will be invested on for research. This model is based on one of the most refined examples of modeling R&D investment decisions—the Schumpeterian quality-ladder model. As a model of innovation (Schumpeter 1934, 1942; Aghion and Howitt 1992; King and Levine 1993; Ericson and Pakes 1995; Cohen and Klepper 1996a, b; Klette and Griliches 2000), it was developed in the context of firms which choose whether to invest in a specific R&D project based on the net present value financial analysis of the investment opportunity. In this setting, the firm is interested in the expected duration of a monopoly profit confronted with the total expenditure required for R&D (without accounting for its distribution across researchers). Moreover, the Schumpeter ladders/rungs of quality express the behaviour of the firm vis-a-vis the firm-specific knowledge at a given constant stock of universal knowledge which is the basis of its evaluation of future opportunities. Formally, the model is driven by the (i) the aggregate flow of resources expended by potential innovators in a given sector when the highest rung (highest ever addressed step of the ladder of knowledge) is addressed for investment by R&D; and (ii) the net expected return on investment. The expected return on investment however, as argued in Tubadji and Nijkamp (2015), is an output of a forecast function in which uncertainty of the prediction due to cultural bias on the interpretation of present and future possibilities is involved. Put differently, the Schumpeter quality ladders investment choice decision-making is a function of two elements: (i) the available lump sum for investment; (ii) the expectation on the quality, i.e. the expected price of the new idea if it gets implemented. While the first element is a hard matter, the second element is above all a function of the decision maker’s preferences and risk aversion biases, which are both derivatives of local culture. This can be expressed formally through the limits of the choice function and the attitude to risk function, as follows:

$$\begin{aligned} \hbox {lim f}({\hbox {Choice}})= & {} \hbox {lim f}( {\hbox {Investment}\_{\mathrm{Z}}}) +\hbox {lim f}( {\hbox {Expectation for NPV}})\end{aligned}$$
(1)
$$\begin{aligned} \hbox {lim f}({\hbox {Choice}})= & {} \hbox {constant1}+\hbox {lim f}({\hbox {All}\_\mathrm{{Knowledge}}_\mathrm{{t}}})\nonumber \\&\times \, \hbox {lim f}({\hbox {culturally}\_\mathrm{{rel}}\_\mathrm{{risk}}\_\mathrm{{attitude}}})\end{aligned}$$
(2)
$$\begin{aligned} \hbox {lim f}({\hbox {Choice}})= & {} \hbox {constant 1}+ \hbox {constant 2}\times \hbox {lim f}( {\hbox {culturally}\_\mathrm{{rel}}\_\mathrm{{risk}}\_\mathrm{{attitude}}})\nonumber \\ \end{aligned}$$
(3)

The above three (lines (1)–(3)) are a proof of the dependence of the limits of choice in the Schumpeterian quality ladder model on the cultural relativity of the chooser. Namely, in (1) choice is a function of some investment lump-sum (Investment_Z) and the net present value (NPV) expected from it. As the Investment_Z is a constant its limit is also a constant, the limit of the choice will vary according to the variation of the expectation for the NPV. The NPV as well is a function of the knowledge and the expectations, demonstrated in (2). However, as shown in (3), the highest level of knowledge attainable in a given moment of time is a constant for everyone who chooses. As Schumpeter’s model suggests, then again its limit is a constant. Therefore, the variation of the limits of the choice is determined by the limits of the cultural relativity of risk attitudes. The latter means that as risk attitudes differ in different localities, for the same Z and same knowledge (innovation epoch), the choice will tend towards a particular number depending on the risk attitude, which has its limit that is locally varying according to local culture. Put differently, local culture determines what the limit of the function of risk attitude tends to, which ultimately determines the limit of the function of choice.

Next, we define choice as a 0 and 1 function, where 1 denotes a chosen, and 0 a not chosen idea. This means, local culture determines risk evaluation which also jumps sharply between 1 and 0. Therefore, our assumption that the choice of ideas is a cultural percolation process seems plausible. So, the question which remains is what the mechanism that shapes the cultural evaluation of risk attitudes is. To answer this question, we need to understand two things: (a) the cultural attitude function is a cultural percolation [varying between 0 and 1, which is 1 when something is successfully communicated (agreed upon) in the social network, as opposed to 0 when unsuccessful communication (non-agreement) upon the same matter occurs in the social network]; (b) the threshold the function has to tend to in order for its limit to tend to 1 or 0, in other words, what is the threshold for the cultural percolation to happen or not. These two matters are explained below.

To fully understand the mechanism of the dependence between choice and culture, we employ here the cultural percolation hypothesis. Percolation processes were introduced by Broadbent and Hammersley (1957) to model the random flow of a fluid through a medium. Percolation distinguishes from diffusion theory in the fact that diffusion is defined as a random movement in a structureless medium. The percolation process in a particular environment (a specific constant type of lattice) can be explained and generalized through the following mathematical condition for the occurrence of a percolation \(X_i(e)\):

$$\begin{aligned} X_i (e)=\left\{ \begin{array}{ll} 1 &{}\quad \text {with probability }p_i \\ 0 &{}\quad \text {with probability }1-p_i,\ 0<p_i <1. \end{array}\right. \end{aligned}$$
(4)

What is of special interest in this percolation process is the essential existence of the so called percolation threshold. According to Kolmogorov’s zero-one law (see, for instance, van den Berg 2008), for any given p, the probability that an infinite cluster exists is either zero or one. Put differently, the percolation—or the full penetration from top to bottom of a surface—is a dummy variable; it either is a fact (denoted by 1), or is not a fact (denoted by 0). And the probability of switching between 0 and 1 is an increasing function of the connectivity in the lattice, denoted with CONN. For this function of CONN, there is a critical CONN (denoted by CONN\(_{\mathrm{c}})\) below which the probability of percolation is always 0, while above this critical CONN the probability of percolation is always 1. This critical CONN is the percolation threshold. Moreover, for the percolation threshold it is true that beyond the critical value of CONN, the probability of an open path from the top to the bottom to be achieved increases sharply from very close to zero to very close to one in a short span of values of CONN. Because of this last characteristic, the percolation threshold is very closely resembling the behavior of a tipping point (see Lamberson and Page 2012). The tipping point in percolation occurs when p\(_{\mathrm{c}}\) is passed which allows the percolation of a liquid through the lattice. After a gradual change in CONN (from CONN\(_{\mathrm{c}}-1\) to CONN\(_{\mathrm{c}})\), there is a change of the initial conditions allowing for a percolation state to switch from 0 to 1.

The calculation of the tipping point in percolation is a complex mathematical question. In some cases, CONN\(_{\mathrm{c}}\) can be explicitly calculated. Such a case is, for instance, the calculation of the percolation threshold for the square lattice Z2 in two dimensions, where CONN\(_{\mathrm{c}}= 1/2\) for a bond percolation (Kesten 1982). For most infinite lattice graphs, however, p\(_{\mathrm{c}}\) cannot be exactly calculated. For example, p\(_{\mathrm{c}}\) is not known for a bond percolation in the hypercubic lattice in two dimensions. Luckily, what is known is that there is a limit case for lattices in many dimensions (the Bethe lattice), whose threshold is at CONN\(_{\mathrm{c}} = 1/(\hbox {z} - 1)\) for a coordination number z (see Bethe 1935). More precisely, a Bethe lattice, introduced by Hans Bethe in 1935, is a connected cycle-free graph where each node is connected to z neighbours, and z is the coordination number. It can be understood as a tree-like structure starting from a central node, with all the nodes arranged in shells around the central one. The central node is called the root or origin of the lattice. The number of nodes in the kth shell is given by:

$$\begin{aligned} N_k =z(z-1)^{k-1}\quad \hbox {for }k>0. \end{aligned}$$
(5)

In some cases, the above expression can be modified to specify that the root node has \(\hbox {z} - 1\) neighbours and the statistical mechanics of lattice models are often exactly solvable. As well-known from Axelrod (1997), social behavior—and culture more specifically—is a coordination game within a social network. Therefore, according to the above formula, for a critical number of coordinated entities (persons at an individual level or co-existing conflicting cultural attitudes or a given cultural distance—for a local society), the tipping point for percolation of an ‘non-solid’ matter through society can be calculated according to p\(_{\mathrm{c}} = 1/(\hbox {z}-1)\). Besides the logical similarity however, it is still unclear what is the economic reasoning that can motivate and serve as a plausible explanation for applying such a measurement approach with regard to ideas percolation in the social lattice. Put differently, the cultural percolation depends on its threshold and its threshold depends on z, which is dependent on the particular lattice structure. But the lattice structure, i.e. the connectivity of the social network is not easily observable. So what is z for the social lattice? How to determine the threshold of the social lattice depending on its structure of connectedness that is non-observable in its full extent?

To answer this second question is pivotal to our study.And to answer what is the threshold which drives the limits of choice by determining the percolation through the social network, we infer here the SDSC literature. A renowned measure for the connectedness in society—and an important measure for the transmission of information between people—is the well-known ‘six degree’ phenomenon, presented below. For the first time the six degrees phenomenon was studied by Traverse and Milgram (1969) with a sample of 296 people from Nebraska. The experimental design required a letter to be sent to a stockbroker in Boston under the rule that the letter is delivered only through first-name-basis acquaintances of the sender. Traverse and Milgram (1969) found that it takes on average six people for a letter to reach from sender to target receiver. This finding was theoretically proved by Watts and Strogatz (1998). Moreover, Watts (2004) and Dodds et al. (2003) replicated the experiment with a much larger sample of 48,000 people on the internet. The experimental design was roughly the same, with the difference that the target recipients were randomly distributed around the globe. Watts’ (2004) results confirm that the average number of steps of connectedness required for an idea to reach from a sender to a receiver is six connected people. Thus, the main proposition of our study is that the ‘six degrees’ phenomenon represents the connectedness within a social network and characterizes the behaviour of its lattice (see also Watts et al. 2002). From another perspective, the notion of cultural diversity has been long debated as a potential source of positive or negative effects on ideas, innovation and finally on local productivity (see for example Ottaviano and Peri 2005, 2006; Bellini et al. 2008; Suedekum et al. 2010). Therefore, the change of cultural diversity can be expected to influence the world of percolation of ideas. Moreover, this effect is most likely to pass through the interaction process of local and foreign workers. Thus, we assume that the six degrees (the likelihood of six people in a row being of a different origin than the local cultural origin) is a relevant approximation of the missing information on the unobservable structure of the social network and its threshold for cultural percolation. Put differently, we suggest that the hypothesis is made that since six connected people in a row are the requested number of people for an idea to be successfully transmitted through the social network, then it might be generalized that the successful communication on a new idea proposed for R&D investment might be also subject to dependence on a SDSC between people. Only in the case of selection of an idea, we make the additional assumption that people are not just connected acquaintances, but also that they share a cultural connectedness, a cultural minimum distance (i.e. zero cultural distance—belonging to the same culture), or a full cultural proximity (proximity on attitudes and origin equal to identity). Their connectedness based on common cultural value-based will give homophily basis for reaching an agreement on the final decisions when passing the idea through the filter of riskiness attitude evaluation.

Finally, however, we note that connectedness between people of the majority in a locality leads to its closedness and lack of receptiveness to new ideas, while more models and more diversity are associated with better innovation and efficiency of solution finding to any problem (see Hong and Page 2001). Therefore, what can be expected to have a positive effect on R&D selection of the most efficient and value adding new ideas is the likelihood of SDSC between people with non-local origin. In other words, we will transform the six degree of separation into an innovative measure of diversity, where diversity is defines as the likelihood of six random people in a row happening to be with a non-local origin.

Based on the above assumptions, we ultimately suggest that the change of the likelihood of having six people in a row being from a diverse cultural origin defines to what limit the cultural attitude in a locality will tend towards new ideas—in other words, if the limit of risk evaluation will tend to 1, the choice of the most efficient ideas for R&D is also having a limit tending to 1. Put differently, our cultural percolation of R&D ideas hypothesis embodies the following testable proposition:

H01: The higher the likelihood of having six people in a row coming from a diverse cultural origin, the closer the limit of risk evaluation will tend to one and therefore the choice of most efficient ideas for R&D will have a limit tending to 1 for a higher number of innovative ideas.

The next section provides an illustrative example for an empirical operationalization of this hypothesis.

3 Difference-in-differences operationalization of the cultural percolation of ideas hypothesis

3.1 Database

Our dataset to test H01 is compiled of data on economic indicators from EUROSTAT and the European Social Survey (ESS). In particular, for the purpose of our numerical example, we use the ESS data available for the two years 2002 and 2004 to quantify the likelihood of six degrees of cultural diversity in the social network, and we complete the data pool into a panel dataset by taking the rest of the economic variables from EUROSTAT Regional Database. Our dataset is at a NUTS2 level.

From the biennially collected ESS dataset, we use the number of people at a particular NUTS2 level who have indicated that they have lived in the area concerned as immigrants who have moved into it during the last 20 years. Thus, our main explanatory variable of interest—six degrees of separation (or culturally-based social connectedness)—is defined as the probability of six random people from the pool of interviewed people in a locality being from an immigrant cultural origin. The exact formula through which we derive the likelihood of six people in a row being from an immigrant cultural origin is respectively: likelihood_six_in_a_row_M \(=\) ((total_M \(-\) 1)/(total \(-\) 1)) \(\times \) ((total_M \(-\) 2)/(total \(-\) 2)) \(\times \) ((total_M \(-\) 3)/(total \(-\) 3)) \(\times \) ((total_M \(-\) 4)/(total \(-\) 4)) \(\times \) ((total_M \(-\) 5)/(total - 5)) \(\times \) ((total_M \(-\) 6)/(total \(-\) 6)), where M denotes being from a non-local cultural origin, total_M is the number of observation of type M and total is the overall number of people living at the NUTS2 region for which the likelihood is estimated (see Traverse and Milgram 1969; Granovetter 1973; Watts and Strogatz 1998). As our measure reflects immigrants, it indicates that the likelihood of heterogeneity and cultural diversity is higher when the likelihood of six degrees of separation is higher. That is why the expected relationship of six degrees of separation and innovation efficiency is expected to be positive.

The main indicators for the outcome variable of interest in our analysis, namely, the number of patent applications and the R&D expenditures in millions, are measured through data from EUROSTAT. Thus, our main dependent variable is the number of patent applications per R&D expenditure. It has been rightly argued in the economics of ideas literature, that ideas and innovation are by nature non-rival and non-excludable goods; and the whole concept of providing patents as a re-compensatory mechanism for innovators is highly contestable (Harabi 1995; Storper 1995). Yet, the patent practice has been widely promoted and adopted throughout—in particular—the EU as the ‘rule of the game’, resulting in policy-supported data collection that has been performed in this field by statistical authorities in a consistent manner. Moreover, even if hampered in the statistical representativeness of the data by confidential protectionist free-riding firm behaviours, the data on patenting of an innovative idea is still the registration of an ultimate act of cooperative behaviour, where the firm innovator tries to both take the due economic advantage and also allows for the agglomeration and spillover effects based on its innovation to take place. So, from a regional development perspective, it is indeed the patented innovations that are meaningful, as they are part of the cooperative game and are most likely to quantify properly the likelihood for the future effect of innovations on local development.

EUROSTAT is also a source for our control variables: total employment, workers with tertiary education and GDP. The number of agricultural workers approximates the economic structural differences in the localities (see Evans and Harrigan 2003). The motivation for the use of control for human capital (as workers with tertiary educational degree or higher—according to the EUROSTAT classification) stems from the endogenous growth models (see for example Uzawa 1986; Romer 1986; Lucas 1988; Rebelo 1991; Fischer et al. 2009), while GDP per capita is a gross measure of the economic productive capacity of the place and approximates its potential to invest in R&D (see Antonelli 1990; Tödtling 1992).

3.2 Estimation strategy

Our theoretical hypothesis is operationally translated into the existence of three types of conditions over time in the NUTS2 regions: regions with an increasing, regions with stable (not changing) and regions with decreasing diversity in the social networks (namely, six people in a row being from a different cultural origin). Thus, the six degrees of diversity in the local networks is a proxy that identifies in a broader sense the culture in the NUTS2 regions. Next, each region experiences an individual path of R&D output efficiency in terms of scientific output (i.e. innovative ideas to be patented per millions of R&D investment) in the different periods. Put differently, our study focuses on a specific cultural intervention (change of diversity in terms of change of the likelihood of six random people drawn from a locality to be sharing same non-local cultural origin) as a treatment effect) that leads to differences in local R&D productivity over time. In addition, we use human capital per worker and GDP per capita as alternative controls to capture the eventual omitted variables.Footnote 3 With the use of these data, we apply a difference-in-differences approach, following Villa (2012),Footnote 4 as exemplified in model (6) below:

$$\begin{aligned} Y_{it}= \beta _0+ \beta _1 X_{it}+ \beta _2 T_{it}+ \beta _3 X_{it}\times T_{it}+\beta _4 Z_{it}+e_{it} \end{aligned}$$
(6)

where Y is the R&D output efficiency; \(X_{it}\) indicates the belonging to the treatment group or not (i.e. \(X_{it}\) has three possible logical conditions—it either increases, decreases or remains stable between two time periods and these three possibilities define the types of the treatment and exhaust its domain; next, \(X_{it}\) is quantified as follows: when we are for instance interested in increase of diversity as a treatment, \(X_{it}\) is 1, if in the period 2002 to 2004 there is an increase of the likelihood of six people in a row being of foreign cultural origin, and 0 otherwise); \(T_{it}\) stands for identifying whether the observation falls in the past period (then the variable is 0) or in the later period (when the dummy is respectively equal to 1). The interaction between the preceding two variables helps us capture the treatment effect, reflected in the coefficient \(\beta _3\). \(Z_{it}\) is a vector of the additional control variables. And e stands for the error term. The expected effects from the three types of treatment are as follows. When diversity increases, i.e. the likelihood of having an increasing probability of six random people in the population being from a foreign origin, this treatment is expected to increase the number of patented new ideas in a locality, as more innovative ideas have more efficiently penetrated through the social lattice there. The opposite effect is expected from the decrease of the same likelihood. And no change in the treatment is supposed to be associated with no change in the efficiency of choice on ideas, i.e. in the number of percolation of really innovative ideas through the social lattice in a locality. If one of these three domains of treatment turns out to be statistically significant our hypothesis cannot be falsified. If more than one domain of treatment shows statistical significance this would be a sign for a non-linear dependence between treatment and percolation of new ideas.

Consequently, for our biennial datasets (2002–2004), we create a variable ‘path-length’ which expresses the six degrees of cultural diversity in the society by estimating the likelihood of having six people in a row from the interviewed people not being originally from this locality for 2002 and for 2004. We take the difference of the path-length of diversity from 2002 to 2004. The resulting change of the six degrees of diversity (i.e. our treatment) falls in three possible categories: increase, stable state and decrease of the likelihood of diversity in the local network. We therefore, specify three different types of treatment: X\(_1\) equal to 1, when there is a positive change, and 0 otherwise; X\(_2\) equal to 1, when there is a zero change of cultural diversity between 2002 and 2004 in the locality, and 0 otherwise; and X\(_3\) equal to 1, when there is a negative change of the likelihood of diversity in the local social network.

Initially, a simple linear OLS regression with robust errors is estimated for each of the three possible treatment effects: increase in diversity, stability (no change of diversity) and decrease of diversity (measured as having six people in a row being a foreign origin). Operationalizing model (6), we obtain the interactions of each variable with the time indicating dummy variable \(T_{it}\). We thus examine the coefficients of the three different interaction terms in order to find out which of the three cultural treatments generates an impact on local R&D output efficiency. Next, we will delve further into the reliability of the estimated significant treatment effect by involving real data (such as share of human capital or GDP per capita) as additional controls or dealing with the eventual omitted-variable bias artificially by estimating the model with fixed effects for each region. The reliability of the different approaches will be compared through a standard Ramsey test for omitted variables as well.

3.3 Numerical illustration results

Firstly, we compare the frequency of the three different possible types of cultural treatment occurring over the period 2002 to 2004. Table 1 presents this overview:

Table 1 Six degrees of cultural diversity: three different treatment effects, 2002–2004

The descriptive Table 1 demonstrates that in our data the frequency with which the three types of treatment are observed are actually almost equally distributed. This means, that an almost identical percentage of the EU regions have experienced an increase, a stable state or a decrease of the likelihood of six people in a row being from a different cultural origin. Therefore, it is interesting if these treatments that are almost equal in likelihood to happen produce different effects on local R&D output efficiency.

All three different types of treatments are examined as three alternative specifications of an operationalization of model (6) in its most reduced form. The results are presented in Table 2 below.

Table 2 Difference-in-differences test for the effect of the three alternative treatments from six degrees of cultural diversity on patents per million investment in R&D

As Table 2 shows, the cultural diversity instability—both in terms of increase or decrease of the six degrees of likelihood to encounter cultural diversity in the local network—has a negative but insignificant effect on the R&D output efficiency. It is however, the stability of cultural diversity in a locality that seems to be significantly and positively associated with the increased innovation potential of the locality. The standard statistics of the three models perform very closely to each other. This result can be interpreted as a sign that diversity instigates innovation in a stable climate without social turbulence. Put differently, if in a given interval of time there is cultural persistence in a locality, this and only this drives the limits of the cultural percolation of new ideas in a locality. And the impact from persistent cultural diversity (i.e. from a six degrees of separation measure defined as the likelihood of having connected in a row six random culturally diverse people in the locality) on innovation efficiency (measured in number of patents) is found positive. Thus, this test result fails to reject our working hypothesis and indicates a linear dependence between the limit of the function of choice of ideas from the pool of potential ideas and the limit of the function of local cultural attitude.

Yet, the results from an OLS regression with robust standard errors may still involve an omitted variable bias. In an attempt to deal with this problem in estimating our treatment effect of interest, we perform an estimation of four alternative specifications, as presented in Table 3 below.

Table 3 Stable six degrees of cultural diversity effect and omitted variables tests

Table 3 presents the specifications: involving a share of agricultural workers and a share of human capital in the locality as controls; then a parsimonious version of it involving only the share of human capital as a control; next a control variable GDP per capita is introduced in the operationalization of model (6); and lastly, model (6) is run in its fully parsimonious form but in combination with regional fixed effects to take away all the eventual omitted locality-specific characteristics.

The additional economic prosperity capturing variables such as share of human capital and GDP per capita report significant but negative coefficients and t-values. This is in line with the expectation that when the investment volume can be bigger, the risk evaluation gets more risk-avoiding and thus the less risky and respectively less innovative ideas are opted for, which ultimately results in a decrease in (or a negative departure from the optimal) innovative potential in richer localities. Naturally, the standard statistics of the empirical models improve with the inclusion of the control variables. The results of our four specifications are consistent, while an improvement in the omitted variable Ramsey RESET test is observed from specification 1 to 4. This second set of results confirms our first impressions from the numerical example considered here. Indeed, our hypothesis that the variation of the likelihood of having SDSC of culturally more diverse people in a locality is a statistically significant predictor of the limit of the function of choice for innovative R&D ideas tending to one more often seems in agreement with our empirical findings.

These results, though potentially intriguing, would have to be considered within certain data limitations imposed by the ESS data used. The ESS data needs to be double weighted to come close to the real representation of the likelihood of an individual to be chosen and to adjust to country-specific population sampling. Therefore, we avoid weighting the data and we refer to the direct number of observations, as the weighting procedures themselves do not add much to the reliability of the results. This numerical illustration presented here serves as a model for empirical application of the model of interest and aims only to aid the replication of the model with a better representative sample. Therefore, we suggest the analytical part of the paper to be understood as an example for potential operationalization of a mixed approach between difference-in-differences and six degrees of separation in a diversity context. It is possible that this original methodological solution can be interesting to be replicated with bigger representative datasets on a regional level.

4 Conclusion

The key message from our difference-in-differences test is that the creation of new knowledge in a locality seems to be positively determined by a stable likelihood of having connected in a row six people who are carriers of cultural diversity in the local network. This result is a confirmation of the main cultural premise of three lines of argument: (i) culture (measured with an innovative mixed measure combining diversity and six degrees of separation between people in a social network context) is a critical factor for the limits of the function of choice of R&D ideas determining the innovative process in a locality; (ii) if we combine the messages from our results, it seems that they agree with our proposition that the six degrees of separation paradigm approximates the tipping point for the unobservable structure of the whole social network in a locality; i.e. a strong connection between the tipping point of the functioning of the social network communication and the local value system of this locality seem indeed to exist; this conclusion stems from the fact that the tipping point for the percolation of new ideas seems to be determined positively and statistically significantly, when the cultural diversity in the locality is used to approximate the six degrees of separation driven tipping point for the percolation of new R&D-investment-ideas; (iii) our results are filling in a caveat to the endogenous growth discourse by showing that, depending on the local cultural values, an objectively important output (such as R&D ideas) may be different, even if the economic investment and other input factor differences (such as the share of human capital) are controlled for.

In summary, given all relevant statistical constraints in our approach, and taking into account the available data limitations, we observe that the results from our numerical illustration report a confirmation of the main cultural impact on innovation hypothesis of our inquiry. Yet, it seems to be stability rather than an increase or decrease of diversity in the local networks that stimulates R&D output in the present case.

Above all, the data limitations do not allow for generalized conclusions. However, the current study presents an original conceptual blending, and offers an empirical operationalization for the purpose of exploring the intersection of three streams of literature: (innovation, diversity and six degrees of separation) as a determinant for the selection of new ideas in decision making processes related to investment in R&D.