3.1 Theoretical Background and Types of Convergence

The convergence process is generally regarded as the implication of neoclassical growth theory (Solow, 1956). The assumption of diminishing returns to reproducible capital leads to convergence across countries and regions. Units with relatively lower initial capital to labour ratios experience technology transfers and capital flows from those with higher ratios. As a consequence, the income level converges across countries and regions.

Baumol (1986) and Barro (1991) define convergence as catching-up process in time series of output differences. The deviations between two countries have tendency to narrow over time. If yi, t > yj, t, then

$$ E\left({y}_{i,t+T}-{y}_{j,t+T}|{\mathrm{I}}_t\right)<{y}_{i,t}-{y}_{j,t} $$
(3.1)

for some T, where Ιt denotes the information set as of time t.

There are two key concepts of convergence. The first one assumes that units starting from high level of income exhibit lower income growth than units beginning with low-income levels. Since this process is measured by coefficient of regression, it is named β-convergence (Barro & Sala-i-Martin, 1992). The second one assumes the decreasing dispersion of income across units. Since the dispersion is measured by standard deviation or coefficient of variation, it is called σ-convergence.

The existence of σ-convergence means that dispersion of the cross-sectional distribution of income decreases over time. The faster growth of poorer economies, which implies in β-convergence process, is showed by negative coefficient of regression between the income growth rate and its initial level, named as the growth-initial level regression. Quah (1993) and Friedman (1994) argued that negative β coefficient in growth-initial level regression does not necessarily imply a reduction in dispersion. It is proved that existence of β-convergence is necessary, but not sufficient, condition of σ-convergence. This is because random shocks appears during convergence processes between economies.

The studies on convergence initially were based on cross-sectional data. In that case the specification of growth regression includes the initial level of income and the income growth rate between the last and first period. The verification of convergence using this kind of analysis does not confirm the existence of convergence understood as a process. Considering only the first and last period of time interval, ignoring the intermediate periods, can lead to erroneous conclusions. To avoid this problem, the regression for panel data is used. Another advantage of panel data appliance is correction for the omitted variables problem existing in cross-sectional studies (Stock & Watson, 2011).

Another important issue of convergence is the distinction of unconditional and conditional convergence. The unconditional convergence assumes that characteristics of economies, which affect steady-state income levels, are the same for all units. As a result, the growth-initial level regression does not include other explanatory variables besides initial level income. The growth equation in a general form is as follows:

$$ \ln \left(\frac{y_{it}}{y_{i,t-1}}\right)=\alpha +\beta \ln \left({y}_{i,t-1}\right)+{u}_{it}, $$
(3.2)

where yit is income of the ith economy in time t, uit has mean zero, finite variance σu2 and is independent over t, and i, α and β are parameters.

Otherwise, if country-specific characteristics cause differences in steady-state income levels, these factors should be controlled in the regression. The regression augmented with variables, such as rates of physical and human capital accumulation and population growth, has the following form:

$$ \ln \left(\frac{y_{it}}{y_{i,t-1}}\right)=\alpha +\beta \ln \left({y}_{i,t-1}\right)+\gamma {Z}_{it}+{u}_{it}, $$
(3.3)

where Zit is a vector of variables affecting the growth rate and γ represents a vector of parameters. The negative sign of β in the augmented regression indicates the conditional convergence.

The conditional convergence assuming differences in the steady state for each economy is to some extent similar to club convergence. It also does not assume one equilibrium-level for all economies, but it takes a multiple equilibrium for groups of countries. The existence of different equilibria is the effect of sharing the same initial position or other attributes by group of countries. The development of the concept of club convergence can be attributed to Baumol (1986), but its continued formal extension is mainly an outcome of the efforts of Durlauf and Johnson (1995). According to Galor (1996), the difference between conditional convergence and club convergence is that in the case of conditional convergence, regions or countries that are similar in their structural characteristics converge to one another regardless of their initial conditions and in the case of club convergence, not only structural characteristics need to be the same to lead to convergence of countries or regions, but their initial conditions need to be similar as well. Despite the conceptual differences, it is not easy to separate club convergence from conditional convergence empirically (Islam, 2003).

Phillips and Sul (2007, 2009) proposed a procedure that allows to identify convergence clubs. For this purpose, they use time-varying factor model which takes into account individual and transitional heterogeneity. Due to its comprehensive capabilities, this methodology is the most popular tool applied to the analysis of club convergence patterns. The extension of this procedure is the ex-post analysis of the factors influencing clubs formation (Bartkowska & Riedl, 2012).

The other concept of convergence is the consequence of Bernard and Durlauf (1995) convergence definition and focuses on the long-run behaviour of differences in the output across countries. They define convergence in output between countries i and j as equality of the long-term forecast of output for both countries at a fixed time t:

$$ \underset{k\to \infty }{\lim }E\left({y}_{i,t+k}-{y}_{j,t+k}|{\mathrm{I}}_t\right)=0, $$
(3.4)

where Ιt denotes the information set available at time t. It means that long-term forecasts of output for each country are equal at fixed time. This definition implies no stochastic trends in the time series of output differences between countries. In the general case, the data are organized in panel form. Then, the subject of analysis are differences between the output of the reference country and each other country in the panel. Alternatively, the average of output for all countries may be treated as a basis (Carlino & Mills, 1993). In the presence of stochastic convergence these differences should follow a zero-mean stationary process. It means that output for all countries tends to evolve along similar equilibrium paths.

Based on stationarity tests, Li and Papell (1999) distinguish between stochastic and deterministic convergence. Trend stationarity of output differences is treated as a weak notion of convergence, i.e. stochastic convergence. Then, linear trend in deterministic component of time series means permanent differences in output levels across countries. Contrary to the weak notion of convergence, Li and Papell (1999) propose its strong definition, called deterministic convergence. In this case the time series of output differences is mean stationary. It does not contain neither deterministic nor stochastic trends. The differences in output are constant over long run. The existence of deterministic convergence implies stochastic convergence, but not the opposite way.

Due to the lack of power of univariate time series unit root tests of the ADF-type to test stochastic convergence, the panel unit root tests are applied (Bernard & Jones, 1996; Evans & Karras, 1996; Salmerón & Romero-Ávila, 2015). There are two groups of these tests. The first one assumes cross-sectional independence between units, and the second one allows for the cross-sectional dependence.

The initial contributions to the convergence literature were made by the papers focusing on the examination of β-convergence. Empirical results of these studies are reported in Table 3.1. The starting point for convergence analyses was the influential study by Baumol (1986). His findings are unambiguous, for the sample of 16 OECD countries, the absolute convergence is present, while for the whole group of 72 counties, there is no absolute convergence. The convergence studies for large groups of countries in most cases show a lack of absolute convergence (Barro, 1991; Durlauf & Johnson, 1995; Mankiw et al., 1992). In turn, for small groups of countries or for regions, such as US states, Japanese prefectures, NUTS regions, the convergence is present (Barro & Sala-i-Martin, 1992; Mankiw et al., 1992; Paci, 1997; Sala-i-Martin, 1996).

Table 3.1 Selected studies on income β-convergence

The studies on β-convergence using panel data generally confirm absolute and conditional convergence at both national and regional levels (Esposti & Bussoletti, 2008; Islam, 1995; Lee et al., 1997; Maynou et al., 2016; Próchniak & Witkowski, 2013; Young et al., 2008). The exception is Fingleton’s (1999) study, which indicates no absolute nor conditional convergence for NUTS regions.

The results of σ-convergence studies are summarized in Table 3.2. Since the presence of β-convergence means the presence of σ-convergence, the studies which find evidence for β-convergence also reveal evidence of σ-convergence (Barro & Sala-i-Martin, 1992; Baumol, 1986; Lee et al., 1997; Sala-i-Martin, 1996). In turn, De Long (1988) and Paci (1997) confirm the absence of both σ- and β-convergence.

Table 3.2 Selected studies on income σ-convergence

Table 3.3 presents the results of selected studies on income club convergence. Starting with Baumol and Wolff (1988), who provide a strong evidence of absolute convergence in the upper income club of countries and weaker evidence of divergence among the lower income countries, several researchers have investigated income club convergence at the national and regional levels. The pioneering work of Durlauf and Johnson (1995) demonstrates complete different results from Baumol and Wolff (1988). Interestingly, Papalia and Bertarelli (2013) reveal the existence of conditional β-convergence with the non-monotonic pattern in the sample, which is similar to the one applied by Durlauf and Johnson (1995). Another important contribution is made by Baumont et al. (2003), who explicitly consider spatial regimes, interpreted as spatial convergence clubs. Recently, some empirical studies on club convergence (e.g. Bartkowska & Riedl, 2012; Cavallaro & Villani, 2021; Lyncker & Thoennessen, 2017) have applied the nonlinear time-varying factor model proposed by Phillips and Sul (2007, 2009). As mentioned previously, this methodology is believed by some authors to be the most appropriate for detecting convergence clusters (Lyncker & Thoennessen, 2017).

Table 3.3 Selected studies on income club convergence

3.2 Role of TFP in Measuring Technological Convergence

The concept of Total Factor Productivity (TFP) is rooted in the economic growth literature. The conceptual framework of the total factor productivity is based on the existence of the dichotomy between technology and capital formation. In other words, the main question for growth economists is how much the output growth should be assigned to the changes in technology and to capital formation, respectively (Hulten, 2000). In the seminal papers by Tinbergen (1942) and Solow (1957) there were first attempts to tie the aggregate production function with TFP. According to the growth accounting methodology suggested by Tinbergen (1942) and elegantly formalized by Solow (1957), TFP can be defined and measured applying a static and dynamic approach. The former regards TFP level as an indicator of ‘the state of technology’ (Chiang & Wainwright, 2005). Lipsey and Carlaw (2004) define technology, measured by TFP, as technological knowledge that consists of the body of knowledge about product technologies, process technologies, and organizational technologies. All these technologies create economic value (product).

To define Total Factor Productivity, one can use the Cobb-Douglas production function Footnote 1 with two inputs:

$$ Q=A{K}^{\alpha }{L}^{\beta }, $$
(3.5)

where Q—output, K—capital, L—labour, A—technology, α—partial elasticity of output with respect to K, β—partial elasticity of output with respect to L. For the calculation of TFP level, both sides of production function should be divided by KαLβ. This results in:

$$ \mathrm{TFP}=\frac{Q}{K^{\alpha }{L}^{\beta }}=A. $$
(3.6)

Within the dynamic perspective, TFP growth is linked to technical progress (Atella & Quintieri, 2001; Barro, 1999; Young, 1992). Under the assumption of long-run equilibrium of perfect competition (i.e. each input factor is paid the amount of its marginal product) and the constant returns to scale, the measure of Solow residual, i.e. the portion of the growth of output that is not explained by the growth of labour and capital, is:

$$ \frac{\dot{\mathrm{TFP}}}{\mathrm{TFP}}=\frac{\dot{Q}}{Q}-{s}_k\frac{\dot{K}}{K}-{s}_L\frac{\dot{L}}{L}, $$
(3.7)

where sK—the share of capital in the value of total output, sL—the share of labour in the value of total output. This expression indicates that the Solow residual can be computed directly from prices and quantities. As such, it is an index number in the form of the growth rate of the Divisia index (Hulten, 1973). The main limitation of the Solow approach is that a calibration exercise needs the restrictive assumptions of perfect competition and constant returns to scale, which do not correspond with the real-world economies (Roeger, 1995).

The concentration on TFP levels rather than rates of change is especially important in growth models where technology is the main source for growth and convergence (Benhabib & Spiegel, 2005; Hulten, 2000). From a measurement perspective, the challenge is how to estimate TFP level of a particular country or region. Within a deterministic approach related to the Solow growth theory, Klenow and Rodriguez-Clare (1997) and Hall and Jones (1999) have introduced the methodology, a so-called development accounting, that allows to produce estimates of TFP levels. Among alternative parametric methods, Schatzer et al. (2019) set out the advantages of the panel regression approach (i.e. the fixed-effects approach and the fixed-effects approach with time trend) to obtain the estimates of A. In contrast to the cross-section approach and the pooled panel approach, this methodology allows to estimate TFP levels directly without employing the accounting framework.

The recent trends in empirical analysis of TFP reveal a growing attention towards non-parametric approach first proposed by Jorgenson and Nishimizu (1978). Their study paved the way for the Malmquist index, which allows to compare the relative TFP levels (Caves et al., 1982). For two countries (regions), A and B, respectively, with production functions QA = F(KA, LA) and QB = G(KB, LB), the Malmquist index is the geometric mean of two ratios. The former informs about the difference in productivity of technology A and technology B at A input’s level, i.e. F(KA, LA)/G(KA, LA). The latter provides the insight into a difference in productivity of technology B and technology A at B input’s level, i.e. F(KB, LB)/G(KB, LB). Moreover, Fare et al. (1994) apply data envelopment analysis (DEA) to set up a widely used distance function approach to calculation and decomposition of the Malmquist productivity index.

Although the Malmquist index is frequently used in the productivity literature, there are different aggregator functions that can be employed to calculate TFP index, i.e. the ratio of an aggregate output − Q(q) to an aggregate input − X(x):

$$ \mathrm{TFP}=\frac{Q(q)}{X(x)}. $$
(3.8)

Apart from the Malmquist index, TFP indexes include Laspeyres, Paasche, Fischer, Lowe, Hicks-Moorsteen, Törnqvist, and Färe-Primont indexes. It should be noted that the applicability of TFP indexes results from their ‘desirable’ properties to satisfy certain economic axioms and tests. As expressed recently by O’Donnell (2011) Laspeyres, Paasche, Fischer, Malmquist, Hicks-Moorsteen, and Törnqvist indexes can only be used for making comparisons involving two observations or two time periods, thus they fail the transitivity test. On the other hand, the Färe-Primont index allows to make reliable comparisons involving many objects and time periods. Moreover, it meets all other economically-relevant requirements from the index number theory. Contrary to the TFP indexes implying specific production functions (e.g. the quadratic function form underlying the Fisher index and the translog function form underlying the Törnqvist index), the DEA estimation of the Färe-Primont index does not require specification of locally linear production frontiers.

In line with Islam’s (2003) argumentation, TFP level is the closest measure of technology and can be used to study technological convergence. Footnote 2 The concept of technological convergence is anchored in the research on income convergence (Hall & Jones, 1999). Conceptually, income convergence may be explained by technological catch-up, when initial TFP differences diminish over time. According to the ‘catch-up’ hypothesis (Wolf, 1991), convergence in total factor productivity levels results from the fact that countries with lower level of technology than the leading countries should face with more rapid rate of growth in technology. The further away a country (region) is from the technology frontier, the greater the possible benefits the advantage provides.

One of the most frequently provided explanations of the catching-up process is a powerful economic concept of ‘advantage of backwardness’, introduced by Gerschenkron (1962) and broadly explored in the contemporary literature on technology diffusion (Stephan et al., 2019; Vu & Asongu, 2020). According to this concept, the flow of technical knowledge goes from the technology leaders to the more backward economies. As such, technology is regarded as a quasi-public good that can freely pass international and regional boundaries. Formally, the process of technological catch-up is concisely described in the model of Nelson and Phelps (1966). Rogers (2004) and Vu and Asongu (2020) refer to this model and show that the growth of the country’s technology can be described as follows:

$$ \frac{\dot{A(t)}}{A(t)}=\varphi (.)\left[\frac{T-A(t)}{A(t)}\right]=\varphi (.)\left\{\left[T/\left.A(t)\right]\right.\right.\left.-1\right\}, $$
(3.9)

where T refers to the world practice technology and φ represents a function describing the country’s absorptive capability. The former determines the technology gap. Building on the above equation, the greater the gap [T/A(t)]−1] is, the greater the growth rate of technology the country experiences, holding φ(.) constant. In other words, the potential to reduce the country’s technology gap is higher when the country is located further from the technology frontier. This is obvious that the increase of the world available technology stock may affect A positively. With the implication of T growth at a constant exogenous rate g, growth of A has to be equal to g in the long run (Fig. 3.1).

Fig. 3.1
A line graph of the growth rate of A versus A slash T has a horizontal dashed line labeled g and an arc touching it.

Technology gap model (Rogers, 2004, p. 579)

As regards the absorptive capability φ(.) Dahlman and Nelson (1995, p. 88) define it as ‘the ability to learn and implement the technologies and associated practices of already developed countries’. In line with Rogers’ (2004) argumentation, the absorptive capability is a multifaceted concept and refers to availability of external technology, ability to learn, and economic, social, and political incentive to adopt new technologies.

The early empirical works on technological (TFP) convergence was pioneered by Jorgenson and Nishimizu (1978), who compared relative TFP levels of the aggregate economies of Japan and the USA, applying a variant of the translog method of estimating TFP differences. The results show that technological gap between these countries was reduced significantly during the period 1952–1974. This research was extended by Christensen et al. (1981), who found that the US’ technology was still ahead of that of Japan. The time series growth accounting approach to test TFP convergence has been adapted by other researchers, including Wolf (1991), Dollar and Wolff (1994), Dougherty and Jorgenson (1997), but, as stated by Islam (2003), it has some limitations linked mainly to detailed time series data availability, especially for large samples.

Apart from the time series growth accounting approach to study TFP convergence, there are many studies employing specific datasets and various methodologies to capture convergence. Methodological differences in TFP convergence testing relate to both the TFP calculation method and the convergence type and its verification procedures. Table 3.4 contains selected results of studies on TFP convergence at the regional and country level. As one would expect, taking into account the complex nature of convergence processes and methodological issues, these empirical studies produce mixed results, which are strongly affected by the approaches applied and the periods of analyses. On the other hand, it is worth noting that some recent studies on TFP convergence apply multifaceted approach. For example Rath and Akram (2019) study TFP convergence using the notion of stochastic conditional convergence, σ-convergence and club convergence. Their results suggest that even in a situation of absolute convergence, there may be different transition paths to the steady state.

Table 3.4 Selected studies on TFP convergence

3.3 Innovation as a Source of Technological Convergence

The link between innovation and TFP has theoretical foundations in the literature on endogenous growth theory. Contrary to the neoclassical approach pioneered by Solow (1957), endogenous growth theory tries to find ways to endogenize technological change, which is the main driver of long-run economic growth. As such, endogenization of technological progress means that it is explained by the model rather than being regarded as completely exogenous (Jones, 2005). There are a few channels to leverage technological knowledge stock, chief of which are R&D activities generating innovations. It should be noted that R&D-based models of endogenous growth can be generally classified into two generations (Ang & Madsen, 2011).

The first generation of R&D-based endogenous growth models (e.g. Aghion & Howitt, 1992; Grossman & Helpman, 1991; Romer, 1990; Segerstrom et al., 1990) show that TFP growth is positively affected by the R&D levels. The key differential equation of these models is:

$$ \dot{A}={H}_A{A}^{\varnothing }, $$
(3.10)

where HA is human capital allocated to research.

In line with the models’ supposition of ∅ = 1 the amount of new knowledge (\( \dot{A} \)) is an increasing function of the existing stock of knowledge (A). One of the main reasons of this situation may be the so-called standing on shoulders effect, according to which past inventions make present discoveries easier and more effective. This can be derived from the main assumption of the ‘acceleration school’ (Machlup, 1962) that each new invention increases the set of possible combinations with huge numbers of current ideas, making further inventions easier. For instance, in the representative R&D-based growth model by Romer (1990), investments in research and development made by profit-maximizing agents lead to the increase of the variety of intermediate goods, which results in productivity growth. In turn, Aghion and Howitt (1992) have introduced the quality ladder model, where vertical innovations in the form of new intermediate goods, which allows to produce the final output more efficiently than before, constitute the underlying source of productivity growth.

The second generation of R&D-based endogenous growth models eliminates the Romer-like knife-edge postulation of scale effects. Within this group of models, there are semi-endogenous growth models and fully endogenous ‘Schumpeterian’ models (Ha & Howitt, 2007). The former include inter alia the models of Jones (1995a), Kortum (1997), and Segerstrom (1998). These models incorporate diminishing returns to the stock of R&D knowledge (∅ < 0 ) to the ‘R&D-based’ branch of endogenous growth theory. The decrease in R&D returns may result from the so-called fishing out effect, which means that the most evident inventions are introduced in the first place and it is thus difficult to discover the following ones. The fishing out effect is anchored in the ‘retardation school’ (Machlup, 1962), which in its extreme form assumes that the more inventions have been discovered the less there remains to be invented. The existence of diminishing productivity of R&D activity is highlighted by the fully endogenous ‘Schumpeterian’ models of Aghion and Howitt (1998), Dinopoulos and Thompson (1998), Peretto (1998), Howitt (1999), and Peretto and Smulders (2002). This is essentially because of the proliferation of R&D effects embodied in intermediate good varieties within many sectors of the economy, which leads to lower productivity of R&D efforts focused on quality improvement. Recently, Peretto (2018) has sought to introduce the new endogenous growth model, in which explosive nature of vertical innovations is balanced by the expansion of horizontal innovations.

Another group of Schumpeterian R&D models focuses on the idea of multiple equilibria in TFP. This group comprises, among others, the model of Howitt (2000) with twin peaks in productivity. A twin-peaked distribution of TFP can be explained by the fact that there are two clubs of countries. The former includes countries with no investments in R&D and with negligible absorptive capacity of external knowledge. These countries stagnate at the lower peak. The latter consists of countries that invest in new technologies and can absorb knowledge from the technological frontier to gradually get closer to the upper peak. The main problem with the model of Howitt (2000) is that it does not distinguish between different strategies for technology creation and catching-up the global technological frontier. For this reason, Howitt and Mayer-Foulkes (2005) proposed an extended model of growth through creative destruction, which implies three peaks in productivity. Countries in the highest club converge to a steady state where they perform leading edge (modern) R&D related to basic and applied research. This R&D strategy needs an appropriate level of human capital, especially creativity skills. In turn, countries in the intermediate club converge to a steady state where they perform experimental development, which requires research and practical experience and allows them to implement technologies developed elsewhere using innovation-effective skills. Finally, countries in the lowest club converge to a stagnation steady state with non-R&D and the erosion of absorptive capacity.

The multimodality of the distribution of TFP can be also explained on the basis of the theory of innovation geography (Feldman & Kogler, 2010). Although there is some scepticism in new economic geography literature concerning the difficulty of measuring knowledge flows that matter for innovation activity (Krugman, 1991), some authors try to incorporate insight from the endogenous growth theory and from the economic geography theory to formalize the interconnections between economic regions in terms of convergence or divergence processes (Alexiadis, 2013). There are some extensions of the first generation of R&D-based endogenous growth models to a two regions framework (Baldwin et al., 2001; Davis, 2009; Yamamoto, 2003). Similar efforts are under way in the extensions of semi-endogenous growth models (Fukuda, 2017) and the fully endogenous ‘Schumpeterian’ models (Davis & Hashimoto, 2015). These models deal with spatial externalities for knowledge spillovers that affect agglomeration economies for innovation (Bond-Smith, 2021). In line with Feldman’s 1994) argumentation, geographic proximity reduces the inherent risk of innovation projects by providing firms with the access to necessary resources accumulated in a region. Moreover, innovative firms tend to form spatial clusters where knowledge externalities allow to reduce the time and costs of innovation processes. Feldman et al. (2002) show how relational networks, initiated and coordinated by research universities, support knowledge spillovers in the form of technology transfer via local linkages and platforms stimulating interactions among firms, individuals, and government institutions.

Table 3.5 summarizes different results on the theoretical prediction of the role of innovation in TFP convergence or divergence. Some theories/models suggest directly or indirectly that investments in innovation may lead to TFP convergence, while others find them resulting in TFP divergence. As regards the scale of production of inventions, TFP convergence or divergence predictions should be considered within the context of so-called ‘transitional dynamics’. In this case, the gap in technology levels between high TFP countries/regions (technology leaders) and low-TFP countries/regions (technology laggers) should be expected to narrow and finally disappear during knowledge accumulation due to the decrease in R&D productivity. Contrary conclusions could be drawn for the case when there is an increase in R&D productivity during knowledge accumulation processes. Interpreting the R&D scale assumptions of the first and second R&D-based endogenous growth models and their TFP convergence or TFP divergence implications, it should be noted that these models generally do not discuss the scale effects of technological knowledge accumulation for a multi-country (region) world. They also neglect the key issues of innovation activities such as the existence of different types of innovation strategies and their spatial aspects. Partially, these limitations are mitigated by both the multiple equilibria Schumpeterian R&D models and innovation geography theory that provide theoretical support to the existence of club convergence in TFP. The former emphasize the key role of different strategies for technological investments and dynamics of absorptive capacity in explaining the multimodality of the distribution of TFP. The latter focuses on the tendency for innovation activity to cluster spatially with the principle that innovation benefits most from location.

Table 3.5 Innovation and TFP convergence or divergence in theory

As regards the empirical verification of theories/models that give support to explain the role of innovation in TFP convergence or divergence, it is noteworthy that Jones (1995b) provides evidence that refutes the first generation of R&D-based theories according to which more R&D staff ought to stimulate more TFP growth. On the other hand, Ang and Madsen (2011) tried to test general validity of the second-generation endogenous growth models. Their results give strong support for the fully endogenous ‘Schumpeterian’ models and only partial support for the semi-endogenous growth models. Similar findings are presented by other authors (Greasley et al., 2013; Ha & Howitt, 2007). Coming to the issue of the effect of innovation on multimodality in distribution of TFP, the results of empirical research do not allow direct verification of this relation. On the one hand, Henderson et al. (2008) show that a distribution of total factor productivity reveals a tendency to become multimodal over time in the global scope. In turn, there are few empirical studies that deal directly or indirectly with the modality of TFP distribution in the European regions, including that of Di Liberto and Usai (2013), who found that there are TFP leaders separating themselves from low-TFP regions. This finding is partially in line with the bimodality recorded in the EU regional distribution of output per worker by Fotopoulos (2005) and Rogge (2019). On the other hand the existence of two or more peaks in the distribution of the regional TFP can be seen as the result of the existence of technological clubs with different patterns of innovation activity. As shown by Barrios et al. (2019) and Kijek et al. (2022), there is a tendency for clustering in R&D and patent activity in the European regional space.

Another stream of research on TFP convergence and its determinants that relate to innovation/imitation processes directly, corresponds with the aforementioned (Sect. 3.2) technological gap model. In their seminal papers, Griffith et al. (2003, 2004) focus on two faces of R&D. The first one considers the conventional role of inducting innovation. The second one denotes R&D-based absorptive capacity. They found that the greater the gap between an economy’s level of technology and the technological frontier, the greater the potential for technologies to be transferred to the non-frontier country via R&D in the sample of 12 OECD countries. This effect of R&D is more important for those countries that are far from the technological frontier. Interestingly, Griffith et al. (2004) also suggest an alternative interpretation of the catching-up process that presumes that there are sharply diminishing returns to R&D. The methodological framework proposed by Griffith et al. (2004) was adapted by Männasoo et al. (2018) who studied the contributions of R&D spending to total factor productivity growth, having regard to convergence processes in the European regions. They show the convergence effect of technology gap, but the effect of R&D expenditures was largely absent in the whole sample. However, a more detailed analysis in the particular subsamples reveals that the marginal return on R&D is reducing in advanced regions, while less productive regions benefit relatively more from increases in R&D. The diminishing returns to R&D is also reported by Burda and Severgnini (2018) who analyzed total factor productivity convergence in the German states.