The medium-term effect of R&D on firm growth

UNU-MERIT Working Papers intend to disseminate preliminary results of research carried out at UNU-MERIT and MGSoG to stimulate discussion on the issues raised. Abstract This study analyses the medium-term effect of R&D expenditure on firm employment growth. Four cross-sectional waves of an innovation survey conducted in the Netherlands have been used to evaluate the effect on firm growth in the five years following the investment. Panel data fixed effect techniques, also allowing for selection bias corrections, indicate a positive influence of R&D on growth. Limited dependent variable models have been used throughout the whole analysis to consider explicitly the cases of firms exiting the market in the analysed medium term. The authors want to thank Rudi Bekkers, Alex Coad, Koen Frenken, Önder Nomaler, the participants at the DIME Final Conference held in Maastricht on April 6-8, 2011, and the participants at the ECIS economics seminar held in Eindhoven on October 13, 2011. This study has partially been developed within the project on “Dynamics of Innovation and Markets in Europe” (Work Package 2.10 on innovation, firm entry and exit, and the growth of industries), sponsored by the 6th Framework Programme of the European Union.


Introduction
This paper examines the relation between R&D expenditure and firm medium-term growth. The recent literature has confirmed that firm-level analysis is necessary to capture the heterogeneity of the economy (Reichstein et al., 2010). In particular, the dynamics of some groups of firms, like gazelles and high growth firms, seem to be driven by different factors than the ones characterizing the dynamics of more moderately growing firms (Coad and Rao 2008;Hölzl 2009;Stam and Wennberg 2009). A current challenge of economic researchers is then expanding the Gibrat's Law approach (Gibrat, 1931) to understand how such heterogeneity can be explained (Stam, 2010). Innovation is one of the usual suspects in defining differences in performance (and especially sustained performance) among firms, and connecting more explicitly innovation patterns with what is known about firm growth is thus a challenge for current research (Cefis and Orsenigo, 2001). In particular, Coad and Rao (2008) have observed that "innovation is of crucial importance for a handful of 'superstar' fast-growth firms".
In this paper, growth in terms of firm employment is the dependent variable. As Coad (2010) states, "employment growth can be seen not only as an input (in the production process) but also as an output if, for example, the policy maker is interested in the generation of new jobs". Henrekson and Johansson (2010) have shown the particular role played by high growth firms in job creation.
We implicitly assume an ideal pattern linking, unidirectionally, R&D to innovation to productivity to employment growth (where direct intermediate steps linking nonadjacent rings of this chain are also possible). However, our study will take into account only the first and the last rings of this chain, and our independent variable of interest will be only R&D expenditure. Of course, alternative approaches would be possible that consider at the same time three or more rings of the same chain, as in the multistep procedure by Crépon, Duguet, and Mairesse (1998), or that take into account multi-directional causation processes, as in the panel VAR approach followed by Coad and Rao (2010). Instead, we estimate the relation between R&D expenditure and firm growth without considering the intermediate logical steps in terms of innovation success and productivity changes. By doing so, we are technically close to the recent works by Hölzl (2009) and Hölzl and Friesenbichler (2010). Although not explicitly considered here, the potential "feedback" effect of the influence of firm growth on R&D could also be important when the analysis of firm survival and performance is not confined to the short term.
Our original contribution to the literature comes from the time span we consider after the R&D expenditure, that is the amount of time allowed to R&D for exerting an influence on firm growth: we analyse the medium-term effect of R&D, and in particular the effect after five years from the investment. Such decision is due to the suspicion that many of the previous cited firm-level works fail to find appreciable influences of R&D on growth, "in contrast to aggregate evidence which clearly shows that R&D and innovation lead to higher growth at the country level" (Hölzl, 2009), because firm-level growth is often measured only one or two years after the R&D expenditure (e.g. Klomp andvan Leeuwen, 2001, andRao, 2008), while a "long time lag [is] required for a commercially valuable discovery to finally materialize in terms of growth of sales or profits" and "successful R&D may even entail further short-term costs (e.g. costs related to product development) before yielding long-term benefits" (Coad and Rao, 2010). However, many technical problems arise when considering medium-term performance, as shown in Section 3 (on methodology). Notably, only a few studies on growth performance have considered a medium or long term, the main exceptions being an analysis by Brouwer et al. (1993) and some recent works on the effect of firm strategies on growth (e.g. Pelham andWilson, 1996, Leitner andGueldenberg, 2010).

Data and variables
For our research, we use the data from the Community Innovation Survey (CIS) that refer to the Netherlands. The CIS is a firm level survey conducted every four years in all EU member states (plus non-EU countries like Norway and Iceland). We consider the four waves of the innovation survey conducted in 1996, 1998, 2000 and 2002. As we will explain later, the original data will be used for applying panel data techniques (exploiting the presence of firms in more than one wave), while a reduced version of the database, in which double counting of the same firms is avoided, will be used when pooling the four waves in a unique cross-section. Table 1 provides an idea of the relative number of firms available in the four waves, respectively for the original data (after erasing observations or which relevant information is missing) and for the pooled cross-section cases. In the Pattern column, each digit corresponds to a survey year, where the years are 1996, 1998, 2000 and 2002; a dot indicates that the firm is missing in the corresponding survey, while a 1 indicates that the firm is observed in the corresponding survey. In the right part of the table, firms that were present in more than one survey year have been considered only for the oldest year. Firms having missing information on R&D expenditure, turnover, size, and affiliation to a group, have been excluded from the data. For each firm, we computed R&D intensity as the ratio between R&D expenditure (survey variables uitota, uitota, rtot and rtot for 1996, 1998, 2000 and 2002 respectively) and turnover (survey variables omztot96, omz98imp, turn, turn02 for 1996, 1998, 2000 and 2002 respectively). We exclude from the analysis all the firms for which the ratio between R&D expenditure and turnover is higher than one, and we use a logarithmic transformation to obtain the variable that will be used in the rest of the analysis as our measure of (transformed) observed R&D intensity 1 . Figure 1 shows the (unconditional) distribution of when pooling all the observations. Apart from the right-truncation in zero (due to our exclusion of firms having R&D expenditure higher than turnover), the distribution resembles a Gaussian. . It is computed as the logarithm of R&D expenditure over turnover. Firms with R&D expenditure equal to zero have not been considered in the analysis. For values of R&D intensity below -11 (still included in the analysis), the density is too low to be shown in the graph.

Original data
If we name t each year in which the survey has been conducted, the corresponding medium-term firm performance is computed as the firm growth between t+1 and t+5, where firm size is proxied by firm employment plus one, and the data on employment have been retrieved by matching the CIS data with the data of the Business Register, a census of the whole Dutch firm population conducted every year. By matching with the Business Register, that contains yearly information on the whole population of firms registered for fiscal purposes in the Netherlands, we are able to check the survival of firms, and to measure the growth rate of surviving firms, during the five years following the CIS survey wave in which the same firms were surveyed.
To define firm growth for each firm i and year t (where t=1996, 1998, 2000, 2002), we start from the expression of relative firm growth (subsequent to the R&D expenditure): Such measure of growth can take only values included between 0 and +∞ (zero in case of exit). Figure 2 shows that the resulting (unconditional) distribution (obtained when pooling all the observations and not considering exits) resembles a Laplace (in line with the findings of Stanley et al., 1996, andAxtell, 2001, who use a log size difference approximation of growth) and looks symmetric in the body, while its left tail is truncated in zero, and its right tail is very long to include some episodes of outstandingly high growth. The measure obtained in (1) will be the growth proxy used in the rest of our study, and referred to as the "CTV measure" in the text.
A variety of proxies have been used in studies concerning firm growth. Besides relative or absolute growth measures, the most popular measures are the Birch index (Birch, 1981 and, which combines relative and absolute growth, and the log size difference. The Birch index has been used especially in studies interested in fast growing firms (Almus, 2002;Hölzl, 2009Hölzl, , 2010, since it weighs proportional growth by the absolute change in the number of employees. For a size proxy x, the growth rate g of firm i, between periods t and t-1, is computed as: It therefore gives more importance to large positive changes in firm size. 2 The log size difference has been chosen in the literature on firm growth and R&D expenditure Rao, 2008, 2010;Klomp and Leeuwen, 2001), and more generally in the literature on firm growth distributions (Bottazzi and Secchi, 2006). This measure allows to approximate proportional growth while reducing the importance of outliers (with large positive growth rates), and is computed as : We choose to depart from these studies for two reasons. First, the approximation of the relative growth process by a log size difference is possible for a limited range of values. For instance, consider a number a; we know, by first order Taylor expansion around zero, that log 1 for small a. Note that the expression is not valid for values of a close to -1 or larger or equal than 1. By defining we get that Therefore, log size difference is a correct approximation of relative growth for values below 1 and not too close to -1. Outside of this range (in the case of extreme growth events), they are not similar.
A second characteristic of log size differences, as proxy for growth rates, has been put forward by Capasso and Cefis (2012) and involves the issue of endogenous truncation of the growth rate distribution: firms cannot have less than zero employees. When considering log size difference as measure of growth rates, and the number of employees plus one as a proxy for firm size (to avoid the existence of infinite negative growth rates), then the distribution of growth, conditional on initial firm size, has a left boundary that depends on the initial size itself: the support of the growth rate distribution, and in particular the minimum growth rate, is sensitive to the firms' initial size. Having such a distortion in the growth rate distribution can potentially bias a study on industrial dynamics and innovation, especially when extreme growth events (i.e. the tails of the growth distribution) deserve particular attention. A left truncation of the distribution characterizes all the measures of growth rate, but it has a particular disturbing impact in the case of the log difference proxy, because only in this case the left truncation is dependent on the size of the firms in the sample.
In order to clarify this, the heterogeneity across growth rate measures is made explicit in Figure 3. In Figure 3a, we show the one-year growth rate (measured on the vertical axis) of a firm having 10 employees in period t-1, and a number of employees ranging from 0 to 20 (measured on the horizontal axis) in the following period t, for five different growth indicators: absolute growth (Abs. gr), the Birch index (Birch), proportional growth (Prop), our measure (CTV) and log size difference (Log diff). As expected, the Birch index largely overemphasizes large positive events, but also associates larger negative values to decreases in size as compared with the proportional growth, CTV and log difference measures.
Differences between the latter measures are better visualized in Figure 3b, where we do the same exercise as for Figure 3a, but focusing only on: proportional growth (Prop), our measure (CTV) and log size difference (Log diff).
In Figure 3c, we do the same exercise as for Figure 3b but considering an initial number of employees equal to 100. We can observe the sensitivity of the log difference measure to initial size: the higher the initial size, the lower the minimum log size difference; the log size difference is -2.4 (i.e. the opposite of the natural logarithm of 11, since we proxy firm size by number of employees plus one) for a firm exiting in t with size 10 employees in t-1 (Figures 3a and 3b), and -4.6 for a firm with initial size 100 employees ( Figure 3c). Instead, the minimum growth rate is always -1 in terms of proportional growth, and always 0 in terms of our CTV measure.
The characteristics of the CTV measure in comparison with the log size difference and proportional growth indicators are as follows. For high positive growth rates, the log transformation applied to the relative growth rate (equation 1) makes it similar to the log difference growth rate: it allows to reduce the effects of heteroscedasticity on the econometric outcomes, by giving less weight to the extreme positive events (as also noted by Coad and Hölzl, 2012). Instead, in the case of extreme negative events (exit), our measure is less affected by the endogenous truncation issue than the log difference one. This latter feature is of particular relevance since we are interested in the evaluation of performance changes in the medium term, and such longer term may affect the frequency and the magnitude of extreme (positive or negative) growth events.  In Figure 3a, we show the one-year growth rate (measured on the vertical axis) of a firm having 10 employees in period t-1, and a number of employees ranging from 0 to 20 (measured on the horizontal axis) in the following period t, for five different growth indicators: absolute growth (Abs. gr), the Birch index (Birch), proportional growth (Prop), our measure (CTV) and log size difference (Log diff).
In Figure 3b, we do the same exercise as for Figure 3a, but focusing only on: proportional growth (Prop), our measure (CTV) and log size difference (Log diff).
In Figure 3c. we do the same exercise as for Figure 3b but considering an initial number of employees equal to 100. Notice that we proxy firm size (used when computing any of the growth measures) by number of employees plus one.   A descriptive summary of the main variables used in our analysis is reported in Table 2 (firm size and growth) and in Table 3 (R&D). In Table 4, descriptive statistics are reported for growth rates and R&D intensity after erasing observations that concern exits and cases of R&D expenditure equal to zero. This allows for a direct comparison with the bottom part of Table 2 (statistics on growth, excluding exits but including cases of zero R&D) and the bottom part of Table 3 (statistics on R&D intensity, excluding cases of zero R&D but including exits). By means of such comparison, it is possible to find out that not only the mean, but also the first and third quartile, of the growth rate distribution become higher after excluding the firms that have zero R&D, which could signal a positive effect of R&D expenditure on the medium-term growth of surviving firms. Analogously, the mean, as well as the first and third quartile, of the R&D intensity distribution are shifted upward when excluding exits, indicating that surviving firms seem to be more likely than exiting firms to have previously had a positive R&D expenditure.

Methodology
Of the 22024 firms observed in the four CIS survey waves and matched with ABR data (including firms present in more than one wave), 3721 have exited during the five years following the survey. Given the medium-term span on which we measure performance, the decision of balancing the panel, and thus exclude from the analysis the exiting firms, would result empirically into a strong reduction of the amount of data used, and theoretically into neglecting the influence that R&D (and in general the whole innovation process) has on firm survival, an influence already shown on similar data by Cefis and Marsili (2005).
We face two problems of variable left-limitation: the one of the dependent variable (firm growth) and the other of the independent variable of interest (R&D intensity). The typical way of dealing with such problem is through the limited variable regression models named Tobit, and in particular either the original Tobit model (Tobit type I, introduced by Tobit, 1958) or its alternate version usually employed for correcting possible selection biases (Tobit type II, also known as Heckit, introduced by Heckman, 1979, and homogenized in the Tobit framework by Amemiya, 1984). The choice between Tobit type I and Tobit type II should be based on the assumptions made about the variable limitation: is the limit value observed for some individuals (censored observations) deriving from the same process that causes the non-limit value for other individuals (noncensored observations)? Rephrasing for our two cases of left-limitation, the question becomes respectively: "Are the firm exits from the market deriving from the same process that defines the growth of surviving firms?" and "Is the decision of declaring no R&D expenditure deriving from the same process that defines the amount of money spent on R&D by firms that declare an R&D expenditure?" We will deal with the growth variable limitation and with the R&D variable limitations in two different ways.
For the growth variable, we assume that firms exiting the market are firms that have experienced strong negative growth rates (relative growth rates lower than -100%, i.e. values of our growth measure lower than zero). In other words, we assume that exit from the market and growth rates of surviving firms are governed by the same process (i.e. by the same relation with the independent variables). The natural consequence of our assumption is adopting a Tobit type I model for explaining exit and growth. We thus distance ourselves from the studies of Hall (1987) and Evans (1987). They instead choose a Tobit type II model, assuming that the decision to exit is governed by a different process than low growth, and therefore must be modeled separately. When analysing growth with panel data techniques (with fixed effects), we will use the Tobit type I panel data version introduced by Honoré (1992, with the corresponding function pantob of the Stata software package) instead of the traditional Tobit type I.
For the R&D variable, we assume that the process leading firms to have (and declare) an R&D expenditure higher than zero is different from the (logically subsequent) process leading firms to decide how much money shall be spent on R&D (and declared in the CIS survey). Therefore, we have two separate equations (respectively, selection and outcome equation of a Tobit type II model) defining respectively the decision of having (and declaring) an R&D expenditure and the amount of money declared to be invested in R&D, where the two equations are allowed to have different independent variables, and different parameters for the same independent variables. Such solution was already used by Hall, Lotti and Mairesse (2009), who estimate a Heckman selection model. First a probit model is estimated to evaluate the influence of some selection variables on the probability of having an observed R&D higher than zero. The observed R&D intensity is then regressed on some control variables plus the inverse Mills' ratio and other byproducts of the probit estimation (to correct for the selection bias). After obtaining an estimation of the parameters of this second equation, it is possible to build a theoretical value of the predicted R&D intensity which is able to proxy the R&D effort also for firms whose observed R&D expenditure is equal to zero. This is particularly useful in cases like ours where R&D expenditure is equal to zero for many firms. This is especially the case for the smallest ones for which a proper assessment of the R&D effort is not easy given that they do not have a separate R&D department.
Using the predicted R&D intensity rather than the observed R&D intensity can partially solve such problem (Hall, Lotti and Mairesse, 2009). Unfortunately, the additional variables we use to explain the probability of having an observed R&D higher than zero (selection equation) are available only for a subset of observations, and therefore we will use, in different models, both the observed R&D intensity and the estimated one.
Summing up, we will estimate four different models: Model 1: pooled cross-section, Tobit type I model for growth, considering only firms having an observed R&D intensity higher than zero.
We pool the four waves in a unique cross-section, and we assume that a latent variable is, for each firm, linearly related to the independent variables, and is linked to the observed firm growth , computed as in (1), as in the following: This is tantamount to saying that exiting firms (i.e. firms for which 0 it g  ) are firms for which the latent variable assumes nonpositive values. The vector of independent variables is structured as where, for each firm i and time t, it logsize is equal to the logarithm of firm employment plus one, it RD is the observed R&D intensity defined as in Section 2, i group is a dummy variable taking value equal to one if the firm i is part of a bigger industrial group and zero otherwise, and i sector is a vector of 51 dummy variables, each one associated to a given 2-digit sector, assuming value equal to one if the firm i belongs to the given sector and zero otherwise, i wave is a vector of dummy variables, each one associated to the survey wave to which the observation belongs. To avoid double counting of the same firms in the pooled cross-section, for firms that were present in more than one survey wave, only the observations pertaining to the oldest wave are kept, thus reducing the number of observations to 16510 (i.e. exactly the total number of firms present in the database after cleaning the data).
Model 2: pooled cross-section, Tobit type I model for growth, considering for all firms the estimated R&D intensity instead of the observed one. Estimated R&D intensity is predicted by a Tobit type II model in which observed R&D intensity is the dependent variable of the outcome equation.
In Model 2, the difference with respect to Model 1 lies in the vector of independent variables: which does not contain the observed log R&D intensity it RD , and contains instead an estimation it RD obtained à la Heckman (Heckman, 1979; also named Tobit type II by Amemiya, 1984) following the procedure by Hall, Lotti and Mairesse (2009). In particular, we first estimate a probit selection equation in which the dependent variable is the probability of having an observed R&D higher than zero where it RDP is an (observable) indicator function that takes value 1 if firm i reports positive R&D expenditures, * it RDP is a latent indicator variable such that firm i decides to perform R&D expenditures if * it RDP is above a given threshold c , it  is the error term, and it w is a set of explanatory variables affecting R&D, including firm log size and the CIS subjective variables that refer to the problems encountered by the firms in implementing innovation strategies. Namely, the additional variables are: per (not sufficiently qualified personnel), kno (insufficient knowledge), lfi (lack of financing), cos (high costs), mar (uncertain market), reg (restrictive regulation), org (not flexible organization), ris (financial risk). 4 We then consider only the observations for which the observed R&D is higher than zero, and estimate the outcome equation where it z is a set of determinants of R&D expenditures, including size and the following byproducts of the previous probit estimation: the obtained predicted probability of having R&D higher than zero (eventprob) and the corresponding Mills' ratio (imr), as well as their squares and their product. After obtaining an estimation of the coefficient, and given that we know it z for all the observations (including the ones for which the declared R&D was zero), we can then compute a predicted value of the R&D intensity (say, the R&D effort) for all the observations. The predicted R&D intensity will then be included in the vector it x of the independent variables of the growth equation (in the place of the observed used in Model 1).
Notice that the two stages (selection and outcome) which compose this extended version of the Heckman procedure combine the information about declaring or not an observed R&D expenditure higher than zero, on the one hand, and the information on the amount of R&D expenditure, on the other hand, in order to infer the R&D effort of all the firms (see Hall, Lotti and Mairesse, 2009, for details). Such estimated R&D effort is then used as the independent variable of interest in the final equation, where growth is the dependent variable.
In both the probit selection model and the outcome equation, firm sector dummies and survey wave dummies have been included. As in model 1, only the observations pertaining to the oldest wave has been kept, thus reducing the number of observations to 16510 (i.e. exactly the number of firms present in the database after cleaning the data). The number of observations is further reduced to 4445, because of the additional variables needed for the selection equation (and not available for all the firms). The outcome equation is estimated on the 3794 observations for which the observed R&D is higher than zero, and allows the estimation of the R&D effort for all the 4445 observations used in the selection equation. The same 4445 observations are eventually used in the final equation where the estimated R&D effort is the independent variable of interest.
Model 3: panel data, Tobit type I model with fixed effects, considering only firms having an observed R&D intensity higher than zero.
In Model 3, the only difference with respect to Model 1 lies in the fact that now the four cross-sectional waves are not pooled in a unique cross-section, but are analysed as panel data with fixed effects. In particular, the model is estimated by using the Stata function pantob that refers to the model by Honoré (1992), which allows for fixed effects in truncated regression models. Such model allows to explore the "within" effect of R&D on growth, instead of the "between" effect estimated in the pooled cross-sections of the previous models. Neither sectoral nor survey wave dummy variables have been included in Model 3, as the fixed effects method is aimed at erasing idiosyncracies at firm-level (thus, no need to care for sectoral idiosincracies), and is examining changes over time in the firm performances (thus, no need to care for cross-correlations). To allow the model for estimating fixed effects, we use only the data pertaining firms that have been surveyed in more than one CIS wave. The panel is reduced to 6593 observations from the original 22024, corresponding to 2747 firms from the original 16510.
Model 4: panel data, Tobit type I model with fixed effects, considering for all firms the estimated R&D intensity instead of the observed one. Estimated R&D intensity is predicted by a Tobit type II model in which observed R&D intensity is the dependent variable of the outcome equation.
In Model 4, the only difference with respect to Model 2 lies in the fact that now the four cross-sectional waves are not pooled in a unique cross-section, but are analysed as panel data with fixed effects. As in model 3, the model is estimated by using the Stata function pantob that refers to the model by Honoré (1992), which allows for fixed effects in truncated regression models to explore the "within" effect of R&D on growth. As in model 2, estimated R&D is used in the place of observed R&D. The estimation of the R&D effort works in the same way of model 2, that is applying the Heckman procedure to a pooled cross-section in order to obtain a predicted value of R&D intensity for all the observations. However, now the pooled cross-section is including double-counted firms (firms surveyed in more than one wave) for consistency with the fixed effects procedure that will take place in the final step of the model. 6117 observations are thus used in the selection equation (i.e. the observations for which the information on the additional variables requested by the Heckman procedure is available, including firms present in more than one wave), and 5284 of them are employed for the outcome equation (i.e. the ones in which the firm has declared a R&D expenditure higher than zero). This allows to estimate the R&D effort for all the 6117 observations used in the selection equation. However, only 1951 observations pertain to firms that have been surveyed in more than one wave (883 firms) and will be used in the final equation with fixed effects, where the estimated R&D effort is the independent variable of interest, and growth is the dependent variable. As for Model 3, neither sectoral nor survey wave dummy variables have been included in Model 4.

Results
The regression results obtained for the four models are shown in Table 5. 5 In Model 1, the parameter referring to is not significant. The effect of firm size is negative and strongly significant, in line with the literature (since Hymer and Pashigian, 1962). Belonging to a group seems also to exert a significant negative effect on growth.
The influence of R&D is still nonsignificant after using its predicted value ( , instead of its observed value ( , i.e. after estimating in the analysis also the innovation effort of firms that do report zero R&D expenditure, and controlling for the selection bias (model 2). In Table 5 only the results on the final equation are reported; see Table 6 (selection equation) and Table 7 (outcome equation) for the results of the Heckman procedure preliminary to the estimation of Model 2. The probability of declaring an R&D expenditure higher than zero is positively related to the declaration of having an insufficiently trained personnel and of facing financial risks of flexibility in the firm organization (variables per and ris, Table 6). This information is used by inserting in the outcome equation the variables imr and eventprob derived from the selection equation. However, the selection bias does not seem to be strong (see the nonsignificant coefficients in the left part of Table 7) nor does the R&D become significantly effective when being proxied by its predicted value (second column of Table 5). The information jointly provided by the first two columns of Table 5, corresponding to the results for respectively Model 1 and Model 2, suggests that R&D intensity does not seem to be a good variable for discerning between firms that survive and grow, on the one hand, and firms that do not grow or do not survive, on the other hand. In other words, the pooled cross-section results do not assign particular value to the variables and , which means that they do not explain performance differences across firms.
The third column of Table 4 shows the results for Model 3, where the effect of becomes positive and significant. While the results of Models 1 and 2 have shown that firms having a higher R&D intensity are not likely to have a higher medium-term growth (as shown by Models 1 and 2), the result of Model 3 (a fixed-effect model) indicates that a firm which decides to raise the level of R&D will be likely to raise its medium-term growth ("within" effect). The effect of size is still negative and significant (as in the cross-sectional analysis), while belonging to a group is now shown to be a factor that can hamper growth. Once we consider the estimated R&D effort instead of the declared R&D intensity, the main result does not change (Model 4, last column of Table  5): the R&D coefficient is significant for the fixed-effect model, i.e. an increase in the R&D intensity positively affects the performance path of a firm along time. Note that the coefficient on in Model 4 is of much higher magnitude than the one on in Model 3 (0.146 against 0.007). This result may be driven by the larger number of small firms in the sample of Model 4, which report zero R&D expenditure (thus not considered in Model 3) but for which estimated R&D is positive. The inclusion of these smaller firms in the sample leads to changes in the magnitude of the R&D coefficient due to a higher average and variance of the dependent variable. Table 5: Estimation results. In Models 2 and 4, predicted R&D ( ) is used instead of observed R&D ( ). The estimation procedure for model 3 with export among the regressors did not converge. Dummy variables relating to 2-digit sectors have been included in all models, and dummy variables relating to the cross-sectional waves have been included in models 1 and 2, i.e. in the pooled cross-section models. Standard errors in brackets below the parameter estimates; * significant at 10%, ** significant at 5%, significant at 1%.

Conclusion
Our analysis shows that R&D expenditure exerts a positive influence on firm employment in the medium-term (five years after the investment). However, the influence appears only when considering fixed effects methods, which allow for firm idiosyncracies in the firm growth process. The economic implication is: R&D intensity does not explain performance differences across firms, but explains, instead, changes along time of the performance of a given firm. In that respect, our results are consistent with the findings of Mudambi and Swift (2011) on the importance of considering the proactive management of R&D expenditures (as identified by a greater R&D volatility) rather than their level. An increase in the R&D intensity will make a firm deviate upward in its performance path, where performance is meant to be not only growth after survival, but also the probability of survival itself. Indeed, throughout the whole analysis, the problem of firm exits has been explicitly addressed by means of truncated regression models and the use of an appropriate growth proxy. At the same time, our results are confirmed when using a predicted value of R&D intensity, instead of the observed one, to assess a measure of R&D effort also for firms that report zero R&D expenditure.
Previous works have found evidence of a modest role of R&D in explaining short-term firm growth when estimating the 'average effect for the average firm' (Coad and Rao, 2008).
Our study suggests that the influence of R&D on employment growth can easily be underestimated for two intertwined reasons. On the one hand, R&D needs some time to show its effects, and a short-term analysis based on the observation of employment growth during only one or two years after the investment could not be sufficient to capture all the causal links connecting R&D to employment. On the other hand, the effect that R&D exerts on firm survival cannot be ignored, especially when using firm-level analyses to predict the aggregate outcome of innovation policies at regional or country scale.