1 Introduction

The complementary capabilities thesis states that the existence of an innovative idea is only one part of the successful development of a new product or process (Teece, 1986). To take the idea further into an innovation triumph, firms may need complementary capabilities. These can, for instance, relate to technology or human capital. Theoretical models show that investments in human capital and innovation inputs such as R&D are strategic complements (Acemoglu, 1997; Redding, 1996). Several empirical studies verify the importance of human capital for innovations (Mason et al., 2020; Mohnen & Röller, 2005; Peri et al., 2015; Winters, 2014). It is even possible that innovation and skills may create super-additive effects due to the generation of important synergies (for instance Piva & Vivarelli, 2004). Mohnen and Röller (2005) also demonstrate that the lack of skills is the most important obstacle to innovation activities in a variety of industries and countries. However, few analyses specify what kind of skills are most beneficial. An exception to this is Freel (2005), who finds that technical skills are particularly important for small innovative firms.

Research and development (R&D) intensity is commonly used to describe innovation activities in firms (see for instance Piva & Vivarelli, 2009) although Kleinknecht et al. (2002) conclude that this measure is manufacturing biased and thus risk to underestimate innovations in services and small firms.Footnote 1 This implies that the broader measure innovation expenditures may have some advantages over R&D intensity, but it is seldom utilized in literature (exceptions include Raymond et al., 2009; Archibugi et al., 2013a, 2013b).

This study attempts to establish the importance of specific formally achieved higher skills for the innovation intensity in firms by use of uniquely linked, representative and comparable official firm-level data for a group of European countries (Finland, France, the Netherlands, Slovenia and the United Kingdom). The intensity is measured as the ratio of innovation expenditures to turnover in firms and the skills are represented by the proportion of employees with higher (tertiary) education in ICT-oriented or general fields (highly ICT- or generally skilled employees). Additional control variables include firm size, firm age and funding status. The analysis makes use of official data on innovation activities in firms for the period 2004–2010, linked to registers on education and businesses as well as to statistics on production, including 34,286 observations.

While some studies investigate the relationship between innovation expenditure and (i) innovation output (Schäfer et al., 2017), (ii) subsidies (Czarnitzki & Lopes-Bento, 2014) and (iii) cooperation activities (Cassiman and Veugelers 2000; Kaiser, 2002; Piga & Vivarelli, 2003; Arvanitis & Bolli, 2013), few focus specifically on how the orientation of higher skills may affect the absorptive capacity of the firm and its ability to innovate. Additionally, the information on formally achieved higher skills indicates to what extent this knowledge is used by the whole firm, independently of the occupational status of the employee.

This study also contributes to the growing literature on analysis of harmonized and linked firm-level data, appropriate for cross-country comparisons. Examples include Bartelsman et al. (2019), Hagsten and Kotnik (2017), Hagsten and Sabadash (2017) and Pantea et al. (2017) as well as Hagsten (2016). All these studies examine the relationship between firm performance (output, productivity, exports and employment) and different ICT or innovation variables based on multi-linked official firm-level data.

In the next section, the conceptual background is presented. This is followed by the empirical approach as well as by a description of data and their sources. Subsequently, the estimation results are exhibited and discussed. Finally, some concluding remarks are offered.

2 Conceptual background

Physical as well as human resources are important factors for the innovation outcome in firms (Doran et al., 2020; Leiponen, 2005). Literature also shows that skilled employees are a necessary complement to R&D activities since this strengthens the absorptive capacity of the firm (Bartel & Lichtenberg, 1987). Studies at the industry level demonstrate that human resources are a key component in innovation activities and economic growth processes. For example, Griffith et al. (2004) stress the importance of human capital for technological change and innovations in OECD countries. The concept of absorptive capacity strongly relates to the role of human capital in the innovation process, in that internal capacity and external knowledge are considered complementary (Cohen & Levinthal, 1990).

Other studies emphasize the importance of formal human capital, measured as educational achievements. Mangematin and Nesta (1999) claim that highly qualified employees will increase the knowledge base of the firm through spillovers from their daily tasks. By collaborating with professionals outside the firm, access may be gained to external knowledge networks (Rothwell & Dodgson, 1991). Carter (1989) argues that highly educated employees make the largest contribution to the know-how trade because of their extensive knowledge, which means that they will be in a better position to recognize, assimilate and apply external knowledge to the internal innovation process.

Lundvall (2008) highlights the relevance of graduates from different fields for the innovation process, where engineers and scientists are particularly relevant for basic innovations, while people with a management or social science degree are crucial as second level innovators.

Mohnen and Röller (2005) provide quantitative evidence of the importance of human capital by finding that the lack of skills is the main obstacle to innovation activities in a wide range of sectors across countries. They show that there are essential complementarities between technical competence and innovations and that human capital is positively associated with innovation performance. The importance of and possible complementarity between human resources and ICT for innovations is highlighted by Leiponen (2005) as well as by Bourke and Crowley (2015) while Piva and Vivarelli (2004) discuss potential super-additives that appear from synergies between innovations and skills within the organizational structure.

In addition, many studies investigate the relationship between human capital and R&D intensity. The majority of these studies find that R&D intensity and human capital are positively related (Kumar & Saqib, 1996; Piva & Vivarelli, 2009; Van Dijk et al., 1997). Certain analyses point to insignificant relationships between human capital indicators and innovation output (Lund Vinding, 2006; Peters, 2009; Schneider et al., 2010) while others find positive links (e.g. Mason et al., 2020; Rao et al., 2002; Tavassoli, 2015).

Freel (2005, 2006) finds that the proportion of qualified scientists and engineers is significantly positively related to new product innovations in technology-based and knowledge-intensive small and medium-sized firms. Doran and Ryan (2014) show that there is a substantial heterogeneity in the importance of skills for different types of innovations. Incremental innovations for instance, benefit from good knowledge of mathematics and statistics.

Unfortunately, the studies mentioned above are difficult to compare because human capital (skills) is measured in different ways like the ratio of white-collar employees, employees with tertiary degrees, kind of degree, firm specific training, or average wage and salary levels. Blue- and white-collar workers, for instance, relate to the actual occupation of the employee, something that could be, but not necessarily is, related to the education of the individual. Wage is another measure that likely follow the professional achievement or education to some extent. However, educational achievement is a more precise measure that exactly relates to schooling of the employees, independent of where in the organization they are active.

Nevertheless, the studies give some insights into how the relationship between the proportion of highly skilled employees and innovation expenditures might appear: these employees are expected to contribute more to know-how and apply external knowledge to the internal innovation process, there are essential complementarities between innovations and highly skilled employees, specific skills are crucial elements and a deficit of skills may hamper innovation activities. This leads to the expectation that there is a positive and significant association between highly skilled employees and innovation expenditures, where higher ICT skills are particularly important.

Surveys offering new measures of innovation activity such as innovation expenditures are conducted in many countries (Mairesse & Mohnen, 2010). However, given the complexity of the definition in the Eurostat Community Innovation Survey (CIS) (besides internal and external R&D it includes other associated costs for training, market research, marketing, licensing, investment in innovation, design and capital goods), the use of the concept is less widespread (Evangelista et al., 1997; Mairesse & Mohnen, 2010).

A large number of studies examine the determinants of R&D activities at the enterprise level (see Cohen, 1995; Cohen & Levin, 1989 for more comprehensive assessments). These expenditures are usually modeled as a function of technological possibilities (proximity to science, external sources of technical knowledge), firm size, age and appropriability conditions. The dependence of R&D activities on firm size is central in this research (Acs & Audretsch, 1988, 1990; Arvanitis, 1997; Lee & Sung, 2005; Ortega-Argilés et al., 2009; Expósito and Sanchis-Lopis 2019). Early work by Schumpeter (1934) suggests that larger firms may have advantages in the innovation process because of a higher cash flow and potential economies of scale. Besides the main variables of interest (different kinds of highly skilled employees), aspects highlighted in the literature need to be controlled for in an empirical estimation. This includes firm size, age, external funding and industry affiliation.

Empirical explorations on the relationship between R&D intensity and firm size are unclear. The seminal study by Cohen et al. (1987) shows that the importance of firm size is negligible, and that industry affiliation is far more crucial. Lee and Sung (2005) also reach the conclusion that firm size only explains a small part of the variation in R&D expenditures across firms. Other studies find a non-linear relationship between R&D intensity and firm size, where both the smallest and largest firms in the dataset have higher R&D intensity (Kumar & Aggarwal, 2005; Rogers, 2002). Few studies examine the role of firm size for innovation inputs using other measures than R&D expenditures. An exception is Acs and Audretsch (1988) who demonstrate that certain small manufacturing firms are highly innovative when measured as number of innovations. Since innovation expenditures is a broader measure than sole R&D it is expected to be less discriminating across size, indicating that advantages of scale may not exist.

Age of the firm is another relevant variable. Coad et al. (2016) demonstrate that the empirical literature is ambiguous on this issue with both negative and positive relationships between firm age and innovation activities, although the theoretical standard view is that young firms are more innovative than old firms (Coad, 2018). García-Quevedo et al. (2014) find that older firms exhibit a more persistent and less erratic innovation behavior than younger ones. On the one hand, firms that are longer in the market have more experience, but on the other hand, many young firms invest heavily in innovation activities. Research by Pellegrino and Piva (2020) highlight that small firms may be better at translating innovation input into product innovations in entrepreneurial services sectors, while older firms are proficient at creating new processes in routinized manufacturing sectors. Thus, given the heterogeneous picture painted of age in literature, its relation to the innovation intensity is expected to be marginal.

Another aspect of interest for the innovation activities is the funding status. Governments provide significant subsidies for private R&D and innovation activities. Most studies find a positive relationship between the funding status and R&D activities (Almus & Czarnitzki, 2003; see García-Quevedo, 2004 for a meta-analysis) or funding status and innovation expenditures (Czarnitzki & Lopes-Bento, 2014). Catozzella and Vivarelli (2016) conclude that public funding increases innovation expenditures but does not necessarily lead to a higher innovation output. Given that the funding status and R&D intensity are correlated, it is expected that there is also a positive link between funding and innovation intensity measured as expenditures. However, the strength of the association might depend on the funding source.

Factors shaping industry evolution, including innovation opportunities, technical change, the role of institutions et cetera may vary across industries (Capone et al., 2019; Malerba & Orsenigo, 1997; Malerba et al., 2016). Because of this it is important to pay attention to these presumptive differences in empirical analyses.

Given available literature, highly ICT- (and generally-) skilled employees are expected to be of particular importance for the innovation intensity in firms when other aspects such as size, age, funding status and industry affiliation are controlled for.

3 Empirical approach

Based on the theoretical considerations in the conceptual section, the innovation intensity, expenditures over turnover, \({{(Inno}_{it}/Y}_{it})\) is specified as a function of highly skilled employees (ICT- or generally), firm size, firm age and funding status:

$${{\mathrm{l}\mathrm{n}(Inno}_{it}/Y}_{it})={\beta }_{0}+{\beta }_{1}{HSICTpct}_{it}+{\beta }_{2}{HSGpct}_{it}+{\beta }_{3}ln{Size}_{it}^{}+{\beta }_{4}{Age}_{it}+{{\beta }_{5}Age}_{it}^{2}+\sum _{F=1}^{3}{{\beta }_{6F}Funding}_{it}^{F}+\sum _{Y=1}^{3}{{\beta }_{7Y}Dummy\_year}_{it}^{Y}+\sum _{S=1}^{N}{{\beta }_{8S}Industry}_{it}^{S}+{\varepsilon }_{it},$$
(1)

where i denotes firm, t time (t = 2004; 2006; 2008 and 2010), ln() is the natural logarithm, \({\beta }_{0}\) is the constant, \({\varepsilon }_{it}\) is the error term which is assumed iid and \({\beta }_{1}\)\({\beta }_{8S}\) are parameters to be estimated. The skills variable, HSICTpct, represents the proportion of employees with tertiary ICT-oriented education and HSGpct the remaining fields. For two countries, the distinction between these two variables cannot be made, therefore the proportion of employees with any tertiary education, HSpct, is used. Variable Size represents the number of employees, Age denotes firm age and Age2 illustrates possible non-linear relationships. Funding includes a set of dummy variables reflecting different sources, with no funding as reference category: a) combination of EU and national funding, b) national funding and c) EU funding. The equation also includes two-digit industry and year dummy variables. Estimations are performed by use of a robust regression method on the harmonized, multi-linked and pooled official firm-level data for every two years and each country separately (Finland, France, the Netherlands, Slovenia and the United Kingdom). In this approach, influential observations are given a lower weight depending on the size of their residuals (Huber, 1964).

Due to specific sampling strategies in official statistics with the purpose to reduce the response burden of firms, the cross-sectional overlaps are sometimes small and the attrition over time is high. This implies that static fixed effects or dynamic panel data estimators are not applicable because they considerably reduce the size of the multi-linked unbalanced dataset. Another important aspect is that the within time variation in innovation datasets might be small compared with the between variation due to a high degree of persistence (Peters, 2009; Raymond et al., 2009, 2010).

4 Data and descriptive statistics

The unique data at hand originate from the national and cross-country sets developed by the European Commission funded ESSLait project, including 14 European countries.Footnote 2 These extensive datasets, spanning over most industries, consist of information collected from official business, trade and education registers, as well as from surveys on production (Structural Business Statistics), ICT usage and innovation activities in firms.Footnote 3 The Distributed Microdata Approach (DMD) was used to access and link the confidential firm-level data (Bartelsman, 2004; Bartelsman et al., 2018; Eurostat, 2008, 2013). This approach makes it possible to run a common protocol directly on identically organized firm-level datasets, harmonized over time and across countries. The time period studied includes a revision of the international industry classifications. By transforming observations classified in accordance with NACE 2 back to NACE REV. 1.1, the time series break is over-bridged.

This study uses firm-level data for the period 2004–2010 from all underlying sources available except the ICT and trade statistics. Innovation data relate to four waves of the CIS (2004, 2006, 2008 and 2010). The estimation sample includes both manufacturing and service firms with more than ten employees. However, in the innovation survey it is not mandatory to include all service sectors, meaning that retail trade, hotels and restaurants might be covered in a less systematic way.

Multi-linking of datasets implies that the overlaps among different surveys or over time might vary, depending on for instance rotating sampling procedures to reduce the response burden of firms. Fazio et al. (2006) show that relationships are less affected by possible selection bias present in linked dataset than descriptive statistics. However, a common consequence of data linking is that the new merged datasets are smaller and do not always allow panel data methods and full identification because of the high risk of losing even more observations than in the linking procedure itself.

Joint data on innovation expenditures, highly skilled employees and funding status are only available for five out of the fourteen countries in the dataset. There are two explanations behind this: in some countries linking of education statistics to firm-level data is not possible and in others the voluntary parts of the Community Innovation Survey are not attended to. As a consequence of the absent information on funding (Denmark, Norway and Sweden), the initial group of eight countries with information on formal educational achievement is reduced to five.

The dependent variable in this study is the ratio of innovation expenditures to turnover in its natural logarithm. Turnover is defined as the value of production in firms including purchases of intermediate goods and services. This measure of innovation intensity is broader than that of internal and purchased R&D considered jointly, since it also includes expenditures related to acquisition of machinery, equipment, software and external knowledge in accordance with the Oslo Manual (OECD/Eurostat 2018). In addition, employee training as well as marketing and engineering of new products and processes are regarded as innovation inputs.

The proportion of employees highly skilled in ICT (tertiary education in mathematics, physics, engineering or information technology) or other fields, are identified by use of two-digit ISCED codes (International Standard Classification of Education 1997).Footnote 4 This definition has similarities with, but is not identical to that of STEM, as used by for instance Peri et al. (2015). Chemistry is not included, and there is no relation to the actual occupation, only to the formal skills of the employees. For three of the countries a breakdown by kind of orientation is possible (Finland, France and the United Kingdom). Size of firm is measured as the number of full-time employees or head counts. Vintage is represented by firm age and age squared, the latter indicating a possible non-linear relationship.

The underlying CIS dataset encompasses four categories of funding: (i) local or regional authorities (ii) Central government (including agencies and ministries), (iii) The European Union (EU) and (iv) Framework Programmes for Research and Technical Development (subset of EU funding). In this study, the regional and national funding is combined into one dummy variable, the remaining two includes EU funding in combination with national funding and sole EU-funding. The number of observations in each country-dataset ranges from 2,990 for Finland to 12,000 for France, all in all 34,286. A correlation table for the continuous independent variables reveals that they are not strongly related to each other (Appendix).

Descriptive statistics show that the innovation intensity varies between 5 and 12 per cent across countries, with no clear trend over time, possibly related to long-term programs (Table 1). The proportion of highly skilled employees extends between 12 per cent in Slovenia and 26 per cent in Finland (measured as means), and there is a clear surge over time. Finland has also by far the largest group of ICT-skilled employees (14 per cent). Funding status is another variable with certain variation across countries, but domestic grants constitute a larger share than international ones everywhere.

Table 1 Descriptive statistics

5 Empirical results

Pooled robust estimations show a strong significantly positive relationship between the innovation expenditures ratio and highly skilled employees (Table 2). The proportion with ICT-oriented skills is positive and significant at the one per cent level in all three countries where these data are available (Finland, France and the United Kingdom). Generally oriented higher degrees are also important for the innovation activities, except in the United Kingdom. Corresponding significant and positive results based on all highly skilled employees are found in the Netherlands, but for Slovenian firms the relationship is negative. The latter result is a clear outlier that hypothetically could hide sub-groups of employees with positive links. There is also a possibility that the results are related to the deep downturn Slovenia experiences during the period of time studied.

Table 2 A and B Association between highly skilled employees and innovation intensity

The magnitude of the association with higher ICT- skills is larger in France and the United Kingdom than in Finland, feasibly related to the clear difference in the underlying levels of these skills across the countries (by far highest in Finland). A one percentage point increase in the proportion of employees with higher ICT- skills (say from two to three per cent) is associated with a 1.6 percentage points surge in the ratio of innovation expenditures (from 1.9 to 3.5 per cent in France and from 0.8 to 2.3 per cent in the United Kingdom).

Given the deficit of studies using identical measures of innovation activities in firms, including information on formal educational achievement and field of orientation, comparisons with earlier research are challenging to perform. One exception to this is Archibugi et al. (2013a), who find a clear relationship between employees with tertiary degrees and innovation activities based on United Kingdom firm-level data for partly the same period of time as present analysis. Although this innovation measure is based on the same variable, it is used as the change over time rather than a ratio to turnover, so a direct comparison is not possible.

Size has a significantly negative sign for all five countries, indicating that the smallest companies spend disproportionally on innovation activities. This distinct result partly contradicts the literature on drivers of R&D-measured innovation activities, where size appears in several guises, for instance non-significant or only important for the firms at the end tails of the size distribution (Cohen et al. 1989; Rogers, 2002).

The relationship between the innovation intensity and age follows the disparity exhibited in literature. Results for Finland and France reveal that it declines in a non-linear shape, while in the other countries age is not significantly related to the innovation expenditures at all. The variability in results could stem from an undetected change in the pattern between innovation expenditures and age during the period of time studied, which covers the economic and financial crisis (Archibugi et al. 2013a, b).

As expected, innovation funding is positively and significantly linked to innovation intensity. The strongest association appears for joint public funding from national governments and EU programs, although national funding is also significant at the one per cent level. In the case of joint innovation funding, the coefficients range between 0.96 and 1.52, while the span is narrower for pure national funding (0.58–0.84). This means that the innovation intensity in jointly supported firms is between 1.5 and 3.5 times higher than in firms without funding.Footnote 5 Similar calculations for the national funding reveals an innovation intensity 0.8–1.3 times higher. The innovation intensity varies markedly among industries, with firms in medium- and high-tech manufacturing as well as business and computer services being most active, although this pattern cannot be observed for the United Kingdom. This suggests that, beyond firm-specific factors, kind of business is of high importance for the innovation intensity as highlighted by Malerba and Orsenigo (1997).

As a robustness check, separate estimations are performed for a sub-sample of firms in the electrical equipment and telecommunications industries, that is, the ICT producing firms. This group has far more employees with higher ICT skills than other firms. Results reveal that the proportion of employees with higher ICT skills is significant at the one per cent level in both Finland and the United Kingdom, with larger coefficients than for the firms in the total dataset (Table 3).

Table 3 Association between highly skilled employees and innovation intensity in electronic equipment and telecommunications firms

ICT-skilled employees are also more relevant for the innovation intensity than other orientations of skills. In France, the proportion of employees with ICT skills is significant at the 10 per cent level. Results for the Netherlands and Slovenia, where only the total proportion of employees with higher skills is available, coincide with those of the total dataset, except that the higher skills variable for Slovenia is positive at the level of 10 per cent. The control variables follow the same patterns as in the baseline estimations for all five countries.

6 Conclusion

In this study a first attempt is made to shed more light on specific formally achieved higher skills as determinant of innovation intensity in firms, measured as expenditures. The approach is based on novel, harmonized and multi-linked firm-level datasets for five European countries. Robust estimations reveal that there is a significant and positive relationship between innovation expenditures and highly skilled employees, defined in accordance with international ISCED standards for higher educations, especially so for the ICT-orientation.

Control variables size, industry affiliation and funding status are also clearly significant while the role of firm age varies. Innovation intensity significantly declines with firm size while joint national and EU funding is undoubtedly relevant. The absence of advantages of scales contradicts results in the literature that uses a narrower measure of innovation intensity that commonly discriminate activities by smaller firms. Industry affiliation turns out to be of high importance, since the set of two-digit dummy variables is significant even after controlling for the firm-level characteristics. Estimations on the sub-sample of firms in the electronic equipment and telecommunications industries confirm the results of the baseline model, but with partly stronger associations.

Some limitations need to be considered. The linked datasets do not allow the use of panel data methods. The reason for this is small cross-sectional overlaps and a high rotation of firms over time in the underlying Community Innovation Survey, with the purpose to reduce the response burden of firms. This implies that attempts to consider unobservable firm effects lead to an even higher loss of observations and difficulties to find general patterns. Yet, this shortcoming is partly mitigated by the small within variation over time due to persistence in innovation activities. Due to the data deficiencies, no general conclusions are drawn about the causality of the established relationship between the innovation intensity and highly skilled employees, although most studies emphasize that it mainly goes from skills to innovations. It is likely that there are unmeasurable factors such as firm strategies that relate to their innovation activities and requirement for specific skills. However, such information is difficult to deduce from official statistics and therefore needs to be collected from alternative sources.

There are several practical implications arising from this study. Firstly, it shows that the choice of innovation measure may discriminate certain kinds of firms and thus needs to be carefully considered in connection with analytical work. Secondly, the results emphasize the importance of higher skills in the innovation process, specifically ICT-oriented. Thirdly, linking of different microdata sources offers vast opportunities to explore and better understand firm behavior, but before robust new insights can be gained, there are several challenges for future research and statistics producers that need to be met. Particularly important aspects are the data access and the sampling design for specific surveys. In Europe, the opportunity to access and link firm-level data vary across countries, although building a cross-country microdata set is still way ahead. Researchers and the statistical offices should intensify their efforts to provide access to confidential, interrelated firm-level data.