ICT infrastructure in firms and online sales

This study investigates whether the underlying information and communication technology (ICT) infrastructure of firms affects the development of online sales, based on a novel micro-aggregated panel dataset encompassing a large group of European countries. The dataset includes continuous measures of online sales activities, as well as of standard production function input variables. Dynamic System GMM estimations show positive and significant associations between the proportion of firms selling online and the ICT infrastructure, measured as the proportion of broadband internet connected employees. The magnitude of the effect is stronger in the group of manufacturing than in the service firms, but in both industries, there is a threshold beyond which the positive effects of the infrastructure diminish. In addition, there is evidence that improvements in the infrastructure lead to a stronger effect on medium-sized and large rather than on small firms.


Introduction
Early prophecies of the digital economy predicted a rapid global expansion of e-commerce and high valuations of firms engaged in this activity [23,54,55]. Indeed, in certain businesses such as book retailers, music retailers and travel agencies, traditional sales channels hardly exist any longer [29]. Despite this, the overall development of online sales is slow, at least in Europe. In 2019, one out of five firms with more than ten employees engage in these activities, although the spread across the 27 European Union member countries goes from ten to 40 per cent (Source: Eurostat Data Browser ISOC_EC_ESELN2). This stands in contrast to a much higher proportion of firms that purchases online and a steady increase in consumer globals, which in contrast to other firms do not internationalise incrementally, but compete globally from inception, and are thus highly likely to directly from start depend on digital sales channels [3,56]. These firms tend to be small and rely on specific technologies, knowledge or innovations [8,50].
Sila [52] concludes that the empirical literature on e-commerce adoption is often based on small samples of cross-sectional data, and reports contradictory results for the determinants included, such as size of firm. This study contributes to both a deeper and more general understanding of ICT infrastructure as one of the important determinants of the proportion of firms that sell online. Deeper because the underlying Micro Moments Database (MMD) encompasses continuous variables in dimensions not earlier available such as size-class, industry and ICT intensity, and broader due to the large country coverage and longitudinal data. 2 Data characteristics also mean that the estimation approaches can deviate from the commonly used logistic regression based on cross-sectional data, and take into account persistence over time as well as the magnitude of different explanatory variables (see for instance [9,33,46,49]).
The study proceeds as follows: Next section covers the conceptual background and the empirical approach. A description of the dataset and some stylised facts ensues. Following this there is a presentation of the estimation results and some concluding remarks.

Conceptual background and empirical approach
According to theory, diffusion is the process by which an idea or innovation is communicated through different channels among participants in a social system over time [51]. The different stages of this process are recognised as early adopters, early majority, late majority and laggards. OECD [44] employs a refined approach for ICT  2 Bartelsman, Hagsten and Polder [5].
innovations in firms which identifies three stages: (i) readiness, (ii) intensity and (iii) impact (Fig. 1). Readiness relates to the ability of a firm to adopt an ICT innovation, intensity (or use) measures the proportion of firms that adopt and the extent of use. Impact relates to changes in behaviour, economic structure or performance as a result of use. Firms across countries, industries and size-classes may operate at different stages of diffusion, implying that this could also be reflected in their ability to use advanced applications dependent on an underlying ICT innovation.
Some studies concur with the definition of e-commerce as being an innovation [7,9,33,52,64] while others rather consider it an advanced computer network application that complements the underlying infrastructure [15,17,21,22,24].
A joint feature of the research into factors influencing the adoption of e-commerce in firms is the importance of the organisational readiness, often expressed as the underlying ICT infrastructure, technology or skills [18,19,33,35,38,39,46,52,53,61,63].
In this case, the underlying ICT infrastructure is considered to be the innovation, approximated by the proportion of employees connected to the internet with a minimum broadband speed. This variable encompasses both aspects of the human capital and the underlying technological infrastructure of the firm [4] and is expected to relate significantly and positively to the proportion of firms engaged in online sales. This leads to the formulation of the first hypothesis: Hypothesis 1 There is a significant and positive association between the underlying ICT infrastructure and the proportion of firms with online sales.
Another aspect of importance is that adoption may differ among firms of different sizes and sectors [6,12,33,52], something that lies behind the formulation of the second and the third hypotheses:

Hypothesis 2
The magnitude of the significant and positive association between the underlying ICT infrastructure and the proportion of firms with online sales differs across industry sectors.

Hypothesis 3
The magnitude of the significant and positive association between the underlying ICT infrastructure and the proportion of firms with online sales differs across size-classes.
Empirical analyses of the probability to adopt e-commerce in firms typically employ probit or logit models based on binary dependent variables [6,9,20,33,46,49,61,64]. Similar kinds of variables are also often used to reflect the underlying ICT infrastructure in firms, for instance the number of ICT elements in use [6,33].
ICT impacts, on the other hand, are commonly estimated by use of augmented Cobb Douglas production functions including the inputs labour, capital and materials [14]. This approach allows an interpretation of the extent to which the different inputs relate to output. Several quantifiable variables of importance suggested in the adoption literature coincide with those typically included in a production function.
Given the availability of data on these elements, together with measurable independent and dependent ICT variables, the present study employs an approach that mirrors the augmented production function where the proportion of firms selling online (AESELL) is the output and (BROADpct) is the infrastructure variable. The standard input factors are represented by capital (K), labour (L) and materials (M). Oliveira and Martins [46] highlight the importance of the technological readiness, which may be difficult to uphold without suitable investments, captured by the capital variable. Labour reflects the size of firm and materials the purchases of inputs such as components and services by external experts [9,20,32]. Persistence is captured by the inclusion of the dependent variable lagged in time. The infrastructure variable allows time delayed reactions.
The dataset is organised in two alternative ways, allowing dynamic estimations of both two-digit industries (Eq. 1a) as well as size-class and broad industry groups (Eq. 1b): where i is 26 two-digit industries, c denotes country, t reflects year (2003-2010) and t encompasses the time effects. Specification 1b adds the dimension j for four sizeclasses and narrows the industries i down to six broad groups. Parameters ic and ∼ ijc hold industry, country and size-classes fixed. The squared term of the broadband variable (BROADpct 2 ) is included to investigate possible thresholds (non-linearities) beyond which the presumptive impact of the ICT infrastructure changes characteristics (see for instance [17]. Based on literature, the direction of causality is assumed to run from the ICT infrastructure to online sales adoption. Nevertheless, endogeneity might occur attributed to unobservable factors affecting both the ICT infrastructure and online sales. Thus, to account for a possible correlation between broadband internet connected employees and the error term, the System GMM panel data estimator [10] is used, where the BROADpct is treated as predetermined and the groups refer to sizeclass, industry and country pairs. This estimator is particularly suitable for datasets with a large number of cross-sectional units and a relatively narrow time frame, as is the case here. (1a)

Data and stylised facts
Data for this analysis stem from the ESSLait Micro Moments Database, available at Eurostat Safe Centre [5]. 3 This database holds linked and micro-aggregated information on firms with ten employees or more originating from the national statistical offices in 14 European countries. Due to laws on disclosure of firm-level data from official sources, there are few opportunities to merge such information into one single international database. A way to partly circumvent this limitation is to work closely with statistical offices that agree to build harmonised dataset that can be micro-aggregated to a level higher than the firm but lower than broad industry-groups.
Information in the Micro Moments Database covers several underlying sources: registers on business, trade and education as well as surveys on production, ICT usage and innovation activities (CIS) in firms for the years 2001-2010. Data are reported for the two-digit industry level, for the EUKLEMS alternative hierarchy (broad industry groups including ICT producers) and in several other dimensions such as size-class, age class, ICT intensity, innovation activity, ownership, affiliation and international experience. 4 An overview of the industry classification is presented in Table 4, Appendix 1.
The ICT infrastructure of firms is approximated by the proportion of broadband internet connected employees (BROADpct), a composite variable reflecting the degree of connectivity among employees within and across firms. This continuous variable is derived from information on the number of employees with broadband internet access of a certain minimum speed and is regarded more sophisticated than categorical and other commonly used basic ICT infrastructure measures relating to the firm (as suggested by for instance [6,17,33]) which do not take into account the employee abilities in the same way as the BROADpct variable [4].
Another advantage of the broadband variable is that it does not reach saturation (despite apparent increase) during the period of time studied, which is otherwise not uncommon for fast developing technologies. The capital variable (K) is based on either stock or book values, employment (L) measures the number of employees  and materials (M) is defined as the gross value of production minus added value and goods for re-sale. Nominal prices have been deflated by EUKLEMS or WIOD indexes. 5 The proportion of firms selling online increases slightly over the period of time studied while the infrastructure variable is surging much stronger (Fig. 2). Close to a third of the firms engage in online sales, with a slight advantage in manufacturing, while the share of turnover relating to online sales is markedly lower in both manufacturing and service firms, fourteen and ten per cent, respectively in 2010 (Table 1). More than every second employee has a broadband internet connection of a certain speed. In service firms, this human capital related ICT infrastructure occurs even more frequently than in manufacturing.
As suggested in literature, the adoption of e-sales applications may vary not only among industries but also across size-classes (Fig. 3). In this case, large firms engage in sales more than twice as often as small firms. No such systematic difference can be found for the ICT infrastructure variable, which is generally on a higher level of intensity than online sales. All firms are also much more active with online purchases (AEBUY) than sales, possibly indicating that this is a far less complex and costly issue to deal with.

Results and discussion
The dynamic system GMM estimations disclose that there is a significant and positive association between the underlying ICT infrastructure, approximated by the proportion of broadband internet connected employees, and the proportion of firms selling online in the group of countries studied ( Table 2, Specification (i)). This result imply that Hypothesis 1 cannot be rejected. The same is valid for Hypothesis 2, since the magnitude of the coefficients is larger for the manufacturing firms with less experience of this infrastructure than for the service firms.
Not only the contemporaneous level of broadband connected employees but also its lagged term relates positively to the proportion of firms selling online ( Table 2, Specification (ii)), although the former is stronger. In addition, the estimations reveal that online sales is more persistent, or path-dependent in the manufacturing than in the service industry, implying that the proportion of firms selling online in earlier years influences the present adoption. This could also mean that the service firms are more flexible and quicker to adjust.
Given the strong growth of the infrastructure variable, a possible significant and positive effect might change characteristics after a certain threshold. In this case, the estimates of the squared broadband term relate significantly negatively to the proportion of firms selling online, somewhat stronger in manufacturing than in service firms, indicating a non-linear relationship where the effect of the infrastructure variable is decreasing with higher usage. In order to give an idea of the magnitude of the relationship, the marginal effect is calculated, based on the proportion of broadband Asterisks ***, ** , and * denote significance at the 1, 5, and 10 per cent levels, respectively. The two-step GMM estimator based on Windmeijer correction for small samples and robust standard errors is used. Variable broadband internet connected employees is treated as predetermined. In addition, the Hansen J-test supports the validity of the instruments at the one per cent significance level in all cases (p-value) and the AR (2)  connected employees. This uncovers that, with a proportion of broadband connected employees of 25 per cent, an increase by ten percentage points gives a marginal effect of three percentage points for service and four percentage points for manufacturing firms (Fig. 4). 6 The calculation also reveals that beyond a threshold of 50 per cent in manufacturing firms and 45 per cent in service firms, the short run ICT infrastructure boost to the proportion of firms engaged in online sales disappears. This aligns with the law of diminishing marginal utility, implying that when a certain level of the infrastructure is reached this is no longer a factor of importance for the choice to sell online. Further, the results uncover that the standard production factors are not significantly different from zero or show the wrong sign. Several explanations may lie behind this result. One of them could relate to the aggregation level of results. Another possibility is that negative effects of labour are attributed to a reverse jobsaving. ICT is sometimes expected to save jobs, especially in the short run (see for instance [59]). A decrease in the proportion of firms that sell online could follow from investments that require more physical staff, for instance in the super warehouses that compete with online sellers, as suggested by Hortaçsu and Syverson [34]. Alternatively, strategies to raise investments, use additional intermediates or hire new employees without paying attention to their specific quality or skills, do not necessarily stimulate an increase of firms that sell online.
Turning to the estimations by size-class, additional details are exposed. The relationship between online sales activities and the ICT infrastructure does vary across size-classes as stated in Hypothesis 3 (Table 3). It is generally more pronounced for the groups of large and medium-sized firms. This is somewhat unexpected since  these firms are already more intensive users of online sales applications and there is no major difference in the level of broadband connected employees among sizeclasses. The path dependency is also stronger for the largest firms. Possibly, other factors such as kind of products or services, supply chain management [37], advantages of scale, sales volumes (units) and client categories (business to businesses, consumers or governments) may be of importance for the choice of sales channels, although information on all this is not available in the dataset at hand. The results verify earlier firm-level research in that the underlying infrastructure is of importance for the adoption and diffusion of e-commerce and that it might vary across firm size [6,12,33]. However, the evidence in this study is both more specific and general: Specific following the panel data set with clearly defined continuous ICT variables in several dimensions that allows comparisons of magnitudes, and general due to data on the representative firm by industry and size-class for twelve European countries.
Several robustness checks are undertaken. To investigate whether the model is mis-specified, the estimations are also carried out with the more traditional production variable turnover of online sales, despite its quality shortcomings. These estimations (available upon request) reveal a pattern similar to that of the main results, but with smaller magnitudes and weaker significances. Employment, capital and materials turn out equally unimportant. An implication of these results is that during the period of time studied, the proportion of firms selling online could be interpreted as an approximation of the extent to which firms sell online, since both variables develop similarly but on different levels.
Given the possibility of self-selection into online sales by reasons of efficiency or competitiveness [32,42], for instance, estimations are also performed with the ICT infrastructure instrumented by the level of labour productivity in groups of firms. Although labour productivity would be directly related to the extent of sales online, the main dependent variable reflecting the proportion of firms selling online is not expected to exhibit the same association. The average labour productivity in constant prices over time is Euro 59,000 for the manufacturing firms and Euro 76,000 for the services firms (Source: Micro Moments Database). The high amount for services reflects an overrepresentation of business services in the dataset. The estimations reveal that the infrastructure variable is still significant for both services and manufacturing firms (results available upon request).
Despite the use of System GMM for the estimations, the possibility of reverse causality cannot be fully neglected. Because of this, the main specifications for manufacturing and services firms are estimated with reverse order for the two ICT variables, ceteris paribus. This renders significant results for the manufacturing but not for the services firms ( Table 6, Appendix 1). Since the variables are scaled differently, a one standard deviation change is calculated to investigate which effects dominate. The standard deviations are 0.4 and 0.3 for BROADpct and AESELL, respectively. This implies that the impact on online sales is 40 per cent and that the reverse effect is 4 per cent. Thus, the overall causality goes from infrastructure to online sales.
To verify that the comparisons of estimates across different sub-groups of firms are statistically valid, the 95 per cent confidence intervals for the BROADpct variable are plotted ( Figure 5, Appendix 1). These intervals overlap to a certain extent, implying that the sub-groups are not significantly different from each other and thus the comparison of the magnitude of point estimates are valid.

Concluding remarks
By use of a novel micro-aggregated dataset encompassing a large group of European countries, and by departing from the standard approach to use binary variables for adoption of ICT, this study investigates the importance of the underlying ICT infrastructure of firms for the extent to which they engage in online sales activities. Online sales activities are with few exceptions (travel agencies and book retailers, for instance), still not widespread among firms. However, large firms more routinely use this sales channel. Dynamic System GMM estimations based on a specification that mirrors a production function show positive and significant relationships between the proportion of firms selling online and their underlying ICT infrastructures, measured as the proportion of broadband internet connected employees. The effect is stronger for manufacturing firms and more persistent over time than for service firms. However, there are indications that the possible boost diminishes after the infrastructure reaches a certain threshold.
When the size-classes are estimated separately the results reveal that the group of large firms, already more experienced in online sales activities, benefit the most from an improved ICT infrastructure. This could relate to advantages of scale, supply chain management or to the kind of production. The standard production factors capital, labour and materials do not show significant and positive links to the proportion of firms selling online. Possibly, this means that specific rather than general skills and quality of the inputs are needed to stimulate firms to sell online. Instead, the results point to the importance of both human capital and technology for the online sales activities. Resistance to online sales may also indicate that there is a certain amount of goods and services that is less suitable for this channel such as fresh produce and typical items that require physical inspection before the purchase.
Although novel variables, broad country and industry coverage, time series spanning over several years as well as an alternative estimation approach allow both more specific and general conclusions, the study has some limitations. It does not give insights into the exact behaviour of firms or the level of online sales that is adopted. Due to data quality issues, the proportion of firms engaged in online sales is used as the main dependent variable, while the more traditional, turnover based one only occurs in the robustness check. This check leads to similar patterns of the results, although with smaller magnitudes and weaker significances. Because of this and due to the fact that these two online sales variables exhibit almost identical trends during shorter periods of time without changes in definitions and coverage, the underlying infrastructure is expected to associate also with the extent of sales online. Just like in firm-level analyses, the data themselves affect the econometric approach. In this case, the two-digit NACE rev. 1.1 classification is more thorough for manufacturing firms, making the panel data approach in lags and levels extra sensitive in the case of the service firms.
There are several avenues for future research: one is to include more countries, update the dataset and prolong the period of time studied. This would allow an investigation into how the outbreak of the Covid-19 pandemic in 2020 affects the proportion of firms that engages in online sales. Another possibility is to investigate alternative variables or instruments for the ICT infrastructure that could be used. Further, the complementarity between the ICT infrastructure and the quality of human capital would deserve a closer look as would an analysis where the firms have been grouped by their position in the value chain or their competitive status.

Appendix 1
See Fig. 5 and Tables 4, 5, 6.     Table 6 Impact of firms selling online on ICT infrastructure, by industry. System GMM estimations. Source:

ESSLait Micro Moments Database and own calculations
Asterisks ***, ** , and * denote significance at the 1, 5, and 10 per cent levels, respectively. The two-step GMM estimator based on the Windmeijer correction for small samples and robust standard errors is used. Variable proportion of firms selling online and its squared term are treated as predetermined. In addition, the Hansen J-test supports the validity of the instruments at the one per cent significance level in all cases (p-value) and the AR (2)